There’s strong evidence that humans rely on coordinate frames, or reference lines and curves, to suss out the position of points in space. That’s unlike widely used computer vision algorithms, which differentiate among objects by numerical representations of their characteristics.
In pursuit of a more humanlike approach, researchers at Google, Alphabet subsidiary DeepMind, and the University of Oxford what they call the Stacked Capsule Autoencoder (SCAE), which reasons about objects using the geometric relationships between their parts. Since these relationships don’t depend on the position from which the model views the objects, the model classifies objects with high accuracy even when the point of view changes.
In 2017, Geoffrey Hinton — a foremost theorist of AI and a recipient of the Turing Award