We describe a computational model for inferring 3D structure from the motion of projected 2D points in an image, with the aim of understanding how biological vision systems learn and internally represent 3D transformations from the statistics of their input. The model uses manifold transport operators to describe the action of 3D points in a scene as they undergo transformation. We show that the model can learn the generator of the Lie group for these transformations from purely 2D input, providing a proof-of-concept demonstration for how biological systems could adapt their internal representations based on sensory input. Focusing on a rotational model, we evaluate the ability of the model to infer depth from moving 2D projected points and to learn rotational transformations from 2D training stimuli. Finally, we compare the model performance to psychophysical performance on structure-from-motion tasks.

You do not currently have access to this content.