To form view-invariant representations of objects, neurons in the inferior temporal cortex may associate together different views of an object, which tend to occur close together in time under natural viewing conditions. This can be achieved in neuronal network models of this process by using an associative learning rule with a short-term temporal memory trace. It is postulated that within a view, neurons learn representations that enable them to generalize within variations of that view. When three-dimensional (3D) objects are rotated within small angles (up to, e.g., 30 degrees), their surface features undergo geometric distortion due to the change of perspective. In this article, we show how trace learning could solve the problem of in-depth rotation-invariant object recognition by developing representations of the transforms that features undergo when they are on the surfaces of 3D objects. Moreover, we show that having learned how features on 3D objects transform geometrically as the object is rotated in depth, the network can correctly recognize novel 3D variations within a generic view of an object composed of a new combination of previously learned features. These results are demonstrated in simulations of a hierarchical network model (VisNet) of the visual system that show that it can develop representations useful for the recognition of 3D objects by forming perspective-invariant representations to allow generalization within a generic view.