Abstract
We explore the hypothesis that linear cortical neurons are concerned with building a particular type of representation of the visual world—one that not only preserves the information and the efficiency achieved by the retina, but in addition preserves spatial relationships in the input—both in the plane of vision and in the depth dimension. Focusing on the linear cortical cells, we classify all transforms having these properties. They are given by representations of the scaling and translation group and turn out to be labeled by rational numbers ‘(p + q)/p’ (p, q integers). Any given (p, q) predicts a set of receptive fields that comes at different spatial locations and scales (sizes) with a bandwidth of log2 [(p + q)/p] octaves and, most interestingly, with a diversity of ‘q’ cell varieties. The bandwidth affects the trade-off between preservation of planar and depth relations and, we think, should be selected to match structures in natural scenes. For bandwidths between 1 and 2 octaves, which are the ones we feel provide the best matching, we find for each scale a minimum of two distinct cell types that reside next to each other and in phase quadrature, that is, differ by 90° in the phases of their receptive fields, as are found in the cortex, they resemble the “even-symmetric” and “odd-symmetric” simple cells in special cases. An interesting consequence of the representations presented here is that the pattern of activation in the cells in response to a translation or scaling of an object remains the same but merely shifts its locus from one group of cells to another. This work also provides a new understanding of color coding changes from the retina to the cortex.