Abstract
Deep convolutional neural networks (DCNNs) have demonstrated impressive robustness to recognize objects under transformations (e.g., blur or noise) when these transformations are included in the training set. A hypothesis to explain such robustness is that DCNNs develop invariant neural representations that remain unaltered when the image is transformed. However, to what extent this hypothesis holds true is an outstanding question, as robustness to transformations could be achieved with properties different from invariance; for example, parts of the network could be specialized to recognize either transformed or nontransformed images. This article investigates the conditions under which invariant neural representations emerge by leveraging that they facilitate robustness to transformations beyond the training distribution. Concretely, we analyze a training paradigm in which only some object categories are seen transformed during training and evaluate whether the DCNN is robust to transformations across categories not seen transformed. Our results with state-of-the-art DCNNs indicate that invariant neural representations do not always drive robustness to transformations, as networks show robustness for categories seen transformed during training even in the absence of invariant neural representations. Invariance emerges only as the number of transformed categories in the training set is increased. This phenomenon is much more prominent with local transformations such as blurring and high-pass filtering than geometric transformations such as rotation and thinning, which entail changes in the spatial arrangement of the object. Our results contribute to a better understanding of invariant neural representations in deep learning and the conditions under which it spontaneously emerges.