Abstract
This letter studies the expansion and preservation of information in a binary autoencoder where the hidden layer is larger than the input. Such expansion is widespread in biological neural networks, as in the olfactory system of a fruit fly or the projection of thalamic inputs to the neocortex. We analyze the threshold model, the kWTA model, and the binary matching pursuit model to find how the sparsity and the dimension of the encoding influence the input reconstruction, similarity preservation, and mutual information across layers. It is shown that the sparser activation of the hidden layer is preferable for preserving information between the input and the output layers. All three models show optimal similarity preservation at dense, not sparse, hidden layer activation. Furthermore, with a large enough hidden layer, it is possible to get zero reconstruction error for any input just by varying the thresholds of neurons. However, we show that the preference for sparsity is due to the noise in the weight matrix between layers. A fixed number of nonzero connections to every neuron achieves better information preservation and input reconstruction for the dense hidden layer activation. The theoretical results give useful insight into models of neural computation based on sparse binary representation and association memory.