Exhaustive exploration of an ensemble of networks is used to model learning and generalization in layered neural networks. A simple Boolean learning problem involving networks with binary weights is numerically solved to obtain the entropy S m and the average generalization ability G m as a function of the size m of the training set. Learning curves G m vs m are shown to depend solely on the distribution of generalization abilities over the ensemble of networks. Such distribution is determined prior to learning, and provides a novel theoretical tool for the prediction of network performance on a specific task.
The ability of learning networks to generalize can be greatly enhanced by providing constraints from the task domain. This paper demonstrates how such constraints can be integrated into a backpropagation network through the architecture of the network. This approach has been successfully applied to the recognition of handwritten zip code digits provided by the U.S. Postal Service. A single network learns the entire recognition operation, going from the normalized image of the character to the final classification.