We study the interaction between input distributions, learning algorithms, and finite sample sizes in the case of learning classification tasks. Focusing on the case of normal input distributions, we use statistical mechanics techniques to calculate the empirical and expected (or generalization) errors for several well-known algorithms learning the weights of a single-layer perceptron. In the case of spherically symmetric distributions within each class we find that the simple Hebb rule, corresponding to maximum-likelihood parameter estimation, outperforms the other more complex algorithms, based on error minimization. Moreover, we show that in the regime where the overlap between the classes is large, algorithms with low empirical error do worse in terms of generalization, a phenomenon known as overtraining.

This content is only available as a PDF.
You do not currently have access to this content.