We describe a series of numerical experiments that measure the average generalization capability of neural networks trained on a variety of simple functions. These experiments are designed to test the relationship between average generalization performance and the worst-case bounds obtained from formal learning theory using the Vapnik-Chervonenkis (VC) dimension (Blumer et al. 1989; Haussler et al. 1990). Recent statistical learning theories (Tishby et al. 1989; Schwartz et al. 1990) suggest that surpassing these bounds might be possible if the spectrum of possible generalizations has a “gap” near perfect performance. We indeed find that, in some cases, the average generalization is significantly better than the VC bound: the approach to perfect performance is exponential in the number of examples m, rather than the 1/m result of the bound. However, in these cases, we have not found evidence of the gap predicted by the above statistical theories. In other cases, we do find the 1/m behavior of the VC bound, and in these cases, the numerical prefactor is closely related to the prefactor contained in the bound.