Abstract
The two-layer radial basis function network, with fixed centers of the basis functions, is analyzed within a stochastic training paradigm. Various definitions of generalization error are considered, and two such definitions are employed in deriving generic learning curves and generalization properties, both with and without a weight decay term. The generalization error is shown analytically to be related to the evidence and, via the evidence, to the prediction error and free energy. The generalization behavior is explored; the generic learning curve is found to be inversely proportional to the number of training pairs presented. Optimization of training is considered by minimizing the generalization error with respect to the free parameters of the training algorithms. Finally, the effect of the joint activations between hidden-layer units is examined and shown to speed training.