Skip Nav Destination
Close Modal
Update search
NARROW
Format
Journal
Date
Availability
1-2 of 2
Peter M. Williams
Close
Follow your search
Access your saved searches in your account
Would you like to receive an alert when new items match your search?
Sort by
Journal Articles
Publisher: Journals Gateway
Neural Computation (1996) 8 (4): 843–854.
Published: 01 May 1996
Abstract
View article
PDF
Neural network outputs are interpreted as parameters of statistical distributions. This allows us to fit conditional distributions in which the parameters depend on the inputs to the network. We exploit this in modeling multivariate data, including the univariate case, in which there may be input-dependent (e.g., time-dependent) correlations between output components. This provides a novel way of modeling conditional correlation that extends existing techniques for determining input-dependent (local) error bars.
Journal Articles
Publisher: Journals Gateway
Neural Computation (1995) 7 (1): 117–143.
Published: 01 January 1995
Abstract
View article
PDF
Standard techniques for improved generalization from neural networks include weight decay and pruning. Weight decay has a Bayesian interpretation with the decay function corresponding to a prior over weights. The method of transformation groups and maximum entropy suggests a Laplace rather than a gaussian prior. After training, the weights then arrange themselves into two classes: (1) those with a common sensitivity to the data error and (2) those failing to achieve this sensitivity and that therefore vanish. Since the critical value is determined adaptively during training, pruning—in the sense of setting weights to exact zeros—becomes an automatic consequence of regularization alone. The count of free parameters is also reduced automatically as weights are pruned. A comparison is made with results of MacKay using the evidence framework and a gaussian regularizer.