Skip Nav Destination
Close Modal
Update search
NARROW
Format
Journal
TocHeadingTitle
Date
Availability
1-3 of 3
Steven J. Nowlan
Close
Follow your search
Access your saved searches in your account
Would you like to receive an alert when new items match your search?
Sort by
Journal Articles
Publisher: Journals Gateway
Neural Computation (1992) 4 (4): 473–493.
Published: 01 July 1992
Abstract
View article
PDF
One way of simplifying neural networks so they generalize better is to add an extra term to the error function that will penalize complexity. Simple versions of this approach include penalizing the sum of the squares of the weights or penalizing the number of nonzero weights. We propose a more complicated penalty term in which the distribution of weight values is modeled as a mixture of multiple gaussians. A set of weights is simple if the weights have high probability density under the mixture model. This can be achieved by clustering the weights into subsets with the weights in each cluster having very similar values. Since we do not know the appropriate means or variances of the clusters in advance, we allow the parameters of the mixture model to adapt at the same time as the network learns. Simulations on two different problems demonstrate that this complexity term is more effective than previous complexity terms.
Journal Articles
Publisher: Journals Gateway
Neural Computation (1991) 3 (1): 79–87.
Published: 01 March 1991
Abstract
View article
PDF
We present a new supervised learning procedure for systems composed of many separate networks, each of which learns to handle a subset of the complete set of training cases. The new procedure can be viewed either as a modular version of a multilayer supervised network, or as an associative version of competitive learning. It therefore provides a new link between these two apparently different approaches. We demonstrate that the learning procedure divides up a vowel discrimination task into appropriate subtasks, each of which can be solved by a very simple expert network.
Journal Articles
Publisher: Journals Gateway
Neural Computation (1990) 2 (3): 355–362.
Published: 01 September 1990
Abstract
View article
PDF
An algorithm that is widely used for adaptive equalization in current modems is the “bootstrap” or “decision-directed” version of the Widrow-Hoff rule. We show that this algorithm can be viewed as an unsupervised clustering algorithm in which the data points are transformed so that they form two clusters that are as tight as possible. The standard algorithm performs gradient ascent in a crude model of the log likelihood of generating the transformed data points from two gaussian distributions with fixed centers. Better convergence is achieved by using the exact gradient of the log likelihood.