Skip Nav Destination
Close Modal
Update search
NARROW
Format
Journal
TocHeadingTitle
Date
Availability
1-2 of 2
Rodney M. Goodman
Close
Follow your search
Access your saved searches in your account
Would you like to receive an alert when new items match your search?
Sort by
Journal Articles
Publisher: Journals Gateway
Neural Computation (1993) 5 (6): 976–990.
Published: 01 November 1993
Abstract
View article
PDF
Recent work has shown that recurrent neural networks have the ability to learn finite state automata from examples. In particular, networks using second-order units have been successful at this task. In studying the performance and learning behavior of such networks we have found that the second-order network model attempts to form clusters in activation space as its internal representation of states. However, these learned states become unstable as longer and longer test input strings are presented to the network. In essence, the network “forgets” where the individual states are in activation space. In this paper we propose a new method to force such a network to learn stable states by introducing discretization into the network and using a pseudo-gradient learning rule to perform training. The essence of the learning rule is that in doing gradient descent, it makes use of the gradient of a sigmoid function as a heuristic hint in place of that of the hard-limiting function, while still using the discretized value in the feedback update path. The new structure uses isolated points in activation space instead of vague clusters as its internal representation of states. It is shown to have similar capabilities in learning finite state automata as the original network, but without the instability problem. The proposed pseudo-gradient learning rule may also be used as a basis for training other types of networks that have hard-limiting threshold activation functions.
Journal Articles
Publisher: Journals Gateway
Neural Computation (1992) 4 (6): 781–804.
Published: 01 November 1992
Abstract
View article
PDF
In this paper we propose a network architecture that combines a rule-based approach with that of the neural network paradigm. Our primary motivation for this is to ensure that the knowledge embodied in the network is explicitly encoded in the form of understandable rules. This enables the network's decision to be understood, and provides an audit trail of how that decision was arrived at. We utilize an information theoretic approach to learning a model of the domain knowledge from examples. This model takes the form of a set of probabilistic conjunctive rules between discrete input evidence variables and output class variables. These rules are then mapped onto the weights and nodes of a feedforward neural network resulting in a directly specified architecture. The network acts as parallel Bayesian classifier, but more importantly, can also output posterior probability estimates of the class variables. Empirical tests on a number of data sets show that the rule-based classifier performs comparably with standard neural network classifiers, while possessing unique advantages in terms of knowledge representation and probability estimation.