Skip Nav Destination
Close Modal
Update search
NARROW
Format
Journal
TocHeadingTitle
Date
Availability
1-11 of 11
Robert A. Jacobs
Close
Follow your search
Access your saved searches in your account
Would you like to receive an alert when new items match your search?
Sort by
Journal Articles
Publisher: Journals Gateway
Neural Computation (2006) 18 (10): 2320–2342.
Published: 01 October 2006
Abstract
View article
PDF
We consider the properties of motor components, also known as synergies, arising from a computational theory (in the sense of Marr, 1982) of optimal motor behavior. An actor's goals were formalized as cost functions, and the optimal control signals minimizing the cost functions were calculated. Optimal synergies were derived from these optimal control signals using a variant of nonnegative matrix factorization. This was done using two different simulated two-joint arms—an arm controlled directly by torques applied at the joints and an arm in which forces were applied by muscles—and two types of motor tasks—reaching tasks and via-point tasks. Studies of the motor synergies reveal several interesting findings. First, optimal motor actions can be generated by summing a small number of scaled and time-shifted motor synergies, indicating that optimal movements can be planned in a low-dimensional space by using optimal motor synergies as motor primitives or building blocks. Second, some optimal synergies are task independent—they arise regardless of the task context—whereas other synergies are task dependent—they arise in the context of one task but not in the contexts of other tasks. Biological organisms use a combination of task-independent and task-dependent synergies. Our work suggests that this may be an efficient combination for generating optimal motor actions from motor primitives. Third, optimal motor actions can be rapidly acquired by learning new linear combinations of optimal motor synergies. This result provides further evidence that optimal motor synergies are useful motor primitives. Fourth, synergies with similar properties arise regardless if one uses an arm controlled by torques applied at the joints or an arm controlled by muscles, suggesting that synergies, when considered in “movement space,” are more a reflection of task goals and constraints than of fine details of the underlying hardware.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2006) 18 (3): 660–682.
Published: 01 March 2006
Abstract
View article
PDF
Investigators debate the extent to which neural populations use pairwise and higher-order statistical dependencies among neural responses to represent information about a visual stimulus. To study this issue, three statistical decoders were used to extract the information in the responses of model neurons about the binocular disparities present in simulated pairs of left-eye and right-eye images: (1) the full joint probability decoder considered all possible statistical relations among neural responses as potentially important; (2) the dependence tree decoder also considered all possible relations as potentially important, but it approximated high-order statistical correlations using a computationally tractable procedure; and (3) the independent response decoder, which assumed that neural responses are statistically independent, meaning that all correlations should be zero and thus can be ignored. Simulation results indicate that high-order correlations among model neuron responses contain significant information about binocular disparities and that the amount of this high-order information increases rapidly as a function of neural population size. Furthermore, the results highlight the potential importance of the dependence tree decoder to neuroscientists as a powerful but still practical way of approximating high-order correlations among neural responses.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2003) 15 (9): 2051–2065.
Published: 01 September 2003
Abstract
View article
PDF
Bernstein (1967) suggested that people attempting to learn to perform a difficult motor task try to ameliorate the degrees-of-freedom problem through the use of a developmental progression. Early in training, people maintain a subset of their control parameters (e.g., joint positions) at constant settings and attempt to learn to perform the task by varying the values of the remaining parameters. With practice, people refine and improve this early-learned control strategy by also varying those parameters that were initially held constant. We evaluated Bernstein's proposed developmental progression using six neural network systems and found that a network whose training included developmental progressions of both its trajectory and its feedback gains outperformed all other systems. These progressions, however, yielded performance benefits only on motor tasks that were relatively difficult to learn. We conclude that development can indeed aid motor learning.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2003) 15 (4): 761–781.
Published: 01 April 2003
Abstract
View article
PDF
We consider the hypothesis that systems learning aspects of visual perception may benefit from the use of suitably designed developmental progressions during training. Four models were trained to estimate motion velocities in sequences of visual images. Three of the models were developmental models in the sense that the nature of their visual input changed during the course of training. These models received a relatively impoverished visual input early in training, and the quality of this input improved as training progressed. One model used a coarse-to-multiscale developmental progression (it received coarse-scale motion features early in training and finer-scale features were added to its input as training progressed), another model used a fine-to-multiscale progression, and the third model used a random progression. The final model was nondevelopmental in the sense that the nature of its input remained the same throughout the training period. The simulation results show that the coarse-to-multiscale model performed best. Hypotheses are offered to account for this model's superior performance, and simulation results evaluating these hypotheses are reported. We conclude that suitably designed developmental sequences can be useful to systems learning to estimate motion velocities. The idea that visual development can aid visual learning is a viable hypothesis in need of further study.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2003) 15 (1): 161–182.
Published: 01 January 2003
Abstract
View article
PDF
This article considers the hypothesis that systems learning aspects of visual perception may benefit from the use of suitably designed developmental progressions during training. We report the results of simulations in which four models were trained to detect binocular disparities in pairs of visual images. Three of the models were developmental models in the sense that the nature of their visual input changed during the course of training. These models received a relatively impoverished visual input early in training, and the quality of this input improved as training progressed. One model used a coarse-scale-to-multiscale developmental progression, another used a fine-scale-to-multiscale progression, and the third used a random progression. The final model was nondevelopmental in the sense that the nature of its input remained the same throughout the training period. The simulation results show that the two developmental models whose progressions were organized by spatial frequency content consistently outperformed the nondevelopmental and random developmental models. We speculate that the superior performance of these two models is due to two important features of their developmental progressions: (1) these models were exposed to visual inputs at a single scale early in training, and (2) the spatial scale of their inputs progressed in an orderly fashion from one scale to a neighboring scale during training. Simulation results consistent with these speculations are presented. We conclude that suitably designed developmental sequences can be useful to systems learning to detect binocular disparities. The idea that visual development can aid visual learning is a viable hypothesis in need of study.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2002) 14 (10): 2415–2437.
Published: 01 October 2002
Abstract
View article
PDF
Previous researchers developed new learning architectures for sequential data by extending conventional hidden Markov models through the use of distributed state representations. Although exact inference and parameter estimation in these architectures is computationally intractable, Ghahramani and Jordan (1997) showed that approximate inference and parameter estimation in one such architecture, factorial hidden Markov models (FHMMs), is feasible in certain circumstances. However, the learning algorithm proposed by these investigators, based on variational techniques, is difficult to understand and implement and is limited to the study of real-valued data sets. This chapter proposes an alternative method for approximate inference and parameter estimation in FHMMs based on the perspective that FHMMs are a generalization of a well-known class of statistical models known as generalized additive models (GAMs; Hastie & Tibshirani, 1990). Using existing statistical techniques for GAMs as a guide, we have developed the generalized backfitting algorithm. This algorithm computes customized error signals for each hidden Markov chain of an FHMM and then trains each chain one at a time using conventional techniques from the hidden Markov models literature. Relative to previous perspectives on FHMMs, we believe that the viewpoint taken here has a number of advantages. First, it places FHMMs on firm statistical foundations by relating them to a class of models that are well studied in the statistics community, yet it generalizes this class of models in an interesting way. Second, it leads to an understanding of how FHMMs can be applied to many different types of time-series data, including Bernoulli and multinomial data, not just data that are real valued. Finally, it leads to an effective learning procedure for FHMMs that is easier to understand and easier to implement than existing learning procedures. Simulation results suggest that FHMMs trained with the generalized backfitting algorithm are a practical and powerful tool for analyzing sequential data.
Journal Articles
Publisher: Journals Gateway
Neural Computation (1999) 11 (6): 1297–1330.
Published: 15 August 1999
Abstract
View article
PDF
Three models of visual cue combination were simulated: a weak fusion model, a modified weak model, and a strong model. Their relative strengths and weaknesses are evaluated on the basis of their performances on the tasks of judging the depth and shape of an ellipse. The models differ in the amount of interaction that they permit among the cues of stereo, motion, and vergence angle. Results suggest that the constrained nonlinear interaction of the modified weak model allows better performance than either the linear interaction of the weak model or the unconstrained nonlinear interaction of the strong model. Further examination of the modified weak model revealed that its weighting of motion and stereo cues was dependent on the task, the viewing distance, and, to a lesser degree, the noise model. Although the dependencies were sensible from a computational viewpoint, they were sometimes inconsistent with psychophysical experimental data. In a second set of experiments, the modified weak model was given contradictory motion and stereo information. One cue was informative in the sense that it indicated an ellipse, while the other cue indicated a flat surface. The modified weak model rapidly reweighted its use of stereo and motion cues as a function of each cue's informativeness. Overall, the simulation results suggest that relative to the weak and strong models, the modified weak fusion model is a good candidate model of the combination of motion, stereo, and vergence angle cues, although the results also highlight areas in which this model needs modification or further elaboration.
Journal Articles
Publisher: Journals Gateway
Neural Computation (1997) 9 (2): 369–383.
Published: 15 February 1997
Abstract
View article
PDF
This article investigates the bias and variance of mixtures-of-experts (ME) architectures. The variance of an ME architecture can be expressed as the sum of two terms: the first term is related to the variances of the expert networks that comprise the architecture and the second term is related to the expert networks' covariances. One goal of this article is to study and quantify a number of properties of ME architectures via the metrics of bias and variance. A second goal is to clarify the relationships between this class of systems and other systems that have recently been proposed. It is shown that in contrast to systems that produce unbiased experts whose estimation errors are uncorrelated, ME architectures produce biased experts whose estimates are negatively correlated.
Journal Articles
Publisher: Journals Gateway
Neural Computation (1995) 7 (5): 867–888.
Published: 01 September 1995
Abstract
View article
PDF
This article reviews statistical techniques for combining multiple probability distributions. The framework is that of a decision maker who consults several experts regarding some events. The experts express their opinions in the form of probability distributions. The decision maker must aggregate the experts' distributions into a single distribution that can be used for decision making. Two classes of aggregation methods are reviewed. When using a supra Bayesian procedure, the decision maker treats the expert opinions as data that may be combined with its own prior distribution via Bayes' rule. When using a linear opinion pool, the decision maker forms a linear combination of the expert opinions. The major feature that makes the aggregation of expert opinions difficult is the high correlation or dependence that typically occurs among these opinions. A theme of this paper is the need for training procedures that result in experts with relatively independent opinions or for aggregation methods that implicitly or explicitly model the dependence among the experts. Analyses are presented that show that m dependent experts are worth the same as k independent experts where k ≤ m . In some cases, an exact value for k can be given; in other cases, lower and upper bounds can be placed on k .
Journal Articles
Publisher: Journals Gateway
Neural Computation (1994) 6 (2): 181–214.
Published: 01 March 1994
Abstract
View article
PDF
We present a tree-structured architecture for supervised learning. The statistical model underlying the architecture is a hierarchical mixture model in which both the mixture coefficients and the mixture components are generalized linear models (GLIM's). Learning is treated as a maximum likelihood problem; in particular, we present an Expectation-Maximization (EM) algorithm for adjusting the parameters of the architecture. We also develop an on-line learning algorithm in which the parameters are updated incrementally. Comparative simulation results are presented in the robot dynamics domain.
Journal Articles
Publisher: Journals Gateway
Neural Computation (1991) 3 (1): 79–87.
Published: 01 March 1991
Abstract
View article
PDF
We present a new supervised learning procedure for systems composed of many separate networks, each of which learns to handle a subset of the complete set of training cases. The new procedure can be viewed either as a modular version of a multilayer supervised network, or as an associative version of competitive learning. It therefore provides a new link between these two apparently different approaches. We demonstrate that the learning procedure divides up a vowel discrimination task into appropriate subtasks, each of which can be solved by a very simple expert network.