Skip Nav Destination
Close Modal
Update search
NARROW
Format
Journal
TocHeadingTitle
Date
Availability
1-7 of 7
Tom Heskes
Close
Follow your search
Access your saved searches in your account
Would you like to receive an alert when new items match your search?
Sort by
Journal Articles
Publisher: Journals Gateway
Neural Computation (2013) 25 (12): 3318–3339.
Published: 01 December 2013
FIGURES
| View All (11)
Abstract
View article
PDF
Partial least squares (PLS) is a class of methods that makes use of a set of latent or unobserved variables to model the relation between (typically) two sets of input and output variables, respectively. Several flavors, depending on how the latent variables or components are computed, have been developed over the last years. In this letter, we propose a Bayesian formulation of PLS along with some extensions. In a nutshell, we provide sparsity at the input space level and an automatic estimation of the optimal number of latent components. We follow the variational approach to infer the parameter distributions. We have successfully tested the proposed methods on a synthetic data benchmark and on electrocorticogram data associated with several motor outputs in monkeys.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2010) 22 (12): 3127–3142.
Published: 01 December 2010
FIGURES
| View All (6)
Abstract
View article
PDF
Recent research has shown that reconstruction of perceived images based on hemodynamic response as measured with functional magnetic resonance imaging (fMRI) is starting to become feasible. In this letter, we explore reconstruction based on a learned hierarchy of features by employing a hierarchical generative model that consists of conditional restricted Boltzmann machines. In an unsupervised phase, we learn a hierarchy of features from data, and in a supervised phase, we learn how brain activity predicts the states of those features. Reconstruction is achieved by sampling from the model, conditioned on brain activity. We show that by using the hierarchical generative model, we can obtain good-quality reconstructions of visual images of handwritten digits presented during an fMRI scanning session.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2004) 16 (11): 2379–2413.
Published: 01 November 2004
Abstract
View article
PDF
We derive sufficient conditions for the uniqueness of loopy belief propagation fixed points. These conditions depend on both the structure of the graph and the strength of the potentials and naturally extend those for convexity of the Bethe free energy. We compare them with (a strengthened version of) conditions derived elsewhere for pairwise potentials. We discuss possible implications for convergent algorithms, as well as for other approximate free energies.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2000) 12 (4): 881–901.
Published: 01 April 2000
Abstract
View article
PDF
Several studies have shown that natural gradient descent for on-line learning is much more efficient than standard gradient descent. In this article, we derive natural gradients in a slightly different manner and discuss implications for batch-mode learning and pruning, linking them to existing algorithms such as Levenberg-Marquardt optimization and optimal brain surgeon. The Fisher matrix plays an important role in all these algorithms. The second half of the article discusses a layered approximation of the Fisher matrix specific to multilayered perceptrons. Using this approximation rather than the exact Fisher matrix, we arrive at much faster “natural” learning algorithms and more robust pruning procedures.
Journal Articles
Publisher: Journals Gateway
Neural Computation (1999) 11 (4): 977–993.
Published: 15 May 1999
Abstract
View article
PDF
In this article, we introduce a measure of optimality for architecture selection algorithms for neural networks: the distance from the original network to the new network in a metric defined by the probability distributions of all possible networks. We derive two pruning algorithms, one based on a metric in parameter space and the other based on a metric in neuron space, which are closely related to well-known architecture selection algorithms, such as GOBS. Our framework extends the theoretically range of validity of GOBS and therefore can explain results observed in previous experiments. In addition, we give some computational improvements for these algorithms.
Journal Articles
Publisher: Journals Gateway
Neural Computation (1998) 10 (6): 1425–1433.
Published: 15 August 1998
Abstract
View article
PDF
The bias/variance decomposition of mean-squared error is well understood and relatively straightforward. In this note, a similar simple decomposition is derived, valid for any kind of error measure that, when using the appropriate probability model, can be derived from a Kullback-Leibler divergence or log-likelihood.
Journal Articles
Publisher: Journals Gateway
Neural Computation (1996) 8 (8): 1743–1765.
Published: 01 November 1996
Abstract
View article
PDF
We study the dynamics of on-line learning for a large class of neural networks and learning rules, including backpropagation for multilayer perceptrons. In this paper, we focus on the case where successive examples are dependent, and we analyze how these dependencies affect the learning process. We define the representation error and the prediction error. The representation error measures how well the environment is represented by the network after learning. The prediction error is the average error that a continually learning network makes on the next example. In the neighborhood of a local minimum of the error surface, we calculate these errors. We find that the more predictable the example presentation, the higher the representation error, i.e., the less accurate the asymptotic representation of the whole environment. Furthermore we study the learning process in the presence of a plateau. Plateaus are flat spots on the error surface, which can severely slow down the learning process. In particular, they are notorious in applications with multilayer perceptrons. Our results, which are confirmed by simulations of a multilayer perceptron learning a chaotic time series using backpropagation, explain how dependencies between examples can help the learning process to escape from a plateau.