Skip Nav Destination
Close Modal
Update search
NARROW
Format
Journal
TocHeadingTitle
Date
Availability
1-5 of 5
Max Welling
Close
Follow your search
Access your saved searches in your account
Would you like to receive an alert when new items match your search?
Sort by
Journal Articles
Publisher: Journals Gateway
Neural Computation (2016) 28 (1): 45–70.
Published: 01 January 2016
FIGURES
| View All (20)
Abstract
View article
PDF
We argue that when faced with big data sets, learning and inference algorithms should compute updates using only subsets of data items. We introduce algorithms that use sequential hypothesis tests to adaptively select such a subset of data points. The statistical properties of this subsampling process can be used to control the efficiency and accuracy of learning or inference. In the context of learning by optimization, we test for the probability that the update direction is no more than 90 degrees in the wrong direction. In the context of posterior inference using Markov chain Monte Carlo, we test for the probability that our decision to accept or reject a sample is wrong. We experimentally evaluate our algorithms on a number of models and data sets.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2009) 21 (4): 1145–1172.
Published: 01 April 2009
FIGURES
| View All (5)
Abstract
View article
PDF
We introduce a new class of “maximization-expectation” (ME) algorithms where we maximize over hidden variables but marginalize over random parameters. This reverses the roles of expectation and maximization in the classical expectation-maximization algorithm. In the context of clustering, we argue that these hard assignments open the door to very fast implementations based on data structures such as kd-trees and conga lines. The marginalization over parameters ensures that we retain the ability to infer model structure (i.e., number of clusters). As an important example, we discuss a top-down Bayesian k -means algorithm and a bottom-up agglomerative clustering algorithm. In experiments, we compare these algorithms against a number of alternative algorithms that have recently appeared in the literature.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2006) 18 (2): 381–414.
Published: 01 February 2006
Abstract
View article
PDF
We present an energy-based model that uses a product of generalized Student-t distributions to capture the statistical structure in data sets. This model is inspired by and particularly applicable to “natural” data sets such as images. We begin by providing the mathematical framework, where we discuss complete and overcomplete models and provide algorithms for training these models from data. Using patches of natural scenes, we demonstrate that our approach represents a viable alternative to independent component analysis as an interpretive model of biological visual systems. Although the two approaches are similar in flavor, there are also important differences, particularly when the representations are overcomplete. By constraining the interactions within our model, we are also able to study the topographic organization of Gabor-like receptive fields that our model learns. Finally, we discuss the relation of our new approach to previous work—in particular, gaussian scale mixture models and variants of independent components analysis.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2004) 16 (1): 197–221.
Published: 01 January 2004
Abstract
View article
PDF
Belief propagation (BP) on cyclic graphs is an efficient algorithm for computing approximate marginal probability distributions over single nodes and neighboring nodes in the graph. However, it does not prescribe a way to compute joint distributions over pairs of distant nodes in the graph. In this article, we propose two new algorithms for approximating these pairwise probabilities, based on the linear response theorem. The first is a propagation algorithm that is shown to converge if BP converges to a stable fixed point. The second algorithm is based on matrix inversion. Applying these ideas to gaussian random fields, we derive a propagation algorithm for computing the inverse of a matrix.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2001) 13 (3): 677–689.
Published: 01 March 2001
Abstract
View article
PDF
We introduce a novel way of performing independent component analysis using a constrained version of the expectation-maximization (EM) algorithm. The source distributions are modeled as D one-dimensional mixtures of gaussians. The observed data are modeled as linear mixtures of the sources with additive, isotropic noise. This generative model is fit to the data using constrained EM. The simpler “soft-switching” approach is introduced, which uses only one parameter to decide on the sub- or supergaussian nature of the sources. We explain how our approach relates to independent factor analysis.