Skip Nav Destination
Close Modal
Update search
NARROW
Format
Journal
TocHeadingTitle
Date
Availability
1-7 of 7
Christopher K. I. Williams
Close
Follow your search
Access your saved searches in your account
Would you like to receive an alert when new items match your search?
Sort by
Journal Articles
Publisher: Journals Gateway
Neural Computation (2023) 35 (4): 727–761.
Published: 18 March 2023
Abstract
View article
PDF
Capsule networks (see Hinton et al., 2018 ) aim to encode knowledge of and reason about the relationship between an object and its parts. In this letter, we specify a generative model for such data and derive a variational algorithm for inferring the transformation of each model object in a scene and the assignments of observed parts to the objects. We derive a learning algorithm for the object models, based on variational expectation maximization (Jordan et al., 1999 ). We also study an alternative inference algorithm based on the RANSAC method of Fischler and Bolles ( 1981 ). We apply these inference methods to data generated from multiple geometric objects like squares and triangles (“constellations”) and data from a parts-based model of faces. Recent work by Kosiorek et al. ( 2019 ) has used amortized inference via stacked capsule autoencoders to tackle this problem; our results show that we significantly outperform them where we can make comparisons (on the constellations data).
Journal Articles
Publisher: Journals Gateway
Neural Computation (2022) 34 (10): 2037–2046.
Published: 12 September 2022
Abstract
View article
PDF
Barlow (1985) hypothesized that the co-occurrence of two events A and B is “suspicious” if P ( A , B ) ≫ P ( A ) P ( B ) . We first review classical measures of association for 2 × 2 contingency tables, including Yule's Y (Yule, 1912 ), which depends only on the odds ratio λ and is independent of the marginal probabilities of the table. We then discuss the mutual information (MI) and pointwise mutual information (PMI), which depend on the ratio P ( A , B ) / P ( A ) P ( B ) , as measures of association. We show that once the effect of the marginals is removed, MI and PMI behave similarly to Y as functions of λ . The pointwise mutual information is used extensively in some research communities for flagging suspicious coincidences. We discuss the pros and cons of using it in this way, bearing in mind the sensitivity of the PMI to the marginals, with increased scores for sparser events.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2021) 33 (4): 853–857.
Published: 01 April 2021
Abstract
View article
PDF
In this note, I study how the precision of a binary classifier depends on the ratio r of positive to negative cases in the test set, as well as the classifier's true and false-positive rates. This relationship allows prediction of how the precision-recall curve will change with r , which seems not to be well known. It also allows prediction of how F β and the precision gain and recall gain measures of Flach and Kull ( 2015 ) vary with r .
Journal Articles
Publisher: Journals Gateway
Neural Computation (2005) 17 (1): 1–6.
Published: 01 January 2005
Abstract
View article
PDF
In many areas of data modeling, observations at different locations (e.g., time frames or pixel locations) are augmented by differences of nearby observations (e.g., δ features in speech recognition, Gabor jets in image analysis). These augmented observations are then often modeled as being independent. How can this make sense? We provide two interpretations, showing (1) that the likelihood of data generated from an autoregressive process can be computed in terms of “independent” augmented observations and (2) that the augmented observations can be given a coherent treatment in terms of the products of experts model (Hinton, 1999).
Journal Articles
Publisher: Journals Gateway
Neural Computation (1998) 10 (5): 1203–1216.
Published: 01 July 1998
Abstract
View article
PDF
For neural networks with a wide class of weight priors, it can be shown that in the limit of an infinite number of hidden units, the prior over functions tends to a gaussian process. In this article, analytic forms are derived for the covariance function of the gaussian processes corresponding to networks with sigmoidal and gaussian hidden units. This allows predictions to be made efficiently using networks with an infinite number of hidden units and shows, somewhat paradoxically, that it may be easier to carry out Bayesian prediction with infinite networks rather than finite ones.
Journal Articles
Publisher: Journals Gateway
Neural Computation (1998) 10 (1): 215–234.
Published: 01 January 1998
Abstract
View article
PDF
Latent variable models represent the probability density of data in a space of several dimensions in terms of a smaller number of latent, or hidden, variables. A familiar example is factor analysis, which is based on a linear transformation between the latent space and the data space. In this article, we introduce a form of nonlinear latent variable model called the generative topographic mapping, for which the parameters of the model can be determined using the expectation-maximization algorithm. GTM provides a principled alternative to the widely used self-organizing map (SOM) of Kohonen (1982) and overcomes most of the significant limitations of the SOM. We demonstrate the performance of the GTM algorithm on a toy problem and on simulated data from flow diagnostics for a multiphase oil pipeline.
Journal Articles
Publisher: Journals Gateway
Neural Computation (1992) 4 (5): 650–665.
Published: 01 September 1992
Abstract
View article
PDF
Despite the fact that complex visual scenes contain multiple, overlapping objects, people perform object recognition with ease and accuracy. One operation that facilitates recognition is an early segmentation process in which features of objects are grouped and labeled according to which object they belong. Current computational systems that perform this operation are based on predefined grouping heuristics. We describe a system called MAGIC that learns how to group features based on a set of presegmented examples. In many cases, MAGIC discovers grouping heuristics similar to those previously proposed, but it also has the capability of finding nonintuitive structural regularities in images. Grouping is performed by a relaxation network that attempts to dynamically bind related features. Features transmit a complex-valued signal (amplitude and phase) to one another; binding can thus be represented by phase locking related features. MAGIC's training procedure is a generalization of recurrent backpropagation to complex-valued units.