Skip Nav Destination
Close Modal
Update search
NARROW
Format
Journal
TocHeadingTitle
Date
Availability
1-7 of 7
William Bialek
Close
Follow your search
Access your saved searches in your account
Would you like to receive an alert when new items match your search?
Sort by
Journal Articles
Publisher: Journals Gateway
Neural Computation (2004) 16 (12): 2483–2506.
Published: 01 December 2004
Abstract
View article
PDF
Clustering provides a common means of identifying structure in complex data, and there is renewed interest in clustering as a tool for the analysis of large data sets in many fields. A natural question is how many clusters are appropriate for the description of a given system. Traditional approaches to this problem are based on either a framework in which clusters of a particular shape are assumed as a model of the system or on a two-step procedure in which a clustering criterion determines the optimal assignments for a given number of clusters and a separate criterion measures the goodness of the classification to determine the number of clusters. In a statistical mechanics approach, clustering can be seen as a trade-off between energy- and entropy-like terms, with lower temperature driving the proliferation of clusters to provide a more detailed description of the data. For finite data sets, we expect that there is a limit to the meaningful structure that can be resolved and therefore a minimum temperature beyond which we will capture sampling noise. This suggests that correcting the clustering criterion for the bias that arises due to sampling errors will allow us to find a clustering solution at a temperature that is optimal in the sense that we capture maximal meaningful structure—without having to define an external criterion for the goodness or stability of the clustering. We show that in a general information-theoretic framework, the finite size of a data set determines an optimal temperature, and we introduce a method for finding the maximal number of clusters that can be resolved from the data in the hard clustering limit.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2004) 16 (2): 223–250.
Published: 01 February 2004
Abstract
View article
PDF
We propose a method that allows for a rigorous statistical analysis of neural responses to natural stimuli that are nongaussian and exhibit strong correlations. We have in mind a model in which neurons are selective for a small number of stimulus dimensions out of a high-dimensional stimulus space, but within this subspace the responses can be arbitrarily nonlinear. Existing analysis methods are based on correlation functions between stimuli and responses, but these methods are guaranteed to work only in the case of gaussian stimulus ensembles. As an alternative to correlation functions, we maximize the mutual information between the neural responses and projections of the stimulus onto low-dimensional subspaces. The procedure can be done iteratively by increasing the dimensionality of this subspace. Those dimensions that allow the recovery of all of the information between spikes and the full unprojected stimuli describe the relevant subspace. If the dimensionality of the relevant subspace indeed is small, it becomes feasible to map the neuron's input-output function even under fully natural stimulus conditions. These ideas are illustrated in simulations on model visual and auditory neurons responding to natural scenes and sounds, respectively.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2003) 15 (8): 1715–1749.
Published: 01 August 2003
Abstract
View article
PDF
A spiking neuron “computes” by transforming a complex dynamical input into a train of action potentials, or spikes. The computation performed by the neuron can be formulated as dimensional reduction, or feature detection, followed by a nonlinear decision function over the low-dimensional space. Generalizations of the reverse correlation technique with white noise input provide a numerical strategy for extracting the relevant low-dimensional features from experimental data, and information theory can be used to evaluate the quality of the low-dimensional approximation. We apply these methods to analyze the simplest biophysically realistic model neuron, the Hodgkin-Huxley (HH) model, using this system to illustrate the general methodological issues. We focus on the features in the stimulus that trigger a spike, explicitly eliminating the effects of interactions between spikes. One can approximate this triggering “feature space” as a two-dimensional linear subspace in the high-dimensional space of input histories, capturing in this way a substantial fraction of the mutual information between inputs and spike time. We find that an even better approximation, however, is to describe the relevant subspace as two dimensional but curved; in this way, we can capture 90% of the mutual information even at high time resolution. Our analysis provides a new understanding of the computational properties of the HH model. While it is common to approximate neural behavior as “integrate and fire,” the HH model is not an integrator nor is it well described by a single threshold.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2001) 13 (11): 2409–2463.
Published: 01 November 2001
Abstract
View article
PDF
We define predictive information I pred (T) as the mutual information between the past and the future of a time series. Three qualitatively different behaviors are found in the limit of large observation times T : I pred (T) can remain finite, grow logarithmically, or grow as a fractional power law. If the time series allows us to learn a model with a finite number of parameters, then I pred (T) grows logarithmically with a coefficient that counts the dimensionality of the model space. In contrast, power-law growth is associated, for example, with the learning of infinite parameter (or non-parametric) models such as continuous functions with smoothness constraints. There are connections between the predictive information and measures of complexity that have been defined both in learning theory and the analysis of physical systems through statistical mechanics and dynamical systems theory. Furthermore, in the same way that entropy provides the unique measure of available information consistent with some simple and plausible conditions, we argue that the divergent part of I pred (T) provides the unique measure for the complexity of dynamics underlying a time series. Finally, we discuss how these ideas may be useful in problems in physics, statistics, and biology.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2000) 12 (7): 1531–1552.
Published: 01 July 2000
Abstract
View article
PDF
We show that the information carried by compound events in neural spike trains—patterns of spikes across time or across a population of cells—can be measured, independent of assumptions about what these patterns might represent. By comparing the information carried by a compound pattern with the information carried independently by its parts, we directly measure the synergy among these parts. We illustrate the use of these methods by applying them to experiments on the motion-sensitive neuron H1 of the fly's visual system, where we confirm that two spikes close together in time carry far more than twice the information carried by a single spike. We analyze the sources of this synergy and provide evidence that pairs of spikes close together in time may be especially important patterns in the code of H1.
Journal Articles
Publisher: Journals Gateway
Neural Computation (1993) 5 (1): 21–31.
Published: 01 January 1993
Abstract
View article
PDF
We show that a simple statistical mechanics model can capture the collective behavior of large networks of spiking neurons. Qualitative arguments suggest that regularly firing neurons should be described by a planar "spin" of unit length. We extract these spins from spike trains and then measure the interaction Hamiltonian using simulations of small clusters of cells. Correlations among spike trains obtained from simulations of large arrays of cells are in quantitative agreement with the predictions from these Hamiltonians. We comment on the novel computational abilities of these "XY networks."
Journal Articles
Publisher: Journals Gateway
Neural Computation (1992) 4 (5): 682–690.
Published: 01 September 1992
Abstract
View article
PDF
In many biological systems the primary transduction of sensory stimuli occurs in a regular array of receptors. Because of this discrete sampling it is usually assumed that the organism has no knowledge of signals beyond the Nyquist frequency. In fact, higher frequency signals are expected to mask the available lower frequency information as a result of aliasing. It has been suggested that these considerations are important in understanding, for example, the design of the receptor lattice in the mammalian fovea. We show that if the organism has knowledge of the probability distribution from which the signals are drawn, outputs from a discrete receptor array can be used to estimate signals beyond the Nyquist limit. In effect, a priori knowledge can be used to de-alias the image, and the estimated signal above the Nyquist cutoff is in fact coherent with the real signal at these high frequencies. We address initially the problem of stimulus reconstruction from a noisy receptor array responding to a Gaussian stimulus ensemble. In this case, the best reconstruction strategy is a simple linear transformation. In the more interesting (and natural) case of nongaussian stimuli, optimal reconstruction requires nonlinear operations, but the higher order correlations in the stimulus ensemble can be used to improve the estimate of super-Nyquist signals.