Skip Nav Destination
Close Modal
Update search
NARROW
Date
Availability
1-20 of 73
Articles
Close
Follow your search
Access your saved searches in your account
Would you like to receive an alert when new items match your search?
1
Sort by
Journal Articles
Publisher: Journals Gateway
Neural Computation (2018) 30 (10): 2616–2659.
Published: 01 October 2018
Abstract
View article
PDF
We formulate the computational processes of perception in the framework of the principle of least action by postulating the theoretical action as a time integral of the variational free energy in the neurosciences. The free energy principle is accordingly rephrased, on autopoetic grounds, as follows: all viable organisms attempt to minimize their sensory uncertainty about an unpredictable environment over a temporal horizon. By taking the variation of informational action, we derive neural recognition dynamics (RD), which by construction reduces to the Bayesian filtering of external states from noisy sensory inputs. Consequently, we effectively cast the gradient-descent scheme of minimizing the free energy into Hamiltonian mechanics by addressing only the positions and momenta of the organisms' representations of the causal environment. To demonstrate the utility of our theory, we show how the RD may be implemented in a neuronally based biophysical model at a single-cell level and subsequently in a coarse-grained, hierarchical architecture of the brain. We also present numerical solutions to the RD for a model brain and analyze the perceptual trajectories around attractors in neural state space.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2018) 30 (10): 2593–2615.
Published: 01 October 2018
FIGURES
| View All (6)
Abstract
View article
PDF
We consider the problem of classifying data manifolds where each manifold represents invariances that are parameterized by continuous degrees of freedom. Conventional data augmentation methods rely on sampling large numbers of training examples from these manifolds. Instead, we propose an iterative algorithm, M C P , based on a cutting plane approach that efficiently solves a quadratic semi-infinite programming problem to find the maximum margin solution. We provide a proof of convergence as well as a polynomial bound on the number of iterations required for a desired tolerance in the objective function. The efficiency and performance of M C P are demonstrated in high-dimensional simulations and on image manifolds generated from the ImageNet data set. Our results indicate that M C P is able to rapidly learn good classifiers and shows superior generalization performance compared with conventional maximum margin methods using data augmentation methods.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2018) 30 (6): 1449–1513.
Published: 01 June 2018
FIGURES
| View All (19)
Abstract
View article
PDF
To accommodate structured approaches of neural computation, we propose a class of recurrent neural networks for indexing and storing sequences of symbols or analog data vectors. These networks with randomized input weights and orthogonal recurrent weights implement coding principles previously described in vector symbolic architectures (VSA) and leverage properties of reservoir computing. In general, the storage in reservoir computing is lossy, and crosstalk noise limits the retrieval accuracy and information capacity. A novel theory to optimize memory performance in such networks is presented and compared with simulation experiments. The theory describes linear readout of analog data and readout with winner-take-all error correction of symbolic data as proposed in VSA models. We find that diverse VSA models from the literature have universal performance properties, which are superior to what previous analyses predicted. Further, we propose novel VSA models with the statistically optimal Wiener filter in the readout that exhibit much higher information capacity, in particular for storing analog data. The theory we present also applies to memory buffers, networks with gradual forgetting, which can operate on infinite data streams without memory overflow. Interestingly, we find that different forgetting mechanisms, such as attenuating recurrent weights or neural nonlinearities, produce very similar behavior if the forgetting time constants are matched. Such models exhibit extensive capacity when their forgetting time constant is optimized for given noise conditions and network size. These results enable the design of new types of VSA models for the online processing of data streams.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2018) 30 (6): 1514–1541.
Published: 01 June 2018
FIGURES
| View All (8)
Abstract
View article
PDF
A vast majority of computation in the brain is performed by spiking neural networks. Despite the ubiquity of such spiking, we currently lack an understanding of how biological spiking neural circuits learn and compute in vivo, as well as how we can instantiate such capabilities in artificial spiking circuits in silico. Here we revisit the problem of supervised learning in temporally coding multilayer spiking neural networks. First, by using a surrogate gradient approach, we derive SuperSpike, a nonlinear voltage-based three-factor learning rule capable of training multilayer networks of deterministic integrate-and-fire neurons to perform nonlinear computations on spatiotemporal spike patterns. Second, inspired by recent results on feedback alignment, we compare the performance of our learning rule under different credit assignment strategies for propagating output errors to hidden units. Specifically, we test uniform, symmetric, and random feedback, finding that simpler tasks can be solved with any type of feedback, while more complex tasks require symmetric feedback. In summary, our results open the door to obtaining a better scientific understanding of learning and computation in spiking neural networks by advancing our ability to train them to solve nonlinear problems involving transformations between different spatiotemporal spike time patterns.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2018) 30 (5): 1209–1257.
Published: 01 May 2018
FIGURES
| View All (15)
Abstract
View article
PDF
The primate visual system has an exquisite ability to discriminate partially occluded shapes. Recent electrophysiological recordings suggest that response dynamics in intermediate visual cortical area V4, shaped by feedback from prefrontal cortex (PFC), may play a key role. To probe the algorithms that may underlie these findings, we build and test a model of V4 and PFC interactions based on a hierarchical predictive coding framework. We propose that probabilistic inference occurs in two steps. Initially, V4 responses are driven solely by bottom-up sensory input and are thus strongly influenced by the level of occlusion. After a delay, V4 responses combine both feedforward input and feedback signals from the PFC; the latter reflect predictions made by PFC about the visual stimulus underlying V4 activity. We find that this model captures key features of V4 and PFC dynamics observed in experiments. Specifically, PFC responses are strongest for occluded stimuli and delayed responses in V4 are less sensitive to occlusion, supporting our hypothesis that the feedback signals from PFC underlie robust discrimination of occluded shapes. Thus, our study proposes that area V4 and PFC participate in hierarchical inference, with feedback signals encoding top-down predictions about occluded shapes.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2018) 30 (5): 1180–1208.
Published: 01 May 2018
FIGURES
| View All (7)
Abstract
View article
PDF
Neurostimulation is a promising therapy for abating epileptic seizures. However, it is extremely difficult to identify optimal stimulation patterns experimentally. In this study, human recordings are used to develop a functional 24 neuron network statistical model of hippocampal connectivity and dynamics. Spontaneous seizure-like activity is induced in silico in this reconstructed neuronal network. The network is then used as a testbed to design and validate a wide range of neurostimulation patterns. Commonly used periodic trains were not able to permanently abate seizures at any frequency. A simulated annealing global optimization algorithm was then used to identify an optimal stimulation pattern, which successfully abated 92% of seizures. Finally, in a fully responsive, or closed-loop, neurostimulation paradigm, the optimal stimulation successfully prevented the network from entering the seizure state. We propose that the framework presented here for algorithmically identifying patient-specific neurostimulation patterns can greatly increase the efficacy of neurostimulation devices for seizures.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2018) 30 (5): 1151–1179.
Published: 01 May 2018
FIGURES
| View All (14)
Abstract
View article
PDF
The computational principles of slowness and predictability have been proposed to describe aspects of information processing in the visual system. From the perspective of slowness being a limited special case of predictability we investigate the relationship between these two principles empirically. On a collection of real-world data sets we compare the features extracted by slow feature analysis (SFA) to the features of three recently proposed methods for predictable feature extraction: forecastable component analysis, predictable feature analysis, and graph-based predictable feature analysis. Our experiments show that the predictability of the learned features is highly correlated, and, thus, SFA appears to effectively implement a method for extracting predictable features according to different measures of predictability.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2018) 30 (4): 987–1011.
Published: 01 April 2018
FIGURES
| View All (12)
Abstract
View article
PDF
By controlling the state of neuronal populations, neuromodulators ultimately affect behavior. A key neuromodulation mechanism is the alteration of neuronal excitability via the modulation of ion channel expression. This type of neuromodulation is normally studied with conductance-based models, but those models are computationally challenging for large-scale network simulations needed in population studies. This article studies the modulation properties of the multiquadratic integrate-and-fire model, a generalization of the classical quadratic integrate-and-fire model. The model is shown to combine the computational economy of integrate-and-fire modeling and the physiological interpretability of conductance-based modeling. It is therefore a good candidate for affordable computational studies of neuromodulation in large networks.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2018) 30 (4): 945–986.
Published: 01 April 2018
FIGURES
| View All (7)
Abstract
View article
PDF
A neuronal population is a computational unit that receives a multivariate, time-varying input signal and creates a related multivariate output. These neural signals are modeled as stochastic processes that transmit information in real time, subject to stochastic noise. In a stationary environment, where the input signals can be characterized by constant statistical properties, the systematic relationship between its input and output processes determines the computation carried out by a population. When these statistical characteristics unexpectedly change, the population needs to adapt to its new environment if it is to maintain stable operation. Based on the general concept of homeostatic plasticity, we propose a simple compositional model of adaptive networks that achieve invariance with regard to undesired changes in the statistical properties of their input signals and maintain outputs with well-defined joint statistics. To achieve such invariance, the network model combines two functionally distinct types of plasticity. An abstract stochastic process neuron model implements a generalized form of intrinsic plasticity that adapts marginal statistics, relying only on mechanisms locally confined within each neuron and operating continuously in time, while a simple form of Hebbian synaptic plasticity operates on synaptic connections, thus shaping the interrelation between neurons as captured by a copula function. The combined effect of both mechanisms allows a neuron population to discover invariant representations of its inputs that remain stable under a wide range of transformations (e.g., shifting, scaling and (affine linear) mixing). The probabilistic model of homeostatic adaptation on a population level as presented here allows us to isolate and study the individual and the interaction dynamics of both mechanisms of plasticity and could guide the future search for computationally beneficial types of adaptation.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2018) 30 (4): 885–944.
Published: 01 April 2018
FIGURES
Abstract
View article
PDF
While Shannon's mutual information has widespread applications in many disciplines, for practical applications it is often difficult to calculate its value accurately for high-dimensional variables because of the curse of dimensionality. This article focuses on effective approximation methods for evaluating mutual information in the context of neural population coding. For large but finite neural populations, we derive several information-theoretic asymptotic bounds and approximation formulas that remain valid in high-dimensional spaces. We prove that optimizing the population density distribution based on these approximation formulas is a convex optimization problem that allows efficient numerical solutions. Numerical simulation results confirmed that our asymptotic formulas were highly accurate for approximating mutual information for large neural populations. In special cases, the approximation formulas are exactly equal to the true mutual information. We also discuss techniques of variable transformation and dimensionality reduction to facilitate computation of the approximations.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2018) 30 (4): 857–884.
Published: 01 April 2018
FIGURES
| View All (4)
Abstract
View article
PDF
We extend the neural Turing machine (NTM) model into a dynamic neural Turing machine (D-NTM) by introducing trainable address vectors. This addressing scheme maintains for each memory cell two separate vectors, content and address vectors. This allows the D-NTM to learn a wide variety of location-based addressing strategies, including both linear and nonlinear ones. We implement the D-NTM with both continuous and discrete read and write mechanisms. We investigate the mechanisms and effects of learning to read and write into a memory through experiments on Facebook bAbI tasks using both a feedforward and GRU controller. We provide extensive analysis of our model and compare different variations of neural Turing machines on this task. We show that our model outperforms long short-term memory and NTM variants. We provide further experimental results on the sequential MNIST, Stanford Natural Language Inference, associative recall, and copy tasks.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2018) 30 (1): 84–124.
Published: 01 January 2018
FIGURES
| View All (5)
Abstract
View article
PDF
Modeling self-organization of neural networks for unsupervised learning using Hebbian and anti-Hebbian plasticity has a long history in neuroscience. Yet derivations of single-layer networks with such local learning rules from principled optimization objectives became possible only recently, with the introduction of similarity matching objectives. What explains the success of similarity matching objectives in deriving neural networks with local learning rules? Here, using dimensionality reduction as an example, we introduce several variable substitutions that illuminate the success of similarity matching. We show that the full network objective may be optimized separately for each synapse using local learning rules in both the offline and online settings. We formalize the long-standing intuition of the rivalry between Hebbian and anti-Hebbian rules by formulating a min-max optimization problem. We introduce a novel dimensionality reduction objective using fractional matrix exponents. To illustrate the generality of our approach, we apply it to a novel formulation of dimensionality reduction combined with whitening. We confirm numerically that the networks with learning rules derived from principled objectives perform better than those with heuristic learning rules.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2018) 30 (1): 34–83.
Published: 01 January 2018
FIGURES
| View All (8)
Abstract
View article
PDF
Surprise describes a range of phenomena from unexpected events to behavioral responses. We propose a novel measure of surprise and use it for surprise-driven learning. Our surprise measure takes into account data likelihood as well as the degree of commitment to a belief via the entropy of the belief distribution. We find that surprise-minimizing learning dynamically adjusts the balance between new and old information without the need of knowledge about the temporal statistics of the environment. We apply our framework to a dynamic decision-making task and a maze exploration task. Our surprise-minimizing framework is suitable for learning in complex environments, even if the environment undergoes gradual or sudden changes, and it could eventually provide a framework to study the behavior of humans and animals as they encounter surprising events.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2018) 30 (1): 1–33.
Published: 01 January 2018
FIGURES
| View All (15)
Abstract
View article
PDF
The dynamics of supervised learning play a main role in deep learning, which takes place in the parameter space of a multilayer perceptron (MLP). We review the history of supervised stochastic gradient learning, focusing on its singular structure and natural gradient. The parameter space includes singular regions in which parameters are not identifiable. One of our results is a full exploration of the dynamical behaviors of stochastic gradient learning in an elementary singular network. The bad news is its pathological nature, in which part of the singular region becomes an attractor and another part a repulser at the same time, forming a Milnor attractor. A learning trajectory is attracted by the attractor region, staying in it for a long time, before it escapes the singular region through the repulser region. This is typical of plateau phenomena in learning. We demonstrate the strange topology of a singular region by introducing blow-down coordinates, which are useful for analyzing the natural gradient dynamics. We confirm that the natural gradient dynamics are free of critical slowdown. The second main result is the good news: the interactions of elementary singular networks eliminate the attractor part and the Milnor-type attractors disappear. This explains why large-scale networks do not suffer from serious critical slowdowns due to singularities. We finally show that the unit-wise natural gradient is effective for learning in spite of its low computational cost.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2017) 29 (12): 3119–3180.
Published: 01 December 2017
FIGURES
| View All (16)
Abstract
View article
PDF
An appealing new principle for neural population codes is that correlations among neurons organize neural activity patterns into a discrete set of clusters, which can each be viewed as a noise-robust population codeword. Previous studies assumed that these codewords corresponded geometrically with local peaks in the probability landscape of neural population responses. Here, we analyze multiple data sets of the responses of approximately 150 retinal ganglion cells and show that local probability peaks are absent under broad, nonrepeated stimulus ensembles, which are characteristic of natural behavior. However, we find that neural activity still forms noise-robust clusters in this regime, albeit clusters with a different geometry. We start by defining a soft local maximum, which is a local probability maximum when constrained to a fixed spike count. Next, we show that soft local maxima are robustly present and can, moreover, be linked across different spike count levels in the probability landscape to form a ridge. We found that these ridges comprise combinations of spiking and silence in the neural population such that all of the spiking neurons are members of the same neuronal community, a notion from network theory. We argue that a neuronal community shares many of the properties of Donald Hebb's classic cell assembly and show that a simple, biologically plausible decoding algorithm can recognize the presence of a specific neuronal community.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2017) 29 (12): 3181–3218.
Published: 01 December 2017
FIGURES
| View All (20)
Abstract
View article
PDF
High-density electrocorticogram (ECoG) electrodes are capable of recording neurophysiological data with high temporal resolution with wide spatial coverage. These recordings are a window to understanding how the human brain processes information and subsequently behaves in healthy and pathologic states. Here, we describe and implement delay differential analysis (DDA) for the characterization of ECoG data obtained from human patients with intractable epilepsy. DDA is a time-domain analysis framework based on embedding theory in nonlinear dynamics that reveals the nonlinear invariant properties of an unknown dynamical system. The DDA embedding serves as a low-dimensional nonlinear dynamical basis onto which the data are mapped. This greatly reduces the risk of overfitting and improves the method's ability to fit classes of data. Since the basis is built on the dynamical structure of the data, preprocessing of the data (e.g., filtering) is not necessary. We performed a large-scale search for a DDA model that best fit ECoG recordings using a genetic algorithm to qualitatively discriminate between different cortical states and epileptic events for a set of 13 patients. A single DDA model with only three polynomial terms was identified. Singular value decomposition across the feature space of the model revealed both global and local dynamics that could differentiate electrographic and electroclinical seizures and provided insights into highly localized seizure onsets and diffuse seizure terminations. Other common ECoG features such as interictal periods, artifacts, and exogenous stimuli were also analyzed with DDA. This novel framework for signal processing of seizure information demonstrates an ability to reveal unique characteristics of the underlying dynamics of the seizure and may be useful in better understanding, detecting, and maybe even predicting seizures.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2017) 29 (11): 2887–2924.
Published: 01 November 2017
FIGURES
| View All (19)
Abstract
View article
PDF
The statistical dependencies that independent component analysis (ICA) cannot remove often provide rich information beyond the linear independent components. It would thus be very useful to estimate the dependency structure from data. While such models have been proposed, they have usually concentrated on higher-order correlations such as energy (square) correlations. Yet linear correlations are a fundamental and informative form of dependency in many real data sets. Linear correlations are usually completely removed by ICA and related methods so they can only be analyzed by developing new methods that explicitly allow for linearly correlated components. In this article, we propose a probabilistic model of linear nongaussian components that are allowed to have both linear and energy correlations. The precision matrix of the linear components is assumed to be randomly generated by a higher-order process and explicitly parameterized by a parameter matrix. The estimation of the parameter matrix is shown to be particularly simple because using score-matching (Hyvärinen, 2005 ), the objective function is a quadratic form. Using simulations with artificial data, we demonstrate that the proposed method improves the identifiability of nongaussian components by simultaneously learning their correlation structure. Applications on simulated complex cells with natural image input, as well as spectrograms of natural audio data, show that the method finds new kinds of dependencies between the components.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2017) 29 (11): 2861–2886.
Published: 01 November 2017
FIGURES
| View All (29)
Abstract
View article
PDF
Two-node attractor networks are flexible models for neural activity during decision making. Depending on the network configuration, these networks can model distinct aspects of decisions including evidence integration, evidence categorization, and decision memory. Here, we use attractor networks to model recent causal perturbations of the frontal orienting fields (FOF) in rat cortex during a perceptual decision-making task (Erlich, Brunton, Duan, Hanks, & Brody, 2015 ). We focus on a striking feature of the perturbation results. Pharmacological silencing of the FOF resulted in a stimulus-independent bias. We fit several models to test whether integration, categorization, or decision memory could account for this bias and found that only the memory configuration successfully accounts for it. This memory model naturally accounts for optogenetic perturbations of FOF in the same task and correctly predicts a memory-duration-dependent deficit caused by silencing FOF in a different task. Our results provide mechanistic support for a “postcategorization” memory role of the FOF in upcoming choices.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2017) 29 (10): 2581–2632.
Published: 01 October 2017
FIGURES
| View All (4)
Abstract
View article
PDF
With our ability to record more neurons simultaneously, making sense of these data is a challenge. Functional connectivity is one popular way to study the relationship of multiple neural signals. Correlation-based methods are a set of currently well-used techniques for functional connectivity estimation. However, due to explaining away and unobserved common inputs (Stevenson, Rebesco, Miller, & Körding, 2008 ), they produce spurious connections. The general linear model (GLM), which models spike trains as Poisson processes (Okatan, Wilson, & Brown, 2005 ; Truccolo, Eden, Fellows, Donoghue, & Brown, 2005 ; Pillow et al., 2008 ), avoids these confounds. We develop here a new class of methods by using differential signals based on simulated intracellular voltage recordings. It is equivalent to a regularized AR(2) model. We also expand the method to simulated local field potential recordings and calcium imaging. In all of our simulated data, the differential covariance-based methods achieved performance better than or similar to the GLM method and required fewer data samples. This new class of methods provides alternative ways to analyze neural signals.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2017) 29 (10): 2633–2683.
Published: 01 October 2017
Abstract
View article
PDF
This article offers a formal account of curiosity and insight in terms of active (Bayesian) inference. It deals with the dual problem of inferring states of the world and learning its statistical structure. In contrast to current trends in machine learning (e.g., deep learning), we focus on how people attain insight and understanding using just a handful of observations, which are solicited through curious behavior. We use simulations of abstract rule learning and approximate Bayesian inference to show that minimizing (expected) variational free energy leads to active sampling of novel contingencies. This epistemic behavior closes explanatory gaps in generative models of the world, thereby reducing uncertainty and satisfying curiosity. We then move from epistemic learning to model selection or structure learning to show how abductive processes emerge when agents test plausible hypotheses about symmetries (i.e., invariances or rules) in their generative models. The ensuing Bayesian model reduction evinces mechanisms associated with sleep and has all the hallmarks of “aha” moments. This formulation moves toward a computational account of consciousness in the pre-Cartesian sense of sharable knowledge (i.e., con : “together”; scire : “to know”).
1