Skip Nav Destination
Close Modal
Update search
NARROW
Date
Availability
1-20 of 1087
Letters
Close
Follow your search
Access your saved searches in your account
Would you like to receive an alert when new items match your search?
1
Sort by
Journal Articles
Publisher: Journals Gateway
Neural Computation (2018) 30 (10): 2805–2832.
Published: 01 October 2018
Abstract
View article
PDF
Although the number of artificial neural network and machine learning architectures is growing at an exponential pace, more attention needs to be paid to theoretical guarantees of asymptotic convergence for novel, nonlinear, high-dimensional adaptive learning algorithms. When properly understood, such guarantees can guide the algorithm development and evaluation process and provide theoretical validation for a particular algorithm design. For many decades, the machine learning community has widely recognized the importance of stochastic approximation theory as a powerful tool for identifying explicit convergence conditions for adaptive learning machines. However, the verification of such conditions is challenging for multidisciplinary researchers not working in the area of stochastic approximation theory. For this reason, this letter presents a new stochastic approximation theorem for both passive and reactive learning environments with assumptions that are easily verifiable. The theorem is widely applicable to the analysis and design of important machine learning algorithms including deep learning algorithms with multiple strict local minimizers, Monte Carlo expectation-maximization algorithms, contrastive divergence learning in Markov fields, and policy gradient reinforcement learning.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2018) 30 (10): 2757–2780.
Published: 01 October 2018
Abstract
View article
PDF
Modeling and interpreting spike train data is a task of central importance in computational neuroscience, with significant translational implications. Two popular classes of data-driven models for this task are autoregressive point-process generalized linear models (PPGLM) and latent state-space models (SSM) with point-process observations. In this letter, we derive a mathematical connection between these two classes of models. By introducing an auxiliary history process, we represent exactly a PPGLM in terms of a latent, infinite-dimensional dynamical system, which can then be mapped onto an SSM by basis function projections and moment closure. This representation provides a new perspective on widely used methods for modeling spike data and also suggests novel algorithmic approaches to fitting such models. We illustrate our results on a phasic bursting neuron model, showing that our proposed approach provides an accurate and efficient way to capture neural dynamics.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2018) 30 (10): 2781–2804.
Published: 01 October 2018
Abstract
View article
PDF
Least squares regression (LSR) is a fundamental statistical analysis technique that has been widely applied to feature learning. However, limited by its simplicity, the local structure of data is easy to neglect, and many methods have considered using orthogonal constraint for preserving more local information. Another major drawback of LSR is that the loss function between soft regression results and hard target values cannot precisely reflect the classification ability; thus, the idea of the large margin constraint is put forward. As a consequence, we pay attention to the concepts of large margin and orthogonal constraint to propose a novel algorithm, orthogonal least squares regression with large margin (OLSLM), for multiclass classification in this letter. The core task of this algorithm is to learn regression targets from data and an orthogonal transformation matrix simultaneously such that the proposed model not only ensures every data point can be correctly classified with a large margin than conventional least squares regression, but also can preserve more local data structure information in the subspace. Our efficient optimization method for solving the large margin constraint and orthogonal constraint iteratively proved to be convergent in both theory and practice. We also apply the large margin constraint in the process of generating a sparse learning model for feature selection via joint ℓ 2 , 1 -norm minimization on both loss function and regularization terms. Experimental results validate that our method performs better than state-of-the-art methods on various real-world data sets.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2018) 30 (10): 2726–2756.
Published: 01 October 2018
Abstract
View article
PDF
In recent years, the development of algorithms to detect neuronal spiking activity from two-photon calcium imaging data has received much attention, yet few researchers have examined the metrics used to assess the similarity of detected spike trains with the ground truth. We highlight the limitations of the two most commonly used metrics, the spike train correlation and success rate, and propose an alternative, which we refer to as CosMIC. Rather than operating on the true and estimated spike trains directly, the proposed metric assesses the similarity of the pulse trains obtained from convolution of the spike trains with a smoothing pulse. The pulse width, which is derived from the statistics of the imaging data, reflects the temporal tolerance of the metric. The final metric score is the size of the commonalities of the pulse trains as a fraction of their average size. Viewed through the lens of set theory, CosMIC resembles a continuous Sørensen-Dice coefficient—an index commonly used to assess the similarity of discrete, presence/absence data. We demonstrate the ability of the proposed metric to discriminate the precision and recall of spike train estimates. Unlike the spike train correlation, which appears to reward overestimation, the proposed metric score is maximized when the correct number of spikes have been detected. Furthermore, we show that CosMIC is more sensitive to the temporal precision of estimates than the success rate.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2018) 30 (10): 2691–2725.
Published: 01 October 2018
Abstract
View article
PDF
Grid cells of the rodent entorhinal cortex are essential for spatial navigation. Although their function is commonly believed to be either path integration or localization, the origin or purpose of their hexagonal firing fields remains disputed. Here they are proposed to arise as an optimal encoding of transitions in sequences. First, storage requirements for transitions in general episodic sequences are examined using propositional logic and graph theory. Subsequently, transitions in complete metric spaces are considered under the assumption of an ideal sampling of an input space. It is shown that memory capacity of neurons that have to encode multiple feasible spatial transitions is maximized by a hexagonal pattern. Grid cells are proposed to encode spatial transitions in spatiotemporal sequences, with the entorhinal-hippocampal loop forming a multitransition system.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2018) 30 (10): 2833–2854.
Published: 01 October 2018
Abstract
View article
PDF
This study focuses on predicting stock closing prices by using recurrent neural networks (RNNs). A long short-term memory (LSTM) model, a type of RNN coupled with stock basic trading data and technical indicators, is introduced as a novel method to predict the closing price of the stock market. We realize dimension reduction for the technical indicators by conducting principal component analysis (PCA). To train the model, some optimization strategies are followed, including adaptive moment estimation (Adam) and Glorot uniform initialization. Case studies are conducted on Standard & Poor's 500, NASDAQ, and Apple (AAPL). Plenty of comparison experiments are performed using a series of evaluation criteria to evaluate this model. Accurate prediction of stock market is considered an extremely challenging task because of the noisy environment and high volatility associated with the external factors. We hope the methodology we propose advances the research for analyzing and predicting stock time series. As the results of experiments suggest, the proposed model achieves a good level of fitness.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2018) 30 (10): 2660–2690.
Published: 01 October 2018
FIGURES
Abstract
View article
PDF
Neural-inspired spike-based computing machines often claim to achieve considerable advantages in terms of energy and time efficiency by using spikes for computation and communication. However, fundamental questions about spike-based computation remain unanswered. For instance, how much advantage do spike-based approaches have over conventional methods, and under what circumstances does spike-based computing provide a comparative advantage? Simply implementing existing algorithms using spikes as the medium of computation and communication is not guaranteed to yield an advantage. Here, we demonstrate that spike-based communication and computation within algorithms can increase throughput, and they can decrease energy cost in some cases. We present several spiking algorithms, including sorting a set of numbers in ascending/descending order, as well as finding the maximum or minimum or median of a set of numbers. We also provide an example application: a spiking median-filtering approach for image processing providing a low-energy, parallel implementation. The algorithms and analyses presented here demonstrate that spiking algorithms can provide performance advantages and offer efficient computation of fundamental operations useful in more complex algorithms.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2018) 30 (9): 2418–2438.
Published: 01 September 2018
FIGURES
| View All (6)
Abstract
View article
PDF
The extreme complexity of the brain has attracted the attention of neuroscientists and other researchers for a long time. More recently, the neuromorphic hardware has matured to provide a new powerful tool to study neuronal dynamics. Here, we study neuronal dynamics using different settings on a neuromorphic chip built with flexible parameters of neuron models. Our unique setting in the network of leaky integrate-and-fire (LIF) neurons is to introduce a weak noise environment. We observed three different types of collective neuronal activities, or phases, separated by sharp boundaries, or phase transitions. From this, we construct a rudimentary phase diagram of neuronal dynamics and demonstrate that a noise-induced chaotic phase (N-phase), which is dominated by neuronal avalanche activity (intermittent aperiodic neuron firing), emerges in the presence of noise and its width grows with the noise intensity. The dynamics can be manipulated in this N-phase. Our results and comparison with clinical data is consistent with the literature and our previous work showing that healthy brain must reside in the N-phase. We argue that the brain phase diagram with further refinement may be used for the diagnosis and treatment of mental disease and also suggest that the dynamics may be manipulated to serve as a means of new information processing (e.g., for optimization). Neuromorphic chips, similar to the one we used but with a variety of neuron models, may be used to further enhance the understanding of human brain function and accelerate the development of neuroscience research.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2018) 30 (9): 2568–2591.
Published: 01 September 2018
FIGURES
| View All (8)
Abstract
View article
PDF
Rule extraction from black box models is critical in domains that require model validation before implementation, as can be the case in credit scoring and medical diagnosis. Though already a challenging problem in statistical learning in general, the difficulty is even greater when highly nonlinear, recursive models, such as recurrent neural networks (RNNs), are fit to data. Here, we study the extraction of rules from second-order RNNs trained to recognize the Tomita grammars. We show that production rules can be stably extracted from trained RNNs and that in certain cases, the rules outperform the trained RNNs.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2018) 30 (9): 2348–2383.
Published: 01 September 2018
FIGURES
| View All (7)
Abstract
View article
PDF
This letter makes scientific and methodological contributions. Scientifically, it demonstrates a new and behaviorally relevant effect of temporal expectation on the phase coherence of the electroencephalogram (EEG). Methodologically, it introduces novel methods to characterize EEG recordings at the single-trial level. Expecting events in time can lead to more efficient behavior. A remarkable finding in the study of temporal expectation is the foreperiod effect on reaction time, that is, the influence on reaction time of the delay between a warning signal and a succeeding imperative stimulus to which subjects are instructed to respond as quickly as possible. Here we study a new foreperiod effect in an audiovisual attention-shifting oddball task in which attention-shift cues directed the attention of subjects to impendent deviant stimuli of a given modality and therefore acted as warning signals for these deviants. Standard stimuli, to which subjects did not respond, were interspersed between warning signals and deviants. We hypothesized that foreperiod durations modulated intertrial phase coherence (ITPC, the degree of phase alignment across multiple trials) evoked by behaviorally irrelevant standards and that these modulations are behaviorally meaningful. Using averaged data, we first observed that ITPC evoked by standards closer to the warning signal was significantly different from that evoked by standards further away from it, establishing a new foreperiod effect on ITPC evoked by standards. We call this effect the standard foreperiod (SFP) effect on ITPC. We reasoned that if the SFP influences ITPC evoked by standards, it should be possible to decode the former from the latter on a trial-by-trial basis. We were able to do so showing that this effect can be observed in single trials. We demonstrated the behavioral relevance of the SFP effect on ITPC by showing significant correlations between its strength and subjects' behavioral performance.
Includes: Supplementary data
Journal Articles
Publisher: Journals Gateway
Neural Computation (2018) 30 (9): 2500–2529.
Published: 01 September 2018
FIGURES
| View All (12)
Abstract
View article
PDF
Estimation of a generating partition is critical for symbolization of measurements from discrete-time dynamical systems, where a sequence of symbols from a (finite-cardinality) alphabet may uniquely specify the underlying time series. Such symbolization is useful for computing measures (e.g., Kolmogorov-Sinai entropy) to identify or characterize the (possibly unknown) dynamical system. It is also useful for time series classification and anomaly detection. The seminal work of Hirata, Judd, and Kilminster ( 2004 ) derives a novel objective function, akin to a clustering objective, that measures the discrepancy between a set of reconstruction values and the points from the time series. They cast estimation of a generating partition via the minimization of their objective function. Unfortunately, their proposed algorithm is nonconvergent, with no guarantee of finding even locally optimal solutions with respect to their objective. The difficulty is a heuristic nearest neighbor symbol assignment step. Alternatively, we develop a novel, locally optimal algorithm for their objective. We apply iterative nearest-neighbor symbol assignments with guaranteed discrepancy descent, by which joint, locally optimal symbolization of the entire time series is achieved. While most previous approaches frame generating partition estimation as a state-space partitioning problem, we recognize that minimizing the Hirata et al. ( 2004 ) objective function does not induce an explicit partitioning of the state space, but rather the space consisting of the entire time series (effectively, clustering in a (countably) infinite-dimensional space). Our approach also amounts to a novel type of sliding block lossy source coding. Improvement, with respect to several measures, is demonstrated over popular methods for symbolizing chaotic maps. We also apply our approach to time-series anomaly detection, considering both chaotic maps and failure application in a polycrystalline alloy material.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2018) 30 (9): 2384–2417.
Published: 01 September 2018
FIGURES
| View All (10)
Abstract
View article
PDF
Apparent motion of the surroundings on an agent's retina can be used to navigate through cluttered environments, avoid collisions with obstacles, or track targets of interest. The pattern of apparent motion of objects, (i.e., the optic flow), contains spatial information about the surrounding environment. For a small, fast-moving agent, as used in search and rescue missions, it is crucial to estimate the distance to close-by objects to avoid collisions quickly. This estimation cannot be done by conventional methods, such as frame-based optic flow estimation, given the size, power, and latency constraints of the necessary hardware. A practical alternative makes use of event-based vision sensors. Contrary to the frame-based approach, they produce so-called events only when there are changes in the visual scene. We propose a novel asynchronous circuit, the spiking elementary motion detector (sEMD), composed of a single silicon neuron and synapse, to detect elementary motion from an event-based vision sensor. The sEMD encodes the time an object's image needs to travel across the retina into a burst of spikes. The number of spikes within the burst is proportional to the speed of events across the retina. A fast but imprecise estimate of the time-to-travel can already be obtained from the first two spikes of a burst and refined by subsequent interspike intervals. The latter encoding scheme is possible due to an adaptive nonlinear synaptic efficacy scaling. We show that the sEMD can be used to compute a collision avoidance direction in the context of robotic navigation in a cluttered outdoor environment and compared the collision avoidance direction to a frame-based algorithm. The proposed computational principle constitutes a generic spiking temporal correlation detector that can be applied to other sensory modalities (e.g., sound localization), and it provides a novel perspective to gating information in spiking neural networks.
Includes: Supplementary data
Journal Articles
Publisher: Journals Gateway
Neural Computation (2018) 30 (9): 2472–2499.
Published: 01 September 2018
FIGURES
| View All (9)
Abstract
View article
PDF
A hippocampal prosthesis is a very large scale integration (VLSI) biochip that needs to be implanted in the biological brain to solve a cognitive dysfunction. In this letter, we propose a novel low-complexity, small-area, and low-power programmable hippocampal neural network application-specific integrated circuit (ASIC) for a hippocampal prosthesis. It is based on the nonlinear dynamical model of the hippocampus: namely multi-input, multi-output (MIMO)–generalized Laguerre-Volterra model (GLVM). It can realize the real-time prediction of hippocampal neural activity. New hardware architecture, a storage space configuration scheme, low-power convolution, and gaussian random number generator modules are proposed. The ASIC is fabricated in 40 nm technology with a core area of 0.122 mm 2 and test power of 84.4 μ W. Compared with the design based on the traditional architecture, experimental results show that the core area of the chip is reduced by 84.94% and the core power is reduced by 24.30%.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2018) 30 (9): 2530–2567.
Published: 01 September 2018
FIGURES
| View All (9)
Abstract
View article
PDF
When modeling goal-directed behavior in the presence of various sources of uncertainty, planning can be described as an inference process. A solution to the problem of planning as inference was previously proposed in the active inference framework in the form of an approximate inference scheme based on variational free energy. However, this approximate scheme was based on the mean-field approximation, which assumes statistical independence of hidden variables and is known to show overconfidence and may converge to local minima of the free energy. To better capture the spatiotemporal properties of an environment, we reformulated the approximate inference process using the so-called Bethe approximation. Importantly, the Bethe approximation allows for representation of pairwise statistical dependencies. Under these assumptions, the minimizer of the variational free energy corresponds to the belief propagation algorithm, commonly used in machine learning. To illustrate the differences between the mean-field approximation and the Bethe approximation, we have simulated agent behavior in a simple goal-reaching task with different types of uncertainties. Overall, the Bethe agent achieves higher success rates in reaching goal states. We relate the better performance of the Bethe agent to more accurate predictions about the consequences of its own actions. Consequently, active inference based on the Bethe approximation extends the application range of active inference to more complex behavioral tasks.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2018) 30 (9): 2439–2471.
Published: 01 September 2018
FIGURES
| View All (12)
Abstract
View article
PDF
Computer vision algorithms are often limited in their application by the large amount of data that must be processed. Mammalian vision systems mitigate this high bandwidth requirement by prioritizing certain regions of the visual field with neural circuits that select the most salient regions. This work introduces a novel and computationally efficient visual saliency algorithm for performing this neuromorphic attention-based data reduction. The proposed algorithm has the added advantage that it is compatible with an analog CMOS design while still achieving comparable performance to existing state-of-the-art saliency algorithms. This compatibility allows for direct integration with the analog-to-digital conversion circuitry present in CMOS image sensors. This integration leads to power savings in the converter by quantizing only the salient pixels. Further system-level power savings are gained by reducing the amount of data that must be transmitted and processed in the digital domain. The analog CMOS compatible formulation relies on a pulse width (i.e., time mode) encoding of the pixel data that is compatible with pulse-mode imagers and slope based converters often used in imager designs. This letter begins by discussing this time-mode encoding for implementing neuromorphic architectures. Next, the proposed algorithm is derived. Hardware-oriented optimizations and modifications to this algorithm are proposed and discussed. Next, a metric for quantifying saliency accuracy is proposed, and simulation results of this metric are presented. Finally, an analog synthesis approach for a time-mode architecture is outlined, and postsynthesis transistor-level simulations that demonstrate functionality of an implementation in a modern CMOS process are discussed.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2018) 30 (8): 2210–2244.
Published: 01 August 2018
FIGURES
| View All (8)
Abstract
View article
PDF
Biological networks have long been known to be modular, containing sets of nodes that are highly connected internally. Less emphasis, however, has been placed on understanding how intermodule connections are distributed within a network. Here, we borrow ideas from engineered circuit design and study Rentian scaling, which states that the number of external connections between nodes in different modules is related to the number of nodes inside the modules by a power-law relationship. We tested this property in a broad class of molecular networks, including protein interaction networks for six species and gene regulatory networks for 41 human and 25 mouse cell types. Using evolutionarily defined modules corresponding to known biological processes in the cell, we found that all networks displayed Rentian scaling with a broad range of exponents. We also found evidence for Rentian scaling in functional modules in the Caenorhabditis elegans neural network, but, interestingly, not in three different social networks, suggesting that this property does not inevitably emerge. To understand how such scaling may have arisen evolutionarily, we derived a new graph model that can generate Rentian networks given a target Rent exponent and a module decomposition as inputs. Overall, our work uncovers a new principle shared by engineered circuits and biological networks.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2018) 30 (8): 2245–2283.
Published: 01 August 2018
FIGURES
Abstract
View article
PDF
Many machine learning problems can be formulated as predicting labels for a pair of objects. Problems of that kind are often referred to as pairwise learning, dyadic prediction, or network inference problems. During the past decade, kernel methods have played a dominant role in pairwise learning. They still obtain a state-of-the-art predictive performance, but a theoretical analysis of their behavior has been underexplored in the machine learning literature. In this work we review and unify kernel-based algorithms that are commonly used in different pairwise learning settings, ranging from matrix filtering to zero-shot learning. To this end, we focus on closed-form efficient instantiations of Kronecker kernel ridge regression. We show that independent task kernel ridge regression, two-step kernel ridge regression, and a linear matrix filter arise naturally as a special case of Kronecker kernel ridge regression, implying that all these methods implicitly minimize a squared loss. In addition, we analyze universality, consistency, and spectral filtering properties. Our theoretical results provide valuable insights into assessing the advantages and limitations of existing pairwise learning methods.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2018) 30 (8): 2284–2318.
Published: 01 August 2018
FIGURES
| View All (9)
Abstract
View article
PDF
In this letter, we study the confounder detection problem in the linear model, where the target variable Y is predicted using its n potential causes X n = ( x 1 , … , x n ) T . Based on an assumption of a rotation-invariant generating process of the model, recent study shows that the spectral measure induced by the regression coefficient vector with respect to the covariance matrix of X n is close to a uniform measure in purely causal cases, but it differs from a uniform measure characteristically in the presence of a scalar confounder. Analyzing spectral measure patterns could help to detect confounding. In this letter, we propose to use the first moment of the spectral measure for confounder detection. We calculate the first moment of the regression vector–induced spectral measure and compare it with the first moment of a uniform spectral measure, both defined with respect to the covariance matrix of X n . The two moments coincide in nonconfounding cases and differ from each other in the presence of confounding. This statistical causal-confounding asymmetry can be used for confounder detection. Without the need to analyze the spectral measure pattern, our method avoids the difficulty of metric choice and multiple parameter optimization. Experiments on synthetic and real data show the performance of this method.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2018) 30 (8): 2175–2209.
Published: 01 August 2018
FIGURES
| View All (12)
Abstract
View article
PDF
It has been suggested that reactivation of previously acquired experiences or stored information in declarative memories in the hippocampus and neocortex contributes to memory consolidation and learning. Understanding memory consolidation depends crucially on the development of robust statistical methods for assessing memory reactivation. To date, several statistical methods have seen established for assessing memory reactivation based on bursts of ensemble neural spike activity during offline states. Using population-decoding methods, we propose a new statistical metric, the weighted distance correlation, to assess hippocampal memory reactivation (i.e., spatial memory replay) during quiet wakefulness and slow-wave sleep. The new metric can be combined with an unsupervised population decoding analysis, which is invariant to latent state labeling and allows us to detect statistical dependency beyond linearity in memory traces. We validate the new metric using two rat hippocampal recordings in spatial navigation tasks. Our proposed analysis framework may have a broader impact on assessing memory reactivations in other brain regions under different behavioral tasks.
Includes: Supplementary data
Journal Articles
Publisher: Journals Gateway
Neural Computation (2018) 30 (8): 2113–2174.
Published: 01 August 2018
FIGURES
| View All (11)
Abstract
View article
PDF
We explore classifier training for data sets with very few labels. We investigate this task using a neural network for nonnegative data. The network is derived from a hierarchical normalized Poisson mixture model with one observed and two hidden layers. With the single objective of likelihood optimization, both labeled and unlabeled data are naturally incorporated into learning. The neural activation and learning equations resulting from our derivation are concise and local. As a consequence, the network can be scaled using standard deep learning tools for parallelized GPU implementation. Using standard benchmarks for nonnegative data, such as text document representations, MNIST, and NIST SD19, we study the classification performance when very few labels are used for training. In different settings, the network's performance is compared to standard and recently suggested semisupervised classifiers. While other recent approaches are more competitive for many labels or fully labeled data sets, we find that the network studied here can be applied to numbers of few labels where no other system has been reported to operate so far.
1