Skip Nav Destination
Close Modal
Update search
NARROW
Format
Journal
TocHeadingTitle
Date
Availability
1-13 of 13
Masato Okada
Close
Follow your search
Access your saved searches in your account
Would you like to receive an alert when new items match your search?
Sort by
Journal Articles
Publisher: Journals Gateway
Neural Computation (2018) 30 (1): 1–33.
Published: 01 January 2018
FIGURES
| View All (15)
Abstract
View article
PDF
The dynamics of supervised learning play a main role in deep learning, which takes place in the parameter space of a multilayer perceptron (MLP). We review the history of supervised stochastic gradient learning, focusing on its singular structure and natural gradient. The parameter space includes singular regions in which parameters are not identifiable. One of our results is a full exploration of the dynamical behaviors of stochastic gradient learning in an elementary singular network. The bad news is its pathological nature, in which part of the singular region becomes an attractor and another part a repulser at the same time, forming a Milnor attractor. A learning trajectory is attracted by the attractor region, staying in it for a long time, before it escapes the singular region through the repulser region. This is typical of plateau phenomena in learning. We demonstrate the strange topology of a singular region by introducing blow-down coordinates, which are useful for analyzing the natural gradient dynamics. We confirm that the natural gradient dynamics are free of critical slowdown. The second main result is the good news: the interactions of elementary singular networks eliminate the attractor part and the Milnor-type attractors disappear. This explains why large-scale networks do not suffer from serious critical slowdowns due to singularities. We finally show that the unit-wise natural gradient is effective for learning in spite of its low computational cost.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2012) 24 (5): 1230–1270.
Published: 01 May 2012
FIGURES
| View All (30)
Abstract
View article
PDF
The neural substrates of decision making have been intensively studied using experimental and computational approaches. Alternative-choice tasks accompanying reinforcement have often been employed in investigations into decision making. Choice behavior has been empirically found in many experiments to follow Herrnstein's matching law. A number of theoretical studies have been done on explaining the mechanisms responsible for matching behavior. Various learning rules have been proved in these studies to achieve matching behavior as a steady state of learning processes. The models in the studies have consisted of a few parameters. However, a large number of neurons and synapses are expected to participate in decision making in the brain. We investigated learning behavior in simple but large-scale decision-making networks. We considered the covariance learning rule, which has been demonstrated to achieve matching behavior as a steady state (Loewenstein & Seung, 2006 ). We analyzed model behavior in a thermodynamic limit where the number of plastic synapses went to infinity. By means of techniques of the statistical mechanics, we can derive deterministic differential equations in this limit for the order parameters, which allow an exact calculation of the evolution of choice behavior. As a result, we found that matching behavior cannot be a steady state of learning when the fluctuations in input from individual sensory neurons are so large that they affect the net input to value-encoding neurons. This situation naturally arises when the synaptic strength is sufficiently strong and the excitatory input and the inhibitory input to the value-encoding neurons are balanced. The deviation from matching behavior is caused by increasing variance in the input potential due to the diffusion of synaptic efficacies. This effect causes an undermatching phenomenon, which has been often observed in behavioral experiments.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2011) 23 (5): 1205–1233.
Published: 01 May 2011
FIGURES
| View All (7)
Abstract
View article
PDF
We propose an algorithm for simultaneously estimating state transitions among neural states and nonstationary firing rates using a switching state-space model (SSSM). This algorithm enables us to detect state transitions on the basis of not only discontinuous changes in mean firing rates but also discontinuous changes in the temporal profiles of firing rates (e.g., temporal correlation). We construct estimation and learning algorithms for a nongaussian SSSM, whose nongaussian property is caused by binary spike events. Local variational methods can transform the binary observation process into a quadratic form. The transformed observation process enables us to construct a variational Bayes algorithm that can determine the number of neural states based on automatic relevance determination. Additionally, our algorithm can estimate model parameters from single-trial data using a priori knowledge about state transitions and firing rates. Synthetic data analysis reveals that our algorithm has higher performance for estimating nonstationary firing rates than previous methods. The analysis also confirms that our algorithm can detect state transitions on the basis of discontinuous changes in temporal correlation, which are transitions that previous hidden Markov models could not detect. We also analyze neural data recorded from the medial temporal area. The statistically detected neural states probably coincide with transient and sustained states that have been detected heuristically. Estimated parameters suggest that our algorithm detects the state transitions on the basis of discontinuous changes in the temporal correlation of firing rates. These results suggest that our algorithm is advantageous in real-data analysis.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2010) 22 (10): 2586–2614.
Published: 01 October 2010
FIGURES
| View All (11)
Abstract
View article
PDF
Spatiotemporal context in sensory stimulus has profound effects on neural responses and perception, and it sometimes affects task difficulty. Recently reported experimental data suggest that human detection sensitivity to motion in a target stimulus can be enhanced by adding a slow surrounding motion in an orthogonal direction, even though the illusory motion component caused by the surround is not relevant to the task. It is not computationally clear how the task-irrelevant component of motion modulates the subject's sensitivity to motion detection. In this study, we investigated the effects of encoding biases on detection performance by modeling the stochastic neural population activities. We modeled two types of modulation on the population activity profiles caused by a contextual stimulus: one type is identical to the activity evoked by a physical change in the stimulus, and the other is expressed more simply in terms of response gain modulation. For both encoding schemes, the motion detection performance of the ideal observer is enhanced by a task-irrelevant, additive motion component, replicating the properties observed for real subjects. The success of these models suggests that human detection sensitivity can be characterized by a noisy neural encoding that limits the resolution of information transmission in the cortical visual processing pathway. On the other hand, analyses of the neuronal contributions to the task predict that the effective cell populations differ between the two encoding schemes, posing a question concerning the decoding schemes that the nervous system used during illusory states.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2010) 22 (9): 2369–2389.
Published: 01 September 2010
FIGURES
| View All (5)
Abstract
View article
PDF
Neural activity is nonstationary and varies across time. Hidden Markov models (HMMs) have been used to track the state transition among quasi-stationary discrete neural states. Within this context, an independent Poisson model has been used for the output distribution of HMMs; hence, the model is incapable of tracking the change in correlation without modulating the firing rate. To achieve this, we applied a multivariate Poisson distribution with correlation terms for the output distribution of HMMs. We formulated a variational Bayes (VB) inference for the model. The VB could automatically determine the appropriate number of hidden states and correlation types while avoiding the overlearning problem. We developed an efficient algorithm for computing posteriors using the recursive relationship of a multivariate Poisson distribution. We demonstrated the performance of our method on synthetic data and real spike trains recorded from a songbird.
Includes: Supplementary data
Journal Articles
Publisher: Journals Gateway
Neural Computation (2007) 19 (9): 2468–2491.
Published: 01 September 2007
Abstract
View article
PDF
Repetitions of precise spike patterns observed both in vivo and in vitro have been reported for more than a decade. Studies on the spike volley (a pulse packet) propagating through a homogeneous feedforward network have demonstrated its capability of generating spike patterns with millisecond fidelity. This model is called the synfire chain and suggests a possible mechanism for generating repeated spike patterns (RSPs). The propagation speed of the pulse packet determines the temporal property of RSPs. However, the relationship between propagation speed and network structure is not well understood. We studied a feedforward network with Mexican-hat connectivity by using the leaky integrate-and-fire neuron model and analyzed the network dynamics with the Fokker-Planck equation. We examined the effect of the spatial pattern of pulse packets on RSPs in the network with multistability. Pulse packets can take spatially uniform or localized shapes in a multistable regime, and they propagate with different speeds. These distinct pulse packets generate RSPs with different timescales, but the order of spikes and the ratios between interspike intervals are preserved. This result indicates that the RSPs can be transformed into the same template pattern through the expanding or contracting operation of the timescale.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2007) 19 (7): 1854–1870.
Published: 01 July 2007
Abstract
View article
PDF
With spatially organized neural networks, we examined how bias and noise inputs with spatial structure result in different network states such as bumps, localized oscillations, global oscillations, and localized synchronous firing that may be relevant to, for example, orientation selectivity. To this end, we used networks of McCulloch-Pitts neurons, which allow theoretical predictions, and verified the obtained results with numerical simulations. Spatial inputs, no matter whether they are bias inputs or shared noise inputs, affect only firing activities with resonant spatial frequency. The component of noise that is independent for different neurons increases the linearity of the neural system and gives rise to less spatial mode mixing and less bistability of population activities.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2007) 19 (7): 1798–1853.
Published: 01 July 2007
Abstract
View article
PDF
Inspired by recent studies regarding dendritic computation, we constructed a recurrent neural network model incorporating dendritic lateral inhibition. Our model consists of an input layer and a neuron layer that includes excitatory cells and an inhibitory cell; this inhibitory cell is activated by the pooled activities of all the excitatory cells, and it in turn inhibits each dendritic branch of the excitatory cells that receive excitations from the input layer. Dendritic nonlinear operation consisting of branch-specifically rectified inhibition and saturation is described by imposing nonlinear transfer functions before summation over the branches. In this model with sufficiently strong recurrent excitation, on transiently presenting a stimulus that has a high correlation with feed- forward connections of one of the excitatory cells, the corresponding cell becomes highly active, and the activity is sustained after the stimulus is turned off, whereas all the other excitatory cells continue to have low activities. But on transiently presenting a stimulus that does not have high correlations with feedforward connections of any of the excitatory cells, all the excitatory cells continue to have low activities. Interestingly, such stimulus-selective sustained response is preserved for a wide range of stimulus intensity. We derive an analytical formulation of the model in the limit where individual excitatory cells have an infinite number of dendritic branches and prove the existence of an equilibrium point corresponding to such a balanced low-level activity state as observed in the simulations, whose stability depends solely on the signal-to-noise ratio of the stimulus. We propose this model as a model of stimulus selectivity equipped with self-sustainability and intensity-invariance simultaneously, which was difficult in the conventional competitive neural networks with a similar degree of complexity in their network architecture. We discuss the biological relevance of the model in a general framework of computational neuroscience.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2006) 18 (10): 2359–2386.
Published: 01 October 2006
Abstract
View article
PDF
We considered a gammadistribution of interspike intervals as a statistical model for neuronal spike generation. A gamma distribution is a natural extension of the Poisson process taking the effect of a refractory period into account. The model is specified by two parameters: a time-dependent firing rate and a shape parameter that characterizes spiking irregularities of individual neurons. Because the environment changes over time, observed data are generated from a model with a time-dependent firing rate, which is an unknown function. A statistical model with an unknown function is called a semiparametric model and is generally very difficult to solve. We used a novel method of estimating functions in information geometry to estimate the shape parameter without estimating the unknown function. We obtained an optimal estimating function analytically for the shape parameter independent of the functional form of the firing rate. This estimation is efficient without Fisher information loss and better than maximum likelihood estimation. We suggest a measure of spiking irregularity based on the estimating function, which may be useful for characterizing individual neurons in changing environments.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2005) 17 (9): 2034–2059.
Published: 01 September 2005
Abstract
View article
PDF
We report on deterministic and stochastic evolutions of firing states through a feedforward neural network with Mexican-hat-type connectivity. The prevalence of columnar structures in a cortex implies spatially localized connectivity between neural pools. Although feedforward neural network models with homogeneous connectivity have been intensively studied within the context of the synfire chain, the effect of local connectivity has not yet been studied so thoroughly. When a neuron fires independently, the dynamics of macroscopic state variables (a firing rate and spatial eccentricity of a firing pattern) is deterministic from the law of large numbers. Possible stable firing states, which are derived from deterministic evolution equations, are uniform, localized, and nonfiring. The multistability of these three states is obtained where the excitatory and inhibitory interactions among neurons are balanced. When the presynapse-dependent variance in connection efficacies is incorporated into the network, the variance generates common noise. Then the evolution of the macroscopic state variables becomes stochastic, and neurons begin to fire in a correlated manner due to the common noise. The correlation structure that is generated by common noise exhibits a nontrivial bimodal distribution. The development of a firing state through neural layers does not converge to a certain fixed point but keeps on fluctuating.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2004) 16 (4): 737–765.
Published: 01 April 2004
Abstract
View article
PDF
A novel analytical method based on information geometry was recently proposed, and this method may provide useful insights into the statistical interactions within neural groups. The link between information-geometric measures and the structure of neural interactions has not yet been elucidated, however, because of the ill-posed nature of the problem. Here, possible neural architectures underlying information-geometric measures are investigated using an isolated pair and an isolated triplet of model neurons. By assuming the existence of equilibrium states, we derive analytically the relationship between the information-geometric parameters and these simple neural architectures. For symmetric networks, the first- and second-order information-geometric parameters represent, respectively, the external input and the underlying connections between the neurons provided that the number of neurons used in the parameter estimation in the log-linear model and the number of neurons in the network are the same. For asymmetric networks, however, these parameters are dependent on both the intrinsic connections and the external inputs to each neuron. In addition, we derive the relation between the information-geometric parameter corresponding to the two-neuron interaction and a conventional cross-correlation measure. We also show that the information-geometric parameters vary depending on the number of neurons assumed for parameter estimation in the log-linear model. This finding suggests a need to examine the information-geometric method carefully. A possible criterion for choosing an appropriate orthogonal coordinate is also discussed. This article points out the importance of a model-based approach and sheds light on the possible neural structure underlying the application of information geometry to neural network analysis.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2004) 16 (2): 309–331.
Published: 01 February 2004
Abstract
View article
PDF
We modeled the inhibitory effects of transcranial magnetic stimulation (TMS) on a neural population. TMS is a noninvasive technique, with high temporal resolution, that can stimulate the brain via a brief magnetic pulse from a coil placed on the scalp. Because of these advantages, TMS is extensively used as a powerful tool in experimental studies of motor, perception, and other functions in humans. However, the mechanisms by which TMS interferes with neural activities, especially in terms of theoretical aspects, are totally unknown. In this study, we focused on the temporal properties of TMS-induced perceptual suppression, and we computationally analyzed the response of a simple network model of a sensory feature detector system to a TMS-like perturbation. The perturbation caused the mean activity to transiently increase and then decrease for a long period, accompanied by a loss in the degree of activity localization. When the afferent input consisted of a dual phase, with a strong transient component and a weak sustained component, there was a critical latency period of the perturbation during which the network activity was completely suppressed and converged to the resting state. The range of the suppressive period increased with decreasing afferent input intensity and reached more than 10 times the time constant of the neuron. These results agree well with typical experimental data for occipital TMS and support the conclusion that dynamical interaction in a neural population plays an important role in TMS-induced perceptual suppression.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2002) 14 (12): 2883–2902.
Published: 01 December 2002
Abstract
View article
PDF
Recent biological experimental findings have shown that synaptic plasticity depends on the relative timing of the pre- and postsynaptic spikes. This determines whether long-term potentiation (LTP) or long-term depression (LTD) is induced. This synaptic plasticity has been called temporally asymmetric Hebbian plasticity (TAH). Many authors have numerically demonstrated that neural networks are capable of storing spatiotemporal patterns. However, the mathematical mechanism of the storage of spatiotemporal patterns is still unknown, and the effect of LTD is particularly unknown. In this article, we employ a simple neural network model and show that interference between LTP and LTD disappears in a sparse coding scheme. On the other hand, the covariance learning rule is known to be indispensable for the storage of sparse patterns. We also show that TAH has the same qualitative effect as the covariance rule when spatiotemporal patterns are embedded in the network.