Abstract

A neuronal population is a computational unit that receives a multivariate, time-varying input signal and creates a related multivariate output. These neural signals are modeled as stochastic processes that transmit information in real time, subject to stochastic noise. In a stationary environment, where the input signals can be characterized by constant statistical properties, the systematic relationship between its input and output processes determines the computation carried out by a population. When these statistical characteristics unexpectedly change, the population needs to adapt to its new environment if it is to maintain stable operation. Based on the general concept of homeostatic plasticity, we propose a simple compositional model of adaptive networks that achieve invariance with regard to undesired changes in the statistical properties of their input signals and maintain outputs with well-defined joint statistics. To achieve such invariance, the network model combines two functionally distinct types of plasticity. An abstract stochastic process neuron model implements a generalized form of intrinsic plasticity that adapts marginal statistics, relying only on mechanisms locally confined within each neuron and operating continuously in time, while a simple form of Hebbian synaptic plasticity operates on synaptic connections, thus shaping the interrelation between neurons as captured by a copula function. The combined effect of both mechanisms allows a neuron population to discover invariant representations of its inputs that remain stable under a wide range of transformations (e.g., shifting, scaling and (affine linear) mixing). The probabilistic model of homeostatic adaptation on a population level as presented here allows us to isolate and study the individual and the interaction dynamics of both mechanisms of plasticity and could guide the future search for computationally beneficial types of adaptation.

1  Introduction

When we talk about adaptation, we take it to mean what W. Ross Ashby had in mind when writing his seminal book Design for a Brain, where he argued that in a volatile environment, “‘adaptive’ behavior is equivalent to the behavior of a stable system” (Ashby, 1954, p. 64). Active dynamical mechanisms that stabilize the activity of neural populations in spite of sudden changes in sensory inputs, lesions, or rewiring of synaptic connections have been studied extensively under the general term homeostatic plasticity. Some of them are confined within individual neurons, generally referred to as intrinsic plasticity; others operate within synapses and are thus grouped under the umbrella term of synaptic plasticity.

Here we aim to unify such forms of plasticity in a single mathematically simple framework of continuous-time stochastic processes that enables us to analyze their distinct functional roles and interactions and allows us to extend the notion of homeostatic plasticity from the level of individual neurons to that of populations.

A wide variety of specific candidate mechanisms of synaptic and intrinsic plasticity has been studied extensively, from both biological as well as information theoretical perspectives: Bienenstock, Cooper, and Munro (1982) proposed a form of synaptic plasticity that determines the growth or decay of synaptic connections under the constraint of maintaining a fixed mean firing rate. Turrigiano and Nelson (2004) conjectured a role of homeostatic plasticity in stabilizing the transmission of information in feedforward networks by fine-tuning the balance between excitatory and inhibitory connections. Both approaches assume a self-regulating form of synaptic adaptation that renders a neuron population invariant to additive shifts in its inputs, ensuring that resulting mean firing rates remain well within physiological bounds. In the abstract, dynamical systems models of biological spiking neurons, spike rate adaptation effects were incorporated by slowly changing adaptation variables (Izhikevich, 2003) and spike-induced responses (Jolivet, Lewis, & Gerstner, 2004) to closely match experimental measurements. In these models, a neuron's adaptation to its spiking output, rather than its input, is the driving force of homeostatic plasticity.

Scale invariance was proposed via a form of plasticity referred to as synaptic scaling (Turrigiano, 2008) or gain control (Burrone & Murthy, 2003), where synaptic connection strength or neural excitability is regulated, such that the variance of membrane potentials remains fixed. Multiple timescales of such adaptation were observed, potentially serving different purposes, that jointly improve information transmission (Fairhall, Lewen, Bialek, & de Ruyter Van Steveninck, 2000). Synaptic depression (Abbott, Varela, Sen, & Nelson, 1997) and diffusion of neurotransmitters (Sweeney, Kotaleski, & Hennig, 2015) were suggested as two fast-acting candidate mechanisms that could achieve scale invariance in order to increase the dynamic range of a neuron's output and thus improve its ability to transmit information.

The concept of a neuron as a bottleneck of information transmission (Bell & Sejnowski, 1995; Stemmler & Koch, 1999) expanded this notion and offered an information-theoretical explanation for the utility of homeostatic adaptation. In this framework, the maintainance of stable characteristics of a neuron's output, in spite of changes in the characteristics of its input, allows a neuron to discover and encode information about its inputs in a stable representation, thus making a population more robust to environmental changes, noise, or spike timing variability (Buesing & Maass, 2010). Intrinsic plasticity mechanisms were proposed to tune the nonlinear response function of neurons to optimize properties of their outputs, such as the maximization of information transmission (Toyoizumi, Pfister, Aihara, & Gerstner, 2005) or the minimization of divergence from a desired stationary distribution (Savin, Joshi, & Triesch, 2010; Triesch, 2007). Their interaction with synaptic plasticity and synaptic scaling was analyzed (Toyoizumi, Kaneko, Stryker, & Miller, 2014) and shown to yield emerging properties, such as the ability to implement blind source separation on prewhitened inputs in simple model neurons (Buesing & Maass, 2010; Savin et al., 2010; Triesch, 2007), a feat observable in vitro as well (Isomura, Kotani, & Jimbo, 2015). Similar results were obtained by Hyvärinen and Oja (1998), who used synaptic scaling in combination with simple nonlinear Hebbian learning rules to discover independent components.

These results provide crucial insight into the capabilities and limitations of neural plasticity and serve as the basis of this article. We contribute to this field of theoretical research by unifying the information-theoretical concept of intrinsic plasticity, enforcing a stable, fixed distribution of activations in the face of changing input statistics and Hebbian synaptic plasticity within an abstract but simple, probabilistic, and compositional model of adaptive neuron populations that avoids several limitations of the approaches discussed above. The model's activation function is directly parameterized by the membrane potential statistics that neurons should adapt to. Thus, the often complex update rules proposed in models of intrinsic plasticity are replaced by causal (nonlinear) filters of the membrane potential, such that adaptation dynamics can be analyzed and convergence ensured. Since our neuron model is fully determined by the stochastic properties of the neuron's membrane potential and activation processes, it could be easily adjusted to experimental data observed in vivo.

The continuous-time nature of our model facilitates studying the interaction between neural dynamics, synaptic dynamics, and plasticity and makes it easier to reconcile with its biological counterpart. By restricting plasticity to intrinsic and Hebbian synaptic plasticity while excluding global mechanisms such as a detailed balancing of excitation and inhibition or synaptic scaling, our model makes use only of information locally available to neurons and synapses, respectively. In light of Ashby (1954), we view many of the suggested benefits of plasticity as instances of the more general principle of homeostatic self-regulation, such that, for example, the purpose of blind source separation becomes, first and foremost, to find an informative, transformation-invariant representation of a population's input. We thus aim to elevate the notion of homeostatic adaptation from the level of individual neurons to the level of populations through the interplay of intrinsic and synaptic plasticity.

To illustrate the capabilities of the model, we reproduce results by Savin et al. (2010) and theoretically analyze the complementary role that intrinsic plasticity plays in stabilizing Hebbian learning, thus allowing individual neurons to discover informative components of their input signals. We demonstrate the generality of these results in a network trained on image patches, where lateral decorrelation drives neurons to learn a transformation-invariant representation of their multivariate inputs by implementing a form of principal or independent component analysis.

2  An Adaptative Network Model

The adaptive network model comprises adaptive neurons and adaptive synapses. The intrinsic plasticity mechanism implemented within each neuron uses locally available information about statistical properties of the neuron's membrane potential to adapt its behavior, such that its output remains stationary with predetermined statistics. Hebbian synaptic plasticity uses the product of (a function of) activations of a synapse's presynaptic source and postsynaptic target to update its connection strength. Correlated activity between source and target thus drives synaptic growth. By combining the adaptation of the marginal statistics via intrinsic plasticity with the adaptation of the copula via synaptic plasticity, the adaptive network realizes plasticity of its multivariate joint outputs and can become invariant to a large range of changes in its input statistics.

A population of neurons can thus be seen as a computational unit that receives a multivariate stochastic process as its input and linearly transforms and projects it on a set of adaptive neurons via adaptive synaptic connections. Each neuron then nonlinearly transforms its marginal input, and the joint activations of these neurons are taken to be the multivariate output of the population that subsequently becomes the input to the same or another population. (See Figure 1 for an illustration of the adaptive network model.)

Figure 1:

Schematic overview of the adaptive network model. Stochastic processes representing inputs into the population are combined via weighted, adaptive synaptic connections and integrated into membrane potential processes. Sufficient statistics of these processes are estimated in real time and used to adapt the neurons' nonlinear activation functions such that they result in stationary activation processes with predefined distributions. These activations, or sampled spike trains with accordingly time-varying intensity (not discussed here), are in turn used as inputs for other neurons.

Figure 1:

Schematic overview of the adaptive network model. Stochastic processes representing inputs into the population are combined via weighted, adaptive synaptic connections and integrated into membrane potential processes. Sufficient statistics of these processes are estimated in real time and used to adapt the neurons' nonlinear activation functions such that they result in stationary activation processes with predefined distributions. These activations, or sampled spike trains with accordingly time-varying intensity (not discussed here), are in turn used as inputs for other neurons.

We adhere to a simplistic yet powerful class of linear-nonlinear models (Ostojic & Brunel, 2011) that separates the dynamics of a neuron into two components: a linear, spatiotemporal filtering of inputs and a nonlinear transformation thereof, which yields the instantaneous firing intensity of the neuron. Although beyond the scope of this work, the model can be further extended using a spike train point process that samples spikes according to the neuron's time-varying firing intensity. Multiple input signals are linearly combined within the neuron's dendritic tree through weighted adaptive synaptic connections and integrated into a neuron's time-varying membrane potential , a real-valued stochastic process. The statistical properties of this membrane potential process are assumed to change rarely or slowly, such that the process can be locally well approximated by a (piece-wise) stationary process. The neuron's nonlinear activation function is parameterized by the vector and maps its membrane potential to an intensity or instantaneous firing rate , also referred to as the neuron's activation or output. It follows that the activation is a stationary stochastic process as well, and for both the membrane potential and the activation, stationary distributions and can be derived such that and (see lemma 7 in the supplementary text A.1). For the special case of an exponential function , this corresponds to a continuous-time generalized linear model as proposed by Truccolo, Eden, Fellows, Donoghue, and Brown (2005).

Conversely, given both a stationary distribution of membrane potentials and a desired distribution of firing rates , a nonlinearity can be derived that satisfies (see lemma 8). This nonlinear activation function can be parameterized by some sufficient statistics of the membrane potential distribution , the estimation of which is the principal task of intrinsic plasticity. By continuously estimating , such an adaptation mechanism can maintain the desired output distribution in spite of gradual changes in the statistics of its membrane potential and thus generates a (nearly) stationary output process . In this work, spike generation is not modeled; instead, the continuous activations are used directly in a rate-coding paradigm to derive theoretical results.

2.1  Model Components

The network model described is composed of four components to be chosen independently. First, a class of stochastic processes can be chosen to model the dynamics of individual membrane potentials and thus determine their marginal membrane potential distributions. Second, the desired marginal distribution of the neurons' activation can be defined to match theoretical considerations (e.g., the exponential distribution—Triesch, 2007) or biological data (e.g., the log-normal distribution—Hromádka, DeWeese, & Zador, 2008). Third, codependency between the signals projected onto the neurons can be introduced or modified by an appropriate connectivity structure of synaptic connections. Finally, the precise mechanisms of intrinsic and synaptic plasticity can be chosen to implement invariance with respect to certain changes in the population's input statistics. Each of these modeling choices is briefly discussed next.

2.1.1  Membrane Potential Processes

With little loss of generality, we assume that the membrane potential of an idividual neuron can be modeled as a process operating on two timescales. The time-varying potential is described by a stationary stochastic diffusion process of the general form
formula
2.1
where models the autonomous deterministic behavior of the membrane potential and modulates the impact of the time-varying input . We assume for convenience that the stationary distribution of is a member of an exponential family of distributions, parameterized by a vector of (minimally) sufficient statistics . This constraint is not particularly restrictive, since a large variety of probability distributions belong to an exponential family, for which a diffusion process with according stationary distribution can be constructed (Bibby, Skovgaard, & Sørensen, 2005). On an orders-of-magnitude slower timescale, the statistics of the membrane potential are themselves subject to changes that are not part of the model. The chosen diffusion process can be adjusted to match dynamic properties such as the membrane potential's impulse response, thus giving the modeler flexibility to choose in accordance with biological observations. For illustration purposes, we model the membrane potential as a simple leaky integrator of the form
formula
2.2
where is the leak rate time constant of the membrane potential, is the resting potential, and controls the sensitivity of the membrane potential to external input. When the neuron is driven by white noise (modeled as a derivative of Brownian motion) , the resulting membrane potential resembles an Ornstein-Uhlenbeck process with gaussian stationary distribution (see lemma 5). This choice of stochastic process is commonly used as a candidate model of membrane potentials (Ricciardi & Lánský, 2006). While this choice of membrane potential process is mathematically convenient, questions have been raised about the applicability of such a model to biological data (Shinomoto, Sakai, & Funahashi, 1999), and a more sophisticated choice could be made here if required. The stationary membrane potential distribution family of choice should be general enough that any relevant changes in the distribution (e.g., due to effects of learning) are captured in its parameters. In the examples presented here, input processes exhibit either stationary gaussian, Laplacian, or beta distributions.

Note that the stochastic input term in the process makes no distinction between an unknown “signal” and “noise” present in the input—just the combination of both is modeled. This makes it possible to match the statistical properties to biological observations without knowledge of what is signal and what is noise in light of the neuron's computational role.

2.1.2  Activation Distributions

The stationary distribution of firing rates affects the neuron's capacity to transmit information about its membrane potential and has a considerable impact on the neuron's computational role. For a fixed mean firing rate, for example, the exponential distribution maximizes the entropy of the neuron's intensity (Triesch, 2007), whereas a narrowly peaked bimodal distribution could alternate between periods of high and low firing rates, leading to more precisely timed bursts of spikes. Distributions with heavier tail probabilities turn the neuron into a coincidence detector, while others may match biological observations for certain neuron types best. When linearly decorrelated, marginally uniform activations become maximally informative (see also section 2.1.4). For a more detailed discussion of the effects, that the choice of an activation distribution can have on the neuron model, we refer readers to the supplementary text A.6. In the following, a continuous distribution is assumed for mathematical convenience, but this is not strictly necessary.

When modeling connected neural populations, the distribution should be chosen in accordance with the stationary distribution of the membrane potential process, such that a linear combination of multiple neurons' outputs, filtered by synaptic responses and integrated, results in the assumed stationary distribution of the membrane potential. However, when a gaussian process is chosen for the membrane potential, this may be neglected for a large number of synaptic inputs due to the central limit theorem and the combined smoothing effect of the synaptic and membrane potential spike response. For the examples presented here, either log-gaussian or log-Laplacian stationary activation distributions are used.

2.1.3  Intrinsic Plasticity

Consider a model neuron as described above with a known membrane potential process that has the stationary exponential family distribution parameterized by sufficient statistics , where is some continuous function of the random variable . By filtering the process with a causal exponential filter, we construct an exponentially weighted running estimate of the membrane potential's sufficient statistics. The dynamic properties of this process are derived in the supplementary text A.4. As outlined in section 2, the neuron's activation function is chosen to map the membrane potential distribution with cumulative distribution function onto the desired output distribution with cumulative distribution function . By using as an empirical estimate of the expected value , the activation function thus always approximates the mapping from the current membrane potential distribution parameterized by to the desired firing intensity distribution, making the neuron invariant to any (slow) changes in the parameters of the membrane potential distribution. It should be stressed that this functional notion of intrinsic plasticity models the combined effect of all forms of homeostatic mechanisms within the neuron, such as spike-rate adaptation, effects due to depletion, diffusion, and aggregation of neurotransmitters, gene expression, and more. If desired, multiple adaptation variables, changing with different timescales, can be included as long as they serve as sufficient statistics of the membrane potential distribution.

2.1.4  Connectivity and Copulas

Neurons within a population may be related through shared inputs or lateral connections. To model the relationship between the signals processed by each neuron, we assume that a population of neurons receives inputs from a multivariate stochastic process, which is subsequently linearly transformed by synaptic connections, projected onto the neurons, and integrated into their membrane potentials. With an appropriate choice of weights, a sufficiently large number of synaptic connections can induce an arbitrary covariance structure. Consequently, any change in the covariance structure of the input process could be reversed by an appropriate choice of synaptic connections (see the supplementary text B.3 for more information).

Since each neuron only (invertibly) transforms its marginal membrane potential distribution to its marginal output distribution, only neurons driven by (in)dependent inputs exhibit (in)dependent outputs. To be able to define the dependency structure independent of the chosen marginal membrane potential and output distributions, we factorize the stationary joint membrane potential distribution into the marginal distributions and a so-called copula with density (Embrechts, Lindskog, & McNeil, 2001). For neurons, this allows us to express the joint distributions of a population's membrane potentials and activations as follows, where denotes a density and a cumulative distribution function of the corresponding random variable in the subscript (see lemma 10 in the supplementary text B.1):
formula
2.3
formula
2.4
formula
2.5
formula
2.6

Under the effect of intrinsic plasticity, all activations are identically distributed, and the joint distribution of activations is fully determined by the desired marginal stationary activation distribution and the copula . Note that , which captures the relation between the neurons' joint outputs, also reflects the structure already found between their membrane potentials, which is in turn determined by the structure between the input signals driving the neurons. The copula thus captures the interrelation between the neurons independent of their activation functions or any invertible transformation thereof (see the supplementary text B.1). The copula itself is a probability distribution, the entropy of which measures the mutual information between the random variables modeled by it (Ma and Sun, 2011). Mutual independence is thus achieved when the entropy of their copula is maximized, resulting in a jointly uniform distribution, also referred to as the independence copula.

Specifically for stationary membrane potential processes with gaussian or Laplacian distributions, as used in the following, the copula is fully determined by the covariance structure of the inputs and the synaptic connection weights. A direct consequence of this is that an appropriate choice of synaptic connection weight can be used to render the neurons fully statistically independent (see the supplementary text B.2). In the special case of marginally uniform firing rate distributions, the copula is identical to the population's joint firing rate distribution, and mere uncorrelatedness of the neurons corresponds to jointly uniform, and thus statistically independent, outputs. The choice of the marginal membrane potential distribution thus defines what aspect of the copula synaptic weights can influence, whereas the marginal output distribution defines what can be revealed about it through (linear) correlation coefficients or Hebbian learning. This conceptual separation of the marginal distributions, on which the adaptive neurons operate, and the copula, which captures the full interrelation of neurons due to their related inputs and synaptic connections, allows us to better analyze the effects of intrinsic and synaptic plasticity and their interaction.

2.1.5  Hebbian Plasticity

In the adaptive network model, a generalized rate-coding model of Hebbian plasticity (cf. the activity product rule proposed by Brown, Kairiss, & Keenan, 1990) updates synapses connecting a presynaptic source with a postsynaptic target according to a product of their respective activations. Generally, assuming a synaptic interaction delay , the weight of a synapse that connects a source with activation to a target with activation is updated according to a multiplicative rule of the form
formula
2.7
where is a synaptic learning rate and and model the potentially nonlinear dependency of the weight updates on the pre- and postsynaptic activations. The specific choice of and allows adapting the synaptic plasticity rule to the chosen marginal distribution of neural activations or tuning it to learn different nonlinear correlations. For example, by choosing and to be the cumulative distribution functions of pre- and postsynaptic activity, respectively, Hebbian learning can be set to approximate Spearman's rank correlation (Genest & Favre, 2007), whereas setting both to the identity function makes the learning rule approximate simple covariance between pre- and postsynaptic activation. Various choices for are discussed by Brito and Gerstner (2016), leading to the discovery of sparse codes, whereas the emergence of principal or independent components can be proven for the specific choices or , respectively, in linear model neurons with constrained weights (Hyvärinen & Oja, 1998; Oja, 1982). It should be noted that these rules consider activations rather than membrane potentials, which is crucial here, since only the activation is subject to the effects of intrinsic adaptation.
For the simulation experiments presented here, both functions are set to simply subtract the mean activation enforced by intrinsic plasticity, such that uncorrelated firing at the mean rate leads to no synaptic growth. Transmission delays are neglected, resulting in the simple Hebbian learning rule
formula
2.8
For strictly inhibitory synapses, weights are constrained to and is chosen, resulting in a form of anti-Hebbian learning (Földiák, 1990).

2.2  The Full Network Model

The dynamics of a neuron within the adaptive network can be summarized in the most general form by equations 2.9 to 2.12, whereas a synaptic weight from neuron to adapts according to equation 2.13:
formula
2.9
formula
2.10
formula
2.11
formula
2.12
formula
2.13
Here, superscripts denote neurons, is an input signal (either external or the output of another neuron), is the membrane potential, is the empirical estimate of the membrane potential's sufficient statistics , is the neuron's activation, is the matrix of synaptic connection weights, and and are the learning rates of intrinsic and synaptic plasticity, respectively.
In the following, specific instances of this model are used, where the membrane potential is modeled by leaky integration; inputs are themselves assumed to be stationary, Markovian, mean-reverting stochastic processes with either gaussian, Laplacian, or beta distribution, the marginal output distribution is chosen to be log-gaussian, log-Laplacian, or uniform; synaptic delays are neglected; and simple Hebbian learning is employed. In the cases shown here, equations 2.10 and 2.13 take the form:
formula
2.14
formula
2.15
where is the expected value of the activation distribution, is the membrane potential time constant, and is a scaling factor. The function defining the sufficient statistics and the activation functions resulting from the different choices of membrane potential and activation distributions used in this paper are listed in Table 1.
Table 1:
Parameterization of Different Neuron Models.
Gaussian log-GaussianLaplacian log-LaplacianBeta UniformLaplacian log-Gaussian
    
    
where     
and     
Gaussian log-GaussianLaplacian log-LaplacianBeta UniformLaplacian log-Gaussian
    
    
where     
and     
In the simulations presented here, the inputs to the model neurons are either outputs from other neurons, pixel intensities of image patches, or artificially generated colored noise. Noise stimuli with an exponential autocorrelation structure and stationary gaussian distribution are generated by Ornstein-Uhlenbeck processes in the form of equation 2.16, whereas a Laplacian stationary distribution is produced by stochastic processes as given by equation 2.17:
formula
2.16
formula
2.17
The functions and above are the (possibly time-varying) mean and standard deviation of the noise stimuli.

In some simulations, to ensure that multiple neurons within a population represent different aspects of their inputs, lateral inhibition is employed, such that one neuron inhibits all others, the next neuron inhibits all but the first, and so on. These inhibitory synaptic weights are trained via simple anti-Hebbian learning (Földiák, 1990) according to rule 2.15 with a negative scaling constant and constrained to remain negative. This connectivity structure introduces a strict ordering of decorrelated neurons without imposing recurrent connectivity on the population.

3  Results

To illustrate the properties of the model described above, we present three simulation examples.

3.1  Intrinsic Plasticity in Isolation

First, consider a gaussian model neuron with a leaky integrating membrane potential as given in equation 2.14, driven by a single gaussian process as given by equation 2.16. During simulation, the mean and standard deviation of the gaussian input signal are changed at three points in time, resulting in four time intervals in each of which the membrane potential exhibits a different stationary gaussian distribution. We estimate the sufficient statistics of the respective membrane potential distributions through the two adaptation variables and (see Table 1, gaussian log-gaussian), both of which change on a timescale much slower than the autocorrelation in the membrane potential process.

During simulation, the sufficient statistics, as well as the Kullback-Leibler divergence between the neuron's estimated and real membrane potential distribution, are tracked and aggregated over 100 trials. This equivalently measures the divergence between the desired, log-gaussian output distribution and the one actually achieved by the neuron (see supplementary text A.5 for a derivation).

The results are summarized in Figure 2. Evidently the model neurons reliably adapt to the various input statistics on a timescale determined by the dynamics of the adaptation variables, as well as the dynamics of the membrane potential. For an analytical derivation of the adaptation dynamics, we again refer you to the supplementary text A.4. The result illustrates that using the estimated membrane potential statistics in the parameterized activation function allows the neuron to compensate for sudden changes in the input distribution and maintain the neuron's desired output distribution.

Figure 2:

Adaptation of a model neuron to changing input distributions. Top: Stationary distributions of the gaussian membrane potential process (dashed lines) during four stages of the simulation (color-coded backgrounds) and the respective distributions estimated by the neuron's adaptation parameters at the end of each interval (filled histograms). Rows 2 and 3: Time course of the adaptation variables and , estimating the first and uncentered second moment of the membrane potential process, respectively. Across 100 trials, mean (solid red lines) and standard deviation (error bars) of the activation parameters are shown. The true moments (dashed gray lines), as well as the analytical values attainable via causal exponential filtering of sufficient statistics (dashed black lines), are provided for reference (see supplementary text A.4 for a derivation). Row 4: Trace of the Kullback-Leibler divergence between underlying and estimated membrane potential or, equivalently, desired and realized output distribution (solid black line). See supplementary text A.5 for an analytical derivation. For reference, the divergence for a hypothetical nonadaptive neuron with identical initial parameter values is given (dashed gray line). Bottom: Desired log-gaussian (dashed black line) and achieved (filled histograms) distributions of activation at the end of each time interval. For reference, the distributions of activation resulting from the same input distributions are given for a nonadaptive neuron with identical initial parameter values (dashed gray line).

Figure 2:

Adaptation of a model neuron to changing input distributions. Top: Stationary distributions of the gaussian membrane potential process (dashed lines) during four stages of the simulation (color-coded backgrounds) and the respective distributions estimated by the neuron's adaptation parameters at the end of each interval (filled histograms). Rows 2 and 3: Time course of the adaptation variables and , estimating the first and uncentered second moment of the membrane potential process, respectively. Across 100 trials, mean (solid red lines) and standard deviation (error bars) of the activation parameters are shown. The true moments (dashed gray lines), as well as the analytical values attainable via causal exponential filtering of sufficient statistics (dashed black lines), are provided for reference (see supplementary text A.4 for a derivation). Row 4: Trace of the Kullback-Leibler divergence between underlying and estimated membrane potential or, equivalently, desired and realized output distribution (solid black line). See supplementary text A.5 for an analytical derivation. For reference, the divergence for a hypothetical nonadaptive neuron with identical initial parameter values is given (dashed gray line). Bottom: Desired log-gaussian (dashed black line) and achieved (filled histograms) distributions of activation at the end of each time interval. For reference, the distributions of activation resulting from the same input distributions are given for a nonadaptive neuron with identical initial parameter values (dashed gray line).

3.2  Interaction between Intrinsic and Synaptic Plasticity

While the ability of individual neurons to adapt to changes in their environment as illustrated in the previous experiment is arguably a very useful feature by itself, our primary interest lies in the interaction of such plastic neurons with plastic synapses. We present two variations of an experiment inspired by Hyvärinen and Oja (1998) and Savin et al. (2010) to illustrate how the interaction of both plasticity mechanisms can drive individual neurons to become selective to a transformation-invariant representation of their multivariate input by discovering a principal or independent component. We are able to reproduce these results using only intrinsic plasticity and Hebbian synaptic plasticity as discussed above and need not rely on a third, distinct mechanism of synaptic scaling as suggested there. For both conditions, we provide a detailed fixed-point analysis of the synapse dynamics in the presence and absence of intrinsic plasticity and show theoretically how independent component analysis emerges (only) from the complementary contributions of intrinsic and synaptic plasticity. To this end, we use plasticity operating on two distinct timescales: a fast-acting form of intrinsic firing rate adaptation that renders the neurons invariant to changes in the scale of their membrane potentials and much slower-acting Hebbian plasticity that adapts the weight vectors projecting into the neuron.

In both simulations, consider two independent source signals modeled by stationary stochastic processes. The source signals are mixed into two different linear combinations by multiplying the two-dimensional signal vector with a mixing matrix , here chosen to be the rotation matrix with an angle of . The transformed signal vector is then projected into two adaptive neurons through a matrix of adaptive synaptic weights . The process driving the two adaptive neurons is thus given by . Recovering the two original source signals such that requires finding the matrix without knowing . This problem is accordingly referred to as blind source separation (Keziou, Fenniri, Messou, & Moreau, 2013). Since the sources here represent statistically independent signals, this can be realized through independent component analysis (Hyvärinen & Oja, 2000, short: ICA). In both simulations, the covariance matrix of both neurons' membrane potentials is determined by the constant mixing matrix and the time-varying weight matrix . In particular, the variance of each neuron 's membrane potential is given as a function of the corresponding row of , that is, the vector of weights projecting into neuron . At each point in time, the effect of the fast-acting intrinsic plasticity, which renders the neurons invariant to changes in their marginal membrane potential variances, can thus be modeled as a normalization of each membrane potential: . This effect of intrinsic plasticity is equivalent to synaptic scaling (Turrigiano, 2008), albeit implemented within the neuron without explicit knowledge of the weight vector, rather than changing the weight vector itself. For each weight vector , this makes it possible to theoretically derive the vector field of expected weight changes from the Hebbian weight update equation 2.7, identify fixed points, and prove convergence. A corresponding vector field in the absence of intrinsic plasticity can be derived in strict analogy, only replacing the normalization of the membrane potential by a scaling with a constant independent of the weight vector: . An analytical derivation of attractor landscapes can be found in the supplementary text C.2.

We show that the basic Hebbian plasticity rule, which updates synaptic weights according to equation 2.15 based on the pre- and postsynaptic activities and , respectively, can achieve blind source separation if and only if paired with intrinsic plasticity. By recovering the original independent sources, the neurons become invariant to the mixing effect of an (orthogonal) transformation , illustrating a network-level form of invariance due to plasticity. Thus, multiple neurons can develop a nonredundant, information theoretically efficient representation of their multidimensional inputs (Isomura et al., 2015).

3.2.1  PCA in Individual Neurons

For the first experiment, assume the sources to be gaussian with different variances. The model neurons are chosen to map gaussian membrane potentials to log-gaussian activations. Over time, we trace the evolution of the synaptic weights under the influence of synaptic and intrinsic plasticity and observe that both weight vectors rapidly align themselves with the largest eigenvector of the covariance matrix of the joint membrane potential distribution. As a consequence, both neurons become tuned to the first principal component of the two-dimensional input, which corresponds to the one recovered original source with largest variance. The magnitude of the resulting weight vectors is equal for both neurons. The top-left panel of Figure 3 summarizes these results, and the two panels on the top right confirm that due to plasticity, the empirical distributions of the membrane potential and activation approach the desired distributions at the end of the simulation.

Figure 3:

Principal component analysis realized by two plastic neurons with Hebbian synapses. The top-left panel shows, color-coded for each neuron, the weight vectors evolving over time, from their respective starting values indicated by a star to final values indicated by a circle. Dashed lines mark the standard basis, and solid straight lines represent eigen directions of the input covariance matrix. The jointly gaussian input distribution is represented by random samples (gray dots) and an iso-probability ellipse. The two panels on the top right show empirical and analytically expected or desired probability densities of the first neuron's membrane potential and its activation . The right-hand panel presents the absolute inner angle in radians between each weight vector (color-coded) and the first principal component direction as a function of time. The bottom-left panel presents the adaptation of the weight vectors in the phase space of the Hebbian learning rule (see equation 2.15) under the influence of IP for the given input distribution. The weight vectors converge to either of the two stable fixed points (filled circles), depending on the domain of attraction (coded by background color), and diverge from the trivial solution at . On the separatrix, two saddle points emerge (half-filled circles). The bottom-right panel, for comparison, shows the phase space and example weight trajectories resulting from synaptic plasticity alone in the absence of intrinsic plasticity.

Figure 3:

Principal component analysis realized by two plastic neurons with Hebbian synapses. The top-left panel shows, color-coded for each neuron, the weight vectors evolving over time, from their respective starting values indicated by a star to final values indicated by a circle. Dashed lines mark the standard basis, and solid straight lines represent eigen directions of the input covariance matrix. The jointly gaussian input distribution is represented by random samples (gray dots) and an iso-probability ellipse. The two panels on the top right show empirical and analytically expected or desired probability densities of the first neuron's membrane potential and its activation . The right-hand panel presents the absolute inner angle in radians between each weight vector (color-coded) and the first principal component direction as a function of time. The bottom-left panel presents the adaptation of the weight vectors in the phase space of the Hebbian learning rule (see equation 2.15) under the influence of IP for the given input distribution. The weight vectors converge to either of the two stable fixed points (filled circles), depending on the domain of attraction (coded by background color), and diverge from the trivial solution at . On the separatrix, two saddle points emerge (half-filled circles). The bottom-right panel, for comparison, shows the phase space and example weight trajectories resulting from synaptic plasticity alone in the absence of intrinsic plasticity.

To understand these results, the respective roles of intrinsic and synaptic plasticity must be analyzed. Hebbian synaptic plasticity as employed here drives the synaptic connections to maximize postsynaptic activations. Due to the nonlinear relationship, here exponential, between membrane potential and activation, more dispersed stimuli are more effective at driving the neuron to fire at high rates, since large, positive deviations of the membrane potential are amplified, whereas similar negative deviations are attenuated. (For a precise derivation of how changes in higher-order moments of the membrane potential distribution affect the neuron's mean firing rate due to its nonlinear activation function, see the supplementary text C.1.) Hebbian plasticity thus rotates the weight vectors toward those directions in input space where dispersion is maximized. Here, for the example of gaussian signals, where all moments beyond the second vanish and due to the activation function's strong sensitivity to the second moment, they correspond to the directions of the principal components, ordered by the associated eigenvalues. In the absence of intrinsic plasticity, synaptic scaling, or other stabilizing mechanisms such as the BCM rule (Bienenstock et al., 1982), such a rule invariably leads to instability. Here, however, fast-acting intrinsic plasticity stabilizes Hebbian plasticity by keeping the post-synaptic neuron's output distribution constant, independent of the magnitude of the current weight vector, thus constraining synaptic growth. The choice of the desired activation distribution of the neuron thus indirectly determines the stable length of the weight vector resulting from Hebbian plasticity, as well as the speed and reliability of convergence to the principal components.

The bottom left panel of Figure 3 shows the attractor landscape of the Hebbian learning procedure, which exhibits two stable solutions, each corresponding to a weight vector aligned with the first principal component and an unstable fixed point at 0, pushing weight vectors to non trivial solutions. As indicated by the domains of attraction, a single neuron with Hebbian synapses and intrinsic plasticity converges to either of two representations of the first principal component of the input, recovering the source signal with larger variance. The temporal dynamics and stability of the adaptation procedure are illustrated in the second row panel on the right side of Figure 3, where the angles between each weight vector and the direction of the first principal component () are plotted as a function of time. After quickly rotating toward , the weight vectors remain stable at an angle close to 0, demonstrating the long-term stability of the solution. To illustrate the crucial role of intrinsic plasticity for stabilizing Hebbian learning, the bottom right panel of the same figure shows the phase space and weight vector trajectories from identical starting positions in the absence of intrinsic plasticity. For the parameters chosen here, both initial conditions now lie within the domain of attraction of a stable fixed point at the trivial solution .

3.2.2  ICA in Individual Neurons

The perspective gained from the previous example can be transferred to a more interesting experiment presented by Savin et al. (2010). Consider now that the two sources are identical, independent nongaussian (here Laplacian) processes, which are again mixed by the same matrix and projected onto adaptive neurons through a matrix of adaptive synaptic weights . Applying the same reasoning as in the previous example, Hebbian learning paired with intrinsic plasticity leads the neuron to discover those projections of the input signal that result in the most dispersed membrane potentials. However, due to the fact that now their covariance matrix itself is orthogonal, any normed linear combination of the inputs has the same variance, and thus no particular unique principal components can be defined. As a consequence, the original source signals cannot be recovered using any method such as PCA that takes only the second moment, that is, the covariance matrix, into account. However, since the input signals are now no longer gaussian, higher-order moments beyond the variance, such as kurtosis, can be taken into account to define unique directions of maximum dispersion. In the example presented here, the source signals' marginal Laplace distributions are leptokurtic, showing a higher kurtosis than any normalized linear combination of the two. The independent components (ICs), maximizing kurtosis, thus correspond to the demixed original source signals.

Since the nonlinear activation function is also sensitive to higher moments beyond the variance (see the supplementary text C.1), the neural adaptation procedure in this case selects the directions that maximize kurtosis and thus discovers the most prominent independent component, which corresponds to one demixed source signal. (See Figure 4 for a summary of the results.) To demonstrate that both neurons, due to different initial weight vectors, truly discover independent components, the top-right panel of Figure 4 shows the copula function of both neurons at the beginning and end of the simulation, converging to a uniform distribution that indicates statistical independence (rather than mere uncorrelatedness).

Figure 4:

Independent component analysis realized by two plastic neurons with Hebbian synapses. The top-left panel shows, color-coded for both independent components of the input, the weight vectors evolving over time from their respective starting values indicated by a star to final values indicated by a circle. Dashed lines mark the standard basis, and solid straight lines mark the basis rotated by . The joint input distribution is represented by random samples (gray dots) and an iso-probability diamond. The two plots on the top right show the copula between both neurons at the beginning and end of learning. The bottom-left panel presents the adaptation of the weight vectors in the phase space of the Hebbian learning rule (see equation 2.15) under the influence of IP for the given input distribution. The weight vectors converge to any of the four stable fixed points (filled circles), depending on the domain of attraction (coded by background color), and diverge from the trivial solution at . On the separatrices, four saddle points emerge (half-filled circles). The bottom-right panel, for comparison, shows the phase space and example weight trajectories resulting from synaptic plasticity alone in the absence of intrinsic plasticity.

Figure 4:

Independent component analysis realized by two plastic neurons with Hebbian synapses. The top-left panel shows, color-coded for both independent components of the input, the weight vectors evolving over time from their respective starting values indicated by a star to final values indicated by a circle. Dashed lines mark the standard basis, and solid straight lines mark the basis rotated by . The joint input distribution is represented by random samples (gray dots) and an iso-probability diamond. The two plots on the top right show the copula between both neurons at the beginning and end of learning. The bottom-left panel presents the adaptation of the weight vectors in the phase space of the Hebbian learning rule (see equation 2.15) under the influence of IP for the given input distribution. The weight vectors converge to any of the four stable fixed points (filled circles), depending on the domain of attraction (coded by background color), and diverge from the trivial solution at . On the separatrices, four saddle points emerge (half-filled circles). The bottom-right panel, for comparison, shows the phase space and example weight trajectories resulting from synaptic plasticity alone in the absence of intrinsic plasticity.

3.2.3  PCA and ICA in Populations

In the two experiments above, we consider the extreme cases where either no unique independent components can be defined, because the kurtosis of the gaussian inputs is 0, or no unique principal components can be defined, as the covariance matrix of the Laplacian inputs is the identity. In each case, a neuron then recovers the appropriately defined dominant component, a principal component or an independent component, respectively. Since the nonlinear activation functions chosen here depend on both second- and third-order moments (and more), this raises the question of whether, in the presence of both principal and independent components, a model neuron would select the former, the latter, or neither, instead converging to a compromise between the two. For an exponential activation function, the optimal solution maximizes a weighted combination of all moments of the neuron's membrane potential, with lower moments being the most influential (see supplementary text C.1). While dominant principal components are thus strongly favored by the adaptation procedure, independent components or yet other directions could coexist as stable solutions.

In order for a population of adaptive neurons to implement either PCA, ICA, or any other transformation-invariant representation of the population's input signals, it is crucial that individual neurons reliably become selective to different components. In the presence of strongly dominant components, as shown in the first example, different initial conditions alone are insufficient to ensure that different principal or independent components are discovered.1 To mitigate this problem, lateral strictly inhibitory synapses with scale parameter are placed between the adaptive neurons and updated via anti-Hebbian learning as outlined in section 2.2.

For two Laplacian source signals as in the previous example, we vary the mixing coefficients to realize three different scenarios in which the directions of maximum dispersion change from favoring principal components to favoring independent components. In all three cases, we train adaptive neurons mapping the Laplacian inputs to log-gaussian outputs and trace the weight vectors under the effect of Hebbian and intrinsic plasticity. (See Figure 5 for a summary of the results.) In accordance with expectations, the lateral decorrelation forces neurons to discover different components of the inputs in a predictable manner. While one adaptation variable emulates synaptic scaling by controlling the membrane potential's variance, the other compensates for the imbalance between feedforward activation and the combined effect of feedforward and lateral inhibition by controlling the mean. The identities of the extracted components by each neuron may change under nonorthogonal transformations, but the resulting representation of the input learned by the population is invariant with respect to further orthogonal transformations of the input.

Figure 5:

PCA and ICA in a population of four laterally decorrelated adaptive neurons with Hebbian synapses. Gray dots represent samples of the bivariate stochastic input process, solid black lines show iso-probability contours, dashed black lines show principal component directions, and dotted black lines represent independent component directions. Colored lines show trajectories of weight vectors learned by the neurons, with a star marking the initial and a circle marking the final value. On the left, all four neurons converge to both principal components with alternating signs. In the middle, two neurons converge to the dominant principal component, whereas the other two converge to the dominant independent component, all with alternating signs. On the right, all four neurons converge to both independent components with opposing signs.

Figure 5:

PCA and ICA in a population of four laterally decorrelated adaptive neurons with Hebbian synapses. Gray dots represent samples of the bivariate stochastic input process, solid black lines show iso-probability contours, dashed black lines show principal component directions, and dotted black lines represent independent component directions. Colored lines show trajectories of weight vectors learned by the neurons, with a star marking the initial and a circle marking the final value. On the left, all four neurons converge to both principal components with alternating signs. In the middle, two neurons converge to the dominant principal component, whereas the other two converge to the dominant independent component, all with alternating signs. On the right, all four neurons converge to both independent components with opposing signs.

3.3  Learning Representations of Image Patches

Finally, to demonstrate the generality of the results discussed thus far on high-dimensional data, we present a network of the same structure as above, evaluated on two different data sets of image patches of size 2828 pixels. For each pixel, an adaptive “sensory neuron” is trained to map its input, the corresponding pixel's intensity (plus a small noise term) scaled to the range from 0 to 1, which we assumed to be distributed according to a beta distribution, onto a uniformly distributed activation in the range from 0.5 to 0.5. These marginally uniform activations are subsequently used as stimuli and projected through Hebbian synapses onto five adaptive neurons that are laterally decorrelated via anti-Hebbian synapses in the same manner as in the previous example with a relative weight scale of . The neurons are chosen to map gaussian inputs to log-gaussian activations and thus exhibit the same exponential activation function as discussed previously. Each image patch is presented for a simulated time of 50 ms, and the total simulation time is 150,000 s at discretization steps of 1 ms each. Time constants of the neurons' membrane potentials are set to 5 ms, and adaptation rates for intrinsic, anti-Hebbian, and Hebbian plasticity are set to 10 s, 100 s, and 200 s, respectively.

First, a freely available collection of 500,000 natural image patches (Winder & Impressive Machines LLC, N.d.) is used, where the model neurons are shown to become selective to principal components, rather than independent components as extracted using an implementation of the FastICA algorithm (Hyvärinen & Oja, 2000; Pedregosa et al., 2011). For a summary of the results, see Figure 6. Due to the unstructured nature and translation-invariant statistics of the natural image patches, this result is in line with our expectations. Since the inputs to the adaptive neurons violate the assumption of gaussianity, intrinsic plasticity fails to achieve identical variance in the activation of all neurons but still succeeds in stabilizing Hebbian plasticity. The discrepancy between the assumed gaussian membrane potential distribution and the sparser observed distributions resulting from the neurons' discovery of nongaussian components could be alleviated by using a more general membrane potential distribution family such as the generalized normal distribution, which contains both gaussian and Laplacian distributions as special cases. Here, however, we restrict ourselves to the much simpler assumption of gaussian membrane potentials, for which closed-form expressions of the sufficient statistics exist.

Figure 6:

On the left, from top to bottom, we show five random sample patches, the first five principal components of the stimuli, the first five independent components extracted by FastICA, and the weights learned by the five neurons of the population. On the top right, the final weight matrix of lateral inhibitory synapses is shown. The bottom right contains the resulting correlation matrix between the five neurons' activations. Evidently the population discovers mostly uncorrelated principal components. (Image patches courtesy of Winder & Impressive Machines LLC, N.d.)

Figure 6:

On the left, from top to bottom, we show five random sample patches, the first five principal components of the stimuli, the first five independent components extracted by FastICA, and the weights learned by the five neurons of the population. On the top right, the final weight matrix of lateral inhibitory synapses is shown. The bottom right contains the resulting correlation matrix between the five neurons' activations. Evidently the population discovers mostly uncorrelated principal components. (Image patches courtesy of Winder & Impressive Machines LLC, N.d.)

Second, an identical network with the same parameters as in the previous case is used with 60,000 image patches of the same size from the MNIST database of handwritten digits (LeCun, Cortes, & Burges, N.d.). (For a summary of the results, see Figure 7). Due to the structuredness of the sample images, components representing the class averages emerge as the learned weights rather than principal components. In an unsupervised fashion, the population thus learns a sparse representation of its inputs. This result is in line with a prediction made by Triesch (2007), who proposes that in a high-dimensional learning problem with clustered data, the relative contribution of each cluster on a neuron's activation can be approximated by the expected activation in a corresponding percentile of the activation distribution.2 In this case, a neuron optimally tuned to respond strongly to a single cluster out of the 10 available can be heuristically expected to exhibit the 10% of its highest firing rates in response to inputs drawn from its favored cluster, resulting in an expected activation of in response to that class. Assuming that intrinsic plasticity successfully enforces the desired activation distribution with mean , the ratio could be viewed as a measure of selectivity toward a dominant class, which can be predetermined by an appropriate choice of activation distribution. As the neuron's mean activation drives Hebbian learning of its input synapses, this selectivity in the neuron's response to specific input classes is proportionally reflected in its synaptic weights. For a log-gaussian distribution with parameters and as used here, a well-tuned neuron could thus be expected to show an overall mean activation of with a mean activation of in response to its favored class input. For comparison, a neuron with enforced exponential activation distribution of equal mean could be expected to respond with a mean activation of to its favored cluster and would thus be slightly less selective for a unique class. For the first neuron in the numerical simulations above, the empirical results at and even exceed the heuristic predictions, implying that the neuron is indeed highly selective to a single cluster mean, in this case corresponding to the digit 0. The discrepancy between theoretical prediction and numerical results may be attributed to several compounding effects, such as temporal dynamics of the stochastic processes used here, high intraclass variability, or the correlatedness of the cluster centers of the MNIST digits, which make a direct quantitative comparison between numerical results and theoretical predictions difficult.

Figure 7:

On the left, from top to bottom, we show five sample images, five principal components of the stimuli, the average stimuli for each of five classes, and the weights learned by the five neurons of the population. On the top right, the final weight matrix of lateral inhibitory synapses is shown. The bottom right contains the resulting correlation matrix between the five neurons' activations. Evidently the population discovers mostly uncorrelated representations of the classes. (MNIST patches courtesy of LeCun, Cortes, & Burges, N.d.)

Figure 7:

On the left, from top to bottom, we show five sample images, five principal components of the stimuli, the average stimuli for each of five classes, and the weights learned by the five neurons of the population. On the top right, the final weight matrix of lateral inhibitory synapses is shown. The bottom right contains the resulting correlation matrix between the five neurons' activations. Evidently the population discovers mostly uncorrelated representations of the classes. (MNIST patches courtesy of LeCun, Cortes, & Burges, N.d.)

In both cases discussed, convergence to the final solution is quick, with components converging one by one in the sequence determined by the lateral inhibition structure, and the discovered representation remains stable over the remainder of the simulation. As the learned components reflect the structure present in the stimuli, the population's representation of its input is thus invariant to (orthogonal) transformations in the 784-dimensional input space, as any such transformation of the input space is counteracted by an according adjustment of synaptic weights and intrinsic excitability.

4  Discussion

Our numerical results confirm the theoretical conclusion that a population of adaptive neurons, implementing only local mechanisms of intrinsic and synaptic plasticity, can realize homeostatic self-regulation that renders it invariant with respect to affine-linear/orthogonal transformations of its multivariate input. This is here achieved by the recovery of independent or principal components, but could be similarly realized by finding any other arbitrary but uniquely defined representation of the population's input. For clustered inputs, the activation distribution could be chosen in accordance with a heuristic outlined by Triesch (2007), such as to achieve a certain specificity in the neurons' response to specific clusters and thus enforce a sparse code.

The generality of the model allows the implementation and combination of various other forms of invariance, even where gradient-based methods might become prohibitively complex. For example, the stochastic membrane potential process with its parametric stationary distribution can be chosen freely, so that its sufficient statistics resemble the properties we wish the neuron to become invariant to, such as higher-order moments.

Nonlinear forms of Hebbian plasticity as discussed by Brito and Gerstner (2016) could be used to learn sparse input representations other than independent or principal components. Additionally, a concave nonlinear dependency on the postsynaptic activation could act to further stabilize Hebbian learning by slowing synaptic growth at high firing rates, thus allowing for much slower intrinsic adaptation to compensate.

By including interaction delays in the synapse model, synaptic plasticity could be used to influence not just the instantaneous correlation structure between neurons but the full auto- and cross-correlation structure in time. By choosing different time constants for different adaptation variables, multiple timescales of adaptation can be modeled, a phenomenon observed in vivo (Fairhall et al., 2000; Turrigiano, 2008). Here, only a combination of slow Hebbian plasticity with fast intrinsic plasticity, as could be realized by rapidly acting homeostatic mechanisms such as spike rate adaptation or the accumulation or depletion of neurotransmitters, is discussed. However, biological evidence suggests that in particular, slow intrinsic adaptation (Turrigiano, 2008) and fast-changing weights (Zucker & Regehr, 2002) should be studied. Instead of the artificially constrained topology of lateral connectivity, which is used here to demonstrate the recovery of (only) dominant components, random sparse connectivity could be used in a sufficiently large network to find a similar, overcomplete, and invariant representation of the population's input.

Despite the generality of our results, we make several assumptions that should be asserted. First, intrinsic plasticity here operates on membrane potentials only. While rate-based or spike-triggered adaptation could be included, we abstain from doing this due to the complexity of the resulting model. Second, the neuron model is rate based; spiking effects could be included only via point or renewal processes with time-varying rate functions. This constrains the scope of spike timing effects and thus spike timing–dependent plasticity mechanisms that can be incorporated into the model. Third, intrinsic adaptation effects are not modeled as part of the membrane potential itself but rather as separate adaptation variables, similar to, for example, the adaptive exponential model (Gerstner & Brette, 2009), and can thus be observed only by their effect on the neurons' output behavior. Fourth, for the derivation of adaptation dynamics, the sufficient statistics that the neurons become invariant to are assumed to be twice differentiable for mathematical convenience, despite this not being a necessary condition. Fifth, the restriction of membrane potentials to (locally) stationary diffusion processes, albeit still very general, may be overly restrictive for some applications where synaptic inputs exhibit more structure, such as neurons driven by a low number of incoming spikes. In such cases, a different class of processes should be used. Finally, the model is stated here in the limiting case, where changes to the parameters of the neurons' inputs occur so slowly or rarely that the resulting membrane potentials can be assumed to be locally or piece-wise stationary. When adaptation variables and input statistics evolve on similar timescales, this assumption is violated; the separation between input processes and their (slowly changing) parameters breaks down, and some of the arguments presented here can no longer be directly applied. In a machine learning context, this problem is commonly referred to as concept drift (Tsymbal, 2004), and adaptive networks might be well capable of coping with it. In this regime of very fast adaptation, intrinsic plasticity might no longer stabilize neural activations but could instead endow neurons with a local estimate of membrane potential statistics, potentially increasing their dynamic range (Fairhall et al., 2000; Fairhall, Lewen, Bialek, & de Ruyter Van Steveninck, 2001). This remains to be studied in future work.

In ongoing work, we explore whether a simple choice of intrinsic plasticity, resulting in scale invariance, can be used to stabilize activity in sensory neuron populations, deep feedforward and recurrently connected networks, thus improving information transmission as suggested by Turrigiano and Nelson (2004). This effect could potentially complement or replace specifically crafted inherently stable (deep) feedforward networks (see Klambauer, Unterthiner, Mayr, & Hochreiter, 2017, for example) by addressing the problem of vanishing gradients, a pressing issue in the field of deep learning.

Conversely, despite the hypothesized stabilizing effect of intrinsic plasticity in feedforward networks, pathological recurrent connectivity could potentially lead intrinsic plasticity to actively destabilize a network. Such pathological network behavior needs to be studied to discover limitations and possible undesired sideeffects of plasticity and could potentially provide some theoretical insight into neurological conditions such as epilepsy.

Finally, in order to assert the biological compatibility of the abstract mathematical model presented here, the biophysical mechanism underlying plasticity needs to be tied to adaptation variables, and a reasonable class of membrane potential processes and instantaneous firing rate distributions needs to be determined from biological evidence to guide modeling choices.

Appendix A:  Mapping Stochastic Processes

A.1  Preliminaries

 

Lemma 1
(Itô's lemma, theorem 4.2.1. of Øksendal, 2003). For a function and an -dimensional stochastic Itô process , the transformation is a -dimensional Itô process that can be expressed as
formula

For linear transformations , we see that stochastic differentials preserve linearity. A special case of this lemma is:

Corollary 1.
For a one-dimensional process and a time-invariant function , Itô's lemma yields
formula
A.1
formula
A.2
which is an Itô process as well.
Lemma 2
(stationary distribution, theorem 4.68 of Capasso & Bakstein, 2015, and section 3.2 of Stepanov, 2013). Let be a stationary diffusion process of the form
formula
with sufficiently regular functions and .3 Then the density of the stationary distribution is (up to a scaling factor) given by
formula

Using corollary 2 and lemma 3, the properties of a specific process, the Ornstein-Uhlenbeck process, can be derived.

Lemma 3
(Ornstein-Uhlenbeck process). Let be a one-dimensional Itô process of the form
formula
where . We call an Ornstein-Uhlenbeck process (Stepanov, 2013, section 2.6). For an initial condition , its solution is given by
formula
For constant , , it is a stationary process with stationary gaussian distribution with mean and autocovariance .
Proof.
Let be as defined. We apply Itô's lemma to define with and simplify:
formula
Rewriting the stochastic differential equation above as stochastic integrals gives
formula
For a proof of stationarity and the full autocorrelation function, see, for example, Stepanov (2013, section 2.9).
Using lemma 3, it follows that the stationary distribution of has the gaussian density
formula
A.3
formula
A.4
formula
A.5
formula
A.6

When the above process is driven not by white noise but an integrable time-varying function (i.e., the solution of another stochastic process), a solution of can be equivalently derived where integration is not performed with respect to the differential but instead with respect to .

A.2  Deriving a Nonlinear Activation Function

 

Lemma 4.
Let denote a stationary Itô diffusion process and let denote the density of the corresponding stable distribution . Choose a monotonically increasing function . Then is again a stationary Itô process with a stable distribution that has the density
formula
where we use to denote the derivative of with respect to .
Proof.
Given as above and a diffusion process of the form
formula
Itô's lemma implies for that
formula
A.7
formula
A.8
formula
A.9
formula
A.10
formula
A.11
Using lemma 3 on the stationary processes and allows us to derive the relationship between the two stationary densities and :
formula
A.12
formula
A.13
formula
A.14
formula
A.15
formula
A.16
Corollary 2.

Let denote the cumulative distribution function (CDF) of , and let denote the CDF of an arbitrary distribution . Define . If is continuous with density , then the distribution from lemma 5 is equal to with density .

Proof.
For as defined, it follows that
formula
A.17
formula
A.18
formula
A.19
formula
A.20
Then by lemma 5,
formula
This proves that indeed maps a process with stationary distribution onto a process with the arbitrarily chosen stationary distribution .

A.3  Autocorrelation of Filtered OU Process

Assuming that the input into the leaky integrating membrane potential of the simple neuron model with and is not Brownian motion but instead another (OU) process (see lemma 4) with autocorrelation function
formula
then the resulting autocorrelation function for with is determined by the stochastic input and can be derived as follows (we assume , ):
formula
A.21
formula
A.22
formula
A.23
formula
A.24
formula
A.25
formula
A.26
formula
A.27
By considering the limit where both and are far from 0 and and the influence of the initial condition vanishes, this can be simplified to a stationary autocovariance function:
formula
A.28
formula
A.29
A direct result from equation A.29 is that the variance of the membrane potential then is given by .

A.4  Estimating Sufficient Statistics

By Itô's lemma, transforming the neuron's membrane potential process through the nonlinear sufficient statistics of its membrane potential distribution (assuming they satisfy the requirements) yields new Itô processes that subsequently can be filtered with a causal exponential filter. The resulting exponentially weighted running average approximates the expected values of the sufficient statistics. Suddenly changing parameters of the membrane potential distribution takes effect on the membrane potential with a certain delay due to continuity of the membrane potential. Filtering a transformation introduces further delay. The dynamics of adaptation variables are in the following analytically derived for the special case of membrane potentials modeled by the Ornstein-Uhlenbeck process defined in lemma 4. Let be an Ornstein-Uhlenbeck with time-varying mean and standard deviation and . We consider an adaptation variable . For and for , we can use the deterministic part of the solution as defined in lemma 4 to derive the expectation
formula
A.30
formula
A.31
formula
A.32
formula
A.33
formula
A.34
formula
A.35
The causal filter here is a valid probability density with mean . We thus see that the adaptation variable approximates the membrane potential's true mean value, lagging behind with a delay on the scale of . A similar derivation can be done for the second adaptation variable, . The larger the two time constants and are chosen, the smoother, yet slower, the resulting adaptation variables approximate the true sufficient statistics. The noise in the adaptation variable can be estimated by considering its variance, which, as calculated as in section A.3, is proportional to and approaches 0 for a sufficiently slow adaptation time constant . We conclude that the proposed mechanism of estimating sufficient statistics via filtering is valid and asymptotically correct, and it follows the dynamics derived here. See Figure 2 to verify the match between derived and simulated adaptation parameter dynamics.

A.5  Approximation Quality of Adaptive Neurons

To measure the quality of an adaptive neuron's temporary estimate of the current stationary membrane potential distribution , the Kullback-Leibler divergence between the two can be used. This measure is very informative, since it simultaneously represents the mismatch between the neuron's produced and desired activation distributions:

Lemma 5.
Let denote the stationary probability density of an adaptive neuron's membrane potential with parameters , and let denote the neuron's estimate of the same density, instead parameterized by the neuron's adaptation variables . Further, let denote the neuron's desired stationary distribution of activation and the distribution achieved by the adaptive neuron. Then the neuron's activation distribution approaches the desired distribution if and only if its estimated membrane potential distribution approaches the true membrane potential distribution and
formula
Proof.
Let be the monotonous, continuous activation function of the neuron with support that maps and ; then:
formula
A.36
formula
A.37
formula
A.38
formula
A.39

How well the desired distribution of activation can be achieved thus crucially depends on how well the neuron's adaptation parameters can approximate the sufficient statistics of the membrane potential distribution. Since the divergence on both membrane potential and activation side of the neuron is identical, we just refer to the divergence between neurons.

Using lemma 7, the divergence between two gaussian neurons can be calculated based on the membrane potential distribution's sufficient statistics and the corresponding approximation by the adaptation variables,
formula
The Kullback-Leibler divergence is then defined as follows:
formula
A.40
formula
A.41
formula
A.42
formula
A.43
formula
A.44
We see that this converges to 0 for and .

A.6  Effects of the Choice of Activation Distribution

The choice of membrane potential distribution and activation distribution together determines the resulting activation function. While the computational role of the neuron is often explained based on the nonlinearity used, the distributions allow a different perspective. Consider first a discrete, bimodal probability distribution of activations, where the activation is either at a high, fixed value with probability or a 0 with probability . The expected activation or firing rate then is , and the neuron switches between periods of high firing rates (bursts) and periods of low firing rates.

Due to the continuous nature of the membrane potential process, the periods of bursting can be expected to be dense, measurable intervals, which allows us to understand the neuron as a bursting neuron. Depending on the time constants of the membrane potential dynamics, these intervals could represent longer or shorter bursts of spikes. For sufficiently high rates and low with a constant mean firing rate , the timing of spikes can be made precise, in particular if renewal processes with refractoriness are used to sample spikes, limiting the number of spikes per interval. The shape of the distribution of firing rates thus allows influencing the temporal precision of spiking and controls the amount of noise introduced into spike trains sampled by an inhomogeneous Poisson process. For this example, the corresponding cumulative distribution function, and thus the nonlinearity, are step functions, and the neuron switches between stochastic bursting and silence. For continuous activation distributions, similar observations can be made. Highly curtotic distributions, where the mean is much larger than the median, are dominated by rare yet disproportionately high firing rates and can thus best be viewed as encoding such rare events that elicit high membrane potentials (coincidences). Depending on the spike sampling mechanism used, different distributions might yield optimal information transmission through spiking output.

Appendix B:  Copulas and Joint Distributions

B.1  Preliminaries

 

Lemma 6
(Copulas, Sklar's theorem; see Embrechts et al., 2001). For an -variate continuous random variable with joint cumulative distribution function and invertible marginal cumulative distribution functions , the copula function can be defined as follows:
formula
B.1
formula
B.2
formula
B.3
formula
B.4

Except information about the marginals, the copula thus contains all information about the joint distribution.

Corollary 3
(copulas and densities). Using lemma 8 and the chain rule, a continuous -dimensional probability distribution with joint density , marginal densities , and marginal continuous distribution functions can be decomposed into the product
formula
B.5
where is the copula density.
Lemma 7

(marginal invariance of copulas). Let be an -variate continuous random variable with copula and let be monotone functions. Then and have the same copula .

Proof.
Let , and be the copula, joint, and marginal CDFs of , respectively, and let , , and be the copula, joint, and marginal CDFs of , respectively. Then and applying lemma 8 twice yields
formula
B.6
formula
B.7
formula
B.8
formula
B.9

B.2  Dependency of the Copula of the Joint Distribution of Activation on the Covariance of Gaussian Inputs

For a population of neurons driven by jointly gaussian input, the joint membrane potential distribution is determined by the stochastic input, particularly the covariance matrix of its stationary distribution. The variance of each neuron 's membrane potential is equal to the variance of the signal driving the neuron, scaled down by a constant that depends on the dynamics of the input and the membrane potential process (see section A.3 for a derivation). Due to the invariance of the copula to marginal transformations, as guaranteed by lemma 10, the stationary distribution of the neurons' joint activation thus has the same copula as the joint membrane potential distribution and the joint input distribution. The stationary, multivariate gaussian input process with covariance matrix has a gaussian copula (Embrechts et al., 2001) of the following form, where denotes the standard gaussian CDF:
formula
B.10
Given , which depends on only the covariance matrix of the stationary joint distribution of the signals driving a population of neurons and marginal probability densities of the neurons activation, the joint distribution of activation in a population is thus fully specified.

B.3  Inducing Structure through Input Weights

In a population, where neurons receive input signals via weighted synaptic connections from different source signals, which are jointly gaussian processes with stationary covariance matrix of rank , the signals driving the neurons are also jointly gaussian with stationary covariance matrix . Decomposing the (pos. def.) matrix (e.g. using the Cholesky-decomposition), the weight matrix could be chosen as (if more input signals than neurons are available, a pseudo-inverse could be used here). The resulting covariance of the signals driving the population is then given by
formula
B.11
that is, the neurons are completely decorrelated. For a desired covariance matrix , the same decomposition into can be used to define a weight matrix , which results in the stationary covariance matrix:
formula
B.12
By choosing the synaptic connection weights accordingly, arbitrary covariance structures can thus be induced in the input signals driving the neurons, assuming that at least as many uncorrelated input signals are available as there are neurons in the population. Using the results from section B.2, this implies that for gaussian inputs, any activation distribution with a gaussian copula can be realized by an appropriate choice of marginal distributions and synaptic weights.

The same reasoning can be applied across time, rather than neurons, when considering the autocovariance of a single neuron instead of the covariance between different neurons. If the autocovariance structure of a gaussian membrane potential process across different points in time is modeled by a multivariate gaussian distribution, it can again be decomposed into a (gaussian) copula and marginal distributions. The copula is preserved under the nonlinear transformation through the neuron's activation function and thus captures the autocovariance structure of its activation process as well. By adding multiple weighted synaptic connections with different delays for a single source signal, the autocovariance can be controlled in the same way as the covariance is controlled above.

Combining these two approaches, a set of weighted, delayed synaptic connections can be used to induce cross-covariance structures in the neurons' inputs and thus membrane potentials and activations.

Appendix C:  PCA and ICA in Single Neurons

C.1  Sensitivity of Nonlinear Neurons to Higher Moments

Let be a random variable with mean , and let be a monotonically increasing function that is infinitely differentiable at . We consider the expected value of the output random variable . Then by using the Taylor-series expansion of around , we can see that the expected value depends on the centered moments of :
formula
C.1
formula
C.2
formula
C.3
formula
C.4

For a linear function, higher-order coefficients are , and the expected value of the output depends on only the first and zero-order moment. For polynomials of order , all moments up to order influence the expected output of the function. For a general exponential function with parameters and , all coefficients are positive and form a decreasing sequence that converges to 0. In this case, the expected value of the output is a linear combination of all centered moments of with rapidly decreasing weights. For a gaussian random variable , where the mean and all moments beyond the second are fixed at zero, the expected value of the output is thus dominated by the second moment, the variance, whereas for a zero mean, unit covariance Laplacian distribution the third and higher moments influence the expected value of . The relative dependence of a function's expected output on the moments of is determined by the coefficients of its Taylor expansion (if it exists).

C.2  Fixed-Point Analysis of the PCA and ICA Neurons

A neuron that receives input from a multivariate stochastic process with stationary covariance matrix through synaptic connections with weight vector has a membrane potential process with standard deviation , where the proportionality constant depends on the dynamics of the processes and (see also section A.3). We assume here that the membrane potential exhibits very fast adaptation dynamics and closely follows the much slower input process (we choose time constants ), such that . The activation of a neuron is given by , where for a nonadaptive neuron, is constant, and we just write . An adaptive neuron that compensates only for the changes in the scale of its membrane potential such as the gaussian or Laplacian models used here (the mean is fixed in these experiments) can thus be modeled by . Using equation 2.15, the vector field of expected weight changes can be calculated for an adaptive neuron:
formula
C.5
formula
C.6
Similarly, a nonadaptive neuron yields
formula
C.7

Since the stationary joint distribution of is explicitly given in both experiments (a multivariate gaussian or a product distribution of two independent Laplace distributions, rotated by ), these expectations can be analytically calculated for the two neuron types (adaptive and nonadaptive) in both conditions (gaussian and Laplacian inputs). Locations of fixed points can be found where expected weight changes vanish to zero. The bottom panels of Figures 3 and 4 show the calculated vector field for both neuron types in both experimental setups, respectively.

Notes

1

If it were the case that an individual neuron, depending on its initial weight vector, would select a nondominant component, this would be considered a flaw of the algorithm, as in that case, it could not be ensured that the most relevant components of the input are discovered.

2

Assuming that cluster centers are linearly independent and equally frequent.

3

The conditions are quite general yet very technical and can be found in Capasso and Bakstein (2015, theorem 4.56).

Acknowledgments

We thank our anonymous referees for their constructive feedback, which greatly helped to improve this article.

References

Abbott
,
L. F.
,
Varela
,
J. A.
,
Sen
,
K.
, &
Nelson
,
S. B.
(
1997
).
Synaptic depression and cortical gain control
.
Science
,
275
(
5297
),
221
224
. doi:10.1126/science.275.5297.221
Ashby
,
W. R.
(
1954
).
Design for a brain
.
New York
:
Wiley
. http://archive.org/details/designforbrain00ashb
Bell
,
A. J.
, &
Sejnowski
,
T. J.
(
1995
).
An information-maximization approach to blind separation and blind deconvolution
.
Neural Computation
,
7
(
6
),
1129
1159
.
Bibby
,
B. M.
,
Skovgaard
,
I. M.
, &
Sørensen
,
M.
(
2005
).
Diffusion-type models with given marginal distribution and autocorrelation function
.
Bernoulli
,
11
(
2
),
191
220
. doi:10.3150/bj/1116340291
Bienenstock
,
E. L.
,
Cooper
,
L. N.
, &
Munro
,
P. W.
(
1982
).
Theory for the development of neuron selectivity: Orientation specificity and binocular interaction in visual cortex
.
Journal of Neuroscience
,
2
(
1
),
32
48
.
Brito
,
C. S. N.
, &
Gerstner
,
W.
(
2016
).
Nonlinear Hebbian learning as a unifying principle in receptive field formation
.
PLOS Computational Biology
,
12
(
9
),
e1005070
. doi:10.1371/journal.pcbi.1005070
Brown
,
T. H.
,
Kairiss
,
E. W.
, &
Keenan
,
C. L.
(
1990
).
Hebbian synapses: Biophysical mechanisms and algorithms
.
Annual Review of Neuroscience
,
13
,
475
511
. doi:10.1146/annurev.ne.13.030190.002355
Buesing
,
L.
, &
Maass
,
W.
(
2010
).
A spiking neuron as information bottleneck
.
Neural Computation
,
22
(
8
),
1961
1992
. doi:10.1162/neco.2010.08-09-1084
Burrone
,
J.
, &
Murthy
,
V. N.
(
2003
).
Synaptic gain control and homeostasis
.
Current Opinion in Neurobiology
,
13
(
5
),
560
567
. doi:10.1016/j.conb.2003.09.007
Capasso
,
V.
, &
Bakstein
,
D.
(
2015
).
An introduction to continuous-time stochastic processes
.
New York
:
Springer
. doi:10.1007/978-1-4939-2757-9
Embrechts
,
P.
,
Lindskog
,
F.
, &
McNeil
,
A.
(
2001
).
Modelling dependence with copulas and applications to risk management
.
Amsterdam
:
Elsevier
.
Fairhall
,
A. L.
,
Lewen
,
G. D.
,
Bialek
,
W.
, &
de Ruyter Van Steveninck
,
R. R.
(
2000
).
Multiple timescales of adaptation in a neural code
. In
Proceedings of the 13th International Conference on Neural Information Processing Systems
(pp.
115
121
).
Cambridge, MA
:
MIT Press
. http://dl.acm.org/citation.cfm?id=3008751.3008769
Fairhall
,
A. L.
,
Lewen
,
G. D.
,
Bialek
,
W.
, &
de Ruyter Van Steveninck
,
R. R.
(
2001
).
Efficiency and ambiguity in an adaptive neural code
.
Nature
,
412
(
6849
),
787
792
. doi:10.1038/35090500
Földiák
,
P.
(
1990
).
Forming sparse representations by local anti-Hebbian learning
.
Biological Cybernetics
,
64
(
2
),
165
170
. doi:10.1007/BF02331346
Genest
,
C.
, &
Favre
,
A.-C.
(
2007
).
Everything you always wanted to know about copula modeling but were afraid to ask
.
Journal of Hydrologic Engineering
,
12
(
4
),
347
368
. doi:10.1061/(ASCE)1084-0699(2007)12:4(347)
Gerstner
,
W.
, &
Brette
,
R.
(
2009
).
Adaptive exponential integrate-and-fire model
.
Scholarpedia
,
4
(
6
),
8427
. doi:10.4249/scholarpedia.8427
Hromádka
,
T.
,
DeWeese
,
M. R.
, &
Zador
,
A. M.
(
2008
).
Sparse representation of sounds in the unanesthetized auditory cortex
.
PLOS Biology
,
6
(
1
),
e16
. doi:10.1371/journal.pbio.0060016
Hyvärinen
,
A.
, &
Oja
,
E.
(
1998
).
Independent component analysis by general nonlinear Hebbian-like learning rules
.
Signal Processing
,
64
(
3
),
301
313
. doi:10.1016/S0165-1684(97)00197-7
Hyvärinen
,
A.
, &
Oja
,
E.
(
2000
).
Independent component analysis: Algorithms and applications
.
Neural Networks
,
13
(
4
),
411
430
. doi:10.1016/S0893-6080(00)00026-5
Isomura
,
T.
,
Kotani
,
K.
, &
Jimbo
,
Y.
(
2015
).
Cultured cortical neurons can perform blind source separation according to the free-energy principle
.
PLOS Computational Biology
,
11
(
12
),
e1004643
. doi:10.1371/journal.pcbi.1004643
Izhikevich
,
E. M.
(
2003
).
Simple model of spiking neurons
.
IEEE Transactions on Neural Networks
,
14
(
6
),
1569
1572
. doi:10.1109/TNN.2003.820440
Jolivet
,
R.
,
Lewis
,
T. J.
, &
Gerstner
,
W.
(
2004
).
Generalized integrate-and-fire models of neuronal activity approximate spike trains of a detailed model to a high degree of accuracy
.
Journal of Neurophysiology
,
92
(
2
),
959
976
. doi:10.1152/jn.00190.2004
Keziou
,
A.
,
Fenniri
,
H.
,
Messou
,
K.
, &
Moreau
,
E.
(
2013
).
Blind source separation of independent/dependent signals using a measure on copulas
. In
Proceedings of the 21st European Signal Processing Conference
(pp.
1
5
).
Piscataway, NJ
:
IEEE
.
Klambauer
,
G.
,
Unterthiner
,
T.
,
Mayr
,
A.
, &
Hochreiter
,
S.
(
2017
).
Self-normalizing neural networks
. arXiv:1706/02515 [cs, stat]
LeCun
,
Y.
,
Cortes
,
C.
, &
Burges
,
C.
(
N.d.
).
MNIST handwritten digit database.
http://yann.lecun.com/exdb/mnist/
Ma
,
J.
, &
Sun
,
Z.
(
2011
).
Mutual information is copula entropy
.
Tsinghua Science and Technology
,
16
(
1
),
51
54
. doi:10.1016/S1007-0214(11)70008-6
Oja
,
E.
(
1982
).
Simplified neuron model as a principal component analyzer
.
Journal of Mathematical Biology
,
15
(
3
),
267
273
. doi:10.1007/BF00275687
Øksendal
,
B.
(
2003
).
Stochastic differential equations
. In
S.
Axler
,
V.
Capasso
,
C.
Casacuberta
,
A.
MacIntyre
,
K.
Ribet
,
C.
Sabbah
, …
W.
Woyczyński
(Eds.),
Stochastic differential equations
(pp.
65
84
).
Berlin
:
Springer
. doi:10.1007/978-3-642-14394-6_5
Ostojic
,
S.
, &
Brunel
,
N.
(
2011
).
From spiking neuron models to linear-nonlinear models
.
PLOS Computational Biology
,
7
(
1
),
e1001056
. doi:10.1371/journal.pcbi.1001056
Pedregosa
,
F.
,
Varoquaux
,
G.
,
Gramfort
,
A.
,
Michel
,
V.
,
Thirion
,
B.
,
Grisel
,
O.
, …
Duchesnay
,
E.
(
2011
).
Scikit-learn: Machine learning in Python
.
J. Mach. Learn. Res.
,
12
,
2825
2830
. http://dl.acm.org/citation.cfm?id=1953048.2078195
Ricciardi
,
L. M.
, &
Lánský
,
P.
(
2006
).
Diffusion models and neural activity
. In
L.
Nodel
(Ed.),
Encyclopedia of Cognitive Science
.
Hoboken, NJ
:
Wiley
. doi:10.1002/0470018860.s00365
Savin
,
C.
,
Joshi
,
P.
, &
Triesch
,
J.
(
2010
).
Independent component analysis in spiking neurons
.
PLOS Computational Biology
,
6
(
4
),
e1000757
. doi:10.1371/journal.pcbi.1000757
Shinomoto
,
S.
,
Sakai
,
Y.
, &
Funahashi
,
S.
(
1999
).
The Ornstein-Uhlenbeck process does not reproduce spiking statistics of Neurons in prefrontal cortex
.
Neural Computation
,
11
(
4
),
935
951
. doi:10.1162/089976699300016511
Stemmler
,
M.
, &
Koch
,
C.
(
1999
).
How voltage-dependent conductances can adapt to maximize the information encoded by neuronal firing rate
.
Nature Neuroscience
,
2
(
6
),
521
527
. doi:10.1038/9173
Stepanov
,
S. S.
(
2013
).
Stochastic world
.
Heidelberg
:
Springer
. doi:10.1007/978-3-319-00071-8
Sweeney
,
Y.
,
Kotaleski
,
J. H.
, &
Hennig
,
M. H.
(
2015
).
A diffusive homeostatic signal maintains neural heterogeneity and responsiveness in cortical networks
.
PLoS Computational Biology
,
11
(
7
),
e1004389
. doi:10.1371/journal.pcbi.1004389
Toyoizumi
,
T.
,
Kaneko
,
M.
,
Stryker
,
M.
, &
Miller
,
K.
(
2014
).
Modeling the dynamic interaction of Hebbian and homeostatic Plasticity
.
Neuron
,
84
(
2
),
497
510
. doi:10.1016/j.neuron.2014.09.036
Toyoizumi
,
T.
,
Pfister
,
J.-P.
,
Aihara
,
K.
, &
Gerstner
,
W.
(
2005
).
Generalized Bienenstock Cooper Munro rule for spiking neurons that maximizes information transmission
.
Proceedings of the National Academy of Sciences of the United States of America
,
102
(
14
),
5239
5244
. doi:10.1073/pnas.0500495102
Triesch
,
J.
(
2007
).
Synergies between intrinsic and synaptic plasticity mechanisms
.
Neural Computation
,
19
(
4
),
885
909
. doi:10.1162/neco.2007.19.4.885
Truccolo
,
W.
,
Eden
,
U. T.
,
Fellows
,
M. R.
,
Donoghue
,
J. P.
, &
Brown
,
E. N.
(
2005
).
A point process framework for relating neural spiking activity to spiking history, neural ensemble, and extrinsic covariate effects
.
Journal of Neurophysiology
,
93
(
2
),
1074
1089
. doi:10.1152/jn.00697.2004
Tsymbal
,
A.
(
2004
).
The problem of concept drift: Definitions and related work
(
Tech. Rep.
).
Dublin
:
Department of Computer Science, Trinity College Dublin
.
Turrigiano
,
G. G.
(
2008
).
The self-tuning neuron: Synaptic scaling of excitatory synapses
.
Cell
,
135
(
3
),
422
435
. doi:10.1016/j.cell.2008.10.008
Turrigiano
,
G. G.
, &
Nelson
,
S. B.
(
2004
).
Homeostatic plasticity in the developing nervous system
.
Nature Reviews Neuroscience
,
5
(
2
),
97
107
. doi:10.1038/nrn1327
Winder
,
S. A.
, &
Impressive Machines LLC
. (
N.d.
).
Natural image patch database.
http://simonwinder.com/2015/08/natural-image-patch-database/
Zucker
,
R. S.
, &
Regehr
,
W. G.
(
2002
).
Short-term synaptic plasticity
.
Annual Review of Physiology
,
64
(
1
),
355
405
. doi:10.1146/annurev.physiol.64.092501.114547