Abstract

Nonlinear interactions in the dendritic tree play a key role in neural computation. Nevertheless, modeling frameworks aimed at the construction of large-scale, functional spiking neural networks, such as the Neural Engineering Framework, tend to assume a linear superposition of postsynaptic currents. In this letter, we present a series of extensions to the Neural Engineering Framework that facilitate the construction of networks incorporating Dale's principle and nonlinear conductance-based synapses. We apply these extensions to a two-compartment LIF neuron that can be seen as a simple model of passive dendritic computation. We show that it is possible to incorporate neuron models with input-dependent nonlinearities into the Neural Engineering Framework without compromising high-level function and that nonlinear postsynaptic currents can be systematically exploited to compute a wide variety of multivariate, band-limited functions, including the Euclidean norm, controlled shunting, and nonnegative multiplication. By avoiding an additional source of spike noise, the function approximation accuracy of a single layer of two-compartment LIF neurons is on a par with or even surpasses that of two-layer spiking neural networks up to a certain target function bandwidth.

1  Introduction

A central challenge in theoretical neuroscience is to describe how biological mechanisms ultimately give rise to behavior. One way to approach this challenge is to build models of neurobiological systems that generate the behavior of interest to a researcher. Since constructing models that span multiple levels of abstraction is typically difficult, theoretical neuroscientists are working on methods that facilitate mapping high-level behavior onto neural mechanisms. Such modeling frameworks include the Neural Engineering Framework (NEF; Eliasmith & Anderson, 2003; Eliasmith, 2013); efficient, balanced spiking networks (EBN; Boerlin & Denève, 2011; Boerlin, Machens, & Denève, 2013); and FORCE (Sussillo & Abbott, 2009; Nicola & Clopath, 2017). Generally these approaches describe how to translate dynamical systems—corresponding to some hypothesized behavioral model—into an idealized spiking neural network that adheres to the desired neurophysiological constraints, for example, neural tuning, firing rate distributions, and population-level connectivity (Komer & Eliasmith, 2016; Nicola & Clopath, 2017). This mechanistic grounding facilitates model validation by enabling a direct comparison of simulation results and empirical data (Stewart, Bekolay, & Eliasmith, 2012; Bekolay et al., 2014; Duggins, Stewart, Choo, & Eliasmith, 2017; Voelker & Eliasmith, 2018; Gosmann & Eliasmith, 2020).

The frameworks mentioned above primarily rely on two biophysical phenomena as computational primitives: synaptic filtering and the nonlinear relationship between somatic input currents and the neural response. Somatic response models range from leaky integrate-and-fire (LIF) to Hodgkin-Huxley-type dynamics (Schwemmer, Fairhall, Denéve, & Shea-Brown, 2015; Eliasmith, Gosmann, & Choo, 2016; Duggins, 2017). Crucially, however, these approaches typically assume that postsynaptic currents are a linear superposition of filtered presynaptic events. Nonlinear interactions between input channels as they may occur when modeling conductance-based synapses or dendritic structures are typically ignored.

While some research exists that explores the effect of nonlinear postsynaptic currents within these frameworks (Bobier, Stewart, & Eliasmith, 2014; Thalmeier, Uhlmann, Kappen, & Memmesheimer, 2016; Alemi, Machens, Deneve, & Slotine, 2018), these nonlinearities are seldom systematically exploited. Yet empirical and theoretical work suggests that active and passive nonlinear effects within the dendritic tree—and not only the soma—are at least partially responsible for the complex responses observed in some biological neurons, including cortical pyramidal cells (Mel, 1994; Koch, 1999; Polsky, Mel, & Schiller, 2004). London and Häusser (2005) argue that in addition to voltage-gated ionic currents, fundamental passive effects such as shunting inhibition are worth being investigated as computational resources.

Put differently, current functional modeling frameworks consider only a subset of the computational resources available in individual neurons and thus underestimate their computational power. Modelers wishing to multiply two signals might, for example, be forced to introduce an additional layer of neurons, although—in biology—the interplay between excitation and inhibition within the dendrites of a single neuron layer could have the same effect (Koch, 1999). The goal of this letter is to present mathematically tractable methods that allow researchers to take nonlinear postsynaptic currents into account. We demonstrate, as demanded by London and Häusser (2005), that the interactions between passive conductance-based input channels within a single dendritic compartment provide significant computational advantages over standard LIF neurons even within a noisy spiking neural network with low firing rates and small neuron counts.

Specifically, we extend the NEF toward systematically exploiting nonlinear postsynaptic currents. The NEF has been applied to various research areas, including low-level modeling of neurobiological systems (Kuo & Eliasmith, 2005; Tripp, 2009; Bobier et al., 2014) and studying large-scale models of cognitive systems grounded in biology (Eliasmith et al., 2012, 2016; Eliasmith, 2013). A software implementation of the NEF is part of the neural simulation package Nengo (Bekolay et al., 2014) and has been used as a neural compiler targeting analog and digital neuromorphic hardware (Choudhary et al., 2012; Mundy, Knight, Stewart, & Furber, 2015; Berzish, Eliasmith, & Tripp, 2016; Blouw, Choo, Hunsberger, & Eliasmith, 2018; Neckar et al., 2019).

The main contributions of this letter are as follows. First, we present a series of extensions to the NEF that improve its compatibility with more biologically detailed neuron models. We describe how to enforce nonnegative weights in conjunction with Dale's principle and extend the NEF toward nonlinear postsynaptic currents, thereby lifting some long-standing limitations of the NEF. Second, we derive a postsynaptic current model for a two-compartment leaky integrate-and-fire (LIF) neuron that can be interpreted as a simple dendritic nonlinearity. Third, we demonstrate that a single layer of two-compartment LIF neurons can compute a wide variety of functions with an error smaller than or on a par with the accuracy achieved by a comparable two-layer spiking neural network, as long as the target function does not surpass a certain bandwidth.

2  Materials and Methods

We begin with a review of relevant portions of the Neural Engineering Framework (NEF), followed by four novel extensions: decoding in current space, nonnegative weights, equality relaxation for subthreshold currents, and the input-dependent nonlinearity model H. These extensions facilitate incorporating biological detail into NEF models, including nonlinear, dendritic postsynaptic currents. We close with a description of the two-compartment LIF neuron and the derivation of a corresponding nonlinearity H.1

2.1  The Neural Engineering Framework

At its core, the NEF describes three principles that govern the construction of models of neurobiological systems (Eliasmith & Anderson, 2003). These principles apply established concepts from artificial neural networks to time-continuous spiking neural networks in a way that facilitates the integration of neurophysiological constraints. The first principle postulates that populations of neurons represent values in a relatively low-dimensional manifold within their high-dimensional activity space via nonlinear encoding and linear decoding. According to the second principle, network connectivity defines transformations as mathematical functions of the represented values. Finally, the third principle states that recurrent connections approximate arbitrary dynamical systems in which the represented values are state variables. Although we will not discuss recurrent networks in this letter, it should be noted that the methods presented here are fully compatible with the third principle and can be used to implement dynamical systems.

2.1.1  Representation

A fundamental assumption of the NEF is that populations of spiking neurons represent d-dimensional vectors xRd by means of nonlinear encoding and linear decoding. In essence, each neuron population is treated as a single hidden-layer neural network with linear input and output units. Similar schemes have been proposed since the late 1960s, an early example being Marr's model of pattern extraction in the cerebellar granule layer (Marr, 1969).

For the encoding process, each neuron i in a population of size n is assigned a tuning-curve ai(x) that maps any represented value x onto a corresponding activity (see Figure 1A and 1B). Mathematically,
ai(x)=GJix,ei=Gαix,ei+βi,
(2.1)
where the encoding vector ei projects the input x onto a scalar ξ, which in turn is translated by Ji(ξ) into a somatic current. While the specific current translation function Ji(ξ) depends on modeling assumptions, it is often defined as a first-order polynomial parameterized by a gain αi and a bias current βi. When building models, encoders ei, biases βi, and gains αi must be selected in such a way that the population exhibits diverse neural tuning, while at the same time staying within modeling constraints such as maximum firing rates for the range of represented x. The nonlinear neuron response G[J] defines the spiking neuron model used in the network. For a wide variety of neuron models, this can be characterized by a rate approximation that maps the somatic input current J onto an average firing rate.

Critically, this mapping does not suggest the adoption of a rate code by the NEF, but rather is a convenience for solving the synaptic weight optimization problem. That is, equation 2.1 is merely normative: it defines the average activity that a modeler expects individual neurons to have when the population represents a certain x. This normative constraint can be dropped entirely and all optimization done in the spiking domain within the NEF, but the computational costs are significantly higher (MacNeil & Eliasmith, 2011; Eliasmith, 2013). In this letter, we focus purely on offline optimization of the synaptic weights. Hence, the methods we present should in no way be interpreted as a theory of learning within nervous systems, but as a method for constructing models of mature, trained neural systems.

Figure 1:

Representation and transformation in the NEF. Example for a population of n=10 neurons. (A) Randomly selected linear (affine) current translation functions Ji(ξ)=αiξ+βi map the encoded value ξ=x,ei onto an input current Ji. The dotted line corresponds to the spike threshold Jth. (B) Tuning curves for each neuron in the population after applying the somatic nonlinearity G[J] (see equation 2.14). (C) Reconstructing the represented value from neural activities by means of linear decoding. (D) Approximating the function f(x)=2x2-1 with a different linear decoding. Dashed lines correspond to the mathematical function being approximated.

Figure 1:

Representation and transformation in the NEF. Example for a population of n=10 neurons. (A) Randomly selected linear (affine) current translation functions Ji(ξ)=αiξ+βi map the encoded value ξ=x,ei onto an input current Ji. The dotted line corresponds to the spike threshold Jth. (B) Tuning curves for each neuron in the population after applying the somatic nonlinearity G[J] (see equation 2.14). (C) Reconstructing the represented value from neural activities by means of linear decoding. (D) Approximating the function f(x)=2x2-1 with a different linear decoding. Dashed lines correspond to the mathematical function being approximated.

Notice that for the purpose of modeling neurobiological systems, all functions and parameters listed above can be hand-tuned by the modeler to match neurophysiological data. For example, as explored in Eliasmith and Anderson (2003), two-dimensional encoding vectors ei can be used to reproduce a wide variety of localized tuning curves found in the neuroscience literature. Since NEF models, in contrast to classical multilayer neural networks, are typically not globally optimized, tuning curves are maintained throughout the weight optimization process, and neural activities can be interpreted at any point in the simulation. Global stochastic gradient descent can be applied to parts of the network (Hunsberger & Eliasmith, 2015) at the cost of negatively impacting the interpretability and biologically informed tuning of individual neuron populations.

Complementary to the encoding operation is decoding, which reconstructs an approximation of the represented value from the momentary population activity at time t by multiplication with a decoding matrix DRd×n, that is, x(t)Da(t) (see Figure 1C). Finding D can be phrased as a Tikhonov regularized least-squares optimization problem,
minDk=1NDa(xk)-xk22+λNDF2=minDDA-XF2+λNDF2,
(2.2)
where ·2 is the L2-norm, ·F is the Frobenius matrix norm, xk is one of N training samples, ARn×N is a matrix of population responses for each sample, XRd×N is a matrix of training samples, and λ is a regularization term accounting for spike variability and other sources of error (Eliasmith & Anderson, 2003).

The decoders D are usually optimized offline under the assumption of the rate model G[J]. As formulated, the same D can be used to decode represented values through time in spiking networks when defining activity a(t) as low-pass filtered population spike trains. Linear low-pass filters are a common model for postsynaptic currents (Roth & van Rossum, 2009) and usually employed in spiking NEF networks (Eliasmith & Anderson, 2003).

2.1.2  Transformation

Nonlinear encoding and linear decoding schemes similar to the one described have been analyzed in more detail by researchers in the field of machine learning (Broomhead & Lowe, 1988). Neuron population tuning curves a(x) span a function space with a set of nonorthogonal basis functions. We can approximate any continuous function over the represented values to a desired degree by linearly combining a finite number of these basis functions (Hornik, Stinchcombe, & White, 1989). Specifically, the linear projection D in equation 2.2 approximates the identity function. By modifying the loss function and substituting X with a matrix Xf of target vectors f(xk), we can solve for decoders Df approximating some function f (see Figure 1D):
minDfk=1N12Dfa(xk)-f(xk)22+λNDfF2.
(2.3)

In order to construct neural networks, we need to find synaptic weight matrices WRm×n that connect from a prepopulation of n neurons to a postpopulation of m neurons. In particular, we would like to find a W that implicitly decodes f(x)=y from the prepopulation and at the same time encodes y in the postpopulation. If we assume that the current translation function Ji(x) is an intrinsic part of the neuron model or, put differently, each neuron i is assigned its own response curve Gi[ei,x]=G[Ji(ei,x)], both the encoding and decoding process are linear operators. With this simplification, W=EDf fulfills the properties listed above, where ERm×d is a matrix of postpopulation encoding vectors ei, and Df the desired function decoder for the prepopulation (Eliasmith & Anderson, 2003).

Crucially, when combining the input from multiple prepopulations, the postpopulation will always represent a linear combination of potentially nonlinear functions. To see this, consider two prepopulations projecting onto a common postpopulation, where the first projection approximates a function f1 and the second a function f2. Let i be the index of a neuron in the postpopulation. Then it holds
Giwif1,apre(x1)+wif2,apre(x2)=Giei,Df1apre(x1)+Df2apre(x2)aipostf1(x1)+f2(x2).
(2.4)
As a consequence, if we, for example, try to multiply two scalars x and y, these values must be represented as a two-dimensional vector in a common prepopulation. We alleviate this restriction by introducing input-dependent nonlinearities (see Figures 2B and 2C).
Figure 2:

Multivariate computation in the NEF. (A) In case the current translation function Ji is a part of the individual neuron response curve, NEF networks are additive: summation in activity space corresponds to addition in representational space. (B) Computing nonlinear multivariate functions φ generally requires all variables to be represented in an intermediate population. (C) The dendritic computation scheme discussed in this letter. Two prepopulations project onto a postpopulation with separate excitatory and inhibitory input channels. The nonlinear interaction between these channels is exploited to compute φ.

Figure 2:

Multivariate computation in the NEF. (A) In case the current translation function Ji is a part of the individual neuron response curve, NEF networks are additive: summation in activity space corresponds to addition in representational space. (B) Computing nonlinear multivariate functions φ generally requires all variables to be represented in an intermediate population. (C) The dendritic computation scheme discussed in this letter. Two prepopulations project onto a postpopulation with separate excitatory and inhibitory input channels. The nonlinear interaction between these channels is exploited to compute φ.

2.2  Extending the Neural Engineering Framework

The NEF as described above makes several assumptions that are not biophysically plausible. This includes the presence of a bias current βi in each neuron and excitatory and inhibitory connections not being treated separately. We describe two extensions that lift the aforementioned assumptions and present an alternative synaptic weight-solving procedure that takes subthreshold currents into account, followed by our proposed method for accounting for input-dependent nonlinearities.

2.2.1  Decoding the Current Translation Function

In section 2.1, we assumed that the current translation function Ji(x) is an intrinsic part of the neuron model. Consequently, each neuron is not only assigned a neuron-specific tuning curve ai(x) but also a neuron-specific response curve Gi[J]. This comes at the cost of biological plausibility, since neurons in general do not possess a strong bias current source.

Tripp and Eliasmith (2007) demonstrate that it is possible to robustly solve for synaptic weight matrices that approximate arbitrary postsynaptic current functions. We can use this insight to directly solve for weights that approximate the current translation function Ji(ei,x) instead of optimizing with respect to represented values x. Since, for now, we assume that the postsynaptic current is linear in the prepopulation activities, we must find a weight vector wi such that the following loss is minimized:
minwik=1NJiei,f(xk)-wi,apre(xk)2+λNwi22.
(2.5)
This equation can be brought into canonical least squares form and solved as before.

2.2.2  Nonnegative Weights and Dale's Principle

So far we have assumed that synaptic weights are real-valued. This is problematic for two reasons. First, the least-squares optimization proposed above arbitrarily assigns algebraic signs to the synaptic weights; we cannot specify which preneurons are excitatory and which inhibitory. Being able to do so is important, since neurons tend to follow Dale's principle—individual preneurons exclusively influence all their postneurons in either an excitatory or inhibitory manner (Strata & Harvey, 1999). Empirical data suggest that, depending on the modeled brain region, excitatory cells outnumber inhibitory cells by a factor between two and four (Hendry & Jones, 1981; Gabbott & Somogyi, 1986). Second, real-valued weights do not generalize to conductance-based synapses. The concept of negative conductances—in contrast to negative currents—is neither physically nor biologically plausible. Biological correlates of synaptic weights, such as the number of vesicles released from the presynapse or the channel density in the postsynapse, are inherently nonnegative quantities (Roth & van Rossum, 2009).

An extension to the NEF that takes these biological constraints into account has been proposed by Parisien, Anderson, and Eliasmith (2008). The Parisien transform splits each projection into an excitatory and inhibitory path, where the latter mediates the signal from the excitatory prepopulation through a population of inhibitory interneurons. The solution we discuss here does not introduce interneurons, and as such does not affect the structure of the network. Modelers using the following method will have to explicitly specify inhibitory interneuron populations if present in the modeled circuit.2

For the sake of simplicity, we assume in this letter that each population is arbitrarily split into a group of excitatory and inhibitory neurons. We write the somatic input current of postneuron i in response to presynaptic activity as wi+,a+-wi-,a-, where wi± are the nonnegative excitatory and inhibitory weight vectors and a± the activities of the excitatory and inhibitory neurons in the prepopulation. Combining this current term with equation 2.5 yields an optimization problem that allows us to solve for weights approximating f for each individual postneuron i,
minwi+,wi-12k=1Nwi+,ak+-wi-,ak--Jiei,f(xk)22+λNwi+22+λNwi-22=12wi'A'-j22+λNwi'22,wherewi'=(wi+,wi-),A'=(A+,-A-)T,and(j)k=Jiei,f(xk),
(2.6)
subject to wi+0,wi-0. This problem is in canonical least-squares form and, as before, can be solved with a standard regularized nonnegative least-squares solver.
Alternatively, equation 2.6 can be phrased as a convex quadratic program (QP), a generalization of least-squares optimization (Wright & Nocedal, 1999):
minwi12(wi')T(A')TA'wi'-jTA'wi'+λNwi'22,subjecttowi'0.
(2.7)
The global minimum of a convex QP can be computed in polynomial time (Kozlov, Tarasov, and Khachiyan, 1980). We propose a QP similar to equation 2.7 to solve for conductance-based synaptic weights in the context of the two-compartment neuron model discussed below.

2.2.3  Equality Relaxation for Subthreshold Currents

Most neurons possess a threshold current Jth below which their firing rate is zero. However, we do not take this into account when solving for somatic currents in equations 2.5 to 2.7. We optimize for synaptic weights that precisely evoke certain postsynaptic currents, despite the magnitude of currents smaller than Jth having no effect on the neural activity in a steady-state neural response model. Instead of optimizing for equality, that is, enforcing that the decoded current Jdec must equal the target current Jtar, we could relax this condition to an inequality constraint Jdec<Jth whenever Jtar<Jth.

We define a new optimization problem based on equation 2.5 that treats target currents smaller than Jth as an inequality constraint,
minwik=1N12EJiei,f(xk),wi,a(xk)2+λNwi22,whereE(Jtar,Jdec)=0ifJtar<JthandJdec<Jth,Jth-JdecifJtar<JthandJdec>Jth,Jtar-Jdecotherwise,
(2.8)
and N is the number of samples, Ji is the current translation function of the ith postneuron, ei is the encoding vector, f is the desired target function, a(xk) is the prepopulation activity for the kth sample point xk, λ is the regularization factor, and Jth is the aforementioned neuronal threshold current.
By splitting the matrix of prepopulation activities A and the vector of target currents j, we can rewrite equation 2.8 as a quadratic program. Let A+ and j+ be the prepopulation activities and target currents for superthreshold samples, and A- be the activities for subthreshold samples. Then the QP is given as
minwi,si12wiT(A+)TA+wi-(A+)T(j+)wi+si22+λNwi22,
(2.9)
subject to A-wi-siJth, where si is a vector of slack variables. Of course, the nonnegativity constraint from equation 2.7 can be incorporated into to this quadratic program.

2.2.4  Extension toward Input-Dependent Nonlinearities

Up to this point, we assumed current-based synapses. Their defining property is that the somatic current J is linear in the synaptic weights w and the presynaptic activities a. Now consider a neuron model with nonlinear input channels. We write the corresponding response curve G (mapping from neural input onto an average firing rate) for the ith neuron in a population as a multivariate function,
ai=Ggi1,,gi=Gai1,wi1,,ai,wi,
(2.10)
where gi1,,gi describe some abstract input state, such as the conductance of each synaptic channel in a neuron model with conductance-based synapses. As expressed in the above equation, we assume that on average, each gik is linear in the presynaptic activities aik. However, we do not make any assumption regarding the effect of gik on the somatic current; more fundamental, we do not assume that there exists an easily identifiable somatic current in the model at all.
The crucial idea is to mathematically reintroduce a “virtual” somatic current J by decomposing G into an input-dependent nonlinearity H and a somatic nonlinearity G. We define G and H according to the following equivalence relations,
Ggi1,,gi=GH(gi1,,gi)Hgi1,,gi=JGJ=Ggi1,,gi,
(2.11)
where H maps from the input channel state onto an average somatic current and, as before, G maps from a somatic current onto the output activity. In other words, H summarizes nonlinear effects caused by dendrites or conductance-based synapses.

While this formalization does not constrain G and H beyond the above equivalence, a sensible choice for G and H will facilitate solving for synaptic weights. For example, if the neuron model in question is an extension to the current-based LIF neuron model, it makes sense to select G as the standard LIF response curve. Then H translates the input state into an LIF-equivalent somatic current.

As we show in the next section, H, or at least a parameterized surrogate for H, can be derived in closed form in some cases. In case this is not feasible, it is possible to purely rely on empirical data for H by sampling the neuron output rate over varying synaptic states. Assume that we can observe H only indirectly by controlling the input channels of our model neuron and measuring the output activity G. Depending on our choice of G[J], we can apply an inverse mapping G-1 to the recorded data to obtain H. See Figure 3 for an illustration.

When solving for weights that approximate a specific function, recall from the above review that the first NEF principle normatively assigns a somatic current Ji(x) to each postneuron i and representational state x. Correspondingly, given H, we can combine equation 2.10 with the current-decoding problem discussed above, as well as the nonnegativity constraint and the equality relaxation,
minwi1,,wik=1NEJiei,f(xk),Hwi1,ak1,,wi,ak2+λNj=1wij22,
(2.12)
subject to wij0, where E is as defined in equation 2.8.

2.3  Two-Compartment Leaky Integrate-and-Fire Neuron Model

In section 2.2, we established the abstract notion of an input-depdendent nonlinearity H. We now derive a concrete H for a biologically plausible extension to the leaky integrate-and-fire (LIF) neuron with nonlinear conductance-based synapses. In particular, we consider a two-compartment version of the LIF neuron originally described in Vu, Lee, and Krasne (1993) and subsequently discussed by Koch (1999) and Capaday and Van Vreeswijk (2006). In addition to the active, somatic compartment, the two-compartment LIF model possesses a resistively coupled passive compartment that represents excitatory and inhibitory input into the dendritic tree. Depending on the coupling conductance gC, the input may be interpreted as either distal or proximal.

Figure 3:

Neural response curve decomposition. (A) Illustration of the multivariate neuron response curve G(gE,gI) for a two-compartment LIF neuron with excitatory and inhibitory conductance-based channels. (B,C) The chosen somatic nonlinearity G and its inverse G-1. (D) corresponding input-dependent nonlinearity H. The neuron does not fire in the hatched regions, that is, G-1 is ill-defined.

Figure 3:

Neural response curve decomposition. (A) Illustration of the multivariate neuron response curve G(gE,gI) for a two-compartment LIF neuron with excitatory and inhibitory conductance-based channels. (B,C) The chosen somatic nonlinearity G and its inverse G-1. (D) corresponding input-dependent nonlinearity H. The neuron does not fire in the hatched regions, that is, G-1 is ill-defined.

We first review the neuron model itself, derive a parameterized surrogate for the nonlinearity H, and finally propose a convex quadratic program for this H that can be used to solve for synaptic weights.

2.3.1  Model Description

The subthreshold dynamics of the conductance-based two-compartment LIF model can be expressed as a two-dimensional system of linear differential equations,
ddtCm,1v1=gC(v2-v1)+gL,1(EL-v1),ddtCm,2v2=gC(v1-v2)+gL,2(EL-v2)+gE(t)(EE-v2)+gI(t)(EI-v2),
(2.13)
where state variables v1, v2 correspond to the membrane potential of the active somatic compartment and the passive dendritic compartment, respectively. Cm,1, Cm,2 are the compartment capacitances, gC is the intercompartment coupling conductance, gL,1, gL,2 are the individual compartment leak conductances, gE(t), gI(t) are the momentary excitatory and inhibitory conductances of the dendritic compartment as evoked by presynaptic spikes, and EL, EE, EI are the conductance-channel reversal potentials. An equivalent circuit diagram of the model is shown in Figure 4A.
Figure 4:

Two-compartment LIF neuron model. (A) Equivalent circuit diagram of the two-compartment LIF neuron model. The “somatic” compartment corresponds to a classical LIF neuron; the “dendritic” compartment is resistively coupled to the somatic compartment. The current flowing between the two compartments corresponds to the somatic current J=H(gE,gI). (B) Voltage trace for the two-compartment model state variables v1 and v2 during spike production. The artificial depolarization following spike events in the somatic compartment significantly affects v2.

Figure 4:

Two-compartment LIF neuron model. (A) Equivalent circuit diagram of the two-compartment LIF neuron model. The “somatic” compartment corresponds to a classical LIF neuron; the “dendritic” compartment is resistively coupled to the somatic compartment. The current flowing between the two compartments corresponds to the somatic current J=H(gE,gI). (B) Voltage trace for the two-compartment model state variables v1 and v2 during spike production. The artificial depolarization following spike events in the somatic compartment significantly affects v2.

In contrast to point-neuron models, and as pointed out by Capaday and Van Vreeswijk (2006), a multicompartment model mandates an explicit spike model. The strong depolarization in the somatic compartment during spike production propagates backward into the denritic compartment and has a significant effect on its membrane potential (see Figure 4B). The model accounts for this with a “spike phase” occurring right before the LIF refractory period. The spike phase is implemented by clamping the somatic compartment to a voltage vspike over a time τspike whenever the somatic membrane potential crosses the threshold vth. Subsequently, the membrane is clamped to vreset for a period τref.

2.3.2  Somatic Current Surrogate Model

We assume that H(gE,gI) is equivalent to the current flowing from the dendritic into the somatic compartment (see Figure 4A), that is, H(gE,gI)=gC(v2-v1). Considering the definition in equation 2.11, this implies that G is the standard LIF response curve,
G[J]=τref+τspike-CmgLln1-(vth-EL)gLJ-1,
(2.14)
where τref, τspike are the length of the refractory and spike periods (see Figure 4B), Cm is the membrane capacitance, vth is the threshold potential, EL is the leak reversal or resting potential, and gL is the leak conductance (Eliasmith & Anderson, 2003). Yet, in practice, as pointed out by Hunsberger et al. (2014), and as we demonstrate in our experiments that follow, a rectified linear unit (ReLU) may be a sensible choice for G as well when modeling noisy input.

Unfortunately, when considering both sub- and superthreshold dynamics, there exists no exact closed-form solution for the average somatic current given constant gE, gI. Instead, our approach is to select a parameterized surrogate model for H and to fit the model parameters to results obtained from numerical simulations to account for inaccuracies in our derivation.

Assuming the subthreshold dynamics described in equation 2.13 are in equilibrium, gE, gI are constant, and applying the above definition of H(gE,gI), we get
H(gE,gI)=gC(v2-v1)=gCgL,2(EL-v1)+gE(EE-v1)+gI(EI-v1)gC+gL,2+gE+gI.
(2.15)
A single-compartment model can be derived by taking the limit of this equation for gC. In this case, and as demonstrated in Stöckel, Voelker, and Eliasmith (2017), H is an affine function and less interesting from a computational perspective. In general, we expect H to become more nonlinear as gC decreases. While a higher degree of nonlinearity may be desirable for computation in a neural network, gC cannot be made arbitrarily small, as this limits the maximum somatic input current (see below).
Since the somatic membrane potential v1 is clamped during the spike and refractory phases, the current flowing into the soma during those times does not influence the overall firing rate (ignoring the feedback effect on v2 discussed earlier). Once the neuron is tonically firing, the somatic membrane potential oscillates between vreset and vth. We can thus substitute v1 with a constant average membrane potential v¯som12(vreset+vth) (see Figure 5). Under these assumptions, H(gE,gI) from equation 2.15 can be written as a parameterized rational function
H(gE,gI)=b0+b1gE+b2gIa0+a1gE+a2gI,wherea0,a1,a2,gE,gI0.
(2.16)
This equation has one superfluous degree of freedom in the parameter space. Setting b1=1 tends to be a numerically stable normalization.
Figure 5:

Average LIF membrane potential over firing rate. Data correspond to a standard LIF neuron (i.e., the somatic compartment) over varying output rates. Data measured by capturing 1000 membrane potential traces for a 10 second current sweep with superimposed spike noise from a Poisson spike source. Individual lines correspond to the effective rate of the spike source, where smaller input rates are equivalent to a larger amount of noise on the input current. Shaded areas correspond to the 25th/75th percentile. (A) Average potential excluding the refractory period and spike production. Except for very low firing rates, the average potential remains relatively constant. Dotted line corresponds to v¯som=12(vreset+vth). (B) Average potential including the refractory and spike production. Dotted line corresponds to a linear model that takes the relative amount of time spent in the subthreshold, spike, and refractory regime into account.

Figure 5:

Average LIF membrane potential over firing rate. Data correspond to a standard LIF neuron (i.e., the somatic compartment) over varying output rates. Data measured by capturing 1000 membrane potential traces for a 10 second current sweep with superimposed spike noise from a Poisson spike source. Individual lines correspond to the effective rate of the spike source, where smaller input rates are equivalent to a larger amount of noise on the input current. Shaded areas correspond to the 25th/75th percentile. (A) Average potential excluding the refractory period and spike production. Except for very low firing rates, the average potential remains relatively constant. Dotted line corresponds to v¯som=12(vreset+vth). (B) Average potential including the refractory and spike production. Dotted line corresponds to a linear model that takes the relative amount of time spent in the subthreshold, spike, and refractory regime into account.

Equation 2.16 implies an absolute maximum and minimum somatic current; H(gE,gI) maps onto an open interval (Jmax,Jmin), where
Jmin=limgIH(gE,gI)=-b2a2=gC(EI-v¯som),Jmax=limgEH(gE,gI)=b1a1=gC(EE-v¯som).
(2.17)
In practice, the maximum attainable current for realistic conductance values is significantly smaller than Jmax, limiting the maximum firing rate. This must be taken into account when selecting the neuron tuning curve.
Model parameters can be estimated by solving the QP
minb0,b2,a0,a1,a2i,JiJthb0+b1gEi+b2gIi-Jia0-JigEia1-JigIia2-gEi2,
(2.18)
subject to the nonnegativity constraints in equation 2.16 where Ji=G-1[G(gEi,gIi)]. The conductances gEi, gIi should be sampled over the operating range of the neuron, and samples with zero or very small output rates ignored: the inverse of G is not defined for a zero output rate, and H was derived under the assumption of superthreshold dynamics.

2.3.3  Synaptic Weights as a Quadratic Program

Given the nonlinearity model H as defined in equation 2.16 our goal is to find weights wiE, wiI such that a desired current (j)k flows into the soma for every sample k (see equation 2.6). Due to H being a rational function, we cannot directly minimize the loss function in equation 2.12 as a quadratic program. Instead, we solve a related quadratic program that, in practice, produces reasonable solutions. In particular, we simply equate the desired currents and the input-dependent nonlinearity model H. Let ÷ and denote elementwise division and multiplication. Then, in vector notation, we get
ji=b0+b1AwiE-b2AwiIa0+a1AwiE+a2AwiI,wherea0,a1,a2,A,wiE,wiI0,
and A is a matrix of prepopulation activities. Rearranging into a canonical form yields
jia0+a1AwiE+a2AwiI=b0+b1AwiE-b2AwiI,(a1JiA-b1A)wiE+(a2JiA+b2A)wiI=b0-a0ji,where(Ji)mn=(ji)m,Aiwi'=bi,
(2.19)
where Ai=(a1JiA-b1A,a2JiA+b2A)T, bi=(b0-a0ji), and wi'=(wiE,wiI).
We account for the subthreshold equality relaxation by splitting Ai, bi according to the samples invoking a zero firing rate (resulting in Ai-, bi-) and those invoking a positive firing rate (resulting in Ai+, bi+). Thus, the quadratic program becomes
minwi',si12(wi')T(A+)TA+wi'-(bi+)TA+wi'+λNwi'22+si22,subjecttoAi-wi'-sibi-andwi'0.
(2.20)

3  Experiments and Results

In this section, we validate the methodology described above. To this end, we perform three experiments. In experiment 1, we test how well the two-compartment LIF nonlinearity model H predicts simulation data for a single spiking neuron. In experiment 2, we study the computational properties of H under optimal circumstances when approximating random band-limited functions. Finally, in experiment 3, we test whether the results from experiments 1 and 2 are still visible in the context of a noisy, feedforward spiking neural network.

Unless explicitly specified, the neuron model parameters are chosen according to Table A1. We model fast excitatory synapses as an exponential low pass with a time-constant of 5 ms as found in glutamatergic pyramidal neurons with AMPA receptor (Jonas, Major, & Sakmann, 1993). The inhibitory pathway is modeled with a 10 ms time constant as found in inhibitory interneurons with GABAA receptors (Gupta, Wang, & Markram, 2000). We use the cvxopt library (Vandenberghe, 2010) as a QP solver for equation 2.18 and the osqp library (Stellato, Banjac, Goulart, Bemporad, & Boyd, 2020) to solve equation 2.20.3 The source code of the computer programs used to conduct and evaluate the experiments can be found in the supplemental materials.

3.1  Experiment 1: Fitting the Surrogate Model to Simulation Data

The surrogate model H(gE,gI) in equation 2.16 predicts the current flowing into the somatic compartment of a two-compartment LIF neuron for excitatory and inhibitory conductances gE and gI. H is characterized by five parameters a0, a1, a2, b0, b2, for which a coarse theoretical estimate can be derived from equation 2.15. We compare spike rates measured in numerical simulation to the spike rate prediction obtained before and after fitting the model parameters to empirical data using equation 2.18. We do this for both constant conductances gE, gI and for artificial temporal spike noise superimposed on the conductance pair.

3.1.1  Experiment 1.1: Constant Conductances

We consider three two-compartment LIF neurons with different coupling conductances gC, namely, 50nS, 100nS, and 200nS, and measure their output spike rate for constant conductances gE, gI on a grid. The conductance range has been selected such that the maximum firing rate is 100 s-1, and the spike onset approximately coincides with the diagonal of the resulting gE-gI-rate contour plot. We measure the steady-state firing rate by taking the inverse of the median interspike interval over the simulation period. We compare these data to the current predicted by H according to the theoretical and optimized estimated parameter sets. Parameter optimization is based on a training-set of 200 conductance pairs sampled with uniform probability from the conductance range. The final prediction error is computed over all 10,000 grid points.

The results for this experiment are depicted in Figure 6A; the fitted parameters can be found in Table A2. When using the theoretical parameter estimate from equation 2.15, there is a significant discrepancy between the model prediction and the numerical simulation, especially for large gC. This discrepancy is greatly reduced after fitting the model parameters. The model prediction almost perfectly fits the empirical data for output spike rates greater than 25 s-1. However, it fails to predict the spike onset correctly, placing it too early with respect to increasing gE. Furthermore, the predicted slope at the spike onset is less steep than what is actually measured. As discussed in section 2.3.2, we can see the linearity of H increase as gC is increased, that is, the contour lines are more “parallel” for larger gC. Still, with an overall root mean square error (RMSE) of about 4 s-1, the model provides a reasonable approximation of the empirical data.

3.1.2  Experiment 1.2: Conductances with Artificial Temporal Spike Noise

In a network context, gE(t) and gI(t) are usually not constant, but instead modeled as the weighted sum of low-pass filtered spike trains, resulting in a considerable amount of “spike noise” being superimposed onto the signal. In this experiment, we simulate artificial spike noise as two Poisson spike sources (one excitatory, one inhibitory) with rate 1/λ=1800s-1 for inhibitory synapses and a rate of 1/λ=4500s-1 for excitatory synapses. These rates were obtained by fitting Poisson distributions to measurements from experiment 3. Spikes arriving at different synapses are simulated by uniformly sampling a random weight from the unit-interval for each spike event. The time-averaged conductance equals gE, gI, respectively. Apart from the simulation period being extended to 100 s, the remaining experimental setup is unchanged from the previous experiment.

As can be seen in our results for noisy conductances in Figure 6B, and particularly in the bottom cross-sections of the measured spike rates in Figure 6B, the steep spike onsets predicted by the theoretical LIF response curve G[J] (see equation 2.14) are no longer present. Furthermore, the relationship between gE and the rate appears to be roughly linear in each cross-section of the gE-gI-rate plot, which is not well captured by the standard LIF response curve. Hence, a soft version of the LIF response curve that takes noise into account would be a better choice when fitting the parameters (Hunsberger, Scott, & Eliasmith, 2014; Capocelli & Ricciardi, 1971; Kreutz-Delgado, 2015). We instead take a pragmatic approach and define G as a rectified linear unit (ReLU), that is, G[J]=max{0,αJ+β}. Due to a relatively pronounced noise floor, we only consider samples with a spike rate greater than 12.5 s-1 for both fitting the model parameters and reporting the RMSE.

Fitting the parameters using the ReLU results in excellent accuracy for spike rates greater than 12.5 s-1 (RMSE less than 1 s-1). However, the fitted model does not capture the subtle sigmoid shape of the response curve near the spike onset that is particularly pronounced for larger gC.

To summarize, the results indicate that after fitting the model parameters, our surrogate model H can indeed be used to predict the neural response curve with relatively high accuracy. When taking noise in the input into account, it can be beneficial to choose a different neural response curve G[J].

3.2  Experiment 2: Computational Properties of the Two-Compartment LIF Nonlinearity H

The above experiment is encouraging in that our input-dependent nonlinearity H seems to predict somatic currents well. Hence, in this experiment, we assume that H accurately describes the somatic current and analyze whether, in theory, there could be any advantage to using a neuron model with this kind of input-depdendent nonlinearity.

Conceptually, what we would like to do is measure how well a system can approximate functions of increasing complexity. If systems using input-dependent nonlinearities are able to approximate complex functions with a lower error than linear systems, they are computationally more powerful. Since complexity is somewhat ill-defined, we choose the spatial frequency content of a function as a proxy.

In our experiment, we randomly generate band-limited current functions Jσ(x,y) over the compact domain (x,y)[-1,1]2 on a 63×63 grid. σ is a spatial low-pass filter coefficient that is inversely proportional to the bandwidth of the function. The functions we generate are normalized such that the mean is zero and the standard deviation equals 1nA. We measure how well we can approximate this desired postsynaptic current for a single postneuron with a given input-dependent nonlinearity. The preneuron populations are as depicted in Figure 2C. Two independent prepopulations with 100 neurons each represent x and y, respectively, and project onto a single postneuron. The population tuning curves are randomly generated in each trial with a maximum firing rate between 50 and 100 s-1 per neuron. All preneurons project both inhibitorily and excitatorily onto the postneuron. We measure the static decoding error, that is, the difference between the target current function and the output of the surrogate model assuming ideal prepopulation tuning curves without any dynamics. We consider dynamical simulation of a spiking network in experiment 3.

Figure 6:

Somatic current model parameter fits. Average spike rates G(gE,gI) measured in simulation are depicted as colored contour plots and solid lines in the cross-section. Dashed lines correspond to the model prediction G[H(gE,gI)]. Dotted lines indicate the location of the cross-section. E denotes the spike rate RMSE over the regions where either the measured or predicted spike rate is greater than 12.5 s-1. Columns correspond to different values of gC. Top two rows show the spike measured and modeled spike rates for constant conductance values gE, gI. The bottom two rows correspond to conductance values with superimposed spike noise. The model fits the simulated response functions well after optimization.

Figure 6:

Somatic current model parameter fits. Average spike rates G(gE,gI) measured in simulation are depicted as colored contour plots and solid lines in the cross-section. Dashed lines correspond to the model prediction G[H(gE,gI)]. Dotted lines indicate the location of the cross-section. E denotes the spike rate RMSE over the regions where either the measured or predicted spike rate is greater than 12.5 s-1. Columns correspond to different values of gC. Top two rows show the spike measured and modeled spike rates for constant conductance values gE, gI. The bottom two rows correspond to conductance values with superimposed spike noise. The model fits the simulated response functions well after optimization.

All synaptic weights are computed by solving the QP in equation 2.20 for 256 randomly selected training samples. For the conductance-based two-compartment LIF nonlinear input-dependent current function Hcond=H, we use the model parameters derived in the previous experiment (see Table A2). We emulate linear input currents Hcur=JE-JI in terms of H by setting the model parameters to b0=a1=a2=0, a0=b1=1, b2=-1. The regularization parameters were selected independently for each setup (see Figure A1). The final error is the normalized RMSE (relative to the standard deviation of 1nA of fσ) over all 3969 grid points. We add normal distributed noise (zero mean, unit standard deviation) to the preactivities when computing the error to test how well the computed weights generalize.

As a further point of comparison, we include a two-layer neural network setup (see Figure 2B). The 200 preneurons are tuned to both input dimensions (x,y), and not x and y independently. This setup should perform the best—but keep in mind that when modeling neurobiological systems, we may be constrained to prepopulations that do not exhibit the kind of multivariate tuning to the input signals we assume here.

Results are depicted in Figure 7. For a current-based neuron and without subthreshold relaxation (dashed line in the plot), the median error increases linearly on a log-log plot from a 2.5% error for low-frequency—almost linear—functions to an error of about 50% for functions with a spatial cutoff frequency greater than one. Subthreshold relaxation reduces this error by up to 50% for low σ-1.

The error for the conductance-based nonlinearity increases sublinearly on a log-log plot, starting at median errors of about 0.8% for low-frequency functions. It is competitive with the two-layer network (see below) for σ-1<0.5. The error function converges to the results for the current-based model for spatial frequencies σ-1>1. Overall, the error for the conductance-based model is reduced by up to 65% compared to the current-based model. The benefits of subthreshold relaxation are not as pronounced as for the linear current model.

The large errors for the current-based and the conductance-based models for σ-1>1 can be explained by the fact that both functions cannot be used to solve the XOR problem (cf. appendix C). Functions with σ-11 are likely to possess multiple maxima and minima over [-1,1]2, akin to XOR, leading to a large error.

The two-layer network setup is able to approximate functions well up to σ-1=10, where it reaches the same final error values as the other setups. The complexity of the functions that can be approximated well by this setup is limited by the number of preneurons. The two-layer network can be thought of as linearly combining rectified hyperplanes to fit the target function, where each hyperplane is a single preneuron response. At a certain σ, the number of hyperplanes is no longer sufficient to reconstruct all the local maxima and minima in the target function.

To summarize, this experiment demonstrates that using a neuron model with the input-dependent nonlinearity H significantly reduces the approximation error of current functions with σ-1<1 in a network setup with prepopulations independently representing the input dimensions. It is competitive with a two-layer network for σ-1<0.5.

3.3  Experiment 3: Dendritic Computation of Multivariate Functions in Spiking Network Models

Experiment 1 suggests that we can use the nonlinear postsynaptic current model H to predict the average current flowing into the somatic compartment. Experiment 2 shows that we can, assuming that H accurately describes the somatic current, approximate a somewhat larger class of random functions well. In our final experiment, we study whether we can still observe the reduction in error in a feedforward spiking neural network when using our model H to solve for weights.

In contrast to the previous experiment, we do not base our error measurements on the decoded static somatic current for a single postneuron. Instead we decode the represented value from the neural activities of a target population over time. Optimally, this decoded value should be f(x(t),y(t)), where f is the function that we want to approximate and x(t),y(t)[-1,1] are input signals in turn represented by populations of 100 LIF neurons each. The target population consists of either standard current-based LIF neurons (see Figure 2A) or conductance-based two-compartment neurons (see Figure 2C).

Just as in the previous experiment, we consider the two-layer topology in Figure 2B as a point of reference. Here, the input is mediated via an additional layer of 200 neurons representing the vectorial quantity (x(t),y(t)) over the interval [-1,1]2.

For all neurons, we generate tuning curves such that the maximum firing rate falls between 50 s and 100 s-1 over their represented range. Neurons are randomly marked as either excitatory or inhibitory. The probability of a neuron being inhibitory is 30%. Excitatory and inhibitory synapses are modeled as a first-order exponential low-pass filter with time constants of τE=5ms and τI=10ms, respectively.

The network is simulated over 10 s at a time resolution of 100 μs. Inputs x(t) and y(t) are sampled by moving through time along a fourth-order space-filling Hilbert curve over [-1,1]2. The output of the target population is decoded and filtered with a first-order exponential low pass at τ=100ms. We compute the desired target value f(x,y) from the original input and pass it through the same series of low-pass filters as the spiking signals. We use the average synaptic time constant of 7.5 ms to emulate the effect of the synaptic low-pass filters. Our final measure Enet is the normalized RMSE between the decoded output and the target values over time; the normalization is relative to the standard deviation of the target7 signal.

All synaptic weights are computed by solving the QP in equation 2.20 with the same mapping of the current-based onto the conductance-based model as in experiment 2. The regularization term λ has been chosen independently for each neuron type and model parameter set such that the network error Enet is minimized when computing multiplication (cf. Figure B1).

Figure 7:

Median decoding error for random multivariate current functions. (A) Normalized RMSE between random, two-dimensional current functions and the decoded approximation using different input-dependent nonlinearities. The error measurement does not take subthreshold currents into account, using the function E defined in equation 2.8. The low-pass filter coefficient σ-1 is a proxy for the spatial frequency content in the target function. All points correspond to the median over 1000 trials. Dashed lines show results for not taking the subthreshold relaxation into account when solving for weights. The black lines show the results for a linear, current-based Hcur; blue and green lines show the results for the two-compartment, conductance-based model Hcond with parameters given in Table A2 (without noise). Orange lines correspond to a current-based network with two-dimensional preneuron tuning (i.e., a two-layer neural network). Shaded areas correspond to the 25th/75th percentile for the current-based models and the conductance-based model with gC=50nS. (B) Exemplary random functions and the corresponding decodings. Deep violet regions (dashed contour lines) correspond to subthreshold currents. Note how the shape of the subthreshold contour lines no longer matches the target when subthreshold relaxation is active. As visible, the conductance-based nonlinearity Hcond helps to decode some target functions with a drastically smaller error compared to the current-based model Hcur, especially when comparing the setups without subthreshold relaxation. Hcond does not provide any benefit for target functions with a high bandwidth.

Figure 7:

Median decoding error for random multivariate current functions. (A) Normalized RMSE between random, two-dimensional current functions and the decoded approximation using different input-dependent nonlinearities. The error measurement does not take subthreshold currents into account, using the function E defined in equation 2.8. The low-pass filter coefficient σ-1 is a proxy for the spatial frequency content in the target function. All points correspond to the median over 1000 trials. Dashed lines show results for not taking the subthreshold relaxation into account when solving for weights. The black lines show the results for a linear, current-based Hcur; blue and green lines show the results for the two-compartment, conductance-based model Hcond with parameters given in Table A2 (without noise). Orange lines correspond to a current-based network with two-dimensional preneuron tuning (i.e., a two-layer neural network). Shaded areas correspond to the 25th/75th percentile for the current-based models and the conductance-based model with gC=50nS. (B) Exemplary random functions and the corresponding decodings. Deep violet regions (dashed contour lines) correspond to subthreshold currents. Note how the shape of the subthreshold contour lines no longer matches the target when subthreshold relaxation is active. As visible, the conductance-based nonlinearity Hcond helps to decode some target functions with a drastically smaller error compared to the current-based model Hcur, especially when comparing the setups without subthreshold relaxation. Hcond does not provide any benefit for target functions with a high bandwidth.

3.3.1  Experiment 3.1: Random Band-Limited Functions

We first test our network setup with the random band-limited functions fσ we used in the previous experiment. Results are depicted in Figure 8. Qualitatively, the results are very similar to what we saw before. The reduction in error between the current- and conductance-based models is not quite as large as suggested by the theoretical experiment, with a maximum reduction (in terms of the median) of only 45% (instead of 65% before). While subthreshold relaxation mostly increased the performance of the current-based model in the previous experiment, the improvement in error is now clearly visible for the conductance-based model as well.
Figure 8:

Median error for computing random band-limited functions in a feedforward network over 1000 trials. Measured RMSE Enet is the difference between the represented value Da(t) and the expected value fσ(x(t),y(t)) relative to the standard deviation of fσ. See Figure 7A for more detail.

Figure 8:

Median error for computing random band-limited functions in a feedforward network over 1000 trials. Measured RMSE Enet is the difference between the represented value Da(t) and the expected value fσ(x(t),y(t)) relative to the standard deviation of fσ. See Figure 7A for more detail.

Notably, the minimum median approximation error of the two-layer network is about 10%, whereas the single-layer, current- and conductance-based models reach minimum errors of about 6.5% and 5%, respectively. The two-layer network clearly surpasses the performance of the two-compartment LIF single-layer network for σ-1>0.6. The larger errors are mainly caused by the representation of the two-dimensional quantity (x(t),y(t)) begin noisier than the representation of the scalars x(t), y(t) in the two prepopulations. This is because chaining multiple populations of spiking neurons slightly increases the noise floor. Furthermore, to cover the square [-1,1]2 as densely as two one-dimensional intervals [-1,1], we would optimally have to square the number of neurons. In our case, we would have to use 10,000 instead of 200 neurons for the intermediate layer, which would not really be comparable to the single-layer setups—keep in mind that the two-layer network already uses 66% more neurons.

3.3.2  Experiment 3.2: Benchmark Functions

While the random functions in the above experiments (see Figure 7B for an example) are useful to systematically characterize the individual setups, it is hard to tell from these data alone what the practical impact of the two-compartment LIF neuron is. To this end, we selected eight mathematical benchmark functions f(x,y) and repeated the experiment. Functions include the maximum max(x,y) and various forms of multiplication (x×y, x×y, (x×y)2; see Table A3 for a complete list). Note that we compute all these functions over the interval [0,1]2 instead of [-1,1]2 by shifting and scaling the values represented by the neuron populations, that is, we compute f(x+1)/2,(y+1)/2. As mentioned above and proved in appendix C, we know that we are not able to solve the XOR problem with the two-conductance LIF neuron, and multiplication over [-1,1]2 can be thought of as a continuous form of XOR. We should be able to approximate multiplication over one quadrant, however.4

A summary of the results over 256 trials per function and setup is given in Table 1, and traces from an example trial are depicted in Figure 9. More detailed results can be found in Table A4. For all but one target function (squared multiplication, which has the highest bandwidth of all tested functions), the conductance-based, two-compartment model with a coupling conductance of gC=50ns achieves the smallest error Enet. Using the surrogate model parameters derived under noise is beneficial when computing multiplicative functions and the maximum. For these target functions, the synaptic connection matrix tends to be sparser, increasing the input noise. Apparently this increase in noise matches the environment the neuron parameters have been optimized for. Interestingly, a purely current-based, single-layer network is competitive for all functions except for multiplication. The minimum error for the two-layer network is about 8% even for simple functions, matching the observation we made in the random function experiment above.
Figure 9:

Single spiking neuron experiment showing computation of multiplication using a two-compartment LIF neuron. (A) Top two plots: inputs x(t) and y(t) as represented by the two prepopulations. The input is a fourth-order 2D Hilbert curve. Bottom: mathematical target f(x(t),y(t))=x(t)y(t), filtered target function, as well as the decoded target population output. (B) Spike raster plots corresponding to the spiking activity of each of the populations. Red shaded background corresponds to inhibitory neurons in the prepopulations; all other neurons are excitatory.

Figure 9:

Single spiking neuron experiment showing computation of multiplication using a two-compartment LIF neuron. (A) Top two plots: inputs x(t) and y(t) as represented by the two prepopulations. The input is a fourth-order 2D Hilbert curve. Bottom: mathematical target f(x(t),y(t))=x(t)y(t), filtered target function, as well as the decoded target population output. (B) Spike raster plots corresponding to the spiking activity of each of the populations. Red shaded background corresponds to inhibitory neurons in the prepopulations; all other neurons are excitatory.

Table 1:
Spiking Neural Network Approximation Errors for Function Approximations on [0,1]2.
 
 

Notes: Error values Enet correspond to the normalized RMSE (relative to the standard deviation of the target function) and are measured as the difference between the output decoded from the target population and the desired output for a 10 s sweep across a fourth order 2D Hilbert curve over the input space. Results are the mean and standard deviation over 256 trials. The best result for a target function is in bold; darker background colors indicate a worse ranking of the result in the corresponding row. Columns labeled “Standard” refer to the default, single-layer network setup, “Two Layers” refers to the two-layer setup, and “Noise Model” to the single-layer network setup with model parameters derived under noise (see experiment 1.2). Additional tables are in appendix A.

aWith subthreshold relaxation.

An effect that could contribute to the superior performance of the two-compartment neuron model in some experiments are the low-pass filter dynamics of the dendritic compartment. These filter the high-frequency spike noise and thus may reduce the target error. We control for this effect in an experiment described in appendix B, where we add an optimal low-pass filter to each network setup. Results are shown in Table B1. We find that a matched prefilter consistently reduces the error of all setups by only 1% to 2%, which indicates that the low-pass filter dynamics of the dendritic compartment are not the primary source for the reduction in error.

To summarize our experiments, we demonstrate in three stages (validation of the nonlinearity model Hcond for a single neuron, purely mathematical properties of Hcond, and, finally, performance on a network level) that we are able to successfully incorporate a model (admittedly a simple one) of nonlinear passive dendritic interaction into functional modeling frameworks. Instead of reducing the accuracy of our networks, the added detail can be systematically leveraged for computation. Our experiments also suggest that—at least in a biologically informed setting, that is, using spiking neurons—this type of computation may result in a higher accuracy compared to two-layer architectures that suffer from an increase in the amount of spike-induced temporal noise due to the additional neuron layer.

4  Discussion

We derived9 a mathematical model of input-depdendent postsynaptic currents in a two-compartment LIF neuron that can be interpreted as a simple form of passive dendritic computation. We experimentally demonstrated that networks with fewer layers but biophysically plausible nonlinearities can compute a broad range of multivariate functions as well as or better than networks typically constructed using functional modeling frameworks. In particular, we proposed a mathematical model H that captures nonlinear interactions between input channels—for example, caused by conductance-based synapses or the dendritic tree. By mapping individual channel states onto an average somatic current J, this model can be integrated into mathematical frameworks that classically rely on current-based input channels.

Specifically, we demonstrated how to incorporate the dendritic nonlinearity H into the Neural Engineering Framework (NEF). To this end, we discussed extensions to the NEF that allow us to optimize for nonnegative synaptic weights that invoke a desired somatic current J and relax the optimization problem by taking subthreshold currents into account. We combined these methods with a specific surrogate model for H in the context of a two-compartment LIF neuron. Finally, we performed a series of spiking neural network simulations that show that our methods allow dendritic nonlinearities to be systematically exploited to efficiently approximate nonlinear multivariate functions up to a certain spatial bandwidth.

While our approach is a step toward providing a general model of dendritic computation in top-down neurobiological modeling frameworks, it admittedly has several limitations. Most important, we treat the dendritic nonlinearity H as time independent. Correspondingly, we implicitly assume that synaptic time constants typically dominate the overall neuronal dynamics. However, dendritic trees in biology—especially when considering active channels and dendritic spikes (Koch, 1999)—possess filter properties and adaptation processes that are not accounted for in our model. It would be interesting to incorporate the dynamical properties of dendritic trees into the NEF by employing the recent techniques presented by Voelker and Eliasmith (2018).

A further shortcoming of the derivation of the surrogate model of H for the two-compartment neuron model is the assumption that the average somatic membrane potential is constant. While we are able to alleviate this assumption to some degree by fitting the model parameters to simulation data, the exact model parameters depend on the specific working regime in which the neuron is used. Deviations from the modeled behavior are particularly apparent in situations with output firing rates smaller than 10 spikes per second (see Figures 5 and 6). Correspondingly, the dendritic nonlinearity presented in this letter may not be a suitable model for brain areas featuring extremely low maximum firing rates. There are two potential ways to work around this limitation. First, it may be possible to include an input-dependent membrane potential term in the nonlinearity. Or, second, one could directly use a sampled model for H. While these approaches are compatible with the concept of dendritic nonlinearity as introduced above, both increase the mathematical complexity of the weight optimization problem to a point where strategies such as stochastic gradient descent are required. These techniques tend to have significantly weaker guarantees regarding finding an optimal solution compared to the convex quadratic programs employed in this letter.

In light of the above limitations, we reemphasize that, as stated in section 1, our goal is not to provide a detailed mechanistic model of dendritic computation. Instead, we hope to provide a useful tool that captures essential aspects of dendritic computation—a nonlinear interaction between input channels—while being computationally cheap and mathematically tractable but still grounded in biophysics. This helps to bridge the gap between purely abstract functional networks and more biophysically grounded mechanisms.

A potential application of our work outside of neurobiological modeling is programming neuromorphic hardware. Neuromorphic computers are inspired by neurobiological principles and promise to reduce the energy consumption of certain computational problems by several orders of magnitude compared to conventional computers (Boahen, 2017). Especially when considering mixed analog-digital neuromorphic hardware systems, it should be possible to achieve a higher energy efficiency by implementing a more complex model neuron—such as the two-compartment LIF neuron discussed here—and performing local analog computation. Potential future work in this regard would be to validate our methods on a neuromorphic computing platform that implements dendritic trees, such as the BrainScales 2 system (Schemmel, Kriener, Müller, & Meier, 2017).

Another line of future work is to consider arbitrary configurations of passive dendritic trees beyond the two-compartment LIF model. By applying Kirchhoff's circuit laws, any passive dendritic tree configuration can be described as a linear dynamical system. Correspondingly, it is possible to derive the dendritic nonlinearity H. It would be interesting to see whether it is still possible to relatively quickly optimize connection weights and in how far the number of compartments influences the computational power of the dendritic nonlinearity.

In conclusion, we believe that the methods proposed here provide a solid grounding for future work exploring both detailed biophysical mechanisms in the context of functional spiking networks and improving neuromorphic methods for neural computation. We have shown how to cast the determination of connection weights in a functional network with conductance-based synapses as an optimization problem with guaranteed convergence to the minimum. This optimization not only exploits known dendritic nonlinearities but respects specifiable network topologies that conform to Dale's principle. The results are functional spiking networks with improved accuracy and biophysical plausibility using fewer neurons than competing approaches.

Notes

1

The methods presented in this section have been implemented as a software library extending the Nengo simulation package; see https://github.com/astoeckel/nengo-bio. A stand-alone version used to conduct the experiments in this letter can be found in the supplemental material.

2

Our software library nengo-bio facilitates the process of specifying network topologies with excitatory and inhibitory populations. The library provides special syntactic sugar for networks with inhibitory interneurons built without the Parisien transform.

3

osqp is used as a part of libbioneuronqp, see https://github.com/astoeckel/libbioneuronqp.

4

The obvious solution to approximating “full” multiplication using two-compartment LIF neurons is to split the target population into four quadrants; however, we wanted to use network setups that are not optimized for a particular problem.

Acknowledgments

We thank Aaron R. Voelker for his comments on earlier drafts of this letter, as well as his advice and constructive criticism regarding our experiments. This work was supported by the Canada Research Chairs program (no grant number), NSERC Discovery grant 261453, and AFOSR grant FA9550-17-1-002.

References

References
Alemi
,
A.
,
Machens
,
C. K.
,
Deneve
,
S.
, &
Slotine
,
J.-J.
(
2018
). Learning nonlinear dynamics in efficient, balanced spiking networks using local plasticity rules. In
Proceedings of the 32nd AAAI Conference on Artificial Intelligence
.
Palo Alto, CA
:
AAAI
.
Bekolay
,
T.
,
Bergstra
,
J.
,
Hunsberger
,
E.
,
DeWolf
,
T.
,
Stewart
,
T. C.
,
Rasmussen
,
D.
, …
Eliasmith
,
C.
(
2014
).
Nengo: A Python tool for building large-scale functional brain models
.
Frontiers in Neuroinformatics
,
7
(
48
).
Berzish
,
M.
,
Eliasmith
,
C.
, &
Tripp
,
B.
(
2016
). Real-time FPGA simulation of surrogate models of large spiking networks. In
A. E.
Villa
,
P.
Masulli
, &
A. J. Pons
Rivero
(Eds.),
Artificial Neural Networks and Machine Learning—ICANN 2016
(pp.
349
356
).
Berlin
:
Springer
.
Blouw
,
P.
,
Choo
,
X.
,
Hunsberger
,
E.
, &
Eliasmith
,
C.
(
2018
).
Benchmarking keyword spotting efficiency on neuromorphic hardware
. arXiv:1812.01739.
Boahen
,
K.
(
2017
).
A neuromorph's prospectus
.
Computing in Science Engineering
,
19
(
2
),
14
28
.
Bobier
,
B.
,
Stewart
,
T. C.
, &
Eliasmith
,
C.
(
2014
).
A unifying mechanistic model of selective attention in spiking neurons
.
PLOS Computational Biology
,
10
(
6
), e1003577.
Boerlin
,
M.
, &
Denève
,
S.
(
2011
).
Spike-based population coding and working memory
.
PLOS Computational Biology
,
7
(
2
),
1
18
.
Boerlin
,
M.
,
Machens
,
C. K.
, &
Denève
,
S.
(
2013
).
Predictive coding of dynamical variables in balanced spiking networks
.
PLOS Computational Biology
,
9
(
11
),
1
16
.
Broomhead
,
D. S.
, &
Lowe
,
D.
(
1988
).
Radial basis functions, multi-variable functional interpolation and adaptive networks
(RSRE MEMO-4148).
Malvern, U.K.
:
Royal Signals and Radar Establishment
.
Capaday
,
C.
, &
Van Vreeswijk
,
C.
(
2006
).
Direct control of firing rate gain by dendritic shunting inhibition
.
Journal of Integrative Neuroscience
,
5
(
2
),
199
222
.
Capocelli
,
R. M.
, &
Ricciardi
,
L. M.
(
1971
).
Diffusion approximation and first passage time problem for a model neuron
.
Kybernetik
,
8
(
6
),
214
223
.
Choudhary
,
S.
,
Sloan
,
S.
,
Fok
,
S.
,
Neckar
,
A.
,
Trautmann
,
E.
,
Gao
,
P.
, …
Boahen
,
K.
(
2012
). Silicon neurons that compute. In
A. E. P.
Villa
,
W.
Duch
,
P.
Érdi
,
F.
Masulli
, &
G.
Palm
(Eds.),
Artificial Neural Networks and Machine Learning— ICANN 2012
(pp.
121
128
).
Berlin
:
Springer
.
Duggins
,
P.
(
2017
).
Incorporating biologically realistic neuron models into the NEF
. Master's thesis. University of Waterloo.
Duggins
,
P.
,
Stewart
,
T. C.
,
Choo
,
X.
, &
Eliasmith
,
C.
(
2017
).
Effects of guanfacine and phenylephrine on a spiking neuron model of working memory
.
Topics in Cognitive Science
,
9
(
1
),
117
134
.
Eliasmith
,
C.
(
2013
).
How to build a brain: A neural architecture for biological cognition
.
New York
:
Oxford University Press
.
Eliasmith
,
C.
, &
Anderson
,
C. H.
(
2003
).
Neural engineering: Computation, representation, and dynamics in neurobiological systems
.
Cambridge, MA
:
MIT Press
.
Eliasmith
,
C.
,
Gosmann
,
J.
, &
Choo
,
X.-F.
(
2016
).
BioSpaun: A large-scale behaving brain model with complex neurons
. arXiv:1602.05220.
Eliasmith
,
C.
,
Stewart
,
T. C.
,
Choo
,
X.
,
Bekolay
,
T.
,
DeWolf
,
T.
,
Tang
,
Y.
, & Ras-
mussen
,
D.
(
2012
).
A large-scale model of the functioning brain
.
Science
,
338
,
1202
1205
.
Gabbott
,
P. L. A.
, &
Somogyi
,
P.
(
1986
).
Quantitative distribution of GABA-immunoreactive neurons in the visual cortex (area 17) of the cat
.
Experimental Brain Research
,
61
(
2
),
323
331
.
Gosmann
,
J.
, &
Eliasmith
,
C.
(
2020
).
CUE: A unified spiking neuron model of short-term and long-term memory
.
Psychological Review
, 10.1037/rev0000250.
Gupta
,
A.
,
Wang
,
Y.
, &
Markram
,
H.
(
2000
).
Organizing principles for a diversity of GABAergic interneurons and synapses in the neocortex
.
Science
,
287
(
5451
),
273
278
.
Hendry
,
S.
, &
Jones
,
E. G.
(
1981
).
Sizes and distributions of intrinsic neurons incorporating tritiated GABA in monkey sensory-motor cortex
.
Journal of Neuroscience
,
1
(
4
),
390
408
.
Hornik
,
K.
,
Stinchcombe
,
M.
, &
White
,
H.
(
1989
).
Multilayer feedforward networks are universal approximators
.
Neural Networks
,
2
(
5
),
359
366
.
Hunsberger
,
E.
, &
Eliasmith
,
C.
(
2015
).
Spiking deep networks with LIF neurons
. arXiv:1510.08829.
Hunsberger
,
E.
,
Scott
,
M.
, &
Eliasmith
,
C.
(
2014
).
The competing benefits of noise and heterogeneity in neural coding
.
Neural Computation
,
26
(
8
).
Jonas
,
P.
,
Major
,
G.
, &
Sakmann
,
B.
(
1993
).
Quantal components of unitary EPSCs at the mossy fibre synapse on CA3 pyramidal cells of rat hippocampus
.
Journal of Physiology
,
472
(
1
),
615
663
.
Koch
,
C.
(
1999
).
Biophysics of computation: Information processing in single neurons
.
New York
:
Oxford University Press
.
Komer
,
B.
, &
Eliasmith
,
C.
(
2016
).
A unified theoretical approach for biological cognition and learning
.
Current Opinion in Behavioral Sciences
,
11
,
14
20
.
Kozlov
,
M.
,
Tarasov
,
S.
, &
Khachiyan
,
L.
(
1980
).
The polynomial solvability of convex quadratic programming
.
USSR Computational Mathematics and Mathematical Physics
,
20
(
5
),
223
228
.
Kreutz-Delgado
,
K.
(
2015
).
Mean time-to-fire for the noisy LIF neuron: A detailed derivation of the Siegert formula.
arXiv:1501.04032.
Kuo
,
P. D.
, &
Eliasmith
,
C.
(
2005
).
Integrating behavioral and neural data in a model of zebrafish network interaction
.
Biological Cybernetics
,
93
(
3
),
178
187
.
London
,
M.
, &
Häusser
,
M.
(
2005
).
Dendritic computation
.
Annual Review of Neuroscience
,
28
(
1
),
503
532
.
MacNeil
,
D.
, &
Eliasmith
,
C.
(
2011
).
Fine-tuning and the stability of recurrent neural networks
.
PLOS ONE
,
6
.
Marr
,
D.
(
1969
).
A theory of cerebellar cortex
.
Journal of Physiology
,
202
(
2
),
437
470
.
Mel
,
B. W.
(
1994
).
Information processing in dendritic trees
.
Neural Computation
,
6
(
6
),
1031
1085
.
Mundy
,
A.
,
Knight
,
J.
,
Stewart
,
T. C.
, &
Furber
,
S.
(
2015
). An efficient SpiNNaker implementation of the Neural Engineering Framework. In
Proceedings of the International Joint Conference on Neural Networks.
Piscataway, NJ
:
IEEE
.
Neckar
,
A.
,
Fok
,
S.
,
Benjamin
,
B. V.
,
Stewart
,
T. C.
,
Oza
,
N. N.
,
Voelker
,
A. R.
, …
Boahen
,
K.
(
2019
).
Braindrop: A mixed-signal neuromorphic architecture with a dynamical systems-based programming model
. In
Proceedings of the IEEE
,
107
(
1
),
144
164
.
Nicola
,
W.
, &
Clopath
,
C.
(
2017
).
Supervised learning in spiking neural networks with FORCE training
.
Nature Communications
,
8
(
1
), 2208.
Parisien
,
C.
,
Anderson
,
C. H.
, &
Eliasmith
,
C.
(
2008
).
Solving the problem of negative synaptic weights in cortical models
.
Neural Computation
,
20
,
1473
1494
.
Polsky
,
A.
,
Mel
,
B. W.
, &
Schiller
,
J.
(
2004
).
Computational subunits in thin dendrites of pyramidal cells
.
Nature Neuroscience
,
7
, 621.
Roth
,
A.
, &
van Rossum
,
M. C. W.
(
2009
). Modeling synapses. In
E. D.
Schutter
(Ed.),
Computational modeling methods for neuroscientists
(pp.
139
159
).
Cambridge, MA
:
MIT Press
.
Schemmel
,
J.
,
Kriener
,
L.
, Mü
ller
,
P.
, &
Meier
,
K.
(
2017
). An accelerated analog neuromorphic hardware system emulating NMDA- and calcium-based non-linear dendrites. In
Proceedings of the 2017 International Joint Conference on Neural Networks
(pp.
2217
2226
).
Piscataway, NJ
:
IEEE
.
Schwemmer
,
M. A.
,
Fairhall
,
A. L.
,
Denéve
,
S.
, &
Shea-Brown
,
E. T.
(
2015
).
Constructing precisely computing networks with biophysical spiking neurons
.
Journal of Neuroscience
,
35
(
28
),
10112
10134
.
Stellato
,
B.
,
Banjac
,
G.
,
Goulart
,
P.
,
Bemporad
,
A.
, &
Boyd
,
S.
(
2020
).
OSQP: An operator splitting solver for quadratic programs
.
Mathematical Programming Computation
. https://doi.org.10.1007/s12532-020-00179-2
Stewart
,
T. C.
,
Bekolay
,
T.
, &
Eliasmith
,
C.
(
2012
).
Learning to select actions with spiking neurons in the basal ganglia
.
Frontiers in Decision Neuroscience
,
6
.
Strata
,
P.
, &
Harvey
,
R.
(
1999
).
Dale's principle
.
Brain Research Bulletin
,
50
(
5
),
349
350
.
Stöckel
,
A.
,
Voelker
,
A. R.
, &
Eliasmith
,
C.
(
2017
).
Point neurons with conductance-based synapses in the neural engineering framework.
arXiv:1710.07659.
Sussillo
,
D.
, &
Abbott
,
L.
(
2009
).
Generating coherent patterns of activity from chaotic neural networks
.
Neuron
,
63
(
4
),
544
557
.
Thalmeier
,
D.
,
Uhlmann
,
M.
,
Kappen
,
H. J.
, &
Memmesheimer
,
R.-M.
(
2016
).
Learning universal computations with spikes
.
PLOS Computational Biology
,
12
(
6
),
1
29
.
Tripp
,
B.
(
2009
).
A search for principles of basal ganglia function
. PhD diss., University of Waterloo.
Tripp
,
B.
, &
Eliasmith
,
C.
(
2007
).
Neural populations can induce reliable postsynaptic currents without observable spike rate changes or precise spike timing
.
Cerebral Cortex
,
17
,
1830
1840
.
Vandenberghe
,
L.
(
2010
).
The CVXOPT linear and quadratic cone program solvers.
http://www.seas.ucla.edu/∼vandenbe/publications/coneprog.pdf
Voelker
,
A. R.
, &
Eliasmith
,
C.
(
2018
).
Improving spiking dynamical networks: Accurate delays, higher-order synapses, and time cells
.
Neural Computation
,
30
(
3
),
569
609
.
Vu
,
E. T.
,
Lee
,
S. C.
, &
Krasne
,
F. B.
(
1993
).
The mechanism of tonic inhibition of crayfish escape behavior: Distal inhibition and its functional significance
.
Journal of Neuroscience
,
13
(
10
),
4379
4393
.
Wright
,
S.
, &
Nocedal
,
J.
(
1999
).
Numerical optimization
.
Springer Science
,
35
(
67–68
),
438
484
.

Supplementary data