## Abstract

Nonlinear interactions in the dendritic tree play a key role in neural computation. Nevertheless, modeling frameworks aimed at the construction of large-scale, functional spiking neural networks, such as the Neural Engineering Framework, tend to assume a linear superposition of postsynaptic currents. In this letter, we present a series of extensions to the Neural Engineering Framework that facilitate the construction of networks incorporating Dale's principle and nonlinear conductance-based synapses. We apply these extensions to a two-compartment LIF neuron that can be seen as a simple model of passive dendritic computation. We show that it is possible to incorporate neuron models with input-dependent nonlinearities into the Neural Engineering Framework without compromising high-level function and that nonlinear postsynaptic currents can be systematically exploited to compute a wide variety of multivariate, band-limited functions, including the Euclidean norm, controlled shunting, and nonnegative multiplication. By avoiding an additional source of spike noise, the function approximation accuracy of a single layer of two-compartment LIF neurons is on a par with or even surpasses that of two-layer spiking neural networks up to a certain target function bandwidth.

## 1 Introduction

A central challenge in theoretical neuroscience is to describe how biological mechanisms ultimately give rise to behavior. One way to approach this challenge is to build models of neurobiological systems that generate the behavior of interest to a researcher. Since constructing models that span multiple levels of abstraction is typically difficult, theoretical neuroscientists are working on methods that facilitate mapping high-level behavior onto neural mechanisms. Such modeling frameworks include the Neural Engineering Framework (NEF; Eliasmith & Anderson, 2003; Eliasmith, 2013); efficient, balanced spiking networks (EBN; Boerlin & Denève, 2011; Boerlin, Machens, & Denève, 2013); and FORCE (Sussillo & Abbott, 2009; Nicola & Clopath, 2017). Generally these approaches describe how to translate dynamical systems—corresponding to some hypothesized behavioral model—into an idealized spiking neural network that adheres to the desired neurophysiological constraints, for example, neural tuning, firing rate distributions, and population-level connectivity (Komer & Eliasmith, 2016; Nicola & Clopath, 2017). This mechanistic grounding facilitates model validation by enabling a direct comparison of simulation results and empirical data (Stewart, Bekolay, & Eliasmith, 2012; Bekolay et al., 2014; Duggins, Stewart, Choo, & Eliasmith, 2017; Voelker & Eliasmith, 2018; Gosmann & Eliasmith, 2020).

The frameworks mentioned above primarily rely on two biophysical phenomena as computational primitives: synaptic filtering and the nonlinear relationship between somatic input currents and the neural response. Somatic response models range from leaky integrate-and-fire (LIF) to Hodgkin-Huxley-type dynamics (Schwemmer, Fairhall, Denéve, & Shea-Brown, 2015; Eliasmith, Gosmann, & Choo, 2016; Duggins, 2017). Crucially, however, these approaches typically assume that postsynaptic currents are a linear superposition of filtered presynaptic events. Nonlinear interactions between input channels as they may occur when modeling conductance-based synapses or dendritic structures are typically ignored.

While some research exists that explores the effect of nonlinear postsynaptic currents within these frameworks (Bobier, Stewart, & Eliasmith, 2014; Thalmeier, Uhlmann, Kappen, & Memmesheimer, 2016; Alemi, Machens, Deneve, & Slotine, 2018), these nonlinearities are seldom systematically exploited. Yet empirical and theoretical work suggests that active and passive nonlinear effects within the dendritic tree—and not only the soma—are at least partially responsible for the complex responses observed in some biological neurons, including cortical pyramidal cells (Mel, 1994; Koch, 1999; Polsky, Mel, & Schiller, 2004). London and Häusser (2005) argue that in addition to voltage-gated ionic currents, fundamental passive effects such as shunting inhibition are worth being investigated as computational resources.

Put differently, current functional modeling frameworks consider only a subset of the computational resources available in individual neurons and thus underestimate their computational power. Modelers wishing to multiply two signals might, for example, be forced to introduce an additional layer of neurons, although—in biology—the interplay between excitation and inhibition within the dendrites of a single neuron layer could have the same effect (Koch, 1999). The goal of this letter is to present mathematically tractable methods that allow researchers to take nonlinear postsynaptic currents into account. We demonstrate, as demanded by London and Häusser (2005), that the interactions between passive conductance-based input channels within a single dendritic compartment provide significant computational advantages over standard LIF neurons even within a noisy spiking neural network with low firing rates and small neuron counts.

Specifically, we extend the NEF toward systematically exploiting nonlinear postsynaptic currents. The NEF has been applied to various research areas, including low-level modeling of neurobiological systems (Kuo & Eliasmith, 2005; Tripp, 2009; Bobier et al., 2014) and studying large-scale models of cognitive systems grounded in biology (Eliasmith et al., 2012, 2016; Eliasmith, 2013). A software implementation of the NEF is part of the neural simulation package Nengo (Bekolay et al., 2014) and has been used as a neural compiler targeting analog and digital neuromorphic hardware (Choudhary et al., 2012; Mundy, Knight, Stewart, & Furber, 2015; Berzish, Eliasmith, & Tripp, 2016; Blouw, Choo, Hunsberger, & Eliasmith, 2018; Neckar et al., 2019).

The main contributions of this letter are as follows. First, we present a series of extensions to the NEF that improve its compatibility with more biologically detailed neuron models. We describe how to enforce nonnegative weights in conjunction with Dale's principle and extend the NEF toward nonlinear postsynaptic currents, thereby lifting some long-standing limitations of the NEF. Second, we derive a postsynaptic current model for a two-compartment leaky integrate-and-fire (LIF) neuron that can be interpreted as a simple dendritic nonlinearity. Third, we demonstrate that a single layer of two-compartment LIF neurons can compute a wide variety of functions with an error smaller than or on a par with the accuracy achieved by a comparable two-layer spiking neural network, as long as the target function does not surpass a certain bandwidth.

## 2 Materials and Methods

We begin with a review of relevant portions of the Neural Engineering Framework (NEF), followed by four novel extensions: decoding in current space, nonnegative weights, equality relaxation for subthreshold currents, and the input-dependent nonlinearity model $H$. These extensions facilitate incorporating biological detail into NEF models, including nonlinear, dendritic postsynaptic currents. We close with a description of the two-compartment LIF neuron and the derivation of a corresponding nonlinearity $H$.^{1}

### 2.1 The Neural Engineering Framework

At its core, the NEF describes three principles that govern the construction of models of neurobiological systems (Eliasmith & Anderson, 2003). These principles apply established concepts from artificial neural networks to time-continuous spiking neural networks in a way that facilitates the integration of neurophysiological constraints. The first principle postulates that populations of neurons represent values in a relatively low-dimensional manifold within their high-dimensional activity space via nonlinear encoding and linear decoding. According to the second principle, network connectivity defines transformations as mathematical functions of the represented values. Finally, the third principle states that recurrent connections approximate arbitrary dynamical systems in which the represented values are state variables. Although we will not discuss recurrent networks in this letter, it should be noted that the methods presented here are fully compatible with the third principle and can be used to implement dynamical systems.

#### 2.1.1 Representation

A fundamental assumption of the NEF is that populations of spiking neurons represent $d$-dimensional vectors $x\u2192\u2208Rd$ by means of nonlinear encoding and linear decoding. In essence, each neuron population is treated as a single hidden-layer neural network with linear input and output units. Similar schemes have been proposed since the late 1960s, an early example being Marr's model of pattern extraction in the cerebellar granule layer (Marr, 1969).

Critically, this mapping does not suggest the adoption of a rate code by the NEF, but rather is a convenience for solving the synaptic weight optimization problem. That is, equation 2.1 is merely normative: it defines the average activity that a modeler expects individual neurons to have when the population represents a certain $x\u2192$. This normative constraint can be dropped entirely and all optimization done in the spiking domain within the NEF, but the computational costs are significantly higher (MacNeil & Eliasmith, 2011; Eliasmith, 2013). In this letter, we focus purely on offline optimization of the synaptic weights. Hence, the methods we present should in no way be interpreted as a theory of learning within nervous systems, but as a method for constructing models of mature, trained neural systems.

Notice that for the purpose of modeling neurobiological systems, all functions and parameters listed above can be hand-tuned by the modeler to match neurophysiological data. For example, as explored in Eliasmith and Anderson (2003), two-dimensional encoding vectors $e\u2192i$ can be used to reproduce a wide variety of localized tuning curves found in the neuroscience literature. Since NEF models, in contrast to classical multilayer neural networks, are typically not globally optimized, tuning curves are maintained throughout the weight optimization process, and neural activities can be interpreted at any point in the simulation. Global stochastic gradient descent can be applied to parts of the network (Hunsberger & Eliasmith, 2015) at the cost of negatively impacting the interpretability and biologically informed tuning of individual neuron populations.

The decoders $D$ are usually optimized offline under the assumption of the rate model $G[J]$. As formulated, the same $D$ can be used to decode represented values through time in spiking networks when defining activity $a\u2192(t)$ as low-pass filtered population spike trains. Linear low-pass filters are a common model for postsynaptic currents (Roth & van Rossum, 2009) and usually employed in spiking NEF networks (Eliasmith & Anderson, 2003).

#### 2.1.2 Transformation

In order to construct neural *networks*, we need to find synaptic weight matrices $W\u2208Rm\xd7n$ that connect from a prepopulation of $n$ neurons to a postpopulation of $m$ neurons. In particular, we would like to find a $W$ that implicitly decodes $f(x\u2192)=y\u2192$ from the prepopulation and at the same time encodes $y\u2192$ in the postpopulation. If we assume that the current translation function $Ji(x)$ is an intrinsic part of the neuron model or, put differently, each neuron $i$ is assigned its own response curve $Gi[\u2329e\u2192i,x\u2192\u232a]=G[Ji(\u2329e\u2192i,x\u2192)\u232a]$, both the encoding and decoding process are linear operators. With this simplification, $W=EDf$ fulfills the properties listed above, where $E\u2208Rm\xd7d$ is a matrix of postpopulation encoding vectors $e\u2192i$, and $Df$ the desired function decoder for the prepopulation (Eliasmith & Anderson, 2003).

### 2.2 Extending the Neural Engineering Framework

The NEF as described above makes several assumptions that are not biophysically plausible. This includes the presence of a bias current $\beta i$ in each neuron and excitatory and inhibitory connections not being treated separately. We describe two extensions that lift the aforementioned assumptions and present an alternative synaptic weight-solving procedure that takes subthreshold currents into account, followed by our proposed method for accounting for input-dependent nonlinearities.

#### 2.2.1 Decoding the Current Translation Function

In section 2.1, we assumed that the current translation function $Ji(x)$ is an intrinsic part of the neuron model. Consequently, each neuron is not only assigned a neuron-specific tuning curve $ai(x\u2192)$ but also a neuron-specific response curve $Gi[J]$. This comes at the cost of biological plausibility, since neurons in general do not possess a strong bias current source.

#### 2.2.2 Nonnegative Weights and Dale's Principle

So far we have assumed that synaptic weights are real-valued. This is problematic for two reasons. First, the least-squares optimization proposed above arbitrarily assigns algebraic signs to the synaptic weights; we cannot specify which preneurons are excitatory and which inhibitory. Being able to do so is important, since neurons tend to follow Dale's principle—individual preneurons exclusively influence all their postneurons in either an excitatory or inhibitory manner (Strata & Harvey, 1999). Empirical data suggest that, depending on the modeled brain region, excitatory cells outnumber inhibitory cells by a factor between two and four (Hendry & Jones, 1981; Gabbott & Somogyi, 1986). Second, real-valued weights do not generalize to conductance-based synapses. The concept of negative conductances—in contrast to negative currents—is neither physically nor biologically plausible. Biological correlates of synaptic weights, such as the number of vesicles released from the presynapse or the channel density in the postsynapse, are inherently nonnegative quantities (Roth & van Rossum, 2009).

An extension to the NEF that takes these biological constraints into account has been proposed by Parisien, Anderson, and Eliasmith (2008). The Parisien transform splits each projection into an excitatory and inhibitory path, where the latter mediates the signal from the excitatory prepopulation through a population of inhibitory interneurons. The solution we discuss here does not introduce interneurons, and as such does not affect the structure of the network. Modelers using the following method will have to explicitly specify inhibitory interneuron populations if present in the modeled circuit.^{2}

#### 2.2.3 Equality Relaxation for Subthreshold Currents

Most neurons possess a threshold current $Jth$ below which their firing rate is zero. However, we do not take this into account when solving for somatic currents in equations 2.5 to 2.7. We optimize for synaptic weights that precisely evoke certain postsynaptic currents, despite the magnitude of currents smaller than $Jth$ having no effect on the neural activity in a steady-state neural response model. Instead of optimizing for equality, that is, enforcing that the decoded current $Jdec$ must equal the target current $Jtar$, we could relax this condition to an inequality constraint $Jdec<Jth$ whenever $Jtar<Jth$.

#### 2.2.4 Extension toward Input-Dependent Nonlinearities

While this formalization does not constrain $G$ and $H$ beyond the above equivalence, a sensible choice for $G$ and $H$ will facilitate solving for synaptic weights. For example, if the neuron model in question is an extension to the current-based LIF neuron model, it makes sense to select $G$ as the standard LIF response curve. Then $H$ translates the input state into an LIF-equivalent somatic current.

As we show in the next section, $H$, or at least a parameterized surrogate for $H$, can be derived in closed form in some cases. In case this is not feasible, it is possible to purely rely on empirical data for $H$ by sampling the neuron output rate over varying synaptic states. Assume that we can observe $H$ only indirectly by controlling the input channels of our model neuron and measuring the output activity $G$. Depending on our choice of $G[J]$, we can apply an inverse mapping $G-1$ to the recorded data to obtain $H$. See Figure 3 for an illustration.

### 2.3 Two-Compartment Leaky Integrate-and-Fire Neuron Model

In section 2.2, we established the abstract notion of an input-depdendent nonlinearity $H$. We now derive a concrete $H$ for a biologically plausible extension to the leaky integrate-and-fire (LIF) neuron with nonlinear conductance-based synapses. In particular, we consider a two-compartment version of the LIF neuron originally described in Vu, Lee, and Krasne (1993) and subsequently discussed by Koch (1999) and Capaday and Van Vreeswijk (2006). In addition to the active, somatic compartment, the two-compartment LIF model possesses a resistively coupled passive compartment that represents excitatory and inhibitory input into the dendritic tree. Depending on the coupling conductance $gC$, the input may be interpreted as either distal or proximal.

We first review the neuron model itself, derive a parameterized surrogate for the nonlinearity $H$, and finally propose a convex quadratic program for this $H$ that can be used to solve for synaptic weights.

#### 2.3.1 Model Description

In contrast to point-neuron models, and as pointed out by Capaday and Van Vreeswijk (2006), a multicompartment model mandates an explicit spike model. The strong depolarization in the somatic compartment during spike production propagates backward into the denritic compartment and has a significant effect on its membrane potential (see Figure 4B). The model accounts for this with a “spike phase” occurring right before the LIF refractory period. The spike phase is implemented by clamping the somatic compartment to a voltage $vspike$ over a time $\tau spike$ whenever the somatic membrane potential crosses the threshold $vth$. Subsequently, the membrane is clamped to $vreset$ for a period $\tau ref$.

#### 2.3.2 Somatic Current Surrogate Model

Unfortunately, when considering both sub- and superthreshold dynamics, there exists no exact closed-form solution for the average somatic current given constant $gE$, $gI$. Instead, our approach is to select a parameterized surrogate model for $H$ and to fit the model parameters to results obtained from numerical simulations to account for inaccuracies in our derivation.

#### 2.3.3 Synaptic Weights as a Quadratic Program

## 3 Experiments and Results

In this section, we validate the methodology described above. To this end, we perform three experiments. In experiment 1, we test how well the two-compartment LIF nonlinearity model $H$ predicts simulation data for a single spiking neuron. In experiment 2, we study the computational properties of $H$ under optimal circumstances when approximating random band-limited functions. Finally, in experiment 3, we test whether the results from experiments 1 and 2 are still visible in the context of a noisy, feedforward spiking neural network.

Unless explicitly specified, the neuron model parameters are chosen according to Table A1. We model fast excitatory synapses as an exponential low pass with a time-constant of 5 ms as found in glutamatergic pyramidal neurons with AMPA receptor (Jonas, Major, & Sakmann, 1993). The inhibitory pathway is modeled with a 10 ms time constant as found in inhibitory interneurons with GABA$A$ receptors (Gupta, Wang, & Markram, 2000). We use the cvxopt library (Vandenberghe, 2010) as a QP solver for equation 2.18 and the osqp library (Stellato, Banjac, Goulart, Bemporad, & Boyd, 2020) to solve equation 2.20.^{3} The source code of the computer programs used to conduct and evaluate the experiments can be found in the supplemental materials.

### 3.1 Experiment 1: Fitting the Surrogate Model to Simulation Data

The surrogate model $H(gE,gI)$ in equation 2.16 predicts the current flowing into the somatic compartment of a two-compartment LIF neuron for excitatory and inhibitory conductances $gE$ and $gI$. $H$ is characterized by five parameters $a0$, $a1$, $a2$, $b0$, $b2$, for which a coarse theoretical estimate can be derived from equation 2.15. We compare spike rates measured in numerical simulation to the spike rate prediction obtained before and after fitting the model parameters to empirical data using equation 2.18. We do this for both constant conductances $gE$, $gI$ and for artificial temporal spike noise superimposed on the conductance pair.

#### 3.1.1 Experiment 1.1: Constant Conductances

We consider three two-compartment LIF neurons with different coupling conductances $gC$, namely, $50nS$, $100nS$, and $200nS$, and measure their output spike rate for constant conductances $gE$, $gI$ on a grid. The conductance range has been selected such that the maximum firing rate is 100 s$-1$, and the spike onset approximately coincides with the diagonal of the resulting $gE$-$gI$-rate contour plot. We measure the steady-state firing rate by taking the inverse of the median interspike interval over the simulation period. We compare these data to the current predicted by $H$ according to the theoretical and optimized estimated parameter sets. Parameter optimization is based on a training-set of 200 conductance pairs sampled with uniform probability from the conductance range. The final prediction error is computed over all 10,000 grid points.

The results for this experiment are depicted in Figure 6A; the fitted parameters can be found in Table A2. When using the theoretical parameter estimate from equation 2.15, there is a significant discrepancy between the model prediction and the numerical simulation, especially for large $gC$. This discrepancy is greatly reduced after fitting the model parameters. The model prediction almost perfectly fits the empirical data for output spike rates greater than 25 s$-1$. However, it fails to predict the spike onset correctly, placing it too early with respect to increasing $gE$. Furthermore, the predicted slope at the spike onset is less steep than what is actually measured. As discussed in section 2.3.2, we can see the linearity of $H$ increase as $gC$ is increased, that is, the contour lines are more “parallel” for larger $gC$. Still, with an overall root mean square error (RMSE) of about 4 s$-1$, the model provides a reasonable approximation of the empirical data.

#### 3.1.2 Experiment 1.2: Conductances with Artificial Temporal Spike Noise

In a network context, $gE(t)$ and $gI(t)$ are usually not constant, but instead modeled as the weighted sum of low-pass filtered spike trains, resulting in a considerable amount of “spike noise” being superimposed onto the signal. In this experiment, we simulate artificial spike noise as two Poisson spike sources (one excitatory, one inhibitory) with rate $1/\lambda =1800s-1$ for inhibitory synapses and a rate of $1/\lambda =4500s-1$ for excitatory synapses. These rates were obtained by fitting Poisson distributions to measurements from experiment 3. Spikes arriving at different synapses are simulated by uniformly sampling a random weight from the unit-interval for each spike event. The time-averaged conductance equals $gE$, $gI$, respectively. Apart from the simulation period being extended to 100 s, the remaining experimental setup is unchanged from the previous experiment.

As can be seen in our results for noisy conductances in Figure 6B, and particularly in the bottom cross-sections of the measured spike rates in Figure 6B, the steep spike onsets predicted by the theoretical LIF response curve $G[J]$ (see equation 2.14) are no longer present. Furthermore, the relationship between $gE$ and the rate appears to be roughly linear in each cross-section of the $gE$-$gI$-rate plot, which is not well captured by the standard LIF response curve. Hence, a soft version of the LIF response curve that takes noise into account would be a better choice when fitting the parameters (Hunsberger, Scott, & Eliasmith, 2014; Capocelli & Ricciardi, 1971; Kreutz-Delgado, 2015). We instead take a pragmatic approach and define $G$ as a rectified linear unit (ReLU), that is, $G[J]=max{0,\alpha J+\beta}$. Due to a relatively pronounced noise floor, we only consider samples with a spike rate greater than 12.5 s$-1$ for both fitting the model parameters and reporting the RMSE.

Fitting the parameters using the ReLU results in excellent accuracy for spike rates greater than 12.5 s$-1$ (RMSE less than 1 s$-1$). However, the fitted model does not capture the subtle sigmoid shape of the response curve near the spike onset that is particularly pronounced for larger $gC$.

To summarize, the results indicate that after fitting the model parameters, our surrogate model $H$ can indeed be used to predict the neural response curve with relatively high accuracy. When taking noise in the input into account, it can be beneficial to choose a different neural response curve $G[J]$.

### 3.2 Experiment 2: Computational Properties of the Two-Compartment LIF Nonlinearity $H$

The above experiment is encouraging in that our input-dependent nonlinearity $H$ seems to predict somatic currents well. Hence, in this experiment, we assume that $H$ accurately describes the somatic current and analyze whether, in theory, there could be any advantage to using a neuron model with this kind of input-depdendent nonlinearity.

Conceptually, what we would like to do is measure how well a system can approximate functions of increasing complexity. If systems using input-dependent nonlinearities are able to approximate complex functions with a lower error than linear systems, they are computationally more powerful. Since *complexity* is somewhat ill-defined, we choose the spatial frequency content of a function as a proxy.

In our experiment, we randomly generate band-limited current functions $J\sigma (x,y)$ over the compact domain $(x,y)\u2208[-1,1]2$ on a $63\xd763$ grid. $\sigma $ is a spatial low-pass filter coefficient that is inversely proportional to the bandwidth of the function. The functions we generate are normalized such that the mean is zero and the standard deviation equals $1nA$. We measure how well we can approximate this desired postsynaptic current for a single postneuron with a given input-dependent nonlinearity. The preneuron populations are as depicted in Figure 2C. Two independent prepopulations with 100 neurons each represent $x$ and $y$, respectively, and project onto a single postneuron. The population tuning curves are randomly generated in each trial with a maximum firing rate between 50 and 100 s$-1$ per neuron. All preneurons project both inhibitorily and excitatorily onto the postneuron. We measure the static decoding error, that is, the difference between the target current function and the output of the surrogate model assuming ideal prepopulation tuning curves without any dynamics. We consider dynamical simulation of a spiking network in experiment 3.

All synaptic weights are computed by solving the QP in equation 2.20 for 256 randomly selected training samples. For the conductance-based two-compartment LIF nonlinear input-dependent current function $Hcond=H$, we use the model parameters derived in the previous experiment (see Table A2). We emulate linear input currents $Hcur=JE-JI$ in terms of $H$ by setting the model parameters to $b0=a1=a2=0$, $a0=b1=1$, $b2=-1$. The regularization parameters were selected independently for each setup (see Figure A1). The final error is the normalized RMSE (relative to the standard deviation of $1nA$ of $f\sigma $) over all 3969 grid points. We add normal distributed noise (zero mean, unit standard deviation) to the preactivities when computing the error to test how well the computed weights generalize.

As a further point of comparison, we include a two-layer neural network setup (see Figure 2B). The 200 preneurons are tuned to both input dimensions $(x,y)$, and not $x$ and $y$ independently. This setup should perform the best—but keep in mind that when modeling neurobiological systems, we may be constrained to prepopulations that do not exhibit the kind of multivariate tuning to the input signals we assume here.

Results are depicted in Figure 7. For a current-based neuron and without subthreshold relaxation (dashed line in the plot), the median error increases linearly on a log-log plot from a $2.5%$ error for low-frequency—almost linear—functions to an error of about $50%$ for functions with a spatial cutoff frequency greater than one. Subthreshold relaxation reduces this error by up to $50%$ for low $\sigma -1$.

The error for the conductance-based nonlinearity increases sublinearly on a log-log plot, starting at median errors of about $0.8%$ for low-frequency functions. It is competitive with the two-layer network (see below) for $\sigma -1<0.5$. The error function converges to the results for the current-based model for spatial frequencies $\sigma -1>1$. Overall, the error for the conductance-based model is reduced by up to $65%$ compared to the current-based model. The benefits of subthreshold relaxation are not as pronounced as for the linear current model.

The large errors for the current-based and the conductance-based models for $\sigma -1>1$ can be explained by the fact that both functions cannot be used to solve the XOR problem (cf. appendix C). Functions with $\sigma -1\u22651$ are likely to possess multiple maxima and minima over $[-1,1]2$, akin to XOR, leading to a large error.

The two-layer network setup is able to approximate functions well up to $\sigma -1=10$, where it reaches the same final error values as the other setups. The complexity of the functions that can be approximated well by this setup is limited by the number of preneurons. The two-layer network can be thought of as linearly combining rectified hyperplanes to fit the target function, where each hyperplane is a single preneuron response. At a certain $\sigma $, the number of hyperplanes is no longer sufficient to reconstruct all the local maxima and minima in the target function.

To summarize, this experiment demonstrates that using a neuron model with the input-dependent nonlinearity $H$ significantly reduces the approximation error of current functions with $\sigma -1<1$ in a network setup with prepopulations independently representing the input dimensions. It is competitive with a two-layer network for $\sigma -1<0.5$.

### 3.3 Experiment 3: Dendritic Computation of Multivariate Functions in Spiking Network Models

Experiment 1 suggests that we can use the nonlinear postsynaptic current model $H$ to predict the average current flowing into the somatic compartment. Experiment 2 shows that we can, assuming that $H$ accurately describes the somatic current, approximate a somewhat larger class of random functions well. In our final experiment, we study whether we can still observe the reduction in error in a feedforward spiking neural network when using our model $H$ to solve for weights.

In contrast to the previous experiment, we do not base our error measurements on the decoded static somatic current for a single postneuron. Instead we decode the represented value from the neural activities of a target population over time. Optimally, this decoded value should be $f(x(t),y(t))$, where $f$ is the function that we want to approximate and $x(t),y(t)\u2208[-1,1]$ are input signals in turn represented by populations of 100 LIF neurons each. The target population consists of either standard current-based LIF neurons (see Figure 2A) or conductance-based two-compartment neurons (see Figure 2C).

Just as in the previous experiment, we consider the two-layer topology in Figure 2B as a point of reference. Here, the input is mediated via an additional layer of 200 neurons representing the vectorial quantity $(x(t),y(t))$ over the interval $[-1,1]2$.

For all neurons, we generate tuning curves such that the maximum firing rate falls between 50 s and 100 s$-1$ over their represented range. Neurons are randomly marked as either excitatory or inhibitory. The probability of a neuron being inhibitory is 30%. Excitatory and inhibitory synapses are modeled as a first-order exponential low-pass filter with time constants of $\tau E=5ms$ and $\tau I=10ms$, respectively.

The network is simulated over 10 s at a time resolution of 100 $\mu $s. Inputs $x(t)$ and $y(t)$ are sampled by moving through time along a fourth-order space-filling Hilbert curve over $[-1,1]2$. The output of the target population is decoded and filtered with a first-order exponential low pass at $\tau =100ms$. We compute the desired target value $f(x,y)$ from the original input and pass it through the same series of low-pass filters as the spiking signals. We use the average synaptic time constant of 7.5 ms to emulate the effect of the synaptic low-pass filters. Our final measure $Enet$ is the normalized RMSE between the decoded output and the target values over time; the normalization is relative to the standard deviation of the target^{7} signal.

All synaptic weights are computed by solving the QP in equation 2.20 with the same mapping of the current-based onto the conductance-based model as in experiment 2. The regularization term $\lambda $ has been chosen independently for each neuron type and model parameter set such that the network error $Enet$ is minimized when computing multiplication (cf. Figure B1).

#### 3.3.1 Experiment 3.1: Random Band-Limited Functions

Notably, the minimum median approximation error of the two-layer network is about $10%$, whereas the single-layer, current- and conductance-based models reach minimum errors of about $6.5%$ and $5%$, respectively. The two-layer network clearly surpasses the performance of the two-compartment LIF single-layer network for $\sigma -1>0.6$. The larger errors are mainly caused by the representation of the two-dimensional quantity $(x(t),y(t))$ begin noisier than the representation of the scalars $x(t)$, $y(t)$ in the two prepopulations. This is because chaining multiple populations of spiking neurons slightly increases the noise floor. Furthermore, to cover the square $[-1,1]2$ as densely as two one-dimensional intervals $[-1,1]$, we would optimally have to square the number of neurons. In our case, we would have to use 10,000 instead of 200 neurons for the intermediate layer, which would not really be comparable to the single-layer setups—keep in mind that the two-layer network already uses $66%$ more neurons.

#### 3.3.2 Experiment 3.2: Benchmark Functions

While the random functions in the above experiments (see Figure 7B for an example) are useful to systematically characterize the individual setups, it is hard to tell from these data alone what the practical impact of the two-compartment LIF neuron is. To this end, we selected eight mathematical benchmark functions $f(x,y)$ and repeated the experiment. Functions include the maximum $max(x,y)$ and various forms of multiplication ($x\xd7y$, $x\xd7y$, $(x\xd7y)2$; see Table A3 for a complete list). Note that we compute all these functions over the interval $[0,1]2$ instead of $[-1,1]2$ by shifting and scaling the values represented by the neuron populations, that is, we compute $f(x+1)/2,(y+1)/2$. As mentioned above and proved in appendix C, we know that we are not able to solve the XOR problem with the two-conductance LIF neuron, and multiplication over $[-1,1]2$ can be thought of as a continuous form of XOR. We should be able to approximate multiplication over one quadrant, however.^{4}

Notes: Error values $Enet$ correspond to the normalized RMSE (relative to the standard deviation of the target function) and are measured as the difference between the output decoded from the target population and the desired output for a 10 s sweep across a fourth order 2D Hilbert curve over the input space. Results are the mean and standard deviation over 256 trials. The best result for a target function is in bold; darker background colors indicate a worse ranking of the result in the corresponding row. Columns labeled “Standard” refer to the default, single-layer network setup, “Two Layers” refers to the two-layer setup, and “Noise Model” to the single-layer network setup with model parameters derived under noise (see experiment 1.2). Additional tables are in appendix A.

$a$With subthreshold relaxation.

An effect that could contribute to the superior performance of the two-compartment neuron model in some experiments are the low-pass filter dynamics of the dendritic compartment. These filter the high-frequency spike noise and thus may reduce the target error. We control for this effect in an experiment described in appendix B, where we add an optimal low-pass filter to each network setup. Results are shown in Table B1. We find that a matched prefilter consistently reduces the error of all setups by only 1% to 2%, which indicates that the low-pass filter dynamics of the dendritic compartment are not the primary source for the reduction in error.

To summarize our experiments, we demonstrate in three stages (validation of the nonlinearity model $Hcond$ for a single neuron, purely mathematical properties of $Hcond$, and, finally, performance on a network level) that we are able to successfully incorporate a model (admittedly a simple one) of nonlinear passive dendritic interaction into functional modeling frameworks. Instead of reducing the accuracy of our networks, the added detail can be systematically leveraged for computation. Our experiments also suggest that—at least in a biologically informed setting, that is, using spiking neurons—this type of computation may result in a higher accuracy compared to two-layer architectures that suffer from an increase in the amount of spike-induced temporal noise due to the additional neuron layer.

## 4 Discussion

We derived^{9} a mathematical model of input-depdendent postsynaptic currents in a two-compartment LIF neuron that can be interpreted as a simple form of passive dendritic computation. We experimentally demonstrated that networks with fewer layers but biophysically plausible nonlinearities can compute a broad range of multivariate functions as well as or better than networks typically constructed using functional modeling frameworks. In particular, we proposed a mathematical model $H$ that captures nonlinear interactions between input channels—for example, caused by conductance-based synapses or the dendritic tree. By mapping individual channel states onto an average somatic current $J$, this model can be integrated into mathematical frameworks that classically rely on current-based input channels.

Specifically, we demonstrated how to incorporate the dendritic nonlinearity $H$ into the Neural Engineering Framework (NEF). To this end, we discussed extensions to the NEF that allow us to optimize for nonnegative synaptic weights that invoke a desired somatic current $J$ and relax the optimization problem by taking subthreshold currents into account. We combined these methods with a specific surrogate model for $H$ in the context of a two-compartment LIF neuron. Finally, we performed a series of spiking neural network simulations that show that our methods allow dendritic nonlinearities to be systematically exploited to efficiently approximate nonlinear multivariate functions up to a certain spatial bandwidth.

While our approach is a step toward providing a general model of dendritic computation in top-down neurobiological modeling frameworks, it admittedly has several limitations. Most important, we treat the dendritic nonlinearity $H$ as time independent. Correspondingly, we implicitly assume that synaptic time constants typically dominate the overall neuronal dynamics. However, dendritic trees in biology—especially when considering active channels and dendritic spikes (Koch, 1999)—possess filter properties and adaptation processes that are not accounted for in our model. It would be interesting to incorporate the dynamical properties of dendritic trees into the NEF by employing the recent techniques presented by Voelker and Eliasmith (2018).

A further shortcoming of the derivation of the surrogate model of $H$ for the two-compartment neuron model is the assumption that the average somatic membrane potential is constant. While we are able to alleviate this assumption to some degree by fitting the model parameters to simulation data, the exact model parameters depend on the specific working regime in which the neuron is used. Deviations from the modeled behavior are particularly apparent in situations with output firing rates smaller than 10 spikes per second (see Figures 5 and 6). Correspondingly, the dendritic nonlinearity presented in this letter may not be a suitable model for brain areas featuring extremely low maximum firing rates. There are two potential ways to work around this limitation. First, it may be possible to include an input-dependent membrane potential term in the nonlinearity. Or, second, one could directly use a sampled model for $H$. While these approaches are compatible with the concept of dendritic nonlinearity as introduced above, both increase the mathematical complexity of the weight optimization problem to a point where strategies such as stochastic gradient descent are required. These techniques tend to have significantly weaker guarantees regarding finding an optimal solution compared to the convex quadratic programs employed in this letter.

In light of the above limitations, we reemphasize that, as stated in section 1, our goal is not to provide a detailed mechanistic model of dendritic computation. Instead, we hope to provide a useful tool that captures essential aspects of dendritic computation—a nonlinear interaction between input channels—while being computationally cheap and mathematically tractable but still grounded in biophysics. This helps to bridge the gap between purely abstract functional networks and more biophysically grounded mechanisms.

A potential application of our work outside of neurobiological modeling is programming neuromorphic hardware. Neuromorphic computers are inspired by neurobiological principles and promise to reduce the energy consumption of certain computational problems by several orders of magnitude compared to conventional computers (Boahen, 2017). Especially when considering mixed analog-digital neuromorphic hardware systems, it should be possible to achieve a higher energy efficiency by implementing a more complex model neuron—such as the two-compartment LIF neuron discussed here—and performing local analog computation. Potential future work in this regard would be to validate our methods on a neuromorphic computing platform that implements dendritic trees, such as the BrainScales 2 system (Schemmel, Kriener, Müller, & Meier, 2017).

Another line of future work is to consider arbitrary configurations of passive dendritic trees beyond the two-compartment LIF model. By applying Kirchhoff's circuit laws, any passive dendritic tree configuration can be described as a linear dynamical system. Correspondingly, it is possible to derive the dendritic nonlinearity $H$. It would be interesting to see whether it is still possible to relatively quickly optimize connection weights and in how far the number of compartments influences the computational power of the dendritic nonlinearity.

In conclusion, we believe that the methods proposed here provide a solid grounding for future work exploring both detailed biophysical mechanisms in the context of functional spiking networks and improving neuromorphic methods for neural computation. We have shown how to cast the determination of connection weights in a functional network with conductance-based synapses as an optimization problem with guaranteed convergence to the minimum. This optimization not only exploits known dendritic nonlinearities but respects specifiable network topologies that conform to Dale's principle. The results are functional spiking networks with improved accuracy and biophysical plausibility using fewer neurons than competing approaches.

## Notes

^{1}

The methods presented in this section have been implemented as a software library extending the Nengo simulation package; see https://github.com/astoeckel/nengo-bio. A stand-alone version used to conduct the experiments in this letter can be found in the supplemental material.

^{2}

Our software library nengo-bio facilitates the process of specifying network topologies with excitatory and inhibitory populations. The library provides special syntactic sugar for networks with inhibitory interneurons built without the Parisien transform.

^{3}

osqp is used as a part of libbioneuronqp, see https://github.com/astoeckel/libbioneuronqp.

^{4}

The obvious solution to approximating “full” multiplication using two-compartment LIF neurons is to split the target population into four quadrants; however, we wanted to use network setups that are not optimized for a particular problem.

## Acknowledgments

We thank Aaron R. Voelker for his comments on earlier drafts of this letter, as well as his advice and constructive criticism regarding our experiments. This work was supported by the Canada Research Chairs program (no grant number), NSERC Discovery grant 261453, and AFOSR grant FA9550-17-1-002.