## Abstract

Fox and Lu introduced a Langevin framework for discrete-time stochastic models of randomly gated ion channels such as the Hodgkin-Huxley (HH) system. They derived a Fokker-Planck equation with state-dependent diffusion tensor $D$ and suggested a Langevin formulation with noise coefficient matrix $S$ such that SS$⊤=D$. Subsequently, several authors introduced a variety of Langevin equations for the HH system. In this article, we present a natural 14-dimensional dynamics for the HH system in which each directed edge in the ion channel state transition graph acts as an independent noise source, leading to a 14 $×$ 28 noise coefficient matrix $S$. We show that (1) the corresponding 14D system of ordinary differential equations is consistent with the classical 4D representation of the HH system; (2) the 14D representation leads to a noise coefficient matrix $S$ that can be obtained cheaply on each time step, without requiring a matrix decomposition; (3) sample trajectories of the 14D representation are pathwise equivalent to trajectories of Fox and Lu's system, as well as trajectories of several existing Langevin models; (4) our 14D representation (and those equivalent to it) gives the most accurate interspike interval distribution, not only with respect to moments but under both the $L1$ and $L∞$ metric-space norms; and (5) the 14D representation gives an approximation to exact Markov chain simulations that are as fast and as efficient as all equivalent models. Our approach goes beyond existing models, in that it supports a stochastic shielding decomposition that dramatically simplifies $S$ with minimal loss of accuracy under both voltage- and current-clamp conditions.

## 1  Introduction

Many natural phenomena exhibit stochastic fluctuations arising at the molecular scale, the effects of which impact macroscopic quantities. Understanding when and how microscale fluctuations will significantly contribute to macroscale behavior is a fundamental problem spanning the sciences. The impact of random ion channel fluctuations on the timing of action potentials in nerve cells provides an important example. Channel noise can have a significant effect on spike generation (Mainen & Sejnowski, 1995; Schneidman, Freedman, & Segev, 1998), propagation along axons (Faisal & Laughlin, 2007), and spontaneous (ectopic) action potential generation in the absence of stimulation (O'Donnell & van Rossum, 2015). At the network level, channel noise can drive the endogenous variability of vital rhythms such as respiratory activity (Yu, Dhingra, Dick, & Galán, 2017).

Hodgkin and Huxley's quantitative model for active sodium and potassium currents producing action potential generation in the giant axon of Loligo suggested an underlying system of gating variables consistent with a multistate Markov process description (Hill & Chen, 1972). The discrete nature of individual ion channel conductances was confirmed experimentally (Neher & Sakmann, 1976). Subsequently, numerical studies of high-dimensional discrete-state, continuous-time Markov chain models produced insights into the effects of fluctuations in discrete ion channel populations on action potentials (Skaugen & Walløe, 1979; Strassberg & DeFelice, 1993), also known as channel noise (White, Klink, Alonso, & Kay, 1998; White, Rubinstein, & Kay, 2000).

In the standard molecular-level HH model, which we adopt here, the K$+$ channel comprises four identical $n$ gates that open and close independently, giving a five-vertex channel-state diagram with eight directed edges; the channel conducts a current only when in the rightmost state (see Figure 1, top). The Na$+$ channel comprises three identical $m$ gates and a single $h$ gate, all independent, giving an eight-vertex diagram with 20 directed edges, of which one is conducting (see Figure 1, bottom).

Figure 1:

Molecular potassium (K$+$) and sodium (Na$+$) channel states for the Hodgkin-Huxley model. Filled circles mark conducting states $n4$ and $m31$. Per capita transition rates for each directed edge ($αn$, $βn$, $αm$, $βm$, $αh$ and $βh$) are voltage dependent (cf. equations B.1 and B.6). Directed edges are numbered 1 to 8 (K$+$ channel) and 1 to 20 (Na$+$-channel), marked in small red numerals.

Figure 1:

Molecular potassium (K$+$) and sodium (Na$+$) channel states for the Hodgkin-Huxley model. Filled circles mark conducting states $n4$ and $m31$. Per capita transition rates for each directed edge ($αn$, $βn$, $αm$, $βm$, $αh$ and $βh$) are voltage dependent (cf. equations B.1 and B.6). Directed edges are numbered 1 to 8 (K$+$ channel) and 1 to 20 (Na$+$-channel), marked in small red numerals.

Discrete-state channel noise models are numerically intensive, whether implemented using discrete-time binomial approximations to the underlying continuous-time Markov process (Skaugen & Walløe, 1979; Schmandt & Galán, 2012) or continuous-time hybrid Markov models with exponentially distributed state transitions and continuously varying membrane potential. The latter were introduced by Clay and DeFelice (1983) and are in principle exact (Anderson, Ermentrout, & Thomas, 2015). Under voltage-clamp conditions, the hybrid conductance-based model reduces to a time-homogeneous Markov chain (Colquhoun & Hawkes, 1981) that can be simulated using standard methods such as Gillespie's exact algorithm (Gillespie, 1977, 2007). Even with this simplification, such Markov chain (MC) algorithms are numerically expensive to simulate with realistic population sizes of thousands of channels or greater. Therefore, there is an ongoing need for efficient and accurate approximation methods.

Following Clay and DeFelice's exposition of continuous time Markov chain implementations, Fox and Lu (1994) introduced a Fokker-Planck equation (FPE) framework that captured the first- and second-order statistics of HH ion channel populations in a 14-dimensional representation. Taking into account conservation of probability, one needs four variables to represent the population of K$+$ channels, seven for Na$+$, and one for voltage, leading to a 12-dimensional state-space description. The resulting high-dimensional partial differential equation is impractical to solve numerically. However, as Fox and Lu observed, “To every Fokker-Planck description, there is associated a Langevin description” (Fox & Lu, 1994). They therefore introduced a Langevin stochastic differential equation:
$CdVdt=Iapp(t)-g¯NaM8V-VNa-g¯KN5V-VK-gleak(V-Vleak),$
(1.1)
$dMdt=ANaM+S1ξ1,$
(1.2)
$dNdt=AKN+S2ξ2,$
(1.3)
where $C$ is the capacitance, $Iapp$ is the applied current, maximal conductances are denoted $g¯ion$, with $Vion$ being the associated reversal potential, and ohmic leak current $gleak(V-Vleak)$. $M∈R8$ and $N∈R5$ are vectors for the fractions of Na$+$ and K$+$ channels in each state, with $M8$ representing the open channel fraction for Na$+$, and $N5$ the open channel fraction for K$+$ (see Figure 1). Vectors $ξ1(t)∈R8$ and $ξ2(t)∈R5$ are independent gaussian white noise processes with zero mean and unit variances $〈ξ1(t)ξ1⊺(t')〉=I8δ(t-t')$ and $〈ξ2(t)ξ2⊺(t')〉=I5δ(t-t')$. The state-dependent rate matrices $ANa$ and $AK$ are given in equations 2.10 and 2.11. In Fox and Lu's formulation, $S$ must satisfy $S=D$, where $D$ is a symmetric, positive semidefinite $k×k$ “diffusion matrix” (see appendix D for the $D$ matrices for the standard HH K$+$ and Na$+$ channels). We will refer to the 14-dimensional Langevin equations 1.1 to 1.3, with $S=D$, as the Fox-Lu model.
The original Fox-Lu model, later called the conductance noise model by Goldwyn and Shea-Brown (2011), did not see widespread use until gains in computing speed made the square root calculations more feasible. Seeking a more efficient approximation, Fox and Lu (1994) also introduced a four-dimensional Langevin version of the HH model. This model was systematically studied in Fox (1997), which can be written as
$CdVdt=Iapp(t)-g¯Nam3hV-VNa-g¯Kn4V-VK-gleak(V-Vleak),$
(1.4)
$dxdt=αx(1-x)-βxx+ξx(t),wherex=m,h,or,n,$
(1.5)
where $ξx(t)$ are gaussian processes with covariance function
$E[ξx(t),ξx(t')]=αx(1-x)+βxxNδ(t-t').$
(1.6)
Here $N$ represents the total number of Na$+$channels (respectively, the total number of K$+$channels), and $δ(·)$ is the Dirac delta function. This model, referred to as the “subunit noise model” by Goldwyn and Shea-Brown (2011), has been widely used as an approximation to MC ion channel models (see the references in Bruce, 2009; Goldwyn & Shea-Brown, 2011). For example, Schmid et al. (2001) used this approximation to investigate stochastic resonance and coherence resonance in forced and unforced versions of the HH model (e.g. in the excitable regime). However, the numerical accuracy of this method was criticized in several studies (Mino, Rubinstein, & White, 2002; Bruce, 2009), which found that its accuracy does not improve even with increasing numbers of channels.

Although more accurate approximations based on Gillespie's algorithm (using a piecewise constant propensity approximation: Bruce, 2009; Mino et al., 2002) and even based on exact simulations (Clay & DeFelice, 1983; Newby, Bressloff, & Keener, 2013; Anderson et al., 2015) became available, they remained prohibitively expensive for large network simulations. Meanwhile, Goldwyn and Shea-Brown's rediscovery of Fox and Lu's earlier conductance-based model (Goldwyn & Shea-Brown, 2011; Goldwyn, Shea-Brown, & Rubinstein, 2010) launched a flurry of activity seeking the best Langevin-type approximation. Goldwyn and Shea-Brown (2011) introduced a faster decomposition algorithm to simulate equations 1.1 to 1.3 and showed that Fox and Lu's (1994) method accurately captured the fractions of open channels and the interspike interval (ISI) statistics, in comparison with Gillespie-type Monte Carlo (MC) simulations. However, despite the development of efficient singular value decomposition based algorithms for solving $S=D$, this step still causes a bottleneck in the algorithms based on Fox and Lu (1994), Goldwyn and Shea-Brown (2011), and Goldwyn, Imennov, Famulare, and Shea-Brown (2011).

Many variations on Fox and Lu's (1994) Langevin model have been proposed in recent years (Dangerfield, Kay, & Burrage, 2010, 2012; Linaro, Storace, & Giugliano, 2011; Orio & Soudry, 2012; Güler, 2013b; Huang, Rüdiger, & Shuai, 2013, 2015; Pezo, Soudry, & Orio, 2014; Fox, 2018), including Goldwyn et al.'s work (Goldwyn & Shea-Brown, 2011; Goldwyn et al., 2011), each with its own strengths and weaknesses. One class of methods imposes projected boundary conditions (Dangerfield et al., 2010, 2012); as we will show in section 5, this approach leads to inaccurate interspike interval distribution and is inconsistent with a natural multinomial invariant manifold structure for the ion channels. Several methods implement correlated noise at the subunit level, as in equations 1.5 and 1.6 (Fox, 1997; Linaro et al., 2011; Güler, 2013a, 2013b). However, if one recognizes that at the molecular level, the individual directed edges represent the independent noise sources in ion channel dynamics, then the approach incorporating noise at the subunit level obscures the biophysical origin of ion channel fluctuations. Some methods introduce the noisy dynamics at the level of edges rather than nodes but lump reciprocal edges together into pairs (Orio & Soudry, 2012; Dangerfield et al., 2012; Huang et al., 2013; Pezo et al., 2014). This approach implicitly assumes, in effect, that the ion channel probability distribution satisfies a detailed balance (or microscropic reversibility) condition. However, while detailed balance holds for the HH model under stationary voltage clamp, this condition is violated during active spiking. Finally, the stochastic shielding approximation (Schmandt & Galán, 2012; Schmidt & Thomas, 2014; Schmidt et al., 2018) does not have a natural formulation in the representation associated with an $n×n$ noise coefficient matrix $S$; in the cases of rectangular $S$ matrices used in Orio and Soudry (2012) and Dangerfield et al. (2012) stochastic shielding can only be applied to reciprocal pairs of edges. We elaborate on these points in section 6.

In this article, we introduce a new variation of Fox and Lu's conductance-noise model that avoids the limitations we have described. We show that preserving each directed edge in the channel transition graph (see Figure 1) as an independent noise source leads to a natural, biophysically motivated Langevin model that does not require any matrix decomposition step. Our construction lends itself to direct application of stochastic shielding methods, leading to faster simulations that retain the accuracy of Fox and Lu's method.

As an additional benefit, our method answers an open question in the literature, arising from the fact that the decomposition $D=SS⊺$ is not unique. As Fox (2018) recently pointed out, subblock determinants of the $D$ matrices play a major role in the structure of the $S$ matrix elements. Fox conjectured that “a universal form for $S$ may exist.” In this article, we obtain the universal form for the noise coefficient matrix $S$. Moreover, we prove that our model is equivalent to Fox and Lu's 1994 model in the strong sense of pathwise equivalence.

The remainder of the article is organized as follows. In section 2, we review the canonical deterministic 14D version of the HH model. We prove a series of lemmas that show (1) the multinomial submanifold $M$ is an invariant manifold within the 14D space and (2) the velocity on the 14D space and the pushforward of the velocity on the 4D space are identical. Moreover, we show (numerically) that (3) the submanifold $M$ is globally attracting, even under current clamp conditions. Figure 2 illustrates the relationship between the 4D and 14D deterministic HH models. Section 3 lays out our $14×28$ Langevin HH model. Like Orio and Soudry (2012), Dangerfield et al. (2012), and Pezo et al. (2014), we avoid matrix decomposition by computing $S$ directly. The key difference between our approach and its closest relative (Pezo et al., 2014) is to use a rectangular $n×k$ matrix $S$ for which each directed edge is treated as an independent noise source rather than lumping reciprocal edges together in pairs. In the new Langevin model, the form of our $S$ matrix reflects the biophysical origins of the underlying channel noise and allows us to apply the stochastic shielding approximation by neglecting the noise on selected individual directed edges. As we prove in section 4, our model (without the stochastic shielding approximation) is pathwise equivalent to all those in a particular class of biophysically derived Langevin models, including those used in Fox and Lu (1994), Goldwyn et al. (2011), Goldwyn and Shea-Brown (2011), Orio and Soudry (2012), Pezo et al. (2014), and Fox (2018). In addition to 4D and 14D deterministic trajectories, Figure 2 shows a stochastic trajectory generated by our Langevin model. Finally, we compare our Langevin model to several alternative stochastic neural models in terms of accuracy (of the full ISI distribution) and numerical efficiency in section 5.

Figure 2:

4D and 14D HH models. The meshed surface is a three-dimensional projection of the 14D state space onto three axes representing the voltage, $v$, the probability of all four potassium gates being in the closed state, $n0$, and the probability of exactly one potassium gate being in the open state, $n1$. Blue curves: Trajectories of the deterministic 14D HH model with initial conditions located on the 4D multinomial invariant submanifold, $M$. We prove that $M$ is an invariant submanifold in section 2.3. Black curve: The deterministic limit cycle solution for the 14D HH model, which forms a closed loop within $M$. Green curve: A trajectory of the deterministic 14D HH model with initial conditions (vertical green arrow) off the multinomial submanifold. Red curve: A trajectory of the stochastic 14D HH model (see section 3) with the same initial conditions as the green trajectory. The blue and black arrows mark the directions of the trajectories. Note that trajectories starting away from $M$ converge to $M$, and all deterministic trajectories converge to the deterministic limit cycle. Parameters of the simulation are given in Table 5.

Figure 2:

4D and 14D HH models. The meshed surface is a three-dimensional projection of the 14D state space onto three axes representing the voltage, $v$, the probability of all four potassium gates being in the closed state, $n0$, and the probability of exactly one potassium gate being in the open state, $n1$. Blue curves: Trajectories of the deterministic 14D HH model with initial conditions located on the 4D multinomial invariant submanifold, $M$. We prove that $M$ is an invariant submanifold in section 2.3. Black curve: The deterministic limit cycle solution for the 14D HH model, which forms a closed loop within $M$. Green curve: A trajectory of the deterministic 14D HH model with initial conditions (vertical green arrow) off the multinomial submanifold. Red curve: A trajectory of the stochastic 14D HH model (see section 3) with the same initial conditions as the green trajectory. The blue and black arrows mark the directions of the trajectories. Note that trajectories starting away from $M$ converge to $M$, and all deterministic trajectories converge to the deterministic limit cycle. Parameters of the simulation are given in Table 5.

Matlab code to generate the figures throughout the article is available on github from https://github.com/shusenpu/Stochastic_shielding and on ModelDB at accession number 266551.

## 2  The Deterministic 4D and 14D HH Models

In this section, we review the classical four-dimensional model of Hodgkin and Huxley (1952) HH), as well as its natural 14-dimensional version (Dayan & Abbott, 2001, sec. 5.7), with variables comprising membrane voltage and the occupancies of five potassium channel states and eight sodium channel states. The deterministic 14D model is the mean field of the channel-based Langevin model proposed by Fox and Lu (1994); this article describes both the Langevin and the mean field versions of the 14D Hodgkin-Huxley system. For completeness of exposition, we briefly review the 4D deterministic HH system and its 14D deterministic counterpart. In section 4, we prove that the sample paths of a class of Langevin stochastic HH models are equivalent; in section 2.3, we review analogous results relating trajectories of the 4D and 14D deterministic ODE systems.

In particular, we show that the deterministic 14D model and the original 4D HH model are dynamically equivalent, in the sense that every flow (solution) of the 4D model corresponds to a flow of the 14D model. The consistency of trajectories between the 14D and 4D models is easy to verify for initial data on a 4D submanifold of the 14D space given by choosing multinomial distributions for the gating variables (Dayan & Abbott, 2001; Goldwyn et al., 2011). Similarly, Keener established results on multinomial distributions as invariant submanifolds of Markov models with ion channel kinetics under several circumstances (Keener, 2009, 2010; Earnshaw & Keener, 2010a, 2010b), but without treating the general current-clamped case. Consistent with these results, we show that the set of all 4D flows maps to an invariant submanifold of the state space of the 14D model. Moreover, we show numerically that solutions of the 14D model with arbitrary initial conditions converge to this submanifold. Thus the original HH model “lives inside” the 14D deterministic model in the sense that the former embeds naturally and consistently within the latter (see Figure 2).

In the stochastic case, the 14D model has a natural interpretation as a hybrid stochastic system with independent noise forcing along each edge of the potassium (8 directed edges) and sodium (20 directed edges) channel state transition graphs. The hybrid model leads naturally to a biophysically grounded Langevin model that we describe in section 3. In contrast to the ODE case, the stochastic versions of the 4D and 14D models are not equivalent (Goldwyn & Shea-Brown, 2011).

### 2.1  The 4D Hodgkin-Huxley Model

The 4D voltage-gated ion channel HH model is a set of four ordinary differential equations:
$Cdvdt=-g¯Nam3h(v-VNa)-g¯Kn4(v-VK)-gL(v-VL)+Iapp,$
(2.1)
$dmdt=αm(v)(1-m)-βm(v)m,$
(2.2)
$dhdt=αh(v)(1-h)-βh(v)h,$
(2.3)
$dndt=αn(v)(1-n)-βn(v)n,$
(2.4)
where $v$ is the membrane potential, $Iapp$ is the applied current, and $0≤m,n,h≤1$ are dimensionless gating variables associated with Na$+$ and K$+$ channels. The constant $g¯ion$ is the maximal value of the conductance for the sodium and potassium channel, respectively. Parameters $Vion$ and $C$ are the ionic reversal potentials and capacitance, respectively. The quantities $αx$ and $βx$, $x∈{m,n,h}$ are the voltage-dependent per capita transition rates, defined in appendix B.

This system is a $C∞$ vector field on a four-dimensional manifold (with boundary) contained in $R4$: $X={-∞ The manifold is forward and backward invariant in time. If $Iapp$ is constant then $X$ has an invariant subset given by $X∩{vmin≤v≤vmax}$, where $vmin$ and $vmax$ are calculated in lemma 1.

As pointed Keener and Sneyd (1998) and Keener (2009) pointed out, for voltage either fixed or given as a prescribed function of time, the equations for $m,h$, and $n$ can be interpreted as the parameterization of an invariant manifold embedded in a higher-dimensional time-varying Markov system. Several papers developed this idea for a variety of ion channel models and related systems (Keener, 2009; Earnshaw & Keener, 2010b), but the theory developed is restricted to the voltage-clamped case.

Under a fixed voltage clamp, the ion channels form a time-homogeneous Markov process with a unique (voltage-dependent) stationary probability distribution. Under a time-varying current clamp, the ion channels nevertheless form a Markov process, albeit no longer time homogeneous. Under these conditions, the ion channel state converges rapidly to a multinomial distribution indexed by a low-dimensional set of time-varying parameters ($m(t),h(t),n(t)$) (Keener, 2010). In the current-clamped case, the ion channel process, considered alone, is neither stationary nor Markovian, making the analysis of this case significantly more challenging from a mathematical point of view.

### 2.2  The Deterministic 14D Hodgkin-Huxley Model

For the HH kinetics given in Figure 1, we define the eight-component state vector $M$ for the Na$+$ gates and the five-component state vector $N$ for the K$+$ gates, respectively, as
$M=[m00,m10,m20,m30,m01,m11,m21,m31]⊺∈[0,1]8.$
(2.5)
$N=[n0,n1,n2,n3,n4]⊺∈[0,1]5,$
(2.6)
where $∑i=03∑j=01mij=1$ and $∑i=04ni=1$. The open probability for the Na$+$ channel is $M8=m31$ and is $N5=n4$ for the K$+$ channel. The deterministic 14D HH equations may be written (compare equations 2.1 to 2.4)
$CdVdt=-g¯NaM8(V-VNa)-g¯KN5(V-VK)-gL(V-VL)+Iapp,$
(2.7)
$dMdt=ANa(V)M,$
(2.8)
$dNdt=AK(V)N,$
(2.9)
where the voltage-dependent drift matrices $ANa$ and $AK$ are given by
$ANa(V)=ANa(1)βm00βh0003αmANa(2)2βm00βh0002αmANa(3)3βm00βh000αmANa(4)000βhαh000ANa(5)βm000αh003αmANa(6)2βm000αh002αmANa(7)3βm000αh00αmANa(8),$
(2.10)
$AK(V)=AK(1)βn(V)0004αn(V)AK(2)2βn(V)0003αn(V)AK(3)3βn(V)0002αn(V)AK(4)4βn(V)000αn(V)AK(5),$
(2.11)
and the diagonal elements
$Aion(i)=-∑j:j≠iAion(j,i),forion∈{Na,K}.$

### 2.3  Relation Between the 14D and 4D Deterministic HH Models

Earnshaw and Keener (2010b) suggest that it is reasonable to expect that the global flow of the 14D system should converge to the 4D submanifold but also that it is far from obvious that it must. Existing theory applies to the voltage-clamped case (Keener, 2009; Earnshaw & Keener, 2010b). Here, we consider instead the current-clamped case, in which the fluctuations of the ion channel state influences the voltage evolution, and vice versa.

In the remainder of this section, we (1) define a multinomial submanifold $M$ and show that it is an invariant manifold within the 14D space, and (2) we show that the velocity on the 14D space and the pushforward of the velocity on the 4D space are identical. In section 2.4, we (3) provide numerical evidence that $M$ is globally attracting within the higher-dimensional space.

In order to compare the trajectories of the 14D HH equations with trajectories of the standard 4D equations, we define lower-dimensional and higher-dimensional domains $X$ and $Y$, respectively, as
$X={-∞
(2.12)
where $Δk$ is the $k$-dimensional simplex in $Rk+1$ given by $y1+…+yk+1=1,yi≥0.$ The 4D HH model $dxdt=F(x)$, equations 2.1 to 2.4, is defined for $x∈X$, and the 14D HH model $dydt=G(y)$, equations 2.7 and 2.9, is defined for $y∈Y$. We introduce a dimension-reducing mapping $R:Y→X$ as in Table 1 and a mapping from lower to higher dimension, $H:X→Y$, as in Table 2. We construct $R$ and $H$ in such a way that $R∘H$ acts as the identity on $X$, that is, for all $x∈X$, $x=R(H(x))$. The maps $H$ and $R$ are consistent with a multinomial structure for the ion channel state distribution, in the following sense. The space $Y$ covers all possible probability distributions on the eight sodium channel states and the five potassium channel states. Those distributions, which are products of one multinomial distribution on the K$+$-channel1and a second multinomial distribution on the Na$+$-channel,2 form a submanifold of $Δ7×Δ4$. In this way we define a submanifold, denoted $M=H(X)$, the image of $X$ under $H$.
Table 1:
$R$: Map from the 14D HH Model $(m00,…,m31,n0,…,n4)$ to the 4D HH Model $(m,h,n)$.
14D Model4D Model
$(v,m00,…,m31,n0,…,n4)$ $(v,m,h,n)$
$v$ $v$
$13(m11+m10)+23(m21+m20)+m31+m30$ $m$
$m01+m11+m21+m31$ $h$
$n1/4+n2/2+3n3/4+n4$ $n$‘
14D Model4D Model
$(v,m00,…,m31,n0,…,n4)$ $(v,m,h,n)$
$v$ $v$
$13(m11+m10)+23(m21+m20)+m31+m30$ $m$
$m01+m11+m21+m31$ $h$
$n1/4+n2/2+3n3/4+n4$ $n$‘

Note: Note that both ${m00,…,m31}$ and ${n0,…,n4}$ follow multinomial distributions.

Table 2:
$H$: Map from the 4D HH Model $(m,h,n)$ and the 14D HH Model $(m00,…,m31,n0,…,n4)$.
4D Model14D Model
$(v,m,h,n)$ $(v,m00,…,m31,n0,…,n4)$
$v$ $v$
$(1-n)4$ $n0$
$4(1-n)3n$ $n1$
$6(1-n)2n2$ $n2$
$4(1-n)n3$ $n3$
$n4$ $n4$
$(1-m)3(1-h)$ $m00$
$3(1-m)2m(1-h)$ $m10$
$3(1-m)m2(1-h)$ $m20$
$m3(1-h)$ $m30$
$(1-m)3h$ $m01$
$3(1-m)2mh$ $m11$
$3(1-m)m2h$ $m21$
$m3h$ $m31$
4D Model14D Model
$(v,m,h,n)$ $(v,m00,…,m31,n0,…,n4)$
$v$ $v$
$(1-n)4$ $n0$
$4(1-n)3n$ $n1$
$6(1-n)2n2$ $n2$
$4(1-n)n3$ $n3$
$n4$ $n4$
$(1-m)3(1-h)$ $m00$
$3(1-m)2m(1-h)$ $m10$
$3(1-m)m2(1-h)$ $m20$
$m3(1-h)$ $m30$
$(1-m)3h$ $m01$
$3(1-m)2mh$ $m11$
$3(1-m)m2h$ $m21$
$m3h$ $m31$
Before showing that the multinomial submanifold $M$ is an invariant manifold within the 14D space, we first show that the deterministic 14D HH model is defined on a bounded domain. Having a bounded forward-invariant manifold is a general property of conductance-based models, which may be written in the form
$dVdt=f(V,Nopen)=1CIapp-gleak(V-Vleak)-∑i∈IgiNopeni(V-Vi)$
(2.13)
$dNdt=A(V)Nand$
(2.14)
$Nopen=O[N].$
(2.15)
Here, $C$ is the membrane capacitance, $Iapp$ is an applied current with upper and lower bounds $I±$, respectively, and $gi$ is the conductance for the $i$th ion channel. The index $i$ runs over the set of distinct ion channel types in the model, $I$. The gating vector $N$ represents the fractions of each ion channel population in various ion channel states, and the operator $O$ gives the fraction of each ion channel population in the open (or conducting) channel states. The following lemma establishes that any conductance-based model (including the 4D or 14D HH model) is defined on a bounded domain.
Lemma 1.

For a conductance-based model of the form 2.13 to 2.15 and for any bounded applied current $I-≤Iapp≤I+$, there exist upper and lower bounds $Vmax$ and $Vmin$ such that trajectories with initial voltage condition $V∈[Vmin,Vmax]$ remain within this interval for all times $t>0$, regardless of the initial channel state.

Proof.
Let $V1=mini∈I{Vi}∧Vleak$, and $V2=maxi∈I{Vi}∨Vleak$, where the index $i$ runs over $I$, the set of distinct ion channel types. Note that for all $i$, $0≤Nopeni≤1$, and $gi>0,gleak>0$. Therefore, when $V≤V1$,
$dVdt=1CIapp-gleak(V-Vleak)-∑i∈IgiNopeni(V-Vi)$
(2.16)
$≥1CIapp-gleak(V-V1)-∑i∈IgiNopeni(V-V1)$
(2.17)
$≥1CIapp-gleak(V-V1)-∑i∈Igi×0×(V-V1)$
(2.18)
$=1CIapp-gleak(V-V1).$
(2.19)
Inequality 2.17 follows because $V1=mini∈I{Vi}∧Vleak$, and inequality 2.18 follows because $V-V1≤0$, $gi>0$ and $Nopeni≥0$. Let $Vmin:=minI-gleak+V1,V1$. When $V, $dVdt>0$. Therefore, $V$ will not decrease beyond $Vmin$.
Similarly, when $V≥V2$,
$dVdt=1CIapp-gleak(V-Vleak)-∑i∈IgiNopeni(V-Vi)$
(2.20)
$≤1CIapp-gleak(V-V2)-∑i∈IgiNopeni(V-V2)$
(2.21)
$≤1CIapp-gleak(V-V2)-∑i∈Igi×0×(V-V2)$
(2.22)
$=1CIapp-gleak(V-V2).$
(2.23)
Inequality 2.21 holds because $V2=maxi∈I{Vi}∨Vleak,$ and inequality 2.22 holds because $V-V2≥0$, $gi>0$ and $Nopeni≥0$. Let $Vmax=maxIappgleak+V2,V2$. When $V>Vmax,$$dVdt<0$. Therefore, $V$ will not go beyond $Vmax$.

We conclude that if $V$ takes an initial condition in the interval $[Vmin,Vmax],$ then $V(t)$ remains within this interval for all $t≥0$.$□$

Given that $y∈Y$ has a bounded domain, lemma 2 follows directly and establishes that the multinomial submanifold $M$ is a forward-time–invariant manifold within the 14D space. The proof of lemma 2 is in appendix C.

Lemma 2.

Let $X$ and $Y$ be the lower-dimensional and higher-dimensional Hodgkin-Huxley manifolds given by equation 2.12, and let $F$ and $G$ be the vector fields on $X$ and $Y$ defined by equations 2.1 to 2.4 and 2.7 to 2.9, respectively. Let $H:X→M⊂Y$ and $R:Y→X$ be the mappings given in Tables 2 and 1, respectively, and define the multinomial submanifold $M=H(X)$. Then $M$ is forward-time-invariant under the flow generated by $G$. Moreover, the vector field $G$, when restricted to $M$, coincides with the vector field induced by $F$ and the map $H$. That is, for all $y∈M$, $G(y)=DxH(R(y))·F(R(y)).$

Lemma 2 establishes that the 14D HH model given by equations 2.7 to 2.9 is dynamically consistent with the original 4D HH model given by equations 2.1 to 2.4.

In section 2.4 we provide numerical evidence that the flow induced by $G$ on $Y$ converges to $M$ exponentially fast. Thus, an initial probability distribution over the ion channel states that is not multinomial quickly approaches a multinomial distribution with dynamics induced by the 4D HH equations. Similar results, restricted to the voltage-clamp setting, were established by Keener and Earnshaw (Keener & Sneyd, 1998; Keener, 2009; Earnshaw & Keener, 2010b).

### 2.4  Local Convergence Rate

Keener and Earnshaw (Keener & Sneyd, 1998; Keener, 2009; Earnshaw & Keener, 2010b) showed that for Markov chains with constant (even time-varying) transition rates, (1) the multinomial probability distributions corresponding to mean-field models (such as the HH sodium or potassium models) form invariant submanifolds within the space of probability distributions over the channel states, and (2) arbitrary initial probability distributions converged exponentially quickly to the invariant manifold. For systems with prescribed time-varying transition rates, such as for an ion channel system under voltage clamp with a prescribed voltage $V(t)$ as a function of time, the distribution of channel states had an invariant submanifold, again corresponding to the multinomial distributions, and the flow on that manifold induced by the evolution equations was consistent with the flow of the full system.

In the preceding section, we established the dynamical consistency of the 14D and 4D models with enough generality to cover both the voltage-clamp and current-clamp systems; the latter is distinguished by not having a prescribed voltage trace, but rather having the voltage coevolve along with the (randomly fluctuating) ion channel states. Here, we give numerical evidence for exponential convergence under current clamp similar to that established under voltage clamp by Keener and Earnshaw.

Rather than providing a rigorous proof, we give numerical evidence for the standard deterministic HH model that $y→M$ under current clamp (spontaneous firing conditions) in the following sense: if $y(t)$ is a solution of $y˙=G(y)$ with arbitrary initial data $y0∈Y$, then $||y(t)-H(R(y(t)))||→0$ as $t→∞$, exponentially quickly. Moreover, the convergence rate is bounded by $λ=max(λv,λNa,λK)$, where $λion$ is the least negative nontrivial eigenvalue of the channel state transition matrix (over the voltage range $Vmin≤v≤Vmax$) for a given ion, and $-1/λv$ is the largest value taken by the membrane time constant (for $Vmin≤v≤Vmax$). In practice, we find that the membrane time constant does not determine the slowest timescale for convergence to $M$. In fact it appears that the second-least-negative eigenvalues (not the least-negative eigenvalues) of the ion channel matrices set the convergence rate.

Note that $y∈Y$ can be written as $y=[V;M;N]$. As shown in appendix C, the Jacobian matrix $∂H∂x$ consists of three block matrices: one for the voltage terms, $∂V∂v$; one associated with the Na$+$ gates, given by $∂M∂m$ and $∂M∂h$; and one corresponding to the K$+$ gates, $∂N∂n$. Fixing a particular voltage $v$, let $λi,i∈{0,1,2,…,7}$ be the eight eigenvalues of $ANa$ and $vi$ be the associated eigenvectors, that is, $ANavi=λivi$ for the rate matrix in equations 2.8. Similarly, let $ηi,wi,i∈{0,1,2,…,4}$ be the five eigenvalues and the associated eigenvectors of $AK$, that is, $AKwi=ηiwi$, for the rate matrix in equations 2.9. If we rank the eigenvalues of either matrix in descending order, the leading eigenvalue is always zero (because the sum of each column for $ANa$ and $AK$ is zero for every $V$) and the remainder are real and negative. Let $λ1$ and $η1$ denote the largest (least negative) nontrivial eigenvalues of $ANa$ and $AK$, respectively, and let $v1∈R8$ and $w1∈R5$ be the corresponding eigenvectors.

The eigenvectors of the full 14D Jacobian are not simply related to the eigenvectors of the component submatrices, because the first (voltage) row and column contain nonzero off-diagonal elements. However, the eigenvectors associated with the largest nonzero eigenvectors of $ANa$ and $AK$ (respectively, $v2$ and $w2$) are parallel to $∂M/∂h$ and $∂N/∂n$, regardless of voltage. In other words, the slowest decaying directions for each ion channel, $v1$ and $w1$, transport the flow along the multinominal submanifold of $Y$. Therefore, it is reasonable to make the hypothesis that if $Y(t)$ is a solution of $y˙=G(y)$ with arbitrary initial data $y∈Y$, then
$∥y(t)-H(R(y(t)))∥∥y(0)-H(R(y(0)))∥≲e-λ2t$
(2.24)
for $λ2$ being the second largest nonzero eigenvalue of $AK$ and $ANa$ over all $v$ in the range $vmin. The convergence behavior is plotted numerically in Figure 3 and is consistent with the Ansatz equation 2.24. We calculate the distance from a point $y$ to $M$ as
$ymax=argmaxy∈Yy-H(R(y))2.$
(2.25)
In order to obtain an upper bound on the distance as a function of time, we begin with the farthest point in the simplex from $M$ by numerically finding the solution to the argument 2.25, which is
$ymax=[v,0.5,0,0,0.5,0,0,0,0,0.5,0,0,0,0.5].$
This vector represents the furthest possible departure from the multinomial distribution: all probability equally divided between the extreme states $m00$ and $m03$ for the sodium channel and the extremal states $n0$ and $n4$ for potassium. The maximum distance from the multinomial submanifold $M$, $dmax$, is calculated using this point. As shown in Figure 3, the function $dmaxe-λ2t$ provides a tight upper bound for the convergence rate from arbitrary initial data $y∈Y$ to the invariant submanifold $M$.
Figure 3:

Convergence of trajectories $y(t)$, for arbitrary initial conditions $y0∈Y$, to the multinomial submanifold $M$, for an ensemble of random initial conditions. (A) Distance (see equation 2.25) between $y(t)$ and $M$. (B) Logarithm of the distance in panel A. The red solid line shows $dmaxe-λ2t$ in panel A and $log(dmax)-λ2t$ in panel B.

Figure 3:

Convergence of trajectories $y(t)$, for arbitrary initial conditions $y0∈Y$, to the multinomial submanifold $M$, for an ensemble of random initial conditions. (A) Distance (see equation 2.25) between $y(t)$ and $M$. (B) Logarithm of the distance in panel A. The red solid line shows $dmaxe-λ2t$ in panel A and $log(dmax)-λ2t$ in panel B.

## 3  Stochastic 14D Hodgkin-Huxley Models

Finite populations of ion channels generate stochastic fluctuations (“channel noise”) in ionic currents that influence action potential initiation and timing (White et al., 1998; Schneidman et al., 1998). At the molecular level, fluctuations arise because transitions between individual ion channel states are stochastic (Hill & Chen, 1972; Neher & Sakmann, 1976; Skaugen & Walløe, 1979). Each directed edge in the ion channel state transition diagrams (see Figure 1) introduces an independent noise source. It is of interest to be able to attribute variability of the interspike interval timing distribution to specific molecular noise sources, specifically individual directed edges in each channel state graph. In order to explore these contributions, we develop a system of Langevin equations for the Hodgkin-Huxley equations, set in a 14-dimensional phase space.

Working with a higher-dimensional stochastic model may appear inconvenient, but in fact has several advantages. First, any projection of an underlying 14D model onto a lower (e.g., 4D) stochastic model generally entails loss of the Markov property. Second, the higher-dimensional representation allows us to assess the contribution of individual molecular transitions to the macroscopically observable variability of timing in the interspike interval distribution. Third, by using a rectangular noise coefficient matrix constructed directly from the transitions in the ion channel graphs, we avoid a matrix decomposition step. This approach leads to a fast algorithm that is equivalent to the slower algorithm due to (Fox & Lu, 1994; Goldwyn & Shea-Brown, 2011) in a strong sense (pathwise equivalence) that we detail in section 4.

### 3.1  Exact Stochastic Simulation of HH Kinetics: The Random–Time-Change Representation

An “exact” representation of the Hodgkin-Huxley system with a population of $Mtot$ sodium channels and $Ntot$ potassium channels treats each of the 20 directed edges in the sodium channel diagram, and each of the 8 directed edges in the potassium channel diagram, as independent Poisson processes, with voltage-dependent per capita intensities. As in the deterministic case, the sodium and potassium channel population vectors $M$ and $N$ satisfy $∑i=03∑j=01Mij≡1≡∑i=04Ni$.3 Thus, they are constrained, respectively, to a 7D simplex embedded in $R8$ and a 4D simplex embedded in $R5$. In the random-time-change representation (Anderson & Kurtz, 2015) the exact evolution equations are written in terms of sums over the directed edges $E$ for each ion channel, $ENa={1,…,20}$ and $EK={1,…,8}$, (see Figure 1),
$M(t)=M(0)+1Mtot∑k∈ENaζkNaYkNaMtot∫0tαkNa(V(s))Mi(k)(s)ds,$
(3.1)
$N(t)=N(0)+1Ntot∑k∈EKζkKYkKNtot∫0tαkK(V(s))Ni(k)(s)ds.$
(3.2)
Here $ζkion$ is the stoichiometry vector for the $k$th directed edge. If we write $i(k)$ for the source node and $j(k)$ for the destination node of edge $k$, then $ζkion=ej(k)ion-ei(k)ion$.4 Each $Ykion(τ)$ is an independent unit-rate Poisson process, evaluated at “internal time” (or integrated intensity) $τ$, representing the independent channel noise arising from transitions along the $k$th edge. The voltage-dependent per capita transition rate along the $k$th edge is $αkion(v)$, and $Mi(k)(s)$ (resp. $Ni(k)(s)$) is the fractional occupancy of the source node for the $k$th transition at time $s$. Thus, for example, the quantity $MtotαkNa(V(s))Mi(k)(s)$ gives the net intensity along the $k$th directed edge in the Na$+$ channel graph at time $s$.
Remark 1.

Under voltage-clamp conditions, with the voltage $V$ held fixed, equations 3.1 and 3.2 reduce to a time-invariant first-order transition process on a directed graph (Schmidt & Thomas, 2014; Gadgil, Lee, & Othmer, 2005).

Under current-clamp conditions, the voltage evolves according to a conditionally deterministic current balance equation of the form
$dVdt=1C{Iapp(t)-g¯NaM31V-VNa-g¯KN4V-VK-gleak(V-Vleak)}.$
(3.3)
Here, $C$ ($μF/cm2$) is the capacitance, $Iapp$ ($nA/cm2$) is the applied current, the maximal conductance is $g¯chan$ ($mS/cm2$), $Vchan$ ($mV$) is the associated reversal potential, and the ohmic leak current is $gleak(V-Vleak)$.

The random-time-change representation, equations 3.1 to 3.3, leads to an exact stochastic simulation algorithm, given in Anderson and Kurtz (2015); equivalent simulation algorithms have been used previously (Clay & DeFelice, 1983; Newby et al., 2013). Many authors substitute a simplified Gillespie algorithm that imposes a piecewise-constant propensity approximation, ignoring the voltage dependence of the transition rates $αkion$ between channel transition events (Goldwyn et al., 2011; Goldwyn & Shea-Brown, 2011; Orio & Soudry, 2012; Pezo et al., 2014). The two methods give similar moment statistics, provided $Ntot,Mtot≳40$ (Anderson & Kurtz, 2015); their similarity regarding path-dependent properties (including interspike interval distributions) has not been studied in detail. Moreover, both Markov chain algorithms are prohibitively slow for modest numbers (e.g., thousands) of channels; the exact algorithm may be even slower than the approximate Gillespie algorithm. For consistency with previous studies, in this article, we use the piecewise-constant propensity Gillespie algorithm with $Mtot=6000$ Na$+$ and $Ntot=1800$ K$+$ channels as our gold standard Markov chain (MC) model, as in Goldwyn and Shea-Brown (2011).

In section 3.2 we develop a 14D conductance-based Langevin model with 28 independent noise sources – one for each directed edge – derived from the random-time-change representation equations 3.1 to 3.3. In previous work (Schmidt & Thomas, 2014), we established a quantitative measure of edge importance, namely, the contribution of individual transitions (directed edges) to the variance of channel state occupancy under steady-state voltage-clamp conditions. Under voltage clamp, the edge importance was identical for each reciprocal pair of directed edges in the graph, a consequence of detailed balance. Some Langevin models lump the noise contributions of each pair of edges (Dangerfield et al., 2010, 2012; Orio & Soudry, 2012; Pezo et al., 2014). Under conditions of detailed balance, this simplification is well justified. However, as we will show (see Figure 5), under current-clamp conditions (e.g., for an actively spiking neuron) detailed balance is violated, the reciprocal edge symmetry is broken, and each pair of directed edges makes a distinct contribution to ISI variability.

### 3.2  Langevin Equations of the 14D HH Model

For sufficiently large number of channels, Schmidt and Thomas (2014) and Schmidt et al. (2018) showed that under voltage clamp, equations 3.1 and 3.2 can be approximated by a multidimensional Ornstein-Uhlenbeck (OU) process (or Langevin equation) in the form5
$dM=∑k=120ζkNaαkNa(V)Mi(k)dt+εNaαkNa(V)Mi(k)dWkNa,$
(3.4)
$dN=∑k=18ζkKαkK(V)Ni(k)dt+εKαkK(V)Ni(k)dWkK.$
(3.5)
Here, $M$, $N$, $ζkion$, and $αkion$ have the same meaning as in equations 3.1 and 3.2. The channel state increments in a short time interval $dt$ are $dM$ and $dN$, respectively. The finite-time increment in the Poisson process $Ykion$ is now approximated by a gaussian process, namely, the increment $dWkion$ in a Wiener (Brownian motion) process associated with each directed edge. These independent noise terms are scaled by $εNa=1/Mtot$ and $εK=1/Ntot$, respectively.
Equations 3.3 to 3.5 comprise a system of Langevin equations for the HH system (under current clamp) on a 14-dimensional phase space driven by 28 independent white noise sources, one for each directed edge. These equations may be written succinctly in the form
$dX=f(X)dt+εG(X)dW(t)$
(3.6)
where we define the 14-component vector $X=(V;M;N)$, and $W(t)$ is a Wiener process with 28 independent components. The deterministic part of the evolution equation $f(X)=dVdt;dMdt;dNdt$ is the same as the mean field, equations 2.7 to 2.9. The state-dependent noise coefficient matrix $G$ is $14×28$ and can be written as
$εG=01×2001×8SNa08×805×20SK.$
The coefficient matrix $SK$ is
$SK=1Ntot-4αnn0βnn1004αnn0-βnn1-3αnn12βnn2003αnn1-2βnn200000000⋯⋯00000000-2αnn23βnn3002αnn2-3βnn3-αnn34βnn400αnn3-4βnn4,$
and $SNa$ is given in appendix D. Note that each of the 8 columns of $SK$ corresponds to the flux vector along a single directed edge in the K$+$ channel transition graph. Similarly, each of the 20 columns of $SNa$ corresponds to the flux vector along a directed edge in the Na$+$ graph (see appendix D).
Remark 2.

Although the ion channel state trajectories generated by equation 3.6 are not strictly bounded to remain within the nonnegative simplex, empirically, the voltage nevertheless remains within specified limits with overwhelming probability.

To facilitate comparison of the model, equations 3.3 to 3.5, prior work (Fox & Lu, 1994; Fox, 1997; Goldwyn & Shea-Brown, 2011), we may rewrite the $14×28$D Langevin description in the equivalent form:
$CdVdt=Iapp(t)-g¯NaM8V-VNa-g¯KN5V-VK-gleak(V-Vleak),$
(3.7)
$dMdt=ANaM+SNaξNa,$
(3.8)
$dNdt=AKM+SKξK.$
(3.9)
The drift matrices $ANa$ and $AK$ remain the same as in Fox and Lu (1994), and are the same as in the 14D deterministic model, equations 2.10 and 2.11. $SNa$ and $SK$ are constructed from direct transitions of the underlying kinetics in Figure 1, and $ξNa∈R20$ and $ξK∈R8$ are vectors of independent gaussian white noise processes with zero mean and unit variance.

Fox and Lu's original approach (Fox & Lu, 1994) requires solving a matrix square root equation $SS⊺=D$ to obtain a square ($8×8$ for Na$+$ or $5×5$ for K$+$) noise coefficient matrix consistent with the state-dependent diffusion matrix $D$. As an advantage, the ion channel representation equations 3.7 to 3.9, uses sparse, nonsquare noise coefficient matrices ($8×20$ for the Na$+$ channel and $5×8$ for the K$+$ channel), which exposes the independent sources of noise for the system.

The new Langevin model in equations 3.7 to 3.9 does not require detailed balance, which gives more insight into the underlying kinetics. Review papers (e.g., Goldwyn & Shea-Brown, 2011; Pezo et al., 2014; Huang et al., 2015) did systematic comparison of various stochastic versions of the HH model. In sections 4 and 5, we quantitatively analyze the connection between the new model and other existing models (Fox & Lu, 1994; Goldwyn et al., 2011; Goldwyn & Shea-Brown, 2011; Dangerfield et al., 2010, 2012; Orio & Soudry, 2012; Huang et al., 2013, 2015; Pezo et al., 2014; Fox, 2018). Problems such as the boundary constraints are beyond the scope of this article; however, we would like to connect the new model to another type of approximation to the MC model; the stochastic shielding approximation.

### 3.3  Stochastic Shielding for the 14D HH Model

The stochastic shielding (SS) approximation was introduced by Schmandt and Galán (2012), in order to approximate the Markov process using fluctuations from only a subset of the transitions, namely, those corresponding to changes in the observable states. In Schmidt and Thomas (2014), we showed that under voltage clamp, each directed edge makes a distinct contribution to the steady-state variance of the ion channel conductance, with the total variance being a sum of these contributions. We call the variance due to the $k$th directed edge the edge importance; assuming detailed balance, it is given by
$Rk=Jk∑i=2n∑j=2n-1λi+λjc⊺viwi⊺ζkζk⊺wjvj⊺c.$
(3.10)
Here, $Jk$ is the steady-state probability flux along the $k$th directed edge; $λn<λn-1≤…≤λ2<0$ are the eigenvalues of the drift matrix ($ANa$ or $AK$, respectively), and $vi$ (resp. $wi$) are the corresponding right (resp. left) eigenvectors of the drift matrix. Each edge's stoichiometry vector $ζk$ has components summing to zero; consequently, the columns of $ANa$ and $AK$ all sum to zero. Thus, each drift matrix has a leading trivial eigenvalue $λ1≡0$. The vector $c$ specifies the unitary conductance of each ion channel state; for the HH model, it is proportional to $e8Na$ or $e5K$, respectively.

Figure 4 shows the edge importance for each pair of edges in the HH Na$+$ and K$+$ ion channel graph, as a function of voltage in the range $[-100,100]$ mV. Note that reciprocal edges have identical $Rk$ due to detailed balance. Under voltage clamp, the largest value of $Rk$ for the HH channels always corresponds to directly observable transitions, that is, edges $k$ such that $|c⊺ζk|>0$, although this condition need not hold in general (Schmidt, Galán, & Thomas, 2018).

Figure 4:

Stochastic shielding under voltage clamp. Redrawn (with permission) from Figures 10 and 13 of Schmidt and Thomas (2014). Each curve shows the edge importance $Rk$ (see equation 3.10) as a function of voltage in the range $[-100,100]$ mV for a different edge pair. For the K$+$ kinetics, $R7=R8$ are the largest $Rk$ value in the voltage range above. For the Na$+$ kinetics, $R11=R12$ have the largest $Rk$ values in the subthreshold voltage range (c. $[-100,-25]$ mV), and $R19=R20$ have the largest $Rk$ values in the suprathreshold voltage range (c. $[-25,100]$ mV).

Figure 4:

Stochastic shielding under voltage clamp. Redrawn (with permission) from Figures 10 and 13 of Schmidt and Thomas (2014). Each curve shows the edge importance $Rk$ (see equation 3.10) as a function of voltage in the range $[-100,100]$ mV for a different edge pair. For the K$+$ kinetics, $R7=R8$ are the largest $Rk$ value in the voltage range above. For the Na$+$ kinetics, $R11=R12$ have the largest $Rk$ values in the subthreshold voltage range (c. $[-100,-25]$ mV), and $R19=R20$ have the largest $Rk$ values in the suprathreshold voltage range (c. $[-25,100]$ mV).

To apply the stochastic shielding method under current clamp, we simulate the model with noise from only a selected subset $E'⊂E$ of directed edges, replacing equations 3.8 to 3.9 with
$dMdt=ANaM+SNa'ξNa,$
(3.11)
$dNdt=AKM+SK'ξK,$
(3.12)
where $SNa'$ (resp. $SK'$) is a reduced matrix containing only the noise coefficients from the most important edges $E'$. That is, $E'$ contains a subset of edges with the largest edge-importance values $Rk$.

Schmandt and Galán (2012) assumed that the edges with the largest contribution to current fluctuations under voltage clamp would also make the largest contributions to variability in voltage and timing under current clamp and included edges 7 and 8 of the K$+$channel ($EK'={7,8}$) and edges 11 and 12 and 19 and 20 of the Na$+$ channel ($ENa'={11,12,19,20}$), yielding an $8×4$ matrix $SNa'$ and a $5×2$ matrix $SK'$. They demonstrated numerically that restricting stochastic forcing to these edges gave a significantly faster simulation with little appreciable change in statistical behavior: under voltage clamp, the mean current remained the same, with a small (but noticeable) decrease in the current variance; meanwhile, similar interspike interval (ISI) statistics were observed.

Under current clamp, detailed balance is violated, and it is not clear from mathematical principles whether the edges with the largest $Rk$ under voltage clamp necessarily make the largest contribution under other circumstances. In order to evaluate the contribution of the fluctuations driven by each directed edge on ISI variability, we test the stochastic shielding method by removing all but one column of $Sion$ at a time. That is, we restrict to a single noise source and observe the resulting ISI variance empirically. For example, to calculate the importance of the $k$th direct edge in the Na$+$ channel, we suppress the noise from all other edges by setting $SK'ξK=05×1$ and
$SNa'=08×1,⋯,SNa(:,k),⋯,08×1,$
that is, we include only the $k$th column of $SNa$ and set other columns to be zeros. The ISI variance was calculated from an ensemble of $104$ voltage traces, each spanning about 500 ISIs.

Figure 5A plots the logarithm of the ISI variance for each edge in $EK$. Vertical bars (cyan) show the ensemble mean of the ISI variance, with a 95% confidence interval superimposed (magenta). Several observations are in order. First, the ISI variance driven by the noise in each edge decreases rapidly the farther the edge is from the observable transitions (edges 7,8), reflecting the underlying “stochastic shielding” phenomenon. Second, the symmetry of the edge importance for reciprocal edge pairs—(1,2), (3,4), (5,6), and (7,8)—that is observed under voltage clamp is broken under current clamp. The contribution of individual directed edges to timing variability under current clamp has another important difference compared with the edge importance (current fluctuations) under voltage clamp. A similar breaking of symmetry for reciprocal edges is seen for the Na$+$ channel, again reflecting the lack of detailed balance during active spiking.

Figure 5:

Logarithm of variance of ISI for stochastic shielding under current clamp. The cyan bar is the mean of ISI, and the magenta plots the 95% confidence interval of the mean ISI (see text for details). The applied current is 10 nA with other parameters specified in the appendixes. For the K$+$ kinetics, the largest contribution edge is 7, and 8 is slightly smaller, ranking the second largest. For the Na$+$ kinetics, the largest contribution pair is 19 and 20, with 20 slightly smaller than 19. Moreover, edge 17 and 18 is the second largest pair.

Figure 5:

Logarithm of variance of ISI for stochastic shielding under current clamp. The cyan bar is the mean of ISI, and the magenta plots the 95% confidence interval of the mean ISI (see text for details). The applied current is 10 nA with other parameters specified in the appendixes. For the K$+$ kinetics, the largest contribution edge is 7, and 8 is slightly smaller, ranking the second largest. For the Na$+$ kinetics, the largest contribution pair is 19 and 20, with 20 slightly smaller than 19. Moreover, edge 17 and 18 is the second largest pair.

Figure 5B shows the ISI variance when channel noise is included on individual edges of $ENa.$ Here, the difference between voltage and current clamp is striking. Under voltage clamp, the four most important edges are always those representing observable transitions, in the sense that the transition's stoichiometry vector $ζ$ is not orthogonal to the conductance vector $c$. That is, the four most important pairs are always 11 and 12 and 19 and 20, regardless of voltage (see Figure 4). Under current clamp, the most important edges are 17, 18, 19, and 20. Although edges 11 and 12 are among the four most important sources of channel population fluctuations under voltage clamp, they are not even among the top 10 contributors to ISI variance when taken singly. Even though edges 17 and 18 are hidden, meaning they do not directly change the instantaneous channel conductance, these edges are nevertheless the second most important pair under current clamp. Therefore, when we implement the stochastic-shielding based approximation, we include the pairs 17 and 18 and 19 and 20 in equation 3.11. We refer to the approximate SS model driven by these six most important edges as the $14×6$D HH model.

Given the other parameters we use for the HH model (see Table 5 in appendix B), the input current of $Iapp=10$ nA is slightly beyond the region of multistability associated with a subcritical Andronov-Hopf bifurcation. In order to make sure the results are robust against increases in the applied current, we tried current injections ranging from 20 to 100 nA. While injecting larger currents decreased the ISI variance, it did not change the rank order of the contributions from the most important edges.

## 4  Pathwise Equivalence for a Class of Langevin Models

Fox and Lu's method has been widely used since its appearance (see references in Bruce, 2009; Goldwyn & Shea-Brown, 2011; Huang et al., 2015), and the “best” approximation for the underlying Markov chain (MC) model has been a subject of ongoing discussion for decades. Several studies (Mino et al., 2002; Bruce, 2009; Sengupta, Laughlin, & Niven, 2010) attested to discrepancies between Fox's later approach (Fox, 1997) and the discrete-state MC model, raising the question of whether Langevin approximations could ever accurately represent the underlying fluctuations seen in the gold standard MC models. An influential review paper (Goldwyn & Shea-Brown, 2011) found that these discrepancies were due to the way in which noise is added to the stochastic differential equations, 1.1 to 1.3. Recent studies, including Dangerfield et al. (2010, 2012); Linaro et al. (2011); Goldwyn and Shea-Brown (2011); Goldwyn et al. (2011); Orio and Soudry (2012); Güler (2013b); Huang et al. (2013, 2015); Pezo et al. (2014); and Fox (2018), discussed various ways of incorporating channel noise into HH kinetics based on the original work by Fox and Lu (Fox & Lu, 1994; Fox, 1997), some of which have the same SDEs but with different boundary conditions. Different boundary conditions (BCs) are not expected to have much impact on computational efficiency. Indeed, if BCs are neglected, the main difference between channel-based (or conductance-based) models is the diffusion matrix $S$ in the Langevin euqations 1.2 and 1.3. As the discussion about where and how to incorporate noise into the HH model framework goes on, Fox (2018) recently asked whether there is a way of relating different models with different $S$ matrices. We give a positive answer to this question below.

In section 4.1 we demonstrate the equivalence (neglecting the boundary conditions) of a broad class of previously proposed channel-based Langevin models (Fox & Lu, 1994; Dangerfield et al., 2010, 2012; Goldwyn & Shea-Brown, 2011; Orio & Soudry, 2012; Huang et al., 2013; Pezo et al., 2014; Fox, 2018) and the 14D Langevin HH model with 28 independent noise sources (one for each directed edge in the channel state transition graph), that is, our $14×28$D Langevin model.

### 4.1  When Are Two Langevin Equations Equivalent?

Two Langevin models are pathwise equivalent if the sample paths (trajectories) of one model can be made to be identical to the sample paths of the other under an appropriate choice of gaussian white noise samples for each. To make this notion precise, consider two channel-based Langevin models of the form $dX=f(X)dt+G(X)dW$ with the same mean dynamics $f∈Rd$ and two different $d×n$ matrices (possibly with different values of $n1$ and $n2$), $G1$ and $G2$. Denote
$f:Rd→Rd,$
(4.1)
$G1:Rd→Rd×n1,$
(4.2)
$G2:Rd→Rd×n2.$
(4.3)
Let $X(t)=[X1(t),X2(t),…,Xd(t)]⊺$ and $X*(t)=[X1*(t),X2*(t),…,Xd*(t)]⊺$ be trajectories produced by the two models, and let $W(t)=[W1(t),W2(t),…,Wn1(t)]⊺$ and $W*(t)=[W1*(t),W2*(t),…,Wn2*(t)]⊺$ be vectors of Weiner processes. That is, $Wi(t),i=1,2,…,n1$ and $Wj*(t),j=1,2,…,n2$ are independent Wiener processes with $〈Wi(s)Wj(t)〉=δijδ(t-s)$ and $〈Wi*(s)Wj*(t)〉=δijδ(t-s)$. Note that $n1$ and $n2$ need not be equal. As defined in (Allen, Allen, Arciniega, & Greenwood, 2008), the stochastic differential equation (SDE) models
$dX=f(t,X(t))dt+G1(t,X(t))dW(t)$
(4.4)
and
$dX*=f(t,X*(t))dt+G2(t,X*(t))dW*(t)$
(4.5)
are pathwise equivalent if systems 4.4 and 4.5 possess the same probability distribution and, moreover, a sample path solution of one equation is also a sample solution to the other one. Allen et al. (2008) proved a theorem giving general conditions under which the trajectories of two SDEs are equivalent. We follow their construction closely below, adapting it to the case of two different Langevin equations for the Hodgkin-Huxley system represented in a 14-dimensional state space.
As in section 3, channel-based Langevin models for the stochastic dynamics of HH can be written as
$dX=f(X)dt+S(X)dW(t),$
(4.6)
where the 14-component random vector $X=(V;M;N)$ and $f(x)=dVdt;dMdt;dNdt$ is the same as the mean-field, equations 2.7 to 2.9. Recall that $x=[v,m,n]⊺$. Here we write
$S(x)=01×m01×nSNa(m)08×n05×mSK(n),withSNa:R8→R8×m,$
(4.7)
for the Na$+$ channel and
$SK:R5→R5×n$
(4.8)
for the K$+$ channel. Here, $m$ is the number of independent white noise forcing terms affecting the sodium channel variables, while $n$ is the number of independent noise sources affecting the potassium gating variables. We write
$W(t)=[W1(t),W2(t),…,Wm+n(t)]⊺$
for a Wiener process incorporating both the sodium and potassium noise forcing. Given two channel-based models with diffusion matrices,
$SNa,1:R8→R8×m1,$
(4.9)
$SNa,2:R8→R8×m2,$
(4.10)
for the Na$+$ channel, and
$SK,1:R5→R5×n1,$
(4.11)
$SK,2:R5→R5×n2,$
(4.12)
for the K$+$ channel, we construct the diffusion matrix $D=SS⊺$. In order for the two models to generate equivalent sample paths, it suffices that they have the same diffusion matrix:
$D=S1S1⊺=01×101×801×508×1DNa08×505×105×8DK=S2S2⊺.$
The SDEs corresponding to the two channel-based Langevin models are
$dX=f(t,X(t))dt+S1(t,X(t))dW(t),$
(4.13)
$dX*=f(t,X*(t))dt+S2(t,X*(t))dW*(t).$
(4.14)
The probability density function $p(t,x)$ for random variable $X$ in equation 4.13 satisfies the Fokker-Planck equation:
$∂p(t,x)∂t=12∑i=18∑j=18∂2∂xixjp(t,x)∑l=1m1+n1S1(i,l)(t,x)S1(j,l)(t,x)-∑i=18∂∂xifi(t,x)p(t,x)=12∑i=18∑j=18∂2∂xixjD(i,j)(t,x)p(t,x)-∑i=18∂∂xifi(t,x)p(t,x),$
(4.15)
where $S1(i,j)(t,x)$ is the $(i,j)$th entry of the diffusion matrix $S1(t,x)$. Equation 4.15 holds because
$D(i,j)(t,x)=∑l=1m1+n1S1(i,l)(t,x)S1(j,l)(t,x).$
If $z1$, $z2∈R14$ and $z1⩽z2$, then
$P(z1⩽X(t)⩽z2)=∫z1,14z2,14∫z1,13z2,13⋯∫z1,1z2,1p(t,x)dx1dx2⋯dx8.$
Note that equations 4.13 and 4.14 have the same expression equation 4.15, for the Fokker-Planck equation; therefore, $X$ and $X*$ possess the same probability density function. In other words, the probability density function of $X$ in equation 4.6 is invariant for different choices of the diffusion matrix $S$.

### 4.2  Map Channel-based Langevin Models to Fox and Lu's Model

We now explicitly construct a mapping between Fox and Lu's (1994) 14D model and any channel-based model (given the same boundary conditions). We begin with a channel-based Langevin description
$dX=f(t,X(t))dt+S(t,X(t))dW(t),$
(4.16)
and Fox and Lu's (1994) model,
$dX*=f(t,X*(t))dt+S0(t,X*(t))dW*(t),$
(4.17)
where $S$ is a $d$ by $m$ matrix satisfying $SS⊺=D$ (note that $S$ is not necessarily a square matrix), and $S0=D$.
Let $T$ be the total simulation time of the random process in equations 4.16 and 4.17. For $0⩽t⩽T$, denote the singular value decomposition (SVD) of $S$ as
$S(t)=P(t)Λ(t)Q(t),$
where $P(t)$ is a $d×d$ orthogonal matrix (i.e., $P⊺P=PP⊺=Id$), $Q(t)$ is an $m×m$ orthogonal matrix, and $Λ(t)$ is a $d×m$ matrix with $rank(Λ)=r⩽d$ positive diagonal entries and $d-r$ zero diagonal entries.

First, we prove that given a Wiener trajectory, $W(t),t∈[0,T]$ and the solution to equation 4.16$X(t)$, there exists a Wiener trajectory $W*(t)$ such that the solution to equation 4.17, $X*$, is also a solution to equation 4.16. In other words, for a Wiener process $W(t)$, we can construct a $W*(t)$, such that $X*(t)=X(t)$, for $0≤t≤T$.

Following Allen et al. (2008), we construct the vector $W*(t)$ of $d$ independent Wiener processes as follows:
$W*(t)=∫0tP(s)Λ(s)Λ⊺(s)12+Λ(s)Q(s)dW(s)+∫0tP(s)dW**(s)$
(4.18)
for $0⩽t⩽T$, where $W**(t)$ is a vector of length $d$ with the first $r$ entries equal to 0 and the next $d-r$ entries independent Wiener processes, and $Λ(s)Λ⊺(s)12+$ is the pseudoinverse of $Λ(s)Λ⊺(s)12$. Consider that
$D(t)=S(t)S⊺(t)=P(t)Λ(t)Q(t)P(t)Λ(t)Q(t)⊺$
(4.19)
$=P(t)Λ(t)Λ⊺(t)P⊺(t)$
(4.20)
$=[S0(t)]2,$
(4.21)
where $S0(t)=P(t)Λ(t)Λ⊺(t)12P⊺(t)$ is a square root of $D$, by construction.
The diffusion term on the right side of equation 4.17 with $X*(t)$ replaced by $X(t)$ satisfies
$S0(t,X(t))dW*(t)=S0(t)P(t)Λ(t)Λ⊺(t)12+Λ(t)Q(t)dW(t)+P(t)dW**(t)=P(t)Λ(t)Λ⊺(t)12P⊺(t)P(t)Λ(t)Λ⊺(t)12+Λ(t)Q(t)dW(t)+P(t)Λ(t)Λ⊺(t)12P⊺(t)P(t)dW**(t)=P(t)Λ(t)Q(t)dW(t).$
(4.22)
From the SVD of $S=PΛ$Q, we conclude that
$S0(t,X(t))dW*(t)=S(t,X(t))dW(t).$
(4.23)
Hence, $dX=f(t,X(t))dt+S0(t,X(t))dWt*$, that is, $X(t)$ is a sample path solution of equation 4.17.
Similarly, given a Wiener trajectory $W*(t)$ and the solution to equation 4.17$X*(t)$, we can construct a vector $W(t)$ of $m$ independent Winner processes as
$W(t)=∫0tQ⊺(s)Λ+(s)Λ(s)Λ⊺(s)1/2P⊺(s)dW*(s)+∫0tQ⊺(s)dW***(s)$
(4.24)
for $0⩽t⩽T$, where $W***(t)$ is a vector of length $m$ with the first $r$ entries equal to 0 and the next $m-r$ entries independent Wiener processes, and $Λ+(s)$ is the pseudoinverse of $Λ(s)$. Then, by an argument parallel to equation 4.22, we conclude that
$S(t,X*(t))dW(t)=S0(t,X*(t))dW*(t).$
(4.25)
Hence, $dX*=f(t,X*(t))dt+S(t,X*(t))dW(t)$, that is, $X*(t)$ is also a solution to equation 4.16. Therefore, we can conclude that the channel-based Langevin model in equation 4.16 is pathwise equivalent to Fox and Lu's model.

To illustrate pathwise equivalence, Figure 6 plots trajectories of the $14×28$D stochastic HH model and Fox and Lu's model using noise traces dictated by the preceding construction. In panel A, we generated a sample path for equation 4.16 and plot three variables in $X$: the voltage $V$, Na$+$ channel open probability $M31$, and K$+$ channel open probability $N4$. The corresponding trajectory, $X*$, for Fox and Lu's model was generated from equation 4.17 and the corresponding Wiener trajectory was calculated using equation 4.18. The top three subplots in panel A superposed the voltage $V*$, Na$+$ channel open probability $M31*$ and K$+$ channel open probability $N4*$ in $X*$ against those in $X$. The bottom three subplots in panel A plot the point-wise differences of each variable. Equations 4.16 and 4.17 are numerically solved in Matlab using the Euler-Maruyama method with a time step $dt=0.001$ ms. The slight differences observed arise in part due to numerical errors in calculating the singular value decomposition of $S$ (in equation 4.16); another source of error is the finite accuracy of the Euler-Maruyama method.6 As shown in Figure 6, most differences occur near the spiking region, where the system is numerically very stiff and the numerical accuracy of the SDE solver accounts for most of the discrepancies (analysis of which is beyond the scope of this article). To illustrate pathwise equivalence, Figure 6 shows superposed voltage trajectories for each simulation, as well as the point-wise voltage differences of each. We can conclude from the comparison in Figure 6 that the $14×28$D Langevin model is pathwise equivalent with Fox and Lu's model. Similarly, the same analogy applies for other channel-based Langevin models such that with the same diffusion matrix $D(X)$.

Figure 6:

Pathwise equivalency of 14D HH model and Fox and Lu's model. (A) Given a sample path of the $14×28$D Langevin model in equation 4.16, we construct the noise by equation 4.18 and generate the sample trajectory of Fox and Lu's model using equation 4.17. (B) Given a sample path of Fox and Lu's model in equation 4.17, we construct the noise by equation 4.24 and generate the sample trajectory of the $14×28$D Langevin model using equation 4.16. For both cases, we plot the voltage $V$, the open probability of the Na$+$ channel ($M31$), and the open probability of the K$+$ channel ($N4$). We obtain excellent agreement in both directions.

Figure 6:

Pathwise equivalency of 14D HH model and Fox and Lu's model. (A) Given a sample path of the $14×28$D Langevin model in equation 4.16, we construct the noise by equation 4.18 and generate the sample trajectory of Fox and Lu's model using equation 4.17. (B) Given a sample path of Fox and Lu's model in equation 4.17, we construct the noise by equation 4.24 and generate the sample trajectory of the $14×28$D Langevin model using equation 4.16. For both cases, we plot the voltage $V$, the open probability of the Na$+$ channel ($M31$), and the open probability of the K$+$ channel ($N4$). We obtain excellent agreement in both directions.

We have shown that our $14×28$D model, with a 14-dimensional state space and 28 independent noise sources (one for each directed edge), is pathwise equivalent to Fox and Lu's original 1994 model, as well as other channel-based models (under corresponding boundary conditions) including Goldwyn et al. (2011); Goldwyn and Shea-Brown (2011); Orio and Soudry (2012); Pezo et al. (2014); Fox (2018). As we shall see in section 5, the pathwise equivalent models give statistically indistinguishable interspike interval distributions under the same BCs. We emphasize the importance of boundary conditions for pathwise equivalence. Two simulation algorithms with the same $Ai$ and $Si$ matrices will generally have nonequivalent trajectories if different boundary conditions are imposed. For example, Dangerfield et al. (2012) employ the same dynamics as Orio and Soudry (2012) away from the boundary, where ion channel state occupancy approaches zero or one. But where the latter allow trajectories to move freely across this boundary (which leads only to small, short-lived excursions into “nonphysical” values), Dangerfield imposes reflecting boundary conditions through a projection step at the boundary. As we will see ln section 5, this difference in boundary conditions leads to a statistically significant difference in the ISI distribution, as well as a loss of accuracy when compared with the gold standard Markov chain simulation.

## 5  Model Comparison

In section 3, we studied the contribution of every directed edge to the ISI variability and proposed how stochastic shielding could be applied under current clamp. Moreover, in section 4, we proved that a family of Langevin models is pathwise equivalent.

Here we compare the accuracy and computational efficiency of several models, including the “subunit model” (Fox, 1997; Goldwyn & Shea-Brown, 2011), Langevin models with different $S$ matrices or boundary conditions (Fox & Lu, 1994; Goldwyn & Shea-Brown, 2011; Dangerfield et al., 2012; Orio & Soudry, 2012; Pezo et al., 2014), the 14D HH model (proposed in section 3.2), the 14D stochastic shielding model with six independent noise sources (proposed in section 3.3), and the gold standard Markov chain model (discussed in section 3.1). Where other studies have compared moment statistics such as the mean firing frequency (under current clamp) and stationary channel open probababilies (under voltage clamp), we base our comparison on the entire interspike interval (ISI) distributions, under current clamp with a common fixed driving current. We use two different comparisons of ISI distributions, the first based on the $L1$ norm of the difference between two distributions (the Wasserstein distance; Wasserstein, 1969), in section 5.1 and the second based on the $L∞$ norm (the Kolmogorov-Smirnov test; Kolmogorov, 1933; Smirnov, 1948), in section 5.2. We find similar results using both measures: as expected, the models that produce pathwise equivalent trajectories (Fox & Lu, 1994; Orio & Soudry; and our $14×28$D model) have indistinguishable ISI statistics, while the nonequivalent models (Fox, 1997; Dangerfield, 2010, 2012; Goldwyn & Shea-Brown, 2011; our $14×6$D stochastic-shielding model) have significantly different ISI distributions. Of these, the $14×6$D SS model is the closest to the models in the $14×28$D class and as fast as any other model considered.

### 5.1  $L1$ Norm Difference of ISIs

We first evaluate the accuracy of different stochastic simulation algorithms by comparing their ISI distributions under current clamp to that produced by a reference algorithm, namely, the discrete-state Markov chain (MC) algorithm.

Let $X1,X2,…,Xn$ be $n$ independent samples of ISIs with a true cumulative distribution function $F$. Let $Fn(·)$ denote the corresponding empirical cumulative distribution function (ECDF) defined by
$Fn(x)=1n∑i=1n1{Xi≤x},x∈R,$
(5.1)
where we write $1A$ to denote the indicator function for the set $A$. Let $Q$ and $QM$ be the quantile functions of $F$ and $FM$, respectively. The $L1$-Wasserstein distance between two CDF's $FM$ and $F$ can be written as (Shorack & Wellner, 2009)
$ρ1(F,FM)=∫0∞F(x)-FM(x)dx=∫01Q(x)-QM(x)dx.$
(5.2)
Note that $ρ1$ has the same units as $dx$. Thus, the $L1$ distances reported in Table 3 have units of milliseconds.
Table 3:
Summary of the $L1$-Wasserstein Distances of ISI Distributions for Langevin Type Hodgkin-Huxley Models Compared to the MC Model.
ModelVariables V+M+N$S$ MatrixNoise Dimensions Na+K$L1$ Norm (msec.) (Wasserstein Dist.)Run Time (sec.)
MC 1+8+5 n/a 20+8 $2.27e-4±7.15e-5$ 3790
Fox94 1+7+4 $S=D$ 7+4 $4.74e-2±1.93e-4$ 2436
Fox97 1+2+1 n/a $8.01e-1±9.48e-4$ 67
Dangerfield 1+8+5 $SEF$ 10+4 $2.18e-1±2.14e-4$ 655
Goldwyn 1+8+5 $S=D$ 8+5 $1.83e-1±1.93e-4$ 2363
Orio 1+8+5 $Spaired$ 10+4 $4.52e-2±2.08e-4$ 577
$14×28$1+8+5 $Ssingle$ 20+8 $4.93e-2±1.94e-4$ 605
SS 1+8+5 $Sss$ 4+2 $7.62e-2±7.57e-5$ 73
ModelVariables V+M+N$S$ MatrixNoise Dimensions Na+K$L1$ Norm (msec.) (Wasserstein Dist.)Run Time (sec.)
MC 1+8+5 n/a 20+8 $2.27e-4±7.15e-5$ 3790
Fox94 1+7+4 $S=D$ 7+4 $4.74e-2±1.93e-4$ 2436
Fox97 1+2+1 n/a $8.01e-1±9.48e-4$ 67
Dangerfield 1+8+5 $SEF$ 10+4 $2.18e-1±2.14e-4$ 655
Goldwyn 1+8+5 $S=D$ 8+5 $1.83e-1±1.93e-4$ 2363
Orio 1+8+5 $Spaired$ 10+4 $4.52e-2±2.08e-4$ 577
$14×28$1+8+5 $Ssingle$ 20+8 $4.93e-2±1.94e-4$ 605
SS 1+8+5 $Sss$ 4+2 $7.62e-2±7.57e-5$ 73

Notes: Model (see text for details): MC: Markov chain. Fox94; From Fox and Lu (1994). Fox97: Fox (1997). Goldwyn: Goldwyn and Shea-Brown (2011). Dangerfield: Dangerfield et al. (2010, 2012). $14×28$D: model proposed in section 3.2. SS: stochastic-shielding model (see section 3.3). Variables: number of degrees of freedom in Langevin equation representing voltage, sodium gates, and potassium gates, respectively. $S$ Matrix: Form of the noise coefficient matrix in equations 1.1 to 1.3. Noise dimensions: number of independent gaussian white noise sources represented for sodium and potassium, respectively. $L1$ Norm: Empirically estimated $L1$-Wasserstein distance between the model's ISI distribution and the MC model's ISI distribution. For MC versus MC, independent trials were compared. $a±b$: mean$±$standard deviation. Run time (in seconds): see text for details.

When two models have the same number of samples, $n$, equation 5.2 can be estimated by
$∫01Q(x)-QM(x)dx≈1n∑i=1nXi-Yi:=ρ1(Fn,FnM),$
(5.3)
where $X1,⋯,Xn$ and $Y1,⋯,Yn$ are $n$ independent samples sorted in ascending order with CDF $F$ and $FM$, respectively.

We numerically calculate $ρ1(Fn,FnM)$ to compare several Langevin models against the MC model. We consider the following models: “Fox94” denotes the original model proposed by Fox and Lu (1994), which requires a square root decomposition ($S=D$) for each step in the simulation (see equations 1.1 to 1.3). “Fox97” is the widely used “subunit model” of Fox (1997); see equations 1.4 and 1.5). “Goldwyn” denotes the method taken from Goldwyn and Shea-Brown (2011), where they restrict the 14D system ($V$, 5 K$+$ gates and 8 Na$+$ gates) to the 4D multinomial submanifold ($V,m,n,andh$; see section 4.2), with gating variables truncated to [0,1]. We write “Orio” for the model proposed by Orio and Soudry (2012), where they constructed a rectangular matrix $S$ such that $SS⊺=D$ (referred to as $Spaired$ in Table 3) combining fluctuations driven by pairs of reciprocal edges, thereby avoiding taking matrix square roots at each time step. The model “Dangerfield” represents Dangerfield et al. (2012), which used the same $S$ matrix as in Orio and Soudry (2012) but added a reflecting (no-flux) boundary condition via orthogonal projection (referred to as $SEF$ in Table 3). Finally, we include the $14×28$D model we proposed in section 3.2, or “14D” (referred to as $Ssingle$ in Table 3); “SS” is the stochastic shielding model specified in section 3.3.

For each model, we ran 10,000 independent samples of the simulation, holding channel number, injected current ($Iapp=10$ nA), and initial conditions fixed. Throughout the article, we presume a fixed channel density of 60 $channels/μm2$ for sodium and 18 $channels/μm2$ for potassium in a membrane patch of area $100μm2$, consistent with prior work such as Goldwyn and Shea-Brown (2011) and Orio and Soudry (2012). The initial condition is taken to be the point on the deterministic limit cycle at which the voltage crosses upward through $-60$ mV. An initial transient corresponding to 10 to 15 ISIs is discarded, to remove the effects of the initial condition. (See Table 5 in appendix B for a complete specification of simulation parameters.) We compared the efficiency and accuracy of each model through the following steps:

1. For each model, a single run simulates a total time of 84,000 milliseconds (ms) with time step 0.008 ms, recording at least 5000 ISIs.

2. For each model, repeat 10,000 runs in step 1.

3. Create a reference ISI distribution by aggregating all 10,000 runs of the MC model, that is, based on roughly $5×107$ ISIs.

4. For each of $104$ individual runs, align all ISI data into a single vector and calculate the ECDF using equation 5.1.

5. Compare the ISI distribution of each model with the reference MC distribution by calculating the $L1$-difference of the ECDFs using equation 5.3.

6. To compare the computational efficiency, we take the average execution time of the MC model as the reference. The relative computational efficiency is the ratio of the average execution time of a model with that of the MC model (about 3790 sec).

Table 3 gives the empiricially measured $L1$ difference in ISI distribution between several pairs of models.7 The first row (“MC”) gives the average $L1$ distance between individual MC simulations and the reference distribution generated by aggregating all MC simulations, in order to give an estimate of the intrinsic variability of the measure. Figure 8 plots the $L1$-Wasserstein differences versus the relative computational efficiency of several models against the MC model. These results suggest that the Fox94, Orio, and $14×28$D models are statistically indistinguishable when compared with the MC model using the $L1$-Wasserstein distance. This result is expected in light of our results (see section 4) showing that these three models are pathwise-equivalent. (We will make pairwise statistical comparisons between the ISI distributions of each model in see section 5.2.) Among these equivalent models, however, the $14×28$D and Orio models are significantly faster than the original Fox94 model (and the Goldwyn model) because they avoid the matrix square root computation. The Dangerfield model has speed similar to the $14×28$D model, but the use of reflecting boundary conditions introduces significant inaccuracy in the ISI distribution. The imposition of truncating boundary conditions in the Goldwyn model also appears to affect the ISI distribution. Of the models considered, the Fox97 subunit model is the fastest; however, it makes a particularly poor approximation to the ISI distribution of the MC model. Note that the maximum $L1$-Wasserstein distance between two distributions is 2. The ISI distribution of the Fox97 subunit model to that of the MC model is more than 0.8, which is 10 times larger than the $L1$-Wasserstein distance of the SS model, and almost half of the maximum distance. As shown in Figure 7,8 the Fox97 subunit model fails to achieve the spike firing threshold and produces longer ISIs. Because of its inaccuracy, we do not include the subunit model in our remaining comparisons. The stochastic shielding model, on the other hand, has nearly the same speed as the Fox97 model, but it is over 100 times more accurate (in the $L1$ sense). The SS model is an order of magnitude faster than the $14×28$D model and has less than twice the $L1$ discrepancy versus the MC model ($L1$ norm 76.2 versus 49.3 microseconds). While this difference in accuracy is statistically significant, it may not be practically significant, depending on the application (see section 6 for further discussion of this point).

Figure 7:

The probability density of interspike intervals (ISIs) for Fox97 (blue) and the MC model (red). The probability densities were calculated over more than $5.4×107$ ISIs.

Figure 7:

The probability density of interspike intervals (ISIs) for Fox97 (blue) and the MC model (red). The probability densities were calculated over more than $5.4×107$ ISIs.

Figure 8:

The $L1$-Wasserstein distances and relative computational efficiency versus the MC model. Fox94 (green circle), Goldwyn (black cross), Orio (cyan square), 14D (blue star), SS (magenta downward-pointing triangle), Dangerfield (red upward-pointing triangle), and the MC (brown diamond) model. The $L1$ error for ISI distribution was computed using the $L1$-Wasserstein distance, equation 5.3, with discrete time Gillespie/Monte Carlo simulations as a reference. The relative computational efficiency is the ratio of the recorded run time to the mean recorded time of the MC mode (3790 seconds). The mean and 95% confidence intervals were calculated using 100 repetitions of 10,000 runs each ($5×109$ ISIs total).

Figure 8:

The $L1$-Wasserstein distances and relative computational efficiency versus the MC model. Fox94 (green circle), Goldwyn (black cross), Orio (cyan square), 14D (blue star), SS (magenta downward-pointing triangle), Dangerfield (red upward-pointing triangle), and the MC (brown diamond) model. The $L1$ error for ISI distribution was computed using the $L1$-Wasserstein distance, equation 5.3, with discrete time Gillespie/Monte Carlo simulations as a reference. The relative computational efficiency is the ratio of the recorded run time to the mean recorded time of the MC mode (3790 seconds). The mean and 95% confidence intervals were calculated using 100 repetitions of 10,000 runs each ($5×109$ ISIs total).

### 5.2  Two-Sample Kolmogorov-Smirnov Test

In addition to using the $L1$-Wasserstein distances to test the differences between two CDFs, we can also make a pairwise comparison between each model by applying the Dvoretzky-Kiefer-Wolfowitz inequality (Dvoretzky, Kiefer, & Wolfowitz, 1956) and the two-sample Kolmogorov-Smirnov (KS) test (Kolmogorov, 1933; Smirnov, 1948). While the Wasserstein distance is based on the $L1$ norm, the KS statistic is based on the $L∞$ (or supremum) norm.

The Dvoretzky-Kiefer-Wolfowitz inequality (Dvoretzky et al., 1956) establishes confidence bounds for the CDF. Specifically, the interval that contains the true CDF, $F(·)$, with probability $1-α$, is given by
$|Fn(x)-F(x)|≤ɛwhereɛ=ln2α2n.$
(5.4)
When comparing samples $X1M,X2M,…,XnM$ obtained from an approximate model $M$ against the gold standard, in section 5.1 we computed the $L1$ difference of the empirical density functions as an approximation for the $L1$ difference of the true distributions. Instead, we work here with the $L∞$ norm:
$ρ∞(Fn,FnM)=limp→∞∫0∞FnM(x)-Fn(x)pdx1/p=sup0≤x<∞FnM(x)-Fn(x).$
(5.5)
For each $x≥0$, equation 5.4 bounds the discrepancy between the true and empirical distribution differences as follows. By the triangle inequality and independence of the $Xi$ from the $XiM$, the inequality
$|FM-F|=|FM-FnM+Fn-F+FnM-Fn|≤|FM-FnM|+|Fn-F|+|FnM-Fn|≤2ɛ+|FnM-Fn|$
(5.6)
holds with probability $(1-α)2$. Similarly,
$|FnM-Fn|=|FnM-FnM+F-Fn+FM-F|≤|FM-FnM|+|Fn-F|+|FM-F|≤2ɛ+|FM-F|$
(5.7)
holds with probability $(1-α)2$. Together, equations 5.6 and 5.7 indicate that the discrepancy between the difference of empirical distributions and the difference of true distributions is bounded as
$||FM-F|-|FnM-Fn||≤2ɛ$
(5.8)
with probability $(1-α)2$, for $ɛ=ln2α2n$.
We will use the pointwise difference of the ECDFs for a large sample as an estimate for the pointwise difference between two true CDFs. The two-sample Kolmogorov-Smirnov (KS) test (Kolmogorov, 1933; Smirnov, 1948) offers a statistics to test whether two samples are from the same distribution. The two-sample KS statistic is
$Dn,m=supx|F1,n(x)-F2,m(x)|,$
(5.9)
where $F1,n$ and $F2,m$ are two ECDFs for two samples defined in equation 5.1 and the $sup$ is the supremum function. The reference statistic, $Rn,m(α)$, depending on the significance level $α$, is defined as
$Rn,m(α)=-log(α/2)2n+mnm,$
(5.10)
where $n$ and $m$ are the sample sizes. The null hypothesis that “the two samples come from the same distribution” is rejected at the significance level $α$ if
$Dn,m>Rn,m(α).$
(5.11)

Figure 9 plots the logarithm of ratio of the two-sample KS statistics, $Dn,mRn,m(0.01)$, for Fox94 (Fox & Lu, 1994), Goldwyn (Goldwyn & Shea-Brown, 2011), Dangerfiled (Dangerfield et al., 2012), Orio (Orio & Soudry, 2012), 14D (the $14×28$D model we proposed in section 3.2). Data of “self-comparison” (e.g. Fox94 versus Fox 94) was obtained by comparing two ISI ECDFs from independent simulations. As shown in Figure 9, models that we previously proved were pathwise equivalent in section 4, namely, the Fox94, Orio, and 14D models, are not distinguishable at any confidence level justified by our data. Note that those three models use the same boundary conditions (free boundary condition as in Orio & Soudry, 2012) and the ratio $Dn,m/Rn,m(α)$ of pairwise comparison is on the same magnitude of that for the self-comparisons. The models of Fox and Lu and Orio and Soudry, and our 14D model generate indistinguishable ISI distributions but are distinguishable from Dangerfield's model and Fox's 97 model. Thus, as the bottom panel of Figure 9 shows, Fox94, Orio, and $14×28$D form a block of statistically indistinguishable samples. However, as pointed out above, these statistically equivalent simulation algorithms have different computational efficiencies (see Figure 8). Among these methods, Orio and Soundry's algorithm (14-dimensional state space with 14 undirected edges as noise sources) and our method (14-dimensional state space with 28 directed edges as noise sources) have similar efficiencies, with Orio's method being about 5% faster than our method. Our $14×28$D method provides the additional advantage that it facilitates further acceleration under the stochastic shielding approximation (see section 6).

Figure 9:

Logarithm of the ratio of Kolmogorov-Smirnov test statistic $Dn,m/Rn,m(α)$, equations 5.9 and 5.10, for samples from the ISI distribution for each pair of models. (Top) Box and whisker plots showing mean and 95% confidence intervals based on 10,000 pairwise comparisons. The first five plots show self-comparisons (green bars); the remainder compare distinct pairs (gray bars). A: Fox94 (Fox & Lu, 1994), B: Orio (Orio & Soudry, 2012), C: 14D ($14×28$D model, section 3.2), D: Dangerfield (Dangerfield et al., 2012), E: Goldwyn (Goldwyn & Shea-Brown, 2011). (Bottom) Mean logarithms (as in top panel) for all pairwise comparisons.

Figure 9:

Logarithm of the ratio of Kolmogorov-Smirnov test statistic $Dn,m/Rn,m(α)$, equations 5.9 and 5.10, for samples from the ISI distribution for each pair of models. (Top) Box and whisker plots showing mean and 95% confidence intervals based on 10,000 pairwise comparisons. The first five plots show self-comparisons (green bars); the remainder compare distinct pairs (gray bars). A: Fox94 (Fox & Lu, 1994), B: Orio (Orio & Soudry, 2012), C: 14D ($14×28$D model, section 3.2), D: Dangerfield (Dangerfield et al., 2012), E: Goldwyn (Goldwyn & Shea-Brown, 2011). (Bottom) Mean logarithms (as in top panel) for all pairwise comparisons.

In contrast to the statistically equivalent Orio, $14×28$D, and Fox94 models, algorithms using different boundary conditions are not pathwise equivalent, which is again verified in Figure 9. Algorithms with subunit approximation and truncated boundary condition (i.e., “Goldwyn”) and the reflecting boundary condition (i.e., “Dangerfield”) are significantly different in accuracy (and, in particular, they are less accurate) than models in the $14×28$D class.

## 6  Discussion and Conclusions

### 6.1  Summary

The exact method for Markov chain (MC) simulation for an electrotonically compact (single compartment) conductance-based stochastic model under current clamp is a hybrid discrete (channel state)/continuous (voltage) model of the sort used by Clay and DeFelice (1983), Newby et al. (2013), and Anderson et al. (2015). While MC methods are computationally expensive, simulations based on gaussian/Langevin approximation can capture the effects of stochastic ion channel fluctuations with reasonable accuracy and excellent computational efficiency. Since Goldwyn and Shea Brown's work focusing the attention of the computational neuroscience community on Fox and Lu's Langevin algorithm for the Hodgkin-Huxley system (Fox & Lu, 1994; Goldwyn & Shea-Brown, 2011), several variants of this approach have appeared in the literature.

In this article, we advocate for a class of models combining the best features of conductance-based Langevin models with the recently developed stochastic shielding approximation (Schmandt & Galán, 2012; Schmidt & Thomas, 2014; Schmidt et al., 2018). We propose a Langevin model with a 14-dimensional state space, representing the voltage, five states of the K$+$ channel, and eight states of the Na$+$-channel; and a 28-dimensional representation of the driving noise: one independent gaussian noise term for each directed edge in the channel-state transition graph. We showed in section 2 that the corresponding mean-field 14D ordinary differential equation model is consistent with the classical HH equations in the sense that the latter correspond to an invariant submanifold of the higher-dimensional model, to which all trajectories converge exponentially quickly. Figure 2 illustrated the relation between the deterministic 4D and 14D Hodgkin-Huxley systems. Building on this framework, we introduced the $14×28$D model, with independent noise sources corresponding to each ion channel transition (see section 3). We proved in section 4 that given identical boundary conditions, our $14×28$D model is pathwise equivalent to Fox and Lu's original Langevin model and to a 14-state model with 14 independent noise sources due to Orio and Soudry (2012).

The original 4D HH model, the 14D deterministic HH model, and the family of equivalent 14D Langevin models we consider here form a nested family, each contained within the next. Specifically, the 14D ODE model is the mean-field version of the 14D Langevin model, and the 4D ODE model forms an attracting invariant submanifold within the 14D ODE model, as we establish in our lemma 2. So in a very specific sense, the original HH equations “live inside” the 14D Langevin equations. Thus, these three models enjoy a special relationship. In contrast, the 4D Langevin equations studied in Fox (1997) do not bear an especially close relationship to the other three.

In addition to rigorous mathematical analysis, we also performed numerical comparisons (see section 5) showing that, as expected, the pathwise equivalent models produced statistically indistinguishable interspike interval (ISI) distributions. Moreover, the ISI distributions for our model (and its equivalents) were closer to the ISI distribution of the gold standard MC model under two different metric space measures. Our method (along with Orio and Soudry's) proved computationally more efficient than Fox and Lu's original method and Dangerfield's model (Dangerfield et al., 2012). In addition, our method lends itself naturally to model reduction (via the stochastic shielding approximation) to a significantly faster $14×6$D simulation that preserves a surprisingly high level of accuracy.

### 6.2  Discrete Gillespie Markov Chain Algorithms

The discrete-state MC algorithm due to Gillespie is often taken to be the gold standard simulation for a single-compartment stochastic conductance-based model. Most former literature on Langevin HH models (Goldwyn & Shea-Brown, 2011; Linaro et al., 2011; Orio & Soudry, 2012; Huang et al., 2013), when establishing a reference MC model, consider a version of the discrete Gillespie algorithm that assumes a piecewise-constant propensity approximation, that is, it does not take into account that the voltage changes between transitions, which changes the transition rates. This approximation can lead to biophysically unrealistic voltage traces for very small system sizes (see Figure 2 of Kispersky & White, 2008; top trace with $N=1$ ion channel) although the differences appear to be mitigated for $N≳40$ channels (Anderson et al., 2015). In this article our MC simulations are based on 6000 Na$+$ and 1800 K$+$ channels (as in Goldwyn & Shea-Brown, 2011), and we too use the ISI distribution generated by a piecewise-constant propensity MC algorithm as our reference distribution. As shown in Table 3 and Figure 8, the computation time for MC is one order of magnitude larger than efficient methods such as Orio and Soudry (2012) and Dangerfield et al. (2012) and the $14×28$D model. The computational cost of the MC model increases dramatically as the number of ion channels grows; therefore, even the approximate MC algorithm is inapplicable for a large number of channels.

### 6.3  Langevin Models

It is worth pointing out that the accuracy of Fox and Lu's original Langevin equations has not been fully appreciated. In fact, Fox and Lu's model (Fox & Lu, 1994) gives an approximation to the MC model that is just as accurate as that of Orio and Soudry (2012) both in the gating variable statistics (Goldwyn & Shea-Brown, 2011) and the ISI distribution sense (see section 5), because, as we have established, these models are pathwise equivalent! However, the original implementation requires taking a matrix square root in every time step in the numerical simulation, which significantly reduces its computational efficiency.

Models based on modifications of Fox and Lu's (1994) work can be divided into three classes: the subunit model (Fox, 1997), effective noise models (Linaro et al., 2011; Güler, 2013b), and channel-based Langevin models (e.g., Goldwyn & Shea-Brown, 2011; Dangerfield et al., 2012; Orio & Soudry, 2012; Pezo et al., 2014; Huang et al., 2013).

#### 6.3.1  Subunit Model

The first modification of Fox and Lu's model is the subunit model (Fox, 1997), which keeps the original form of the HH model, and adds noise to the gating variables ($m,h,andn$) (Fox, 1997; Goldwyn & Shea-Brown, 2011). The subunit approximation model was widely used because of its fast computational speed. However, as Bruce (2009) and others pointed out, the inaccuracy of this model remains significant even for large number of channels. Moreover, Goldwyn and Shea-Brown (2011) and Huang et al. (2015) found that the subunit model fails to capture the statistics of the HH Na$+$ and K$+$ gates. In this article, we also observed that the subunit model is more efficient than channel-based Langevin models but tends to delay spike generation. As shown in Figure 7, the subunit model generates significantly longer ISIs than the MC model.

#### 6.3.2  Effective Noise Models

Another modification to Fox and Lu's algorithm is to add colored noise to the channel open fractions. Though colored noise models such as those of Linaro et al. (2011) and Güler (2013b) are not included in our model comparison, Huang et al. (2015) found that both of these effective noise models generate shorter ISIs than the MC model with the same parameters. Though the comparison we provided in section 5 only include the Fox and Lu 94, Fox97, Goldwyn, Dangerfield, Orio, SS, and the $14×28$D models, combining the results from Goldwyn and Shea-Brown (2011) and Huang et al. (2015), the $14×28$D model could be compared to a variety of models (e.g., Fox & Lu, 1994; Fox, 1997; Goldwyn & Shea-Brown, 2011; Linaro et al., 2011; Orio & Soudry, 2012; Dangerfield et al., 2012; Huang et al., 2013; Güler, 2013b).

#### 6.3.3  Channel-based Langevin Models

The main focus of this article is the modification based on the original Fox and Lu's matrix decomposition method, namely, the channel-based (or conductance-based) Langevin models. We proved in section 4 that under the same boundary conditions, Fox and Lu's original model, Orio's model, and our $14×28$D model are pathwise equivalent, which was also verified from our numerical simulations in sections 4 and 5. In section 4 we discussed channel-based Langevin models (Fox & Lu, 1994; Goldwyn & Shea-Brown, 2011; Dangerfield et al., 2012; Orio & Soudry, 2012; Fox, 2018). We excluded Fox's more recent implementation (Fox, 2018) in section 5 for two reasons. First, the algorithm is pathwise equivalent to others considered there. Moreover, the method is vulnerable to numerical instability when the Cholesky decomposition is performed. Specifically, some of the elements in the $S$ matrix from the Cholesky decomposition in Fox (2018) involve square roots of differences of several quantities, with no guarantee that the differences will result in nonnegative terms—even with strictly positive value of the gating variables. Nevertheless, this model would be in the equivalence class and in any case would not be more efficient than the Orio model because of the noise dimension and complicated operations (involving taking multiple square roots) in each step.

### 6.4  Model Comparisons

If two random variables have similar distributions, then they will have similar moments, but not vice versa. Therefore, comparison of the full interspike-interval distributions produced by different simulation algorithms gives a more rigorous test than comparison of first and second moments of the ISI distribution. Most previous evaluations of competing Langevin approximations were based on the accuracy of low-order moments, for example, the mean and variance of channel state occupancy under voltage clamp, or the mean and variance of the interspike interval distribution under current clamp (Goldwyn et al., 2011; Goldwyn & Shea-Brown, 2011; Schmandt & Galán, 2012; Linaro et al., 2011; Dangerfield et al., 2012; Orio & Soudry, 2012; Huang et al., 2013, 2015). Here, we compare the accuracy of the different algorithms using the full ISI distribution but using the $L1$ norm of the difference (Wasserstein metric) and the $L∞$ norm (Kolmogorov-Smirnov test). Greenwood, McDonnell, and Ward (2015) previously compared the ISI distributions generated by the Markov chain (Gillespie algorithm) to the distribution generated by different types of Langevin approximations (LA), including the original Langevin models (Fox & Lu, 1994; Goldwyn & Shea-Brown, 2011), the channel-based LA with colored noise (Linaro et al., 2011; Güler, 2013b), and LA with a $14×14$ variant of the diffusion coefficient matrix $S$ (Orio & Soudry, 2012). They concluded that Orio and Soudry's method provided the best match to the Markov chain model, specifically, “Fox-Goldwyn, and Orio-Kurtz8 methods both generate ISI histograms very close to those of “Micro.”9 (Greenwood et al., 2015). We note that the comparison reported in this article simply superimposed plots of the ISI distributions, allowing a qualitative comparison, while our metric-space analysis is fully quantitative. In any case, their conclusions are consistent with our findings; we showed in section 4 that the Fox-Goldwyn and the Orio-Kurtz model are pathwise equivalent (when implemented with the same boundary conditions), which accounts for the similarity in the ISI histograms they generate. In fact, because of pathwise equivalence, we can conclude that the true distributions for these models are identical, and any differences observed just reflect finite sampling.

### 6.5  Stochastic Shielding Method

The stochastic shielding (SS) approximation (Schmandt & Galán, 2012) provides an efficient and accurate method for approximating the Markov process with only a subset of observable states. For conductance-based models, rather than aggregating ion channel states, SS affects dimension reduction by selectively eliminating those independent noise sources that have the least impact on current fluctuations. Recent work in (Pezo, Soudry, & Orio, 2014) compared previous methods (Gillespie, 1977; Orio & Soudry, 2012; Dangerfield et al., 2012; Huang et al., 2013; Schmandt & Galán, 2012) in accuracy, applicability, and simplicity, as well as computational efficiency. They concluded that for mesoscopic numbers of channels, stochastic shielding methods combined with diffusion approximation methods can be an optimal choice. Like Orio and Soudry (2012), the stochastic shielding method proposed by Pezo et al. (2014) also assumed a detailed balance of transitions between adjacent states and used edges that are directly connected to the open gates of HH Na$+$ and K$+$. We calculated the edge importance in section 3.3 and found that the 4 (out of 20) most important directed edges for the Na$+$ gates are not the 4 edges directly connected to the conducting state, as assumed in previous application of the SS method (Schmandt & Galán, 2012).

### 6.6  Which Model to Use?

Among all modifications of Fox and Lu's method considered here, Orio and Soudry's approach and our $14×28$D model provide the best approximation to the gold standard MC model, with the greatest computational efficiency. Several earlier models were studied in the review paper by Goldwyn and Shea-Brown (2011), where they rediscovered that the original Langevin model proposed by Fox and Lu is the best approximation to the MC model among those considered. Later work (Huang et al., 2015) further surveyed a wide range of Langevin approximations for the HH system (Fox & Lu, 1994; Fox, 1997; Goldwyn & Shea-Brown, 2011; Linaro et al., 2011; Güler, 2013b; Orio & Soudry, 2012; Huang et al., 2013) and explored models with different boundary conditions. Huang et al. (2015) concluded that the bounded and truncated-restored Langevin model (Huang et al., 2013) and the unbounded model (Orio & Soudry, 2012) provide the best approximation to the MC model.

As shown in sections 4 and 5, the $14×28$D Langevin model naturally derived from the channel structure is pathwise equivalent to the Fox94, Fox18, and Orio-Soudry models under the same boundary conditions. The $14×28$D model is more accurate than the reflecting boundary condition method of Dangerfield et al. (2012) and also better than the approximation method proposed by Goldwyn and Shea-Brown (2011) when the entire ISI distribution is taken into account. We note that Huang et al. (2015) treated Goldwyn's method (Goldwyn & Shea-Brown, 2011) as the original Fox and Lu model in their comparison; however, the simulation in Goldwyn and Shea-Brown (2011) uses the 4D multinomial submanifold to update gating variables. Our analysis and numerical simulations suggest that the original Fox and Lu model is indeed as accurate as the Orio-Soudry model, although the computational cost remains a major concern.

Though the $14×28$D model has similar efficiency and accuracy as that of Orio and Soudry (2012), it has several advantages. First, the rectangular $S$ matrix (in equations 1.2 and 1.3) in Orio's model merges the noise contributions of reciprocal pairs of edges. However, this dimension reduction assumes, in effect, that detailed balance holds along reciprocal edges, which our results show is not the case, under current clamp (see Figure 5). Moreover, the $14×28$D model arises naturally from the individual transitions of the exact evolution equations (see equations 3.1 and 3.2) for the underlying Markov chain model, which makes it conceptually easier to understand. In addition, our method for defining the $14×28$D Langevin model and finding the best SS model extends to channel-based models with arbitrary channel gating schemes beyond the standard HH model. Given any channel state transition graph, the Langevin equations may be read off from the transitions, and the edge importance under current clamp can be evaluated by applying the stochastic shielding method to investigate the contributions of noise from each individual directed edge. Finally, in exchange for a small reduction in accuracy, the stochastic shielding method affords a significant gain in efficiency. The $14×28$D model thus offers a natural way to quantify the contributions of the microscopic transitions to the macroscopic voltage fluctuations in the membrane through the use of stochastic shielding. For general ion channel models, extending a biophysically based Langevin model analogous to our $14×28$D HH model, together with the stochastic shielding method, may provide the best available tool for investigating how unobservable microscopic behaviors (such as ion channel fluctuations) affect the macroscopic variability in many biological systems.

### 6.7  Limitations

All Langevin models, including our proposed $14×28$D model, proceed from the assumption that the ion channel population is large enough (and the ion channel state transitions frequent enough) that the gaussian approximations by which the white noise forcing terms are derived are justified. Thus, when the system size is too small, no Langevin system will be appropriate. Fortunately the Langevin approximation appears quite accurate for realistic population sizes.

The $14×28$D model uses more noise sources than other approaches. However, stochastic shielding allows us to jettison noise sources that do not significantly affect the system dynamics (the voltage fluctuations and ISI distribution). Moreover, in order to compare the ISI distribution in detail among several variants of the Fox94’ model versus the Markov chain standard, we have considered a single value of the driving current, while other studies have compared parameterized responses such as the firing rate, ISI variance, or other moments, as a function of applied current. Accurate comparisons require large ensembles of independent trajectories, forcing a trade-off between precision and breadth; we opted here for precise comparisons at a representative level of the driving current.

From a conceptual point of view, a shortcoming of most Langevin models is the tendency for some channel state variables $x$ to collide with the domain boundaries $x∈[0,1]$ and to cross them during numerical simulations with finite time steps. We adopted the approach advocated by Orio and Soudry (2012) of using “free boundaries” in which gating variables can make excursions into the (unphysical) range $x<0$ or $x>1$. Practically, these excursions are always short if the time step is reasonably small, as they tend to be self-correcting.10 Another approach is to construct reflecting boundary conditions; different implementations of this idea were used in Dangerfield et al. (2010), Fox and Lu (1994), and Schmid, Goychuk, and Hänggi (2001). Dangerfield's method proved both slower and less accurate than the free boundary method. As an alternative method, one uses a biased rejection sampling approach, testing each gating variable of the 14D model on each time step and repeating the noise sample for any time step violating the domain conditions (Fox & Lu, 1994; Schmid et al., 2001). We found that this method had accuracy similar to that of Dangerfield's method ($L1$-Wasserstein difference $≈4.4e-1$ msec; see Table 3) and run time similar to that of the Fox94 implementation, about four times slower than our 14D Langevin model.

Yet another approach that in principle can guarantee that the stochastic process remains within proscribed bounds, rederives a “best diffusional approximation” Fokker-Planck equation based on matching a master-equation birth-and -death description of a (binomial) population of two-state ion channels, leading to modified drift and diffusion terms (Goychuk, 2014). This method does not appear to extend readily to the 14D setting, with underlying multinomial structure of the ion channel gates, so we do not dwell on it further.

Table 3 gives the accuracies with which each model reproduces the ISI distribution, compared to a standard reference distribution generated through a large number of samples of the MC method. The mean $L1$ difference between a single sample and the reference sample is about 0.227 microseconds. For a nonnegative random variable $T≥0$, the difference in the mean under two probability distributions is bounded above by the $L1$ difference in their cumulative distribution functions.11 Thus, the $L1$ norm gives an idea of the temporal accuracy with which one can approximate a given distribution by another. The mean difference between the ISI distribution generated by a single run of the full $14×28$D model is about 49 $μ$sec, and the discrepancy produced by the (significantly faster) SS model is about 76 $μ$sec. When would this level of accuracy matter for the function of a neuron within a network? The barn owl Tyto alba uses interaural time difference to localize its prey to within 1 to 2 degrees, a feat that requires encoding information in the precise timing of auditory system action potentials at the scale of 5 to 20 microseconds (Moiseff & Konishi, 1981; Gerstner, Kempter, Van Hemmen, & Wagner, 1996). For detailed studies of the effects of channel noise in this system, the superior accuracy of the MC model might be preferred. On the other hand, the timescale of information encoding in the human auditory nerve is thought to be in the millisecond range (Goldwyn, Shea-Brown, & Rubinstein, 2010), with precision in the feline auditory system reported as low as 100 $μ$s (Imennov & Rubinstein, 2009; see also Woo, Miller, & Abbas, 2009). For these and other mammalian systems, the stochastic shielding approximation should provide sufficient accuracy.

### 6.8  Future Work

In section 3.3 we compared the ISI variance when noise was included one edge at a time and found that the edges making the greatest contribution to population fluctuations under voltage clamp were not identical to the edges having the largest effect on ISI variance when taken one at a time. However, the ISI, considered as a random variable determined through a first-passage time process, depends on the entire trajectory, not just on the occupancy of the conducting states. The HH dynamics are strongly nonlinear, producing a limit cycle in the deterministic case, and it is not immediately clear whether the effects of channel noise on ISI variability should be additive. In future work, we plan to address the question of the additive contribution of individual/molecular noise sources on ISI variability.

A principal motivation for using the stochastic shielding algorithm is to develop fast and accurate algorithms for ensemble simulations of forward models for parameter estimation in a data assimilation framework. We expect that our method may prove useful for such studies based on current-clamp electrophysiological data in the future.

## Appendix A:  Table of Common Symbols and Notations

Table 4:
SymbolMeaning
$C$ Membrane capacitance ($μF/cm2$
$v$ Membrane potential ($mV$
$VNa,VK,VL$ Ionic reversal potential for Na$+$, K$+$and leak ($mV$
$Iapp$ Applied current to the membrane ($nA/cm2$
$m,h,n$ Dimensionless gating variables for Na$+$ and K$+$ channels
$αx,βxx∈{m,n,h}$ Voltage-dependent rate constant ($1/msec$
$x$ Vector of state variables
$M=[M1,M2,⋯,M8]$ Eight-component state vector for the Na$+$ gates
$[m00,m10,m20,m30,$ Components for the Na$+$gates
$m01,m11,m21,m31]⊺$
$N=[N1,N2,⋯,N5]$ Five-component state vector for the K$+$ gates
$[n0,n1,n2,n3,n4]⊺$ Components for the K$+$ gates
$Mtot,Ntot$ Total number of Na$+$ and K$+$ channels
$X$ 4-dimensional manifold domain for 4D HH model
$Y$ 14-dimensional manifold domain for 14D HH model
$Δk$ $k$-dimensional simplex in $Rk+1$ given by $y1+…+yk+1=1,yi≥0$
$M$ Multinomial submanifold within the 14D space
$ANa$, $AK$ State-dependent rate matrix
$D$ State diffusion matrix
$S$, $S1$, $S2$, $SNa$, $SK$ Noise coefficient matrices
$ξ$ Vector of independent $δ$-correlated gaussian white noise with zero mean and unit variance
$X=[X1,X2,…,Xd]$ A $d$-dimensional random variable for sample path
$W=[W1,W2,…,Wn]$ A Wiener trajectory with $n$ components
$δ(·)$ The Dirac delta function
$δij$ The Kronecker delta
$Fn$ Empirical cumulative distribution function with $n$ observations (in section 5, we use $m,n$ as sample sizes)
SymbolMeaning
$C$ Membrane capacitance ($μF/cm2$
$v$ Membrane potential ($mV$
$VNa,VK,VL$ Ionic reversal potential for Na$+$, K$+$and leak ($mV$
$Iapp$ Applied current to the membrane ($nA/cm2$
$m,h,n$ Dimensionless gating variables for Na$+$ and K$+$ channels
$αx,βxx∈{m,n,h}$ Voltage-dependent rate constant ($1/msec$
$x$ Vector of state variables
$M=[M1,M2,⋯,M8]$ Eight-component state vector for the Na$+$ gates
$[m00,m10,m20,m30,$ Components for the Na$+$gates
$m01,m11,m21,m31]⊺$
$N=[N1,N2,⋯,N5]$ Five-component state vector for the K$+$ gates
$[n0,n1,n2,n3,n4]⊺$ Components for the K$+$ gates
$Mtot,Ntot$ Total number of Na$+$ and K$+$ channels
$X$ 4-dimensional manifold domain for 4D HH model
$Y$ 14-dimensional manifold domain for 14D HH model
$Δk$ $k$-dimensional simplex in $Rk+1$ given by $y1+…+yk+1=1,yi≥0$
$M$ Multinomial submanifold within the 14D space
$ANa$, $AK$ State-dependent rate matrix
$D$ State diffusion matrix
$S$, $S1$, $S2$, $SNa$, $SK$ Noise coefficient matrices
$ξ$ Vector of independent $δ$-correlated gaussian white noise with zero mean and unit variance
$X=[X1,X2,…,Xd]$ A $d$-dimensional random variable for sample path
$W=[W1,W2,…,Wn]$ A Wiener trajectory with $n$ components
$δ(·)$ The Dirac delta function
$δij$ The Kronecker delta
$Fn$ Empirical cumulative distribution function with $n$ observations (in section 5, we use $m,n$ as sample sizes)
Table 5:
SymbolMeaningValue
$C$ Membrane capacitance 1 $μF/cm2$
$g¯Na$ Maximal sodium conductance 120 $μS/cm2$
$g¯K$ Maximal potassium conductance 36 $μS/cm2$
$gleak$ Leak conductance 0.3 $μS/cm2$
$VNa$ Sodium reversal potential for Na$+$ 50 $mV$
$VK$ Potassium reversal potential for K$+$ $-$77 $mV$
$Vleak$ Leak reversal potential $-$54.4 $mV$
$Iapp$ Applied current to the membrane 10 $nA/cm2$
$A$ Membrane area $100μm2$
$Mtot$ Total number of Na$+$ channels 6,000
$Ntot$ Total number of K$+$ channels 18,00
SymbolMeaningValue
$C$ Membrane capacitance 1 $μF/cm2$
$g¯Na$ Maximal sodium conductance 120 $μS/cm2$
$g¯K$ Maximal potassium conductance 36 $μS/cm2$
$gleak$ Leak conductance 0.3 $μS/cm2$
$VNa$ Sodium reversal potential for Na$+$ 50 $mV$
$VK$ Potassium reversal potential for K$+$ $-$77 $mV$
$Vleak$ Leak reversal potential $-$54.4 $mV$
$Iapp$ Applied current to the membrane 10 $nA/cm2$
$A$ Membrane area $100μm2$
$Mtot$ Total number of Na$+$ channels 6,000
$Ntot$ Total number of K$+$ channels 18,00

## Appendix B:  Parameters and Transition Matrices

Subunit kinetics for the Hodgkin and Huxley equations are given by
$αm(v)=0.1(25-v)exp(2.5-0.1v)-1,$
(B.1)
$βm(v)=4exp(-v/18),$
(B.2)
$αh(v)=0.07exp(-v/20),$
(B.3)
$βh(v)=1exp(3-0.1v)+1,$
(B.4)
$αn(v)=0.01(10-v)exp(1-0.1v)-1,$
(B.5)
$βn(v)=0.125exp(-v/80)$
(B.6)
$AK(v)=DK(1)βn(v)0004αn(v)DK(2)2βn(v)0003αn(v)DK(3)3βn(v)0002αn(v)DK(4)4βn(v)000αn(v)DK(5),$
$ANa=DNa(1)βm00βh0003αmDNa(2)2βm00βh0002αmDNa(3)3βm00βh000αmDNa(4)000βhαh000DNa(5)βm000αh003αmDNa(6)2βm000αh002αmDNa(7)3βm000αh00αmDNa(8),$
where the diagonal elements
$Dion(i)=-∑j:j≠iAion(j,i),ion∈{Na,K}.$

## Appendix C:  Proof of Lemma 2

For the reader's convenience, we restate this lemma.

Lemma 2.

Let $X$ and $Y$ be the lower-dimensional and higher-dimensional Hodgkin-Huxley manifolds given by equation 2.12, and let $F$ and $G$ be the vector fields on $X$ and $Y$ defined by equations 2.1 to 2.4 and 2.7 to 2.9, respectively. Let $H:X→M⊂Y$ and $R:Y→X$ be the mappings given in Tables 2 and 1, respectively, and define the multinomial submanifold $M=H(X)$. Then $M$ is forward-time-invariant under the flow generated by $G$. Moreover, the vector field $G$, when restricted to $M$, coincides with the vector field induced by $F$ and the map $H$. That is, for all $y∈M$, $G(y)=DxH(R(y))·F(R(y)).$

The main idea of the proof is to show that for every $y∈Y$, $G(y)$ is contained in the span of the four vectors $∂H∂xi(R(y))i=14.$

Proof.
The map from the 4D HH model to the 14D HH model is given in Table 2 as ${H:x→y|x∈X,y∈Y}$, and the map from the 14D HH model to the 4D HH model is given in Table 1 as ${R:y→x|x∈X,y∈Y}$. The partial derivatives $∂H∂x$ of the map $H$ are given by
$dm00dm=-3(1-m)2(1-h)dm00dh=-(1-m)3dm10dm=3(1-h)(3m2-4m+1)dm10dh=-3(1-m)2mdm20dm=3(1-h)(2m-3m2)dm20dh=-3(1-m)m2dm30dm=3(1-h)m2dm30dh=-m3dm01dm=-3h(1-m)2dm01dh=(1-m)3$
$dm11dm=3h(3m2-4m+1)dm11dh=3(1-m)2mdm21dm=3h(2m-3m2)dm21dh=3(1-m)m2dm31dm=3hm2dm31dh=m3.$
We can write $∂H/∂x$ in matrix form as:
$∂H∂x=10000-3(1-m)2(1-h)-(1-m)3003(1-h)(3m2-4m+1)-3(1-m)2m003(1-h)(2m-3m2)-3(1-m)m2003(1-h)m2-m300-3h(1-m)2(1-m)3003h(3m2-4m+1)3(1-m)2m003h(2m-3m2)3(1-m)m2003hm2m30000-4(1-n)30004(1-n)2(1-4n)00012n(1-n)(1-2n)0004n2(3-4n)0004n3.$
We write out the vector fields 2.8 and 2.9 component by component:
$dM1dt=βmM2+βhM5-(3αm+αh)M1=-3(1-m)2(1-h)(1-m)αm-mβm+(1-m)3hβh-(1-h)αh,dM2dt=3αmM1+2βmM3+βhM6-(2αm+βm+αh)M2=3(1-h)(3m2-4m+1)(1-m)αm-mβm+3(1-m)2mhβh-(1-h)αh,dM3dt=2αmM2+3βmM4+βhM7-(αm+2βm+αh)M3,=3(1-h)(2m-3m2)(1-m)αm-mβm+3(1-m)m2hβh-(1-h)αh,dM4dt=αmM3+βhM8-(3βm+αh)M4,=3(1-h)m2(1-m)αm-mβm+m3hβh-(1-h)αh,dM5dt=βmM6+αhM1-(3αm+βh)M5,=-3h(1-m)2(1-m)αm-mβm+(1-m)3hβh-(1-h)αh,dM6dt=3αmM5+2βmM7+αhM2-(2αm+βm+βh)M6,=3h(3m2-4m+1)(1-m)αm-mβm-3(1-m)2mhβh-(1-h)αh,dM7dt=2αmM6+3βmM8+αhM3-(αm+2βm+βh)M7,=3h(2m-3m2)(1-m)αm-mβm-3(1-m)m2hβh-(1-h)αh,dM8dt=αmM7+αhM4-(3βm+βh)M8,=3hm2(1-m)αm-mβm-m3hβh-(1-h)αh,dN1dt=βnN2-4αnN1=-4(1-n)3[αn(1-n)-nβn],dN2dt=4αnN1+2βnN3-(3αn+βn)N2=4(1-n)2(1-4n)[αn(1-n)-nβn],dN3dt=3αnN2+3βnN4-(2αn+2βn)N3=12n(1-n)(1-2n)[αn(1-n)-nβn],dN4dt=2αnN3+4βnN5-(3αn+3βn)N4=4n2(3-4n)[αn(1-n)-nβn],dN5dt=αnN4-4βnN5=4n3[αn(1-n)-nβn].$
By extracting common factors from the previous expressions it is clear that $G(y)$ may be written, for all $y∈Y$, as
$G(y)=-g¯NaM8(V-VNa)-g¯KN5(V-VK)-gL(V-VL)+IappC∂H∂v(R(y))+(1-m')αm-m'βm∂H∂m(R(y))-h'βh-(1-h')αh∂H∂h(R(y))+[αn(1-n')-n'βn]∂H∂n(R(y)),$
(C.1)
where $m'=(M2+M6)/3+2(M3+M7)/3+(M4+M8),$$h'=M5+M6+M7+M8$ and $n'=N2/4+N3/2+3N4/4+N5$. Thus, $G(y)$ is in the span of the column vectors $∂H/∂v$, $∂H/∂m$, $∂H/∂n$, and $∂H/∂h$, as was to be shown.
On the other hand, the vector field for the 4D HH ODE, equations 2.1 to 2.4, is given by
$F=-g¯Nam3h(V-VNa)-g¯Kn4(V-VK)-gL(V-VL)+Iapp/Cαm(V)(1-m)-βm(V)mαh(V)(1-h)-βh(V)hαn(V)(1-n)-βn(V)n.$
Referring to equation C.1, we see that $G(y)=DxH(R(y))F(R(y))$. Thus we complete the proof of lemma 2.$□$

## Appendix D:  Diffusion Matrix of the 14D Model

As defined in equations 2.5 and 2.6,
$M=[m00,m10,m20,m30,m01,m11,m21,m31]⊺,$
(D.1)
$N=[n0,n1,n2,n3,n4]⊺,$
(D.2)
the diffusion matrices $DNa$ and $DK$ are given by
$DK=DK(1,1)-4αnn0-βnn10-4αnn0-βnn1DK(2,2)3αnn1-2βnn20-3αnn1-2βnn2DK(3,3)00-2αnn2-3βnn3000⋯⋯00002αnn2-3βnn30DK(4,4)-αnn3-4βnn4-αnn3-4βnn4DK(5,5),DNa(1:3)=DNa(1,1)-3αmm00-βmm100-3αmm00-βmm10DNa(2,2)-2αmm10-2βmm200-2αmm10-2βmm20DNa(3,3)00-αmm20-3βmm30-αhm00-βhm01000-αhm10-βhm11000-αhm20-βhm21000,DNa(4:6)=0-αhm00-βhm01000-αhm10-βhm11-αmm20-3βmm3000DNa(4,4)000DNa(5,5)-3αmm01-βmm110-3αmm01-βmm11DNa(6,6)00-2αmm11-2βmm21-αhm30-βhm3100,DNa(7:8)=0000-αhm20-βhm2100-αhm30-βhm3100-2αmm11-2βmm210DNa(7,7)-αmm21-3βmm31-αmm21-3βmm31DNa(8,8),$
where
$Dion(i,i)=-∑j:j≠iDion(j,i),forion∈{Na,K}.$
The matrices $SK$ and $SNa$ for the $14×28$D HH model are given by
$SNa(1:5)=-αhm00βhm01-3αmm00βmm100003αmm00-βmm10-αhm100000000000αhm00-βhm01000-βhm110000000αhm20-βhm2100000SNa(6:10)=00000βhm11-2αmm102βmm20002αmm10-2βmm20-αhm20βhm210000000000-βhm110000000αhm20-βhm2100000SNa(11:15)=0000000000-αmm203βmm30000αmm20-3βmm30-αhm30βhm3100000000003αmm010000000αhm30-βhm310SNa(16:20)=00000000000000000000-3αmm01βmm11000-βmm11-2αmm112βmm210002αmm11-2βmm21-αmm213βmm31000αmm21-3βmm31,$
where $SNa(i:j)$ is the $ith-jth$ column of $SNa$, and
$SK=-4αnn0βnn1004αnn0-βnn1-3αnn12βnn2003αnn1-2βnn200000000⋯⋯000000002αnn2-3βnn3-αnn34βnn400αnn3-4βnn4.$

## Notes

1

That is, distributions indexed by a single open probability $n$, with the five states having probabilities $4ini(1-n)4-ifor0≤i≤4$.

2

That is, distributions indexed by two open probabilities $m$ and $h$, with the eight states having probabilities $3imi(1-m)3-ihj(1-h)1-j,for0≤i≤3,and0≤j≤1$.

3

We annotate the stochastic population vector $M$ as either $[M00,M10,…,M31]$ or as $[M1,…,M8]$, whichever is more convenient. In either notation $M31≡M8$ is the conducting state of the Na$+$channel. For the K$+$channel, $N4$ denotes the conducting state.

4

We write $eiNa$ and $eiK$ for the $i$th standard unit vector in $R8$ or $R5$, respectively.

5

The convergence of the discrete channel system to a Langevin system under voltage clamp is a special case of Kurtz's theorem (Kurtz, 1981).

6

The forward Euler method is first-order accurate for ordinary differential equations, but the forward Euler-Maruyama method is only $O(dt)$ accurate for stochastic differential equations (Kloeden & Platen, 1999).

7

Run times in Table 3, rounded to the nearest integer number of seconds, were obtained by averaging the run times on a distribution of heterogeneous compute nodes from Case Western Reserve University's high-performance computing cluster.

8

We refer to this model as “Orio.”

9

This is the model we refer to as the Markov chain model.

10

To avoid complex entries, we use $|x|$ when calculating entries in the noise coefficient matrix.

11

For a nonnegative random variable $T$ with cumulative distribution function $F(t)=P[T≤t]$, the mean satisfies $E[T]=∫0∞(1-F(t))dt$ (Grimmett & Stirzaker, 2005). Therefore, the difference in mean under two distributions $F1$ and $F2$ satisfies $|E1[T]-E2[T]|=∫0∞F1(t)-F2(t)dt≤ρ1(F1,F2).$

## Acknowledgments

We thank David Friel (Case Western Reserve University) for illuminating discussions of the impact of channel noise on neural dynamics and the anonymous referees for many detailed and constructive comments.

This work was made possible in part by grants from the National Science Foundation (DMS-1413770 and DEB-1654989) and the Simons Foundation. P.T. thanks the Oberlin College Department of Mathematics for research support. This research has been supported in part by the Mathematical Biosciences Institute and the National Science Foundation under grant DMS-1440386. Large-scale simulations made use of the High Performance Computing Resource in the Core Facility for Advanced Research Computing at Case Western Reserve University.

## References

Allen
,
E. J.
,
Allen
,
L. J.
,
Arciniega
,
A.
, &
Greenwood
,
P. E.
(2008)
.
Construction of equivalent stochastic differential equation models
.
Stochastic Analysis and Applications
,
26
(2)
,
274
297
.
Anderson
,
D. F.
,
Ermentrout
,
B.
, &
Thomas
,
P. J.
(2015)
.
Stochastic representations of ion channel kinetics and exact stochastic simulation of neuronal dynamics
.
Journal of Computational Neuroscience
,
38
(1)
,
67
82
.
Anderson
,
D. F.
, &
Kurtz
,
T. G.
(2015)
.
Stochastic analysis of biochemical systems
, vol.
1
.
Berlin
:
Springer
.
Bruce
,
I. C.
(2009)
.
Evaluation of stochastic differential equation approximation of ion channel gating models
.
Annals of Biomedical Engineering
,
37
(4)
,
824
838
.
Clay
,
J. R.
, &
DeFelice
,
L. J.
(1983)
.
Relationship between membrane excitability and single channel open-close kinetics
.
Biophys. J.
,
42
(2)
,
151
157
.
Colquhoun
,
D.
, &
Hawkes
,
A.
(1981)
.
On the stochastic properties of single ion channels
.
Proc. R. Soc. Lond. B
,
211
(1183)
,
205
235
.
Dangerfield
,
C.
,
Kay
,
D.
, &
Burrage
,
K.
(2010)
.
Stochastic models and simulation of ion channel dynamics
.
Procedia Computer Science
,
1
(1)
,
1587
1596
.
Dangerfield
,
C. E.
,
Kay
,
D.
, &
Burrage
,
K.
(2012)
.
Modeling ion channel dynamics through reflected stochastic differential equations
.
Physical Review E
,
85
(5)
,
051907
.
Dayan
,
P.
, &
Abbott
,
L. F.
(2001)
.
Theoretical neuroscience: Computational and mathematical modeling of neural systems
.
Cambridge, MA
:
MIT Press
.
Dvoretzky
,
A.
,
Kiefer
,
J.
, &
Wolfowitz
,
J.
(1956)
.
Asymptotic minimax character of the sample distribution function and of the classical multinomial estimator
.
Annals of Mathematical Statistics
,
27
(3)
,
642
669
.
Earnshaw
,
B.
, &
Keener
,
J.
(2010a)
.
Global asymptotic stability of solutions of nonautonomous master equations.
SIAM Journal on Applied Dynamical Systems
,
9
(1)
,
220
237
.
Earnshaw
,
B.
, &
Keener
,
J.
(2010b)
.
Invariant manifolds of binomial-like nonautonomous master equations.
SIAM Journal on Applied Dynamical Systems
,
9
(2)
,
568
588
.
Faisal
,
A. A.
, &
Laughlin
,
S. B.
(2007)
.
Stochastic simulations on the reliability of action potential propagation in thin axons
.
PLOS Computational Biology
,
3
,
5
.
Fox
,
R. F.
(1997)
.
Stochastic versions of the Hodgkin-Huxley equations
.
Biophysical Journal
,
72
(5)
,
2068
2074
.
Fox
,
R. F.
(2018)
.
Critique of the Fox-Lu model for Hodgkin-Huxley fluctuations in neuron ion channels
.
arXiv:1807.07632
.
Fox
,
R.
, &
Lu
,
Y.
(1994)
.
Emergent collective behavior in large numbers of globally coupled independently stochastic ion channels
.
Phys. Rev. E Stat. Phys. Plasmas Fluids Relat. Interdiscip. Topics
,
49
(4)
,
3421
3431
.
,
C.
,
Lee
,
C. H.
, &
Othmer
,
H. G.
(
2005
).
A stochastic analysis of first-order reaction networks
.
Bulletin of Mathematical Biology
,
67
,
901
946
.
Gerstner
,
W.
,
Kempter
,
R.
, Van
Hemmen
,
J. L.
, &
Wagner
,
H.
(1996)
.
A neuronal learning rule for sub-millisecond temporal coding
.
Nature
,
383
(6595)
,
76
78
.
Gillespie
,
D. T.
(
1977
).
Exact stochastic simulation of coupled chemical reactions
.
J. Phys. Chem.
,
81
,
2340
2361
.
Gillespie
,
D. T.
(
2007
).
Stochastic simulation of chemical kinetics
.
Annu. Rev. Phys. Chem.
,
58
,
35
55
.
Goldwyn
,
J. H.
,
Imennov
,
N. S.
,
Famulare
,
M.
, &
Shea-Brown
,
E.
(2011)
.
Stochastic differential equation models for ion channel noise in Hodgkin-Huxley neurons.
Physical Review E
,
83
(4)
,
041908
.
Goldwyn
,
J. H.
, &
Shea-Brown
,
E.
(2011)
.
The what and where of adding channel noise to the Hodgkin-Huxley equations.
PLOS Comput. Biol.
,
7
(11)
,
e1002247
.
Goldwyn
,
J. H.
,
Shea-Brown
,
E.
, &
Rubinstein
,
J. T.
(2010)
.
Encoding and decoding amplitude-modulated cochlear implant stimuli—a point process analysis
.
Journal of Computational Neuroscience
,
28
(3)
,
405
424
.
Goychuk
,
I.
(2014)
.
Stochastic modeling of excitable dynamics, improved Langevin model for mesoscopic channel noise
. In
Proceedings of the International Conference on Nonlinear Dynamics of Electronic Systems
(pp.
325
332
).
Berlin
:
Springer
.
Greenwood
,
P. E.
,
McDonnell
,
M. D.
, &
Ward
,
L. M.
(2015)
.
Dynamics of gamma bursts in local field potentials
.
Neural Computation
,
27
(1)
,
74
103
.
Grimmett
,
G.
, &
Stirzaker
,
D.
(
2005
).
Probability and random processes
(3rd ed.).
New York
:
Oxford University Press
.
Güler
,
M.
(2013a)
.
An investigation of the stochastic Hodgkin-Huxley models under noisy rate functions.
Neural Computation
,
25
(9)
,
2355
2372
.
Güler
,
M.
(2013b)
.
Stochastic Hodgkin-Huxley equations with colored noise terms in the conductances.
Neural Computation
,
25
(1)
,
46
74
.
Hill
,
T. L.
, &
Chen
,
Y.-D.
(1972)
.
On the theory of ion transport across the nerve membrane: IV. Noise from the open-close kinetics of K$+$ channels.
Biophysical Journal
,
12
(8)
,
948
959
.
Hodgkin
,
A. L.
, &
Huxley
,
A. F.
(1952)
.
A quantitative description of membrane current and its application to conduction and excitation in nerve
.
J. Physiol.
,
117
,
500
544
.
Huang
,
Y.
,
Rüdiger
,
S.
, &
Shuai
,
J.
(2013)
.
Channel-based Langevin approach for the stochastic Hodgkin-Huxley neuron
.
Physical Review E
,
87
(1)
,
012716
.
Huang
,
Y.
,
Rüdiger
,
S.
, &
Shuai
,
J.
(2015)
.
Accurate Langevin approaches to simulate Markovian channel dynamics
.
Physical Biology
,
12
(6)
,
061001
.
Imennov
,
N. S.
, &
Rubinstein
,
J. T.
(2009)
.
Stochastic population model for electrical stimulation of the auditory nerve
.
IEEE Transactions on Biomedical Engineering
,
56
(10)
,
2493
2501
.
Keener
,
J. P.
(2009)
.
Invariant manifold reductions for Markovian ion channel dynamics
.
J. Math. Biol.
,
58
(3)
,
447
457
.
Keener
,
J. P.
(2010)
.
Exact reductions of Markovian dynamics for ion channels with a single permissive state
.
J. Math. Biol.
,
60
(4)
,
473
479
.
Keener
,
J. P.
, &
Sneyd
,
J.
(1998)
.
Mathematical physiology
,
vol. 1
.
Berlin
:
Springer
.
Kispersky
,
T.
, &
White
,
J. A.
(2008)
.
Stochastic models of ion channel gating
.
Scholarpedia
,
3
(1)
,
1327
.
Kloeden
,
P. E.
, &
Platen
,
E.
(1999)
.
Numerical solution of stochastic differential equations
.
Berlin
:
Springer
.
Kolmogorov
,
A.
(
1933
).
Sulla determinazione empirica di una lgge di distribuzione
.
Inst. Ital. Attuari, Giorn.
,
4
,
83
91
.
Kurtz
,
T. G.
(1981)
.
Approximation of population processes
.
:
SIAM
.
Linaro
,
D.
,
Storace
,
M.
, &
Giugliano
,
M.
(2011)
.
Accurate and fast simulation of channel noise in conductance-based model neurons by diffusion approximation
.
PLOS Computational Biology
,
7
(3)
,
e1001102
.
Mainen
,
Z. F.
, &
Sejnowski
,
T. J.
(1995)
.
Reliability of spike timing in neocortical neurons
.
Science
,
268
(5216)
,
1503
1506
.
Mino
,
H.
,
Rubinstein
,
J. T.
, &
White
,
J. A.
(2002)
.
Comparison of algorithms for the simulation of action potentials with stochastic sodium channels
.
Ann. Biomed. Eng.
,
30
(4)
,
578
587
.
Moiseff
,
A.
, &
Konishi
,
M.
(1981)
.
The owl's interaural pathway is not involved in sound localization
.
Journal of Comparative Physiology
,
144
(3)
,
299
304
.
Neher
,
E.
, &
Sakmann
,
B.
(1976)
.
Single-channel currents recorded from membrane of denervated frog muscle fibres
.
Nature
,
260
(5554)
,
799
.
Newby
,
J. M.
,
Bressloff
,
P. C.
, &
Keener
,
J. P.
(2013)
.
Breakdown of fast-slow analysis in an excitable system with channel noise
.
Phys. Rev. Lett.
,
111
(12)
,
128101
.
O'
Donnell
,
C.
, &
van Rossum
,
M. C.
(2015)
.
Spontaneous action potentials and neural coding in unmyelinated axons
.
Neural Computation
,
27
(4)
,
801
818
.
Orio
,
P.
, &
Soudry
,
D.
(2012)
.
Simple, fast and accurate implementation of the diffusion approximation algorithm for stochastic ion channels with multiple states
.
PLOS One
,
7
(5)
,
e36670
.
Pezo
,
D.
,
Soudry
,
D.
, &
Orio
,
P.
(2014)
.
Diffusion approximation-based simulation of stochastic ion channels: Which method to use?
Frontiers in Computational Neuroscience
,
8
,
139
.
Schmandt
,
N. T.
, &
Galán
,
R. F.
(2012)
.
Stochastic-shielding approximation of Markov chains and its application to efficiently simulate random ion-channel gating.
Physical Review Letters
,
109
(11)
,
118101
.
Schmid
,
G.
,
Goychuk
,
I.
, &
Hänggi
,
P.
(2001)
.
Stochastic resonance as a collective property of ion channel assemblies.
Europhysics Letters
,
56
(1)
,
22
.
Schmidt
,
D. R.
,
Galán
,
R. F.
, &
Thomas
,
P. J.
(2018)
.
Stochastic shielding and edge importance for Markov chains with timescale separation
.
PLOS Computational Biology
,
14
(6)
,
e1006206
.
Schmidt
,
D. R.
, &
Thomas
,
P. J.
(2014)
.
Measuring edge importance: A quantitative analysis of the stochastic shielding approximation for random processes on graphs
.
Journal of Mathematical Neuroscience
,
4
(1)
,
6
.
Schneidman
,
E.
,
Freedman
,
B.
, &
Segev
,
I.
(1998)
.
Ion channel stochasticity may be critical in determining the reliability and precision of spike timing
.
Neural Computation
,
10
(7)
,
1679
1703
.
Sengupta
,
B.
,
Laughlin
,
S.
, &
Niven
,
J.
(2010)
.
Comparison of Langevin and Markov channel noise models for neuronal signal generation
.
Physical Review E
,
81
(1)
,
011918
.
Shorack
,
G. R.
, &
Wellner
,
J. A.
(2009)
.
Empirical processes with applications to statistics
.
:
SIAM
.
Skaugen
,
E.
, &
Walløe
,
L.
(1979)
.
Firing behaviour in a stochastic nerve membrane model based upon the Hodgkin-Huxley equations.
Acta Physiol. Scand.
,
107
(4)
,
343
363
.
Smirnov
,
N.
(1948)
.
Table for estimating the goodness of fit of empirical distributions
.
Annals of Mathematical Statistics
,
19
(2)
,
279
281
.
Strassberg
,
A. F.
, &
DeFelice
,
L. J.
(1993)
.
Limitations of the Hodgkin-Huxley formalism: Effects of single channel kinetics on transmembrane voltage dynamics
.
Neural Computation
,
5
(6)
,
843
855
.
Wasserstein
,
L. N.
(1969)
.
Markov processes over denumerable products of spaces describing large systems of automata
.
Problems of Information Transmission
,
5
(3)
,
47
52
.
White
,
J. A.
,
,
R.
,
Alonso
,
A.
, &
Kay
,
A. R.
(1998)
.
Noise from voltage-gated ion channels may influence neuronal dynamics in the entorhinal cortex
.
Journal of Neurophysiology
,
80
(1)
,
262
269
.
White
,
J. A.
,
Rubinstein
,
J. T.
, &
Kay
,
A. R.
(2000)
.
Channel noise in neurons
.
Trends in Neurosciences
,
23
(3)
,
131
137
.
Woo
,
J.
,
Miller
,
C. A.
, &
Abbas
,
P. J.
(2009)
.
Simulation of the electrically stimulated cochlear neuron: Modeling adaptation to trains of electric pulses
.
IEEE Transactions on Biomedical Engineering
,
56
(5)
,
1348
1359
.
Yu
,
H.
,
Dhingra
,
R. R.
,
Dick
,
T. E.
, &
Galán
,
R. F.
(2017)
.
Effects of ion channel noise on neural circuits: An application to the respiratory pattern generator to investigate breathing variability.
Journal of Neurophysiology
,
117
(1)
,
230
242
.