## Abstract

Spike trains with negative interspike interval (ISI) correlations, in which long/short ISIs are more likely followed by short/long ISIs, are common in many neurons. They can be described by stochastic models with a spike-triggered adaptation variable. We analyze a phenomenon in these models where such statistically dependent ISI sequences arise in tandem with quasi-statistically independent and identically distributed (quasi-IID) adaptation variable sequences. The sequences of adaptation states and resulting ISIs are linked by a nonlinear decorrelating transformation. We establish general conditions on a family of stochastic spiking models that guarantee this quasi-IID property and establish bounds on the resulting baseline ISI correlations. Inputs that elicit weak firing rate changes in samples with many spikes are known to be more detectible when negative ISI correlations are present because they reduce spike count variance; this defines a variance-reduced firing rate coding benchmark. We performed a Fisher information analysis on these adapting models exhibiting ISI correlations to show that a spike pattern code based on the quasi-IID property achieves the upper bound of detection performance, surpassing rate codes with the same mean rate—including the variance-reduced rate code benchmark—by 20% to 30%. The information loss in rate codes arises because the benefits of reduced spike count variance cannot compensate for the lower firing rate gain due to adaptation. Since adaptation states have similar dynamics to synaptic responses, the quasi-IID decorrelation transformation of the spike train is plausibly implemented by downstream neurons through matched postsynaptic kinetics. This provides an explanation for observed coding performance in sensory systems that cannot be accounted for by rate coding, for example, at the detection threshold where rate changes can be insignificant.

## 1 Introduction

How a neuron's spiking pattern is produced by its intrinsic dynamics and how the resulting spike patterns encode information are distinct yet connected central questions in neuroscience (Stevens & Zador, 1995; London, Roth, Beeren, Haüsser, & Latham, 2010; Jacobs et al., 2009; Panzeri, Harvey, Piasini, Latham, & Fellin, 2017). Spike patterns arise from the confluent effects of synaptic input and intracellular firing dynamics that preserve some memory of past spike times. Point process models provide a means to assess these combined effects on both the descriptive statistics and coding properties of spiking patterns (Kass & Ventura, 2001; Keat, Reinagel, Reid, & Meister, 2001; Truccolo, Eden, Fellows, Donoghue, & Brown, 2005; Paninski, 2004; Citi, Ba, Brown, & Barbieri, 2014). Including a spike-triggered intracellular adaptation variable into a point process model is one such memory mechanism by which past spike events induce negative feedback that reduce the probability of current spike emissions (Kass & Ventura, 2001; Truccolo et al., 2005; Lüdtke & Nelson, 2006; Muller, Buesing, Schemmel, & Meier, 2007; Nesse, Maler, & Longtin, 2010; Farkhooi, Muller, & Nawrot, 2011). This adaptation can produce interspike interval (ISI) temporal dependencies in which spikes emitted in short succession (i.e., short ISIs) are commonly followed by longer ISIs and vice versa. Such negative serial correlations are commonly observed in many cell types and organisms (Perkel, Gerstein, & Moore, 1967; Nawrot et al., 2007; Farkhooi, Strube-Bloss, & Nawrot, 2009; Berry & Meister, 1998). They reduce spike count variability and enhance the detection of rate changes relative to comparable renewal interval sequences (Chacron, Longtin, & Maler, 2001; Chacron, Lindner, & Longtin, 2004; Chacron, Maler, & Bastian, 2005; Lüdtke & Nelson, 2006; Nawrot et al., 2007; Maimon & Assad, 2009; Farkhooi et al., 2009, 2011; Avila-Akerberg & Chacron, 2011; Nikitin, Stocks, & Bulsara, 2012).

Here, novel properties of stochastic models with spike-triggered adaptation that produce temporal ISI correlations are of interest. A previous study by our group (Nesse et al., 2010) demonstrated that a specific hybrid model, involving both a hazard function and adaptation dynamics, can exhibit statistically dependent successive ISIs; but paradoxically, the spike-triggered adaptation states exhibit probabilistic quasi-independence with identical distribution (quasi-IID). That work established that quasi-IID adaptation states could be utilized in lieu of correlated ISI sequences to represent the spike pattern of a cell. Adaptation dynamics thus amount to a nonlinear decorrelation transformation of the ISI sequence. The results in Nesse et al. (2010) established this quasi-IID property for a specific stochastic spiking model and confirmed its existence for a conductance-based model as well, but a more general set of sufficient conditions guaranteeing the property was not made.

We specify sufficient conditions on a general class of stochastic spiking models that guarantee the quasi-IID property and give several examples. We also establish that such quasi-IID adaptation processes necessarily induce negative serial ISI correlations and establish bounds on that correlation. Furthermore, because spike-triggered adaptation processes are dynamically similar to that of synaptic transmission, such representations of the spike pattern may be plausibly communicated postsynaptically if there are matched pre- and postsynaptic kinetics (Pfister, Dayan, & Lengyel, 2010; Khanbabaie, Nesse, Longtin, & Maler, 2010; Avila-Akerberg & Chacron, 2011; Marcoux, Clarke, Nesse, Longtin, & Maler, 2016), and our letter addresses this possibility.

An important related question is whether this quasi-IID representation of the spike pattern could provide additional information about an input that is not available in firing rate statistics (Gussin, Benda, & Maler, 2007). As stated above, numerous studies have established that negative serial ISI correlations reduce the variance of firing rate, making firing rate changes more statistically discriminable relative to comparable renewal ISI spike trains (Chacron et al., 2001, 2004; Chacron, Maler, & Bastian, 2005; Lüdtke & Nelson, 2006; Nawrot et al., 2007; Maimon & Assad, 2009; Farkhooi et al., 2009, 2011; Avila-Akerberg & Chacron, 2011; Nikitin et al., 2012). This variance-reduction phenomenon defines the rate coding performance benchmark for spike trains with negative serial interval correlations. The quasi-IID representation, because it contains a complete representation of the spike train in an IID format, provides the means to rigorously assess the statistical discrimination performance of serially correlated ISI sequences in response to weak signal inputs.

Here we show that spike pattern coding that utilizes the quasi-IID adaptation states is statistically efficient for weak signal discrimination. Further, we calculate that it exceeds the previous coding benchmark by proving an inequality between the Fisher information of each code when large numbers of spikes are utilized for this weak signal discrimination. We then show that the spike pattern Fisher information exceeds the Fisher information for all rate codes, including variance-reduced rate codes, by 20% to 30%, in this general class of adapting models. We find that this reduction in rate code Fisher information arises because, while temporal ISI correlations reduce firing rate variance, the negative feedback of the adaptation also reduces firing rate gain in response to input signal changes, offsetting some of the gain in information from the variance reduction. The additional information contained in the spike pattern, beyond that found in the firing rate, may account for the behavioral performance observed in some sensory systems where firing rate coding is insufficient to account for observed weak signal discrimination performance (Knudsen, 1974; Gussin et al., 2007; Jacobs et al., 2009; Jung, Longtin, & Maler, 2016; Fettiplace & Hackney, 2006; Heil & Peterson, 2015). Indeed this question is one of the main motivations of our study, along with the related question of what determines the extreme detection capabilities seen in various senses across species.

The letter is organized as follows. We begin by defining the general category of adapting point process models we study. We then establish model conditions that guarantee the quasi-IID property theorem and provide a proof. Three illustrative example models are then discussed. We next explore consequences that the theorem has on spike statistics and make a case for the biophysical plausibility of such a coding scheme. Finally, we prove that the pattern coding that arises from the quasi-IID property exceeds that of rate codes.

## 2 Adapting Point Process Models

Let $\lambda (H):R\u2192[0,\u221e)$ be a nonincreasing and Lipschitz continuous spike emission rate function. The adaptation function $H(\eta )$: $R\u2192R$ controls the emission rate, where $\eta $ is a special “adaptation time” variable. Adaptation time $\eta $ increases at the same rate in lockstep with “clock” time $t$ for intervals in which no spiking occurs (see below for more details). The probability of spiking in the time interval $[\eta ,\eta +d\eta ]$ is $\lambda (H(\eta ))d\eta +O(d\eta 2)$. The adaptation function $H$ is differentiable and monotonic decreasing in the $\eta $ variable, making $H(\eta )$ invertible. The nonincreasing assumption of $\lambda (H)$ dictates that the decline of $H$ elicits a nondecreasing probability of spiking as adaptation time increases. That $\lambda (H(\eta ))$ is nondecreasing in $\eta $ is a reasonably nonrestrictive assumption that simply expresses that spiking is increasingly likely (or same) as adaptation $H$ declines.

Upon spiking, the postspike adaptation time $\eta $ is found to depend on the prespike adaptation time through the composition function $f=H-1\u2218g\u2218H:R\u2192R$, which is monotonic increasing, differentiable, and invertible; it represents the mapping of the adaptation time variable: $f(\eta *)=\eta $. As we will show, this formulation of the time variable $\eta $ with discontinuous jumps at each spike is necessary for the statistical analysis of the point process generated by the stochastic spiking model. These $\eta $-discontinuities at spike events allow $h$ and $\eta $ to be in a one-to-one correspondence over every interspike interval, such that when $h$ is activated via the $g$-function, the $\eta $-variable is correspondingly reset via the $f$-function.

The emission function $\lambda $, adaptation $H$, and activation $g$ (or equivalently $f$) together constitute the adapting point process model: ${\lambda ,H,g}$. Spikes are emitted stochastically, followed by a postspike-triggered activation of adaptation (with a corresponding postspike time resetting $\eta $), and then on to the next spike, forming a cycle of spike, activate, spike again, and so on. This cycle is illustrated in Figure 1A. Let $\eta 0*$ be a spike time associated with the level of adaptation $H(\eta 0*)=h0*$ just prior to the clock time at which the model generates a first “reference” spike. Then $\eta 0$ is the postspike reset adaptation time. The spike triggers an activation $\eta 0=f(\eta 0*)$, or, alternatively, $h0=g(h0*)$, which defines the red circuit. Eventually the next spike event occurs at $\eta 1*$ (blue circuit) and then resets again to $\eta 1$, and so on, producing tandem sequences ${\eta 0,\eta 1,\eta 2,\u2026}$ and ${h0,h1,h2,\u2026}$. In what follows we focus our presentation on the postspike-activated $h$-values and reset $\eta $-values (nonasterisked) in order to simplify the approach, bearing in mind that the prespike values are obtained through the inverse transformations: $\eta *=f-1(\eta )$ and $h*=g-1(h)$.

## 3 Quasi-Independence of Adaptation Processes

The objective of this section is to establish the conditions for which $R(\eta n+1|\eta n)\u223cR(\eta n+1)$ and $Q(hn+1|hn)\u223cQ(hn+1)$, that is, that both $\eta $ and $h$ are independent, or quasi-independent, of the previous value, amounting to an effectively trivial Markov process. This implies that the asymptotic distributions are obtained from a single application of the Markov operator: $r\u221e(\eta )\u223cR(\eta )$, and $q\u221e(h)\u223cQ(h)$.

Stated simply, an effectively trivial Markov process is guaranteed if the activation $g\u2218H(\eta n*)=hn$ is always large enough, and in turn $\eta n$ goes sufficiently below a threshold (see the vertical dashed line in Figures 1A and 1B) that the emission function $\lambda $ is minuscule for some time after a spike. In other words, the adaptation causes a significant relative refractory period. Consider the following theorem and proof:

(the Quasi-IID Adaptation Property). Suppose that for a given tolerance $0\u2264\epsilon \u226a1$ there exists a particular $\eta \epsilon $ such that $K(\eta |\eta \epsilon )=\u222b\eta \eta \epsilon \lambda (H(\xi ))d\xi <\epsilon $ for all $\eta \u2264\eta \epsilon $. Further suppose that $f(\eta )\u2264\eta \epsilon $ for all $\eta $, and $\eta 0<\eta \epsilon $.

The proof of theorem ^{1} is found in the appendix.

Theorem ^{1} implies that for sufficiently small $\epsilon $, $P(\eta )\u2248P(\eta |\eta 0)$. We will use equality ($=$) in lieu of approximation ($\u2248$) henceforth and define $P(\eta )\u2261P(\eta |\eta 0)=P(\eta )$, for notational simplicity. We also regard the quasi-IID variables as IID in subsequent analyses. Also note that $\eta \u2264\eta \epsilon $ and $h\u2265h\epsilon $ define the upper and lower domain boundaries for the respective variables. We use $K$ as the potential function in both $\eta $ and $h$ variables and define distributions $p\u221e(\eta )=K'e-K$ where the prime means differentiation with respect to the relevant variable; the analogous postspike activated form $r\u221e$ is similarly defined through a change of variables. On the boundaries of the respective domains of $h$ or $\eta $, the potential function has the limits $K\u21920$ and $K\u2192\u221e$ so that the probability function $P=e-K\u2208[0,1]$, as required.

^{1}yields both the adaptation states $hn$ and adaptation times $\eta n$ as quasi-independent random variable sequences, with distributions $q(h)$ or $r(\eta )$, respectively. Note that by using the activation functions, the ISIs are expressible in terms of the adaptation states or adaptation times:

Equation 3.3 defines a nonlinear transformation that expresses ISI sequences in terms of quasi-IID adaptation states $hn$ or quasi-IID adaptation times $\eta n$.

Only subsequent ISIs are statistically dependent, and they are negatively correlated, as we now show. Recall from equation 3.3 or 2.3 that ISIs are expressed as a difference of functions of two IID variables $hn$ and $hn+1$. This means that only subsequent ISIs are negatively correlated, as is common in many neuron types (Farkhooi et al., 2009; Nawrot et al., 2007). We state this fact as a corollary:

If a point process model satisfies the IID adaptation property, then the ISI sequence is serially correlated, meaning that subsequent ISIs will be statistically dependent; however, nonsubsequent ISIs $\Delta tn$ and $\Delta tk$, where $k\u2260n\xb11$, are quasi-IID and therefore uncorrelated.

The verification is simple: serial correlation results because any subsequent pair of ISIs $\Delta tn$ and $\Delta tn+1$ depends on a common adaptation variable, namely, $hn+1$; nonsubsequent ISI pairs are functions of distinct, and thus independent, adaptation variables.

Another important question is the allowable correlation range for subsequent ISIs. Let $\Delta tn$ and $\Delta tn+1$ be subsequent ISIs. The correlation range is as follows.

A short proof is given in the appendix.

In sum, this section has established that equation 3.3 constitutes a nonlinear decorrelation transformation between ISI sequences and adaptation states and times. As we shall show in section 6, we will use this decomposition of ISIs into IID variables to perform an analysis of spike train coding performance. But first, we illustrate some examples of adapting point process models in the following section.

## 4 Examples of Quasi-IID Adapting Point Process Models and Stochastic Simulations

The following three examples start from simple and less tunable to more physiologically realistic and tunable. The first example was chosen because the distributions can all be computed in terms of elementary functions, which facilitates the deduction of its behavior directly from the equations. The other two examples involve nonelementary functions but are easily depicted graphically.

For each example, we evaluate analytically derived distributions and compare them to distributions from Monte Carlo simulations. Those simulations involved incrementing by very small time steps $\Delta \eta $ and computing the probability of spiking in each time step to first order $\lambda (H(\eta ))d\eta $, and comparing it to a random number generator draw. If a spike was determined to occur, $\eta $ and $H$ were activated accordingly. Many thousands of spikes were recorded to generate empirical histograms for ISIs and $h$-data. Distributions of $\eta $ and $\eta *$ are not shown because they can be obtained from $h$ through variable changes.

Importantly, Monte Carlo simulations make no assumptions on the statistics of $hn$-data, particularly the quasi-IID property. Therefore, a good agreement between the empirical distribution of $h$-data and the presumptive IID asymptotic distribution $q\u221e(h)$ means that the quasi-IID property is a valid description of the true $hn$-process for the $\epsilon $-values chosen.

### 4.1 Example 1: Threshold Spike Emission Function, Exponential Adaptation, and Additive Activation

### 4.2 Example 2: Exponential Emission Function, Exponential Adaptation, and Additive Activation

^{1}'s strong activation hypothesis. In this case, $A=-ln(\epsilon )$ makes it so. This $A$ is the minimum sufficient activation, which occurs at time $\eta \epsilon $, so $A=H=e-\eta \epsilon $ implies $\eta \epsilon =-ln(-ln(\epsilon ))$ and $h\epsilon =-ln(\epsilon )$. The potential function is

### 4.3 Example 3: Exponential Emission, Exponential Adaptation, with Saturating Activation

Let $\lambda (H)=\alpha e-\beta H$ and $H(\eta )=e-\eta \tau $ be similar to example 2 but incorporating tunable parameters $\alpha $, $\beta $, and $\tau $. The following model was explored previously to a more limited extent in a previous work by our group (Nesse et al., 2010). The parameter $\beta $ sets the strength of $H$-dependency on emission rate, $\alpha $ sets the overall spike emission rate, and $\tau $ sets the characteristic timescale of adaptation decay. The activation function imposes an upper bound on the adaptation state that it saturates to for very high firing rates: $g(H)=\Lambda (1-H)+H=\Lambda +(1-\Lambda )H$, where $\Lambda \u2208(0,1)$. The saturation dictates that $g(H)\u2208(\Lambda ,1)$, so for all $\eta $, and $H(\eta )=h\u2208(0,1)$. The parameter $\Lambda $ is the minimum additive activation.

This saturating activation model comprising $g(H)$ and $H(t)$ can be derived from the aggregate current from many stochastically activating and inactivating binary-state ion channels. Let the $i$th channel open/closed state be a binary stochastic process $Xi=0,1$ defining an ion channel, for $i=1,2,\u2026,M$, for large $M$. Let $H=1M\u2211iXi\u2208[0,1]$ be the ensemble average. Each state $Xi=0,1$ undergoes fast state transitions. During the spike event, a fraction $\Lambda \u2208(0,1)$ of the zero-states $Xi=0$ will transition to $Xi=1$. Then $H$ is the fraction already activated and $1-H$ is the fraction not activated prior to a spike. Upon spiking, the transition $H\u2192H+\Lambda (1-H)$ occurs, which is the activation function we have defined already. In between the spike-triggered state transitions, there is a constant rate of transition back to the zero-state with timescale $\tau $ ($Xi\u21920$) according to $\tau dHdt=-H$, resulting in a continuous exponential decay of $H$ in the mean field (for very large $M$).

^{1}. The parameter $\Lambda $ anchors our analysis. Let $\Lambda =h\epsilon =e-\eta \epsilon \tau $, which is inverted to get $\eta \epsilon =-\tau ln(\Lambda )$. For $\eta 0<\eta \epsilon $, we have

^{1}. Recall that $\beta $ is the coupling parameter between adaptation and emission rate—a larger $\beta $ makes the adaptation $H$ more effective at reducing the emission rate. Of course, one can fix any three parameters, and it would determine the fourth parameter in equation 4.10. For example, given a fixed $\Lambda $, $\tau $, and $\alpha $, the inequality in equation 4.10 requires that $\beta $ be sufficiently large. Alternatively, $\Lambda $ must be sufficiently large and the product $\alpha \tau $ must be sufficiently small, all other things being constant. In the appendix, we perform an extensive parameter scan of the example 3 model.

In this letter, we focus entirely on parameter regimes in which $\epsilon $ is suitably small, in which point process models are in an effective IID parameter regime. Consequently, in what follows, we drop the “$\u221e$” on the asymptotic distributions $q\u221e\u2192q$ and assume the quasi-IID process is IID. To understand more about the statistical properties of the point process models under a broader range of parameters in which there is a loss of the quasi-IID property, see the prior study by our group (Nesse et al., 2010).

## 5 Biological Plausibility of Encoding ISI Sequences by Adaptation States: Postsynaptic Proxies

The nonlinear transformation from the IID sequence of adaptation states, or equivalently adaptation times, to correlated ISI's is an interesting operation that simplifies data analysis. But is such a representation of the data biophysically realized and plausibly transmitted synaptically? Postsynaptic coding schemes have been proposed in which presynaptic dynamics are transmitted through matched kinetics with postsynaptic dynamics (Pfister et al., 2010; Avila-Akerberg & Chacron, 2011; Lüdtke & Nelson, 2006; Marcoux et al., 2016). In our example models presented in section 4, we have utilized an instantaneously activating and exponentially decaying adaptation variable in which the dynamics of spike-activated adaptation are similar to synaptic dynamics.

In example 3, we derived a model for adaptation based on the average of independent stochastic processes $Xi\u2208{0,1}$ for $i=1,2,\u2026,M$, for large $M$ that undergo spike-activating and ISI-inactivating state transitions. Note that we can similarly define a postsynaptic proxy $S$, consisting of an average of a distinct set of independent set state variables $Yi\u2208{0,1}$ and for $i=1,2,\u2026,M$ (large $M$), so that $S=1M\u2211iYi$. Here, $Yi$ could define the conductance states of ion channels or bound-unbound states of postsynaptic receptors.

The above arguments show that standard models of synaptic dynamics are equivalent to how we have defined presynaptic adaptation dynamics. Provided kinetics are matched between adaptation and the postsynaptic processes, any information in the adaptation states is potentially representable and transmissible postsynaptically, as depicted schematically in Figure 3.

## 6 Weak Signal Detection Performance of Spike Pattern and Rate Codes Using Fisher Information

In this section, we show that the spike pattern code exhibits the best possible detection performance for weak signal perturbations, exceeding rate coding for the adapting point process model.

Consider a $\lambda $-rate model that contains a multiplicative excitability parameter $\alpha >0$. Example 3 possesses this parameter ($\lambda =\alpha e-\beta H$), but any model can be modified to include the parameter (e.g., by writing $\lambda \alpha (H)=\alpha \lambda (H))$. The parameter $\alpha $ is similar to parameters used in other studies: $\lambda 1$ in Kass and Ventura (2001) and the $q$ function in Berry and Meister (1998). The parameter $\alpha $ sets the baseline probability of spiking and can be interpreted as an input rate parameter.

Note that it is sometimes more convenient to work with the inverse rate or timescale parameter $\theta \u22611\alpha $. We will use these parameters interchangeably depending on which is more convenient. We define a weak signal by a perturbation $\alpha +\Delta \alpha $ or $\theta +\Delta \theta $, where $\alpha $ and $\theta $ are considered baseline levels.

Statistical discrimination between baseline and perturbed excitability parameters is performed by analyzing the Fisher information per spike for the spike pattern and rate models, based on a sample of $N$ spikes. We define the spike pattern code by the IID sequence ${hn}n=1N$ and its associated probability distribution $q(h)$, whereas spike rate coding is based on the distribution of the sample mean ISI: $TN=1N\u2211n=1N\Delta tn$. Note that $TN$ is in units of time, so it is not itself a rate but rather an inverse rate; however, we call $TN$ a rate statistic for convenience.

Note that the baseline parameter $\alpha $ must not be excessively large in order to guarantee the $\epsilon $-criterion for refractoriness that permits the quasi-IID property. Hence, there is a restricted range in which our analysis is valid that we term the physiological range that is given by an upper bound $\alpha UB$. Conversely, too small an $\alpha $-value, while it satisfies theorem ^{1}, is also not very interesting because the resulting $h$-sequence is near constant so quasi-IID property is trivially satisfied while the ISI sequence is an effective renewal process and uncorrelated. Therefore we define a lower bound $\alpha LB$ that excludes these trivial cases as well. The appendix rigorously defines $\alpha UB$ and $\alpha LB$ in terms of parameters in the example 3 model.

We compare the pattern code with two versions of a rate-based code. In each code, we assess discrimination performance on a per spike basis, which is reasonable because each code originates from the same model and so the firing rate is equal for each code. The first rate code, which we simply refer to as *the* rate code is based on the $TN$ statistic that takes into account the serial ISI correlations that yield short-long-short ISI sequences that result in a reduced variance of $TN$ relative to the case where the ISIs are a renewal process (Chacron et al., 2001; Nawrot et al., 2007). The second rate code assumes, wrongly, that the ISI sequence originates from a renewal process, and so does not benefit from the variance reduction.

### 6.1 The Pattern Code

### 6.2 The Rate Code

The distribution of $Z(h)$ can be derived through a change of variables from $h$, and so it has the same Fisher information as the pattern code. However, rate code inferences are not made from isolated $Zn$-variables or isolated $\Delta tn$, but on the large-$N$ sample mean $TN$ equation 6.5, of the $Zn$-variables. The distribution of $TN$ is asymptotically normal for large-$N$ samples by the central limit theorem.

The mean and variance of $Z$, given as a function $\theta $ (but could also be given in terms of $\alpha $), can be related to the mean and standard deviation of ISI statistics and allows a useful means of analysis of the spike train. We present the statistics of $Z$ as a critical lemma that enables us to analyze large-$N$ ISI averages (the inverse rate statistic):

Note that for large $N$, the mean ISI random variable, equation 6.5, approaches normality, where equations 6.7 and 6.9 define the mean and variance. Importantly, the variance of $Z$, equation 6.9, represents the variance reduction phenomenon of negative serially correlated ISI sequences.

Figure 4A shows, for the example 3 model, the ISI mean ($\mu Z$) and standard deviation ($\sigma Z$) as a function of $\alpha $, with $\alpha LB$ and $\alpha UB$ marked on the graph representing the physiological range. There are significant negative serial correlations $\rho \Delta t(1)<0$ within the physiological range, concomitant to a coefficient of ISI variation $\sigma Z/\mu Z$ being substantially less than unity (see Figure 4B) that defines non-Poisson and negative serially correlated spiking.

^{4}:

We note that the rate code Fisher information in equation 6.12 captures the variance-reduction effect of serially correlated ISI sequences. That is, $\sigma Z2(\theta )$ in equation 6.12 is smaller than what it would be if the ISI sequence were a renewal process, making the ratio in equation 6.12 larger than it would be otherwise (see below for a detailed analysis of this point). Despite this variance-reduction effect, the Fisher information, equation 6.12, is still bounded above by the pattern code Fisher information, equation 6.3, due to the gain change reflected in the numerator, as we rigorously establish in the following theorem.

The statistical efficiency $Eratevs.pattern$, equation 6.13, is bounded by unity, provided the adaptation process $hn$ is IID and $N$ is large.

A proof of theorem ^{5} is given in the appendix.

### 6.3 The Renewal Rate Code

We performed a numerical investigation to determine the magnitude of the aforementioned inequalities in theorem ^{5} and corollary ^{2}. Figure 4C shows that the ordering of efficiencies over the physiological range is $Erenewalvs.pattern<Erenewalvs.rate<Eratevs.pattern$ using representative parameters $\beta =40$, $\tau =20$, and $\Lambda =0.2$. The rate code outperforms the renewal rate code, consistent with numerous prior studies (Ratnam & Nelson, 2000; Chacron et al., 2001, 2005; Goense & Ratnam, 2003; Nikitin et al., 2012), and the pattern code outperforms both of the rate codes. The pattern code performance premium above any rate code is roughly 20% to 30% over the physiological range. Note also that this 20% to 30% factor holds over a wide range of parameters $\alpha $, $\beta $, $\tau $, and $\Lambda $, as detailed in the appendix.

What accounts for the diminished Fisher information of the rate code? In terms of statistical estimation, the simple answer is that the pattern code, interpreted through the IID $hn$-sequence data and distribution $q$, can form a sufficient statistic for $\theta $; the rate code utilizing $TN$ cannot. However, the diminished efficiency can also be understood in terms of physiology. The introduction of the variance-reducing effect of negative serial ISI correlations on the spike rate is a direct result of the adaptation $hn$ that defines the spike pattern code. It would seem that on first inspection, the coding performance of the rate code would be commensurate with the spike pattern code. However, the introduction of adaptation also reduces spike emission. So the adaptation that produces variance reduction in the denominator of equation 6.12 also reduces the firing rate change $d\mu Zd\theta $ in the numerator in equation 6.12, thereby offsetting at least some of the benefit of variance-reduction.

Figure 4D depicts this interaction between firing rate and firing rate variability with the curve $(\mu Z,\sigma Z)$ parameterized by $\alpha $ (the blue line with black squares marking the physiological range). The curve goes from upper right to lower left for increasing $\alpha $-values. We used $\beta =40$, $\tau =20$, and $\Lambda =0.2$ for this example. Any curve that curves further to the lower left corner, where a larger mean change relative to a fixed variance change, will necessarily yield a more discriminable rate code.

## 7 Discussion

Point process models incorporating a spike-triggered adaptation produce the kind of serial ISI correlations present in many neuron types (Nawrot et al., 2007; Farkhooi et al., 2009) and can be modeled using point process models incorporating adaptation (Muller et al., 2007; Nesse et al., 2010; Farkhooi et al., 2011). In broad classes of these models, the IID sequence of random variables $hn$ representing adaptation levels serves as a decorrelating transformation of the correlated ISI sequence. This IID representation requires that adaptation be strong enough to produce a refractory period, a natural constraint for neuron models.

The quasi-IID property also requires that the spike emission function $\lambda (\xb7)$ be a monotonically decreasing function of the postspike adaptation $H(t)$. Furthermore, the adaptation must monotonically increase following a spike. Such requirements on adaptation are entirely plausible, and postspike exponential decay of adaptation is utilized in some point process models (Keat et al., 2001). However, this is not strictly observed in practice. Studies by Kass and Ventura (2001) and Truccolo et al. (2005) fitted point process models to observe empirically estimated nonmonotonic spike probabilities after an initial spike. That is, their postspike probability functions increased from zero to reach a peak but then declined modestly to a steady state probability thereafter. However, in both cases, the typical ISI fell mostly within the monotonic zone. Similarly, using a linear response model approximating the Hodgkin-Huxley model, fitted kernels exhibit near-zero influence of inputs postspike followed by a postspike rebound peak (Kistler, Gerstner, & Hemmen, 1997).

Numerous studies have established that serial ISI correlations reduce spike count variance, thereby making a given firing rate change more discriminable (Chacron et al., 2001, 2004, 2005; Lüdtke & Nelson, 2006; Nawrot et al., 2007; Maimon & Assad, 2009; Farkhooi et al., 2009, 2011; Avila-Akerberg & Chacron, 2011; Nikitin et al., 2012). This long-standing finding represented a performance benchmark for negative serially correlated spike trains. Our results here establish that there is a level of coding performance beyond this previous benchmark. Firing rate coding, whether from correlated ISIs or renewal ISIs, is less informative than spike pattern coding as represented by adaptation states $hn$ for the detection of weak input changes. This is because the presence of adaptation, while it can reduce spike count variance, also reduces firing rate gain, thereby offsetting some performance. The stronger performance of the spike pattern code relative to rate codes arises because the firing rate is not a sufficient statistic for adapting cell excitability, while the pattern code based on the $hn$ adaptation states is equipped with a sufficient statistic.

Our work has focused exclusively on the negative ISI correlations because it has been deeply studied both experimentally and theoretically. However, it is known that ISI correlations come in various forms, such as a mixture of positive and negative correlations at different lags, and even oscillatory ISI correlations. The precise makeup of correlations determines spike train statistics such as the Fano factor over asymptotically large counting windows. Recent theoretical work (Schwalger & Lindner, 2013; Shiau, Schwalger, & Lindner, 2015) and modeling studies (Fuwape & Neiman, 2008) have explored the dynamical origin of these correlations and their impact on information transfer, concluding that the nonrenewal property of the ISI sequence can often enhance information transfer. It remains to be seen how our results on the quasi-IID properties of large classes of adaptation processes can provide further insight into the single cell and cellular network coding abilities of those dynamical processes. This poses a serious challenge given the more complex nature of the putative dynamics responsible for the more complex ISI correlations.

It is noteworthy that the inverse-square relationship $1/\alpha 2$ of the spike pattern Fisher information is identical to a nonadapting, nonrefractory, renewal Poisson point process $\lambda =\alpha $, owing to the fact that both are in the exponential family of distributions. With Poisson spiking, information available to discriminate inputs is reflected purely by varying spike rates with no upper bound on the rate or spike count and no refractory periods that characterize real neurons. In the adapting model, this same information is reflected in the adaptation states and requires smaller spike rate variation. Neuron dynamics permit only a limited dynamic range of spike rate variation. The adaptation-based scheme therefore permits maximum coding performance by neurons operating under physiological constraints on their dynamic range.

We can also speculate how such neural spike pattern representations would need to be decoded. Decoding the firing rate necessarily involves some physiological proxy of a time-windowed spike count. In contrast, the decoding of spike patterns is achieved by assessing the amplitude of the IID postspike adaptation amplitudes $hn$. Because the kinetics of spike-triggered adaptation are much like that of synaptic neurotransmission, there could be a corresponding quasi-IID postsynaptic proxy of each $hn$ amplitude, provided there are matched kinetics (see Figure 3). A similar approach of representing ISI patterns by synaptic dynamics with appropriately tuned kinetics to infer presynaptic states has been proposed (Pfister et al., 2010; Lüdtke & Nelson, 2006). Furthermore, a recent experimental study has established that postsynaptic potential peaks, akin to the $hn$-sequence of adaptation states, are decorrelated when the driving presynaptic ISI structure exhibits negative serial correlations, but only if the synaptic kinetics of the postsynaptic potentials, consisting of multiple neurotransmitter systems, were not pharmacologically perturbed (Marcoux et al., 2016). This result suggests that theoretical findings in this letter may be applied to the decoding of neuron data. Moreover, such pattern-based discrimination may be necessary to account for performance in neural systems where rate coding involving spike patterns with negative serial ISI correlations is demonstrably insufficient to explain behavioral performance, especially near the threshold of sensory detection (Knudsen, 1974; Gussin et al., 2007; Jacobs et al., 2009; Jung et al., 2016; Fettiplace & Hackney, 2006; Heil & Peterson, 2015).

Finally, an interesting avenue to pursue in future work is to understand more fully the encoding of temporal signals in the presence of the quasi-IID adaptation states and the negatively correlated ISIs. This would go beyond the simple stimuli in the form of step increases in amplitude that were considered here. Previous work (Ratnam & Nelson, 2000; Chacron et al., 2001, 2004) comparing models with and without these negative ISI correlations reveals a noise-shaping effect. This causes a decrease in the baseline of the power spectral density of the spontaneous spike trains from the model at lower frequencies. This leads to a signal detection and encoding enhancement, since the signal-to-noise ratio increases in that range. The effect is related to the regularization of spike counts by the anticorrelation discussed here (Chacron et al., 2001; Nesse et al., 2010). However, the decrease of the noise floor at lower frequencies is accompanied by an increase at the higher frequencies. Consequently, the coding benefits gained at low frequencies lead to losses at higher frequencies, which effectively limits the bandwidth of the system to the lower range. The precise manner in which the theorems described here for the quasi-IID behavior translate into signal encoding benefits by the adaptation states and the ISI sequence remains to be explored, a task left for future work.

## Appendix

### A.1 Proof of Theorem ^{1}, the Quasi-IID Adaptation Property

The value $\eta =\eta 0$ is the starting value after activation from some previous spike. We can mandate that $\eta 0\u2264\eta \epsilon $ because the time activation function $f$ never exceeds $\eta \epsilon $.

This expansion establishes the first case, equation 3.1. The statement means that for the initial interval of time $[\eta 0,\eta \epsilon )$, less than an $\epsilon $-fraction of the ensemble spikes, so the survival function is $\epsilon $-close to unity.

### A.2 Proof of Corollary ^{3}

### A.3 Proof of Theorem ^{5}

^{5}. $\u25a1$

### A.4 The Effect of Auxiliary Parameters on Spike Statistics and Fisher Information

The coding efficiency of the rate code was found to be 20% to 30% less than that of the pattern code for the more biophysically realistic example 3 model for a specific set of parameters $\beta =40$, $\tau =20$, and $\Lambda =0.2$, with $\alpha $ varying over what we termed the physiological range. Here we rigorously define this range and perform an extensive parameter scan to understand how all parameters $\beta $, $\Lambda $, and $\tau $ interact with $\alpha $ in the example 3 model to affect coding efficiency. We establish that the 20% to 30% efficiency loss we reported for the rate code is representative over an exhaustive scan of the parameters.

Physiologically realistic spiking is characterized by (1) exhibiting significant serial ISI correlations between subsequent ISIs, with $\rho \Delta t(1)$ being sufficiently negative; (2) excitability is not so high that the quasi-IID property holds—a postspiking refractory period is present; and (3) the adaptation pattern sequence $hn$ exhibits nontrivial variability levels so that the ISI sequence and the $hn$ sequence are not a constant or near-constant sequence. Conditions 1 to 3 imply that the ISI coefficient of variation (CV) is sufficiently below unity, unlike a Poisson process, yet also not too close to zero like that of a constant ISI sequence. Over a very large range of parameter combinations, we show that there exist $\alpha $ intervals where the spiking behavior is consistent with conditions 1 to 3.

Within this physiological range satisfied by conditions 1 to 3, we find the coding efficiency $Eratevs.pattern$ is consistently in the 70% to 80% range. In other more extreme parameter ranges, coding efficiencies were much lower and higher, but the associated spiking statistics did not meet criteria 1 to 3. This is explored rigorously as follows.

The choice of $c=4$ in equation A.26 was not very restrictive. Increasing $c$ further would result in a narrower admissible parameter range; however, a broader range was chosen so that the behavior of the adapting point process model is better illustrated. On the left-hand sides of all the parameter islands, the spike patterns possess weak, negative serial ISI correlations (see the darker red region of the color map, top row of Figure 5), but increasing $s$ yields stronger negative correlations. We will only consider realistic spiking that exhibits serial correlations below $-0.2$ that is typical of reported serial correlations (Farkhooi et al., 2009; Nawrot et al., 2007), a region to the right of the thick black line in the top row of Figure 5 that is considered physiological. The right-hand boundary of the parameter island defines the upper bound of equation A.26 that satisfies condition 2.

The condition on the CV is shown in the second row of Figure 5. The admissible CV range was defined to be 0.2 to 0.3, an intermediate level of spiking variability in this model. Studies of real neurons typically report larger CV-values (Nawrot et al., 2007; Farkhooi et al., 2009), but these must be estimated from long sampling windows, typically much longer than used for sensory detection, which could include intermediate-to-longer-timescale fluctuations that are not present in a stationary model. These fluctuations will necessarily elevate experimentally computed estimates of variability (Perkel et al., 1967). This 0.2 to 0.3 CV range was framed by the lower and upper black lines in the second row of Figure 5. Increasing $\beta $- and $s$-levels for any $\Lambda $ had the effect of increasing the regularity of the ISI sequence, resulting in a lower CV.

Together, conditions 1 to 3 formed an admissible parameter range for realistic serially correlated spiking (Nawrot et al., 2007; Farkhooi et al., 2009). These combinations occurred to the right side of the ISI correlation boundary and to the left of the edge of the parameter island formed by equation A.26, and between the upper and lower CV boundaries; this defines a region of parameter space for particular levels of $\Lambda $-values. For some of the larger $\Lambda $-values, there was no admissible parameter region because the increased $\Lambda $-value “pushed” the physiological parameter range off the island. A very large $\Lambda $-value in the range $0.6<\Lambda <1$ results in reduced variability for both the ISIs and the $hn$-sequence because the adaptation levels become effectively saturated (not shown).

Two coding efficiencies were assessed and plotted in the third and fourth rows of Figure 5 within the admissible parameter range island. The border enclosing the physiological range (satisfying properties 1 to 3) was drawn as the white-black dashed lines. The first efficiency (third row) compared the adaptation pattern code with the renewal rate code. The renewal rate code exhibited a consistently low efficiency across a range of $\Lambda $-values, with an upper limit around 0.45 to 0.5, and a lower limit around 0.23 to 0.39 over the $\Lambda $-columns.

The second efficiency compared the pattern code to the spike rate code for the adapting model, with results shown in the fourth row of Figure 5. The negative ISI correlations reduce the variance of the ISI sequence, and thus of spike counts, and improve detection performance relative to renewal ISIs. The upper coding efficiency level was observed to be approximately 0.75 to 0.85, while the lower coding efficiency level was approximately 0.71 to 0.75 over the $\Lambda $-columns. We conclude that for realistic spike trains exhibiting negative serial correlations and IID adaptation, the coding efficiency over this exhaustive parameter scan for the example 3 model is roughly 70% to 80%; in other words, the rate code detection performance is 20% to 30% less than that of the pattern code.

## Acknowledgments

We thank Gregory Rice (University of Waterloo), and Lajos Horvath (University of Utah) for their helpful discussions. A.L. acknowledges support from the NSERC Canada DG program, and L.M. acknowledges support from CIHR 153143.