The discussion whether temporally coordinated spiking activity really exists and whether it is relevant has been heated over the past few years. To investigate this issue, several approaches have been taken to determine whether synchronized events occur significantly above chance, that is, whether they occur more often than expected if the neurons fire independently. Most investigations ignore or destroy the autostructure of the spiking activity of individual cells or assume Poissonian spiking as a model. Such methods that ignore the autostructure can significantly bias the coincidence statistics. Here, we study the influence of the autostructure on the probability distribution of coincident spiking events between tuples of mutually independent non-Poisson renewal processes. In particular, we consider two types of renewal processes that were suggested as appropriate models of experimental spike trains: a gamma and a log-normal process. For a gamma process, we characterize the shape of the distribution analytically with the Fano factor (FFc). In addition, we perform Monte Carlo estimations to derive the full shape of the distribution and the probability for false positives if a different process type is assumed as was actually present. We also determine how manipulations of such spike trains, here dithering, used for the generation of surrogate data change the distribution of coincident events and influence the significance estimation. We find, first, that the width of the coincidence count distribution and its FFc depend critically and in a nontrivial way on the detailed properties of the structure of the spike trains as characterized by the coefficient of variation CV. Second, the dependence of the FFc on the CV is complex and mostly nonmonotonic. Third, spike dithering, even if as small as a fraction of the interspike interval, can falsify the inference on coordinated firing.
Synchronous spiking has been implicated in neural coding (Abeles, Bergman, Margalit, & Vaadia, 1993; Gray, König, Engel, & Singer, 1989; Kilavik et al., 2009; Riehle, Grün, Diesmann, & Aertsen, 1997; Singer, 1999). It has been associated with cortical avalanches (Beggs & Plenz, 2003) and synfire chains (Abeles, 1991) and has also been used to constrain the mechanisms that give rise to cortical activity (Diesmann, Gewaltig, & Aertsen, 1999; Gutkin, Laing, Colby, Chow, & Ermentrout, 2001; Vicente, Gollo, Mirasso, Fischer, & Pipa, 2008; Sommer & Wennekers, 2001; Aoki & Aoyagi, 2007). Hence the determination of whether synchronized events occur above chance (i.e. whether they occur more often than expected if the neurons fire independently) has recently attracted considerable attention. Several approaches have been taken to investigate this issue (Amari, Nakahara, Wu, & Sakai, 2003; Aertsen, Gerstein, Habib, & Palm, 1989; Brody, 1999; Brown, Kass, & Mitra, 2004; Truccolo, Eden, Fellows, Donoghue, & Brown, 2005; Gerstein & Perkel, 1972; Grün, Diesmann, Grammont, Riehle, & Aertsen, 1999; Grün, Diesmann, & Aertsen, 2002a, 2002b; Grün, 2009; Ikegaya et al., 2004; Kass, Ventura, & Brown, 2005; König, 1994; Louis, Gerstein, Grün, & Diesmann, 2010; Pipa & Grün, 2003; Pipa, Riehle, & Grün, 2007; Pipa, Wheeler, Singer, & Nikolic, 2008; Agmon, 2012; Ventura, Cai, & Kass, 2005; Gerhard, Haslinger, & Pipa, 2011a, 2011b; Pipa & Munk, 2011; Wu, Wheeler, & Pipa, 2011) (Oram, Wiener, Lestienne, & Richmond, 1999) performed a similar analysis, however, for patterns within single spike trains). Many of these approaches for the analysis of correlated spike trains imply more or less implicitly the assumption of Poisson spike trains, for example, by simulating spike trains as Poisson processes with the same rate profiles as the neurons under investigation to determine the statistics of coincident events analytically. However, experimental interspike interval (ISI) distributions can show a substantial deviation from the exponential distribution, since the nature of experimental spike trains can be dramatically different from Poisson processes (Burns & Webb, 1976; Levine, 1991; Teich, Heneghan, Lowen, Ozaki, & Kaplan, 1997; Iyengar & Liao, 1997; Nawrot et al., 2007, 2008; Smith, 1954a, 1954b; Krahe & Gabbiani, 2004; Farkhooi, Strube-Bloss, & Nawrot, 2009). However, their effect on the occurrence of synchronous or spatiotemporal spike events has been only rarely investigated (Grün et al., 2002a; Pipa & Grün, 2003; Gerstein, 2004; Louis, Borgelt, & Grün, 2010; Louis et al., 2010; Ventura et al., 2005), and in particular, which conclusions arise if such features of experimental spike trains are ignored. Figure 1 illustrates the influence of non-Poisson spike train structure (here modeled as renewal gamma processes) on the distribution of coincident spike events. The comparison of the distributions to the one of Poisson processes demonstrates that the shape of the coincidence distribution may be considerably affected by the spike train structure. Thus, assuming Poisson in the presence of non-Poisson processes would lead to incorrect estimations of the statistical significance of coincident spiking activity. This aspect will be rigorously evaluated in the first part of this letter. Two types of renewal processes (gamma and log-normal processes) considered as reasonable models for experimental spike trains (Burns & Webb, 1976; Levine, 1991; Teich et al., 1997) are used for the investigation. Their ISI characteristic can be parameterized by the coefficient of variation (CV) and thus can be varied from a bursty to a regular spike train structure. The influence of wrongly assumed spike train structure onto the distribution used for significance estimation will be shown by analytical and numerical methods.
Another approach increasingly used for the creation of surrogate data for significance evaluation of correlated spike trains is dithering (Date, Bienenstock, & Geman, 1998; Davies, Gerstein, & Baker, 2006; Gerstein, 2004; Hatsopoulos, Geman, Amarasingham, & Bienenstock, 2003; Shmiel et al. 2006). This procedure is also often called jittering (Kuhn, Aertsen, & Rotter, 2003), but we prefer to reserve the term jitter for the temporal imprecision in the original data and use dithering for the manipulation of the data (Grün, 2009). The method intentionally destroys existing spike correlation by a small displacement of the timing of each individual spike by a random amount. Therefore, the rate profiles stay almost unchanged. However, obviously this procedure interacts with the inter spike intervals of the original spike trains, and thus must lead to changes of the coincidence distribution. Although the efficiency of dithering was recently studied (Pazienti, Diesmann, & Grün, 2007; Pazienti, Maldonado, Diesmann, & Grün, 2008), there are no systematic investigations of the consequences of spike dithering on the significance evaluation of correlation. Here we will show first how the spike train structure interacts with dithering and, second, under which conditions the method may lead to false-positive results. Finally we discuss the results and methods to avoid false-positive results and incorrect data interpretation.
2. Spike Train Models
Although it is well known that spike trains are not Poissonian, this assumption is often used for simplicity for analytical and statistical purposes. Three main facts clearly indicate that biological spike trains have an autostructure that deviates from Poissonian processes. First, all neurons exhibit refractory periods after spike firing. This refractoriness makes short interspike intervals on the order of several milliseconds unlikely to happen and is therefore incompatible with the exponentially decaying interspike interval distribution of a Poisson process (Cox & Lewis, 1966). Second, biological neurons can fire with a certain temporal structure that can be described in a first approximation as regular or bursty firing. In the latter case, neurons fire two or more spikes during short epochs, followed by longer episodes of silence. And third, neuronal firing might not be described by renewal processes but exhibit higher-order dependencies of spike times that lay further back in the past (Nawrot et al., 2007). In this letter, we focus on the first two aspects and assume that the neuronal firing can be described by renewal processes. We study the impact of the refractoriness, the burstiness, and the regularity of spike trains on the probability distribution of coincident firing.
To this end, we use two models of neuronal firing that are both renewal processes with two model parameters each: the gamma process and the log-normal process. Both processes allow us to model burstiness, regularity, and refractoriness for different parameter combinations, and both have been proposed as model descriptions for neuronal spike trains (Burns & Webb, 1976; Levine, 1991; Teich et al., 1997; Iyengar & Liao, 1997; Nawrot et al., 2007, 2008; Smith, 1954a, 1954b).
2.1. Gamma Process.
This distribution is characterized by two variables: the spike rate R and the shape parameter . The Poisson process is a special case of the gamma process for which . Values of the shape parameter smaller than 1 make the distribution become hyperexponential, and short intervals are more likely to occur than for a Poisson process with the same rate. We use this parameter regime of a gamma process with a shape parameter smaller than 1 in order to model bursty spike trains. In contrast, for very large values of the shape parameter , the gamma distribution approaches a narrow normal distribution. Therefore, a gamma process with shape parameter is used to model regular firing.
2.2. Log-Normal Process.
3. Coincidence Count Distribution
In this section, we derive and discuss the impact of the interspike interval distribution on the shape of coincidence count distributions. To stress the importance of this interplay we first give a few examples in order to demonstrate that the impact of the detailed interspike interval distribution on the coincidence count distribution can be important. The first example we look at is the comparison of different renewal processes that look identical concerning two features that are frequently used to characterize biological spike trains: the CV of the interspike interval distribution and the Fano factor FFs of the spike count distributions. As a second example, we show that dithering of individual spikes can alter the coincidence count distribution for different renewal processes characterized by the same CV in very different ways.
Following these examples, we present analytical expressions and a rigorous discussion of the mean and the variance of the coincidence count distribution for a frequently used model of biological spike trains, the gamma process in the case of integer shape parameters. We then complement this analytical derivation by numerical estimations of the mean and the variance for gamma processes with noninteger shape parameters as well as with a second kind of renewal process with log-normal-distributed interspike intervals.
Given two parallel spike train processes, we define a coincidence based on binned versions of the original processes. The binned spike trains are obtained by segmenting the time axis into exclusive bins, each of length , and counting the exact number of spikes nk per bin k (no clipping; clipping reduces any number of spikes larger than 1 to 1; see Grün, Diesmann, & Aertsen, 2002b). The number of coincidences in bin k shared by the two corresponding simultaneous bins of neurons 1 and 2 is then defined by Nkc=n1kn2k.
3.1.1. Examples of Coincidence Count Distributions for Gamma and Log-Normal Processes
As a first example of the importance of the interspike interval distribution on the coincidence count distribution, we show in Figures 2 and 3 the coincidence count distributions for gamma and log-normal processes for the case of four different CVs but equal spike rate R and bin widths . Figures 2B1 and 3B1 show the coincidence count distribution for a gamma and log-normal process, respectively, in case of very regular spiking activity, yielding a CV of 0.1. In contrast, panel B4 in both figures shows the coincidence count distribution for a CV of 1.5. Note that the nature of the log-normal distribution makes very short intervals unlikely, even for CV values larger than 1, while short intervals are likely for the same CV in the case of a gamma process.
That difference in the likelihood of short ISIs makes the distribution of the coincidence counts much broader in the case of the gamma process than for the log-normal process (see Figures 4A, 5A3, and 5B3 for quantification of the broadening in terms of the Fano factor). This emphasizes that the detailed structure, namely, the stronger burstiness for Gamma compared to log-normal processes, is responsible for making the coincidence distribution different even though the CV and the FFs of the spike counts are identical.
3.1.2. Example for the Effect of Dithering on the Coincidence Count Distribution.
Another peculiarity occurs when individual spikes are dithered (Date et al., 1998; Davies et al., 2006; Gerstein, 2004; Hatsopoulos et al., 2003; Shmiel et al., 2006). This technique is often used for statistical purposes to derive surrogate data (e.g., to derive the probability of coincidence firing under the assumption that neurons fire independently). Dithering is defined by a random variation of the spike timing of individual spikes by a random offset, for example, drawn independently and identically distributed (i.i.d.) from a gaussian distribution with zero mean and variance (Jones, Depireux, Simons, & Keller, 2004). Since the gamma process with CV= 1 is actually a Poisson process, dithering affects neither the ISI distribution nor the coincidence count distribution, as shown in Figure 2B3.
However, although the CV and the Fano factor of the spike count distribution are identical for both the Poisson and the log-normal distribution, dithering changes the shape of the coincidence count distribution for the log-normal process (see Figure 3B3) but not for the Poisson processes (see Figure 2B3). In contrast, we see the opposite effect for CV=1.5. Dithering changes the coincidence distribution in case of the gamma process, but it stays nearly unchanged for the log-normal process. These examples show that dithering changes the shape of the coincidence count distribution in a rather unpredictable way when the nature of the underlying process is unknown.
3.2. Fano Factor (FFC) of Coincidence Count Distribution for Renewal Processes
After having presented examples that emphasize the necessity of a deeper understanding of the interplay between the nature of a renewal process and the coincidence count distribution, we now give a rigorous analytical description of the mean and the variance of the coincidence count distribution of a gamma process with integer shape parameter .
Next we take the limit and use the fact that as |k| approaches infinity (), Aik approaches exponentially. Therefore, approaches 0 exponentially as |k| increases.
Note that the analytical expression described so far for the Fano factor, equation 3.10 holds for any renewal process as well as any stationary nonrenewal point process for which the autocorrelation approaches sufficiently quickly.
3.3. Analytical Derivation of FFc for Poisson Processes
Note that FFc is larger than 1 and grows with increasing spike rates and bin width. The reason is that we counted the number of coincidences as the product of the number of spikes of simultaneous bins (see equation 3.1).
Mostly for simplification, an alternative way of counting coincidences has been proposed that requires that the spike trains be binary sequences. In this case, the number of coincidences per bin can be either 0 or 1 (clipping). Note also that in this case, the Fano factor FFclipc does not equal 1. In fact, FFclipc is smaller than 1 in the case of underlying Poisson processes.
3.4. Analytical Derivation of FFc for Gamma Processes
After having derived the Fano factor FFc for Poisson processes, we derive a closed analytical expression of FFc for gamma processes.
Hence, equation 3.16 is a closed analytical expression for the Fano factor FFc of the coincidence count distribution of two mutually independent gamma processes with integer shape factors in case coincidences are counted by the product of spike counts per bin as defined in equation 3.1.
Using equation 3.16, we then studied the functional dependencies of FFc as a function of, the shape parameter, different rates for the two spike trains considered simultaneously, and, the product of the bin width and the spike rate (see Figure 4).
In Figure 4A we plot FFc as a function of the CV for four variations of the product of the spike rate and the bin width for two processes with identical rate R=R1=R2 and identical shape factors . It demonstrates that the value of FFc changes nonmonotonically with increasing values of CV. For a low order of , FFc decreases with increasing shape parameter, while FFc increases with for larger values of the shape parameter. Moreover, although FFc is below 1 for all four variations of for an intermediate range of the CV, it is larger than 1 for low and high values of the CV.
As a second variation, Figure 4B, we used equation 3.16 to study changes of the Fano factor FFc as a function of the CV for the case that the rates of processes 1 and 2 are differing but have the same shape factors . The bin width is chosen such that . We quantify the difference in rate by with and . For , the Fano factor is and does not depend on , such that we reproduce the results from Figure 4A. For small values of , the dependence on is weak, but for large , there is a strong dependence. However, this dependence is not monotonic either.
For small differences of that are larger than , the quali- tative behavior for small values of CV is quite different and stays below 1, while the difference across differences of for larger values of the CV disappears.
In Figure 4C we study FFc as a function of the difference in rate for five different shape parameters and . Again the plot demonstrates that the behavior is not linear or trivial. With increasing shape parameter, the mapping becomes more complex. While it is monotonic as a function of the absolute value of for shape factors , the relation becomes nonmonotonic for shape parameter . For larger , the FFc has a maximum for and drops more rapidly with as is increased. However, for very large , there is a secondary peak for for this value of , R1=2R2. Presumably for even larger , there is another peak when R1=3R2. This illustrates that the Fano factor of the coincidence counts can change quite unexpectedly and strongly even for very small differences in the spike rates of two independent processes if the gamma processes becomes very regular.
As a last variation (see Figure 4D), we study FFc as a function of the for five variations of the shape factor . Again the complexity of the behavior increases with increasing shape factor. For a Poisson process, FFc grows linearly with the rate difference , while it is also nonmonotonic for shape factors .
This set of results demonstrates that the Fano factor FFc of the coincidence count distribution depends in a complex and nonintuitive way on all three parameters: the shape factor and the CV, the product of the spike rate and the bin width , and the difference between the spike rates of two independent gamma processes.
The fact that even small changes of single parameters that are each in a biological plausible range can alter the Fano factor quite heavily and emphasize that a statistical inference on coincident firing may fail in case that it is based on the assumption that neuronal firing can be approximated by Poissonian processes. We find that if the CV of the processes is approximately smaller than 0.2 or larger than 0.7, the significance would be overestimated, and for values in between, it would be underestimated (see Figure 4A). The picture changes if the firing rates of the two processes differ. Then the significance tends to be underestimated for small CVs (see Figure 4B and 4C). The underestimation is stronger if rate differences are larger (see Figure 4D).
3.5. Numerical Estimation of the Coincidence Count Distribution.
Next, we complement the analytical results from the last section with numerical estimations for the gamma processes with noninteger order parameter and use log-normal processes as a second kind of renewal processes. As for the analytical derivation of the Fano factor FFc for the coincidence count distribution, we choose the coefficient of variation (CV) of the ISI distribution for the parameterization of both processes (see equations 2.2 and 2.6). We first checked the correspondence between the analytically predicted model CV and the numerical estimation for the gamma and log-normal process (see Figures 5A1 and 5B1). For all tested parameter combinations, the numerical estimations of CV corresponded to the analytically predicted values.
In a further step, we study the effect of dithering of individual spikes (i.i.d., gaussian, zero mean, and 5 ms standard deviation) of gamma and log-normal processes on the CV of the resulting processes (see Figures 5A2 and 5B2). Dither widths as small as 5 ms change the CV of the original process quite strongly and increase the CV. This is because processes with small values of CV have many pairs of spikes that occur in short intervals and can be characterized as bursty. This bursty activity is destroyed for even small amounts of dithering, since the short timescale of the dithering is comparable to the timescale of the bursts. For an original CV of about 1, there is no obvious change, but for original CV larger than 1, there is a tendency for resulting CVs to be smaller than 1. Intuitively that can be explained by the fact that dithering makes the interspike interval distribution of any kind of spiking activity more Poisson.
To estimate FFC for noninteger gamma processes with shape factors ranging from to 100 (CV from 0.1 to 1.5), as well as renewal processes with log-normal distributed ISIs for values of CV between 0.1 to 1.5, we used Monte Carlo estimations (see Figures 5A3 and 5B3; for details regarding the numerical procedure, see appendix C). The numerical results of the gamma process correspond to the analytically predicted values (compare Figure 4A). For shape parameters smaller than 1, for which we do not have analytical predictions, the value FFc grows with increasing CV and increases of the product . The qualitative behavior of changes of FFc for a log-normal process as a function of the CV are comparable to the changes in the case of a gamma process. Both processes exhibit nonlinear and nonmonotonic changes of FFC as a function of CV. Moreover, both processes exhibit a low FFC for intermediate values of CV, while values of FFC are large for very low and high values of CV.
In general, changes of FFc are more pronounced for larger values of the product . This is true for both kinds of processes. However, despite the qualitatively identical behavior, changes for larger values CV are up to two times stronger for log-normal processes than for gamma processes.
3.6. Comparison of the Full Distributions.
To compare the full shape of the distribution of coincidence counts we use quantile-quantile (QQ) plots. We first compare the distribution for gamma (see Figure 6A1–6A3) and log-normal processes (see Figure 6B1–6B3) with the distribution for Poisson processes, for the case that all three types of processes have identical spike rates (R = 50 Hz). Differences between distributions resulting from gamma and log-normal processes as compared to Poisson processes are particularly obvious in the tails of the distributions. Because the average coincidence count is unaffected by the shape of the ISI distribution, the medians of the distributions are almost the same (QQ plots cut the diagonal close to 0.5). Large and small values of CV, deviating from CV=1, make the coincidence count distributions of the gamma and log-normal processes more heavily tailed than the distribution resulting from Poisson processes, whereas for intermediate values of the CV, their tails are less pronounced. The fact that these differences are especially pronounced in the tails implies strong consequences for hypothesis testing. To illustrate that, we added two insets for each QQ plot showing the first 10% from 0 to 0.1 corresponding to the left tail of the distribution (see the insets above the diagonals in Figure 6) and the last 10% from 0.9 to 1 corresponding to the right tail of the distribution (insets below the diagonal in Figure 6).
For the left tail of the distribution, any curve above the diagonal indicates an increased false-positive level if the test statistics is based on the assumption that spike trains follow Poissonian firing. In contrast, any curve below the diagonal indicates a reduced number of false positives and therefore a conservative hypothesis test. The actual number of false positives in comparison to the given the test level (values along the x-axis) is equal to the value of the QQ curve (y-axis) at the given test level (x-axis). In the case of the right tail of the distribution (insets below the diagonal showing values ranging from 0.9 to 1), this relation is the opposite. Here, any curve below the diagonal indicates an increased level of false positives. For this case, the actual number of false positives, given the assumption of Poissonian firing, is equal to 1 minus the value at which the curve intersects a given value of the x-axis that equals the assumed test level. Note that the interpretation of false positives occurring at the different tails of the distribution would be interpreted differently. While false positives in the left tail of the distribution would be interpreted as an overestimation of lacking coincidences, they would be interpreted as an overestimation of excess synchrony at the right tail. Typically of interest are tests showing excess synchrony.
Differences between the distributions can be quite dramatic, such that the number of false positives can be considerably increased (see Figures 6A3 and B3) for large bin width, as well as for large values of the CV. Qualitatively, changes for both types of processes, the gamma and log-normal, compared to Poisson are very similar. Quantitative differences are most pronounced in the tails, where differences in comparison to distributions generated by Poisson processes are generally strong.
4. Influence of Dithering on Coincidence Distribution
4.1. Analytical Approach for Gamma Processes.
As we have shown, the Fano factor for the number of coincidences FFc depends on the autocorrelations of the spikes trains of the two neurons, A(i)k. The autocorrelation of the spike trains after dithering is applied will be different from the autocorrelation of the undithered spike trains. Hence, dithering will change the shape of the distribution, and thus the Fano factor of the coincidence count.
In general, it is not possible to analytically derive the Fano factor of the coincidences after dithering. However, it is instructive to consider the extreme with an extremely large width of the dithering kernel. In this case, the autocorrelation becomes flat except for the peak at zero delay. Nevertheless, the process is not Poissonian, since dithering does not change the spike count distribution for the two processes.
Note that in this derivation, we have not used the fact that spike trains are renewal processes. This result holds for any stationary process.
The impact of dither on the Fano factor FFc is very nonintuitive. Changes of FFc are monotonic for small shape factors of the gamma process and nonmonotonic for large shape factors and can amount to up to 40% of relative changes. The strongest changes occur for dithering of the order of 0.1 to 1 times the average interspike interval (see Figure 7).
This is exactly the range that is used for significance estimations, since any smaller amount of dithering does not destroy coincidences and any larger dithering could change the profile of rate changes. This emphasizes that the actual p-value describing the likelihood that coincidence events are due to chance might depend on the amount of dithering, whereas the relation can be nonmonotonic.
4.2. Appropriateness of Dithering to Realize the Null Hypothesis of Independence.
So far we have analytically described the impact of dithering on the Fano factor FFc of gamma processes with the integer shape factor. Next, we supplement the analytical expressions of FFc by Monte Carlo–based estimations of the full shape of the distribution of coincidence counts of undithered and dithered gamma and log-normal processes. In addition, we explore how far the use of dithering is appropriate as an implementation of the null hypothesis of independence in a significance test of spike coincidences. Thus, we use the distribution of coincidence counts derived from mutually independent and dithered gamma and log-normal processes as respective reference distributions and estimate the probability for significant outcomes of the original, undithered spike trains. By comparing the distributions from the undithered to the dithered coincidence count distributions, we derive an estimate of the probability of false-positive outcomes. This is done in the following way. We first determine the critical number of coincidences Ncritical from the coincidence count distribution of the dithered processes as the minimal count just becoming significant given the test level such that test level. Then we compute the probability of false positives of the corresponding undithered processes as . Since these three numbers, N, Nundith, and Ncritical, are integer numbers, we use a linear interpolation on the cumulative distribution of the reference distribution to derive an interpolated Ncritical to overcome variations in the effective significance level occurring due to the discreteness of the coincidence count distribution. We use four different amounts of dithering ( ms, 10 ms, 20 ms, and 40 ms) that are gaussian distributed with zero mean and a standard deviation of . The spike rate was 50 Hz and interspike intervals were gamma distributed in Figures 8A1 to A4 and log-normal distributed in Figures 8B1 to 8B4.
For low and intermediate values of the CV, the qualitative and quantitative behavior of both process types is quite similar. Small values of the CV lead to increases in the number of false positives up to seven times compared to the test level. Intermediate values of the CV yield for both process types a level of false positives that is comparable to the test level. Only for high values of the CV do the false positives behave differently for gamma and log-normal processes. While for gamma processes, the false positives increase, the false-positive rate is comparable to the test level in the case of log-normal processes. The reason is that the probability for small interspike intervals for log-normal processes and gamma processes is very different for larger CV. In case of gamma processes, the distribution is hyperexponential, and short intervals are very likely, whereas for log-normal processes the probability for very short intervals is zero and very low for small ISIs. Since dithering destroys very short intervals more effectively than larger ones, the coincidence count distributions are in particular affected by this procedure.
This illustrates that the influence of dithering cannot be predicted by the CV of the underlying spike trains alone. Changes of the coincidence count distribution due to dithering depend on the detailed structure of the interspike interval distribution. For testing if coincidences are occurring by chance, dithering was introduced to eventually destroy existing excess coincidences between neurons to derive the distribution of coincident events occurring by chance. However, since dithering destroys not only coincident events of coupled neurons but also the shape of the coincidence distribution due to changes on the autostructure, it is not possible to tell the one effect from the other. Therefore, significant deviations of the number of coincidences could be due to either changes on the autostructure or the coupling of neurons. This illustrates that dithering, even if very small compared to the expected interspike interval or other methods that manipulate the autostructure of the original spike trains, may lead to false conclusions whether or not excess synchronous activity occurs.
The procedure we have taken here provides an average estimate of the effect of manipulation by dithering. However, when estimating the significance of a number of coincidences in a data set, dithering is applied many times to the original data set. Based on the coincidences counted in each dithered version of the original data, the coincidence distribution generated then serves for significance estimation of the number of coincidences found in the original data set (for details see Louis et al., 2010; Louis, Borgelt, & Grün, 2010).
Coordinated neuronal activity across many neurons is a major component of neuronal activity and involved in neuronal coding. However, the discussion whether coordinated activity really exists and if it is used for coding (Hebb, 1949; Barlow, 1972; Rieke, Warland, de Ruyter van Steveninck, & Bialek, 1997; Shadlen & Movshon, 1999; Singer, 1999; Eggermont, 1990; deCharms & Zador, 2000) remains heated and controversial. While a large and still growing number of studies have confirmed that coordinated neuronal activity like coincidences, spatiotemporal patterns, and avalanches occur more often than expected by chance, other studies have reported the opposite (Baker & Lemon, 2000). A major uncertainty is that it is unknown to what degree statistical features of the individual processes are involuntarily neglected and thus change the likelihood that experimentally observed instances of coordinated neuronal activity can be explained by chance (for a review on these issues, see Grün, 2009).
The existence of an autostructure within spike trains is clear, since neurons have a refractory period after they fire a spike, and they fire often with a temporal structure that is bursty, regular, or periodic. In addition, neuronal firing might not be described by renewal processes but exhibit higher-order dependencies of spike times further in the past (Nawrot et al., 2007). In this letter, we studied the influence of the autostructure of renewal processes on the shape of the coincidence count distribution of pairs of mutually independent point processes. To this end, we derived an analytical expression for the expected mean coincidence count value and the Fano factor of the coincidence counts by the autocorrelation of the coincidences, in particular, for gamma renewal processes with integer shape parameters. We supplemented these results with Monte Carlo estimations of the full shape of the coincidence count distributions for gamma renewal processes with noninteger and integer shape parameters, as well as for renewal processes with log-normal distributed interspike intervals.
Our first major finding is that the width of the coincidence count distribution depends critically and in a nontrivial way on detailed properties of the interspike interval distribution. This is best illustrated by the complex and nonmonotonic dependence of the Fano factor FFc on the CV of the interspike intervals for both the gamma and log-normal process (see Figures 4 and 5). The complexity of the FFc becomes even more apparent when differences in the individual spike rates , changes in the bin length , or changes in the shape parameter are considered. All of these affect the FFc nonlinearly and nonmonotonically for shape parameters larger than 1.
The second major finding is that the width of the coincidence count distribution characterized by FFc depends on the detailed properties of the individual point processes. This is demonstrated by differences in the Fano factor for gamma and log-normal processes with identical CV of the interspike interval distributions (see Figure 5). Still, for both types of processes, we found that CV values indicative of strong regularity or burstiness lead to a larger Fano factor than for intermediate CVs when the firing rates of the processes are identical. Note that the complex and mostly nonmonotonic behavior of the Fano factor for gamma processes as a function of the rate difference and the order parameter requires a precise knowledge of these parameters to allow a prediction of FFc. This suggests that predictions of the FFc based on estimated model parameters without knowing the underlying distributions are not trustworthy.
5.1. Interaction of Spike Count Variability and Spike Train Regularity.
The Fano factor of the coincidences is mainly determined by two factors: variability in the number of spikes for each of the spike trains and the sharpness of the ISI distribution. If the ISI distribution for both spike trains is very narrow and both processes have the same rate, the following is true: if there is a coincidence at a time t=t0, the probability that there is a second coincidence at a time is high, while this probability is low if there is no coincidence at time t=t0. This leads to an increase in the variability of the number of coincidences. A higher variability in the number of spikes of the individual neurons also leads to a larger Fano factor of the coincidences. These effects can be separated by dithering the spike trains by an amount that is large compared to the average ISI. This dithering destroys the fine temporal structure in the spike times but maintains the number of spikes and thus their count variability. Indeed we observed in this case that the FFc depends on only the Fano factor of the spikes counts FFs of the two neurons (see equation 4.12).
Since the Fano factor of the spike count is equal to C2V for renewal processes, this allows a qualitative understanding of the width of the distribution of coincident events as a function of the coefficient of variation. As the CV is decreased from a high value, the effect of lowering the Fano factor of the spike counts FFs dominates, and the Fano factor of the coincidences FFc decreases. For small CV, the effect of the fine temporal structure becomes increasingly important. In the latter case, FFc increases as CV is lowered even more. The point where this transition occurs, however, depends strongly on the details of the renewal processes. Note also that the higher likelihood of having a coincidence at time given a coincidence at time t=t0 applies only when the two spike trains have almost the same rate. This is true to a lesser extent when the ratio of the rates is around 1:2 or, in general, n:m (with ).
5.1.1. Interaction of Higher Complexity of More Than Two Neurons.
5.2. Statistical Inference from Dithering.
In order to statistically evaluate the existence of coordinated firing, one needs to define a model that predicts the number of coincidences occurring by chance for mutually independent processes. Different approaches have been taken to derive an approximate model for independent processes from recorded biological spike trains. The approaches can be divided into two classes: approaches that consider the full autostructure of the spiking activity of each individual neuron (Pipa et al., 2007, 2008; Harrison & Geman, 2009) and approaches that ignore, at least partly, the autostructure. Here we explored the approach of dithering (Date et al., 1998; Abeles & Gat, 2001; Hatsopoulos et al., 2003; Shmiel et al., 2006) that belongs in the latter class.
5.2.1. Dithering as Implementation of the Null Hypothesis for Coincidences.
Dithering of individual spikes was proposed to test whether coincidences occur by chance. The basic idea is that dithering destroys existing coincident events caused by neuronal coupling. In order to test if in the original data set coincidences occur more often than expected by chance, their number is compared with the distribution of coincidences derived by dithering. Since small amounts of dithering do not alter the rate profile significantly, it was proposed for the detection of synchronized neuronal activity during the presence of rate variation. However, as shown here, dithering not only destroys coincident events induced by coupling, but also changes the shape of the coincidence distribution due to changes of the autostructure. Therefore, significant deviations from the number of coincidences expected by chance can be due to either changes of autostructure or the coupling of neurons. To judge whether changes of the autostructure are of practical relevance for significance estimation based on dithering, we investigated to what degree FFc and the shape of the coincidence count distribution change as a function of the amount of dither and the nature of the autostructure of the spike trains.
The amount of dither is constrained by two timescales. The lower bound is given by the bin length . Any change smaller or of the order of has little chance of destroying potentially existing coordinated events (Pazienti et al., 2008). The upper bound is given by the fastest timescale of rate changes present in the data. Therefore, dithering should be chosen of a particular width, for example, from a gaussian distribution with zero mean and variance .
We analytically derived the effect of dithering on the Fano factor of the coincident count distribution FFc. The main result is that changes of FFc for changes of are nonmonotonic and can either decrease or increase the variance of the coincidence count distribution for gamma processes with a shape factor larger than or equal to two. The strongest changes of FFc occur exactly for the domain , which is often used to manipulate recorded biological spike trains to get a model for independent firing. To understand the impact of this procedure on the estimation of the statistical significance of coordinated firing, we estimated the probability of false positives. To this end, we estimated the probability of significant coincidence counts of simulated mutually independent gamma or log-normal processes given the coincidence count distributions derived from dithered versions of the spike trains.
We found that false positives were considerably increased for small values of the CV for both gamma and log-normal processes. For larger values of the CV>1, only the gamma process expressed an increased level of false positives. Thus, even very small numbers of dithers of less than the expected ISI interval can influence the occurrence of false positives quite strongly. Hence, even alternative dither approaches (Davies et al., 2006) that aim to minimize the number of changes on the spiking data by adapting the amount of jitter by preceding and subsequent ISIs may lead to false conclusions whether synchronous activity occurs by chance or not.
5.2.2. Dithering as Implementation of the Null Hypothesis for Complex Patterns.
So far we have discussed the impact of dithering on distributions of coincidences between two neurons. Since dithering has also been applied to derive the statistical significance of higher complexity -tuplets as well as for spatiotemporal patterns (Abeles & Gat, 2001) that can involve many neurons and more than one spike from the same neuron, we also studied the impact of dithering on changes of FFc for higher complexities of . We found (data not shown here) for the case of synchronous firing of spikes within n neurons for integer gamma processes that the effect of dithering on the width of the coincidence distribution becomes more pronounced with increasing complexity. However, the direction of the change of FFc (i.e., whether dithering broadens or narrows the distribution) can be different for different complexities even for the same and cannot be predicted from the effect of dithering on doublets. However, for most values of CV, dithering decreases the Fano factor of the coincidence count distribution FFc for complexities . This means that for most values of CV, dithering leads to an increased level of false positives.
In case of coincident firing that involves maximally one spike per neuron, the expected value of coincident events stays unchanged for different types of autostructures, while the variability of the coincident count distribution changes. In the case of spatiotemporal patterns, which can include more than a single spike from each neuron, the situation is different. Here, the autocorrelation influences the probability of occurrences of a certain spike sequence of the same neuron directly. Hence, any autostructure that deviates from Poissonian firing leads not only to a change of the variance but also to a change of the expected value of the number of spatiotemporal patterns. Whether the expected number is larger or smaller than in case of Poissonian firing depends on the autocorrelation. If the autocorrelation exhibits a bump that corresponds to the interval between spikes of the same neuron in the spatiotemporal pattern, the expected number will be larger than with Poisson processes.
Hence, for spatiotemporal patterns, dithering can alter not only the variance of the count distribution of spatiotemporal patterns but also the expected number. This makes a prediction of changes on the distribution of spatiotemporal pattern due to dithering difficult. However, our results suggest that spike trains that are very regular or bursty have a higher number of chance patterns as compared to Poisson processes. Thus, dithering may trivially reduce the expected number and may generate false positives. Unfortunately, up to now, this has not been investigated. However, it was shown that assuming an incorrect shape factor for surrogate gamma processes (e.g., too low) would also yield a lower expectation for the correct shape factor (Gerstein, 2004). Assuming a shape factor for surrogate data higher than in real data would lead to the opposite conclusion (Baker & Lemon, 2000).
5.2.3. Relation to Other Spike Correlation Analysis Approaches.
This study directly relates to other correlation analysis approaches that base their significance evaluation on certain assumptions. For example, the unitary events (UE) analysis (Grün et al., 2002a, 2002b) in its original form assumes Poisson statistics to analyze the significance of coincident spike events (see Grün, 2009, for surrogate-based UE analysis). This assumption enables calculating the coincidence distribution analytically. Thus, given the results in this letter, one would expect that assuming Poisson processes in the presence of, say, bursty processes would lead to false-positive outcomes in the UE analysis (see Figure 5). However, this is not the case (Grün, 2009). The reason is twofold. First, the UE analysis operates on binned and clipped spike trains, which considerably reduces the burstiness. Second, it adjusts the mean of the coincidence distribution (Poisson) used for the significance evaluation according to the spike counts in the data under evaluation. The comparison of coincidence counts resulting from Poisson and non-Poisson processes we have shown in this study was based on distributions derived from many realizations parameterized by the firing rate of the processes. As a consequence, spike count variations of the various realizations also contribute to the width of the distributions, which UE avoids (see for details, Pipa, van Vreeswijk, & Grün, 2010). Further, the UE analysis for intermediate CVs leads to fewer false positives as the applied test level, which explains the enhanced significance of excess spike synchrony found in a comparative analysis of the same data by NeuroXidence and UE (Pipa et al., 2007).
Louis, Gerstein, Grün, and Diesmann (2010) extensively compared various surrogate approaches for the significance estimation of excess spike synchrony in the framework of UE analysis (Grün et al., 2002b). Pairs of spike trains were modeled as gamma processes in nonstationary settings, and the significance of empirically found coincidences was evaluated using surrogate data. Different kinds of spike dithering (mere dither, square root dither, joint-ISI dither, and dithering in operational time) were studied despite the shift procedures used in Pipa et al. (2007). The false-positive level decreased for the methods that increasingly include more features of the original spike trains but still showed an increased FP level for small CVs. Thus, the results agree with our findings in this letter for CV<1 but differ for CV>1. In that range, FPs were at the expected level (significance level) due to the reasons discussed above for the UE analysis.
These results demonstrate that dithering with a dither width much smaller than the average interspike interval can falsify statistical inference of the existence of coordinated neuronal activity. This holds not only for coincidence patterns of two neurons, but also for patterns composed of more neurons or for spatiotemporal patterns. The amount of over- or underestimation of the statistical significance depends on details of the autostructure of the original spike trains and cannot be predicted on the basis of the CV alone.
Appendix A: List of Symbols Used
t Interspike interval (ISI)
p(t) ISI distribution
Expected value interspike interval t
Variance of the ISI distribution
R Spike rate in units of spikes per second
CV Coefficient of variation of the ISI distribution
FFc Fano factor of the coincidence count distribution
Shape parameter of the gamma distribution
a, k Parameter of the log-normal distribution
T Number of trials (realizations) with
Bin width for binning the spike trains, in seconds
Tw Duration of the spike trains in seconds
ni Spike count in bin i ()
Nc Total number of coincidences in interval of , corresponding to I bins
Ncritical Minimal number of coincidences required for significant excess
Variance of dither distribution
FFs Fano factor of the spike count distribution
Appendix B: Autocorrelation of the Gamma Process
B.1. Continuous Time.
We can write for , , where . That this choice has the right properties can be seen as follows. If k is a multiple of , for all l, so that , while if k is not a multiple of , , but , so that .
Appendix C: Estimation of the Coincidence Count Distribution
To estimate the coincidence count distribution we used the following procedure:
We generate mutually independent pairs of spike trains of length Tw as stochastic realizations of the same underlying process. To this end, we draw iteratively as many interspike intervals ti from the same underlying ISI distribution ) as needed to cover a period of . Hence, two periods, each of length —one preceding Tw and one afterward—are used to randomize the starting conditions (warm-up). was selected to be 500 s. That gives on average 25,000 spikes in the warm-up period for a spike rate of 50 Hz. In addition the first initial interval was drawn from an exponential distribution with the same expected ISI.
Binning of the sequence of interspike intervals. We partitioned Tw into exclusive bins each of length and counted the number of spikes ni falling into each individual bin ().
The number of coincidences was derived. To this end, we derived the coincidence count for each individual bin as the product of the spike counts of corresponding bins i of the first n1i and the second spike train n2i. In a last step, we computed the sum of coincidences across all i bins in the interval of Tw.
Steps 1 to 4 were iterated T times (“trials”) to estimate the probability distribution of p(Nc) by .
This work was supported in part by the Hertie Foundation (G.P.), the EU (EU project GABA–FP6-2005-NEST-Path-043309) (G.P.), the German Ministry for Education and Research (BMBF grants 01GQ01413) (S.G.), the Stifterverband für die Deutsche Wissenschaft (S.G.), the Volkswagen Foundation (I/79 342) (S.G., G.P.), Helmholtz Alliance on Systems Biology (S.G.), and EU Grant 269921 (BrainScaleS) (S.G.). We thank Martin P. Nawrot and Mina Shahi for constructive discussions.