## Abstract

Several authors have previously discussed the use of log-linear models, often called maximum entropy models, for analyzing spike train data to detect synchrony. The usual log-linear modeling techniques, however, do not allow time-varying firing rates that typically appear in stimulus-driven (or action-driven) neurons, nor do they incorporate non-Poisson history effects or covariate effects. We generalize the usual approach, combining point-process regression models of individual neuron activity with log-linear models of multiway synchronous interaction. The methods are illustrated with results found in spike trains recorded simultaneously from primary visual cortex. We then assess the amount of data needed to reliably detect multiway spiking.

## 1.  Introduction

Synchrony is widely believed to play a fundamental role in neural computation (e.g., Uhlhaas et al., 2009), but its statistical assessment is subtle (for reviews, see Grün, 2009, and Harrison, Amarasingham, & Kass, in press). To analyze multiway synchrony among simultaneously recorded multiple spike trains that are represented as binary time series, across many trials, it is natural to consider well-established log-linear modeling technology (Gütig & Aertsen, 2003; Martignon et al., 2000; Nakahara & Amari, 2002; Schneidman, Berry, Segev, & Bialek, 2006). The standard approach, however, has two shortcomings. First, it assumes stationarity of firing rates across suitable time intervals. Second, it does not incorporate spiking history or other covariates, and therefore it effectively assumes Poisson spiking. While previous authors have been aware of these issues (Martignon et al., 2000), they have not offered specific methods for dealing with them. In addition, the standard approach ignores the inherent relative sparsity of two-way and higher-order synchronous spiking. Here we provide a modification of the usual log-linear modeling methodology to deal with the case of inhomogeneous non-Poisson firing rates that result from stimulus-driven (or action-driven) recordings, and we give a straightforward procedure for parameter estimation, which is a variant of maximum likelihood via iterative proportional fitting. Tests for pairwise and multiway synchrony may then be based on the bootstrap.

To explain more fully the problem we are trying to solve and the idea behind our proposed solution, let us suppose we have simultaneously recorded spike trains from three neurons that we label (1, 2, 3), for which spiking patterns take the form (a, b, c), where so that, for instance, (1, 1, 0) would signify that the first two neurons fired but the third did not. The probability of obtaining the pattern (a, b, c) at some time t on trial r may depend on the history Ht,r of spiking patterns prior to time t, and it may also depend on some other covariates, such as a measurement of network activity, as in Kass, Kelly, and Loh (2011) or variables representing trial-to-trial variation, as in Ventura, Cai, and Kass (2005a), which we here take to be a vector ut,r. We then write the probability of pattern (a, b, c) at time t on trial r as p123abc,r(t|Ht,r, ut,r). If we were to treat the spike trains as jointly stationary and ignore history, covariates, and additional sources of trial-to-trial variation, we could omit t and r and write the usual model for two-way, but not three-way, interaction as
1.1
where the parameters satisfy certain constraints (e.g., , ). In computer science and physics, models such as equation 1.1 are often called maximum entropy models (Schneidman et al., 2006). In the statistics literature, the parameters are usually standardized by subtracting means (Agresti, 2002). Using equation 1.1, for a given set of data the log-likelihood function may be maximized iteratively to produce fitted parameters and probabilities. The problem we solve here is introducing a variant of equation 1.1 that allows both nonstationarity and the use of history and other covariates.

### 1.1.  Overview of Approach.

Suppose we have spike trains from n neurons recorded simultaneously over a time interval of length T, across R trials. We consider spiking patterns at a relatively fine time resolution, denoted by . In section 3, we report an analysis of simultaneous spiking data recorded from primary visual cortex where we took to be 5 milliseconds. The spike train data may be represented as binary arrays with dimensionality . For the data analysis reported in section 3 we used arrays of dimension to estimate probabilities p123abc,r(t|Ht,r, ut,r). In full generality there are more parameters than data values. Furthermore, nonstationarity and history dependence could, in their most general conception, be very complicated. The approach we take here simplifies the situation greatly by making these assumptions:

1. For every neuron i, the firing probability varies smoothly across time and depends on spiking history Ht,r only through the history Hit,r for neuron i, together with the covariate vector ut,r.

2. Excess joint pairwise spiking, above that expected under independence, does not depend on either spiking history or the covariate vector.

Importantly, the second assumption defines excess synchrony relative to “independence,” which here means independence conditionally on all history and covariate effects mentioned in the first assumption. In specifying these assumptions, we aim to emphasize the way synchrony is judged against a backdrop of explanatory covariates. For example, it is widely appreciated that a pair of neurons may exhibit excess pairwise spiking relative to what might be expected from their time-averaged firing rates because they respond to given stimuli with roughly similar temporal profiles; this would be synchrony due to their individual firing rate functions, as seen through overlapping peristimulus time histograms (PSTHs). Various methods may be used to adjust or “normalize” pairwise spiking to account for the individual time-varying firing rate functions (Aertsen, Gerstein, Habib, & Palm, 1989). Many other possible sources of pairwise spiking may be present in particular cases, including global network activity. One of our purposes here is to introduce a general framework for quantifying the contributions of alternative sources of pairwise spiking while assessing statistical evidence and uncertainty. A second purpose is to examine excess multiway spiking relative to that expected from pairwise spiking. The approach we develop melds log-linear modeling, as in equation 1.1, together with point-process regression modeling (which usually comes under the rubric of generalized linear models) as in numerous articles (Kass & Ventura, 2001; Kass, Ventura, & Brown, 2005; Kelly, Smith, Kass, & Lee, 2010a; Okatan, Wilson, & Brown, 2005; Pillow et al., 2008; Stevenson et al., 2009; Truccolo, Eden, Fellows, Donoghue, & Brown, 2005; Zhao et al., 2011). We use point-process regression to model the behavior of each individual neuron; we then overlay the structure of log-linear models to account for synchronous connections.

For notational simplicity, we concatenate the history and covariate vectors as a single vector:
When we consider only the spiking history Hit,r of neuron i at time t on trial r, we write
Let pi1,r(t|xt,r) be the probability of neuron i spiking at time t on trial r. The central object in the point-process framework is the conditional intensity function, , which is the firing rate function in continuous time, and the relationship between the continuous and discrete time representations is summarized by
1.2
the approximation being justified by passage to the limit as . The corresponding key statistical result is that the likelihood function based on the continuous-time representation, in terms of the conditional intensity function, is approximately equal to the likelihood function based on the binary data and the probability of spiking pi(t|xit,r). We use x in the notation xit,r to connote regression-style modeling of probability in terms of explanatory covariates, including history.
Now let pij11,r(t|xt,r) be the probability that neurons i and j will both spike at time t on trial r. Under assumption 1 above, we use the spike trains from each neuron i to fit that neuron's firing probabilities pi1,r(t|xt,r)=pi1,r(t|xit,r) across time and trials. Let us write such fits as . Under assumptions 1 and 2, we may define
1.3
so that represents the excess pairwise spiking above that predicted by independence (as in Ventura, Cai, & Kass, 2005b). We have written with the argument t and subscript x to indicate that is a function only of time, but because it is defined through equation 1.3, it depends indirectly on the covariates used in the individual neuron firing probabilities pi1,r(t|xt,r). For example, we may evaluate excess synchrony with and without a covariate that measures network activity. Kass et al. (2011) gave an example in which the data indicated that when a covariate for network activity was included in xt,r while when it was omitted. This suggested that excess synchrony above that expected from time-varying firing rates, while present, was due to global network activity rather than a local circuit that affected the particular pair of neurons.

We may now summarize the steps of the strategy we have implemented for estimating multiway spiking probabilities based on all combinations of excess pairwise spiking, thereby generalizing equation 1.1 to account for nonstationarity, history, and covariates. We consider first the case of n=3 neurons labeled i, j, k:

1. For all i and all t and for every trial r, fit pi1,r(t|xit,r) to get .

2. For all i, j estimate to obtain .

3. Based on the sets of estimated values , , , , , from steps 1 and 2, use an iterative algorithm to obtain the complete set of estimates of pijkabc,r(t|xt,r).

Once we have all the fitted probabilities we are able to use them to estimate quantities defined in terms of the probabilities pijkabc,r(t|xt,r). We can also use the set of fitted probabilities to simulate artificial spike trains that reflect all of the estimated multineuron dependence, and we can thereby use bootstrap simulation to obtain confidence intervals for estimated quantities. These confidence intervals incorporate the statistical uncertainty from all three steps of the fitting procedure. Similarly, we can test the null hypothesis of two-way interaction, but no three-way interaction, against the alternative that also includes three-way interaction. The bootstrap hypothesis tests also incorporate uncertainty from all the steps of the fitting procedure.

The same steps may be followed for more than three neurons. For example, for four neurons, we would again use the two-way probabilities in equation 1.5 to fit probabilities of the form and thereby test the null hypothesis of pairwise interaction against both three-way and four-way interaction. Thus, according to the approach described here, large numbers of spiking probabilities having potentially complicated forms are estimated by first fitting to each neuron a smooth conditional intensity function and any relevant covariates, then fitting two-way excess synchronous spiking terms, then finding multiway probabilities by iterative optimization.

To help fix these ideas, Figure 1 displays a simple situation involving time-varying firing rates from three artificial neurons (the dotted lines in panels A–C) where two-way synchronous spiking occurs more often than predicted by independence (the dotted lines in panels D–F), and three-way interaction occurs more often than that predicted by a two-way model. We simulated a large number of trials from this model. When a two-way interaction model was fitted to the simulated data, the one-way and two-way firing rates were estimated accurately (the solid lines in panels A–F), but the amount of three-way spiking was underestimated (the solid line in panel G).

Figure 1:

Time-varying two-way interaction model in the presence of three-way spiking. In all panels, the dotted line is the simulated rate and the solid line is the estimated rate based on the fitting algorithms discussed in section 2. (A–C) Three simulated cells, each with a different firing rate profile. Each cell has an elevation of the firing rate at a different point in the trials. (D–F) Coincidence rates for pairwise synchronous events, which are uniformly elevated above those predicted by independence. Panel D corresponds to coincidences between neurons shown in panles A and B, panel C to coincidences between panels A and C, and panel F to coincidences between panels B and C. (G) Three-way coincidence rates. The pairwise model underestimates the number of three-way synchronous events.

Figure 1:

Time-varying two-way interaction model in the presence of three-way spiking. In all panels, the dotted line is the simulated rate and the solid line is the estimated rate based on the fitting algorithms discussed in section 2. (A–C) Three simulated cells, each with a different firing rate profile. Each cell has an elevation of the firing rate at a different point in the trials. (D–F) Coincidence rates for pairwise synchronous events, which are uniformly elevated above those predicted by independence. Panel D corresponds to coincidences between neurons shown in panles A and B, panel C to coincidences between panels A and C, and panel F to coincidences between panels B and C. (G) Three-way coincidence rates. The pairwise model underestimates the number of three-way synchronous events.

### 1.2.  Overview of This Letter.

In section 2 we provide methodological details. We begin in section 2.1 by briefly summarizing the fitting methods for individual neuron probabilities, then go on to pairwise probabilities in section 2.2, giving procedures for bootstrap confidence intervals in section 2.3. In section 2.4 we present the algorithm for fitting multiway probabilities under the assumption of pairwise interaction but not higher-order interaction, in section 2.5 we discuss bootstrap confidence intervals for functions of these probabilities, and in section 2.6 we describe bootstrap hypothesis tests. In section 2.7 we review the essential motivation for the three-step approach outlined in section 1.1 in terms of what we call hierarchical sparsity. In section 3 we provide a real-data illustration by analyzing some simultaneous spiking data recorded from primary visual cortex. We then, in section 4, use the framework described here to address a fundamental question: How many data would be needed to distinguish three-way interaction from the three-way spiking that occurs by chance from pairwise interaction models? In section 5 we add remarks about the utility of this method in practice.

## 2.  Methodology

Here, and for the remainder of the letter, we omit explicit reference to the trial r, leaving it implicit. We also omit the subscript x on .

### 2.1.  Fitting One-Way Probabilities.

Individual neuron (one-way) firing probabilities may be fitted by invoking smooth point-process models, where each neuron i has a firing rate governed by a conditional intensity function Kass and Ventura (2001) and Kass, Ventura, and Cai (2003) discussed spline-based fitting of conditional intensity functions, and Pillow et al. (2008) used an alternative set of smooth basis functions that incorporate both rate variation and history dependence. Sometimes history and covariate effects may be ignored, and then the PSTH may be smoothed by gaussian filters (kernel smoothers), fixed-knot splines, or more sophisticated methods such as BARS (DiMatteo, Genovese, & Kass, 2001). The first step of the approach suggested here is to apply one of these individual neuron models in order to obtain estimates for all t.

### 2.2.  Fitting Pairwise Probabilities.

We now move on to pairwise probabilities, considering neurons i and j as in equation 1.3. While we could incorporate time-varying excess spiking effects as in equation 1.3 here, instead we further specialize by assuming that the excess pairwise spiking probability effects are constant across time. We therefore take and define
2.1
This is helpful when there are not large numbers of joint spikes, and it also allows us to use a simple and intuitive estimator of given in equation 2.6. By inserting the estimates , , and into the right-hand side of equation 2.1, we obtain estimates of the two-way probabilities pijab(t|xt) as follows:
2.2
These estimates, for i, j=1, 2, 3, give us all of the two-way probabilities needed to fit a two-way interaction model analogous to equation 1.1 that instead incorporates time-varying, history-dependent, and covariate-dependent firing probabilities.
To estimate , we fix the values of pi1(t|xit) to be . Using the point-process representation of joint spiking for neurons i and j derived by Kass et al. (2011), we replace with and write the synchronous-spike likelihood function as
2.3
where tij are the times of the joint (synchronous) spikes and
2.4
Equation 2.4 is the conditional-intensity analogue of equation 2.1. Setting the derivative of to zero and solving gives the maximum likelihood estimate
2.5
where Nij is the total number of joint spikes for neurons i and j and the denominator is the expected number of joint spikes under independence, after taking account of the covariate x. This corresponds to equation 22 of Kass et al. (2011). The formula we use in practice replaces the integral in equation 2.5 with
2.6
To interpret equation 2.6, let us reemphasize the covariate x by restoring its use as a subscript: on the left-hand side of equation 2.6, we replace by . If , the value of represents the proportionate excess synchronous spiking beyond that explained by the covariate x, while if , the value of is the proportionate diminution of synchrony, below that explained by x. When , it is helpful to consider the reciprocal,
2.7
which ranges from 0 to 1 and represents the proportion of synchronous spikes explained by the covariate vector x (including its time-varying, trial-averaged firing rate). For example, if when x contains only the time-varying, trial-averaged firing rate, then Ex=.5, and we would say the time-varying, trial-averaged firing rate explains half the synchronous spikes. This gives us a way of interpreting the relative effects of various alternative covariate vectors x, as we illustrate in section 3. If , we can define the inhibitory measure , which then represents the proportion of observed spikes relative to the number predicted by x.

### 2.3.  Confidence Intervals for Pairwise Effects.

Standard errors and confidence intervals associated with the pairwise estimates in equation 2.6 may be obtained by a parametric bootstrap using equation 2.2. That is, equations 2.2 specify a set of multinomial spiking probabilities at each time t (and for each trial, separately, due to separate spiking history or covariate effects) for every spiking pattern (1,1),(1,0),(0,1),(0,0). It is straightforward to generate G complete sets of pseudo-data pairwise spike trains based on these probabilities, each set of pseudo-data replicating the layout of the original data (a typical value of G being 500). Let us use to index the sets of pseudo-data. If we wish to obtain a 95% bootstrap confidence interval for a parameter (such as ), we use the pseudo-data set g to obtain for ; that is, for every g, we obtain , by applying equation 2.6 to the pseudo-data, where all the terms are computed from the pseudo-data. We then order the values of , find the .025 and .975 percentiles, and use them as end points of the 95% confidence interval. A standard error is similarly obtained as the standard deviation of the values of . When summarizing results with estimates and standard errors, we prefer to work with rather than because the log transform tends to symmetrize the distribution (and thus make it closer to normal, as it should be when one interprets standard errors). Because we include the refitting of to the pseudo-data, the bootstrap confidence intervals account for the uncertainty in that first step of the fitting procedure.

### 2.4.  Fitting Higher-Order Probabilities.

For n=2 neurons, equations 2.2 completely determine the estimated probabilities we need for statistical analysis. For n=3 neurons, pairwise interaction models do not provide analogous closed-form expressions for all the necessary probabilities p123abc(t|xt), and the same is true for n>3. Even in the simpler stationary Poisson setting, where the data become counts aggregated across time and model 1.1 may be applied, it is well known (Agresti, 2002) that maximum likelihood estimation of the two-way interaction model requires iterative methods. The standard algorithm in that context is iterative proportional fitting (IPF; Agresti, 2002; Schneidman et al., 2006).

Before reviewing IPF and the modification of it we employ here, we wish to make sure it is clear what model we are referring to as the “two-way interaction model” that we are fitting to the spike train data. It is helpful to return to the simpler model, equation 1.1, and explain how our approach would apply in that setting. The complete model for three-way probabilities pabc, including three-way interactions (in statistics, it is often called “saturated”), has 8−1=7 free parameters: the 8 values pabc must satisfy the constraint . The two-way interaction model, equation, 1.1, omits the three-way interaction and has six free parameters. (In equation 1.1 there are seven parameters with nonzero multipliers and, again, there is the constraint .)1 In our approach, we do not need to use the parameters that appear in equation 1.1. Instead, we effectively use a different set of six free parameters. We define
2.8
and then parameterize equation 1.1 using We are not able to write a closed-form expression for pabc in terms of the parameter vector for the same reason that we are unable to write closed-form maximum likelihood estimates for pabc in equation 1.1. However, any particular value of the vector does define a particular set of values of pabc according to equation 1.1 based on the nonlinear equations in equation 2.8. In the general case we are concerned with here, we similarly use to define, for every t, a two-way interaction model of the form 1.1 that we are fitting to the data.
Now, to explain the modified IPF that achieves the desired fitting, let us begin with IPF in the standard setting involving counts, with nabc being the number of time bins in which the pattern (a, b, c) occurred and mabc being the expected number according to the two-way interaction model, 1.1, where mabc=npabc and is the total number of spikes. IPF produces estimates of mabc. With nab+ being the number of bins in which the first two neurons have the pattern (a, b), and na+c, and n+bc defined similarly, the first cycle of the IPF algorithm, as in Agresti (2002), involves three steps:
Iterating these steps produces convergence to the maximum likelihood estimates under very general conditions (Haberman, 1974).
To fit our general time-varying and/or history-dependent and/or covariate-dependent two-way interaction model, we replace the counts in the standard IPF above with the probability estimates given by equation 2.2 and the expected values by the probabilities pijkabc(t|xt). Here, IPF will produce a set of estimates , which we write as
Let
and let and be defined analogously. Given values at iteration g, we obtain the next three updates as
2.9
For iteration g, the values of the form are obtained from
Because there are time values and R trials, equation 2.9 produces sets of three equations—for example, for each of the data analyses in section 3, there were 24,000 sets of three equations, which were iterated to convergence. There are many ways to initialize the algorithm, including taking . We have found in practice that the algorithm converges in only a few iterated sets of the three steps in equation 2.9. We summarize the fitting algorithm next.

Algorithm for Fitting the Generalized Two-Way Interaction Model

1. For all i and all t, and for every trial, fit pi1(t|xit) to get .

2. For all i, j, use equation 2.6 to obtain .

3. Initialize; then iterate equation 2.9 to convergence.

The output of this algorithm is the complete set of estimated probabilities .

When there are n>3 neurons we may apply the same method for fitting the two-way interaction model, thereby obtaining estimated probabilities. Clearly as n grows, it quickly becomes infeasible to compute a multiple of 2n quantities. We mention this again in section 5.

### 2.5.  Estimates and Confidence Intervals for Functions of Probabilities.

Once we have these fitted probabilities , we may also estimate any quantity that may be written as a function of them,
simply by plugging in the probability estimates:
We may then compute standard errors and confidence intervals using a parametric bootstrap as in the n=2 case, outlined following equation equation 2.6, using the probabilities to generate the three-way pseudo-data spike trains.
One of the benefits of two-way models such as equation 1.1 is that the two-way interaction coefficients may be used to provide definitions of the specific functional connectivity between two neurons after taking account of a third neuron (Martignon et al., 2000). For this purpose, let us define
2.10
and denote other marginal probabilities by analogous notations, then consider defined by
2.11
which we estimate with
2.12
where is the total number of (1, 1, 0) spike patterns. The estimator gives us a measure of the excess spiking activity of neurons i and j that is unrelated to the activity of neuron k. We do not mean to suggest that this particular measure has some special stature—others might be envisioned—but it is intuitive and easily computed within the framework we are describing here.
We may also define the excess three-way spiking and estimate it. For this purpose, let pijk111(t|xt) denote the three-way spiking probability under the general three-way model, which includes three-way interaction. We define by
2.13
and estimate it with
2.14
where Nijk is the total number of triplet joint spikes for neurons i, j, k, and obtained from fitting the two-way interaction model. Formula 2.14 is analogous to formula 2.6 and may be derived by an analogous argument.2 Once again, standard errors and confidence intervals may be obtained from the parametric bootstrap, but now the three-way interaction term is included in the model. Specifically, the spike trains are generated from the estimates obtained from using equation 2.14:
2.15
We illustrate by applying equations 2.12 and 2.14, together with bootstrap confidence intervals, in section 3.

### 2.6.  Hypothesis Tests.

When n=2, the null hypothesis concerning in equation 2.1 is that of independence, . To test H0, we may use a parametric bootstrap, with spike trains generated by the fitted independence model,
2.16
which simply requires that we generate the neuron i and neuron j spike trains independently using and . There are several choices for the test statistic. Let us suppose we use Nij, and let Nijobs denote the value computed from the data. We generate bootstrap pseudo-data, with the same number of trials as the real data, based on and , and repeat this procedure G times (e.g., we might take G=10, 000). We let label the pseudo-data, and for set g of pseudo-data we compute Nij and label its value Nij(g). The bootstrap p-value is then
2.17
This p-value satisfies the usual properties of the parametric bootstrap, that is, it furnishes approximately the correct probability of rejecting H0 under the assumed null model. In practice, if the numerator is 0, we do not report p=0 but rather say p<1/G. Some authors prefer to add 1 to both the numerator and denominator of equation 2.17.

When n=3, to test whether the two-way interaction model is adequate to explain all the three-way spikes, we take the general two-way interaction model as H0. We may then use a parametric bootstrap by generating pseudo-data spike trains according to the probabilities of section 2.4 and can use the total number of three-way spikes Nijk as the test statistic. If we again label the sets of pseudo-data with and the data-based value of Nijk with Nijkobs, we obtain the p-value by replacing the numerator in equation 2.17 with the number of values of g such that .

When n>3, we may apply the bootstrap test not only to the triplet spikes from all combinations of three neurons using Nijk, but also the quadruplet spikes from all combinations of four neurons using Nijkl, and so forth, in principle up to Nijk⋅⋅⋅n.

The p-value in equation 2.17 is one-sided in the sense that only excess joint spiking is being assessed. To check for either excess joint spiking or diminished joint spiking, we could instead use as the test statistic. Then, with being the value computed from set g of the pseudo-data and being the value computed from the real data, we would define
2.18
as a p-value for the two-sided test. When , equation 2.18 finds the proportion of sets of pseudo-data for which either or . Higher-order tests may be modified similarly.

### 2.7.  Motivation from Hierarchical Sparsity.

Central to the strategy in sections 2.1 to 2.5 is the idea that we may perform fitting hierarchically using the three steps articulated in section 1.1 and realized in the algorithm of section 2.5. The main justification for this procedure is based on the rate at which pairwise and multiway spikes occur.

For small bin width (e.g., seconds), each firing probability pi1(t|xt), is also small and, as we have
2.19
According to equation 2.1, we then have
2.20
Equations 2.19 and 2.20, together with higher-order counterparts such as
provide a formal expression of hierarchical sparsity. Kass et al. (2011) applied hierarchical sparsity in deriving point-process representations that approximate discrete-time models, analogous to equation 1.2. We use the idea here to motivate our hierarchical fitting procedure. Under hierarchical sparsity, there is likely to be good information (many spikes) available to estimate one-way probabilities, but as we move up the hierarchy of interactions, the information degrades: there are many fewer two-way spikes, and then even fewer three-way spikes, and so on. Although we do not offer a more precise theoretical statement, it is apparent that under hierarchical sparsity, the hierarchical fitting procedure should produce estimates of the multiway probabilities that capture well the available information in the data.

## 3.  Data Analysis

We illustrate the approach by analyzing two sets of three neurons recorded from primary visual cortex, as described in Kelly, Smith, Kass, and Lee (2010b) and Kass et al. (2011). In each case, the neural responses were recorded from an anesthetized monkey while sinusoidal gratings were moved across the visual field. The data in Figures 2 and 3 correspond to 1 second of the recordings, and there are 120 repeated trials. Both figures display raster plots from three cells, with dark circles superimposed to indicate triplet firing within 5 ms time bins. The three cells chosen for Figure 2 are different from the three chosen for Figure 3, except that cell three in Figure 2 is the same as cell 2 in Figure 3. In Figure 2, there are 11 such triplets in total, across the 120 trials, while in Figure 3, there are 12 triplets. To implement step 1 of the fitting procedure, in which we fit time-varying individual-neuron firing rate functions (see section 2.1), we used a gaussian filter with bandwidth ms to smooth each PSTH, as shown at the bottom of panels A, B, and C in Figures 2 and 3. Results were not sensitive to choice of bandwidth. In related analyses, we used spline fits, including BARS, but chose gaussian filters here for speed of implementation, which was important for the simulation study reported in section 4. Panel D in each figure shows the bootstrap distribution of the number of triplets obtained under the null model using , as described in section 2.6. (We used G=500 bootstrap samples.) For the neurons corresponding to Figure 2, N123obs=11 is in the extreme tail of the bootstrap distribution and, applying equation 2.17, the test has a significant p-value with p=.002. On the other hand, for the neurons corresponding to Figure 3, N123obs=12 is in middle of the bootstrap distribution, and the test is not significant (p=.52).

Figure 2:

(A–C) Raster plots, with PSTHs shown below, for three neurons recorded simultaneously from primary visual cortex. Dark circles indicate three-way synchronous spikes. (D) Histogram of bootstrap values Nijk(g). The value Nijkobs=11 is in the tail of this distribution.

Figure 2:

(A–C) Raster plots, with PSTHs shown below, for three neurons recorded simultaneously from primary visual cortex. Dark circles indicate three-way synchronous spikes. (D) Histogram of bootstrap values Nijk(g). The value Nijkobs=11 is in the tail of this distribution.

Figure 3:

(A–D) Same as Figure 2, except the data are from three different neurons. In this case the value Nijkobs=12 is in the middle of the bootstrap histogram in panel D.

Figure 3:

(A–D) Same as Figure 2, except the data are from three different neurons. In this case the value Nijkobs=12 is in the middle of the bootstrap histogram in panel D.

For the neurons in Figure 2, applying equation 2.14 and then bootstrapping based on eqaution 2.15, we obtained the estimate of is with approximate 95% confidence interval (1.7,4.3). We also computed estimates and confidence intervals for each using equation 2.12 and then again bootstrapping based on equation 2.15.3 We obtained
Thus, interestingly, the estimates were all statistically indistinguishable from the null value of 1. This indicates that once the three-way interactions are considered, there are no longer any significant two-way effects. We suspect the excess three-way spiking among these neurons is due to the slow-wave activity discussed by Kelly et al. (2010b) and Kass et al. (2011). For the neurons in Figure 3, all the corresponding estimated effects and confidence intervals were consistent with null values of 1.

The results given above did not include individual-neuron history terms. When we included history, based on spike count in the preceding 100 milliseconds, the three-way hypothesis test results were essentially unchanged: individual-neuron history effects apparently picked up network activity so that pairwise interactions became nonsignificant, but the three-way interaction for the neurons in Figure 2 remained highly significant. (This was not sensitive to the width of history window.)

A related example of the approach presented here, focusing on two-way interaction, was given in Kass et al. (2011). Two alternative pairs of neurons were analyzed, with and without covariates xit that had two components: the spike count for neuron i in the previous 100 ms and the total spike count among all neurons in the previous 100 ms. Thus, xit represents the recent network history, including that of neuron i itself but not including that of neuron j. We summarize the results in terms of , as in equation 2.7, the proportion of synchronous spikes explained by covariates x in Table 1. In the table, we have written to indicate that the bootstrap test of was not rejected for pair 1 when the network history covariates were included. The bootstrap 95% confidence intervals for were pair 1: (.79,1.4) and pair 2: (1.4,3.6). The main findings are, first, for both pairs, there is highly significant synchronous activity beyond that due to firing rate ( for pair 1, p<.0001 for pair 2) with only about 50% and 40% of the synchronous spikes explained by firing rate for the two pairs.4 Second, for pair 1, network activity appears to explain synchronous spiking, but for pair 2 it does not (p=.0002 from the bootstrap test of H0). Indeed, from the pair 2 values of Ex, the network activity explains only a very small additional proportion of spikes beyond the time-varying firing rate. For pair 2, there is excess synchronous spiking that presumably is associated with the stimulus.

Table 1:
Proportion Ex of Synchronous Spikes Explained by x.
Covariate xPair 1Pair 2
Average firing rate .52 .40
Time-varying firing rate .50 .42
Network history  .48
Covariate xPair 1Pair 2
Average firing rate .52 .40
Time-varying firing rate .50 .42
Network history  .48

For these two pairs of neurons, we have also examined whether there is evidence of time-varying synchrony. The interesting case would be time-varying synchrony within a particular stimulus epoch. We considered the 300 ms corresponding to each stimulus and decomposed these into the first 100 ms and the latter 200 ms, then used equation 2.6 within each of these time intervals and computed confidence intervals for both the early part of the time interval and the later part of the time interval. In every case, these two confidence intervals strongly overlapped, indicating no evidence in favor of time-varying synchrony. This may have been due to the relatively small number of synchronous spikes available within these 300 ms stimulus conditions across the 120 trials.

## 4.  The Power of Tests for Three-Way Interaction

We next investigate the amount of data needed to reliably detect three-way interaction. We chose four data-generating scenarios. For the first two, we used a model representing individual-neuron contributions together with three-way interaction,
4.1
while for the latter two, we used a model representing two-way interaction together with three-way interaction:
4.2
For each model, we simulated spike trains across 1 s, using bin width (5 ms), with individual-neuron firing rates adjusted to be either 5 Hz or 10 Hz. In equation 4.2, we then adjusted the coefficients so that the two-way excess firing would be . The combination of the two models with the two firing rates constituted our four scenarios. For each scenario, we chose a grid of values of excess three-way synchrony given by equation 2.13 and a grid of values R for the number of trials. For each combination of and R, we computed the probability of rejecting H0 after fixing the rejection cutoff so that the probability of rejection under H0 (the -level) was .05, as is customary. The results are shown in Figure 4, with black lines indicating the combinations of and R that produce the customary minimally acceptable power values of .8. The figure is based on 1000 replications of each simulation. Specifically, for each and R, the figure displays smoothed proportions, out of 1000, of the replications that rejected H0.
Figure 4:

Power analysis for detecting three-way interaction. Results for four scenarios are given in the four plots. Model 4.1 was used for panels A and B, while model 4.2 was used for panels C and D. For panels A and C, the individual-neuron firing rates were set to 5 Hz, and for panels B and D, they were set to 10 Hz. The pairwise synchrony coefficient for panels C and D is 2. In each panel, the x-axis is the value of the excess three-way firing rate , for example, indicates twice as many triplet spikes as would be observed under the null model. The y-axis is the number of trials R. The bold lines indicate the values of and R for which the power was .8.

Figure 4:

Power analysis for detecting three-way interaction. Results for four scenarios are given in the four plots. Model 4.1 was used for panels A and B, while model 4.2 was used for panels C and D. For panels A and C, the individual-neuron firing rates were set to 5 Hz, and for panels B and D, they were set to 10 Hz. The pairwise synchrony coefficient for panels C and D is 2. In each panel, the x-axis is the value of the excess three-way firing rate , for example, indicates twice as many triplet spikes as would be observed under the null model. The y-axis is the number of trials R. The bold lines indicate the values of and R for which the power was .8.

The plots in Figure 4 indicate that in order to have a substantial probability of detecting three-way interaction, one needs either a large value of or a large number of trials R, or both. For example, when there are, on average, twice as many triplets as would occur under H0 (), in order to reject H0 with power .8 in the 5 Hz scenario of panel A, one would need more than 700 trials (of 1 s duration). For values of clearly below 2, which could be realistic, it becomes extremely difficult to reliably reject H0. More than 1000 trials would be needed, and the extrapolation of the power curves indicates the number grows quickly as decreases. When the number of three-way spikes is larger, the situation is more favorable. When the firing rates are higher for the 10 Hz scenario of panel B, even there one needs about 150 trials to reliably reject H0 when . Panels C and D show the same two firing rates but with injected pairwise correlation. The pairwise correlation effectively increases the number of three-way spikes, and thus the power is larger when pairwise correlation is present. For the scenario in panel C, 200 trials are need to reliably reject H0, and in the most favorable situation of panel D, 75 trials are needed.

## 5.  Discussion

Our main purpose has been to generalize statistical assessment of synchrony based on log-linear modeling (using maximum entropy models) so that it can accommodate time-varying firing rates or non-Poisson history effects or time-varying covariates. We have presented methodological details to supplement the theoretical treatment of Kass et al. (2011), which served to provide a point-process foundation for discrete-time modeling. The hierarchical approach described here is sufficiently simple that the generalized two-way model, together with multiway estimates and hypothesis tests, can be implemented relatively easily and effectively. We have produced Matlab code that may be accessed at http://www.cnbc.cmu.edu/~rkelly/code/synchrony. The code includes inputs for user-defined history and covariate effects.

The chief theoretical novelty in our approach is to recognize the strong heuristic of hierarchical sparsity, as articulated in section 2.7: if multiway spikes are relatively rare compared to individual-neuron spikes, then it should be useful to conceptualize excess spiking as involving multiway gain factors and to combine these with individual-neuron firing probabilities described by point-process regression models. The strategy adopted here is different from other statistical treatments of multiple binary time series, such as in Liang and Zeger (1989), which are not aimed at point processes and do not consider the special circumstance, and opportunity, presented by hierarchical sparsity; they would be considerably more cumbersome in this setting. On the other hand, hierarchical sparsity is not universally valid. It may happen that population-level activity produces substantial bursts of multiway spiking, as illustrated in panel C of Figure 5. The data in this figure come from the same experiment as the data used in section 3. Our approach assumes the individual-neuron conditional intensity functions account for this kind of shared activity. If conditional intensity functions failed to include appropriate covariates to identify population activity, then multiway spiking might no longer be rare compared with individual-neuron spiking and hierarchical sparsity might no longer be applicable. Also, network activity such as that in Figure 5 would be an example source of trial-to-trial variation; in our approach, all important sources of trial-to-trial variation must be included by defining suitable covariates. It would be interesting to consider incorporating into the framework described here latent variables to accommodate network bursting (Chen, Vijayan, Barbieri, Wilson, & Brown, 2009; Tokdar, Xi, Kelly, & Kass, 2010; Wu, Kulkarni, Hatsopoulos, & Paninski, 2009), but that is a topic for future research. Kass et al. (2011) used a covariate based on population spike counts in point-process regression models to analyze synchrony, following the approach specified in greater detail here.

Figure 5:

Neural spike train raster plots for repeated presentations of a drifting sine wave grating stimulus (from Kass et al., 2011). Recordings were made in V1. (A) Single-cell responses to 120 repeats of multiple sinusoidal grating stimuli. At the top is a raster corresponding to the spike times, and below is a peristimulus time histogram for the same data. (B) Same as panel A, for a different cell. (C) Population responses to the same stimulus, for five repeats. Each block, corresponding to a single trial, is the population raster for n=125 units. On each trial, there are several dark bands, which constitute bursts of network activity.

Figure 5:

Neural spike train raster plots for repeated presentations of a drifting sine wave grating stimulus (from Kass et al., 2011). Recordings were made in V1. (A) Single-cell responses to 120 repeats of multiple sinusoidal grating stimuli. At the top is a raster corresponding to the spike times, and below is a peristimulus time histogram for the same data. (B) Same as panel A, for a different cell. (C) Population responses to the same stimulus, for five repeats. Each block, corresponding to a single trial, is the population raster for n=125 units. On each trial, there are several dark bands, which constitute bursts of network activity.

The usual parameters in equation 1.1 have some statistical virtues, as do the orthogonalized parameters discussed by Amari (2009). We have worked with a different parameterization, given in equation 2.8, with the chosen due to interpretability as a gain factor for increased synchronous firing rate. This also led to our suggestion of in equation 2.11 as a measure of functional connectivity of neurons i and j in the presence of neuron k. One could instead introduce a generalized version of the usual parameters in equation 1.1 to account for time-varying firing rates, for example. In the context of synchrony investigation, we find more directly interpretable, but this is a matter of taste and convenience.

We also conducted a power study, the main conclusion of which is that large numbers of trials are likely needed in order to detect realistic multiway spiking above that determined by two-way interactions. In the most favorable case we examined, where cells fire at 10 Hz, approximately 75 1 s trials are needed to reliably detect excess three-way spiking that produces double the number of spikes expected under the two-way model (and this assumed that diminished three-way spiking is not of interest so that a one-sided test could be used; a two-sided test would require considerably more data). When there are lower firing rates, or less dramatic excess firing, the ability to reliably detect excess three-way spiking deteriorates, and very long recording sessions will be needed. With the code we have made available, additional scenarios may be investigated so that the power of new experiments to find excess three-way synchrony could be considered carefully.

We have assumed that excess multiway spiking is constant in time. At least in the case of two-way spiking, there will be opportunities to examine stimulus-related time-locked increases in synchrony, with modest amounts of data, which could have important physiological relevance (Riehle, Grün, Diesmann, & Aertsen, 1997). In our framework, this would require estimating in equation 1.3, and an illustration of such an estimate was given by Ventura et al. (2005b) based on low-order spline fitting. An alternative would be to apply formula 2.6 repeatedly across distinct time intervals, and a continuous-time estimate could also be obtained by windowing or smoothing the numerator and denominator of equation 2.6, analogous to what was done in Kass et al. (2003). From our power results in section 4, however, we would expect time-varying multiway interactions to require either high firing rates or extended recording sessions. Indeed, when we examined data from two pairs of neurons in section 3 and found no evidence in favor of nonconstant , we observed that we had relatively few synchronous spikes to work with, and thus little statistical power to detect nonconstant . In the context of time-varying synchrony, it is also worth pointing out that in the presence of time-varying individual-neuron firing rates, alternative models of excess synchrony are not equivalent (Ito & Tsuji, 2000). In particular, the assumption in equation 1.1 that is constant in time is not equivalent to the assumption that is constant in time.

We have applied the IPF procedure of section 2.2 to triples of neurons. It could be applied to larger sets of n>3 neurons, but the number of probabilities that must be fitted is 2n, so as n grows, the computation will quickly become infeasible. We believe it remains possible to treat n neurons, even when n is large, but this will require additional methods and is a subject for future investigation. An important alternative to the parametric bootstrap described here is based on spike jittering, as in Harrison and Geman (2009). The jitter null hypothesis, however, is different from the bootstrap null of section 2.3 (see Harrison et al., in press). In future work we also plan to investigate the use of jitter in conjunction with the model-based approach described here.

Finally, no discussion of synchrony is complete without some reference to the problem of spike sorting, which undoubtedly can have an impact on synchrony detection. Considerable effort went into the characterization of spike waveforms in the data analyzed here (Kelly et al., 2007), but the problem is difficult (Harris, Henze, Csicsvari, Hirase, & Buzsaki, 2000; Ventura, 2009). We trust spike misclassification will be mitigated as recording technologies advance.

## Notes

1

In the usual way equation 1.1 is presented in the statistics literature, there are 19 parameters and 13 constraints.

2

Formula 2.14 is asymptotically equivalent, as , to equation 23 of Kass et al. (2011), but it may differ in practice and seems preferable because of the direct connection between equations 2.13 and 1.1.

3

We used equation 2.15 rather than the two-way model because the test of was significant.

4

The values of Ex are similar for the overall time-averaged firing rate (a constant firing rate) and for the time-varying firing rate; this depends on the “signal correlation,” that is, the overlap of the PSTHs, with the latter typically being bigger than the former when there is substantially shared time-varying response to the stimulus and, thus, relatively large signal correlation.

## References

Aertsen
,
A. M. H. J.
,
Gerstein
,
G. L.
,
Habib
,
M. K.
, &
Palm
,
G.
(
1989
).
Dynamics of neuronal firing correlation: Modulation of effective connectivity
.
J. Neurophys.
,
61
,
900
917
.
Agresti
,
A.
(
2002
).
Categorical data analysis
(2nd ed.).
Hoboken, NJ
:
Wiley
.
Amari
,
S.-I.
(
2009
).
Measure of correlation orthogonal to change in firing rate
.
Neural Comput.
,
21
,
960
972
.
Chen
,
Z.
,
Vijayan
,
S.
,
Barbieri
,
R.
,
Wilson
,
M. A.
, &
Brown
,
E. N.
(
2009
).
Discrete- and continuous-time probabilistic models and inference algorithms for neuronal decoding of up and down states
.
Neural Comput.
,
21
,
1797
1862
.
DiMatteo
,
I.
,
Genovese
,
C. R.
, &
Kass
,
R. E.
(
2001
).
Bayesian curve-fitting with free-knot splines
.
Biometrika
,
88
,
1055
1071
.
Grün
,
S.
(
2009
).
Data-driven significance estimation for precise spike correlation
.
J. Neurophysiol.
,
101
,
1126
1140
.
Gütig
,
R.
, &
Aertsen
,
A.
(
2003
).
Analysis of higher-order neuronal interactions based on conditional inference
.
Biol. Cybernetics
,
88
,
352
359
.
Haberman
,
S. J.
(
1974
).
The analysis of frequency data
.
Chicago
:
University of Chicago Press
.
Harris
,
K. D.
,
Henze
,
D. A.
,
Csicsvari
,
J.
,
Hirase
,
H.
, &
Buzsaki
,
G.
(
2000
).
Accuracy of tetrode spike separation as determined by simultaneous intracellular and extracellular measurements
.
J. Neurophysiol.
,
84
,
401
414
.
Harrison
,
M. T.
,
Amarasingham
,
A.
, &
Kass
,
R. E.
(
in press
).
Statistical identification of synchronous spiking
. In
P. Di Lorenzo & J. Victor
(Eds.),
Spike timing: Mechanisms and function
.
London
:
Taylor and Francis
.
Harrison
,
M. T.
, &
Geman
,
S.
(
2009
).
A rate and history-preserving algorithm for neural spike trains
.
Neural Comput.
,
21
,
1244
1258
.
Ito
,
H.
, &
Tsuji
,
S.
(
2000
)
Model dependence in quantification of spike interdependency by joint peri-stimulus time histogram
.
Neural Comput.
,
12
,
195
217
.
Kass
,
R. E.
,
Kelly
,
R. C.
, &
Loh
,
W.-L.
(
2011
).
Assessment of synchrony in multiple neural spike trains using loglinear point process models
.
Annals Appl. Statist.
,
5
,
1262
1292
.
Kass
,
R. E.
, &
Ventura
,
V.
(
2001
).
A spike-train probability model
.
Neural Comput.
,
13
,
1713
1720
.
Kass
,
R. E.
,
Ventura
,
V.
, &
Brown
,
E. N.
(
2005
).
Statistical issues in the analysis of neuronal data
.
J. Neurophysiol.
,
94
,
8
25
.
Kass
,
R. E.
,
Ventura
,
V.
, &
Cai
,
C.
(
2003
).
Statistical smoothing of neuronal data
.
Network: Comput. Neural Systems
,
14
,
5
15
.
Kelly
,
R. C.
,
Smith
,
M. A.
,
Kass
,
R. E.
, &
Lee
,
T.-S.
(
2010a
).
Accounting for network effects in neuronal responses using L1 penalized point process models
. In
J. Lafferty, C.K.I. Williams, J. Shawe-Taylor, R. S. Zemel, & A. Culotta
(Eds.),
Advances in neural information processing systems
,
23
.
Red Hook, NY
:
Curran Associates
.
Kelly
,
R. C.
,
Smith
,
M. A.
,
Kass
,
R. E.
, &
Lee
,
T.-S.
(
2010b
).
Local field potentials indicate network state and account for neuronal response variability
.
J. Comput. Neurosci.
,
29
,
567
579
.
Kelly
,
R. C.
,
Smith
,
M. A.
,
Samonds
,
J. M.
,
Kohn
,
A.
,
Bonds
,
A. B.
,
Movshon
,
J. A.
, et al. (
2007
).
Comparison of recordings from microelectrode arrays and single electrodes in the visual cortex
.
J. Neurosci.
,
27
,
261
264
.
Liang
,
K. Y.
, &
Zeger
,
S. L.
(
1989
).
A class of logistic regression models for multivariate binary time series
.
J. Amer. Statist. Assoc.
,
84
,
447
451
.
Martignon
,
L.
,
Deco
,
G.
,
,
K.
,
Diamond
,
M.
,
Freiwald
,
W.
, &
,
E.
(
2000
).
Neural coding: Higher-order temporal patterns in the neurostatistics of cell assemblies
.
Neural Comput.
,
12
,
2621
2653
.
Nakahara
,
H.
, &
Amari
,
S.
(
2002
).
Information geometric measure for neural spikes
.
Neural Comput.
,
14
,
2269
2316
.
Okatan
,
M.
,
Wilson
,
M. A.
, &
Brown
,
E. N.
(
2005
).
Analyzing functional connectivity using a network likelihood model of ensemble neural spiking activity
.
Neural Comput.
,
17
,
1927
1961
.
Pillow
,
J. W.
,
Shlens
,
J.
,
Paninski
,
L.
,
Sher
,
A.
,
Litke
,
A. M.
,
Chichilnisky
,
E. J.
, et al
(
2008
).
Spatio-temporal correlations and visual signalling in a complete neuronal population
.
Nature
,
454
,
995
999
.
Riehle
,
A.
,
Grün
,
S.
,
Diesmann
,
M.
, &
Aertsen
,
A.
(
1997
).
Spike synchronization and rate modulation differentially involved in motor cortical function
.
Science
,
278
,
1950
1953
.
Schneidman
,
E.
,
Berry
,
M. J.
,
Segev
,
R.
, &
Bialek
,
W.
(
2006
).
Weak pairwise correlations imply strongly correlated network states in a neural population
.
Nature
,
440
,
1007
1012
.
Stevenson
,
I. H.
,
Rebesco
,
J. M.
,
Hatsopoulos
,
N. G.
,
Haga
,
Z.
,
Miller
,
L. E.
, &
Körding
,
K. P.
(
2009
).
Bayesian inference of functional connectivity and network structure from spikes
.
IEEE Trans. Neural Systems and Rehabilitation
,
17
,
203
213
.
Tokdar
,
S.
,
Xi
,
P.
,
Kelly
,
R. C.
, &
Kass
,
R. E.
(
2010
).
Detection of bursts in extracelluar spike trains using hidden semi-Markov point process models
.
J. Comput. Neurosci.
,
29
,
203
212
.
Truccolo
,
W.
,
Eden
,
U. T.
,
Fellows
,
M. R.
,
Donoghue
,
J. P.
, &
Brown
,
E. N.
(
2005
).
A point process framework for relating neural spiking activity to spiking history, neural ensemble, and extrinsic covariate effects
.
J. Neurophys.
,
93
,
1074
1089
.
Uhlhaas
,
P. J.
,
Pipa
,
G.
,
Lima
,
B.
,
Melloni
,
L.
,
Neuenschwander
,
S.
,
Nikolić
,
D.
, et al. (
2009
).
Neural synchrony in cortical networks: History, concept and current status
.
Frontiers in Integrative Neuroscience
,
3
.
Ventura
,
V.
(
2009
).
Traditional waveform based spike sorting yields biased rate code estimates
.
PNAS
,
106
,
6921
6926
.
Ventura
,
V.
,
Cai
,
C.
, &
Kass
,
R. E.
(
2005a
).
Trial-to-trial variability and its effect on time-varying dependence between two neurons
.
Journal Neurophysiol.
,
94
,
2928
2939
.
Ventura
,
V.
,
Cai
,
C.
, &
Kass
,
R. E.
(
2005b
).
Statistical assessment of time-varying dependence between two neurons
.
Journal Neurophysiol.
,
94
,
2940
2947
.
Wu
,
W.
,
Kulkarni
,
J.
,
Hatsopoulos
,
N.
, &
Paninski
,
L.
(
2009
).
Neural decoding of goal-directed movements using a linear state-space model with hidden states
.
IEEE Trans. Neural Syst. Rehab. Engineering
,
17
,
370
378
.
Zhao
,
M.
,
Batista
,
A. P.
,
Cunningham
,
J. P.
,
Chestek
,
C. A.
,
Rivera-Alvidrez
,
Z.
,
Kalmar
,
R.
, et al. (
2011
).
An L1-regularized logistic model for detecting short-term neuronal interactions
.
J. Comput. Neurosci.
doi: 10.1007/s10827-011-0365-5
.