## Abstract

Several authors have previously discussed the use of log-linear models, often called maximum entropy models, for analyzing spike train data to detect synchrony. The usual log-linear modeling techniques, however, do not allow time-varying firing rates that typically appear in stimulus-driven (or action-driven) neurons, nor do they incorporate non-Poisson history effects or covariate effects. We generalize the usual approach, combining point-process regression models of individual neuron activity with log-linear models of multiway synchronous interaction. The methods are illustrated with results found in spike trains recorded simultaneously from primary visual cortex. We then assess the amount of data needed to reliably detect multiway spiking.

## 1. Introduction

Synchrony is widely believed to play a fundamental role in neural computation (e.g., Uhlhaas et al., 2009), but its statistical assessment is subtle (for reviews, see Grün, 2009, and Harrison, Amarasingham, & Kass, in press). To analyze multiway synchrony among simultaneously recorded multiple spike trains that are represented as binary time series, across many trials, it is natural to consider well-established log-linear modeling technology (Gütig & Aertsen, 2003; Martignon et al., 2000; Nakahara & Amari, 2002; Schneidman, Berry, Segev, & Bialek, 2006). The standard approach, however, has two shortcomings. First, it assumes stationarity of firing rates across suitable time intervals. Second, it does not incorporate spiking history or other covariates, and therefore it effectively assumes Poisson spiking. While previous authors have been aware of these issues (Martignon et al., 2000), they have not offered specific methods for dealing with them. In addition, the standard approach ignores the inherent relative sparsity of two-way and higher-order synchronous spiking. Here we provide a modification of the usual log-linear modeling methodology to deal with the case of inhomogeneous non-Poisson firing rates that result from stimulus-driven (or action-driven) recordings, and we give a straightforward procedure for parameter estimation, which is a variant of maximum likelihood via iterative proportional fitting. Tests for pairwise and multiway synchrony may then be based on the bootstrap.

*a*,

*b*,

*c*), where so that, for instance, (1, 1, 0) would signify that the first two neurons fired but the third did not. The probability of obtaining the pattern (

*a*,

*b*,

*c*) at some time

*t*on trial

*r*may depend on the history

*H*

_{t,r}of spiking patterns prior to time

*t*, and it may also depend on some other covariates, such as a measurement of network activity, as in Kass, Kelly, and Loh (2011) or variables representing trial-to-trial variation, as in Ventura, Cai, and Kass (2005a), which we here take to be a vector

*u*

_{t,r}. We then write the probability of pattern (

*a*,

*b*,

*c*) at time

*t*on trial

*r*as

*p*

^{123}

_{abc,r}(

*t*|

*H*

_{t,r},

*u*

_{t,r}). If we were to treat the spike trains as jointly stationary and ignore history, covariates, and additional sources of trial-to-trial variation, we could omit

*t*and

*r*and write the usual model for two-way, but not three-way, interaction as where the parameters satisfy certain constraints (e.g., , ). In computer science and physics, models such as equation 1.1 are often called maximum entropy models (Schneidman et al., 2006). In the statistics literature, the parameters are usually standardized by subtracting means (Agresti, 2002). Using equation 1.1, for a given set of data the log-likelihood function may be maximized iteratively to produce fitted parameters and probabilities. The problem we solve here is introducing a variant of equation 1.1 that allows both nonstationarity and the use of history and other covariates.

### 1.1. Overview of Approach.

Suppose we have spike trains from *n* neurons recorded simultaneously over a time interval of length *T*, across *R* trials. We consider spiking patterns at a relatively fine time resolution, denoted by . In section 3, we report an analysis of simultaneous spiking data recorded from primary visual cortex where we took to be 5 milliseconds. The spike train data may be represented as binary arrays with dimensionality . For the data analysis reported in section 3 we used arrays of dimension to estimate probabilities *p*^{123}_{abc,r}(*t*|*H*_{t,r}, *u*_{t,r}). In full generality there are more parameters than data values. Furthermore, nonstationarity and history dependence could, in their most general conception, be very complicated. The approach we take here simplifies the situation greatly by making these assumptions:

For every neuron

*i*, the firing probability varies smoothly across time and depends on spiking history*H*_{t,r}only through the history*H*^{i}_{t,r}for neuron*i*, together with the covariate vector*u*_{t,r}.Excess joint pairwise spiking, above that expected under independence, does not depend on either spiking history or the covariate vector.

Importantly, the second assumption defines excess synchrony relative to “independence,” which here means independence conditionally on all history and covariate effects mentioned in the first assumption. In specifying these assumptions, we aim to emphasize the way synchrony is judged against a backdrop of explanatory covariates. For example, it is widely appreciated that a pair of neurons may exhibit excess pairwise spiking relative to what might be expected from their time-averaged firing rates because they respond to given stimuli with roughly similar temporal profiles; this would be synchrony due to their individual firing rate functions, as seen through overlapping peristimulus time histograms (PSTHs). Various methods may be used to adjust or “normalize” pairwise spiking to account for the individual time-varying firing rate functions (Aertsen, Gerstein, Habib, & Palm, 1989). Many other possible sources of pairwise spiking may be present in particular cases, including global network activity. One of our purposes here is to introduce a general framework for quantifying the contributions of alternative sources of pairwise spiking while assessing statistical evidence and uncertainty. A second purpose is to examine excess multiway spiking relative to that expected from pairwise spiking. The approach we develop melds log-linear modeling, as in equation 1.1, together with point-process regression modeling (which usually comes under the rubric of generalized linear models) as in numerous articles (Kass & Ventura, 2001; Kass, Ventura, & Brown, 2005; Kelly, Smith, Kass, & Lee, 2010a; Okatan, Wilson, & Brown, 2005; Pillow et al., 2008; Stevenson et al., 2009; Truccolo, Eden, Fellows, Donoghue, & Brown, 2005; Zhao et al., 2011). We use point-process regression to model the behavior of each individual neuron; we then overlay the structure of log-linear models to account for synchronous connections.

*H*

^{i}_{t,r}of neuron

*i*at time

*t*on trial

*r*, we write Let

*p*

^{i}_{1,r}(

*t*|

*x*

_{t,r}) be the probability of neuron

*i*spiking at time

*t*on trial

*r*. The central object in the point-process framework is the conditional intensity function, , which is the firing rate function in continuous time, and the relationship between the continuous and discrete time representations is summarized by the approximation being justified by passage to the limit as . The corresponding key statistical result is that the likelihood function based on the continuous-time representation, in terms of the conditional intensity function, is approximately equal to the likelihood function based on the binary data and the probability of spiking

*p*(

^{i}*t*|

*x*

^{i}_{t,r}). We use

*x*in the notation

*x*

^{i}_{t,r}to connote regression-style modeling of probability in terms of explanatory covariates, including history.

*p*

^{ij}_{11,r}(

*t*|

*x*

_{t,r}) be the probability that neurons

*i*and

*j*will both spike at time

*t*on trial

*r*. Under assumption 1 above, we use the spike trains from each neuron

*i*to fit that neuron's firing probabilities

*p*

^{i}_{1,r}(

*t*|

*x*

_{t,r})=

*p*

^{i}_{1,r}(

*t*|

*x*

^{i}_{t,r}) across time and trials. Let us write such fits as . Under assumptions 1 and 2, we may define so that represents the excess pairwise spiking above that predicted by independence (as in Ventura, Cai, & Kass, 2005b). We have written with the argument

*t*and subscript

*x*to indicate that is a function only of time, but because it is defined through equation 1.3, it depends indirectly on the covariates used in the individual neuron firing probabilities

*p*

^{i}_{1,r}(

*t*|

*x*

_{t,r}). For example, we may evaluate excess synchrony with and without a covariate that measures network activity. Kass et al. (2011) gave an example in which the data indicated that when a covariate for network activity was included in

*x*

_{t,r}while when it was omitted. This suggested that excess synchrony above that expected from time-varying firing rates, while present, was due to global network activity rather than a local circuit that affected the particular pair of neurons.

We may now summarize the steps of the strategy we have implemented for estimating multiway spiking probabilities based on all combinations of excess pairwise spiking, thereby generalizing equation 1.1 to account for nonstationarity, history, and covariates. We consider first the case of *n*=3 neurons labeled *i*, *j*, *k*:

For all

*i*and all*t*and for every trial*r*, fit*p*^{i}_{1,r}(*t*|*x*^{i}_{t,r}) to get .For all

*i*,*j*estimate to obtain .Based on the sets of estimated values , , , , , from steps 1 and 2, use an iterative algorithm to obtain the complete set of estimates of

*p*^{ijk}_{abc,r}(*t*|*x*_{t,r}).

Once we have all the fitted probabilities we are able to use them to estimate quantities defined in terms of the probabilities *p ^{ijk}*

_{abc,r}(

*t*|

*x*

_{t,r}). We can also use the set of fitted probabilities to simulate artificial spike trains that reflect all of the estimated multineuron dependence, and we can thereby use bootstrap simulation to obtain confidence intervals for estimated quantities. These confidence intervals incorporate the statistical uncertainty from all three steps of the fitting procedure. Similarly, we can test the null hypothesis of two-way interaction, but no three-way interaction, against the alternative that also includes three-way interaction. The bootstrap hypothesis tests also incorporate uncertainty from all the steps of the fitting procedure.

The same steps may be followed for more than three neurons. For example, for four neurons, we would again use the two-way probabilities in equation 1.5 to fit probabilities of the form and thereby test the null hypothesis of pairwise interaction against both three-way and four-way interaction. Thus, according to the approach described here, large numbers of spiking probabilities having potentially complicated forms are estimated by first fitting to each neuron a smooth conditional intensity function and any relevant covariates, then fitting two-way excess synchronous spiking terms, then finding multiway probabilities by iterative optimization.

To help fix these ideas, Figure 1 displays a simple situation involving time-varying firing rates from three artificial neurons (the dotted lines in panels A–C) where two-way synchronous spiking occurs more often than predicted by independence (the dotted lines in panels D–F), and three-way interaction occurs more often than that predicted by a two-way model. We simulated a large number of trials from this model. When a two-way interaction model was fitted to the simulated data, the one-way and two-way firing rates were estimated accurately (the solid lines in panels A–F), but the amount of three-way spiking was underestimated (the solid line in panel G).

### 1.2. Overview of This Letter.

In section 2 we provide methodological details. We begin in section 2.1 by briefly summarizing the fitting methods for individual neuron probabilities, then go on to pairwise probabilities in section 2.2, giving procedures for bootstrap confidence intervals in section 2.3. In section 2.4 we present the algorithm for fitting multiway probabilities under the assumption of pairwise interaction but not higher-order interaction, in section 2.5 we discuss bootstrap confidence intervals for functions of these probabilities, and in section 2.6 we describe bootstrap hypothesis tests. In section 2.7 we review the essential motivation for the three-step approach outlined in section 1.1 in terms of what we call hierarchical sparsity. In section 3 we provide a real-data illustration by analyzing some simultaneous spiking data recorded from primary visual cortex. We then, in section 4, use the framework described here to address a fundamental question: How many data would be needed to distinguish three-way interaction from the three-way spiking that occurs by chance from pairwise interaction models? In section 5 we add remarks about the utility of this method in practice.

## 2. Methodology

Here, and for the remainder of the letter, we omit explicit reference to the trial *r*, leaving it implicit. We also omit the subscript *x* on .

### 2.1. Fitting One-Way Probabilities.

Individual neuron (one-way) firing probabilities may be fitted by invoking smooth point-process models, where each neuron *i* has a firing rate governed by a conditional intensity function Kass and Ventura (2001) and Kass, Ventura, and Cai (2003) discussed spline-based fitting of conditional intensity functions, and Pillow et al. (2008) used an alternative set of smooth basis functions that incorporate both rate variation and history dependence. Sometimes history and covariate effects may be ignored, and then the PSTH may be smoothed by gaussian filters (kernel smoothers), fixed-knot splines, or more sophisticated methods such as BARS (DiMatteo, Genovese, & Kass, 2001). The first step of the approach suggested here is to apply one of these individual neuron models in order to obtain estimates for all *t*.

### 2.2. Fitting Pairwise Probabilities.

*i*and

*j*as in equation 1.3. While we could incorporate time-varying excess spiking effects as in equation 1.3 here, instead we further specialize by assuming that the excess pairwise spiking probability effects are constant across time. We therefore take and define This is helpful when there are not large numbers of joint spikes, and it also allows us to use a simple and intuitive estimator of given in equation 2.6. By inserting the estimates , , and into the right-hand side of equation 2.1, we obtain estimates of the two-way probabilities

*p*(

^{ij}_{ab}*t*|

*x*) as follows: These estimates, for

_{t}*i*,

*j*=1, 2, 3, give us all of the two-way probabilities needed to fit a two-way interaction model analogous to equation 1.1 that instead incorporates time-varying, history-dependent, and covariate-dependent firing probabilities.

*p*

^{i}_{1}(

*t*|

*x*) to be . Using the point-process representation of joint spiking for neurons

^{i}_{t}*i*and

*j*derived by Kass et al. (2011), we replace with and write the synchronous-spike likelihood function as where

*t*are the times of the joint (synchronous) spikes and Equation 2.4 is the conditional-intensity analogue of equation 2.1. Setting the derivative of to zero and solving gives the maximum likelihood estimate where

_{ij}*N*is the total number of joint spikes for neurons

^{ij}*i*and

*j*and the denominator is the expected number of joint spikes under independence, after taking account of the covariate

*x*. This corresponds to equation 22 of Kass et al. (2011). The formula we use in practice replaces the integral in equation 2.5 with To interpret equation 2.6, let us reemphasize the covariate

*x*by restoring its use as a subscript: on the left-hand side of equation 2.6, we replace by . If , the value of represents the proportionate excess synchronous spiking beyond that explained by the covariate

*x*, while if , the value of is the proportionate diminution of synchrony, below that explained by

*x*. When , it is helpful to consider the reciprocal, which ranges from 0 to 1 and represents the proportion of synchronous spikes explained by the covariate vector

*x*(including its time-varying, trial-averaged firing rate). For example, if when

*x*contains only the time-varying, trial-averaged firing rate, then

*E*=.5, and we would say the time-varying, trial-averaged firing rate explains half the synchronous spikes. This gives us a way of interpreting the relative effects of various alternative covariate vectors

_{x}*x*, as we illustrate in section 3. If , we can define the inhibitory measure , which then represents the proportion of observed spikes relative to the number predicted by

*x*.

### 2.3. Confidence Intervals for Pairwise Effects.

Standard errors and confidence intervals associated with the pairwise estimates in equation 2.6 may be obtained by a parametric bootstrap using equation 2.2. That is, equations 2.2 specify a set of multinomial spiking probabilities at each time *t* (and for each trial, separately, due to separate spiking history or covariate effects) for every spiking pattern (1,1),(1,0),(0,1),(0,0). It is straightforward to generate *G* complete sets of pseudo-data pairwise spike trains based on these probabilities, each set of pseudo-data replicating the layout of the original data (a typical value of *G* being 500). Let us use to index the sets of pseudo-data. If we wish to obtain a 95% bootstrap confidence interval for a parameter (such as ), we use the pseudo-data set *g* to obtain for ; that is, for every *g*, we obtain , by applying equation 2.6 to the pseudo-data, where all the terms are computed from the pseudo-data. We then order the values of , find the .025 and .975 percentiles, and use them as end points of the 95% confidence interval. A standard error is similarly obtained as the standard deviation of the values of . When summarizing results with estimates and standard errors, we prefer to work with rather than because the log transform tends to symmetrize the distribution (and thus make it closer to normal, as it should be when one interprets standard errors). Because we include the refitting of to the pseudo-data, the bootstrap confidence intervals account for the uncertainty in that first step of the fitting procedure.

### 2.4. Fitting Higher-Order Probabilities.

For *n*=2 neurons, equations 2.2 completely determine the estimated probabilities we need for statistical analysis. For *n*=3 neurons, pairwise interaction models do not provide analogous closed-form expressions for all the necessary probabilities *p*^{123}_{abc}(*t*|*x _{t}*), and the same is true for

*n*>3. Even in the simpler stationary Poisson setting, where the data become counts aggregated across time and model 1.1 may be applied, it is well known (Agresti, 2002) that maximum likelihood estimation of the two-way interaction model requires iterative methods. The standard algorithm in that context is iterative proportional fitting (IPF; Agresti, 2002; Schneidman et al., 2006).

*p*, including three-way interactions (in statistics, it is often called “saturated”), has 8−1=7 free parameters: the 8 values

_{abc}*p*must satisfy the constraint . The two-way interaction model, equation, 1.1, omits the three-way interaction and has six free parameters. (In equation 1.1 there are seven parameters with nonzero multipliers and, again, there is the constraint .)

_{abc}^{1}In our approach, we do not need to use the parameters that appear in equation 1.1. Instead, we effectively use a different set of six free parameters. We define and then parameterize equation 1.1 using We are not able to write a closed-form expression for

*p*in terms of the parameter vector for the same reason that we are unable to write closed-form maximum likelihood estimates for

_{abc}*p*in equation 1.1. However, any particular value of the vector does define a particular set of values of

_{abc}*p*according to equation 1.1 based on the nonlinear equations in equation 2.8. In the general case we are concerned with here, we similarly use to define, for every

_{abc}*t*, a two-way interaction model of the form 1.1 that we are fitting to the data.

*n*being the number of time bins in which the pattern (

_{abc}*a*,

*b*,

*c*) occurred and

*m*being the expected number according to the two-way interaction model, 1.1, where

_{abc}*m*=

_{abc}*np*and is the total number of spikes. IPF produces estimates of

_{abc}*m*. With

_{abc}*n*

_{ab+}being the number of bins in which the first two neurons have the pattern (

*a*,

*b*), and

*n*

_{a+c}, and

*n*

_{+bc}defined similarly, the first cycle of the IPF algorithm, as in Agresti (2002), involves three steps: Iterating these steps produces convergence to the maximum likelihood estimates under very general conditions (Haberman, 1974).

*p*(

^{ijk}_{abc}*t*|

*x*). Here, IPF will produce a set of estimates , which we write as Let and let and be defined analogously. Given values at iteration

_{t}*g*, we obtain the next three updates as For iteration

*g*, the values of the form are obtained from Because there are time values and

*R*trials, equation 2.9 produces sets of three equations—for example, for each of the data analyses in section 3, there were 24,000 sets of three equations, which were iterated to convergence. There are many ways to initialize the algorithm, including taking . We have found in practice that the algorithm converges in only a few iterated sets of the three steps in equation 2.9. We summarize the fitting algorithm next.

**Algorithm for Fitting the Generalized Two-Way Interaction Model**

The output of this algorithm is the complete set of estimated probabilities .

When there are *n*>3 neurons we may apply the same method for fitting the two-way interaction model, thereby obtaining estimated probabilities. Clearly as *n* grows, it quickly becomes infeasible to compute a multiple of 2^{n} quantities. We mention this again in section 5.

### 2.5. Estimates and Confidence Intervals for Functions of Probabilities.

*n*=2 case, outlined following equation equation 2.6, using the probabilities to generate the three-way pseudo-data spike trains.

*i*and

*j*that is unrelated to the activity of neuron

*k*. We do not mean to suggest that this particular measure has some special stature—others might be envisioned—but it is intuitive and easily computed within the framework we are describing here.

*p*

^{ijk}_{111}(

*t*|

*x*) denote the three-way spiking probability under the general three-way model, which includes three-way interaction. We define by and estimate it with where

_{t}*N*is the total number of triplet joint spikes for neurons

^{ijk}*i*,

*j*,

*k*, and obtained from fitting the two-way interaction model. Formula 2.14 is analogous to formula 2.6 and may be derived by an analogous argument.

^{2}Once again, standard errors and confidence intervals may be obtained from the parametric bootstrap, but now the three-way interaction term is included in the model. Specifically, the spike trains are generated from the estimates obtained from using equation 2.14: We illustrate by applying equations 2.12 and 2.14, together with bootstrap confidence intervals, in section 3.

### 2.6. Hypothesis Tests.

*n*=2, the null hypothesis concerning in equation 2.1 is that of independence, . To test

*H*

_{0}, we may use a parametric bootstrap, with spike trains generated by the fitted independence model, which simply requires that we generate the neuron

*i*and neuron

*j*spike trains independently using and . There are several choices for the test statistic. Let us suppose we use

*N*, and let

^{ij}*N*denote the value computed from the data. We generate bootstrap pseudo-data, with the same number of trials as the real data, based on and , and repeat this procedure

^{ij}_{obs}*G*times (e.g., we might take

*G*=10, 000). We let label the pseudo-data, and for set

*g*of pseudo-data we compute

*N*and label its value

^{ij}*N*

^{ij(g)}. The bootstrap

*p*-value is then This

*p*-value satisfies the usual properties of the parametric bootstrap, that is, it furnishes approximately the correct probability of rejecting

*H*

_{0}under the assumed null model. In practice, if the numerator is 0, we do not report

*p*=0 but rather say

*p*<1/

*G*. Some authors prefer to add 1 to both the numerator and denominator of equation 2.17.

When *n*=3, to test whether the two-way interaction model is adequate to explain all the three-way spikes, we take the general two-way interaction model as *H*_{0}. We may then use a parametric bootstrap by generating pseudo-data spike trains according to the probabilities of section 2.4 and can use the total number of three-way spikes *N ^{ijk}* as the test statistic. If we again label the sets of pseudo-data with and the data-based value of

*N*with

^{ijk}*N*, we obtain the

^{ijk}_{obs}*p*-value by replacing the numerator in equation 2.17 with the number of values of

*g*such that .

When *n*>3, we may apply the bootstrap test not only to the triplet spikes from all combinations of three neurons using *N ^{ijk}*, but also the quadruplet spikes from all combinations of four neurons using

*N*, and so forth, in principle up to

^{ijkl}*N*

^{ijk⋅⋅⋅n}.

*p*-value in equation 2.17 is one-sided in the sense that only excess joint spiking is being assessed. To check for either excess joint spiking or diminished joint spiking, we could instead use as the test statistic. Then, with being the value computed from set

*g*of the pseudo-data and being the value computed from the real data, we would define as a

*p*-value for the two-sided test. When , equation 2.18 finds the proportion of sets of pseudo-data for which either or . Higher-order tests may be modified similarly.

### 2.7. Motivation from Hierarchical Sparsity.

Central to the strategy in sections 2.1 to 2.5 is the idea that we may perform fitting hierarchically using the three steps articulated in section 1.1 and realized in the algorithm of section 2.5. The main justification for this procedure is based on the rate at which pairwise and multiway spikes occur.

*p*

^{i}_{1}(

*t*|

*x*), is also small and, as we have According to equation 2.1, we then have Equations 2.19 and 2.20, together with higher-order counterparts such as provide a formal expression of hierarchical sparsity. Kass et al. (2011) applied hierarchical sparsity in deriving point-process representations that approximate discrete-time models, analogous to equation 1.2. We use the idea here to motivate our hierarchical fitting procedure. Under hierarchical sparsity, there is likely to be good information (many spikes) available to estimate one-way probabilities, but as we move up the hierarchy of interactions, the information degrades: there are many fewer two-way spikes, and then even fewer three-way spikes, and so on. Although we do not offer a more precise theoretical statement, it is apparent that under hierarchical sparsity, the hierarchical fitting procedure should produce estimates of the multiway probabilities that capture well the available information in the data.

_{t}## 3. Data Analysis

We illustrate the approach by analyzing two sets of three neurons recorded from primary visual cortex, as described in Kelly, Smith, Kass, and Lee (2010b) and Kass et al. (2011). In each case, the neural responses were recorded from an anesthetized monkey while sinusoidal gratings were moved across the visual field. The data in Figures 2 and 3 correspond to 1 second of the recordings, and there are 120 repeated trials. Both figures display raster plots from three cells, with dark circles superimposed to indicate triplet firing within 5 ms time bins. The three cells chosen for Figure 2 are different from the three chosen for Figure 3, except that cell three in Figure 2 is the same as cell 2 in Figure 3. In Figure 2, there are 11 such triplets in total, across the 120 trials, while in Figure 3, there are 12 triplets. To implement step 1 of the fitting procedure, in which we fit time-varying individual-neuron firing rate functions (see section 2.1), we used a gaussian filter with bandwidth ms to smooth each PSTH, as shown at the bottom of panels A, B, and C in Figures 2 and 3. Results were not sensitive to choice of bandwidth. In related analyses, we used spline fits, including BARS, but chose gaussian filters here for speed of implementation, which was important for the simulation study reported in section 4. Panel D in each figure shows the bootstrap distribution of the number of triplets obtained under the null model using , as described in section 2.6. (We used *G*=500 bootstrap samples.) For the neurons corresponding to Figure 2, *N*^{123}_{obs}=11 is in the extreme tail of the bootstrap distribution and, applying equation 2.17, the test has a significant *p*-value with *p*=.002. On the other hand, for the neurons corresponding to Figure 3, *N*^{123}_{obs}=12 is in middle of the bootstrap distribution, and the test is not significant (*p*=.52).

^{3}We obtained Thus, interestingly, the estimates were all statistically indistinguishable from the null value of 1. This indicates that once the three-way interactions are considered, there are no longer any significant two-way effects. We suspect the excess three-way spiking among these neurons is due to the slow-wave activity discussed by Kelly et al. (2010b) and Kass et al. (2011). For the neurons in Figure 3, all the corresponding estimated effects and confidence intervals were consistent with null values of 1.

The results given above did not include individual-neuron history terms. When we included history, based on spike count in the preceding 100 milliseconds, the three-way hypothesis test results were essentially unchanged: individual-neuron history effects apparently picked up network activity so that pairwise interactions became nonsignificant, but the three-way interaction for the neurons in Figure 2 remained highly significant. (This was not sensitive to the width of history window.)

A related example of the approach presented here, focusing on two-way interaction, was given in Kass et al. (2011). Two alternative pairs of neurons were analyzed, with and without covariates *x ^{i}_{t}* that had two components: the spike count for neuron

*i*in the previous 100 ms and the total spike count among all neurons in the previous 100 ms. Thus,

*x*represents the recent network history, including that of neuron

^{i}_{t}*i*itself but not including that of neuron

*j*. We summarize the results in terms of , as in equation 2.7, the proportion of synchronous spikes explained by covariates

*x*in Table 1. In the table, we have written to indicate that the bootstrap test of was not rejected for pair 1 when the network history covariates were included. The bootstrap 95% confidence intervals for were pair 1: (.79,1.4) and pair 2: (1.4,3.6). The main findings are, first, for both pairs, there is highly significant synchronous activity beyond that due to firing rate ( for pair 1,

*p*<.0001 for pair 2) with only about 50% and 40% of the synchronous spikes explained by firing rate for the two pairs.

^{4}Second, for pair 1, network activity appears to explain synchronous spiking, but for pair 2 it does not (

*p*=.0002 from the bootstrap test of

*H*

_{0}). Indeed, from the pair 2 values of

*E*, the network activity explains only a very small additional proportion of spikes beyond the time-varying firing rate. For pair 2, there is excess synchronous spiking that presumably is associated with the stimulus.

_{x}Covariate
. x | Pair 1
. | Pair 2
. |
---|---|---|

Average firing rate | .52 | .40 |

Time-varying firing rate | .50 | .42 |

Network history | .48 |

Covariate
. x | Pair 1
. | Pair 2
. |
---|---|---|

Average firing rate | .52 | .40 |

Time-varying firing rate | .50 | .42 |

Network history | .48 |

For these two pairs of neurons, we have also examined whether there is evidence of time-varying synchrony. The interesting case would be time-varying synchrony within a particular stimulus epoch. We considered the 300 ms corresponding to each stimulus and decomposed these into the first 100 ms and the latter 200 ms, then used equation 2.6 within each of these time intervals and computed confidence intervals for both the early part of the time interval and the later part of the time interval. In every case, these two confidence intervals strongly overlapped, indicating no evidence in favor of time-varying synchrony. This may have been due to the relatively small number of synchronous spikes available within these 300 ms stimulus conditions across the 120 trials.

## 4. The Power of Tests for Three-Way Interaction

*R*for the number of trials. For each combination of and

*R*, we computed the probability of rejecting

*H*

_{0}after fixing the rejection cutoff so that the probability of rejection under

*H*

_{0}(the -level) was .05, as is customary. The results are shown in Figure 4, with black lines indicating the combinations of and

*R*that produce the customary minimally acceptable power values of .8. The figure is based on 1000 replications of each simulation. Specifically, for each and

*R*, the figure displays smoothed proportions, out of 1000, of the replications that rejected

*H*

_{0}.

The plots in Figure 4 indicate that in order to have a substantial probability of detecting three-way interaction, one needs either a large value of or a large number of trials *R*, or both. For example, when there are, on average, twice as many triplets as would occur under *H*_{0} (), in order to reject *H*_{0} with power .8 in the 5 Hz scenario of panel A, one would need more than 700 trials (of 1 s duration). For values of clearly below 2, which could be realistic, it becomes extremely difficult to reliably reject *H*_{0}. More than 1000 trials would be needed, and the extrapolation of the power curves indicates the number grows quickly as decreases. When the number of three-way spikes is larger, the situation is more favorable. When the firing rates are higher for the 10 Hz scenario of panel B, even there one needs about 150 trials to reliably reject *H*_{0} when . Panels C and D show the same two firing rates but with injected pairwise correlation. The pairwise correlation effectively increases the number of three-way spikes, and thus the power is larger when pairwise correlation is present. For the scenario in panel C, 200 trials are need to reliably reject *H*_{0}, and in the most favorable situation of panel D, 75 trials are needed.

## 5. Discussion

Our main purpose has been to generalize statistical assessment of synchrony based on log-linear modeling (using maximum entropy models) so that it can accommodate time-varying firing rates or non-Poisson history effects or time-varying covariates. We have presented methodological details to supplement the theoretical treatment of Kass et al. (2011), which served to provide a point-process foundation for discrete-time modeling. The hierarchical approach described here is sufficiently simple that the generalized two-way model, together with multiway estimates and hypothesis tests, can be implemented relatively easily and effectively. We have produced Matlab code that may be accessed at http://www.cnbc.cmu.edu/~rkelly/code/synchrony. The code includes inputs for user-defined history and covariate effects.

The chief theoretical novelty in our approach is to recognize the strong heuristic of hierarchical sparsity, as articulated in section 2.7: if multiway spikes are relatively rare compared to individual-neuron spikes, then it should be useful to conceptualize excess spiking as involving multiway gain factors and to combine these with individual-neuron firing probabilities described by point-process regression models. The strategy adopted here is different from other statistical treatments of multiple binary time series, such as in Liang and Zeger (1989), which are not aimed at point processes and do not consider the special circumstance, and opportunity, presented by hierarchical sparsity; they would be considerably more cumbersome in this setting. On the other hand, hierarchical sparsity is not universally valid. It may happen that population-level activity produces substantial bursts of multiway spiking, as illustrated in panel C of Figure 5. The data in this figure come from the same experiment as the data used in section 3. Our approach assumes the individual-neuron conditional intensity functions account for this kind of shared activity. If conditional intensity functions failed to include appropriate covariates to identify population activity, then multiway spiking might no longer be rare compared with individual-neuron spiking and hierarchical sparsity might no longer be applicable. Also, network activity such as that in Figure 5 would be an example source of trial-to-trial variation; in our approach, all important sources of trial-to-trial variation must be included by defining suitable covariates. It would be interesting to consider incorporating into the framework described here latent variables to accommodate network bursting (Chen, Vijayan, Barbieri, Wilson, & Brown, 2009; Tokdar, Xi, Kelly, & Kass, 2010; Wu, Kulkarni, Hatsopoulos, & Paninski, 2009), but that is a topic for future research. Kass et al. (2011) used a covariate based on population spike counts in point-process regression models to analyze synchrony, following the approach specified in greater detail here.

The usual parameters in equation 1.1 have some statistical virtues, as do the orthogonalized parameters discussed by Amari (2009). We have worked with a different parameterization, given in equation 2.8, with the chosen due to interpretability as a gain factor for increased synchronous firing rate. This also led to our suggestion of in equation 2.11 as a measure of functional connectivity of neurons *i* and *j* in the presence of neuron *k*. One could instead introduce a generalized version of the usual parameters in equation 1.1 to account for time-varying firing rates, for example. In the context of synchrony investigation, we find more directly interpretable, but this is a matter of taste and convenience.

We also conducted a power study, the main conclusion of which is that large numbers of trials are likely needed in order to detect realistic multiway spiking above that determined by two-way interactions. In the most favorable case we examined, where cells fire at 10 Hz, approximately 75 1 s trials are needed to reliably detect excess three-way spiking that produces double the number of spikes expected under the two-way model (and this assumed that diminished three-way spiking is not of interest so that a one-sided test could be used; a two-sided test would require considerably more data). When there are lower firing rates, or less dramatic excess firing, the ability to reliably detect excess three-way spiking deteriorates, and very long recording sessions will be needed. With the code we have made available, additional scenarios may be investigated so that the power of new experiments to find excess three-way synchrony could be considered carefully.

We have assumed that excess multiway spiking is constant in time. At least in the case of two-way spiking, there will be opportunities to examine stimulus-related time-locked increases in synchrony, with modest amounts of data, which could have important physiological relevance (Riehle, Grün, Diesmann, & Aertsen, 1997). In our framework, this would require estimating in equation 1.3, and an illustration of such an estimate was given by Ventura et al. (2005b) based on low-order spline fitting. An alternative would be to apply formula 2.6 repeatedly across distinct time intervals, and a continuous-time estimate could also be obtained by windowing or smoothing the numerator and denominator of equation 2.6, analogous to what was done in Kass et al. (2003). From our power results in section 4, however, we would expect time-varying multiway interactions to require either high firing rates or extended recording sessions. Indeed, when we examined data from two pairs of neurons in section 3 and found no evidence in favor of nonconstant , we observed that we had relatively few synchronous spikes to work with, and thus little statistical power to detect nonconstant . In the context of time-varying synchrony, it is also worth pointing out that in the presence of time-varying individual-neuron firing rates, alternative models of excess synchrony are not equivalent (Ito & Tsuji, 2000). In particular, the assumption in equation 1.1 that is constant in time is not equivalent to the assumption that is constant in time.

We have applied the IPF procedure of section 2.2 to triples of neurons. It could be applied to larger sets of *n*>3 neurons, but the number of probabilities that must be fitted is 2^{n}, so as *n* grows, the computation will quickly become infeasible. We believe it remains possible to treat *n* neurons, even when *n* is large, but this will require additional methods and is a subject for future investigation. An important alternative to the parametric bootstrap described here is based on spike jittering, as in Harrison and Geman (2009). The jitter null hypothesis, however, is different from the bootstrap null of section 2.3 (see Harrison et al., in press). In future work we also plan to investigate the use of jitter in conjunction with the model-based approach described here.

Finally, no discussion of synchrony is complete without some reference to the problem of spike sorting, which undoubtedly can have an impact on synchrony detection. Considerable effort went into the characterization of spike waveforms in the data analyzed here (Kelly et al., 2007), but the problem is difficult (Harris, Henze, Csicsvari, Hirase, & Buzsaki, 2000; Ventura, 2009). We trust spike misclassification will be mitigated as recording technologies advance.

## Notes

^{1}

In the usual way equation 1.1 is presented in the statistics literature, there are 19 parameters and 13 constraints.

^{3}

We used equation 2.15 rather than the two-way model because the test of was significant.

^{4}

The values of *E _{x}* are similar for the overall time-averaged firing rate (a constant firing rate) and for the time-varying firing rate; this depends on the “signal correlation,” that is, the overlap of the PSTHs, with the latter typically being bigger than the former when there is substantially shared time-varying response to the stimulus and, thus, relatively large signal correlation.