## Abstract

Time series data sets often contain heterogeneous signals, composed of both continuously changing quantities and discretely occurring events. The coupling between these measurements may provide insights into key underlying mechanisms of the systems under study. To better extract this information, we investigate the asymptotic statistical properties of coupling measures between continuous signals and point processes. We first introduce martingale stochastic integration theory as a mathematical model for a family of statistical quantities that include the phase locking value, a classical coupling measure to characterize complex dynamics. Based on the martingale central limit theorem, we can then derive the asymptotic gaussian distribution of estimates of such coupling measure that can be exploited for statistical testing. Second, based on multivariate extensions of this result and random matrix theory, we establish a principled way to analyze the low-rank coupling between a large number of point processes and continuous signals. For a null hypothesis of no coupling, we establish sufficient conditions for the empirical distribution of squared singular values of the matrix to converge, as the number of measured signals increases, to the well-known Marchenko-Pastur (MP) law, and the largest squared singular value converges to the upper end of the MP support. This justifies a simple thresholding approach to assess the significance of multivariate coupling. Finally, we illustrate with simulations the relevance of our univariate and multivariate results in the context of neural time series, addressing how to reliably quantify the interplay between multichannel local field potential signals and the spiking activity of a large population of neurons.

## 1 Introduction

The observation of highly multivariate temporal point processes, corresponding to the activity of a large number of individuals or units, is pervasive in many applications (for example, neurons in brain networks; Johnson, 1996) and members in social networks (Dai, Wang, Trivedi, & Song, 2016; De, Valera, Ganguly, Bhattacharya, & Rodriguez, 2016). As the number of observed events per unit may remain small, inferring the underlying dynamical properties of the studied system from such observations is challenging. However, in many cases, it is possible to observe continuous signals whose coupling with the events can offer key insights.

In neuroscience, this is the case of the extracellular electrical field, which provides information complementary to spiking activity. Local field potentials (LFP) are mesoscopic (Liljenstroem, 2012) signals resulting from the superposition of the electric potentials generated by ionic currents flowing across the membranes of the cells located close to the tip of recording electrodes. The LFP reflects neural cooperation due to the anisotropic cytoarchitecture of most brain regions, allowing the summation of the extracellular currents resulting from the activity of neighboring cells. As such, a number of subthreshold integrative processes (i.e., modifying the neurons' internal state without necessarily triggering spikes) contribute to the LFP signal (Buzsaki, Anastassiou, & Koch, 2012; Buzsaki, Logothetis, & Singer, 2013; Einevoll, Kayser, Logothetis, & Panzeri, 2013; Pesaran et al., 2018; Herreras, 2016).

Reliably quantifying the coupling between activities of individual units (e.g., spikes generated by individual neurons) in a circuit and the aggregated measures (such as the LFP) may provide insights into underlying network mechanisms, as illustrated in the electrophysiology literature. At the single neuron level, the relationship of spiking activity to subthreshold activity has broad implications for the underlying cellular and network mechanisms at play. For instance, it has been suggested that synaptic plasticity triggers changes in the coupling between spikes and LFPs (Grosmark, Mizuseki, Pastalkova, Diba, & Buzsáki, 2012; Grosmark & Buzsáki, 2016). Regarding the putative functional role of such observed couplings, it has been hypothesized to support cognitive functions such as attention. Such coordination by oscillations hypothesis proposes that network oscillations modulate differentially the excitability of several target populations, such that a sender population can emit messages during the window of time for which a selected target is active, while unselected targets are silenced (Fries, 2005, 2015; Womelsdorf et al., 2007).

In the case of two continuous signals, coupling measures such as coherence and phase locking value (PLV) (Rosenblum, Pikovsky, Kurths, Schäfer, & Tass, 2001; Pereda, Quiroga, & Bhattacharya, 2005) are widely used, and their statistical properties have been investigated, in particular in the stationary gaussian case (Brillinger, 1981; Aydore, Pantazis, & Leahy, 2013). In a similar way, PLV (Ashida, Wagner, & Carr, 2010) and spike-field coherence (SFC) (Mitra, 2007) can measure spike-LFP coupling (see among others: Vinck, Battaglia, Womelsdorf, & Pennartz, 2012; Vinck, van Wingerden, Womelsdorf, Fries, & Pennartz, 2010; Jiang, Bahramisharif, van Gerven, & Jensen, 2015; Zarei, Jahed, & Daliri, 2018; Li, Cui, & Li, 2016) and are broadly used to makes sense of the role played by neurons in coordinated network activity (Buzsaki & Schomburg, 2015). There are notable contributions investigating potential biases of those measures when both point processes and continuous signals are involved (Lepage, Kramer, & Eden, 2011; Kovach, 2017). However, two issues relevant for practical applications remain: (1) the effect of the intrinsic variability of spike occurrence on key statistical properties of the estimates, such as the variance, have not yet been thoroughly described, and (2) how to extend the rigorous statistical analysis of spike-field coupling in the context of the highly multivariate signals available with modern recording techniques remains largely unaddressed.

We address these two issues by using continuous time martingale theory (see Liptser & Shiryaev, 2013a), the related concept of stochastic integration (see Protter, 2005) and random matrix theory (Anderson, Guionnet, & Zeitouni, 2010; Capitaine & Donati-Martin, 2016). The martingale central limit theorem (CLT) allows us to derive analytically the asymptotic gaussian distribution of a general family of coupling measures that can be expressed as stochastic integrals. We exploit this general result to show that the classical univariate PLV estimator is also asymptotically normally distributed and provide the analytical expression for its mean and variance. Furthermore, we study potential sources of bias for the commonly used von Mises coupling model (Ashida et al., 2010). We then go beyond univariate coupling measures and analyze the statistical properties of a family of multivariate coupling measures taking the form of a matrix with stochastic integral coefficients. We characterize the jointly gaussian asymptotic distribution of matrix coefficients, and exploit random matrix theory (RMT) principles to show that after appropriate normalization, the spectral distribution of such large matrices under the null hypothesis (of absence of coupling), follows approximately the Marchenko-Pastur (MP) law (Marchenko & Pastur, 1967), while the magnitude of the largest singular value converges to a fixed value whose simple analytic expression depends only on the shape of the matrix. We finally show how this result provides a fast and principled procedure to detect significant singular values of the coupling matrix, reflecting an actual dependency between the underlying signals. In the appendixes, we included detailed proofs and background material on RMT and stochastic integration, such that nonexpert readers can further apply these tools in neuroscience.

## 2 Background

### 2.1 Spike-Field Coupling in Neuroscience

Although our results are relevant to a broad range of applications within and beyond neuroscience, we will use the estimation of spike-LFP coupling introduced above as the guiding example of this letter. Spikes convey information communicated between individual neurons. This information is believed to be encoded in the occurrence times of successive spike events, which are typically modeled with point processes—for example, Poisson (Softky & Koch, 1993) or Hawkes process (Truccolo, 2016; Krumin, Reutsky, & Shoham, 2010).

While oscillatory dynamics is ubiquitous in the brain and instrumental to its coordinated activity (Buzsaki, 2006; Buzsaki et al., 2013; Peterson & Voytek, 2018), it is often challenging to uncover based solely on the sparse spiking activity of recorded neurons. On the other hand, LFPs often exhibit oscillatory components that can be isolated with signal processing tools (typically bandpass filtering or template matching), such that pairing the temporal information from LFPs and spiking activity can help extract reliable markers of neural coordination.

### 2.2 Counting Process Martingales

^{1}to the filtration ${Ft}$ (which represents the past information accumulated up to time $t$) if (1) $M(0)=0$, (2) it is adapted to ${Ft}$ (informally the law of $M$ up to time $t$ “uses” only past information up to $t$), and (3) it satisfies the martingale property:

In order to keep our results concise, we assume the following deterministic setting in the remainder of this letter (see section 5 for potential extensions).

**Assumption 1.**

Assume the intensity function, $\lambda (t)=\lambda (t|Ft)$ of $N(t)$ and the signal $x(t)$ are deterministic bounded left-continuous and adapted to $Ft$ over $[0,T]$.

^{1}, the terms of equation 2.5 separate the deterministic part from the (zero-mean) random fluctuations of the measure that are integrally due to the martingale term. Using martingale properties, the statistics of the coupling measure are

^{2}

### 2.3 Random Matrix Theory

As data sets get increasingly high dimensional, it becomes important to replace the above univariate measure $c^$ by a quantity that summarizes the coupling between a large number of units and continuous signals. This extension leads to assessing the spectral properties of a coupling matrix $C^$ that gathers all pairwise measurements. However, such task is nontrivial due to the martingale fluctuations affecting $C^$, leading to spurious nonzero coupling coefficients, and can also hide the deterministic structure of the matrix associated with significant coupling.

*spectral measure*in our case),

Notably, recent developments in the field of random matrix theory extend the classic results that were only valid for independent coefficients (uncorrelated Wishart matrices) to various forms of dependencies between coefficients. For instance, El Karoui (2007, 2008) showed that the ESD and the distribution of the largest eigenvalue for a sequence of matrices $Xnn>0$ with general covariance matrices (not necessarily with identity covariance matrix) follow similar laws and Banna, Merlevède, and Peligrad (2015) investigate the case of symmetric random matrices with correlated entries. Furthermore, the behavior of high-dimensional autocovariance matrices in the context of discrete time stochastic processes is discussed in Liu, Aue, and Paul (2015) and Bhattacharjee and Bose (2016). Applications of this framework have also been considerably extended including global finance (Namaki et al., 2020) and various aspects of machine learning and signal processing such as shallow (Louart, Liao, & Couillet, 2018) and deep (Pennington & Bahri, 2017; Pennington & Worah, 2019) neural networks, denoising (Bun, Bouchaud, & Potters, 2017) and dimensionality reduction (Johnstone & Onatski, 2020).

In this study, we show that the martingale fluctuations of the coupling matrices also cause spectral convergence to the MP law in the absence of actual coupling between the signals. Recent results on the low-rank perturbation (Capitaine & Donati-Martin, 2016; Loubaton & Vallet, 2011; Benaych-Georges & Nadakuditi, 2012) of random matrices suggest this convergence can be exploited to further assess the significance of the largest eigenvalues of the coupling matrix with respect to the null hypothesis that they only reflect random fluctuations.

## 3 Assessment of Univariate Coupling

### 3.1 Mathematical Formulation

We exploit a central limit theorem (CLT) for martingales to show the residual variability (difference between the empirically estimated $c^K$ and the expected coupling $c*$ of equation 2.6) is asymptotically normally distributed. We formally state it in theorem ^{2}.

**Theorem 1.**

^{1}, and $x(t)$ real-valued. Then,

^{2}to derive the asymptotic properties of the PLV introduced in section 2.1. For that, we adapt the empirical estimate of equation 2.1 to the $K$ trials setting introduced above and define

^{3}For this estimate, we get the following result.

**Corollary 1.**

^{1}, where $\varphi $ is real-valued and stands for the phase of the signal $x$. Then the expectation of the PLV statistics $PLV^K$ estimated from $K$ trials of measurements on $[0,T]$ tends to the limit

**Sketch of the proof.**

This relies on applying theorem ^{2} to the real and imaginary parts of $ei\varphi (t)$. In addition, the coupling between both quantities is taken into account by replacing the variance of univariate quantities $V\u02dc(t)$ in theorem ^{2} by a covariance matrix that can be assessed with martingale results given in section B.1.1.

**Remark 1.**

^{4}$1\lambda 0T$ such that the coupling strength $\u03f0$ affects the mean but not the variance of the PLV estimate.

^{24}and corollary

^{25}, in appendix C, provide formal statements of this remark.

^{3}to predict the statistics of PLV estimates for other models of phase-locked spike trains. A classical model uses the von Mises distribution (also known as circular normal distribution) with parameter $\kappa \u22650$ to model the concentration of spiking probability around a specified locking phase $\varphi 0$ (for more details, see Ashida et al., 2010). The original model uses a purely sinusoidal time series by assuming a linearly increasing phase $\varphi (t)=2\pi ft$, where $f$ is the modulating frequency, to derive the intensity of an inhomogeneous Poisson spike train,

^{4}, whose PLV magnitude can reach at most 1/2, this model can achieve arbitrarily large PLV, which might explain why it is more frequently used in applications.

Corollaries 2 and 3 derive the asymptotic covariance of the variability of the PLV estimate around this theoretical value (which is novel to the best of our knowledge). Furthermore, the results are derived in a more general model setting accounting for “biases”^{5} due to nonlinear phase increases $\varphi (t)$ and observation intervals that are not multiples of the modulating oscillation period. It should be noted that the mentioned biases are inherent in the estimator's definition. They happen independent of additional biases originating from the phase estimation procedure (e.g., phase extraction via Hilbert transform; see Kovach, 2017).

^{25}in appendix C for a simplified version of corollary

^{5}, assuming a linearly increasing phase $\varphi (t)=2\pi ft$ with frequency $f$).

**Corollary 2.**

^{3}, assume additionally that $\varphi (t)$ is continuous, strictly increasing, and piece-wise differentiable on $[0,T]$ and the intensity of the point-process is given by equation 3.5 for a given $\kappa \u22650$, then the expectation of the multitrial PLV estimate converges (for $K\u2192+\u221e$) to

**Sketch of the proof.**

This is based on plugging the intensity function $\lambda (t)$ of equation 3.5 in corollary ^{3}. Using change of variable in the integrals ($\varphi (t)$ to $\theta $) and exploiting the symmetries of the functions, the integrals in the analytical expressions of the expectation and covariance turn into modified Bessel functions $Ik$ for $k$ integer.

The above result has important consequences for the assessment of PLV from data. In particular, it exhibits key experimental requirements for PLV estimates to match the classical Bessel functions expression of equation 3.7: (1) evaluate PLV on an integer number of periods (this is critical for trials with short duration) and (2) take into account the fluctuations of the rate of increase of the phase $\varphi (t)$ across the oscillation period. This second point is critical in applications where the phase is inferred from signals (such as LFPs) through the Hilbert transform, as nonlinearities of the underlying phenomena may lead to nonsinusoidal oscillations, with periodic fluctuations of the time derivative of the phase $\varphi '(t)$. To further emphasize the consequences of this aspect, we also derive the asymptotic distribution of $PLV$ for a homogeneous Poisson process that corresponds to the special case $\kappa =0$ of the classical von Mises coupling of equation 3.5. Although there is no actual coupling between events and the continuous signal in such a case,^{6} the nonlinear phase increase leads asymptotically (for $K$ large) to a nonvanishing PLV estimate and to false detection of coupling.

**Corollary 3.**

^{3}, we assume additionally that the point process is homogeneous Poisson with rate $\lambda 0$ and that $\varphi (t)$ is strictly increasing (almost everywhere) and differentiable on $[0,T]$. Let $\theta \u21a6\tau (\theta )$ be its inverse function (such that $\tau (\varphi (t))=t$). Then the expectation of $PLV^K$ converges (for $K\u2192+\u221e$) to

**Sketch of the proof.**

The result stems from using the intensity function $\lambda 0$ in corollary ^{3} and then using change of variable in the integrals and exploiting the symmetries of the functions.

This corollary will be further illustrated in the next paragraphs.

### 3.2 Application to Bias Assessment

Corollary ^{6} predicts scenarios where in the absence of modulation of spiking activity (having a constant intensity function $\lambda (t)=\lambda 0$), the expectation of the PLV estimates remains far from zero even when the number of trials is large, that is, the coupling between a homogeneous point process and a continuous oscillatory signal would appear significant and reflect a form of bias. Corollary ^{6} allows computing this bias and therefore correcting it.

^{24}): $\varphi (t)=2\pi ft$, where $f$ is the frequency of oscillation of the continuous signal. We then get

^{6}, let $\theta \u21a6\tau (\theta )$ be the inverse of $\varphi (t)$, and let us use equation 3.9 to compute the $PLV*$. Taking a sinusoidal modulation over the oscillation period, $\tau (\theta )=\theta +\epsilon sin(\theta )$ with $|\epsilon |<1$,

^{7}we get a nonvanishing asymptotic expected PLV:

Our theoretical framework can be used for developing methods to correct such biases. In the linear phase setting, bias can be avoided simply by using an integer number of periods for coupling estimation. In the case of a nonlinear phase evolution of the continuous signal, we can use the theoretical phase (if available) or its empirical estimate to evaluate $PLV*$ under constant spike intensity assumptions with equation 3.9 and subtract this quantity to the estimated PLV. For resolving issues that arise due to the nonlinearity of the estimated phase, specialized methods have been suggested. For instance, Hurtado, Rubchinsky, and Sigvardt (2004) dealt with phase jumps (a particular form of nonlinearity) by interpolating the signal from the available data before and after the sudden change and Cole and Voytek (2019) introduced a cycle-by-cycle method for analyzing oscillatory dynamics. In this method, they consider a linear phase for each detected cycle of oscillation. Therefore, with this linear choice of phase, one can avoid the spurious coupling that can appear due to phase nonlinearities. Based on our framework, theoretically motivated methods that are not relying on the linearization of the phase can be developed.

### 3.3 Simulations

We demonstrate the outcome of our theoretical results using simulated phase-locked spike trains (similar to what has been introduced in corollaries 2 and 4) and sinusoidal oscillations. For generating phase-locked spike trains, we adopt the method introduced in Ashida et al. (2010). As the model has already been described elsewhere, we restrict ourselves to a brief explanation.

To generate phase-locked or periodic spike trains based on the classical von Mises model with rate $\lambda (t)$ as introduced in equation 3.4, we use a purely sinusoidal continuous signal $x(t)$ with linearly increasing phase $\varphi (t)=2\pi ft$, with $f=1$ Hz and various coupling strength ($\kappa $) (see appendix E for lists of parameters used for each figure). Based on this simulation we perform two numerical experiments to demonstrate the practical relevance of our (asymptotic) theoretical results.

#### 3.3.1 Experiment 1

In order to demonstrate the validity of corollaries 2 and 4, in Figure 2 we show the empirical distribution of the normalized residual of the PLV estimate and compare it to its asymptotic theoretical distribution. We simulate two cases, one with homogeneous Poisson spike trains ($\kappa =0$) and one with phase-locked spike trains ($\kappa =0.5$) with Poisson statistics. In both cases, we observe the agreement between theory and simulation, as the joint distribution of real and imaginary part approaches an isotropic gaussian. The slightly non-gaussian shape of the real part histogram for $\kappa =0.5$ suggests, however, a slower convergence to the normal distribution in the case of coupled signals.

#### 3.3.2 Experiment 2

We demonstrate an application of corollary ^{6} for bias evaluation with a simple simulation. In section 3.2 we pointed out that using a noninteger $fT$ ($T$ is not a multiple of the oscillation period) can lead to spurious correlation between the point process and the oscillatory continuous signal. By using equation 3.11 we can compute this bias.

## 4 Assessment of Multivariate Coupling

High-dimensional data sets have become increasingly important in biology (Bühlmann, Kalisch, & Meier, 2014). More specifically in neuroscience, state-of-the-art multichannel electrophysiology recording systems (Dickey, Suminski, Amit, & Hatsopoulos, 2009; Jun et al., 2017; Juavinett, Bekheet, & Churchland, 2019) allow the simultaneous recording of thousands of sites (Pesaran et al., 2018; Jun et al., 2017; Buzsáki, 2004; Fukushima, Chao, & Fujii, 2015). This growth in dimensionality requires the development of appropriate tools (Stevenson & Kording, 2011; O'Leary, Sutton, & Marder, 2015; Gao & Ganguli, 2015; Williamson, Doiron, Smith, & Yu, 2019) for computing an interpretable summary of the coupling between neurophysiological quantities reflecting the collective dynamics of the underlying neural ensembles (Truccolo, 2016; Safavi et al., 2020). To achieve this aim, deriving low-rank approximations of high-dimensional matrices is supported by empirical evidence and theoretical predictions of the existence of low-dimensional structures in neural activity (Ermentrout & Kleinfeld, 2001; Ermentrout & Pinto, 2007; Truccolo, Hochberg, & Donoghue, 2010; Gallego, Perich, Miller, & Solla, 2017; Mastrogiuseppe & Ostojic, 2018; Sohn, Narain, Meirhaeghe, & Jazayeri, 2019; Cueva et al., 2020). This section provides statistical results for such approximation in the context of the coupling between point processes and continuous signals.

### 4.1 Mathematical Formulation

^{1}to adapt to this multivariate setting. By restricting ourselves to homogeneous Poisson processes, we investigate a null hypothesis of no coupling between continuous signals and point processes. Let us denote $x\xaf$ the complex conjugate of $x$ and $\delta $ the Kronecker delta symbol:

**Assumption 2**

(Complex Multivariate Case). We consider an infinite sequence ${xj(t)}j\u22651$ of complex valued left-continuous deterministic functions uniformly bounded on $[0,T]$ and assume

- (1)
For all $i,j\u22651$, $1T\u222b0Tx\xafixjdt=\delta ijand\u222b0Txixjdt=0.$

- (2)
For all $i\u22651$, $\u222b0Txidt=0$,

- (3)
There exist $0<\lambda min<\lambda max$ and a sequence of independent homogeneous Poisson processes ${Ni}i\u2208N*$'s with associated rates ${\lambda i}i\u2208N*$ in the interval $[\lambda min,\lambda max]$.

While the assumptions on ${xi(t)}$ are designed for complex signals, which is the classical case when dealing with PLV-like quantities, the results of this section also hold for real signals by using the assumption $1T\u222b0Txixjdt=\delta ij$ instead of the above condition 1. Condition 2 is also added to ensure that there is no trivial bias leading to a nonvanishing expectation of the coupling coefficients (as illustrated in section 3.2). Indeed, when the time average of each signal vanishes, based on theorem ^{2}, the expectation of all univariate coupling measures for a homogeneous Poisson process vanishes. We then exploit a multivariate generalization of the martingale CLT to characterize the distribution of the coupling matrix given these assumptions.

**Theorem 2.**

For given $n,p\u22651$ and all $K\u22651$, we use sequences of signals defined in assumption ^{7} to build multivariate continuous signal $x(t)=(xj)j=1\cdots p$ and $K$ independent copies of multivariate Poisson process $N(t)=(Ni)i=1\cdots n$ with rate vector $\lambda =[\lambda 1,\cdots ,\lambda n]\u22a4$. Then the normalized coupling matrix $KC^Kdiag(T\lambda )-1$, with $C^K$ given by equation 4.1, converges in distribution for $K\u2192+\u221e$ to a matrix with i.i.d. complex standard normal coefficients.

**Sketch of the proof.**

This essentially uses a generalization of the CLT to multivariate point processes described in Aalen et al. (2008, appendix B). Based on the statistics of stochastic integrals presented in section B.1.1, assumptions on $x$ entail vanishing correlations between all matrix coefficients and lead to the analytical expression of the covariance matrix.

**Theorem 3.**

^{7}, assume an increasing, positive integer sequence ${p(n),K(n)}n\u2208N*$ such that $p(n)n\u27f6n\u2192+\u221e\alpha \u2208(0,+\u221e)$, and

^{8}for $p=p(n)$; then the corresponding sequence ${Sn}$ defined by equation 4.3 has an ESD converging weakly with probability one to the MP law of equation 2.7.

**Sketch of the proof.**

We use theorem 1.1 of Bai and Zhou (2008) addressing the case of matrices with dependence of coefficients within columns. We use Itô's formula (see appendix B) to check the simplified necessary conditions provided in corollary 1.1 of Bai and Zhou (2008). This implies convergence of the Stieltjes transform to the same function as the transform of the MP distribution. By classical results on the Stieltjes transform (Anderson et al., 2010, theorem 2.4.4), this implies weak convergence to the MP measure (i.e., convergence for the weak topology; see appendix B.2).

**Remark 2.**

Condition in equation 4.4 determines how many trials are needed at most for spectral convergence. Due to the uniform boundedness assumption on signal $x(t)$ and given the number of terms in the sum bounded by $n4$, we can already see that $nK(n)\u21920$, that is, having the number of trials increasing at an even slightly faster rate than the dimension is enough for convergence for any choice of continuous signals respecting orthonormality assumption ^{7}. However, there are cases where even fewer trials than dimensions are required. An important example is the Fourier basis of the $[0,T]$ interval, $xl(t)=1Texp(i2\pi lt/T)$. Then all terms in the sum of equation 4.4 vanish, except the ones satisfying $j-j'-l+l'=0$, such that we are left with a number of bounded terms that scale with $n3$. As a consequence, the condition on the number of trials to achieve spectral convergence becomes $nK(n)\u21920$, such that we need increasingly fewer trials than dimensions. On the contrary, due to the uniform bound that we impose on the signals, choosing a basis of signals with decreasing support, such as a wavelet basis, typically departs from our condition 1 of assumption ^{7}, as the normalization of condition 1 of assumption ^{7} imposes unit norm on each signal, requiring their amplitude to increase as their support decreases, violating the uniform bound assumption. This limitation supports the intuition that statistical regularities exploited by our asymptotic results deteriorate with highly transient signals.

This convergence of the spectral measure to the MP law guarantees eigenvalues do not accumulate in a large proportion above the upper end of the support of the MP law; however, they do not provide rigorous guarantees regarding convergence of individual eigenvalues and, in particular, the largest eigenvalue. Although such convergence is satisfied in classical settings (gaussian i.i.d. coefficients), they typically require stronger assumptions than for the (weak) spectral convergence to the MP law, and still only very few results are available in the non-i.i.d. setting. We could, however, prove such convergence by adding a constraint to our model.

**Theorem 4.**

**Sketch of the proof.**

The identical intensities assumption allows us to use the result of Chafaï and Tikhomirov (2018) for matrices with i.i.d. columns. We first checked that their proof holds also for the complex case by replacing symmetric matrices by Hermitian matrices and squared scalar product by an absolute squared Hermitian product. We satisfy their strong tail projection (STP) assumption using Chebyshev's inequality. The necessary fourth-order moment conditions exploit the same stochastic integration results as theorem ^{9}.

**Remark 3.**

Without additional assumptions, the moment condition of equation 4.5 is satisfied by choosing $K(n)=n2$ (as there are $p2$ bounded moments, scaling as $n2$ when $n$ grows). It is likely from the proof that taking into account more information about the moments of the continuous signal sequence ${xj}$, we can achieve convergence with a lower rate of increase for the number of trials. This is left to future work.

This result thus provides the guarantees that under a null hypothesis of no coupling (due to homogeneity of the Poisson processes), the extreme eigenvalues of $S$ will asymptotically cover exactly the full support of the MP law. This will be used in section 4.2 to assess the significance of the eigenvalues $\u2113k$ by simply checking whether they are larger than $(1+\alpha )2$.

This significance analysis relies as well on understanding what happens to the eigenvalues when the model departs from the null hypothesis. In a practical setting, we hypothesize that the coupling matrix has a deterministic structure superimposed on the martingale noise modeled in the above results. One qualitative justification of this assumption can be found in remark ^{4}, showing that for sinusoidal coupling, a nonvanishing expectation proportional to the coupling is superimposed to martingale noise, whose distribution is unaffected by coupling, such that the noisy part of the matrix satisfies the conditions of the above theorems. As typically done in applications, we are mostly interested in the low-rank structure associated with the largest singular values of the coupling matrix, providing an interpretable summary of the multivariate interactions.

This naturally leads to a modeling departure from the null hypothesis with a low-rank perturbation assumption. In such a case, the eigenvalues related to significant coupling are expected to be reflected in the spectrum of the perturbed matrix, such that they can be isolated from the remaining eigenvalues associated with the martingale noise. This intuition is justified by results in the case of the Wishart ensemble (Loubaton & Vallet, 2011); see also Benaych Georges and Nadakuditi (2012) for a more general result and Capitaine and Donati Martin (2016) for an overview of matrix perturbation results), that we restate here:

**Theorem 5**

A demonstration that this further applies rigorously to our nongaussian, non-i.i.d. case is left to further work (but see Benaych-Georges & Nadakuditi, 2012, for a generalization in this direction). This result shows the upper end of the MP support is indeed the critical threshold for the eigenvalues of $An$ to stand out from the noise. Below this threshold, the largest eigenvalue convergence to the upper end of the support of the MP distribution is not informative about $\theta i$. Above this threshold, the value of $\theta i$ can be recovered and detected by comparing the largest eigenvalue to the upper end of the MP distribution.

We next illustrate the interest of these theoretical predictions in the context of neural time series for reliably quantifying the interplay between multichannel LFP signals and the spiking of multiple neurons. Nevertheless, the results are potentially applicable in other domains as well. In neuroscience, $x$ may represent LFP measurements collected on each recording channel and $N$ the spiking activity of different neurons, called units. The number of recording channels $nc$ and recorded units $nu$ correspond to $p$ and $n$, respectively. These numbers may differ, and as a consequence, the coupling matrix is generally rectangular.

### 4.2 Application to Significance Assessment

In order to statistically assess the significance of the largest singular value(s) of coupling matrix $C^Kn$, considered as a measure of coupling between point processes and continuous signals, we need a null hypothesis. Hypothesis testing based on the generation of surrogate data is one of the common methods for significant assessment in neuroscience and other fields. Generating appropriate surrogate data can be not only challenging (see Grün, 2009, and Elsayed & Cunningham, 2017, for examples in neuroscience), but also computationally expensive due to the increasingly large dimensions of modern data sets. Exploiting our theoretical results for this setting allows us to perform such statistical assessment in a principled way, without using surrogate data and sparing computational resources.

^{7}(and avoid bias in the coupling measure similar to what is described in section 3.3). We then need to process further this signal such that condition 1 of assumption

^{7}is satisfied approximately. In order to achieve this, we perform classical whitening of the signals to generate matrix $X$, the discrete time approximation of $x(t)$, according to

^{8}), a key requirement for convergence to the MP law.

### 4.3 Simulation

We use a simulation to demonstrate the outcome of our (asymptotic) theoretical results on mutlivariate coupling. Similar to the simulations of section 3.3 for the univariate case, we use simulated phase-locked spike trains with Poisson statistics. The main difference between this simulation and the previous one is in synthesizing the LFP. In order to simulate multichannel oscillatory signals that lead to a low-rank structure for $C*$, we use a combination of noisy oscillatory components.

In this simulation, the frequencies of the oscillatory components range from 11 to 15 Hz. We used 100 LFP channels ($nc=100$) and different choices for the number of spiking units (10, 50, and 90). Spiking activities are simulated in different scenarios, with and without coupling to the LFP oscillations. In the latter case, we have two populations of neurons (each consisting of 1/5th of the total number of neurons) that are each coupled to one of the oscillatory groups of LFP channels. Both populations are coupled to their respective oscillation with identical strength ($\kappa =0.15$) and phase ($\varphi 0=0$).

To compute the coupling matrix $C^K$, we first preprocess $\Psi [t]$ by applying bandpass filtering in a range covering $[fmin,fmax]$ and convert it to an analytic signal via the discrete time Hilbert transform, leading to data matrix $L$, following the standards of PLV analysis in neuroscience (Chavez, Besserve, Adam, & Martinerie, 2006).

This signal matrix is then whitened according to equation 4.6 to yield matrix $X$, the discrete time version of $x(t)$. The coupling matrix $C^K$ is then computed according to equation 4.1 using 10 trials (barring trivial approximation to the closest time sample in $X$).

## 5 Discussion

### 5.1 Insights for Data Analysis

Our theoretical results provide guarantees for specific coupling models to respect univariate and multivariate asymptotic statistics that can be easily exploited for statistical testing. The required assumptions provide guidelines for practical settings that are likely of interest beyond the strict framework that we imposed to get the rigorous results. For the univariate coupling measure, corollaries and simulations point out the importance of the choice of the observation interval $[0,T]$, which is particularly sensitive when considering short intervals covering only a few oscillation periods. This is the case when doing time-resolved analysis or dealing with experiments with short trial durations. Moreover, the univariate results also emphasize the effect of nonlinear phase increases, highly relevant in neuroscience due to the pervasive effects of nonlinear dynamics in the mesoscopic signals. Our results provide asymptotic bias correction terms that can be used for statistical testing.

In the same way, theoretical results in the multivariate setting may seem to be constrained by our assumptions, but they provide critical guidelines to interpret singular values. First, whitening the continuous signals and normalizing the coupling by the square root of the rate are key preprocessing steps for making the asymptotic behavior of the martingale noise invariant to the specifics of the data at hand. This then reduces to an analytical model, the MP law, dependent on only a single matrix shape parameter. After assessment of the significance of the singular value of the normalized coupling matrix, it is of course possible to revert these preprocessing steps to get a low-rank approximation of the original coupling matrix (nonnormalized, nonwhitened) to summarize the significant coupling structure in an interpretable way. A second insight provided by the multivariate results is the role of fourth-order moments of the continuous signals, represented by the integrals of order four monomials of components of $x(t)$, in the MP convergence results. The magnitude of these moments determines the number of trials asymptotically needed to achieve convergence. Since these moments can be estimated empirically, we can check how they grow with the dimension of the signals in a specific application. With our minimal assumptions on the signals, the number of trials need only to grow at most sublinearly in the dimension for spectral convergence; however, we could only show that convergence of the largest eigenvalues requires at most quadratic increase in the dimension $n$. This last result might be improved in future work, with extra assumptions, to reach linear growth.

Our theoretical results can be extended in two directions. The first is toward exploiting point processes different from inhomogeneous Poisson (e.g., Hawkes processes) in order to be able to apply the framework in the context of stochastic intensities. The second direction is toward exploiting recent developments in RMT, in order to develop a probabilistic significance assessment.

### 5.2 Extension of Signal Assumptions

Our theoretical results assume deterministic continuous signals and point process intensities (see assumption ^{1}). This entails limitations, such as implicitly assuming the considered point processes are (homogeneous or inhomogeneous) Poisson processes. This assumption may be too restrictive in realistic scenarios (for examples in neuroscience, see Deger, Helias, Boucsein, & Rotter, 2012; Reimer, Staude, Ehm, & Rotter, 2012; Nawrot et al., 2008; Shinomoto, Shima, & Tanji, 2003; Maimon & Assad, 2009; Shinomoto et al., 2009). However, the stochastic integration methods that provide the basis of our results allow the treatment of random signals and intensities, provided they are predictable, which encompasses a wide enough class of processes to cover most applications (Protter, 2005). Our results thus have potential for generalizations to the case of random continuous signal, with the difference that the variance of the estimates would increase due to the additional variability induced by the signal fluctuations, and to the case of random intensities, leading to different asymptotic properties of the coupling measures, which may or may not have simple analytical expressions.

As a potential direction for extending this framework, Hawkes processes (Hawkes, 1971) are point processes for which the probability of occurrence of future events can also depend on the sequence of past events. Due to this history dependency, they are also called self-exciting processes. Hawkes processes are used for modeling recurrent interactions in various fields; for instance—in finance it is used to model buy or sell transaction events on stock markets (Embrechts, Liniger, & Lin, 2011) in geology to model the origin times and magnitudes of earthquakes (Ogata, 1988), in online social media to model user actions over time (Rizoiu, Lee, Mishra, & Xie, 2017), and even modeling reliability of information on the web and controlling the spread misinformation (Tabibian et al., 2017; Kim, Tabibian, Oh, Schölkopf, & Gomez-Rodriguez, 2018), and in neuroscience to model spike trains (Krumin et al., 2010). We conjecture that such history dependency can be incorporated in our analytic treatment of the coupling measure, such that our theoretical results can be extended to this model.

### 5.3 Extension beyond Binary Significance Assessment

We show that the Marchenko-Pastur distribution provides a good approximation of the distribution of eigenvalues in the absence of coupling, and the upper end of its support approximates the largest eigenvalue. This provides us a threshold to assess the significance of empirical singular values. Nevertheless, this hard thresholding approach does not take into account the actual fluctuations of the largest eigenvalue around this asymptotic upper end of the support and thus does not provide meaningful $p$-value for the statistical test.

It has been shown that the appropriately rescaled and recentered^{8} largest eigenvalue of Wishart matrices is asymptotically distributed as the Tracy-Widom distribution—for example, see Johnstone (2001); Tracy and Widom (2002); and El Karoui (2003, 2005, 2007). However, note that in some cases of practical relevance, the normal distribution might be more appropriate (Bai & Yao, 2008). Such asymptotic distribution of the largest eigenvalue can be exploited for reporting a theoretical $p$-value for the significance of the coupling and therefore extending the significance assessment from a binary decision to a probabilistic one. For example, Kritchman and Nadler (2009) exploit this idea (but in a simpler scenario) to determine the number of signal components in noisy data. This extension would allow a precise probabilistic assessment of the significance of weaker couplings leading to eigenvalues in the neighborhood of the asymptotic threshold introduced above.

## 6 Conclusion

We investigated the statistical properties of coupling measures between continuous signals and point processes. We first used martingale theory to characterize the distributions of univariate coupling measures such as the PLV. Then, based on multivariate extensions of this result and RMT, we established predictions regarding the null distribution of the singular values of coupling matrices between a large number of point processes and continuous signals and a principled way to assess significance of such multivariate coupling. These theoretical results build a solid basis for the statistical assessment of such coupling in applications dealing with high dimensional data.

## Appendix: Proofs of Theorems in the Main Text

**Proof of Theorem**

^{2}.We then exploit a central limit theorem (CLT) for martingales to prove the second part of the theorem (convergence to gaussian distribution). To satisfy the CLT in such a case, it is sufficient to find a particular martingale $M\u02dc(K)$ sequence that will satisfy the conditions described in Aalen et al. (2008, p. 63) ($\u2192P$ indicate convergence in probability):

- 1.
$Var(M\u02dc(K)(t))\u27f6K\u2192+\u221ePV\u02dc(t)$ for all $t\u2208[0,T]$, with $V\u02dc$ increasing and $V\u02dc(0)=0$.

- 2.
Informally, the size of the jumps of $M\u02dc(K)$ tends to zero (see Aalen et al., 2008, p. 63). Formally, for any $\epsilon >0$, the martingale $M\u02dc\epsilon (K)(t)$ gathering the jumps $>\epsilon $ satisfies $VarM\u02dc\epsilon (K)(t)\u27f6K\u2192+\u221eP0$.

Then $M\u02dc(K)(t)$ converges in distribution to a gaussian martingale of variance $V\u02dc(t)$.

To achieve these conditions, we define $M(k)$, the sequence of i.i.d. zero mean martingales defined on $[0,T]$ canonically associated with the point process of each trial $N(k)$. Then we build martingales $Mx(k)(t)=\u222b0tx(s)dM(k)(s)ds$ and construct $M\u02dc(K)=1/K\u2211k=1KMx(k)$.

For condition 2, due to assumption ^{1}, $x(t)$ is bounded, such that there is a $B>0$ satisfying $|x(t)|<B$ over $[0,T]$. As a consequence, the size of all jumps is bounded by $B/K$, and for any $\epsilon $, $M\u02dc\epsilon (K)(t)$ is the constant zero for $K>B2\epsilon 2$ and condition 2 is satisfied.

**Proof of Corollary**

^{3}.^{2}to $ei\varphi (t)$ (i.e., replacing $x(t)$ with $ei\varphi (t)$). As $ei\varphi (t)$ is complex valued, we should have a covariance function for its predictable variation process $V\u02dc(t)$. The covariance between a martingale's real part,

^{2}, as $K\u2192+\u221e$, the residuals converge in distribution to a zero-mean complex gaussian variable $Z$ (i.e., the joint distribution of real and imaginary parts is gaussian):

^{2}guarantees that the $K(c^K-c*)$ tends to a gaussian with finite variance, $c^K$ tends to the Dirac measure at $c*$.

**Proof of Corollary**

^{5}.^{3}. The $PLV$ asymptotic value ($PLV*$) can be derived from definition introduced in equation 3.2:

^{3}and integrating on $[0,T]$:

^{2}). With this change, equation A.18 becomes

We can have a similar calculation for the imaginary part, $Cov(Z)22$, as well, but using the identity $sin2(\theta )=121-cos(2\theta )$ instead of equation A.20. The off-diagonal elements of the covariance matrix vanish due to symmetries of integrand.

**Proof of Corollary**

^{6}.Similar to corollary ^{5}, we can derive the asymptotic $PLV$, equation 3.9, for this case, from the definition in equation 3.2. We apply the intensity function $\lambda =\lambda 0$ in corollary ^{3}. The $PLV$ asymptotic value ($PLV*$) can be derived simply by changing the integration variable from $\varphi (t)$ to $\theta $ (and let $\theta \u21a6\tau (\theta )$ be its inverse).

^{5}. We plug the rate $\lambda 0$ as $\lambda (t)$ in corollary

^{3}and integrate on $[0,T]$:

**Proof of Theorem**

^{8}.Similar to the proof of theorem ^{2}, we rely on a CLT, but this time adapted to the case of vector-valued martingales (Aalen et al., 2008, appendix B) to prove this theorem.

^{2}, we introduced the $(p\xd7n)$-variate martingale:

- 1.
$\u222b0tH(K)(u)diag{\lambda (K)(u)}H(K)(u)\u22a4du\u27f6PV(t)$.

- 2.
$\u2211j=1k\u222b0t(H(K)(u))21|H(K)(u)|>\epsilon \lambda j(K)(u)du\u27f6P0,$ for all $t\u2208[0.T]$ and $\epsilon >0$.

The above stochastic integral converges in distribution to a mean-zero gaussian martingale of covariance $V(t)$.

^{14}, as

^{7}, a diagonal covariance matrix with block-constant diagonal coefficients,

**Proof of Theorem**

^{9}.Based on proposition ^{23} in section B.3, we need only to check the four following necessary conditions, using the Kronecker delta notation of equation 4.2:

- 1.
$EX\xafjkXlk=\delta lj$, for all $k$.

- 2.
$1nmaxj\u2260lEX\xafjkXlk2\u21920$ uniformly in $k\u2264n$.

- 3.
$1n2\u2211\Gamma E[X\xafjkXlk-\delta ljXj'kX\xafl'k-\delta j'l'2\u21920$ uniformly in $k\u2264n$, where $\Gamma ={(j,l,j',l'):1\u2264j,l,j',l'\u2264p}\u2216{(j,l,j',l'):j=j'\u2260l=l'orj=l'\u2260j'=l}$.

- 4.
$p/n\u2192\alpha \u2208(0,\u221e)$.

^{8}, we use the auxiliary processes

Condition 1 is a direct application of results from equation A.42 in the proof of theorem ^{8} because $EX\xafjkXlk$ is the covariance between the coefficients of the normalized coupling matrix.

^{7}) that it has mean $\delta ij$. As a consequence, $EX\xafjkXlk-\delta lj2$ is the variance of the above expression, which is (by stochastic integral formula)

^{7}, this expression is bounded uniformly for any values of $i,j,n,k$, and condition 2 is fulfilled.

^{16}to compute the required fourth-order moments:

*integral product term*; the last term in this expression will be dealt with independently. Given assumption

^{7}, it is clear that for $l,j,j',l'$, all different from each other, the integral product term is vanishing. If there happen to be only two indices that are equal, the moment also vanishes (at least one term of each product vanishes). For the case $j=l=k'=l'$, the integral product term possibly does not vanish, but is uniformly bounded, and only $n$ terms satisfy this relation, such that it will not affect the limit of the relevant expression for condition 3 (due to the $1/n2$ factor).

^{7}, the integral product term still vanishes. As a consequence, the asymptotic behavior we are interested in is given by the behavior of the remaining single integral term of the moment: $1K\lambda k\u222b0Txlx\xafjxj'x\xafl'ds$ (the only remaining nonvanishing terms are bounded and intervene only in $n$ terms of the sum), such that

To sum up, all four necessary conditions for the application of proposition ^{23} are fulfilled (condition 4 is part of the assumptions), and the convergence to the MP law follows immediately.

**Proof of Theorem**

^{11}.**Definition 1**

^{9}, we use

^{9}(based on proposition

^{16}) leads to an expansion for which all terms vanish but one per expectation, leading to

## Appendix B: Additional Background and Useful Results

### B.1 Jump Processes

Jump processes exhibit discontinuities related to the occurrence of random events, which are distributed according to the given point process models. In this letter, we are concerned with jump times distributed according to (possibly inhomogeneous) Poisson processes.

#### B.1.1 Martingales Related to Counting Processes

#### B.1.2 Stochastic Integrals

#### A.1.3 Second Order Statistics

#### B.1.4 General Jump Stochastic Processes

#### B.1.5 Mean Stochastic Jump Integrals

#### B.1.6 Itô's Formula

Itô's formula or Itô's lemma is an identity to find the differential of a function of a stochastic process. It is a counterpart of the chain rule used to compute the differential of composed functions. We restrict ourselves to the case of a time-independent scalar function of a jump process, while different formulas exist for other cases.

^{9}.

An application of this formula that we will use follows:

**Proposition 2.**

**Proof.**

### B.2 Notions of Convergence

#### B.2.1 Convergence in Distribution

The classical definition is based on the CDF.

**Definition 2**

An equivalent definition can be formulated in terms of weak convergence:

**Proposition 3.**

The generalization to multidimensional variables encountered in theorem ^{8} consists simply in replacing the cumulative distribution by its multivariate version, $FX(x)=P(X1<x1,\u2026,Xn<xn)$, in definitions. A simple necessary and sufficient condition for $X\u27f6Y$ is that for all vectors $t$, $t\u22a4X\u27f6t\u22a4Y$ (this is the Cramér-Wold theorem, see Billingsley, 1995).

#### B.2.2 Convergence in Probability

It can be shown that convergence in probability implies convergence in distribution. The converse is true only in special cases:

**Proposition 4.**

If $X$ converges in distribution to a (deterministic) constant $c$, then it also converges to it in probability.

An extension to the multivariate case is obtained in finite vector spaces by replacing the absolute value in equation B.14 by any norm, or simply by requiring the convergence of all components individually.

#### B.2.3 Convergence of Random Measures

The ESDs are random measures, and as such, random variables, leaving in an infinite-dimensional space of measures. This means that for a fixed realization $\omega $, the random measure $\mu $ takes the deterministic value $\mu (\omega )$.

Several types of convergence can be defined. First, the notion of *convergence weakly in probability* can be seen as a combination of the above definitions. It is known that the weak convergence of deterministic measures (see proposition ^{18}) can be associated with a (nonunique) metric (the topological space of weak convergence is metrizable). Let us pick such a metric $\rho (\mu ,\nu )$ between two deterministic measures; then:

**Definition 3**

*weakly in probability*to the deterministic measure $\nu $ for any $\epsilon >0$:

Next, we can also define convergence with probability 1 (also called *almost sure convergence*).

**Definition 4**

As for the case of scalar random variables, convergence with probability one implies convergence in probability.

### B.3 Random Matrix Theory

Random matrix theory resulted from fairly recent developments in high-dimensional statistics. It has various application in physics (Guhr, Müller-Groeling, & Weidenmüller, 1998; Doussal, Majumdar, & Schehr, 2016), machine learning (Pennington & Bahri, 2017; Pennington & Worah, 2017; Louart et al., 2018), and neuroscience (Timme, Geisel, & Wolf, 2006; Veraart et al., 2016; Almog et al., 2019).

#### B.3.1 Wishart Ensemble

Let $X$ be a $p\xd7n$ data matrix. Assume that the coefficients of $X$, $xij$ are i.i.d. $NC0,1$. $NC$ specifies a standard complex normal distribution. By definition, this means that $xij=xijreal+ixijimag$, where $xij=xijreal$ and $xijimag$ are independent (real) $N(0,12)$. This implies that columns of $X$ are i.i.d. $NC0p,Ip$ and, similarly, the real and imaginary parts are $N0p,Ip/2$.

We wrote here the general formula that holds for all $\alpha >0$, accounting for zero eigenvalues with a Dirac mass in zero in the rank-deficient case $\alpha >1$.

#### B.3.2 Stieltjes Transform of ESD

**Proposition 5.**

If two sequences of random measures ${\mu k}$ and ${\nu k}$ converge weakly in probability to a deterministic with identical Stieltjes transform, they converge to the same measure.

#### B.3.3 Convergence to MP for Matrices with Dependent Coefficients

**Proposition 6.**

Let As $n\u2192\u221e$, and assume the following. Let

- 1.
$EX\xafjkXlk=\delta lj$, for all $k$.

- 2.
$1nmaxj\u2260lEX\xafjkXlk-\delta lj2\u21920$ uniformly in $k\u2264n$.

- 3.
$1n2\u2211\Gamma EX\xafjkXlk-\delta ljXj'kX\xafl'k-\delta j'l'2\u21920$ uniformly in $k\u2264n$, where $\Gamma ={(j,l,j',l'):1\u2264j,l,j',l'\u2264p}\u2216{(j,l,j',l'):j=j'\u2260l=l'orj=l'\u2260j'=l}$.

- 4.
$p/n\u2192\alpha \u2208(0,\u221e)$.

Then, with probability 1, the ESD of $Sn$ tends (weakly) to the MP law.

**Sketch of the proof.**

We use theorem 1.1 from Bai and Zhou (2008) combined with the sufficient condition of corollary 1.1, assuming the identity matrix $Tn$. These conditions are compatible with the case of the Wishart ensemble, such that the ESD converges to a distribution with the same Stieltjes transform as the MP law.^{9} As a consequence of proposition ^{22}, we get that the limit ESD is the MP law.

## Appendix C: Additional Corollaries

The additional results in this appendix are corollaries based on simplifying assumption ^{24}, where a linear phase is considered instead of the general assumption on phase that was used in corollaries 2 and 3.

**Assumption 3.**

**Corollary 4.**

^{5}, assume additionally assumption

^{24}is also satisfied, and the intensity of the point-process is given by

**Proof.**

^{24}:

^{3}and integrating on $[0,T]$

^{2}). With this change, equation C.16 becomes

We can have a similar calculation for the imaginary part, $Cov(Z)22$, as well, but using the identity $sin2(\theta )=121-cos(2\theta )$ instead of equation A.20. The off-diagonal elements of the covariance matrix vanish due to symmetry of integrand.

**Corollary 5.**

**Proof.**

^{5}, we can derive the asymptotic $PLV$ (see equation C.24) for this case, from the definition in equation 3.2. We use the assumed phase $\varphi (t)=2\pi kt/T$ and apply the intensity function defined in equation C.23 in corollary

^{3}:

^{5}. We plug the rate $\lambda (t)$ assumed in the corollary (see equation C.23) and integrate on $[0,T]$,

We have a similar calculation for the imaginary part, $Cov(Z)22$. The off-diagonal elements of the covariance matrix vanish due to the symmetry of he integrand.

## Appendix D: Circular Noise

The advantage of such phase noise is to preserve the spectral content of the original oscillation better than conventional normal noise. Nevertheless, using conventional normal (white) noise (on both the real and imaginary parts of the oscillation) did not change the results significantly.

## Appendix E: Tables of Parameters

The choice of parameters used in the figures in the main text. In all simulations, $\varphi 0=0$.^{1}^{2}^{3}

Parameter . | Description . | A . | B . |
---|---|---|---|

$f$ | Frequency | 1 Hz | |

$K$ | Number of trials | 5000 | |

$T$ | Simulation length | 5 s | |

$\lambda 0$ | Average firing rate | 20 Hz | |

$NS$ | Number of simulations | 5000 | |

$\kappa $ | Modulation strength | 0 | 0.5 |

Parameter . | Description . | A . | B . |
---|---|---|---|

$f$ | Frequency | 1 Hz | |

$K$ | Number of trials | 5000 | |

$T$ | Simulation length | 5 s | |

$\lambda 0$ | Average firing rate | 20 Hz | |

$NS$ | Number of simulations | 5000 | |

$\kappa $ | Modulation strength | 0 | 0.5 |

Parameter . | Description . | A . | B . | C . | D . |
---|---|---|---|---|---|

$f$ | Frequency | 1 Hz | |||

$K$ | Number of trials | 10 | |||

$T$ | Simulation length | 0.75 s | 0.5 s | 1 s | $x$-axis |

$\lambda 0$ | Average firing rate | 30 Hz | |||

$NS$ | Number of simulations | 500 | |||

$\kappa $ | Modulation strength | 0 |

Parameter . | Description . | A . | B . | C . | D . |
---|---|---|---|---|---|

$f$ | Frequency | 1 Hz | |||

$K$ | Number of trials | 10 | |||

$T$ | Simulation length | 0.75 s | 0.5 s | 1 s | $x$-axis |

$\lambda 0$ | Average firing rate | 30 Hz | |||

$NS$ | Number of simulations | 500 | |||

$\kappa $ | Modulation strength | 0 |

Parameter . | Description . | A1 . | A2 . | A3 . | B1 . | B2 . | B3 . |
---|---|---|---|---|---|---|---|

$f$ | Frequency | 5 oscillatory components, 11–15 Hz | |||||

$K$ | Number of trials | 10 | |||||

$T$ | Simulation length | 11 s | |||||

$\lambda 0$ | Average firing rate | 20 Hz | |||||

$NS$ | Number of simulations | 100 | |||||

$\kappa $ | Modulation strength | 0 | 0.15 | ||||

$nc$ | Number of LFP channels | 100 | |||||

$ns$ | Number of spiking units | 10 | 50 | 90 | 10 | 50 | 90 |

$\kappa noise$ | Dispersion parameter of phase noise | 10 |

Parameter . | Description . | A1 . | A2 . | A3 . | B1 . | B2 . | B3 . |
---|---|---|---|---|---|---|---|

$f$ | Frequency | 5 oscillatory components, 11–15 Hz | |||||

$K$ | Number of trials | 10 | |||||

$T$ | Simulation length | 11 s | |||||

$\lambda 0$ | Average firing rate | 20 Hz | |||||

$NS$ | Number of simulations | 100 | |||||

$\kappa $ | Modulation strength | 0 | 0.15 | ||||

$nc$ | Number of LFP channels | 100 | |||||

$ns$ | Number of spiking units | 10 | 50 | 90 | 10 | 50 | 90 |

$\kappa noise$ | Dispersion parameter of phase noise | 10 |

## Code Availability

The code to reproduce our simulation results is at https://github.com/shervinsafavi/safavi_neuralComp2021.

## Notes

^{1}

Any martingale in this paper is zero-mean.

^{2}

See section B.1.1 for more details.

^{3}

This allows the normalization factor to converge to a deterministic quantity as $K\u2192+\u221e$ equation 2.1.

^{4}

The sum of the variances of real and imaginary parts.

^{5}

They are biases in the sense that one would expect a coupling measure to vanish if there is no coupling in the data generating procedure.

^{6}

In the sense that we can generate the homogeneous spike train and the oscillation without parametric models that do not share any information.

^{7}

To guarantee the phase to be strictly increasing.

^{9}

This requires checking that the self-consistency equation 1.1 in Bai and Zhou (2008) has a unique solution, which they establish by equation 1.2.

## Acknowledgments

We are very grateful to Afonso Bandeira and Asad Lodhia for fruitful discussions at the beginning of the project. We thank Edgar Dobriban for pointing us to Bai and Yao (2008) and Joachim Werner and Michael Schnabel for their excellent IT support. This work was supported by the Max Planck Society.

## References

*Survival and event history analysis: A process point of view*

*Handbook of mathematical functions with formulas, graphs, and mathematical tables*

*PLOS Computational Biology*

*An introduction to random matrices*

*Analysis of parallel spike trains*

*NeuroImage*

*Annales de l'Institut Henri Poincaré, Probabilités et Statistiques*

*Statistica Sinica*

*Stochastic Processes and Their Applications*

*Journal of Multivariate Analysis*

*Annals of Statistics*

*Probability and measure*

*Time series: Data analysis and theory*

*Annual Review of Statistics and Its Application*

*Physics Reports*

*Nature Neuroscience*

*Rhythms of the brain*

*Nat. Rev. Neurosci.*

*Neuron*

*Nat. Neurosci.*

*Spectrum of deformed random matrices and free probability.*

*Probability Theory and Related Fields*

*J. Neurosci. Methods*

*Journal of Neurophysiology*

*PNAS, 117*

*Proceedings of the First Workshop on Deep Learning for Recommender Systems*

*Advances in neural information processing systems*

*Journal of Computational Neuroscience*

*J. Neurophysiol.*

*Europhysics Letters*

*Nat. Rev. Neurosci.*

*On the largest eigenvalue of Wishart matrices with identity covariance when n, p and p/n tend to infinity*

*Acta Physica Polonica. Series B*

*Annals of Probability*

*Annals of Statistics*

*Nature Neuroscience*

*Journal of Applied Probability*

*Neuron*

*Trends Cogn. Sci.*

*Neuron*

*Current Opinion in Neurobiology*

*Neuron*

*Current Opinion in Neurobiology*

*Science*

*Neuron*

*Journal of Neurophysiology*

*Physics Reports*

*Applied stochastic processes and control for jump-diffusions: Modeling, analysis and computation*

*Journal of the Royal Statistical Society: Series B (Methodological)*

*Front. Neural Circuits*

*Journal of Neurophysiology*

*NeuroImage*

*Journal of Computational Neuroscience*

*Annals of Statistics*

*Annals of Statistics*

*eLife*

*Nature*

*Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining*

*IEEE Transactions on Signal Processing*

*IEEE Transactions on Signal Processing*

*Front. Comput. Neurosci.*

*Neural Computation*

*J. Neurosci. Methods*

*Statistics of random processes: I. General theory*

*Statistics of random processes II: Applications*

*Annals of Statistics*

*Ann. Appl. Probab.*

*Electronic Journal of Probability*

*Neuron*

*Matematicheskii Sbornik*

*Neuron*

*Free probability and random matrices*

*Observed brain dynamics*

*Mathematical statistics for economics and business*

*Analysis of the global banking network by random matrix theory*

*Journal of Neuroscience Methods*

*Journal of the American Statistical Association*

*Current Opinion in Neurobiology*

*Proceedings of the International Conference on Machine Learning*

*Advances in neural information processing systems*

*J. Stat. Mech.*

*Progress in Neurobiology*

*Nature Neuroscience*

*Healthy oscillatory coordination is bounded by single-unit computation*

*Stochastic integration and differential equations*

*Journal of Neuroscience Methods*

*A tutorial on Hawkes processes for events in social media.*

*Handbook of biological physics*

*Uncovering the organization of neural circuits with generalized phase locking analysis*.

*PLOS Computational Biology*

*Neural Computation*

*Journal of Neuroscience*

*Neuron*

*Nature Neuroscience*

*Proceedings of the 26th International Conference on World Wide Web*

*Chaos: An Interdisciplinary Journal of Nonlinear Science*

*Distribution functions for largest eigenvalues and their applications*

*Journal of Physiology–Paris*

*Nat. Neurosci.*

*NeuroImage*

*J. Comput. Neurosci.*

*NeuroImage*

*A treatise on the theory of Bessel functions*

*Ann. Math*

*Annals of Mathematics*

*Current Opinion in Neurobiology*

*Science*

*Frontiers in Computational Neuroscience*