## Abstract

Time series data sets often contain heterogeneous signals, composed of both continuously changing quantities and discretely occurring events. The coupling between these measurements may provide insights into key underlying mechanisms of the systems under study. To better extract this information, we investigate the asymptotic statistical properties of coupling measures between continuous signals and point processes. We first introduce martingale stochastic integration theory as a mathematical model for a family of statistical quantities that include the phase locking value, a classical coupling measure to characterize complex dynamics. Based on the martingale central limit theorem, we can then derive the asymptotic gaussian distribution of estimates of such coupling measure that can be exploited for statistical testing. Second, based on multivariate extensions of this result and random matrix theory, we establish a principled way to analyze the low-rank coupling between a large number of point processes and continuous signals. For a null hypothesis of no coupling, we establish sufficient conditions for the empirical distribution of squared singular values of the matrix to converge, as the number of measured signals increases, to the well-known Marchenko-Pastur (MP) law, and the largest squared singular value converges to the upper end of the MP support. This justifies a simple thresholding approach to assess the significance of multivariate coupling. Finally, we illustrate with simulations the relevance of our univariate and multivariate results in the context of neural time series, addressing how to reliably quantify the interplay between multichannel local field potential signals and the spiking activity of a large population of neurons.

## 1  Introduction

The observation of highly multivariate temporal point processes, corresponding to the activity of a large number of individuals or units, is pervasive in many applications (for example, neurons in brain networks; Johnson, 1996) and members in social networks (Dai, Wang, Trivedi, & Song, 2016; De, Valera, Ganguly, Bhattacharya, & Rodriguez, 2016). As the number of observed events per unit may remain small, inferring the underlying dynamical properties of the studied system from such observations is challenging. However, in many cases, it is possible to observe continuous signals whose coupling with the events can offer key insights.

In neuroscience, this is the case of the extracellular electrical field, which provides information complementary to spiking activity. Local field potentials (LFP) are mesoscopic (Liljenstroem, 2012) signals resulting from the superposition of the electric potentials generated by ionic currents flowing across the membranes of the cells located close to the tip of recording electrodes. The LFP reflects neural cooperation due to the anisotropic cytoarchitecture of most brain regions, allowing the summation of the extracellular currents resulting from the activity of neighboring cells. As such, a number of subthreshold integrative processes (i.e., modifying the neurons' internal state without necessarily triggering spikes) contribute to the LFP signal (Buzsaki, Anastassiou, & Koch, 2012; Buzsaki, Logothetis, & Singer, 2013; Einevoll, Kayser, Logothetis, & Panzeri, 2013; Pesaran et al., 2018; Herreras, 2016).

Reliably quantifying the coupling between activities of individual units (e.g., spikes generated by individual neurons) in a circuit and the aggregated measures (such as the LFP) may provide insights into underlying network mechanisms, as illustrated in the electrophysiology literature. At the single neuron level, the relationship of spiking activity to subthreshold activity has broad implications for the underlying cellular and network mechanisms at play. For instance, it has been suggested that synaptic plasticity triggers changes in the coupling between spikes and LFPs (Grosmark, Mizuseki, Pastalkova, Diba, & Buzsáki, 2012; Grosmark & Buzsáki, 2016). Regarding the putative functional role of such observed couplings, it has been hypothesized to support cognitive functions such as attention. Such coordination by oscillations hypothesis proposes that network oscillations modulate differentially the excitability of several target populations, such that a sender population can emit messages during the window of time for which a selected target is active, while unselected targets are silenced (Fries, 2005, 2015; Womelsdorf et al., 2007).

In the case of two continuous signals, coupling measures such as coherence and phase locking value (PLV) (Rosenblum, Pikovsky, Kurths, Schäfer, & Tass, 2001; Pereda, Quiroga, & Bhattacharya, 2005) are widely used, and their statistical properties have been investigated, in particular in the stationary gaussian case (Brillinger, 1981; Aydore, Pantazis, & Leahy, 2013). In a similar way, PLV (Ashida, Wagner, & Carr, 2010) and spike-field coherence (SFC) (Mitra, 2007) can measure spike-LFP coupling (see among others: Vinck, Battaglia, Womelsdorf, & Pennartz, 2012; Vinck, van Wingerden, Womelsdorf, Fries, & Pennartz, 2010; Jiang, Bahramisharif, van Gerven, & Jensen, 2015; Zarei, Jahed, & Daliri, 2018; Li, Cui, & Li, 2016) and are broadly used to makes sense of the role played by neurons in coordinated network activity (Buzsaki & Schomburg, 2015). There are notable contributions investigating potential biases of those measures when both point processes and continuous signals are involved (Lepage, Kramer, & Eden, 2011; Kovach, 2017). However, two issues relevant for practical applications remain: (1) the effect of the intrinsic variability of spike occurrence on key statistical properties of the estimates, such as the variance, have not yet been thoroughly described, and (2) how to extend the rigorous statistical analysis of spike-field coupling in the context of the highly multivariate signals available with modern recording techniques remains largely unaddressed.

We address these two issues by using continuous time martingale theory (see Liptser & Shiryaev, 2013a), the related concept of stochastic integration (see Protter, 2005) and random matrix theory (Anderson, Guionnet, & Zeitouni, 2010; Capitaine & Donati-Martin, 2016). The martingale central limit theorem (CLT) allows us to derive analytically the asymptotic gaussian distribution of a general family of coupling measures that can be expressed as stochastic integrals. We exploit this general result to show that the classical univariate PLV estimator is also asymptotically normally distributed and provide the analytical expression for its mean and variance. Furthermore, we study potential sources of bias for the commonly used von Mises coupling model (Ashida et al., 2010). We then go beyond univariate coupling measures and analyze the statistical properties of a family of multivariate coupling measures taking the form of a matrix with stochastic integral coefficients. We characterize the jointly gaussian asymptotic distribution of matrix coefficients, and exploit random matrix theory (RMT) principles to show that after appropriate normalization, the spectral distribution of such large matrices under the null hypothesis (of absence of coupling), follows approximately the Marchenko-Pastur (MP) law (Marchenko & Pastur, 1967), while the magnitude of the largest singular value converges to a fixed value whose simple analytic expression depends only on the shape of the matrix. We finally show how this result provides a fast and principled procedure to detect significant singular values of the coupling matrix, reflecting an actual dependency between the underlying signals. In the appendixes, we included detailed proofs and background material on RMT and stochastic integration, such that nonexpert readers can further apply these tools in neuroscience.

## 2  Background

### 2.1  Spike-Field Coupling in Neuroscience

Although our results are relevant to a broad range of applications within and beyond neuroscience, we will use the estimation of spike-LFP coupling introduced above as the guiding example of this letter. Spikes convey information communicated between individual neurons. This information is believed to be encoded in the occurrence times of successive spike events, which are typically modeled with point processes—for example, Poisson (Softky & Koch, 1993) or Hawkes process (Truccolo, 2016; Krumin, Reutsky, & Shoham, 2010).

While oscillatory dynamics is ubiquitous in the brain and instrumental to its coordinated activity (Buzsaki, 2006; Buzsaki et al., 2013; Peterson & Voytek, 2018), it is often challenging to uncover based solely on the sparse spiking activity of recorded neurons. On the other hand, LFPs often exhibit oscillatory components that can be isolated with signal processing tools (typically bandpass filtering or template matching), such that pairing the temporal information from LFPs and spiking activity can help extract reliable markers of neural coordination.

An example of a coupling measure achieving such pairing is the phase locking value (PLV). Given, on the one hand, event (spike) times ${tj}$ where $j∈{1,2,…,N}$ (with $N$ the number of spikes in the spike train), and on the other hand, $ϕ(t)$ the time-varying phase of an oscillatory continuous signal, which is typically a bandpassed filtered LFP, phase locking between these two signals is estimated by the complex number
$PLV^=1N∑j=1Neiϕ(tj),withi2=-1.$
(2.1)
We use a “hat” notation to reflect that this quantity is empirical: indeed, even if we assume a fixed $ϕ$, the PLV depends on the specific values of event times $tj$. In this work, we assume these points are drawn from a Poisson process, with a possibly time-varying rate (inhomogeneous Poisson process), such that we can define a population statistics that is a function of the point process population distribution instead of its empirical counterpart. We then address under which conditions the empirical PLV reflects a true coupling between the rate of the underlying point process and $ϕ$.

### 2.2  Counting Process Martingales

We use a continuous time framework leading to powerful results based on concise deterministic and stochastic integral expressions, which can trivially be approximated using discrete time signals in practice. A (continuous time) stochastic process $M={M(t);t∈[0,τ]}$ is a zero-mean martingale relative1 to the filtration ${Ft}$ (which represents the past information accumulated up to time $t$) if (1) $M(0)=0$, (2) it is adapted to ${Ft}$ (informally the law of $M$ up to time $t$ “uses” only past information up to $t$), and (3) it satisfies the martingale property:
$EM(t)|Fs=M(s),forallt>s.$
(2.2)
Consider now a (univariate) counting process $(N(t),Ft);t≥0$, counting the number of events that occurred up to time $t$, adapted to filtration ${Ft}$ (Aalen, Borgan, & Gjessing, 2008, chap. 2). Under mild assumptions, it has a Doob-Meyer decomposition,
$N(t)=M(t)+∫0tλ(t)dt,$
(2.3)
where $λ(t)$ is a predictable process with respect to ${Ft}$ called the intensity function and $M(t)$ is a martingale, called the compensated counting process. Figure 1 shows an illustration of this decomposition for a Poisson process with sinusoidal intensity.
Figure 1:

Doob-Meyer decomposition for an example inhomogeneous Poisson process with oscillatory of rate $λ(t)$ of frequency $f=1$ Hz; average firing rate $λ0=5$ Hz (dashed line indicates the reference 0). See section 3.3 for details of the simulation.

Figure 1:

Doob-Meyer decomposition for an example inhomogeneous Poisson process with oscillatory of rate $λ(t)$ of frequency $f=1$ Hz; average firing rate $λ0=5$ Hz (dashed line indicates the reference 0). See section 3.3 for details of the simulation.

Consider now an empirical coupling measure $c$ between a (real or complex) predictable process $x(t)$ and $N(t)$ observed during time interval $[0,T]$, which takes the form of the stochastic integral (see Protter, 2005),
$c^=∑tk
(2.4)
where ${tk}$ denote the jump times of the counting process (note that the PLV defined in equation 2.1 is a normalized version of such coupling). The empirical coupling measure, $c^$, can then be decomposed as
$c^=∫0Tx(t)λ(t)dt+∫0Tx(t)dM(t).$
(2.5)
Interestingly, it can be shown that the second integral on the right-hand side is also a martingale (see Liptser & Shiryaev, 2013b, theorem 18.7).

In order to keep our results concise, we assume the following deterministic setting in the remainder of this letter (see section 5 for potential extensions).

Assumption 1.

Assume the intensity function, $λ(t)=λ(t|Ft)$ of $N(t)$ and the signal $x(t)$ are deterministic bounded left-continuous and adapted to $Ft$ over $[0,T]$.

Note this entails that $N(t)$ is a (possibly inhomogeneous) Poisson process (Liptser & Shiryaev, 2013b, theorem 18.10). Under assumption 1, the terms of equation 2.5 separate the deterministic part from the (zero-mean) random fluctuations of the measure that are integrally due to the martingale term. Using martingale properties, the statistics of the coupling measure are2
$c*≜Ec^=∫0Tx(t)λ(t)dtandVar[c^]=E|c^-c*|2=∫0Tx2(t)λ(t)dt.$
(2.6)
In case $x(t)$ integrates to zero, the expected coupling $c*$ thus reflects the covariation across time between $x(t)$ and the intensity of the point process up to random fluctuations.

### 2.3  Random Matrix Theory

As data sets get increasingly high dimensional, it becomes important to replace the above univariate measure $c^$ by a quantity that summarizes the coupling between a large number of units and continuous signals. This extension leads to assessing the spectral properties of a coupling matrix $C^$ that gathers all pairwise measurements. However, such task is nontrivial due to the martingale fluctuations affecting $C^$, leading to spurious nonzero coupling coefficients, and can also hide the deterministic structure of the matrix associated with significant coupling.

Random matrix theory allows investigating the spectral properties of some matrices in noisy settings by studying their asymptotic spectral properties as dimensions grow to infinity. Any $(p×p)$ complex Hermitian or real symmetric matrix $M$ has a set of $p$ real eigenvalues ${ℓk}$ (where we put several times the same eigenvalue in the set according to its multiplicity). One classically studied quantity is the empirical spectral distribution (ESD) (or empirical eigenvalue distribution, see Mingo & Speicher, 2017 and Anderson et al., 2010) of the set of all eigenvalues ${ℓk}$. ESD indistinctly refers (with a slight abuse of language), to either the probability measure (also called spectral measure in our case),
$μM(t)=1p(δℓ1(t)+⋯+δℓp(t)),t∈R,$
where $δℓk$ is the dirac measure with unit mass in $ℓk$, or to its associated cumulative distribution:
$FM(t)=∫-∞tdμM(s).$
Seminal works by Wigner (1955, 1958), Marchenko and Pastur (1967), and many others have established the convergence of the ESD of large random matrix ensembles (see section B.2 for the precise notions of convergence). In particular, for a sequence of matrices $Xnn>0$ of dimension $p×n$ such that $pn→n→+∞α≤1$, with coefficients sampled independently and identically distributed (i.i.d.) from a (possibly complex) standard Normal distribution, the ESD of the Wishart matrix $Sn=1nXnXnH$ (where $.H$ indicates the transposed complex conjugate) converges to the Marchenko-Pastur (MP) law $μMP(x)$ (Marchenko & Pastur, 1967) with density
$dμMPdx(x)=12παx(b-x)(x-a),a≤x≤b,0,otherwise,$
(2.7)
with $a=(1-α)2$ and $b=(1+α)2$. Additionally, the smallest and largest eigenvectors converge to $a$ and $b$, respectively. Importantly, these convergences also hold in the case $α>1$, but equation 2.7 is modified to account for the rank deficiency of the Wishart matrix, imposing $p-n$ zero eigenvalues in the spectrum (see section B.3.1 for details).

Notably, recent developments in the field of random matrix theory extend the classic results that were only valid for independent coefficients (uncorrelated Wishart matrices) to various forms of dependencies between coefficients. For instance, El Karoui (2007, 2008) showed that the ESD and the distribution of the largest eigenvalue for a sequence of matrices $Xnn>0$ with general covariance matrices (not necessarily with identity covariance matrix) follow similar laws and Banna, Merlevède, and Peligrad (2015) investigate the case of symmetric random matrices with correlated entries. Furthermore, the behavior of high-dimensional autocovariance matrices in the context of discrete time stochastic processes is discussed in Liu, Aue, and Paul (2015) and Bhattacharjee and Bose (2016). Applications of this framework have also been considerably extended including global finance (Namaki et al., 2020) and various aspects of machine learning and signal processing such as shallow (Louart, Liao, & Couillet, 2018) and deep (Pennington & Bahri, 2017; Pennington & Worah, 2019) neural networks, denoising (Bun, Bouchaud, & Potters, 2017) and dimensionality reduction (Johnstone & Onatski, 2020).

In this study, we show that the martingale fluctuations of the coupling matrices also cause spectral convergence to the MP law in the absence of actual coupling between the signals. Recent results on the low-rank perturbation (Capitaine & Donati-Martin, 2016; Loubaton & Vallet, 2011; Benaych-Georges & Nadakuditi, 2012) of random matrices suggest this convergence can be exploited to further assess the significance of the largest eigenvalues of the coupling matrix with respect to the null hypothesis that they only reflect random fluctuations.

## 3  Assessment of Univariate Coupling

### 3.1  Mathematical Formulation

We consider the setting of $K≥1$ independent trials of measurements on $[0,T]$ available to estimate the coupling statistics by the trial average
$c^K=1K∑k=1K∫0Tx(t)dN(k)(t),$
where ${N(k)}$ are $K$ independent copies of the process $N(t)$, associated with each trial. As this letter focuses on the statistical properties induced by the intrinsic variability of point process realizations, we assumed above that the continuous signal does not change across trials. However, including some forms of variability across trials, such as random time shifts affecting all processes in the same way, would not affect the results, barring additional technical details.

We exploit a central limit theorem (CLT) for martingales to show the residual variability (difference between the empirically estimated $c^K$ and the expected coupling $c*$ of equation 2.6) is asymptotically normally distributed. We formally state it in theorem 2.

Theorem 1.
Assume $(Ft,x(t),λ(t))$ satisfy assumption 1, and $x(t)$ real-valued. Then,
$E[c^K]≜c*=∫0Tx(t)λ(t)dtandVar[c^K]=1K∫0Tx2(t)λ(t)dt.$
Moreover, as the number of trials increases, fluctuations converge in distribution:
$Kc^K-c*⟶K→+∞N0,∫0Tx2(t)λ(t)dt.$
Sketch of the proof.

We rely on the decomposition of equation 2.5. As described in section B.1.1, the martingale property is preserved by the stochastic integral term and allows us to exploit a martingale CLT to prove convergence to a gaussian distribution.

The case of $x(t)$ complex-valued can be dealt with by distinguishing the real and imaginary parts of the signal, as is done in the proofs of the following corollaries. We can exploit theorem 2 to derive the asymptotic properties of the PLV introduced in section 2.1. For that, we adapt the empirical estimate of equation 2.1 to the $K$ trials setting introduced above and define
$PLV^K=1∑k=1KNk∑k=1K∑j=1Nkeiϕ(tjk),$
(3.1)
where $Nk$ is the number of events observed during trial $k$ and ${tjk}$ is the collection of the time stamps of these events. The specificity of this multitrial estimate is to use a single normalization constant corresponding to the total number of events pooled across trials.3 For this estimate, we get the following result.
Corollary 1.
Assume $(Ft,x(t)=eiϕ(t),λ(t))$ satisfy assumption 1, where $ϕ$ is real-valued and stands for the phase of the signal $x$. Then the expectation of the PLV statistics $PLV^K$ estimated from $K$ trials of measurements on $[0,T]$ tends to the limit
$PLV*=∫0Teiϕ(t)λ(t)dt/Λ(T),withΛ(T)=∫0Tλ(t)dt.$
(3.2)
Moreover, as $K→+∞$, the residual,
$KPLV^K-PLV*,$
(3.3)
converges in distribution to a zero-mean complex gaussian variable $Z$ (i.e., the joint distribution of real and imaginary parts is gaussian), such that
$CovRe{Z}Im{Z}=1Λ(T)2∫0TM(t)λ(t)dt,whereM(t)=cos2(ϕ(t))sin(2ϕ(t))/2sin(2ϕ(t))/2sin2(ϕ(t)).$
Sketch of the proof.

This relies on applying theorem 2 to the real and imaginary parts of $eiϕ(t)$. In addition, the coupling between both quantities is taken into account by replacing the variance of univariate quantities $V˜(t)$ in theorem 2 by a covariance matrix that can be assessed with martingale results given in section B.1.1.

Remark 1.
For the simple case of a $T/k$-periodic sinusoidal signal ($k$ integer), such that $ϕ(t)=2πkt/T$, and a sinusoidal modulation of the intensity with phase shift $φ0$ and modulation amplitude $ϰ$ such that
$λ(t)=λ01+ϰcosϕ(t)-φ0,λ0>0,0≤ϰ≤1,$
we get easily with trigonometric identities that $PLV*=12ϰeiφ0$ and the residual of equation 3.3 converges to an isotropic complex gaussian of total variance4$1λ0T$ such that the coupling strength $ϰ$ affects the mean but not the variance of the PLV estimate.
Also, it is easy to see that if $λ(t)$ is modulated by a sine wave at a different integer multiple $m≠k$ of the fundamental frequency $1/T$, such that
$λ(t)=λ0+ϰcos2πmt/T-φ0,$
the $PLV*$ vanishes and the residual's variance remains the same. These properties make PLV straighforward to interpret and test for sinusoidal coupling with a carefully chosen observation duration $T$. Assumption 24 and corollary 25, in appendix C, provide formal statements of this remark.
We can use corollary 3 to predict the statistics of PLV estimates for other models of phase-locked spike trains. A classical model uses the von Mises distribution (also known as circular normal distribution) with parameter $κ≥0$ to model the concentration of spiking probability around a specified locking phase $ϕ0$ (for more details, see Ashida et al., 2010). The original model uses a purely sinusoidal time series by assuming a linearly increasing phase $ϕ(t)=2πft$, where $f$ is the modulating frequency, to derive the intensity of an inhomogeneous Poisson spike train,
$λ(t)=λ0expκcos(ϕ(t)-φ0).$
(3.4)
resulting in an analytical expression for the asymptotic complex-valued PLV,
$PLV*=eiφ0∫0πcos(θ)exp(κcos(θ))dθ∫0πexp(κcos(θ))dθ=eiφ0I1(κ)I0(κ),$
with the $Ik$'s denoting the modified Bessel functions of the first kind for $k$ integer (see Abramowitz & Stegun, 1972, p. 376):
$Ik(κ)=1π∫0πcos(kθ)exp(κcos(θ))dθ.$
Compared to the sinusoidal coupling described in remark 4, whose PLV magnitude can reach at most 1/2, this model can achieve arbitrarily large PLV, which might explain why it is more frequently used in applications.

Corollaries 2 and 3 derive the asymptotic covariance of the variability of the PLV estimate around this theoretical value (which is novel to the best of our knowledge). Furthermore, the results are derived in a more general model setting accounting for “biases”5 due to nonlinear phase increases $ϕ(t)$ and observation intervals that are not multiples of the modulating oscillation period. It should be noted that the mentioned biases are inherent in the estimator's definition. They happen independent of additional biases originating from the phase estimation procedure (e.g., phase extraction via Hilbert transform; see Kovach, 2017).

We thus assume a coupling, parameterized by $κ$ between a possibly nonlinearly increasing phase $ϕ(t)$ and a point process with intensity
$λ(t)=λ0expκcos(ϕ(t)-φ0)dϕdt(t).$
(3.5)
Note that for linearly increasing phases, this coupling amounts to the classical von Mises model of equation 3.4. The additional factor $dϕdt(t)$ allows preserving the analytical expression of PLV statistics even for nonlinearly increasing phases, providing a novel generalization of the von Mises model (see corollary 25 in appendix C for a simplified version of corollary 5, assuming a linearly increasing phase $ϕ(t)=2πft$ with frequency $f$).
Corollary 2.
Under the assumptions of corollary 3, assume additionally that $ϕ(t)$ is continuous, strictly increasing, and piece-wise differentiable on $[0,T]$ and the intensity of the point-process is given by equation 3.5 for a given $κ≥0$, then the expectation of the multitrial PLV estimate converges (for $K→+∞$) to
$PLV*=∫ϕ(0)ϕ(T)eiθexp(κcos(θ-φ0))dθ∫ϕ(0)ϕ(T)exp(κcos(θ-φ0))dθ.$
(3.6)
If in addition $[0,T]$ corresponds to an integer number of periods of the oscillation,
$PLV*=eiφ0∫0πcos(θ)exp(κcos(θ))dθ∫0πexp(κcos(θ))dθ=eiφ0I1(κ)I0(κ),$
(3.7)
and the scaled residual $KPLV^K-PLV*$ converges to a zero mean complex gaussian $Z$ with the following covariance:
$CovRe{Ze-iφ0}Im{Ze-iφ0}=12λ0(ϕ(T)-ϕ(0))I0(κ)2I0(κ)+I2(κ)00I0(κ)-I2(κ).$
(3.8)
Sketch of the proof.

This is based on plugging the intensity function $λ(t)$ of equation 3.5 in corollary 3. Using change of variable in the integrals ($ϕ(t)$ to $θ$) and exploiting the symmetries of the functions, the integrals in the analytical expressions of the expectation and covariance turn into modified Bessel functions $Ik$ for $k$ integer.

The above result has important consequences for the assessment of PLV from data. In particular, it exhibits key experimental requirements for PLV estimates to match the classical Bessel functions expression of equation 3.7: (1) evaluate PLV on an integer number of periods (this is critical for trials with short duration) and (2) take into account the fluctuations of the rate of increase of the phase $ϕ(t)$ across the oscillation period. This second point is critical in applications where the phase is inferred from signals (such as LFPs) through the Hilbert transform, as nonlinearities of the underlying phenomena may lead to nonsinusoidal oscillations, with periodic fluctuations of the time derivative of the phase $ϕ'(t)$. To further emphasize the consequences of this aspect, we also derive the asymptotic distribution of $PLV$ for a homogeneous Poisson process that corresponds to the special case $κ=0$ of the classical von Mises coupling of equation 3.5. Although there is no actual coupling between events and the continuous signal in such a case,6 the nonlinear phase increase leads asymptotically (for $K$ large) to a nonvanishing PLV estimate and to false detection of coupling.

Corollary 3.
Under the assumptions of corollary 3, we assume additionally that the point process is homogeneous Poisson with rate $λ0$ and that $ϕ(t)$ is strictly increasing (almost everywhere) and differentiable on $[0,T]$. Let $θ↦τ(θ)$ be its inverse function (such that $τ(ϕ(t))=t$). Then the expectation of $PLV^K$ converges (for $K→+∞$) to
$PLV*=∫ϕ(0)ϕ(T)eiθτ'(θ)dθϕ(T)-ϕ(0),$
(3.9)
and the scaled residual,
$Z=KPLV^K-PLV*,$
converges to a zero mean complex gaussian:
$KPLV^K-PLV*⟶K→+∞N00,Cov(Z),$
with the following covariance:
$Cov(Z)=1λ0T2∫ϕ(0)ϕ(T)cos2(θ)sin(2θ)/2sin(2θ)/2sin2(θ)τ'(θ)dθ.$
Sketch of the proof.

The result stems from using the intensity function $λ0$ in corollary 3 and then using change of variable in the integrals and exploiting the symmetries of the functions.

This corollary will be further illustrated in the next paragraphs.

### 3.2  Application to Bias Assessment

Corollary 6 predicts scenarios where in the absence of modulation of spiking activity (having a constant intensity function $λ(t)=λ0$), the expectation of the PLV estimates remains far from zero even when the number of trials is large, that is, the coupling between a homogeneous point process and a continuous oscillatory signal would appear significant and reflect a form of bias. Corollary 6 allows computing this bias and therefore correcting it.

One such case is when the observation interval is not an integer number of oscillation periods. To demonstrate this analytically, we can start from the PLV expectation with the constant intensity $λ0$,
$PLV*=∫0Teiϕ(t)λ(t)dt∫0Tλ(t)dt=λ0∫0Teiϕ(t)dtλ0∫0Tdt=1T∫0Teiϕ(t)dt.$
(3.10)
Furthermore, we assume $ϕ(t)$ has linear phase (assumption 24): $ϕ(t)=2πft$, where $f$ is the frequency of oscillation of the continuous signal. We then get
$PLV*=1T∫0Tei2πftdt=12πγTie2πγTi-1,$
(3.11)
where $γT=Tf$ is the ratio of the length of the time series ($T$) to the period of oscillation $1f$. As is noticeable in equation 3.11, the coupling measure $PLV*$ is not zero when $γT$ is not an integer number. Notably, this bias affects both the magnitude and the phase of the $PLV*$ estimate.
Furthermore, even using an observation interval covering an integer number of periods, nonlinear increases in phase may lead to a nonvanishing PLV. This can be demonstrated with a simple example. Again, we can start from the original definition of PLV expectation, equation 3.2, but now we do not assume the linearity of the phase. As introduced in corollary 6, let $θ↦τ(θ)$ be the inverse of $ϕ(t)$, and let us use equation 3.9 to compute the $PLV*$. Taking a sinusoidal modulation over the oscillation period, $τ(θ)=θ+εsin(θ)$ with $|ε|<1$,7 we get a nonvanishing asymptotic expected PLV:
$PLV*=12π∫02πeiθ(1+εcos(θ))dθ=ε∫0πeiθcos(θ)dθ=ε/2≠0,ifε≠0.$

Our theoretical framework can be used for developing methods to correct such biases. In the linear phase setting, bias can be avoided simply by using an integer number of periods for coupling estimation. In the case of a nonlinear phase evolution of the continuous signal, we can use the theoretical phase (if available) or its empirical estimate to evaluate $PLV*$ under constant spike intensity assumptions with equation 3.9 and subtract this quantity to the estimated PLV. For resolving issues that arise due to the nonlinearity of the estimated phase, specialized methods have been suggested. For instance, Hurtado, Rubchinsky, and Sigvardt (2004) dealt with phase jumps (a particular form of nonlinearity) by interpolating the signal from the available data before and after the sudden change and Cole and Voytek (2019) introduced a cycle-by-cycle method for analyzing oscillatory dynamics. In this method, they consider a linear phase for each detected cycle of oscillation. Therefore, with this linear choice of phase, one can avoid the spurious coupling that can appear due to phase nonlinearities. Based on our framework, theoretically motivated methods that are not relying on the linearization of the phase can be developed.

### 3.3  Simulations

We demonstrate the outcome of our theoretical results using simulated phase-locked spike trains (similar to what has been introduced in corollaries 2 and 4) and sinusoidal oscillations. For generating phase-locked spike trains, we adopt the method introduced in Ashida et al. (2010). As the model has already been described elsewhere, we restrict ourselves to a brief explanation.

To generate phase-locked or periodic spike trains based on the classical von Mises model with rate $λ(t)$ as introduced in equation 3.4, we use a purely sinusoidal continuous signal $x(t)$ with linearly increasing phase $ϕ(t)=2πft$, with $f=1$ Hz and various coupling strength ($κ$) (see appendix E for lists of parameters used for each figure). Based on this simulation we perform two numerical experiments to demonstrate the practical relevance of our (asymptotic) theoretical results.

#### 3.3.1  Experiment 1

In order to demonstrate the validity of corollaries 2 and 4, in Figure 2 we show the empirical distribution of the normalized residual of the PLV estimate and compare it to its asymptotic theoretical distribution. We simulate two cases, one with homogeneous Poisson spike trains ($κ=0$) and one with phase-locked spike trains ($κ=0.5$) with Poisson statistics. In both cases, we observe the agreement between theory and simulation, as the joint distribution of real and imaginary part approaches an isotropic gaussian. The slightly non-gaussian shape of the real part histogram for $κ=0.5$ suggests, however, a slower convergence to the normal distribution in the case of coupled signals.

#### 3.3.2  Experiment 2

We demonstrate an application of corollary 6 for bias evaluation with a simple simulation. In section 3.2 we pointed out that using a noninteger $fT$ ($T$ is not a multiple of the oscillation period) can lead to spurious correlation between the point process and the oscillatory continuous signal. By using equation 3.11 we can compute this bias.

We use a simulation similar to the one used in the previous experiment with an oscillatory signal and a homogeneous Poisson spike train ($κ=0$) and investigate the coupling between these two signals. If the length of the continuous signal is not an integer number of the oscillation period, the PLV estimate has a nonzero empirical mean (see Figures 3A and B) while when it is a multiple of number of the oscillation period, the estimate matches the ground truth (see Figure 3C). In Figure 3D we compare the theoretical prediction and the numerical simulation for various length of the signals, showing that this effect disappears when the observation window covers a larger number of oscillation periods.
Figure 2:

Simulation of (A) homogeneous Poisson spike trains and (B) phase-locked spike train with Poisson statistics (von Mises model with $κ=0.5$). First row: Example raster plot of the spikes. Second row: Empirical firing rate (gray line) and ground truth firing rate (orange and purple traces). Third row: Continuous signal $x(t)$. (C) Scatter plots represent the complex-valued PLVs estimates. Each dot represents one realization of the simulation. Insets depict the zoomed version of both distributions. Green crosses indicate the theoretical complex-valued PLV. (D, E) Histograms of real and imaginary parts of scaled residuals for simulations (D) without coupling and (E) with coupling. Green lines indicate the theoretical predictions of corresponding distributions according to corollaries 2 and 4, and the bars indicate the empirical distributions. Note the subtle difference between real and imaginary parts in panels D versus E. See Table 1 for parameters used for this figure.

Figure 2:

Simulation of (A) homogeneous Poisson spike trains and (B) phase-locked spike train with Poisson statistics (von Mises model with $κ=0.5$). First row: Example raster plot of the spikes. Second row: Empirical firing rate (gray line) and ground truth firing rate (orange and purple traces). Third row: Continuous signal $x(t)$. (C) Scatter plots represent the complex-valued PLVs estimates. Each dot represents one realization of the simulation. Insets depict the zoomed version of both distributions. Green crosses indicate the theoretical complex-valued PLV. (D, E) Histograms of real and imaginary parts of scaled residuals for simulations (D) without coupling and (E) with coupling. Green lines indicate the theoretical predictions of corresponding distributions according to corollaries 2 and 4, and the bars indicate the empirical distributions. Note the subtle difference between real and imaginary parts in panels D versus E. See Table 1 for parameters used for this figure.

Figure 3:

(A–C) Distribution of simulated complex-valued PLVs (gray dots), average of the simulated PLVs (red circle), and theoretical prediction based on equation 3.11 (green crosses) for (A) $γT=0.75$, (B) $γT=0.5$, and (C) $γT=1$. All complex-valued PLVs are represented in the complex plane. Angles indicate the locking phase and the radius the PLV. (D) PLV for different interval lengths $T$. Box plots represent the simulated PLVs, and the dashed green trace represents theoretical prediction of the expectation based on equation 3.11. Vertical broken blue lines indicate integer number of oscillation periods. See Table 2 for parameters used for this figure.

Figure 3:

(A–C) Distribution of simulated complex-valued PLVs (gray dots), average of the simulated PLVs (red circle), and theoretical prediction based on equation 3.11 (green crosses) for (A) $γT=0.75$, (B) $γT=0.5$, and (C) $γT=1$. All complex-valued PLVs are represented in the complex plane. Angles indicate the locking phase and the radius the PLV. (D) PLV for different interval lengths $T$. Box plots represent the simulated PLVs, and the dashed green trace represents theoretical prediction of the expectation based on equation 3.11. Vertical broken blue lines indicate integer number of oscillation periods. See Table 2 for parameters used for this figure.

## 4  Assessment of Multivariate Coupling

High-dimensional data sets have become increasingly important in biology (Bühlmann, Kalisch, & Meier, 2014). More specifically in neuroscience, state-of-the-art multichannel electrophysiology recording systems (Dickey, Suminski, Amit, & Hatsopoulos, 2009; Jun et al., 2017; Juavinett, Bekheet, & Churchland, 2019) allow the simultaneous recording of thousands of sites (Pesaran et al., 2018; Jun et al., 2017; Buzsáki, 2004; Fukushima, Chao, & Fujii, 2015). This growth in dimensionality requires the development of appropriate tools (Stevenson & Kording, 2011; O'Leary, Sutton, & Marder, 2015; Gao & Ganguli, 2015; Williamson, Doiron, Smith, & Yu, 2019) for computing an interpretable summary of the coupling between neurophysiological quantities reflecting the collective dynamics of the underlying neural ensembles (Truccolo, 2016; Safavi et al., 2020). To achieve this aim, deriving low-rank approximations of high-dimensional matrices is supported by empirical evidence and theoretical predictions of the existence of low-dimensional structures in neural activity (Ermentrout & Kleinfeld, 2001; Ermentrout & Pinto, 2007; Truccolo, Hochberg, & Donoghue, 2010; Gallego, Perich, Miller, & Solla, 2017; Mastrogiuseppe & Ostojic, 2018; Sohn, Narain, Meirhaeghe, & Jazayeri, 2019; Cueva et al., 2020). This section provides statistical results for such approximation in the context of the coupling between point processes and continuous signals.

As a natural extension of the scalar case discussed in the previous section, we now consider the expected coupling matrix $C*$ between an $n$-dimensional vector of counting processes $N$ with associated intensity vector $λ(t)$ and a multivariate $p$-dimensional signal $x(t)$, and its estimate based on independent trials $C^K$, respectively defined as
$C*=∫0Tx(t)λ(t)⊤dtandC^K=1K∑k=1K∫0Tx(t)dN(k)(t)⊤.$
(4.1)
In this multivariate setting, the coupling matrix between the point process and continuous signal can be characterized by the singular value(s) of $C*$,
$σ1≥σ2≥⋯≥σp≥0,$
and associated orthonormal singular vectors ${(uk,vk)}$, such that
$C*=∑k=1pukσkvkH.$
When the dimension of the coupling matrix gets large, recovering the entire structure of $C*$ using its estimate $C^K$ becomes unlikely due to the fluctuations of individual coupling coefficients investigated in the previous section. However, the largest singular values may remain reliably estimated because they correspond to low-rank structures of the matrix that stand out from the noise. Random matrix theory provides justifications for this approach by characterizing the spectral properties of “noisy” matrices. Up to a normalization explained later, this will involve indirectly characterizing the behavior of the empirical singular values ${σ^k}$ of the estimate matrix $C^K$ by analyzing the eigenvalues of the hermitian matrix $1nC^KC^KH$ denoted
$ℓ1≥ℓ2≥⋯≥ℓp≥0.$
These are related to each other by the relation $σ^k=nℓk$ for all $k$.

### 4.1  Mathematical Formulation

We now replace assumption 1 to adapt to this multivariate setting. By restricting ourselves to homogeneous Poisson processes, we investigate a null hypothesis of no coupling between continuous signals and point processes. Let us denote $x¯$ the complex conjugate of $x$ and $δ$ the Kronecker delta symbol:
$δlj=1,ifl=j,0,otherwise.$
(4.2)
Assumption 2

(Complex Multivariate Case). We consider an infinite sequence ${xj(t)}j≥1$ of complex valued left-continuous deterministic functions uniformly bounded on $[0,T]$ and assume

1. (1)

For all $i,j≥1$, $1T∫0Tx¯ixjdt=δijand∫0Txixjdt=0.$

2. (2)

For all $i≥1$, $∫0Txidt=0$,

3. (3)

There exist $0<λmin<λmax$ and a sequence of independent homogeneous Poisson processes ${Ni}i∈N*$'s with associated rates ${λi}i∈N*$ in the interval $[λmin,λmax]$.

While the assumptions on ${xi(t)}$ are designed for complex signals, which is the classical case when dealing with PLV-like quantities, the results of this section also hold for real signals by using the assumption $1T∫0Txixjdt=δij$ instead of the above condition 1. Condition 2 is also added to ensure that there is no trivial bias leading to a nonvanishing expectation of the coupling coefficients (as illustrated in section 3.2). Indeed, when the time average of each signal vanishes, based on theorem 2, the expectation of all univariate coupling measures for a homogeneous Poisson process vanishes. We then exploit a multivariate generalization of the martingale CLT to characterize the distribution of the coupling matrix given these assumptions.

Theorem 2.

For given $n,p≥1$ and all $K≥1$, we use sequences of signals defined in assumption 7 to build multivariate continuous signal $x(t)=(xj)j=1⋯p$ and $K$ independent copies of multivariate Poisson process $N(t)=(Ni)i=1⋯n$ with rate vector $λ=[λ1,⋯,λn]⊤$. Then the normalized coupling matrix $KC^Kdiag(Tλ)-1$, with $C^K$ given by equation 4.1, converges in distribution for $K→+∞$ to a matrix with i.i.d. complex standard normal coefficients.

Sketch of the proof.

This essentially uses a generalization of the CLT to multivariate point processes described in Aalen et al. (2008, appendix B). Based on the statistics of stochastic integrals presented in section B.1.1, assumptions on $x$ entail vanishing correlations between all matrix coefficients and lead to the analytical expression of the covariance matrix.

This result suggests that for large $n$ and $p=p(n)$, coupling matrices $C^Kn$ of increasing size can be used to build the Wishart-like matrix sequence,
$Sn≜KnC^Kndiag(Tλ)-1(C^Kn)H,$
(4.3)
whose ESD may converge to the Marchenko-Pastur law. This is, however, not guaranteed by classical results due to the nongaussianity and dependence of the matrix coefficients of $C^Kn$ for fixed $n$ and $K$. Convergence will thus depend on how much the departure from these assumptions plays a role as $n$ becomes large. We show in the following theorem that increasing the number of trials as a function of the dimension guarantees convergence to the MP law.
Theorem 3.
In addition to assumption 7, assume an increasing, positive integer sequence ${p(n),K(n)}n∈N*$ such that $p(n)n⟶n→+∞α∈(0,+∞)$, and
$1n2K(n)2∑Γ∫0Tx¯jxlxj'x¯l'dt2→0,uniformlyink≤n,$
(4.4)
where $Γ={(j,l,j',l'):1≤j,l,j',l'≤p}∖{(j,l,j',l'):j=j'≠l=l'orj=l'≠j'=l}$. Consider the sequence ${C^K(n)n}n∈N*$ built as in theorem 8 for $p=p(n)$; then the corresponding sequence ${Sn}$ defined by equation 4.3 has an ESD converging weakly with probability one to the MP law of equation 2.7.
Sketch of the proof.

We use theorem 1.1 of Bai and Zhou (2008) addressing the case of matrices with dependence of coefficients within columns. We use Itô's formula (see appendix B) to check the simplified necessary conditions provided in corollary 1.1 of Bai and Zhou (2008). This implies convergence of the Stieltjes transform to the same function as the transform of the MP distribution. By classical results on the Stieltjes transform (Anderson et al., 2010, theorem 2.4.4), this implies weak convergence to the MP measure (i.e., convergence for the weak topology; see appendix B.2).

Remark 2.

Condition in equation 4.4 determines how many trials are needed at most for spectral convergence. Due to the uniform boundedness assumption on signal $x(t)$ and given the number of terms in the sum bounded by $n4$, we can already see that $nK(n)→0$, that is, having the number of trials increasing at an even slightly faster rate than the dimension is enough for convergence for any choice of continuous signals respecting orthonormality assumption 7. However, there are cases where even fewer trials than dimensions are required. An important example is the Fourier basis of the $[0,T]$ interval, $xl(t)=1Texp(i2πlt/T)$. Then all terms in the sum of equation 4.4 vanish, except the ones satisfying $j-j'-l+l'=0$, such that we are left with a number of bounded terms that scale with $n3$. As a consequence, the condition on the number of trials to achieve spectral convergence becomes $nK(n)→0$, such that we need increasingly fewer trials than dimensions. On the contrary, due to the uniform bound that we impose on the signals, choosing a basis of signals with decreasing support, such as a wavelet basis, typically departs from our condition 1 of assumption 7, as the normalization of condition 1 of assumption 7 imposes unit norm on each signal, requiring their amplitude to increase as their support decreases, violating the uniform bound assumption. This limitation supports the intuition that statistical regularities exploited by our asymptotic results deteriorate with highly transient signals.

This convergence of the spectral measure to the MP law guarantees eigenvalues do not accumulate in a large proportion above the upper end of the support of the MP law; however, they do not provide rigorous guarantees regarding convergence of individual eigenvalues and, in particular, the largest eigenvalue. Although such convergence is satisfied in classical settings (gaussian i.i.d. coefficients), they typically require stronger assumptions than for the (weak) spectral convergence to the MP law, and still only very few results are available in the non-i.i.d. setting. We could, however, prove such convergence by adding a constraint to our model.

Theorem 4.
In addition to assumption 2, assume all homogeneous rates $λk$ are equal. Assume two increasing, positive integer sequences ${p(n),K(n)}n∈N*$ such that
$p(n)n→α∈(0,+∞)and1K(n)∑1≤i,k≤p(n)∫0T|xixj|2(t)dt
(4.5)
for some constant $B$. Then $Sn$ defined in equation 4.3 has an ESD converging weakly with probability one to the MP law of equation 2.7. Moreover, let $ℓ1$ and $ℓp$ be the largest and smallest eigenvalues of ${Sn}$, respectively. Then in probability
$ℓ1(n)→(1+α)2andℓp(n)→(1-α)21α<1.$
Sketch of the proof.

The identical intensities assumption allows us to use the result of Chafaï and Tikhomirov (2018) for matrices with i.i.d. columns. We first checked that their proof holds also for the complex case by replacing symmetric matrices by Hermitian matrices and squared scalar product by an absolute squared Hermitian product. We satisfy their strong tail projection (STP) assumption using Chebyshev's inequality. The necessary fourth-order moment conditions exploit the same stochastic integration results as theorem 9.

Remark 3.

Without additional assumptions, the moment condition of equation 4.5 is satisfied by choosing $K(n)=n2$ (as there are $p2$ bounded moments, scaling as $n2$ when $n$ grows). It is likely from the proof that taking into account more information about the moments of the continuous signal sequence ${xj}$, we can achieve convergence with a lower rate of increase for the number of trials. This is left to future work.

This result thus provides the guarantees that under a null hypothesis of no coupling (due to homogeneity of the Poisson processes), the extreme eigenvalues of $S$ will asymptotically cover exactly the full support of the MP law. This will be used in section 4.2 to assess the significance of the eigenvalues $ℓk$ by simply checking whether they are larger than $(1+α)2$.

This significance analysis relies as well on understanding what happens to the eigenvalues when the model departs from the null hypothesis. In a practical setting, we hypothesize that the coupling matrix has a deterministic structure superimposed on the martingale noise modeled in the above results. One qualitative justification of this assumption can be found in remark 4, showing that for sinusoidal coupling, a nonvanishing expectation proportional to the coupling is superimposed to martingale noise, whose distribution is unaffected by coupling, such that the noisy part of the matrix satisfies the conditions of the above theorems. As typically done in applications, we are mostly interested in the low-rank structure associated with the largest singular values of the coupling matrix, providing an interpretable summary of the multivariate interactions.

This naturally leads to a modeling departure from the null hypothesis with a low-rank perturbation assumption. In such a case, the eigenvalues related to significant coupling are expected to be reflected in the spectrum of the perturbed matrix, such that they can be isolated from the remaining eigenvalues associated with the martingale noise. This intuition is justified by results in the case of the Wishart ensemble (Loubaton & Vallet, 2011); see also Benaych Georges and Nadakuditi (2012) for a more general result and Capitaine and Donati Martin (2016) for an overview of matrix perturbation results), that we restate here:

Theorem 5
(From Loubaton & Vallet, 2011, Theorem 6). Let $Xn$ be an $n×p$ sequence of i.i.d. complex gaussian matrices defined in section 2.3 and $An$ be a finite rank perturbation of the null matrix with nonzero eigenvalues $θi$. Let $Mn=(1nXn+An)(1nXn+An)H$. Then as $n→∞$ and $pn→α∈(0,1)$, almost surely,
$λi(Mn)→(1+θi)(c+θi)θi,ifθi>α,(1+α)2,otherwise.$

A demonstration that this further applies rigorously to our nongaussian, non-i.i.d. case is left to further work (but see Benaych-Georges & Nadakuditi, 2012, for a generalization in this direction). This result shows the upper end of the MP support is indeed the critical threshold for the eigenvalues of $An$ to stand out from the noise. Below this threshold, the largest eigenvalue convergence to the upper end of the support of the MP distribution is not informative about $θi$. Above this threshold, the value of $θi$ can be recovered and detected by comparing the largest eigenvalue to the upper end of the MP distribution.

We next illustrate the interest of these theoretical predictions in the context of neural time series for reliably quantifying the interplay between multichannel LFP signals and the spiking of multiple neurons. Nevertheless, the results are potentially applicable in other domains as well. In neuroscience, $x$ may represent LFP measurements collected on each recording channel and $N$ the spiking activity of different neurons, called units. The number of recording channels $nc$ and recorded units $nu$ correspond to $p$ and $n$, respectively. These numbers may differ, and as a consequence, the coupling matrix is generally rectangular.

### 4.2  Application to Significance Assessment

In order to statistically assess the significance of the largest singular value(s) of coupling matrix $C^Kn$, considered as a measure of coupling between point processes and continuous signals, we need a null hypothesis. Hypothesis testing based on the generation of surrogate data is one of the common methods for significant assessment in neuroscience and other fields. Generating appropriate surrogate data can be not only challenging (see Grün, 2009, and Elsayed & Cunningham, 2017, for examples in neuroscience), but also computationally expensive due to the increasingly large dimensions of modern data sets. Exploiting our theoretical results for this setting allows us to perform such statistical assessment in a principled way, without using surrogate data and sparing computational resources.

In order to exploit the results of the theoretical part, it is best to preprocess the $p×q$ matrix of time-discretized signals $L$ that correspond to $q$ samples over interval $[0,T]$ with sampling interval $Δ=T/q$. The chosen signals are driven by the application (in our case, they are preprocessed LFPs, see section 4.3 for a simulation reproducing the context of neurophysiolgy data). We assume the rows of $L$ sum to zero to match condition 2 of assumption 7 (and avoid bias in the coupling measure similar to what is described in section 3.3). We then need to process further this signal such that condition 1 of assumption 7 is satisfied approximately. In order to achieve this, we perform classical whitening of the signals to generate matrix $X$, the discrete time approximation of $x(t)$, according to
$X=WL,withW=1qLLH-12,$
(4.6)
where the power in the expression of the whitening matrix $W$ describes the inversion of a matrix square root, typically achieved via eigenvalue decomposition, and which may require PCA-like dimensionality reduction in practice to minimize the numerical effects of small eigenvalues. This procedure decorrelates the martingale fluctuations of coefficients within the same column of the coupling matrix (see theorem 8), a key requirement for convergence to the MP law.
As explained in section 4.1, theoretical results support using $θDET=(1+α)2$, the upper end of the support of the MP law, as a detection threshold for the significance of the eigenvalues of the Hermitian matrix,
$Sn=KnC^Kndiag(Tλ0)-1(C^Kn)H.$
The null hypothesis of nonsignificance of the $k$th largest singular value $σ^k$ of the normalized coupling matrix,
$KC^Kndiag(Tλ0)−1,$
should thus be rejected if the corresponding $k$th largest eigenvalue $ℓk$ of $Sn$ is superior to the significance threshold, leading to the condition
$σ^k=nℓk>nθDET=n(1+α).$
(4.7)
Therefore, this last condition on the empirical singular values is used to identify those reflecting a significant coupling between the multivariate point process and continuous signal. An illustration of our overall significance assessment approach is shown in Figure 4.
Figure 4:

We assume $C^Kn$ is a superposition of martingale noise and a low-rank deterministic matrix $C*$ reflecting the actual coupling. If the singular values of a normalized version of $C*$ are large enough (larger than the upper end of the MP law support), theory suggests that they will correspond to the largest eigenvalues of $Sn$ appearing beyond the support of MP distributed eigenvalues reflecting martingale noise. They can thus be detected with a simple thresholding approach (see equation 4.7).

Figure 4:

We assume $C^Kn$ is a superposition of martingale noise and a low-rank deterministic matrix $C*$ reflecting the actual coupling. If the singular values of a normalized version of $C*$ are large enough (larger than the upper end of the MP law support), theory suggests that they will correspond to the largest eigenvalues of $Sn$ appearing beyond the support of MP distributed eigenvalues reflecting martingale noise. They can thus be detected with a simple thresholding approach (see equation 4.7).

### 4.3  Simulation

We use a simulation to demonstrate the outcome of our (asymptotic) theoretical results on mutlivariate coupling. Similar to the simulations of section 3.3 for the univariate case, we use simulated phase-locked spike trains with Poisson statistics. The main difference between this simulation and the previous one is in synthesizing the LFP. In order to simulate multichannel oscillatory signals that lead to a low-rank structure for $C*$, we use a combination of noisy oscillatory components.

The LFPs contain $Nosc$ oscillatory groups of channels; each channel $l$ within the same group contains the same oscillatory component with index $j(l)$, with the time course of all these components being $Oj(t)=e2πifjt$, $j∈{1,…,Nosc}$, with all frequencies $fj$ in the range $[fmin,fmax]$, and all multiples of $1/T$. Due to the necessary time axis discretization, the bracket notation $[t]$ indicates the oscillation is sampled at equispaced discrete times $t={kΔ}k=1,…,q$. The synthesized discrete time multichannel LFP ($Ψ[t]={ψl[t]}l=1,…,nc$) can be written as
$Ψl[t]=Oj(l)[t]⊙expiηl[t],$
(4.8)
with $⊙$ entrywise product and ${ηl[t])}$ i.i.d. sampled (white) phase noises contaminating each channel independently (see appendix D for more details).

In this simulation, the frequencies of the oscillatory components range from 11 to 15 Hz. We used 100 LFP channels ($nc=100$) and different choices for the number of spiking units (10, 50, and 90). Spiking activities are simulated in different scenarios, with and without coupling to the LFP oscillations. In the latter case, we have two populations of neurons (each consisting of 1/5th of the total number of neurons) that are each coupled to one of the oscillatory groups of LFP channels. Both populations are coupled to their respective oscillation with identical strength ($κ=0.15$) and phase ($ϕ0=0$).

To compute the coupling matrix $C^K$, we first preprocess $Ψ[t]$ by applying bandpass filtering in a range covering $[fmin,fmax]$ and convert it to an analytic signal via the discrete time Hilbert transform, leading to data matrix $L$, following the standards of PLV analysis in neuroscience (Chavez, Besserve, Adam, & Martinerie, 2006).

This signal matrix is then whitened according to equation 4.6 to yield matrix $X$, the discrete time version of $x(t)$. The coupling matrix $C^K$ is then computed according to equation 4.1 using 10 trials (barring trivial approximation to the closest time sample in $X$).

Then in order to approximate the normalization $KC^Kndiag(Tλ0)-1$ based on empirical data, we use the total number of events for unit $u$ occurring across all $K$ trials $Ntotu=∑k=1KNku$ and multiply each column $u$ of the coupling matrix by
$KNtotu≈KK∫0Tλu(t)dt$
for the corresponding unit $u$, asymptotically matching the theoretical normalization in the homogeneous Poisson case.
We observe in Figure 5A that in the absence of coupling, the distribution of eigenvalues originating from the random matrix structure is very close to the theoretically predicted MP distribution. In Figure 5B, we have coupling between spike and eigenvalues reflecting the coupling beyond the MP support (blue line in Figure 5), and the eigenvalue bulk below the threshold is also close to MP distribution. This suggests an easy thresholding approach for significance assessment.
Figure 5:

Theoretical Marchenko-Pastur distribution (green lines) and empirical distribution (gray bars) for (A) simulation without coupling ($κ=0$) and (B) with coupling ($κ=0.15$) between multivariate spikes and LFP. Rows represent the spectral distribution of simulations with different number of spiking units—rows 1, 2, and 3, respectively, 10, 50, and 90 (which leads to different $α$ for MP law). Insets zoom the tail of the distributions. Parameters used for this figure are denoted in Table 3.

Figure 5:

Theoretical Marchenko-Pastur distribution (green lines) and empirical distribution (gray bars) for (A) simulation without coupling ($κ=0$) and (B) with coupling ($κ=0.15$) between multivariate spikes and LFP. Rows represent the spectral distribution of simulations with different number of spiking units—rows 1, 2, and 3, respectively, 10, 50, and 90 (which leads to different $α$ for MP law). Insets zoom the tail of the distributions. Parameters used for this figure are denoted in Table 3.

## 5  Discussion

### 5.1  Insights for Data Analysis

Our theoretical results provide guarantees for specific coupling models to respect univariate and multivariate asymptotic statistics that can be easily exploited for statistical testing. The required assumptions provide guidelines for practical settings that are likely of interest beyond the strict framework that we imposed to get the rigorous results. For the univariate coupling measure, corollaries and simulations point out the importance of the choice of the observation interval $[0,T]$, which is particularly sensitive when considering short intervals covering only a few oscillation periods. This is the case when doing time-resolved analysis or dealing with experiments with short trial durations. Moreover, the univariate results also emphasize the effect of nonlinear phase increases, highly relevant in neuroscience due to the pervasive effects of nonlinear dynamics in the mesoscopic signals. Our results provide asymptotic bias correction terms that can be used for statistical testing.

In the same way, theoretical results in the multivariate setting may seem to be constrained by our assumptions, but they provide critical guidelines to interpret singular values. First, whitening the continuous signals and normalizing the coupling by the square root of the rate are key preprocessing steps for making the asymptotic behavior of the martingale noise invariant to the specifics of the data at hand. This then reduces to an analytical model, the MP law, dependent on only a single matrix shape parameter. After assessment of the significance of the singular value of the normalized coupling matrix, it is of course possible to revert these preprocessing steps to get a low-rank approximation of the original coupling matrix (nonnormalized, nonwhitened) to summarize the significant coupling structure in an interpretable way. A second insight provided by the multivariate results is the role of fourth-order moments of the continuous signals, represented by the integrals of order four monomials of components of $x(t)$, in the MP convergence results. The magnitude of these moments determines the number of trials asymptotically needed to achieve convergence. Since these moments can be estimated empirically, we can check how they grow with the dimension of the signals in a specific application. With our minimal assumptions on the signals, the number of trials need only to grow at most sublinearly in the dimension for spectral convergence; however, we could only show that convergence of the largest eigenvalues requires at most quadratic increase in the dimension $n$. This last result might be improved in future work, with extra assumptions, to reach linear growth.

Our theoretical results can be extended in two directions. The first is toward exploiting point processes different from inhomogeneous Poisson (e.g., Hawkes processes) in order to be able to apply the framework in the context of stochastic intensities. The second direction is toward exploiting recent developments in RMT, in order to develop a probabilistic significance assessment.

### 5.2  Extension of Signal Assumptions

Our theoretical results assume deterministic continuous signals and point process intensities (see assumption 1). This entails limitations, such as implicitly assuming the considered point processes are (homogeneous or inhomogeneous) Poisson processes. This assumption may be too restrictive in realistic scenarios (for examples in neuroscience, see Deger, Helias, Boucsein, & Rotter, 2012; Reimer, Staude, Ehm, & Rotter, 2012; Nawrot et al., 2008; Shinomoto, Shima, & Tanji, 2003; Maimon & Assad, 2009; Shinomoto et al., 2009). However, the stochastic integration methods that provide the basis of our results allow the treatment of random signals and intensities, provided they are predictable, which encompasses a wide enough class of processes to cover most applications (Protter, 2005). Our results thus have potential for generalizations to the case of random continuous signal, with the difference that the variance of the estimates would increase due to the additional variability induced by the signal fluctuations, and to the case of random intensities, leading to different asymptotic properties of the coupling measures, which may or may not have simple analytical expressions.

As a potential direction for extending this framework, Hawkes processes (Hawkes, 1971) are point processes for which the probability of occurrence of future events can also depend on the sequence of past events. Due to this history dependency, they are also called self-exciting processes. Hawkes processes are used for modeling recurrent interactions in various fields; for instance—in finance it is used to model buy or sell transaction events on stock markets (Embrechts, Liniger, & Lin, 2011) in geology to model the origin times and magnitudes of earthquakes (Ogata, 1988), in online social media to model user actions over time (Rizoiu, Lee, Mishra, & Xie, 2017), and even modeling reliability of information on the web and controlling the spread misinformation (Tabibian et al., 2017; Kim, Tabibian, Oh, Schölkopf, & Gomez-Rodriguez, 2018), and in neuroscience to model spike trains (Krumin et al., 2010). We conjecture that such history dependency can be incorporated in our analytic treatment of the coupling measure, such that our theoretical results can be extended to this model.

### 5.3  Extension beyond Binary Significance Assessment

We show that the Marchenko-Pastur distribution provides a good approximation of the distribution of eigenvalues in the absence of coupling, and the upper end of its support approximates the largest eigenvalue. This provides us a threshold to assess the significance of empirical singular values. Nevertheless, this hard thresholding approach does not take into account the actual fluctuations of the largest eigenvalue around this asymptotic upper end of the support and thus does not provide meaningful $p$-value for the statistical test.

It has been shown that the appropriately rescaled and recentered8 largest eigenvalue of Wishart matrices is asymptotically distributed as the Tracy-Widom distribution—for example, see Johnstone (2001); Tracy and Widom (2002); and El Karoui (2003, 2005, 2007). However, note that in some cases of practical relevance, the normal distribution might be more appropriate (Bai & Yao, 2008). Such asymptotic distribution of the largest eigenvalue can be exploited for reporting a theoretical $p$-value for the significance of the coupling and therefore extending the significance assessment from a binary decision to a probabilistic one. For example, Kritchman and Nadler (2009) exploit this idea (but in a simpler scenario) to determine the number of signal components in noisy data. This extension would allow a precise probabilistic assessment of the significance of weaker couplings leading to eigenvalues in the neighborhood of the asymptotic threshold introduced above.

## 6  Conclusion

We investigated the statistical properties of coupling measures between continuous signals and point processes. We first used martingale theory to characterize the distributions of univariate coupling measures such as the PLV. Then, based on multivariate extensions of this result and RMT, we established predictions regarding the null distribution of the singular values of coupling matrices between a large number of point processes and continuous signals and a principled way to assess significance of such multivariate coupling. These theoretical results build a solid basis for the statistical assessment of such coupling in applications dealing with high dimensional data.

## Appendix: Proofs of Theorems in the Main Text

Proof of Theorem 2.
For the first part of the theorem (expectation), we use the martingale $M(k)$ associated with each copied process $N(k)$ to rewrite
$c^K=1K∑k=1K∫0Tx(t)dM(k)(t)+1K∑k=1K∫0Tx(t)λ(t)dt(t).$
(A.1)
Elements of the sum in the first term are then zero mean martingale, and by linearity, so is the whole term. As a consequence (using the zero mean property), the expectation of the first term is zero so only the second term remains:
$Ec^K=∫0Tx(t)λ(t)dt(t).$

We then exploit a central limit theorem (CLT) for martingales to prove the second part of the theorem (convergence to gaussian distribution). To satisfy the CLT in such a case, it is sufficient to find a particular martingale $M˜(K)$ sequence that will satisfy the conditions described in Aalen et al. (2008, p. 63) ($→P$ indicate convergence in probability):

1. 1.

$Var(M˜(K)(t))⟶K→+∞PV˜(t)$ for all $t∈[0,T]$, with $V˜$ increasing and $V˜(0)=0$.

2. 2.

Informally, the size of the jumps of $M˜(K)$ tends to zero (see Aalen et al., 2008, p. 63). Formally, for any $ε>0$, the martingale $M˜ε(K)(t)$ gathering the jumps $>ε$ satisfies $VarM˜ε(K)(t)⟶K→+∞P0$.

Then $M˜(K)(t)$ converges in distribution to a gaussian martingale of variance $V˜(t)$.

To achieve these conditions, we define $M(k)$, the sequence of i.i.d. zero mean martingales defined on $[0,T]$ canonically associated with the point process of each trial $N(k)$. Then we build martingales $Mx(k)(t)=∫0tx(s)dM(k)(s)ds$ and construct $M˜(K)=1/K∑k=1KMx(k)$.

The variance of this latter martingale (also called its predictable variation processes) can be computed based on the rules provided in section B.1.1. First, due to trial independence,
$V˜(t)=VarM˜(K)(t)=Var1K∑k=1KMx(k)(t)=∑k=1KVar1KMx(t),$
(A.2)
and using equation B.4, we get
$V˜(t)=1K∑∫0tx2(t)λ(t)dt=∫0tx2(t)λ(t)dt.$
(A.3)
Equation A.3 clearly fulfills CLT's condition 1.

For condition 2, due to assumption 1, $x(t)$ is bounded, such that there is a $B>0$ satisfying $|x(t)| over $[0,T]$. As a consequence, the size of all jumps is bounded by $B/K$, and for any $ε$, $M˜ε(K)(t)$ is the constant zero for $K>B2ε2$ and condition 2 is satisfied.

Fulfillment of both conditions leads to convergence in distribution to a gaussian martingale of variance $V˜(t)$;
$M˜(K)⟶K→+∞N0,∫0Tx2(t)λ(t)dt.$
(A.4)
Finally, using equation A.1, we conclude the proof by noticing that the above martingale corresponds exactly to the quantity $Kc^K-c*$. Therefore,
$Kc^K-c*⟶K→+∞N0,∫0Tx2(t)λ(t)dt.$
(A.5)
Proof of Corollary 3.
We apply theorem 2 to $eiϕ(t)$ (i.e., replacing $x(t)$ with $eiϕ(t)$). As $eiϕ(t)$ is complex valued, we should have a covariance function for its predictable variation process $V˜(t)$. The covariance between a martingale's real part,
$MRe(t)=∫0tRe(eiϕ(s))dM(s)ds,$
and imaginary part,
$MIm(t)=∫0tIm(eiϕ(s))dM(s)ds,$
is given by
$∫0tRe(eiϕ(s))Im(eiϕ(s))λ(s)ds.$
(A.6)
The diagonal elements of the covariance function are the predictable variation process of $MRe$ and $MIm$ that can be computed based on equation B.4, and the off-diagonal elements are the covariance between martingale's real and imaginary part that can be computed based on equation B.5. Therefore, the covariance function for its predictable variation process is
$CovRe{Z}Im{Z}=∫0tRe(eiϕ(s))2λ(s)ds∫0tRe(eiϕ(s))Im(eiϕ(s))λ(s)ds∫0tRe(eiϕ(s))Im(eiϕ(s))λ(s)ds∫0tIm(eiϕ(s))2λ(s)ds$
(A.7)
$=∫0tcos2(ϕ(s))sin(2ϕ(s))/2sin(2ϕ(s))/2sin2(ϕ(s))λ(s)ds.$
(A.8)
Similar to theorem 2, as $K→+∞$, the residuals converge in distribution to a zero-mean complex gaussian variable $Z$ (i.e., the joint distribution of real and imaginary parts is gaussian):
$Kc^K-c*⟶K→+∞N0,Cov(Z).$
Because theorem 2 guarantees that the $K(c^K-c*)$ tends to a gaussian with finite variance, $c^K$ tends to the Dirac measure at $c*$.
However, given that we use $x(t)=eiϕ(t)$, $c^K$ is not exactly the multitrial $PLV$ estimate—more precisely,
$c^K=1K∑k=1K∫0Teiϕ(t)dN(k)(t)=1K∑k=1K∑j=1Nkeiϕ(tjk)=∑k=1KNkKPLV^K.$
Thus, we can write $PLV^K=νK·c^K$, with $νK=K∑k=1KNk$. With the same techniques (using $x(t)=1$), we can show convergence in the distribution of $νK$ to a constant:
$1νK=∑k=1KNkK=1/K∑k∫0T1·dN(k)⟶K→+∞∫0Tλ(t)dt=Λ(T).$
$νK⟶K→+∞1Λ(T).$
Following a version of Slutsky's theorem (Mittelhammer, 1996, theorem 5.10), since $νk$ and $c^K$ tend to a limit in distribution, and one of these limits is a constant, the product tends to the product of the limits such that we get
$PLV*=limK→∞νK·c^K=c*Λ(T)$
and can decompose the PLV residual as follows:
$KPLV^K-PLV*=KνKc^K-c*+KνKc*-PLV*.$
Taking the limit of the above equation, the second term clearly vanishes (see the above limit of $νK$), and the first term, using again the limit of products, leads to the final result:
$KPLV^K-PLV*⟶K→+∞N0,1Λ(T)2Cov(Z).$
Proof of Corollary 5.
We use the intensity function introduced in equation 3.5 in corollary 3. The $PLV$ asymptotic value ($PLV*$) can be derived from definition introduced in equation 3.2:
$PLV*=∫0Teiϕ(t)λ(t)dt∫0Tλ(t)dt$
(A.9)
$=ro∫0Teiϕ(t)exp(κcos(ϕ(t)-φ0))ϕ'(t)dtro∫0Texp(κcos(ϕ(t)-φ0))ϕ'(t)dt.$
(A.10)
We change the integration variable from $ϕ(t)$ to $θ$:
$PLV*=∫ϕ(0)ϕ(T)eiθexp(κcos(θ-φ0))dθ∫ϕ(0)ϕ(T)exp(κcos(θ-φ0))dθ.$
(A.11)
To simplify the integral (bring the $φ0$ out of the integral), we change the integration variable again, from $θ$ to $ψ$, ($ψ=θ-φ0$):
$PLV*=∫ϕ(0)-φ0ϕ(T)-φ0ei(ψ+φ0)exp(κcos(ψ))dψ∫ϕ(0)-φ0ϕ(T)-φ0exp(κcos(ψ))dψ$
(A.12)
$=eiφ0∫ϕ(0)-φ0ϕ(T)-φ0eiψexp(κcos(ψ))dψ∫ϕ(0)-φ0ϕ(T)-φ0exp(κcos(ψ))dψ.$
(A.13)
Given that that integrand is a $2π$-periodic functions (thus, the integral is invariant to translations of the integration interval), we get
$PLV*=eiφ0∫-ππeiψexp(κcos(ψ))dψ∫-ππexp(κcos(ψ))dψ.$
Observing that the integrand of the denominator is even, while for the numerator, the imaginary part is odd and the real part is even, we get
$PLV*=eiφ0∫0πcos(ψ)exp(κcos(ψ))dψ∫0πexp(κcos(ψ))dψ.$
This proves the first part of the corollary—equation 3.6. By using the integral form of the modified Bessel functions $Ik$ for $k$ integer (see, e.g., Watson, 1995, p. 181):
$Ik(κ)=1π∫0πcos(kθ)exp(κcos(θ))dθ+sin(kπ)π∫0+∞e-κcosht-ktdt$
(A.14)
$=1π∫0πcos(kθ)exp(κcos(θ))dθ,$
(A.15)
we can derive the compact form:
$PLV*=eiφ0I1(κ)I0(κ).$
(A.16)
The covariance matrix of the asymptotic distribution can be easily derived by plugging equation 3.5 as $λ(t)$ in corollary 3 and integrating on $[0,T]$:
$Cov(Z)11=λ0Λ(T)2∫0Tcos2(ϕ(t))expκcos(ϕ(t)-φ0)ϕ'(t)dt.$
(A.17)
Based on the above developments and noticing that the integration intervals correspond to $2πγT$, with $γT$ the number of oscillation periods, we have
$Λ(T)=λ02γTπI0(κ)=λ02ϕ(T)-ϕ(0)2ππI0(κ),$
such that
$Cov(Z)11=1λ0ϕ(T)-ϕ(0)2I0(κ)2×∫0Tcos2(ϕ(t))expκcos(ϕ(t)-φ0)ϕ'(t)dt.$
(A.18)
To simplify the rest of the derivations, we transform the complex variable coordinates by using $eiϕ(t)e-iφ0$ instead of $eiϕ(t)$ as predictable with respect to ${Ft}$ (i.e., replacing $x(t)$ with $eiϕ(t)e-iφ0$ in theorem 2). With this change, equation A.18 becomes
$Cov(Z)11=1λ0ϕ(T)-ϕ(0)2I0(κ)2×∫0Tcos2(ϕ(t)-φ0)expκcos(ϕ(t)-φ0)ϕ'(t)dt.$
(A.19)
We change the variable of the integral from $ϕ(t)-φ0$ to $θ$ and use the following trigonometric identity,
$cos2(θ)=121+cos(2θ),$
(A.20)
to obtain
$Cov(Z)11=12λ0ϕ(T)-ϕ(0)2I0(κ)2∫ϕ(0)ϕ(T)1+cos(2θ)expκcos(θ)dθ=12λ0ϕ(T)-ϕ(0)2I0(κ)2×∫ϕ(0)ϕ(T)expκcos(θ)+cos(2θ)expκcos(θ)dθ.$
Using again that the integration interval is $2πγT$ with $γT$ integer, and integrates $2π$-periodic functions (thus, the integral is invariant to translations of the integration interval), we get
$Cov(Z)11=12λ0ϕ(T)-ϕ(0)2I0(κ)2∫02πγTexpκcos(θ)dθ+∫02πγTcos(2θ)expκcos(θ)dθ,Cov(Z)11=12λ0ϕ(T)-ϕ(0)2I0(κ)22γTπI0(κ)+2γTπI2(κ)$
(A.21)
$=2πγT2λ0ϕ(T)-ϕ(0)2I0(κ)2I0(κ)+I2(κ)$
(A.22)
$=12λ0ϕ(T)-ϕ(0)I0(κ)2I0(κ)+I2(κ),$
(A.23)
where $γT$ is the number of oscillation periods contained in $[0,T]$.

We can have a similar calculation for the imaginary part, $Cov(Z)22$, as well, but using the identity $sin2(θ)=121-cos(2θ)$ instead of equation A.20. The off-diagonal elements of the covariance matrix vanish due to symmetries of integrand.

Therefore, we showed that for a given $κ≥0$, scaled residual
$Z'=e-iφ0KPLV^K-PLV*$
converges to a zero mean complex gaussian with the following covariance:
$CovRe{Z'}Im{Z'}=Re{Ze-iφ0}Im{Ze-iφ0}=12λ0(ϕ(T)-ϕ(0))I0(κ)2I0(κ)+I2(κ)00I0(κ)-I2(κ).$
Proof of Corollary 6.

Similar to corollary 5, we can derive the asymptotic $PLV$, equation 3.9, for this case, from the definition in equation 3.2. We apply the intensity function $λ=λ0$ in corollary 3. The $PLV$ asymptotic value ($PLV*$) can be derived simply by changing the integration variable from $ϕ(t)$ to $θ$ (and let $θ↦τ(θ)$ be its inverse).

The covariance matrix of the asymptotic distribution, can be derived by the procedure we used for the proof of corollary 5. We plug the rate $λ0$ as $λ(t)$ in corollary 3 and integrate on $[0,T]$:
$Cov(Z)11=λ0Λ(T)2∫0Tcos2(ϕ(t))dt.$
(A.24)
By changing the variable from $ϕ(t)$ to $θ$, we get
$Cov(Z)11=λ0Λ(T)2∫ϕ(0)ϕ(T)cos2(θ)τ'(θ)dθ.$
(A.25)
As $Λ(T)=∫0Tλ0dt=λ0T$, we have
$Cov(Z)11=λ0Λ(T)2∫ϕ(0)ϕ(T)cos2(θ)τ'(θ)dθ$
(A.26)
$=1λ0T2∫ϕ(0)ϕ(T)cos2(θ)τ'(θ)dθ.$
(A.27)
With a similar calculation for other coefficients of the covariance matrix, we get
$Cov(Z)=1λ0T2∫ϕ(0)ϕ(T)cos2(θ)sin(2θ)/2sin(2θ)/2sin2(θ)τ'(θ)dθ.$
Therefore, we showed that the scaled residual,
$Z=KPLV^K-PLV*,$
converges to a zero mean complex gaussian:
$KPLV^K-PLV*⟶K→+∞N00,Cov(Z).$
Proof of Theorem 8.

Similar to the proof of theorem 2, we rely on a CLT, but this time adapted to the case of vector-valued martingales (Aalen et al., 2008, appendix B) to prove this theorem.

We start from the single trial empirical vector-valued coupling measure of equation 4.1:
$C=∫0tx(t)dN(t)⊤.$
(A.28)
As for the univariate case, under mild assumptions, we can associate a martingale with a vector-valued counting process $N(t)$:
$M(t)=N(t)-∫0tλ(s)ds.$
(A.29)
As in this theorem, we assume $λ(t)=λ0,t∈[0,T]$, we get
$M(t)=N(t)-λ0t.$
(A.30)
The $(p×n)$ matrix-valued martingale for the empirical coupling matrix of equation 4.1, resulting from stochastic integration, is
$Mx(t)=∫0tx(s)dM⊤(s)ds$
(A.31)
and can be decomposed similarly to equation B.3 as
$Mx(t)=∫0tx(s)dN(s)⊤-∫0tx(s)λ0ds.$
(A.32)
By generalizing the steps of theorem 2, we introduced the $(p×n)$-variate martingale:
$M˜(K)(t)=1/K∑k=1KMx(k)(t)$
(A.33)
$=1/K∑k=1K∫0tx(s)dM(k)⊤(s)ds.$
(A.34)
We now state the CLT theorem for multivariate stochastic integrals.
Proposition 1
(Multivariate Martingale CLT; Aalen et al., 2008, Section B.3). Given the (real) matrix valued predictable functions $H(K)(t)$, consider the multivariate stochastic integral of multivariate martingale $M(K)$ with intensity vector $λ(K)(t)$:
$∫0tH(K)(u)dM(K)(u).$
Assume:
1. 1.

$∫0tH(K)(u)diag{λ(K)(u)}H(K)(u)⊤du⟶PV(t)$.

2. 2.

$∑j=1k∫0t(H(K)(u))21|H(K)(u)|>ελj(K)(u)du⟶P0,$ for all $t∈[0.T]$ and $ε>0$.

The above stochastic integral converges in distribution to a mean-zero gaussian martingale of covariance $V(t)$.

We notice that when summing across $K$ trials (see equation A.34), deterministic signals $x$ remain identical and point processes are pooled across $K$-trials. Given that trials are independent, the counting processes derived from the trial-pooled Poisson processes $∑k=1KN(k)(t)$ are distributed as multivariate Poisson processes with intensity vector $Kλ0$, such that
$M˜(K)(t)=1/K∫0tx(s)dP⊤(s)ds,$
(A.35)
where $P$ is the martingale associated with the pooled process,
$P(t)=∑k=1KN(k)(t)-∫0tKλ(s)ds.$
(A.36)
Given that the coupling matrix is matrix valued, we have to vectorize it in order to apply the above CLT. Let $Vec{.}$ be the operator that concatenates the successive columns of a matrix into a larger column vector. $M˜(K)(t)$ is a $(p×n)$-variate matrix-valued process, and its vectorized version, $Vec{M˜(K)(t)}$, is a $(pn×1)$-variate vector process. We can write equation A.35 in vectorized form as
$Vec{M˜(K)(t)}=∫0tH(s)dP⊤(s)ds,$
with the $(pn×n)$-variate block diagonal matrix:
$H(s)=1Kx(s)0⋯⋯00x(s)0⋯000⋱⋱00⋱⋱⋱00⋯⋯0x(s).$
(A.37)
The variance of $Vec{M˜(K)(t)}$ (a $(pn×pn)$-variate covariance matrix which is also called predictable variation process) can be written, based on proposition 14, as
$V˜(t)=∫0tH(s)diagλ(s)H(s)⊤ds.$
(A.38)
Since we assume a constant intensity function, $λ(t)=λ0={λk}k$ ($(n×1)$-variate matrix), we can simplify equation A.38 as follows:
$V˜(t)=∫0tH(s)diagKλ0H(s)⊤ds.$
(A.39)
Replacing $H(s)$ with the block diagonal matrix defined in equation A.37 leads us to
$V˜(t)=1K∫0tKλ1x(s)x(s)Hds0⋯⋯00∫0tKλ2x(s)x(s)Hds0⋯000⋱⋱00⋱⋱⋱00⋯⋯0∫0tKλnx(s)x(s)Hds$
(A.40)
$=λ1∫0tx(s)x(s)Hds0⋯⋯00λ2∫0tx(s)x(s)Hds0⋯000⋱⋱00⋱⋱⋱00⋯⋯0λn∫0tx(s)x(s)Hds.$
(A.41)
This fulfills condition 1 of the CLT for all $t∈[0,T]$. For the second condition, it is enough to see that the coeffcients of $H$ are bounded by a term decreasing in $1K$. The CLT is thus satisfied, and we get convergence in the distribution to a zero-mean complex gaussian of covariance $V˜(t)$ for each $t$. Specializing the result for $t=T$, we get, based on assumption 7, a diagonal covariance matrix with block-constant diagonal coefficients,
$V˜(T)=Tλ1Ip0⋯⋯00Tλ2Ip0⋯000⋱⋱00⋱⋱⋱00⋯⋯0TλnIp,$
(A.42)
where $Ip$ indicates the $(p×p)$ identity matrix, which provides the covariance matrix of the (vectorized) coefficients of matrix $KC^K$.
Therefore, for the normalized coupling matrix, $C^Kdiag(Tλ0)-1$, the column by-column normalization, normalizes each block of the above covariance matrix by a multiplicative term $1Tλk$ to lead to an identity covariance. This proves convergence of the normalized coupling matrix in distribution for $K→+∞$ to a random matrix with i.i.d. unit variance complex gaussian coefficients (because lack of correlations implies independence in the gaussian case):
$KVec{C^Kdiag(Tλ0)-1}⟶K→+∞N0pn,Ipn.$
(A.43)
Proof of Theorem 9.

Based on proposition 23 in section B.3, we need only to check the four following necessary conditions, using the Kronecker delta notation of equation 4.2:

1. 1.

$EX¯jkXlk=δlj$, for all $k$.

2. 2.

$1nmaxj≠lEX¯jkXlk2→0$ uniformly in $k≤n$.

3. 3.

$1n2∑ΓE[X¯jkXlk-δljXj'kX¯l'k-δj'l'2→0$ uniformly in $k≤n$, where $Γ={(j,l,j',l'):1≤j,l,j',l'≤p}∖{(j,l,j',l'):j=j'≠l=l'orj=l'≠j'=l}$.

4. 4.

$p/n→α∈(0,∞)$.

Based on the same developments as theorem 8, we use the auxiliary processes
$Xlk(t)=KλkT1K∫0txl(s)dPk(s)=1KλkT∫0txl(s)dPk(s)=∫0tHlk(s)dPk(s)$
with $Pk$ zero-mean martingale associated with the Poisson process of intensity $Kλk$ (see equation A.36) and
$Hlk(t)=xl(t)KλkT,$
and will denote $Xlk=Xlk(T)$—that is, random variables that we are concerned with are the final values (at $t=T$) of those processes.

Condition 1 is a direct application of results from equation A.42 in the proof of theorem 8 because $EX¯jkXlk$ is the covariance between the coefficients of the normalized coupling matrix.

For condition 2, let us first evaluate
$EX¯jkXlk-δlj2.$
For that, we can use Ito's formula of equation B.10 and derive the expression of $X¯jkXlk$ as a stochastic integral, using the function $F(X¯jk,Xlk)=X¯jkXlk$ We obtain
$X¯jkXlk=-∫0TXlkH¯jk(s)+X¯jkHlk(s)Kλkds+∫0TX¯jk(s-)+H¯jk(s-)Xlk(s-)+Hlk(s-)-X¯jkXlk(s-)dPk(s)+Kλkdt,=∫0TXlkH¯jk(s-)+X¯jkHlk(s-)dPk(s)+∫0TH¯jk(s-)Hlk(s-)dPk(s)+Kλkds.$
(A.44)
The first term is a stochastic integral of a zero mean martingale, while the second term is a stochastic integral of a Poisson counting process, which we can verify (due to assumption 7) that it has mean $δij$. As a consequence, $EX¯jkXlk-δlj2$ is the variance of the above expression, which is (by stochastic integral formula)
$EX¯jkXlk-δlj2=-∫0TEXlk(s-)H¯jk(s-)+X¯jk(s-)Hlk(s-)2Kλkds+∫0TH¯jk(s-)Hlk(s-)2Kλkds.$
(A.45)
Applying again the formula for predictable variation process, we obtain
$EX¯jkXlk-δlj2=-∫0T∫0sHlk(u)H¯jk(s-)+H¯jk(u)Hlk(s-)2KλkduKλkds+∫0TH¯jk(s-)Hlk(s-)2Kλkds.$
(A.46)
Due to assumption 7, this expression is bounded uniformly for any values of $i,j,n,k$, and condition 2 is fulfilled.
For condition 3, we use the auxiliary result presented in proposition 16 to compute the required fourth-order moments:
$1K2λk2EX¯jkXlkXj'kX¯l'k=∫0THlkHj'kds∫0TH¯jkH¯l'kds+∫0THlkH¯jkds∫0THj'kH¯l'kds+∫0THlkH¯l'kds∫0TH¯jkHj'kds+1Kλk∫0THlkH¯jkHj'kH¯l'kds=1λk2T2K2∫0Txlxj'ds∫0Tx¯jx¯l'ds+∫0Txlx¯jds∫0Txj'x¯l'ds+∫0Txlx¯l'ds∫0Tx¯jkxj'kds+1K3λk3T2∫0Txlx¯jxj'x¯l'ds.$
We first consider the term consisting in all products of two integrals, which we call integral product term; the last term in this expression will be dealt with independently. Given assumption 7, it is clear that for $l,j,j',l'$, all different from each other, the integral product term is vanishing. If there happen to be only two indices that are equal, the moment also vanishes (at least one term of each product vanishes). For the case $j=l=k'=l'$, the integral product term possibly does not vanish, but is uniformly bounded, and only $n$ terms satisfy this relation, such that it will not affect the limit of the relevant expression for condition 3 (due to the $1/n2$ factor).
It remains the case in which three indices exactly are identical. In such a case, one among $δjl$ or $δj'l'$ is one while the other is zero. Take $δjl=1$ and $δj'l'=0$ without loss of generality, assuming $j=l=j'≠l'$. The relevant quantity of condition 3 is
$1K2λk2EX¯jkXlk-1Xj'kX¯l'k=1K2λk2EX¯jkXlkXj'kX¯l'k-1K2λk2EXj'kX¯l'k=1λk2T2K2∫0Txlxj'ds∫0Tx¯jx¯l'ds+∫0Txlx¯j-Tds∫0Txj'x¯l'ds+∫0Txlx¯l'ds∫0Tx¯jkxj'kds+1K3λk3T2∫0Txlx¯jxj'x¯l'ds,$
in which, due to assumption 7, the integral product term still vanishes. As a consequence, the asymptotic behavior we are interested in is given by the behavior of the remaining single integral term of the moment: $1Kλk∫0Txlx¯jxj'x¯l'ds$ (the only remaining nonvanishing terms are bounded and intervene only in $n$ terms of the sum), such that
$lim1n2∑ΓEX¯jkXlk-δljXj'kX¯l'k-δj'l'2=lim1n2K2λk2∑ΓEX¯jkXlk-δljXj'kX¯l'k-δj'l'2.$
(A.47)
Thus condition 3 is satisfied due to the theorem's assumption.

To sum up, all four necessary conditions for the application of proposition 23 are fulfilled (condition 4 is part of the assumptions), and the convergence to the MP law follows immediately.

Proof of Theorem 11.
Let us use the result of Chafaï and Tikhomirov (2018) adapted to our complex case and adapt the dimension notation ($n→p(n)$, $mn→n$, but we keep the notation $Xn$). We additionally checked in all proofs and lemmas that the result still holds when we replace symmetric matrices by Hermitian ones and the scalar product of real vectors by Hermitian products of complex vectors, putting an absolute value on the Hermitian product when the original scalar product was squared. We consider ${Xn}$, a sequence of isotropic (i.e. identity covariance) zero mean random vectors and consider the empirical covariance matrix estimated from observing $n$ independent copies of $Xn$,
$Σ^n=1n∑k=1nXn(k)Xn(k)H.$
We rely on the strong tail projection property (STP) that guarantees convergence of the spectral measure of the empirical covariance to the MP law, and convergence of the extreme eigenvalues to the ends of the MP support.
Definition 1
(Strong Tail Projection Property (STP)). STP holds when there exist $f:N→[0,1]$, $g:N→R+$ such that $f(r)→0$ and $g(r)→0$ as $r→∞$, and for every $p∈N$, for any orthogonal projection $P:Cp→Cp$ of rank $r>0$, for any real $t>f(r).r$ we have
$PPXn2-r≥t≤g(r)rt2.$
By noting that $EPXn2=r$, we can use Chebyshev's inequality to satisfy such property. Let $σ2$ be the variance of $PXn2$. The inequality leads to, for any $t$,
$PPXn2-r≥σt≤PPXn2-r≥σt≤1t2,$
so we get $PPXn2-r≥t≤σ2/t2$ and just need to find an upper bound of $σ2$ of the form $g(r)r$. To limit the complexity of the rank-dependent analysis, we will look for $g(r)=C/r$ for a fixed positive constant $C$, such that we just need to bound the above variance by a constant. Finer bounds are likely possible but left to future work.
In our specific case, in line with the proof of theorem 9, we use
$Xn=∫0Tx(t)KλTdP(t),$
with $P$ the compensated Poisson process martingale of rate $Kλ$. In an orthonormal basis adapted to the othogonal projection $P$ with rank $r$, we can rewrite
$PXn2=∑k=1rwk,Xn2,$
where ${wk}$ are $r$ orthonormal vectors in $Cp$. Then we have
$σ2=∑k,l≤rEwk,Xn2wl,Xn2-1.$
Using similar fourth-order moment results as in theorem 9 (based on proposition 16) leads to an expansion for which all terms vanish but one per expectation, leading to
$σ2=1KλT2∑k,l≤r∫0Twk,x(t)x(t),wkwl,x(t)x(t),wldt,$
which can be rewritten using the Hermitian operator $X$ acting on the space of $p×p$ matrices as a positive definite bilinear form,
$X(U,V)=∫0TV,xxH(t)xxH(t),Udt,$
with associated eigenvalues $ξ1≥...≥ξp2≥0$ such that
$σ2=1KλT2∑k,l≤rXwkwlH,wkwlH.$
This sum is maximized when the $r2$ unitary tensor matrices of the sum $wkwl$ are eigenvectors associated with the largest eigenvalues of the operator, such that we get
$σ2≤1KλT2∑k=1≤r2ξk,$
which is itself upper bounded by the trace of the operator, leading to
$σ2≤1KλT2∑k,l≤p(n)∫0T|xkxl|2dt,$
which is bounded according to the theorem's assumptions, completing the proof.

## Appendix B: Additional Background and Useful Results

### B.1  Jump Processes

Jump processes exhibit discontinuities related to the occurrence of random events, which are distributed according to the given point process models. In this letter, we are concerned with jump times distributed according to (possibly inhomogeneous) Poisson processes.

#### B.1.1  Martingales Related to Counting Processes

As introduced in section 2.2 (see equation 2.3), under mild assumptions, we can associate a zero-mean martingale with a counting process $N(t)$:
$M(t)=N(t)-∫0tλ(s)ds.$
(B.1)
In addition, in our case (deterministic intensity), the variance of $M(t)$ is given by
$V(t)=EM(t)2=∫0tλ(s)ds.$

#### B.1.2  Stochastic Integrals

Now, if we consider for a deterministic predictable process $H$ (with regard to the same filtration $Ft$), the stochastic integration
$MH(t)=∫0tH(s)dM(s)ds.$
(B.2)
Using equation B.1, we can write
$MH(t)=∫0tH(s)dN(s)-∫0tH(s)λ(s)ds,$
(B.3)
which is equivalent to equation 2.5, which introduced the separation of the deterministic component of empirical coupling measure from the (zero-mean) random fluctuations of the measure. $MH(t)$ is also a zero-mean martingale with respect to history ${Ft}$. This trivially entails that $EMH(t)=0$ at all times.

#### A.1.3  Second Order Statistics

In addition, the second-order statistics of such stochastic integrals can be explicitly derived from the original intensities. In particular, for $MH(t)=∫0tH(s)dM(s)ds$, we have the variance
$VH(t)=EMH(t)2=∫0tH(s)2λ(s)ds,$
(B.4)
which corresponds to its predictable variation process (see Aalen et al., 2008, sec. 2.2.6). A similar result applies to covariance as well. Let $G$ and $H$ be deterministic predictable; then
$VH,G(t)=EMH(t)MG(t)=∫0tH(s)G(s)λ(s)ds.$
(B.5)
Importantly, we note that this nonvanishing covariance reflects the fact that both stochastic integrals are computed from the same realization of $M(t)$. If two stochastic integrals are derived from independent point processes, the resulting covariance between them is zero.

#### B.1.4  General Jump Stochastic Processes

For the proofs of our results, it is convenient to state some general results for jump processes that combine deterministic and a jump stochastic integral, decomposable as
$X(t)=X(0)+∫0tf(X(s),s)ds+∫0th(X(s),s)dN(s),$
(B.6)
with $N(t)$ a Poisson process with intensity $λ(t)$, $f$ and $h$ square integrable. This clearly includes the martingales defined above.

#### B.1.5  Mean Stochastic Jump Integrals

According to Hanson (2007, theorem 3.20), we can compute the expectation of $X(t)$ defined in equation B.6:
$E[X(t)]=E[X(0)]+∫0tf(X(s),s)ds+∫0tEh(X(s),s)λ(s)ds.$
(B.7)
This allows retrieval of the zero-mean property of the stochastic integral of martingales.

#### B.1.6  Itô's Formula

Itô's formula or Itô's lemma is an identity to find the differential of a function of a stochastic process. It is a counterpart of the chain rule used to compute the differential of composed functions. We restrict ourselves to the case of a time-independent scalar function of a jump process, while different formulas exist for other cases.

A generalized chain rule for the time derivative of such processes allows deriving an integral formula for scalar process $Y(t)=F(X(t))$ with $F$ continuously differentiable (see Hanson, 2007, lemma 4.22, rule 4.23):
$Y(t)=Y(0)+∫0tdFdx(X(s))f(X(s),s)ds+∫0tFX(s-)+h(X(s-),s)-F(X(s-))dN(s),$
(B.8)
where $X(s-)=limt→s-X(t)$ indicates the left limit.
For a scalar function of a multivariate process $Y(t)=F(X(t))$ with
$X(t)=X(0)+∫0tf(X(s),s)ds+∫0th(X(s),s)dN(s),$
(B.9)
the generalization is straightforward:
$Y(t)=Y(0)+∫0t∑kdFdxk(X(s))fk(X(s),s)ds+∫0tFX(s-)+h(X(s-),s)-F(X(s-))dN(s).$
(B.10)
This allows retrieving the expression of martingale second-order statistics presented above, as well as computing higher-order moments required in the proof of theorem 9.

An application of this formula that we will use follows:

Proposition 2.
Assume that $W(t)=∫0tA(s)dM(s)$, $X(t)=∫0tB(s)dM(s)$, $Y(t)=∫0tC(s)dM(s)$, and $Z(t)=∫0tD(s)dM(s)$ are stochastic integrals with respect to the same (possibly inhomogeneous) Poisson process martingale $M(t)=N(t)-∫0tλ(s)ds$ with intensity $λ(t)$. Then
$EWXYZ(t)=∫0tABCD(s-)λ(s)ds+∫0tAB(s)λ(s)ds∫0tCD(s)λ(s)ds+∫0tAC(s)λ(s)ds∫0tBD(s)λ(s)ds+∫0tADλ(s)(s)ds∫0tBC(s)λ(s)ds.$
(B.11)
Proof.
We apply the above formula to $F(W,X,Y,Z)=WXYZ$, yielding
$WXYZ(t)=-∫0t(AXYZ(s)+WBYZ(s)+WXCZ(s)+WXYD(s))λds+∫0t[(W(s-)+A)(X(s-)+B)(Y(s-)+C)(Z(s-)+D)-WXYZ(s-)]dN(s).$
Expanding the second term, we obtain the formula
$WXYZ(t)=∫0tAXYZ(s)+WBYZ(s)+WXCZ(s)+WXYD(s)dM(s)+∫0t(ABYZ(s-)+AXCZ(s-)+AXYD(s-)+WBCZ(s-)+WBYD(s-)+WXCD(s-))dN(s)+∫0tABCD(s-)dN(s)+∫0t(ABCZ(s-)+AXCD(s-)+ABYD(s-)+WBCD(s-))dN(s).$
The first and last integral terms in this formula have vanishing expectation, the first because it is a stochastic integral of zero mean martingale $M$, the last because each term inside the integral contains only one random variable, which is itself a stochastic integral of the martingale $M$ (and thus zero mean). Thus, for the expectation, we get
$EWXYZ(t)=∫0tABCD(s-)dλ(s)+∫0tABEYZ(s-)+ACEXZ(s-)+ADEXY(s-)+BCEWZ(s-)+BDEWY(s-)+CDEWX(s-))λ(s)ds.$
(B.12)
Based on the Itô integral formula, one can easily derive an expression for the expectation of each product of two variables (see equation A.44), leading to, after reordering the terms,
$EWXYZ(t)=∫0tABCD(s-)dλ(s)+∫0tAB(s-)∫0sCD(u-)λ(u)du+CD(s-)∫0sAB(u-)λ(u)du+AC(s-)∫0sBD(u-)λ(u)du+BD(s-)∫0sAC(u-)λ(u)du+AD(s-)∫0sBC(u-)λ(u)du+BC(s-)∫0sAD(u-)λ(u)duλ(s)ds.$
(B.13)
We then observe that the terms inside the integral can be paired such that the integral form of the product derivative formula ($∫f∫g=∫g∫f+f∫f$) can be applied, leading directly to equation B.11.

### B.2  Notions of Convergence

In contrast to finite-dimensional vectors, there are different and nonequivalent notions of convergence for functions and random variables. We explain the two types of convergence encountered in this letter. For a random variable $X$, we consider its probability measure $μX$ such that
$μX(A)=P(X∈A),$
and its associated cumulative distribution function (CDF),
$FX(x)=μX-∞,x=P(X≤x).$

#### B.2.1  Convergence in Distribution

The classical definition is based on the CDF.

Definition 2
(Convergence in Distribution). We say that the sequence of random variables ${Xn}$ converges in distribution (or in law) to $X$ whenever
$FXn(x)⟶n→+∞FX,$
at all continuity points of $FX$. This is then denoted $Xn⟶DX$.

An equivalent definition can be formulated in terms of weak convergence:

Proposition 3.
$Xn⟶DX$ if and only if, for any bounded continuous function $f$,
$Ef(Xn)=∫fdμXn→Ef(X)=∫fdμX,$
that is, in classical topological terms, the measure $μXn$ converges weakly to $μX$.

The generalization to multidimensional variables encountered in theorem 8 consists simply in replacing the cumulative distribution by its multivariate version, $FX(x)=P(X1, in definitions. A simple necessary and sufficient condition for $X⟶Y$ is that for all vectors $t$, $t⊤X⟶t⊤Y$ (this is the Cramér-Wold theorem, see Billingsley, 1995).

#### B.2.2  Convergence in Probability

This stronger notion of convergence denotes $Xn⟶PX$, stating that for any $ε>0$,
$P|Xn-X|>ε⟶n→+∞0.$
(B.14)

It can be shown that convergence in probability implies convergence in distribution. The converse is true only in special cases:

Proposition 4.

If $X$ converges in distribution to a (deterministic) constant $c$, then it also converges to it in probability.

An extension to the multivariate case is obtained in finite vector spaces by replacing the absolute value in equation B.14 by any norm, or simply by requiring the convergence of all components individually.

#### B.2.3  Convergence of Random Measures

The ESDs are random measures, and as such, random variables, leaving in an infinite-dimensional space of measures. This means that for a fixed realization $ω$, the random measure $μ$ takes the deterministic value $μ(ω)$.

Several types of convergence can be defined. First, the notion of convergence weakly in probability can be seen as a combination of the above definitions. It is known that the weak convergence of deterministic measures (see proposition 18) can be associated with a (nonunique) metric (the topological space of weak convergence is metrizable). Let us pick such a metric $ρ(μ,ν)$ between two deterministic measures; then:

Definition 3
(Convergence Weakly in Probability). The sequence of random measures $μn$ converges weakly in probability to the deterministic measure $ν$ for any $ε>0$:
$Pρ(μn,ν)>ε⟶n→+∞0.$
(B.15)

Next, we can also define convergence with probability 1 (also called almost sure convergence).

Definition 4
(Convergence (Weakly) with Probability One). The sequence of random measures $μn$ converges weakly with probability one to the deterministic measure $ν$ for any $ε>0$:
$Pρ(μn(ω),ν)⟶n→+∞0=1.$
(B.16)

As for the case of scalar random variables, convergence with probability one implies convergence in probability.

### B.3  Random Matrix Theory

Random matrix theory resulted from fairly recent developments in high-dimensional statistics. It has various application in physics (Guhr, Müller-Groeling, & Weidenmüller, 1998; Doussal, Majumdar, & Schehr, 2016), machine learning (Pennington & Bahri, 2017; Pennington & Worah, 2017; Louart et al., 2018), and neuroscience (Timme, Geisel, & Wolf, 2006; Veraart et al., 2016; Almog et al., 2019).

#### B.3.1  Wishart Ensemble

Let $X$ be a $p×n$ data matrix. Assume that the coefficients of $X$, $xij$ are i.i.d. $NC0,1$. $NC$ specifies a standard complex normal distribution. By definition, this means that $xij=xijreal+ixijimag$, where $xij=xijreal$ and $xijimag$ are independent (real) $N(0,12)$. This implies that columns of $X$ are i.i.d. $NC0p,Ip$ and, similarly, the real and imaginary parts are $N0p,Ip/2$.

As $n$ grows and $pn→n→+∞α∈(0,+∞)$, the ESD of the so-called Wishart ensemble, $Sn=1nXXH$, converges to the Marchenko-Pastur law $μMP(x)$ (Marchenko & Pastur, 1967) with density
$dμMPdx(x)=1-αα1α>1δ0+12παx(b-x)(x-a)1[a,b],$
(B.17)
with $a=(1-α)2$ and $b=(1+α)2$ (see examples for Marchenko-Pastur law for different values of $α$ in Figure 6).
Figure 6:

Density of the Marchenko-Pastur law for different values of the aspect ratio of the matrices, $α$, in equation 2.7.

Figure 6:

Density of the Marchenko-Pastur law for different values of the aspect ratio of the matrices, $α$, in equation 2.7.

We wrote here the general formula that holds for all $α>0$, accounting for zero eigenvalues with a Dirac mass in zero in the rank-deficient case $α>1$.

#### B.3.2  Stieltjes Transform of ESD

The Stieltjes transform is a very useful tool to establish the convergence of ESD and determine its limit. The Stieltjes transform of a measure $μ$ is defined as
$mμ(z)=∫1x-zdμ(x),z∈C∖R.$
A key example for us is the Stieltjes transform of the MP law:
$m(z)=1-c-z+(1+c-z)2-4c2cz.$
Many important results relate measures to their Stieltjes transform. We only need the property that the Stieltjes transform identifies the limit of a sequences of measures, with the following proposition that immediately derives from Anderson et al. (2010, theorem 2.4.4).
Proposition 5.

If two sequences of random measures ${μk}$ and ${νk}$ converge weakly in probability to a deterministic with identical Stieltjes transform, they converge to the same measure.

#### B.3.3  Convergence to MP for Matrices with Dependent Coefficients

Based on the above, we can now write a result that is a combination of results found in Bai and Zhou (2008—mainly theorem 1.1 and corollary 1.1) adapted to our specific case. We consider a sequence of random matrices ${Xn}$ with independent columns and study the ESD of
$Sn=1nXnXnH.$
In the following proposition, we use the Kronecker delta symbold delta (see equation 4.2) and denote by $X¯$ the complex conjugate of $X$.
Proposition 6.

Let As $n→∞$, and assume the following. Let

1. 1.

$EX¯jkXlk=δlj$, for all $k$.

2. 2.

$1nmaxj≠lEX¯jkXlk-δlj2→0$ uniformly in $k≤n$.

3. 3.

$1n2∑ΓEX¯jkXlk-δljXj'kX¯l'k-δj'l'2→0$ uniformly in $k≤n$, where $Γ={(j,l,j',l'):1≤j,l,j',l'≤p}∖{(j,l,j',l'):j=j'≠l=l'orj=l'≠j'=l}$.

4. 4.

$p/n→α∈(0,∞)$.

Then, with probability 1, the ESD of $Sn$ tends (weakly) to the MP law.

Sketch of the proof.

We use theorem 1.1 from Bai and Zhou (2008) combined with the sufficient condition of corollary 1.1, assuming the identity matrix $Tn$. These conditions are compatible with the case of the Wishart ensemble, such that the ESD converges to a distribution with the same Stieltjes transform as the MP law.9 As a consequence of proposition 22, we get that the limit ESD is the MP law.

The additional results in this appendix are corollaries based on simplifying assumption 24, where a linear phase is considered instead of the general assumption on phase that was used in corollaries 2 and 3.

Assumption 3.
Assume that $ϕ(t)$ is a linear function of $t$ on $[0,T]$,
$ϕ(t)=mt,m=2πf=2π/τ,$
(C.1)
where $f>0$ (interpretable as the frequency of an oscillation for the continuous signal) and $γT$ is the ratio of length ($T$) of signal to period of oscillation $τ$:
$γT=Tτ=ϕ(T)-ϕ(0)2π.$
Corollary 4.
Under the assumptions of corollary 5, assume additionally assumption 24 is also satisfied, and the intensity of the point-process is given by
$λ(t)=λ0exp(κcos(ϕ(t)-φ0)),$
(C.2)
for a given $κ≥0$. Then the expectation of the multitrial PLV estimate converges (for $K→+∞$) to
$PLV*=∫0Tei2πftexp(κcos(2πft-φ0))dt∫0Texp(κcos(2πft-φ0))dt.$
(C.3)
If, in addition, $[0,T]$ corresponds to an integer number $γT>0$ of periods of the oscillation,
$PLV*=eiφ0∫ϕ(0)ϕ(T)cos(θ)exp(κcos(θ))dθ∫ϕ(0)ϕ(T)exp(κcos(θ))dθ=eiφ0I1(κ)I0(κ),$
(C.4)
and the scaled residual $KPLV^K-PLV*$ converges to a zero mean complex gaussian $Z$ with the following covariance:
$CovRe{Ze-iφ0}Im{Ze-iφ0}=12λ0TI0(κ)2I0(κ)+I2(κ)00I0(κ)-I2(κ).$
(C.5)
Proof.
We use the intensity function introduced in equation C.2. The $PLV$ asymptotic value ($PLV*$) can be derived from definition introduced in equation 3.2 by using assumption 24:
$PLV*=∫0Teiϕ(t)λ(t)dt∫0Tλ(t)dt$
(C.6)
$=λ0∫0Teiϕ(t)exp(κcos(ϕ(t)-φ0))dtλ0∫0Texp(κcos(ϕ(t)-φ0))dt$
(C.7)
$=λ0∫0Teimtexp(κcos(mt-φ0))dtλ0∫0Texp(κcos(mt-φ0))dt.$
(C.8)
We change the integration variable from $mt$ to $θ$:
$PLV*=∫θ(0)θ(T)eiθexp(κcos(θ-φ0))dθ∫θ(0)θ(T)exp(κcos(θ-φ0))dθ.$
(C.9)
To simplify the integral (bring the $φ0$ out of the integral), we change the integration variable again, from $θ$ to $ψ$, ($ψ=θ-φ0$),
$PLV*=∫θ(0)-φ0θ(T)-φ0ei(ψ+φ0)exp(κcos(ψ))dψ∫θ(0)-φ0θ(T)-φ0exp(κcos(ψ))dψ$
(C.10)
$=eiφ0∫θ(0)-φ0θ(T)-φ0eiψexp(κcos(ψ))dψ∫θ(0)-φ0θ(T)-φ0exp(κcos(ψ))dψ.$
(C.11)
When $[0,T]$ corresponds to an integer number of periods of the oscillation (i.e., is an integer number), and given that the integration interval is $2πγT$, and integrates $2π$-periodic functions (thus the integral is invariant to translations of the integration interval), we have
$PLV*=eiφ0∫-ππeiψexp(κcos(ψ))dψ∫-ππexp(κcos(ψ))dψ.$
Observing that the integrand of the denominator is even, while for the numerator the imaginary part is odd and the real part is even, we get
$PLV*=eiφ0∫0πcos(ψ)exp(κcos(ψ))dψ∫0πexp(κcos(ψ))dψ.$
We prove the first part of the corollary, equation C.3. By using the integral form of the modified Bessel functions $Ik$ for $k$ integer (see Watson, 1995, p. 181)
$Ik(κ)=1π∫0πcos(kθ)exp(κcos(θ))dθ+sin(kπ)π∫0+∞e-κcosht-ktdt$
(C.12)
$=1π∫0πcos(kθ)exp(κcos(θ))dθ,$
(C.13)
we can derive the compact form:
$PLV*=eiφ0I1(κ)I0(κ).$
(C.14)
The covariance matrix of the asymptotic distribution can be easily derived by plugging equation C.2 as $λ(t)$ in corollary 3 and integrating on $[0,T]$
$Cov(Z)11=λ0Λ(T)2∫0Tcos2(ϕ(t))expκcos(ϕ(t)-φ0)dt.$
(C.15)
As we have
$Λ(T)=λ0TI0(κ),$
we can continue with equation C.15 as,
$Cov(Z)11=1λ0T2I0(κ)2∫0Tcos2(ϕ(t))expκcos(ϕ(t)-φ0)dt.$
(C.16)
To simplify the rest of the derivations, we transform the complex variable coordinates by using $eiϕ(t)e-iφ0$ instead of $eiϕ(t)$ as predictable with respect to ${Ft}$ (i.e., replacing $x(t)$ with $eiϕ(t)e-iφ0$ in theorem 2). With this change, equation C.16 becomes
$Cov(Z)11=1λ0T2I0(κ)2∫0Tcos2(ϕ(t)-φ0)expκcos(ϕ(t)-φ0)dt.$
(C.17)
Then we change the variable of the integral from $mt-φ0$ to $θ$ (and consequently $dt$ to $1mdθ$) and use the following trigonometric identity,
$cos2(θ)=121+cos(2θ),$
(C.18)
to obtain
$Cov(Z)11=12mλ0T2I0(κ)2∫θ(0)θ(T)1+cos(2θ)expκcos(θ)dθ=12mλ0T2I0(κ)2∫θ(0)θ(T)expκcos(θ)+cos(2θ)expκcos(θ)dθ.$
Given that the integral is invariant to translations of the integration, we get
$Cov(Z)11=12mλ0T2I0(κ)2∫02πγTexpκcos(θ)dθ+∫02πγTcos(2θ)expκcos(θ)dθCov(Z)11=12mλ0T2I0(κ)22γTπI0(κ)+2γTπI2(κ)$
(C.19)
$=2πγT2mλ0T2I0(κ)2I0(κ)+I2(κ)$
(C.20)
$=mT2mλ0T2I0(κ)2I0(κ)+I2(κ)$
(C.21)
$=12λ0TI0(κ)2I0(κ)+I2(κ).$
(C.22)

We can have a similar calculation for the imaginary part, $Cov(Z)22$, as well, but using the identity $sin2(θ)=121-cos(2θ)$ instead of equation A.20. The off-diagonal elements of the covariance matrix vanish due to symmetry of integrand.

Therefore, we showed that for a given $κ≥0$, scaled residual
$Z'=e-iφ0KPLV^K-PLV*$
converges to a zero mean complex gaussian with the following covariance:
$CovRe{Z'}Im{Z'}=Re{Ze-iφ0}Im{Ze-iφ0}=12λ0TI0(κ)2I0(κ)+I2(κ)00I0(κ)-I2(κ).$
Corollary 5.
Assume $ϕ(t)=2πkt/T$, with $k>0$ integer, and a sinusoidal modulation of the intensity at frequency $m/T$, with $m>0$ integer possibly different from $k$, phase shift $φ0$, and modulation amplitude $ϰ$ such that
$λ(t)=λ01+ϰcos2πmt/T-φ0,λ0>0,0≤ϰ≤1,$
(C.23)
and the point process is homogeneous Poisson with rate $λ0$. Then the expectation of the PLV estimate converges (for $K↦+∞$) to
$PLV*=12ϰeiφ0δkm,$
(C.24)
where $δkm$ denotes the Kronecker symbol. Moreover, the asymptotic covariance of $Z=KPLV^K-PLV*$ is
$CovRe{Z}Im{Z}=12λ0T1001.$
(C.25)
Proof.
Similar to corollary 5, we can derive the asymptotic $PLV$ (see equation C.24) for this case, from the definition in equation 3.2. We use the assumed phase $ϕ(t)=2πkt/T$ and apply the intensity function defined in equation C.23 in corollary 3:
$PLV*=∫0Teiϕ(t)λ(t)dt∫0Tλ(t)dt$
(C.26)
$=∫0Tei2πkt/T1+ϰcos2πmt/T-φ0dt∫0T1+ϰcos2πmt/T-φ0dt.$
(C.27)
By using Euler's formula, we can write the second term in the numerator as weighted sum of exponentials ($cos(x)=12(eix+e-ix))$,
$PLV*=12∫0Tei2πkt/T+ϰ∫0Tei2πkt/Tei(2πmt/T-φ0)+e-i(2πmt/T-φ0)dt∫0Tdt+∫0Tϰcos2πmt/T-φ0dt$
(C.28)
$=12∫0Tei2πkt/T+ϰ∫0Tei2π(k+m)t/Teiφ0+ϰ∫0Te-i2π(k-m)t/Teiφ0dt∫0Tdt+∫0Tϰcos2πmt/T-φ0dt$
(C.29)
$=12∫0Tei2πkt/T+ϰeiφ0∫0Tei2π(k+m)t/T+ϰeiφ0∫0Te-i2π(k-m)t/Tdt∫0Tdt+ϰ∫0Tcos2πmt/T-φ0dt.$
(C.30)
Given that $k,m>0$ and we are integrating over full periods, all terms vanish except the last term in the numerator (if and only if $k=m$) and the first term in the denominator. Therefore we have,
$PLV*=12ϰeiφ0∫0Te-i2π(k-m)t/Tdt∫0Tdt$
(C.31)
$=12ϰeiφ0δkm.$
(C.32)
We prove the first part of the corollary.
The covariance matrix of the asymptotic distribution can be derived by the procedure we used for the proof of corollary 5. We plug the rate $λ(t)$ assumed in the corollary (see equation C.23) and integrate on $[0,T]$,
$Cov(Z)11=λ0Λ(T)2∫0Tcos2(2πkt/T)1+ϰcos2πmt/T-φ0dt,$
(C.33)
and use the trigonometric identity, equation A.20, to get
$Cov(Z)11=λ02Λ(T)2∫0T1+cos(4πkt/T)1+ϰcos2πmt/T-φ0dt.$
(C.34)
In the resulting equation,
$Cov(Z)11=λ02Λ(T)2∫0Tdt+ϰ∫0Tcos2πmt/T-φ0dt+∫0Tcos(4πkt/T)dt+ϰ∫0Tcos(4πkt/T)cos2πmt/T-φ0dt,$
(C.35)
all terms except the first one vanish. The second and third vanish as we integrate in the full period, and the last term vanishes given that
$∫0Tcos(4πkt/T)cos2πmt/T-φ0dt=cos(φ0)∫0Tcos(4πkt/T)cos2πmt/Tdt+sin(φ0)∫0Tcos(4πkt/T)sin2πmt/Tdt,$
(C.36)
and $k$ and $m$ are integers.
Finally, given that $Λ(T)=∫0Tλ(t)dt=λ0T$, we have
$Cov(Z)11=12λ0T.$
(C.37)

We have a similar calculation for the imaginary part, $Cov(Z)22$. The off-diagonal elements of the covariance matrix vanish due to the symmetry of he integrand.

Therefore, we showed that for the scaled residual,
$Z=KPLV^K-PLV*$
converges to a zero mean isotropic complex gaussian:
$KPLV^K-PLV*⟶K→+∞N00,12λ0T1001.$

## Appendix D: Circular Noise

We use random numbers drawn from the von Mises distribution to generate noise for the phase of an oscillation. Consider the oscillation $Oorig[t]=e2πift$, where the bracket indicates the oscillation is sampled at equispaced discrete times $t={kΔ}k=1,…,q$. Then $O[t]$ is a noisy version of this oscillation, which is perturbed in the phase
$O[t]=e2πiftexpiη[t],$
(D.1)
where $η[t]$ is sampled i.i.d. from the zero-mean von Mises distribution $M(0,κ)$ at each time $t$. Notably, $κ$ is the dispersion parameter; therefore, larger $κ$ correspond to smaller variance of the noise. In the simulation used in section 4.3, we use $κ=10$.
In the simulation for the multivariate case, we use $Nosc$-dimensional vector of oscillations, $Oorig[t]={Ojorig[t]}j=1,…,Nosc$, and sample i.i.d. the noise for each oscillation, leading to the vector time series $η[t]$. In this case, the noisy oscillations can be written as
$O[t]=Oorig[t]⊙expiη[t],$
(D.2)
where $⊙$ is (entrywise) Hadamard product.

The advantage of such phase noise is to preserve the spectral content of the original oscillation better than conventional normal noise. Nevertheless, using conventional normal (white) noise (on both the real and imaginary parts of the oscillation) did not change the results significantly.

## Appendix E: Tables of Parameters

The choice of parameters used in the figures in the main text. In all simulations, $ϕ0=0$.123

Table 1:

Parameters Used for Simulations in Figure 2.

ParameterDescriptionAB
$f$ Frequency 1 Hz
$K$ Number of trials 5000
$T$ Simulation length 5 s
$λ0$ Average firing rate 20 Hz
$NS$ Number of simulations 5000
$κ$ Modulation strength 0.5
ParameterDescriptionAB
$f$ Frequency 1 Hz
$K$ Number of trials 5000
$T$ Simulation length 5 s
$λ0$ Average firing rate 20 Hz
$NS$ Number of simulations 5000
$κ$ Modulation strength 0.5
Table 2:

Parameters Used for Simulations in Figure 3.

ParameterDescriptionABCD
$f$ Frequency 1 Hz
$K$ Number of trials 10
$T$ Simulation length 0.75 s 0.5 s 1 s $x$-axis
$λ0$ Average firing rate 30 Hz
$NS$ Number of simulations 500
$κ$ Modulation strength
ParameterDescriptionABCD
$f$ Frequency 1 Hz
$K$ Number of trials 10
$T$ Simulation length 0.75 s 0.5 s 1 s $x$-axis
$λ0$ Average firing rate 30 Hz
$NS$ Number of simulations 500
$κ$ Modulation strength
Table 3:

Parameters Used for Simulations in Figure 5.

ParameterDescriptionA1A2A3B1B2B3
$f$ Frequency 5 oscillatory components, 11–15 Hz
$K$ Number of trials 10
$T$ Simulation length 11 s
$λ0$ Average firing rate 20 Hz
$NS$ Number of simulations 100
$κ$ Modulation strength 0.15
$nc$ Number of LFP channels 100
$ns$ Number of spiking units 10 50 90 10 50 90
$κnoise$ Dispersion parameter of phase noise 10
ParameterDescriptionA1A2A3B1B2B3
$f$ Frequency 5 oscillatory components, 11–15 Hz
$K$ Number of trials 10
$T$ Simulation length 11 s
$λ0$ Average firing rate 20 Hz
$NS$ Number of simulations 100
$κ$ Modulation strength 0.15
$nc$ Number of LFP channels 100
$ns$ Number of spiking units 10 50 90 10 50 90
$κnoise$ Dispersion parameter of phase noise 10

## Code Availability

The code to reproduce our simulation results is at https://github.com/shervinsafavi/safavi_neuralComp2021.

## Notes

1

Any martingale in this paper is zero-mean.

2

See section B.1.1 for more details.

3

This allows the normalization factor to converge to a deterministic quantity as $K→+∞$ equation 2.1.

4

The sum of the variances of real and imaginary parts.

5

They are biases in the sense that one would expect a coupling measure to vanish if there is no coupling in the data generating procedure.

6

In the sense that we can generate the homogeneous spike train and the oscillation without parametric models that do not share any information.

7

To guarantee the phase to be strictly increasing.

8

The required recentering and rescaling of the eigenvalues is studied in the literature (Johnstone, 2001; El Karoui, 2003, 2007).

9

This requires checking that the self-consistency equation 1.1 in Bai and Zhou (2008) has a unique solution, which they establish by equation 1.2.

## Acknowledgments

We are very grateful to Afonso Bandeira and Asad Lodhia for fruitful discussions at the beginning of the project. We thank Edgar Dobriban for pointing us to Bai and Yao (2008) and Joachim Werner and Michael Schnabel for their excellent IT support. This work was supported by the Max Planck Society.

## References

Aalen
,
O. O.
,
Borgan
,
Ø.
, &
Gjessing
,
H. K.
(
2008
).
Survival and event history analysis: A process point of view
.
New York
:
Springer
.
Abramowitz
,
M.
, &
Stegun
,
I. A.
(
1972
).
Handbook of mathematical functions with formulas, graphs, and mathematical tables
.
New York
:
Dover
.
Almog
,
A.
,
Buijink
,
M. R.
,
Roethler
,
O.
,
Michel
,
S.
,
Meijer
,
J. H.
,
Rohling
,
J. H. T.
, &
Garlaschelli
,
D.
(
2019
).
Uncovering functional signature in neural systems via random matrix theory
.
PLOS Computational Biology
,
15
(
5
), e1006934.
Anderson
,
G. W.
,
Guionnet
,
A.
, &
Zeitouni
,
O.
(
2010
).
An introduction to random matrices
.
Cambridge
:
Cambridge University Press
.
Ashida
,
G.
,
Wagner
,
H.
, &
Carr
,
C. E.
(
2010
). Processing of phase-locked spikes and periodic signals. In
S.
Rotter
&
S.
Grn
(Eds.),
Analysis of parallel spike trains
(pp.
59
74
).
New York
:
Springer
.
Aydore
,
S.
,
Pantazis
,
D.
, &
Leahy
,
R. M.
(
2013
).
A note on the phase locking value and its properties
.
NeuroImage
,
74
,
231
244
.
Bai
,
Z.
, &
Yao
,
J.-f.
(
2008
).
Central limit theorems for eigenvalues in a spiked population model.
Annales de l'Institut Henri Poincaré, Probabilités et Statistiques
,
44
(
3
),
447
474
.
Bai
,
Z.
, &
Zhou
,
W.
(
2008
).
Large sample covariance matrices without independence structures in columns
.
Statistica Sinica
,
18
,
425
442
.
Banna
,
M.
,
Merlevède
,
F.
, &
,
M.
(
2015
).
On the limiting spectral distribution for a large class of symmetric random matrices with correlated entries
.
Stochastic Processes and Their Applications
,
125
(
7
),
2700
2726
.
Benaych-Georges
,
F.
, &
,
R. R.
(
2012
).
The singular values and vectors of low rank perturbations of large rectangular random matrices
.
Journal of Multivariate Analysis
,
111
,
120
135
.
Bhattacharjee
,
M.
, &
Bose
,
A.
(
2016
).
Large sample behaviour of high dimensional autocovariance matrices
.
Annals of Statistics
,
44
(
2
),
598
628
.
Billingsley
,
P.
(
1995
).
Probability and measure
.
New York
:
Wiley
.
Brillinger
,
D. R.
(
1981
).
Time series: Data analysis and theory
.
:
SIAM
.
Bühlmann
,
P.
,
Kalisch
,
M.
, &
Meier
,
L.
(
2014
).
High-dimensional statistics with a view toward applications in biology
.
Annual Review of Statistics and Its Application
,
1
,
255
278
.
Bun
,
J.
,
Bouchaud
,
J.-P.
, &
Potters
,
M.
(
2017
).
Cleaning large correlation matrices: Tools from random matrix theory
.
Physics Reports
,
666
,
1
109
.
Buzsáki
,
G.
(
2004
).
Large-scale recording of neuronal ensembles.
Nature Neuroscience
,
7
(
5
),
446
451
.
Buzsaki
,
G.
(
2006
).
Rhythms of the brain
.
New York
:
Oxford University Press
.
Buzsaki
,
G.
,
Anastassiou
,
C. A.
, &
Koch
,
C.
(
2012
).
The origin of extracellular fields and currents–EEG, ECOG, LFP and spikes.
Nat. Rev. Neurosci.
,
13
(
6
),
407
20
.
Buzsaki
,
G.
,
Logothetis
,
N.
, &
Singer
,
W.
(
2013
).
Scaling brain size, keeping timing: Evolutionary preservation of brain rhythms
.
Neuron
,
80
(
3
),
751
764
.
Buzsaki
,
G.
, &
Schomburg
,
E. W.
(
2015
).
What does gamma coherence tell us about inter-regional neural communication?
Nat. Neurosci.
,
18
,
484
489
.
Capitaine
,
M.
, &
Donati-Martin
,
C.
(
2016
).
Spectrum of deformed random matrices and free probability.
arXiv:1607.05560.
Chafaï
,
D.
, &
Tikhomirov
,
K.
(
2018
).
On the convergence of the extremal eigenvalues of empirical covariance matrices with dependence
.
Probability Theory and Related Fields
,
170
(
34
),
847
889
.
Chavez
,
M.
,
Besserve
,
M.
,
,
C.
, &
Martinerie
,
J.
(
2006
).
Towards a proper estimation of phase synchronization from time series
.
J. Neurosci. Methods
,
154
(
1–2
),
149
160
.
Cole
,
S.
, &
Voytek
,
B.
(
2019
).
Cycle-by-cycle analysis of neural oscillations
.
Journal of Neurophysiology
,
122
(
2
),
849
861
.
Cueva
,
C. J.
,
Saez
,
A.
,
Marcos
,
E.
,
Genovesio
,
A.
,
Jazayeri
,
M.
,
Romo
,
R.
, …
Fusi
,
S.
(
2020
).
Low-dimensional dynamics for working memory and time encoding.
PNAS, 117
,
23021
23032
.
Dai
,
H.
,
Wang
,
Y.
,
Trivedi
,
R.
, &
Song
,
L.
(
2016
).
Recurrent coevolutionary latent feature processes for continuous-time recommendation.
In
Proceedings of the First Workshop on Deep Learning for Recommender Systems
(pp.
29
34
).
New York
:
ACM
.
De
,
A.
,
Valera
,
I.
,
Ganguly
,
N.
,
Bhattacharya
,
S.
, &
Rodriguez
,
M. G.
(
2016
). Learning and forecasting opinion dynamics in social networks. In
D.
Lee
,
M.
Sugiyama
,
U.
Luxburg
,
I.
Guyon
, &
R.
Garnett
(Eds.),
Advances in neural information processing systems
,
29
(pp.
397
405
).
Red Hook, NY
:
Curran
.
Deger
,
M.
,
Helias
,
M.
,
Boucsein
,
C.
, &
Rotter
,
S.
(
2012
).
Statistical properties of superimposed stationary spike trains
.
Journal of Computational Neuroscience
,
32
(
3
),
443
463
.
Dickey
,
A. S.
,
Suminski
,
A.
,
Amit
,
Y.
, &
Hatsopoulos
,
N. G.
(
2009
).
Single-unit stability using chronically implanted multielectrode arrays
.
J. Neurophysiol.
,
102
(
2
),
1331
1339
.
Doussal
,
P. L.
,
Majumdar
,
S. N.
, &
Schehr
,
G.
(
2016
).
Large deviations for the height in 1D Kardar-Parisi-Zhang growth at late times
.
Europhysics Letters
,
113
(
6
), 60004.
Einevoll
,
G. T.
,
Kayser
,
C.
,
Logothetis
,
N. K.
, &
Panzeri
,
S.
(
2013
).
Modelling and analysis of local field potentials for studying the function of cortical circuits
.
Nat. Rev. Neurosci.
,
14
(
11
),
770
785
.
El Karoui
,
N.
(
2003
).
On the largest eigenvalue of Wishart matrices with identity covariance when n, p and p/n tend to infinity
. arXiv:0309355(math).
El Karoui
,
N.
(
2005
).
Recent results about the largest eigenvalue of random covariance matrices and statistical application
.
Acta Physica Polonica. Series B
,
B35
(
9
),
2681
2697
.
El Karoui
,
N.
(
2007
).
Tracy–Widom limit for the largest eigenvalue of a large class of complex sample covariance matrices
.
Annals of Probability
,
35
(
2
),
663
714
.
El Karoui
,
N.
(
2008
).
Spectrum estimation for large dimensional covariance matrices using random matrix theory
.
Annals of Statistics
,
36
(
6
),
2757
2790
.
Elsayed
,
G. F.
, &
Cunningham
,
J. P.
(
2017
).
Structure in neural population recordings: An expected byproduct of simpler phenomena?
Nature Neuroscience
,
20
(
9
), 1310.
Embrechts
,
P.
,
Liniger
,
T.
, &
Lin
,
L.
(
2011
).
Multivariate Hawkes processes: An application to financial data
.
Journal of Applied Probability
,
48
(
A
),
367
378
.
Ermentrout
,
B.
, &
Pinto
,
D.
(
2007
).
Neurophysiology and waves.
SIAM News
,
40
(
2
).
Ermentrout
,
G. B.
, &
Kleinfeld
,
D.
(
2001
).
Traveling electrical waves in cortex: Insights from phase dynamics and speculation on a computational role
.
Neuron
,
29
(
1
),
33
44
.
Fries
,
P.
(
2005
).
A mechanism for cognitive dynamics: Neuronal communication through neuronal coherence
.
Trends Cogn. Sci.
,
9
(
10
),
474
80
.
Fries
,
P.
(
2015
).
Rhythms for cognition: Communication through coherence
.
Neuron
,
88
,
220
35
.
Fukushima
,
M.
,
Chao
,
Z. C.
, &
Fujii
,
N.
(
2015
).
Studying brain functions with mesoscopic measurements: Advances in electrocorticography for non-human primates
.
Current Opinion in Neurobiology
,
32
,
124
131
.
Gallego
,
J. A.
,
Perich
,
M. G.
,
Miller
,
L. E.
, &
Solla
,
S. A.
(
2017
).
Neural manifolds for the control of movement
.
Neuron
,
94
,
978
984
.
Gao
,
P.
, &
Ganguli
,
S.
(
2015
).
On simplicity and complexity in the brave new world of large-scale neuroscience
.
Current Opinion in Neurobiology
,
32
,
148
155
.
Grosmark
,
A. D.
, &
Buzsáki
,
G.
(
2016
).
Diversity in neural firing dynamics supports both rigid and learned hippocampal sequences
.
Science
,
351
(
6280
),
1440
1443
.
Grosmark
,
A. D.
,
Mizuseki
,
K.
,
Pastalkova
,
E.
,
Diba
,
K.
, &
Buzsáki
,
G.
(
2012
).
REM sleep reorganizes hippocampal excitability
.
Neuron
,
75
(
6
),
1001
1007
.
Grün
,
S.
(
2009
).
Data-driven significance estimation for precise spike correlation
.
Journal of Neurophysiology
,
101
(
3
),
1126
1140
.
Guhr
,
T.
,
Müller-Groeling
,
A.
, &
Weidenmüller
,
H. A.
(
1998
).
Random-matrix theories in quantum physics: Common concepts
.
Physics Reports
,
299
(
4–6
),
189
425
.
Hanson
,
F. B.
(
2007
).
Applied stochastic processes and control for jump-diffusions: Modeling, analysis and computation
.
:
SIAM
.
Hawkes
,
A. G.
(
1971
).
Point spectra of some mutually exciting point processes
.
Journal of the Royal Statistical Society: Series B (Methodological)
,
33
(
3
),
438
443
.
Herreras
,
O.
(
2016
).
Local field potentials: Myths and misunderstandings
.
Front. Neural Circuits
,
10
, 101.
,
J. M.
,
Rubchinsky
,
L. L.
, &
Sigvardt
,
K. A.
(
2004
).
Statistical method for detection of phase-locking episodes in neural oscillations
.
Journal of Neurophysiology
,
91
(
4
),
1883
1898
.
Jiang
,
H.
,
Bahramisharif
,
A.
,
van Gerven
,
M. A. J.
, &
Jensen
,
O.
(
2015
).
Measuring directionality between neuronal oscillations of different frequencies
.
NeuroImage
,
118
,
359
367
.
Johnson
,
D. H.
(
1996
).
Point process models of single-neuron discharges
.
Journal of Computational Neuroscience
,
3
(
4
),
275
299
.
Johnstone
,
I. M.
(
2001
).
On the distribution of the largest eigenvalue in principal components analysis
.
Annals of Statistics
,
29
,
295
327
.
Johnstone
,
I. M.
, &
Onatski
,
A.
(
2020
).
Testing in high-dimensional spiked models
.
Annals of Statistics
,
48
(
3
),
1231
1254
.
Juavinett
,
A. L.
,
Bekheet
,
G.
, &
Churchland
,
A. K.
(
2019
).
Chronically implanted Neuropixels probes enable high-yield recordings in freely moving mice
.
eLife
,
8
, e47188.
Jun
,
J. J.
,
Steinmetz
,
N. A.
,
Siegle
,
J. H.
,
Denman
,
D. J.
,
Bauza
,
M.
,
Barbarits
,
B.
, …
Harris
,
T. D.
(
2017
).
Fully integrated silicon probes for high-density recording of neural activity
.
Nature
,
551
(
7679
),
232
236
.
Kim
,
J.
,
Tabibian
,
B.
,
Oh
,
A.
,
Schölkopf
,
B.
, &
Gomez-Rodriguez
,
M.
(
2018
).
Leveraging the crowd to detect and reduce the spread of fake news and misinformation.
In
Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining
(pp.
324
332
).
New York
:
ACM
.
Kovach
,
C. K.
(
2017
).
A biased look at phase locking: Brief critical review and proposed remedy
.
IEEE Transactions on Signal Processing
,
65
(
17
),
4468
4480
.
Kritchman
,
S.
, &
,
B.
(
2009
).
Non-parametric detection of the number of signals: Hypothesis testing and random matrix theory
.
IEEE Transactions on Signal Processing
,
57
,
3930
3941
.
Krumin
,
M.
,
Reutsky
,
I.
, &
Shoham
,
S.
(
2010
).
Correlation-based analysis and generation of multiple spike trains using Hawkes models with an exogenous input
.
Front. Comput. Neurosci.
,
4
, 147.
Lepage
,
K. Q.
,
Kramer
,
M. A.
, &
Eden
,
U. T.
(
2011
).
The dependence of spike field coherence on expected intensity
.
Neural Computation
,
23
(
9
),
2209
2241
.
Li
,
Z.
,
Cui
,
D.
, &
Li
,
X.
(
2016
).
Unbiased and robust quantification of synchronization between spikes and local field potential
.
J. Neurosci. Methods
,
269
,
33
8
.
Liljenstroem
,
H.
(
2012
).
Mesoscopic brain dynamics
.
Scholarpedia
,
7
(
9
), 4601.
Liptser
,
R. S.
, &
Shiryaev
,
A. N.
(
2013a
).
Statistics of random processes: I. General theory
.
New York
:
.
Liptser
,
R. S.
, &
Shiryaev
,
A. N.
(
2013b
).
Statistics of random processes II: Applications
.
New York
:
.
Liu
,
H.
,
Aue
,
A.
, &
Paul
,
D.
(
2015
).
On the Marčenko–Pastur law for linear time series
.
Annals of Statistics
,
43
(
2
),
675
712
.
Louart
,
C.
,
Liao
,
Z.
, &
Couillet
,
R.
(
2018
).
A random matrix approach to neural networks
.
Ann. Appl. Probab.
,
28
(
2
),
1190
1248
.
Loubaton
,
P.
, &
Vallet
,
P.
(
2011
).
Almost sure localization of the eigenvalues in a gaussian information plus noise model: Application to the spiked models
.
Electronic Journal of Probability
,
16
,
1934
1959
.
Maimon
,
G.
, &
,
J. A.
(
2009
).
Beyond Poisson: Increased spike-time regularity across primate parietal cortex
.
Neuron
,
62
(
3
),
426
440
.
Marchenko
,
V. A.
, &
Pastur
,
L. A.
(
1967
).
Distribution of eigenvalues for some sets of random matrices
.
Matematicheskii Sbornik
,
114
(
4
),
507
536
.
Mastrogiuseppe
,
F.
, &
Ostojic
,
S.
(
2018
).
Linking connectivity, dynamics, and computations in low-rank recurrent neural networks
.
Neuron
,
99
(
3
),
609
623
.
Mingo
,
J. A.
, &
Speicher
,
R.
(
2017
).
Free probability and random matrices
.
New York
:
Springer
.
Mitra
,
P.
(
2007
).
Observed brain dynamics
.
New York
:
Oxford University Press
.
Mittelhammer
,
R. C.
(
1996
).
Mathematical statistics for economics and business
.
New York
:
Springer
.
Namaki
,
A.
,
Ardalankia
,
J.
,
Raei
,
R.
,
Hedayatifar
,
L.
,
Hosseiny
,
A.
,
Haven
,
E.
, &
Jafari
,
G. R.
(
2020
).
Analysis of the global banking network by random matrix theory
. arXiv:200714447.
Nawrot
,
M. P.
,
Boucsein
,
C.
,
Molina
,
V. R.
,
Riehle
,
A.
,
Aertsen
,
A.
, &
Rotter
,
S.
(
2008
).
Measurement of variability dynamics in cortical spike trains
.
Journal of Neuroscience Methods
,
169
(
2
),
374
390
.
Ogata
,
Y.
(
1988
).
Statistical models for earthquake occurrences and residual analysis for point processes
.
Journal of the American Statistical Association
,
83
(
401
),
9
27
.
O'Leary
,
T.
,
Sutton
,
A. C.
, &
Marder
,
E.
(
2015
).
Computational models in the age of large datasets
.
Current Opinion in Neurobiology
,
32
,
87
94
.
Pennington
,
J.
, &
Bahri
,
Y.
(
2017
).
Geometry of neural network loss surfaces via random matrix theory.
In
Proceedings of the International Conference on Machine Learning
(pp.
2798
2806
).
Pennington
,
J.
, &
Worah
,
P.
(
2017
). Nonlinear random matrix theory for deep learning. In
I.
Guyon
,
Y. V.
Luxburg
,
S.
Bengio
,
H.
Wallach
,
R.
Fergus
,
S.
Vishwanathan
, &
R.
Garnett
(Eds.),
Advances in neural information processing systems
30
(pp.
2637
2646
).
Red Hook, NY
:
Curran
.
Pennington
,
J.
, &
Worah
,
P.
(
2019
).
Nonlinear random matrix theory for deep learning
.
J. Stat. Mech.
,
2019
(
12
), 124005.
Pereda
,
E.
,
Quiroga
,
R. Q.
, &
Bhattacharya
,
J.
(
2005
).
Nonlinear multivariate analysis of neurophysiological signals
.
Progress in Neurobiology
,
77
(
1–2
),
1
37
.
Pesaran
,
B.
,
Vinck
,
M.
,
Einevoll
,
G. T.
,
Sirota
,
A.
,
Fries
,
P.
,
Siegel
,
M.
, …
Srinivasan
,
R.
(
2018
).
Investigating large-scale brain dynamics using field potential recordings: Analysis and interpretation
.
Nature Neuroscience
,
21
,
903
919
.
Peterson
,
E. J.
, &
Voytek
,
B.
(
2018
).
Healthy oscillatory coordination is bounded by single-unit computation
. bioRxiv:309427.
Protter
,
P. E.
(
2005
).
Stochastic integration and differential equations
.
New York
:
Springer
.
Reimer
,
I. C.
,
Staude
,
B.
,
Ehm
,
W.
, &
Rotter
,
S.
(
2012
).
Modeling and analyzing higher-order correlations in non-Poissonian spike trains
.
Journal of Neuroscience Methods
,
208
(
1
),
18
33
.
Rizoiu
,
M.-A.
,
Lee
,
Y.
,
Mishra
,
S.
, &
Xie
,
L.
(
2017
).
A tutorial on Hawkes processes for events in social media.
arXiv:1708.06401.
Rosenblum
,
M.
,
Pikovsky
,
A.
,
Kurths
,
J.
,
Schäfer
,
C.
, &
Tass
,
P. A.
(
2001
).
Phase synchronization: From theory to data analysis.
In
F.
Moss
&
S.
Gielen
(Eds.),
Handbook of biological physics
(vol.
4
, pp.
279
321
).
Amsterdam
:
Elsevier
.
Safavi
,
S.
,
Panagiotaropoulos
,
T.
,
Kapoor
,
V.
,
Ramirez-Villegas
,
J. F.
,
Logothetis
,
N. K.
, &
Besserve
,
M.
(
2020
).
Uncovering the organization of neural circuits with generalized phase locking analysis.
bioRxiv. https://doi.org/10.1101/2020.12.09.413401
Shinomoto
,
S.
,
Kim
,
H.
,
Shimokawa
,
T.
,
Matsuno
,
N.
,
Funahashi
,
S.
,
Shima
,
K.
, …
Toyama
,
K.
(
2009
).
Relating neuronal firing patterns to functional differentiation of cerebral cortex.
PLOS Computational Biology
,
5
(
7
).
Shinomoto
,
S.
,
Shima
,
K.
, &
Tanji
,
J.
(
2003
).
Differences in spiking patterns among cortical neurons
.
Neural Computation
,
15
(
12
),
2823
2842
.
Softky
,
W. R.
, &
Koch
,
C.
(
1993
).
The highly irregular firing of cortical cells is inconsistent with temporal integration of random EPSPs
.
Journal of Neuroscience
,
13
(
1
),
334
350
.
Sohn
,
H.
,
Narain
,
D.
,
Meirhaeghe
,
N.
, &
Jazayeri
,
M.
(
2019
).
Bayesian computation through cortical latent dynamics
.
Neuron
,
103
,
934
907
.
Stevenson
,
I. H.
, &
Kording
,
K. P.
(
2011
).
How advances in neural recording affect data analysis
.
Nature Neuroscience
,
14
(
2
),
139
142
.
Tabibian
,
B.
,
Valera
,
I.
,
Farajtabar
,
M.
,
Song
,
L.
,
Schölkopf
,
B.
, &
Gomez-Rodriguez
,
M.
(
2017
).
Distilling information reliability and source trustworthiness from digital traces.
In
Proceedings of the 26th International Conference on World Wide Web
(pp.
847
855
).
New York
:
ACM
.
Timme
,
M.
,
Geisel
,
T.
, &
Wolf
,
F.
(
2006
).
Speed of synchronization in complex networks of neural oscillators: Analytic results based on random matrix theory
.
Chaos: An Interdisciplinary Journal of Nonlinear Science
,
16
(
1
), 015108.
Tracy
,
C. A.
, &
Widom
,
H.
(
2002
).
Distribution functions for largest eigenvalues and their applications
. arXiv:0210034(math-ph).
Truccolo
,
W.
(
2016
).
From point process observations to collective neural dynamics: Nonlinear Hawkes process GLMs, low-dimensional dynamics and coarse graining
.
Journal of Physiology–Paris
,
110
(
4, Part A
),
336
347
.
Truccolo
,
W.
,
Hochberg
,
L. R.
, &
Donoghue
,
J. P.
(
2010
).
Collective dynamics in human and monkey sensorimotor cortex: Predicting single neuron spikes
.
Nat. Neurosci.
,
13
,
105
111
.
Veraart
,
J.
,
Novikov
,
D. S.
,
Christiaens
,
D.
,
,
B.
,
Sijbers
,
J.
, &
Fieremans
,
E.
(
2016
).
Denoising of diffusion MRI using random matrix theory
.
NeuroImage
,
142
,
394
406
.
Vinck
,
M.
,
Battaglia
,
F. P.
,
Womelsdorf
,
T.
, &
Pennartz
,
C.
(
2012
).
Improved measures of phase-coupling between spikes and the local field potential
.
J. Comput. Neurosci.
,
33
(
1
),
53
75
.
Vinck
,
M.
,
van Wingerden
,
M.
,
Womelsdorf
,
T.
,
Fries
,
P.
, &
Pennartz
,
C. M.
(
2010
).
The pairwise phase consistency: A bias-free measure of rhythmic neuronal synchronization
.
NeuroImage
,
51
(
1
),
112
22
.
Watson
,
G. N.
(
1995
).
A treatise on the theory of Bessel functions
.
Cambridge
:
Cambridge University Press
.
Wigner
,
E. P.
(
1955
).
Characteristic vectors of bordered matrices with infinite dimensions
.
Ann. Math
,
62
, 548.
Wigner
,
E. P.
(
1958
).
On the distribution of the roots of certain symmetric matrices
.
Annals of Mathematics
,
67
,
325
327
.
Williamson
,
R. C.
,
Doiron
,
B.
,
Smith
,
M. A.
, &
Yu
,
B. M.
(
2019
).
Bridging large-scale neuronal recordings and large-scale network models using dimensionality reduction
.
Current Opinion in Neurobiology
,
55
,
40
47
.
Womelsdorf
,
T.
,
Schoffelen
,
J. M.
,
Oostenveld
,
R.
,
Singer
,
W.
,
Desimone
,
R.
,
Engel
,
A. K.
, &
Fries
,
P.
(
2007
).
Modulation of neuronal interactions through neuronal synchronization
.
Science
,
316
,
1609
1612
.
Zarei
,
M.
,
Jahed
,
M.
, &
Daliri
,
M. R.
(
2018
).
Introducing a comprehensive framework to measure spike-LFP coupling.
Frontiers in Computational Neuroscience
,
12
.