Abstract

Marked point process models have recently been used to capture the coding properties of neural populations from multiunit electrophysiological recordings without spike sorting. These clusterless models have been shown in some instances to better describe the firing properties of neural populations than collections of receptive field models for sorted neurons and to lead to better decoding results. To assess their quality, we previously proposed a goodness-of-fit technique for marked point process models based on time rescaling, which for a correct model produces a set of uniform samples over a random region of space. However, assessing uniformity over such a region can be challenging, especially in high dimensions. Here, we propose a set of new transformations in both time and the space of spike waveform features, which generate events that are uniformly distributed in the new mark and time spaces. These transformations are scalable to multidimensional mark spaces and provide uniformly distributed samples in hypercubes, which are well suited for uniformity tests. We discuss the properties of these transformations and demonstrate aspects of model fit captured by each transformation. We also compare multiple uniformity tests to determine their power to identify lack-of-fit in the rescaled data. We demonstrate an application of these transformations and uniformity tests in a simulation study. Proofs for each transformation are provided in the appendix.

1  Introduction

In recent years, marked point process models have become increasingly common in the analysis of population neural spiking activity (Kloosterman, Layton, Chen, & Wilson, 2014; Deng, Liu, Kay, Frank, & Eden, 2015; Sodkomkham et al., 2016). For multiunit spike data, these models directly relate the occurrences of spikes with particular waveform features to the biological, behavioral, or cognitive variables encoded by the population, without the need for a separate spike-sorting step. For this reason, these are sometimes called clusterless neural models. Clusterless models have been shown to capture coding properties for spikes that cannot be sorted with confidence and to lead to improved population decoding results in place-field data from rat hippocampus during spatial navigation tasks (Kloosterman et al., 2014; Deng et al., 2015; Sodkomkham et al., 2016). Additionally, avoiding a computationally intensive spike-sorting step allows for neural decoding to be implemented in real-time, closed-loop experiments. Figure 1 illustrates how the marked point process data are generated from neural activity recorded by a tetrode. Figure 1A shows the neural activity of multiple cells recorded using the tetrode, and Figure 1B shows the corresponding spike event. A spike event is detected when the neural activity peak-to-peak amplitude is above a hypothetical threshold in at least one channel of the tetrode. For each spike event, the peak amplitudes on the tetrode electrodes represent the mark information.
Figure 1:

Mapping neural activity recorded from a tetrode to spike events. (A) Example neural activity captured by a tetrode. The dashed line shows the threshold for spike detection. (B) The mark per each event has a dimension of 4, which corresponds to the peak amplitude per channel.

Figure 1:

Mapping neural activity recorded from a tetrode to spike events. (A) Example neural activity captured by a tetrode. The dashed line shows the threshold for spike detection. (B) The mark per each event has a dimension of 4, which corresponds to the peak amplitude per channel.

A critical element of any statistical modeling procedure is the ability to assess the goodness-of-fit between a fitted model and the data. For point process models of sorted spike train data, effective goodness-of-fit methods have been developed based on the time-rescaling theorem (Papangelou, 1972; Brown, Barbieri, Ventura, Kass, & Frank, 2002). Previously, we developed an extension of the time-rescaling theorem for marked point processes that, given the correct model, rescale the observed spike and mark data to a uniform distribution in a random subset of a space of the marks and rescaled times (Tao, Weber, Arai, & Eden, 2018). We can then use established statistical tests for uniformity to assess whether the model used for rescaling is consistent with the observed data. However, several challenges still limit the efficient application of these methods to marked-point process models in some cases. For models with high-dimensional marks representing the waveform features, computing the space in which the rescaled data should be uniform can be computationally expensive (Tao et al., 2018). Since this space is random and typically not convex, the number of statistical tests for uniformity is limited to those that can be applied in general spaces. Finally, of the multitude of uniformity tests, it is often not clear which should be applied to the rescaled data.

Here, we propose several extensions to this goodness-of-fit approach based on combinations of time and mark scaling, which for a correct model transform the observed spike and waveform data to uniformly distributed samples in a hypercube. This in turn simplifies and opens up more options for assessing uniformity. We discuss the properties of each transformation and demonstrate which aspects of model lack-of-fit are better captured using each. Finally, we perform a simulation analysis to compare and contrast the transformations proposed here, along with the multiple uniformity tests, to assess different models' fit to the simulated data. Using these simulated data, we are able to visualize the transformation results and study different aspects of the model including factors that affect the fit and also uniformity tests' results. For neural data, we might have a higher dimension for the mark. For instance, the mark can be the amplitudes of potentials being recorded from a tetrode for each event. In appendix D, we demonstrate the transformations and uniformity tests' results applied to neural data recorded from the hippocampus area of a rat brain traversing a W-shaped maze.

Our goal here is not to identify one single, best transformation and uniformity test for assessing goodness-of-fit of marked point process models; instead, we aim to provide a toolbox of methods to identify multiple ways in which a model may fail to capture structure in the data and provide guidance about which methods are most likely to be useful in different situations. We also developed an interactive and easy-to-use toolbox for the transformations and uniformity tests described here to assist other researchers in applying these goodness-of-fit techniques in their analysis of neural spike trains.

The letter is organized as follows. We first introduce each transformation in detail and briefly discuss their core properties. We then discuss different uniformity tests and their main attributes. We next go through a simulation example and compare goodness-of-fit results for the true and a set of alternative generative models. We finish with theoretical proofs that the transformations under the correct model yield uniform samples. We developed an open source toolbox that implements different transformations and uniformity tests discussed in this letter. The toolbox is written in Matlab, and it is available through our GitHub repository (Yousefi, Amidi, Nazari, & Eden, 2020).

2  Marked Point Process to Uniform Transformation

In this section, we introduce two transformations that take a data set of spike times and waveforms from a marked point process model to a set of identically distributed uniform samples on the hypercube $[01]d+1$, where $d$ is the dimension of the mark used to describe the spike waveform features in the model. We also discuss the properties of both transformations and explain which features of the model lack-of-fit can be better captured by each transformation.

A marked point process model is defined by a joint mark intensity function (JMIF), $λ(t,m→)$, where $t$ represents time and $m→$ represents a vector mark describing spike waveform features. $λ(t,m→)$ is defined so that the likelihood of observing a spike at time $t$ with a waveform with features in the full space of marks, $M$, is given by
$Pr(spikeat[t,t+Δ]withmarkinM|Ht)=∫Mλ(t,m→)dm→Δ.$
A marked point process model expresses this intensity as a function of any signals or covariates encoded by the neural population, $x(t)$, and the history of spikes and waveforms up to time $t$, $Ht$. Using this joint mark intensity, we can compute the ground intensity, $Λ(t)=∫Mλ(t,m→)dm→$, where $M$ is the full space of marks, which defines the intensity of observing any spike at time $t$. Similarly, we can define the mark intensity, $Γ(m→)=∫0Tλ(t,m→)dt$.

For an observation interval, $[0,T]$, we observe a sequence of spike events at times $0=s0 with associated marks $m→i$, for $i=1,…,N(T)$ with joint mark intensity function $λ(t,m→)$. We assume this joint mark intensity function is integrable over both time and mark space. The notation we use to define the data and model components is listed in Table 1.

Table 1:
Notation for the Marked Point Process Model and Data.
NameMathematical Notation
Joint mark intensity function $λ(t,m→)$
Ground intensity $Λ(t)$
Mark intensity $Γ(m→)$
Full spike event $(si,m→i)$
Spike time $si$
Spike mark $m→i$
Conditional mark distribution $m→i|si$
Conditional intensity $si|m→i$
NameMathematical Notation
Joint mark intensity function $λ(t,m→)$
Ground intensity $Λ(t)$
Mark intensity $Γ(m→)$
Full spike event $(si,m→i)$
Spike time $si$
Spike mark $m→i$
Conditional mark distribution $m→i|si$
Conditional intensity $si|m→i$

In the following sections, we present the transformations and associated uniformity tests.

2.1  Interval-Rescaling Conditional Mark Distribution Transform (IRCM)

This algorithm requires the computation of the ground intensity, $Λ(t)$, followed by the conditional mark distribution, $m→|si$. We first rescale the interspike intervals across all observed spikes based on the ground intensity. We then rescale each mark dimension sequentially using a Rosenblatt transformation (Rosenblatt, 1952) based on the conditional mark distribution given the spike time. The order of conditioning for the mark features can be specified directly or selected randomly. The dimension of new data samples is $d+1$, where $d$ is the dimension of mark space. The transformed data samples are independent and identically distributed (i.i.d.) with a uniform distribution in the hypercube $01d+1$. (See algorithm 1.)

No matter which ordering we select for the mark components, the i.i.d. and uniformity properties will hold for the true model. The theoretical proof of IRCM transformation is in section A.1.

The rescaled data samples from the IRCM transformation not only provide insight into the overall quality of the proposed joint mark intensity function, but also reveal finer aspects of the model fit or lack-of-fit. The $ui$ samples are computed using only the spike times and the estimated ground intensity model and can be used separately to assess the goodness-of-fit of the temporal component of the model to the unmarked spike times using the time-rescaling theorem (Brown et al., 2002). The $viΘ(l)$ samples for a fixed dimension, $l$, are computed using only marks $Θ(1),…,Θ(l)$ and the conditional mark density, $f(mΘ(l)|mΘ(l-1),…,mΘ(1),si)$. If samples $viΘ(k)$ for $k are uniform but $viΘ(l)$ are not, this suggests specific lack-of-fit in modeling the coding properties of the waveform features associated with mark dimension $l$. Figure 2 explains how this transformation maps the data to $u$ and $v$ coordinates.
Figure 2:

Describing how the marks and their timing are transformed into $u$ and $v$ coordinates under the IRCM transformation. (A) The same spike event in two different color gradients. In the top panel, spike events are colored based on their timing, and in the bottom panel, the same spike events are colored based on their mark value. Here, we assume the mark dimension is 1. (B) The spike events' transformation under IRCM. Note that the rescaled time and mark are shuffled in both axes.

Figure 2:

Describing how the marks and their timing are transformed into $u$ and $v$ coordinates under the IRCM transformation. (A) The same spike event in two different color gradients. In the top panel, spike events are colored based on their timing, and in the bottom panel, the same spike events are colored based on their mark value. Here, we assume the mark dimension is 1. (B) The spike events' transformation under IRCM. Note that the rescaled time and mark are shuffled in both axes.

2.2  Mark Density Conditional Intensity Transform (MDCI)

Algorithm 2 requires rescaling time separately for each spike, based on its joint mark intensity. This can potentially break the ordering of spikes with different waveform features, while spikes with similar waveforms will tend to maintain their relative ordering. Next, the algorithm sequentially rescales each mark dimension, again based on a Rosenblatt transformation (Rosenblatt 1952). Like IRCM, we can choose any ordering for the mark features or select a random ordering. Distinct from IRCM, this transformation does not depend on the time of the spike, only on its mark value. Algorithm 2 describes this mapping.

The proof that the MDCI transformation under the true conditional intensity model leads to uniform samples is included in section A.2. Figure 3 explains how this transformation maps the data to $u$ and $v$ coordinates.
Figure 3:

How the marks and their timing are transformed into $u$ and $v$ coordinates under the MDCI transformation. (A) The same spike event in two different color gradients. In the top panel, spike events are colored based on their timing, and in the bottom panel, the same spike events are colored based on their mark value. Here, we assume the mark dimension is 1. (B) The spike events' transformation under MDCI. Note that the rescaled times preserve their order but are stretched to the range of 0 to 1 and the rescaled marks are stretched and shuffled.

Figure 3:

How the marks and their timing are transformed into $u$ and $v$ coordinates under the MDCI transformation. (A) The same spike event in two different color gradients. In the top panel, spike events are colored based on their timing, and in the bottom panel, the same spike events are colored based on their mark value. Here, we assume the mark dimension is 1. (B) The spike events' transformation under MDCI. Note that the rescaled times preserve their order but are stretched to the range of 0 to 1 and the rescaled marks are stretched and shuffled.

The key difference between the IRCM and MDCI transforms is that the IRCM transforms the interspike intervals independent of their marks and then transforms each mark based on the intensity of spikes with that waveform at the observed time, while the MDCI transforms the marks independent of when the corresponding spikes occur and then transforms time differently for each spike waveform. For neural spiking models, the IRCM examines the intervals between spikes and tends to mix the marks so that spikes with similar waveforms may end up far apart in the transformed mark space; inversely, the MDCI tends to leave spikes with similar waveforms nearby in the transformed mark space while mixing up the spike timing from different neurons. Another important difference is that for the correct model, the IRCM generates i.i.d. uniform samples while the MDCI samples are not independent. However, the set of all the unordered MDCI samples does have a joint uniform distribution. We therefore expect these transforms to allow us to determine separate aspects of lack-of-fit. The lack-of-fit associated with the model of individual neurons or particular waveform features might be better assessed using MDCI, while lack-of-fit associated with interactions between neurons might be better assessed using IRCM. We investigate these expectations in section 4.

In this section, we described two algorithms that take marked point process data and map them to uniformly distributed samples in a hypercube, $[01]d+1$, based on their joint mark intensity. These methods allow for marks of arbitrary dimension. In appendix B, we describe one additional transformation, which applies in the specific case where the mark is scalar.

3  Uniformity Tests

There are a multitude of established uniformity tests for one-dimensional data; however, the number of established, robust, multidimensional uniformity tests is more limited. Pearson's chi-square test can be used to assess uniformity by partitioning the space into discrete components and computing the number of samples in each (Pearson, 1938; Greenwood & Nikulin, 1996). Another approach is to apply a multivariate Kolmogorov-Smirnov (Rosenblatt) test (Justel, Peña, & Zamar, 1997), which uses a statistic based on the maximum deviation between the empirical multivariate CDF and that of the uniform to build a distribution-free test for multidimensional samples. Other test statistics are derived from number-theoretic or quasi–Monte Carlo methods for measuring the discrepancy of points in $[01]d$ (Liang, Fang, Hickernell, & Li, 2001; Ho & Chiu, 2007). Using Monte Carlo simulation, it is known that the finite-sample distribution of these statistics can be well approximated by a standard normal distribution (Liang et al., 2001; Ho & Chiu, 2007). Two other approaches to assessing multivariate uniformity are based on distances between samples and the boundary of the hypercube (Berrendero, Cuevas, & Vjosázquez-grande, 2006) and distances between nearest samples, which leads to the computation of Ripley's K function (Ripley, 2005; Lang & Marcon, 2013; Marcon, Traissac, & Lang, 2013). Fan (1998) describes a test based on the $L2$ distance between the kernel density estimate of the underlying probability density and the uniform distribution. Other tests include those built on order statistics (Chen & Ye, 2009), Friedman-Rafsky's minimal spanning tree (Friedman & Rafsky, 1979), or a weighted $K$-function (Veen & Schoenberg, 2006; Jafari Mamaghani, Andersson, & Krieger, 2010; Dixon, 2014). There are several other multivariate uniformity tests which are not presented here; a comprehensive discussion of scalar and multivariate uniformity tests can be found in Marhuenda, Morales, and Pardo (2005). There are also uniformity tests specifically designed for two- and three-dimensional spaces including complete spatial randomness or bivariate Cramer-von Mises tests that are described in Zimmerman (1993), Chiu and Liu (2009), and Csörgő (2014).

Here, we investigate a few of these approaches in terms of their ability to detect model lack-of-fit in rescaled samples from the spike transformations described above; the tests are a Pearson's chi-square test (Pearson, 1938), a multivariate KS test (Justel et al., 1997), the distance-to-boundary method (Berrendero et al., 2006), a discrepancy-based test (Liang et al., 2001), a test based on Ripley's $K$-function (Ripley, 2005; Lang & Marcon, 2013; Marcon et al., 2013), and a test using minimum spanning trees (MST) (Smith & Jain, 1984; Jain, Xu, Ho, & Xiao, 2002). The tests are described in detail in the cited literature and are expressed algorithmically in Table 2. These tests tend to be straightforward to implement with a few exceptions: Ripley's $K$-function becomes computationally expensive to test in more than two dimensions, and Pearson's chi square requires defining a set of subregions of the hypercube. The remaining tests do not require any parameters to be selected except for the test significance level.

Table 2:
Uniformity Tests.
Test NameMethod
Pearson $χ2$ test (Pearson, 19381. Define $M$ subregions, $Rjj=1,…,M$ in the hypercube $R$.2. Let $pj=|Rj|/|R|$, where $|R|$ represents the volume of region $R$.3. Calculate $X2=∑j=1M(rj-npj)2/npj$ using the rescaled data samples, where $rj$ is the number of samples in $Rj$.4. If $X2>χM-12(1-α)$, reject the null hypothesis, where $χM-12(1-α)$ is the inverse of the chi-square cdf with $M-1$ degrees of freedom.
Multivariate Kolmogorov-Smirnov test (Justel et al., 19971. Define $DKS=supn|F(vn(1),…,vn(d),un)-un∏l=1dvn(d)|$, where $F(·)$ is the empirical multivariate CDF.2. Use Monte Carlo simulation to find the critical value, $Cα$, for the test at the significance level of $α$.3. Calculate $DKS$ using transformed data samples.4. If $DKS>Cα$ reject the null hypothesis.
Distance to boundary test (Berrendero et al., 20061. Define $D(u,δS)=minx∈δS∥x-u∥$, where $δS$ is the boundary of the hypercube and $·$ is the Euclidean norm on $Rd$.2. Calculate $yi=D(ui,δS)/max(D(uj,δS))$ for $j=1,…,n$ samples.3. Calculate $DKS=supi|F(yi)-1+(1+yi)d|$, where $F(·)$ is the empirical CDF.4. If $DKS>Cα/n$, where $Cα$ is the KS critical value, reject the null hypothesis.
Discrepancy test (Liang et al., 20011. Calculate $An=n[(U1-Md)+2(U2-Md)]/(5ξ1)$; section C.1 provides the definitions for $M,ξ1,U1$, and $U2$. Under the null hypothesis, $An$ has a standard normal distribution.2. If $|An|>zα/2$, reject the null hypothesis, where $zα/2$ is the critical value of a standard normal at significance level $α$
Ripley statistics test (Ripley, 2005; Lang & Marcon, 20131. Compute the distance from each rescaled point to its nearest neighbor, $r→={r1,…,rs}$.2. For each $rk$, calculate Ripley's $K$-function statistics by $K^(rk)=∑ui≠ujI{d(ui,uj)≤rk}$.3. Let $K^=(K^(r1)-B(r1),…,K^(rs)-B(rs))$ where $B(rk)=-83rk3+12rk4$.4. Calculate $T2=∥Σ12(K^-πr→2)∥$. Under the null hypothesis, $T2$ has a chi-square distribution with $d$ degree of freedom. Section C.2 provides the definition for $Σ$.5. If $T2>χd2(α)$, where $χd2(α)$ is the chi-square critical value, reject the null hypothesis.
Minimal spanning tree (MST) test (Smith & Jain 1984; Jain et al., 20021. Draw $m$ multivariate uniform sample points.2. Calculate the number of degree pairs, $C$, in the MST that share a common node.3. If $di$ is the degree of the $ith$ node, $C$ is defined by $C=1/2∑i=1Ndi(di-1),N=m+n$.4. Calculate the number of edges linking a point from the generated data to uniform sample points – $T$.5. Under the null hypothesis assumption, calculate $Var[T|C]=2mnN(N-1)(2mn-NN+C-N+2(N-2)(N-3)[N(N-1)-4mn+2])$ and $E[T]=2mnN+1$.6. Calculate $D=(T-E[T])/Var[T|C]$.7. If $D, reject the null hypothesis, where $zα$ is the $α$-quantile of the standard normal distribution
Test NameMethod
Pearson $χ2$ test (Pearson, 19381. Define $M$ subregions, $Rjj=1,…,M$ in the hypercube $R$.2. Let $pj=|Rj|/|R|$, where $|R|$ represents the volume of region $R$.3. Calculate $X2=∑j=1M(rj-npj)2/npj$ using the rescaled data samples, where $rj$ is the number of samples in $Rj$.4. If $X2>χM-12(1-α)$, reject the null hypothesis, where $χM-12(1-α)$ is the inverse of the chi-square cdf with $M-1$ degrees of freedom.
Multivariate Kolmogorov-Smirnov test (Justel et al., 19971. Define $DKS=supn|F(vn(1),…,vn(d),un)-un∏l=1dvn(d)|$, where $F(·)$ is the empirical multivariate CDF.2. Use Monte Carlo simulation to find the critical value, $Cα$, for the test at the significance level of $α$.3. Calculate $DKS$ using transformed data samples.4. If $DKS>Cα$ reject the null hypothesis.
Distance to boundary test (Berrendero et al., 20061. Define $D(u,δS)=minx∈δS∥x-u∥$, where $δS$ is the boundary of the hypercube and $·$ is the Euclidean norm on $Rd$.2. Calculate $yi=D(ui,δS)/max(D(uj,δS))$ for $j=1,…,n$ samples.3. Calculate $DKS=supi|F(yi)-1+(1+yi)d|$, where $F(·)$ is the empirical CDF.4. If $DKS>Cα/n$, where $Cα$ is the KS critical value, reject the null hypothesis.
Discrepancy test (Liang et al., 20011. Calculate $An=n[(U1-Md)+2(U2-Md)]/(5ξ1)$; section C.1 provides the definitions for $M,ξ1,U1$, and $U2$. Under the null hypothesis, $An$ has a standard normal distribution.2. If $|An|>zα/2$, reject the null hypothesis, where $zα/2$ is the critical value of a standard normal at significance level $α$
Ripley statistics test (Ripley, 2005; Lang & Marcon, 20131. Compute the distance from each rescaled point to its nearest neighbor, $r→={r1,…,rs}$.2. For each $rk$, calculate Ripley's $K$-function statistics by $K^(rk)=∑ui≠ujI{d(ui,uj)≤rk}$.3. Let $K^=(K^(r1)-B(r1),…,K^(rs)-B(rs))$ where $B(rk)=-83rk3+12rk4$.4. Calculate $T2=∥Σ12(K^-πr→2)∥$. Under the null hypothesis, $T2$ has a chi-square distribution with $d$ degree of freedom. Section C.2 provides the definition for $Σ$.5. If $T2>χd2(α)$, where $χd2(α)$ is the chi-square critical value, reject the null hypothesis.
Minimal spanning tree (MST) test (Smith & Jain 1984; Jain et al., 20021. Draw $m$ multivariate uniform sample points.2. Calculate the number of degree pairs, $C$, in the MST that share a common node.3. If $di$ is the degree of the $ith$ node, $C$ is defined by $C=1/2∑i=1Ndi(di-1),N=m+n$.4. Calculate the number of edges linking a point from the generated data to uniform sample points – $T$.5. Under the null hypothesis assumption, calculate $Var[T|C]=2mnN(N-1)(2mn-NN+C-N+2(N-2)(N-3)[N(N-1)-4mn+2])$ and $E[T]=2mnN+1$.6. Calculate $D=(T-E[T])/Var[T|C]$.7. If $D, reject the null hypothesis, where $zα$ is the $α$-quantile of the standard normal distribution

Note: $d$ is the dimension of the data, $n$ is the number of data samples, and $α$ is the significance level.

The data transformations require the selection of an ordering of the mark dimensions; the uniformity tests can be applied to one particular ordering or can be modified to allow for assessment across multiple permutations of orderings. In such cases, the test procedures should be adjusted for multiple comparisons (Miller, 1977).

4  Simulation Study

In this section, we demonstrate an application of the IRCM and MDCI transformations, along with the multiple uniformity tests described in Table 2 to assess their ability to measure goodness-of-fit in simulated data. We first describe how the simulated data are generated and then examine the transformations and goodness-of-fit results.

4.1  Simulated Data

We generate simulated spiking data using a marked point process intensity model consisting of two connected neurons encoding a simulated position variable, $xt$. $xt$ is modeled as the first-order autoregressive process AR(1)
$xt=αxt-1+ɛt,$
(4.1)
where $α$ is set at 0.98 and $ɛt$ is a zero mean white noise process with a standard deviation of 0.3. The neurons' spiking activity depends on $xt$ and on previous spiking; each neuron has a refractory period and neuron 2 has an excitatory influence on neuron 1. We generate the simulated spike data in discrete time using a step size of 1 millisecond, based on the following joint mark intensity function
$λ(t,m)=[λ1,x(xt)+λ1,E(t)]λ1,H(t)ϕ(m;ω1,m(t),σ1,m2)+λ2,x(xt)λ2,H(t)ϕ(m;ω2,m(t),σ2,m2),$
(4.2)
where $ϕ(x;w,σ2)$ is the PDF of a normal distribution with mean $w$ and variance $σ2$ at point $x$, used to represent the variability of the spike waveform marks. In equation 2.2,
$λj.x(xt)=expaj-xt-μj,x22σj,x2j=1,2,$
(4.3)
represents the receptive fields of neurons 1 and 2, $μj,x$ and $σj,x2$ define the field center and width, and $aj$ define the peak firing rates. The excitatory influence of neuron 2 on neuron 1 is defined by
$λ1,E(t)=∑i=1N(t-)expa3-t-si-r22σ12I{siεS2},$
(4.4)
where, $S2$ is the set containing all the spike times of neuron 2 and $r$ is the time lag of the peak effect of each neuron 2 spike on neuron 1. The variable $a3$ defines the peak excitatory influence from neuron 2 on the firing rate of neuron 1. The history-dependent terms for each neuron are defined by
$λj,H(t)=∑i=1N(t-)1-exp-t-si22σ22I{siεSj}j=1,2,$
(4.5)
where $N(t-)$ is the total number of spikes up to, but not including, time $t$, and $Sj$ is the set containing the spike times for each neuron.
The mark process for each neuron is a scalar random variable with distribution
$m∼Nωj,m(t),σj,m2j=1,2.$
(4.6)
The marks are normally distributed with a time-dependent mean and a known variance $σj,m2$, $j=1,2$. The time-dependent mean for each neuron is defined by
$ωj,m(t)=μj,m+ρtTj=1,2,$
(4.7)
where, $μj,m$ is the time-independent component of the mean and $T$ is the total time interval for the experiment. $ρ$ defines how rapidly the mean of mark distribution changes as a function of time. Such a time-dependent drift in the mark could reflect changes in the spike waveform amplitude of each neuron due to electrode drift, for example. Table 3 shows the numerical values of the modelfree parameters. We note that these parameters are assumed to be known in both the true and misspecified models. Here, we are focusing on assessing lack-of-fit due to model misspecification rather than due to parameter estimation error.
Table 3:
Values for the Simulation Model Parameters.
ParameterValue
$a1$ Neuron 1 peak firing rate $log(0.18)$
$a2$ Neuron 2 peak firing rate $log(0.18)$
$a3$ Peak excitatory influence $log(0.3)$
$μ1,x$ Mean of receptive field model for neuron 1 $-$
$μ2,x$ Mean of receptive field model for neuron 2
$σ1,x2$ Variance of receptive field model for neuron 1 0.5
$σ2,x2$ Variance of receptive field model for neuron 2 0.5
$μ1,m$ Neuron 1 time-independent marks' mean 11
$μ2,m$ Neuron 2 time-independent marks' mean 12
$ρ$ Mark time dependency drift parameter 0.8
$σ1,m2$ Variance of neuron 1 mark distribution 0.09
$σ2,m2$ Variance of neuron 2 mark distribution 0.09
$σ12$ Excitatory term variance
$σ22$ Inhibitory term variance 14
Lag time of the excitatory influence 10
ParameterValue
$a1$ Neuron 1 peak firing rate $log(0.18)$
$a2$ Neuron 2 peak firing rate $log(0.18)$
$a3$ Peak excitatory influence $log(0.3)$
$μ1,x$ Mean of receptive field model for neuron 1 $-$
$μ2,x$ Mean of receptive field model for neuron 2
$σ1,x2$ Variance of receptive field model for neuron 1 0.5
$σ2,x2$ Variance of receptive field model for neuron 2 0.5
$μ1,m$ Neuron 1 time-independent marks' mean 11
$μ2,m$ Neuron 2 time-independent marks' mean 12
$ρ$ Mark time dependency drift parameter 0.8
$σ1,m2$ Variance of neuron 1 mark distribution 0.09
$σ2,m2$ Variance of neuron 2 mark distribution 0.09
$σ12$ Excitatory term variance
$σ22$ Inhibitory term variance 14
Lag time of the excitatory influence 10
To assess how the IRCM and MDCI transformations and the selected uniformity tests can capture the extent or lack of goodness-of-fit for marked-point process data, we generate simulated spike data using the joint marked intensity model described in equations 2.2 to 2.7; we compare the assessed goodness-of-fit of a set of alternative models including the true model and a number of misspecified models, to fit these data. The true model is the one specified by equations 2.2 to 2.7 and the parameter values in Table 3. The first misspecified model uses the correct place and mark structure for each neuron and the interaction between them, but omits the refractory period for each neuron. The JMIF is
$λ(t,m)=[λ1,x(xt)+λ1,E(Ht)]ϕ(m;ω1,m(t),σ1,m2)+λ2,x(xt)ϕ(m;ω2,m(t),σ2,m2)$
(4.8)
The second misspecified model lacks only the excitatory influence of neuron 2 on neuron 1. Its JMIF is
$λ(t,m)=λ1,x(xt)λ1,H(Ht)ϕ(m;ω1,m(t),σ1,m2)+λ2,x(xt)λ2,H(Ht)ϕ(m;ω2,m(t),σ2,m2).$
(4.9)
The final misspecified model includes all components but lacks the temporal drift in the mark distribution for both neurons. Its JMIF is
$λ(t,m)=[λ1,x(xt)+λ1,E(Ht)]λ1,H(Ht)ϕ(m;μ^1,m,σ^1,m2)+λ2,x(xt)λ2,H(Ht)ϕ(m;μ^2,m,σ^2,m2),$
(4.10)
where, $μ^j,m=μj,m-0.5ρ$ are the means and $σ^j,m2$ are the variances of the mark density for each neuron, based on the best estimates of these parameters using the true model under the incorrect assumption that the means are constant.
Figure 4 shows an example of the simulated data, including the spike times and marks, along with the position variable generated using the true model. This example has 784 spikes: 486 from neuron 1 and 298 from neuron 2. The range of marks for neuron 1 is from 9.62 to 11.77 and for neuron 2 from 10.53 to 12.71. This overlap means that perfect spike sorting using this single mark is impossible. Similarly, these two neurons fire over overlapping regions of space, $xt$ as shown in Figure 4A.
Figure 4:

Simulated spiking by a marked point process model with JMIF and $xt$ defined in equations 4.1 to 4.7. There are 784 spikes in this example. (A) Simulated $xt$ and spike locations in time $t$, Spikes from neuron 1 change from red to yellow to indicate time into the simulation. Similarly, spikes from neuron 2 change from blue to green. This coloring scheme helps in visualizing the transformations in the IRCM and MDCI mappings in subsequent figures. (B) Mark values of each spike. Red and blue colors imply whether a spike comes from neuron 1 or neuron 2, respectively.

Figure 4:

Simulated spiking by a marked point process model with JMIF and $xt$ defined in equations 4.1 to 4.7. There are 784 spikes in this example. (A) Simulated $xt$ and spike locations in time $t$, Spikes from neuron 1 change from red to yellow to indicate time into the simulation. Similarly, spikes from neuron 2 change from blue to green. This coloring scheme helps in visualizing the transformations in the IRCM and MDCI mappings in subsequent figures. (B) Mark values of each spike. Red and blue colors imply whether a spike comes from neuron 1 or neuron 2, respectively.

4.2  Transformation Results

Figure 5 shows the mapping results for the IRCM transform of the simulated data using the true and alternative models. The colors of the dots indicate both the identity of the neuron generating the spike and the relative order of the spike within the experiment, matching those of Figure 4A. For the true model (see Figure 5A), the transformed points are shuffled in both the rescaled time and mark axes. Visually, the transformed points appear uniformly distributed over the square (we assess this quantitatively using multiple uniformity tests in the next section). Figure 5B shows the transformed data using the first misspecified model, which lacks the refractory behavior of each neuron. When this inhibitory history term is omitted, the ground intensity function is overestimated immediately after each spike, which increases the values of $u$. Since the missing inhibitory term does not affect the marks, the transformed data points for $v$ do not show clear deviation from uniformity. Figure 5C shows the mapping result for the model missing the excitatory influence from neuron 2 to neuron 1. In this misspecified model, a subset of the transformed data points is shifted toward lower values of $u$, since the intensity for neuron 1's marks is underestimated immediately after neuron 2 spikes. In addition, since the influence of the excitatory term is only on neuron 1, it is primarily the red-to-yellow rescaled points that are concentrated near the origin.
Figure 5:

Rescaling results for the IRCM algorithm using four models: the true model and three alternative models described in section 4.1. Dot color indicates the neuron identity and timing of spike before rescaling, consistent with Figure 4A. (A) Rescaling using the true mark intensity function produces apparently uniform data. (B) Rescaling using the missing refractoriness model shows clear nonuniformity for $u$. (C) Rescaling using the missing interaction model shows clustering of points from neuron 1 at the origin. (D) Rescaling using the missing mark drift model produces apparently uniform but not independent samples.

Figure 5:

Rescaling results for the IRCM algorithm using four models: the true model and three alternative models described in section 4.1. Dot color indicates the neuron identity and timing of spike before rescaling, consistent with Figure 4A. (A) Rescaling using the true mark intensity function produces apparently uniform data. (B) Rescaling using the missing refractoriness model shows clear nonuniformity for $u$. (C) Rescaling using the missing interaction model shows clustering of points from neuron 1 at the origin. (D) Rescaling using the missing mark drift model produces apparently uniform but not independent samples.

Figure 5D shows the rescaled data using the alternative model missing the drift in the mark structure. Here, there is no apparent lack of uniformity among the points, but there is a clear pattern where the yellow and green points from the end of the simulation session tend to cluster near the origin, and the red and blue points from earlier in the session tend to cluster near the opposite corner of the square. This suggests that simple tests of uniformity might be insufficient to detect this lack-of-fit based on the IRCM transformation. In this case, including tests for independence between rescaled samples may provide a more complete view of model fit to the observed data.

Figure 6 shows the rescaling results for the MDCI algorithm. Figure 6A shows that the transformed data points using the true model are distributed uniformly. In contrast to IRCM, there is little shuffling of points along the rescaled time and mark axes. In this transform, each neuron's early spikes tend to rescale to smaller values of $ui$ and its late spikes tend to rescale to later values. Therefore, each $ui$ individually is not uniformly distributed; however, the full set of unordered rescaled times is jointly uniform.
Figure 6:

Rescaling results for the MDCI algorithm using four models: the true model and three alternative models described in section 4.1. Dot color indicates neuron and timing of spike, consistent with Figure 4A. (A) Rescaling using the true mark intensity function produces apparently uniform data. (B) Rescaling using missing refractoriness model appears like the true model in panel A. (C) Rescaling using the missing interaction model shows more density at lower values of $v$ for neuron 1. (D) Rescaling using the missing mark drift model shows a nonuniform drift.

Figure 6:

Rescaling results for the MDCI algorithm using four models: the true model and three alternative models described in section 4.1. Dot color indicates neuron and timing of spike, consistent with Figure 4A. (A) Rescaling using the true mark intensity function produces apparently uniform data. (B) Rescaling using missing refractoriness model appears like the true model in panel A. (C) Rescaling using the missing interaction model shows more density at lower values of $v$ for neuron 1. (D) Rescaling using the missing mark drift model shows a nonuniform drift.

Figure 6B shows the rescaled points for the misspecified model lacking refractoriness. Visually, there is no clear evidence of lack of uniformity among the samples, suggesting that tests based on this transform may lack statistical power to reject this model. When the excitatory influence of neuron 2 on neuron 1 is omitted from the model (see Figure 6C), a subtle deviation from uniformity is observed in the resulting transformed data; the fewer spikes from neuron 2 (blue to green points) occupy as much area as the more prevalent spikes from neuron 1 (red to yellow points) suggest a lack of uniformity for $v$. Figure 6D shows the rescaled points for the model lacking the drift in the marks. This leads to an apparent drift in the rescaled points, with earlier spikes producing larger values of $v$ and later spikes producing smaller values of $v$. Unlike the IRCM transformation, the lack of uniformity is visually clear for this misspecified model.

Figures 5 and 6 suggest that different forms of model misspecification may be better identified using different transformations; the missing refractoriness model shows clear lack-of-fit based on the IRCM but not the MDCI transformation, while the missing mark drift model shows more apparent lack-of-fit through the MDCI transformation. It remains to be seen whether this apparent lack-of-fit is captured quantitatively using each of the uniformity tests described previously; we explore this in the following section.

4.3  Uniformity Test Results

Tables 4 and 5 provide the results of the uniformity tests described in Table 2, along with their corresponding $p$-values on the rescaled data shown in Figures 5 and 6 using the IRCM and MDCI transformations. A small $p$-value indicates strong evidence against the null hypothesis; here, we set the significance level $α$ of 0.05. The null hypothesis is that the sample data are distributed uniformly in a unit square; this hypothesis would be true if the original marked point process data are generated based on the joint mark intensity model used for the transformation.

Table 4:
Different Uniformity Test Statistics and Corresponding $p$-Values Using the IRCM Algorithm.
Uniformity TestTrue ModelMissing Refractoriness ModelMissing Neural Interaction ModelConstant Mark Model
Pearson $χ2$ statistic test with different degree of freedom $M=N=3,X82(α)=15.51$
Metric $p$-value Metric $p$-value Metric $p$-value Metric $p$-value
3.109 0.9273 746.55 $∼$0 396.30 $∼$0 4.7398 0.785
$M=N=4,X152(α)=24.99$
Metric $p$-value Metric $p$-value Metric $p$-value Metric $p$-value
10.571 0.7823 838.08 $∼$0 707.02 $∼$0 15.346 0.4267
$M=N=5,X242(α)=36.41$
Metric $p$-value Metric $p$-value Metric $p$-value Metric $p$-value
23.461 0.4927 920.33 $∼$0 1029.4 $∼$0 20.910 0.644
Multivariate Kolmogorov-Smirnov (KS) test $n=784,Cα^/n=0.053$
Metric $p$-value Metric $p$-value Metric $p$-value Metric $p$-value
0.0314 0.462 0.4316 7.1e-05 0.2291 3.1e-05 0.0314 0.462
Distance-to-boundary test $n=784,Cα/n=0.048$
Metric $p$-value Metric $p$-value Metric $p$-value Metric $p$-value
0.0420 0.251 0.1359 3.5e-04 0.2252 2.1e-05 0.0563 0.013
Discrepancy test $zα/2-1=±1.6449$
Metric $p$-value Metric $p$-value Metric $p$-value Metric $p$-value
0.5435 0.3442 4.1766 6.5e-05 $-$3.312 0.0017 0.7858 0.2930
Ripley statistics test $d=2,χd2(α)=5.9915$
Metric $p$-value Metric $p$-value Metric $p$-value Metric $p$-value
4.3124 0.1158 31.084 1.77e-7 123.54 1.4e-27 4.731 0.09387
Minimal spanning tree (MST) test $zα-1=±1.64$
Metric $p$-value Metric $p$-value Metric $p$-value Metric $p$-value
0.9367 0.2573 $-$6.770 4.4e-11 $-$4.60 9.9e-06 $-$0.86 0.2756
Uniformity TestTrue ModelMissing Refractoriness ModelMissing Neural Interaction ModelConstant Mark Model
Pearson $χ2$ statistic test with different degree of freedom $M=N=3,X82(α)=15.51$
Metric $p$-value Metric $p$-value Metric $p$-value Metric $p$-value
3.109 0.9273 746.55 $∼$0 396.30 $∼$0 4.7398 0.785
$M=N=4,X152(α)=24.99$
Metric $p$-value Metric $p$-value Metric $p$-value Metric $p$-value
10.571 0.7823 838.08 $∼$0 707.02 $∼$0 15.346 0.4267
$M=N=5,X242(α)=36.41$
Metric $p$-value Metric $p$-value Metric $p$-value Metric $p$-value
23.461 0.4927 920.33 $∼$0 1029.4 $∼$0 20.910 0.644
Multivariate Kolmogorov-Smirnov (KS) test $n=784,Cα^/n=0.053$
Metric $p$-value Metric $p$-value Metric $p$-value Metric $p$-value
0.0314 0.462 0.4316 7.1e-05 0.2291 3.1e-05 0.0314 0.462
Distance-to-boundary test $n=784,Cα/n=0.048$
Metric $p$-value Metric $p$-value Metric $p$-value Metric $p$-value
0.0420 0.251 0.1359 3.5e-04 0.2252 2.1e-05 0.0563 0.013
Discrepancy test $zα/2-1=±1.6449$
Metric $p$-value Metric $p$-value Metric $p$-value Metric $p$-value
0.5435 0.3442 4.1766 6.5e-05 $-$3.312 0.0017 0.7858 0.2930
Ripley statistics test $d=2,χd2(α)=5.9915$
Metric $p$-value Metric $p$-value Metric $p$-value Metric $p$-value
4.3124 0.1158 31.084 1.77e-7 123.54 1.4e-27 4.731 0.09387
Minimal spanning tree (MST) test $zα-1=±1.64$
Metric $p$-value Metric $p$-value Metric $p$-value Metric $p$-value
0.9367 0.2573 $-$6.770 4.4e-11 $-$4.60 9.9e-06 $-$0.86 0.2756

Note: Bold numbers show cases where the test identified lack-of-fit at a significance level of $α=0.05.$

Table 5:
Different Uniformity Test Statistics and Corresponding $p$-Values Using the MDCI Algorithm.
Uniformity TestTrue ModelMissing Refractoriness ModelMissing Neural Interaction ModelConstant Mark Model
Pearson $χ2$ – statistic test with different degree of freedom $M=N=3,X82(α)=15.51$
Metric $p$-value Metric $p$-value Metric $p$-value Metric $p$-value
1.997 0.981 2.788 0.946 54.33 5.9e-9 94.510 $∼$0
$M=N=4,X152(α)=24.99$
10.22 0.805 10.693 0.774 77.87 1.7e-10 117.79 $∼$0
$M=N=5,X242(α)=36.41$
Metric $p$-value Metric $p$-value Metric $p$-value Metric $p$-value
21.202 0.626 18.423 0.782 82.900 2.08e-8 139.15 $∼$0
Multivariate Kolmogorov Smirnov test $n=784,Cα^/n=0.053$
Metric $p$-value Metric $p$-value Metric $p$-value Metric $p$-value
0.038 0.307 0.0152 0.834 0.120 9.1e-04 0.079 0.007
Distance-to-boundary test $n=784,Cα/n=0.048$
Metric $p$-value Metric $p$-value Metric $p$-value Metric $p$-value
0.0162 0.814 0.023 0.661 0.035 0.368 0.028 0.524
Discrepancy test $zα/2-1=±1.6449$
Metric $p$-value Metric $p$-value Metric $p$-value Metric $p$-value
$-$0.209 0.390 $-$0.169 0.393 0.245 0.387 0.037 0.398
Ripley statistics test $d=2,χd2(α)=5.9915$
Metric $p$-value Metric $p$-value Metric $p$-value Metric $p$-value
1.862 0.394 2.376 0.304 3.920 0.140 4.903 0.086
Minimal spanning tree (MST) test $zα-1=±1.64$
Metric $p$-value Metric $p$-value Metric $p$-value Metric $p$-value
0.128 0.395 $-$0.150 0.394 $-$1.341 0.162 $-$2.561 0.0150
Uniformity TestTrue ModelMissing Refractoriness ModelMissing Neural Interaction ModelConstant Mark Model
Pearson $χ2$ – statistic test with different degree of freedom $M=N=3,X82(α)=15.51$
Metric $p$-value Metric $p$-value Metric $p$-value Metric $p$-value
1.997 0.981 2.788 0.946 54.33 5.9e-9 94.510 $∼$0
$M=N=4,X152(α)=24.99$
10.22 0.805 10.693 0.774 77.87 1.7e-10 117.79 $∼$0
$M=N=5,X242(α)=36.41$
Metric $p$-value Metric $p$-value Metric $p$-value Metric $p$-value
21.202 0.626 18.423 0.782 82.900 2.08e-8 139.15 $∼$0
Multivariate Kolmogorov Smirnov test $n=784,Cα^/n=0.053$
Metric $p$-value Metric $p$-value Metric $p$-value Metric $p$-value
0.038 0.307 0.0152 0.834 0.120 9.1e-04 0.079 0.007
Distance-to-boundary test $n=784,Cα/n=0.048$
Metric $p$-value Metric $p$-value Metric $p$-value Metric $p$-value
0.0162 0.814 0.023 0.661 0.035 0.368 0.028 0.524
Discrepancy test $zα/2-1=±1.6449$
Metric $p$-value Metric $p$-value Metric $p$-value Metric $p$-value
$-$0.209 0.390 $-$0.169 0.393 0.245 0.387 0.037 0.398
Ripley statistics test $d=2,χd2(α)=5.9915$
Metric $p$-value Metric $p$-value Metric $p$-value Metric $p$-value
1.862 0.394 2.376 0.304 3.920 0.140 4.903 0.086
Minimal spanning tree (MST) test $zα-1=±1.64$
Metric $p$-value Metric $p$-value Metric $p$-value Metric $p$-value
0.128 0.395 $-$0.150 0.394 $-$1.341 0.162 $-$2.561 0.0150

Note: Bold numbers show cases where the test identified lack-of-fit at a significance-level of $α=0.05$.

The results in Tables 4 and 5 suggest that no single combination of transform and uniformity test will identify all forms of model lack-of-fit. For this simulation, the IRCM transformation makes it simple to identify lack-of-fit due to incorrect history dependence structure—either missing refractoriness or neural interactions—using any of the uniformity tests. However, it remains difficult to detect lack-of-fit due to missing the mark dynamics; while the distance-to-boundary test detects the lack-of-fit at the $α=0.05$ significance level, this result would not hold up to correction for multiple tests. The MDCI transformation is not able to detect lack-of-fit in the missing refractoriness model using any of the uniformity tests, but both the Pearson and multivariate KS tests are able to detect lack-of-fit due to missing neural interactions and missing mark dynamics at very small significance levels.

These results also suggest that certain uniformity tests may achieve substantially higher statistical power over others for the types of lack-of-fit often encountered in neural models. While all of the tests were able to identify the misspecified models missing history-dependent components via the IRCM transformation, the Pearson and multivariate KS tests provided much lower $p$-values for detecting the missing interaction and constant mark models' lack-of-fit under the MDCI transformation. This suggests that different combinations of transformations and uniformity and independence tests can provide different views on goodness-of-fit that can be used together to provide a more complete picture of the quality of a model.

While the IRCM transformed samples in the constant mark model do not show obvious lack of uniformity in Figure 2D, these samples do show obvious dependence through the structure in the dot colors. The rescaling theorem for this transformation guarantees that under the true model, these samples will be independent. We can therefore apply correlation tests to these samples to further assess model goodness-of-fit. To demonstrate, we used a correlation test between consecutive samples based on a Fisher transform (Meng, Rosenthal, & Rubin, 1992), defined by
$Z=0.5log1+r1-r/1n-3$
(4.11)
where $r$ is the correlation coefficient between $vi$ and $vi+1$ and $n$ is the number of samples. Under the true model, the $p$-value for this test is 0.58, suggesting no lack-of-fit related to dependence in the transformed samples; under the constant mark model, the $p$-value was 2.1e-08, suggesting lack-of-fit in the model leading to dependence of the samples. This suggests that uniformity and independence tests can provide complementary tools to identify model misspecification in IRCM transformed samples.

5  Discussion

A fundamental component of any statistical model is the ability to evaluate the quality of its fit to observed data. While the marked point process framework has the potential to provide holistic models of the coding properties of a neural population while avoiding a computationally expensive spike-sorting procedure, methods for assessing their goodness-of-fit have been lacking until recently. Our preceding work to extend the time-rescaling theorem (Brown et al., 2002) to marked point process neural models has provided a preliminary approach to address this problem (Tao et al., 2018), but further work was necessary to make the approach computationally efficient in higher dimensions, enable the use of more statistically powerful test methods, and understand which tests are most useful for capturing different aspects of model misspecification.

In this letter, we have proposed two new transformations, IRCM and MDCI, that combine rescaling in both the time and mark spaces to produce samples that are uniformly distributed in a hypercube for the correct marked point process model. This removes one of the most troublesome issues with our prior method: the time-rescaling solution produced uniform samples in a random space that made performing multiple uniformity tests computationally challenging. The integrals and transformations derived as a part of IRCM and MDCI algorithms can be solved analytically for some simple classes of the joint mark intensity function, $λ(t,m→)$. For all of the analyses in this letter, we have used simple numerical techniques to solve the integrals and implement these transformations. Numerical solutions can be accurate and computationally tractable when the dimension of the mark process is low. However, the computational cost of the transformations becomes more expensive as the dimension of the mark grows. Different approaches can be taken to reduce the computational cost of the integrals in IRMC and MDCI. For instance, the toolbox we developed (Yousefi et al., 2020) adjusts the width of summand in the Riemann sum integral algorithm to control the algorithms' computational cost. This solution trades off the accuracy with the computational cost. While the focus of this letter is not on minimizing the computational cost of these methods, we note that judicial selection or approximation of the joint mark intensity process can lead to closed-form solutions or approximations, which can substantially reduce the computational costs of these algorithms in higher dimensions.

While both the IRCM and MDCI transformations produce samples that are uniformly distributed in a hypercube for the true model, each transformation can capture different attributes of the quality of the model fit to the observed data. The IRCM rescales the interspike intervals between all observed spikes, regardless of their waveforms, and then rescales the marks in a time-dependent manner. For correct models, this causes mixing between the spike marks from different neurons. This transformation is likely to be particularly sensitive to misspecification of interactions between neurons as in our simulation example. The MDCI transformation rescales the spike waveform features regardless of when they occur and then rescales the spike times in a manner that depends on their waveforms. This transformation tends to keep spikes from a single neuron nearby and is likely to be sensitive to misspecification of the coding properties of individual neurons. The fact that the IRCM makes the rescaled samples independent allows us to use correlation tests as further goodness-of-fit measures. The fact that the MDCI keeps marks from individual neurons nearby allows us to identify regions of nonuniformity in the hypercube to determine which waveforms have spiking that is poorly fit by the model. Together, these transformations provide complementary approaches for model assessment and refinement.

In addition to having multiple, complementary transformations for the data, we have multiple tests for uniformity and dependence with which to assess the transformed samples. Here, we explored six well-established uniformity tests to examine how different forms of model misspecification could be captured using combinations of these transforms and tests. As expected, the true model did not lead to a significant lack of uniformity in either transformation based on any of the tests we explored. Similarly, for the true model, our correlation test did not detect dependence in the IRCM transformed samples. For the misspecified models, different combinations of transformations and uniformity tests were able to identify different sources of lack-of-fit. The missing refractoriness and missing neural interaction models were easily identified as lack-of-fit under the IRCM transform using all of our tests, but the constant mark model could not be identified by any of the tests using this transform. The constant mark model was identified as lack-of-fit under the MDCI using the Pearson and multivariate KS tests but not the other uniformity tests.

Across these simulations, the Pearson chi square, multivariate KS, and MST tests proved to be statistically more powerful in capturing the particular forms of model misspecification that we examined. However, these simulations were limited by using a simple two-neuron population model and only a one-dimensional mark. While more systematic exploration of uniformity tests is necessary to know which combinations of transforms and tests are most useful for determining different aspects of model goodness-of-fit, these results suggest that no one combination is likely to work in all cases. Relatedly, goodness-of-fit for marked point process models should not be limited to rescaling methods; deviance analysis and point process residuals can provide additional, complementary goodness-of-fit measures. A toolbox that includes multiple approaches, including different rescaling transformations and tests provides substantially more statistical power than any one approach on its own.

Ultimately, insight into which goodness-of-fit methods are most useful for these clusterless coding models will require extensive analysis of real neural population spiking data. Based on the many advantages of the clusterless modeling approach—the reduction of bias in receptive field estimation (Ventura, 2009), the ability to use spikes cannot be sorted with confidence (Deng et al., 2015), the ability to fit models in real time for during the recording sessions—and the experimental trend toward recording larger populations and closed-loop experiments, we anticipate that clusterless modeling approaches and methods to assess their quality will become increasingly important. In order to enable experimentalists to apply these algorithms in their data analysis, we have made the Matlab code for these transformations along with the uniformity tests explored here available through our Github repository (Yousefi et al., 2020).

Appendix A: Rescaling Theorem Proofs

In this letter, we introduce IRCM and MDCI algorithms. In this appendix, we present the theoretical proof for these algorithms.

A.1  Interval-Rescaling Conditional Mark Distribution

In this section, we provide a set of new transformations and their properties in transforming observed marked point process data points to uniformly distributed samples in hypercube. We use different methodologies to prove properties of these transformations; we either use change of variables' theorem (Jeffreys, Jeffreys, & Swirles, 1999) or derive the distribution of observed joint mark and spike events under these transformations in these proofs. We assume that we have a sequence of marked point process with observed marks $m→i∈M,i=1,…,N(T)$ associated with the spike time $0=s0 and with a joint mark intensity function $λ(t,m→)=λ(t,m→|Ht)$. The joint probability of observing $N(T)$ events over the period of $t=[0T]$ is defined by
$Psi,m→i,i=1,…,n,NT=n=∏i=1nλ(si,m→i)exp-∫0T∫Mλt,m→dm→dt=exp-∫snT∫Mλt,m→dm→dt∏i=1nλ(si,m→i)exp-∫si-1si∫Mλt,m→dm→dt,$
(A.1)
and we show the following transformation takes $si,m→i,i=1,…,n$ to a set of new data points $ui,vi,i=1,…,n$ which are i.i.d samples with a uniform distribution in the range of $[01]1+d-d$ is the dimension of mark.
Theorem 1.
Define the ground intensity by
$Λ(t)=∫Mλt,m→dm→$
(A.2)
and the conditional intensity of mark given the event time by
$fm→|t=fm(1),m(2),…,m(d)|t=λt,m→/Λt,$
(A.3)
where $ml$ is the lth element of the vector $m→$. The conditional intensity of mark can be written by
$fm(1),m2,…,m(d)|t=fm(d)|m(d-1),…,m(1),t⋯fm3|m(2),m1,tfm2|m(1),tfm(1)|t,$
(A.4)
where
$fm(1)|t=∫⋯∫fm→|tdm(2)⋯dm(d),$
(A.5a)
$fm(2)|m(1),t=fm(2),m(1)|t/fm1|t=∫⋯∫fm→|tdm(3)⋯dm(d)/fm(1)|t,$
(A.5b)
$fm(l)|m(l-1),…,m(1),t=fm(l),…,m1|tfml-1,…,m(1)|t,l>1.$
(A.5c)
If $λ(t,m→)$ is the true joint mark intensity, then the $d+1$-dimensional random vector with elements – $ui$, $vi(1)$, $vi(2),…,vi(d)$, $i=1…n$,
$ui=1-exp-∫si-1siΛ(t)dt,$
(A.6a)
$vi(l)=∫-∞mi(l)fm(l)|mi(j);j
(A.6b)
are i.i.d. and uniformly distributed in the $01d+1$ hypercube.
Proof.
We can redefine equation A.1 by
$Psi,m→i,i=1,…,n,NT=n=∏i=1nλsi,m→iΛsiΛsiexp-∫si-1siΛ(t)dtexp-∫snTΛ(t)dt=∏i=1nfm→i|tΛ(si)exp-∫si-1siΛ(t)dtexp-∫snTΛ(t)dt.$
(A.7)
Here, the first $n$ terms represent the probability of observing continuous samples over the time period of $[0,sn]$ with marks $m→ii=1,…,n$. The last term corresponds to the probability of not observing any event for the time period of $(sn,T]$; this corresponds to $NT-N(sn)=0$.
We want to build the joint probability distribution of $u1,v1(1),…,v1(d),…,un,vn1,…,vn(d)$ given ${(s1,m→1),…,(sn,m→n),N(T)=n}$. We first focus on the time period $[0,sn]$, where we observe $n$ events. We use the change of variable theorem (Jeffreys et al., 1999) to build the joint probability distribution of full events over this time period. To derive the joint probability distribution, we need to calculate the Jacobian matrix, which is defined by
$J=∂u1∂s1⋯∂u1∂sn∂u1∂m1(1)⋯∂u1∂m1(d)⋯∂u1∂mn(1)⋯∂u1∂mn(d)⋮⋮⋮∂un∂s1⋯∂un∂sn∂un∂m1(1)⋯∂un∂m1(d)⋯∂un∂mn(1)⋯∂un∂mn(d)∂v1(1)∂s1∂v1(1)∂sn∂v1(1)∂m1(1)∂v1(1)∂m1(d)∂v1(1)∂mn(1)∂v1(1)∂mn(d)⋮⋮⋮⋮⋮⋮∂v1(d)∂s1∂v1(d)∂sn∂v1(d)∂m1(1)∂v1(d)∂m1(d)∂v1(d)∂mn(1)∂v1(d)∂mn(d)⋮⋯⋮⋮⋯⋮⋯⋮⋯⋮∂vn(1)∂s1∂vn(1)∂sn∂vn(1)∂m1(1)∂vn(1)∂m1(d)∂vn(1)∂mn(1)∂vn(1)∂mn(d)⋮⋮⋮⋮⋮⋮∂vn(d)∂s1∂vn(d)∂sn∂vn(d)∂m1(1)∂vn(d)∂m1(d)∂vn(d)∂mn(1)∂vn(d)∂mn(d)$
(A.8)
where
$∂up∂sq=0∀q>p$
(A.9a)
$∂up∂mq(l)=0∀p,q,l$
(A.9b)
$∂vp(l)∂mqk=0q>pOrk>l$
(A.9c)
where $p,q∈{1,…,n}$ and $l,k∈{1,…,d}$. Note that the upper triangular elements of the matrix $J$ are equal to zero, and thus we only need to calculate the diagonal elements of the matrix to calculate its determinant. The matrix diagonal elements are defined by
$∂ui∂si=Λsi1-uii=1,…,n$
(A.10a)
$∂vi(l)∂mi(l)=fmi(l)|mi(j);j
(A.10b)
The Jacobian matrix determinant is equal to
$J=∏i=1nfmi(l)|mi(j);j
(A.11)
By replacing these elements in equation A.7, we get
$Pui,vi(l),i=1⋯n=J-1∏i=1nfmi(l)|mi(j);j
(A.12)
Now we consider the last component of the distribution, which implies no event from time $sn$ to $T$. We define a new random variable $z$,
$z=∫snTΛ(t)dt,$
(A.13)
which defines the number of events for the time $sn$ to $T$. We define the probability of not observing an event by
$u=Pnoextraevent=exp(-z),$
(A.14)
where $u$ is a new random variable in the range of 0 to 1. By changing the variable from $z$ to $u$, the joint probability distribution can be written by
$Pui,vi(l),i=1,…,n,l=1,…,d,NT=n=J-1expz×∏i=1nfmi(l)|mi(j);j
(A.15)
Thus, all elements including the last element are uniformly distributed on $[01]d+1$ given the assumption that samples are generated using the true $λ(s,m→)$.$□$

Note that the last element becomes a sure event in $u$ space. The last term can be also projected back to $uis$; with this assumption—when it is not compensated—the transformed $uis$ are uniformly distributed on a scaled space of $0exp∫SnTΛ(t)dt1n$.

Corollary 1.

The marginalization steps over the mark dimension described in theorem 1 are valid on any arbitrary sequence of the mark space dimension.

Corollary 2.

The $ui$ samples generated by equation A.6a given the ground conditional intensity defined in equation A.2 are independent and uniformly distributed over the range 0 to 1 and independent of the $vi$ samples. The $ui$ samples correspond to the time rescaling theorem (Brown et al., 2002) samples over the full event's time intervals.

A.2  Mark Density Conditional Intensity

In this section, we provide a proof for algorithm 2 using theorem 4. In theorem 4, we use an established theorem, the Rosenblatt transformation (Rosenblatt, 1952), which is a mapping from a $d$-dimensional random vector to one with a uniform distribution on the $d$-dimesnional hypercube.

Rosenblatt Transformation, Let $X=(X1,…,Xd)$ be a random vector with distribution functions $F(x1,…,xd)$. Let $z=z1,…,zd=Tx=T(x1,…,xd)$, where $T$ is the transformation considered. Then $T$ is given by
$z1=PX1≤x1=F1x1z2=PX2≤x2|X1=x1=F2x2|x1⋮zd=PXd≤xd|Xd-1=xd-1,…,X1=x1=Fdxd|xd-1,…,x1.$
(A.16)
One can readily show that the random vector $Z=TX$ is uniformly distributed on the $d$-dimensional hypercube. For
$PZi≤zi;i=1,…,d=∫{Z|Zi≤zi}⋯∫dxdFdxd|xd-1,…,x1…dx1F1x1=∫0zd⋯∫0z1dz1⋯dzd=∏i=1dzi$
(A.17)
when $0≤zi≤1,i=1,…,d$. Hence, $Z1,…,Zd$ are uniformly and independently distributed on [0,1].

Now we define theorem 4, which constructs the MDCI transformation.

Theorem 2.
Let $Γm→=∫0Tλt,m→dt$ be the mark density, and let
$fm→=Γm→∫MΓm→dm→,$
(A.18a)
$ft|m→=λt,m→Γm→.$
(A.18b)
If $λt,m→$ is the true joint mark intensity, then the $d+1$-dimensional random vector with elements–$ui$, $vi(1)$, $vi(2),…,vi(d)$, $i=1,…n$,
$ui=∫0sift|m→idt$
(A.19a)
$vi(l)=∫-∞mi(l)f(m(l)|mi(l-1),…,mi1)dm(l)l=1…d,$
(A.19b)
are jointly uniformly distributed in the $01d+1$ hypercube.
Proof.
The joint probability distribution of the full event defined in equation A.1 can be rewritten by
$Psi,m→ii=1n,NT=n=exp-∫M∫0Tλt,m→dtdm→∫M∫0Tλt,m→dtdm→n∏i=1nfsi|m→ifm→i$
(A.20)
To prove the theorem, we require building the joint probability distribution of $u1,v1(1),…,v1(d),…,un,vn(1),…,vn(d)$ given $s1,m→1,…,sn,m→n,N(T)=n$. First, we define $Psi,m→ii=1n|NT=n$, which corresponds to
$Psi,m→ii=1n|NT=n=Psi,m→ii=1n,NT=nPNT=n=Psi,m→ii=1n,NT=nexp-∫MΓm→dm→∫MΓm→dm→n/n!=n!∏i=1nfsi|m→ifm→i.$
(A.21)
In equation A.21, the denominator defines the joint probability distribution of observing $n$ events independent of their temporal order. Given the history dependence of the events, the joint probability distribution of temporally ordered events (Tao et al., 2018) is defined by
$si,m→ii=1n,si-1
(A.22)
In equation (A.22), $ft|m→fm→$ defines the joint probability distribution of the mark and spike time. Equation set A.19 is the Rosenblatt transformation (Rosenblatt, 1952) of the spike time and mark, mapping the observed events from multivariate continuous random variables defined by $t,m→$ to another one, $u,v→$. Under the Rosenblatt transformation, the transformed data points are uniformly distributed in a hypercube of $01d+1$. As a result, the joint distribution of $u,v→$ is defined by
$Pui,vi(l);l=1⋯di=1n|NT=n=1,$
(A.23)
which suggests that $ui,vi(l);l=1⋯d$ s are uniform samples in a hypercube of $01d+1$.$□$

Appendix B: Mark-Rescaling Conditional Intensity

The MRCI algorithm starts by building the mark intensity function, $Γ(m)$, which is followed by deriving the conditional intensity function, $f(t|mi)$. This transformation corresponds to a time rescaling on the mark and the CDF of conditional intensity on spike time. Algorithm 3 describes the steps being taken to map the full spike event data to a unit square.

In the MRCI algorithm, we can use any permutated sequence over mark, and the resulting $vi,ui$ samples still hold uniformity. We provide the theoretical proof of MRCI transformation in section B.2. Like the previous algorithms, we can use multiple uniformity tests, described in section 3, to assess the accuracy of model fit.

In contrast to IRCM and MDCI, where we marginalized the conditional mark distribution over the mark space, marginalizing of $Γ(m)$ to construct uniformly distributed variables is not easy. When the mark is a scalar variable, we can use $ui$ and $vi$ samples to draw further insight into the model fit. $ui$ samples, when they are constructed using sorted marks, reflect how properly the temporal properties of the full events are captured using the model.

A challenge with MRCI, when the observed mark is multidimensional, is the interpretation of full spike events in the multidimensional spaces. There is no unique solution on how to sort the multidimensional marks. As a result, MRCI can be a proper transformation under two circumstances: (1) when the mark, independent of its dimension, is treated as one element, and (2) when the dimension of the mark is one.

We use the simulated data described in section 4.1 to assess the mapping result of MRCI. Figure 7 shows the mapping results for this algorithm. For the true model, we expect the observed data points are being mapped to uniformly distributed samples in a unit square (see Figure 7A). The results are in a strong support of this assumption, and Figures 7B to 7D show the mapping results using alternative models under MRCI mapping. For the inhibitory independent model, the transformed data points are shifted toward higher values for $v$. This is because when the history term is dropped, the joint mark intensity function is overestimated for the times after each spike, and thus the mark intensity function increases in the interval between pairs of events (see line 5 in algorithm 3). Similarly, for the excitatory independent model, the new data points are shifted toward low values for $v$; this is because by eliminating the excitatory effect of neuron 2 on neuron 1, the joint mark intensity function is getting underestimated for the time after neuron 2 spikes and thus the mark intensity function decreases in the interval between pairs of events (see line 5 in algorithm 3). Figure 7D shows the transformed data points under the time-independent mark model. Like the IRCM, the MRCI algorithm is not capable of capturing the misspecification being embedded in the mark distribution; this is because we take the integral of the mark intensity function in intervals between pairs of events where the mark intensity function simultaneously increases and decreases over theses intervals.
Figure 7:

Rescaling results for the MRCI algorithm using four models: the true model and three alternative models described in section 4.1. Dot color indicates neuron and timing of spike, consistent with Figure 1A. (A) Rescaling using the true mark intensity function produces apparently uniform data. (B) Rescaling using the missing refractoriness model shows a nonuniform drift. (C) Rescaling using the missing interaction model shows more density at lower values of $v$. (D) Rescaling using the missing mark drift model appears like the true model in panel A.

Figure 7:

Rescaling results for the MRCI algorithm using four models: the true model and three alternative models described in section 4.1. Dot color indicates neuron and timing of spike, consistent with Figure 1A. (A) Rescaling using the true mark intensity function produces apparently uniform data. (B) Rescaling using the missing refractoriness model shows a nonuniform drift. (C) Rescaling using the missing interaction model shows more density at lower values of $v$. (D) Rescaling using the missing mark drift model appears like the true model in panel A.

The uniformity test results for the MRCI algorithm are reported in Table 6. Given the result in the table, chi-square Pearson and multivariate KS tests reject the null hypothesis for all alternative models. The distance-to-boundary, discrepancy, Ripley statistics, and MST tests reject the null hypothesis only for the inhibitory independent model and fail to reject the null hypothesis for other misspecified models.

Table 6:
Different Uniformity Test Statistics and Corresponding $p$-Values Using the MRCI Algorithm.
Uniformity TestTrue ModelMissing Refractoriness ModelMissing Neural Interaction ModelConstant Mark Model
Pearson $χ2$ statistic test with different degree of freedom $M=N=3,X82(α)=15.51$
Metric $p$-value Metric $p$-value Metric $p$-value Metric $p$-value
8.6071 0.3765 279.07 $∼$0 44.452 4.65e-7 33.484 5.03e-5
$M=N=4,X152(α)=24.99$
Metric $p$-value Metric $p$-value Metric $p$-value Metric $p$-value
15.931 0.3867 357.77 $∼$0 53.061 3.7e-6 45.469 6.45e-5
$M=N=5,X242(α)=36.41$
Metric $p$-value Metric $p$-value Metric $p$-value Metric $p$-value
22.452 0.5523 453.36 $∼$0 67.658 4.92e-6 74.035 5.26e-7
Multivariate Kolmogorov-Smirnov test $n=784,Cα^/n=0.053$
Metric $p$-value Metric $p$-value Metric $p$-value Metric $p$-value
0.0510 0.0510 0.2974 7.25e-5 0.0857 0.003 0.0820 0.005
Distance-to- boundary test $n=784,Cα/n=0.048$
Metric $p$-value Metric $p$-value Metric $p$-value Metric $p$-value
0.0282 0.536 0.1843 2.29e-4 0.035 0.366 0.0432 0.232
Discrepancy test $zα/2-1=±1.6449$
Metric $p$-value Metric $p$-value Metric $p$-value Metric $p$-value
$-$0.667 0.3193 $-$1.720 0.0408 $-$0.893 0.2677 $-$1.351 0.1602
Ripley statistics test $d=2,χd2(α)=5.9915$
Metric $p$-value Metric $p$-value Metric $p$-value Metric $p$-value
2.3306 0.3118 23.374 8.4 e-9 4.2077 0.122 4.5260 0.104
Minimal spanning tree (MST) test $zα-1=±1.6449$
Metric $p$-value Metric $p$-value Metric $p$-value Metric $p$-value
0.9616 0.2513 $-$5.133 7.5e-07 $-$1.492 0.1311 $-$1.111 0.2151
Uniformity TestTrue ModelMissing Refractoriness ModelMissing Neural Interaction ModelConstant Mark Model
Pearson $χ2$ statistic test with different degree of freedom $M=N=3,X82(α)=15.51$
Metric $p$-value Metric $p$-value Metric $p$-value Metric $p$-value
8.6071 0.3765 279.07 $∼$0 44.452 4.65e-7 33.484 5.03e-5
$M=N=4,X152(α)=24.99$
Metric $p$-value Metric $p$-value Metric $p$-value Metric $p$-value
15.931 0.3867 357.77 $∼$0 53.061 3.7e-6 45.469 6.45e-5
$M=N=5,X242(α)=36.41$
Metric $p$-value Metric $p$-value Metric $p$-value Metric $p$-value
22.452 0.5523 453.36 $∼$0 67.658 4.92e-6 74.035 5.26e-7
Multivariate Kolmogorov-Smirnov test $n=784,Cα^/n=0.053$
Metric $p$-value Metric $p$-value Metric $p$-value Metric $p$-value
0.0510 0.0510 0.2974 7.25e-5 0.0857 0.003 0.0820 0.005
Distance-to- boundary test $n=784,Cα/n=0.048$
Metric $p$-value Metric $p$-value Metric $p$-value Metric $p$-value
0.0282 0.536 0.1843 2.29e-4 0.035 0.366 0.0432 0.232
Discrepancy test $zα/2-1=±1.6449$
Metric $p$-value Metric $p$-value Metric $p$-value Metric $p$-value
$-$0.667 0.3193 $-$1.720 0.0408 $-$0.893 0.2677 $-$1.351 0.1602
Ripley statistics test $d=2,χd2(α)=5.9915$
Metric $p$-value Metric $p$-value Metric $p$-value Metric $p$-value
2.3306 0.3118 23.374 8.4 e-9 4.2077 0.122 4.5260 0.104
Minimal spanning tree (MST) test $zα-1=±1.6449$
Metric $p$-value Metric $p$-value Metric $p$-value Metric $p$-value
0.9616 0.2513 $-$5.133 7.5e-07 $-$1.492 0.1311 $-$1.111 0.2151

The bold numbers show cases where the test identified lack-of-fit at a significance-level of $α=0.05$.

The results here are in accordance to the previous results in Tables 5 and 6, where chi-square Pearson and multivariate KS tests are shown to be statistically stronger tests. The statement that we should try a combination of uniformity tests to build a stronger confidence in the goodness-of-fit result holds for this transformation as well.

B.1  Mark-Rescaling Conditional Intensity Theoretical Proof

Theorem 3.
We define the ground conditional intensity over the mark space by
$Γm=∫0Tλt,mdt$
(B.1)
and conditional intensity of the event given the mark by,
$ft|m=λt,m/Γm.$
(B.2)
Then the following two new variables, $ui$ and $vi$
$vi=1-exp-∫mΘ(i-1)mΘ(i)ΓmdmmΘ(i)-mΘ(i-1)≥01-exp∫mΘ(i-1)mΘ(i)ΓmdmmΘ(i)-mΘ(i-1)<0,$
(B.2a)
$ui=∫0sift|mdt,$
(B.2b)
are uniformly distributed samples over $[01]2$, under the correct model. Here, $Θ(i)$ represents the corresponding permutated number for $i$. $mΘ(0)$ is equal to $-∞$ and $mΘ(n+1)$ is equal to $+∞$.
Proof.
We can write the joint probability distribution of the event time and mark by
$Psi,mi,i=1,…,n,NT=n=∏i=1nλ(si,mi)exp-∫M∫0Tλt,mdtdm=∏i=1nλsi,miΓmiΓmiexp(-∫MΓmdm).$
(B.3)
We can replace the exponential term in equation B.3 using different samples' mark information
$exp-∫MΓmdm=exp-∫-∞m1Γmdmexp0×exp-∫-∞m2Γmdmexp-∫-∞m1Γmdm×exp(-∫-∞m3Γmdm)exp(-∫-∞m2Γmdm)×⋯×exp-∫MΓmdm)exp(-∫-∞mnΓmdm),$
(B.4)
where the marks are sorted in ascending order. Note that the above equality is valid for any sequence of $mi$ where $i$ is defined by a permutation of $[1n]$. Thus, we have
$exp-∫MΓmdm=∏i=1n+1exp-∫-∞mΘ(i)Γmdmexp-∫-∞mΘ(i-1)Γmdm=∏i=1n+1(1-ui),$
(B.5)
where $Θ(i)$ represents the corresponding permutated number for $i$. $mΘ(0)$ is equal to $-∞$ and $mΘ(n+1)$ is equal to $+∞$. In equation B.5, we assume $mΘ(i)$ is larger than $mΘ(i-1)$. Otherwise, we swap the nominator and denominator (see equation B1a).
Now we can use the change-of-variable theorem to prove that $ui$ and $vi$ are uniformly distributed in $[01]2$. The partial derivative of $ui$ with respect to $mΘ(i)$ is defined by
$∂vi∂mΘ(i)=ΓmΘ(i)exp-∫mΘi-1mΘ(i)Γmdm=ΓmΘ(i)(1-vi),$
(B.6)
and the partial derivative of $ui$ with respect to $si$ is defined by
$∂vi∂si=fsi|mi.$
(B.7)
By replacing these in the joint probability distribution of $si,mi$ defined in equation B.3, we get the following probability distribution,
$Pui,vi,i=1,…,n;NT=n=∏i=1nfsi|miΓmθ(i)(1-vi)J-1(1-vn+1),$
(B.8)
where, $J$ is the Jacobian of transformation defined by
$J=∏i=1n(1-vi)fsi|miΓmΘ(i)(1-vn+1),$
(B.9)
where, for the last term, we take the derivative with respect to $mΘ(n+1)$, which we assume goes to $+∞$. To calculate the derivative, we leverage from the fact that $J$ is an upper—or lower—triangular matrix. Note that we can use the row operation to change the order of $Θ(i)$ to $i$, and this will change only the sign of the determinant in equation B.9. For the transformation defined in equation B.8, the sign of the determinant is not important given that we only need the absolute value of the determinant. The right side of equation B.6 becomes equal to 1,
$Pui,vi,i=1,…,n;NT=n=1,$
(B.10)
which suggests $ui$ and $vi$ are uniformly distributed samples over the space $[01]2$.$□$
Corollary 3.

The $vi$ samples generated by equation A.26a given the ground conditional intensity defined in equation A.24 are uniformly distributed over the range zero to one independent of $v$ samples.

Here, we build a point process over the mark space, $m$; thus, if the marks are sorted in ascending order, under the time-rescaling theorem (Brown, Barbieri et al., 2002), $vi$s are independent and uniformly distributed random variables. The result will be like corollary 3.

Appendix C: Uniformity Tests

Here, we describe different terms used in the discrepancy and Ripley statistic tests.

C.1  Discrepancy Test

Here, we provide $U1$ and $U2$ terms' definitions used in step 4 of the discrepancy test. In this test, we calculate the following equation after generating data samples by running IRCM or MDCI transformations:
$An=nU1-Md+2U2-Md/(5ξ1).$
(C.1)
In this equation for symmetric discrepancy, $M=4/3$, $ξ1=95d-69d$, and $U1$, $U2$ are defined by
$U1=1n∑k=1n∏j=1d(1+2zkj-2zkj2),$
(C.1a)
$U2=2d+1n(n-1)∑k
(C.1b)
where $zkj$ are uniform samples that $k$ and $j$ refer to the $k$th sample and $j$th the dimension of data. For example, for simulated data, $zk1=uk,k=1,..,n$ and $zk2=vk,k=1,..,n$.

C.2  Ripley Statistics Test

The $Σ$ used in step 4 of the Ripley statistics method is defined by
$Σ=Var(K^r1)⋯cov(K^r1,K^rs)⋮⋮⋮cov(K^r1,K^rs)⋯Var(K^rs)$
(C.2)
where, $ri∈r,r=r1,…,rs$ indicates the neighborhood radius of each sample point. The variance, $VarK^ri,i=1,…,s$, and covariance, $covK^(ri),K^(rj),j=1,…,s$, terms are defined by
$VarK^ri=2eri-eri2nn-1+4Vrin-2nn-1+1+nexp-n1-exp-n-nexp-neri2,$
(C.3)
where $n$ is the number of samples and
$eri=πri2-8ri33+ri42$
(C.4)
and
$Vri=2ri583π-25645+ri61148π-89-64+8ri73-ri84,$
(C.5)
and for the covariance term,
$covK^ri,K^rj=2eri-erierjnn-1+4n-2Cri,rjnn-1+erierjexp(-n)(1+n)(1-exp-n-nexp-n),$
(C.6)
where
$Cri,rj=1-2rj2ri2rj2bribrj+22-4rjri2rj3bri×∫rirj1brj-gx1dx1+22-4rjri3rj2brj×∫01brj-grix1rjbri-gx1dx1+4ri2rj4∫01∫01hA1rjxri,ri+hA2rjxri,ri+hA3rjxri,ri+hA4rjxri,ri×hA3x,ri+hA4x,ridx1dx2,$
(C.7)
where $x=x1x2T$ is the sample point and
$bri=π-1ri2eri=-83ri+ri22$
(C.8)
Also, $g(.)$ and $hAk.k=1,…,4$ in equation C.7 are defined by
$gx1=I0
(C.9)

Appendix D: Rat Hippocampus Data and Transformation Results

In this appendix, we demonstrate the application of IRCM and MDCI to population spiking activity recorded from rat hippocampus (Tao et al., 2018). The data represent the neural activity of the cell ensembles around a tetrode placed in the CA1 and CA3 regions of the hippocampus of a rat traversing a W-shaped maze environment. Spikes were detected offline by choosing events whose peak-to-peak amplitudes were above a 40 $μ$V threshold in at least one of the channels. For each spike, the peak amplitudes across each electrode channel were used as a four-dimensional mark. In Tao et al. (2018), the joint mark intensity of the recorded data is modeled by a mixture of gaussian (MOG) model, defined by
$λt,m→=∑c=1Cλcexp-xt-fc22σc2-m→-μ→cTΣcm→-μ→c2,$
(D.1)
where $C$ is the number of gaussian components, $λc$ is a normalized weight for the $c$th gaussian component, $xt$ is the position of the animal, $m→$ is the four-dimensional mark vector, and $fc$, $σc2$, and $μ→c$, $Σc$ are the means and covariances in position and mark spaces, respectively. Model parameters are estimated using a Gibbs sampling technique presented in Tao et al. (2018). Figure 8 shows the mark data from a sample tetrode along with the position of unsorted spikes occurrence, tracing out the path the rat traveled in the maze.
Figure 8:

Neural data. (A) Four-dimensional mark information corresponding to the peak amplitudes of spikes across each electrode channel displayed two dimensions at a time. (B) The time and position of occurrence of unsorted spikes. (C) The spike peak amplitude on each channel, as a function of time.

Figure 8:

Neural data. (A) Four-dimensional mark information corresponding to the peak amplitudes of spikes across each electrode channel displayed two dimensions at a time. (B) The time and position of occurrence of unsorted spikes. (C) The spike peak amplitude on each channel, as a function of time.

To assess the MOG model fit to the data, we use IRCM and MDCI algorithms to map the data to rescaled time and mark spaces. Figures 9A and 9B show $vi(l),l=1,…,4$ via time rescaling samples, $ui$, under IRCM and MDCI transformations. Figure 8A, which is for the IRCM algorithm, suggests the transformed data are pushed toward lower values in the rescaled time-space, $u$. Figure 8B, which is for MDCI, also suggests possible lack-of-fit in rescaling time and mark as well. For instance, in the ($u1,v1)$ pair, the rescaled points are pushed toward lower values of $u$. For the ($u1,v2)$, we can see less uniform samples toward larger values of $u1$ and $v2$. The collective result might suggest a more pronounced lack-of-fit in capturing temporal characteristics of the events.
Figure 9:

Rescaling results for the IRCM and MDCI algorithm using MOG models for two different permutations of mark. (A) Rescaling using IRCM and $Θ(1)={1,2,3,4}$ order for mark. (B) Rescaling using IRCM and $Θ(1)={2,1,3,4}$ order for mark. (C) Rescaling using MDCI and $Θ(2)={1,2,3,4}$ order for mark. (D) Rescaling using MDCI and $Θ(2)={2,1,3,4}$ order for mark.

Figure 9:

Rescaling results for the IRCM and MDCI algorithm using MOG models for two different permutations of mark. (A) Rescaling using IRCM and $Θ(1)={1,2,3,4}$ order for mark. (B) Rescaling using IRCM and $Θ(1)={2,1,3,4}$ order for mark. (C) Rescaling using MDCI and $Θ(2)={1,2,3,4}$ order for mark. (D) Rescaling using MDCI and $Θ(2)={2,1,3,4}$ order for mark.

Besides the visual inspection, we applied two uniform tests on the rescaled data, the Pearson and multivariate KS tests. Table 7 shows the uniformity test results that indicate the lack-of-fit under both IRCM and MDCI transformations for both mark dimension orders.

Table 7:
Uniformity Tests' Metrics and Corresponding $p$-Values Applied to the Hippocampus Data.
Mark Permutation Order $Θ(1)=$Mark Permutation Order $Θ(2)=$Mark Permutation Order $Θ(1)=$Mark Permutation Order $Θ(2)=$
${1,2,3,4}$${2,1,3,4}$${1,2,3,4}$${2,1,3,4}$
Uniformity TestIRCMMDCI
Pearson $χ2$ statistic test $di=2,i=1,..,4,X312(α)=20.07$
Metric $p$-value Metric $p$-value Metric $p$-value Metric $p$-value
96.61 $∼$0 107.39 $∼$0 101.28 $∼$0 103.94 $∼$0
Multivariate Kolmogorov-Smirnov test $n=967,Cα^/n=0.011$
Metric $p$-value Metric $p$-value Metric $p$-value Metric $p$-value
0.10 $∼$0 0.201 $∼$0 0.067 0.012 0.730 0.015
Mark Permutation Order $Θ(1)=$Mark Permutation Order $Θ(2)=$Mark Permutation Order $Θ(1)=$Mark Permutation Order $Θ(2)=$
${1,2,3,4}$${2,1,3,4}$${1,2,3,4}$${2,1,3,4}$
Uniformity TestIRCMMDCI
Pearson $χ2$ statistic test $di=2,i=1,..,4,X312(α)=20.07$
Metric $p$-value Metric $p$-value Metric $p$-value Metric $p$-value
96.61 $∼$0 107.39 $∼$0 101.28 $∼$0 103.94 $∼$0
Multivariate Kolmogorov-Smirnov test $n=967,Cα^/n=0.011$
Metric $p$-value Metric $p$-value Metric $p$-value Metric $p$-value
0.10 $∼$0 0.201 $∼$0 0.067 0.012 0.730 0.015

Note: The bold numbers show cases where the test identified lack-of-fit at a significance-level of $α=0.05$.

Additionally, Figure 10 shows the multivariate KS plot for these two different permutations, out of 12 possible permutations, under IRCM and MDCI transformations. For each permutation, we consider a different order of the mark to derive the rescaled mark defined in the IRCM and MDCI algorithms. The multivariate KS plots and corresponding KS-distance are in agreement with Table 7.
Figure 10:

KS plot for 2 out of 12 permutations under the IRCM and MDCI algorithm using MOG models. (A) KS plot based on IRCM. (B) KS plot based on MDCI. Both graphs suggest the lack-of-fit in the model as the empirical CDFs do not follow the hypothetical 45-degree line.

Figure 10:

KS plot for 2 out of 12 permutations under the IRCM and MDCI algorithm using MOG models. (A) KS plot based on IRCM. (B) KS plot based on MDCI. Both graphs suggest the lack-of-fit in the model as the empirical CDFs do not follow the hypothetical 45-degree line.

As suggested, the MOG model might lose some temporal dynamics present in the spike events. The test information can be used in revising the joint mark intensity functions derived to characterize the spike data.

References

References
Berrendero
,
J. R.
,
Cuevas
,
A.
, &
Vjosázquez-grande
,
F.
(
2006
).
Testing multivariate uniformity: The distance-to-boundary method
.
,
34
(
4
),
693
707
.
Brown
,
E. N.
,
Barbieri
,
R.
,
Ventura
,
V.
,
Kass
,
R. E.
, &
Frank
,
L. M.
(
2002
).
The time-rescaling theorem and its application to neural spike train data analysis
.
Neural Computation
,
14
(
2
),
325
346
.
Chen
,
Z.
, &
Ye
,
C.
(
2009
).
An alternative test for uniformity
.
International Journal of Reliability, Quality and Safety Engineering
,
16
(
4
),
343
356
.
Chiu
,
S. N.
, &
Liu
,
K. I.
(
2009
).
Generalized Cramér–von Mises goodness-of-fit tests for multivariate distributions
.
Computational Statistics and Data Analysis
,
53
(
11
),
3817
3834
.
Csörgő
,
M.
(
2014
).
Multivariate Cramér–Von Mises statistics
.
Wiley StatsRef: Statistics Reference Online.
Deng
,
X.
,
Liu
,
D. F.
,
Kay
,
K.
,
Frank
,
L. M.
, &
Eden
,
U. T.
(
2015
).
Clusterless decoding of position from multiunit activity using a marked point process filter
.
Neural Computation
,
27
(
7
),
1438
1460
.
Dixon
,
P. M.
(
2014
).
Ripley's K function
.
Wiley StatsRef: Statistics Reference Online.
Fan
,
Y.
(
1998
).
Goodness-of-fit tests based on kernel density estimators with fixed smoothing parameters
.
Econometric Theory
,
14
,
604
621
.
Friedman
,
J. H.
, &
Rafsky
,
L. C.
(
1979
).
Multivariate generalizations of the Wald-Wolfowitz and Smirnov two-sample tests
.
Annals of Statistics
,
7
,
697
717
.
Greenwood
,
P. E.
, &
Nikulin
,
M. S.
(
1996
).
A guide to chi-squared testing
.
New York
:
Wiley
.
Ho
,
L. P.
, &
Chiu
,
S. N.
, (
2007
).
Testing uniformity of a spatial point pattern
.
Journal of Computational and Graphical Statistics
,
16
(
2
),
378
398
.
Jafari Mamaghani
,
M.
,
,
M.
, &
Krieger
,
P.
(
2010
).
Spatial point pattern analysis of neurons using Ripley's K-function in 3D
.
Frontiers in Neuroinformatics
,
4
,
9
.
Jain
,
A. K.
,
Xu
,
X.
,
Ho
,
T. K.
, &
Xiao
,
F.
(
2002
).
Uniformity testing using minimal spanning tree
. In
Proceedings of the International Conference on Pattern Recognition
.
Piscataway, NJ
:
IEEE
.
Jeffreys
,
H.
,
Jeffreys
,
B.
, &
Swirles
,
B.
(
1999
).
Methods of mathematical physics
.
Cambridge
:
Cambridge University Press
.
Justel
,
A.
,
Peña
,
D.
, &
Zamar
,
R.
(
1997
).
A multivariate Kolmogorov-Smirnov test of goodness of fit
.
Statistics and Probability Letters
,
35
(
3
),
251
259
.
Kloosterman
,
F.
,
Layton
,
S. P.
,
Chen
,
Z.
, &
Wilson
,
M. A.
(
2014
).
Bayesian decoding using unsorted spikes in the rat hippocampus
.
Journal of Neurophysiology
,
111
,
217
227
.
Lang
,
G.
, &
Marcon
,
E.
(
2013
).
Testing randomness of spatial point patterns with the Ripley statistic
.
ESAIM: Probability and Statistics
,
17
,
767
788
.
Liang
,
J.-J.
,
Fang
,
K.-T.
,
Hickernell
,
F.
, &
Li
,
R.
(
2001
).
Testing multivariate uniformity and its applications
.
Mathematics of Computation
,
70
(
233
),
337
355
.
Marcon
,
E.
,
Traissac
,
S.
, &
Lang
,
G.
(
2013
).
A statistical test for Ripley's function rejection of Poisson null hypothesis
.
International Scholarly Research Notices
,
2013
,
753475
.
Marhuenda
,
Y.
,
Morales
,
D.
, &
Pardo
,
M.
(
2005
).
A comparison of uniformity tests
.
Statistics
,
39
(
4
),
315
327
.
Meng
,
X.-L.
,
Rosenthal
,
R.
, &
Rubin
,
D. B.
(
1992
).
Comparing correlated correlation coefficients
.
Psychological Bulletin
,
111
(
1
),
172
.
Miller Jr.
,
R. G.
(
1977
).
Developments in multiple comparisons 1966–1976
.
Journal of the American Statistical Association
,
72
(
360a
),
779
788
.
Papangelou
,
F.
(
1972
).
Integrability of expected increments of point processes and a related random change of scale
.
Transactions of the American Mathematical Society
,
165
,
483
506
.
Pearson
,
E. S.
(
1938
).
The probability integral transformation for testing goodness of fit and combining independent tests of significance
.
Biometrika
,
30
(
1/2
),
134
148
.
Ripley
,
B. D.
(
2005
).
Spatial statistics
.
New York
:
Wiley
.
Rosenblatt
,
M.
(
1952
).
Remarks on a multivariate transformation
.
Annals of Mathematical Statistics
,
23
(
3
),
470
472
.
Smith
,
S. P.
, &
Jain
,
A. K.
(
1984
).
Testing for uniformity in multidimensional data
.
IEEE Transactions on Pattern Analysis and Machine Intelligence
,
PAMI-6
(
1
),
73
81
.
Sodkomkham
,
D.
,
Ciliberti
,
D.
,
Wilson
,
M. A.
,
Fukui
,
K.-i.
,
Moriyama
,
K.
,
Numao
,
M.
, &
Kloosterman
,
F.
(
2016
).
Kernel density compression for real-time Bayesian encoding/decoding of unsorted hippocampal spikes
.
Knowledge-Based Systems
,
94
,
1
12
.
Tao
,
L.
,
Weber
,
K. E.
,
Arai
,
K.
, &
Eden
,
U. T.
(
2018
).
A common goodness-of-fit framework for neural population models using marked point process time-rescaling
.
Journal of Computational Neuroscience
,
45
(
2
),
147
162
.
Veen
,
A.
, &
Schoenberg
,
F. P.
(
2006
). Assessing spatial point process models using weighted K-functions: Analysis of California earthquakes. In
A.
,
P.
Gregori
,
J.
Mateu
,
R.
Stoica
, &
D.
Stoyan
(Eds.),
Lecture Notes in Statistics: Vol. 185. Case studies in spatial point process modeling
.
New York
:
Springer
.
Ventura
,
V.
(
2009
).
Traditional waveform based spike sorting yields biased rate code estimates
. In
Proceedings of the National Academy of Sciences
,
106
(
17
),
6921
6926
.
Yousefi
,
A.
,
Amidi
,
Y.
,
Nazari
,
B.
, &
Eden
,
U. T.
(
2020
).
GitHub repository
. https://github.com/YousefiLab/Marked-PointProcess-Goodness-of-Fit
Zimmerman
,
D. L.
(
1993
).
A bivariate Cramér–von Mises type of test for spatial randomness
.
Journal of the Royal Statistical Society: Series C
(
Applied Statistics
),
42
(
1
),
43
54
.