Abstract

Pyramidal neurons recorded from the rat hippocampus and entorhinal cortex, such as place and grid cells, have diverse receptive fields, which are either unimodal or multimodal. Spiking activity from these cells encodes information about the spatial position of a freely foraging rat. At fine timescales, a neuron’s spike activity also depends significantly on its own spike history. However, due to limitations of current parametric modeling approaches, it remains a challenge to estimate complex, multimodal neuronal receptive fields while incorporating spike history dependence. Furthermore, efforts to decode the rat’s trajectory in one- or two-dimensional space from hippocampal ensemble spiking activity have mainly focused on spike history–independent neuronal encoding models. In this letter, we address these two important issues by extending a recently introduced nonparametric neural encoding framework that allows modeling both complex spatial receptive fields and spike history dependencies. Using this extended nonparametric approach, we develop novel algorithms for decoding a rat’s trajectory based on recordings of hippocampal place cells and entorhinal grid cells. Results show that both encoding and decoding models derived from our new method performed significantly better than state-of-the-art encoding and decoding models on 6 minutes of test data. In addition, our model’s performance remains invariant to the apparent modality of the neuron’s receptive field.

1  Introduction

A fundamental goal in neuroscience is to establish a functional relationship between a sensory stimulus or induced behavior and the associated neuronal response. For example, in a rodent experiment (O’Keefe, 1979; Hasselmo, 2008), microelectrodes are implanted into a rat’s hippocampus, and the entorhinal cortex and ensembles of neurons’ activities are recorded while the animal freely forages in an open arena (see Figure 1A). The goals of such an experiment are to understand how a single neuron or population of neurons encodes spatial information of the rat’s position and to determine if we can decode the animal’s position based on their spiking activity (Brown, Frank, Tang, Quirk, & Wilson, 1998; Kloosterman, Layton, Chen, & Wilson, 2014).

Figure 1:

Spiking activity of unimodal and multimodal place cell. (A) A schematic of the circular arena in which the rat is freely foraging. Spiking activity of a unimodal place cell (B), and a multimodal place cell (C). The rat’s run trajectory (for min) is marked by blue lines. Red dots mark the rat’s spatial position when the cell spiked.

Figure 1:

Spiking activity of unimodal and multimodal place cell. (A) A schematic of the circular arena in which the rat is freely foraging. Spiking activity of a unimodal place cell (B), and a multimodal place cell (C). The rat’s run trajectory (for min) is marked by blue lines. Red dots mark the rat’s spatial position when the cell spiked.

In the experiment described above, pyramidal neurons recorded from the rat entorhinal cortex and hippocampus have shown complex grid-like or place-like spiking patterns as a function of the rat’s position in two-dimensional (2D) environments (O’Keefe, 1979; Moser, Kropff, & Moser, 2008; Hasselmo, 2008) (see Figure 1). However, it is well known that these neurons fire in a non-Poisson fashion and their spiking also depends on their own spiking history. For instance, bursting patterns have been found to be a significant component of hippocampal neuronal spiking activity (Barbieri, Quirk, Frank, Wilson, & Brown, 2001; Fenton & Muller, 1998). Thus, encoding and decoding models that capture spike history dependencies will likely yield better performance (Truccolo, Eden, Fellows, Donoghue, & Brown, 2005; Pillow et al., 2008). However, this well-known intrinsic phenomenon has not been fully explored in the rodent hippocampal decoding literature (Brown et al., 1998; Zhang, Ginzburg, McNaughton, & Sejnowski, 1998; Barbieri et al., 2004; Santaniello, Montgomery, Gale, and Sarma, 2012; Kloosterman et al., 2014). The fact that the receptive field has a complex and multimodal shape (especially in 2D space) complicates such modeling. Parametric models may be insufficient to fully characterize a complex, multimodal spatial receptive field structure and history dependence simultaneously.

Nonparametric modeling assumes no explicit probability distribution (i.e., distribution free) and lets data speak for themselves; therefore, it is more flexible in statistical data analysis. In the literature, several nonparametric methods have been developed for modeling neuronal spike activity (Chen, Kloosterman, Layton, & Wilson, 2012; Kloosterman et al., 2014; Rad & Paninski, 2010; Coleman & Sarma, 2010). Specifically, Chen et al. (2012) and Kloosterman et al. (2014) proposed an encoding framework based on kernel density estimation methods for capturing complex dependencies between spiking features and the rat’s position in a 1D environment. However, their model did not incorporate the spike history dependence. In another work, Theis, Chagas, Arnstein, Schwarz, and Bethge (2013) used a semiparametric mixtures-of-gaussian model to estimate ganglion cell responses that incorporate spike history effects and showed improvement over standard parametric models. Nonparametric methods, such as histograms and radial basis functions, have also been used for dimensionality reduction in fitting neuronal response functions (Sharpee, Rust, & Bialek, 2004; Williamson, Sahani, & Pillow, 2015) .

Inspired by these works, we applied nonparametric algorithms to develop encoding models that use both spatial and spike history information for neurons recorded from a rat’s hippocampus and entorhinal cortex. Further, we proposed a novel nonparametric decoding algorithm and applied it to reconstruct the rat’s trajectory using spiking activity from these cells. Our nonparametric framework is based on a novel and efficient nonparametric maximum likelihood (ML) density estimator over a set of densities whose square root is band-limited (BL) (Agarwal, Chen, & Sarma, 2015). The procedure uses convex optimization with quadratic time complexity for encoding (for details, see appendix  A) and linear time complexity (in terms of number of neurons per time step) for decoding.

Finally, we use 6 minutes of unseen data from these 53 cells to evaluate the performance of our nonparametric models and compared them with state-of-the-art parametric models with and without spike history covariates. Results show that the performance of our nonparametric encoding and decoding models with spike history is significantly better than the performance of state-of-the-art parametric models. The result remained robust across the population of neurons with both unimodal and multimodal receptive fields.

2  Methods

2.1  Notation

Let t denote a continuous-time index, and let denote the neuronal index among the population. We use the lowercase p to denote the probability density function (pdf) and the uppercase P to denote the probability. Further, the terms unimodal and multimodal cells are used throughout, but the cells are not definitively defined as such.

2.2  Data Collection

In the experimental protocol, a Long-Evans rat was freely foraging in an open field arena (1 m radius) for a period of 36.36 minutes. Custom microelectrode drives with variable numbers of tetrodes were implanted in the rat’s medial entorhinal cortex and dorsal hippocampal CA1 area. Spikes were acquired with a sampling rate of 31.25 kHz and filter settings of 300 Hz to 6 kHz. Two infrared diodes alternating at 60 Hz were attached to the microelectrode array drive of the animal for position tracking. Spike sorting was accomplished using a custom manual clustering program (Xclust, M. A. Wilson). All procedures were approved by the MIT Institutional Animal Care and Use Committee.

In this selected recording session, neurons were recorded from both entorhinal cortex () and dorsal hippocampus (). Out of these 74 neurons, 21 (6 from entorhinal cortex) neurons were discarded, as they fired either independent of position (i.e., putative interneurons) or had very low firing rates (0.175 Hz). Out of the remaining neurons, (1 from entorhinal cortex, 26 from hippocampus) were unimodal and multimodal (1 from entorhinal cortex, 25 from hippocampus). The x and y coordinates of the rat’s position when these cells spiked are shown as scatter plots in Figure 2.

Figure 2:

Spike scatter plot for spiking activity. (A) Unimodal hippocampal place cells () and (B) multimodal hippocampal place cells and entorhinal grid cells (). Each plot corresponds to the spiking activity of a single neuron displayed in () coordinates in the circular arena. Each dot corresponds to () coordinates of rat’s position when the cell spiked.

Figure 2:

Spike scatter plot for spiking activity. (A) Unimodal hippocampal place cells () and (B) multimodal hippocampal place cells and entorhinal grid cells (). Each plot corresponds to the spiking activity of a single neuron displayed in () coordinates in the circular arena. Each dot corresponds to () coordinates of rat’s position when the cell spiked.

2.3  Neuronal Encoding and Decoding Models

The encoding model describes the propensity of a single cell to spike over time as a function of external and internal covariates. As a neuronal spike train is a stochastic point process (i.e., random binary events on a continuum), the goal of encoding is to characterize the conditional intensity function (CIF) (Cox & Isham, 2000; Agarwal & Sarma, 2011, 2012a, 2012b). In this letter, we assume that the spiking activity of each place and grid cell depends only on the rat’s position and the cell’s own spike history. Therefore, we write the CIF for neuron c as (Brown, Barbieri, Ventura, Kass, & Frank, 2002; Truccolo et al., 2005)
formula
2.1
where denotes the rat’s position at time t in the two-dimensional (2D) environment, and the vector consists of history covariates for cell c up to time t as in Kass and Ventura (2001), Sarma et al. (2012), Sarma et al. (2010), Santaniello et al. (2012), Kahn, Sheiber, Thakor, and Sarma (2011), and Agarwal, Sarma, Thakor, Schieber, and Massaquoi (2015).

The decoding model is the inverse of the encoding model and is used to predict the external covariates, at any given time t, from the spiking activity of cells up to the time t. In this letter, we realize the decoding model by using the maximum a posteriori (MAP) estimator that maximizes the posterior probability , where denotes the time since the last spike for cell c at time t and denotes the likelihood.

As the likelihood is a high-dimensional object, we simplify as follows. At any given time t, we assume that the likelihood of observing the time since the last spike, from the cth neuron () given the rat’s spatial position at time t, has a factorial form
formula
2.2
where we assume that the neurons fire conditionally independently given the position such that for any . This allows us to write the MAP estimator as
formula
2.3

For simplicity, we do not consider the temporal prior in the course of computing the posterior. In sections 2.3.1 and 2.3.2, we use both parametric and nonparametric methods to estimate the encoding and decoding models for neuronal spike trains.

2.3.1  Parametric Models

Encoding. In the context of a generalized linear model (GLM) framework, we assume that the CIF has a log-linear form. Namely, the logarithm of the CIF is linear with respect to its parameters (Truccolo et al., 2005). Inspired by the past successes of using gaussian and Zernike polynomials in modeling 2D hippocampal place cells (Barbieri et al., 2004), these two representations are chosen to estimate the CIFs. In addition, spike history dependencies are empirically modeled with spike count history covariates that correspond to the number of spikes within the past interval ms. Specifically, the CIF for each neuron c is modeled in the following two ways:

  • Quadratic GLM ()
    formula
    2.4
  • Zernike GLM ()
    formula
    2.5

where are the unknown GLM parameters for and , respectively; denotes the fourth-order Zernike polynomials as used in Barbieri et al. (2004), and denotes the spike history for cell c at time t. Note that since the log of the CIF is modeled as a quadratic function in , the CIF has a gaussian shape.

Decoding. In light of equations 2.2 and 2.3, is evaluated as (Barbieri et al., 2001)
formula
2.6
where the integration in equation 2.6 is computed as
formula
2.7
formula
2.8
formula
2.9
where denote the spatial position covariates in either or . In writing the above probabilities and equations, it is assumed that the rat’s spatial position remains unchanged at the fine timescales of neuronal spiking. Note that can be directly estimated from the data as the spike history is known at any time during the decoding process.

A uniform prior P(x,y) is used for MAP estimation in equation 2.3. A similar encoding and decoding formulation can be used for spike history–independent models, except that the hyperparameter J is set to zero. The MAP estimates derived from both spike history–dependent and independent models are then smoothed out by filtering bidirectionally by a two-pole Butterworth filter with cutoff frequency Hz (for containing 95% power of coordinates) to obtain the final smooth estimate .

Model estimation and validation. The parameters of history-independent models are estimated by maximum likelihood regression, whereas the parameters for history-dependent models are estimated using -regularized GLM ridge regression (with regularization parameter ). The log-linear assumption on the parameters results in a convex likelihood function that allows for efficient computation of the ML estimate (McCullagh, 1984; Truccolo et al., 2005). We use a conjugate gradient algorithm to solve the convex optimization problem.

For simplicity, despite the shape differences in receptive fields, we assume that all neuronal place receptive fields have the same level of history dependence and number of parameters. Therefore, we use the one-vector of hyperparameters for all cells. These hyperparameters are then chosen by maximizing the R2 statistic computed as a measure of decoding performance on the validation data set using the hill-climbing algorithm (Yuret & De La Maza, 1993), which guarantees convergence to local maxima. In particular, the first parameter is varied while fixing the second parameter to some initial value. A maximum decoding performance is obtained in the first dimension, and the first parameter is then fixed to the corresponding value. The second is then varied, and its respective value that provides maximum decoding performance is obtained. The process is repeated until a local maximum in all dimensions is found—namely, no improvement in decoding performance is observed by varying any of the two parameters in any direction.

Out of 30.36 minutes of training data, the first 85% is used for model building, and the remaining 15% is used to validate hyperparameters. Once the hyperparameters are validated, the model is retrained in the entire 30.36 min of training data and assessed on the new 6 minute test data.

2.3.2  Nonparametric Models

Encoding. Let , , and be the outcomes of random variables , respectively, where if and only if cell c fires an action potential in and 0 otherwise.

Then the CIF for neuron c can be written in the following equivalent form as the ratio of two densities (Kloosterman et al., 2014):
formula
2.10
where N denotes the total number of spikes within total duration T. The numerator denotes a 3D conditional pdf of spatial position and history condition on neuronal spiking, and the denominator denotes a 3D joint pdf of the spatial position and spike history. Note that the time and cell indices are dropped from x, y, and h in these definitions as the densities are defined on space of X, Y, and Hc, which take any particular value xt, yt, and for any given cell c at any given time t. The use of the logarithm in the history function allows for a smoother dependence of on , which in turn captures high-frequency components in the CIF due to refractoriness (i.e., a sharp decrease in right after emitting a spike) and bursting.
To estimate and , we apply the probability product rule:
formula
2.11
Let . We rewrite equation 2.10 as
formula
2.12
In the second step, it is assumed that conveys the same information about h as knowing . This assumption is reasonable as spiking history h depends on the spatial coordinates through the spiking patterns of the neuron, which are characterized by . This assumption reduces a 3D into a 2D density estimation problem. The validity of this assumption is tested by performing goodness-of-fit tests and decoding analysis on the held-out test data. A direct way to verify the validity of this assumption is to calculate the dependencies between Hc and and Hc and . (For details, see Agarwal, Sacre, & Sarma, 2015.)
Further, to estimate in equation 2.12, we use the following nonlinear transformations
formula
where , where is a very small constant that aims to prevent numerical underflow. This nonlinear transformation is motivated by the fact that and hence has a nonsmooth pdf at , which is difficult to estimate using nonparametric methods. The transformation is aimed to smooth out the density of for an easier nonparametric estimation. Note that for any one-to-one transformation , the change of variables does not affect the estimation due to the axioms of probability (see appendix  B).

To estimate history-independent encoding models, we set .

Finally, we define the place receptive field (RF) as the expected value of with respect to the probability distribution :
formula
2.13
Therefore, the estimation of the RF consists of three steps: (1) estimate ; (2) estimate (note that step 2 is dependent on step 1, but step 1 is independent of step 2, and no iterative solution is required); and (3) estimate the expected value and product according to equation 2.13.
Decoding. Assuming a change of variable: , instead of using equation 2.3, we use an equivalent yet simpler form of the MAP estimator:
formula
2.14
which yields the identical estimator due to rules of probability: , where (see equation B.3 in appendix  B) has the same dependence on . To simplify equation 2.14, we assume that is a sufficient statistic (Lehmann & Scheffé, 1950) for hc for the cth neuron. That is, knowing () does not convey any additional information about hc if is known. This assumption leads to
formula
2.15
Now, the density ( has been dropped from for brevity of notation) is estimated as
formula
2.16
where is a one-to-one nonlinear transformation. The second equality is derived from rules of probability (see equations B.4 and B.5 in appendix  B). The densities , , and are estimated nonparametrically, as described in the next section.

For spike history–independent decoding models, the in equation 2.3 is first replaced by using the sufficient statistic assumption. Then is calculated using pseudo-equilibrium assumption (as done in GLM decoding): .

The MAP estimators () are computed by maximizing equation 2.3 or 2.14 over a 2D grid of coordinate space. These MAP estimators are then smoothed by filtering bidirectionally by a two-pole Butterworth filter with cutoff frequency Hz (for containing 95% power of true coordinates) to obtain the final smooth estimate . Figure 3 provides a summary of the encoding and decoding processes.

Figure 3:

Encoding and decoding processes under the nonparametric framework. (A) Encoding process and (B) decoding process. Green boxes denote training data, blue boxes denote densities estimated nonparametrically using training data, yellow boxes denote important functions computed using these densities, and red boxes denote test data. The CIF and decoding estimates are obtained by applying functions in tan boxes on the test data. Dark blue boxes denote the densities used in both encoding and decoding processes. is a nonlinear function to smooth out dependence on for an easier nonparametric estimation.

Figure 3:

Encoding and decoding processes under the nonparametric framework. (A) Encoding process and (B) decoding process. Green boxes denote training data, blue boxes denote densities estimated nonparametrically using training data, yellow boxes denote important functions computed using these densities, and red boxes denote test data. The CIF and decoding estimates are obtained by applying functions in tan boxes on the test data. Dark blue boxes denote the densities used in both encoding and decoding processes. is a nonlinear function to smooth out dependence on for an easier nonparametric estimation.

Model estimation and validation. Note that the rate functions and appear as products and ratios of densities. These densities are estimated using both second-order kernel density estimators () (Silverman, 1986) and our recently developed nonparametric band-limited ML estimator (Agarwal, Chen et al., 2015) (see also appendix  A). Specifically, the two nonparametric density estimators are defined as follows:

  • Kernel density estimators (): assumes that the pdf is a linear combination of kernel functions centered at the data samples as follows:
    formula
    2.17
    Here, the vectors are the data samples, and is a d-dimensional gaussian kernel with bandwidth vector . We have used the isotropic kernel, , with an optimal bandwidth (Silverman, 1986), where constantis are determined by a cross-validation procedure.
  • Band-limited maximum likelihood () estimation, which assumes that the pdf square root is band-limited and smooth (Agarwal, Chen et al., 2015). The estimator has the following form:
    formula
    2.18
    Here, the vectors are the data samples, denotes the vector of cutoff frequencies, , and the vector is computed by solving the following optimization problem (see appendix  A for details):
    formula
    2.19
    where, . Note that appears similar to , but it is different in the following ways: (1) is a nonparametric ML estimator, while is not optimal in any simple statistical sense; (2) the summation in the estimator is squared; and (3) the coefficients in front of the sinc functions in the estimator are computed by solving an optimization problem that falls out of the ML procedure. The estimator is computed using the algorithm (Agarwal, Chen et al., 2015) defined in appendix  A.

For simplicity, despite the shape differences in place receptive fields, we assume that all neuronal place receptive fields have the same level of smoothness in the x and y directions across the entire population. Namely, we use the one-vector of bandwidths (for the estimator) and cutoff frequencies (for the estimator) for the entire neuronal population. To select these hyperparameters of bandwidths and cutoff frequencies, performance as measured by goodness-of-fit on validation data was evaluated as in Kloosterman et al. (2014). Specifically, the combination of bandwidths and cutoff frequencies that produces the best decoding performance on test data (as measured by R2 statistic) is selected. As the optimization on cutoff frequency is a 3D problem, a hill-climbing strategy (Yuret & De La Maza, 1993) (as done for parametric models) is used to find cutoff frequencies and bandwidths that jointly maximize decoding performance locally. Particularly, the first parameter is varied while fixing the second and third parameter to some initial values. A maximum decoding performance is obtained in the first dimension, and the first parameter is then fixed to the corresponding value. The second and third parameters are then varied, and their respective values that provide maximum decoding performance are obtained. The process is repeated until a local maximum in all dimensions is found: namely, no improvement in decoding performance is observed by varying any of the three parameters in any direction.

Similar to GLM estimation, out of 30.36 minutes of training data, the first 85% is used for model building and the remaining 15% to validate hyperparameters. Once the hyperparameters are validated, the model is retrained on the entire 30.36 minutes of training data and assessed on the new 6 minute test data.

2.4  Performance Evaluation

2.4.1  Encoding Models

The goodness-of-fit of the eight estimators ( with and without spike history) of is assessed by using the time-rescaling theorem and the Kolmogorov-Smirnov (KS) statistic (Brown et al., 2002) on 6 minutes of test data. Particularly, we tested if rescaled times are distributed as a Poisson process with rate 1. The similarity between the two cumulative distribution functions is quantified using the normalized (by , number of spikes) KS statistic. Due to normalization, a value of KS 1 indicates that the estimated is too extreme (, one-sample KS test) to generate the test data. The closer the normalized KS statistic is to 0, the better is the estimate.

Since the KS statistic may be insufficient to assess non-Poisson neural encoding models (Pillow, 2009; Gerhard & Gerstner, 2010), we also compute the negative log likelihood of 6 minute test data for all tested models:
formula
2.20
where ms is used. A smaller negative log-likelihood value on test data implies better data predictability. For the purpose of comparison and better display, the negative loglikelihoods for all models are plotted against the result of a baseline model ( with no spike history). In the reported negative log-likelihood statistics, all values have been normalized by 200 and then shifted up by 3.

2.4.2  Decoding Models

The decoding model performance is assessed by using the R2 statistic of estimates obtained from 6 minutes of test data. In addition, box plots with median, 5%, 25%, 75%, and 95% quantiles of the decoding errors for all the eight models are presented.

All parametric and nonparametric methods are implemented in customized software written in Matlab2014a (MathWorks, Natick, MA).

3  Results

3.1  Hyperparameter Validation

The validated hyperparameters of parametric and nonparametric models are summarized in Table 1. Note that spike history–independent parametric models contain no hyperparameter.

Table 1:
Model Hyperparameters Selected from Validation of Experimental Data.
ModelWithout Spike HistoryWith Spike History
 NA  
 NA  
   
   
ModelWithout Spike HistoryWith Spike History
 NA  
 NA  
   
   

3.2  Encoding: CIF Estimation

Figure 4 shows the estimated CIFs using GLM methods for a unimodal and multimodal place cell, respectively. In Figure 4A, the method with and without history are compared. As can be seen in Figure 4A(ii,iii), both methods capture the position dependence. However, as shown in Figure 4A(v), the estimated using with no spike history dependence () fails to capture bursting and refractoriness phenomena, as evidenced in the KS statistic (2.01 versus 1.34 with spike history dependence) and negative log likelihood (3 versus 2.28 with spike history dependence) (see Figure 4A(vi)).

Figure 4:

Performance of GLM methods on a (A) unimodal place cell and (B) multimodal place cell. (i) The rat’s spatial position when the cells spiked are marked by dots. (ii, iii) Estimated dependence of CIF on () coordinates, using different GLMs. (iv) Estimated spatial parameters for different models. (v) Estimated CIF as a function of time. (vi) Performance metrics of different models evaluated on test data, KS statistic (blue bar), negative log likelihood (yellow bar). , , and denote the respective GLM without and with spike history.

Figure 4:

Performance of GLM methods on a (A) unimodal place cell and (B) multimodal place cell. (i) The rat’s spatial position when the cells spiked are marked by dots. (ii, iii) Estimated dependence of CIF on () coordinates, using different GLMs. (iv) Estimated spatial parameters for different models. (v) Estimated CIF as a function of time. (vi) Performance metrics of different models evaluated on test data, KS statistic (blue bar), negative log likelihood (yellow bar). , , and denote the respective GLM without and with spike history.

In Figure 4B, the performance of and (both with history dependence) on a multimodal place cell data is shown. It can be seen that method fails to capture the spatial complexity in firing patterns of the multimodal place cell. Although, did better in capturing the spatial complexity, KS = 2.2 implies scope for improvement in model fitting (see Figure 4B(vi)).

The encoding results are summarized in Table 2. In contrast to the GLM, the nonparametric encoding models yield better performance for the multimodal place cell data for the same multimodal cell. Figure 5B shows that the KS statistics for both (KS = 0.58) and (KS = 0.84) are lower than 1. Similarly, the negative log-likelihood values (0.85 for and 0.98 for ) are better than those of (1.83) and (1.43). The estimates for using and are shown in Figures 5C and 5E, respectively. First, it can be seen that for both place and grid cells, is larger wherever the neuron spikes more, verifying the “place-like” and “grid-like” structures. Second, the method produces a smoother than the method. Figures 5D and 5F show for the and methods, respectively.

Table 2:
Comparison of Performance Metrics (on Test Data) of Parametric and Nonparametric Methods in Encoding and Decoding.
ModelMedian KS StatisticMedian Negative Log LikelihoodMedian Decoding Error (cm)R2
Unimodal cells  
    (w/o, w/ history) 4.33, 2.04 3.00, 2.36 23.59, 24.01 0.40, 0.44 
    (w/o, w/ history) 3.80, 1.88 2.90, 2.28 24.21, 28.25 0.27, 0.32 
    (w/o, w/ history) 4.41, 0.99 2.91, 2.29 25.67, 15.39 0.40, 0.80 
    (w/o, w/ history) 4.04, 1.01 2.94, 1.82 22.29, 15.51 0.55, 0.83 
Multimodal  
    (w/o, w/ history) 5.74, 2.50 3.00, 2.00 29.70, 30.81 0.36, 0.34 
    (w/o, w/ history) 4.65, 2.25 2.72, 1.83 21.58, 23.62 0.44, 0.55 
    (w/o, w/ history) 4.86, 1.63 2.70, 2.07 19.07, 14.12 0.64, 0.83 
    (w/o, w/ history) 4.65, 1.30 2.66, 1.22 19.99, 14.42 0.62, 0.83 
All combined  
    (w/o, w/ history) 4.82, 2.36 3.00, 2.15 24.24, 23.63 0.47, 0.50 
    (w/o, w/ history) 4.27, 2.20 2.82, 1.94 20.04, 23.40 0.47, 0.59 
    (w/o, w/ history) 4.58, 1.35 2.80, 2.25 21.75, 12.85 0.64, 0.88 
    (w/o, w/ history) 4.25, 1.09 2.79, 1.52 18.21, 12.39 0.72, 0.89 
ModelMedian KS StatisticMedian Negative Log LikelihoodMedian Decoding Error (cm)R2
Unimodal cells  
    (w/o, w/ history) 4.33, 2.04 3.00, 2.36 23.59, 24.01 0.40, 0.44 
    (w/o, w/ history) 3.80, 1.88 2.90, 2.28 24.21, 28.25 0.27, 0.32 
    (w/o, w/ history) 4.41, 0.99 2.91, 2.29 25.67, 15.39 0.40, 0.80 
    (w/o, w/ history) 4.04, 1.01 2.94, 1.82 22.29, 15.51 0.55, 0.83 
Multimodal  
    (w/o, w/ history) 5.74, 2.50 3.00, 2.00 29.70, 30.81 0.36, 0.34 
    (w/o, w/ history) 4.65, 2.25 2.72, 1.83 21.58, 23.62 0.44, 0.55 
    (w/o, w/ history) 4.86, 1.63 2.70, 2.07 19.07, 14.12 0.64, 0.83 
    (w/o, w/ history) 4.65, 1.30 2.66, 1.22 19.99, 14.42 0.62, 0.83 
All combined  
    (w/o, w/ history) 4.82, 2.36 3.00, 2.15 24.24, 23.63 0.47, 0.50 
    (w/o, w/ history) 4.27, 2.20 2.82, 1.94 20.04, 23.40 0.47, 0.59 
    (w/o, w/ history) 4.58, 1.35 2.80, 2.25 21.75, 12.85 0.64, 0.88 
    (w/o, w/ history) 4.25, 1.09 2.79, 1.52 18.21, 12.39 0.72, 0.89 

Note: The best performance in each column is shown in bold.

Figure 5:

Comparison of , , and GLM methods with spike history dependence for a multimodal place cell. (A) Each black dot marks the rat’s spatial location when the cell spiked. (B) The performance stats for , , , and using test data (blue bars KS statistic, yellow bars negative log likelihood). (C,E) Estimated for and , respectively. (D,F) Estimated for and , respectively. (G) Estimated CIF as a function of time for the two nonparametric encoding methods.

Figure 5:

Comparison of , , and GLM methods with spike history dependence for a multimodal place cell. (A) Each black dot marks the rat’s spatial location when the cell spiked. (B) The performance stats for , , , and using test data (blue bars KS statistic, yellow bars negative log likelihood). (C,E) Estimated for and , respectively. (D,F) Estimated for and , respectively. (G) Estimated CIF as a function of time for the two nonparametric encoding methods.

Figure 5G shows the estimated CIF over time from the nonparametric and methods. Both methods are successful in capturing refractoriness and bursting in the neuronal activity. Specifically, refractoriness is captured by smaller values near across all , and bursting is captured by sudden large values near ms across all (see Figures 5D and 5F). The values achieved at and smaller are higher than the values achieved at ms and larger . This suggests that even if is small, can still reach a significant value if a spike occurred in the past 4 ms. This in turn makes the values of after 4 ms of any spike almost constant, which is therefore independent of . These findings confirm that bursting in place and grid cells is an internal phenomenon that is not affected much by the external stimulus.

Figure 6 shows the recorded and simulated activity (using the model) for a multimodal place cell shown in Figure 5. It can be seen that at both macro- and microlevels, the nonparametric model is able to reproduce the recorded activity; the spatial dependence is nicely captured as shown in Figure 6B, along with microlevel bursting, as shown in Figure 6D. Further, it can be seen in the zoom-in view in Figure 6C that there is high variability (Fenton & Muller, 1998; Moser et al., 2008) between the spike counts on different trajectories that pass through the region, even though the is almost constant over this small region. This effect is replicated in simulated activity due to incorporation of spike history. Statistical dependencies on the spike history, particularly bursting, makes the neuron either spike at least three of four times if it fires once or the neuron does not fire at all, resulting in increased variance.

Figure 6:

Simulated activity for a multimodal place cell using . (A,C) The recorded spiking activity of the multimodal place cell shown in Figure 5 and its zoom-in view. (B,D) The simulated spiking activity of the multimodal place cell and its zoom-in view. Each red dot marks the spike with respect to spatial coordinates.

Figure 6:

Simulated activity for a multimodal place cell using . (A,C) The recorded spiking activity of the multimodal place cell shown in Figure 5 and its zoom-in view. (B,D) The simulated spiking activity of the multimodal place cell and its zoom-in view. Each red dot marks the spike with respect to spatial coordinates.

For the entire population of unimodal place, cells, multimodal place, and grid cells, models with spike history are significantly better as measured by KS statistics and predictive log likelihoods than the models without spike history for (KS statistic, P = 2.5e-10, log likelihood P = 2.39e-10, paired Wilcoxon signed-rank test), (KS statistic P = 2.4e-10, log likelihood P = 1.94e-05), (KS statistic P = 2.4e-10, log likelihood P = 3.57e-08), and (KS statistic P = 2.5e-10, log likelihood P = 2.39e-10). Further, methods with spike history (median KS statistics = 1.09, median negative log likelihood = 1.52) perform significantly better than the KDE models with spike history (median KS statistics 1.35, negative log likelihood 2.25), with spike history (median KS statistics 2.36, negative log likelihood 2.15), and with spike history (median KS statistics 2.2, negative log likelihood 1.94) methods (respective P-values for KS-statistic and negative log likelihood: P = 2.19e-05 and 2.7e-10, P = 1.03e-09 and 2.5e-10, P = 1.03e-09 and 4.7e-10, paired Wilcoxon signed-rank test). Box plots showing the KS statistic and negative log likelihood values for all eight statistical models are shown in Figures 7 and 8. As expected, the improvement in CIF estimation using nonparametric methods is greater for the multimodal place cell and grid cell populations (the gap between the median negative log likelihood is between and with spike history) as compared to unimodal place cell population (the gap between the median negative log likelihood is between and with spike history).

Figure 7:

Comparison of KS-statistics values. Box plots for KS statistics for the , , , and methods (A) without spike history dependence and (B) with spike history dependence for unimodal place cells, multimodal place cells and grid cells and both combined. * and denote and , respectively, for comparisons between the mean KS statistics achieved by BLML and the indicated method. means for comparisons between the mean KS statistics achieved by a method without and with spike history dependence. All P-values are calculated using the Wilcoxon signed-rank test.

Figure 7:

Comparison of KS-statistics values. Box plots for KS statistics for the , , , and methods (A) without spike history dependence and (B) with spike history dependence for unimodal place cells, multimodal place cells and grid cells and both combined. * and denote and , respectively, for comparisons between the mean KS statistics achieved by BLML and the indicated method. means for comparisons between the mean KS statistics achieved by a method without and with spike history dependence. All P-values are calculated using the Wilcoxon signed-rank test.

Figure 8:

Comparison of predictive negative log-likelihood values. Box plots for negative log-likelihood values for , , , and methods (A) without spike history dependence and (B) with spike history dependence for unimodal place cells, multimodal place cells and grid cells and both combined. * denote for comparisons between the mean KS statistics derived from and the other indicated method. means for comparisons between the mean KS statistics achieved by a method without and with spike history dependence. All P-values are calculated using the Wilcoxon signed-rank test.

Figure 8:

Comparison of predictive negative log-likelihood values. Box plots for negative log-likelihood values for , , , and methods (A) without spike history dependence and (B) with spike history dependence for unimodal place cells, multimodal place cells and grid cells and both combined. * denote for comparisons between the mean KS statistics derived from and the other indicated method. means for comparisons between the mean KS statistics achieved by a method without and with spike history dependence. All P-values are calculated using the Wilcoxon signed-rank test.

3.3  Decoding

The decoding results are summarized in Table 2. In addition, Figure 9 shows a summary of decoding errors obtained using 6 minutes of test data, over the population of unimodal place cells, multimodal place, and grid cells and both using , , , and with and without spike history covariates. Overall, models with spike history (respective R2 statistics: 0.89, 0.88, 0.50, 0.59; respective median decoding error: 12.39, 12.85, 23.62, 23.40 cm) perform significantly better than their counterparts without spike history (respective R2: 0.72, 0.64, 0.47, 0.47; respective median decoding error: 18.21, 21.75, 24.24, 20.04 cm; respective P-values of signed-rank test: 9.1e-97, 1.7e-135, 7.4e-07, 1.6e-05). Note that although the R2 value is better for with spike history (0.59 versus 0.47), the median decoding error is slightly worse (23.4 versus 20.04 cm). More important, with spike history performs the best among all four methods (respective P-values of signed-rank test: 0.02, 5.7e-165, and 4.0e-203 for , , and in decoding error). Similarly, without spike history also performs significantly better than , , and in decoding errors (respective P-values: 8.6e-27, 9.9e-54, and 1.01e-16). From Table 2, it can be seen that with or without spike history also performs the best in decoding the rat’s position only using unimodal cells or multimodal cells. Interestingly, has better performance than in R2 for unimodal cells, yet worse performance for multimodal cells.

Figure 9:

Comparison in decoding error performance. Box plots of neural decoding errors for the , , , and methods without (A) and with (B) spike history dependence for unimodal place cells, multimodal place cells, and grid cells and both combined. * denotes for comparisons between the mean KS statistics derived from and the other indicated method. means for comparisons between the mean KS statistics achieved by a method without and with spike history dependence. All P-values are calculated using Wilcoxon signed rank-test.

Figure 9:

Comparison in decoding error performance. Box plots of neural decoding errors for the , , , and methods without (A) and with (B) spike history dependence for unimodal place cells, multimodal place cells, and grid cells and both combined. * denotes for comparisons between the mean KS statistics derived from and the other indicated method. means for comparisons between the mean KS statistics achieved by a method without and with spike history dependence. All P-values are calculated using Wilcoxon signed rank-test.

As an illustration, Figure 10 shows the reconstructed rat’s trajectory calculated using and with spike history, respectively, for a 1 minute period in the test data. The animal’s actual position (black) is overlaid on the estimates (green; red), and each subpanel shows the reconstruction for a 20 second period. A video clip for demonstration of our decoding performance in experimental data (using ) is provided in the supplementary material.

Figure 10:

Comparison of trajectory decoding using nonparametric () and parametric () decoding approaches. Reconstructed rat’s run trajectory calculated using (A) and (B) (both with spike history covariates), respectively, for a 1 minute period in test data. Each subpanel shows the reconstruction for a 20 second period.

Figure 10:

Comparison of trajectory decoding using nonparametric () and parametric () decoding approaches. Reconstructed rat’s run trajectory calculated using (A) and (B) (both with spike history covariates), respectively, for a 1 minute period in test data. Each subpanel shows the reconstruction for a 20 second period.

4  Discussion and Conclusion

4.1  Nonparametric Modeling: Flexibility and Robustness

In this letter, we develop a novel nonparametric approach to study encoding properties and decoding performance in hippocampal place cells and entorhinal grid cells. Our approach is inspired by the nonparametric encoding framework previously proposed in Chen et al. (2012) and Kloosterman et al. (2014). The basic idea of the nonparametric encoding framework is to convert the encoding or decoding problem to a density estimation problem. In this letter, two nonparametric density estimators are constructed: the estimator Agarwal, Chen et al. (2015) and the standard . The performance of these two nonparametric models is then compared to parametric GLM models (using either quadratic or Zernike polynomials). The estimator outperforms both parametric methods. The model has the best performance for encoding models, whereas methods are comparable to in decoding performance. Both and methods outperformed parametric models in both encoding and decoding performances.

The proposed neural encoding models also capture the statistical dependence of spike history on current spiking activity. This allows modeling of neuronal spiking at a millisecond timescale, and, hence, both bursting and refractoriness phenomenon are captured successfully. Further, our result can also explain the observed extra variance in spiking activity, which cannot be accounted sufficiently by an inhomogeneous Poisson process (Fenton & Muller, 1998; Moser et al., 2008). This improved the decoding performance of our methods over state-of-the-art decoding methods that ignore spike history dependence and hence assume incorrectly that neuronal spiking is an inhomogeneous Poisson process phenomenon.

Due to the nonparametric nature of the estimator, the same model structure (with the same cutoff frequency) is able to characterize a variety of receptive field types observed in both place and grid cells. This modeling flexibility is an important step forward towards a practical solution of neural decoding, required in brain-machine interfaces. Currently, neural decoding via implantable devices suffers from random drifts in recordings and neuronal firing over time. Such drifts may include changes in firing patterns of the cell itself due to change in neural structure or environment (Moser et al., 2008; Frank, Stanley, & Brown, 2004; Muller & Kubie, 1987). Therefore, to achieve robust decoding, the encoding models should be flexible so that they can model a variety of receptive fields. As shown in this letter, the nonparametric method constructed by the estimator is statistically and computationally efficient, flexible enough to characterize a variety of receptive field types in the encoding phase, and may remain robust if the receptive field structure changes over time.

More important, our neural decoding results are the first attempt to decode the rat’s trajectory in a 2D space using multimodal place receptive fields (from either place cells or grid cells) together with spike history. The popular parametric GLM framework is insufficient for modeling such multimodal place and grid cells with or without spike history (Barbieri et al., 2004; Brown et al., 2002). Previous attempts using nonparametric models mainly concentrated on decoding from unsorted multiunit activity, and the decoding was achieved in a 1D environment (Kloosterman et al., 2014) without using information stored in spiking history.

4.2  Fourier Representation in Place Receptive Fields

Recently, researchers (Kubie & Fox, 2015; Ormond & McNaughton, 2015) have hypothesized that the complexity in hippocampal place receptive fields is formed as a result of a Fourier-like summation of multiple receptive fields of grid cells (see Figure 11), and the grid cells’ receptive field can be closely approximated by a 2D cosine function with a specific spatial frequency. These grid cells are generally arranged in anatomical regions called modules (see Figure 11A) that contain the grid cells with similar spatial frequencies (Stensola et al., 2012). The hippocampal place cells receive projections from grid cells across multiple modules (each with distinct spatial frequencies). A toy example of such a sum is shown in Figure 11B. This hypothesis is called the Fourier hypothesis. If the Fourier hypothesis is biologically plausible, then it will also explain the superior performance of the nonparametric estimator, which assumes that place and grid fields are supported by sinusoidal functions as in the Fourier representation. However, the superior performance of the estimator in this investigation may also be due to superiority of density estimators. It remains to be tested which of the aforementioned is true.

Figure 11:

The Fourier hypothesis. (A) A lateral view of rat brain illustrating medial entorhinal cortex projection to the hippocampus. The grid cells are arranged in modules within the medial entorhinal cortex. Each module contains grid cells that have receptive fields with similar spatial frequency. The hippocampal place cells receive synaptic projections from grid cells in different modules. (B) The anatomical connections suggest that the hippocampal place cells linearly sum up the inputs from grid fields to generate a place field. (Figure courtesy of Kubie & Fox, 2015, reprinted by permission of PNAS.)

Figure 11:

The Fourier hypothesis. (A) A lateral view of rat brain illustrating medial entorhinal cortex projection to the hippocampus. The grid cells are arranged in modules within the medial entorhinal cortex. Each module contains grid cells that have receptive fields with similar spatial frequency. The hippocampal place cells receive synaptic projections from grid cells in different modules. (B) The anatomical connections suggest that the hippocampal place cells linearly sum up the inputs from grid fields to generate a place field. (Figure courtesy of Kubie & Fox, 2015, reprinted by permission of PNAS.)

4.3  Extension and Future Work

Several extensions of our work can be considered in future investigations. First, in neural encoding, we can in principle incorporate other informative neural covariates, such as the power or phase of the local field potential (Agarwal & Sarma, 2010, 2012a; Agarwal et al., 2014). The expectation is that, this will further increase the performance of the encoding model. In addition, finding an efficient procedure for bandwidth parameter selection is important to obtain good performance. Details on this investigation can be found in our previous work (Agarwal, Chen et al., 2015). Second, we can formulate the neural decoding problem within a state-space framework (Brown et al., 1998; Chen, 2013, 2015), by considering a smooth temporal prior. This will further improve the decoding accuracy. Finally, our methodology is not limited to rodent hippocampal place cells or grid cells; it can be extended to other modalities, such as the retina, lateral geniculate nucleus, primary visual cortex, and primary motor cortex.

Appendix A:  Algorithm for Estimator

To make this letter self-contained, we present a short introduction of the estimator and algorithm. The detailed results are provided in (Agarwal, Chen et al., 2015).

A.1  Formulation of the Estimator

Consider a pdf, , of a random variable (unit: dimensionless) with Fourier transform which has support in (unit: radian), namely, , is the cutoff frequency of (where fc denotes the corresponding cutoff frequency in unit Hz). Then,
formula
A.1
Define,
formula
A.2
and
formula
A.3
where . Note that and are Hilbert spaces with the inner product defined as , , respectively. The norm is defined for both spaces. Further, note that for all elements in and , .

A.2  The Likelihood Function for Band-Limited PDFs

Now consider a random variable, , with unknown pdf and its n independent realizations . The likelihood of observing is
formula
A.4a
formula
A.4b
Defining
formula
A.5
yields
formula
A.6
Further, consider that maximizes the likelihood function (see equation A.6):
formula
A.7
Then the estimator is
formula
A.8
Theorem 1
Consider n independent samples of an unknown band-limited pdf, , with assumed cutoff frequency fc (unit: Hz). Then the estimator of is given as
formula
A.9
where and
formula
A.10
Here ().

A.3  Choice of Optimal Orthant

Agarwal, Chen et al. (2015), showed that the asymptotic solution of equation A.10 lies in the orthant with indicator vector if is band-limited and (i.e., strictly positive ). Therefore, the optimal orthant vector can be chosen trivially (i.e., ), and then is solved in that orthant to compute . It is important to note that the asymptotic properties start to manifest for sample sizes as far as , and the condition is obeyed by many pdfs encountered in nature. Therefore, the two conditions are very less restrictive.

A.4  Algorithm

Consider a function such that
formula
A.11
where and is the sampling frequency. It is easy to verify that is also a pdf and . Now consider samples (where denotes an integer scale). Clearly these samples are related to as
formula
A.12
Further consider ’s computed by binning from independent and identically distributed (i.i.d.) observations of random variable from , as
formula
A.13
where denotes the function that returns the greatest integer of the input variable. Note that are the i.i.d. observations sampled from . Now since the estimate for should converge to due to Nyquist’s sampling theorem (Marks, 1991). We called this estimator . Assuming that the rate of convergence for is then if fs is chosen such that the will also converge with a rate of . This happens at also if .

A.5  Implementation and Computational Complexity

First consider the following theorem (Agarwal, Chen et al., 2015).

Theorem 2.

Consider n i.i.d observations of random variable x with pdf that has a tail. Then for large n.

Let denote the index for samples and denote the index for bins. The implementation of the algorithm consists of the following steps:

  1. Compute from . This step has a computational complexity of .

  2. Sort and construct () and . Note that is a block-Toeplitz matrix (Toeplitz arrangements of blocks and each block is Toeplitz). This step has a computational complexity of (Akaike, 1973).

  3. Use convex optimization algorithms to solve . Newton’s method should take a finite number of iterations to reach a given tolerance since the cost function is self-concordant (Boyd & Vandenverghe, 2004). Therefore, the computational complexity of optimization is the same as the computational complexity of one iteration. The complexity of one iteration is the same as the complexity of calculating
    formula
    As the matrix has a block-Toeplitz structure, the Akaike algorithm (Akaike, 1973) can be used to evaluate each iteration of Newton’s method in . Note that Simulations show that can be approximated accurately (to machine accuracy) by a low-rank matrix (e.g., with a rank order for ); therefore, the inversion can be performed in . Further, in some cases, one may end up with a large B (e.g., if the true pdf has heavy tails) so that storing the Hessian matrix becomes expensive. In such cases, a quasi-Newton or gradient-descent optimization algorithm can be used that computes the estimator fairly quickly.
  4. Evaluate the estimate at l given points. This step has a computational complexity of .

In summary, the total computational complexity of is . Substituting (from theorem 2), yields the total computational complexity , where describes the tail of the pdf (the gaussian pdf has an infinite value of r) and l is the number of points being evaluated.

Appendix B:  Change of Variables

Let , and , where is a one-to-one nonlinear function with inverse image .

When is an increasing function, the cumulative distribution function (cdf) of the transformed variable is
formula
B.1
When is an increasing function, the cdf of the transformed variable is
formula
B.2
When equations B.1 and B.2 are combined, the pdf of the transformed variable is
formula
B.3
Considering the ratio between a conditional pdf and a marginal pdf, we have
formula
B.4
where it is assumed that and . Similarly,
formula
B.5
where . Therefore, the density ratio is invariant to the one-to-one nonlinear transform.

Acknowledgments

Z.C. was supported by an NSF-CRCNS award (no. 1307645) from the U.S. National Science Foundation. M.A.W. is supported by NIH grant R01-MH06197, TR01-GM10498, and ONR-MURI grant N00014-10-1-0936. S.V.S. was supported by the U.S. National Science Foundation CAREER Award 1055560, the Burroughs Wellcome Fund CASI Award 1007274, and NSF EFRI-M3C.

References

Agarwal
,
G.
,
Stevenson
,
I. H.
,
Berényi
,
A.
,
Mizuseki
,
K.
,
Buzsáki
,
G.
, &
Sommer
,
F. T.
(
2014
).
Spatially distributed local fields in the hippocampus encode rat position
.
Science
,
344
(
6184
),
626
630
.
Agarwal
,
R.
,
Chen
,
Z.
, &
Sarma
,
S. V.
(
2015
).
Nonparametric estimation of band-limited probability density functions
.
arXiv:1503.06236v1
. http://arxiv.org/pdf/1503.06236v1.pdf
Agarwal
,
R.
,
Sacre
,
P.
, &
Sarma
,
S. V.
(
2015
).
Mutual dependence: A novel method for computing dependencies between random variables
.
arXiv preprint arXiv:1506.00673
.
Agarwal
,
R.
, &
Sarma
,
S. V.
(
2010
).
Restoring the basal ganglia in Parkinson’s disease to normal via multi-input phase-shifted deep brain stimulation
. In
Proc. of the Conference of the IEEE Engineering in Medicine and Biology Society
(pp.
1539
1542
).
Piscataway, NJ
:
IEEE
.
Agarwal
,
R.
, &
Sarma
,
S. V.
(
2011
).
An analytical study of relay neuron’s reliability: Dependence on input and model parameters
. In
Proc. of the Conference of the IEEE Engineering in Medicine and Biology Society
(pp.
2426
2429
).
Piscataway, NJ
:
IEEE
.
Agarwal
,
R.
, &
Sarma
,
S. V.
(
2012a
).
The effects of DBS patterns on basal ganglia activity and thalamic relay
.
Journal of Computational Neuroscience
,
33
(
1
),
151
167
.
Agarwal
,
R.
, &
Sarma
,
S. V.
(
2012b
).
Performance limitations of relay neurons
.
PLoS Computational Biology
,
8
(
8
),
e1002626
.
Agarwal
,
R.
,
Sarma
,
S. V.
,
Thakor
,
N. V.
,
Schieber
,
M. H.
, &
Massaquoi
,
S.
(
2015
).
Sensorimotor gaussian fields integrate visual and motor information in premotor neurons
.
Journal of Neuroscience
,
35
(
25
),
9508
9525
.
Akaike
,
H.
(
1973
).
Block Toeplitz matrix inversion
.
SIAM J. Appl. Math.
,
24
(
2
),
234
241
.
Barbieri
,
R.
,
Frank
,
L.
,
Nguyen
,
D.
,
Quirk
,
M.
,
Solo
,
V.
,
Wilson
,
M.
, &
Brown
,
E.
(
2004
).
Dynamic analyses of information encoding in neural ensembles
.
Neural Computation
,
16
(
2004
),
277
307
.
Barbieri
,
R.
,
Quirk
,
M. C.
,
Frank
,
L. M.
,
Wilson
,
M. A.
, &
Brown
,
E. N.
(
2001
).
Construction and analysis of non-Poisson stimulus-response models of neural spiking activity
.
Journal of Neuroscience Methods
,
105
(
1
),
25
37
.
Boyd
,
S.
, &
Vandenverghe
,
L.
(
2004
).
Convex optimization
.
Cambridge
:
Cambridge University Press
.
Brown
,
E.
,
Barbieri
,
R.
,
Ventura
,
V.
,
Kass
,
R.
, &
Frank
,
L.
(
2002
).
The time-rescaling theorem and its application to neural spike train data analysis
.
Neural Comput.
,
14
(
2
),
325
346
.
Brown
,
E.
,
Frank
,
L.
,
Tang
,
D.
,
Quirk
,
M.
, &
Wilson
,
M.
(
1998
).
A statistical paradigm for neural spike train decoding applied to position prediction from ensemble firing patterns of rat hippocampal place cells
.
Journal of Neuroscience
,
18
(
18
),
7411
7425
.
Chen
,
Z.
(
2013
).
An overview of Bayesian methods for neural spike train analysis
.
Computational Intelligence and Neuroscience
,
2013
,
251905
.
Chen
,
Z.
(Ed.) (
2015
).
Advanced state space methods in neural and clinical data
.
Cambridge
:
Cambridge University Press
.
Chen
,
Z.
,
Kloosterman
,
F.
,
Layton
,
S.
, &
Wilson
,
M. A.
(
2012
).
Transductive neural decoding for unsorted neuronal spikes of rat hippocampus
. In
Proc. of the Conference of the IEEE Engineering in Medicine and Biology Society
(pp.
1310
1313
).
Piscataway, NJ
:
IEEE
.
Coleman
,
Todd P.
, &
Sarma
,
Sridevi V.
(
2010
).
A computationally efficient method for nonparametric modeling of neural spiking activity with point processes
.
Neural Comput.
,
22
(
8
),
2002
2030
.
Cox
,
D.
, &
Isham
,
V.
(
2000
).
Point processes
.
London
:
Chapman and Hall/CRC
.
Fenton
,
A. A.
, &
Muller
,
R. U.
(
1998
).
Place cell discharge is extremely variable during individual passes of the rat through the firing field
.
Proceedings of the National Academy of Sciences USA
,
95
(
6
),
3182
3187
.
Frank
,
L. M.
,
Stanley
,
G. B.
, &
Brown
,
E. N.
(
2004
).
Hippocampal plasticity across multiple days of exposure to novel environments
.
Journal of Neuroscience
,
24
(
35
),
7681
7689
.
Gerhard
,
F.
, &
Gerstner
,
W.
(
2010
).
Rescaling, thinning or complementing? On goodness-of-fit procedures for point process models and generalized linear models
. In
J. D.
Lafferty
,
C. K. J.
Williams
,
J.
Shawe-Taylor
,
R. S.
Zemel
, &
A.
Cullotta
(Eds.),
Advances in neural information processing systems, 23
.
Cambridge, MA
:
MIT Press
.
Hasselmo
,
M. E.
(
2008
).
Grid cell mechanisms and function: Contributions of entorhinal persistent spiking and phase resetting
.
Hippocampus
,
18
(
12
),
1213
1229
.
Kahn
,
K.
,
Sheiber
,
M.
,
Thakor
,
N.
, &
Sarma
,
S. V.
(
2011
).
Neuron selection for decoding dexterous finger movements
. In
Proc. of the Conference of the IEEE Engineering in Medicine and Biology Society
.
Piscataway, NJ
:
IEEE
.
Kass
,
R.
, &
Ventura
,
V.
(
2001
).
A spike train probability model
.
Neural Comput.
,
13
,
1713
1720
.
Kloosterman
,
F.
,
Layton
,
S.
,
Chen
,
Z.
, &
Wilson
,
M.
(
2014
).
Bayesian decoding using unsorted spikes in the rat hippocampus
.
J. Neurophysiol.
,
111
(
1
),
217
227
.
Kubie
,
J. L.
, &
Fox
,
S. E.
(
2015
).
Do the spatial frequencies of grid cells mold the firing fields of place cells?
Proceedings of the National Academy of Sciences USA
,
112
(
13
),
3860
3861
.
Lehmann
,
E. L.
, &
Scheffé
,
H.
(
1950
).
Completeness, similar regions, and unbiased estimation: Part I
.
Sankhyā: The Indian Journal of Statistics
,
10
,
305
340
.
Marks
,
I. R.
(
1991
).
Introduction to shannon sampling and interpolation theory
.
New York
:
Springer-Verlag
.
McCullagh
,
P.
(
1984
).
Generalized linear models
.
European Journal of Operational Research
,
16
(
3
),
285
292
.
Moser
,
E. I.
,
Kropff
,
E.
, &
Moser
,
M.-B.
(
2008
).
Place cells, grid cells, and the brain’s spatial representation system
.
Annu. Rev. Neurosci.
,
31
,
69
89
.
Muller
,
R. U.
, &
Kubie
,
J. L.
(
1987
).
The effects of changes in the environment on the spatial firing of hippocampal complex-spike cells
.
Journal of Neuroscience
,
7
(
7
),
1951
1968
.
O’Keefe
,
J.
(
1979
).
A review of the hippocampal place cells
.
Progress in Neurobiology
,
13
(
4
),
419
439
.
Ormond
,
J.
, &
McNaughton
,
B. L.
(
2015
).
Place field expansion after focal MEC inactivations is consistent with loss of Fourier components and path integrator gain reduction
.
Proceedings of the National Academy of Sciences USA
,
112
(
13
),
4116
4121
.
Pillow
,
J. W.
(
2009
).
Time-rescaling methods for the estimation and assessment of non-Poisson neural encoding models
. In
Y.
Bengio
,
D.
Schuurmans
,
J.
Lafferty
,
C. K. I.
Williams
, &
A.
Culotta
(Eds.),
Advances in neural information processing systems
,
22
(pp.
1473
1481
),
Cambridge, MA
:
MIT Press
.
Pillow
,
J. W.
,
Shlens
,
J.
,
Paninski
,
L.
,
Sher
,
A.
,
Litke
,
A. M.
,
Chichilnisky
,
E. J.
, &
Simoncelli
,
E. P.
(
2008
).
Spatio-temporal correlations and visual signalling in a complete neuronal population
.
Nature
,
454
(
7207
),
995
999
.
Rad
,
K. R.
, &
Paninski
,
L.
(
2010
).
Efficient, adaptive estimation of two-dimensional firing rate surface via Gassign process methods
.
Network
,
21
(
3–4
),
142
168
.
Santaniello
,
S.
,
Montgomery
, Jr.
E. B.
,
Gale
,
J. T.
, &
Sarma
,
S. V.
(
2012
).
Non-stationary discharge patterns in motor cortex under subthalamic nucleus deep brain stimulation
.
Front. Integr. Neurosci.
,
6
(
35
),
1
13
.
Sarma
,
S.
,
Cheng
,
M.
,
Eden
,
U.
,
Williams
,
Z.
,
Brown
,
E.
, &
Eskandar
,
E.
(
2012
).
The effects of cues on neurons in the basal ganglia in Parkinson’s disease
.
Front. Integr. Neurosci.
,
6
(
40
).
Sarma
,
S. V.
,
Eden
,
U. T.
,
Cheng
,
M. L.
,
Williams
,
Z.
,
Hu
,
R.
,
Eskandar
,
E. N.
, &
Brown
,
E. N.
(
2010
).
Using point process models to compare neural spiking activity in the sub-thalamic nucleus of Parkinson’s patients and a healthy primate
.
IEEE Trans. Biomed. Engr.
,
57
(
6
),
1297
1305
.
Sharpee
,
T.
,
Rust
,
N. C.
, &
Bialek
,
W.
(
2004
).
Analyzing neural responses to natural signals: Maximally informative dimensions
.
Neural Computation
,
16
,
223
250
.
Silverman
,
B. W.
(
1986
).
Density estimation for statistics and data analysis
.
London
:
CRC Press
.
Stensola
,
H.
,
Stensola
,
T.
,
Solstad
,
T.
,
Frøland
,
K.
,
Moser
,
M.-B.
, &
Moser
,
E. I.
(
2012
).
The entorhinal grid map is discretized
.
Nature
,
492
(
7427
),
72
78
.
Theis
,
L.
,
Chagas
,
A. M.
,
Arnstein
,
D.
,
Schwarz
,
C.
, &
Bethge
,
M.
(
2013
).
Beyond GLMs: A generative mixture modeling approach to neural system identification
.
PLoS Computational Biology
,
9
(
11
),
e1003356
.
Truccolo
,
W.
,
Eden
,
U. T.
,
Fellows
,
M. R.
,
Donoghue
,
J. P.
, &
Brown
,
E. N.
(
2005
).
A point process framework for relating neural spiking activity to spiking history, neural ensemble, and extrinsic covariate effects
.
Journal of Neurophysiology
,
93
(
2
),
1074
1089
.
Williamson
,
R. S.
,
Sahani
,
M.
, &
Pillow
,
J. W.
(
2015
).
The equivalence of information-theoretic and likelihood-based methods for neural dimensionality reduction
.
PLoS Computational Biology
,
11
(
4
),
e1004141
.
Yuret
,
D.
, &
De La Maza
,
M.
(
1993
).
Dynamic hill climbing: Overcoming the limitations of optimization techniques
. In
The Second Turkish Symposium on Artificial Intelligence and Neural Networks
(pp.
208
212
).
N.p.
Zhang
,
K.
,
Ginzburg
,
I.
,
McNaughton
,
B.
, &
Sejnowski
,
T.
(
1998
).
Interpreting neuronal population activity by reconstruction: Unified framework with application to hippocampal place cells
.
Journal of Neurophysiology
,
79
,
1017
1044
.