## Abstract

Firing activity from neural ensembles in rat hippocampus has been previously used to determine an animal's position in an open environment and separately to predict future behavioral decisions. However, a unified statistical procedure to combine information about position and behavior in environments with complex topological features from ensemble hippocampal activity has yet to be described. Here we present a two-stage computational framework that uses point process filters to simultaneously estimate the animal's location and predict future behavior from ensemble neural spiking activity. First, in the encoding stage, we linearized a two-dimensional T-maze, and used spline-based generalized linear models to characterize the place-field structure of different neurons. All of these neurons displayed highly specific position-dependent firing, which frequently had several peaks at multiple locations along the maze. When the rat was at the stem of the T-maze, the firing activity of several of these neurons also varied significantly as a function of the direction it would turn at the decision point, as detected by ANOVA. Second, in the decoding stage, we developed a state-space model for the animal's movement along a T-maze and used point process filters to accurately reconstruct both the location of the animal and the probability of the next decision. The filter yielded exact full posterior densities that were highly nongaussian and often multimodal. Our computational framework provides a reliable approach for characterizing and extracting information from ensembles of neurons with spatially specific context or task-dependent firing activity.

## 1. Introduction

The firing activity of place cells in the CA1 region of rat hippocampus has been shown to correlate with the location of the animal while exploring its environment. Ensembles of hippocampal neurons have been shown to maintain robust representations of the animal's position (O'Keefe & Dostrovsky, 1971; O'Keefe, 1979; Wilson & McNaughton, 1993), and decoding analyses have successfully been used to reconstruct the trajectory of a rat running through either open environments or linear tracks using ensemble spiking activity, based on Bayesian statistical methods (Barbieri et al., 2004; Brown, Frank, Tang, Quirk, & Wilson, 1998; Zhang, Ginzburg, McNaughton, & Sejnowski, 1998). Point process filters have been applied with great success to address problems of estimating a continuous state variable, such as the animal's position, from neural spike train observations (Barbieri et al., 2004; Brown et al., 1998). Often these point process filters are based on gaussian approximations to the posterior density, allowing explicit formulas to be derived for updating the posterior mean and variance (Brown et al., 1998; Eden, Frank, Barbieri, Solo, & Brown, 2004). Gaussian approximate filters have been used effectively for decoding in open environments or on linear tracks, but they may be inappropriate for decoding on environments with more complicated topologies, and specifically on T-shaped mazes, since the posterior can often be nongaussian and multimodal.

Certain spatially specific hippocampal neurons have also been shown to fire differently during memory tasks as a function of the animal's recent past or future behavior (Lee, Griffin, Zilli, Eichenbaum, & Hasselmo, 2006; Griffin, Eichenbaum, & Hasselmo, 2007; Smith & Mizumori, 2006; Hasselmo & Eichenbaum, 2005; Ferbinteanu & Shapiro, 2003; Frank, Brown, & Wilson, 2000). These cells have been termed splitter cells (Wood, Dudchenko, Robitsek, & Eichenbaum, 2000). For example, in rats trained to perform a continuous alternation task on a T-shaped maze, several neurons were recorded that would fire intensely along the common stem of the track preceding turns to one direction, while firing less intensely or not at all preceding turns in the opposite direction. This suggests that these neurons encode information about spatial location and behavioral context simultaneously.

This article focuses on understanding how ensembles represent information about location, history, and context and use this information to make decisions about future behavior. Constructing appropriate models for the firing activity of these neurons presents a number of statistical challenges. First, it is not obvious how to most appropriately incorporate information about position and future decisions in a single neural model. Poorly chosen models may not accurately capture the neural representation of these signals. Second, since many of the neurons have multimodal place fields that are active at multiple locations in the environment, it is necessary to develop a class of models that is sufficiently flexible to capture complex functional relations. Additionally, there are dual statistical challenges associated with using the ensemble activity from this area to reconstruct, or decode, the animal's movement trajectory and predict its future decisions. First, how do we build a decoding algorithm that can simultaneously reconstruct the animal's movement trajectory and predict its future movement? Second, we expect the posterior distribution for the animal's position to be nongaussian and have multiple modes associated with turning to each direction. Therefore, established methods based on gaussian approximations to the posterior will not be applicable. Exact numerical methods of computing the posterior distribution such as numerical integration and particle filtering become computationally intractable at higher dimensions, so simplifying assumptions are necessary to compute estimates accurately and efficiently.

To address these challenges, a simple point process framework was developed to model the spiking activity of an ensemble of neurons from the CA1 region of hippocampus and reconstruct the animal's position from the combined activity of this ensemble. By linearizing the maze and assuming that the neurons predominantly represented distance along the track rather than side-to-side position on the track, we were able to greatly reduce the computational complexity of the problem and compute the optimal filtering distributions at high resolution numerically. In practice, we first characterized the firing activity of each of the neurons in the ensemble. Next, we examined how information from the joint activity of an ensemble of neurons could be combined to reconstruct the animal's trajectory through the maze and predict future decisions. We constructed point process filters to iteratively calculate the full posterior probability for the animal's position at each time step, given the complete firing activity until the current time.

The remainder of this article is organized as follows. Section 2 describes the experimental methods used to train the animals and collect electrophysiological data. Sections 3 and 4 develop the neural models and estimation algorithms used in the encoding and decoding analyses, as well as statistical measures of goodness of fit. Section 5 presents the results of these analyses on the ensemble hippocampal activity. Finally, section 6 provides a discussion of the potential advantages of this framework and how it can be extended to modeling position and context-dependent firing in general maze environments.

## 2. Experimental Methods

In order to provide an illustration of the use of these analysis techniques, we have applied the modeling and estimation framework to the analysis of an example set of electrophysiological data. The experimental methods are described in detail in Lee et al. (2006). Briefly, one male Long Evans rat was implanted with a movable array of 12 recording tetrodes (four 12.7 micron diameter nichrome wires) and trained to perform a continuous alternation task conducted on a modified T-maze (116 × 107 cm long, 10 cm wide) that contained return arms (112 cm long, 10 cm wide) connecting the food wells to the start of the stem. Each reward zone contained a small plastic disc appropriately baited with chocolate sprinkles. The animal would start at the food well and walk down the return arm toward the base of the stem. It would turn onto the stem and run toward the T-intersection, where a choice would need to be made. A correct choice is to alternate between left and right on successive turns. If a correct turn is made, the animal receives a reward and continues on to the next trial. If an error is made, the animal is not given a reward and must make another lap to make a correct choice. All animal procedures and surgery were in accordance with National Institutes of Health and Boston University Animal Care and Use Committee guidelines.

Neural signals were amplified and bandpass filtered 0.3 kHz to 6 kHz (Neuralynx, Tucson, AZ), and threshold crossings were recorded. Single units were extracted from tetrode recordings using standard cluster cutting techniques (Offline Sorter, Plexon, Inc.). Principal component and energy waveform measurements were used to increase unit isolation quality. Recordings were made from multiple single CA1 neurons, and position data were recorded at 30 Hz (frames/sec). An array of diodes positioned on the head stage of the animal allowed a video camera to track the position of the animal during behavior. An ensemble of 47 neurons that fired more than two spikes over the entire experimental session was used for this analysis. The recorded data for one of the neurons in the ensemble is shown in Figure 1. The movement trajectory of the rat is shown in gray, and the locations at which this neuron fired are shown in black. There are 69 trials in this session, and 3 of these trials were error trials.

Position and spiking data from resting periods before and after each running session were removed, and linear interpolation was used to determine the coordinates of a few missing location points that occurred when the diode array on the headstage was occluded.

## 3. Encoding Methods

### 3.1. Two-Dimensional Memory/Prediction Tasks and Linearization.

We began by linearizing the two-dimensional (2D) T-shaped maze into a one-dimensional (1D) track. Since the movement of the rat is restricted on the linear track, simple state models are more appropriate in 1D. At the same time, treating the problem in 1D decreases the complexity of the problem, simplifies the calculations, and allows us to address questions about how the brain represents position in linear environments.

Figure 2 illustrates the methodology that was used to linearize the track. Figure 2A shows the original shape of the track in two dimensions, with individual sections shaded to show where they appear in the linearization. Figure 2B shows the resulting linearized track. We rotated the light-gray-with-horizontal-stripes and dark-gray-with-vertical-stripes sections, so they became respective extensions of the dark-gray-with-horizontal-stripes and light-gray-with-vertical-stripes sections of the horizontal part of the maze. When the animal was on the center stem of the maze, the region to which its position was mapped was determined by the direction it would turn when it next reached the decision point. We mapped the points preceding left turns to the light gray region and those preceding right turns to the dark gray region.

This linearization represents a transformation of the actual 2D coordinate position of the animal, (*x*, *y*), to a single coordinate, *x _{l}*. At each point in time, the animal's true position can be transformed to the linearized coordinate system, and its trajectory through the maze becomes a trajectory in 1D. The 1D coordinate itself indicates the total distance, in units of pixels, from the connection point at the lower end of the T-maze, with negative numbers indicating trajectories that include a left turn and positive numbers indicating ones that include a right turn. For example, a trajectory where the animal starts at the connection point and ends at the turn-decision point maps to either the interval 0 to −245 or 0 to 245 in 1D. Zero to −245 corresponds to the movement trajectory preceding left turns, and 0 to 245 corresponds to that preceding right turns. The connection point of the original T-maze was mapped to three distinct points in the linearized track (marked by dots in Figure 2B): the left end, right end, and midpoint of the linearized maze (corresponding to +/−753 and origin of the

*x*-axis). Respectively, the points represent the ending location for a left trial or a right trial or the starting point for a next trial. Any point along the maze away from the neighborhood of this connection point is topologically equivalent to a line, in that the animal can move only forward or backward in one dimension at such a point; however, at the connection point, the animal can travel in four possible directions. In the topology literature, a space with this structure is called a figure eight space or the wedge of two circles (Lee, 2008) because it is topologically equivalent to two circles that are joined at a single point.

Although this transformation is not exactly invertible, we can define an approximate inverse transformation that maps back the linearized position of the animal onto a set of central paths along the T-maze. In this way, we can compare our estimated position of the animal with its true position in the original 2D maze.

### 3.2. Modeling of the Conditional Intensity Function.

A set of spiking observations can be described in terms of the number of spikes observed over any time interval. We define a counting process *N*(*t*) on an observation interval (0, *T*], as the total number of spikes fired by a single neuron from time 0 up to and including time *t*, for any *t* < *T*. *N*(*t*) is then a continuous time specification of a stochastic point process (Cox & Isham, 1980). We can also consider point processes in discrete time by partitioning the observation interval (0, *T*] evenly into *K* subintervals (*t*_{k−1}, *t _{k}*]

^{K}

_{k=1}, for a large integer

*K*. A spike train can thus be expressed in terms of discrete increments Δ

*N*=

_{k}*N*(

*t*) −

_{k}*N*(

*t*

_{k−1}), which count the number of spikes in a single subinterval.

It is known that any point process can be completely characterized by its conditional intensity function (CIF) (Daley & Vere-Jones, 2003). For this analysis, two intensity models were constructed for the firing activity of each neuron. For the first model, an inhomogeneous Poisson model was used to capture the spatial component for the neuron's firing activity, independent of its past firing history. Thus, the firing rate was related solely to the position of the animal along the linearized track. Since we observed that the firing activity was high at multiple locations along the linearized maze, low-dimensional parametric functions such as a gaussian-shaped curve would not accurately capture the spatial structure of the place field. On the other hand, spline curves that are flexible and piecewise continuous (Hearn & Baker, 1996) have been successfully used to characterize firing intensity as a function of time or location (Frank, Eden, Solo, Wilson, & Brown, 2002; Kass, Ventura, & Cai, 2003). We used a spline model of the same form as in Frank et al. (2002) to characterize the firing rate as a function as the linearized position.

*x*(

*t*) the linear position of the animal at time

*t*. Under this model, we assume that the firing activity in each neuron follows an inhomogeneous Poisson process with rate function where

*θ*

_{i}for

*i*= 1, …,

*P*are a collection of model parameters that differ for each neuron and define the shape of the conditional intensity as a function of the linearized position.

*g*(.) for

_{i}*i*= 1, …,

*P*are basis functions for cardinal splines, defined on the position data

*x*(

*t*). Figure 3 shows an example for a cardinal spline fit. Specific details of the spline basis functions used are described in Frank et al. (2002).

*Q*is a parameter that determines how long history effects persist. The value of

*Q*will be determined by data, as described in section 3.4. Δ

*N*

_{t–j}is the number of spikes this neuron previously fired in the time interval [

*t*−

*j*× Δ

*t*,

*t*−

*j*× Δ

*t*+ Δ

*t*), where Δ

_{t}is the difference between two consecutively recorded time steps in the experiment. When

*j*varies from 1 to

*Q*, this model incorporates the spiking history for this neuron from the previous time steps to

*Q*time steps before the current time in our model (Truccolo, Eden, Fellows, Donoghue, & Brown, 2005). The parameters for this model include

*θ*

_{i}for

*i*= 1, …,

*P*, which determines the magnitude of the CIF in the absence of the modulatory history effect, and

*γ*

_{j}for

*j*= 1, …,

*Q*, which defines the magnitude of the effects of past spiking activity at a given time lag. Positive coefficient values suggest excitatory effects while negative values suggest inhibition.

### 3.3. Likelihood Function and GLM Models.

We denote the parameter vector to be estimated in equation 3.1 as β_{1} = [θ_{1}, …, θ_{R}], and that in equation 3.2 as β_{2} = [θ_{1}, …, θ_{R}, γ_{1}, …γ_{Q}]. We will use the notation λ_{m}(*t _{k}* ∣ β

_{m}),

*m*= 1, 2 for the conditional intensity function parameterized by β

_{m}and evaluated at time step

*t*. Let

_{k}*N*

_{1:K}= (Δ

*N*)

_{k}_{k=1,…,K}. The log-likelihood function for parameter vector β

_{1}= [θ

_{1}, …, θ

_{R}] is given by (Bremaud, 1981; Daley & Vere-Jones, 2003; Brown, Barbieri, Eden, & Frank, 2003; Eden et al., 2004; Trucollo et al., 2005).

Let be a parameter vector at which log *L*_{1}(β_{1} ∣ *N*_{1:K}) attains its maximum as a function of β_{1}, with the observed spike train data, *N*_{1:K}, held fixed. Then is called the maximum likelihood estimator (MLE) of β_{1}. Similarly, the that maximizes log *L*_{2}(β_{2} ∣ *N*_{1:K}) is the MLE of β_{2}. Intuitively, the MLE is an appropriate estimator since it represents the parameter vector for which the observed spike train data are most likely. MLEs also possess some of the mathematically desirable optimality properties. In large samples under certain regularity conditions, the MLE has approximately a (multivariate) normal distribution with mean equal to the true parameter value and covariance matrix given by the inverse of the Fisher information matrix, allowing confidence regions for the point estimator to be rapidly computed (Casella & Berger, 2001).

Our goal was to find and in order to fit the conditional intensity functions of the forms shown in equations 3.1 and 3.2, respectively. Both of these models fall into the class of generalized linear models (GLMs) (McCullagh & Nelder, 1989). The GLM is an extension of the general linear regression model in which the variable being predicted, in this case spike times, need not be gaussian. GLMs have a number of specific theoretical and computational advantages. It is easily shown that for a GLM, the log-likelihood function is concave as a function of the model parameters. This means that maximum likelihood methods will be easy to implement and guaranteed to converge to a global maximum. Additionally, the GLM framework is integrated into several standard mathematical and statistical packages. These properties make the GLM framework ideal for rapidly assessing the possible relevance of a large number of covariates on the spiking properties of neurons whose receptive fields have not previously been fully characterized (Truccolo et al., 2005).

### 3.4. Goodness-of-Fit Measures.

*Q*in equation 3.2 needs to be determined for each of the neurons in the ensemble. In order to select the optimal number of parameters, the Aikaike information criterion (AIC) function was computed, which penalizes the likelihood function by the number of parameters used. It has the form where the likelihood function is evaluated at the MLE of model parameters, , and

*q*is the total number of parameters to be estimated in the model, or the dimension of the vector . The model that minimizes the AIC asymptotically maximizes the cross-validation likelihood and generalizes best to the unobserved data (Burnham & Anderson, 2002). If the AIC for model 2 is minimized at

*Q*= 0, the optimal conditional intensity model reduces to model 1; otherwise equation 3.2 with the minimized value for

*Q*gives the optimal model in this set.

After fitting spiking data to the Poisson model and the history-dependent model with optimal *Q*, it is important to determine how well the fit models capture the structure observed in the data. For spike train data, this amounts to characterizing how well the models predict the timing of the observed spike events. Goodness-of-fit measures that are typically applied in continuous data analyses are not directly appropriate for point process data. One solution to this problem is to apply the time-rescaling theorem (Brown, Barbieri, Ventura, Kass, & Frank, 2001; Ogata, 1988; Papangelou, 1972) to the intensity functions computed from model 1 and model 2 respectively.

Given the conditional intensity function, define . By the time-rescaling theorem, if the conditional intensity function accurately reflects the structure of the spiking activity, the Λ(*t _{k}*)s will be a Poisson process with unit rate, and the rescaled interspike intervals (ISIs), defined as Λ(

*t*) − Λ(

_{k}*t*

_{k−1}) for

*k*= 1, …,

*K*, will be independent exponential random variables with mean 1.

A Kolmogorov-Smirnov (KS) plot is a plot of the empirical cumulative distribution function (CDF) of the rescaled ISIs against an exponential CDF. If the conditional intensity model accurately describes the observed spiking data, then the KS plot should follow a 45 degree line. Confidence bounds for the degree of agreement between a model and the data can be constructed using the distribution of the KS statistic; the 95% confidence bounds can be approximated as , for *k* = 1, …, *K* (Johnson & Kotz, 1970).

If the conditional intensity function from either of our point process models is accurate, the rescaled ISIs should be not only exponential but also independent. Autocorrelation (ACF) plots are useful for determining whether there is a significant correlation in the rescaled ISIs. The rescaled ISIs are first transformed to uniform variables on the unit interval. Then the mean-adjusted sample autocorrelation function is calculated for the transformed ISIs and plotted against the lags varying from 1 to 20 to generate an ACF plot. If our model correctly describes the spiking data, the ACF at all lags should be close to zero (Bartlett, 1946; Shumway, 1988). Bootstrap methods (Lehmann, 1999; Casella & Berger, 2001) were used to calculate the empirical confidence bounds. If the conditional intensity model accurately describes the measured spiking activity, we expect the observed correlation values to fall within these bounds with 95% probability.

## 4. Decoding Methods

Bayesian state-space estimation procedures require both a state transition model and an observation model. The observation model characterizes the distribution of the observed data as a function of the state, which is determined by the conditional intensity function and equation 4.3 below. The state transition model determines the prior distribution on the evolution of the state, *p*(*x _{k}* ∣

*x*

_{k−1}). In this application, the state model describes the statistical properties of the movement trajectory of the rat, and the observation model describes the spiking activity as a function of this movement.

*x*= ±753 and

*x*= 0 along the linearized track, it is topologically equivalent to a figure eight space in that it is locally linear everywhere except for a single connection point, shown in Figure 2. Under this modified random walk,

*d*, the distance the animal can move in one time step is normally distributed with mean zero: If the animal does not move through the connection point, then this movement distance uniquely determines the next position,

_{k}*x*=

_{k}*x*

_{k−1}+

*d*. If the animal does move through the connection point, then there are three possible points at this distance, and under the model, it is equally likely to move in any of these three possible directions.

_{k}This state model generalizes a simple random walk to the linearized maze, such that from any point in the track, the animal is equally likely to move in any direction available to it. The transition probabilities are such that they are approximately gaussian when the animal is far from the connection point, but, they can be highly nongaussian, including points of discontinuity as well as multiple local maxima, when the previous position of the animal is near the connection point.

*d*, is normally distributed with constant variance σ

_{k}^{2}, but is now centered around a predictable component

*f*(

*x*

_{k−1}): The drift term

*f*(

*x*

_{k−1}) estimates the expected movement of the animal as a function of the animal's current location between two adjacent time steps. Local linear fitting (Fan & Gijbels, 1996), a form of nonparametric regression, was performed to evaluate

*f*at each of the linearized positions, based on behavioral data of the animal but not the spiking data (see Figure 4 below). Similar to the adapted random walk state model, if this movement distance does not cause the animal to pass through the connection point, the next position is uniquely determined through equation 4.2. For each location along the return arm near the connection point, we calculated the observed relative frequency with which the animal moved through the connection point to the maze stem within a single time step. For this state model, we set the probability that the animal passes through the connection point equal to this observed relative frequency and set the probability that the next trajectory would be a left turn or a right turn, each to one half.

In comparison to the first simple random walk state model, this state model takes into consideration the animal's actual behavior; therefore, we expect a closer reconstruction of the animal's movement. However, if we are interested only in the information about the movement trajectory due to the ensemble spiking activity, we may prefer to use a less informative state model such as the modified random walk.

The goal of our decoding algorithm is to compute, at each time point, the posterior distribution of the animal's position along the linearized track given the combined spiking activity of the entire neural ensemble. Notice that when the animal is on the common stem of the T-maze, this posterior distribution not only predicts its location on the stem, but also predicts which direction the animal will turn when it next reaches the decision point. For example, if the posterior distribution is multimodal with a large peak centered at 120 and a smaller peak centered at −120, this would suggest that the rat is approximately halfway up the stem and is more likely right than left when it reaches the decision point.

In the above expression, *p*(*x _{k}* ∣

*x*

_{k−1}) is given by either of the state models defined by equations 4.1 and 4.2, along with the property that movements through the connection point are equally likely to move in each of the possible directions. We multiply this state transition probability to the posterior distribution of the animal's position at the previous time step

*t*

_{k−1}, and numerically integrate the product over all possible positions of the animal to get a one-step prediction distribution at the current time

*t*. This distribution is used as the prior distribution for the present time step. In this case, we performed the numerical integration using a simple Riemann sum. If integration over a much larger spatial scale or in higher dimensions were required, well-developed methods such as importance sampling or subregion-adaptive integration (Davis & Rabinowitz, 1984; Genz & Kass, 1991, 1997; Evans & Swartz, 2000) should be considered for improving the efficiency of the algorithm.

_{k}*N*∣

^{c}_{k}*H*), is fully characterized by its conditional intensity function. For neuron

_{k}*c*, we denote by λ

^{c}

_{k}the conditional intensity function at time step

*t*. The probability of seeing Δ

_{k}*N*spikes from this neuron during the time interval (

^{c}_{k}*t*

_{k−1},

*t*] is where Δ

_{k}*t*=

*t*−

_{k}*t*

_{k−1}is the time interval between two recorded location points and

*M*is an expression that does not depend on the state,

*x*. The observation distribution for an ensemble of

_{k}*C*neurons, ∏

^{C}

_{c=1}Pr(Δ

*N*∣

^{c}_{k}*H*), is multiplied by the one-step prediction density to get the posterior at the current time.

_{k}After calculating the posterior density, the mode of the distribution was used to estimate the location of the animal, providing a maximum a posteriori (MAP) estimator. We denote this estimator by at time step .

The uncertainty of can be characterized by constructing confidence intervals. The traditional approach to computing confidence intervals of integrating the posterior density up to .025 and .975 is not suitable for two reasons. First, since the figure eight topology of the track has no natural starting point, an arbitrary starting point needs to be selected, which would have a serious effect on the resulting confidence region. Second, whichever starting point we choose, this approach will lead to inappropriately large confidence bounds, since the posterior density can be multimodal and CIs constructed this way can include long regions where the posterior density is close to zero.

_{k}the posterior density at time step

*t*. To calculate the 95% confidence bounds, we find the largest value

_{k}*y*

_{0}such that The 95% confidence region is thus given by {

*x*:ρ

_{k}(

*x*)>

*y*

_{0}}. Note that the HPD allows multiple separate regions at any point in time, which can be especially informative when the animal was on the stem of the T-maze or getting close to the decision point. It is also, by definition, the narrowest among all 95% confidence intervals, since it is the region that corresponds to the highest posterior densities.

At each time step, we check whether the 95% confidence interval actually covers the true position of the animal. The total number of time steps when this occurs is divided by the total number of time steps across the whole experiment to calculate the coverage probability of our decoding model. A larger coverage probability indicates that the decoding procedure is more appropriately estimating its uncertainty about the animal's location. The point and interval estimators calculated in 1D can then be mapped back in 2D using an inverse transformation (see section 3.1), and the root mean squared error (RMSE) between the true and estimated position can be computed.

To assess whether the models generalize well to unobserved data, cross-validation analyses are performed (Cox, 1975; Mosteller & Tukey, 1977). We divided both position and spiking data from the whole experiment into two equal-length halves. The first half of data was used as the training set, for building the observation model and the empirical state model. The spiking information from the second half of the data was used as the test set to reconstruct the location of the animal and compute confidence bounds. Coverage probability and RMSE were calculated for the decoded movement trajectories of the test set and compared with the results from decoding over the whole experiment.

## 5. Results

In order to illustrate the encoding and decoding methods, we applied this framework to the spiking activity of the 47 cell ensembles collected from the rat performing the continuous alternation task.

### 5.1. Encoding Results.

By linearizing the two-dimensional track (see section 3.1), we were able to construct and fit the two spline-based GLMs separately for the spiking activity of each of the 47 neurons.

An example for the encoding results is shown in Figure 5 for a single neuron whose place field was shown in Figure 1. Figure 5A shows the movement trajectory of the rat in gray, along with the locations at which this neuron fired in black. This neuron was most active when the animal was at the right corner of the T-maze. Figure 5B shows the MLE of model 1, the Poisson model, fit to the linearized data. The solid black curve represents the MLE of the firing rate as a function of the animal's location on the track, while the gray region with vertical stripes represents a 95% confidence interval for the firing rate. We can see from the plot that the neuron spikes most when the animal's linearized position was around 395, which corresponds to the right corner for the original 2D maze (see Figure 2).

The goodness-of-fit analysis based on time-rescaling, KS plots, and autocorrelation analyses as described in section 3.4 were then carried out for the Poisson model. Figure 6A shows KS plots for the Poisson model applied to the neuron illustrated in Figure 5. The fact that the KS plot passes outside its 95% confidence interval suggests that this Poisson model does not fully capture the structure in the data. In order to explore the possible reasons for the KS test to fail, we checked the correlation of the rescaled ISIs at multiple lags. An ACF plot constructed for the transformed ISIs according to the Poisson model is shown in Figure 6B. We can see a clear structure in the ACF at the first 20 lags. This suggests that the uncaptured structure in the data is related to dependencies between nearby spike events.

To capture the effects of spiking history on the firing properties of these neurons, we also fit the spiking data to model 2, the history-dependent model. The AIC function was used to determine the number of history covariates, or the length of history to be considered. The AIC of model 2 reaches a minimum at lag 17, which suggests that our optimal model should include the spiking for the past 17 time steps. The maximum likelihood fits for the history-dependent model with *Q* equivalent to 17 are shown in Figures 5C and 5D. Figure 5C shows the MLEs for the exponential of the 17 coefficients for the history covariates of this neuron and their 95% confidence bounds. Three of the exponentiated coefficients are significantly different from one (indicated by stars). Note that the exponentials of those three parameters are all greater than one, indicating excitation at these time lags. The most significant coefficient is one time step, or about 33 ms, back in history, which can be reflective of bursting activity in this neuron. For example, if this neuron fired about 33 ms in the past, it would have a fourfold increase in firing probability at the next time step. To examine the effect that this history dependence has on the spatial component model, we plotted the estimate for the firing rate assuming no relevant recent spiking activity, represented by the solid black curve in Figure 5D. The 95% confidence bounds are shown as a gray region with vertical stripes. In a comparison of this plot with Figure 5B, the major difference is that the peak for the estimated firing rate is much lower after we take into consideration the neuron's firing history. This suggests that some of the firing intensity when the animal was at the right corner is more properly attributed to history-related effects such as bursting than purely to position effects.

The model fit shown in Figures 5B and 5D also suggests that the firing activity increases more along the stem just prior to right turns (around 245 for the linearized position) than it does prior to left turns (around −245 for the linearized position). However, this difference in firing rate is not significant at the 95% confidence level, suggesting that it would be difficult to infer the next turn direction from the firing activity of this neuron alone.

Figures 6C and 6D show the KS and ACF plots for the same neuron using the fit for the history-dependent model. Figure 6C shows that adding the simple history component in model 2 improves the model fit to the point that it passes the KS test. The ACF plot at the first 20 lags constructed for model 2 is shown in Figure 6D; the structure at small lags that was present in Figure 6B is eliminated, and almost all the points fall within the 95% confidence bounds.

Over the full ensemble of 47 neurons, we found that, similar to the encoding results shown in Figure 5, the firing intensity varied as a function of the linearized location for each of the neurons. By visual inspection, we found that the CIF for between two-fifths and one-half of the neurons displayed multiple peaks along the linearized maze under both the Poisson model and the history-dependent spiking model. The rest of the neurons had a single narrow peak at one particular location, suggesting that these neurons are narrowly tuned to position. However, few of the neurons displayed significant differential firing between left and right turns when the animal was at the stem of the T- maze.

We also carried out goodness-of-fit analyses for the firing activity of each neuron in the ensemble. Overall, 19 of the 47 neurons were well fit by the Poisson model as measured by the KS test. Under the history-dependent model, the number of neurons that were well fit improved to 29 out of the 47. For the neurons whose firing activities modeled by inhomogeneous Poisson failed the KS test, most of them showed clear structure in their ACF plots. That structure was eliminated or reduced once a simple history component was added to the model. This suggests that these neurons have history-dependent firing properties that must be modeled to properly describe the observed spiking patterns.

### 5.2. Decoding Results.

In order to decode the position of the animal from the firing activity of an ensemble of neurons, we calculated the posterior density of the animal's position along the linearized T-maze given the past spiking activity of the entire ensemble, at each time step, for the duration of the experiment. The decoding results from the empirical state model (see section 4) and history-dependent encoding model for a minute-long segment of the experiment are shown in Figure 7. Figure 7A shows the posterior distribution for the position of the animal at four time steps, with each frame coming from distinct passes through the maze. The horizontal black line shows the true position of the animal, and the curve is the estimated posterior density at that time step, with the black dot being the MAP estimator for the location of the animal. The 95% confidence region is shown in light gray. We can see that in all four frames, the MAP estimate is close to the true position of the animal, and the 95% confidence interval for the true position of the animal is narrow. We also found that the posterior distribution is highly nongaussian, that is, asymmetric in general and often multimodal. This can be partly attributed to the discontinuity at the connection point of the T-maze, as well as the fact that the middle stem of the maze was split according to the animal's next turn direction during the linearization process (see section 3.1). Bimodal distributions often occurred when the animal was traveling up the stem and approaching the decision point.

Figure 7B shows the decoding results in one dimension by comparing the true and estimated linearized positions as a function of time. Here, the continuous dark gray curve represents the actual trajectory of the rat along the linearized track, while the discontinuous black points are the MAP estimator and the light gray region is the 95% confidence interval at each time step. The estimate tracks the true position closely, and the confidence interval covers the true position most of the time. In Figure 7C, the decoding results for the four chosen frames displayed in Figure 7A are mapped back to our original T-maze. The dark gray dot represents the true position of the animal; the black dot and the light gray interval, derived by mapping the MAP estimator and its 95% confidence interval in 1D back to 2D, respectively, show the estimated location of the animal in the original track with 95% confidence region. At each time step, we computed an estimate of the rat's actual position by mapping the linearized track estimate back onto the actual track. By piecing together individual frames of this sort, we constructed a 2D decoding movie that is available online in the Supplemental Materials for this article on the *Neural Computation* Web site.

We carried out this decoding analysis for the full length of experiment under four different models, and the results are summarized in Table 1. We considered both the Poisson and the history-dependent observation models and two different state models, the adapted random walk model and the empirical state model derived from local polynomial regression (see section 4).

Conditional Intensity Model . | State Model . | Coverage Probability of 95% CI . | RMSE (cm) . |
---|---|---|---|

Equation 3.1 | Random walk | 62.76% | 7.69 |

without history component | Empirical | 82.08 | 5.58 |

Equation 3.2 | Random walk | 61.26 | 8.17 |

with history component | Empirical | 83.10 | 5.71 |

From Table 1, we can see that although the firing history greatly improved the goodness of fit to the data in the encoding analysis, it does not have a major impact on the decoding results in terms of either the coverage probability or the mean squared error. However, using the empirical state model does improve the estimation accuracy, since it uses statistical information about the animal's likely movement at each time step. It is clear from the reconstructions based on the random walk model that even without the information afforded by the empirical state model, the neural ensemble contains sufficient information to accurately decode the rat's position and intended movement direction. For the empirical state models, the RMSE in 2D is less than 6 cm, which is less than half the body length of the animal (about 15 cm).

This decoding framework can also be used to predict the animal's next turn decision. When the animal is at the stem of the T-maze, the posterior density is likely to be bimodal, with separate modes providing the conditional probabilities for the animal to turn left or right, given the ensemble spiking activity. As the animal approaches the decision point, if the probability under one of the modes is greater than that under the other, we can predict that the animal will turn to the corresponding direction. One example is shown in Figure 8. The posterior density splits into two modes as the animal moves up the stem toward the decision point (from 0 toward 245 or −245 in 1D), with each side representing the same spatial location but different future turn directions. The posterior density is much higher in the region corresponding to a right turn than it is in the region corresponding to a left turn. In this case, we are much more certain that the animal will choose to turn right at the next decision point. As the animal nears the decision point, the left side of the modes dies off, and we are increasingly confident that the animal will eventually turn right. In the last frame of Figure 8, since the right mode contains the whole of the 95% CI, we can predict with over 95% confidence the next turn direction for the animal at that time step, during which the true position of the animal is just over halfway up the stem. Another example is illustrated in Figure 7C(4). As the animal travels up the stem approaching the decision point, we can predict, using the posterior distribution shown in Figure 7A(4), that the animal will make a right turn with over 99% confidence.

For all four combinations of state and observation models, we correctly predicted 66 of 69 turns. All three of these errors occurred on right turns that were predicted as left turns. One of the trials in which the algorithm incorrectly predicted the turn was the last trial of the experiment; the other two were trials for which the rat also made a behavioral error—a consecutive right turn after the previous trial. In these cases, the decoding algorithm correctly predicted the direction the animal should have chosen according to the task rather than the direction where the animal actually went. This could occur if the ensemble were correctly coding for features of the task that would allow it to choose the proper direction, but the rat did not act properly on this information, or if the ensemble were retrospectively coding where the animal had been rather than where it was going (Ferbinteanu & Shapiro, 2003).

We carried out cross-validation by partitioning the recorded data into two equal-length halves. The first half was used as the training set to fit the neural models, and the second half was used as a test set to reconstruct the movement trajectories. The cross-validated decoding results are summarized in Table 2.

Conditional Intensity Model . | State Model . | Coverage Probability of 95% CI . | RMSE (cm) . |
---|---|---|---|

Equation 1, | Random walk | 60.47% | 11.15 |

without history component | Empirical | 82.46 | 6.76 |

Equation 2, | Random walk | 58.96 | 11.99 |

with history component | Empirical | 81.32 | 7.45 |

Conditional Intensity Model . | State Model . | Coverage Probability of 95% CI . | RMSE (cm) . |
---|---|---|---|

Equation 1, | Random walk | 60.47% | 11.15 |

without history component | Empirical | 82.46 | 6.76 |

Equation 2, | Random walk | 58.96 | 11.99 |

with history component | Empirical | 81.32 | 7.45 |

Comparing Table 2 with Table 1, there is not a very big change for the coverage probability of the four models. The corresponding RMSE does increase slightly, but this may also be due to the decrease in sample size, since the training and test set consist of only half of the original data. In any case, it is clear that overfitting to the data is minimal. For the empirical state models, the RMSE still remains smaller than half of the body length of the animal.

## 6. Discussion

Using a simple model for the place-specific differential firing of these neurons, we were able to characterize the firing properties of each neuron in the ensemble, reconstruct the animal's trajectory through its environment, and predict future behaviors.

We started by simplifying the problem both conceptually and computationally by linearizing the animal's position into a 1D vector, focusing our interest on the animal's distance along the narrow track. We mapped the rat's movement trajectory at all locations (excluding only the resting area outside of the track) from 2D to 1D. Most place cell analyses exclude spiking data collected when the animal's movement falls below a specific speed threshold since this activity may be associated with activities apart from spatial exploration (Buzsáki, 1986; Lee et al., 2006). However, for this analysis, we sought to use all of the spiking data that provided information about the animal's location, including data at the corner of the T-maze where the feeding wells are located. These data are still informative about position, and we can maintain better estimates of the animal's position by including all of the observed data.

To address how information about future turn decisions is represented by this ensemble, we mapped the middle stem of the T-maze to different regions on the linearized track according to the animal's next turn decision. In the encoding stage, we constructed and fit MLEs for conditional intensity models of each of the neurons along the linearized maze. Differential firing based on the direction of the next turn decision can be identified by comparing the fit of the intensity model over the linearized region corresponding to left turns (−245 to 0) to that corresponding to right turns (0 to 245). For the decoding stage, it is more natural to construct state models that describe the rat's movements for the linearized track, and the posterior distribution of the animal's distance along the track can be more efficiently computed under this 1D framework than in 2D. Linearization enables us to predict the animal's full movement trajectory and next turn direction at a high level of accuracy.

The linearization procedure described in this article is specific to the T-maze topology but can be easily expanded to arbitrary linear mazes. Random walks can be generalized by constructing gaussian models on the distance the rat moves in a particular direction at a single time step and assigning equal probability to each point in the maze at the resulting distance and direction. For example, if the rat travels up a stem that branches into three different choice paths in the original 2D maze, the generalized random walk model will define the probability that the animal passes through the branch point based on the gaussian distribution and assign equal probability to each of the three directions. More general state models can also be constructed by defining a distribution on the current state given the previous one, *p*(*x _{k}* ∣

*x*

_{k−1}).

After we linearized the maze, we proposed two models to characterize the firing activity of each neuron along the linearized T-maze. The first is a Poisson model that relates the firing rate of the neuron to the spatial location of the animal using a cardinal spline. This simple model was able to capture the statistical structure in the spiking activity of 19 of 47 neurons (about 40%), based on the KS tests. For those that failed the KS tests, the ACF plot for their transformed ISIs demonstrated correlated structure at small lags, potentially indicating history-dependent firing activity. Thus, in the second model, we incorporated a simple history component. Under this history-dependent model, the firing activity of 29 neurons (over 60%) passed the KS tests, and the correlation structure originally shown in the ACF plots of many of the neurons was reduced or eliminated. This suggests that firing history is an essential component necessary to accurately model this spiking activity.

Even under the conditional intensity models with simple history dependence, over a third of the neurons (17 of 47) did not pass the KS test and retained a significant correlation structure between the rescaled ISIs, suggesting that the coarse history component of this model could not fully capture the true dependence structure. This model could be improved by incorporating a more sophisticated history component. This can be achieved in several ways. Binning the spiking activity at finer time scales would allow us to capture more structure in the history dependence of these neurons, especially at small time intervals after a spike. Alternately, renewal-type models of history dependence that relate the firing intensity to the time since the last spike could also be constructed. Additionally, this history-dependent model made a simplification by assuming that the history covariates are independent of the spatial firing properties. Known phenomena such as theta precession suggest that dependencies between spike times could be place specific.

However, the decoding results shown in Table 1 suggest that neither the estimation accuracy for the position of the animal nor the coverage probability of the 95% confidence interval is greatly improved by using the model that incorporates covariates related to the past firing history. Although the simple Poisson model does not fully capture the statistical structure of each neuron's firing activity, it has captured the spatial information in the ensemble spiking very well. For this reason, we do not expect significant improvement in the decoding results using an encoding model with a finer history component. A better model of history dependence will, however, help us better understand the biophysical properties of these neurons.

Next, we examined the estimated conditional intensity function for individual neurons in the ensemble under both the Poisson model and the history-dependent model. We found that many of the neurons show narrow tuning of position, since a number of individual cells contain pertinent information for the location of the animal. To examine how the information of future turn direction is encoded over the full ensemble, a two-way ANOVA analysis (Wood et al., 2000) was performed comparing spike activity as a function of location and turn direction. The results suggested that only 13 of the 47 neurons show significant differential firing preceding different turn directions. Physiologically this suggests that a small subpopulation of these neurons contains the preponderance of information about future decisions.

The errors in predicting the next turn direction tended to occur when the animal made a behavioral mistake in the alternation task by turning to the same direction as in the previous loop; in other words, in these cases, the decoding framework was predicting the direction to which the animal should go instead of where it actually went. One possible explanation for this observation is that these neurons may be retrospectively coding the rat's past movement trajectory rather than predicting its future behavioral decisions. However, because the movement task required the rat to alternate between left and right turns, we cannot disambiguate between retrospective coding and prospective coding in this experiment. One approach to this problem would be to examine an alternative movement task or alternatively designed mazes that would enable us to distinguish between the influence of the past and future movement trajectories (Ferbinteanu & Shapiro, 2003; Shapiro & Ferbinteanu, 2006; Fleischer, Gally, Edelman, & Krichmar, 2007).

Recall that we predicted the animal's next turn direction by calculating the full posterior distribution of the animal's position along the linearized track. When the animal was going up the middle stem of the T-maze, the posterior was often bimodal, with one of the two modes dying off as the animal approached the decision point, suggesting more and more certainty that the animal would turn in the direction of the mode that persists. Overall, the posterior distribution over the linearized track was observed to be highly nongaussian and often multimodal. This is an expected outcome of using a state model that allows the rat to move in any direction at the decision points. As a direct result, the posterior distribution for the animal's position appears multimodal with multiple modes whenever it approaches the connection point at a linearized position of 0 and ±753.

By calculating the exact posterior distribution at each time step and using the MAP as our estimate for the animal's true position, the RMSE for our tracking across the whole experiment is less than 6 cm under the empirical state model and about 8 cm under the random walk model (the body length of the animal is about 15 cm). The coverage probability falls short of the expected 95%, suggesting that this framework has captured much, but not all, of the spatial information contained in this ensemble. Further improvements in the encoding model might lead to more accurate decoding. For example, we could incorporate other factors into the encoding model such as the phase of oscillations in the EEG in the theta band (Skaggs, McNaughton, Wilson, & Barnes, 1996; Jensen & Lisman, 2000).

It is unlikely that the accuracy of the decoding results is related to overfitting in the neural spiking models, since the number of model parameters fit is very small compared to the number of time points at which the movement trajectory was reconstructed. To demonstrate that this decoding analysis generalizes to unobserved data, we performed a cross-validation analysis, which led to comparable decoding results to the analysis using the full data set for model fitting. This suggests that the conditional intensity models were not just capturing actual structure and not simple trial-by-trial fluctuations in the spiking data, that is, the models describe the actual relationship between firing and position rather than the noise.

There are several possible research areas to be further explored. One future direction is to compare the decoding results from this algorithm with those from other decoding paradigms for the T-maze, either under our linearization framework or directly in 2D. Approximate gaussian filters are common tools for efficient computation of the posterior distribution. When the posterior is assumed to be gaussian, a recursive expression for the mean and variance of the posterior can often be derived analytically. This could lead to a simplification of the decoding procedure described here, since such an algorithm needs only to update the mean and variance according to the explicit formula. However, it is not trivial to construct a gaussian filter for this task. As we discussed before, the posterior distribution for the animal's position is often multimodal along the linearized track due to the figure eight topology of the maze. Fundamentally, the gaussian distribution is defined on the real line. Whether there is an appropriate way to impose this structure over the figure eight topology of this task is unclear.

Particle filters offer another alternative to the numerical computations of the posterior distribution that underlie this decoding framework (Wang, Paiva, Principe, & Sanchez, 2007; Brockwell, Rojas, & Kass, 2004; Kelly & Lee, 2004; Eden, Frank, Barbieri, & Brown, 2002; Doucet, de Freitas, & Gordon, 2001; Doucet, Godsill, & Andrieu, 2000). The questions of when such particle filters are more computationally efficient and how many particles are required for accurate decoding for specific decoding problems are still unanswered. By linearizing the track and computing the posterior distribution numerically, we have a ground truth to which we can compare different particle filtering algorithms in terms of accuracy and computational efficiency. Ergün, Barbieri, Eden, Wilson, and Brown (2007) found through simulation that the commonly used bootstrap particle filter could be inefficient when dealing with point process observations, since the posterior density could jump instantaneously when a spike occurs. A future analysis focused on applying these methods to this data set and comparing the results to the ones presented here could help elucidate these issues.

These methods can also be compared to other decoding algorithms developed directly in 2D (Johnson & Redish, 2007). For decoding paradigms designed directly in the original 2D maze, accurate state models that capture the structure of this T-maze task are more difficult to construct. One issue is whether the state model should allow the animal to move outside the confines of the maze. If there is a possibility for the animal to move outside and the encoding model is constructed using only spiking data along the track (usually the case), the model will be incorrect, and the posterior density will inappropriately increase in regions outside the track during nonspiking periods. Therefore, such an algorithm will inappropriately incorporate information coming from nonspike times. On the other hand, the problem of defining a state model that confines the animal to the maze and has appropriate probability distributions for its movements through the maze is nontrivial. It is also more challenging to construct a conditional intensity model in 2D with great accuracy. Because of the maze structure, it is not proper to assume a gaussian model for the CIF; at the same time, spline-based estimation algorithms that use flexible functions to characterize the place field are much more difficult to express in 2D than in 1D.

The focus of this article has been to develop a point process modeling and decoding paradigm and illustrate its application to a small data set from the CA1 region of rat hippocampus. Ultimately we would like to carry out large-scale analysis across different animals within the same regions of the brain using these methods. As a direct extension to this experiment, it is of great interest to explore how prediction accuracy, as well as confidence of prediction, varies as a function of distance or time preceding the decision point. The data set used to illustrate this application is too small to draw strong conclusions, but similar analyses on larger data sets will allow us to quantify prediction accuracy as a function of time prior to the decision. Also, this would allow us to investigate commonalities and differences in the ensemble coding properties of hippocampal neurons across individual animals and draw general conclusions about the coding features of the brain. This will also allow us to study how the results vary as the ensemble size changes in order to understand the information content in subpopulations of these neurons.

In summary, this work provides an important first step toward understanding how the brain encodes information about its environment and uses that information to make behavioral decisions. We proposed a decoding paradigm for a T-maze topology, which in principle can be generated to a number of any linear maze structures. Because of the decrease in dimensionality of the problem, analytical methods appropriate for linear track experiments or open field environments can be carried out in the 2D maze topology as well. This approach offers a powerful new methodology to study how this spatial and context-dependent information is encoded by neural firing patterns across a variety of environments and experimental paradigms.

## Acknowledgments

This material is based on work supported by the National Science Foundation under grant no. IIS-0643995 and by the National Institute of Mental Health under grant nos. MH60013 and MH71702.

## References

*Statistical inference*