## Abstract

Large-scale fluorescence calcium imaging methods have become widely adopted for studies of long-term hippocampal and cortical neuronal dynamics. Pyramidal neurons of the rodent hippocampus show spatial tuning in freely foraging or head-fixed navigation tasks. Development of efficient neural decoding methods for reconstructing the animal's position in real or virtual environments can provide a fast readout of spatial representations in closed-loop neuroscience experiments. Here, we develop an efficient strategy to extract features from fluorescence calcium imaging traces and further decode the animal's position. We validate our spike inference-free decoding methods in multiple in vivo calcium imaging recordings of the mouse hippocampus based on both supervised and unsupervised decoding analyses. We systematically investigate the decoding performance of our proposed methods with respect to the number of neurons, imaging frame rate, and signal-to-noise ratio. Our proposed supervised decoding analysis is ultrafast and robust, and thereby appealing for real-time position decoding applications based on calcium imaging.

## 1 Introduction

Pyramidal cells in the rodent hippocampus are known to encode space. Large-scale, long-term calcium imaging has been used for studies of long-term memories in freely behaving or head-fixed animals (Dombeck, Harvey, Tian, Looger, & Tank, 2010; Ziv et al., 2013; Jercog, Rogerson, & Schnitzer, 2016; Mau et al., 2018). One of interesting neuroscience questions is to examine how populations of hippocampal neurons represent space at different brain states and how these representations correlate with memory, learning, and planning.

Genetically encoded fluorescent reporters of neurobiological processes can be used to monitor biochemical events and signals in neurons and tissues of living brains. Rapid advances in neurotechnology have allowed us to simultaneously record thousands or tens of thousands of neurons using large-scale calcium imaging (Pachitariu et al., 2017; Stringer et al., 2019). However, the curse of dimensionality may arise for processing such high-throughput neural data, especially in closed-loop neuroscience experiments.

Neural decoding for position reconstruction has been well studied based on large-scale hippocampal neuronal activities or population responses. In electrophysiology, multiple sources of effort have been dedicated to position decoding based on either unsorted spikes or multi-unit activity (Chen, Kloosterman, Layton, & Wilson, 2012; Kloosterman, Layton, Chen, & Wilson, 2014; Deng, Liu, Kay, Frank, & Eden, 2015; Ciliberti, Michon, & Kloosterman, 2018; Hu et al., 2018) or the amplitude of ultra-high-frequency band multisite hippocampal field potentials (Cao, Varga, & Chen, 2019). In calcium imaging, there was an approximately linear relationship between the peak amplitude of a somatic Ca$2+$ transient and the underlying number of spikes in the recorded brain areas (Chen et al., 2013; Jercog et al., 2016). This naturally invites the next question: Can we directly estimate the position of freely behaving animals directly based on the calcium imaging data without the need of spike inference?

In the standard analysis pipeline for calcium imaging, spike inference algorithms are used to extract spiking activity from fluorescence traces (e.g., Vogelstein et al., 2010; Pnevmatikakis et al., 2016; Giovannucci et al., 2018; Jewell & Witten, 2018; Zhou et al., 2018; Pnevmatikakis, 2019). However, spike deconvolution is a challenging and computationally demanding procedure that seeks an approximate solution to the inverse problem. However, statistical methods have also been developed to bypass spike deconvolution for subsequent analyses, such as estimating neuronal firing rates directly from calcium imaging data (Ganmor, Krumin, Rossi, Carandini, & Simoncelli, 2016). In this letter, we propose a new analysis pipeline for position decoding based on in vivo calcium imaging recordings from the mouse hippocampus. We show that the preprocessed fluorescence traces (without spike inference) can be used directly in either supervised or unsupervised position decoding analysis. The proposed method not only is efficient in computational speed, but also shows comparable performance with the standard spike inference methods. From the unsupervised learning analysis, we further recover the latent state representations that encode the animal's position.

## 2 Methods

### 2.1 Data Preprocessing and Approximate Spike Inference

In calcium imaging, the Ca$2+$ dynamics are characterized by the fluorescence traces, which are typically shown in units of a percentage change of each imaged cell's baseline fluorescence level (i.e., $dF/F$). At first, the standard preprocessing procedures, such as motion artifact correction, neuropil removal, and source extraction, were conducted for the raw calcium imaging data (Giovannucci et al., 2018; Pnevmatikakis, 2019). In traditional neural data analysis, the spike inference procedure is further employed to estimate the calcium concentration and the spiking activity from the $dF/F$ fluorescence traces. Here, we used one state-of-the-art spike deconvolution algorithm--based nonnegative deconvolution (Friedrich, Zhou, & Paninski, 2017; Pachitariu, Stringer, & Harris, 2018). At the end of spike inference, a time series of spiking activity (real-valued) from each fluorescence trace was obtained (see Figure 1A). The fluorescence trace can be viewed as a temporal convolution between a discrete point process and low-pass filter, corrupted by an additive noise process (Vogelstein et al., 2010). Regardless of the techniques or inference principle, spike deconvolution remains a computational bottleneck for processing large-scale and long-term calcium imaging recordings.

To sidestep the spike deconvolution procedure, we developed a simple yet efficient strategy for inferring the proxy of spiking activity. The procedure consisted of three steps:

*Step 1.*We estimated the positive peak amplitude from the $dF/F$ fluorescence traces. The amplitude were identified based on a threshold-based peak detection algorithm (e.g., Matlab function findpeaks; see Figure 1B). Depending on the noise level, the minimum peak threshold was often set as at least 10% to 30% peak fluorescence amplitude.*Step 2*. We set the signal below the threshold to 0 and then converted the fluorescence trace into a one-dimensional marked point process (MPP; Jacobsen, 2006), where the nonzero value indicated the peak event, and the mark was defined by the peak amplitude (see Figure 1B).*Step 3*. To accommodate the spiking activity during the rise time before reaching the peak, we convolved the marked point process with an asymmetric temporal filter $h$ (see Figure 1C), resulting in a filtered MPP (see Figure 1D). The custom-designed filter may have the following shape,where $\alpha $ is the decay parameter and $A=exp(-\alpha t0)/\alpha $ is a normalization constant such that $\u222bh(t)dt=1$.$h(t)=1-exp(-\alpha (t-t0))At<t0\u226400t>t0,$

Upon feature extraction, either the deconvolved spiking activity or the filtered MPPs of selected neurons were used as the observations for the subsequent decoding analysis. Notably, the filtered MPPs can be viewed and used as the surrogate of scaled spiking activity). However, as we see below, position decoding does not require exact spike timing or scaling, implying that a proxy of spiking activity can provide a much simpler and efficient solution to decoding analysis.

### 2.2 Optimal Linear Estimation for Position Decoding

For simplicity, we used the standard optimal linear estimation (OLE) method (Agarwal et al., 2014; Cao et al., 2019). The OLE method is based on a linear regression model. However, the principle discussed here can be adapted for other likelihood-based decoding methods, for possible incorporation of a temporal prior.

Assuming gaussianity, the solution to equation 2.4 was given by a least-squared (LS) estimate. In the preprocessing step, the observed features can be *Z*-scored across neurons. However, when the number of neurons is large (e.g., more than 500), the overfitting problem will potentially arise. To resolve this issue, we imposed a sparsity constraint on the estimate and employed a variational Bayes (VB) linear regression combined with an automatic relevance determination (ARD) procedure (Bishop, 2006; Wu, Nagarajan, & Chen, 2016; Drugowtisch, 2017). The VB-ARD sparse regression showed improved decoding performance in the presence of high-dimensionality (see the results in section 3.2).

### 2.3 Maximum Likelihood Estimation for Position Decoding

### 2.4 Bayesian Hidden Markov Model

In our previous work, we developed a hidden Markov model (HMM) to uncover latent structures of hippocampal population activity during spatial navigation, based on observed spiking activity (Chen, Kloosterman, Brown, & Wilson, 2012; Chen, Gomperts, Yamamoto, & Wilson, 2014) or multichannel LFP features (Cao et al., 2019). Here we extended the idea by using the new features derived from the fluorescence calcium imaging traces.

To accommodate automatic selection of model order $m$, we have developed a Bayesian nonparametric inference procedure to identify the unknown parameters ${\pi ,P,\Lambda}$ and the latent state sequences ${S1:T}$ (Linderman, Johnson, Wilson, & Chen, 2016). The Bayesian nonparametric version of the HMM—the hierarchical Dirichlet process-HMM (HDP-HMM)—generalizes the finite-state HMM with a nonparametric HDP prior. Upon completion of Bayesian inference, we estimated the posterior of latent state sequences $S1:T$, the tuning curve matrix $\Lambda $, and identified the correspondence between the inferred state and animal's position (Chen, Gomperts, Yamamoto, & Wilson, 2014).

### 2.5 Rank-Invariant Resampling

The deconvolved spiking activity or filtered MPPs derived from the fluorescence traces are nonnegative real-valued, which do not satisfy the Poisson assumption in the likelihood model (see equation 2.5 or 2.8). To accommodate this change, we employed a rank-invariant resampling procedure (Honey et al., 2009) to generate the same size of random samples from a targeted Poisson distribution.

In the resampling procedure, we replaced the original ordered samples with the resampled ordered samples (by keeping the order unchanged). Specifically, we replaced the smallest raw data value with the smallest randomly sampled value, the second-smallest raw data value with the second-smallest randomly sampled value, and so on until all raw data values were replaced. A schematic illustration of the rank-invariant resampling procedure is shown in Figure 2C. Since the targeted resampled data were guaranteed to be Poisson distributed, we treated them as the pseudo-spike count observations and then conducted the inference described in sections 2.3 or 2.4.

The choice of the targeted mean statistic in the targeted Poisson distribution was ad hoc, and the HMM-based decoding performance may vary according to the Poisson mean statistic. When the latent state is known, the rank-invariant resampling may be approximately scale and translationally invariant (see the appendix).

### 2.6 Experimental Methods

#### 2.6.1 Hippocampal Data Set 1

In the first data set, male and female C57/BL6 mice ages 6 to 8 weeks (Charles River Laboratory) were used in the experiments. We used AAV1-synapsin-GCaMP6s to label neurons in hippocampal dorsal CA1 area and tracked somatic calcium activity of neurons by two-photon imaging through a chronic window as previously described (Dombeck et al., 2010). The in vivo Ca$2+$ imaging experiments were performed using a Bruker two-photon laser scanning system equipped with a Ti:sapphire laser (Mai Tai DeepSee; Spectra Physics) tuned to 920 nm. The average laser power on the brain sample was approximately 15 mW to 30 mW. All imaging was performed using a 40$\xd7$ objective immersed in an artificial cerebral spinal fluid (ACSF) solution and with a 1$\xd7$ digital zoom. Images were collected at a resolution of 473 $\xd7$ 473 pixels (see Figure 3A) and a frame rate of 2 Hz. The head-restrained animal was trained to move his forelimbs to perform running forward on the treadmill (see Figure 3B), while the rotation of belt provided continual changes in visual and tactile stimuli (Royer et al., 2012). Training on the 108 cm treadmill consisted of eight trials with continuous running for 10 laps in each trial (80 laps in total). The treadmill speed was 2.5 to 2.7 cm/s. The $dF/F$ fluorescence calcium activity (see Figure 3C) in trial 8 (laps 71–80) was analyzed. Recordings of two mice were used in the position decoding analysis (see Table 1). Overall, the mean$\xb1$SD cell population firing rate was $0.1440\xb10.1121$ (spike magnitude/s) for mouse 1 and $0.1438\xb10.1187$ (spike magnitude/s) for mouse 2. In addition, the mean$\xb1$SD rate of fluorescence-evoked events was $0.1647\xb10.0826$ Hz and $0.1642\xb10.0901$ Hz for two mice, respectively. The experimental studies were performed in accordance with the National Institutes of Health (NIH) Guide for the Care and Use of Laboratory Animals to ensure minimal animal use and discomfort and were approved by the New York University School of Medicine (NYUSOM) Institutional Animal Care and Use Committee.

Data Set (animal) . | # Session . | # Cells . | Sampling Rate . | Imaging . |
---|---|---|---|---|

Data set 1 (mouse 1) | 1 | 79 | 2 Hz | Two-photon, GCaMP6s |

(mouse 2) | 1 | 53 | 2 Hz | Two-photon, GCaMP6s |

Data set 2 (mouse 3) | 4 | 1091 | 20 Hz | One-photon, GCaMP6f |

(mouse 4) | 4 | 835 | 20 Hz | One-photon, GCaMP6f |

Data Set (animal) . | # Session . | # Cells . | Sampling Rate . | Imaging . |
---|---|---|---|---|

Data set 1 (mouse 1) | 1 | 79 | 2 Hz | Two-photon, GCaMP6s |

(mouse 2) | 1 | 53 | 2 Hz | Two-photon, GCaMP6s |

Data set 2 (mouse 3) | 4 | 1091 | 20 Hz | One-photon, GCaMP6f |

(mouse 4) | 4 | 835 | 20 Hz | One-photon, GCaMP6f |

#### 2.6.2 Hippocampal Data Set 2

The second data set consists of one-photon calcium imaging from the mouse hippocampus (https://data.mendeley.com/datasets/f9fmrj98n3/1). Details of experimental setup and recordings are provided by Mau et al. (2018). Briefly, mice were introduced to a 40 $\xd7$ 60 cm$2$ rectangular track with an embedded motorized mouse treadmill as one of its long sides. Mice were acclimated to the environment until they reliably sought 20% sucrose water solution. Then they were trained to run in place on the treadmill for increasing time intervals ($>$6 s) in between laps. For the beginning sessions, running speed varied between 10 and 24 cm/s. Once mice would reliably run for approximately 30 laps per day, data were collected for four days, with each session lasting approximately 30 min and consisting of approximately 30 laps of 10 s treadmill running and water retrieval. A miniaturized epifluorescence microscope (Inscopix) was used to collect imaging videos of hippocampal CA1 activity at a frame rate of 20 Hz. Videos were motion corrected and cropped (500 $\xd7$ 500 pixels) to exclude areas without GCaMP6f activity. The videos were then postprocessed using an open-source software package (https://github.com/SharpWave/TENASPIS) to identify the regions of interest (ROIs) and further extract the fluorescence activity ($dF/F$; see Figure 4B). Across days, imaged neurons were tracked using image registration software. We selected two mice (G45 and G48) from this public data set for our current study. Overall, the mean$\xb1$SD cell population firing rate was $0.0022\xb10.0011$ (spike magnitude/s) for mouse 3 and $0.0010\xb10.0003$ (spike magnitude/s) for mouse 4. In addition, the mean$\xb1$SD rate of fluorescence-evoked events was $0.0282\xb10.0146$ Hz and $0.0198\xb10.0089$ Hz for two mice, respectively.

## 3 Results

In the following analyses, we tested our methods on two calcium imaging data sets recorded from head-fixed or freely behaving mice. We compared the performance of supervised and unsupervised decoding analyses based on the deconvolved spiking activity, MPPs or filtered MPPs. In supervised decoding analysis, we conducted 10-fold cross-validation using OLE or MLE. In unsupervised learning analysis, we used 90% data for training the HMM and 10% held-on data for validation. For simplicity, median position decoding errors were reported for performance assessment. However, the mean, variance, and distribution statistics were also important. Custom software for all analyses was written in Python and Matlab and is publicly accessible (www.cn3lab.org/software.html).

### 3.1 Hippocampal Data Set 1

Using a temporal bin size of 500 ms (i.e., 2 Hz frame rate), we compared the cross-validated median position decoding error derived from three features: (1) deconvolved spiking activity from fluorescence traces, (2) unfiltered MPPs derived from fluorescence traces, (3) filtered MPPs derived from fluorescence traces, and (4) when necessary, Poisson resampling was performed on features 1 to 3. The raw features were used in OLE, and the resampled features were used in MLE and HMM. To obtain feature 1, we used a state-of-the-art spike nonnegative deconvolution method (Friedrich et al., 2017). From the inferred spiking activities of hippocampal cells, we constructed the normalized place fields (see Figure 3D, left panel). By comparison, there was a high degree of resemblance in heat map derived from filtered MPPs (see Figure 3D, right panel).

Remarkably, the MLE- or OLE-based supervised decoding analysis showed that our proposed feature 3 yielded lower median decoding error than deconvolved spiking activity in two tested animals (see Figures 3E and 3F). Using the same feature, MLE produced better decoding accuracy than OLE in supervised decoding analysis. Compared with OLE, the HMM-based unsupervised decoding analysis showed comparable median decoding error based on (resampled) deconvolved spikes; however, the median decoding accuracy was worse based on the (resampled) filtered or unfiltered MPP features. In all tested methods, the filtered MPPs yielded consistently better performance than the unfiltered MPPs (especially in the HMM-based method), implying the importance of preprocessing in fluorescence traces.

It is noteworthy that the choice of the hyperparameters $K$ and $\kappa $ influenced the position decoding accuracy. The optimal values could be selected by cross-validation (see Figure 3G). However, the decoding accuracy was relatively robust with a wide range of hyperparameters. For implementation simplicity, we have used a finite-length digital filter $h=[h1,h2,h3,0,0,0,0]$ (where $h1<h2<h3$ and $h1+h2+h3=1$) and assumed a constant $h3/h2=\alpha $. Similarly, the shape of temporal filter, or the decay constant $\alpha $, could be optimized by cross-validation.

### 3.2 Hippocampal Data Set 2

We linearized the rectangular track (including the treadmill area), resulting in a track length of approximately 150 cm (see Figure 4B). We used a velocity threshold of 5 cm/s to exclude low-speed movement periods. The summary of the data set is shown in Table 1. Similar to the first data set, we inferred individual spiking activities and filtered MPPs, and computed the corresponding heat maps of hippocampal place fields (see Figure 4C). In the remaining decoding analyses, we focused only on the results based on deconvolved spikes and filtered MPPs.

We binned the spiking activity or filtered MPPs with a bin size of 200 to 300 ms. We picked the bin size that yielded the better decoding performance (mouse 3: 300 ms; mouse 4: 200 ms). We repeated both supervised and unsupervised decoding analysis as before and varied the number of neurons by randomly selecting a subset of the population. Each decoding analysis was repeated 20 times. In the OLE method, we used the same hyperparameters for all tested features ($K=75$ and $\kappa =25$ in mouse 3; $K=25$ and $\kappa =25$ in mouse 4). In the MLE method, we resampled the deconvolved spikes or filtered MPPs with Poisson count data (using the same mean statistic: 5). In the HMM method, we used the same number of latent states for all tested features ($m=50$). The results are shown in Figures 4D and 4E. As seen, excellent decoding performance was achieved by using only a small number of neurons, and the decoding accuracy gradually saturated with the increasing number of neurons. Comparing supervised and unsupervised decoding analyses, in the mouse 3 recording, MLE consistently yielded the overall best decoding performance, while OLE and HMM approaches had comparable decoding accuracy when the number of neurons was small; however, OLE-based decoding achieved slightly better accuracy than the HMM-based decoding with an increasing number of neurons. In the mouse 4 recording, MLE still had the lowest decoding error, whereas the HMM-based strategy achieved better decoding accuracy than the OLE-based strategy. More remarkable, the simple filtered MPP (resampled) features achieved the best median decoding error in the presence of a small number of neurons ($C=50$ or $C=100$). From Figure 4E, it seemed that the optimal number of neurons for achieving a low yet reliable decoding error was around 200, regardless of the methods used or the features. Notably, when the number of neurons was more than 400, the LS estimate in the OLE suffered from overfitting, whereas the sparse VB-ARD estimate did not (see Figure 4F). However, the VB-ARD estimate was biased; therefore, the LS solution was slightly better when there was no overfitting. In addition, we investigated the impact of resampled Poisson mean statistic on the decoding accuracy. We found that the decoding accuracy changed according to the resampling mean statistics, and the optimal remapping value depended on the data set (see Figure 4G). The optimal resampling mean varied from different features and different data sets. The spike-based decoding accuracy was slightly more stable, possibly due to the distribution difference between the filtered MPP features and deconvolved spikes.

Next, we focused on the investigation on the OLE method. Specifically, we systematically downsampled the florescence traces (from 20 Hz to 10 Hz, to 4 Hz and to 2 Hz) and investigated the impact of sampling rate on position decoding accuracy. We considered two different downsampling conditions: we took one from every $N$ sample from the 20 Hz sampled florescence trace, and we temporally averaged every $N$ sample from the 20 Hz sampled florescence trace. In each case, the decoding bin size remained the same as before resampling. Using 100 randomly selected neurons, we repeated the decoding analysis pipeline 20 times. We found that the decoding accuracy trends derived from these two downsampling operations were somewhat similar, but the accuracy was slightly better in the second downsampling condition (see Figure 5A). The OLE-based decoding accuracy was better with deconvolved spikes than with filtered MPPs. In addition, we observed a decreasing trend in the median decoding error with a lower sampling rate. Although the result appeared counterintuitive, our downsampling operations did not apply on the raw data directly, which might deviate from the physical reality of low imaging frame rate.

Furthermore, we examined the relationship between the SNR and the OLE-based decoding accuracy based on deconvolved spikes and filtered MPPs. In each calcium imaging data set, we computed and sorted the SNR (see equation 2.1) and used the top $N$ and bottom $N$ neurons ($N=500$ for mouse 3, and $N=400$ for mouse 4) in decoding, separately. We observed a decreased decoding accuracy with lower SNR for both deconvolved spikes and filtered MPPs (see Figure 5B).

To test the stability and generalization of the decoding method based on filtered MPPs, we further conducted cross-session OLE-based decoding analysis: training on one session and testing on another session, while using 100 neurons randomly selected (yet identical) from the whole population. For a fair comparison, the within-session position decoding error was computed by two-fold cross-validation. In between-session decoding, data from one complete session were trained, followed by testing on another complete session. Remarkably, we still achieved good position decoding accuracy across multiple days (see Figure 5C). In general, the within-session (i.e., the diagonal) had the lowest decoding error, followed by the between-session results in neighboring days.

### 3.3 Computer Simulations

. | OLE . | MLE . | HMM . | OLE . | MLE . | HMM . | OLE . | MLE . | HMM . |
---|---|---|---|---|---|---|---|---|---|

Features . | $\sigma =0.3$ . | $\sigma =0.6$ . | $\sigma =1$ . | ||||||

Deconvolved spikes | 7.87 $\xb1$ 0.25 | 3.64 $\xb1$ 0.07 | 4.76 $\xb1$ 0.52 | 8.30 $\xb1$ 0.27 | 3.90 $\xb1$ 0.13 | 5.13 $\xb1$ 0.47 | 9.42 $\xb1$ 0.36 | 4.71 $\xb1$ 0.09 | 6.45 $\xb1$ 0.68 |

MPPs | 8.89 $\xb1$ 0.23 | 3.73 $\xb1$ 0.07 | 5.75 $\xb1$ 0.44 | 9.46 $\xb1$ 0.28 | 4.39 $\xb1$ 0.12 | 5.98 $\xb1$ 0.50 | 11.65 $\xb1$ 0.25 | 6.11 $\xb1$ 0.26 | 10.19 $\xb1$ 0.89 |

Filtered MPPs | 6.26 $\xb1$ 0.16 | 2.40 $\xb1$ 0.07 | 3.10 $\xb1$ 0.36 | 6.65 $\xb1$ 0.22 | 2.80 $\xb1$ 0.13 | 3.53 $\xb1$ 0.47 | 7.81 $\xb1$ 0.27 | 4.38 $\xb1$ 0.18 | 5.28 $\xb1$ 0.45 |

. | OLE . | MLE . | HMM . | OLE . | MLE . | HMM . | OLE . | MLE . | HMM . |
---|---|---|---|---|---|---|---|---|---|

Features . | $\sigma =0.3$ . | $\sigma =0.6$ . | $\sigma =1$ . | ||||||

Deconvolved spikes | 7.87 $\xb1$ 0.25 | 3.64 $\xb1$ 0.07 | 4.76 $\xb1$ 0.52 | 8.30 $\xb1$ 0.27 | 3.90 $\xb1$ 0.13 | 5.13 $\xb1$ 0.47 | 9.42 $\xb1$ 0.36 | 4.71 $\xb1$ 0.09 | 6.45 $\xb1$ 0.68 |

MPPs | 8.89 $\xb1$ 0.23 | 3.73 $\xb1$ 0.07 | 5.75 $\xb1$ 0.44 | 9.46 $\xb1$ 0.28 | 4.39 $\xb1$ 0.12 | 5.98 $\xb1$ 0.50 | 11.65 $\xb1$ 0.25 | 6.11 $\xb1$ 0.26 | 10.19 $\xb1$ 0.89 |

Filtered MPPs | 6.26 $\xb1$ 0.16 | 2.40 $\xb1$ 0.07 | 3.10 $\xb1$ 0.36 | 6.65 $\xb1$ 0.22 | 2.80 $\xb1$ 0.13 | 3.53 $\xb1$ 0.47 | 7.81 $\xb1$ 0.27 | 4.38 $\xb1$ 0.18 | 5.28 $\xb1$ 0.45 |

Notes: Median $\xb1$ SD statistics (unit: cm) are derived from 20 Monte Carlo runs. $\sigma $ denotes the measurement noise standard deviation in the simulated calcium concentration.

### 3.4 Computational Speed

Low-latency computation and real-time scalable decoding analysis are two important factors for closed-loop neuroscience experiments. Compared to the state-of-the-art fast spike deconvolution methods, our supervised decoding strategy is simple and ultrafast. Here, we compared the CPU time between two fast spike deconvolution methods (Friedrich et al., 2017; Rahmati, Kirmse, Holthoff, & Kiebel, 2018) and our strategy.^{1} Using custom-written Matlab codes (Matlab R2019, Windows 10 OS; Intel Core i7-6500U CPU 2.50 GHz, 8 GM RAM), we averaged the CPU time across complete recordings and all neurons. A naive (nonoptimized) implementation of our strategy already achieved significant speed-up in computation (see Table 3). We expect an efficient C/C++ implementation will improve the speed further.

. | CPU Time (ms per sample) . | |
---|---|---|

Method . | Data Set 1 . | Data Set 2 . |

Proposed OLE method | (1.77 $\xb1$ 0.18) $\xd7$$10-4$ | (1.63 $\xb1$ 0.19) $\xd7$$10-4$ |

Spike deconvolution (Friedrich et al., 2017) | 0.0556 $\xb1$ 0.0060 | 0.0483 $\xb1$ 0.0060 |

Spike deconvolution (Rahmati et al., 2018) | 0.0012 $\xb1$ 0.0002 | 0.0011 $\xb1$ 0.0001 |

. | CPU Time (ms per sample) . | |
---|---|---|

Method . | Data Set 1 . | Data Set 2 . |

Proposed OLE method | (1.77 $\xb1$ 0.18) $\xd7$$10-4$ | (1.63 $\xb1$ 0.19) $\xd7$$10-4$ |

Spike deconvolution (Friedrich et al., 2017) | 0.0556 $\xb1$ 0.0060 | 0.0483 $\xb1$ 0.0060 |

Spike deconvolution (Rahmati et al., 2018) | 0.0012 $\xb1$ 0.0002 | 0.0011 $\xb1$ 0.0001 |

Note: Statistics were averaged across all neurons in each data set.

Several points are noteworthy:

The convolution step in our proposed preprocessing procedure introduces a temporal delay and appears acausal. This issue becomes more particularly problematic when the frame rate is low (e.g., 2 Hz). However, when the frame rate is relatively high (e.g., 20 Hz), the delay and causality issue will not be the issue given the decoding bin size is 200 to 300 ms (i.e., four to six frames).

If the causation of real-time processing is a concern, we can also discard the convolution step and use only MPPs (instead of filtered MPPs) for the subsequent decoding analysis. The decoding result will be slightly reduced (e.g., see Figure 3F and Table 2) but still acceptable.

Unlike the OLE method, the MLE- or HMM-based decoding strategy relies on a rank-resampling strategy, limiting its real-time application.

## 4 Discussion

In this letter, we have proposed an efficient method to infer the proxy of spiking activity directly from $dF/F$ florescence in calcium imaging and used it for the subsequent position decoding analysis. Our spike inference–free procedure is simply based on a peak detection followed by a filtering operation. The temporal filter can be optimized depending on the neuron's firing rate and the imaging frame rate. The strategy has been proven effective in both supervised and unsupervised decoding analyses for reconstruction of the animal's position. It is demonstrated, via both simulated and experimental data, that exact spike timing is not needed for position decoding; instead, we can infer the mouse's position directly from the proxy of spiking activity via filtered MPP.

In the literature, many methods have been developed for position decoding analysis based on calcium imaging of the hippocampus or neocortex (Ziv et al., 2013; Mao et al., 2018; Wei et al., 2019). The common idea of these methods is to establish neuronal correlates (either deconvolved spikes or their proxy) with the animal's position. A similar rationale has also been extended for decoding time (Mau et al., 2018). Statistical modeling of deconvolved calcium signals or their preprocessed signals is critical for interpreting calcium measurements. However, for some analyses, a deconvolution step may turn out unnecessary, as demonstrated in our study and others (e.g., Ganmor et al., 2016).

Based on the mouse hippocampal calcium imaging data, we show that a small number (approximately 100) of hippocampal place cells is sufficient to provide a reliable spatial readout at various imaging frame rates. The unsupervised HMM-based method can also uncover spatial representations of a population of rodent hippocampal place cells from their calcium imaging activities, consistent with our prior results derived from spikes and LFP signals (Linderman, Johnson, Wilson, & Chen, 2016; Cao et al., 2019). Once the temporal resolution of imaging techniques is further improved (say 50–100 Hz), the same analysis can be used to decode hippocampal memory replays (Liu, Grosmark, & Chen, 2018; Cao et al., 2019; Chen, Grosmark, Penagos, & Wilson, 2016; Chen & Wilson, 2017; Maboudi et al., 2018). Real-time calcium imaging–based decoding analyses can be readily adapted for closed-loop experiments in behaving mice.

Our calcium imaging “place” decoding analysis for the mouse hippocampus can also be extended to neocortical circuits that encode spatial information, including the neurons from the entorhinal cortex, primary visual cortex, retrosplenial cortex, and parietal cortex (Hafting, Fyhn, Molden, Moser, & Moser, 2005; Ji & Wilson, 2007; Whitlock, Sutherland, Witter, Moser, & Moser, 2008; Haggerty & Ji, 2015; Mao, Kandler, McNaughton, & Bonin, 2017). Extending this analysis pipeline to large-scale *in vivo* neocortical population calcium imaging data in other behavioral tasks will be the subject of future investigation.

## Appendix: Implication of Rank-Invariant Resampling

Without the loss of generality, we assume the observed data $y\u2208R$. The goal of rank-invariant resampling is to find a linear or nonlinear transformation: $ynew=\phi (y)$, where $\phi (\xb7)$ denotes a nonnegative monotonic increasing function, such that the rank-invariant order among the observed samples is preserved. The rank-invariant resampling procedure can be viewed as a special form of nonlinear mapping function.

The rank-invariant resampling can be approximately viewed as the implementation of a set of domain-based piecewise linear functions within the CDF interval of [0, 1]. Therefore, the local invariance property approximately holds. In the presence of latent state variables, the invariance property cannot be guaranteed.

## Notes

^{1}

The software packages are available at https://github.com/j-friedrich/OASIS and https://github.com/VahidRahmati/UFARSA.

## Acknowledgments

We thank William Mau for sharing the calcium imaging data. This work was partially supported by NIH grants R01-MH118928 (Z.S.C.), R01-MH111486 (W.-B.G.) from the NIMH, R01-NS100065 (Z.S.C.), R01-NS047325 (W.-B.G.) from the NINDS, as well as an NSF grant CBET-1835000 (Z.S.C).