## Abstract

Recent remarkable advances in experimental techniques have provided a background for inferring neuronal couplings from point process data that include a great number of neurons. Here, we propose a systematic procedure for pre- and postprocessing generic point process data in an objective manner to handle data in the framework of a binary simple statistical model, the Ising or generalized McCulloch–Pitts model. The procedure has two steps: (1) determining time bin size for transforming the point process data into discrete-time binary data and (2) screening relevant couplings from the estimated couplings. For the first step, we decide the optimal time bin size by introducing the null hypothesis that all neurons would fire independently, then choosing a time bin size so that the null hypothesis is rejected with the strict criteria. The likelihood associated with the null hypothesis is analytically evaluated and used for the rejection process. For the second postprocessing step, after a certain estimator of coupling is obtained based on the preprocessed data set (any estimator can be used with the proposed procedure), the estimate is compared with many other estimates derived from data sets obtained by randomizing the original data set in the time direction. We accept the original estimate as relevant only if its absolute value is sufficiently larger than those of randomized data sets. These manipulations suppress false positive couplings induced by statistical noise. We apply this inference procedure to spiking data from synthetic and in vitro neuronal networks. The results show that the proposed procedure identifies the presence or absence of synaptic couplings fairly well, including their signs, for the synthetic and experimental data. In particular, the results support that we can infer the physical connections of underlying systems in favorable situations, even when using a simple statistical model.

## 1 Introduction

Recent tremendous technological advances have enabled us to measure multipoint signals simultaneously, which may reveal much about the central nervous system and other biological systems (Weigt, White, Szurmant, Hoch, & Hwa, 2009; Lezon, Banavar, Cieplak, Maritan, & Fedoroff, 2006). Multipoint neuronal activities can be recorded by microelectrode arrays (Buzsáki, 2004; Brown, Kass, & Mitra, 2004; Steinmetz, Koch, Harris, & Carandini, 2018) or calcium imaging (Ikegaya et al., 2004; Cheng, Gonçalves, Golshani, Arisaka, & Portera-Cailliau, 2011; Grewe, Langer, Kasper, Kampa, & Helmchen, 2010). These types of studies provide rich data and help us to understand the mechanisms of information processing in large, coupled systems beyond the single-neuron level (Yuste, 2015; Roudi, Dunn, & Hertz, 2015; Grosmark & Buzsáki, 2016; Maass, 2016; Paninski & Cunningham, 2018).

Theoretical results from the field of the statistical physics have offered powerful tools for inferring intrinsic structures from such measurements. In cases with undirected (symmetric) couplings, where the systems are in equilibrium, the mean-field formulas for statistical inference have been previously developed (Kappen & Rodríguez, 1998; Tanaka, 1998; Sessak & Monasson, 2009; Roudi, Tyrcha, & Hertz, 2009) and successfully applied to biological data (Schneidman, Berry, Segev, & Bialek, 2006; Shlens et al., 2006; Cocco, Leibler, & Monasson, 2009; Tang et al., 2008; Ohiorhenuan et al., 2010). Recently, beyond undirected cases, more general cases, or those with directed (asymmetric) connectivity structures in nonequilibrium states, can be handled with reasonable computation by using improved systematic techniques (Roudi & Hertz, 2011; Mézard & Sakellariou, 2011; Zeng, Aurell, Alava, & Mahmoudi, 2011; Sakellariou, Roudi, Mezard, & Hertz, 2012; Aurell & Ekeberg, 2012; Zeng, Alava, Aurell, Hertz, & Roudi, 2013), which leads to their applications for real data (Tyrcha, Roudi, Marsili, & Hertz, 2013; Dunn, Mørreaunet, & Roudi, 2015).

## 2 Inference Model and Methods

### 2.1 Inference Model: The Kinetic Ising Model

In this study, the maximum likelihood method forms the basis for inferring the parameters. However, a direct maximization of likelihood is often accompanied by plenty of computational cost, which can be a bottleneck in postprocessing of screening. Therefore, we use a simple mean-field formula to reduce the computational cost. This approximation keeps the inference accuracy in terms of coupling signs, as shown in section 4.4. The direct maximum likelihood method and other more sophisticated formulas can also be principally applied in combination with our methodology. The mean-field inverse formula derived by Roudi and Hertz (2011) is written as $Jest=A-1DC-1$, where $mi=\u2329i(t)\u232a$, $Cij=\u2329i(t)sj(t)\u232a-mimj$, $Dij=\u2329i(t+1)sj(t)\u232a-mimj$, and $Aij=1-mi2\delta ij$. The angled brackets represent the time average, which is expected to be in accordance with the ensemble average under the ergodic assumption, and we make this assumption in this work (Churchland, Yu, Sahani, & Shenoy, 2007; Cunningham & Yu, 2008).

Throughout this letter, we assume that the data do not involve dominant long-range collective modes of statistical fluctuations that interfere with the accurate coupling inference (Das & Fiete, 2019). This marks the limit of the application range of the proposed methods. Although such modes are sometimes observed in some natural setups, data without the long-range modes have recently attracted much attention because they can be observed, for example, in neuronal activity during tasks (Dahmen, Grün, Diesmann, Helias, 2019; Stringer, Pachitariu, Steinmetz, Carandini, & Harris, 2019) as well as spontaneous activity. Therefore, we focus on these types of data. Note that the absence of the long-range modes in data can be tested by a statistical method, as shown in section 4.5.

### 2.2 Selecting Optimal Bin Size

For the kinetic Ising model and other types of generalized linear models (Okatan, Wilson, & Brown, 2005; Pillow et al., 2008; Field et al., 2010; Kobayashi et al., 2019), we have to determine the size of time bins when preprocessing raw data. Although it is known that the size of the time bins in binarizing the spiking data can affect the inference results (Capone, Filosa, Gigante, Ricci-Tersenghi, & Del Giudice, 2015), objective criteria for the bin size have not been considered sufficiently. To address this issue, we employ an information-theoretic method to determine the bin size in an objective way.

To apply the inverse kinetic Ising scheme to spike train data, we first determine the length $\Delta \tau $ of the time bin from an information theory viewpoint (Kullback, 1997), in the procedure shown in Figure 1. We begin with the spike series, which is obtained with a measurement frequency $\Delta t-1$, which decides the minimal time unit (i.e., temporal resolution). We divide the spike trains using time bins of $\Delta \tau $, binarize them, and obtain a spike raster with the whole time-length $T$, to ensure that each bin takes a $+$1 or $-$1 value to express the presence or absence of spikes in a bin, respectively. Then, to operate the inverse formula effectively, their time bins are made coarse-grained with the appropriate $\Delta \tau $, which corresponds to the unit time in the kinetic Ising system.

### 2.3 Screening Relevant Couplings

Once the optimal time bin size is decided, applying the inverse Ising formula to the coarse-grained binary sequence is straightforward. Calculating the estimated coupling matrix provides us with continuous-valued $Jijest$ for each pair $(i,j)$ of neurons. This continuity may make results unclear because it is not easy to distinguish statistically significant couplings from the others. Therefore, we introduce an additional computational-statistical step (Aru et al., 2015; Xu, Aurell, Corander, & Kabashima, 2017; Xu, Puranen, Corander, & Kabashima, 2018) to extract the relevant couplings from $Jest$. To prepare, we generate $L$ randomized time series, each obtained by shuffling the original coarse-grained sequence with $\Delta \tau opt$ in the time direction individually for each element. For all the series, we calculate the coupling matrices ${(Jranest)(r)}r=1L$. Then, for each pair $(i,j)$, we have $L$ reference values ${(Jranest)(r)ij}r=1L$ against the value for the nonrandomized data, $Jijest$. If the nonrandomized value is relevant, its absolute value is considered as sufficiently larger than the $L$ reference values. According to this idea, we accept $Jijest$ as a relevant coupling only if its absolute value is larger than the $pthL$ largest value among ${|(Jranest)(r)ij|}r=1L$. The value $pth$ is a parameter that controls the tightness of this criterion and should be small. Although we use the simple mean-field formula to obtain $Jest$ in this letter, this procedure can be applied to any other inference algorithms solving the inverse Ising problem. Similar data processing methods have been used in other contexts for spike statistics (Bialek & van Steveninck, 2005; Schwartz et al., 2006).

This numerical manipulation can induce a great computational cost when dealing with a huge amount of data. Under such cases it would be better to make use of the approximation proposed in Terada, Obuchi, Isomura, and Kabashima (2018).

### 2.4 Symmetric Inference

If the coupling matrix is symmetric, systems are described by equilibrium distributions in the static manner (Kappen & Rodríguez, 1998; Tanaka, 1998; Sessak & Monasson, 2009; Roudi et al., 2009). Hence, we use the mutual information between equal-time states $I\Delta \tau si(t);sj(t)$ instead of $I\Delta \tau si(t+1);sj(t)$ in equation 2.4. The mean-field inverse formula is replaced as $Jest=A-1-C-1$ (Kappen & Rodríguez, 1998; Tanaka, 1998).

To characterize the asymmetric property, we compare the results obtained using symmetric and asymmetric inferences in section 3.

## 3 Results

### 3.1 Synthetic Data: The Izhikevich Model

To confirm the effectiveness of our methods, we first take up the Izhikevich model as an example. This is a standard neuronal model that generates neurobiologically plausible spike sequences (Izhikevich, 2003). We first set 100 neurons (90 excitatory and 10 inhibitory) on an asymmetric cyclic chain, where each neuron projects synaptic connections and sends signals to up to three clockwise neighboring neurons. The coupling strengths are drawn from uniform distributions between 5 and 10 for the excitatory neurons and between $-20$ and $-10$ for the inhibitory neurons. The other parameters are set following the original work (Izhikevich, 2003).

We also consider the systems on sparse and dense random networks. For each pair $(i,j)$, except the self pairs, a connection from $j$ to $i$ is generated with a probability $q$. In the sparse case, the coupling strengths are drawn between 2 and 3 for the excitatory neurons and between $-6$ and $-4$ for the inhibitory neurons. In the dense cases, they are $[0.8/q,0.8/q+1]$ for the excitatory neurons and $[-2(0.8/q+1),-2(0.8/q)]$ for the inhibitory neurons. The other conditions, except for coupling strengths, are the same as those in the chain model.

### 3.2 Experimental Data: In Vitro Neuronal Networks

Generally, we do not precisely know the underlying connectivity in real neuronal systems, which provides uncertainty regarding the accuracy of inference methods. However, this favorable situation allows us to easily confirm the validation of the inference methods due to the physical circular restriction of the synaptic connections.

For comparison, we also plot the counterpart results obtained from the conventional symmetric inference methods in Figures 5e to 5g. A similar chain-like network structure is generated; however, the coupling plot in Figure 5f tends to be noisier, and we find that the finer coupling structures are missed. These results highlight the effectiveness of our methods with the use of the asymmetric model.

## 4 Discussion

### 4.1 Robustness of Bin-Size Determination: Comparison with Cross-Correlation

Our statistical study confirmed that the proposed determination of the optimal bin size based on equation 2.4 is adequately robust, as shown in Figures 2c, 5b, and 5e). Here, we examine cross-correlation as a comparison, which is another quantity commonly used for determining the characteristic timescale of the neuronal data (Ostojic, Brunel, & Hakim, 2009).

### 4.2 Finite-Size Bias of Mutual Information

We used the gross mutual information of binarized time series to determine the optimal time bin size for coupling inference in section 3. In the large limit of the data size, we can obtain accurate estimates for the gross mutual information from the observed data. However, in real situations, the data size is finite, and hence it is important to assess the finite-size bias for that quantity. Thus, we calculate the bias of the mutual information $I(si(t+1);sj(t))$ induced by the finiteness of the used data size by using a resampling method.

### 4.3 Comparison to Other Regularization Method

This letter proposes a method for screening significant couplings from the estimates. Several regularization methods to avoid overfitting have been proposed in other works (Zeng, Hertz, & Roudi, 2014; Decelle & Zhang, 2015; Bulso, Marsili, & Roudi, 2016). Among them, some sparsity-inducing regularizations also work as an effective screening method, which are of interest to compare with our method. Therefore, we examine the $\u21131$ regularization as a representative example of sparsity-inducing regularizations.

### 4.4 Goodness of Fit of the Kinetic Ising Model

We investigate the goodness of fit achieved by the kinetic Ising model. Although we used the mean-field (MF) approximation as the inference algorithm in section 3, here we use the maximum likelihood (ML) method to evaluate the expressive ability of the kinetic Ising model (the generalized linear model) itself and assess the MF approximation accuracy. To perform the ML method, we use the glmfit package in Matlab. This package considers the spike train data as $\sigma i\u22080,1$, where 1 indicates the existence of a spike and 0 indicates no spike, during direct application of the package code. We then rephrase the result in the Ising description as $si=\xb11$ through a simple transformation of the variables. To evaluate the statistics in the kinetic Ising model, we generate the binary data using the transition probability which is conditioned by the original data at the previous time step.

Methods . | Correct Ratio: Existence . | Absence . | Excitatory . | Inhibitory . | Time (s) . |
---|---|---|---|---|---|

Mean field | $0.9993\xb10.0015(s.d.)$ | $0.9979\xb10.0006$ | $1\xb10$ | $0.9933\xb10.0149$ | 55,344 |

Maximum likelihood | 1 | 0.9944 | 1 | 1 | 529,772 |

Methods . | Correct Ratio: Existence . | Absence . | Excitatory . | Inhibitory . | Time (s) . |
---|---|---|---|---|---|

Mean field | $0.9993\xb10.0015(s.d.)$ | $0.9979\xb10.0006$ | $1\xb10$ | $0.9933\xb10.0149$ | 55,344 |

Maximum likelihood | 1 | 0.9944 | 1 | 1 | 529,772 |

### 4.5 Fundamental Limitation by Dominant Long-Range Collective Modes

Finally, we state that the presence of dominant long-range modes in systems can hide local statistics and degrade the inference accuracy. As recently reported in Das and Fiete (2019), most inference methods fail to reconstruct neuronal couplings in the presence of long-range modes. Indeed, if such modes hide in data and contaminate the statistics directly related to couplings, it is very difficult to infer the connectivity. However, using statistical quantities, we can assess the presence or absence of these long-range modes, allowing us to avoid these situations in advance. We use a recurrent network model introduced by Das and Fiete (2019) to demonstrate this.

## 5 Conclusion

In this letter, we proposed systematic methods equipped with an objective criterion for processing spike time series data in the inverse problem using statistical inference models. The first method is to appropriately discretize raw data with the time bins. This method showed superior effectiveness against the conventional cross-correlation method, as demonstrated in section 4.1. The second method is for effectively screening couplings obtained as the solution of the inverse problem. We showed this method can be accelerated using the approximation (Terada, Obuchi, Isomura, & Kabashima, 2018) with keeping its inference accuracy as much as the $\u21131$ regularization in section 4.3. We showed that these methods work well in simulated and in vitro neuronal networks. These results highlight that the proposed inference procedure, with the use of the kinetic asymmetric Ising model, is quite effective and has the potential to infer actual connectivity, including neuronal excitatory and inhibitory properties. We stress that the methods described here for pre- and postprocessing data can be used with other types of generalized linear models.

We note that strong long-range modes may hide local statistics, and in such cases, we cannot reconstruct synaptic connectivity from observed spiking data as reported in Das and Fiete (2019). However, as shown in section 4.5, we can detect such long-range modes from the statistics of the neuronal activity. We can use the weighted average of the inverse participation ratios in the correlation matrix to discriminate such conditions, and this method may be useful before applying an inference procedure to neural data.

We also note two potential directions in future work. First, it is important to develop inference methods under hidden neurons since we often cannot have access to the entire relevant population (Battistin, Dunn, & Roudi, 2017). Statistical-mechanical analyses have been done in symmetric (Huang, 2015) and asymmetric coupling cases (Dunn & Roudi, 2013; Battistin, Hertz, Tyrcha, & Roudi, 2015). Thus, it would be interesting to investigate whether our methods are effective in those cases. Second, although our treatment ignores the time dependency of the external force, it becomes important in many experimental settings where neurons are exposed to nonstationary input (Tyrcha, Roudi, Marsili, & Hertz, 2013). Hence, it is interesting to generalize our methods presented in this letter and apply the generalized one to those cases with nonstationary inputs.

## Acknowledgments

We thank Abhranil Das and Ila R. Fiete for providing us with a sample code of their work (Das & Fiete, 2019). We are grateful to Yasser Roudi and John Hertz for fruitful discussion. This work was supported by the Special Postdoctoral Research Program at RIKEN (Y.T.), MEXT KAKENHI grant numbers 17H00764 (Y.T., T.O., and Y.K.), 18K11463 (T.O.), 19H01812 (T.O.), and 19K20365 (Y.T.), and JST CREST grant number JPMJCR1912 (T.O. and Y.K.). T.O. is also supported by a Grant for Basic Science Research Projects from the Sumitomo Foundation.