## Abstract

With our ability to record more neurons simultaneously, making sense of these data is a challenge. Functional connectivity is one popular way to study the relationship of multiple neural signals. Correlation-based methods are a set of currently well-used techniques for functional connectivity estimation. However, due to explaining away and unobserved common inputs (Stevenson, Rebesco, Miller, & Körding, 2008), they produce spurious connections. The general linear model (GLM), which models spike trains as Poisson processes (Okatan, Wilson, & Brown, 2005; Truccolo, Eden, Fellows, Donoghue, & Brown, 2005; Pillow et al., 2008), avoids these confounds. We develop here a new class of methods by using differential signals based on simulated intracellular voltage recordings. It is equivalent to a regularized AR(2) model. We also expand the method to simulated local field potential recordings and calcium imaging. In all of our simulated data, the differential covariance-based methods achieved performance better than or similar to the GLM method and required fewer data samples. This new class of methods provides alternative ways to analyze neural signals.

## 1 Introduction

Simultaneous recording of large populations of neurons is an inexorable trend in current neuroscience research (Kandel, Markram, Matthews, Yuste, & Koch, 2013). Over the last five decades, the number of simultaneously recorded neurons has doubled approximately every seven years (Stevenson & Kording, 2011). One way to make sense of these big data is to measure the functional connectivity between neurons (Friston, 2011) and link the function of the neural circuit to behavior. As previously reviewed (Stevenson, Rebesco, Miller, & Körding, 2008), correlation-based methods have been used to estimate functional connectivity for a long time. However, they are suffering from the problem of explaining away unobserved common inputs (Stevenson et al., 2008), which makes it difficult to interpret the link between the estimated correlation and the physiological network. More recently, Okatan, Wilson, and Brown (2005), Truccolo et al. (2005), and Pillow et al. (2008) applied the generalized linear model to spike train data and showed good probabilistic modeling of the data.

To overcome these issues, we developed a new class of methods that use not only the spike trains but the voltages of the neurons. They achieve better performance than the GLM method but are free from the Poisson process model and require fewer data samples. They provide directionality information about sources and sinks in a network, which is important to determine the hierarchical structure of a neural circuit.

In this article, we further show that our differential covariance method is equivalent to a regularized second-order multivariate autoregressive model. The multivariate autoregressive (MAR) model has been used to analyze neuroscience data (Friston, 2011; McIntosh & Gonzalez-Lima, 1991). In particular, this model with an arbitrary order has been discussed previously (Harrison, Penny, & Friston, 2003). Continuous AR process, which is known as the Ornstein-Uhlenbeck (OU) process (Uhlenbeck & Ornstein, 1930), has also been applied to model neuronal activity (Burkitt, 2006; Ricciardi & Sacerdote, 1979; Lánskỳ & Rospars, 1995). However, the modified AR(2) model in this has not been used previously.

All of the data that we analyze in this article are simulated data so that we can compare different methods with ground truth. We first generated data using a simple passive neuron model and provided theoretical proof for why the new methods perform better. Then we used a more realistic Hodgkin-Huxley (HH)–based thalamocortical model to simulate intracellular recordings and local field potential data. This model can successfully generate sleep patterns such as spindles and slow waves (Bazhenov, Timofeev, Steriade, & Sejnowski, 2002; Chen, Chauvette, Skorheim, Timofeev, & Bazhenov, 2012; Bonjean et al., 2011). Since the model has a cortical layer and a thalamic layer, we further assume that the neural signals in the cortical layer are visible by the recording instruments, while those from the thalamic layer are not. This is a reasonable assumption for many experiments. Since the thalamus is a deep brain structure, most experiments involve only measurements from the cortex.

Next, we simulated 1000 Hodgkin-Huxley neurons networks with 80% excitatory neurons and 20% inhibitory neurons sparsely connected. As in real experiments, we recorded simulated calcium signals from only a small percentage of the network (50 neurons) and compared the performance of different methods. In all simulations, our differential covariance-based methods achieve performance better than or similar to the GLM method, and in the LFP and calcium imaging data sets, they achieve the same performance with fewer data samples.

The article is organized as follow. In section 2, we introduce our new methods. In section 3, we show the performance of all methods and explain why our methods perform better. In section 4, we discuss the advantage and generalizability of our methods. We also propose a few improvements for the future.

## 2 Methods

### 2.1 Simulation Models Used to Benchmark the Methods

#### 2.1.1 Passive Neuron Model

We added the latent inputs to the system because unobserved common inputs exist in real-world problems (Stevenson et al., 2008). For example, one could be using two-photon imaging to record calcium signals from the cortical circuit. The cortical circuit might receive synaptic inputs from deeper layers in the brain, such as the thalamus, which is not visible to the microscope. Each invisible neuron projects to many visible neurons, leading to common synaptic currents to the cortical circuit, and causes neurons in the cortical circuit to be highly correlated. We will discuss how to remove interference from the latent inputs.

#### 2.1.2 Thalamocortical Model

. | PYTC . | PYRE . | TCPY . | TCIN . | PYIN . | INPY . | RERE . | TCRE . | RETC . |
---|---|---|---|---|---|---|---|---|---|

Radius | 8 | 6 | 15 | 3 | 1 | 5 | 5 | 8 | 8 |

. | PYTC . | PYRE . | TCPY . | TCIN . | PYIN . | INPY . | RERE . | TCRE . | RETC . |
---|---|---|---|---|---|---|---|---|---|

Radius | 8 | 6 | 15 | 3 | 1 | 5 | 5 | 8 | 8 |

For each simulation, we simulated the network for 600 secs. The data were sampled at 1000 Hz.

#### 2.1.3 Local Field Potential (LFP) Model

We plant an LFP electrode at the center of each of the 50 PY neuron local circuits. The distance between the 10 local neurons is , and the distance between the PY subnetworks is 1 cm. The LFP is calculated according to the standard model in Bonjean et al. (2011), Destexhe (1998), and Nunez and Srinivasan (2005). The LFPs are mainly contributed by elongated dendrites of the cortical pyramidal neurons. In our model, each cortical pyramidal neuron has a 2 mm long dendrite.

#### 2.1.4 Calcium Imaging Model

In real experiments, we can image only a small percentage of neurons in the brain. Therefore, we simulated HH-based networks of 1000 neurons and record from only 50 neurons. We used the four connection patterns (normal-1 normal-4) provided online (https://www.kaggle.com/c/connectomics) (Stetter, Battaglia, Soriano, & Geisel, 2012). Briefly, the networks have 80% excitatory neurons and 20% inhibitory neurons. The connection probability is 0.01—one neuron connects to about 10 neurons—so it is a sparsely connected network.

Similar to our thalamocortical model, we used AMPA and NMDA synapses for the excitatory synapses and GABA synapses for the inhibitory synapses. The simulations ran 600 sec and were sampled at 1000 Hz. The intracellular voltages obtained from the simulations were then transferred to calcium fluorescence signals and downsampled to 50 Hz. For each of the four networks, we conducted 25 recordings. Each recording contains calcium signals from 50 randomly selected neurons.

### 2.2 Differential Covariance-Based Methods

In this section, we introduce a new class of methods to estimate the functional connectivity of neurons (code is provided online at https://github.com/tigerwlin/diffCov).

#### 2.2.1 Step 1: Differential Covariance

#### 2.2.2 Step 2: Applying Partial Covariance Method

As Stevenson et al. (2008) previously mentioned, one problem of the correlation method is the propagation of correlation. Here we designed a customized partial covariance algorithm to reduce this type of error in our methods. We use to denote the differential covariance estimation after applying the partial covariance method.

As we explain in section B.2, the partial covariance of the differential covariance is not equivalent to the inverse of the differential covariance estimation. The two are equivalent only for the covariance matrix and when the partial correlation is controlling on all variables in the observed set. In our case, the differential covariance matrix is nonsymmetric because it is the covariance between recorded signals and their differential signals. We have signals and differential signals in our observed set; however, we are controlling only on the original signals for the partial covariance algorithm. Due to these differences, we developed this customized partial covariance algorithm, equation 2.8, which performs well for neural signals in the form of equation 2.2.

#### 2.2.3 Step 3: Sparse Latent Regularization

Finally, we applied the sparse latent regularization method to partial covariance version of the differential covariance (Chandrasekaran, Sanghavi, Parrilo, & Willsky, 2011; Yatsenko et al., 2015). As explained in section B.3, in the sparse latent regularization method, we made the assumption that there are observed neurons and unobserved common inputs in a network. If the connections between the observed neurons are sparse and the number of unobserved common inputs is small, this method can separate the covariance into two parts and the sparse matrix is the intrinsic connections between the observed neurons.

#### 2.2.4 Computing Derivative

In differential covariance, computing the derivative using provides better estimation results than using . To elaborate this point, we first explain our method's connection to the autoregressive (AR) method.

As explained above, using different methods to compute the derivative will produce different differential covariance estimators, and they are equivalent to estimators from different AR models for the connection matrix. In section 3, we show that the performances of these two estimators are significantly different.

### 2.3 Performance Quantification

The performance of each method is judged by four quantified values. The first three values indicate the method's abilities to reduce the three types of false connections (see Figure 4). The last one indicates the method's ability to correctly estimate the true positive connections against all possible interference.

## 3 Results

### 3.1 False Connections in Correlation-Based Methods

For convenience of explanation, we define the diagonal strip of connections in the ground truth (the first 50 neurons in Figure 3A) as the 3 and 4 diagonal lines, because they are three and four steps away from the diagonal line of the matrix. As shown in Figure 3, all of these methods produce false connections on the diagonal lines. The precision matrix method (see Figure 3C) also has square box shape of false connections in the background.

#### 3.1.1 Type 1 False Connections

#### 3.1.2 Type 2 False Connections

The type 2 false connections shown in Figure 4B are due to the propagation of correlation. Because one neuron connects to another neuron and neuron connects to another neuron , the correlation method presents the correlation between and , which do not have a physical connection. This phenomenon is shown in Figure 3B as the extra diagonal strips. This problem, shown in Figure 3C can be greatly reduced by the precision matrix/partial covariance method.

#### 3.1.3 Type 3 False Connections

The type 3 false connections shown in Figure 4C are also due to the common currents that pass into two neurons. However, in this case, they are from the unobserved neurons. For this particular passive neuron model, it is due to the inputs from the 10 unobserved neurons (neurons 51–60) as shown in Figure 3A. Because the latent neurons have broad connections to the observed neurons, they introduce a square box shape correlation pattern into the estimations. (See Figure 3C. Figure 3B also contains this error, but it is hard to see.) Since the latent neurons are not observable, the partial covariance method cannot be used to regress out this type of correlation. The sparse latent regularization can be applied if the sparse and low-rank assumption is valid, and the sparse + latent regularized result is shown in Figure 3D. However, even after using this regularization, the correlation-based methods still leave false connections in Figure 3D.

### 3.2 Estimations from Differential Covariance-Based Methods

#### 3.2.1 The Differential Covariance Method Reduces Type 1 False Connections

^{8}, which shows that the strength of the type 1 false connections is reduced in the differential covariance method by a factor of . Moreover, the performance of the differential covariance method on reducing type 1 false connections is quantified in Figure 6.

#### 3.2.2 The Partial Covariance Method Reduces Type 2 False Connections

#### 3.2.3 The Sparse + Latent Regularization Reduces Type 3 False Connections

Third, the sparse + latent regularization to remove the correlation is introduced by the latent inputs. When the observed neurons' connections are sparse and the number of unobserved common inputs is small, the covariance introduced by the unobserved common inputs can be removed. As shown in Figure 5D, the external covariance in the background of Figure 5C is removed, while the true diagonal connections and the directionality of the connections are maintained. This regularization is also effective for correlation-based methods, but type 1 false connections are maintained in the estimation even after applying this regularization (see Figure 3D). Each estimator's performance for reducing type 3 false connections is quantified in Figure 6.

#### 3.2.4 The Differential Covariance-Based Methods Provide Directionality Information of the Connections

^{9}in section A.3, we have We note here that there is another ambiguous setting that provides the same result, which is an inhibitory connection . Conceptually, the differential covariance indicates the current sources and sinks in a neural circuit, but the exact type of synapse is unknown.

#### 3.2.5 Performance Quantification for the Passive Neuron Data Set

In Figure 6, we quantified the performance of the estimators for one example data set. We see that with the increase in the sample size, our differential covariance-based methods reduce the three types of false connections, while maintaining high true positive rates. Also, as we apply more advanced techniques (), the estimator's performance increases in all four panels of the quantification indices. Although the precision matrix and the sparse + latent regularization help the correlation method reduce type 2 and type 3 errors, all correlation-based methods handle the type 1 false connections poorly. We also note that the masks we used to quantify each type of false connections are not mutually exclusive (i.e., there are false connections that belong to more than one type of false connections). Therefore, in Figure 6, it seems that a regularization is reducing the multiple types of false connections. For example, the sparse + latent regularization is reducing both type 2 and type 3 false connections.

In Table 2, we provide quantified results (area under the ROC curve) for two connection patterns (cxcx34, and cxcx56789) and three conductance settings (g5, g30, and g50). We see that the key results in Figure 6 are also generalized here. By applying more advanced techniques to the original differential covariance estimator (), the performance increases with respect to the three types of error, while the true positive rate is not sacrificed. We also note that although the precision matrix and the sparse + latent regularization help the correlation method reduce type 2 and type 3 errors, all correlation-based methods handle the type 1 error poorly.

. | Cov . | Precision . | Precision+SL-reg . | . | . | . |
---|---|---|---|---|---|---|

cxcx34 g5 | ||||||

Error 1 | 0 | 0 | 0 | 0.6327 | 0.3469 | 0.8776 |

Error 2 | 0.1469 | 0.9520 | 0.9915 | 0.3757 | 0.8347 | 1.0000 |

Error 3 | 0.4638 | 0.9362 | 0.9797 | 0.6541 | 0.8391 | 0.9986 |

True positive | 0.7312 | 1.0000 | 1.0000 | 0.9677 | 0.9946 | 1.0000 |

cxcx34 g30 | ||||||

Error 1 | 0 | 0 | 0 | 0.0510 | 0.5816 | 0.9490 |

Error 2 | 0.0056 | 0.8927 | 0.9972 | 0.2881 | 0.9548 | 1.0000 |

Error 3 | 0.2164 | 0.9188 | 0.9942 | 0.5430 | 0.9662 | 1.0000 |

True positive | 0.5591 | 1.0000 | 0.9892 | 0.6559 | 1.0000 | 1.0000 |

cxcx34 g50 | ||||||

Error 1 | 0 | 0 | 0 | 0 | 0.2041 | 0.6531 |

Error 2 | 0 | 0.7034 | 0.9944 | 0.0523 | 0.9054 | 1.0000 |

Error 3 | 0.3179 | 0.8000 | 0.9894 | 0.4145 | 0.9309 | 1.0000 |

True positive | 0.9140 | 1.0000 | 0.9946 | 0.9516 | 0.9785 | 1.0000 |

cxcx56789 g5 | ||||||

Error 1 | 0 | 0.0053 | 0.0053 | 0.6895 | 0.6263 | 0.8526 |

Error 2 | 0.1896 | 0.6229 | 0.7896 | 0.5240 | 0.7896 | 0.9938 |

Error 3 | 0.3573 | 0.6085 | 0.7659 | 0.6957 | 0.7591 | 0.9817 |

True positive | 0.6884 | 0.9442 | 0.6674 | 0.9930 | 0.9605 | 0.9837 |

cxcx56789 g50 | ||||||

Error 1 | 0 | 0 | 0 | 0.0263 | 0.2816 | 0.6842 |

Error 2 | 0.1083 | 0.5312 | 0.8240 | 0.2844 | 0.6990 | 0.9979 |

Error 3 | 0.4256 | 0.4927 | 0.7762 | 0.5091 | 0.7116 | 0.9835 |

True positive | 0.9256 | 0.9116 | 0.6698 | 0.9395 | 0.9279 | 0.9419 |

. | Cov . | Precision . | Precision+SL-reg . | . | . | . |
---|---|---|---|---|---|---|

cxcx34 g5 | ||||||

Error 1 | 0 | 0 | 0 | 0.6327 | 0.3469 | 0.8776 |

Error 2 | 0.1469 | 0.9520 | 0.9915 | 0.3757 | 0.8347 | 1.0000 |

Error 3 | 0.4638 | 0.9362 | 0.9797 | 0.6541 | 0.8391 | 0.9986 |

True positive | 0.7312 | 1.0000 | 1.0000 | 0.9677 | 0.9946 | 1.0000 |

cxcx34 g30 | ||||||

Error 1 | 0 | 0 | 0 | 0.0510 | 0.5816 | 0.9490 |

Error 2 | 0.0056 | 0.8927 | 0.9972 | 0.2881 | 0.9548 | 1.0000 |

Error 3 | 0.2164 | 0.9188 | 0.9942 | 0.5430 | 0.9662 | 1.0000 |

True positive | 0.5591 | 1.0000 | 0.9892 | 0.6559 | 1.0000 | 1.0000 |

cxcx34 g50 | ||||||

Error 1 | 0 | 0 | 0 | 0 | 0.2041 | 0.6531 |

Error 2 | 0 | 0.7034 | 0.9944 | 0.0523 | 0.9054 | 1.0000 |

Error 3 | 0.3179 | 0.8000 | 0.9894 | 0.4145 | 0.9309 | 1.0000 |

True positive | 0.9140 | 1.0000 | 0.9946 | 0.9516 | 0.9785 | 1.0000 |

cxcx56789 g5 | ||||||

Error 1 | 0 | 0.0053 | 0.0053 | 0.6895 | 0.6263 | 0.8526 |

Error 2 | 0.1896 | 0.6229 | 0.7896 | 0.5240 | 0.7896 | 0.9938 |

Error 3 | 0.3573 | 0.6085 | 0.7659 | 0.6957 | 0.7591 | 0.9817 |

True positive | 0.6884 | 0.9442 | 0.6674 | 0.9930 | 0.9605 | 0.9837 |

cxcx56789 g50 | ||||||

Error 1 | 0 | 0 | 0 | 0.0263 | 0.2816 | 0.6842 |

Error 2 | 0.1083 | 0.5312 | 0.8240 | 0.2844 | 0.6990 | 0.9979 |

Error 3 | 0.4256 | 0.4927 | 0.7762 | 0.5091 | 0.7116 | 0.9835 |

True positive | 0.9256 | 0.9116 | 0.6698 | 0.9395 | 0.9279 | 0.9419 |

### 3.3 Thalamocortical Model Results

We tested the methods further in a more realistic Hodgkin-Huxley–based model. Because the synaptic conductances in the model are no longer constants but become nonlinear dynamic functions, which depend on the presynaptic voltages, the previous derivations can be considered only a first-order approximation.

Similar to the passive neuron model, in Figure 7B, the correlation method still suffers from those three types of false connections. As shown, the latent inputs generate false correlations in the background, and the diagonal line false connections, which are due to the common currents, exist in all correlation-based methods (see Figures 7B–7D). Comparing Figures 7C and 7D because the type 1 false connections are strong in the Hodgkin-Huxley–based model, the sparse + latent regularization removed the true connections but kept these false connections in its final estimation.

### 3.4 Simulated LFP Results

### 3.5 Simulated Calcium Imaging Results

## 4 Discussion

### 4.1 Generalizability and Applicability of the Differential Covariance-Based Methods to Real Experimental Data

Many methods have been proposed to solve the problem of reconstructing the connectivity of a neural network. Winterhalder et al. (2005) reviewed the nonparametric methods and Granger causality–based methods. Much progress has been made recently using the kinetic Ising model and Hopfield network (Huang, 2013; Dunn & Roudi, 2013; Battistin, Hertz, Tyrcha, & Roudi, 2015; Capone, Filosa, Gigante, Ricci-Tersenghi, & Del Giudice, 2015; Roudi & Hertz, 2011) with sparsity regularization (Pernice & Rotter, 2013). The GLM method (Okatan et al., 2005; Truccolo et al., 2005; Pillow et al., 2008) and the maximum entropy method (Schneidman, Berry, Segev, & Bialek, 2006) are two popular classes of methods and the main modern approaches for modeling multi-unit recordings (Roudi, Dunn, & Hertz, 2015).

In current research, people are recording more and more neurons and looking for new data analysis techniques to handle bigger data with higher dimensionality. The field is in favor of algorithms that require fewer samples and scale well with dimensionality, but without sacrificing accuracy. Also, an algorithm that is model free or makes minimum assumptions about the hidden structure of the data has the potential to be applied to multiple types of neural recordings.

The key difference between our methods and other ones is that we use the relationship between a neuron's differential voltage and its voltage rather than finding the relationship between voltages. This provides better performance because the differential voltage is a proxy for a neuron's synaptic current. Also, the relationship between a neuron's synaptic current and its input voltages is more linear, which is suitable for data analysis techniques like the covariance method. While this linear relationship holds only for our passive neuron model, we still see similar or better performance of our methods in our examples based on the Hodgkin-Huxley model, where we relaxed this assumption and allowed loops in the networks. This implies that this class of methods is still applicable even when the ion channels' conductances vary nonlinearly with the voltages, which makes the linear relationship hold only weakly.

### 4.2 Caveats and Future Directions

One open question for the differential covariance-based methods is how to improve the way they handle neural signals that are nonlinearly transformed from the intracellular voltages. Currently, to achieve good performance in the calcium imaging example, we need to assume we know the exact transfer function and reversely reconstruct the action potentials. We find this reverse transform method to be prone to additive gaussian noise. Further study is needed to find a better way to preprocess calcium imaging data for the differential covariance–based methods.

Throughout our simulations, the differential covariance method has better performance and needs fewer data samples in some simulations but not in others. Further investigation is needed to understand where the improvement comes from.

Our Hodgkin-Huxley simulations did not include axonal or synaptic delays, a critical feature of a real neural circuit. Unfortunately, it is nontrivial to add this feature to our Hodgkin-Huxley model. Nevertheless, we tested our methods with the passive neuron model using the same connection patterns but with random synaptic delays between neurons. In appendix D, we show that for up to a 10 ms uniformly distributed synaptic delay pattern, our methods still outperform the correlation-based methods.

## Appendix A: Differential Covariance Derivations

In this appendix we first build a simple three-neuron network to demonstrate that our differential covariance–based methods can reduce type 1 false connections. Then we develop a generalized theory showing that the type 1 false connections' strength is always lower in our differential covariance–based methods than the original correlation-based methods.

### A.1 A Three-Neuron Network

Notice that because the ratio between and is , in differential covariance, the type 1 false connection has value 0.

### A.2 Type 1 False Connection's Strength Is Reduced in Differential Covariance

- •
All neurons' leakage conductance and membrane capacitance are constants and the same.

- •
There is no loop in the network.

- •
, where is the maximum of , for

Then we prove below that:

- •
For two neurons that have physical connection, their covariance is .

- •
For two neurons that do not have physical connection, their covariance is .

- •
For two neurons that have physical connection, their differential covariance is .

- •
For two neurons that do not have physical connection, their differential covariance is .

- •
The type 1 false connection's strength is reduced in differential covariance.

The autocovariance of a neuron is . The covariance of two different neurons with or without physical connection is .

We prove this by induction.

*The basis*. The base case contains two neurons: From lemma

^{1}, we have Then, from lemma

^{2}, we have

So the statement holds.

*The inductive step*. If the statement holds for a network of neurons, we add one more neuron to it.

**Part 1**. First, we prove that the covariance of any neuron with is also .

^{2}, we have: where are the neurons projecting to neuron and are the neurons projecting to neuron .

Note that because are neurons from the old network, is at most , and it is only when .

Here we separate the problem into two situations: cases 1 and 2.

*Case 1: Neuron i projects to neuron n* Since there is no loop in the network, . Therefore, is the covariance of two neurons from the old network and is . must be larger than , such that is larger than .

Therefore, if a neuron's covariance with neuron is larger than , one of its antecedents' covariance with neuron is also larger than . Since we assume there is no loop in this network, there must be at least one antecedent (say, neuron ) whose covariance with neuron is larger than and it has no antecedent.

^{2}: Since is , is , which is smaller than . This is a contradiction. So is no larger than . Therefore, is .

*Case 2: Neuron i does not project to neuron n* Now, in equation A.24, it is possible that . However, in case 1, we just proved that the covariance of any neuron that projects to neuron n is . Therefore, is regardless of whether .

Then, similar to case 1, there must be an antecedent of neuron (say neuron ), whose covariance with neuron is larger than and it has no antecedent. Then from equation A.25, we know this is a contradiction. So is .

As we already proved that any neuron's covariance with neuron is , the dominant term in equation A.26 is . Therefore, the autocovariance of neuron is also .

The covariance of two neurons that are physically connected is . The covariance of two neurons that are not physically connected is .

If neuron and neuron are physically connected, let's say ; then . Thus, one of the for is . Therefore, is . Since there is no loop in the network, , so is . Therefore, is .

If neuron and neuron are not physically connected, we have , so is . And , so is . Therefore, is .

The differential covariance of two neurons that are physically connected is .

The differential covariance of two neurons that are not physically connected is .

Note:

- •
There is no physical connection between a neuron in and neuron ; otherwise, this neuron belongs to . Therefore, from theorem

^{4}, is . - •
There is no physical connection between a neuron in and neuron ; otherwise, this neuron belongs to . Therefore, from theorem

^{4}, is . - •
There could be physical connections between neurons in and , so is .

- •
There could be physical connections between neurons in and , so is .

- •
There could be physical connections between neurons in and neuron , so is .

- •
There could be physical connections between neurons in and neuron , so is .

From lemma ^{5}, we know that is also .

Type 1 false connection's strength is reduced in differential covariance.

From theorems ^{3} and ^{4}, we know that in the correlation method, the strength of a nonphysical connection is times that of a physical connection .

From theorems ^{6} and ^{7}, we know that in the differential covariance method, the strength of a nonphysical connection is times that of a physical connection .

Because , the relative strength of the nonphysical connections is reduced in the differential covariance method.

### A.3 Directionality Information in Differential Covariance

If neuron projects to neuron with an excitatory connection, and .

^{7}, from lemma

^{2}, we have, for any neuron :

Note that in theorem ^{7}, we already proved the scale of . Also:

- •
There are physical connections between neurons in and neuron , so is .

- •
From lemma

^{1}, the autocovariance of neuron is . So is .

Therefore, is the dominant term in . Since , for excitatory connection , .

If neuron projects to neuron with an inhibitory connection, and .

## Appendix B: Benchmarked Methods

We compared our methods to a few popular methods.

### B.1 Covariance Method

### B.2 Precision Matrix

We begin by considering a pair of variables and remove the correlation in them introduced from a control variable .

So and differ by a ratio of .

### B.3 Sparse Latent Regularization

Prior studies (Banerjee, Ghaoui, d'Aspremont, & Natsoulis, 2006; Friedman, Hastie, & Tibshirani, 2008) have shown that regularizations can provide better a estimation if the ground-truth connection matrix has a known structure (e.g., sparse). For all data tested in this letter, the sparse latent regularization (Yatsenko et al., 2015) worked best. For a fair comparison, we applied the sparse latent regularization to both the precision matrix method and our differential covariance method.

### B.4 The Generalized Linear Model Method

As summarized by Roudi et al. (2015), GLMs assume that every neuron spikes at a time-varying rate that depends on earlier spikes (both those of other neurons and its own) and on external covariates (such as a stimulus or other quantities measured in the experiment). As they explained, the influence of earlier spikes on the firing probability at a given time is assumed to depend on the time since they occurred. For each prepostsynaptic pair , , it is described by a function of this time lag (Roudi et al., 2015). In this article, we average this temporal dependent function over time to obtain the functional connectivity estimation of this method.

The spike trains used for the GLM method were computed using the spike detection algorithm from Quiroga, Nadasdy, and Ben-Shaul (2004) (https://github.com/csn-le/wave_clus). The article's default parameter set is used except that the maximum detection threshold is increased. The code is provided in our github repository (https://github.com/tigerwlin/diffCov/blob/master/spDetect.m) for reference. The GLM code was obtained from Pillow et al. (2008) (http://pillowlab.princeton.edu/code_GLM.html) and applied on the spike trains.