## Abstract

Knowledge of synaptic input is crucial for understanding synaptic integration and ultimately neural function. However, *in vivo*, the rates at which synaptic inputs arrive are high, so that it is typically impossible to detect single events. We show here that it is nevertheless possible to extract the properties of the events and, in particular, to extract the event rate, the synaptic time constants, and the properties of the event size distribution from *in vivo* voltage-clamp recordings. Applied to cerebellar interneurons, our method reveals that the synaptic input rate increases from 600 Hz during rest to 1000 Hz during locomotion, while the amplitude and shape of the synaptic events are unaffected by this state change. This method thus complements existing methods to measure neural function *in vivo*.

## 1 Introduction

*in vitro*, or in cases where activity is artificially lowered, individual excitatory and inhibitory inputs can be resolved (see Figure 1a, top);

*in vivo*the rates are typically so high that this is impossible. Instead, the total synaptic current trace is wildly fluctuating, and single-event extraction methods will fail.

Nevertheless, information can still be extracted from the statistical properties of the recorded *in vivo* currents. Although individual synaptic events might not be distinguishable in the observed current trace, the trace will still bear signatures of the underlying events. Intuitively, the mean current should be proportional to the product of the synaptic event size and the total event frequency. But it is possible to extract other information as well. For instance, when the synaptic events have short time constants, the observed current trace will have more high-frequency content than when the synaptic time constants are slow. Similarly, when the input is composed of many small events, the variance of the current trace will be smaller than when it is composed of a few large events.

An early application of these ideas was used at the neuromuscular junction (Katz & Miledi, 1972) and in the retina to measure visually evoked transmitter release (Ashmore & Falk, 1982). Other earlier methods have estimated both the excitatory and inhibitory conductances using across-trial averages of current injections of different magnitude (Borg-Graham, Monier, & Frégnac, 1996; Häusser & Roth, 1997; Anderson, Carandini, & Ferster, 2000; Wehr & Zador, 2003; Rudolph, Piwkowska, Badoual, Bal, & Destexhe, 2004; Greenhill & Jones, 2007). More recently, conductances have been estimated from a single trace by applying a diverse range of probabilistic inference methods. In some of those studies, the size of the excitatory and inhibitory inputs is assumed to be identical, fixed, and known a priori, while the synaptic inputs were assumed to be -functions, with instantaneous rise and decay time and Poisson statistics (Kobayashi, Shinomoto, & Lansky, 2011). Some of the assumptions were relaxed in Paninski, Vidne, Depasquale, and Ferreira (2012), where the number of inputs in a time window followed either an exponential or a truncated gaussian distribution, but the synaptic decay time constant had to be known a priori. Finally, Lankarany, Zhu, Swamy, and Toyoizumi (2013) further generalized the distribution of the number of inputs in a time window by making use of a mixture of gaussians. This method allowed a good estimation of the conductance traces even when the distribution of synaptic amplitudes has long tails.

These methods typically attempt to recover the global excitatory and inhibitory conductances. Here instead we focus on recovering the statistics of the constituent synaptic events. Specifically, we introduce a method that aims to infer the event rate, synaptic time constants, and distribution of synaptic event amplitudes from the power spectral density and statistical moments of the observed current trace. We applied our method to voltage-clamp traces of electrotonically compact interneurons recorded in the cerebellum of awake mice. We find that during voluntary locomotion, the excitatory input rate increases from 600 to 1000 Hz, while the synaptic event amplitudes remain the same. Our method thus provides a novel way to resolve synaptic event properties *in vivo*.

## 2 Results

A common method to extract synaptic properties is to identify and analyze isolated events from current traces, but *in vivo* this fails because the events will overlap (see Figure 1a). To demonstrate the problem explicitly, we simulated a neuron randomly receiving excitatory synaptic events (see below for model details). For illustration purposes, we assume momentarily that the amplitude of all events is identical (50 pA). From the total current recorded in voltage clamp, we attempt to reconstruct the frequency of events and the distribution of their amplitudes.

We used single-event dectection software (see the appendix) to find putative postsynaptic currents (PSCs). At low-input frequencies (50 Hz), most PSCs were correctly identified, and the resulting estimation of the synaptic input amplitude distribution was correct. However, at higher frequencies, when the event interval became shorter than the synaptic decay time, the event frequency was grossly underestimated and reached a plateau (see Figure 1b, left). At this point the individual excitatory postsynaptic currents (EPSCs) overlapped and became indistinguishable. The reason is that the most probable intertime interval of a Poisson process (a common model for the inputs received by a neuron, but see Lindner, 2006) is zero. In addition, as a result of the overlap, the estimated PSC amplitude distribution had peaks at multiples of the original amplitude and the variance of the event amplitude was highly overestimated (see Figure 1b, right). Finally, at high-input frequencies, the traces had to be manually postprocessed to correct mistakes in event detection. This manual processing is time-consuming; even an experienced researcher spent more than 1 hour to analyze a 10 second trace. Thus, at high-input frequencies, single-event analysis is not only incorrect; it is also time-consuming. While a somewhat better result might be achievable when the PSCs are identified by their rising phase, such methods will still fail at high rates.

### 2.1 Generative Model for the Observed Current Trace

*in vitro*situation, the synaptic properties are not directly accessible from

*in vivo*recordings. Instead, the data indirectly and stochastically reflect the synaptic properties. We therefore use a generative model to couple the data, in particular, the statistics of the current trace, to the underlying synaptic properties. We define the generative model as follows. The synaptic inputs are assumed to arrive according to a Poisson process with a rate (see Figure 1a and section 3). The synaptic events are modeled with a biexponential time course, as this can accurately fit most fast synapses (e.g., Roth & van Rossum, 2009), with rise time and decay time . While we initially assumed that all PSCs have the same time constants, the effect of heterogeneous time constants is studied below. The total current is where denotes the time of event and is the amplitude of that event. Unlike the schematic example above, the event amplitudes are drawn from a synaptic amplitude distribution (with ). This distribution captures the spread of amplitudes across the population of synapses, as well as variation in single-synapse-event amplitudes due to randomness and nonstationarities such as short-term plasticity.

*LN*), with raw moments ; or (2) a stretched exponential distribution (

*SE*), with moments where is the gamma function; or (3) a zero-truncated-normal distribution (

*TN*), where , and and are the density of a normal distribution with zero-mean and unit variance and its cumulative. The mean and variance , where (for higher moments see, e.g., Horrace, 2013).

*LN*and the

*SE*distribution (for ) are heavy tailed, while the

*TN*distribution is not. Conveniently, all of these distributions are characterized by the two parameters and . These parameters determine the mean and variance of the events, but as can be seen from the equations for the moments above, the exact relations are different for each of the three candidate distributions.

#### 2.1.1 Moments of the Synaptic Current

#### 2.1.2 Power Spectrum of the Synaptic Current

### 2.2 Inference Procedure

Now that we have expressed both the PSD and the moments of the current distribution in the model parameters, one could proceed using classical fitting techniques, such as least square fitting, to find the synaptic parameters that best fit the data. However, we use a probabilistic approach that yields the distribution of parameters that best fit the data. A probabilistic approach is advantageous for three reasons. First, we expect strong correlations between the model parameters, which can cause traditional fitting to fail (see, e.g., Costa, Sjostrom, & van Rossum, 2013). Second, the probabilistic approach naturally yields the probability distribution of possible fit parameters. Third, the probabilistic approach offers a natural way to perform model selection.

We first present an idealized model, which ignores some distortions typical of *in vivo* recordings. Figure 2 shows the Bayesian network and the dependencies among the variables (nodes). The green nodes stand for variables that are measured directly from the data: the PSD and the first four moments of the current . Together the data are succinctly denoted . The orange nodes represent variables that are to be inferred. The five parameters of the model are the rate , the mean synaptic amplitude , its variance , synaptic risetime , and decay time , as well as the type of distribution (see Table 1). The set of parameters is denoted .

Parameter Name . | Description . |
---|---|

Measured data | |

Observed first four moments of the current | |

Power spectral density of | |

Parameters of idealized model | |

, | Rise and decay time of the EPSCs |

LN,SE,TN | Synaptic amplitude distribution = {log-normal, |

stretched exponential, truncated normal} | |

, | Mean and SD of the amplitude distribution |

Moments of the synaptic amplitude distribution | |

Frequency of synaptic inputs | |

Additional parameters of full model | |

Voltage-clamp baseline current | |

SD of high-frequency noise | |

SD of low-frequency fluctuations |

Parameter Name . | Description . |
---|---|

Measured data | |

Observed first four moments of the current | |

Power spectral density of | |

Parameters of idealized model | |

, | Rise and decay time of the EPSCs |

LN,SE,TN | Synaptic amplitude distribution = {log-normal, |

stretched exponential, truncated normal} | |

, | Mean and SD of the amplitude distribution |

Moments of the synaptic amplitude distribution | |

Frequency of synaptic inputs | |

Additional parameters of full model | |

Voltage-clamp baseline current | |

SD of high-frequency noise | |

SD of low-frequency fluctuations |

From this equation the parameter distribution given the data follows as . We now describe and the probabilistic dependencies among the nodes term by term. The first two terms infer the values for the synaptic time constants from the PSD. Since we cannot obtain an analytic expression of the likelihood of the PSD, we use empirical Bayes to set the prior on the time constants of the postsynaptic current (Casella, 1985). We fit the shape of equation 2.8 with a least square method to the PSD to find and (see the top-left inset in Figure 2; note that the values , , are not needed to perform this fit). Since we found the relative cross-terms of the Hessian matrix between and to be very small (), we model the time constants with independent gaussian distributions with mean and variance given by the mean and the Hessian of the PSD fit. A common criticism of empirical Bayes is that it uses data for both prior and inference, thus double-counting the data. Here, however, the PSD data are used to set the prior on the time constants, but the data are not used as evidence in the inference process (see Figure 2).

The next term in equation 2.9 is . This is a deterministic function, because the moments of the synaptic amplitude distribution are fully determined by , and the type of amplitude distribution type (see equation 2.3– 2.5). The parameters , , and are given uninformative uniform priors spanning a reasonable and positive range of values.

The final term in equation 2.9, the likelihood of the moments of the current , cannot be calculated analytically. Although equation 2.6 gives the expected value, is a stochastic quantity that due to the Poisson process is different on each run and thus requires simulation. Below, we present a method to speed up its calculation.

#### 2.2.1 Inclusion of In Vivo Variability and Other Experimental Confounds

*In vivo* voltage-clamp recordings show a number of effects that need to be included in the model with additional parameters. The first additional feature is the baseline current of the voltage clamp that has to be subtracted from the current. In *in vitro* situations, one can estimate it by finding the baseline of the current trace, but due to the high input rates, this is challenging for *in vivo* recordings. Instead a prior probability of was included. It was normally distributed with mean and variance estimated with an informed guess, reflecting the uncertainty in the value of .

*in vivo*synaptic activity (e.g., Schiemann et al., 2015). We relax the constant rate assumption by adding a modulation term to the Poisson rate, which is modeled as an OU process with power and cutoff frequency of 5 Hz: As a result, in the expression for the PSD, equation 2.8, the rate is replaced by , where the power spectrum of the OU process is given by . To find the variance of this slow noise, we fit the PSD with equation 2.8 in a range above and calculate its integral (the theoretical standard deviation of the modulation-free trace). The value of was set heuristically as the minimal value that resulted in a good fit. Since the observed variance of the signal is the sum of , , and (the slow component is independent of the underlying process), it follows that .

#### 2.2.2 Description of the Sampling Algorithm

In the Bayesian framework, the posterior probabilities of the parameters of the model can be estimated by sampling from , for instance, using a suitable Markov chain Monte Carlo (MCMC) algorithm. However, this approach is very slow, because the likelihood does not have a closed form and has to be estimated with multiple simulations after each MCMC sample. As the estimation of the likelihood takes about 1 minute on a standard PC, a typical MCMC run of about 100,000 samples would take approximately two months.

We introduce a speed-up that can be used whenever a likelihood can be obtained only by sampling from the generative model, but its means can be calculated analytically. The idea is to fit the likelihood with a kernel density estimate (KDE). Assuming that the shape of the likelihood does not depend much on the parameter values, the same KDE can be exploited to approximate the likelihood for different parameter values. As a result, we can keep the shape fixed, but we translate it to a new location determined by the analytically calculated average moments of the likelihood. A thorough validation shows that the method correctly infers the parameters across a wide range of biologically plausible values (see below).

### 2.3 Validation on Simulated Data

*in vivo*PSCs in cerebellar interneurons (Szapiro & Barbour, 2007, and below). We first assumed that the shape of the synaptic amplitude distribution (

*LN*,

*SE*, or

*TN*) is known a priori. Figure 4a compares the estimated parameters versus their true value. The inference works well in a physiologically plausible range, and the true value is almost always within the confidence interval. The largest error bars occur when either the mean event amplitude is small or the standard deviation is large (i.e., the CoV is large).

The approach also yields the inferred joint distribution for a given parameter setting. The posterior distribution of the parameters contains the true values in the region of maximal density (see Figure 4b). Unlike single-point estimates (e.g., maximum a posteriori MAP estimates), one can also evaluate the dependencies between the parameters. In particular we observe a strong anticorrelation between event rate and event size (bottom left panel). In other words, the model compensates for changes in the rate by changing the estimate for the event size; their product is approximately invariant.

#### 2.3.1 Model Selection

*LN, SE*, or

*TN*) when it is not known a priori. The Bayesian framework offers straightforward tools to assess the likelihood of a model, such as the deviance information criterion (DIC) (Spiegelhalter, Best, Carlin, & Linde, 2002). The higher the DIC is, the less likely is the model suitable to describe the data, and this would be the simplest way to choose the most likely distribution. However, the DIC value is a random variable that varies from trial to trial. Thus, rather than selecting the lowest DIC, we use Bayesian model comparison based on the distribution of the DIC values. We generated 100 traces using a given amplitude distribution and run the inference algorithm assuming either

*LN, SE,*or

*TN*amplitude distribution and calculate the DIC for each mode (see Figure 5a). From the three DIC values of the three models , , and (corresponding to the

*LN, SE,*and

*TN*model, respectively), we calculate two quantities: and . To find the most likely amplitude distribution, we apply Bayes theorem and calculate where, in the second line, we assumed that each amplitude distribution is a priori equiprobable. Thus, for each point in the space , we select the distribution that has the highest probability according to equation 2.13 (see Figure 5b). This method is able to correctly identify the amplitude distribution with approximately accuracy (see Figure 5c).

#### 2.3.2 Robustness of Method

*in vivo*traces and one can see more rapid modulation in the synaptic inputs. Indeed, longer traces lead to less uncertainty on the parameter (see Figure 6a). The analysis shows that 10 second recordings in general are enough to obtain a reasonable estimation of the parameters.

Next, *in vivo* PSCs rise and decay times might vary across synapses as different synapses may have different kinetic properties and may be subject to different amounts of dendritic filtering (Williams & Mitchell, 2008). To test whether our model performs well when the shape of the PSCs varies, both time constants that determine the PSC shape were independently drawn from truncated normal distributions for each PSC. When the heterogeneity of the time constants was modest (CV, the model correctly extracted , , and , but at larger values, the frequency estimate in particular diverges (see Figure 6b). One reason might be that our model assumes fixed time constants, but in this case, the cumulants of equation 2.6 should actually be averaged over the distribution of time constants.

Finally, we tested what happens when *in vivo* activity breaks the stationary assumption of the homogeneous Poisson model and inputs typically fluctuate on a slow timescale. To test the robustness of our model, we generate *in vivo*–like traces by adding an inhomogeneous component to the Poisson rate, modeled as an OU process with 5 Hz cutoff frequency. When we applied the correction described above, the model performs well even in the presence of considerable fluctuations in the synaptic input rate (see Figure 6c.i). Including the correction is important; without it, large biases arise (see Figure 6c.ii).

### 2.4 Inference Method Applied to Cerebellar in Vivo Data

*in vivo*recordings obtained from cerebellar interneurons in the molecular layer (basket and/or stellate cells). These neurons are ideal to test our method as they are electronically compact (Kondo & Marty, 1998), although some cable filtering can be observed in older animals (Abrahamsson, Cathala, Matsui, Shigemoto, & DiGregorio, 2012). The voltage clamp held neurons at 70 mV to isolate excitatory inputs. The head-restrained mice displayed bouts of self-paced voluntary locomotion on a cylindrical treadmill (see Figure 7a). All traces () were 90 seconds and contained at least 10 seconds of movement. Locomotion modulates subthreshold and spiking activity in a large number of brain regions (Dombeck, Graziano, & Tank, 2009; Polack, Friedman, & Golshani, 2013; Schiemann et al., 2015). In cerebellar interneurons, locomotion is associated with increased excitatory input drive (see Figure 7b). In particular we were interested in what underlies this increased drive. For instance, it could be caused by increased frequency, increased amplitude as an effect of neuromodulation, or recruitment of a distinct set of synapses.

To apply our method we extracted the PSD and distribution from the current trace (see Figure 7b). We corrected for the low-frequency modulation as described above, while the high-frequency noise was measured directly from the system. The subsequent inference showed that the increase in excitatory synaptic current is associated with an increased input frequency, shown for a representative trace in Figure 7c (bottom panel). However, in this cell movement did not lead to changes in the mean amplitude or the standard deviation of the synaptic amplitudes (see Figure 7c, top and middle panels). During movement, the input frequency roughly doubled, from 484 to 1233 Hz. The synaptic time constants found by fitting the power spectrum of the current were ms and ms (mean standard error), comparable to the 20% to 80% rise time of 0.41 0.14 ms and the 1.85 0.52 ms decay reported in slice (Szapiro & Barbour, 2007).

Across the population, the MAP estimates of , , and during quiet wakefulness and movement show a similar pattern (see Figure 7d and Table 2). Note that given the small changes between quiet and moving state, the power of the test calculated from the data is low, but 10% changes would be detected with high probability.

. | . | . | . | Power . | Power to Detect . | ||
---|---|---|---|---|---|---|---|

. | Quiet . | Movement . | . | . | . | ||

. | Mean . | SE . | Mean . | SE . | p-Value
. | of Data . | 10% Change . |

0.27 ms | 0.03 ms | 0.28 ms | 0.03 ms | 0.24 | 0.15 | 0.61 | |

1.68 ms | 0.22 ms | 1.65 ms | 0.19 ms | 0.61 | 0.1 | 0.74 | |

42.8 pA | 8.7 pA | 43.2 pA | 7.9 pA | 1.00 | 0.04 | 0.83 | |

31.3 pA | 6.2 pA | 31.0 pA | 4.9 pA | 0.86 | 0.04 | 0.42 | |

585 Hz | 153 Hz | 1006 Hz | 80 Hz | 0.03 | 0.93 | 0.07 |

. | . | . | . | Power . | Power to Detect . | ||
---|---|---|---|---|---|---|---|

. | Quiet . | Movement . | . | . | . | ||

. | Mean . | SE . | Mean . | SE . | p-Value
. | of Data . | 10% Change . |

0.27 ms | 0.03 ms | 0.28 ms | 0.03 ms | 0.24 | 0.15 | 0.61 | |

1.68 ms | 0.22 ms | 1.65 ms | 0.19 ms | 0.61 | 0.1 | 0.74 | |

42.8 pA | 8.7 pA | 43.2 pA | 7.9 pA | 1.00 | 0.04 | 0.83 | |

31.3 pA | 6.2 pA | 31.0 pA | 4.9 pA | 0.86 | 0.04 | 0.42 | |

585 Hz | 153 Hz | 1006 Hz | 80 Hz | 0.03 | 0.93 | 0.07 |

Next, we applied our inference method to each trace using the *LN* (log-normal), *SE* (stretched exponential), and *TN* (truncated normal) distribution and determined which synaptic amplitude distribution was the most likely, where it should be kept in mind that the statistical power of the data is limited. In general, we found that during both quiet periods and movement, the most likely distributions were heavy-tailed, being either LN or SE (with exponent on average 0.8, range 0.7–1.2; see Figure 7e). In particular, during active periods, the *LN* distribution (the most common) was significantly more likely than the *TN* ( = 0.046), but the *SE* distribution was not significantly less likely ( = 0.37). Thus, while this suggests that the distribution is stretched, the current data cannot distinguish between the *LN* and *SE* types. Next, we wondered if the shape changed using a binomial test. For instance, if a recording yielded five times the LN distribution out of eight data traces during the quiet period, and did so six out of eight times during the active period, there is no evidence for a change. We found no evidence for a change in the distribution shape between quiet and active period (*LN*, = 0.78; *SE*, = 0.96; *TN*, = 0.71).

Finally, we compared our estimates to a standard single-event extraction method (see the appendix). Because the event extraction method fails at frequencies higher than about 500 inputs per second, the frequency of the synaptic inputs is underestimated by a factor two due to the misclassification of overlapping events.

## 3 Discussion

In the past decade, numerous studies have been published using voltage-clamp data from anesthetized animals to investigate the contribution of excitation and inhibition to the dynamics, with recordings from auditory cortex (Wehr & Zador, 2003; Poo & Isaacson, 2009; Liu et al., 2010), visual cortex (Liu et al., 2010; Haider, Hausser, & Carandini, 2012), and prefrontal cortex (Haider, Duque, Hasenstaub, & McCormick, 2006). However, in these experiments, only the total excitatory or inhibitory contributions can be extracted; therefore, they are unable to distinguish properties of single synapses and changes therein. We proposed a novel probabilistic method to infer the synaptic time constants, the mean and variance of the synaptic event amplitude distribution, and the synaptic event rate from *in vivo* voltage-clamp traces. Moreover, the method accurately recovers the shape of the distribution of synaptic inputs. The inference is robust to slow fluctuations of synaptic input rate, experimental noise, and heterogeneity in the time constants of the PSCs.

The extracted distribution reflects the amplitude of events received by the neuron. It therefore includes not only variations across synapses but also variation due to synaptic unreliability and heterogeneity from effects like short-term synaptic plasticity (Szapiro & Barbour, 2007; Abrahamsson et al., 2012). Furthermore, the contribution of each synapse is weighted by its own input rate: synapses receiving inputs at higher rates will contribute more to the estimated amplitude distribution than synapses receiving low rates. Our method thus captures the effective distribution of synaptic inputs in an *in vivo* recording and thereby complements techniques that infer the amplitude distribution either anatomically from spine size or from paired recordings *in vitro* and that are not weighted by the input rate.

Applied to voltage-clamp recordings from cerebellar interneurons of awake mice, we found that the excitatory synaptic amplitude distribution is either a stretched exponential or log normal. This means that the probability for large events is larger than for a gaussian with the same mean and variance. Such heavy-tailed distributions have been observed in a number of systems (Sayer, Friedlander, & Redman, 1990; Song, Sjöström, Reigl, Nelson, & Chklovskii, 2005; Barbour et al., 2007; Ikegaya et al., 2013) and are believed to be an important characteristic of neural processing (Koulakov, Hromádka, & Zador, 2009; Roxin, Brunel, Hansel, Mongillo, & Vreeswijk, 2011; Teramae, Tsubo, & Fukai, 2012). While any distribution can be tested (although for efficiency reasons the moments should ideally be available analytically), a future goal is to reconstruct the amplitude distribution directly, for instance, by reconstructing it from its moments. However, there are currently no fully satisfactory mathematical methods to achieve this.

Furthermore, we found no evidence that the synaptic amplitude distribution changes in these neurons when the animal is moving. Instead, the increase of the excitatory current during movement is due to the higher frequency of the inputs. The most parsimonious explanation is that all inputs, big and small, increase their rates similarly during movement. However, it is important to remember that the method is based on the ensemble of inputs. While our findings are inconsistent with a case where only large inputs become more active and a case where all single synaptic events become stronger by, say, neuromodulation, we cannot rule out that, for instance, a second population of inputs with an identical amplitude distribution becomes active during movement.

We summarize restrictions of the method. First, as in most other methods, the *in vivo* traces need to be stationary over a period long enough to accumulate sufficient statistics. The second assumption is that the neurons are electronically compact such that a good “space clamp” can be achieved, which is problematic for Purkinje and pyramidal neurons (Williams & Mitchell, 2008). It would be of interest to examine how robust our method is toward deviations from this (Abrahamsson et al., 2012), for example, using compartmental simulations.

The third assumption is that synaptic inputs are uncorrelated and follow a Poisson distribution. Experimental measurements of correlations in the brain are contradictory and largely depend on what timescale is considered, reviewed in Cohen and Kohn (2011). Notably, slow correlations are visible in the PSD, adding a component with a different time constant (Moreno-Bote, Renart, & Parga, 2008). When fitting the PSD of *in vivo* data, we observed a bump in activity in the low frequencies ( Hz) that could correspond to spike correlations on timescales longer than about 15 ms. Such correlations are included in our model. The method would not be able to identify spike correlations on the order of the synaptic time constants ( and ) because they would contribute to the PSD in the same frequency range. However, it is generally believed that spike count correlations on a short timescale (about 2 ms) are small, normally below 0.03 (Smith & Kohn, 2008; Helias, Tetzlaff, & Diesmann, 2014; Grytskyy et al., 2013; Renart et al., 2010; Ecker et al., 2010), and thus the inference would likely still give correct results. Recent experimental evidence shows that high-frequency firing Purkinje cells contact interneurons directly (Witter, Rudolph, Pressler, Lahlaf, & Regehr, 2016), which could lead to strongly correlated input. We did not observe bumps in the interneuron powerspectum at around 200 Hz to 250 Hz, the typical cerebellar oscillation frequency (de Solages et al., 2008). Nevertheless, when applying this method, one should be aware of the possibility of such effects. Finally, in these population measurements, truly instantaneous correlations, where multiple events arrive simultaneously, in principle can never be distinguished from altered distributions. However, the error associated with this effect is likely limited. Consider a neuron that receives inputs of equal amplitude at a rate . If the inputs have correlation , every 100 events, as a first approximation, one will observe on average only 95 events: 90 of size and 5 of size . In general, for a given correlation , the observed frequency is and the observed average amplitude . Thus, even assuming , the error in the estimate would be 5% or less.

While the robustness of the method can be tested further using (multicompartment) simulations, physiological validation is much harder; even with optogenetics, there is no obvious way to generate high-frequency Poisson-like input trains.

The method used the first four moments of the measured current. While in order to infer the distribution shape, one needs at least three moments, we found that using only the first three led to a consistent overestimate of the event amplitude with some 10 pA (not shown). In contrast, higher moments are difficult to estimate with brief recordings. Thus, four seems a good middle ground for the recordings analyzed here.

In summary, commonly used methods to analyze *in vivo* voltage-clamp data cannot infer the single-event statistics at all or introduce large errors. Instead the proposed method represents an important step in extracting such information from *in vivo* intracellular recordings.

### Appendix: Implementation Details

We implemented the model in PyMC2, a Python package to perform Bayesian computation (Patil, Huard, & Fonnesbeck, 2010), using a Metropolis Hastings sampler, with normal proposal distribution and standard deviation in each dimension equal to 1 over the absolute value of the parameters. Usually the autocorrelation of the chains was about 300 to 500 samples, and the burn-in phase was about 10 effective samples. To construct the posterior, we generated 150,000 samples yielding approximately 400 effective samples and assessed the mixing by using the Geweke method provided by the PyMC package. The computational analysis tools and data are available at https://github.com/ppuggioni/invivoinfer.

To compare our method to traditional single-event detection methods, we employed TaroTools, a freely available IgorPro package (see sites.google.com/site/tarotoolsregister/) to detect putative postsynaptic currents (PSCs).

The experimental data are described in detail elsewhere (Jelitai, Puggioni, Ishikawa, Rinaldi, & Duguid, 2016). Briefly, whole-cell patch clamp recordings of molecular layer interneurons (basket and stellate cells) were obtained from awake, behaving, but head-restrained, mice at a depth of 100 to 300 m from the pial surface of the cerebellum, using a Multiclamp 700B amplifier (Molecular Devices, USA). The signal was filtered at 6–10 kHz and acquired at 10–20 kHz using PClamp 10 software in conjunction with a DigiData 1440 DAC interface (Molecular Devices). Patch pipettes (5–8 M) were filled with internal solution (285–295 mOsm) containing (in mM): 135 K-gluconate, 7 KCl, 10 HEPES, 10 sodium phosphocreatine, 2 MgATP, 2 Na2ATP, 0.5 Na2GTP and 1 mg/ml biocytin (pH adjusted to 7.2 with KOH). External solution contained (in mM): 150 NaCl, 2.5 KCl, 10 HEPES, 1.5 CaCl2, 1 MgCl2 (adjusted to pH 7.3 with NaOH). While biocytin was included in the pipette for histological identification, to allow for offline classification interneuron type, our recovery rate was relatively low (<10%) Thus, we were unable to differentiate further between different interneuron subtypes in this study.

To detect movement, the animals were filmed using a moderate frame rate digital camera (60 fps) synchronized with the electrophysiological recording. We defined a region of interest (ROI) covering the forepaws, trunk and face and calculated a motion index between successive frames (as in Schiemann et al., 2015). All movements (positioning, grooming and locomotion) were included.

## Acknowledgments

We are grateful to M. Nolan, P. Latham, and L. Acerbi for helpful discussions. This work was supported by the Engineering and Physical Sciences Research Council Doctoral Training Centre in Neuroinformatics, grant EP/F500386/1; the Biotechnology and Biological Sciences Research Council, grant BB/F529254/1 (PP); and a Wellcome Trust Career Development fellowship (grant WT086602MF) to I.D.