## Abstract

Generalized linear models (GLMs) have a wide range of applications in systems neuroscience describing the encoding of stimulus and behavioral variables, as well as the dynamics of single neurons. However, in any given experiment, many variables that have an impact on neural activity are not observed or not modeled. Here we demonstrate, in both theory and practice, how these omitted variables can result in biased parameter estimates for the effects that are included. In three case studies, we estimate tuning functions for common experiments in motor cortex, hippocampus, and visual cortex. We find that including traditionally omitted variables changes estimates of the original parameters and that modulation originally attributed to one variable is reduced after new variables are included. In GLMs describing single-neuron dynamics, we then demonstrate how postspike history effects can also be biased by omitted variables. Here we find that omitted variable bias can lead to mistaken conclusions about the stability of single-neuron firing. Omitted variable bias can appear in any model with confounders—where omitted variables modulate neural activity and the effects of the omitted variables covary with the included effects. Understanding how and to what extent omitted variable bias affects parameter estimates is likely to be important for interpreting the parameters and predictions of many neural encoding models.

## 1 Introduction

Regression models have been widely used in systems neuroscience to explain how external stimulus and task variables, as well as internal state variables, may relate to observed neural activity (Brown, Barbieri, Eden, & Frank, 2003; Kass, Ventura, & Brown, 2005). However, in many cases, the full set of variables that explain the activity of the observed neurons is not observed or is not even known. It is important to recognize that in these cases, omitted variables can cause the parameter estimates for the effects that are included in a regression model to be biased. That is, parameter estimates for the modeled effects would be different if other, omitted variables were to be included in the model (Box, 1966). In experiments from behaving animals (Niell & Stryker, 2010; Reimer et al., 2014), but also in more controlled sensory tasks (Arandia-Romero, Tanabe, Drugowitsch, Kohn, & Moreno-Bote, 2016; Kelly, Smith, Kass, & Lee, 2010), there is growing evidence that neural activity is affected by many more variables than are typically considered relevant (Kandler, Mao, McNaughton, & Bonin, 2017; Stringer et al., 2018). At the same time, although it has long been a concern in statistics (Pearl, 2009) and has received some attention in other fields (Clarke, 2005; Greenland, 1989), omitted variable bias, as a general problem, appears underappreciated in systems neuroscience. Here we demonstrate why systematically considering omitted variable bias may be important in neural data analysis and examine how omitted variable bias can affect one popular framework for describing neural spiking activity: the generalized linear model (GLM) with Poisson observations.

In general, regression methods aim to estimate variations in a response variable as a function of other variables or covariates. When the goal of modeling is to maximize prediction accuracy, such as with brain-machine interfaces, interpreting the model parameters may not be a high priority. However, in many other cases, parameter estimates are, at least to some extent, interpreted and analyzed. For instance, tuning curves or receptive fields may be measured and compared under different stimulus or task conditions or before and after a manipulation. In fully controlled experiments, where the covariates are assigned at random, estimated coefficients can often be interpreted as estimates of causal effects (Gelman & Hill, 2007). However, for many cases in neuroscience, it may be difficult or impossible to completely control or randomize all the relevant variables.

In modeling neural activity, omitted variable bias can appear in any situation where neurons are modulated by omitted variables and the omitted variables (often called confounders) are not independent from the variables included in the model—the ones whose effects we are trying to estimate. Minimizing the influence of confounding variables is a major part of most experimental design (Rust & Movshon, 2005), and the statistical effects of confounding variables are well understood (Wasserman, 2004). However, when the goal of modeling is description or explanation (Shmueli, 2010), the effects of these omitted variables are frequently neglected. To give a concrete example, imagine an idealized neuron in primary motor cortex (M1) whose firing, unlike typical M1 neurons (Georgopoulos, Kalaska, Caminiti, & Massey, 1982), is not at all modulated by reach direction but instead is modulated by reach speed (see Figure 1). In a typical experimental setting, an animal's reach directions are randomized, but reach speed cannot be randomized or tightly controlled. If the average speed differs across reach directions, such a hypothetical neuron will appear to be tuned to reach direction, despite not being directly affected by direction. First, fitting a typical tuning curve for reach direction, we would infer that such a neuron has a clear preferred direction and nonzero modulation depth. If we then fit a second model that included both reach direction and speed, we would infer that the neuron is modulated by speed alone, and it would be apparent that the original preferred direction and modulation depth estimates were biased due to the omitted variable.

In adding more variables, previous studies have largely focused on the fact that including previously omitted variables improves model accuracy or the fact that neural activity is often influenced by a host of task variables. In M1, for instance, including speed improves model accuracy (Moran & Schwartz, 1999), but the presence of many correlated motor variables (e.g., kinematics, end-point forces, muscle activity) makes it difficult to interpret how neurons represent movement overall (Humphrey, Schmidt, & Thompson, 1970; Omrani, Kaufman, Hatsopoulos, & Cheney, 2017). In this letter, instead of focusing on the advantages or complexities of models with many variables, we focus on the well-known, but underdiscussed, fact that the parameters describing the original effects change as additional variables are included. The hypothetical M1 neuron above points to a more general question about regression models of neural activity. What happens when we cannot or do not include variables that are relevant to the process that we are modeling?

Here we first evaluate the statistical problem of omitted variable bias in the canonical generalized linear model with Poisson observations. Then, as a case study, we examine how speed affects estimates of direction tuning of neurons in primary motor cortex, as well as two other case studies where the spike counts are modeled as a function of external variables: orientation tuning in primary visual cortex (V1) and place tuning in the hippocampus (HC). In each of these case studies, we find that commonly omitted variables (speed in M1, population activity in V1, and speed and heading in HC) can bias the estimated effects of commonly included variables (reach direction in M1, stimulus orientation or direction in V1, and place in HC). Across all three case studies, including the omitted variables reduces the estimated modulation due to typical tuning effects. We also illustrate how omitted variable bias can affect generalized linear models of spike dynamics where a postspike history filter aims to describe refractoriness and bursting (Truccolo, Eden, Fellows, Donoghue, & Brown, 2005). The goal of these models is typically to differentiate aspects of spike dynamics that are due to the neurons' own properties (e.g., membrane time constant, resting potential, after-hyperpolarization currents) from those due to input to the neuron from other sources (Brillinger & Segundo, 1979; Paninski, 2004). In this setting, the input to the neuron is typically not directly observed but is approximated by stimulus or behavioral covariates, local field potential, or the activity of other neurons. Here we show that omitting the input can lead to large biases in postspike history filters and that including omitted variables describing the input can change the interpretation and stability of the estimated history effects.

GLMs have been used in many settings to disentangle the effects of multiple, possibly correlated, stimulus or task variables (Fernandes, Stevenson, Phillips, Segraves, & Kording, 2014; Park, Meister, Huk, & Pillow, 2014; Runyan, Piasini, Panzeri, & Harvey, 2017) and also to model neural mechanisms such as postspike dynamics, interactions between neurons, and coupling to local fields (Harris, Csicsvari, Hirase, Dragoi, & Buzsáki, 2003; Pillow et al., 2008; Truccolo et al., 2005). It is often argued that GLMs are advantageous because they have unique maximum likelihood estimates and can be more robust to nonspherical covariate distributions than other methods, such as spike-triggered averaging (Paninski, 2004; Pillow, 2007). Although these advantages are important, GLMs are not immune to bias. Here we show how the possibility of omitted variable bias, in particular, should encourage researchers to be cautious in their interpretation of model parameters, even in cases where a GLM achieves high predictive accuracy (Shmueli, 2010).

## 2 Results

Here we introduce the problem of omitted variable bias and examine differences between omitted variable bias in linear models and the canonical Poisson GLM. We then consider three tuning curve estimation problems: estimating direction tuning in primary motor cortex, place tuning in hippocampus, and orientation tuning in primary visual cortex, and show how omitted variables in each of these three cases can alter parameter estimates. Finally, we consider a GLM that aims to describe the dynamics of postspike history and show how omitted inputs can bias the estimated history effects and qualitatively change model stability.

### 2.1 Omitted Variable Bias in Linear Regression and Canonical Poisson GLMs

When relevant variables are not included in a regression model, the estimated effects for the variables that are included can be biased (Box, 1966). Omitted variable biases can cause the parameters describing the effects of the original variables to be over- or underestimated, and model fits can change qualitatively when omitted variables are included (see Figure 1).

### 2.2 Omitted Variable Bias in Tuning Curve Estimation

When fitting tuning curve models to spike count data, omitted variable bias can cause preferred stimuli and modulation depths to be misestimated and can even lead to completely illusory tuning (see Figure 1). To illustrate how omitted variable bias affects GLMs of neural spiking, not just in theory but in practice, we consider three case studies where we fit typical tuning curve models that omit potentially relevant variables along with augmented models that include these additional variables. We first consider modeling spike counts across trials and on relatively slow (over 100 ms) timescales. Here we assess (1) the tuning of neurons in motor cortex to reach direction, with speed as a potential omitted variable; (2) the tuning of neurons in hippocampus to position, with both speed and head direction as potential omitted variables; and (3) the tuning of neurons in visual cortex to the direction of motion of a sine-wave grating, with population activity as a potential omitted variable. In each of these cases studies, we show how the omitted variables are not independent of the commonly included variables and how neural responses are modulated by the omitted variables. Together, these two properties, can lead to omitted variable biases.

In our first case study, we model data recorded from primary motor cortex (M1) of a macaque monkey performing a center-out, planar reaching task. In this task, speed differs systematically across reach directions (see Figure 2A), with average speed differing by as much as $35\xb13%$ (for the targets at 45 and 225 degrees relative to right; see Figure 2B). To model neural responses, we first fit a traditional tuning curve model (Amirikian & Georgopulos, 2000; Georgopoulos et al., 1982), where the predicted responses depend only on target direction. Here we use a circular, cubic B-spline basis (five equally spaced knots) to allow for deviations from sinusoidal firing, but in most cases, the responses of the $n=81$ neurons in this experiment are well described by cosine-like tuning curves with clear modulation for reach direction. We then fit a second model that includes effects from movement speed. Here we use covariates based on Moran and Schwartz (1999), including a linear speed effect, as well as cosine-tuned direction-speed interactions (see section 4). This model captures the responses of individual neurons, where spike counts can increase (see Figure 2C, top) or decrease (see Figure 2C, middle) as a function of speed, and, in some cases, speed and direction appear to interact (see Figure 2C, bottom). Together, the fact that direction and speed are not independent, along with the fact that neural responses appear to be modulated by speed, could lead to biased parameter estimates for the model where speed is omitted.

Comparing the models with and without omitted variables, we find that, averaged across the population, there are only minimal shifts in the preferred direction (3 $\xb1$ 2 degrees) when speed is included in the traditional tuning curve model, and there do not appear to be large, systematic shifts in the population distribution of PDs (Kuiper's test, $p>0.1$). At the same time, there is substantial variability between neurons in the size of the PD shift (circular SD 32 $\xb1$ 5 degrees. Across the population, modulation depth (measured using the standard deviation of the tuning curve) decreases slightly on average (3 $\xb1$ 2%), and the size of the modulation change also varies substantially between individual neurons (SD of changes 18 $\xb1$ 3%). An example neuron in Figure 1C (bottom), for instance, has a modulation decrease of 9 $\xb1$ 5%, and the preferred direction changes 4 $\xb1$ 9 degrees when speed is included in the model (standard error from bootstrapping). Overall, approximately 10% of neurons have statistically significant changes in PD, and approximately 14% have significant changes in modulation (bootstrap tests $\alpha =0.05$, not corrected for multiple comparisons). For some individual neurons, at least, the parameters of the model without speed thus have clear omitted variable bias. However, since individual neurons have diverse speed dependencies, in this case, the average biases across the population are minimal.

When speed is included in the model, model accuracy (pseudo-$R2)$ does increase slightly ($p=0.01$, one-sided paired $t$-test across neurons). The average cross-validated (jackknife) pseudo-$R2$ for the original model is 0.23 $\xb1$ 0.01 and for the model with speed, 0.24 $\xb1$ .01 (see Figure 5). However, it seems likely that in other experimental contexts, the effects of omitting speed could be more pronounced. By requiring the animal to make reaches to the same targets at different speeds, previous studies have more clearly demonstrated that responses in M1 are modulated by speed (Churchland, Santhanam, & Shenoy, 2006). Here we demonstrate how this type of modulation can lead to omitted variable biases in the estimated parameters of typical tuning curve models without speed.

In our second case study, we examine the activity of neurons in the dorsal hippocampus of a rat foraging in an open field. Here we consider to what extent the practice of omitting speed and head direction from a place field model biases estimates of a neuron's position tuning. As in the first case study, omitted variable bias can occur if neural activity is modulated by omitted variables and the omitted variables covary with the included variables. In the case of the hippocampus, neural activity is known to be modulated by both movement speed and head direction (McNaughton, Barnes, & O'Keefe, 1983), in addition to an animal's position (O'Keefe & Dostrovsky, 1971). Additionally, behavioral variables can be highly nonuniform across the open field (Walsh & Cummins, 1976)—for instance, near and far from the walls. Together the fact that the omitted variables may covary with position and the fact that neurons appear to be modulated by the omitted variables suggest that there may be omitted variable bias.

Here, in one recording during an open field foraging task, we find that the average speed (see Figure 2A) and heading (see Figure 2B) differ extensively as a function of position. Within a given neuron's place field, the distributions of speed and heading may be very different from their marginal distributions. Across the population of $n=68$ place cells (selected from 117 simultaneously recorded neurons; see section 4), average in-field speed was between 80% and 135% of the average overall speed (5.5 cm/s), and the animal's heading can be either more or less variable in-field (circular SD 57-80 deg) compared to overall (75 degrees).

As previous studies have shown, we also find that neural responses are modulated by speed and head direction. Responses due to place, speed, and heading are shown for one example neuron in Figure 3. This neuron shows a stereotypical place-dependent response (see Figure 3B), but splitting the observations by speed (see Figure 3C, top) or heading (see Figure 3C, bottom) by quartiles/quadrants reveals that there is also tuning to these variables. The neuron appears to increase its firing with increasing speed and responds most strongly when the rat is facing the left. These dependencies are well fit by the full model where the firing rate depends not just on position, but also on the (log-transformed) speed and the heading (see Figure 3D, bottom). For the example place cell shown here, the location of the place field does not change substantially when the omitted variables are included (see Figure 3E). However, the modulation (SD of the rate map) decreases by 27%. That is, 27% of the apparent modulation due to position when it is modeled alone can be explained by speed and heading effects.

Across the population of place cells, there were no clear, systematic differences in the place field locations, but the modulation (SD of the rate map $\lambda (x))$ decreases by 9 $\xb1$ 1% on average when speed and heading are included. Individual neurons showed substantial variability in their modulation differences (population SD 10 $\xb1$ 1%). As in M1, including the omitted variables increased spike prediction accuracy: the average cross-validated (10-fold) pseudo-$R2$ was .29 $\xb1$ .02 for the original model and .31 $\xb1$ .02 for the model, including speed and heading activity. This difference seems small since there is large variability in pseudo-$R2$ values across the population, but the average increase in pseudo-$R2$ was 11 $\xb1$ 3% (see Figure 5). Given that neurons appear to be modulated by speed and heading, it is unsurprising that including these variables improves model fit. However, as before, it is important to note that this modulation can lead to biases in the place field estimates for the model with only position.

In our third case study, we examine the activity of neurons in a more controlled sensory experiment. Here we use data recorded from primary visual cortex (V1) of an anaesthetized monkey viewing oriented sine-wave gratings moving in 1 of 12 different directions (see section 4). In this experiment, variability in the animal's behavior is purposefully minimized, and instead of considering the effect of omitting a behavioral variable, we consider the effect of omitting a variable relating to the animal's internal state: the total population activity. Several studies have shown that population activity alters neural responses in V1 (Arandia-Romero et al., 2016; Arieli, Sterkin, Grinvald, & Aertsen, 1996; Kelly et al., 2010; Okun et al., 2015). If the distribution of population activity also varies with stimulus direction, there is the potential for omitted variable bias.

Here we assess neural activity from $n=90$ simultaneously recorded neurons across many (2400) repeated trials with 12 different movement directions. We find high trial-to-trial variability in the population rate (see Figure 4A), and the average firing across all neurons differs across stimulus directions—up to about 50%. For this recording, the most extreme differences were between the 180 degree stimulus, where the average rate across the population was 3.4 $\xb1$ 0.1 Hz, and the 60 degree stimulus, where the average rate was 6.3 $\xb1$ 0.1 Hz (see Figure 4B). By adding the (log-transformed) population rate as a covariate to a more typical model of direction tuning, we find that population activity may lead to omitted variable bias in models of direction tuning alone.

As in the case studies above, there do not appear to be any consistent or systematic effects on the preferred stimulus direction at the population level (Kuiper's test, $p$ = 0.1). However, the modulation depth (measured using SD of the tuning curve) decreases substantially 15 $\xb1$ 2% when population rate is included in the model, and there is again high variability across neurons (SD 20 $\xb1$ 2%). In this case, model accuracy increases substantially when the omitted variable is included. The cross-validated (10-fold) pseudo-$R2$ is .26 $\xb1$ .02 for the original model and .43 $\xb1$ .02 for the model including population activity, with an average increase of 164 $\xb1$ 31% (see Figure 5).

Unlike in M1, where the effect of speed was highly diverse for different neurons, in this case study, the effect of the population rate is largely consistent. Higher population rates are associated with higher firing rates, and for most neurons, the effect of the population rate is stronger in the preferred direction(s), consistent with a multiplicative effect. Note that, we do not include the neuron whose rate we are modeling in the calculation of the population rate. However, using the population rate as an omitted variable requires some interpretation. The population rate will certainly be affected by the tuning of the relatively small sample of neurons that we observe. If we have a disproportionate number of neurons tuned to a specific preferred direction, the population rate in those directions will be higher. This suggests that in a different recording, the covariation between the stimulus and the population rate could very likely be different. However, it appears that the omitted variable biases in this case are mostly driven by noise correlations, where neural activity is correlated on single trials even within the same stimulus condition, rather than stimulus correlations, where neural activity is correlated due to similar tuning. When we shuffle the data within each stimulus condition (removing noise correlations) the average change in the modulation depth is $-1\xb1$ 2% (SD 18 $\xb1$ 3%), and the effect of the omitted variable becomes negligible.

### 2.3 Omitted Variable Bias in the Estimation of Postspike History Effects

In addition to modeling spike counts over trials or on relatively slow (more than 100 ms) behavioral timescales, GLMs are also often used to describe detailed, single-trial spike dynamics on fast (less than 10 ms) timescales. One common covariate used in these types of models is a post-spike history effect where the probably of spiking at a given time depends on the recent history of spiking. Modeling these effects allows us to describe refractoriness, bursting (Paninski, 2004; Truccolo et al., 2005), and a whole host of other dynamics (Weber & Pillow, 2017). Conceptually, the goal of these models is to disentangle the sources of rate variation based only on observations of a neuron's spiking, with history effects ideally reflecting intrinsic biophysics However, since the full synaptic input is typically not known with extracellular spike recordings, there is potential for omitted variable biases.

These examples with strong, periodic input are not necessarily biologically realistic, but they make it apparent how the postspike history can be biased by omitted input variables. In vivo, neurons instead appear to be in a high-conductance state, where membrane potential fluctuations have approximately $1/f$ power spectra (Destexhe, Rudolph, Fellous, & Sejnowski, 2001; Destexhe, Rudolph, & Paré, 2003). When these naturalistic input statistics are used to drive the GLM, omitted variable bias can occur as well. Here we simulate a GLM receiving $1/f\alpha $ noise input with $\alpha =0$ (white noise) 1 and 2 (see Figure 7). For white noise input, the MLE accurately recovers the simulated postspike history filter when the input is omitted from the model, but when $\alpha =1$ or 2 the estimates become increasingly biased (see Figures 7A and 7C). With the full model, where the input is included as a covariate, the history is recovered accurately no matter what the input statistics are. Just as in the periodic case, however, these different input statistics alter the autocorrelation, and when the input is omitted from the model, the maximum likelihood history filter simply aims to capture these patterns.

In GLMs for single-neuron dynamics, one effect of omitted variable bias is that it may lead us to misinterpret how stable a neuron's dynamics are. Even if the true history filter only reduces the neuron's firing rate following a spike (as in Figure 7C), the estimated filter can be biased upward when the input is omitted. If we were to simulate the activity of this neuron based on the biased filter, the bias could cause the neuron's rate to diverge if the rate becomes high enough. To assess the stability of the estimated postspike history effects quantitatively, here we make use of a quasi-renewal approximation analysis introduced in Gerhard, Deger, and Truccolo (2017). Given a history filter, this approach finds an approximate transfer function describing the neuron's future firing rate (output) given its recent (input) firing rate (see section 4). For all estimated models, the transfer function has a stable fixed point near the neuron's baseline firing rate. When the true input is omitted and $\alpha >0$, the estimated history filters also have an unstable fixed point where the neuron's firing rate will diverge if the rate exceeds this point (Gerhard et al., 2017). Here we find that omitted variable bias leads to apparent fragility (see Figures 7B and 7D). The stable region shrinks as $\alpha $ increases, and even when the true dynamics are strictly stable (as in Figures 7C and 7D), omitted variable bias can lead us to mistakenly conclude that the neuron has fragile dynamics.

With most extracellular spike recordings, the synaptic input that the neuron receives is unknown. However, there may also be omitted variable bias when history effects are estimated from real data. In this case, the input to a neuron can be approximated by stimulus or behavioral variables, local field potentials, or the activity of simultaneously recorded neurons (Gerhard et al., 2013; Harris et al., 2003; Kelly et al., 2010; Pillow et al., 2008; Truccolo et al., 2005; Truccolo, Hochberg, & Donoghue, 2010; Volgushev, Ilin, & Stevenson, 2015). Just as in the simulations above, including or omitting these variables can alter the estimated history effects, even though they are not as directly related to spiking as the synaptic input itself. Here we consider total population spiking activity as a proxy for synaptic input and consider how including population activity alters the history filters when compared to a model of history alone.

We examine two data sets: spontaneous activity from primary visual cortex of an anaesthetized monkey with $n=62$ simultaneously recorded neurons and activity from dorsal hippocampus of a sleeping rat with $n=39$ simultaneously recorded neurons. To model population covariates, we sum the spiking of all neurons, except the one whose spiking we aim to predict, and low-pass filter the signal (see section 4). Similar to previous results (Okun et al., 2015), we find that, since neurons often have correlated fluctuations in their spiking (see Figures 8A and 8D), the population rate is a good predictor for single-neuron activity. Moreover, when we add population covariates to a GLM with postspike history effects, the history filter changes.

In the V1 data set, the postspike gain decreases by 7.8 $\xb1$ 0.5% on average when population covariates are included and 14.9 $\xb1$ 0.8% when considering only the first 50 ms after a spike (see Figure 8B). The effects of adding population covariates are less pronounced in the hippocampal data set. The postspike gain decreases by 2.5 $\xb1$ 0.3% on average and 9.5 $\xb1$ 1.2% when considering only the first 50 ms after the spike (see Figure 8E). Based on the quasi-renewal approximation, all neurons in both the V1 and hippocampal data sets have fragile transfer functions where there is a stable fixed point (near the neuron's average firing rate) and an unstable fixed point where the neuron's rate diverges if the input becomes too strong. For V1, the average upper limit of the stable region is 80 $\xb1$ 3 Hz for the models with history only and 143 $\xb1$ 7 Hz for the models with population covariates (see Figure 8C). In the hippocampal data, the average upper limit of the stable region is 38 $\xb1$ 6 Hz for the models with history only and 75 $\xb1$ 13 Hz for the models with population covariates (see Figure 8F). Each neuron is thus apparently more stable after the population covariates are included.

As in the case studies using tuning curves, adding covariates also improves spike prediction accuracy. In the V1 data set, the average log-likelihood ratio relative to a homogeneous Poisson model is 2.2 $\xb1$ 0.3 bits/s for the history model and 3.3 $\xb1$ 0.3 bits/s for the model with population covariates. In hippocampus, the log-likelihood ratio is 0.9 $\xb1$ 0.3 bits/s for the history model and 2.0 $\xb1$ 0.5 bits/s for the model with population covariates. The larger effects in V1 are likely explained by the fact that the population rate is predictive for many more neurons here than for the hippocampal data. In the hippocampus, only 26% of the neurons have an increase of over 0.5 bits/s when the population covariates are included, compared to 85% of neurons in V1. Altogether, these results demonstrate how omitted variable bias could affect estimates of postspike history filters in vivo. In both data sets, we find that when population covariates are included in the GLM, spike prediction accuracy increases, postspike gain decreases, and apparent stability increases.

## 3 Discussion

When the goal of modeling is causal inference or understanding of biological mechanisms, the potential for biases due to omitted variables is often clear. The statistical effects of confounders (Wasserman, 2004), as well as the limits that they place on neuroscientific understanding, are widely appreciated (Jonas & Kording, 2017; Krakauer, Ghazanfar, Gomez-Marin, MacIver, & Poeppel, 2017; Yoshihara & Yoshihara, 2018). However, when the goal of modeling is to create an abstract explanation or summary of observed neural activity, the fact that omitted variables can bias these explanations is not always widely acknowledged. Here we have illustrated the potential for omitted variable bias in two types of commonly used GLMs for neural spiking activity: tuning curve models using spike counts across trials and models that capture single-neuron dynamics with a postspike history filter. In each model, adding a previously omitted variable improved spike prediction accuracy, as expected. However, what we emphasize here is that when omitted variables were included, the estimates of the original parameters changed. For three case studies using tuning curves, we found that by adding a traditionally omitted variable, tuning curves showed less modulation due to the originally included variables. In models of single-neuron dynamics, adding omitted variables led to decreased postspike gain and greater apparent stability. Importantly, omitted variables can arise in GLMs in any situation where an omitted variable affects neural activity and the effect of the omitted variable is not independent of the included variables.

The case studies here are not unique; many studies have described how adding variables to a tuning curve or single-neuron model can improve prediction accuracy. In M1, in addition to movement speed, joint angles, muscle activity, end-point force, and many other variables also appear to modulate neural responses (Fetz, 1992; Kalaska, 2009). In addition to speed and head direction in the hippocampus, theta-band local field potential, sharp-wave ripples, and environmental features, such as borders, appear to modulate neural activity (Hartley, Lever, Burgess, & O'Keefe, 2014). And in V1, there is growing evidence that population activity (Lin, Okun, Carandini, & Harris, 2015) and nonvisual information (Ghazanfar & Schroeder, 2006) modulate neural responses. In each of these systems, neural responses are affected by many factors. Responding to many task variables may even be functional, allowing downstream neurons to more effectively discriminate inputs (Fusi, Miller, & Rigotti, 2016). In any case, it seems clear that our models do not yet capture the full complexity of neural responses (Carandini et al., 2005). By omitting relevant variables, current models are likely to be not just less accurate but also biased.

Parameter bias may be problematic in and of itself. However, omitted variable bias may also have an important effect on generalization performance. As Box (1966) noted, in a new context, the effect of the omitted variables and the relationship between the omitted and included variables may be different. Since the parameters of the included variables are biased, this change can reduce generalization accuracy. This phenomena may explain, to some extent, why tuning models fit in one condition often do not generalize to others (Graf, Kohn, Jazayeri, & Movshon, 2011; Oby, Ethier, & Miller, 2013). For models of single-neuron dynamics, omitted variable bias can also have a negative effect on the accuracy of simulations. Previous work has shown that simulating a GLM with postspike filters estimated from data often results in unstable, diverging simulations. Although several methods for stabilizing these simulations have recently been developed (Gerhard et al., 2017; Hocker & Park, 2017), one, perhaps primary, reason for this instability may be that the postspike filters are biased due to omitted synaptic input. Since estimated postspike filters may reflect not just intrinsic neuron properties but also the statistics of the input, interpreting and comparing postspike filters may be difficult. Different history parameters may be different due to intrinsic biophysics (Tripathy, Padmanabhan, Gerkin, & Urban, 2013) or due to differing input, and resolving this ambiguity will likely involve more accurately accounting for the input itself (Kim & Shinomoto, 2012).

The possibility of omitted variable bias does not mean that estimated parameters, predictions, and simulations from simplified model are useless, but it may mean that we need to be cautious in interpreting these models and their outputs. When reporting the results of regression, in addition to avoiding describing associations with causal language, it may be generally useful to discuss known and potential confounds. Previous studies have already identified several specific cases of omitted variable bias where careful interpretation is necessary. For instance, omitted common input can bias estimates of interactions between neurons (Brody, 1999), and omitted history effects can bias receptive field estimates (Pillow & Simoncelli, 2003). In estimating peristimulus time histograms, omitting variables that account for trial-to-trial variation may cause biases (Czanner et al., 2008) or issues with identifiability (Amarasingham, Geman, & Harrison, 2015). Similarly, biases due to spike sorting errors (Ventura, 2009) could be framed as a result of omitting variables related to missing or excess spikes. Since we typically do not model or observe all the variables that affect neural activity, omitted variable problems are likely to be pervasive in systems neuroscience far beyond these specific cases.

Although we have focused on GLMs here, omitted variable bias can affect any model, and other types of model misspecification can also result in biased parameter estimates. Adding input nonlinearities (Ahrens, Paninski, & Sahani, 2008; David, Mesgarani, Fritz, & Shamma, 2009), interaction effects (McFarland, Cui, & Butts, 2013), or higher-order terms to the GLM (Berger, Song, Chan, & Marmarelis, 2010; Park, Archer, Priebe, & Pillow, 2013) may fix certain types of model misspecification, but any model that omits relevant variables is still likely to suffer from the same problems. This includes both machine learning methods that may provide better prediction accuracy than GLMs (Benjamin et al., 2017) and single-neuron models aiming to describe greater biophysical detail (Herz, Gollisch, Machens, & Jaeger, 2006). Unlike over-fitting or nonconvergence (Zhao & Iyengar, 2010), omitted variable bias will generally not be resolved by including additional data or by adding regularization. Moreover, adding one omitted variable, as we have done with the case studies here, is no guarantee that there are not other relevant variables being omitted.

One approach that could potentially reduce omitted variable bias is latent variable modeling, where the effects of unknown covariates are explicitly included (constrained by simplifying assumptions). Recent work has introduced latent variables for neural activity with linear dynamics (Kulkarni & Paninski, 2007; Paninski et al., 2010; Smith & Brown, 2003), switching dynamics (Putzky, Franzen, Bassetto, & Macke, 2014), rectification (Whiteway & Butts, 2017), and oscillations (Arai & Kass, 2017). These models appear to outperform GLMs on population data in retina (Vidne et al., 2012), visual (Archer, Koster, Pillow, & Macke, 2014), and motor cortices (Chase, Schwartz, & Kass, 2010; Macke et al., 2011). Inferring latent variables requires making (sometimes strong) assumptions about the nature of the variables and may require observations from multiple neurons or across multiple trials, but by approximating some of the effects of relevant omitted variables, latent variables may reduce omitted variable bias. However, generally determining when relevant variables are omitted from a model and what those variables are is not a trivial problem.

There is a well-known aphorism from George E. P. Box: “All models are wrong, but some are useful.” The lengthier version of this quip is, “All models are approximations. Essentially, all models are wrong, but some are useful. However, the approximate nature of the model must always be borne in mind” (Box & Draper, 1987). GLMs are certainly useful descriptions of neural activity. They are computationally tractable, can disentangle the relative influence of multiple covariates, and often provide the core components for Bayesian decoders. Here we emphasize, however, one ubiquitous circumstance in systems neuroscience where the “approximate nature” of the models should be “borne in mind.” Namely, omitted variables can bias estimates of the included effects.

## 4 Methods

### 4.1 Neural Data

All data analyzed here was previously recorded and shared by other researchers through the Collaborative Research in Computation Neuroscience (CRCNS) Data Sharing Initiative (crcns.org).

Data from primary motor cortex are from CRCNS data set DREAM-Stevenson_2011 (Walker & Kording, 2013). These data were recorded using a 100 electrode Utah array (Blackrock Microsystems, 400 mm spacing, 1.5 mm length) chronically implanted in the arm area of primary motor cortex of an adult macaque monkey. The monkey made center-out reaches in a 20 $\xd7$ 20 cm workspace while seated in a primate chair, grasping a two-link manipulandum in the horizontal plane (arm roughly in a sagittal plane). Each trial for the center-out task began with a hold period at a center target (0.3–0.5 s). After a go cue, subjects had 1.25 s to reach one of eight peripheral targets and then hold this outer target for at least 0.2 s to 0.5 s. Each success was rewarded with juice, and feedback (1–2 cm diam) about arm position was displayed on-screen as a circular cursor. Spike sorting was performed by manual cluster cutting using an offline sorter (Plexon) with waveforms classified as single- or multi-unit based on action potential shape and minimum ISIs greater than 1.6 ms (yielding $n=81$ single units). Here we model tuning curves using spike counts between 150 ms before to 350 ms after the speed reached its half-max. Average movement speed for each trial was calculated from 0 to 250 ms after the speed reached its half-max (290 trials in total). (For details of the surgery, recording, and spike sorting, see Stevenson et al., 2011.)

Hippocampal data are from CRCNS HC-3 (Mizuseki, Sirota, Pastalkova, Diba, & Buzsáki, 2013). Here we use recording sessions ec16.19.272 and ec014.215, where a Long-Evans rat was sleeping and foraging in a 180 $\xd7$ 180 cm maze, respectively. For both recordings, 12 shank silicon probes (with 8 recording sites each, 20 $\mu $m separation) were implanted in CA1 (8 shanks) and EC3-5 (4 shanks) (based on histology). Spikes were sorted automatically using KlustaKwick and then manually adjusted (Klusters), yielding 85 units for the sleep data and 117 for the open field data. For the sleep data, where we model postspike history effects, the spike trains were binned at 1 ms and the recording length was 27 min. Here we model all neurons with firing rates above 0.5 Hz ($n=39$). For the open field data, where we model place tuning, spike trains were binned at 250 ms, and the recording length was 93 min. Place cells ($n=68$) were selected based on having an overall firing rate below 5 Hz (to rule out interneurons), a peak firing rate above 2 Hz, and a contiguous set of pixels (after smoothing with an isotropic gaussian $\sigma =8$ cm) of at least 200 cm$2$ where the firing rate was above 10% of the peak rate. For details of the surgery, recording, and spike sorting see (Diba & Buzsaki, 2008; Mizuseki, Sirota, Pastalkova, & Buzsáki, 2009).

Data from primary visual cortex are from CRCNS data set PVC-11 (Kohn & Smith, 2016). Here we use spontaneous activity, during a gray screen, (monkey 1) and responses to drifting sine-wave gratings (monkey 1) both from an anaesthetized (Sufentanil: 4–18 microg/kg/hr) adult monkey (*Macaca fascicularis*). Recordings were made in parafoveal V1 (RFs within 5 degrees of the fovea) using a 96-channel multielectrode array (Blackrock Microsystems), 400 mm electrode spacing, and 1 mm depth. After automatic spike sorting and manual cluster adjustment, 87 and 106 units were recorded during spontaneous activity and grating presentation, respectively. Only neurons with waveform SNR >2 and firing rates >1 Hz were analyzed, $n=62$ for spontaneous and $n=90$ for grating data. For the spontaneous activity, we bin spike counts at 1 ms, and the recording length was 20 min. For the drifting grating data, we analyzed spike counts from 200 ms to 1.2 s after stimulus onset on each trial: 12 directions, 2400 trials total. Gratings had a spatial frequency of 1.3 cyc/deg, temporal frequency of 6.25 Hz, and size of 8 to 10 degrees (to cover receptive fields of all recorded neurons) and were presented for 1.25 s with a 1.5 s intertrial interval between stimuli. (For surgical, stimulus, and preprocessing details see (Kelly et al., 2010, and Smith & Kohn, 2008.)

### 4.2 Tuning Curve Models

### 4.3 Postspike History Simulations and Population Rate Models

### 4.4 Stability Analysis

## References

*Spectral methods for neural characterization using generalized quadratic models*. IN