Abstract

The neural correlates of decision making have been extensively studied with tasks involving a choice between two alternatives that is guided by visual cues. While a large body of work argues for a role of the lateral intraparietal (LIP) region of cortex in these tasks, this role may be confounded by the interaction between LIP and other regions, including medial temporal (MT) cortex. Here, we describe a simplified linear model of decision making that is adapted to two tasks: a motion discrimination and a categorization task. We show that the distinct contribution of MT and LIP may indeed be confounded in these tasks. In particular, we argue that the motion discrimination task relies on a straightforward visuomotor mapping, which leads to redundant information between MT and LIP. The categorization task requires a more complex mapping between visual information and decision behavior, and therefore does not lead to redundancy between MT and LIP. Going further, the model predicts that noise correlations within LIP should be greater in the categorization compared to the motion discrimination task due to the presence of shared inputs from MT. The impact of these correlations on task performance is examined by analytically deriving error estimates of an optimal linear readout for shared and unique inputs. Taken together, results clarify the contribution of MT and LIP to decision making and help characterize the role of noise correlations in these regions.

1  Introduction

Decision making is an everyday activity that involves a deliberation among alternative choices (Smith & Ratcliff, 2004). To examine the neural correlates of decision making, studies have designed simplified tasks using two alternative choices and rapid-timed responses (Gold & Shadlen, 2000; Meister, Hennig, & Huk, 2013; Roitman & Shadlen, 2002; Shadlen & Kiani, 2013). These tasks provide a wealth of data on response times and choice probabilities and allow researchers to look for correlates of those variables in neural activity (Beck et al., 2008; Churchland et al., 2011; Kira, Yang, & Shadlen, 2015).

One region that has generated sustained interest for its role in decision making is the lateral intraparietal (LIP) region of posterior parietal cortex, a region that is suggested to track sensory evidence over time (de Lafuente, Jazayeri, & Shadlen, 2015; Huk & Shadlen, 2005; Wong, Huk, Shadlen, & Wang, 2007) and integrate sensory inputs during oculomotor decisions (Hanks & Summerfield, 2017). LIP may be causally involved in modulating both response times and choice probabilities, as suggested by manipulations of LIP activity through microstimulation (Hanks, Ditterich, & Shadlen, 2006). In fact, an impressive body of work relating LIP to psychophysiological mechanisms of decision making over the past decades gives the firm impression of a well-characterized causal role for this region.

However, recent work has begun to challenge the causal role of LIP during decision making (Churchland & Kiani, 2016). In particular, deactivation of LIP using a GABA agonist (muscimol) has shown no impact on psychophysical variables during a motion discrimination task (Katz, Yates, Pillow, & Huk, 2016), calling into question the seemingly well-established causal role of LIP in this task. Addressing this controversy is challenging given that LIP is embedded in a broad neuronal network involving both cortical and subcortical regions (Collins, Airey, Young, Leitch, & Kaas, 2010), making it difficult to rule out the possibility that its activity merely reflects afferent activation from neural structures that causally affect behavior.

Computational models of neural activity offer a promising avenue to shed light on this controversy and address the role of brain regions involved in decision making (Averbeck, Latham, & Pouget, 2006; Cisek, 2006; Lo & Wang, 2016; Moreno-Bote et al., 2014; Sompolinsky, Yoon, Kang, & Shamir, 2001; Wong et al., 2007). In a model composed of several layers, each representing a distinct brain region, it is straightforward to inhibit, deactivate, or alter targeted aspects of a circuit to examine their role on various cognitive and behavioral tasks (Layton & Fajen, 2016).

Here, we examined a simplified model of decision making and analyzed it using an optimal linear readout of neural activity. The model consisted of a layer of medial temporal (MT; extrastriate visual cortex) units that project in a feedforward manner to a layer of LIP units. This configuration does not, of course, suggest that these two structures are the sole contributors to decision making; rather, our aim was to characterize one example of a neural transformation that illustrates the gradual passage from more sensory (MT) to more cognitive (LIP) representations. This pathway has been intensely studied in decision making: LIP receives direct input from MT that likely guides the direction of behavioral decisions (Blatt, Andersen, & Stoner, 1990).

We applied the model to a well-studied motion discrimination task (Gold & Shadlen, 2000; Roitman & Shadlen, 2002; see Figure 1a). Of central importance for the current work, the direction of the correct saccade in this task always matches the direction of stimulus motion. Using computational simulations of activity during the motion discrimination task, we show that the passage from a sensory to a motor representation involves largely redundant information, as information on the direction of movement is already present in the sensory activation from MT. This aspect of the motion discrimination task represents a fundamental confound in a large body of work on the neural basis of perceptual decision making. In fact, the combined computation of perceptual decision and movement planning may actually reflect distinct processes that are linked in tasks only where this confound is present (Bennur & Gold, 2011).

Figure 1:

Behavioral tasks that highlight the role of regions MT and LIP in decision making. (a) Motion discrimination task. A stimulus consisting of a random-dot stereogram is presented in the response field (RF) of a given LIP cell (arrow indicates motion direction of the majority of dots). (b) Delayed-match categorization task. Two random-dot stimuli are separated by a brief delay period. The subject indicates via saccade whether the two stimuli belong to the same broad category (match) or not (nonmatch). (c) Category membership based on stimulus orientation for the categorization task as well as a novel task (see main text). Dashed line, category boundary.

Figure 1:

Behavioral tasks that highlight the role of regions MT and LIP in decision making. (a) Motion discrimination task. A stimulus consisting of a random-dot stereogram is presented in the response field (RF) of a given LIP cell (arrow indicates motion direction of the majority of dots). (b) Delayed-match categorization task. Two random-dot stimuli are separated by a brief delay period. The subject indicates via saccade whether the two stimuli belong to the same broad category (match) or not (nonmatch). (c) Category membership based on stimulus orientation for the categorization task as well as a novel task (see main text). Dashed line, category boundary.

By a combination of simulations and formal analyses, we argue that this visuomotor confound is at the origin of a causal masking effect, whereby the causal influence of LIP on decision making is observed only in selective tasks. By comparing simulation results on the motion discrimination task with those obtained on a delayed-match categorization task (Freedman & Assad, 2006, 2009; Sarma, Masse, Wang, & Freedman, 2016; see Figures 1b and 1c), we show that MT-LIP interactions are modulated by task demands, resulting in LIP neurons having more (or less) shared input from MT. In turn, these shared inputs affect noise correlations in LIP (correlations between responses to repeats of the same input), with consequences for behavioral performance.

This work is structured as follows. First, we show the impact of noise correlation in a simplified neural circuit that is fed through an optimal linear readout. Second, we expand the model to a two-layer circuit (MT and LIP) and apply it to decision-making tasks. We show the impact of neural deactivation on this circuit and discuss conditions that lead to a causal masking effect.

2  Results

2.1  Optimal Linear Readout of Correlated Neural Activity

In a first step, we examined the effect of noise correlation on stimulus discrimination using a pair of simulated units. Nearby cortical neurons show marked correlations in local synaptic activity (Rosenbaum, Smith, Kohn, Rubin, & Doiron, 2017). These noise correlations have been extensively studied in terms of their role in decision-making tasks (Abbott & Dayan, 1999; Averbeck et al., 2006; Ecker, Berens, Tolias, & Bethge, 2011; Sompolinsky et al., 2001; Zohary, Shadlen, & Newsome, 1994), as well as their influence on the dynamics of neural circuits (Brunel, 2000; Bujan, Aertsen, & Kumar, 2015; de la Rocha, Doiron, Shea-Brown, Josić, & Reyes, 2007; Graupner & Reyes, 2013; Renart et al., 2010; Rosenbaum & Josić, 2011; Salinas & Sejnowski, 2000; Yim, Kumar, Aertsen, & Rotter, 2014). Correlations may provide advantages or disadvantages in terms of neural coding; despite extensive work on this issue, conditions leading to either outcome have yet to be fully characterized.

To investigate the effect of noise correlations, we began with a simple linear neural integrator model based on mean firing rate (Cain, Barreiro, Shadlen, & Shea-Brown, 2013; Goldman, 2009). This model represents an approximation of underlying spiking activity in neuronal circuits (Dayan & Abbott 2001; Ganguli et al., 2008; Murphy & Miller 2009; Miri et al., 2011) and captures the quasi-linear responses of firing rates found across many neuronal types (Chance, Abbott, & Reyes, 2002).

In the simplest scenario, we considered two independent units ( that received a constant input as well as a time-fluctuating noise source (see Figure 2a) and whose firing rate is described by
formula
2.1
where and are leak terms, and are gaussian noise terms with finite mean and variance, and control the gain of the noise, and and are distinct stimuli indexed by (default parameter values are , .
Figure 2:

Shared inputs and classification error in simulated networks of MT and LIP. (a) Two LIP units (, are connected via a reciprocal pathway (dashed line). The two units receive an input (either shared or independent), as well as individual noise. (b) Illustration of activity for units and at discrete time points (solid black circles). Solid green line: optimal linear decision boundary. Dashed line shows correspondence between mean of stimulus 2 (solid red circle) and a gaussian function with mean centered at that value. Gray shaded area: estimation of classification error for stimulus 2.

Figure 2:

Shared inputs and classification error in simulated networks of MT and LIP. (a) Two LIP units (, are connected via a reciprocal pathway (dashed line). The two units receive an input (either shared or independent), as well as individual noise. (b) Illustration of activity for units and at discrete time points (solid black circles). Solid green line: optimal linear decision boundary. Dashed line shows correspondence between mean of stimulus 2 (solid red circle) and a gaussian function with mean centered at that value. Gray shaded area: estimation of classification error for stimulus 2.

Here, the additive noise terms and make no distinction between stimulus and neural noise, whose effect is quantitatively similar in mean-rate models (Burak & Fiete 2012). Direct synaptic connections between the two units are not modeled explicitly, but they are accounted for in our analysis through correlated activity.

We aimed to identify a linear readout mechanism that would optimally decode neural activity in the above model. This is a central question to virtually all studies of neural coding (Moreno-Bote et al., 2014). A linear readout is sufficient to perform neural decoding across many tasks (Klampfl, David, Yin, Shamma, & Maass, 2012; Meyers, Freedman, Kreiman, Miller, & Poggio, 2008) and perform a motor decision based on neural activity (Nienborg & Cumming, 2010; Shadlen, Britten, Newsome, & Movshon, 1996). Further, a linear readout can be implemented straightforwardly by a biologically realistic Hebbian or perceptron rule (Buonomano & Maass, 2009).

Here, a readout mechanism was implemented through a Fisher linear discriminant analysis (LDA), which provides an upper bound on coding capacity. LDA yields an optimal decoder under the assumptions of a linear decision boundary, equal covariances between the stimulus classes, discrete categories, and homogeneous variance along the classification boundary (Berberian, MacPherson, Giraud, Richardson, & Thivierge, 2017). In addition, it provides a conservative alternative to measures such as Fisher information, which may lead to overfitting (Kohn, Coen-Cagli, Kanitscheider, & Pouget, 2016). Linear discriminant analysis has shown good agreement with behavioral performance on decision-making tasks in recordings of LIP (Berberian et al., 2017) and elsewhere in cortex (Rich & Wallis, 2016).

Briefly, this readout works by computing a line of projection that offers the lowest variability within exemplars of the same stimulus class and the highest variability between stimulus classes. This projection, termed the optimal linear projection (OLP), can be computed directly from any generative model of activity that is analytically tractable, which is the case for equation 2.1 (see the appendix). From the OLP, an optimal classification boundary can be derived (see Figure 2b). Assuming a multivariate distribution of stimulus responses, classification error can be estimated by integrating the area of responses that fall on the wrong side of the boundary.

To examine classification error in the above model, we considered two distinct scenarios. Both consider that the model receives two distinct inputs across distinct trials. For simplicity, these inputs are constant over time. First, and most straightforward, is a scenario where the distribution of responses falls along the line of unity when displayed graphically (see Figure 3a). Formally, this scenario meets the condition
formula
2.2
where and . Here, and are absolute differences between stimuli 1 and 2 for units and . One example of such scenario is when units and receive shared inputs. In this scenario, the effect of noise correlations can be thought of as stretching the activity along the line of unity. As a consequence, a gradual increase in correlation between and leads to a monotonic increase in readout error, a result found in both numerical simulations and analytical estimates (see Figure 3b).
Figure 3:

Impact of noise correlation on optimal linear readout. (a) Two units (, receive shared inputs (mean activation of units and set to 10 Hz and 13 Hz for stimuli 1 and 2, respectively). Each dot is the firing rate at a given time point. Pairwise correlation is 0.9. (b) With shared inputs, classification error increases monotonically with the correlation between and . (c) Two units each receiving a unique input. Unit receives a mean input of 11 Hz for stimulus 1 (blue) and 13 Hz for stimulus 2 (red). Unit receives a 2 Hz input for stimulus 1 and a 4 Hz input for stimulus 2. (d) With unique inputs, classification error follows a nonmonotonic function. Low noise correlations lead to an increase in error until an intermediate value (solid blue lines) beyond which error decreases rapidly. (e) Illustration of the relation between noise correlation and overlap between the firing rates of two units in response to distinct stimuli.

Figure 3:

Impact of noise correlation on optimal linear readout. (a) Two units (, receive shared inputs (mean activation of units and set to 10 Hz and 13 Hz for stimuli 1 and 2, respectively). Each dot is the firing rate at a given time point. Pairwise correlation is 0.9. (b) With shared inputs, classification error increases monotonically with the correlation between and . (c) Two units each receiving a unique input. Unit receives a mean input of 11 Hz for stimulus 1 (blue) and 13 Hz for stimulus 2 (red). Unit receives a 2 Hz input for stimulus 1 and a 4 Hz input for stimulus 2. (d) With unique inputs, classification error follows a nonmonotonic function. Low noise correlations lead to an increase in error until an intermediate value (solid blue lines) beyond which error decreases rapidly. (e) Illustration of the relation between noise correlation and overlap between the firing rates of two units in response to distinct stimuli.

In a different scenario, we relaxed the condition set by equation 2.2 and allowed
formula
2.3
This scenario occurs when units and each receive different inputs. Graphically, one example of such scenario arises when two stimuli result in parallel distributions that overlap only minimally (see Figure 3c). A surprising result is that under certain model parameters, classification error as a function of noise correlation follows an inverted U with a single peak (see Figure 3d). Intuitively, this result is explained by the overlap between the distributions. With small correlations, variance pushes the two distributions evenly in all directions, resulting in a small overlap (see Figure 3e, left). With a larger correlation (0.5), variance stretches the distributions such that the overlap increases (see Figure 3e, middle). However, with a very large correlation (0.9), the distributions are so stretched that no overlap occurs (see Figure 3e, right).
We can calculate the value of noise correlation where error shifts from an increasing to a decreasing function:
formula
2.4
Substituting the solution with model parameters, this yields
formula
2.5

We expanded the analytical mean-rate results to a broad range of parameter values and explored the impact of correlations on an optimal linear readout of neural activity. We began by examining the impact of distance between the two stimuli. When stimuli are close enough to overlap (see Figure 4a, left), error was high and increased nearly linearly with increased noise correlation (see Figure 4b). However, with increased distance between the stimuli (see Figure 4a, right), the impact of correlation took a nonmonotonic form (see Figure 4b). In this case, the correlation associated with maximal error (see equation 2.4) was high (i.e., approaching a value of 1, see the asterisk in Figure 4b); hence, for realistic values of correlation (typically, between 0 and 0.2) (Cohen & Newsome, 2008; Zohary et al., 1994), classification error increases monotonically with noise correlations. Next, we considered the effect of noise gain ( and in equation 2.1) on classification error. First, as one should expect, increasing the noise gain leads to worsened stimulus classification (see Figure 4c). However, more complex results are obtained if we fix the noise gain of a single unit ( and alter the other one (. The result is a shift in the correlation value associated with maximum classification error. More specifically, the shift from beneficial to detrimental correlations occurs at a lower value of correlation when noise is increased (see Figure 4d). In an extreme scenario where (beyond which value further changes in error are negligible), maximum error occurs at a correlation value near zero, and therefore increases in correlation lead to a decrease in error over most of the range of correlation values (see Figure 4d, dashed line).

Figure 4:

Impact of stimuli and model parameters on optimal linear readout. (a) Examples of responses to two stimuli (red and blue circles). Units and have a correlation of 0.9. Mean stimulus values for unit are indicated on the figure. (b) The impact of correlation depends on the distance between the two stimuli being discriminated. Classification error was solved analytically (see the appendix). The asterisk shows the point of maximum error obtained when unit receives an input of 3 Hz for stimulus 1 and 6 Hz for stimulus 2 as illustrated in panel a, right. (c) Impact of varying the noise gain of both units ( and simultaneously. (d) Effect of varying the noise gain of a single unit (. Solid black circles: example of a scenario with a correlation of 0.6 where increasing noise gain for one unit from to 10 leads to a reduction in classification error. (e) Increasing the noise gain of unit redistributes the variance of the firing rate along the -axis. (f) The correlation corresponding to maximum error ( varies according to noise gain (. Solid lines show the impact of altering the ratio of absolute stimulus difference ( and for the two units and . Dashed vertical line shows the value of (see equation 2.10) for a ratio of 0.5.

Figure 4:

Impact of stimuli and model parameters on optimal linear readout. (a) Examples of responses to two stimuli (red and blue circles). Units and have a correlation of 0.9. Mean stimulus values for unit are indicated on the figure. (b) The impact of correlation depends on the distance between the two stimuli being discriminated. Classification error was solved analytically (see the appendix). The asterisk shows the point of maximum error obtained when unit receives an input of 3 Hz for stimulus 1 and 6 Hz for stimulus 2 as illustrated in panel a, right. (c) Impact of varying the noise gain of both units ( and simultaneously. (d) Effect of varying the noise gain of a single unit (. Solid black circles: example of a scenario with a correlation of 0.6 where increasing noise gain for one unit from to 10 leads to a reduction in classification error. (e) Increasing the noise gain of unit redistributes the variance of the firing rate along the -axis. (f) The correlation corresponding to maximum error ( varies according to noise gain (. Solid lines show the impact of altering the ratio of absolute stimulus difference ( and for the two units and . Dashed vertical line shows the value of (see equation 2.10) for a ratio of 0.5.

Paradoxically, increasing noise gain may improve categorization error in some circumstances. For instance, with and a noise correlation of 0.6, substantially increasing from 0.8 to 10 leads to a decrease in error (see Figure 4d, filled black circles). Intuitively, this is because noise increases the spread of the data points associated with each given stimulus along the axis corresponding to unit (see Figure 4e). If such increase in spread reduces the overlap between the categories, the result will be an improvement in classification.

To further analyze these results relating to the gain of noise, we first reformulate equation 2.5 as follows:
formula
2.6
Substituting for the model parameters of equation 2.1 yields
formula
2.7
Extending this analysis to any model parameter , we can write
formula
2.8
where is a function of the parameter of interest, and is a constant with respect to . Given the results of Table 1, one can substitute the appropriate expressions for and into equation 2.8. Taking the noise gain as an example, where
formula
2.9
and
formula
2.10
this yields
formula
2.11
A plot of equation 2.11 reveals a nonlinear effect of noise gain on (see Figure 4f). The value of increases linearly with increased noise ( until , beyond which point decreases according to 1/. The intuition for this result is similar to that described in Figure 3e: at low values of noise gain, the two distributions have no preferred direction of variance, resulting in a small overlap. While a small increase in noise gain increases this overlap, a larger increase stretches the distributions preferentially along one axis, therefore reducing their overlap. A similar analysis applies to all other model parameters (see Table 1).
Table 1:
Analytical Solutions of the Impact of Model Parameters on Classification Error.
Parameters
Absolute Difference between Stimuli Integration Constant Leak Term Noise Gain
     
     
Parameters
Absolute Difference between Stimuli Integration Constant Leak Term Noise Gain
     
     

Note: Variables , , and refer to equations 2.9 and 2.10.

In sum, we showed that the effect of noise correlations on an optimal linear classifier depends on detailed parameters as well as the strength of the input. Different configurations of these parameters can lead to drastically different conclusions on the impact of noise correlations. These effects may explain the divergence in results obtained over the last decades in experimental and computational studies reporting either helpful or detrimental effects of noise correlations (Averbeck et al., 2006). While some of the above results are specific to the model described here, the analyses we derive rely on few assumptions and may therefore be applicable to a broad class of neural models whose activity approximates a multivariate distribution.

One caveat of the above analyses is that we did not consider a scenario where the direction of maximum variance in neural responses is different for the two stimuli being classified. However, theoretical work shows that the impact of unequal variances on classification is expected to be minimal: the angles of maximum variance can vary by as much as 40 degrees while error remains low (da Silveira & Berry, 2014). Next, we expanded the model to a population of units that capture neural activity in areas MT and LIP during decision making.

2.2  Motion Discrimination Task

We considered a neural circuit comprising simple integrator units organized into two layers representing areas MT and LIP, respectively (see section 5). Units in layer MT had gaussian tuning curves (equally spaced, with variance of 20 degrees and amplitude of 1 arbitrary units) that were sensitive to sensory motion orientation (see Figure 5a). The activation of these units by oriented input was relayed to the LIP layer, which encoded the direction of decision choices in the model. We employed this simplified circuit to capture neural activity in a task of motion discrimination (Katz et al., 2016; Roitman & Shadlen, 2002). In this task, monkeys indicate via a saccade the direction of moving dots on a visual display (see Figure 1a). Unless otherwise indicated, we focused on stimuli with 100% coherence (i.e., all dots moving in the target direction).

Figure 5:

Computational simulation of MT and LIP activity for the motion discrimination task. (a) Orientation-selective tuning of six MT units. Top: The normalized activation (“norm. activation”) of MT units in response to oriented stimuli. Solid black line: example of stimulus with 25.6% coherence, leading to a larger activation near 270 degrees compared to 90 degrees. Arrows show the peak amplitude of activation for each stimulus. Bottom: Neural input into MT units obtained by multiplying the normalized activation (top panel) with the stimulus. (b) Schematic representation of the model used for the motion discrimination task. MT units are tuned to respond to selective orientations following gaussian tuning curves (panel a). Activation from the MT layer propagates to LIP units in a one-to-one fashion, where each MT unit is associated with a single LIP unit. Small arrows correspond to preferred orientation relative to the motion of sensory input (for MT) and the motor response orientation (for LIP). (c) Firing rate of MT and LIP units in response to a stimulus in the motion discrimination task (each solid line is a single unit, with color corresponding to panels a and b). Solid gray line shows the duration of stimulus into MT units. Stimulus orientation is indicated by the asterisk above the figure. (d) Fisher linear discriminant analysis that considered either the simulated activity from MT or LIP, or both. Cross-validation (c.v.) error reflects the accuracy of linear classification with respect to correct choices. Time is relative to stimulus onset. Dashed line: expected chance performance. (e) Linear discriminant analysis of LIP single-cell data during the motion discrimination task using nonoverlapping time windows of 10 ms in activity. Shaded area: SEM. (f) Higher stimulus coherence leads to lower classification error of simulated activity. Dashed lines: best-fitting lines of regression.

Figure 5:

Computational simulation of MT and LIP activity for the motion discrimination task. (a) Orientation-selective tuning of six MT units. Top: The normalized activation (“norm. activation”) of MT units in response to oriented stimuli. Solid black line: example of stimulus with 25.6% coherence, leading to a larger activation near 270 degrees compared to 90 degrees. Arrows show the peak amplitude of activation for each stimulus. Bottom: Neural input into MT units obtained by multiplying the normalized activation (top panel) with the stimulus. (b) Schematic representation of the model used for the motion discrimination task. MT units are tuned to respond to selective orientations following gaussian tuning curves (panel a). Activation from the MT layer propagates to LIP units in a one-to-one fashion, where each MT unit is associated with a single LIP unit. Small arrows correspond to preferred orientation relative to the motion of sensory input (for MT) and the motor response orientation (for LIP). (c) Firing rate of MT and LIP units in response to a stimulus in the motion discrimination task (each solid line is a single unit, with color corresponding to panels a and b). Solid gray line shows the duration of stimulus into MT units. Stimulus orientation is indicated by the asterisk above the figure. (d) Fisher linear discriminant analysis that considered either the simulated activity from MT or LIP, or both. Cross-validation (c.v.) error reflects the accuracy of linear classification with respect to correct choices. Time is relative to stimulus onset. Dashed line: expected chance performance. (e) Linear discriminant analysis of LIP single-cell data during the motion discrimination task using nonoverlapping time windows of 10 ms in activity. Shaded area: SEM. (f) Higher stimulus coherence leads to lower classification error of simulated activity. Dashed lines: best-fitting lines of regression.

Crucially, the motion discrimination task has a one-to-one correspondence between the direction where the majority of dots are moving and the direction of the behavioral decision. To capture this straightforward mapping from visual information to decision, we designed connections between MT and LIP that projected in a one-to-one fashion between units of the two layers (see Figure 5b). As a result, each LIP unit received unique (i.e., nonshared) inputs from MT. Here, as in previous work, we focus on the feedforward pathway between MT and LIP and do not explicitly consider the role of feedback from LIP to MT (Engel, Chaisangmongkon, Freedman, & Wang, 2015). These feedback projections are comparatively weaker and restricted to layer I of MT (Blatt et al., 1990); further work will be required to isolate their effect. Following the stimulus-driven response of MT units, activation propagated to LIP, where activity was characterized by a gradual buildup over the course of a trial (see Figure 5c). LIP activation in response to oriented sensory input occurred because of afferent MT connections and despite simulated LIP units having no explicitly sensory tuning, in line with experimental evidence that LIP neurons respond to motion stimuli within their response fields even in the absence of saccades (Balan & Gottlieb, 2009; Pesaran & Freedman, 2016).

To examine whether an ideal observer could perform an accurate classification of the stimuli based on neural activity from MT and LIP, we applied an optimal linear readout that aimed to classify activity based on the direction of an accurate decision (see section 5). Classification error of the readout gradually decreased over the course of a trial until it approached zero around 400 ms after stimulus onset (see Figure 5d). This is consistent with the rapid time course of decision making based on visual input (DiCarlo, Zoccolan, & Rust, 2012). This result was also consistent with an optimal linear readout of experimental data obtained from LIP during the motion discrimination task (see Figure 5e), despite no attempts at a precise quantitative fit.

In the model, a similar result was obtained whether the linear classifier considered only MT activity, only LIP, or both (see Figure 5d). Therefore, deactivating LIP by not entering it into the linear classifier did not alter performance in a meaningful way. This result is consistent with recent experimental evidence on muscimol deactivation of LIP during the motion discrimination task (Katz et al., 2016).

Importantly, the above results depend on the ability of the linear readout to receive inputs from both MT and LIP separately. While the integration of decision-relevant signals remains to be fully characterized, one candidate region is the ventral intraparietal area (VIP), which processes multimodal information and receives distinct connectivity from MT and LIP (Blatt et al., 1990; Lewis & Van Essen, 2000). Functionally, VIP reflects choice-related signals that guide decision making and are independent of pure sensory tuning (Zaidel, DeAngelis, & Angelaki, 2017), thus making VIP a potential site for behavior-related readout of neural signals from MT and LIP.

A follow-up series of simulations varied stimulus coherence (see section 5). Stimulus coherence was linearly related to readout error in the model; this result was found when decoding MT activity (, ), LIP activity (, , or both (, ) (see Figure 5f). In all cases, decoding error based on simulated activity at the end of the trial was typically low (less than 5%). A similar relation between readout error and coherence was found in analyses of LIP data (see Figure 5f; ). Next, we considered what happens when deactivating LIP during a task of stimulus classification, where the directions of visual input and decision output do not correspond in a straightforward manner.

2.3  Stimulus Categorization Task

We altered the above population model to capture activity during a stimulus categorization task (Sarma et al., 2016). In this task, subjects must indicate whether dot motion stimuli belong to the same broad category (see Figures 1b and 1c). To capture LIP activity during this task, we assumed that a subpopulation of LIP units responds preferentially to each of the two category orientations (see Figure 1c). In this way, LIP units transformed motion information to categorical information. LIP units in the model were segregated into two nonoverlapping subpopulations whose tuning curves reflected categorical information by showing a selective response to stimuli from one of two preferred categories (see Figure 6a).

Figure 6:

Simulations of MT and LIP during the categorization task. (a) Modeled LIP units received activation from a shared pool of MT units, reflecting the category membership of orientation-tuned inputs. The category selectivity of MT and LIP units is indicated by small arrows. (b) Firing rate of MT and LIP units for a given stimulus (indicated by asterisk). Solid gray line shows the duration of stimulus into MT units. (c) Sum of squared differences of within- versus between-category distances (WCD and BCD, respectively) in the activation of MT and LIP units. Each dot represents an individual unit, where activation is averaged across eight stimuli orientations and 100 trials. Dashed line: unity. (d) Left: Linear discriminant analysis of simulated activity during the categorization task. Middle: Novel task (see Figure 1c). Right: Categorization task employing a model with one-to-one connectivity from MT to LIP. Inset shows the configuration of the model with either shared or unique connectivity between MT and LIP. (e) Classification accuracy related to distance of the stimuli from the categorization boundary. Vertical bars: SEM. Filled circles are averaged over stimuli and trials.

Figure 6:

Simulations of MT and LIP during the categorization task. (a) Modeled LIP units received activation from a shared pool of MT units, reflecting the category membership of orientation-tuned inputs. The category selectivity of MT and LIP units is indicated by small arrows. (b) Firing rate of MT and LIP units for a given stimulus (indicated by asterisk). Solid gray line shows the duration of stimulus into MT units. (c) Sum of squared differences of within- versus between-category distances (WCD and BCD, respectively) in the activation of MT and LIP units. Each dot represents an individual unit, where activation is averaged across eight stimuli orientations and 100 trials. Dashed line: unity. (d) Left: Linear discriminant analysis of simulated activity during the categorization task. Middle: Novel task (see Figure 1c). Right: Categorization task employing a model with one-to-one connectivity from MT to LIP. Inset shows the configuration of the model with either shared or unique connectivity between MT and LIP. (e) Classification accuracy related to distance of the stimuli from the categorization boundary. Vertical bars: SEM. Filled circles are averaged over stimuli and trials.

We hard-wired the model such that MT units corresponding to each category projected to distinct LIP subpopulations (see Figure 6a). In this way, LIP units within a given subpopulation received shared inputs from MT. Related computational work shows that a similar configuration of MT-LIP projections can be learned by a reward-based plasticity rule (Engel et al., 2015) or a supervised Hebbian learning rule, as described below.

Stimuli were injected in the model as in the motion discrimination task (see Figure 5a); however, only a single stimulus orientation per trial was present for the categorization task. Activation of MT units by a stimulus generated a population response in LIP with a sustained time course of activation exceeding that of MT units (see Figure 6b). Further, LIP population activity reflected the category membership of the sensory input (e.g., the “blue” stimulus activating MT leads to the activation of the “blue” category in LIP in Figure 6b). This characteristic of simulated LIP activity captured experimental LIP recordings during the sample and delay periods of the categorization task (Sarma et al., 2016).

To further examine the category-specific responses of simulated LIP activity, we presented the model with individual stimuli in each of eight orientations (see Figure 1c), with 100 trials for each direction. We then computed two scores: within- and between-category differences (WCD and BCD, respectively) as in experiments (Freedman & Assad, 2006). These scores were obtained by taking the sum of squared differences between firing rates within (WCD) and between (BCD) stimulus categories. We found that LIP yielded markedly higher scores than MT (see Figure 6c), in accord with experimental results. Thus, LIP accentuated the difference in firing rate between stimuli belonging to different categories, while keeping the difference between stimuli of the same category relatively smaller.

Next, we entered the model's activity in an optimal linear readout that aimed to classify stimuli according to their target category. Simultaneously feeding both MT and LIP activity into the linear classifier led to above-chance performance beyond 400 ms after stimulus onset (see Figure 6d, left). Performance at a trial offset of 1200 ms was lower (68%) with stimuli presented close to the categorization boundary compared to stimuli presented farther away (see Figure 6e), capturing psychometric results (Freedman & Assad, 2006).

However, when we entered MT activity alone in the classifier, error did not reduce to below-chance level (see Figure 6d, left). Therefore, “deactivating” LIP by not entering it in the readout markedly disrupted performance. This is consistent with MT exhibiting motion-relevant but not category-relevant activation, and hence not serving as an adequate basis for stimulus categorization. This result is in stark contrast with results obtained with the motion discrimination task, where an accurate readout could be obtained with MT alone (see Figure 5d). Taken in isolation, the motion discrimination task may therefore point to the misleading conclusion that LIP does not participate in computations involving the discrimination of motion stimuli. This masking effect is in fact due to the overlapping representations of MT and LIP during the motion discrimination task, given that the stimulus motion is straightforwardly related to the direction of a correct decision. Such overlap is not present in the categorization task.

Next, we examined how adequate projections from MT to LIP can be learned in order to perform the categorization task. We employed a supervised Hebbian learning method to train feedforward connection weights from MT to LIP (see section 5). Connections were adjusted through training (1000 trials) in order to learn a set of target associations between sensory motion directions encoded in MT and movement directions encoded in LIP (see Figure 7a, left). Beginning from a set of random feedforward connections (see Figure 7a, middle), the network refined its connectivity in order to reflect the target associations (see Figure 7a, right). This learning was gradual over the course of trials (see Figure 7b). These results show that accurate projections from MT to LIP can be learned using a simplified yet biologically grounded learning rule.

Figure 7:

Training projections from MT to LIP using supervised Hebbian learning. (a) Target associations are set to 1 (black squares) for connections between motion direction encoded in MT units and movement direction encoded in LIP units, and zero (white squares) otherwise. Connection weights are initialized at random following a uniform distribution between 0 and 1. Final connection weights reflect the target associations. (b) Decrease in error over the course of 1000 trials of Hebbian learning. Dashed line: random performance.

Figure 7:

Training projections from MT to LIP using supervised Hebbian learning. (a) Target associations are set to 1 (black squares) for connections between motion direction encoded in MT units and movement direction encoded in LIP units, and zero (white squares) otherwise. Connection weights are initialized at random following a uniform distribution between 0 and 1. Final connection weights reflect the target associations. (b) Decrease in error over the course of 1000 trials of Hebbian learning. Dashed line: random performance.

One limitation of the categorization task is that it does not offer a complete dissociation between sensory input and correct decision. While it does offer some form of dissociation (above and beyond the motion discrimination task), the direction of correct decision corresponds approximately to the mean of the within-category sensory inputs. To address this issue, we developed a novel task where sensory motion offered a low correspondence to target decisions (see Figure 1c). As with the categorization task, a readout of activity during the novel task shows accurate classification with LIP alone but not with MT alone (see Figure 6d, middle). This result reinforces the notion that LIP plays a central role in visuomotor transformations during tasks where such mapping is not straightforward.

The model employed here to perform the categorization task relied on shared inputs between MT and corresponding subpopulations of LIP units (see Figure 6a). To illustrate the importance of shared connectivity in the categorization task, we examined what happens if we take the model configured for the motion discrimination task and apply it to the categorization task. Specifically, we maintained one-to-one connectivity between MT and LIP units (see Figure 5b) and entered the resulting activity in a linear readout. In this scenario, performance did not decrease below chance level (see Figure 6d, right). Therefore, one-to-one connectivity did not allow for an adequate classification of neural activity in the categorization task.

An important consequence of shared connectivity between MT and LIP (see Figure 6a) is an increase in LIP noise correlations. Mean Pearson correlations of LIP units across trials and stimuli were . By comparison, running the categorization task using a model with one-to-one connectivity (as used for the motion discrimination task; see Figure 5b) yielded mean correlations of . While these results are likely an overestimate of true correlations given the simplicity of the mean rate model, they nonetheless outline an important prediction of the model that feedforward propagation of activity from MT to LIP during the categorization leads to an increase in noise correlations compared to the motion discrimination task. The consequences of increased noise correlations on stimulus classification depend on several factors, including the input statistics and the gain of noise, as examined earlier.

3  Discussion

In this work, we developed a model circuit that highlighted the distinct roles of areas MT and LIP in decision making. Taking an “ideal observer” approach, we used an optimal linear readout of neural activity (Berberian et al., 2017) to show that pairwise correlations can have either a positive or a negative impact on neural decoding depending on several aspects of the stimuli and model parameters. Further, we argued that depending on the nature of the behavioral task, markedly different conclusions may be reached on the causal role of LIP. In tasks that exhibit a visuomotor confound, a prime example being the motion discrimination task, MT and LIP conveyed largely redundant information. Conversely, tasks that do not exhibit this confound, such as a categorization task, showed a clear role for LIP in the accurate classification of sensory input. Hence, we argue that the causal contribution of a region such as LIP may be easily confounded by task demands, an effect referred to as causal masking. In this section, we begin by exploring solutions to causal masking and then discuss future avenues for both experimental and theoretical work.

3.1  Solutions to Causal Masking

Causal masking occurs when a particular behavioral or cognitive task is linked to the activation of a given brain region, leading to the (possibly erroneous) conclusion of a causal relation between that region and aspects of the task. Such an effect is pervasive not only to cellular electrophysiology but also to studies of large-scale brain imaging (Honey, Thivierge, & Sporns, 2010).

In our results, causal masking occurs because of a visuomotor confound in the motion discrimination task. Such a confound is present in many studies showing that deactivation of posterior parietal cortex has little effect on performance (Chafee & Goldman-Rakic, 2000; Li, Mazzoni, & Andersen, 1999; Schiller & Tehovnik, 2003; Wardak, Olivier, & Duhamel, 2002). Conversely, tasks that feature an arbitrary mapping between sensory and motor components show an impact of LIP deactivation, for instance, on contralesional target detection time (Wardak, Olivier, & Duhamel, 2004). This effect is not just attentional, as increasing the number of visual distractors does not lead to progressively worsened performance (Liu, Yttri, & Snyder, 2010).

Unfortunately, causal masking is not straightforwardly resolved by the use of selective activation (e.g., optogenetics or microstimulation) or deactivation (e.g., local application of muscimol) (Panzeri, Harvey, Piasini, Latham, & Fellin, 2017). A simple example was provided in our study, where we selectively enter MT and LIP activity in an optimal linear decoder and reached markedly different conclusions depending on the nature of the task. Further examples of the failure of activation-deactivation approaches abound in the literature (Jonas & Kording, 2017; Marom et al., 2009).

It is possible to limit the effect of causal masking by placing a strong focus on hypothesis-driven behavioral tasks (Krakauer, Ghazanfar, Gomez-Marin, MacIver, & Poeppel, 2017). This is a particularly important point when considering the neural correlates of decision making, given that the causal role of associative cortical regions such as LIP cannot be inferred strictly on anatomical grounds (Katz, 2016). Hypothesis-driven tasks that examine causality do not have to be intricate or particularly sophisticated; although simple tasks may lack ecological validity, they provide a wealth of information on the basic computations performed by brain circuits (Thivierge & Marcus, 2007).

Giving a central role to hypothesis-driven behavioral tasks in understanding the function of brain circuits does not render useless the role of activation-deactivation approaches. LIP is part of a large-scale network of regions that contribute to decision making. In the motion discrimination task, for instance, the redundancy between MT and LIP may be thrown off by microstimulation (Hanks et al., 2006). Thus, results of microstimulation studies may be highly informative, as they reveal how decision making involves pooling information across a number of brain regions and how shifting the representation in one region affects this process.

As an illustration, consider three regions that represent evidence for a decision toward a given motor orientation or its opposite (see Figure 8a). These regions differ in the strength of evidence that they contribute toward a final decision outcome. In a simplified scenario, this outcome could be computed as a weighted average of individual contributions, where the weight is given by the strength of evidence (see Figure 8b). Microstimulation of a given region could bias the decision by altering both the direction of motor orientation and the strength of evidence. Depending on these factors, microstimulation may affect the final weighted average and therefore have an impact on the decision outcome (see Figure 8b). On the other hand, deactivation of the same given region may have no impact on the direction of the final decision (see Figure 8b). Several computational models of category-based decision making rely on a similar idea of pooling across subpopulations that each encode different aspects of the task (Amit, Brunel, & Tsodyks, 1994; Ardid & Wang, 2013; Engel & Wang, 2011). In our view, finding no effect of LIP deactivation is fully compatible with finding an effect of microstimulation in the same region; what is central is how information is pooled across regions and how much redundancy is present. If MT and LIP provide estimates of decision-relevant variables that are statistically dependent, then combining both regions will not tend to reduce the variance of a readout estimate when compared to a scenario where both regions provide independent estimates. Anatomically, statistical dependencies between the two regions likely arise from a combination of direct connectivity and shared common inputs.

Figure 8:

Schematic illustration of the effect of microstimulation and deactivation on decision making in a framework where information across several regions is pooled via weighted averaging. Each arrow from panel a represents the “vote” of a given region toward a particular direction of behavior (e.g., saccadic eye movement). The strength of evidence contributed by each region is represented by color. The size of each arrow represents the strength of contribution of each region toward the final decision. Microstimulation may bias a given region toward a given direction of decision, as shown in panel b. Deactivation would remove that region from the weighted averaging. The net effect of such biases depends on how much it shifts the sum of weighted contributions from each region.

Figure 8:

Schematic illustration of the effect of microstimulation and deactivation on decision making in a framework where information across several regions is pooled via weighted averaging. Each arrow from panel a represents the “vote” of a given region toward a particular direction of behavior (e.g., saccadic eye movement). The strength of evidence contributed by each region is represented by color. The size of each arrow represents the strength of contribution of each region toward the final decision. Microstimulation may bias a given region toward a given direction of decision, as shown in panel b. Deactivation would remove that region from the weighted averaging. The net effect of such biases depends on how much it shifts the sum of weighted contributions from each region.

3.2  From Tasks to Computations

An adequate, hypothesis-driven quest for neural causality should be inspired from Marr's levels of understanding (Marr, 1982). Specifically, systems neuroscience longs for an “algorithmic” understanding of neural circuits. Such understanding focuses on computations rather than tasks and requires a shift in language from “region X is involved in task Y” to “region X performs computation Y,” where computations may serve a broad (yet, it is hoped, well delineated) range of tasks.

A shift in focus from a vocabulary based on tasks to one based on computations is necessary when considering that any given brain circuit may participate in a plurality of tasks. Computational models have begun to capture this important notion with networks of randomly connected neurons that perform different tasks by connecting to distinct readout units (Sussillo & Abbott, 2009; Vincent-Lamarre, Lajoie, & Thivierge, 2016). Importantly, these networks need no alteration to their internal connectivity in order to accommodate various tasks; all that is required are different sets of projections to trained readout units. If biological circuits operate on principles similar to these models, one may abandon the promise made by microconnectomics that an understanding of neural computation will be achieved by a detailed mapping of connectivity (Schröter, Paulsen, & Bullmore, 2017) or even a recording of all neurons in a given circuit (Pillow et al., 2008).

3.3  A Mapping Hypothesis

What are the computations performed by LIP according to our account? Our results suggest that LIP is ideally suited to perform a mapping of sensory input into more abstract categories linked to behavioral decisions (Freedman & Assad, 2006). While this mapping was explicitly defined in our model, it nonetheless serves to explain an important aspect of neural activity during the categorization task: the category-specific responses (see Figure 6c) observed during the sample and delay periods of the task (Sarma et al., 2016).

Going further, a mapping hypothesis generates a prediction on task-modulated noise correlations in LIP. Specifically, tasks that involve a straightforward visuomotor mapping between MT-LIP (such as the motion discrimination task) may result in low noise correlations compared to tasks based on a more complex mapping (such as the categorization task). This is due to the latter tasks leading to a reconfiguration of projections from MT that results in shared inputs into LIP targets (Engel et al., 2015).

Importantly, a mapping hypothesis is distinct from a hypothesis of feature binding (Shadlen & Movshon, 1999). Here, feature binding would require that multiple aspects of stimuli be combined in LIP. In a mapping hypothesis, however, there is no need to postulate that different features bind together, given that stimuli all vary along a single dimension (i.e., orientation). Further, muscimol deactivation of LIP shows an effect on tasks involving either one or many distinct features, so feature binding does not seem to matter extensively in this context (Wardak et al., 2004).

Could some of the conclusions on the role of LIP be influenced by sheer memory load? For instance, perhaps LIP deactivation has little effect in a task of lower memory load but a more pronounced effect with increased load. Even assuming that there was an uneven memory load across the two tasks described here, it is unclear whether LIP activation would be influenced by this effect, given that it does not appear to rely on memory-based comparisons (Shadlen & Shohamy, 2016). Rather, we suggest that it is not memory load per se that constitutes the determining factor, but rather the degree of abstraction in sensory-motor correspondence. To be sure, however, one would need to design a task that allows independent control of memory load and stimulus abstraction.

The postulate that LIP plays a role in abstract visuomotor mapping is one of many views on the computations performed in that region, which include evidence integration (Roitman & Shadlen, 2002), likelihood ratio estimation (Kira et al., 2015), regulation of a speed-accuracy trade-off (Hanks, Kiani, & Shadlen, 2014), and pattern separation (Berberian et al., 2017). Hence, our results highlight the contribution of LIP to one of several distinct possible, yet well-delimited, computations.

3.4  Decoding Noise Correlations

The beneficial or detrimental contribution of noise correlations to coding continues to spark debate (Averbeck et al., 2006; Franke et al., 2016; Zylberberg, Cafaro, Turner, Shea-Brown, & Rieke, 2016). Here, we shed light on this issue by developing a generative model of neural activity that can be analyzed using an optimal linear readout, thus providing an upper bound on the coding capacity of the model given certain parameter settings.

One distinguishing feature of the analysis provided here is that we examined both analytically and numerically how specific dynamical parameters (such as time integration, leak, and noise gain) mediate the relation between noise correlation and discrimination error. By doing so, we unveiled conditions where noise correlation has an impact on error in a nonmonotonic fashion (see Figures 3b to 3d), an effect that has not been reported elsewhere in the literature to the best of our knowledge. This is an important finding because it warrants against broad conclusions on the impact of noise correlation that do not take into account dynamical parameters of a neural circuit, a point that has not been addressed by previous work.

By comparison, previous work has largely focused on how decoding (e.g., Fisher information) is affected when gradually more and more units are incorporated in the analysis (Abbott & Dayan, 1999; Hu, Zylberberg, & Shea-Brown, 2014; Kohn et al., 2016; Moreno-Bote et al., 2014; Sompolinsky et al., 2001; Panzeri, Treves, Schultz, & Rolls, 1999; Zylberberg et al., 2016). The few recent papers that have begun to examine the impact of model parameters have focused on the role of neural heterogeneity (Ecker et al., 2011; da Silveira & Berry, 2014), not dynamical parameters of the model's equations (see Table 1). Our work shows that these parameters play a crucial role in shaping how noise correlations impact decoding.

Finally, we provide formal analyses that offer direct quantitative predictions on the impact of correlations on linear decoding. This approach depicts a highly nontrivial picture of noise correlations on stimulus classification, one that depends on several assumptions including noise, stimulus characteristics, and dynamical parameters such as neuronal leak and the time constant of integration (see Table 1).

4  Conclusion

Our work offers a cautionary tale on the overinterpretation of causality associating a brain region such as LIP to a particular task or even sets of tasks. The effect of causal masking is pervasive to any computation that is distributed among brain regions. While LIP appears to be involved in the passage from motion information to saccadic choice (Law & Gold, 2008), this role may be shared among several regions, including superior colliculus and frontal eye field (Horwitz & Newsome, 1999; Kim & Shadlen, 1999), making it difficult to isolate the computations performed. The mapping hypothesis proposed here offers a possible computational role for LIP in assembling visual inputs into categories corresponding to motorcentric representations (Pesaran & Freedman, 2016).

In addition, our results plead against oversimplistic conclusions relating noise correlations to sensory decoding, in favor of a more nuanced picture based on characteristics of the stimuli, noise, and firing rate. The approach outlined here may be extended to more realistic neural circuits to gain further insight into how correlations affect the computations underlying decision making.

5  Methods

5.1  Model of MT-LIP Activity

Simulations of neural activity (see Figures 5 and 6) were performed with a two-layer model. Here, we describe the model only briefly and refer readers to recent work (Berberian et al., 2017). Neural activity was modeled with linear mean-rate units that have been successful at capturing LIP activity during decision making. The firing rate of simulated MT units relates to the motion orientation of visual stimuli, whereas LIP reflects the more abstract passage from sensory input to motor response.

The mean firing rate of MT units is described by
formula
5.1
where and are mean firing rates of MT and LIP units and , respectively, is an external input delivered at time , are within-layer connection strengths between MT cells, ms is a constant of integration, is a leak parameter (set by default to , and is gaussian noise with zero mean and unit variance.
Activity in the LIP layer evolved according to
formula
5.2
where reflects between-layer feedforward connectivity and is a time-dependent urgency signal that acts as a gain on inputs originating from MT. This urgency is included in many related models that feature a time-varying gain (Churchland, Kiani, & Shadlen, 2008; Cisek, Puskas, & El-Murr, 2009; Ditterich, 2006; Drugowitsch, Moreno-Bote, Churchland, Shadlen, & Pouget, 2012; Thura, 2016; Thura, Beauregard-Racine, Fradet, & Cisek, 2012; Thura & Cisek, 2016). This simplified account of LIP activity aims to capture the broad pattern of activity observed in motion discrimination and categorization tasks during the stimulus presentation and delay period only; it does not include an explicit behavioral output or motor feedback.

The resulting model is one where information and urgency gradually shape firing rate over the course of a given trial (Morcos & Harvey, 2016). This account differs from a strict accumulator winner-take-all model of neural activity during decision making (Wong & Wang, 2006). We opted against such a mechanism for several reasons. First, the presence of high trial-to-trial variability in LIP is inconsistent with the convergence of activity toward a single low-dimensional attractor. Second, experimental recordings show that neuronal variability is markedly greater than expected by classic attractor models (Brody, Hernandez, Zainos, & Romo, 2003; Shafi et al., 2007). In fact, persistent activity during tasks of working memory is typically irregular and Poisson-like (Baeg et al., 2003; Compte et al., 2003). Finally, both experimental and theoretical results suggest that dimensionality remains large even at the decision point (Berberian et al., 2017). We do note that despite these points, the debate over which model best represents the neural and psychophysical correlates of decision making remains unresolved (Thura, 2016).

5.2  MT Within-Layer Connectivity

Within-layer connectivity in MT followed a number of rules. First, 80% of units were chosen to be excitatory, and the remainder 20% were inhibitory. Following Dale's law, a given excitatory/inhibitory unit had only positive or negative outgoing connections weights. Second, within-layer connectivity in MT was sparse, with only 20% of all possible connections present (chosen randomly among all possible connections). Self-connections were not permitted. Third, we balanced the sum of excitation and inhibition coming into each unit. The result was an approximate cancellation of the sum of afferent inputs as evidenced in cortical networks (Haider, Duque, Hasenstaub, & McCormick, 2006; Shu, Hasenstaub, & McCormick, 2003), albeit at a mean-rate level instead of detailed synaptic inputs. Fourth, connectivity followed a “sunken Mexican hat” whereby units with a similar orientation preference interacted via excitatory connections () and units with a dissimilar orientation preference interacted via inhibitory connections (). This connectivity was obtained by subtracting one gaussian function (amplitude: 0.45; SD: 0.38) from another (amplitude: 0.4; SD: 0.43), where the means of both functions corresponded to the preferred orientation of each neuron. Finally, all connection weights were scaled as , where is the total number of input connections.

5.3  LIP Within-Layer Connectivity

Connectivity within the LIP layer of the model followed similar principles to MT (see above). However, for the categorization task, connections between LIP units of the same preferred response orientation were excitatory (), whereas units of opposite orientations were inhibitory ().

5.4  Between-Layer Connectivity

Feedforward projections from MT to LIP were designed differently based on the task performed. For the motion-discrimination task, each MT unit was paired with a distinct LIP unit, reflecting the fact that each stimulus orientation was associated with a distinct motor response orientation (see Figure 5b). For the categorization task, we explicitly set those connections in order for LIP to combine MT sensory signals within the same response category (see Figure 6a).

In Figure 7, we employed a supervised Hebbian learning method to train connection weights from MT to LIP. This method updated connections as follows,
formula
5.3
where 0.1 is a learning constant and is a teacher signal set to 1 when a given orientation direction in MT links to a target direction of movement encoded in LIP, and zero otherwise (see Figure 7a). Initial connections from MT to LIP were drawn from a uniform distribution within the range [0,1]. The learning error of the Hebbian rule was computed by first thresholding the weights such that for , and otherwise. Then we computed the percentage difference between (see Figure 7a) and (see Figure 7b).

We performed all simulations using a fourth-order Runge-Kutta integration in the Matlab programming language (Mathworks, Natick, MA). Custom code in the Matlab language is available from the authors on request.

5.5  Decision-Making Tasks

To ensure a straightforward comparison between the motion discrimination and categorization tasks, and unless otherwise stated, we limited simulations to a scenario where inputs had 100% coherence, resulting in nonambiguous stimuli. Inputs were turned on at trial onset (0 ms) and shut off after 500 ms for the motion discrimination task (Meister et al., 2013) and 650 ms for the categorization task (Freedman & Assad, 2006).

For both the motion discrimination and categorization tasks, the peak amplitude of the stimulus was set to 30 Hz, corresponding to the approximate change in firing rate observed experimentally from baseline to delay period. As in previous computational work, each stimulus was represented by a gaussian distribution (SD 15 degrees) centered around a single motion orientation (Berberian et al., 2017; see Figure 5a). Tuning curves of all units (in units of normalized firing rate, given their range of [0,1]) were multiplied by the stimulus, and the result was entered as input into MT units of the model ( in equation 5.10). More specifically, the stimulus S was stored in a matrix of size 360 (degrees orientation) by time steps ( ms for a given trial); tuning curves C were stored in a matrix of size 360 (degrees orientation) by units (set to 100 by default). The input was computed as I = CS (where indicates a matrix transpose), resulting in a matrix of size units by time steps. This input assumes that the stimulus is within both the receptive field of activated MT units and the response field of corresponding LIP units. Finally, the urgency ramp ( in equation 5.2) was initialized at zero at the beginning of each trial, then increased with a slope of 0.01 Hz/ms approximating experimental results (Roitman & Shadlen, 2002).

For one set of simulations involving the motion discrimination task (see Figure 5f), we varied the coherence of the stimulus by altering the relative strength of inputs to MT units while keeping the sum of peak firing rates (see Figure 5a, arrows) fixed at 30 Hz across trials. Peak firing rates for a given coherence were obtained as
formula
A coherence of 25.6%, for instance, yields peak firing rates of 18.84 Hz and 11.16 Hz for each stimulus, respectively. Stimulus coherences employed here matched those of experimental work (0, 3.2, 6.4, 12.8, 25.6, and 51.2%). We simulated 100 trials with each of these coherences.

5.6  Linear Discrimination Analysis

We employ LDA to compute the optimal boundary between different responses of the model or LIP data corresponding to distinct sensory motion directions (motion discrimination task) or categories (categorization task) (Dehaqani et al., 2016; Rich & Wallis, 2016). We computed LDA over time using nonoverlapping time bins of 10 ms, a value that offers a good correlation between classification accuracy of single-unit LIP activity and psychophysics (Berberian et al., 2017). We employed ten-fold cross-validation to estimate classification error (see Figures 5 and 6). This was achieved numerically by splitting the data set into 10 subsets, training LDA on 9 of these subsets, testing on the 10th subset, and repeating until every subset was used as the test set. The final cross-validation error corresponded to the average error obtained over all ten folds.

5.7  Macaque Electrophysiology

The LIP data analyzed here (see Figures 5e and 5f) were obtained from previous work where two rhesus monkeys learned a task of motion discrimination (Roitman & Shadlen 2002). We considered only the spike times of neurons whose response was maximal at the given stimulus orientation presented. Neurons recorded over repeated trials (with the same behavioral decision) were averaged together prior to entering in LDA. A total of 54 single units were analyzed.

Appendix

Here we analytically compute estimates of classification error for an optimal linear readout applied to a mean-rate model of neural activity.

We begin by assuming two units with firing rates and that follow the same dynamics and do not interact (see equation 2.1), and therefore have zero correlation. The solution to these units can be expressed as an Ornstein-Uhlenbeck process,
formula
A.1
where , , and . The stationary, asymptotic expected mean and variance of this process are, respectively,
formula
A.2
formula
A.3
We assume that during different trials, unit receives two distinct inputs from MT, denoted and for simplicity. We let where and . Similarly, we define for the second input. This results in , where
formula
A.4
with a variance-covariance matrix
formula
A.5
and its inverse,
formula
A.6
To perform an optimal linear readout, we first obtain the total within-stimulus covariance matrix:
formula
A.7
Next, we define the optimal linear projection (OLP) for unit as a set of weighted connections from to a linear readout that best classifies the input (Berberian et al., 2017). The OLP for unit is
formula
A.8
Using the OLP, we obtain the means of the projected joint distribution,
formula
A.9
with variance
formula
A.10
The optimal categorization threshold is given by
formula
A.11
Next, we aim to estimate error rates of the OLP, corresponding to the probability that a point will fall on the wrong side of (i.e., where the correct side of is the side closest to the distribution mean) and therefore be misclassified (see Figure 2b). This probability is given by the improper integral of a gaussian probability distribution function centered on a given stimulus. For simplification, we shift the means of the projected joint distribution by such that the threshold value becomes zero, yielding
formula
A.12
and
The estimated error is obtained by
formula
A.13
where is the axis of projection of the data distribution. This error can be further simplified by first defining the squared Mahalanobis distance between means,
formula
A.14
leading to
formula
A.15
The above result assumes zero correlation between units. In order to introduce some positive noise correlation between and , the variance-covariance matrix becomes
formula
A.16
with inverse
formula
A.17
assuming asymptotically stable behavior of the model. The conditions for such behavior are provided elsewhere (Berberian et al., 2017). Importantly, the analysis assumes that noise correlations do not have an impact on firing rates. While we acknowledge such an effect in biological circuits (de la Rocha et al., 2007), isolating the effect of correlations allows us to investigate their role independent of their effect on firing rates (Hu et al., 2014). This is central to understanding the role of noise correlations on stimulus decoding.
With equations A.16 and A.17 introducing correlations into the variance-covariance matrix, the OLP for each unit becomes
formula
A.18
The means of each projected joint distribution are
formula
A.19
with variance
formula
A.20
The estimated error can be calculated by incorporating the above result directly into equation A.13.

A.1  Special Cases of Error Maxima

The maximum point of the error function is derived in the main text (see equation 2.6). Here, we consider the Mahalanobis distance in special cases dealing with overlapping versus nonoverlapping distributions of firing rates.

Overlapping distributions are obtained with identical variances and shared input across units, and . In this case, the squared Mahalanobis distance, (equation A.14) becomes
formula
A.21
In a different scenario, we consider parallel distributions, defined by equal variances but no shared inputs, and . In this case, the squared Mahalanobis distance becomes
formula
A.22
In both cases, equations A.21 and A.22, the Mahalanobis distance can be incorporated into an estimation of discrimination error (equation A.15).

A.2  Comparing Analytical Solutions to Numerical Simulations

Analytical estimates of classification error were compared to numerical simulations as follows (see Figures 2 and 3). Assuming that firing rates follow a multivariate distribution, we first computed the mean, equation A.4, and variance, equation A.5, of that distribution. We then generated 5000 data points for each distribution. The resulting data were entered in LDA.

Acknowledgments

This work was supported by a Discovery grant from the Natural Sciences and Engineering Council of Canada (NSERC grants 210977 and 210989) and the University of Ottawa Brain and Mind Institute. We thank Alex Huk and Peter Latham for useful discussions.

References

Abbott
,
L. F.
, &
Dayan
,
P.
(
1999
).
The effect of correlated variability on the accuracy of a population code
.
Neural Computation
,
11
(
1
),
91
101
.
Amit
,
D. J.
,
Brunel
,
N.
, &
Tsodyks
,
M. V.
(
1994
).
Correlations of cortical Hebbian reverberations: Theory versus experiment
.
Journal of Neuroscience
,
14
(
11
),
6435
6445
.
Ardid
,
S.
, &
Wang
,
X.-J.
(
2013
).
A tweaking principle for executive control: Neuronal circuit mechanism for rule-based task switching and conflict resolution
.
Journal of Neuroscience
,
33
(
50
),
19504
19517
.
Averbeck
,
B. B.
,
Latham
,
P. E.
, &
Pouget
,
A.
(
2006
).
Neural correlations, population coding and computation
.
Nature Reviews Neuroscience
,
7
(
5
),
358
366
.
Baeg
,
E. H.
,
Kim
,
Y. B.
,
Huh
,
K.
,
Mook-Jung
,
I.
,
Kim
,
H. T.
, &
Jung
,
M. W.
(
2003
).
Dynamics of population code for working memory in the prefrontal cortex
.
Neuron
,
40
,
177
188
.
Balan
,
P. F.
, &
Gottlieb
,
J.
(
2009
).
Functional significance of nonspatial information in monkey lateral intraparietal area
.
Journal of Neuroscience
,
29
(
25
),
8166
8176
.
Beck
,
J. M.
,
Ma
,
W. J.
,
Kiani
,
R.
,
Hanks
,
T.
,
Churchland
,
A. K.
,
Roitman
,
J.
, …
Pouget
,
A.
(
2008
).
Probabilistic population codes for Bayesian decision making
.
Neuron
,
60
(
6
),
1142
1152
.
Bennur
,
S.
, &
Gold
,
J. I.
(
2011
).
Distinct representations of a perceptual decision and the associated oculomotor plan in the monkey lateral intraparietal area
.
Journal of Neuroscience
,
31
(
3
),
913
921
.
Berberian
,
N.
,
MacPherson
,
A.
,
Giraud
,
E.
,
Richardson
,
L.
, &
Thivierge
,
J.-P.
(
2017
).
Neuronal pattern separation of motion-relevant input in LIP activity
.
Journal of Neurophysiology
,
117
(
2
),
738
755
.
Blatt
,
G. J.
,
Andersen
,
R. A.
, &
Stoner
,
G. R.
(
1990
).
Visual receptive field organization and cortico-cortical connections of the lateral intraparietal area (area LIP) in the macaque
.
Journal of Comparative Neurology
,
299
(
4
),
421
445
.
Brody
,
C. D.
,
Hernandez
,
A.
,
Zainos
,
A.
, &
Romo
,
R.
(
2003
).
Timing and neural encoding of somatosensory parametric working memory in macaque prefrontal cortex
.
Cerebral Cortex
,
13
,
1196
1207
.
Brunel
,
N.
(
2000
).
Dynamics of sparsely connected networks of excitatory and inhibitory spiking neurons
.
Journal of Computational Neuroscience
,
8
(
3
),
183
208
.
Bujan
,
A. F.
,
Aertsen
,
A.
, &
Kumar
,
A.
(
2015
).
Role of input correlations in shaping the variability and noise correlations of evoked activity in the neocortex
.
Journal of Neuroscience
,
35
(
22
),
8611
8625
.
Burak
,
Y.
, &
Fiete
,
I. R.
(
2012
).
Fundamental limits on persistent activity in networks of noisy neurons
.
Proceedings of the National Academy of Sciences of the United States of America
,
109
(
43
),
17645
17650
.
Buonomano
,
D. V.
, &
Maass
,
W.
(
2009
).
State-dependent computations: Spatiotemporal processing in cortical networks
.
Nature Reviews Neuroscience
,
10
(
2
),
113
125
.
Cain
,
N.
,
Barreiro
,
A. K.
,
Shadlen
,
M.
, &
Shea-Brown
,
E.
(
2013
).
Neural integrators for decision making: A favorable tradeoff between robustness and sensitivity
.
Journal of Neurophysiology
,
109
(
10
),
2542
2559
.
Chafee
,
M. V.
, &
Goldman-Rakic
,
P. S.
(
2000
).
Inactivation of parietal and prefrontal cortex reveals interdependence of neural activity during memory-guided saccades
.
Journal of Neurophysiology
,
83
(
3
),
1550
1566
.
Chance
,
F. S.
,
Abbott
,
L. F.
, &
Reyes
,
A. D.
(
2002
).
Gain modulation from background synaptic input
.
Neuron
,
35
,
773
782
.
Churchland
,
A. K.
, &
Kiani
,
R.
(
2016
).
Three challenges for connecting model to mechanism in decision-making
.
Current Opinion in Behavioral Sciences
,
11
,
74
80
.
Churchland
,
A. K.
,
Kiani
,
R.
,
Chaudhuri
,
R.
,
Wang
,
X.-J.
,
Pouget
,
A.
, &
Shadlen
,
M. N.
(
2011
).
Variance as a signature of neural computations during decision making
.
Neuron
,
69
(
4
),
818
831
.
Churchland
,
A. K.
,
Kiani
,
R.
, &
Shadlen
,
M. N.
(
2008
).
Decision-making with multiple alternatives
.
Nature Neuroscience
,
11
(
6
),
693
702
.
Cisek
,
P.
(
2006
).
Integrated neural processes for defining potential actions and deciding between them: A computational model
.
Journal of Neuroscience
,
26
(
38
),
9761
9770
.
Cisek
,
P.
,
Puskas
,
G. A.
, &
El-Murr
,
S.
(
2009
).
Decisions in changing conditions: The urgency-gating model
.
Journal of Neuroscience
,
29
(
37
),
11560
11571
.
Cohen
,
M. R.
, &
Newsome
,
W. T.
(
2008
).
Context-dependent changes in functional circuitry in visual area MT
.
Neuron
,
60
(
1
),
162
173
.
Collins
,
C. E.
,
Airey
,
D. C.
,
Young
,
N. A.
,
Leitch
,
D. B.
, &
Kaas
,
J. H.
(
2010
).
Neuron densities vary across and within cortical areas in primates
.
Proceedings of the National Academy of Sciences of the United States of America
,
107
(
36
),
15927
15932
.
Compte
,
A.
,
Constantinidis
,
C.
,
Tegner
,
J.
,
Raghavachari
,
S.
,
Chafee
,
M. V.
,
Goldman-Rakic
,
P. S.
, &
Wang
,
X. J.
(
2003
).
Temporally irregular mnemonic persistent activity in prefrontal neurons of monkeys during a delayed response task
.
Journal of Neurophysiology
,
90
,
3441
3454
.
da Silveira
,
R. A.
, &
Berry
,
M. J.
(
2014
).
High-fidelity coding with correlated neurons
.
PLoS Computational Biology
,
10
(
11
),
e1003970
.
Dayan
,
P.
, &
Abbott
,
L. F.
(
2001
).
Theoretical neuroscience: Computational and mathematical modeling of neural systems.
Cambridge, MA
:
MIT Press
.
de la Rocha
,
J.
,
Doiron
,
B.
,
Shea-Brown
,
E.
,
Josić
,
K.
, &
Reyes
,
A.
(
2007
).
Correlation between neural spike trains increases with firing rate
.
Nature
,
448
(
7155
),
802
806
.
de Lafuente
,
V.
,
Jazayeri
,
M.
, &
Shadlen
,
M. N.
(
2015
).
Representation of accumulating evidence for a decision in two parietal areas
.
Journal of Neuroscience
,
35
(
10
),
4306
4318
.
Dehaqani
,
M.-R. A.
,
Vahabie
,
A.-H.
,
Kiani
,
R.
,
Ahmadabadi
,
M. N.
,
Araabi
,
B. N.
, &
Esteky
,
H.
(
2016
).
Temporal dynamics of visual category representation in the macaque inferior temporal cortex
.
Journal of Neurophysiology
,
116
(
2
),
587
601
.
DiCarlo
,
J. J.
,
Zoccolan
,
D.
, &
Rust
,
N. C.
(
2012
).
How does the brain solve visual object recognition
?
Neuron
,
73
(
3
),
415
434
.
Ditterich
,
J.
(
2006
).
Evidence for time-variant decision making
.
European Journal of Neuroscience
,
24
(
12
),
3628
3641
.
Drugowitsch
,
J.
,
Moreno-Bote
,
R.
,
Churchland
,
A. K.
,
Shadlen
,
M. N.
, &
Pouget
,
A.
(
2012
).
The cost of accumulating evidence in perceptual decision making
.
Journal of Neuroscience
,
32
(
11
),
3612
3628
.
Ecker
,
A. S.
,
Berens
,
P.
,
Tolias
,
A. S.
, &
Bethge
,
M.
(
2011
).
The effect of noise correlations in populations of diversely tuned neurons
.
Journal of Neuroscience
,
31
(
40
),
14272
14283
.
Engel
,
T. A.
,
Chaisangmongkon
,
W.
,
Freedman
,
D. J.
, &
Wang
,
X.-J.
(
2015
).
Choice-correlated activity fluctuations underlie learning of neuronal category representation
.
Nature Communications
,
6
, 6454.
Engel
,
T. A.
, &
Wang
,
X.-J.
(
2011
).
Same or different? A neural circuit mechanism of similarity-based pattern match decision making
.
Journal of Neuroscience
,
31
(
19
),
6982
6996
.
Franke
,
F.
,
Fiscella
,
M.
,
Sevelev
,
M.
,
Roska
,
B.
,
Hierlemann
,
A.
, &
da Silveira
,
R. A.
(
2016
).
Structures of neural correlation and how they favor coding
.
Neuron
,
89
(
2
),
409
422
.
Freedman
,
D. J.
, &
Assad
,
J. A.
(
2006
).
Experience-dependent representation of visual categories in parietal cortex
.
Nature
,
443
(
7107
),
85
88
.
Freedman
,
D. J.
, &
Assad
,
J. A.
(
2009
).
Distinct encoding of spatial and nonspatial visual information in parietal cortex
.
Journal of Neuroscience
,
29
(
17
),
5671
5680
.
Ganguli
,
S.
,
Bisely
,
J. W.
,
Roitman
,
J. D.
,
Shadlen
,
M. N.
,
Goldberg
,
M. E.
, &
Miller
,
K. D.
(
2008
).
One-dimensional dynamics of attention and decision making in LIP
.
Neuron
,
58
,
15
25
.
Gold
,
J. I.
, &
Shadlen
,
M. N.
(
2000
).
Representation of a perceptual decision in developing oculomotor commands
.
Nature
,
404
(
6776
),
390
394
.
Goldman
,
M. S.
(
2009
).
Memory without feedback in a neural network
.
Neuron
,
61
(
4
),
621
634
.
Graupner
,
M.
, &
Reyes
,
A. D.
(
2013
).
Synaptic input correlations leading to membrane potential decorrelation of spontaneous activity in cortex
.
Journal of Neuroscience
,
33
(
38
),
15075
15085
.
Haider
,
B.
,
Duque
,
A.
,
Hasenstaub
,
A. R.
, &
McCormick
,
D. A.
(
2006
).
Neocortical network activity in vivo is generated through a dynamic balance of excitation and inhibition
.
Journal of Neuroscience
,
26
(
17
),
4535
4545
.
Hanks
,
T. D.
,
Ditterich
,
J.
, &
Shadlen
,
M. N.
(
2006
).
Microstimulation of macaque area LIP affects decision-making in a motion discrimination task
.
Nature Neuroscience
,
9
(
5
),
682
689
.
Hanks
,
T.
,
Kiani
,
R.
, &
Shadlen
,
M. N.
(
2014
).
A neural mechanism of speed-accuracy tradeoff in macaque area LIP
.
ELife
,
3
.
Hanks
,
T. D.
, &
Summerfield
,
C.
(
2017
).
Perceptual decision making in rodents, monkeys, and humans
.
Neuron
,
93
(
1
),
15
31
.
Honey
,
C. J.
,
Thivierge
,
J.-P.
, &
Sporns
,
O.
(
2010
).
Can structure predict function in the human brain?
NeuroImage
,
52
(
3
),
766
776
.
Horwitz
,
G. D.
, &
Newsome
,
W. T.
(
1999
).
Separate signals for target selection and movement specification in the superior colliculus
.
Science
,
284
(
5417
),
1158
1161
.
Hu
,
Y.
,
Zylberberg
,
J.
, &
Shea-Brown
,
E.
(
2014
).
The sign rule and beyond: Boundary effects, flexibility, and noise correlations in neural population codes
.
PLoS Computational Biology
,
10
(
2
),
e1003469
.
Huk
,
A. C.
, &
Shadlen
,
M. N.
(
2005
).
Neural activity in macaque parietal cortex reflects temporal integration of visual motion signals during perceptual decision making
.
Journal of Neuroscience
,
25
(
45
),
10420
10436
.
Jonas
,
E.
, &
Kording
,
K. P.
(
2017
).
Could a neuroscientist understand a microprocessor?
PLoS Computational Biology
,
13
(
1
),
e1005268
.
Katz
,
L. N.
,
Yates
,
J. L.
,
Pillow
,
J. W.
, &
Huk
,
A. C.
(
2016
).
Dissociated functional significance of decision-related activity in the primate dorsal stream
.
Nature
,
535
(
7611
),
285
288
.
Katz
,
P. S.
(
2016
).
Evolution of central pattern generators and rhythmic behaviours
.
Philosophical Transactions of the Royal Society of London, Series B, Biological Sciences
,
371
(
1685
),
20150057
.
Kim
,
J. N.
, &
Shadlen
,
M. N.
(
1999
).
Neural correlates of a decision in the dorsolateral prefrontal cortex of the macaque
.
Nature Neuroscience
,
2
(
2
),
176
185
.
Kira
,
S.
,
Yang
,
T.
, &
Shadlen
,
M. N.
(
2015
).
A neural implementation of Wald's sequential probability ratio test
.
Neuron
,
85
(
4
),
861
873
.
Klampfl
,
S.
,
David
,
S. V.
,
Yin
,
P.
,
Shamma
,
S. A.
, &
Maass
,
W.
(
2012
).
A quantitative analysis of information about past and present stimuli encoded by spikes of A1 neurons
.
Journal of Neurophysiology
,
108
(
5
),
1366
1380
.
Kohn
,
A.
,
Coen-Cagli
,
R.
,
Kanitscheider
,
I.
, &
Pouget
,
A.
(
2016
).
Correlations and neuronal population information
.
Annual Review of Neuroscience
,
39
,
237
256
.
Krakauer
,
J. W.
,
Ghazanfar
,
A. A.
,
Gomez-Marin
,
A.
,
MacIver
,
M. A.
, &
Poeppel
,
D.
(
2017
).
Neuroscience needs behavior: Correcting a reductionist bias
.
Neuron
,
93
(
3
),
480
490
.
Law
,
C.-T.
, &
Gold
,
J. I.
(
2008
).
Neural correlates of perceptual learning in a sensory-motor, but not a sensory, cortical area
.
Nature Neuroscience
,
11
(
4
),
505
513
.
Layton
,
O. W.
, &
Fajen
,
B. R.
(
2016
).
A neural model of MST and MT explains perceived object motion during self-motion
.
Journal of Neuroscience
,
36
(
31
),
8093
8102
.
Lewis
,
J. W.
, &
Van Essen
,
D. C.
(
2000
).
Corticocortical connections of visual, sensorimotor, and multimodal processing areas in the parietal lobe of the macaque monkey
.
Journal of Comparative Neurology
,
428
(
1
),
112
137
.
Li
,
C. S.
,
Mazzoni
,
P.
, &
Andersen
,
R. A.
(
1999
).
Effect of reversible inactivation of macaque lateral intraparietal area on visual and memory saccades
.
Journal of Neurophysiology
,
81
(
4
),
1827
1838
.
Liu
,
Y.
,
Yttri
,
E. A.
, &
Snyder
,
L. H.
(
2010
).
Intention and attention: Different functional roles for LIPd and LIPv
.
Nature Neuroscience
,
13
(
4
),
495
500
.
Lo
,
C.-C.
, &
Wang
,
X.-J.
(
2016
).
Conflict resolution as near-threshold decision-making: A spiking neural circuit model with two-stage competition for antisaccadic task
.
PLoS Computational Biology
,
12
(
8
),
e1005081
.
Marom
,
S.
,
Meir
,
R.
,
Braun
,
E.
,
Gal
,
A.
,
Kermany
,
E.
, &
Eytan
,
D.
(
2009
).
On the precarious path of reverse neuro-engineering
.
Frontiers in Computational Neuroscience
,
3
,
5
.
Marr
,
D.
(
1982
).
Vision: A computational approach
.
Cambridge, MA
:
MIT Press
.
Meister
,
M. L. R.
,
Hennig
,
J. A.
, &
Huk
,
A. C.
(
2013
).
Signal multiplexing and single-neuron computations in lateral intraparietal area during decision-making
.
Journal of Neuroscience
,
33
(
6
),
2254
2267
.
Meyers
,
E. M.
,
Freedman
,
D. J.
,
Kreiman
,
G.
,
Miller
,
E. K.
, &
Poggio
,
T.
(
2008
).
Dynamic population coding of category information in inferior temporal and prefrontal cortex
.
Journal of Neurophysiology
,
100
(
3
),
1407
1419
.
Miri
,
A.
,
Daie
,
K.
,
Arrenberg
,
A. B.
,
Baier
,
H.
,
Aksay
,
E.
, &
Tank
,
D. W.
(
2011
).
Spatial gradients and multidimensional dynamics in a neural integrator circuit
.
Nature Neuroscience
,
14
,
1150
1159
.
Morcos
,
A. S.
, &
Harvey
,
C. D.
(
2016
).
History-dependent variability in population dynamics during evidence accumulation in cortex
.
Nature Neuroscience
,
19
(
12
),
1672
1681
.
Moreno-Bote
,
R.
,
Beck
,
J.
,
Kanitscheider
,
I.
,
Pitkow
,
X.
,
Latham
,
P.
, &
Pouget
,
A.
(
2014
).
Information-limiting correlations
.
Nature Neuroscience
,
17
(
10
),
1410
1417
.
Murphy
,
B. K.
, &
Miller
,
K. D.
(
2009
).
Balanced amplification: A new mechanism of selective amplification of neural activity patterns
.
Neuron
,
61
,
635
648
.
Nienborg
,
H.
, &
Cumming
,
B.
(
2010
).
Correlations between the activity of sensory neurons and behavior: How much do they tell us about a neuron's causality?
Current Opinion in Neurobiology
,
20
(
3
),
376
381
.
Panzeri
,
S.
,
Harvey
,
C. D.
,
Piasini
,
E.
,
Latham
,
P. E.
, &
Fellin
,
T.
(
2017
).
Cracking the neural code for sensory perception by combining statistics, intervention, and behavior
.
Neuron
,
93
(
3
),
491
507
.
Panzeri
,
S.
,
Treves
,
A.
,
Schultz
,
S.
, &
Rolls
,
E. T.
(
1999
).
On decoding the responses of a population of neurons from short time windows
.
Neural Computation
,
11
(
7
),
1553
1577
.
Pesaran
,
B.
, &
Freedman
,
D. J.
(
2016
).
Where are perceptual decisions made in the brain?
Trends in Neurosciences
,
39
(
10
),
642
644
.
Pillow
,
J. W.
,
Shlens
,
J.
,
Paninski
,
L.
,
Sher
,
A.
,
Litke
,
A. M.
,
Chichilnisky
,
E. J.
, &
Simoncelli
,
E. P.
(
2008
).
Spatio-temporal correlations and visual signalling in a complete neuronal population
.
Nature
,
454
(
7207
),
995
999
.
Renart
,
A.
,
de la Rocha
,
J.
,
Bartho
,
P.
,
Hollender
,
L.
,
Parga
,
N.
,
Reyes
,
A.
, &
Harris
,
K. D.
(
2010
).
The asynchronous state in cortical circuits
.
Science
,
327
(
5965
),
587
590
.
Rich
,
E. L.
, &
Wallis
,
J. D.
(
2016
).
Decoding subjective decisions from orbitofrontal cortex
.
Nature Neuroscience
,
19
(
7
),
973
980
.
Roitman
,
J. D.
, &
Shadlen
,
M. N.
(
2002
).
Response of neurons in the lateral intraparietal area during a combined visual discrimination reaction time task
.
Journal of Neuroscience
,
22
(
21
),
9475
9489
.
Rosenbaum
,
R.
, &
Josić
,
K.
(
2011
).
Mechanisms that modulate the transfer of spiking correlations
.
Neural Computation
,
23
(
5
),
1261
1305
.
Rosenbaum
,
R.
,
Smith
,
M. A.
,
Kohn
,
A.
,
Rubin
,
J. E.
, &
Doiron
,
B.
(
2017
).
The spatial structure of correlated neuronal variability
.
Nature Neuroscience
,
20
(
1
),
107
114
.
Salinas
,
E.
, &
Sejnowski
,
T. J.
(
2000
).
Impact of correlated synaptic input on output firing rate and variability in simple neuronal models
.
Journal of Neuroscience
,
20
(
16
),
6193
6209
.
Sarma
,
A.
,
Masse
,
N. Y.
,
Wang
,
X.-J.
, &
Freedman
,
D. J.
(
2016
).
Task-specific versus generalized mnemonic representations in parietal and prefrontal cortices
.
Nature Neuroscience
,
19
(
1
),
143
149
.
Schiller
,
P. H.
, &
Tehovnik
,
E. J.
(
2003
).
Cortical inhibitory circuits in eye-movement generation
.
European Journal of Neuroscience
,
18
(
11
),
3127
3133
.
Schröter
,
M.
,
Paulsen
,
O.
, &
Bullmore
,
E. T.
(
2017
).
Micro-connectomics: Probing the organization of neuronal networks at the cellular scale
.
Nature Reviews Neuroscience
,
18
(
3
),
131
146
.
Shadlen
,
M. N.
,
Britten
,
K. H.
,
Newsome
,
W. T.
, &
Movshon
,
J. A.
(
1996
).
A computational analysis of the relationship between neuronal and behavioral responses to visual motion
.
Journal of Neuroscience
,
16
(
4
),
1486
1510
.
Shadlen
,
M. N.
, &
Kiani
,
R.
(
2013
).
Decision making as a window on cognition
.
Neuron
,
80
(
3
),
791
806
.
Shadlen
,
M. N.
, &
Movshon
,
J. A.
(
1999
).
Synchrony unbound: A critical evaluation of the temporal binding hypothesis
.
Neuron
,
24
(
1
),
67
77
,
111–125
.
Shadlen
,
M. N.
, &
Shohamy
,
D.
(
2016
).
Decision making and sequential sampling from memory
.
Neuron
,
90
(
5
),
927
939
.
Shafi
,
M.
,
Zhou
,
Y.
,
Quintana
,
J.
,
Chow
,
C.
,
Fuster
,
J.
, &
Bodner
,
M.
(
2007
).
Variability in neuronal activity in primate cortex during working memory tasks
.
Neuroscience
,
146
,
1082
1108
.
Shu
,
Y.
,
Hasenstaub
,
A.
, &
McCormick
,
D. A.
(
2003
).
Turning on and off recurrent balanced cortical activity
.
Nature
,
423
(
6937
),
288
293
.
Smith
,
P. L.
, &
Ratcliff
,
R.
(
2004
).
Psychology and neurobiology of simple decisions
.
Trends in Neurosciences
,
27
(
3
),
161
168
.
Sompolinsky
,
H.
,
Yoon
,
H.
,
Kang
,
K.
, &
Shamir
,
M.
(
2001
).
Population coding in neuronal systems with correlated noise
.
Physical Review E
,
64
(
5
),
051904
.
Sussillo
,
D.
, &
Abbott
,
L. F.
(
2009
).
Generating coherent patterns of activity from chaotic neural networks
.
Neuron
,
63
(
4
),
544
557
.
Thivierge
,
J.-P.
, &
Marcus
,
G. F.
(
2007
).
The topographic brain: From neural connectivity to cognition
.
Trends in Neurosciences
,
30
(
6
),
251
259
.
Thura
,
D.
(
2016
).
How to discriminate conclusively among different models of decision making?
Journal of Neurophysiology
,
115
(
5
),
2251
2254
.
Thura
,
D.
,
Beauregard-Racine
,
J.
,
Fradet
,
C.-W.
, &
Cisek
,
P.
(
2012
).
Decision making by urgency gating: Theory and experimental support
.
Journal of Neurophysiology
,
108
(
11
),
2912
2930
.
Thura
,
D.
, &
Cisek
,
P.
(
2016
).
On the difference between evidence accumulator models and the urgency gating model
.
Journal of Neurophysiology
,
115
(
1
),
622
623
.
Vincent-Lamarre
,
P.
,
Lajoie
,
G.
, &
Thivierge
,
J.-P.
(
2016
).
Driving reservoir models with oscillations: A solution to the extreme structural sensitivity of chaotic networks
.
Journal of Computational Neuroscience
,
41
(
3
),
305
322
.
Wardak
,
C.
,
Olivier
,
E.
, &
Duhamel
,
J.-R.
(
2002
).
Saccadic target selection deficits after lateral intraparietal area inactivation in monkeys
.
Journal of Neuroscience
,
22
(
22
),
9877
9884
.
Wardak
,
C.
,
Olivier
,
E.
, &
Duhamel
,
J.-R.
(
2004
).
A deficit in covert attention after parietal cortex inactivation in the monkey
.
Neuron
,
42
(
3
),
501
508
.
Wong
,
K.-F.
,
Huk
,
A. C.
,
Shadlen
,
M. N.
, &
Wang
,
X.-J.
(
2007
).
Neural circuit dynamics underlying accumulation of time-varying evidence during perceptual decision making
.
Frontiers in Computational Neuroscience
,
1
,
6
.
Wong
,
K.-F.
, &
Wang
,
X.-J.
(
2006
).
A recurrent network mechanism of time integration in perceptual decisions
.
Journal of Neuroscience
,
26
(
4
),
1314
1328
.
Yim
,
M. Y.
,
Kumar
,
A.
,
Aertsen
,
A.
, &
Rotter
,
S.
(
2014
).
Impact of correlated inputs to neurons: Modeling observations from in vivo intracellular recordings
.
Journal of Computational Neuroscience
,
37
(
2
),
293
304
.
Zaidel
,
A.
,
DeAngelis
,
G. C.
, &
Angelaki
,
D. E.
(
2017
).
Decoupled choice-driven and stimulus-related activity in parietal neurons may be misrepresented by choice probabilities
.
Nature Communications
,
8
(
1
),
715
.
Zohary
,
E.
,
Shadlen
,
M. N.
, &
Newsome
,
W. T.
(
1994
).
Correlated neuronal discharge rate and its implications for psychophysical performance
.
Nature
,
370
(
6485
),
140
143
.
Zylberberg
,
J.
,
Cafaro
,
J.
,
Turner
,
M. H.
,
Shea-Brown
,
E.
, &
Rieke
,
F.
(
2016
).
Direction-selective circuits shape noise to ensure a precise population code
.
Neuron
,
89
(
2
),
369
383
.