Previous theoretical work has shown that a single-layer neural network can implement the optimal decision process for simple, two-alternative forced-choice (2AFC) tasks. However, it is likely that the mammalian brain comprises multilayer networks, raising the question of whether and how optimal performance can be approximated in such an architecture. Here, we present theoretical work suggesting that the noradrenergic nucleus locus coeruleus (LC) may help optimize 2AFC decision making in the brain. This is based on the observations that neurons of the LC selectively fire following the presentation of salient stimuli in decision tasks and that the corresponding release of norepinephrine can transiently increase the responsivity, or gain, of cortical processing units. We describe computational simulations that investigate the role of such gain changes in optimizing performance of 2AFC decision making. In the tasks we model, no prior cueing or knowledge of stimulus onset time is assumed.
Performance is assessed in terms of the rate of correct responses over time (the reward rate). We first present the results of a single-layer model that accumulates (integrates) sensory input and implements the decision process as a threshold crossing. Gain transients, representing the modulatory effect of the LC, are driven by separate threshold crossings in this layer. We optimize over all free parameters to determine the maximum reward rate achievable by this model and compare it to the maximum reward rate when gain is held fixed. We find that the dynamic gain mechanism yields no improvement in reward for this single-layer model.
We then examine a two-layer model, in which competing sensory accumulators in the first layer (capable of implementing the task relevant decision) pass activity to response accumulators in a second layer. Again, we compare a version in which threshold crossing in the first (decision) layer elicits an LC response (and a concomitant increase in gain) with a fixed-gain version of the model. Here, we find that gain transients modeling the LC phasic response yield an improvement in reward rate of 12% to 24%. Furthermore, we show that the timing characteristics of these gain transients agree with observations concerning LC firing patterns reported in recent experimental studies. This provides converging evidence for the hypothesis that the LC optimizes processes underlying 2AFC decision making in multilayer networks.