Abstract
pFC is generally regarded as a region critical for abstract reasoning and high-level cognitive behaviors. As such, it has become the focus of intense research involving a wide variety of subdisciplines of neuroscience and employing a diverse range of methods. However, even as the amount of data on pFC has increased exponentially, it appears that progress toward understanding the general function of the region across a broad array of contexts has not kept pace. Effects observed in pFC are legion, and their interpretations are generally informed by a particular perspective or methodology with little regard with how those effects may apply more broadly. Consequently, the number of specific roles and functions that have been identified makes the region a very crowded place indeed and one that appears unlikely to be explained by a single general principle. In this theoretical article, we describe how the function of large portions of pFC can be accommodated by a single explanatory framework based on the computation and manipulation of error signals and how this framework may be extended to account for additional parts of pFC.
INTRODUCTION
In his studies of color phenomena (Newton, 1730), Isaac Newton investigated the composition of white light. Before Newton's work, color was generally believed to derive from combinations of light and dark. In his experiments, he demonstrated that white light, rather than indicating the absence of color, is in fact composed of all colors. In a famous experiment, white light was refracted through a prism to produce the color spectrum, after which the entire spectrum was refracted through a second prism, resulting in a white light produced by reintegration of the color spectrum. This experiment provides a concise and clear example of the processes of analysis and synthesis (Ritchey, 1991). One prism decomposes white light into a spectrum (analysis), whereas the second prism reconstitutes the color bands into white light (synthesis). More generally, the process of analysis attempts to understand a phenomenon through decomposition into its constituent parts—literally breaking it up into simpler, more tractable entities. In turn, synthesis attempts to take individual components and combine them into a unified whole (Figure 1).
Figure from Opticks (Newton, 1730) depicting the apparatus used to decompose and reintegrate white light.
Figure from Opticks (Newton, 1730) depicting the apparatus used to decompose and reintegrate white light.
The metaphorical prisms used by neuroscientists for analysis are the methods by which observations regarding a particular brain region are recorded (Grinvald & Hildesheim, 2004). Such tools decompose the world into a wide spectrum of data. For example, microelectrode arrays produce data with high temporal resolution recorded from a limited number of neurons. EEG and electrocorticography yield data with similarly high temporal resolution but reflecting the activity of ensembles of neurons over broad neural regions. Data from fMRI, conversely, have a relatively coarse temporal resolution but can provide much greater spatial detail over a larger area than other methods. The manner in which data are recorded can have profound implications for its interpretations; in extreme cases, data from the same region, recorded at different temporal and spatial resolutions, can yield interpretations that are almost diametrically opposed (Ford, Gati, Menon, & Everling, 2009).
Although it is generally assumed that the data obtained by these diverse methods reflect some aspect of the underlying neural mechanisms, the meaning ascribed to them is informed by a second, metaphorical prism, the process of synthesis: the particular theory that is brought to bear on interpreting brain function. For a particular region of the brain, as for example, the ACC, a social neuroscientist may find that it is primarily involved in processing social cues (Rotge et al., 2015), a neuroeconomist may discover deep connections with quantities important for decision-making such as value and uncertainty (Rangel, Camerer, & Montague, 2008), and all the while, an affective neuroscientist might insist that the same region is a vital hub of emotional information such as happiness, pain, regret, and so on (Lieberman & Eisenberger, 2015). This is not to say that any of these interpretations are necessarily wrong; however, the fractionation of interpretation induced by specialized subfields may result in a disjointed and incomplete understanding of the neural mechanisms underlying human behavior. At worst, this trend might produce an overly complex “integrative” account that attempts to explain different functions as the product of multiple, spatially overlapping modules subserving specific and dissociable roles (Alexander & Brown, 2015b).
If the range of methods and perspectives deployed in recording and interpreting brain activity reflects the process of decomposing a signal into more easily understood constituents—analysis—then, what are the tools by which the constituent elements are reintegrated? Generally, this is the work of the theorist who proposes models and frameworks by which sets of data might be understood as emerging from some common underlying mechanism. Models can be specified in a variety of fashions, from simple graphical or written descriptions explaining how a system may function to more formal computational or mathematical descriptions. Synthesis through modeling tends to be a more catholic pursuit than analysis—to be worthwhile, a model should explain a range of analytic results rather than only one. However, even with this broader scope, synthesis can still be constrained by the perspective of the single theorist. A neuroscientist who is interested in intracellular signaling cascades will not pursue a theory of behavior, whereas a psychologist will generally not be interested in protein phosphorylation. Likewise, a cognitive neuroscientist may be able to explain the role of a region across a variety of tasks, such as ACC, but might be at a loss as to why the same region also responds to pain (Jahn, Nee, Alexander, & Brown, 2016).
Recent years have seen an increased emphasis on the synthetic process in neuroscience, frequently in the search for unifying principles underlying brain function, by which the diversity of data might be reintegrated. Proposed unifying frameworks include predictive coding, free energy, and the Bayesian brain hypothesis. Although the success of reinforcement learning and deep learning architectures at approximating human level performance at a variety of tasks provides existence proofs that relatively simple mechanisms can be used to understand human cognition, it remains an open question as to whether the variety of effects observed in brain and behavior can be reduced to a simple underlying principle. Indeed, the range of effects observed across different neuroscientific methodologies seems to provide evidence to the contrary.
Nevertheless, to the extent that the goal of neuroscience is to understand the function of the brain, it is insufficient to develop comprehensive models of highly circumscribed data. However, the sheer proliferation of data, in type and in quantity, tends to resist easy integration, sometimes resulting in the attempt of imposing some degree of order on an otherwise chaotic landscape through sufficiently sophisticated analyses and machine learning methods (although it remains unclear how effective these approaches are at recovering function; Jonas & Kording, 2017). In attempting to uncover the principles underlying brain function, then, it is necessary to negotiate between competing demands: The range of data incorporated should be sufficiently broad to specify a general underlying mechanism, yet not so broad as to render integration unlikely.
An Integrative Account of Medial pFC
One region that typifies this tradeoff is medial pFC (mPFC), especially ACC. mPFC/ACC activity is routinely observed across a range of experimental paradigms and is frequently associated with processing behavioral error (Gehring, Goss, Coles, Meyer, & Donchin, 1993). However, a host of other interpretations have been ascribed to the region, often deriving from the particular subdiscipline from which a study hails. From the affective neuroscience literature, the region has been assigned roles in processing somatic pain, joy, regret, perseverance, and other functions primarily associated with emotionally relevant information (Lieberman & Eisenberger, 2015; Parvizi, Rangarajan, Shirer, Desai, & Greicius, 2013; Chandrasekhar, Capra, Moore, Noussair, & Berns, 2008; Coricelli et al., 2005). In the domain of social neuroscience, ACC has been observed to be involved in processing social exclusion, monitoring the outcomes of another's choices, or learning from observing others (Hill, Boorman, & Fried, 2016; Rotge et al., 2015; Apps, Balsters, & Ramnani, 2012). Meanwhile, in the cognitive domain, ACC has been implicated in processing behavioral conflict, predicting the likelihood of an error, determining the value of exerting effort, or selecting optimal control signals (Holroyd & McClure, 2015; Verguts, Vassena, & Silvetti, 2015; Holroyd & Yeung, 2012; Brown & Braver, 2005; Botvinick, Braver, Barch, Carter, & Cohen, 2001).
Given the diverse array of effects observed in the region, it is an open question as to whether a single explanatory framework could be brought to bear to interpret signals generated by ACC. By and large, theorizing regarding ACC function (and pFC function in general) has tended to avoid overarching accounts and instead focused on the role of ACC under particular contexts (Holroyd & McClure, 2015; Shenhav, Straccia, Cohen, & Botvinick, 2014; Shenhav, Botvinick, & Cohen, 2013; Holroyd & Yeung, 2012; Kolling, Behrens, Mars, & Rushworth, 2012; Grinband et al., 2011; Silvetti, Seurinck, & Verguts, 2011; Brown & Braver, 2005; Yeung, Cohen, & Botvinick, 2004; Holroyd & Coles, 2002; Botvinick et al., 2001). Computational and mathematical models of the region have typically concerned themselves with the function of ACC within constrained empirical perspectives such as cognitive control or value-based decision-making. Indeed, the impetus behind the development of the predicted response–outcome (PRO) model (Alexander & Brown, 2010, 2011) was to provide an account of the function of ACC under relatively simple cognitive control tasks. The PRO model states that ACC learns to predict the likely outcomes of actions and signals deviations between observed and expected outcomes. Although the PRO model successfully captured effects related primarily to cognitive control, the formulation of the model as signaling surprising deviations from expectations suggested that it could be applied in a more general manner. In follow-up modeling work (Brown & Alexander, 2017; Alexander, Fukunaga, Finn, & Brown, 2015; Alexander & Brown, 2014) based on the PRO model, as well as tests of model predictions performed by other researchers (Jahn, Nee, Alexander, & Brown, 2014; Chang, Gariépy, & Platt, 2013; Talmi, Atkinson, & El-Deredy, 2013; Ferdinand, Mecklinger, Kray, & Gehring, 2012; Bryden, Johnson, Tobia, Kashtelyan, & Roesch, 2011), the twin functions of the PRO model—prediction and error signaling—have been applied to a broad range of perspectives, ranging from decision-making; social, affective, and clinical neuroscience; and perception and attention. A noncomprehensive list of effects captured by the PRO model is included elsewhere in this issue (Brown & Alexander, 2017). Given the breadth of effects encompassed by the PRO model, the range of perspectives to which the PRO account of ACC can be applied, and its ability to address observations from the level of single units to behavior, the PRO model remains the most comprehensive account of ACC function to date.
Building Out the Brain
Beyond merely addressing the function of ACC, however, the formulation of the PRO model carries implications regarding the function of regions of the brain with which ACC interacts. The PRO model generates two main signals, one related to predicting future events and another related to signaling surprising deviations. If, as the PRO model suggests, these two signals constitute the main outputs of ACC, regions connected to ACC should interact with at least one, and possibly both, of those signals (Figure 2A). Furthermore, the error and prediction signals generated by the PRO model are vector valued, carrying information regarding all possible outcomes that may be observed after a stimulus and reporting the amount by which an observed event deviates from all predictions.
(A) According to the PRO model, ACC generates two principal signals, prediction and prediction error. If this account is correct, regions in pFC with which ACC interacts must do so through one or both of these signals, constraining the range of possible functions those regions may have. (B) The HER model specifies how dlPFC may interact with ACC by learning representations of the error signal generated by ACC and deploying active error representations to modulate predictive activity. (1) Task stimuli lead to predictions regarding likely outcomes. (2) Deviations between predicted and observed outcomes produce error signals, which are used to train distributed error representations in dlPFC. (4) Subsequent encounters with task stimuli leading to prediction errors reactivate error representations in dlPFC, (5) which are then used to modulate predictive activity in ACC.
(A) According to the PRO model, ACC generates two principal signals, prediction and prediction error. If this account is correct, regions in pFC with which ACC interacts must do so through one or both of these signals, constraining the range of possible functions those regions may have. (B) The HER model specifies how dlPFC may interact with ACC by learning representations of the error signal generated by ACC and deploying active error representations to modulate predictive activity. (1) Task stimuli lead to predictions regarding likely outcomes. (2) Deviations between predicted and observed outcomes produce error signals, which are used to train distributed error representations in dlPFC. (4) Subsequent encounters with task stimuli leading to prediction errors reactivate error representations in dlPFC, (5) which are then used to modulate predictive activity in ACC.
Concurrently, the possible function of interactive brain regions is further implied by the class of tasks that the PRO model is unable to address. As noted above, the PRO model was developed with the intent of capturing effects related to cognitive control. In typical cognitive control experiments, participants observe a stimulus indicating that a response is required, and after the generation of a response, the participant receives feedback regarding their performance, after which the next trial begins. Beyond a limited range of intertrial effects (Alexander & Brown, 2014), however, the PRO model is unable to address observations regarding ACC involvement in more sophisticated working memory tasks that involve the maintenance of information over protracted delays, often in the face of distracting, irrelevant information and potentially involving complex interrelationships among stimulus features that must be learned to inform correct behavior (Nee & Brown, 2013). An example of such a task is the AX Continuous Performance Task (CPT; Rosvold, Mirsky, Sarason, Bransome, & Beck, 1956), in which participants observe a sequence of stimuli (A, B, X, and Y) and are required to make a target response when an X appears, but only if the stimulus immediately preceding it was an A. To successfully perform this task, information related to the stimulus preceding an X must be maintained to correctly determine the response to the X.
Considering these two points, then, that (1) regions with which ACC interacts either receive or alter processing of prediction and/or error signals generated by ACC and (2) these regions are important for learning and performing complex cognitive tasks that require representing information regarding the relationships of task components, along with the assumption that prediction and error signaling constitute a general role for ACC across a range of experimental paradigms, we can begin to develop a clearer idea of the functions of additional regions of pFC. In this regard, dorsolateral pFC (dlPFC) is a likely candidate: dlPFC is densely and reciprocally interconnected with ACC (Medalla & Barbas, 2009, 2010; Barbas & Pandya, 1989) and is generally implicated in representing rules and complex task structure as well as in maintaining information over protracted delays, that is, working memory (Badre, Kayser, & D'Esposito, 2010; Chadderdon & Sporns, 2006; Koechlin, Ody, & Kouneiher, 2003). dlPFC is believed to be organized along a rostrocaudal abstraction gradient, with caudal regions representing concrete rules and rostral areas representing abstract context information, and it is frequently coactivated, with ACC, in tasks that involve complex interrelationships and learning models of the world (Nee & Brown, 2013; Badre & Frank, 2012; Gläscher, Daw, Dayan, & O'Doherty, 2010; Badre & D'Esposito, 2007, 2009).
How might dlPFC interact with prediction and error signals generated by ACC? In Alexander and Brown (2011), we noted that the vector-valued error signal used by the PRO model is appropriate for model-based reinforcement learning (Barto, Bradtke, & Singh, 1995; Sutton, 1990) as distinct from model-free reinforcement learning approaches that employ a scalar value signal to drive learning (Sutton & Barto, 1990). Previous work (Gläscher et al., 2010) has observed effects in dlPFC consistent with such a learning signal, suggesting that error signals generated by ACC may be used in dlPFC to learn representations of a task. Furthermore, working memory is important for informing and contextualizing behavioral responses; to respond correctly to an X in the AX CPT, information carried by the immediately preceding stimulus is required to modify predictions regarding the likely outcomes of the various responses one could make. Together, these observations led to the development of the hierarchical error representation (HER) model (Alexander & Brown, 2015a, 2016). The HER model proposes that error signals generated in ACC/mPFC are used to train representations in dlPFC, which are associated with task-relevant stimuli that reliably precede a prediction error (Figure 2B). When these representations are elicited by future presentations of the task stimuli with which they are associated, they are used to modulate prediction-related activity in ACC/mPFC. In typical reinforcement learning applications, the error signal specifies the direction and magnitude by which associations between a stimulus and its associated outcomes should be modified. In contrast, representations learned by the dlPFC in the HER model are predictions of prediction errors reported by ACC/mPFC; error signals constitute a kind of “proxy” outcome upon which representations in dlPFC converge during the course of learning. By using error signals themselves as outcomes that are the target of predictive processes, additional error signals reflecting the discrepancy between a predicted error and an actual error can be calculated, and these higher-order error signals may themselves be subject to further prediction and error calculations, and so on. Although this process of calculating increasingly abstract prediction errors could, in principle, continue arbitrarily, it is computationally limited by the capacity of computer systems on which the HER model is simulated, and biologically, it appears that the human brain is organized into three to five hierarchical processing levels in pFC, from premotor cortex at the base layer to rostral dlPFC (Reynolds, O'Reilly, Cohen, & Braver, 2012; Badre, 2008; Koechlin et al., 2003).
The purpose of learning to predict prediction errors themselves, as opposed to some other quantity, is to refine predictions regarding the likely outcomes of actions; being able to predict the kinds of prediction errors that are possible within a given context provides information sufficient to refine predictions of the likely outcomes given a current stimulus. The use of error signals in this fashion—as being the target of predictive processes in addition to governing prediction learning—is appealing for two reasons. First, and most pragmatically, learning representations of errors works: The HER model is able to learn to perform structured tasks from trial-and-error learning in a manner consistent with human behavior. Second, from an aesthetic point of view, the use of a common representation scheme used among regions in pFC is parsimonious and does not require intermediate transformations of information. The HER model is thus composed of a relatively simple computational motif that is hierarchically iterated. At each level, the model attempts to learn to predict the association between task stimuli and outcome signals arriving from lower hierarchical levels (or, at the base level, from the external environment), passing the results of error calculations upward along the hierarchy, while prediction information is passed downward to modulate the processing of lower hierarchical levels. However, although the calculation and maintenance of quantities related to error appear to be a useful scheme for interpreting the function of pFC, this aspect of the model remains speculative and in need of testing.
Although the computational motif on which it is based is relatively simple (in fact, it is functionally identical to the PRO model), the HER model is capable of learning complex cognitive tasks in a manner consistent with human behavior and evidence from neuroimaging studies (Alexander & Brown, 2015a). Rather than being limited to explaining results from a single task, the architecture of the HER model constitutes a general learning algorithm that can solve a range of tasks reported in the literature (Alexander & Brown, 2016), ranging from relatively simple examples such as the AX CPT or delayed-match-to-sample tasks to highly involved tasks using multiple stimulus categories with complex interrelationships (Koechlin et al., 2003) in a way that captures the function of ACC/mPFC and dlPFC as well as how the two regions interact during behavior (Kim, Johnson, Cilles, & Gold, 2011). The HER model additionally captures patterns of activity in single neurons observed in lateral pFC and mPFC during tasks involving maintenance of information and sequential decision-making (Procyk, Tanaka, & Joseph, 2000; Miller, Erickson, & Desimone, 1996). In brief, the HER model addresses itself to a broad range of tasks to account for data from multiple levels of description simultaneously.
Toward an Integrative Model of pFC
Together, the PRO and HER models provide one of the most comprehensive accounts of effects observed in pFC (cf. Brown & Alexander, 2017; Table 1). These effects range from single units in lateral pFC and mPFC, the activity of ensembles of neurons indexed by EEG and fMRI, the nature of representations deployed by pFC in the context of high-level cognitive tasks, and how the acquisition of these representations during learning contributes to behavioral markers of adaptive behavior. The ability of the models to capture these effects rests on the reconceptualization of activity in pFC as being fundamentally related to calculating, maintaining, and manipulating quantities related to prediction error: In the HER framework, mPFC calculates deviations between expected and observed outcomes, whereas dlPFC learns representations of the expected error reported by mPFC and associated with task-relevant stimuli.
Effects Simulated by the HER Model So Far
Region | |
fMRI | |
Badre et al., 2010 | LPFC |
Kim et al., 2011 | LPFC/mPFC |
Koechlin et al., 2003 | LPFC |
Nee & Brown, 2012 | LPFC |
Nee & Brown, 2013 | LPFC |
Nee, Jahn, & Brown, 2013 | LPFC |
Nee & D'Esposito, 2016 | LPFC |
Reverberi, Görgen, & Haynes, 2011 | LPFC |
Reynolds et al., 2012 | LPFC |
Lesion | |
Gehring & Knight, 2000 | mPFC |
Tsuchida & Fellows, 2008 | LPFC/mPFC |
Single unit | |
Hayden, Pearson, & Platt, 2011 | mPFC |
Miller et al., 1996 | LPFC |
Procyk et al., 2000 | mPFC |
Shidara & Richmond, 2002 | mPFC |
Stoll et al., 2016 | LPFC/mPFC |
Behavioral | |
Badre et al., 2010 | NA |
Krueger, 2011 | NA |
Krueger & Dayan, 2009 | NA |
Markant & Gureckis, 2012 | NA |
Stoll et al., 2016 | NA |
Region | |
fMRI | |
Badre et al., 2010 | LPFC |
Kim et al., 2011 | LPFC/mPFC |
Koechlin et al., 2003 | LPFC |
Nee & Brown, 2012 | LPFC |
Nee & Brown, 2013 | LPFC |
Nee, Jahn, & Brown, 2013 | LPFC |
Nee & D'Esposito, 2016 | LPFC |
Reverberi, Görgen, & Haynes, 2011 | LPFC |
Reynolds et al., 2012 | LPFC |
Lesion | |
Gehring & Knight, 2000 | mPFC |
Tsuchida & Fellows, 2008 | LPFC/mPFC |
Single unit | |
Hayden, Pearson, & Platt, 2011 | mPFC |
Miller et al., 1996 | LPFC |
Procyk et al., 2000 | mPFC |
Shidara & Richmond, 2002 | mPFC |
Stoll et al., 2016 | LPFC/mPFC |
Behavioral | |
Badre et al., 2010 | NA |
Krueger, 2011 | NA |
Krueger & Dayan, 2009 | NA |
Markant & Gureckis, 2012 | NA |
Stoll et al., 2016 | NA |
LPFC = lateral prefrontal cortex; NA = not applicable.
The integration suggested by the HER model then is twofold. First, the HER model bridges multiple levels of description, concurrently providing an account of the function of single neurons in pFC, the role those units play in neural ensembles, and ultimately, how their distributed activity conspires to produce observed patterns of behavior. Second, the architecture of the HER model suggests a relationship with theoretical frameworks that have been proposed as potentially unifying models of neocortex. Recent years have seen a renewed interest in the search for such a unifying framework that may be of use in interpreting the function and organization of the brain. Approaches such as hierarchical Bayesian inference, free energy, and predictive coding (Clark, 2013; Friston, 2010; Lee & Mumford, 2003; Rao & Ballard, 1999) have garnered significant interest in this respect and have achieved success in explaining effects observed in sensory and motor cortices. Generally, these approaches suggest a hierarchical organization of the brain in which information in the form of prediction errors is passed from inferior hierarchical levels to superior levels, whereas information required to “explain away” prediction errors generated at a lower level are passed downward from superior hierarchical levels. The HER model conforms to this overall framework, with prediction errors traveling through the hierarchy along bottom–up routes, while representations of prediction errors are passed in a top–down fashion to refine predictions of lower levels. Within each level of the hierarchy, mPFC and dlPFC serve complementary roles along the bottom–up and top–down processing pathways. In the bottom–up pathway, mPFC calculates error signals used to train error representations in dlPFC at superior hierarchical layers. In the top–down pathway, contextually relevant components of active error representations in dlPFC are selected by mPFC to modulate ongoing prediction-related activity at lower hierarchical layers (Alexander & Brown, 2015a). At the base layer of the hierarchy, the HER model interprets mPFC activity as being involved in predicting response–outcome conjunctions (as in the PRO model) and signaling discrepancies; top–down information thus serves to contextualize or “explain away” errors that would otherwise be reported without top–down modulation. Thus, the HER model provides a demonstration that predictive coding and related approaches may be extended into pFC.
By recasting the function of large portions of pFC as relating to prediction errors, either through the explicit calculation of error or through maintaining predictions of potential future prediction errors, the HER model suggests that error calculation and representation may serve as a common code underlying neural activity and communication. This possibility stands in contrast to recent proposals (Shenhav et al., 2013; Levy & Glimcher, 2012) that quantities related to the prediction and calculation of value might constitute the common neural currency under which the function of brain regions should be interpreted. Although a large literature in neuroeconomics and judgment and decision-making has implicated aspects of the frontal lobes in value computations, especially, for example, ventromedial and orbitofrontal pFC (Grabenhorst & Rolls, 2011; Gläscher, Hampton, & O'Doherty, 2009; Rangel et al., 2008; Padoa-Schioppa & Assad, 2006; Kringelbach, 2005; Gottfried, O'Doherty, & Dolan, 2003), it is not automatic that value representation needs to be the only, or even primary, role of those regions (Stalnaker, Cooch, & Schoenbaum, 2015; Gläscher et al., 2010; Hampton, Bossaerts, & O'Doherty, 2006). One possibility is that effects that appear to relate to neuroeconomic quantities such as value may have an alternate interpretation under the framework of error and error representation. Alternately, it is possible that different processing streams in pFC utilize complementary but distinct forms of representation to support diverse cognitive behaviors. An open question therefore is whether predictive coding in general, and the HER model in particular, might be expanded to account for the function of additional regions of pFC without reference to explicit value signaling.
In this regard, one possible avenue by which the HER model might be extended relates to the status of internal representations used by the model. As detailed above, the PRO model was aimed initially at explaining effects observed within mPFC and with little regard as to how the signals postulated by the model might be deployed by regions with which mPFC interacts. Additional modeling work, building on the PRO model, specifies how prediction and error signals in the model may be used in supporting proactive and reactive control (Brown & Alexander, 2017) or in the acquisition and performance of cognitive tasks (Alexander & Brown, 2015a). In a similar fashion, the origin of internal representations used by the PRO and HER models as the bases for learning is left underspecified; the appearance of an external stimulus results in the activation of an internal representation corresponding to that stimulus. This mapping of external stimuli to internal representations in a one-to-one fashion is likely overly simplistic—besides the considerable processing needed to transform patterns of light hitting the retina into unitary internal representations (e.g., letters or numbers), additional processes are involved in governing whether the presence of an external stimulus is registered (e.g., attention) as well as contextual influences on how that stimulus, once registered, informs ongoing behavior. A significant challenge to be addressed then is whether unifying schemes such as predictive coding can be leveraged to explain the representation and contextualization of task stimuli in pFC.
It is possible that additional regions with which mPFC interacts may be involved in regulating access of internal stimulus representations to regions of pFC involved with outcome prediction and error calculations. One region that may potentially serve this role is the anterior insula cortex (AIC). AIC is reciprocally connected with mPFC/ACC (Augustine, 1996), and coactivation of the two regions is routinely observed, especially during the registration and processing of behavioral error (Ullsperger, Harsay, Wessel, & Ridderinkhof, 2010). It has been suggested, considering the dense innervation of AIC from amygdala (Augustine, 1996), that AIC is important for processing emotionally relevant information (Jones, Ward, & Critchley, 2010; Wiech et al., 2010; Singer, Critchley, & Preuschoff, 2009). However, considering that ACC has also been extensively implicated in processing affective information (Lieberman & Eisenberger, 2015; Rotge et al., 2015; Chandrasekhar et al., 2008; Bush, Luu, & Posner, 2000), it seems unlikely that the two regions are dissociated by their role in emotional processing. An alternative possibility is that AIC may be involved in the selection of information for further processing by ACC. AIC receives rich interoceptive signals related to bodily states (Barrett & Simmons, 2015; Critchley, Wiens, Rotshtein, Öhman, & Dolan, 2004) as well as information potentially related to the significance of sensory input (Han & Marois, 2014; Menon & Uddin, 2010; Nelson et al., 2010; Eckert et al., 2009; Corbetta & Shulman, 2002). Models of associative learning (Alexander, 2007; Kruschke, 2001; Pearce & Hall, 1980; Mackintosh, 1975) have suggested that error signals generated during learning might not only support the alteration of associations between a stimulus and its subsequent outcomes but also modulate the associability (or salience) of a stimulus. Error signals generated by AIC might therefore provide a means by which incoming information, interoceptive or exteroceptive, is triaged for further processing, whereas error signals in mPFC influence the associations learned regarding selected information. In support of this possibility, AIC is known to project to the nucleus basalis, the primary source of cholinergic input to cortex, although evidence for innervation of the nucleus basalis by cingulate is mixed (Russchen, Amaral, & Price, 1985; Mesulam & Mufson, 1984); acetycholine has been implicated as an important neuromodulator for estimating risk and selecting internally represented information (Smith, Saaj, & Allouis, 2012; Krichmar, 2008; Yu & Dayan, 2005).
The integration suggested by the HER model, although of potential interest, remains speculative for a number of reasons. First, although the HER model is able learn a number of tasks that have been deployed in the study of high-level cognitive behaviors (Alexander & Brown, 2016), these tasks represent only one “operating mode” of the brain. Specifically, in the kinds of tasks the HER model was developed to learn, participants are required to integrate a history of observations to determine the correct behavior given a currently observed stimulus. This type of task is exemplified by the 1-2 AX CPT (O'Reilly & Frank, 2006) in which the sequence of stimuli observed by a participant is externally controlled; when a potential target cue is displayed, participants can only refer to past observations to arrive at a decision as to whether to make a target or nontarget response. Contrast this with a situation in which participants may be asked to navigate from one point in a maze to another; in this case, participants must themselves determine the sequence of observations required to correctly solve the maze. Although the present version of the HER model is unable to address this kind of behavior, it is possible that the general hierarchical organization of pFC as instantiated in the model, as well as its interpretation of activity in pFC as relating to error representation and manipulation, may be suitable for this form of goal-oriented decision-making. Under this mode of operation, goals might be interpreted as discrepancies (or errors) between a desired and current state and behaviors selected on the basis of how efficiently this discrepancy is reduced.
A second potential limitation of the model in its current form, also related to the class of tasks the model was developed to perform, is its inability to account for behaviors related to the manipulation of internal representations. An example of this kind of behavior, pervasive in the working memory literature, is the n-back task (Kirchner, 1958), in which participants observe a sequence of stimuli and are required to report whether the current stimulus is a match for the stimulus observed n steps previously (typically n is a number from 1 to 3). Above and beyond passively integrating a history of observations, the n-back task requires participants, upon presentation of a new stimulus, not only to maintain the identity of previously observed stimuli but also to update those representations with information pertaining to the number of steps in the past they were observed. For example, if a picture of a dog was observed one step in the past, the presentation of a new stimulus requires participants to remember that the dog was now observed two steps in the past. Related to this kind of task are other cognitive behaviors, such as mental calculation, in which the results of simple calculations must be represented internally and used in further calculation to arrive at the correct solution. The active maintenance and manipulation of internal representations implied by tasks of this sort suggest the existence of a “visuospatial sketchpad” or “phonological loop” (Baddeley & Hitch, 1974) in which the results of internal representational manipulations can be stored for later use or reintegrated during further manipulation. Although the HER model in its current form does not incorporate a mechanism by which such manipulations of internal representations might be carried out, it is possible that future work might extend the model to include such operations.
More generally, although the HER model is incomplete, its reconceptualization of large portions of frontal cortex as engaging in error calculation and representation provides a lens through which additional regions might be viewed. A key challenge for future work is to investigate whether and how processes related to error computation might constitute a general functional principle of pFC. In much the same way that the PRO model provided critical constraints on how regions with which ACC interacts may function, the HER model may further inform our understanding of the organization and function of the rest of pFC. Although the success of the HER and PRO models in accounting for a wide array of effects, from single neurons to behavior, suggests that error-related processes may be a useful framework for interpreting brain function, it is possible that such a framework may in fact prove insufficient to explain the diversity of observations throughout pFC. In either eventuality, whether it serves as the basis for a broader understanding of pFC, or as a theory to be superseded by more comprehensive accounts, the HER model is a step toward the ultimate goal of understanding the function of pFC.
Acknowledgments
W. H. A. was supported by FWO-Flanders Odysseus II Award #G.OC44.13N. E. V. was supported by the Marie Sklodowska-Curie action with a standard IF-EF fellowship, within the H2020 framework (H2020-MSCA-IF2015, Grant number 705630).
Reprint requests should be sent to William H. Alexander, Department of Experimental Psychology, Ghent University, Henri Dunantlaan 2, Ghent, Belgium 9000, or via e-mail: william.alexander@ugent.be.