How does the brain maintain to-be-remembered information in working memory (WM), particularly when the focus of attention is drawn to processing other information? Cognitive models of WM propose that when items are displaced from focal attention recall involves retrieval from long-term memory (LTM). In this fMRI study, we tried to clarify the role of LTM in performance on a WM task and the type of representation that is used to maintain an item in WM during rehearsal-filled versus distractor-filled delays. Participants made a deep or shallow levels-of-processing (LOP) decision about a single word at encoding and tried to recall the word after a delay filled with either rehearsal of the word or a distracting math task. Recalling one word after 10 sec of distraction demonstrated behavioral and neural indices of retrieval from LTM (i.e., LOP effects and medial-temporal lobe activity). In contrast, recall after rehearsal activated cortical areas that reflected reporting the word from focal attention. In addition, areas that showed an LOP effect at encoding (e.g., left ventrolateral VLPFC and the anterior temporal lobes [ATLs]) were reactivated at recall, especially when recall followed distraction. Moreover, activity in left VLPFC during encoding, left ATL during the delay, and left hippocampus during retrieval predicted recall success after distraction. Whereas shallow LOP and rehearsal-related areas supported active maintenance of one item in focal attention, the behavioral processes and neural substrates that support LTM supported recall of one item after it was displaced from focal attention.
How does the brain maintain to-be-remembered information in working memory (WM), particularly when one's focus of attention is drawn to processing other information? “Store” models of WM suggest that a small number of items (e.g., four) may be maintained in a temporary storage site, such as a domain-specific buffer (Baddeley, 1986). According to Baddeley, verbal items are maintained by rehearsing them in a phonological loop, instantiated neurally by Broca's area (BA 44/45) and areas of left temporo-parietal cortex (e.g., supramarginal gyrus [BA 40]), with contributions from premotor, supplemental motor, and cerebellar areas (Baddeley, 2003; for an alternative view, see Buchsbaum & D'Esposito, 2008). “State” models suggest that information “in WM” consists of representations that differ in their level of activation (Oberauer, 2009; Cowan, 2008; McElree, 2006). According to these researchers, a small amount of information may be within the focus of attention—representing the highest level of activation. This information is thought to be available for cognitive processing without the need for retrieval per se. Views on the capacity limit vary, but all of these models claim that at least one item can be maintained in WM over short delays.
However, when the focus of attention is drawn away from maintaining to-be-remembered items in WM, state models propose that recall involves retrieving representations from activated long-term memory (LTM; Oberauer, 2009; Cowan, 2008) and involves areas associated with LTM retrieval, such as the hippocampus and/or the parahippocampal gyrus in the medial-temporal lobes (MTLs) as well as the inferior frontal gyrus and angular gyrus (AG; Jonides et al., 2008; Postle, 2006; Talmi, Grady, Goshen-Gottstein, & Moscovitch, 2005). In contrast, according to Baddeley (2007), WM interacts with LTM through an “episodic buffer” but is still thought to involve information in conscious awareness. This information may be associated with (or bound to) information either in LTM or domain-specific stores, yet it is proposed to do so independently of the hippocampus (Baddeley, Allen, & Vargha-Khadem, 2010). Thus, a critical difference between store and state models concerns their conceptualization of the structural architecture of WM. Recently, we showed that patient H. C., an MTL amnesic with bilateral hippocampal lesions, had a striking deficit in maintaining a single novel word in WM over 4–8 sec of distraction (Rose, Olsen, Craik, & Rosenbaum, 2012). Such a finding counters the traditional view of a double dissociation in the neural substrates supporting WM and LTM (e.g., Baddeley & Warrington, 1970; for reviews of similar findings, see Jonides et al., 2008; Ranganath & Blumenfeld, 2005). In this fMRI study, we sought to clarify the role of LTM in performance on a WM task and the type of representation that healthy participants use to maintain a single word in WM during a brief rehearsal-filled or distractor-filled delay.
Another traditional view of the distinction between WM and LTM concerns the involvement of different representational codes in WM and LTM tasks. Classically, “deeper” (semantic/conceptual) levels of processing (LOP) are thought to support LTM, whereas “shallow” (perceptual) types of processing are sufficient for WM (Craik & Lockhart, 1972). Verbal WM tasks typically involve rehearsal of shallow, phonological codes associated with cortical regions along a largely left-lateralized dorsal pathway connecting temporoparietal areas with frontal areas via the arcuate and superior longitudinal fasciculi (Buchsbaum et al., 2011; Fiebach, Rissman, & D'Esposito, 2006; Martin, 2005; Shivde & Thompson-Schill, 2004; Hickok, Buchsbaum, Humphries, & Muftuler, 2003; Gruber, 2001; D'Esposito, Postle, & Rypma, 2000).
However, substantial evidence from behavioral (Rose, Buchsbaum, & Craik, 2014; Rose & Craik, 2012; Shivde & Anderson, 2011), neuropsychological (Barde, Schwartz, Chrysikou, & Thompson-Schill, 2010; Hamilton & Martin, 2007; Martin, 2005), EEG (Cameron, Haarmann, Grafman, & Ruchkin, 2005; Haarmann & Cameron, 2005; Ruchkin, Grafman, Cameron, & Berndt, 2003), and fMRI (Fiebach, Friederici, Smith, & Swinney, 2007; Shivde & Thompson-Schill, 2004) studies suggest that deeper, semantic codes and associated representational cortical areas do support WM in many situations. For instance, deeper (semantic) LOP at encoding benefits WM when active maintenance processes are disrupted (Rose et al., 2014). Furthermore, patients with damage to left ventrolateral prefrontal cortex (VLPFC) and anterior temporal lobe (ATL) tend to have a relatively selective deficit in short-term maintenance of semantic features. Such patients do not show better immediate recall of words than pseudowords (i.e., a lexicality effect), and their immediate recognition is better for rhyme probes than semantic associate probes (Martin & He, 2004; Martin & Saffran, 1997; Martin, Shelton, & Yaffee, 1994). Such findings suggest that these patients rely more on shallow, phonological codes than deeper, semantic codes for short-term retention and have led some researchers to hypothesize the existence of a specialized semantic WM store (Hamilton & Martin, 2007; Martin, 2005).
A Processing Approach to the WM/LTM Distinction
Some have argued that the use of codes (semantic) and areas of cortex (MTL) classically associated with deeper LOP and LTM on WM tasks does not necessarily call for the existence of a specialized WM store or suggest that items are retrieved from the LTM store (Hoffman, Jefferies, & Lambon Ralph, 2011; Postle, 2006; Belleville, Caza, & Peretz, 2003; Barde & Thompson-Schill, 2002). Rather, a processing view of the WM/LTM distinction argues that involvement of semantic codes as opposed to articulatory/phonological codes, for example, reflects a difference in the maintenance processes and representations used to support performance (Rose & Craik, 2012; Craik & Jacoby, 1975). That is, words may be maintained in conscious awareness (“in WM”) in terms of either their phonological features or their semantic features depending on task demands and participants' expectations. Several findings from the neuroimaging literature showing substantial overlap in areas supporting WM and LTM are consistent with this processing view of the WM/LTM distinction (Ranganath, Johnson, & D'Esposito, 2003; Cabeza, Dolcos, Graham, & Nyberg, 2002; Braver et al., 2001). For example, Speer, Jacoby, and Braver (2003) showed that biasing the use of maintenance-focused (phonologically based) processes over retrieval-focused (semantically based) processes by manipulating participants' test expectations (presenting blocks of trials with recognition of four vs. eight words) led to greater activation in left posterior frontal and supplemental motor regions typically associated with phonological processing and articulatory rehearsal, even for trials in which the overall memory task was identical (recognition of six words). Similarly, Shivde and Thompson-Schill (2004) had participants maintain one word in WM over a delay to make a semantic (relatedness) or phonological (rhyme) judgment about the word. The type of test dictated the preferential maintenance of either semantic or phonological features, and consequently, semantic maintenance elicited greater activity in the bilateral inferior frontal gyrus and left middle temporal gyrus, whereas phonological maintenance elicited greater activity in the left superior parietal cortex.
Therefore, depending on the nature of the task conditions, WM maintenance may involve a code classically associated with either short-term retention (e.g., phonological) or long-term retrieval (e.g., semantic) and corresponding areas of associated cortex. Taken together, WM task conditions that show behavioral and neural indices for an involvement of retrieval from LTM and/or the use of LTM retrieval codes may be best understood by the cognitive principle of encoding specificity (Tulving & Thomson, 1973) and related views of memory (LOP: Craik & Lockhart, 1972; transfer-appropriate processing: Morris, Bransford, & Franks, 1977). That is, retention is determined by how well the requirements of a memory test match initial processing used during encoding (Morris et al., 1977; Craik & Tulving, 1975; for examples with neural data, see Vaidya, Zhao, Desmond, & Gabrieli, 2002; Nyberg, Habib, McIntosh, & Tulving, 2000). From this perspective, the extent to which similarities and differences are observed on WM and LTM tests should depend on the extent to which the constituent processes converge or diverge (Rose, Myerson, Roediger, & Hale, 2010).
According to this “processing view,” retention interval is not the defining factor for the distinction between “short-term” memory and “long-term” memory (LTM) memory. In the current study, we hypothesized that recall after rehearsal relies on maintenance of a phonological representation and that recall after math involves retrieval of an episodic (item-in-context) representation from LTM, which predominantly relies on a semantic/conceptual representation. To fit with the current terminology, we use the term WM as shorthand to describe the collection of processes associated with actively maintaining to-be-remembered information, which is to be distinguished from retrieval of information that has been dropped from focal attention during a delay—that is, “from LTM.”
The Current Study
To cast further light on the role of LTM in WM and on the maintenance of information in WM during a delay, we hypothesized that LTM involvement and the type of representation that is maintained would change as a function of distraction. To this end, we developed a novel task involving recall of a single word after a delay period filled with either rehearsal or an easy- or hard-math task (see Figure 1).
By manipulating the nature of encoding and the amount of distraction during the delay, we could assess how the representation that was encoded changed as a function of distraction and what the consequences were to subsequent recall performance. We predicted that retrieval after a math-filled delay would benefit from deeper LOP at encoding and would recruit areas associated with LTM retrieval, such as the inferior frontal gyrus in VLPFC, the AG, and the hippocampus and/or the parahippocampal gyrus in the MTL. We also predicted that deeper (semantic) LOP at encoding would be associated with activation in areas known to be important for semantic processing, such as left VLPFC and left ATL, and that such areas would be both reactivated during retrieval and predictive of successful recall. Such findings would reflect a neural instantiation of the cognitive principle of encoding specificity and a processing approach to the WM/LTM distinction.
Eighteen right-handed young adults (18–35 years old) were recruited from the Rotman Research Institute Research Volunteer Pool. All participants had normal or corrected-to-normal vision and did not report a history of any neurological or psychiatric illness. Informed consent was obtained from all participants, and the local ethics board at the Rotman Research Institute at Baycrest Hospital approved the study.
Design and Procedure
LOP at encoding (deep, shallow) and delay condition (rehearse, easy math, hard math) were manipulated within participants. The WM task was the same as that reported in Rose et al. (2014), adapted for use in the MRI scanner. Stimuli were projected onto a screen behind the scanner, which the participant viewed through a mirror mounted on the head coil. On each trial, an LOP orienting question asked the participant either to indicate whether the to-be-remembered word represented something living (deep encoding condition) or contained an “e” (shallow encoding condition) by pressing the left (yes) or right (no) button on a response box with, respectively, the index or middle finger of their right hand. The to-be-remembered word was presented visually for 1.25 sec. Half of the words represented something living, and half of these words contained an “e”; the other half did not contain an “e.” Similarly, half of the nonliving words contained an “e,” and the other half did not. This permitted a counterbalancing scheme such that, across participants, the same words were to be recalled in each of the LOP × Delay conditions.
After the processing decision, there was either a filled or unfilled delay of 10 sec, before the participant was to recall the word. For the unfilled delay, the participant was instructed to subvocally rehearse the word (i.e., “repeat the word over and over in your head”). For the filled delay, the participant performed a continuous mental calculation task, which required adding or subtracting a series of digits using the numbers −9 to +9, excluding the numbers between −3 and +3. (Pilot testing indicated that adding or subtracting 0, 1, 2, or 3 to the current sum was considerably less distracting than adding or subtracting 4, 5, 6, 7, 8, or 9.) The “easy-math” condition consisted of a series of five numbers presented at a rate of 2000 msec per number (e.g., 8, −2, +6, +4, −7). In the “hard-math” condition, the first number presented in the series was a random number between 100 and 200 followed by six single digits presented at a rate of 1430 msec per number (e.g., 127, −4, +8, −6, −9, +4, −7). In both cases, the participant was asked to mentally calculate the current value with each successive digit. After the final digit, a number was presented (e.g., 119), and the participant pressed a button to indicate if the number was or was not the correct sum. On half of the trials, the number deviated from the actual sum by ±1. Then, the participant was to recall the word presented on that trial by saying it aloud within the 5-sec response window. Vocal responses were recorded via a noise-canceling microphone for later scoring of accuracy and response time. Participants performed 120 total trials presented in one of six pseudorandom orders (counterbalanced across participants). There were 20 trials for each of the LOP × Delay conditions; trials for the different conditions were mixed randomly with equal numbers split into five blocks (24 trials per block). There was a pseudorandom jittering of intertrial intervals ranging from 3 to 6 sec in 500-msec steps with a rectangular distribution and a 1-min break between each block. Feedback on the LOP decision accuracy and math accuracy was provided between each block. Participants were explicitly told not to sacrifice performance on the math task so that they could recall the word—that is, both tasks were equally important.
MRI Acquisition, Preprocessing, and Analysis
Whole-brain images were acquired with a 3.0-T Siemens MAGNETOM Trio MRI scanner (Siemens, Erlangen, Germany). First, high-resolution T1-weighted images were obtained for anatomical localization (3-D MP-RAGE Ax 2000TR, 2.63TE; field of view = 192 × 256 mm, 160 slices, 1.0/0). T2*-weighted EPIs acquired BOLD contrasts with a 12-channel head coil system using a two-shot gradient-echo EPI sequence (22.5 × 22.5 cm field of view with a 96 × 96 matrix size, resulting in an in-plane resolution of 2.35 × 2.35 mm for each of twenty-six 3.5-mm axial slices with a 0.5-mm interslice gap; repetition time = 1.5 sec; echo time = 27 msec; flip angle = 62°).
The data for each participant were preprocessed using Analysis of Functional Neuroimages (AFNI). Functional images were first realigned to the mean image volume of the first scanning run with the AFNI program 3dvolreg. The data were spatially smoothed with a 5-mm FWHM Gaussian kernel. The reference EPI image was then aligned to the participant's T1 structural image and warped to standard Montreal Neurological Institute (MNI) atlas space with nonlinear diffeomorphic registration using advanced normalization tools. General linear models were created in AFNI using 3dDeconvolve with regressors using the SPM canonical hemodynamic response function for each condition and trial phase (encoding: LOP; delay: LOP × delay condition; recall: LOP × delay condition). Thus, there were two regressors for the encoding phase (0–2 sec after word onset; deep, shallow), six regressors for the delay period (4–10 sec after word onset), and six regressors for the recall phase (11–15 sec after word onset). Nuisance regressors from head motion were used to control for movement associated with recall. It may also be noted that studies have suggested the movement artifact associated with overt recall is minimal and often uncorrelated with the expected hemodynamic response, especially for production of a single item (see, e.g., Öztekin, Long, & Badre, 2010; Birn, Cox, & Bandettini, 2004; Barch et al., 1999). The delay regressor started a bit late and ended a bit early to reduce collinearity between the encoding and recall phases. At the group level, t tests were carried out on contrasts of regression weights after spatial normalization, and statistical significance was assessed using an alpha value of .005 and minimum cluster extent of 18 voxels determined via numerical simulation with the AFNI program 3dAlphaSim.
Brain–Behavior Correlation Analysis
To examine the relationship between BOLD activity and recall performance, we carried out a subsequent memory analysis for all time points from 0 to 24 sec after trial onset, a range spanning the encoding, delay, and recall phases. Thus, for every voxel in the brain, the BOLD time series data were split into sixteen 1.5-sec chunks that were aligned to the onset of each trial, yielding a 16 (time points) × 80 (number of easy and hard math trials) matrix. Rehearsal trials were not used for this analysis because performance was at ceiling, and thus, there was almost no variance in recall accuracy. A Spearman rank correlation was then computed between the BOLD value and recall performance (recall = 1, no recall = 0) for each time point. Because of the relatively few number of errors in the easy math condition, we collapsed across both LOPs (shallow, deep) and math (easy, hard) conditions after removing each of the condition means. There were slightly more trials, on average, for deep than shallow conditions; however, separate correlation coefficients were computed for each LOP condition. These coefficients were then averaged and therefore equally weighted so that deep trials did not contribute more than shallow trials. Thus, this subsequent memory cross-correlation analysis allowed us to examine the point-by-point relationship between BOLD activity and subsequent memory performance across the trial. Whole-brain inference at the group level was assessed, after spatial normalization of the 16 correlation maps, using nonparametric analogs of a one-way repeated-measures ANOVA on the time series, the Friedman test (Friedman, 1937), and a one-sample t test on each time point, the Wilcoxon test (Wilcoxon, 1945).
First, we confirmed that LOP decision accuracy and RT at encoding did not differ between the deep (97%, 824 msec) and shallow (96%, 812 msec) encoding conditions. We also confirmed that the “hard-math” task (67%) was actually hard for the participants and the “easy-math” task was actually easy (92%).
Recall performance is presented in Figure 2. Recall was nearly perfect after a rehearsal-filled delay (M = 99.7%, SD = 00.8), poorer after an easy-math-filled delay (M = 85.8%, SD = 11.9), and poorest after a hard-math-filled delay (M = 77.8%, SD = 11.8). The main effect of Delay condition was significant, F(2, 34) = 37.59, p < .001. Deep encoding benefited recall overall, as indicated by a significant main effect of LOP, F(1, 17) = 28.19, p < .001, but there was a significant interaction between Delay condition and LOP, F(2, 34) = 10.67, p < .001. Follow-up analyses on the easy- and hard-math conditions confirmed that LOP and delay condition interacted, F(1, 17) = 5.01, p < .05, because there was a substantially larger benefit of deep encoding when recall followed hard math (t(17) = 4.67, p < .001, d = .90) than easy math (t(17) = 2.53, p < .05, d = .37). The pattern of behavioral results obtained in the scanner is almost an exact replication of the results obtained with an independent sample that performed a slightly modified version of the task outside the scanner (Rose et al., 2014).
Activation differences between conditions for the encoding (deep > shallow), delay (math > rehearse), and recall (math > rehearse) phases are shown in Figure 3. Deep encoding elicited more activity than shallow encoding in areas associated with semantic processing including left VLPFC and left anterior and middle temporal regions, consistent with previous findings (Badre & Wagner, 2007; Nyberg, Forkstam, Petersson, Cabeza, & Ingvar, 2002; Poldrack et al., 1999; Kapur, Craik, et al., 1994; Kapur, Rose, et al., 1994). In contrast, shallow encoding elicited more activity than deep encoding in visual association areas in occipital and parietal cortices (for all coordinates, see Supplemental Table 1). A contrast between activation during math-filled and rehearsal-filled delays (math > rehearsal, collapsed across easy-math and hard-math conditions) showed increased activation for math-filled delays in areas in the “cognitive control” network (including the superior parietal lobule and DLPFC; Niendam et al., 2012; Cole & Schneider, 2007). Increased activation during rehearsal-filled delays was observed in a distributed network of areas associated with the “default mode” (Buckner, Andrews-Hanna, & Schacter, 2008; for coordinates, see Supplemental Table 2). Note that, because the mental calculation task was designed to require subvocal articulation and rehearsal of digits, and therefore to disrupt rehearsal of the to-be-remembered word, areas activated in both conditions (e.g., premotor cortex) were canceled out. The same contrast computed for the recall phase showed increased activation after a math-filled delay in the parahippocampal gyrus, the anterior insula/VLPFC, the AG, and the posterior portion of the middle frontal gyrus. Activation after a rehearsal-filled delay was seen predominantly in the left and right precentral and postcentral gyri and anterior intraparietal sulci (see Supplemental Table 3).
We selected two regions that showed significantly greater activation for deep versus shallow processing at encoding—regions that have been associated with semantic processing: the left VLPFC and the left ATL. We then created ROIs by finding the peak voxel within each significant activation cluster in the two areas and identified the top 20 (deep > shallow) voxels in the connected neighborhood. Using these masks derived from the deep > shallow encoding contrast, we then extracted single-subject t statistics for the recall phase of the task and computed a 2 (deep, shallow) × 3 (rehearsal, easy math, hard math) ANOVA for each ROI. As may be seen in Figure 4, in both the left VLPFC and ATL, there were significant main effects of both LOP (F(1, 17) = 3.64 and 10.18, respectively, pone-tailed < .05) and Delay condition (F(2, 34) = 10.37 and 2.85, respectively, pone-tailed < .05).Tests were one tailed because of the a priori prediction, based on the LOP framework (Craik & Lockhart, 1972), that the effect should be deep > shallow, consistent with semantic reinstatement. This pattern indicates that these two regions showed an LOP effect at encoding that was mirrored during recall—that is, reactivation of encoding areas during retrieval, especially after distraction.
Next, we examined activity across all trial time points that predicted recall success. For every time point, at 1.5-sec intervals, from 0- to 24-sec posttrial onset, we computed a Spearman rank correlation between the BOLD activation level and subsequent memory (0 for incorrect, 1 for correct). Because of a ceiling effect in the rehearsal condition, only math trials were used in this analysis, resulting in 80 trials per participant (40 easy math, 40 hard math). To test whether there was a reliable pattern in the shape of this cross-correlation function for the group, we computed a one-way Friedman test (random factor: participant, repeated measure: time). A significant effect on the Friedman test indicates whether there was a significant change in the correlation with performance across time within a task trial.
We first examined whether activation in the left VLPFC and ATL ROIs specifically defined by the deep > shallow contrast at encoding was also predictive of subsequent recall. The same cross-correlation analysis between BOLD activity and recall performance was carried out for both ROIs. A one-way, repeated-measures Friedman test was again computed. Significant differences in the correlation across time were observed in the left VLPFC (χ2 = 29.99, p < .01) and in the left ATL (χ2 = 26.31, p < .05; see the time courses in Figure 5).
To examine subsequent memory effects throughout the brain, we also carried out the subsequent memory analysis for every voxel in MNI space, again using the one-way Friedman test (p < .005; cluster size > 18 voxels). To aid in the visualization of the temporal structure of the cross-correlation profile, we divided the time points into three bins: encoding (1.5–7.5 sec after word onset), delay (7.5–16.5 sec after word onset), and recall (16.5–24 sec after word onset). The surface map, presented in Figure 5, was colored according to which time bin the correlation with performance was maximal within the mask defined by the thresholded Friedman test map (encoding = green, delay = blue, recall = red).
As may be seen in Figure 5, the large areas of red are regions that were correlated with recall performance during the recall phase itself. Note, for example, that the red areas encompass the ventral portion of the motor cortex, which is to be expected because incorrect recall often involved an omission and thus no overt vocalization. The green and blue coloring in areas of VLPFC indicate that, during both encoding and delay periods, activation in VLPFC predicted subsequent recall. The blue coloring in the ATL indicates that activity in this region during the delay predicted subsequent recall. Other areas that predicted recall performance, which are not entirely visible on the surface view, include the medial portion of left ATL and the left hippocampus (see slice views in Figure 5).
Thus, BOLD activation in areas that showed greater activation for deep processing at encoding was correlated with subsequent recall, yet the correlation changed over different phases of the task. During encoding, activity in the left VLPFC (MNI: −31, 27, −3) predicted recall. During the delay, activity in the left (MNI: −38, 19, −27) and right (MNI: 41, 21, −27) ATLs predicted recall. During retrieval, activity in the left hippocampus (MNI: −22, −19, −15) predicted recall success (see Figure 5).
Recall of one word after 10 sec of either an easy- or hard-math task demonstrated a benefit of deep encoding and greater activation in VLPFC, ATL, and MTL than recall after a rehearsal-filled delay. Two areas known to be important for semantic retrieval, left VLPFC and left ATL, showed greater activation during deep encoding, a pattern that was also evident during subsequent recall. Moreover, activity in VLPFC and ATL during the math-filled delay predicted recall success. The temporal profile of this predictive activity differed across regions. Specifically, during the encoding phase, activity in the left VLPFC predicted subsequent recall; during the delay, activity in the left ATL predicted subsequent recall; and during the retrieval phase itself, left hippocampal activity was most predictive of recall success.
Taken together, whereas recall after rehearsal was nearly perfect and activated a network of cortical areas thought to reflect direct readout from the phonological loop (Buchsbaum & D'Esposito, 2008; Hickok et al., 2003), recall after a distracting math-filled delay reflected behavioral and neural indices of retrieval from LTM (Rose et al., 2014; Nee & Jonides, 2008, 2011; Öztekin, Davachi, & McElree, 2010; Öztekin, McElree, Staresina, & Davachi, 2009; Talmi et al., 2005; Sakai, Rowe, & Passingham, 2002a). These results have important implications for the distinction between WM and LTM. They help to clarify the involvement of LTM in performance on WM tasks and the concept of semantic WM.
Although rehearsal of phonological representations may support verbal WM in situations when it is possible to engage continuous rehearsal mechanisms (see Buchsbaum & D'Esposito, 2008, for a review), it is important to acknowledge that the type of neural code that is maintained in WM changes when rehearsal mechanisms are disrupted. The activations in the left VLPFC and left ATL that were observed during math-filled delays and the reactivation of deep encoding areas in the left VLPFC and left ATL during recall after math-filled delays, but not after rehearsal-filled delays, may reveal the use of semantic representations to support recall after distraction. It has been suggested that markers of semantic processing during a WM delay period reflect active semantic maintenance (Shivde & Anderson, 2011). For example, Fiebach and colleagues have shown anterior inferotemporal activation, as well as functional coupling with the left inferior PFC, during maintenance of words versus pseudowords (Fiebach et al., 2006) and that maintaining a conceptual (semantic) representation of three words (tomato note ripe) to make a judgment about a semantically associated word (green?) showed delay-period activity that was greater in posterior inferior temporal and inferior frontal areas (compared with item recognition trials; Fiebach et al., 2007). In addition, during the retention period of a verbal WM task, semantic priming effects were seen on the N400 ERP component for incidental probes that were related to words from a memory list (Cameron et al., 2005). As Fiebach et al. (2007) noted, “[f]inding a reduction in an ERP component sensitive to semantic context effects suggests that the processing of the incidental probe is facilitated by the sustained activation of the semantic representations of the memory set, lending support to the notion of sustained activation of conceptual-semantic representations during the short-term maintenance of words in WM” (p. 2046). That left VLPFC and left ATL areas were active during the delay and predicted subsequent recall in the math conditions of the current study may be seen as evidence for maintenance of semantic features of the to-be-remembered word (see also Shivde & Thompson-Schill, 2004).
These findings are also consistent with previous neuroimaging studies that have shown that disrupting rehearsal impairs verbal WM performance, diminishes activity in areas associated with phonological rehearsal, and increases activity in lateral PFC (e.g., Gruber, 2001). In addition, during the distractor-filled delay period of a WM task, less activation in lateral PFC was associated with more retrieval errors (Sakai, Rowe, & Passingham, 2002b). Taken together, the current findings are consistent with the hypotheses that lateral PFC supports maintenance of distraction-resistant representations (Sakai et al., 2002b) and that WM coding is flexible and takes on the form that suits the task (e.g., rehearsal of phonological features or reactivation of semantic features; Speer et al., 2003).
VLPFC, Deep Processing, and Distinctiveness
We found that left VLPFC was activated by deep processing at encoding and was reactivated at recall after distraction. This suggests that VLPFC facilitates retrieval from WM after distraction through activation of semantic features of to-be-retrieved representations. This hypothesis is plausible because semantic retrieval cues provide diagnostic information that can help resolve interference from competing representations by discriminating target memories from nontargets (Craik & Tulving, 1975). In prior neuroimaging studies, the left VLPFC is an area associated with resolving interference during both encoding (Fletcher, Shallice, & Dolan, 2000; Dolan & Fletcher, 1997) and retrieval (Badre, Poldrack, Paré-Blagoev, Insler, & Wagner, 2005; D'Esposito, Postle, Jonides, & Smith, 1999; Thompson-Schill, D'Esposito, Aguirre, & Farah, 1997). It becomes more active as proactive interference accumulates over trials in WM tasks (Jonides & Nee, 2006), and patients with left IFG damage have deficits on semantic STM tasks (see Martin, 2005, for a review), particularly with regards to intrusions from previously relevant items (Barde et al., 2010; Hamilton & Martin, 2007). However, left IFG patients typically have deficits in semantic processing in general (Belleville et al., 2003; Hoffman et al., 2011; Barde et al., 2010). In addition, the left IFG has been implicated in tasks requiring semantic processing with no memory demands per se (see Badre & Wagner, 2007, for a review). Rather than having the specific function of resolving interference, an alternative hypothesis is that the left VLPFC provides detailed semantic information that allows one to discriminate target from nontarget memories during recall. Together with the left temporal areas associated with processing conceptual/semantic aspects of representations and areas of the MTL that are crucial for binding temporal/contextual information to episodes, the left IFG may provide information (e.g., conceptual/semantic, temporal/contextual) that allows one to distinguish a target memory from other memories that may be activated but are irrelevant, off-target memories. That is, rather than the left IFG “inhibiting” irrelevant representations, it may help to access distinctive (e.g., semantic) information that can discriminate target from nontarget memories.
MTL and Retrieval from LTM on WM Tasks
We also showed MTL activation associated with recall after distraction, but not after rehearsal. This finding is consistent with accounts suggesting that short-term retrieval after distraction involves binding of and access to temporal–contextual information, whereas temporal context cues are less important for retrieval of an item rehearsed in focal attention (Sakai et al., 2002b). This finding is also consistent with prior neuroimaging studies of the WM/LTM distinction that showed greater activation on WM tasks in the MTL during recognition of items that were assumed to have been retrieved from LTM based on the serial position of the items (Nee & Jonides, 2008, 2011; Öztekin et al., 2009, 2010; Talmi et al., 2005). The current study is novel in that we used single-item memoranda and recall instead of recognition. In prior studies, it is difficult to know whether an item was retrieved from focal attention or from LTM based on its serial position because it would depend on which item the participant was rehearsing or attending to before retrieval. By requiring recall of one word, the current study provided clear conditions in which the item was either in focal attention or was displaced from focal attention at the time of recall, and so, according to contemporary models of WM, the item had to be retrieved from activated LTM (Rose et al., 2014; Oberauer, 2009; Cowan, 2008). Importantly, the present results demonstrate behavioral and neural evidence suggesting that recall of one highly familiar, concrete noun (e.g., moose) can involve retrieval from LTM after a brief, attention-demanding activity (addition or subtraction of 5–7 digits).
In conclusion, the current study showed that, whereas shallow processing and rehearsal-related areas supported active maintenance of one item in focal attention, the behavioral processes and neural substrates that support LTM also supported recall of one item after it was displaced from focal attention. That the same “semantic” areas activated at encoding are reactivated at retrieval is parallel to the findings that, when words are initially presented auditorily, later recognition again activates auditory cortex; a similar pattern of reactivation applies for visual presentation and occipital cortex (Vaidya et al., 2002; Nyberg et al., 2000). These findings support hypotheses suggesting that retrieval from memory involves a reinstatement of the same features that were activated at encoding. Therefore, we argue that the current results reflect a neural instantiation of a processing view of the WM/LTM distinction inspired by the cognitive principles of LOP and encoding specificity. That is, the extent to which WM and LTM appear similar or dissimilar in terms of cognitive processes (e.g., LOP) and neural substrates (e.g., MTL) depends on the degree to which initial processing and subsequent retrieval converge or diverge (Rose & Craik, 2012).
This research was supported by grants from the Natural Sciences and Engineering Research Council of Canada (grant nos. 8261 to F. I. M. C. and 386631 to B. R. B.). We thank Sabrina Lemire-Rodger and Ashley Bondad for assistance with data collection.
Reprint requests should be sent to Nathan S. Rose, Department of Psychiatry, University of Wisconsin-Madison, 6001 Research Park Blvd., Madison, WI 53719, or via e-mail: firstname.lastname@example.org.