Refreshing is the component cognitive process of directing reflective attention to one of several active mental representations. Previous studies using fMRI suggested that refresh tasks involve a component process of initiating refreshing as well as the top–down modulation of representational regions central to refreshing. However, those studies were limited by fMRI's low temporal resolution. In this study, we used EEG to examine the time course of refreshing on the scale of milliseconds rather than seconds. ERP analyses showed that a typical refresh task does have a distinct electrophysiological response as compared to a control condition and includes at least two main temporal components: an earlier (∼400 msec) positive peak reminiscent of a P3 response and a later (∼800–1400 msec) sustained positivity over several sites reminiscent of the late directing attention positivity. Overall, the evoked potentials for refreshing representations from three different visual categories (faces, scenes, words) were similar, but multivariate pattern analysis showed that some category information was nonetheless present in the EEG signal. When related to previous fMRI studies, these results are consistent with a two-phase model, with the first phase dominated by frontal control signals involved in initiating refreshing and the second by the top–down modulation of posterior perceptual cortical areas that constitutes refreshing a representation. This study also lays the foundation for future studies of the neural correlates of reflective attention at a finer temporal resolution than is possible using fMRI.
Recently, interest has grown in studying the similarities and differences between two types of attention: externally directed or perceptual attention and internally directed or reflective attention (Johnson et al., 2005; for review, see Chun, Golomb, & Turk-Browne, 2011; Chun & Johnson, 2011). These two types of attention involve activity in highly overlapping networks of brain regions related to executive function and have similar modulatory effects on posterior areas of cortex related to perceptual processing (e.g., Johnson & Johnson, 2009a; Johnson, Mitchell, Raye, D'Esposito, & Johnson, 2007; Lepsien & Nobre, 2007; Wojciulik, Kanwisher, & Driver, 1998). Although reflective attention, as a means of limiting and shaping information flow, is as central to the study of thought as perceptual attention is to the study of the senses, difficulties controlling or even ascertaining the target of reflective attention in the lab—versus the relative ease of providing a controlled perceptual environment—pose special challenges for reflective attention research.
One way of addressing such challenges is to focus on relatively simple, constrained reflective processes such as refreshing: the act of thinking of or foregrounding one of several active mental representations via reflective attention, similar to highlighting one of several present sensory stimuli via perceptual attention (Johnson, Reeder, Raye, & Mitchell, 2002). Refreshing is thought to be a key process for selecting, maintaining, and manipulating information within working memory (Chun & Johnson, 2011). It is proposed to be different from rehearsing in that rehearsing typically involves recycling multiple items over several seconds or minutes via a phonological looping processes (Baddeley, 2012). A typical task for studying refreshing might begin by displaying one to three items (e.g., words, pictures, or other stimuli), followed by a short delay (e.g., 400–1500 msec) and then a cue indicating that the participant should think back to one item (e.g., verbalize a cued word, visualize a cued picture, etc., depending on modality). Neuroimaging investigations have shown that refreshing reliably activates left dorsolateral pFC (DLPFC; Johnson et al., 2005) and parietal cortex (Raye, Mitchell, Reeder, Greene, & Johnson, 2008; Raye, Johnson, Mitchell, Reeder, & Greene, 2002) and is capable of both enhancing and suppressing activity in high-level representational areas in visual cortex (Johnson & Johnson, 2009a). Baddeley (2012, p. 23) has suggested that refreshing may underlie the visual–spatial sketch pad and/or maintenance in the episodic buffer in his model of working memory. This would be consistent with evidence that refreshing is not specific to modality of input (e.g., can occur for either visual or auditory information; Johnson et al., 2005, Experiment 4) and the suggestion that refreshing could operate not only on information that has just been perceived but also on information that is being reflectively rehearsed; thus, refreshing may be a critical component in tasks that require manipulation such as updating (e.g., n-back, Cohen et al., 1997) or alphabetizing (D'Esposito, Postle, Ballard, & Lease, 1999). Refreshing has been referred to as a “minimal” executive process (Raye, Johnson, Mitchell, Greene, & Johnson, 2007), but the brain activity associated with refreshing can vary depending on task demands. For example, increasing the number of potential candidates for refreshing increases activity in ACC (Raye et al., 2008; Johnson et al., 2005).
Although refreshing—a single, brief instance of directing reflective attention—is one of the simplest executive functions a participant might be asked to perform, its operationalization in experimental task paradigms may invoke additional reflective component processes (Johnson, 1992). For example, in addition to the theoretical component process of refreshing (the mental foregrounding of a particular representation, with concomitant enhancement of appropriate brain activity patterns), a refresh task procedure may require participants to initiate (i.e., switch between) tasks. Comparisons with other task conditions can help distinguish these processes: In one fMRI experiment, Raye and colleagues (2007, Experiment 1) compared a Refresh task condition to two control conditions, one in which participants read a novel word (Read) and one in which participants saw a square onscreen that cued them to press a button (Act). As in previous studies, Raye and colleagues found greater activity in DLPFC associated with refreshing than either control condition. In addition, an area of anterior pFC was equally active for the Act and Refresh conditions but exhibited little activity for the Read condition (Figure 1). The authors concluded that the anterior pFC area was likely responsible for initiating a nonautomatic action based on a cue, as this was the major commonality between Refresh and Act, whereas Reading the word was mostly automatic. This interpretation dovetails with the proposed role of anterior pFC (also known as frontopolar cortex) in subgoal management and cognitive branching (e.g., Koechlin & Hyafil, 2007; Braver & Bongiolatti, 2002) and in task initiation (Koshino et al., 2011).
As noted above, refreshing a stimulus such as a face or scene also modulates activity in extrastriate cortical regions selective for the category in question (Johnson & Johnson, 2009a; Johnson et al., 2007), in agreement with the idea that memory representations of sensory percepts are maintained by reinstantiating activity patterns from when they were originally perceived (Pasternak & Greenlee, 2005; Ranganath & D'Esposito, 2005; Curtis & D'Esposito, 2003; Postle, Druzgal, & D'Esposito, 2003; Ruchkin, Grafman, Cameron, & Berndt, 2003). These results, coupled with the refresh-related activity observed in anterior pFC, DLPFC, and other areas, suggest a two-phase model of refresh tasks in which a frontal (and/or parietal) control signal first initiates the component cognitive process refreshing, which subsequently manifests as modulated activity in posterior representational areas. This hypothesized sequence of neural and cognitive events within a short (<2 sec) act of refreshing occurs too quickly to be detected easily with fMRI, but EEG measures neural activity on the scale of milliseconds. Thus, we probed a refresh task using EEG to determine whether ERPs associated with refresh events might indeed be further broken down into two (or more) distinct, temporally defined subcomponents.
We presented participants with pairs of face, scene, or word stimuli, followed by a cue to either refresh one of the stimuli, press a button, or do nothing. Our primary aims were to determine (1) whether refreshing had an ERP signature that could be distinguished from control conditions, (2) whether refresh-related ERPs could be divided into temporal subcomponents, and (3) whether the refresh response was significantly modulated by or contained measurable information about the category of the refreshed item, as had previously been shown using fMRI.
Twenty-one right-handed, self-reported healthy young adults (nine men, mean age = 21.9 years, SD = 2.5 years) with normal or corrected-to-normal vision participated in exchange for compensation. Procedures were approved by the Yale University institutional review board. Six additional participants also took part in the study, but their data sets were rejected because of either poor fixation, excessive sleepiness, or greater than 50% of their trials meeting rejection criteria due to movements or blinking (see below).
On each trial (Figure 2), a white central fixation point against a black background (750 msec) first signaled the start of the trial. Participants were asked to maintain fixation on this point throughout each trial without blinking. Then, two stimuli of the same category (either two faces, two scenes, or two words) appeared above and below fixation (1500 msec). Next, a 500-msec delay (with only the fixation point shown) was followed by a 1500-msec cue that could be (1) a Refresh cue: a white arrow pointing up or down, indicating that participants should briefly refresh (think back to, visualize) the stimulus just presented in the upper or lower position; (2) a NoAct cue: the white central fixation dot turning bright green, indicating that participants need not do anything at all; or (3) an Act cue: the white central fixation dot both turning green and growing larger, indicating that participants should press a button with their right index fingers. Lastly, the central fixation point was presented alone again (750 msec), indicating that the trial was nearly over. After that, the screen went entirely black for an intertrial interval of 2500 msec, during which participants could blink freely, before the next trial began.
The Refresh condition was similar to tasks and instructions used in previous fMRI studies of refreshing (Johnson & Johnson, 2009a; Johnson et al., 2007). Postexperimental surveys after these types of studies typically indicate that participants understand the instructions, comply, and do not report engaging in additional processes beyond refreshing. Participants did not complete such a survey for Experiments 1 and 2 of this study, but for a similar, contemporaneous EEG study involving a Refresh condition for scene stimuli, participants (n = 19) responded to a postexperiment survey as follows. “How easy did you find it to think back to each scene picture when you saw the arrow?”: M = 4.3, SD = 2.2 on a 0–10 scale (10 = very difficult). “Rate how vivid your mental image was when you saw the arrow and had to think back to a scene”: M = 5.8, SD = 1.7 on a 0–10 scale (10 = incredibly vivid). “…what percentage of each scene picture would you say you were able to mentally revisualize?”: M = 57%, SD = 18%. For the free response question, “Do you recall using any strategies to think back to the scenes when you saw the arrow?” most participants did not report a specific strategy (e.g., “Not really—not enough time to strategize,” “Not really. Just tried to concentrate,” “Just tried to ‘see’ the image again”), aside from several noting that they tended to focus on the most salient or striking elements of the stimulus, and some others reporting that they sometimes automatically associated stimuli with a verbal label (e.g., “college apartment,” “red mountain”) or a feeling/memory from their past.
There were 240 trials, divided into four runs of 60 trials each. There were equal numbers of trials using face, scene, and word stimuli. Fifty percent of all trials ended with a Refresh cue, 40% with a NoAct cue, and 10% with an Act cue. Thus, participants received a total of 120 Refresh trials (40 each of faces, scenes, and words), 24 Act trials, and 96 NoAct trials, pseudorandomly intermixed. Whereas Raye et al. (2007; see Figure 1) had used an Act condition as a primary comparison to a Refresh condition, we were concerned that large potentials from preparing and executing the button press would dominate the Act ERPs, making it a poor comparison condition. (This was not a concern in fMRI studies, as the spatial dispersion of fMRI activity is much lower than that of scalp potentials; thus button-press-related activity during Act would not spread to other brain regions during fMRI in the same way that motor-related potentials can be recorded from many distant scalp sites during EEG.) However, we did not use NoAct as the sole comparison condition because this may have given participants too little to do on NoAct trials, potentially leading to mind-wandering or unintentional refreshing. Thus, we included both Act and NoAct trials to introduce some ambiguity to the meaning of the NoAct cue and induce participants to process the Act and NoAct cues more fully than if there were only a single “non-Refresh” condition. Fewer Act trials were included because the primary purpose of the Act condition was to facilitate this ambiguity; our primary comparison was between the Refresh and NoAct conditions.
All face and scene stimuli were color pictures measuring 300 × 300 pixels. Faces were forward-facing complete head shots of young and older men and women (in equal proportions) with neutral or pleasant facial expressions, drawn from a database developed by Minear and Park (2004). Scenes were indoor and outdoor (in equal proportions) pictures of landscapes, buildings, and interior rooms in a wide variety of settings, drawn from a number of sources (mostly freely available images from the Internet). Words were chosen from a set of everyday, neutrally valenced, one- to three-syllable nouns, presented in a bold white font that could be easily read even while maintaining central fixation. Stimuli were counterbalanced across participants with regard to the condition and run in which they appeared, and on Refresh trials, the stimulus to be refreshed occurred equally often on the top and bottom for each category. Every face, scene, and word stimulus was used exactly once per participant.
Data Acquisition and Analysis
Scalp potentials were recorded from a 32-channel EEG cap using a nose reference. Channels for horizontal and vertical EOG were also included to monitor eye movements and blinks. Signals were recorded with a gain of 10,000 and a bandpass filter (−3 dB) of 0.01–100 Hz and continuously digitized and stored with 14-bit precision and a 250-Hz sampling rate. Electrodes included a 31-channel subset of the international 10/20 and 10/10 systems and were positioned and labeled according to the conventions of those systems. Electrode impedances were kept below 5 kΩ.
Analyses were focused on the final cue period when either a Refresh, NoAct, or Act cue was presented. For each trial, a 2100-msec signal epoch was extracted for each channel (the 1500 msec that the cue was onscreen, as well as a 100-msec preonset baseline period and a 500-msec postoffset period). After each epoch was extracted, all channels were linearly detrended, and artifact rejection was performed. Trials were rejected if the peak-to-peak amplitude of any EEG channel exceeded 150 μV after linear detrending or if any EEG channel contained a flat period of more than 75 msec, which generally indicated amplifier clipping caused by excessive movement. Next, signal correlated with each of the EOG channels was regressed out of all EEG channels to remove any residual influence from small eye movements that may have occurred on trials that survived blink artifact rejection. As noted above, participants who had more than 50% of their trials rejected because of these criteria were not included in later analyses. The mean voltage from the 100 msec preceding the cue period was treated as a baseline and subtracted from the entire epoch to ensure that the signal at each cue onset began at approximately 0 μV.
For standard ERP analyses, the nonrejected trials for each condition/participant/channel were collapsed to create a participant average. These were, in turn, smoothed slightly using a 5-point moving average and then collapsed across participants to create a grand average. (These grand averages were then smoothed again with a 5-point moving average for display purposes in Figures 3, 5, and 7, but this did not affect data analysis.) Standard parametric statistics (e.g., repeated-measures ANOVAs) were conducted to determine time points where the ERPs from each condition significantly differed from one another. A false discovery rate (FDR) correction was used to adjust for multiple comparisons (Genovese, Lazar, & Nichols, 2002; Benjamini & Hochberg, 1995). See Results for further details.
We also fed trial-by-trial ERP data from refreshing faces, scenes, and words into a multivariate pattern analysis (MVPA; Norman, Polyn, Detre, & Haxby, 2006; Haxby et al., 2001) to see if it could reliably classify the category being refreshed. The same analysis was performed for the NoAct condition (attempting to classify the stimulus category that was initially presented on NoAct trials, although participants were not refreshing during those trials) to determine whether successful classification was specific to refreshing. We used values that were preprocessed as detailed above, but also binned them into 40-msec bins to reduce the total number of features fed into the classifier. Only ERP data from the 1500-msec period that the Refresh (or NoAct) cue was actually onscreen were used for MVPA. To perform the classification, for a given participant, we first determined which subcondition (initial presentation of faces, scenes, or words for either the Refresh or NoAct condition) had the smallest number of acceptable trials. That number was rounded down to the nearest multiple of 5. That number of trials was then randomly selected from each condition and divided into training and test sets, with 4/5 of the trials being used for training and 1/5 for testing. This process was iterated 500 times for each analysis, with a different random sampling of trials used in each iteration.
For classification, we used sparse multinomial logistic regression (Krishnapuram, Figueiredo, Carin, & Hartemink, 2005), which attempts to obtain a sparse classifier during training. This removes the need for a separate feature selection step and is useful in exploratory analyses where it is not known a priori which features (i.e., EEG channels/time points) are likely to be most informative. After classification, the sparse multinomial logistic regression algorithm yielded a matrix of scores indicating, for each trial, the classifier's confidence that the trial belonged to each of the three conditions. These scores were used to calculate receiver operating characteristic (ROC) curves and the area under the ROC curve (AUC) for each condition and each participant. Finally, AUCs were collapsed across condition and classifier iteration to yield a single AUC value for each participant, indicating the classifier's overall accuracy at distinguishing among the three conditions for that participant. These AUC values could range from 0 to 1, with chance equaling 0.5; although there were three conditions being classified, the final AUC was calculated from an average of per-condition ROC curves, hence chance equaling 0.5 rather than 0.33. These AUC scores were then subjected to traditional group statistics (e.g., t tests against chance).
The MVPA classifications described above were initially performed for both the Refresh and NoAct conditions for the full time (1500 msec) that the corresponding cue was onscreen and then separately for the initial time period that the cue was onscreen (0–800 msec postcue) and a later portion of the cue period (800–1400 msec postcue). They were also performed separately at each 40 msec time bin to examine the time course of classification performance for Refresh and NoAct. See Results for further details.
ERPs for Refresh, Act, and NoAct
ERPs for representative EEG electrodes are shown in Figure 3 collapsed across the category (faces, scenes, or words) initially presented. As noted in Methods, a separate repeated-measures ANOVA was run at each time point (526 time points, for a 2100-msec epoch acquired at 250 Hz) and channel (31 channels of EEG data), yielding a total of 16,306 p values. These were subjected to an FDR correction at q = 0.05, and those channels/time points that survived the correction are shown in bold in the ERP plots of Figure 3.
Only Figure 3A–B shows ERPs in the Act condition, for illustrative purposes; in general, Act ERPs progressed similarly to the NoAct condition for the first ∼400 msec postcue but then continued to grow in amplitude to reach a substantially higher positive peak at ∼600 msec postcue, likely because of the button press response, the substantially lower frequency of Act trials (making it an “oddball” in some sense), and/or greater temporal synchrony in neural responses to Act trials. This confirmed our expectation that Act is a less appropriate condition for comparison to Refresh in ERP than in fMRI analyses. Thus, because the numbers of Refresh and NoAct trials were more nearly equal and neither required a motor response, the analyses reported below (and illustrated in Figure 3C–F) compare only the Refresh and NoAct conditions.
At a number of electrodes, both the Refresh and NoAct conditions showed a large positive peak at approximately 400–500 msec after cue onset, but with the Refresh response peaking earlier than the NoAct response. These effects are illustrated for Fz and Pz in Figure 3A and B; similar patterns were found at F3, F7, FCz, FC3, FT7, Cz, T3, and CPz (not shown). These central and left-lateralized frontal sites showed positive peak responses that were similar in amplitude (but different in latency) in the Refresh and NoAct conditions, perhaps associated with initiating the appropriate response to the onscreen cue in the two conditions.
The response at several other electrodes exhibited a similar positive peak for both Refresh and NoAct, but in addition to a faster latency for Refresh trials, there was also a higher amplitude peak for Refresh versus NoAct. This pattern is shown for the F4, CP4, T6, and O2 electrodes in Figure 3C–F; similar patterns were found at F8, FC4, FT8, C4, T4, TP7, TP8, T5, P3, P4, O1, and Oz (see also Figure 4C). One possible hypothesis arising from this finding is that the greater amplitude in these right frontal and bilateral parietal, temporal, and occipital electrodes could reflect the onset of top–down modulation signals that are unique to the Refresh condition.
Consistent with previous fMRI findings that refreshing modulates activity in posterior representational regions, we observed another set of differences between Refresh and NoAct ERPs, primarily in more posterior electrodes on the right side, arising later during the cue period. The T6 electrode in Figure 3E and the O2 electrode in Figure 3F show this effect most clearly: a sustained positivity for Refresh (relative to NoAct, which hovered around the 0 μV baseline) that reached multiple-comparison-corrected significance at several points between approximately 800 and 1400 msec after the onset of the cue. Other electrodes showing similar, but somewhat weaker, patterns were FC4, TP8, and O1 (see also Figure 4D).
Figure 4A–B summarizes the electrodes and time points that showed significant differences between the Refresh and NoAct conditions (after FDR multiple-comparison correction). Many electrodes showed significant differences at the earlier (∼400 msec postcue) large peak of the Refresh response and then again (∼600–800 msec postcue) as that response returned to baseline faster than the NoAct response. Fewer significant differences were found in the later period (∼800–1400 msec postcue) of sustained Refresh positivity observed at the electrodes noted above, but some additional significant differences, all showing greater positivity for Refresh than NoAct, did appear at electrodes Oz, C4, and FT8 for at least two consecutive time points within that window. All in all, 3356 time point–electrode combinations (out of 16,306), or about 20.6%, showed a significant difference between the Refresh and NoAct conditions at the FDR-corrected threshold (q = 0.05). Figure 4C–D also shows scalp distributions for the subtraction Refresh − NoAct at the periods of interest discussed above: Figure 4C shows the distribution of the amplitude difference at the early positive peak (adjusted for the latency difference between Refresh and NoAct, by taking each condition's peak voltage anywhere in the period 300–600 msec postcue at each electrode), and Figure 4D shows the distribution of the overall difference during the later period of sustained Refresh positivity (by taking each condition's mean voltage over the period 800–1400 msec postcue at each electrode).
In addition to the ∼400 msec positive peak and ∼800–1400 msec later period, differences between the Refresh and NoAct conditions were also observed in the very early or very late portions of the time window (i.e., less than 250 msec postcue, or after the offset of the cue at 1500 msec postcue). Namely, in several electrodes, there was a more pronounced and/or earlier latency negative peak at ∼200 msec postcue for the Refresh condition compared to NoAct, and an earlier latency cue offset response for Refresh than NoAct. Given the timing of these ERPs, it is likely that they were primarily sensory responses related to the onset and offset of the cue stimuli and were not directly related to the cognitive process of refreshing per se. However, to confirm that our primary effects of interest were not driven by low-level sensory differences between the Refresh and NoAct cues, we conducted Experiment 2 (see below), focusing on replicating the primary effects of interest at ∼400 msec and between 800 and 1400 msec, while using more similar cues for the Refresh and NoAct conditions to better equate low-level sensory responses.
ERPs for Refreshing Faces, Scenes, and Words
To determine if refresh ERPs differed by stimulus category, we split all Refresh epochs based on whether the refreshed stimuli were faces, scenes, or words. Figure 5 shows ERPs for a set of representative electrodes. Qualitatively, it is clear that the three categories of Refresh responses track together much more closely than the Refresh and NoAct responses, with no obvious pattern of differences among the three Refresh subconditions. In fact, no time point–electrode combinations survived an FDR correction, so time points plotted in bold in Figure 5 only differed at an uncorrected threshold of p < .05.
However, based on fMRI findings of differences in brain activity patterns depending on what is being refreshed (Johnson & Johnson, 2009b; Johnson et al., 2005; Johnson, Raye, Mitchell, Greene, & Anderson, 2003), we hypothesized that the entire pattern of scalp activity might contain enough information to afford above-chance category decoding. Thus, we conducted the MVPA described in the Methods. Across participants, the mean AUC for decoding the category refreshed during the full 1500-msec cue period (Figure 6A, left pair of bars, in blue) was 0.540. Although not numerically far above chance (0.5), the difference was statistically significant (t(20) = 3.48, p = .0024, two-tailed t test against chance) and the effect size (Cohen's d = 0.76) indicated a medium-to-large effect. To confirm that this result was not due to bias in our algorithm, we ran the same analysis, but shuffled the labels of the conditions randomly before classifying which should yield chance performance. Indeed, the shuffled classification did not differ from chance (mean AUC = 0.499, t(20) = 0.82, p = .42) but did differ from the nonshuffled analysis (t(20) = 3.60, p = .0018, two-tailed paired t test).
We also performed the same analysis for NoAct, to determine whether successful category decoding was specific to refreshing. NoAct classification was also significantly above chance (mean AUC = 0.526, t(20) = 2.21, p = .039; Figure 6A, left pair of bars, in green). Although this was numerically worse than Refresh, the difference between Refresh and NoAct classification during the full 1500-msec cue period was not significant (t(20) = 0.88, p = .39).
However, given that our ERP analyses (Figure 3) had shown differences between Refresh and NoAct at two separate periods (early and late) within the overall cue period, we hypothesized that NoAct category classification might be driven exclusively by activity in the earlier period (e.g., by lingering perceptual activity from the initial two-stimulus display and/or participants' inadvertently beginning to refresh on some NoAct trials before processing the NoAct cue and realizing that they did not have to). Thus, we repeated the Refresh and NoAct MVPA separately for the earlier (0–800 msec postcue) and later (800–1400 msec postcue) periods. As predicted, during the earlier period (Figure 6A, middle pair of bars), both Refresh category classification (mean AUC = 0.534, t(20) = 3.81, p = .0011) and NoAct category classification (mean AUC = 0.531, t(20) = 2.49, p = .022) differed from chance, but not from each other (t(20) = 0.28, p = .78). However, during the later period (Figure 6A, right pair of bars), Refresh category classification remained above chance (mean AUC = 0.536, t(20) = 3.40, p = .0028), but NoAct category classification dropped to chance (mean AUC = 0.502, t(20) = 0.14, p = .89), and Refresh classification was significantly better than that of NoAct (t(20) = 2.31, p = .032).
We also ran the same MVPA for both Refresh and NoAct separately at each 40 msec time bin to plot the time course of category classification performance. As shown in Figure 6B, at individual time points early in the cue period, category classification for both conditions was somewhat above chance, with no separation between Refresh and NoAct performance. However, at ∼700 msec postcue, NoAct classification dropped to near chance whereas Refresh classification remained high. When we tested for significant differences in performance (paired t tests at each time bin), no time points survived an FDR correction; however, Refresh category classification was significantly better than NoAct category classification at an uncorrected p threshold of .05 at 5 time points (out of 37 in the entire cue period), all between 700 and 1200 msec postcue.
As noted above, Experiment 2 was conducted to replicate the major effects observed in Experiment 1, but using more similar cues for the Refresh and NoAct conditions, to eliminate the possibility that low-level sensory differences might be driving our effects. All methods were the same as in Experiment 1, except where stated below.
Sixteen right-handed, self-reported healthy young adults (eight men, mean age = 22.4, SD = 2.9) with normal or corrected-to-normal vision participated in the study. Four additional participants were excluded according to the same criteria as in Experiment 1.
The task was identical to that used in Experiment 1, except that we changed the NoAct condition's cue stimulus. Whereas the NoAct cue was a small green dot in Experiment 1, in Experiment 2 it was a white arrow—identical to the arrow used in Refresh trials, except that the arrow pointed left or right instead of up or down. Given that the faces, scenes, and words presented initially on each trial were above and below fixation, participants were instructed to refresh the indicated item if the arrow pointed up or down and do nothing if the arrow pointed left or right (i.e., not toward the previous location of a stimulus). The presentation of the Refresh and Act cues as well as the proportion of different trial types, the face/scene/word stimuli used, and the order in which stimuli/conditions were presented were all the same as in Experiment 1.
ERPs for Refresh versus NoAct
For this replication, we focused on differences between the Refresh and NoAct conditions at the electrodes/time points where the most notable differences were found in Experiment 1. Figure 7A–D show the Refresh and NoAct ERPs for the same electrodes illustrated in Figure 3C–F from Experiment 1, with a similar pattern of results; in fact, all four electrodes in Figure 7A–D (F4, CP4, T6, O2) exhibited differences between Refresh and NoAct at both the earlier ∼400 msec positive peak (driven primarily by a faster latency Refresh response) and during the later 800–1400 msec period. In contrast, only T6 and O2 showed a difference between Refresh and NoAct during the later period in Experiment 1. Although these differences did not pass the FDR-corrected threshold of q = 0.05 used in Experiment 1 (time points shown in bold in Figure 7A–D are at an uncorrected p < .05 threshold), they did emerge at the same time points with the same qualitative characteristics (faster latency for Refresh during the earlier positive peak, greater sustained positivity for Refresh during the later period), suggesting that the overall pattern of results from Experiment 2 did replicate that of Experiment 1. For further quantification, we used the enhanced FDR procedure introduced by Storey (2002). After feeding the p values for the Refresh–NoAct comparisons at all electrodes and time points into this procedure, it returned a π0 parameter of 0.771, suggesting that approximately (1 − π0) or 22.9% of measurements in Experiment 2 contain true differences between conditions, even if they do not pass a conventional FDR threshold. When considering only the electrodes/time points of maximum interest that (1) occurred between 300 msec postcue and cue offset and (2) passed the FDR threshold in Experiment 1, the expected percentage of true differences rose to 40.0%. This suggests a fair rate of replication that is relatively specific to the effects of maximum interest, despite the changes to the Refresh and NoAct cue stimuli.
In addition to the electrodes in Figure 7A–D, similar patterns of differences (p < .05, uncorrected, at multiple consecutive time points) between Refresh and NoAct at both the earlier peak and the later period were found at F8, FC4, FT8, C4, T3, T4, TP7, TP8, P3, P4, T5, O1, and Oz. A related but somewhat different pattern was also seen at Fz, FCz, Cz, CPz, Pz, F3, FC3, C3, and CP3. Those electrodes showed the same greater sustained Refresh positivity during the later 800–1400 msec period, but with no significant difference between conditions at the Refresh peak time point of ∼400 msec postcue; instead, early-period differences between Refresh and NoAct were driven by a greater positive peak for NoAct (though still at a slower latency than Refresh), occurring ∼500–600 msec postcue. See Figure 7E for the scalp distribution of the magnitude difference of the initial peak between Refresh and NoAct, irrespective of latency, and Figure 7F for the scalp distribution of the difference in mean voltage during the 800–1400 msec later period. Note that although Figures 4C/7E and 4D/7F exhibit some clear visual dissimilarities from each other, these are largely due to overall baseline voltage shifts. Figure 7E reflects a global negative shift relative to Figure 4C, and Figure 7F reflects a global positive shift relative to Figure 4D, but the relative distributions of voltages independent of these baseline shifts are largely similar between Figures 4C and 7E and between Figures 4D and 7F.
Critically, the very early (∼200 msec postcue) and very late (after cue offset, >1500 msec postcue) differences between Refresh and NoAct, observed in Experiment 1 and which served as the motivation for Experiment 2, were eliminated in Experiment 2 (see Figure 7A–D). This suggests that the differences found in Experiment 1 at these time points were indeed due to low-level sensory differences between the Refresh and NoAct cues. However, as the effects at ∼400 msec postcue and between 800 and 1400 msec postcue remained (although at an uncorrected significance threshold), Experiment 2 also suggests that those effects of primary interest were not driven by low-level sensory differences between cues.
ERPs for Refreshing Faces, Scenes, and Words
MVPA results were also similar between Experiments 1 and 2. Over the full 1500-msec cue period, decoding for the category refreshed (Figure 8A, left pair of bars, in blue) was significantly better than chance (mean AUC = 0.566, t(15) = 4.35, p = .00057, two-tailed t test). Classification of the initially presented category from NoAct trials (Figure 8A, left pair of bars, in green) was not different from chance (mean AUC = 0.516, t(15) = 1.61, p = .13). Unlike Experiment 1, the difference between Refresh and NoAct classification during the full cue period was significant (t(15) = 3.79, p = .0018).
We then split the data into earlier (0–800 msec postcue) and later (800–1400 msec postcue) periods and ran the MVPA again, as in Experiment 1. During the earlier period (Figure 8A, middle pair of bars), Refresh category classification was significantly better than chance (mean AUC = 0.560, t(15) = 4.03, p = .0011) whereas NoAct classification showed only a trend toward better-than-chance performance (mean AUC = 0.524, t(15) = 1.88, p = .080); the difference between Refresh and NoAct performance during the earlier period was significant (t(15) = 2.20, p = .044). During the later period (Figure 8A, right pair of bars), Refresh category classification, as in Experiment 1, remained above chance (mean AUC = 0.534, t(15) = 2.18, p = .045) whereas NoAct was at chance (mean AUC = 0.493, t(15) = 0.54, p = .59), and Refresh performance was again significantly better than that of NoAct (t(15) = 3.74, p = .0020).
Finally, we performed the MVPA separately at each 40 msec time bin and plotted the time course of category classification in Figure 8B. As in Experiment 1, performance was more similar between Refresh and NoAct during the early cue period; Refresh classification was only significantly better than NoAct at one early time point (time bin centered at 420 msec postcue). However, NoAct performance dropped to near-chance later in the cue period whereas Refresh remained high, as in Experiment 1. Eleven time points between 700 and 1420 msec postcue showed a significant difference between Refresh and NoAct performance (paired t tests, p < .05, uncorrected); the differences at 420, 980, and 1220 msec postcue survived FDR correction.
Summary of Results
To our knowledge, this is the first EEG study of the cognitive processes engaged during refresh tasks. The results supported our two main hypotheses: Refresh ERPs (relative to a control condition) can be broken down into temporal subcomponents, and the refresh response is significantly modulated by the category refreshed. There were two major ERP differences between refreshing and our control NoAct condition: a positive peak relatively early in the cue period and a span of more sustained positivity later in the cue period. Both Refresh and NoAct had early peaks at multiple electrodes, with the Refresh peak occurring sooner (∼400 msec postcue) than the NoAct peak (∼500 msec postcue). This latency difference was not due to sensory differences between the Refresh and NoAct cues (or difficulty distinguishing the NoAct and Act cues), as it occurred not only in Experiment 1 but also in Experiment 2, in which the Refresh and NoAct cues were visually similar and the NoAct and Act cues were visually dissimilar.
However, in Experiment 2, the amplitudes of Refresh peaks compared to NoAct peaks were somewhat reduced (or, conversely, NoAct peaks were relatively larger). This could either have been due to purely sensory effects (i.e., the NoAct cues were larger and more visually salient, thus producing amplitudes more similar to those of Refresh) or cognitive effects relating to the cue change (i.e., the greater similarity of the cues made it more challenging to interpret the cue and initiate the appropriate response, leading to more similar amplitudes between conditions). Regardless, in both experiments, we observed somewhat different scalp distributions in the Refresh versus NoAct peak responses. In central and left-lateralized frontal sites (Fz, FCz, Cz, CPz, Pz, F3, FC3), the Refresh peak amplitude was approximately equal to (Experiment 1) or somewhat smaller than (Experiment 2) the NoAct peak, whereas in right frontal and bilateral parietal, temporal, and occipital electrodes (F4, CP4, F8, FC4, FT8, C4, T4, TP7, TP8, T5, T6, P3, P4, O1, O2, Oz), the Refresh peak amplitude was somewhat greater than (Experiment 1) or approximately equal to (Experiment 2) the NoAct peak (see Figures 4C and 7E).
The Refresh and NoAct responses also differed later in the cue period, in the form of a more sustained positivity, rather than a distinct peak, that was greater for Refresh than NoAct from ∼800 msec postcue onward. In Experiment 1, this was seen at several posterior sites, most notably T6 and O2, as well as at a number of additional sites in Experiment 2. At several sites, the Refresh and NoAct ERPs converged to similar amplitudes after their initial peaks before separating for the later period, suggesting that the later positivity is a distinct ERP subcomponent of refreshing, separate from that initial peak. This is further supported by our MVPA results; although category information could be decoded during Refresh trials throughout the cue period, it could only be decoded during NoAct trials in the earlier period, and thus, the MVPA primarily differentiated Refresh from NoAct during the later period. Classifiable category information during the early cue period may have been more related to persisting activity from the initial stimulus presentation in both the Refresh and NoAct conditions or spontaneous (anticipatory) refreshing in NoAct. Thus, for future studies examining the relationship between top–down modulation of sensory cortex and category-specific EEG patterns evoked by refreshing, it may be most useful to focus on the later cue period, which is presumably carrying the clearest signal of the top–down modulation of representations that is a defining function of the refresh process.
Relation to Previous Work
The distinct refresh-related ERP components we observed here bear some resemblance to ERP effects previously found in other contexts. In particular, our earlier positive peak is reminiscent of the large, well-characterized ERP component known as the P3 or P300. Although our refresh task was quite different from the infrequent target detection or “oddball” tasks classically used to elicit the P3, the positivity, latency, and magnitude of the responses are similar enough to suggest that our early refresh-related peak and the classical P3 might share some degree of underlying neural activity. The P3 has been linked to attention and working memory updating and is presumed to arise from activity in a broad frontoparietal network (for review, see Polich, 2007; Soltani & Knight, 2000; Polich & Kok, 1995). Given that refreshing is thought to constitute a fundamental component of many executive functions and shows some overlap in brain activity with perceptual attention processes (e.g., Roth, Johnson, Raye, & Constable, 2009; for a review, see Chun & Johnson, 2011), including both frontal and parietal regions of activation (Johnson et al., 2005), it seems reasonable to draw some relation between the peaks we observed here and the P3 family of responses.
More specifically, the P3 is typically subdivided into two subcomponents, the P3a—which is associated with irrelevant, novel, or distractor stimuli, a frontocentral scalp distribution, frontal source generators, and an earlier latency—and the P3b—which is associated with voluntary target detection, a more posterior scalp distribution, parietal and inferior temporal source generators, and a later latency (Polich, 2007; Bledowski et al., 2004; Knight, 1997). Given that our Refresh cue might be thought of as a type of target and our NoAct cue as a type of distractor, the scalp distributions we observed are broadly consistent with these divisions: a less Refresh-associated (more NoAct-associated) frontocentral distribution with the P3a and a more Refresh-associated (less NoAct-associated) posterior distribution with the P3b.
Thus, although our initial Refresh- and NoAct-related peaks likely both reflect some weighted combination of P3a-like and P3b-like processing, our results could be interpreted in terms of (and may shed new light on) theoretical models of P3a and P3b. The details of such models are still debated, but it appears that the P3a is related to the initial orienting to and evaluation of a stimulus, driven primarily by pFC (Polich, 2007; Bledowski et al., 2004; Friedman, Cycowicz, & Gaeta, 2001), whereas the P3b seems to be more related to the resolution of uncertainty about stimuli and the concomitant updating of expectancies or context, potentially engaging additional attentional or memory processes, and driven primarily by temporoparietal activity (Polich, 2007; Bledowski et al., 2004; Knight, 1997; Verleger, 1988; Sutton, Tueting, Zubin, & John, 1967). Both our Refresh and NoAct conditions are likely to involve P3a-like processing in the need to initially orient to and evaluate a cue to make an action (or inaction) decision, but P3b-like processing should be more Refresh-specific, given that only Refresh involves subsequent deployment of reflective attention to an active representation. The earlier latency for Refresh is consistent with the sensitivity of the P3 latency to the time required to evaluate/resolve a stimulus, driven primarily by the later and more temporally variable P3b. In our study, given the greater salience of the Refresh cue, participants likely held a refresh-specific attentional set that may have facilitated faster evaluation of that cue.
Previous ERP studies of orienting internally directed attention to items held in working memory, a task that likely entails refreshing, have also reported enhanced P3-like responses (e.g., Griffin & Nobre, 2003); however, those tasks have typically also involved a subsequent memory probe for the attended item. Thus, the less complex nature of the refresh task used here helps to establish with greater certainty that this P3-like enhancement is due to the act of reflective attention itself, rather than the preparation of a response based on the representation selected.
The later sustained positivity we associated with refreshing also has some analogue in previous work, the closest of which may be the late directing attention positivity (LDAP; Hopf & Mangun, 2000; Harter, Miller, Price, LaLonde, & Keyes, 1989). The LDAP is a late positive potential associated with perceptual attention, lasting up to several hundred milliseconds. It has been interpreted as arising from the anticipatory top–down modulation of visual regions in response to an attentional precue. Given the known top–down modulation effect of refreshing on activity in extrastriate category-selective visual regions of cortex (e.g., Johnson & Johnson, 2009a; Johnson et al., 2007), this interpretation is broadly consistent with our later refresh-related sustained positivity. If the LDAP and our refresh-related late positivity were indeed determined to stem from similar sources, it would suggest (1) that although the LDAP has previously been observed in terms of greater contralateral than ipsilateral positivity for visual attention directed to one hemifield, similar positivity can also be observed from directing reflective attention to representations of stimuli presented centrally and thus not explicitly lateralized and (2) that LDAP-like positivity is not limited to simply a gain increase from attending to an empty visual field but can also be evoked by top–down modulatory signals to visual regions that carry meaningful information about currently active mental representations (after the offset of the corresponding perceptual stimulus). These task differences (lateralized vs. central stimulus presentation, spatial/perceptual vs. reflective attention) limit how directly we might compare the traditional LDAP to our LDAP-like positivity, although they also may help explain differences in timing (the traditional LDAP arises ∼500 msec postcue whereas our positivity began ∼800 msec postcue, but reflective attention may reasonably be expected to take longer to initiate than spatial/perceptual attention) and create opportunities for future studies more specifically designed to assess the similarities and differences between ERPs associated with perceptual versus reflective attention.
Consistent with the above interpretation is our finding of Refresh-specific category decoding during the later part of the cue period that contained this LDAP-like positivity. At least one previous EEG study (LaRocque, Lewis-Peacock, Drysdale, Oberauer, & Postle, 2013) has successfully decoded the general category of information (visual, phonological, or semantic) maintained in working memory over several seconds; our study extends this result to demonstrate above-chance decoding for a shorter time span and a more similar set of categories. Although classifier performance in both cases was modest, this is to be expected with EEG; even during perception, category-specific ERP effects are not as pronounced as in fMRI. For example, although the fusiform face area in fMRI studies and the N170 potential in ERP both respond selectively to faces (Kanwisher, McDermott, & Chun, 1997; McCarthy, Puce, Gore, & Allison, 1997; Bentin, McCarthy, Perez, Puce, & Allison, 1996), there is no similarly diagnostic ERP component for visual scenes, despite the existence of several scene-preferring areas that are readily observed using fMRI (e.g., Epstein, 2008; Epstein & Kanwisher, 1998). Thus, both our study and that of LaRocque et al. highlight the utility of applying MVPA to reveal information about reflective processing that would not be recoverable from traditional ERP analysis (and with finer temporal resolution than is possible with fMRI); however, relatively large differences between stimulus categories may be necessary to achieve satisfactory decoding.
These EEG results support hypotheses formed as a result of previous fMRI investigations, thus fleshing out a dynamic cognitive and neural model of reflective attention and executive function more generally. Those fMRI studies found that refreshing was associated with activity in DLPFC, anterior pFC, and parietal regions, particularly the supramarginal gyrus (Raye et al., 2007; Johnson et al., 2005), as well as category-specific modulation of extrastriate visual areas (Johnson & Johnson, 2009a; Johnson et al., 2007). The short timescale of the refresh process (typically <2 sec) and the low temporal resolution of fMRI make it difficult to resolve the order in which those regions become active, but comparisons among task conditions suggested a basic model of how these areas interact: Anterior pFC, associated in previous studies with subgoal management, cognitive branching, and task initiation (Koshino et al., 2011; Koechlin & Hyafil, 2007; Braver & Bongiolatti, 2002) and activated for both Refresh and Act conditions (Raye et al., 2007), is primarily responsible for initiating an appropriate nonautomatic cognitive or motor action based on the interpretation of a cue. DLPFC, which is relatively specific to the Refresh condition in most studies of refreshing and which is thought to generate signals that bias the flow of activity in other brain regions (Miller & Cohen, 2001), produces a control signal to direct reflective attention to a representation. Subsequently and potentially mediated by parietal regions, activity patterns associated with that representation's initial perception are sustained, enhanced, or partially revived in representational cortical regions (e.g., visual areas, for visual stimuli).
Although our present findings do not allow us to map ERP phenomena directly onto specific cortical areas, they do suggest that refresh tasks contain at least two distinct component cognitive processes of reflection, which is generally consistent with the two-phase model predicted from fMRI: one with a peak at ∼400 msec postcue (initiating) and the other which is more distributed between ∼800 and 1400 msec postcue (refreshing). The latter potential, its similarity to the LDAP, and the refresh-specific category decoding we found during its temporal window imply that this interval represents the period during which top–down modulation of representational regions occurs and patterns associated with the attended representation are most enhanced.
The earlier P3-like peak also integrates well with existing fMRI data, suggesting that disambiguation of the cue and initiation of the appropriate action occurs by 400–500 msec postcue. The scalp distributions we observed for Refresh versus NoAct suggested both P3a-like (relatively less specific to refreshing) and P3b-like (more specific to refreshing) aspects. The known role of the P3a in initial stimulus orienting and evaluation, coupled with frontal source generators and an earlier latency than the P3b, may map onto the posited function of anterior pFC in initiating the appropriate cue-based response in both conditions. By contrast, the more refresh-specific P3b-like activity may reflect the slightly later recruitment of DLPFC and/or parietal regions to bias reflective attention to one representation, consistent with the P3b's later latency and role in context updating or recruitment of further attention/memory processes. Although the P3b is generally associated with temporoparietal source generators (Polich & Criado, 2006; Bledowski et al., 2004), frontal sources have also been found (Volpe et al., 2007); it is also true that our task is quite different from traditional P3 elicitation paradigms, and thus, interpretations of our P3-like peak may not map onto a “canonical” P3 response in every respect. Another alternative is that, because the temporal onset of DLPFC activity in refreshing is thought to occur between that of anterior pFC and more posterior regions, any DLPFC-associated ERPs may overlap too heavily with the earlier and later aspects of the P3-like peak to be easily dissociated from either.
Although this study did not record any behavioral data for the Refresh or NoAct conditions, there are clear implications for behavior. The refresh process has been proposed to be a key component of many more complex mental tasks (Chun & Johnson, 2011; Johnson & Johnson, 2009a; Johnson et al., 2005) and a critical element in conceptual (Baddeley, 2012) and quantitative (Barrouillet, Portrat, & Camos, 2011) models of working memory performance. Indeed, we have found that refreshing can have both immediate and long-term behavioral consequences. Refreshing can inhibit immediate perceptual access to a refreshed item (Johnson et al., 2013) but produce perceptual priming after a delay (Yi, Turk-Browne, Chun, & Johnson, 2008) and increase long-term recognition memory of the refreshed item (Johnson et al., 2002). The current findings provide some new tools for later studies to investigate such behavioral phenomena in a more fine-grained manner; for example, the latencies of the temporal subcomponents of the refresh task could be used as a more precise measure of when reflective attention is deployed on each trial or classifier performance could be used as a measure of representation strength. Either or both of those variables could then be used to predict behavioral effects such as long-term recognition performance or RTs for overtly refreshing (e.g., speaking aloud) an item or identifying a later re-presentation of a previously refreshed item.
All in all, these results contribute to a more complete understanding of the refresh component process specifically, its potential relation to another component process (initiating), and more generally of the spatiotemporal neural dynamics of the building blocks of more complex reflective thought processes. Additionally, the successful isolation of refresh-related ERP responses paves the way for future studies that may manipulate the refresh task to obtain a more thorough understanding of the refresh process itself, the downstream consequences of refreshing upon memory representations, or the role of reflective attention in more complex cognitive operations (e.g., the relation between refreshing and rehearsing, or retrieving). Future studies may also benefit from employing combined fMRI and EEG analyses in designs specifically targeted toward integrating the spatial and temporal features of the model described above.
The authors especially thank Christina Ramsay for assistance in data collection. Funding support was provided by grants AG009253 and MH092953 to M. K. J. and AG034773 to M. R. J.
Reprint requests should be sent to Matthew R. Johnson, Department of Psychology, Yale University, PO Box 208205, New Haven, CT 06520-8205, or via e-mail: firstname.lastname@example.org.