Maintaining visual working memory (VWM) representations recruits a network of brain regions, including the frontal, posterior parietal, and occipital cortices; however, it is unclear to what extent the occipital cortex is engaged in VWM after sensory encoding is completed. Noninvasive brain stimulation data show that stimulation of this region can affect working memory (WM) during the early consolidation time period, but it remains unclear whether it does so by influencing the number of items that are stored or their precision. In this study, we investigated whether single-pulse transcranial magnetic stimulation (spTMS) to the occipital cortex during VWM consolidation affects the quantity or quality of VWM representations. In three experiments, we disrupted VWM consolidation with either a visual mask or spTMS to retinotopic early visual cortex. We found robust masking effects on the quantity of VWM representations up to 200 msec poststimulus offset and smaller, more variable effects on WM quality. Similarly, spTMS decreased the quantity of VWM representations, but only when it was applied immediately following stimulus offset. Like visual masks, spTMS also produced small and variable effects on WM precision. The disruptive effects of both masks and TMS were greatly reduced or entirely absent within 200 msec of stimulus offset. However, there was a reduction in swap rate across all time intervals, which may indicate a sustained role of the early visual cortex in maintaining spatial information.
Research examining the neural bases of working memory (WM) has suggested that, whereas early sensory cortex is primarily engaged during stimulus encoding, the formation and maintenance of WM representations are most likely mediated by sustained activation in “higher-level” areas such as the parietal and frontal cortex (Postle, 2006; Xu & Chun, 2006; Todd & Marois, 2004; Courtney, Ungerleider, Keil, & Haxby, 1997; Goldman-Rakic, 1995). More recent evidence, however, has demonstrated that stimulus attributes, such as the color or direction of motion of a stimulus, can be decoded from delay period activity in sensory cortex using pattern classifiers (Emrich, Riggall, LaRocque, & Postle, 2013; Ester, Anderson, & Serences, 2013; Harrison & Tong, 2009; Serences, Ester, Vogel, & Awh, 2009), and the accuracy of these classifiers has been correlated with the precision of stored information (Emrich et al., 2013). These data suggest that early sensory areas may play an important role in the formation and short-term retention of WM representations, in addition to their well-established role in perceptual encoding. The correlational nature of the majority of these studies, however, has made it difficult to draw strong inferences regarding the causal role of early sensory areas in WM functions. To address this, several studies have used noninvasive brain stimulation methods to demonstrate a causal link between early sensory areas and WM (Makovski & Lavidor, 2014; van de Ven & Sack, 2013; van de Ven, Jacobs, & Sack, 2012; Cattaneo, Vecchi, Pascual-Leone, & Silvanto, 2009).
For example, van de Ven et al. (2012) used single-pulse transcranial magnetic stimulation (spTMS) to examine the contribution of the early visual cortex to visual working memory (VWM). Results showed that spTMS of the visual cortex produced a retinotopically specific disruption of performance on a shape change detection task when applied 200 msec poststimulus offset (350 msec poststimulus onset), but not when applied at 100 or 400 msec. Additionally, this disruption was only observed in a high-load condition in which three shapes were remembered; no effects of spTMS were observed when only a single item was remembered. Crucially, the TMS-related decrease in performance occurred at the same time period as visual mask-related decreases in performance observed in a second experiment. The authors concluded from these results that occipital-cortex-mediated VWM consolidation occurs early during the retention interval (see also Cattaneo et al., 2009) and is both topographically organized and highly capacity limited. Underlying this conclusion is the idea that spTMS and visual masks disrupt performance in an all-or-none fashion by interfering with visual cortex activity that is critical for the formation of stimulus representations in WM. That is, TMS prevents one or more items from being successfully consolidated and maintained in WM. This possibility appears consistent with evidence suggesting that the functional effect of TMS is to interrupt ongoing neural activity, rather than to add random noise to the signal (Harris, Clifford, & Miniussi, 2008; but see Abrahamyan, Clifford, Arabzadeh, & Harris, 2011; Schwarzkopf, Silvanto, & Rees, 2011). Another possibility, however, is that both forms of disruption produce their effects by reducing mnemonic precision (i.e., the quality of information in VWM), rather than preventing the consolidation or maintenance of some items (i.e., the quantity of information in VWM). This could happen if the disruption caused by TMS (or visual masks) prevents the further accumulation of high-resolution stimulus information for a subset of the remembered items, rather than having an all-or-none effect on consolidation. Previous TMS studies, which have relied exclusively on variants of the change detection paradigm, are not well suited to addressing this possibility, because they do not allow the quantity and quality of WM representations to be separately estimated.
In this study, we sought to clarify the functional role of early visual cortex in WM by applying single pulses of TMS to retinotopic visual cortex while participants performed a cued recall WM task. This task required participants to remember three colors presented in either the lower left or lower right visual field, randomly determined by a spatial cue presented at the beginning of each trial (Vogel & Machizawa, 2004), and to estimate a given remembered color at test by selecting its value from a continuous representation of the color space (Zhang & Luck, 2008; Wilken & Ma, 2004). Critically, although the visual field location of memory stimuli varied from trial to trial, the stimulated hemisphere was held constant. Additionally, the timing of TMS relative to the offset of the memory display was varied randomly across trials. These design elements allowed us to assess the topographic and temporal specificity of TMS effects on performance (as in van de Ven et al., 2012). To assess the functional contribution of early visual cortex activity to WM, we adopted a mixture modeling approach (Bays, Catalao, & Husain, 2009; Zhang & Luck, 2008) that attributes recall response errors to three different underlying sources: (1) response variability, measured as the standard deviation (SD) of a circular Gaussian (von Mises) distribution centered on the target color, a proxy for memory quality; (2) the probability of uniform responding (i.e., guessing, denoted as g), proposed to reflect the proportion of trials on which no information about the cued item is present in WM, a proxy for WM capacity; and (3) nontarget responses, in which participants mistakenly report the color of one of the un-cued items at test.
Our initial hypothesis was that TMS would produce retinotopically specific effects on estimates of the SD (rather than g). This was based on previous findings, suggesting that the ability to decode stimulus identity from patterns of activation in early visual cortex is predictive of SD (Emrich et al., 2013), and on theories, suggesting that the representation of high-precision featural information in WM involves the recruitment by attention of early visual areas involved in the initial processing of that information (D'Esposito, 2007; Postle, 2006; Pasternak & Greenlee, 2005). However, a previous study examining the effects of visual masks on recall performance (Zhang & Luck, 2008) revealed selective effects on random responding, rather than SD. If TMS influences WM through a similar mechanism, it could also be expected to affect the rate of random responding. Finally, if retinotopic early visual cortex plays a functional role in both the initial encoding as well as maintenance of information in VWM, we expected TMS pulses to continue to exert an influence on behavior across each of the time points tested (0, 100, or 200 msec poststimulus offset).
Results showed that, contrary to our initial prediction, TMS-related declines in performance were most prominently reflected in a decrease in g, although smaller effects were also observed on SD. Additionally, TMS produced retinotopically specific improvements in performance; the likelihood of making a swap error was reduced for targets contralateral to the stimulated hemisphere. In a corresponding experiment, we found a similar pattern of disruptive effects when visual masks were used to interfere with performance; visual masks produced a significant increase in g and smaller more variable effects on SD. Visual masks had no effect on swap rate. Although the disruptive effects of visual masks were larger and extended further into the delay than the TMS-induced effects, in both cases the effects were either absent or considerably reduced by 200 msec poststimulus (350 msec after stimulus onset).
EXPERIMENT 1: EFFECTS OF VISUAL MASKS ON CHANGE DETECTION
Before examining the effects of masks and TMS on recall performance, we conducted an initial experiment to determine the timing of masking effects in the context of change detection, the VWM task utilized by van de Ven et al. (2012). In a change detection task, participants are asked to remember a small set of simple objects (e.g., colored squares, oriented bars, abstract shapes, etc.) across a short retention interval (∼800 msec to 3 sec), by asking them to judge whether the items in a test display are the same as or different than the items they saw originally (Luck & Vogel, 1997). In a shape change detection task, van de Ven et al. (2012) found masking effects at 200 msec poststimulus offset (350 msec post onset), but not at 100 or 400 msec poststimulus offset. In contrast, Vogel, Woodman, and Luck (2006) found that visual masks disrupted performance on a color change detection task until approximately 183 msec after the onset of the memory display (33 msec poststimulus offset). Therefore, in Experiment 1, we utilized a change detection task with masks at three time points: 0, 100, or 200 msec following the offset of the stimulus display. This way, we could determine whether mask timings differ for change detection versus recall tasks before administering spTMS in a cued recall task (Experiment 3).
Thirteen undergraduate students (10 women, average age = 24.5 years) participated in this experiment for monetary compensation ($10/hr). All participants had normal or corrected-to-normal visual acuity and normal color vision and provided informed consent before participation. Study protocols for this and all subsequent experiments were approved by the North Dakota State University institutional review board.
Stimulus presentation and response recording was controlled by a PC running Matlab (The Mathworks, Inc., Natick, MA) with Psychophysics Toolbox extensions (Brainard, 1997; Pelli, 1997). Stimulus displays contained three colored squares subtending 1.23° × 1.23° of visual angle on either side of fixation at a viewing distance of 70 cm (see Figure 1). Individual colors were selected at random from a set of 180 colors equally distributed in CIELAB (1976) color space (centered at CIE L*a*b* coordinates: L = 70, A = 28, B = 12). All objects were presented in the lower left and right visual hemifields, equally spaced on an invisible circle centered at fixation with a radius of 6.31° of visual angle and squares within each hemifield were spaced 3.3° from each other, center to center.
The task is shown in Figure 1A. Each trial began with a precue (500 msec), directing participants to attend to either the left or right visual hemifield, followed by a 500-msec fixation screen, then the stimulus display (150 msec), and then a 1000-msec delay. During the delay, a pattern mask (composed of randomly selected colors from the possible colorspace) flashed for 200 msec over the spatial location occupied by each object on the left side of the screen. Therefore, when cued to attend to the left visual hemifield, the memory representations were masked, but when cued to attend to the right visual hemifield, memory representations were unmasked (the left side alone was masked to correspond with the TMS experiment, in which stimulation was always applied to the right hemisphere). Masks appeared at delays of 0, 100, or 200 msec after the offset of the stimulus display (randomly intermixed). Because the duration of the stimulus display was 150 msec, these timings correspond to SOAs between the onset of the stimulus display and the onset of the mask of 150, 250, and 350 msec, respectively. These timings were selected on the basis of previous findings related to the time course of consolidation of color stimuli in VWM, which revealed that visual masks continue to disrupt memory performance at SOAs up to183 msec (see Vogel et al., 2006, Experiment 2). Following the delay, the test display was presented, which was either identical to the memory display or contained a color change in one of the objects. When a color changed, the new color was selected at random from the total set of possible colors, with the constraint that it must be different from the sample stimulus by at least 20° in color space. Participants indicated whether the test display was the same as or different than the stimulus display by pressing one of two keys on their keyboards. Participants completed 60 experimental trials in each condition (30 change, 30 no change; 360 trials total) divided into 10 blocks with an even number of each trial type, plus a block of 36 practice trials.
A mask-related decrease in performance was found at 0 and 100 msec poststimulus offset, but not 200 msec (see Figure 2). This was supported by a 2 (Mask: mask, no mask) × 3 (Timing: 0, 100, 200 msec) repeated-measures ANOVA. Where the assumption of sphericity was violated, a Greenhouse–Geisser correction was applied. In this and all subsequent experiments, post hoc tests were corrected with the Holm–Bonferroni correction for family-wise error. With this correction, comparisons are ranked from smallest to largest p value and are compared against a critical p value, calculated separately for each hypothesis using the formula P(k) > α/m + 1 − k, where P refers to the obtained p value, α represents the selected criterion for rejecting the null hypothesis (.05 in the present case), m is an index representing the order of p values for each hypothesis tested, from lowest to highest, and k is the minimal index for which the obtained p value does not exceed the significance criterion (α/m + 1 − k; Holm, 1979). In all cases, three comparisons were made. Thus, the threshold for significance for the smallest p value is p < .0167, the second largest p < .025, and the largest p value p < .05.
The ANOVA revealed a significant effect of Mask, F(1, 12) = 12.83, p = .004, ηp2 = 0.52, no main effect of Timing, F(1.33, 15.924) = 2.33, p = .14, ηp2 = 0.16, and a significant Mask × Timing interaction, F(2, 24) = 20.62, p < .001, ηp2 = 0.63. Post hoc t tests revealed a significant difference between the mask and no-mask conditions at 0 msec, t(12) = −7.39, p < .001 (critical p = .0167), a marginally significant effect at 100 msec, t(12) = −2.53, p = .03 (critical p = .025), and no effect at 200 msec, t(12) = .01, p = .99 (critical p = .05).
Visual masks produced a temporally graded pattern of disruption of color change detection, producing a large effect when the mask appeared immediately after stimulus offset, a marginal effect 100 msec later, and no effect at all by 200 msec (SOA between onset of stimulus and onset of mask = 350 msec), similar to the timing effects found by Vogel et al. (2006). This is in contrast to van de Ven et al. (2012), who found masking-related decreases in performance only at 200 msec (sample-mask onset SOA = 350 msec), but not at either earlier (100 msec) or later (400 msec) time points. The reasons for this discrepancy are unclear. van de Ven et al. suggest that the timing of the interference effect observed in their experiment could be due to interference with a later-occurring sweep of feedback input from higher-level areas to the visual cortex, which, they propose, may be critical for successful consolidation. Differences in the timing of masking effects between our studies could potentially be explained if we assume that the timing and/or necessity of such feedback interactions for consolidation differ depending on the specific stimuli used (e.g., complex shapes vs. colors). Determining whether this is in fact the case is beyond the scope of this study and will require further research, ideally using neuroimaging methods that make it possible to carefully track the patterns of activity involved in task performance. Of greater relevance, the results of Experiment 1 provide a range of sample-mask SOAs that were utilized in Experiment 2 in the context of a recall WM task. Use of recall, rather than change detection, allowed us to determine whether visual masks influence g (as a proxy for capacity) or SD (as a proxy for mnemonic precision).
EXPERIMENT 2: EFFECTS OF VISUAL MASKS ON RECALL PERFORMANCE
Experiment 2 was identical to Experiment 1, with the exception that a cued recall test was used, rather than change detection. The task was identical to the change detection task up until the presentation of the test display. In this task, the participant is asked to recall a particular cued item by selecting its value from a continuous representation of the feature space (Zhang & Luck, 2008; Wilken & Ma, 2004). We then examined the impact of visual masks on both total error (the absolute difference between the recalled and actual color across trials) and on different putative sources of error using a mixture modeling approach (Bays et al., 2009; Zhang & Luck, 2008), which allows both the quantity and quality of VWM representations to be separately estimated.
Twenty-two undergraduate and graduate students (age M = 22.09 years, 19 women) participated in this experiment for either course credit or monetary compensation ($10/hr). All participants had normal or corrected-to-normal vision and normal color vision.
Stimuli and Procedure
The stimuli and procedure were identical to Experiment 1, with the exception that a cued recall rather than change detection test was used (Figure 1B). The test display contained a filled white square at the location of a randomly selected test item, with empty white box placeholders in the locations of the nontargets in the attended hemifield only. These boxes were surrounded by a color wheel centered at fixation with a radius of 8.13° of visual angle, which contained all possible colors equally distributed in steps of 2° and was randomly rotated on each trial so that participants could not generate anticipatory responses before the onset of the test display. Participants were instructed to report the color of the cued item by clicking on the color wheel using the computer mouse (see Figure 1). As participants moved the mouse around the color wheel, the cued square was filled with the selected color. Once participants made a response, the target square was filled with the response color, and a border of the correct color was added so that participants could compare their response to the correct one. Additional feedback was provided in the form of a black bar that appeared on the outside the color wheel, marking the correct color. Participants completed 100 trials in each condition (600 trials total), evenly distributed across 10 experimental blocks, plus one practice block of 60 trials.
Modeling Recall Response Distributions
Participants' data were analyzed using the MemToolbox (Suchow, Brady, Fougnie, & Alvarez, 2013). To analyze performance in the recall task, we made use of analytic techniques proposed by Zhang and Luck (2008) and Bays et al. (2009), in which recall response distributions are assumed to reflect a mixture of response types drawn from different distributions. According to the logic of this approach, when the cued item is successfully remembered, recall responses are drawn from a circular Gaussian (i.e., Von Mises) distribution, in which the mean indicates how close recall responses were on average to the actual target value across trials, and the standard deviation reflects the precision (quality) of the recalled items. If the item was not successfully stored, recall responses are assumed to be drawn from a uniform distribution in which individual colors are selected equiprobably from the color wheel; this “guess rate” can then be used to estimate the quantity of items stored in VWM. Finally, accurate performance on this task necessitates keeping track of which colors appeared in each location in the original memory display. A failure to do so can result in a third type of recall response in which one of the noncued memory items is recalled instead of the cued item, known as a “swap error” (Bays et al., 2009). Therefore, this method makes it possible to separately estimate the quantity and quality of stored information, as well as the likelihood of mistaking the cued item for one of the other items in WM. For each participant, we compared goodness of fit between the three-component variant of the mixture model and the two-component model proposed by Zhang and Luck (2008), which does not include swap errors, using the log likelihood and the corrected Akaike information criteria (cAIC). For all model comparisons, the three-component model provided the superior fit for the majority of participants (100% and 86% for the log likelihood and cAIC, respectively). For both the log likelihood and cAIC, we compared the computed scores for each model with a paired sample t test and report the mean difference between the scores as the two component score minus the three component score. The three component model provided a significantly superior fit for both the cAIC, t(21) = −.30, p < .001 (mean difference = 13.53, SD = 14.76), and log likelihood, t(21) = −5.49, p < .001 (mean difference = −8.07, SD = 6.89). Therefore, the three-component model was used for all analyses.
Analysis of Absolute Error
The present experiment manipulated the timing of visual masks and whether the attended visual field location was masked or unmasked. To examine the effects of mask timing and presence/absence on recall performance, we first performed a two-way (2 Mask × 3 Timing) repeated-measures ANOVA on the absolute error (absolute difference between the recalled and the actual target color). This revealed a significant main effect of Mask, F(1, 21) = 45.78, p < .0001, ηp2 = 0.69, a significant main effect of Timing, F(2, 42) = 27.99, p < .0001, ηp2 = 0.57, and a significant Mask × Timing interaction, F(2, 42) = 14.145, p < .0001, ηp2 = 0.40. Post hoc t tests revealed significant mask-related elevations in absolute error at each sample-mask timing (all ps < .001), although the effect grew substantially smaller at longer sample-mask delays (see Figure 3A).
Analysis of Mixture Model Fits
Results of the mixture model analysis can be seen in Figure 3B–D. Differences between conditions were assessed with separate 2 (Mask: mask, no mask) × 3 (Timing: 0, 100, 200) repeated-measures ANOVAs conducted for each parameter (g, SD, swap rate). When the assumption of sphericity was violated, a Greenhouse–Geisser correction was applied.
The results indicate a masking-related increase in guess rate at all three time points, although effects tended to decrease over time. Confirming this pattern, ANOVA revealed a main effect of Mask, F(1, 21) = 19.29, p < .001, ηp2 = 0.48, no effect of Timing, F(2, 42) = 1.14, p = .33, ηp2 = 0.05, and a significant Mask × Timing interaction, F(2, 42) = 3.72, p = .03, ηp2 = 0.15. Post hoc t tests revealed a significant increase in guess rate as a result of the mask at 0 msec, t(21) = 4.63, p < .001 (critical p = .0167), and 100 msec, t(21) = 3.56, p = .002 (critical p = .025), and a smaller effect at 200 msec, t(21) = 2.09, p = .049 (critical p = .05).
The ANOVA only revealed a main effect of Mask, F(1, 21) = 4.39, p = .049, ηp2 = 0.17, with a marginal effect of Timing, F(1.55, 32.49) = 3.20, p = .065, ηp2 = 0.13, and no significant interaction, F(1.51, 31.77) = 1.58, p = .22, ηp2 = 0.07.
No masking effects on swap rate were found. The ANOVA revealed no effects of Mask, F(1, 21) = 2.73, p = .11, ηp2 = 0.12, Timing, F(2, 42) = 2.43, p = .10, ηp2 = 0.10, or a Mask × Timing interaction, F(2, 42) = 1.00, p = .38, ηp2 = 0.05.
Using a recall test, Experiment 2 revealed the presence of masking effects at all tested intervals, although the magnitude of the effect decreased over time. Although the greater number of participants in Experiment 2 makes it difficult to directly compare these results to Experiment 1, this result suggests that the recall task may have greater sensitivity to reveal masking effects than the change detection task. The time required to fully consolidate VWM representations may therefore be somewhat longer than suggested by studies examining this issue using the change detection task (Vogel et al., 2006). In keeping with the findings of Zhang and Luck (2008), the masks in this experiment primarily influenced the likelihood of generating a guess-like response. However, we also observed a small, mask-related increase in SD that did not vary based on mask timing. Thus, these findings are broadly consistent with the proposal that encoding in VWM is an all-or-none process, as opposed to a process characterized by a gradual accumulation of featural information over time (Zhang & Luck, 2008).
EXPERIMENT 3: EFFECTS OF SPTMS OF RETINOTOPIC VISUAL CORTEX ON RECALL PERFORMANCE
The goal of Experiment 3 was to determine whether TMS affects the quantity and quality of information in VWM in a manner similar to the visual masks used in Experiment 2. The same procedure was used as in Experiment 2, except that instead of visual masks, spTMS was applied to retinotopic visual cortex at variable delays relative to stimulus offset. It was expected that disruptions in performance would occur when participants encoded information presented in the visual field contralateral to the stimulated hemisphere (as in van de Ven et al., 2012); contralateral stimulation trials therefore served the same function as the “mask” trials in Experiments 1 and 2. The trials in which participants attended to the hemifield ipsilateral to stimulation (i.e., right hemifield during right-occipital stimulation) served as a hemispheric control. Although it is common to also include a sham control condition in TMS experiments, the use of a control hemisphere has the advantage that (1) trials probing the control and target hemispheres are randomly intermixed throughout the session, ensuring that the state of the participant is roughly equivalent across conditions, and (2) the tactile sensation produced by the coil's discharge is identical across conditions (acoustic sensations were largely eliminated by the use of masking noise during the session). For these reasons, a sham control was not included.
Twenty-two participants recruited from the North Dakota State University undergraduate and graduate population completed the experiment for paid compensation ($20/hr). All participants were between the ages of 18 and 35 years, (age M = 22 years, 15 women), were right-handed, reported normal or corrected-to-normal visual acuity, and had normal color vision (as established with the Ishihara color vision test). Participants gave informed consent and were screened for the presence of neurological and psychiatric conditions and other risk factors related to the application of both MRI and TMS before participation (based on guidelines for TMS safety set forth in Rossi, Hallett, Rossini, & Pascual-Leone, 2009). One participant was excluded because of very low performance (guess rate > 2 Sds below the mean across all conditions), bringing the final N to 21. The results do not change when this participant is included.
TMS Targeting and Stimulation
TMS was delivered with a Magstim Super Rapid 2 magnetic stimulator fit with a focal bipulse, figure of eight 70-mm stimulating coil (Magstim, Whitland, UK). TMS targeting and online guidance was achieved using a Visor2 neuronavigation system (Advanced Neuro Technology, Enschede, The Netherlands) that uses infrared-based frameless stereotaxy to map the position of the coil and the participant's head within the reference space of the individual's high-resolution anatomical MRI. Whole-brain T1-weighted anatomical MRI scans were acquired with a GE Signa HD 1.5-T MRI scanner for each participant before participation (206 axial slices, with a resolution of 1 mm). Throughout the experiment, participants listened to constant masking noise played through a pair of inserted earplugs. The volume of the masking noise, which never exceeded 90 dB, was adjusted immediately before the experimental session for each participant until the “click” produced by the discharge of the TMS coil could no longer be heard (Johnson, Kundu, Casali, & Postle, 2012; Esser et al., 2006).
Phosphene Localization and Thresholding
Phosphene localization procedures and all subsequent stimulus presentation and response recording were controlled by a PC running Matlab with Psychophysics Toolbox extensions (Brainard, 1997; Pelli, 1997). The right primary visual cortex was first identified and targeted on the basis of individual anatomy, and target coordinates were further refined by determining the coil position and orientation that elicited visual phosphenes in a particular region of visual space (van de Ven et al., 2012). To do this, participants fixated a central white cross presented against a black background while single pulses of TMS were administered to visual cortex at 70% of stimulator output. Coil position and orientation were then adjusted until reproducible phosphenes were visible in the hemifield contralateral to stimulation. Once phosphenes were induced, participants used a computer mouse to draw a circle around the region in the visual field where the phosphenes appeared. Responses were recorded as both an image file and as the x–y coordinates of all mouse locations. The coil position was then adjusted until reliable phosphenes were produced overlapping or adjacent to the location of the sample display items (for stimulus positions, see Experiment 1 Methods). Phosphenes were considered to be elicited when the following criteria were met: Phosphenes could be localized in both left and right hemispheres, phosphenes were elicited with eyes shut, and phosphenes moved with fixation (Kammer, 1998). Following phosphene localization, phosphene thresholds were established. Stimulator intensity was reduced by increments of 5%, and 10 pulses were delivered at each intensity until phosphene threshold, defined as the minimum intensity required to elicit phosphenes 50% of the time, was established. During the experiment, pulses were administered to the right hemisphere at 110% of phosphene threshold. Average stimulation intensity was at 67% of stimulator output (range = 57–82%). Postexperiment debriefing confirmed that TMS at this intensity did not give rise to visible phosphenes during performance of the color recall task, in which stimuli were presented against a light gray background and attention was focused on the task.
Stimuli and Procedure
Stimuli and procedure were identical to Experiment 2, except that TMS pulses, rather than visual masks, were applied at varying intervals relative to stimulus onset (see Figure 1C). Viewing distance was held constant at 70 cm, and head position was stabilized using a chinrest. spTMS was administered either 0, 100, or 200 msec poststimulus offset.
Analysis of Absolute Error
Absolute error for each combination of TMS side and timing can be seen in Figure 4A. As in Experiment 2, differences in absolute error across conditions were assessed by a two-way (2 Mask × 3 Timing) repeated-measures ANOVA. Contrary to the masking data, neither main effect reached significance (ps = .20 and .23 for side and timing, respectively). However, there was a trend toward a significant Side × Timing interaction, F(2, 40) = 2.608, p = .086, ηp2 = 0.12, with somewhat elevated absolute error for targets contra versus ipsi to the stimulated hemisphere when TMS was applied 0-msec poststimulus offset.
Analysis of Mixture Model Fits
As with Experiment 2, estimates of response errors were derived from the three-component mixture model proposed by Bays et al. (2009). Comparison of model fits using the log likelihood and cAIC suggested that the three-component model performed better than the two-component model for the majority of participants (100% versus 82% for log likelihood and cAIC, respectively). Average estimated model parameters are depicted in Figure 4B. As in Experiment 2, three separate 2 (Visual hemifield: contra vs. ipsi to stimulated hemisphere) × 3 (TMS timing: 0, 100, or 200 msec after stimulus display offset) repeated-measures ANOVAs were conducted, one each for g, SD, and swap errors. For both the log likelihood and cAIC, we compared the computed scores for each model with a paired sample t test. The three component model provided a significantly superior fit for both the cAIC, t(20) = 2.34, p = .03 (mean difference = 26.41, SD = 51.70), and log likelihood, t(20) = −2.52, p = .02 (mean difference = −14.22, SD = 25.85).
A TMS-related increase in g was found when stimulation was applied coincident with stimulus offset. Specifically, the ANOVA revealed no main effects of either stimulation side, F(1, 20) = 2.41, p = .14, ηp2 = 0.11, or TMS timing, F(2, 40) = 1.33, p = .28, ηp2 = 0.06, but there was a significant interaction between Timing and Stimulation side, F(2, 40) = 3.69, p = .03, ηp2 = 0.16. Follow-up post hoc t tests revealed an increase in guess rate for targets in the contralateral versus ipsilateral hemifield when TMS was applied at 0 msec, t(20) = 2.50, p = .02, which was just above the corrected threshold value for significance (critical p = .0167). No effects of TMS on guess rate were found at either 100 msec, t(20) = .48, p = .63, or 200 msec, t(20) = .99, p = .33.
The ANOVA revealed a main effect of Stimulation side, F(1, 20) = 5.05, p = .04, ηp2 = 0.20, but no main effect of Timing, F(2, 40) = 1.45, p = .25, ηp2 = 0.07, and a marginal interaction, F(2, 40) = 2.80, p = .07, ηp2 = 0.12. However, post hoc t tests revealed no significant differences in standard deviation between the contralateral and ipsilateral hemifield when TMS was applied at 0, t(20) = −1.40, p = .18, 100, t(20) = 2.14, p = .045 (critical p = .0167), or 200 msec, t(20) = .83, p = .42.
Interestingly, analysis of swap rate revealed an overall decrease in swap errors for targets in the contralateral hemifield, but this was not specific to a particular TMS timing. The ANOVA revealed a main effect of Stimulation side, F(1, 20) = 5.44, p = .03, ηp2 = 0.21, but no effects of timing, F(2, 40) = .44, p = .65, ηp2 = 0.02, and no interaction, F(2, 40) = 1.14, p = .33, ηp2 = 0.05.
These data demonstrate that TMS of the early visual cortex produced temporally and topographically specific effects on g, an index of the number of items that were successfully stored in VWM. Specifically, when applied coincident with the offset of the stimulus display, spTMS increased the likelihood of making a guess response for targets appearing in the visual hemifield contralateral to the stimulated cortex (i.e., the left visual field following right visual cortex stimulation). Contrary to our predictions, TMS only produced a small effect on SD. Although the effect on SD appeared to be specific to the 100-msec condition, the TMS Timing × Visual Hemifield interaction of the ANOVA was not significant, nor was the post hoc t test looking at differences in SD at this interval. Additionally, contrary to our predictions, the effects of spTMS on both g and SD were entirely absent by 200 msec after stimulus offset (i.e., 350 msec poststimulus onset). Contrasting with these disruptive effects, spTMS also produced a temporally nonspecific reduction in the likelihood of making swap errors for targets in the contralateral hemifield (i.e., swap errors were less likely when estimating the color of targets contra vs. ipsi to the stimulated hemisphere). The effects of TMS on g and SD are qualitatively similar to the findings of van de Ven et al. (2012), who found that spTMS was no longer effective at later time points, when consolidation is presumably already completed. However, these results go beyond their findings by demonstrating temporally and topographically specific effects on the parameters of the mixture model, which may map on to qualitatively distinct sources of error in recall WM tasks.
This study sought to clarify the functional relevance of early visual cortex contributions to VWM. To do this, we presented visual pattern masks or administered single pulses of TMS to retinotopic visual cortex at different time points relative to the offset of a stimulus display. To determine the nature of the influence of masks and TMS on the formation of VWM representations, participants' memory for the sample display colors was assessed using either delayed recognition (change detection) or cued recall, which made it possible to separately estimate the number of items stored in WM and the precision of the stored information. Critically, although memory stimuli could be encoded from either the lower left or lower right visual fields, as determined by a spatial cue presented at the beginning of each trial, both masks and TMS selectively targeted the lower left visual field. Thus, performance for stimuli encoded from the lower right visual field served as a within-subject control (as in van de Ven et al., 2012). Across experiments, we observed similar effects of masks and TMS on memory for items appearing in the affected visual field, although the effects of masks were generally more pronounced. Specifically, visual masks impaired change detection performance (Experiment 1) and increased the likelihood of making a guess-like response (Experiment 2) for targets appearing in the lower-left visual field, and spTMS induced a similar increase in guessing for stimuli in the visual field contralateral to the targeted hemisphere (Experiment 3). Both masks and TMS produced the largest effects when applied coincident with stimulus offset, with the disruption either entirely dissipating (Experiments 1 and 3) or growing progressively smaller (Experiment 2) the further into the retention interval it was applied. Visual masks and TMS also produced small and variable effects on the standard deviation of responses. Finally, TMS produced an unexpected decrease in swap errors for targets appearing in the affected hemifield. These findings have several implications for our understanding of the time course and functional relevance of early visual cortex to VWM.
Regarding the question of whether masks (and TMS) affect the number of items stored or their quality, results of the mixture model analysis in Experiment 2 confirmed the findings of Zhang and Luck (2008), who found that masks primarily affect performance by reducing the probability that an item is encoded into VWM, rather than increasing the variability of recall responses (SD). Similarly, Experiment 3 revealed that TMS applied to early visual cortex coincident with stimulus offset produced an increase in guess rate for targets presented in the contralateral visual field. Although there were also effects on SD, these effects were more variable, and post hoc tests suggested that they were not reliable at any of the tested latencies. Thus, taken together, these results support the proposal of van de Ven et al. (2012) that interfering with activity in early visual cortex likely disrupts an ongoing process of memory consolidation, suggesting a functional role for early sensory areas in the initial formation of VWM representations.
Although the mechanisms underlying the functional effects of spTMS are not well understood, research by Harris and colleagues (2008) suggests that the pattern of disruption observed here and in the study of van de Ven et al. (2012) may have been caused by a TMS-related decrease in signal intensity, rather than an increase in random image noise (Abrahamyan et al., 2011; Schwarzkopf et al., 2011). This would explain the finding of more robust and reliable effects on guessing, rather than standard deviation. Although the precise mechanism by which TMS induces these effects is unclear, a recent optical imaging study examining the effects of TMS on cat visual cortex activity suggests that the reduction in signal intensity may be caused by TMS-induced local cortical inhibition (Kozyrev, Eysel, & Jancke, 2014). In their study, single pulses of TMS applied to the visual cortex were found to induce a localized pattern of inhibition that lasted approximately 300 msec before returning to baseline. Assuming similar mechanisms are at work in human visual cortex, this brief period of inhibition may be adequate to disrupt the initial formation of WM representations, without affecting later activity once consolidation is complete. Whether this is correct will require further work examining the neural effects of TMS. In particular, this question could be profitably addressed by the adoption of a computational neurostimulation approach, in which the effects of simulated TMS pulses are examined in the context of realistic neural models of the consolidation and maintenance of information in WM (for discussion of this approach, see Bestmann & Feredoes, 2013).
Another notable aspect of the data that speaks to the functional role of the early visual cortex in VWM is the time course of the effects of masks and TMS on performance. If activity in the early visual cortex is critical for both the initial consolidation and later maintenance of information in VWM, as has been proposed by sensory recruitment models of WM (D'Esposito & Postle, 2015), we expected TMS to disrupt performance at all latencies tested. Contrary to this possibility, in each experiment, the effects of TMS and of masks were either substantially reduced or entirely absent when stimulation was applied at later time points, when consolidation was nearing completion and short-term maintenance had presumably begun. These findings match those of van de Ven et al. (2012), who observed a significant disruption of performance when spTMS was applied 200 msec after memory display offset, but not 200 msec later. Similarly, Beckers and Hömberg (1991) reported no effect on performance when spTMS was applied to the visual cortex during the retention interval of a delayed match to sample task requiring memory for faces, although performance was disrupted by TMS applied during retrieval.
These findings suggest that early visual cortex involvement in VWM may be restricted to an early consolidation time window during which fragile sensory representations are being transformed into more durable VWM representations.
This conclusion is consistent with several lines of evidence suggesting that, although initial perceptual processing depends on activity in early visual areas, maintenance in VWM likely depends on activity in higher-order cortical areas, such as the parietal or frontal cortex (Bettencourt & Xu, 2016; Mendoza-Halliday, Torres, & Martinez-Trujillo, 2014; Xu, 2010; Xu & Chun, 2006; Todd & Marois, 2004, 2005). For example, using fMRI and multivoxel pattern classification, Bettencourt and Xu (2016) reported that the ability to decode the orientation of a remembered stimulus from patterns of activity in specific regions of the occipital and parietal cortex differed depending on whether new, task-irrelevant stimuli were presented during the delay. Although it was possible to decode item orientation from both areas during an unfilled delay, when task-irrelevant distractors were presented, decoding from early visual areas was no longer possible, even though task performance was unaffected by the new input. By contrast, decoding from the superior intraparietal sulcus, a region that has been implicated in setting capacity limits in WM (Todd & Marois, 2004, 2005), remained intact and closely tracked WM task performance.
Similarly, a recent neurophysiological recording study in macaques (Mendoza-Halliday et al., 2014) found that, although the direction of motion of a stored stimulus could be decoded from local field potential activity recorded in the motion-selective middle temporal area (MT), stimulus-specific spiking activity was only observed in higher-order, multimodal areas. Moreover, low-frequency neural oscillations in MT were phase-coherent with spiking activity in the frontal cortex, suggesting that the local field potential activity supporting decoding in MT was likely driven by feedback inputs from the frontal or parietal cortex, rather than reflecting memory-related activity arising in early visual area MT itself. Findings such as these suggest that, although it may be possible to decode stimulus identity from the early visual cortex in some cases, this activity may not be causally necessary for storage in WM.
In a reply to the study of Bettencourt and Xu (2016), Ester, Rademaker, and Sprague (2016) pointed out that a failure to decode from a given area does not constitute strong evidence against this area's involvement in the process in question. For example, it remains possible that the visual cortex was engaged in storage in the distractor condition of Bettencourt and Xu (2016), but at a level that was inaccessible to multivoxel decoding analyses (see Dubois, de Berker, & Tsao, 2015, for evidence demonstrating such a failure of decoding in the macaque face patch system). Similar arguments have been forwarded to explain several notable failures to decode item identity from delay period activity in both frontal and parietal regions (Emrich et al., 2013; Linden, Oosterhof, Klein, & Downing, 2012; Riggall & Postle, 2012). For example, in a recent review paper, Riley and Constantinidis (2016) argue that the spatial resolution of fMRI may be insufficient to support successful decoding of item identity from the frontal cortex, which is characterized by nontopographic organization and the representation of information at spatial scales that can be an order of magnitude finer than the resolution of fMRI. Given this, failures to decode the contents of WM from the frontal cortex would be expected, even if this information is robustly represented in this area.
This discussion highlights potential limitations in the kinds of inferences that can be drawn from pattern classification analyses alone. On the one hand, as noted by Ester et al. (2016) and Riley and Constantinidis (2016), a failure to decode does not necessarily constitute strong evidence against a particular area's involvement in a given process. On the other hand, the results of Bettencourt and Xu (2016) and Mendoza-Halliday et al. (2014) suggest that the ability to decode from a particular area is no guarantee that this area is causally necessary for storage. One way of addressing these inferential limitations is to adopt causal methods such as TMS, as we have done here. Crucially, in both our study and that of van de Ven et al. (2012), TMS only disrupted performance when applied relatively early in the delay period (see also Beckers & Hömberg, 1991). These findings strongly suggest that, although it is sometimes possible to decode item identity from this area, the early visual cortex is likely not causally necessary to represent item information beyond the time period of initial memory consolidation.
Finally, in addition to the temporally specific disruptive effects of TMS on guess rate, we also observed a significant main effect of stimulation side on swap rate (i.e., on the likelihood of confusing the cued item with one of the other items in memory). Contrary to what we would have expected, the swap rate was significantly reduced for targets appearing in the affected hemifield across all intervals tested. That is, TMS appears to have had a reliable enhancing effect for contralateral targets, increasing the likelihood of correctly binding individual colors to specific locations. According to Bays et al. (2009), swap errors likely arise as a result of coarse coding of the spatial positions of remembered features. This can lead to confusion about which item to report at test, particularly when either a large number of items need to be remembered (Bays et al., 2009) or when stimuli are presented very close together in space (Emrich & Ferber, 2012). One means by which TMS could reduce this type of error, therefore, is if it serves to sharpen the spatial tuning of neurons in the stimulated area. Although we are unaware of any direct evidence supporting this possibility, the results of Kozyrev et al. (2014) discussed above are suggestive. In their study, single pulses of TMS were found to produce a short-lived, localized increase in inhibition, which they attribute to a selective effect of TMS on inhibitory interneurons. Increased inhibition has, in turn, been shown to produce a sharpening of neural tuning in various sensory cortical areas (Isaacson & Scanziani, 2011). If spTMS of the visual cortex in our study increased inhibition and this increase sharpened the tuning of cells coding for the spatial position of remembered items, this could explain the beneficial effect of TMS on color–location binding. This possibility is frankly speculative, but it could be tested by assessing the impact of spTMS of early visual cortex on recall of the spatial position, rather than color, of a remembered item. If this were the case, we would expect TMS to produce an improvement in the resolution of spatial recall (i.e., reduced SD).
TMS as applied here is a powerful method of determining the functional relevance of a particular brain area. Future work could benefit by stimulating a wider range of areas and/or localizing TMS targets using functional criteria. For example, current models and data suggest that WM functions are likely supported by distributed activity within a network of partly functionally specialized brain regions, rather than being mediated by a single area (Ester et al., 2016; D'Esposito & Postle, 2015; Ester, Sprague, & Serences, 2015). Future research could benefit by targeting other regions within this network, such as the parietal or pFC, to determine their specific contributions to WM. In addition, although the current study suggests that early visual cortex is not causally necessary for WM storage, different results may be achieved by targeting specific subregions of the visual cortex on the basis of functional criteria, rather than phosphene perception alone. For example, TMS targeting could be guided by fMRI scans aimed at localizing specific areas exhibiting particular functional characteristics (e.g., color selectivity or load sensitivity) by selecting regions on the basis of accurate classification of stimulus features or via TMS-based localization of feature-specific regions. As an example of the latter, Banissy, Walsh, and Muggleton (2012) have shown that continuous theta-burst stimulation applied to area V4 disrupted color priming effects in a subsequent task. A procedure such as this one could be used to localize color-selective areas of visual cortex before the application of TMS during a color WM task. It is possible that localizing stimulus-selective regions in this way could reveal a role of visual cortex in WM maintenance. This is still to be determined however, and additional research will be required to determine the functional roles of each region in the network of brain areas associated with WM functioning.
In conclusion, results of this study suggest that interfering with visual cortex activity by either presenting visual masks or directly stimulating the cortex with TMS disrupts performance primarily by influencing the number of items that are successfully stored, with smaller, more variable disruptions of the quality of VWM representation. In each case, observed disruptive effects were largest when interference was applied very early during the retention interval, growing smaller or disappearing altogether at later intervals. In addition, TMS also produced longer-lasting, location-specific improvements in color–location binding. Taken together, this pattern of results echoes recent findings suggesting that early visual cortex involvement in VWM for object features may be restricted to the period of time during which fragile sensory representations are being transformed into more durable VWM representations, with its role dissipating the further one goes into the retention interval.
Reprint requests should be sent to Amanda E. van Lamsweerde, North Dakota State University, Dept. 2765, P.O. Box 6050, Fargo, ND 58108, or via e-mail: email@example.com.