Although humans can hold multiple items in mind simultaneously, the contents of working memory (WM) can be selectively prioritized to guide future behavior. We explored whether the “same-object” benefits in visual processing may also be observed in visual WM. fMRI data were collected while participants performed a multistep serial retrocuing task in which they first viewed two 2-D objects (coherently moving colored dots). During retention, an initial relevance cue then indicated whether only the first or only the second object (“object-relevant”), or only the color of both objects or only their direction of motion would be relevant for the remainder of the trial (“feature-relevant”). On “object-relevant” trials, the ensuing priority cues selected either one of the features (“color” or “direction”) bound to the relevance-cued object, whereas on “feature-relevant” trials, the priority cues selected one of the two relevance-cued features. Using multivariate inverted encoding models, we found a same-object benefit on object-relevant trials in occipitotemporal regions: On feature-relevant trials, the first priority cue triggered a strengthening of the neural representation of the cued feature and a concomitant weakening to baseline of the uncued feature, whereas on object-relevant trials, the cued item remained active but did not increase in strength and the uncued item weakened but remained significantly elevated throughout the delay period. Although the stimulus-specific representation in frontoparietal regions was weak and uneven, these regions closely tracked the higher order information of which stimulus category was relevant for behavior throughout the trial, suggesting an important role in controlling the prioritization of information in visual WM.
Working memory (WM) is a cognitive function that enables the mental retention of information in the absence of sustained input from the physical world, its manipulation, and its use for guiding behavior. Current accounts of sensory WM hold that it relies on attentional mechanisms also involved in the prioritization of information perceived in the environment (D'Esposito & Postle, 2015; Kiyonaga & Egner, 2013; Gazzaley & Nobre, 2012; Oberauer & Hein, 2012). Consistent with this view is the fact that instructing an individual to prioritize a subset of information being held in WM with a retrodictive cue (hereafter, “retrocue”) improves its subsequent recall at the expense of uncued information, in a manner comparable to the effects on visual perception of prospectively cuing a location or a feature in an impending visual scene (Sahan, Verguts, Boehler, Pourtois, & Fias, 2016; Zokaei, Manohar, Husain, & Feredoes, 2014; Pertzov, Bays, Joseph, & Husain, 2013; Lepsien & Nobre, 2007; Griffin & Nobre, 2003). The retrocuing technique has also been used to test theories about the capacity and the temporal dynamics of the putative focus of attention (or “region of direct access”) in state-based models of WM (as reviewed in LaRocque, Lewis-Peacock, & Postle, 2014). In one series of studies, we have used a multistep delayed serial retrocuing (DSR) procedure in which participants are first presented with two sample items to hold in WM, then after a retention interval, an initial retrocue indicates which of the two will be tested by the impending memory probe. Because participants know, however, that this probe will be followed by a second retrocue and a second probe and because they know there is an equal probability that the second retrocue will prioritize either sample item, they cannot forget the item not cued by the initial retrocue. This creates a portion of the trial in which two items are being held in WM, but only one is a “prioritized memory item” (PMI). Initially, in studies with fMRI (LaRocque, Riggall, Emrich, & Postle, 2017; Rose et al., 2016; Lewis-Peacock & Postle, 2012) and with EEG (Rose et al., 2016; LaRocque, Lewis-Peacock, Drysdale, Oberauer, & Postle, 2013), multivariate pattern analyses (MVPAs) showed that decoding the neural representation of the PMI improved following the initial retrocue, whereas decoding the neural representation of the initially uncued (and, therefore, “unprioritized”) memory item (UMI) dropped to baseline levels. More recently, there have been reports of multivariate evidence for the UMI, with an item represented in a different region (e.g., Christophel, Iamshchinina, Yan, Allefeld, & Haynes, 2018) or in a different neural code (e.g., van Loon, Olmos-Solis, Fahrenfort, & Olivers, 2018) when a UMI relative to when a PMI. This study was designed to use a multivariate inverted encoding modeling (IEM) approach, which offers advantages over multivariate decoding approaches (Serences & Saproo, 2012), to address several questions that arise from these observations: (1) Is one effect of selection to increase the strength of the neural representation of the PMI? (2) Does the degradation of MVPA decodability of the UMI truly correspond to a weakening of its neural representation? (3) Regardless of the answer to (2), is the retrocuing effect on the neural representation of the UMI sensitive to its status as a discrete object versus as a feature in a multidimensional object? The answers to these questions will have implications for broader questions, such as whether principles of object- and feature-based attention also apply to WM, and whether a complex object is represented in WM as more than the set of features that define it.
Well-established principles of visual attention, such as biased competition (Desimone & Duncan, 1995) and divisive normalization (Carandini & Heeger, 2012), have the potential to account the finding that retrocuing leads to a weakening of the neural representation of the UMI. That is, prioritization of one among multiple mnemonic representations could be achieved via top–down signals from frontoparietal systems that are important for the endogenous control of attention (e.g., Nelissen, Stokes, Nobre, & Rushworth, 2013). Importantly, the dynamics of biased competition have been demonstrated, with MVPA, to influence the population-level representation of objects in a manner that would be predicted from single-neuron studies, both in analyses of extracellular recordings from neurons in monkey ventral temporal cortex (Zhang et al., 2011) and in unpublished fMRI data from humans performing a selective attention task (Sheldon, Saad, Sahan, Meyering, & Postle, Submitted). Indeed, it is the effects suggesting the operation of object-based attention, in a different previous fMRI study of WM, that motivate the present experiment.
In an fMRI study of DSR by Lewis-Peacock, Drysdale, and Postle (2015), participants first viewed an image of a real-world object (e.g., a baseball) and were then cued as to what dimension of that stimulus would be interrogated by a memory probe: its silhouette outline, the phonology of its name, or the semantic category to which it belonged. Although the performance of classifiers trained independently to discriminate visual from phonological from semantic processing strengthened and weakened in a manner congruent with the cues, unlike in studies that required WM for two discrete objects, decoding for the uncued stimulus features did not drop to baseline levels. One possible account of this observation was that we were observing a neural correlate of an analogue of the “same-object” benefit that is seen in visual selective detection (Driver, 2001; Vecera, Behrmann, & McGoldrick, 2000; Duncan, 1984). That is, perhaps the uncued stimulus dimensions retained some level of activity because they were an inherent part of the same object from which a different stimulus dimension had been selected, and attention therefore spread to all components of the selected object. Limitations of that study's design, however, precluded a strong test of this possibility.
The question of whether multidimensional objects are represented as bound objects in visual WM remains contentious in WM research. On one hand, there are several studies suggesting that objects defined by a conjunction of two or more features can be maintained just as well as can single-feature objects, suggesting that the elementary units of visual WM are integrated objects (e.g., Peters, Kaiser, Rahm, & Bledowski, 2015; Luria & Vogel, 2011; Woodman & Vogel, 2008; Luck & Vogel, 1997). Others have argued, however, that the elementary units of visual WM are the features that make up complex visual objects and that the various features of an object are simultaneously stored in dimension-specific channels (e.g., Burmester & Fougnie, 2016; Bays, Wu, & Husain, 2011; Wheeler & Treisman, 2002). Furthermore, according to these feature-based accounts, feature binding in WM only occurs when attention is exerted over the to-be-bound features. For instance, Wheeler and Treisman (2002) showed that same-object benefits were observed in visual WM only when participants were not holding competing multifeature objects in WM, presumably because these would disrupt sustained attentional control. The design of this study may also help to address this debate.
To address more directly whether principles of object-based attention can be observed during visual WM, we designed this study to compare the neural effects of selecting one feature from among two 1-D objects being held concurrently in WM versus those of selecting one feature from a single 2-D compound object being held in WM. Additionally, for this study, we adopted an analytic method that would allow us to quantify the effects of selection on the strength of WM representations. A limitation of the MVPA decoding approach, such as what we have used in many previous DSR studies, is that it does not index the strength of neural representations per se. For example, although one can observe systematic changes in MVPA performance with the manipulation of, say, the number of items being held in WM (e.g., Emrich, Riggall, LaRocque, & Postle, 2013), the interpretation of such a finding with regard to the neural representation of stimulus information is equivocal. Although it could be that a load-related decline in the strength of the neural representations of items leads to the decline in MVPA decoding, there are equally plausible alternative explanations: Perhaps increasing load changes the level of stimulus-nonspecific noise that nonetheless influences the performance of the decoder, or perhaps increasing load changes the nature of the neural code, but not of the amplitude per se, of stimulus representations. IEM, in contrast, entails the fitting of data to one or more a priori models that specify the mapping from multiple sensor-level signals into a hypothesized population-level representation. This affords the quantification of different parameters of the model fit, such that one can estimate, in our case, the extent to which selection might strengthen or weaken our model's estimated reconstruction of the neural representation of stimulus information.
Ten neurologically healthy students from the University of Wisconsin–Madison (three women, 18–30 years, M = 22 years, all right-handed) participated in three 2-hr scanning sessions in return for payment. One participant was excluded from the analyses because of excessive head movement. Another participant was an author of this study (A. D. S.). All participants had normal or corrected-to-normal vision and reported having normal color vision. The research complied to the guidelines of the University of Wisconsin–Madison's Health Sciences institutional review board, and all participants gave written informed consent.
The experiment comprised a one-item delayed recall (a.k.a. “delayed estimation”) task and a multiple serial retrocuing (MSR) task (Figure 1). The purpose of the delayed recall task was twofold: to serve as a localizer that was independent of the MSR task and to train feature dimension-specific IEMs for testing on the MSR data. The primary rationale for this approach is that reconstructions by an IEM that is trained on an independent data set can be used for quantitative comparisons between experimental conditions of interest (e.g., the effect of priority status on estimates of the strength of the delay period representation of motion in the MSR task). Operationally, because we were interested in the effects of prioritization on mnemonic representations, we trained IEMs trained on the final repetition time (TR) of the delay period from the one-item delayed recall task for testing on data from the MSR task. Our reasoning was that such “one-item delay” models would constitute relatively robust and “pure” models of the representation of stimulus motion and stimulus color in WM: robust because multivariate estimates of stimulus information are stronger when just one item is being held in WM (e.g., Sprague, Ester, & Serences, 2014; Emrich et al., 2013) and “pure” because additional processes such as context binding (Gosseries et al., 2018) and attention shifting (e.g., Thigpen, Petro, Oschwald, Oberauer, & Keil, 2019) were less likely to be engaged. We made this choice with the knowledge that a limitation of this approach would be that that all the results that it would generate could only be interpreted in terms of the codes engaged by the one-item delayed recall task. That is, if one or more neural codes are engaged uniquely by the MSR task, our approach would not be sensitive to this. Nonetheless, our experience from previous studies (e.g., Cai, Sheldon, Yu, & Postle, 2019; LaRocque et al., 2017) gave us confidence that this approach would be effective at testing our principal hypotheses of interest.
The delayed recall task began with the presentation of a sample stimulus, either a patch of uniformly colored static dots whose color varied from trial to trial or a patch of gray dots moving with 100% coherence in a direction that varied from trial to trial. Data from this task were used to train IEMs to learn the neural bases of perceiving and remembering “color” and “direction of motion.” We note that it is possible that, when moving dots are presented at high contrast within a circular aperture, as is the case here, it is possible that the signals that support direction-of-motion decoding may not (only) correspond to motion processing but perhaps (also) to other factors, such as the transients generated by appearance/disappearance of individual dots at the opaque boundary of the aperture (e.g., an aperture-inward bias; Wang, Merriam, Freeman, & Heeger, 2014). Importantly, the possible ambiguity about the precise computations underlying the signals generated by this condition are not problematic for interpreting our results, because our interest is not in the neural bases of motion perception, per se, but rather in the neural bases of attention to either of two visual features that had the subjective properties (for the participant) of being categorically different—one reproducible on a color bar and the other with a radial dial—and the objective properties (for the experimenters) of being varied along orthogonal dimensions—values in color space for static dots or direction of motion for color-invariant gray dots. For expository parsimony, from this point onward, we will refer to these feature dimensions as “color” and “direction.”
Each trial of the task of primary experimental interest (the MSR task) began with the serial presentation of two 2-D stimuli, each a patch of coherently moving colored dots. Next, a “relevance cue” designated what information from the two sample stimuli would be relevant for that trial: the color and direction of either the first or the second sample, or the color of both samples or the direction of both samples. Thus, a “relevance cue” indicating “〈First〉” or “〈Second〉” would designate a trial that would require selection of a feature bound to a 2-D object, whereas a “relevance cue” indicating “〈Color〉” or “〈Direction〉” would designate a trial that would require selection of the relevant feature of one from among two objects. Following the “relevance cue,” each trial proceeded in the same way as many previous D(ual)SR tasks: a first “prioritization cue” indicated which of the two relevant features would be tested by the first recall probe, then a second “prioritization cue” indicated, with a probability of .5, which of the same two relevant features would be tested by the second recall probe (Figure 1). For the remainder of this report, we will refer to trials when the “relevance cue” designated the “〈First〉” or “〈Second〉” sample stimulus as Bound trials (because they entailed prioritizing a feature that, when perceived in the sample display, was bound to another feature as part of a 2-D compound stimulus) and trials when the “relevance cue” designated the “〈Color〉” or “〈Direction〉” of the two sample stimuli as Unbound trials.
All sample stimuli comprised 400 dots (0.08 in diameter) displayed within an invisible circular aperture (7.75° in diameter). Delayed recall tasks featured 1-D trials: On direction trials, gray dots (L = 38, a = 0, b = 0) moved with 100% coherence at a constant speed of 3.2°/sec in a direction that was randomly sampled (without replacement) from a list of 180 vectors that spanned the full space of 360° in increments of 2°. The dots were repeatedly redrawn on each frame of the monitor with a refresh rate of 60 Hz. The direction recall interface was a dial: a white circle (7.75° in diameter) presented centrally with a white radius line (.05° wide) extending from the center to the edge of the circle (like the “needle” of an analog speedometer). On each trial, the initial angle of the needle was determined randomly, and it could be made to rotate in a clockwise or counterclockwise direction by movement of a trackball.
On color trials, the dots were stationary and appeared in a color that was randomly sampled (without replacement) from of a list of 180 values that spanned the full color space of 360° in increments of 2°. This color list was generated from an evenly distributed circle on the CIE L*a*b color space, centered at (L = 80, a = 20, b = 20) with a radius of 60. The white point for the CIE XYZ space defining the LAB colors was set to (1, 1, 1). All colors had an equal luminance and brightness and only varied in hue. The color recall interface was a horizontal bar (12.14° × 1.55°) appearing at the center of the screen, its color transitioning smoothly across all possible colors in the color space, and a superimposed vertical white line (0.78° long, 0.05° wide; like the analog “tuning bar” on a radio). On each trial, the initial position of the tuning bar was determined randomly, and it could be made to translate horizontally along the color bar, to the left or to the right, by movement of the trackball.
Scanning was performed across three sessions, each on a different day: Days 1–16, 15-trial blocks of one-item delayed recall; Day 2–8, 15-trial blocks of one-item delayed recall (for a total of 360 trials of delayed recall) plus six 10-trial blocks of MSR; Day 3–10, 10-trial blocks of MSR (for a total of 160 trials of MSR). Before the start of each scanning session, participants were given instructions, and the tasks were practiced both outside and inside the scanner. The task procedures in each phase are described in detail below.
Each trial started with a central presentation of a white fixation cross (0.78° width and height) against a black background (2 sec), followed by the central presentation of the sample (1 sec). The sample was either a patch of coherently moving gray dots or a patch of static dots all presented in a uniform color. The white cross returned during the subsequent 9-sec delay period, after which recall was prompted during a 3-sec window. On direction trials, participants were instructed to click a response key when they had rotated the needle to an angle that matched their memory of the direction of motion of the sample dots. For both trial types, the needle/tuning bar became thicker (0.13°) when the response was registered and remained at the selected position for the remainder of the 3-sec response window, followed by feedback (1 sec; errors ≤15° elicited “great,” >15° and <30° elicited “good,” and ≥30° elicited “poor”), followed by a gray fixation cross displayed throughout the 8-sec intertrial interval (ITI). Trial type was pseudorandomly interleaved across the whole experiment, with each trial type being equiprobable, hence unpredictable to the participants.
Multiple Serial Retrocuing
In the MSR task, sample stimuli were 2-D compound objects that combined the features of the delayed recall stimuli: patches of moving, colored dots. After two samples were presented, a “relevance cue” indicated what would be the critical to-be-remembered information for that trial: cues indicating “〈First〉” or “〈Second〉” designated one of the two initially presented stimuli (i.e., the Bound condition); whereas cues indicating “〈Color〉” or “〈Direction〉” designated one of the two initially presented features (i.e., the Unbound condition). Then, the remainder of all trials unfolded with two serially occurring sequences of “prioritization cue”-delay-probe (Figure 1), with each prioritization cue indicating which feature would be tested by the ensuing recall probe. The logic was that memory load was equated across both conditions—four feature tokens were initially presented, then the “relevance cue” indicated which two of these four were relevant for the trial, presumably reducing the memory load to two feature tokens—and the factor of principal theoretical interest was whether the two trial-relevant feature tokens were bound together in the same object or were drawn from two discrete objects. Note that, in this design, the factor boundedness is confounded with category homogeneity, in that Bound trials always required memory for a color and a direction whereas Unbound trials always required memory for two colors or for two directions. However, because previous studies have shown a drop-to-baseline of the MVPA decodability of the UMI regardless of whether the two (unbound) memory items are drawn from the same or from different categories (LaRocque et al., 2013, 2017; Lewis-Peacock & Postle, 2012), this confound was deemed unlikely to complicate the interpretation of the results.
Whereas “relevance cues” presented a single word displayed in brackets—“〈First〉” or “〈Second〉” for Bound trials; “〈Direction〉” or “〈Color〉” for Unbound trials—“priority cues” used the same four words but without brackets. Note that the same word could never appear as both types of cues on the same trial (i.e., after a “relevance cue” of “〈First〉” or “〈Second〉,” the subsequent “priority cues” could only be “Direction” or “Color,” and vice versa). In both conditions, Priority Cue 2 was equally likely to cue the feature token that had or that had not been cued by Priority Cue 1, resulting in 20 “stay” trials—in which the same feature token was probed twice—and 20 “switch” trials, per cell in our design. The total duration of a trial was 52 sec, with participants performing 160 trials in randomized order across 16 blocks of 10 trials each. All stimulus parameters were the same as in the delayed recall task unless specified otherwise, and trial timing is illustrated in Figure 1. Both feature dimensions were randomly drawn (with replacement) from the full 360° of their respective feature spaces, in increments of 1°. The feature values of the second sample were constrained to a minimum angular separation of 40° relative to the first sample.
Behavioral Data Analysis
Performance was assessed using both a descriptive approach and a model-fitting approach. For each, a continuous measure of error for each response was obtained as the angular distance between the reported feature value and the true feature value. For the descriptive approach, a precision measure was then calculated as the reciprocal of the standard deviation of the error (calculated with Fischer's formula with a correction for systematic underestimation as outlined in Bays, Catalao, & Husain, 2009; paulbays.com/). The descriptive precision measures for each of the two probes were then submitted to a 2 × 2 repeated-measures ANOVA, with Category selected by the “relevance cue” (Bound or Unbound) and Feature Dimension (direction or color) as within-participant factors. Trials on which no responses were given were excluded from the analyses (3%). An alpha level of .05 was applied, and Bonferroni correction was used on multiple tests to control for false-positives in post hoc testing.
Data Acquisition and Preprocessing
Whole-brain images were acquired with the 3-T MRI scanner (Discovery MR750; GE Healthcare) at the Lane Neuroimaging Laboratory at the University of Wisconsin–Madison. High-resolution T1-weighted images were acquired for all participants with a fast spoiled gradient echo sequence (TR = 8.132 msec, time echo = 3.18 msec, 12° flip angle, 156 axial slices, 256 × 256 in-plane, 1.0 mm isotropic). BOLD-sensitive data were acquired using a gradient-echo, echoplanar sequence (TR = 2 sec, time echo = 25 msec) within a 64 × 64 matrix (39 sagittal slices, 3.5 mm isotropic).
fMRI Data Analysis
The fMRI data were analyzed with Analysis of Functional NeuroImages (AFNI) software package (afni.nimh.nih.gov; Cox, 1996). The first three volumes of each EPI series were included to allow magnetic saturation and were then removed from the analysis. The preprocessing pipeline for each day of scanning included the following steps. All volumes were spatially realigned to the first volume of the first functional run using rigid body realignment. Slice time correction was then applied to these functional volumes. Skull-stripped anatomical images were generated to which the functional images were coregistered to. Linear, quadratic, and cubic trends were removed from each run to reduce the influence of scanner drift. Data were then converted to percent signal change. For univariate analyses, data were spatially smoothed with 4 mm FWHM Gaussian kernel and transformed into Talairach space. The (inverted) transformation parameters obtained in this step were then later used in the generation of the masks by transforming them from Talairach space to the native subject space. For mutlivariate analyses, data were z-scored separately within run for each voxel and left in their native space. As a common native space for all the volumes acquired across different days of scanning, we defined the volumes of the second day of scanning as a reference to which the volumes of Days 1 and 3 were realigned to.
Generation of ROIs
ROIs were generated as a conjunction of anatomically and functionally defined voxels.
First, anatomical ROIs were generated using the Talairach anatomical atlas (TTatlas; https://sscc.nimh.nih.gov/afni/doc/misc/afni_ttatlas/index_html). Coordinates for relevant gyri in the TTatlas were used to generate masks for each gyrus, which were then warped into an individual's native space and aggregated to create three regional masks. The frontal anatomical mask comprised the precentral, anterior cingulate, inferior frontal, middle frontal, superior frontal, and medial frontal gyri. The parietal anatomical mask was similarly generated and comprised the posterior cingulate gyrus, precuneus, inferior parietal, and superior parietal lobules. Importantly, this included the intraparietal sulcus. The ventral occipitotemporal (VOT) mask comprised the lingual, fusiform, inferior occipital, inferior temporal, middle occipital, superior occipital gyri, and the cuneus.
Next, we fit a general linear model, separately for each participant, to the data from the delayed recall task. Regressors of interest were delta functions placed at the beginning of stimulus onset, and a nine second boxcar modeling the delay period, all convolved with a canonical hemodynamic response function. Nuisance covariates modeled head motion and block effects. From the solution of the general linear model, we extracted, from each anatomical region, the top 400 voxels with the highest positive t statistic associated with each of several contrasts: [Samplecolor − baseline], [Sampledirection − baseline], [Delaycolor − baseline], and [Delaydirection − baseline] to construct “feature-selective” ROIs; [(Samplecolor+direction) − baseline] and [(Delaycolor+direction) − baseline] to construct “feature-nonselective” ROIs. Of the resultant functionally defined ROIs, different instantiations would be most suitable for different analyses.
Multivariate Inverted Encoding Modeling
As an initial validation step, we implemented a leave-one-run-out cross-validation procedure where, for each fold, one of the runs from the delayed recall task was set aside and each time point from the remaining runs was used to generate a weight matrix for each feature dimension (color and direction) within each ROI. We then inverted the weight matrix and applied it to data from the left-out run to generate reconstructions in channel space (also referred to as “channel tuning functions”). Reconstructions from each iteration of the leave-one-run-out procedure were then aligned and averaged together to generate reconstructions for the delayed recall task, which we then quantified using the procedure outlined below. These results are shown in Figure 4. Important to note is that the results from the analyses of the one-item delayed recall data indicated that the IEM reconstruction of the neural representation of direction was markedly superior than that for color and, furthermore, that the reconstruction of direction was markedly stronger in the VOT (Figure 4) than in the parietal and frontal ROIs (Figures 10 and 11). Therefore, to maximize the sensitivity for addressing our questions of principal interest, we focused on the representation of direction in the VOT ROI by training a “one-item delay” IEM from TR 6 of the delayed recall task and testing it on data from the MSR task.
Once the weight matrix was generated from delayed recall data, data from each time point in the MSR task (testing phase) was multiplied by the inverted weight matrix as described in Equation 3 to generate a reconstruction time course of direction. Each of these feature-specific reconstruction time courses were then circularly shifted to a common center (0°) and averaged with those from like trials. Thus, for example, to generate the “Prioritized” reconstructions for the Unbound condition (Figure 5A), channel outputs from trials for which “〈Direction〉” was “relevance”-cued and for the item that was cued by Priority Cue 1 were aligned along the “priority”-cued item's direction and averaged together. To generate the smooth, 360-point functions shown in the IEM figures (6 and 7), we repeated the IEM analysis a total of 39 times and shifted the centers of the direction or color channels by 1° on each iteration.
This procedure was repeated 10,000 times in total, yielding 10,000 bootstrapped estimates of amplitude, baseline, and concentration. To test whether the amplitudes of the PMI and UMI representations were significantly above baseline levels, (one-tailed) p values for the robustness of the PMI and UMI feature reconstructions were separately calculated in the Unbound and Bound conditions by assessing the percentage of bootstrapped iterations whose amplitude estimates were negative. In other words, statistical significance at an alpha level of .05 implies that at least 95% of resampled reconstructions have a positive amplitude (ppos).
To test whether feature reconstructions of PMIs were stronger than those of UMIs, we computed the difference between the bootstrapped amplitudes of the PMI and UMI reconstructions, forming a distribution of difference scores. We assessed the percentage of bootstrapped iterations whose amplitudes were negative. In other words, statistically significant differences (at p < .05) between the PMI and UMI conditions would indicate that 95% of the differences in the resampled amplitude estimates of the PMI and UMI reconstructions were positive (ppos). The difference between the PMI and UMI reconstructions were separately calculated in the Unbound and Bound conditions. The same principle of statistical testing was applied to the baseline parameter. We were particularly interested in the attentional modulations of feature representations late in the delay period, namely, in time points 15 and 21 in response to Priority Cue 1 and Priority Cue 2, respectively. Therefore, the tests in the results sections are mainly focused on these time points (see Figure 6). However, reconstructions of the entire time courses of the delay periods of both priority cues are presented in Figures 14 and 15, along with the statistics on the amplitude and baseline differences between PMI and UMI feature reconstructions.
Multivariate Pattern Analyses
We carried out MVPAs to clarify and/or refine the interpretation of some of the findings from the IEM analyses, using L2-regularized logistic regression (with a lambda penalty term of 25) applied to z-scored BOLD, and implemented with the Princeton Multivoxel Pattern Analysis toolbox (www.pni.princeton.edu/mvpa/) and custom scripts in MATLAB (cf. Lewis-Peacock, Drysdale, Oberauer, & Postle, 2012). MVPA was carried out on two levels of stimulus information: (i) within feature (color and direction labeled as belonging to one of four quadrants in their respective 360° stimulus spaces, carried out in feature-selective ROIs) and (ii) between feature (color vs. direction, carried out in feature-nonselective ROIs).
To assess the representation of stimulus level of information, we trained two classifiers, one from color-sensitive voxels and one from motion direction-sensitive voxels as in the IEM analyses, separately for each ROI, to classify motion direction and color values categorized as belonging to one of four quadrants (centered at 45°, 135°, 225°, and 315°, each spanning a 90° wedge of positions within the full 360° range of possible colors and motion directions). Categorizing stimuli in this way enabled us to determine whether coarse stimulus information might be decodable in the frontal and parietal ROIs for which the IEMs failed to reconstruct exemplar-specific feature information. Classifiers were trained on late delay period data from the delayed recollection task TR 6, with k-fold cross-validation (train on 23 runs, test on the 24th), classification accuracy averaged across folds and compared against chance performance with two-tailed t tests (Bonferroni corrected). Note that, as we did for IEM analyses, we focused on the late delay period of the one-item delayed recall task for MVPA classifier training.
Additionally, to assess the representation of higher order information about stimulus category, we trained classifiers (in feature-nonselective ROIs) to accurately distinguish between the categories of color versus direction on data from the one-item delayed recall task. MVPA methods were the same as described above, with the exception that only two labels were used for training (“color,” “direction”), and so statistical significance was assessed with two-tailed t tests comparing accuracy to chance performance (50%).
Finally, to assess evidence for cognitive control-related activity, we also applied the category-level decoders trained on data from the one-item delayed recall task to data from the MSR task. Because a hallmark of control is that it should dynamically track changing contingencies within individual trials, we carried out these analyses by applying late-delay classifiers from the one-item delayed recall task to every TR of “switch” trials from the Bound condition of the MSR task. This would generate classification time courses for MSR trials that featured within-trial switches of priority between stimulus category. For this analysis, at each time point of the MSR task, and for each category, a measure of pattern similarity was computed between the voxel patterns for that TR and the late-delay patterns from the one-item delayed recall task. Using logistic regression, each category's pattern similarity score was then converted into “classifier evidence,” a value between 0 and 1 that can be interpreted as the extent to which the pattern at the tested TR matches the pattern learned by the classifier (i.e., conceptually similar to a correlation coefficient; cf. Lewis-Peacock & Postle, 2012; Polyn, Natu, Cohen, & Norman, 2005). Average classifier estimates were computed by sorting trials according to the feature dimension selected by Priority Cue 1 and the feature dimension selected by Priority Cue 2. Statistical significance of the evidence between the feature dimensions as function of the priority cues was computed by pairwise t tests at each time point (Bonferroni-corrected).
Analysis of the precision of responses revealed only main effects of Feature Dimension: Probe 1, F(1, 8) = 10.76, p < .05, ηp2 = .57, with an overall higher precision for direction (M = 2.99, SE = .61) than for color (M = 1.44, SE = .19) responses (other effects ns); Probe 2, F(1, 8) = 7.62, p < .05, ηp2 = .49, with an overall higher precision for direction (M = 2.46, SE = .49) than for color (M = 1.53, SE = .25) responses (other effects ns; Figure 2). Follow-up analyses comparing the probes revealed that the drop in precision for direction in response to Probe 2 relative to Probe 1 was statistically significant, t(8) = 4.25, p < .01, CI [.25, .86], whereas the precision for color did not statistically differ across probes, t(8) = −0.23, p = .82, CI [−.27, .22].
Although inspection of results for κ (Figure 3A) suggests qualitatively similar patterns to those for precision, ANOVAs indicated, instead, a greater sensitivity to boundedness (main effect of Relevance Cue, F(1, 8) = 5.66, p < .05, ηp2 = .41, with κ higher in the Bound (M = 20.02, SE = 2.91) than the Unbound (M = 13.94, SE = 1.4) condition), and the difference between feature dimensions no longer meeting the threshold for significance, F(1, 8) = 4.84, p = .059, ηp2 = .38 (direction trials, M = 21.75, SE = 3.78; color trials, M = 12.22, SE = 1.50). The interaction between boundedness and feature dimension for κ did not reach significance (F < 1).
The greater difficulty of color than direction performance, as suggested by the descriptive statistics, was captured in the model's estimates of PT, F(1, 8) = 7.51, p < .05, ηp2 = .48 (with a higher PT for direction, M = .95, SE = .016, than for color, M = .89, SE = .036, responses; other Fs < 1), and of PNT, F(1, 8) = 8.17, p < .05, ηp2 = .51 (with a lower PNT for direction, M = .02, SE =.006, than for color, M = .059, SE =.016; other Fs < 1). There were no differences in PU (Fs < 4).
One-item Delayed Recall
Sample-evoked ventral occipitotemoral ROI.
In the sample-evoked VOT ROI, stimulus feature reconstructions were markedly superior for direction than for color. For direction, a model that was trained on TR 4 (i.e., the volume collected from 5 to 6 sec after sample onset, when the sample-evoked BOLD response was expected to be maximal; cf. Figure 1C) yielded a robust reconstruction when tested at that same TR (with leave-one-run-out k-fold cross-validation). Furthermore, sweeping this model across all 12 TRs of the trial yielded reliable reconstructions spanning from the first TR of the delay period through to the second TR after the response window, thereby demonstrating robust cross-temporal generalization and indicating that stimulus direction was represented, in part, with a perceptual neural code throughout the trial (Figure 4A). The same qualitative pattern of reconstruction was observed when this process was repeated with a model trained on data from TR 6, which was intended to capture signal primarily attributable to delay period processing (Figure 4B) Finally, direction reconstruction was achieved at all but the TR preceding sample presentation when a model was trained and tested at each TR (i.e., “along the diagonal” of a cross-temporal matrix; Figure 4C). For color, an IEM could be successfully trained at TR 4 but did not generalize to any other TRs (Figure 4D), and one trained at TR 6 generalized to TR 5 and TR 8 (Figure 4E). When models were trained and tested at each TR, the reconstruction of color information was successful for only a subset of TRs associated with the perception/encoding and retention of color information, as well as for TRs during the ITI that followed the probe (Figure 4F). Although one might expect poorer IEM reconstruction for the stimulus feature that was remembered less well (i.e., color recall was inferior to direction recall), it could also be the case that the reconstruction of neural representations of color would have been more robust had we attempted to optimize for IEM the plane of the slice through CIE space that we selected to generate our stimuli. Additionally, our method does not allow us to know the extent to which verbalization may have contributed to color WM performance, a possibility that would not be expected to produce robust IEM in VOT ROI.
Delay-evoked parietal and frontal ROIs.
In parietal cortex, successful reconstruction of sample direction was restricted to just a few TRs associated with encoding/early delay and with recall/response and no successful reconstructions of sample color (Figure 10). In frontal cortex, the pattern was similar, with the exception that there were a few successful reconstructions of sample color in ITI TRs (Figure 11).
Multiple Serial Retrocuing
The results from the analyses of the one-item delayed recall data indicated that the IEM reconstruction of the neural representation of direction was markedly superior than that for color and, furthermore, that the reconstruction of direction was markedly stronger in the VOT (Figure 4) than in the parietal and frontal ROIs (Figures 10 and 11). Therefore, to maximize the sensitivity for addressing our question of principal interest, we focused on the representation of direction in the VOT ROI by training a “one-item delay” IEM from TR 6 of the delayed recall task and testing it on data from the MSR task.
On trials of all types, the neural representation of direction in the VOT was robust during the delay period prior to the “relevance cue” (TR 6 in Figure 5).
Unbound condition, Direction relevance cue.
The reconstructions of the direction of the sample that would become the PMI (Figure 5A) and of the sample that would become the UMI (Figure 5B) were both robust during the TRs leading up to and immediately following Priority Cue 1 (TR 11), with statistically comparable reconstructions from the data collapsed across TR 9, TR 10, and TR 11 (two-tailed, p = .69).
Once Priority Cue 1 designated one item to be the PMI and the other to be the UMI, the amplitude of the reconstructions diverged markedly. IEM of the PMI produced robust delay period reconstructions from TR 12 through TR 15, with the reconstruction strengthening across the delay preceding Probe 1 (slope = 0.039, p < .05). For IEM of the UMI, in contrast, the amplitude of the 0° channel declined significantly from TR 12 to TR 15 (slope = −0.069, p < .001), and no reconstructions were reliable from TR 13 to TR 16. Indeed, the output of the channels near 0° channel dropped below that of flanking channels, although this trend toward a significantly “negative” channel tuning function did not achieve significance when the data were collapsed across TR 13, TR 14, and TR 15 (two-tailed, p = .27). Finally, statistical comparisons confirmed that reconstructions of the PMI were higher in amplitude than those of the UMI (signal from both collapsed across TR 13 through TR 15, one-tailed, p < .0001).
No reconstructions were reliable in the parietal and frontal ROIs. Unexpectedly, however, in all ROIs, the overall magnitude of IEM channel outputs increased markedly from Delay 1 (TR 4 through TR 6) to Delay 2 (TR 8 through TR 10), as reflected in significant increases in the values of the baseline parameter in the VOT (p < .001, two-tailed) and parietal (p < .01, two-tailed) ROIs; for the frontal ROI (p = .11). Importantly, these effects were not accompanied by changes in the overall BOLD signal intensity. (Possible interpretations of changes in baseline will be considered in the Multivariate Pattern Classification section.)
The IMI following Priority Cue 2.
At TR 21, on “stay” trials, neither the PMI nor the IMI could be reconstructed (Figure 6B). On “switch” trials, however, the IEM reconstruction of the newly cued PMI (which had been flat and nonsignificant during the previous delay; Figure 6A) was robust at the end of Delay 4 (TR 21) and that of the IMI trended in the opposite direction: that is, the reconstruction displayed minimal channel output along the aligned direction and maximal output for channels corresponding to the opposite direction. Because such a flipped reconstruction was not predicted, we assessed its significance with a two-tailed test, which indicated that it missed the threshold for significance (p = .06; Figure 6C). At TR 21, the PMI and IMI reconstructions differed in amplitude (p < .01) and in baseline (p < .05, two-tailed). Overall, the IEM reconstructions of the PMI were significantly stronger than the IMI in the entire interval following Prioritization Cue 2 (signal from both collapsed across TR 19 through TR 21, one-tailed, p < .001).
Interim summary of results from Unbound condition.
For trials on which the “relevance cue” indicated “〈Direction〉,” the subsequent Priority Cue 1 influenced the representation of both features held in WM: The representation of the PMI increased in strength across the subsequent Delay 3, whereas the representation of the UMI decreased in strength to the point that it could no longer be reconstructed by the end of Delay 3. This is the pattern of results that would be expected if the principles of biased competition apply to the selection of one from among two WM representations of stimuli in the same way that they do for the selection of one from among two objects in a visual scene (e.g., Sheldon et al., Submitted). On trials when Priority Cue 2 prompted a switch of prioritization status, the IEM reconstruction of the newly designated PMI increased over the course of Delay 4, whereas that of the newly designated IMI decreased over the course of Delay 4, from being significantly positive at the beginning (Figure 6A) to being nonsignificant and, indeed, by the end of the delay period, bordering on flipped relative to the trained model. Additionally, the baseline parameter of this flipped reconstruction of the UMI during Delay 4 was higher than that of the PMI.
After the “relevance cue” indicated “〈First〉” or “〈Second〉,” the reconstructions from the VOT ROI of the direction of the sample that would become the PMI (Figure 5C) and of the sample that would become the UMI (Figure 5D) were robust during the TRs leading up to and immediately following Priority Cue 1 (TR 11), with statistically comparable reconstructions from signal collapsed across TR9, TR10, and TR11 (two-tailed, p = .77). Unlike in the Unbound condition, however, the designation by Priority Cue 1 of the PMI and the UMI had only a relatively minor effect on the IEM reconstructions. Statistical comparisons confirmed that reconstructions of the PMI and UMI did not differ over the course of this delay period (signal from both collapsed across TR 13 through TR 15, one-tailed, p = .14). Although the IEMs of the PMI reconstructions were sustained across the ensuing delay period, their strength did not increase (slope = 0.026, p = .25). Furthermore, although the amplitude of the reconstructions of the UMI decreased across this delay period (slope = −0.050, p < .05), they remained statistically significant across the delay period, and only beginning with TR 15 did the reconstruction of the UMI decline in amplitude to a point at which it was significantly lower than that of the PMI (p < .05).
Results from the Bound condition also differed markedly from the Unbound condition in the parietal and frontal ROIs, in that representations of the direction of the PMI became significant with the onset of Priority Cue 1 and for a few TRs into the ensuing Delay 3 (Figures 12 and 13), as well as in the frontal ROI of the UMI for a single TR.
The overall magnitude of IEM channel outputs increased from Delay 1 to Delay 2, as reflected in the values of the baseline parameters, although, as with the Unbound condition, this increase only reached significance in the VOT (two-tailed, p < .0001) and parietal (two-tailed, p < .05) ROIs.
The IMI following Priority Cue 2.
The patterns at late Delay 4 (TR 21) mirrored those from the Unbound condition: On “stay” trials, neither the PMI nor the IMI could be reconstructed (Figure 6E); on “switch” trials, reconstruction of the PMI was significantly positive (p < .0001) and that of the IMI was nonsignificant, but trending toward a flipped reconstruction (p = .09, two-tailed), and the two differed significantly from each other (p < .0001). Moreover, the overall IEM reconstructions of the PMI were significantly stronger than the UMI (signal from both collapsed across TR 19 through TR 21, one-tailed, p < .001).
Interim summary of results from Bound condition.
On trials when “direction” was cued by Prioritization Cue 1, the IEM reconstruction of the PMI remained robust across the ensuing delay period, but it did not increase in strength. On trials when “color” was cued by Prioritization Cue 1, although the amplitude of the neural representation of the UMI declined across the ensuing delay period, it nonetheless remained significantly elevated throughout the delay (Figure 5). In comparison to the Unbound condition, these results are consistent with the idea that a cardinal principle of object-based attention may apply to visual WM in a manner similar to visual perception: When one feature of an object is selected, the benefits of attention extend to all features of that object (e.g., Egly, Driver, & Rafal, 1994; Duncan, 1984). Although we cannot rule out the possibility that the UMI may have also benefited from the allocation of spatial attention to the stimulus of which it was a part, this possibility seems unlikely because, by this account, spatial attention would also be expected to boost the strength of the IMI, but this was not observed: Following Prioritization Cue 2, the representation of the PMI at TR 21 was positive, and the representation of the IMI was nonsignificant (and, indeed, trending in a direction opposite to what a spatial attention account would predict). Because this pattern mirrored what was observed in the Unbound condition, the implication is that the dynamics of “dropping” a no-longer-needed feature from WM may be similar across these two conditions.
Comparisons between conditions.
To quantify comparisons across conditions, we first established that the amplitudes of the reconstructions of direction that would become the PMI during Delay 3 (i.e., at TR 12–TR 15) were comparable during the time immediately preceding Priority Cue 1 (i.e., at TR 9–TR 11, p = .31; ns). Next, comparison of the change in PMI amplitudes across Delay 3 indicated that the strengthening of the reconstruction of the PMI that was observed in the Unbound condition did not differ significantly from the flat slope of the amplitude of the PMI in the Bound condition (slope difference = .01; two-tailed, p = .74). Finally, at TR 15, the amplitudes of the PMI did not differ between Bound and Unbound conditions (two-tailed, p = .52). Turning to the UMI, we first established that the amplitudes of the representations of direction that would become the UMI during Delay 3 were comparable during the time immediately preceding Priority Cue 1 (i.e., at TR 9–TR 11: two-tailed, p = .62; ns). Next, comparison of the slopes of the decline in the strength of the UMI across Delay 3 indicated that the weakening of the reconstruction of the UMI across Delay 3 did not differ significantly between the two conditions (slope difference = −0.019, p = .55). At TR 15, however, the amplitudes of the UMI differed between Bound and Unbound conditions (two-tailed, p < .05). Across the delay period (collapsing across the TR 13 through TR 15), the magnitude of the PMI–UMI difference was not different between Unbound and Bound conditions (two tailed, p = .31).
For Delay 4, on “switch” trials, the pattern was qualitatively similar for Bound and Unbound conditions: Reconstructions at TR 21 were significantly positive for the PMI and trending toward flipped for the IMI. The amplitudes of the PMI did not differ between Bound and Unbound conditions (two-tailed, p = .31), nor did the amplitudes of the IMI differ between Bound and Unbound conditions (two-tailed, p = .30). The two conditions differed quantitatively, however, in that, across the delay period (collapsing across TR 19 through TR 21), the magnitude of the PMI–IMI difference was greater in the Unbound than in the Bound condition (p < .05, two-tailed).
Interpretation of the differences in the Bound versus Unbound UMI at TR 15 and IMI at TR 21 must be qualified by the fact that the factor of binding (bound/unbound) was confounded with category congruity (different/same). For example, the lower amplitude of the Unbound UMI at TR 15 could be due in part to the fact that the PMI on these trials was also a direction of motion, which may have resulted in greater interitem interference relative to Bound trials, on which the PMI was a color. One argument against this alternative account is that, although such interitem interference might also be expected to impede the transition of the Unbound feature from UMI to PMI on switch trials, this transition was, to the contrary, greater than it was on Bound trials. In addition to this line of argumentation, however, it was also important to find evidence in these for a same-object effect for contrasts to which this confound did not apply. This was accomplished by examining patterns of IEM reconstructions of motion during epochs when color was the cued stimulus feature.
Color-cued trial epochs.
Unbound condition, Color relevance cue.
When the relevance cue indicated that the colors of the two stimuli were the critical to-be-remembered features, the strength of the neural representation of the direction of motion of the two stimuli could not be reconstructed at the end of Delay 2 (TR 10; Figure 7B). There remained, however, evidence of an effect of feature binding on these trials, in that Priority Cue 1 had the effect of dissociating the strength of the reconstructions of the two no-longer-relevant directions of motion: Although the reconstruction of the direction associated with the cued color was flat at the end of the ensuing delay period (TR 15), that of the direction associated with the uncued color became significant in the flipped direction (p < .0001), and the two differed significantly from each other (p < .0001; Figure 7D).
Bound condition, Color Priority Cue 1.
When the relevance cue indicated that one of the two stimuli would be relevant for the remainder of the trial, the reconstruction of the direction of the relevance-cued stimulus was highly significant at the end of Delay 2 (TR 10; Figure 7C, p < .001), whereas that of the uncued stimulus became flipped (TR 10; Figure 7C, p < .01), and the baseline parameter differed between the two (TR 10; Figure 7C, p < .0001). When Priority Cue 1 then cued color, the reconstruction of the direction of that stimulus was significantly above baseline at the end of the ensuing delay period (TR 15; p < .05), whereas the reconstruction of the direction of the irrelevant stimulus was flat, and the two differed significantly (p < .05).
Interim summary of results from color-cued trials.
These analyses provided evidence of a same-object effect in conditions that were not complicated by confounding factors. After a Color relevance cue, direction was irrelevant for the remainder of the trial. Nonetheless, IEM reconstructions of direction differed significantly after Priority Cue 1 prompted the prioritization of one of the two colors, with the reconstruction of the direction linked to the uncued color becoming significantly flipped. On bound trials, when Priority Cue 1 designated the relevant object's color, the reconstruction of that object's direction was significant at the end of the ensuing delay period and significantly greater than the flat reconstruction of the direction of the no-longer-relevant object.
Multivariate Pattern Classification
Delayed Recall Task
After binning stimuli into arbitrarily defined quadrants, direction of motion could be successfully decoded from the VOT ROI, t(8) = 3.45, p < .05, but not from the parietal or frontal ROIs (ts < 1.36; Figure 8A). Decoding of color was unsuccessful in all ROIs (ts < 1.03; Figure 12A). Decoding of trial type (color vs. direction) was successful in all ROIs: VOT, t(8) = 6.72, p < .001; parietal, t(8) = 10.61, p < .001; and frontal, t(8) = 8.4, p < .001 (Figure 8B).
MVPA of stimulus category.
Although parietal and frontal cortex are generally associated with the control of WM (e.g., Brincat, Siegel, von Nicolai, & Miller, 2018; Pribram, Ahumada, Hartog, & Ross, 1964), for the successful decoding of trial type (e.g., Figure 8B) to be interpreted as reflecting control-related activity, one would want to see that it dynamically tracks changing contingencies within individual trials. We assessed this possibility by applying late-delay classifiers from the one-item delayed recall task to every TR of “switch” trials from the Bound condition of the MSR task, so as to generate classification time courses. (Note that only the Bound condition included within-trial switches between stimulus categories.) These analyses, carried out in feature-nonselective ROIs in parietal and frontal cortex, revealed that information about color and direction were represented, to the same extent, from the beginning of the trial until TR 11 (the onset of Prioritization Cue 1). At this point in the trial, classifier evidence for the cued feature increased steeply and evidence for the uncued feature decreased steeply. At TR 17 (the onset of Prioritization Cue 2, a “switch” cue in these analyses), these patterns reversed, with evidence for the newly prioritized feature rising precipitously, and evidence for the newly unprioritized falling precipitously (Figure 9A).
Time course of the baseline parameter from IEM mirrors MVPA of stimulus category.
Inspection of the IEM time courses of the first half of the MSR trials (Figure 5), as well as the late-delay reconstructions from Delay 4 (Figure 6), reveals considerable variation in the baseline parameter of IEMs. By definition, this parameter does not relate directly to stimulus representation. To explore the possibility that these patterns of variation may track the priority of category-level information and thereby possibly index control-related activity, we plotted the values of the baseline parameter from tests of the late-delay IEM of one-item delayed recall on each TR of the trials of the MSR task that are featured in Figure 9A. As illustrated in Figure 9B, the fluctuations of the baseline parameter closely followed those of the MVPA of feature category. Importantly, univariate BOLD fluctuations for these different trial types were statistically indistinguishable, indicating that this evidence for task-related variation of multivariate category-level information in the parietal and frontal VOIs is not a mere byproduct of fluctuations in overall signal intensity (Figure 9C).
Interim summary of MVPA results.
The results of MVPA of delayed recall data binned post hoc into quadrants in stimulus-feature space were broadly consistent with the IEM results, in that evidence for stimulus-level representation of sample information was only reliably found for the dimension of direction of motion and only in the VOT ROI. This reinforces the idea that stimulus-specific direction information was most prominently represented in VOT cortex. Classification at the more abstract level of stimulus category (i.e., color vs. direction), however, was reliable in all three ROIs. Furthermore, when the late-delay delayed recall decoder was swept across data from the MSR task, it revealed that the representation of stimulus category in all ROIs was priority dependent and with a time course that was tightly coupled to the structure of the task: MVPA evidence for both categories was comparable at the beginning of the trial, prior to item prioritization, and closely tracked prioritization once prioritization cuing began. Coupled with the weak and uneven evidence for stimulus-level representation in frontal and parietal ROIs, these results are consistent with the idea that frontal and parietal networks were preferentially involved in controlling the maintenance of and changing of the priority of VOT-supported stimulus representations. Finally, the post hoc comparison of these MVPA time courses with the time courses of fluctuations in the baseline parameter of IEM suggest that the latter, too, may provide an index of the control of representations in visual WM.
The results from our study of a MSR task yielded two novel sets of empirical observations, each with important implications for our understanding of the mechanisms underlying visual WM. The first relates to the effects of Prioritization Cue 1 on stimulus representations. When two pieces of remembered information are drawn from separate objects, IEM indicates that the prioritization of one results in the strengthening of its neural representation and in the weakening, to baseline levels, of the active neural representation of the unprioritized item (even though this UMI must be retained in WM). When these two pieces of remembered information are drawn from the same object, in contrast, the effects of prioritization are markedly different: The active representation of the PMI is sustained but does not increase in strength, and the strength of the active representation of the UMI declines but nonetheless remains significantly above baseline. Although these two observations confound boundedness with category homogeneity of the two memory items, comparable same-object effects are also observed on color-cued trials, when the category homogeneity does not pertain. The second novel observation arises from contrasting the weak and uneven representation of stimulus identity in parietal and frontal cortex, whether assessed by IEM or MVPA, with the robust cue-locked dynamics of MVPA decoding of stimulus category in these two regions. These patterns are consistent with a role for these regions in the representation of priority.
Object-based Attention in Visual WM
The differential pattern of results in the Unbound versus the Bound condition suggests that key principles governing object-based attentional prioritization in visual perception also apply to visual WM. When the two remembered items belonged to separate objects (Unbound), the biasing of their competition for representation (Desimone & Duncan, 1995) resulted in the strengthening of the neural representation of the PMI at the expense of the strength of the UMI. When, however, the two remembered items belonged to the same object (Bound), the neural representation of the UMI remained elevated, consistent with an automatic spread of object-based attention to all elements of the remembered object. The theoretical implications of this finding are twofold. First, by illustrating object-based attention-like effects in visual WM, they extend the boundary conditions for which it can be said that visual WM appears to arise from “nothing more” than attention allocated to neural representations of objects not currently accessible to the eyes (cf. Myers, Stokes, & Nobre, 2017; Chun, 2011; Postle, 2006; Cowan, 1995). Second, they support the idea that multidimensional objects are represented in visual WM as bound objects, not as a collection of unbound features (cf. Park, Sy, Hong, & Tong, 2017; Bays et al., 2011; Luria & Vogel, 2011; Woodman & Vogel, 2008; Wheeler & Treisman, 2002; Luck & Vogel, 1997).
Controlling Priority in Visual WM
Stimulus representation in parietal and frontal cortex was markedly weaker than in the VOT ROI, whether assessed by IEM or by MVPA. There was, however, clear evidence that these two regions tracked the higher order information of which stimulus category was prioritized during each epoch of the trial, and they did so with a high degree of temporal precision. This is consistent with the idea that cue-driven changes in priority were implemented in WM via activity in frontoparietal circuits, whose representation of the prioritized category may have acted as a source of top–down bias on high-fidelity representations of stimulus features in VOT cortex (cf. Sheldon et al., Submitted; Nelissen et al., 2013).
Evidence for the Active Removal of Information from WM?
There are, in principle, two ways that information can exit WM: Its links to trial-specific context can decay after attention has been shifted away from it, or it can be actively removed, via suppression, recoding, or some other mechanism. Our results provide suggestive evidence for an active mechanism: In three instances when a retrocue indicated that the uncued information was no longer relevant for behavior (i.e., designated it a IMI), its neural representation transitioned from a robust reconstruction of the trained model to one that approached (after Priority Cue 2) or achieved (after the “relevance cue”) a state that was flipped relative to when it had been in the focus of attention. Although we cannot know from these data what mechanism(s) would have effected this change in representation (e.g., inhibition, recoding), we can postulate that they may only be engaged when task demands require the active removal of information from WM. In the one-item delayed recall task, in contrast, the neural representation of the sample item seems to just “fade away” at the end of the trial.
A mechanistically noncommittal interpretation of the retrocue-triggered transformation of IMIs is to suggest that once the identity of the information that will be relevant for the remainder of the MSR trial is known, the resultant IMI is processed in a manner that makes it least likely that it will interfere with performance on the remainder of the trial. A similar phenomenon has been observed in a different variant of the MSR task (with fMRI; Yu & Postle, 2018), in a 2-back task (with EEG; Wan, Cai, Samaha, & Postle, In Press), and in a dual serial visual search task (van Loon et al., 2018). It could be that recoding unprioritized information is a general mechanism for preventing that information from interfering with performance that needs to be guided by a PMI. By this account, removing an item from WM would be accomplished in a two-step process: First, recode it so that it is less likely to interfere with the PMI; second, let the recoded representation decay.
Reprint requests should be sent to Muhammet I. Sahan, Department of Experimental Psychology, Ghent University, H Dunanlaan 2, 9000 Ghent, Belgium, or via e-mail: firstname.lastname@example.org.
These authors contributed equally to this work.