Abstract

The dorsal attention network is consistently involved in verbal and visual working memory (WM) tasks and has been associated with task-related, top–down control of attention. At the same time, WM capacity has been shown to depend on the amount of information that can be encoded in the focus of attention independently of top–down strategic control. We examined the role of the dorsal attention network in encoding load and top–down memory control during WM by manipulating encoding load and memory control requirements during a short-term probe recognition task for sequences of auditory (digits, letters) or visual (lines, unfamiliar faces) stimuli. Encoding load was manipulated by presenting sequences with small or large sets of memoranda while maintaining the amount of sensory stimuli constant. Top–down control was manipulated by instructing participants to passively maintain all stimuli or to selectively maintain stimuli from a predefined category. By using ROI and searchlight multivariate analysis strategies, we observed that the dorsal attention network encoded information for both load and control conditions in verbal and visuospatial modalities. Decoding of load conditions was in addition observed in modality-specific sensory cortices. These results highlight the complexity of the role of the dorsal attention network in WM by showing that this network supports both quantitative and qualitative aspects of attention during WM encoding, and this is in a partially modality-specific manner.

INTRODUCTION

An important characteristic of neural networks associated with working memory (WM) tasks is their sensitivity to WM load. This WM load sensitivity is more particularly observed in the dorsal attention network, involving the intraparietal cortices and the superior frontal gyri. A common assumption is that the dorsal attention network exerts a role of top–down attentional control during WM tasks by considering that higher WM load is also associated with higher attentional control demands (e.g., Majerus et al., 2012, 2016; Cowan et al., 2011; Postle, 2006; Corbetta & Shulman, 2002). At the same time, the precise nature of these attentional processes in WM remains a controversial question. On the behavioral level, it has been shown that WM load and attentional control account for variability in behavior that is partly shared and partly unique to load or control (Cowan, Fristoe, Elliott, Brunner, & Saults, 2006). The aim of this study was to clarify the role of dorsal attention network involvement in WM by distinguishing nonstrategic, quantitative aspects associated with encoding load and strategic, qualitative aspects associated with top–down task-related attentional control.

Dorsal attention network involvement is very consistently observed across WM tasks, and this across different WM modalities, such as verbal, visuospatial, motor, and tactile modalities (Konoike et al., 2015; Savini, Brunetti, Babiloni, & Ferretti, 2012; Cowan et al., 2011; Majerus et al., 2010; Todd & Marois, 2004). A critical element is that increased activity in this network is typically observed for high load WM conditions, with no or minimal activity for low load WM conditions (Majerus et al., 2012; Ravizza, Delgado, Chein, Becker, & Fiez, 2004; Todd & Marois, 2004). This load dependency does not appear to be directly related to the storage of memoranda. Riggall and Postle (2012) observed that, despite showing load-related elevated activity, neural patterns in posterior parietal/intraparietal cortices of the dorsal attention network could not discriminate the different stimuli that had to be maintained in a WM task, whereas this discrimination was possible for neural patterns in sensory cortices, which, however, did not show elevated brain activity. Hence, the dorsal attention network does not appear to support the representation of memoranda as such. At the same time, Riggall and Postle showed that frontal and posterior parietal cortex encoded information about trial-specific instructions, indicating that the dorsal attention network may exert a more general role of task-related attentional control. This is also supported by a recent study investigating neural patterns associated with visual and verbal WM load. Majerus et al. (2016) observed that neural patterns in the dorsal attention network, but not in sensory cortices, could predict WM load conditions between verbal and visual modalities. These results suggest that load effects in the dorsal attention network are associated with domain general processes.

The nature of these processes remains poorly understood. The dorsal attention network has been associated with top–down, task-related attention, which allows to control task execution as a function of the task instructions (Shulman et al., 2009; Corbetta, Kincade, Ollinger, McAvoy, & Shulman, 2000). In WM tasks, it has further been shown that the dorsal attention network interacts with the ventral attentional network involving the TPJ and OFC, both networks showing antagonistic activity patterns (Asplund, Todd, Snyder, & Marois, 2010; Shulman et al., 2009; Corbetta et al., 2000). The ventral attention network has been associated with bottom–up, stimulus-driven attention supporting the attentional capture of unexpected, distractor information (Cabeza, Ciaramelli, & Moscovitch, 2012; Shulman et al., 2009). Several studies documented the same phenomenon in WM tasks by showing an increase of activity of the dorsal attention network and a decrease of activity of the ventral attention network, as a function of increasing WM load, in both verbal and visuospatial domains (Majerus et al., 2012; Todd, Fougnie, & Marois, 2005). Furthermore, when participants are engaged in high load WM tasks and show increased activation of the dorsal attention network, their ability to detect task-irrelevant distractor stimuli is diminished (Kurth et al., 2016; Majerus et al., 2012; Fougnie & Marois, 2007; Todd et al., 2005). These findings indicate that the dorsal attention network would be mainly involved in top–down aspects of attentional control during WM tasks.

However, existing evidence is indirect and relies on the assumption that high WM conditions require more top–down attentional control. Despite the similarities of the neural phenomena observed in WM and attentional domains, the studies conducted so far did not directly manipulate attentional control in WM tasks, and hence the attribution of a top–down attentional control function to the dorsal attention network during WM tasks remains a speculation. Critically, behavioral studies have made a distinction between two different types of attentional investment in WM tasks. A first type reflects the top–down, controlled attention aspect we already discussed; it is supposed to control task execution as a function of task requirements, including the control of task-related strategies such as stimulus selection, control of encoding and maintenance strategies, or attentional refreshing of memoranda (Barrouillet, Bernardin, & Camos, 2004; Engle, Kane, & Tuholski, 1999; Cowan, 1995). A second type concerns a less strategic aspect of attention, although, importantly, it is different from the bottom–up, stimulus-driven type of attention discussed earlier. This aspect has been termed by Cowan (1995, 2001) as the focus of attention and reflects, within a stream of auditorily or visually presented information, the limited amount of stimuli we can be aware of at one time. This aspect does not involve the implementation of top–down controlled strategies on the stimuli but rather reflects the amount of information that is available to attention. This attentional load capacity is measured by presenting continuous memory lists at a very fast pace (i.e., two to three items per second for verbal lists), preventing participants from implementing any controlled attentional processes, such as attentional refreshing, stimulus elaboration and regrouping, or articulatory rehearsal processes, until the moment when the list ends and retrieval is to begin (Gray et al., 2017; Broadway & Engle, 2010; Cowan et al., 2005; Hockey, 1973; Pollack, Johnson, & Knaff, 1959). In this type of tasks, the amount of information participants can recall/recognize at any time during the task is about three or four units. This number equals also to the WM load at which activity in the dorsal attention network reaches an asymptote in visual array WM tasks (Todd & Marois, 2004). At the same time, in verbal WM tasks using standard presentation times and allowing for the implementation of top–down controlled attentional processes, intraparietal cortex activity continues to increase up to a load of six stimuli (Majerus et al., 2016). It follows that dorsal attention activity in WM tasks could reflect nonstrategic attentional load aspects as theorized by the concept of the focus of attention, or top–down control of attention, or both.

A further aspect that needs to be considered is the fact that WM load is frequently confounded with sensory numerical aspects. When comparing high and low memory load, studies contrast stimulus sets containing a large versus small amount of stimuli (e.g., Majerus et al., 2016). We know from studies in the numerical domain that the intraparietal sulci are sensitive to numerosity information for physical objects and their univariate activation levels change when the amount of physical stimuli presented on the screen changes, and this even in passive viewing tasks (Piazza, Pinel, Le Bihan, & Dehaene, 2007; Piazza, Izard, Pinel, Le Bihan, & Dehaene, 2004). Bulthe, De Smedt, and Op de Beeck (2015) further showed that multivariate signals in the parietal cortex are particularly sensitive to physical differences in numerosity (one dot vs. five dots), rather than abstract differences of numerosity (the number “1” vs. the number “5”). It follows that the univariate activity differences observed in the intraparietal sulcus for WM sets differing in stimulus load may also reflect the differences in physical numerosity and not only differences in WM load per se. Cowan et al. (2011) aimed at controlling for this possible confound by presenting the same set of stimuli in high and low conditions and by subsequently cuing only a subset of them for further retention, and they still observed higher left intraparietal sulcus activity for high load auditory and visual WM conditions. However, in that study, WM load was manipulated via postencoding, top–down controlled processes preventing the investigation of nonstrategic processes involved in WM encoding load. Emrich, Riggall, LaRocque, and Postle (2013) controlled perceptual load during a visual WM paradigm by presenting a fixed number of dot patterns while varying the number of dot patterns that were moving (one, two, or three mowing patterns); they did not observe a load-related activity in frontoparietal cortices during WM encoding, but load-related differences appeared later during maintenance.

This study examined the encoding load versus top–down memory control processes that define the intervention of the dorsal attention network in WM tasks by disentangling nonstrategic attentional load from top–down attentional control during WM encoding while maintaining the number of physical stimuli constant in all conditions. The different experimental conditions all involved the rapid presentation of stimuli, which allowed us to control the implementation of top–down strategies by the participants. To determine the domain generality of the two attentional aspects targeted by this study, the memory lists were either sequences of auditory–verbal stimuli (letter names and digits) or sequences of visual stimuli (lines in different geometric orientations or unfamiliar faces). In the low top–down control condition, participants simply focused their attention on all the successive stimuli, until a probe stimulus appeared for which the participants had to decide whether it was part of the memory list. In this condition, the fast presentation ensured that participants could not implement strategic and controlled attentional processes. In the high top–down control condition, participants had to selectively focus on a specific item category (e.g., encoding of only the digit stimuli), imposing top–down attentional control on the encoding process; this was possible due to the fact that digit and letter stimuli were presented in alternation, and hence, participants had to orient their attention to every second unique stimulus of the sequence. For examining the role of nonstrategic attentional load, we manipulated the encoding load of the stimulus sequence, with low load conditions containing successive stimulus repetitions reducing the amount of unique stimuli entering in the focus of attention by half relative to the high load condition. This procedure ensured that the numerosity of sensory events presented in high and low load conditions was the same as the same number of physical stimuli was presented in both load conditions; despite the successive stimulus repetitions in the low load condition, the stimuli remained separated at the temporal level and hence were encoded as distinct sensory events; at the same time, the successive repetition of stimuli in the low load condition led to a reduction of encoding load as the redundancy of information contained in two following identical stimuli will automatically lead to the formation of a chunk (Mathy & Feldman, 2012), and unique chunks define the limits of focus of attention capacity (Cowan, 2001).

Both univariate and multivariate analysis strategies were used. Whole-brain univariate analyses assessed differences in regional brain activity levels between the different control and load conditions. If the univariate activity level differences observed in the dorsal attention network between high and low WM load conditions in previous studies reflect differences in physical stimulus numerosity, then no or strongly reduced univariate differences between high and low load/control conditions may be found when physical stimulus numerosity is held constant. Next, we used whole-brain multivariate voxel pattern analyses to determine whether neural activation patterns nevertheless encode information about the different load and control conditions. Although these analyses are most frequently used to identify neural patterns associated with specific stimulus categories, they have also been shown to be informative about the type of cognitive processes that are supported by specific neural patterns (e.g., Riggall & Postle, 2012; Esterman, Chiu, Tamber-Rosenau, & Yantis, 2009). This whole-brain multivariate voxel pattern analysis was followed up by an ROI analysis to determine whether informative neural activation patterns for both the load and control manipulations were supported by the dorsal attention network. A searchlight analysis strategy further examined whether the load and control aspects of attention are supported exclusively by regions that are part of the dorsal attention network or whether they involve additional brain regions. Finally, the domain generality of neural patterns distinguishing load and control aspects of attention was assessed by conducting between-domain predictions (from auditory–verbal to visual and from visual to auditory–verbal modalities) of load and control conditions.

METHODS

Participants

Valid data were obtained for 26 right-handed native French-speaking young adults (14 men; mean age = 23.12 years, age range 20–33) recruited from the university community, with no history of psychological or neurological disorders. The data of five additional participants were discarded due to incomplete data acquisition (four participants) or sudden head movement resulting in volume-to-volume displacement exceeding 9 mm and 15° (one participant). The study was approved by the ethics committee of the Faculty of Medicine of the University of Liège and was performed in accordance with the ethical standards described in the Declaration of Helsinki (1964). All participants gave their written informed consent before their inclusion in the study.

Task Description

The stimulus material consisted of digits (1–9), names of consonant letters (B, C, D, F, G, H, J, K, L, N, R, S, T, V), unfamiliar faces (nine male faces taken from FERET database; Phillips, Wechsler, Huang, and Rauss, 1998), and line stimuli presented in different orientations (nine different orientations). The verbal stimuli had been recorded by a neutral female voice and transformed to digital .wav mono sound files (44,100 Hz sampling frequency), with a normalized duration of 300 msec and a mean amplitude approximating 70 dB. The visual stimuli had a size of 397 × 529 pixels and a resolution of 96 ppi (see Figure 1). The stimuli were presented in continuous sequences containing 12 stimuli each, with an ISI of 50 msec; both auditory–verbal and visual stimuli had a presentation duration of 300 msec.

Figure 1. 

Description of the experimental design used for the verbal and visual modalities of the WM task. The first two examples represent verbal WM trials, and the two last examples represent visual WM trials. For each of the two examples, the first row represents a low encoding load trial, and the second row represents a high encoding load trial. In the high control conditions, the participants had to selectively encode a specific stimulus category (digits or letters for the verbal WM trials; faces or lines for the visual WM trials), whereas in the low control condition, the participants encoded the stimuli as they appeared.

Figure 1. 

Description of the experimental design used for the verbal and visual modalities of the WM task. The first two examples represent verbal WM trials, and the two last examples represent visual WM trials. For each of the two examples, the first row represents a low encoding load trial, and the second row represents a high encoding load trial. In the high control conditions, the participants had to selectively encode a specific stimulus category (digits or letters for the verbal WM trials; faces or lines for the visual WM trials), whereas in the low control condition, the participants encoded the stimuli as they appeared.

For the low load condition, each stimulus was repeated once immediately after its first presentation to diminish the amount of different items to be maintained in the focus of attention while ensuring that the same number of visual stimuli was presented as in the high load condition in which every successive item was different; the temporal separation of 50 msec between two adjacent items ensured that repeated items were still perceived, at the sensory level, as two successive auditory stimulations (see Figure 1). For the auditory–verbal sequences, the presentation of letters and digits was alternated within the same sequence to allow, for the high top–down control condition, selective encoding of only one of the two stimulus categories (see below). The same alternation also characterized visual sequences, with a regular alternation between line and face stimuli; again, a 50-msec black screen separating two successive items ensured that repeated items were perceived as two successive sensory events (see Figure 1).

At the beginning of each sequence, participants were informed about the type of control (high control: stimulus selection; low control: no stimulus selection) that was required and the stimulus category they had to focus on in the high control condition by an instruction screen displayed during 1500 msec before the start of the sequence. In the high control condition, the different stimulus categories (letters, digits; faces, lines) were targeted for an equal number of trials. During the entire task, the background of the screen was black. An instruction “In the list?” appeared 1000 msec after the last item of each sequence, and the participants either heard a probe stimulus (auditory–verbal condition) or saw a probe stimulus (visual condition), and the participants had to decide whether the stimulus had been in the list or not. To ensure that the participants' responses were based on active WM representations and not familiarity-based recognition judgments, participants were instructed to respond “yes” only if they were certain about their response. Participants pressed the response button under their middle finger for “yes” (i.e., definitely in the list) and the button under their index finger for “no.” Nonmatching stimuli were stimuli not presented in the list but were from the same stimulus category as the target stimulus category. Also, given that for rapid, continuous sequence presentation paradigms, information within the focus of attention is very quickly lost and updated, the number of positive trials largely exceeded the number of negative trials with a maximum ratio of one negative trial for four positive trials. This procedure was motivated by the fact that negative probes would not be informative about the content of information held in WM; pilot data had shown that, although recognition accuracy is up to 90% when items from the two most recently presented serial positions are probed, performance drops quickly to chance level for earlier serial positions.

Participants were allowed 3000 msec for giving their response; if the participant did not respond with the given time, a no response was recorded. After the response or after the 3000-msec response waiting time in case of no response, there was a fixation intertrial interval of similar duration as the duration of the trials, ensuring proper separation of the brain signal associated with each trial (two successive random Gaussian distributions, a first one being centered on a mean duration of 3000 ± 1000 msec and a second one being centered on a mean duration of 5500 ± 1100 msec, amounting to a total mean duration of 8500 msec). Finally, baseline trials, controlling for basic sensorimotor and decision processes involved in the tasks were also presented. They had the same structure as the experimental trials, except that they consisted in the continuous repetition of a single stimulus, and during the response stage, a perceptually identical or an acoustically reversed/contrast-reversed stimulus was presented; participants simply had to judge the perceptual “normality” of the probe stimulus relative to the standard presentation of the stimuli across the task.

For each stimulus modality, there were 20 trials for each of the four cells resulting from the crossing of the different conditions (low load–low control, low load–high control, high load–low control, high load–high control), as well as 20 baseline trials. The auditory–verbal and visual trials were presented in two different sessions on the same day, and the order of the sessions was randomly assigned to participants; the two sessions were separated by the acquisition of a T1 structural brain scan (see below). This allowed us to make between-modality predictions of load and control conditions based on independent data sets. Furthermore, participants completed a practice session for both verbal and visual trials outside the scanner before the start of the experiment to ensure that participants had understood the difference between the high and low control instructions and complied to task requirements.

MRI Acquisition

The experiments were carried out on a 3-T whole-body scanner (Prisma, Siemens Medical Solutions, Erlangen, Germany) operated with a standard transmit–receive quadrature head coil. fMRI data were acquired using a T2*-weighted gradient-echo EPI sequence with the following parameters: repetition time (TR) = 1830 msec, echo time (TE) = 30 msec, field of view (FOV) = 192 × 192 mm2, 64 × 64 matrix, 30 axial slices, voxel size 3 × 3 × 3 mm3, and 25% interslice gap to cover most of the brain. The four initial volumes were discarded to avoid T1 saturation effects. Field maps were generated from a double echo gradient-recalled sequence (TR = 634 msec, TE = 10.00 msec and 12.46 msec, FOV = 192 × 192 mm2, 64 × 64 matrix, 40 transverse slices with 3 mm thickness and 25% gap, flip angle = 90°, bandwidth = 260 Hz/pixel) and used to correct echo-planar images for geometric distortion due to field inhomogeneities. A high-resolution T1-weighted magnetization-prepared rapid gradient echo image was acquired for anatomical reference (TR = 1900’msec, TE = 2.19 msec, inversion time = 900 msec, FOV 256 × 240 mm2, matrix size 256 × 240 × 176, voxel size 1 × 1 × 1 mm3). For the auditory WM task, between 936 and 1032 functional volumes were obtained, and for the visual WM task, between 926 and 1006 functional volumes were obtained. Head movement was minimized by restraining the participant's head using a vacuum cushion. Stimuli were displayed on a screen positioned at the rear of the scanner, which the participant could comfortably see through a mirror mounted on the head coil.

fMRI Analyses

Image Preprocessing

Data were preprocessed and analyzed using SPM12 software (version 12.0; Wellcome Department of Imaging Neuroscience, www.fil.ion.ucl.ac.uk/spm) implemented in MATLAB (Mathworks Inc., Natick, MA) for univariate analyses. EPI time series were corrected for motion and distortion with “Realign and Unwarp” (Andersson, Hutton, Ashburner, Turner, & Friston, 2001) using the generated field map together with the FieldMap toolbox (Hutton et al., 2002) provided in SPM12. A mean realigned functional image was then calculated by averaging all the realigned and unwarped functional scans and the structural T1-image was coregistered to this mean functional image (rigid body transformation optimized to maximize the normalized mutual information between the two images). The mapping from subject to Montreal Neurological Institute space was estimated from the structural image with the “unified segmentation” approach (Ashburner & Friston, 2005). The warping parameters were then separately applied to the functional and structural images to produce normalized images of resolution 2 × 2 × 2 mm3 and 1 × 1 × 1 mm3, respectively. Finally the warped functional images were spatially smoothed with a Gaussian kernel of 4 mm FWHM to improve signal-to-noise ratio while preserving the underlying spatial distribution (Schrouff, Kussé, Wehenkel, Maquet, & Phillips, 2012); this smoothing also diminishes the impact residual head motion can have on MVPA performance, even after head motion correction (Gardumi et al., 2012).

Univariate Analyses

Univariate analyses first assessed brain activity levels associated with attentional control and load manipulations in the verbal and visual WM tasks. For each participant, brain responses were estimated at each voxel, using a general linear model with event-related and epoch-related regressors. For each WM task (session), the design matrix contained four regressors modeling the encoding phase with two regressors for the control conditions (high, low) and two regressors for the load conditions (high, low); an additional regressor modeled the response phase. The sensory and motor control trials were modeled implicitly. The regressors resulted from the convolution of the onset and duration parameters for each event of interest with a canonical hemodynamic response function. The design matrixes for each session also included the session-specific realignment parameters to account for any residual movement-related effect. A high-pass filter was implemented using a cutoff period of 128 sec to remove the low-frequency drifts from the time series. Serial autocorrelations were estimated with a restricted maximum likelihood algorithm with an autoregressive model of order 1 (+white noise). For each design matrix, linear contrasts were defined for the two attentional control conditions and the two attentional load conditions. For each task, the resulting contrast images, after additional smoothing by 6 mm FHWM, were entered in a second-level, random effect ANOVA analysis to assess control and load responsive brain areas at the group level. The additional smoothing was implemented to reduce noise due to intersubject differences in anatomical variability and to reach a more conventional filter level for group-based univariate analyses (42+62=7.21mm; Mikl et al., 2008).

Multivariate Analyses

Multivariate analyses were conducted using PRoNTo, a pattern recognition toolbox for neuroimaging (www.mlnl.cs.ucl.ac.uk/pronto; Schrouff et al., 2013). They were used to determine the voxel patterns discriminating between the different control and load trials at an individual subject level. We trained classifiers to distinguish whole-brain voxel activation patterns associated with high versus low control and with high versus low load in the preprocessed and 4-mm smoothed functional images for the verbal and visual WM encoding events separately, using a binary support vector machine (Burges, 1998). For within WM modality classifications of control and load conditions, a leave-one-block-out cross-validation procedure was used. For cross WM modality predictions of load and control conditions, a leave-one-run-out cross-validation procedure was used, resulting in training the classifier on one modality and testing the classifier on the other modality. At the individual level, classifier performance was assessed by running permutation tests on individual balanced classification accuracies (Npermutation = 1000, p < .05). At the group level, classifier performance was tested by comparing the group-level distribution of classification accuracies to a chance-level distribution using Bayesian one sample t tests; Bayesian statistics were used given their robustness in case of small-to-moderate sample sizes and nonnormal distributions (Moore, Reise, Depaoli, & Haviland, 2015) and because, with these analyses, the bias toward accepting or rejecting the null hypothesis does not change with sample size. Furthermore, Bayesian statistics assess evidence for a model under investigation in the light of the data, whereas group-level classical t tests make population-level inferences; population-level inferences using a classical t test have been shown to be problematic when comparing classification accuracies against chance-level (Allefeld, Gorgen, & Haynes, 2016). A Bayesian factor (BF10) greater than 3 was considered as providing moderate evidence in favor of above-level classification accuracy, a BF10 greater than 10 as providing strong evidence, a BF10 greater than 30 as providing very strong evidence, and BF10 greater than 100 as providing decisive evidence (Lee & Wagenmakers, 2013). A BF10 smaller than the reciprocal of each number (<1/3, <1/10, <1/30, and <1/100) serves as commensurate evidence favoring the null, a conclusion that could not be drawn using null hypothesis statistical testing. Note that a Bayesian analysis approach was also used to assess group-level behavioral performance by using Bayesian ANOVA. A standard mask removing voxels outside the brain was applied to all images, and all models included timing parameters for hemodynamic response function delay (5 sec) and hemodynamic response function overlap (5 sec), ensuring that stimuli from different categories falling within the same 5 sec were excluded (Schrouff et al., 2013). The whole-brain multivariate analyses were followed up by ROI analyses to determine the role of the frontoparietal cortices of the dorsal attention network in the discrimination of the different load and control conditions (see the Results section for more details).

Finally, in addition to the ROI analyses, a searchlight decoding approach was used to determine the local spatial distribution of the voxels that discriminate between the different conditions (Kriegeskorte, Goebel, & Bandettini, 2006). A searchlight sphere of 10 mm was applied on the whole-brain multivariate feature map, and the classification accuracy of each voxel cluster was determined, using ad hoc code built for the Pronto toolbox and available at https://github.com/CyclotronResearchCentre/PRoNTo_SearchLight. Significance of searchlight classifications were also assessed at both individual and group levels, given that analyses of group-level classification accuracies can indicate small above-chance level classifications as being significant while at an individual level, only few (if any) participants may show significant classification accuracies. It is therefore important to also consider the prevalence of the effect across participants and not only the mean classification rate of the group (Allefeld et al., 2016). To obtain significance values for individual-level classification accuracies, we used binomial tests indicating the classification accuracy threshold at which voxels are significant at p < .05 according to a binomial distribution (Noirhomme et al., 2014). For displaying searchlight results, a prevalence image was built on individual searchlight classification maps summarizing the number of individual participants for which a given voxel showed a classification accuracy higher than the binomial significance threshold. This led to a prevalence image indicating the proportion of participants showing significant classification accuracies for a given voxel.

RESULTS

Behavioral Analyses

A first group-level within-subject Bayesian ANOVA assessed the effects of Load (high, low) and Control (high, low) manipulations as well as Task modality (auditory–verbal, visual) on recognition accuracy (for positive trials). Specific effect analysis showed that there was decisive evidence for the inclusion of Load (BFInclusion = 1.57e+15) and Control (BFInclusion = 3.22e+15) effects, whereas evidence for the inclusion of Modality (BFInclusion = 0.85) or any of the interactions was very low (all BFInclusion < 1.50). As shown in Figure 2A and B, recognition accuracy was, as expected, higher for the low versus high load condition and for the high versus low control condition in the auditory–verbal and visual WM tasks, respectively. The fact that recognition accuracy was higher for the high versus low control condition confirms that participants were able to selectively focus on target stimuli only, reducing the overall amount of stimuli to be maintained. A further analysis assessed response bias by calculating d′ and C scores (Brophy, 1986) by collapsing conditions within each modality to obtain reliable estimates of the rejection rates for the rarely occurring negative trials (see Methods). A Bayesian paired t test on d′ scores in the auditory–verbal and visual modalities showed very low evidence for an effect of Modality, BF10 = 0.22, with d′ values showing reliable discrimination of positive and negative trials in both the auditory–verbal (mean = 1.80 ± 0.59) and visual (mean = 1.73 ± 0.50) modalities. A Bayesian paired t test on C scores also showed very low evidence for an effect of Modality, BF10 = 0.30, with C scores indicating an overall conservative response criterion in the auditory–verbal (mean = 0.14 ± 0.24) and visual (mean = 0.22 ± 0.38) modalities, confirming that participants complied to the task instruction to accept targets only if they were certain about their response.

Figure 2. 

Accuracy and RTs for performance on auditory–verbal (A, C) and visual (B, D) WM tasks. Error bars represent standard errors.

Figure 2. 

Accuracy and RTs for performance on auditory–verbal (A, C) and visual (B, D) WM tasks. Error bars represent standard errors.

An analysis of RTs for positive trials showed again very strong evidence for the effects of Load (BFInclusion = 1508.26) and Control (BFInclusion = 3343.69), whereas all interactions were associated with very low evidence (all BFInclusion < 0.65); this analysis, however, also revealed very strong evidence for a Modality effect (BFInclusion = +∞). As shown in Figure 2C and D, RTs were faster for the low versus high load conditions and for the high versus low control conditions, but they were also faster for the visual versus auditory–verbal modality. The results confirm that participants were able to implement a selective encoding strategy in the high control condition, which led to faster access to items selectively held in the focus of attention. The general faster RTs in the visual modality are likely to reflect the temporal nature of auditory–verbal stimuli, which can only be identified after a sufficient portion of the acoustic signal, which unfolds over time, has been presented: The mean difference in poststimulus onset RTs (315 ± 193 msec) is indeed equivalent to the duration of the acoustic signal of the auditory probe stimulus (300 msec).

Neuroimaging—Univariate Analyses

A first set of neuroimaging analyses assessed the effects of load and control on univariate neural activity changes in the verbal and visual WM tasks. As expected, when contrasting the different load and control conditions, univariate analyses yielded few significant differential activity foci in the frontoparietal, dorsal attention network of interest, as shown in Table 1. The only contrast that yielded significant between-condition activity differences was the high versus low control contrast in the verbal WM task: the left intraparietal sulcus, covering the horizontal segment from anterior to posterior portions (see Figure 3 and Table 1), and the left dorsolateral pFC showed increased activity at pFWE_corrected < .05 (with a cluster-forming threshold of puncorrected < .001 at the voxel level; Eklund, Nichols, & Knutsson, 2016) for the high control condition, in line with an involvement of the dorsal attention network in top–down attentional control. This was paralleled by increased activity for the low control condition in the bilateral fronto-orbital cortex and, at uncorrected levels, in the bilateral TPJ; these regions are part of the ventral attention network involved in stimulus-driven attention and which has been shown to be activated (or less deactivated) when task-related top–down control is low (Majerus et al., 2012; Todd et al., 2005; Corbetta & Shulman 2002). Finally, as shown by a conservative conjunction null analysis (Friston, Penny, & Glaser, 2005) over all conditions in the verbal and visual WM tasks (see Table 1), frontoparietal cortices defining the dorsal attention network were strongly activated above baseline, with activity in the intraparietal sulcus covering the entire horizontal segment, from anterior to posterior portions, in both hemispheres. Next, we used multivariate analyses to determine to what extent neural activity patterns allow to distinguish between the different load and control conditions.

Table 1. 

Univariate Activity Foci, as a Function of Control and Load Contrasts, as well as a Conjunction Analysis over All Conditions

Anatomical RegionBrodmann AreaNo. VoxelsLeft/RightxyzSPM {Z}-value
High > low control (auditory) 
 Inferior frontal gyrus 9/44 637 −40 12 28 4.83 
 Intraparietal sulcus (posterior) 385 −26 −60 42 4.46* 
Low > high control (auditory) 
 OFC 10 371 −4 60 12 4.35* 
High vs. low load (auditory) 
 No voxel survived threshold 
High vs. low control (visual) 
 No voxel survived threshold 
High vs. low load (visual) 
 No voxel survived threshold 
Conjunction null analysis (auditory and visual; all conditions) 
 Anterior cingulate 6/32 1753 −6 14 46 >7.76 
 Superior frontal gyrus  −28 −2 52 >7.76 
742 46 56 7.76 
  36 58 7.25 
 Middle frontal gyrus  −38 28 30 5.59 
223 38 40 26 7.05 
 Inferior frontal gyrus/insula 47 423 −30 26 >7.76 
47 529 34 24 >7.76 
 Inferior frontal gyrus 44  −38 18 26 5.37 
44 202 42 12 22 >7.76 
 Intraparietal sulcus 7/40 4053 −40 −40 42 7.56 
  −34 −50 42 >7.76 
  −28 −58 42 >7.76 
7/40 766 50 −34 52 >7.76 
  38 −42 42 5.92 
  32 −62 42 6.00 
 Superior temporal gyrus 20 19 −48 −46 12 5.32 
20 222 56 −46 12 6.98 
 Caudate (head)  −12 16 5.03 
 41 14 14 5.64 
 Cerebellum  135 −28 −62 −28 >7.76 
 75 34 −60 −28 6.35 
Anatomical RegionBrodmann AreaNo. VoxelsLeft/RightxyzSPM {Z}-value
High > low control (auditory) 
 Inferior frontal gyrus 9/44 637 −40 12 28 4.83 
 Intraparietal sulcus (posterior) 385 −26 −60 42 4.46* 
Low > high control (auditory) 
 OFC 10 371 −4 60 12 4.35* 
High vs. low load (auditory) 
 No voxel survived threshold 
High vs. low control (visual) 
 No voxel survived threshold 
High vs. low load (visual) 
 No voxel survived threshold 
Conjunction null analysis (auditory and visual; all conditions) 
 Anterior cingulate 6/32 1753 −6 14 46 >7.76 
 Superior frontal gyrus  −28 −2 52 >7.76 
742 46 56 7.76 
  36 58 7.25 
 Middle frontal gyrus  −38 28 30 5.59 
223 38 40 26 7.05 
 Inferior frontal gyrus/insula 47 423 −30 26 >7.76 
47 529 34 24 >7.76 
 Inferior frontal gyrus 44  −38 18 26 5.37 
44 202 42 12 22 >7.76 
 Intraparietal sulcus 7/40 4053 −40 −40 42 7.56 
  −34 −50 42 >7.76 
  −28 −58 42 >7.76 
7/40 766 50 −34 52 >7.76 
  38 −42 42 5.92 
  32 −62 42 6.00 
 Superior temporal gyrus 20 19 −48 −46 12 5.32 
20 222 56 −46 12 6.98 
 Caudate (head)  −12 16 5.03 
 41 14 14 5.64 
 Cerebellum  135 −28 −62 −28 >7.76 
 75 34 −60 −28 6.35 

All regions are significant at p < .05, with voxel-level and/or cluster-level family-wise error (FWE) corrections for whole-brain volume.

*

p < .05 for cluster-level FWE corrections only, with a cluster-forming threshold of puncorrected < .001 at the voxel level.

Figure 3. 

Univariate results for the high versus low attentional control contrasts in the verbal WM task, rendered on a standard 3-D brain template (display threshold: p < .001 uncorrected).

Figure 3. 

Univariate results for the high versus low attentional control contrasts in the verbal WM task, rendered on a standard 3-D brain template (display threshold: p < .001 uncorrected).

Neuroimaging—Whole-brain and ROI Multivariate Analyses

A first set of multivariate analyses assessed whole-brain within-task prediction of control and load effects for auditory–verbal and visual WM tasks separately. We observed reliable within-task prediction for both control and load manipulations, and this for both types of WM tasks. As shown in Figure 4 (left and middle columns), for the verbal WM task, significant multivariate discrimination of neural patterns was observed for control conditions in 77% of participants (mean classification accuracy = .77 ± .10) and for load in 50% of participants (mean classification accuracy = .70 ± .09). Similar results were observed for the visual WM task, with significant multivariate discrimination for control conditions in 81% of participants (mean classification accuracy = .69 ± .08) and for load conditions in 92% of participants (mean classification accuracy = .71 ± .07). When performing a Bayesian t test on group-level classification accuracies against a chance-level classification distribution, decisive evidence in favor of above-chance level classification was observed for control and load conditions in both modalities (auditory–verbal WM, control: BF10 = 3.78e+10, load: BF10 = 1.72e+8; visual WM, control: BF10 = 7.76, load = BF10 = 9.01e+10).

Figure 4. 

Individual classification accuracies for the discrimination of attentional control conditions (first row) and for the discrimination of attentional load conditions (second row), within verbal WM (left column) and visual WM (middle column) tasks, and across WM modalities (right columns). The horizontal axis corresponds to subjects index. Each circle represents the balanced classification accuracy observed for a given participant; black circles represent classification accuracies significant at p < .05 using permutation tests on individual classification accuracies. The black continuous line indicates chance-level classifier performance.

Figure 4. 

Individual classification accuracies for the discrimination of attentional control conditions (first row) and for the discrimination of attentional load conditions (second row), within verbal WM (left column) and visual WM (middle column) tasks, and across WM modalities (right columns). The horizontal axis corresponds to subjects index. Each circle represents the balanced classification accuracy observed for a given participant; black circles represent classification accuracies significant at p < .05 using permutation tests on individual classification accuracies. The black continuous line indicates chance-level classifier performance.

The whole-brain analyses were followed up by ROI analyses, focusing on the intraparietal sulci and the superior frontal gyri that define the dorsal attention network, as well as the dorsolateral pFC, which has also been associated with WM load in previous studies; to increase the precision of these analyses, the intraparietal sulcus ROIs were further segmented in anterior, middle, and posterior portions (Gillebert, Mantini, Peeters, Dupont, & Vandenberghe, 2013; Gillebert et al., 2012). The ROIs were defined as spheres with a radius of 10 mm selected from previously published studies with the following coordinates: x = ±43, y = −40, z = 43, x = ±34, y = −49, z = 45, and x = ±26, y = −60, z = 41, for the left/right anterior, middle, and posterior intraparietal sulci, respectively; x = −20/+26, y = −1/−2, z = 50/47 for the left/right superior frontal gyrus; and x = −40, y = 13, z = 28 for the left dorsolateral pFC (Majerus et al., 2012, 2016; Gillebert et al., 2012, 2013; Asplund et al., 2010). The ROI analyses proceeded as for the whole-brain multivariate analyses but by limiting the feature selection to the target ROIs. Using Bayesian one-sample t tests on classification accuracies for the attention control conditions in the auditory–verbal modality, very strong evidence for group-level above-chance discrimination was observed in the anterior intraparietal sulcus (left: mean classification accuracy = .64 ± .07, BF10 = 6.55e+7; right: mean classification accuracy = .60 ± .07, BF10 = 78,079.40), in the middle intraparietal sulcus (left: mean classification accuracy = .66 ± .08, BF10 = 2.47e+7; right: mean classification accuracy = .62 ± .08, BF10 = 77,084.22), in the posterior intraparietal sulcus (left: mean classification accuracy = .58 ± .09, BF10 = 348.20; right: mean classification accuracy = .60 ± .09, BF10 = 3199.10), in the superior frontal gyrus (left: mean classification accuracy = .61 ± .07, BF10 = 315,932.99; right: mean classification accuracy = .59 ± .07, BF10 = 97,078.90), and in the dorsolateral pFC (mean classification accuracy = .65 ± .08, BF10 = 1.79e+7). Very similar results were observed when conducting the same ROI analysis on the discrimination of load conditions in the auditory–verbal modality, with very strong evidence for group-level above-chance discriminations in the anterior intraparietal sulcus (left: mean classification accuracy = .57 ± .07; BF10 = 4784.00; right: mean classification accuracy = .58 ± .06, BF10 = 10213.00), in the middle intraparietal sulcus (left: mean classification accuracy = .56 ± .06, BF10 = 692.70; right: mean classification accuracy = .57 ± .08, BF10 = 293.96), in the posterior intraparietal sulcus (left: mean classification accuracy = .57 ± .06, BF10 = 4549.50; right: mean classification accuracy = .57 ± .05, BF10 = 124,050.40), in the superior frontal gyrus (left: mean classification accuracy = .55 ± .04, BF10 = 31498.57; right: mean classification accuracy = .55 ± .06, BF10 = 137.36), and in the dorsolateral pFC (mean classification accuracy = .58 ± .06, BF10 = 73,285).

When applying the same ROI analyses to the visual modality, robust evidence for above-chance discrimination of both control and load conditions was also observed in regions of the dorsal attention network. For the control conditions, strong to very strong evidence for group-level above-chance discrimination was observed in the anterior intraparietal sulcus (left: mean classification accuracy = .55 + .05, BF10 = 597.90; right: mean classification accuracy = .54 ± .05, BF10 = 103.90), in the middle intraparietal sulcus (left: mean classification accuracy = .58 ± .06, BF10 = 87,388.90; right: mean classification accuracy = .55 ± .07, BF10 = 28.61), in the posterior intraparietal sulcus (left: mean classification accuracy = .57 ± .07, BF10 = 371.80; right: mean classification accuracy = .56 ± .06, BF10 = 956.30), in the superior frontal gyrus (left: mean classification accuracy = .56 ± .05, BF10 = 11,862.12; right: mean classification accuracy = .55 ± .08, BF10 = 24.79), and in the dorsolateral pFC (mean classification accuracy = .57 ± .07, BF10 = 1490.00). For the load conditions, very strong evidence for above-chance discriminations was observed in the left anterior intraparietal sulcus (left: mean classification accuracy = .56 ± .05; BF10 = 51198.00; right: mean classification accuracy = .56 ± .06, BF10 = 865.00), in the middle intraparietal sulcus (left: mean classification accuracy = .58 ± .08, BF10 = 1828.00; right: mean classification accuracy = .57 ± .07, BF10 = 981.83), in the posterior intraparietal sulcus (left: mean classification accuracy = .56 ± .06, BF10 = 2928.50; right: mean classification accuracy = .57 ± .06, BF10 = 2122.40), and in the dorsolateral pFC (mean classification accuracy = .56 ± .05, BF10 = 2277.00). Only evidence for above-chance discrimination in the superior frontal gyrus was inconclusive (left: mean classification accuracy = .52 ± .06, BF10 = 1.47; right: mean classification accuracy = .53 ± .07, BF10 = 0.831).

Neuroimaging—Searchlight Analyses

To determine whether neural patterns in additional brain regions outside the dorsal attention network ROIs also contribute to differentiate the control and load conditions in the auditory–verbal and visual WM modality, a multivariate searchlight analysis was conducted. This analysis revealed that, in addition to the intraparietal sulci and the superior frontal gyri target areas part of the dorsal attention network, the attentional control conditions could also be reliably distinguished (as indicated by a binomial test; see Methods for further details) in the dorsolateral and ventrolateral pFC in up to 89% of participants for the auditory–verbal modality and up to 77% participants for the visual modality (see Figure 5A). When running the same analyses for the neural differentiation of attentional load conditions, modality-specific neural patterns in sensory processing areas were identified in the majority of participants (see Figure 5B). In the auditory–verbal modality, most participants showed discriminatory neural patterns (up to 77%) in the superior temporal gyrus close to the auditory cortex, reflecting the differences in auditory encoding load of the high and low conditions. In the visual modality, most participants (up to 77%) showed discriminatory patterns involving voxels in the bilateral fusiform gyri, reflecting the differences in visual encoding load of the high and low load conditions. At the same time, it is important to note that, for both modalities, neural patterns in the target areas of the dorsal attention network still discriminated between high and low load conditions in up to 65% (auditory modality) and 70% (visual modality) of participants, in line with the ROI analyses.

Figure 5. 

Searchlight regions discriminating between high and low control conditions (A) and between high and low load conditions (B) in the verbal (top) and visual (bottom) WM tasks. The colors indicate the prevalence of participants showing individual-level significant classification accuracies for a searchlight region around a given voxel.

Figure 5. 

Searchlight regions discriminating between high and low control conditions (A) and between high and low load conditions (B) in the verbal (top) and visual (bottom) WM tasks. The colors indicate the prevalence of participants showing individual-level significant classification accuracies for a searchlight region around a given voxel.

Neuroimaging—Cross-modality Predictions

Finally, we determined to what extent the neural patterns that discriminate attentional control and attentional load are similar in the auditory–verbal and visual WM modalities. We conducted between-WM task whole-brain multivariate predictions by training classifiers to discriminate control or load conditions in one WM modality (e.g., auditory–verbal) and by testing the classifiers on the data from the other WM modality (e.g., visual). As shown in Figure 4 (right column), individual-level cross-modality predictions were reduced relative to the equivalent within-modality predictions and were significant in only 27% of participants for the predictions of control conditions (mean classification accuracy = .53 ± .05) and in 19% of participants for the prediction of load conditions (mean classification accuracy = .52 ± .04). Yet, Bayesian t tests on group-level classifications indicated moderate-to-strong evidence in favor of above-chance level discrimination of control and load conditions (control: BF10 = 36.68, load: BF10 = 4.59). Given that the searchlight analyses had shown that patterns in the intraparietal cortices were sensitive to load and control conditions in both modalities but that there were also additional modality-specific neural patterns, we assessed whether these additional patterns could have diminished the reliability of cross-modality predictions of load and control conditions. We therefore ran an additional cross-modality prediction analysis that was restricted to voxels in frontoparietal cortices of the dorsal attention network that had been shown, in the searchlight analyses, to discriminate between the different control and load conditions in both the auditory–verbal and visual modalities. As shown in Figure 6, these constrained between-modality predictions led to similar results as the whole-brain between-modality predictions, with significant cross-modality predictions of control conditions in 38% of participants (mean classification accuracy = .53 ± .05) and of load conditions in 19% of participants (mean classification accuracy = .52 ± .05). This was confirmed by group-level analysis, showing only anecdotal-to-moderate levels of evidence for above-chance level cross-modality predictions of control and load conditions (Bayesian t tests for control, BF10 = 4.63 and load, BF10 = 0.73).

Figure 6. 

Individual classification accuracies for constrained between-modality predictions of attentional control conditions and for constrained between-modality predictions of attentional load conditions. The horizontal axis corresponds to subjects index. Each circle represents the balanced classification accuracy observed for a given participant; black circles represent classification accuracies significant at p < .05 using permutation tests on individual classification accuracies. The black continuous line indicates chance-level classifier performance.

Figure 6. 

Individual classification accuracies for constrained between-modality predictions of attentional control conditions and for constrained between-modality predictions of attentional load conditions. The horizontal axis corresponds to subjects index. Each circle represents the balanced classification accuracy observed for a given participant; black circles represent classification accuracies significant at p < .05 using permutation tests on individual classification accuracies. The black continuous line indicates chance-level classifier performance.

DISCUSSION

This study examined the role of the dorsal attention network in verbal and visual WM tasks by examining to what extent this network supports attentional aspects involved in memory load versus memory control during WM encoding. We observed that multivariate neural patterns in the dorsal attention network were able to discriminate both between the different control conditions and between the different encoding load conditions, within verbal and visual WM modalities. Modality-specific neural patterns were further observed. In the verbal modality, neural patterns in frontotemporal cortices discriminated between high and low control conditions, and neural patterns in superior temporal cortices discriminated between high and low encoding load conditions. In the visual modality, neural patterns in bilateral fusiform gyri discriminated between high and low encoding load conditions. Between-modality predictions of neural patterns associated with top–down control or encoding load conditions were not reliable. Finally, univariate analyses did not show differential levels of activity in the dorsal attention network as a function of load conditions, which, unlike in most previous studies, were matched for differences in physical stimulus numerosity.

First, the results of this study show that, although univariate activity in the dorsal attention network may be attributable to confounds between encoding load and the physical numerosity of memory stimuli, multivariate activity patterns are still able to decode encoding load when physical numerosity is held constant. Intraparietal cortex is known to react to differences in numerosity, particularly for physical differences (e.g., different visual quantities) as opposed to abstract differences (e.g., digits symbolizing different numerical quantities; Bulthe et al., 2015; Piazza et al., 2004, 2007). In this study, the sensory events were matched in high and low load conditions as the number of temporally/visually separated physical stimuli was the same in both conditions. Our findings partially mirror the results of a study by Emrich et al. (2013), in which the numerosity of visual stimuli was also held constant (three dot patterns) while varying memory load (one, two, or three dot patterns with movements): Univariate load effects in the frontoparietal cortices did not appear during the encoding WM phase. However, these effects appeared later during the maintenance delay period; the nature of these univariate effects still needs to be clarified, as they may reflect load effects in nonstrategic maintenance of information in the focus of attention and/or top–down control processes involved in structuring and reviewing variable amounts of memoranda during the delay period. Moreover, it could be argued that the fast and continuous presentation of stimuli used in this study may have led to saturation effects in the hemodynamic signal, occulting possible univariate load effects during encoding; however, in that case, no univariate differences should have been observed for any contrast, which is contradicted by the observation of univariate differences for the control conditions at least in the auditory–verbal modality. In summary, our results indicate that, although differences in perceptual numerosity may be a confound for univariate load effects observed in previous studies at least during encoding, encoding load is still represented by multivariate activity patterns within the dorsal attention network after control of perceptual numerosity.

The second finding of this study is that two different aspects of attention are represented by activity patterns in the dorsal attention network during WM encoding. We showed that the dorsal attention network, known to be sensitive to WM load, does not only reflect top–down attentional control. Rather, our study shows that the dorsal attention network represents both the amount of items in the focus of attention (as reflected by encoding load) and task-related top–down attentional control (as reflected by the type of items that needs to be selected for retention), in line with WM accounts in which attentional control coexists with attention used for WM storage (e.g., D'Esposito & Postle, 2015; Cowan et al., 2006; Cowan, 1988). This is a critical finding, as previous studies have attributed the intervention of the dorsal attention network during WM tasks implicitly or explicitly to top–down, task-related attentional control mechanisms (Majerus et al., 2012; Riggall & Postle, 2012; Todd et al., 2005; Todd & Marois, 2004).

It is important to note here that the neural classifiers of encoding load and top–down control conditions were independent of each other while involving the same sets of images: Classifiers discriminating top–down control conditions were trained and tested on both high and low encoding load trials for each control condition and hence could not reflect encoding load-related neural differences as these differences were matched within the high and low control conditions. The same was the case for the classifiers discriminating encoding load, which were trained and tested on both high and low top–down control conditions for each load condition. Furthermore, in the verbal modality, univariate analyses revealed an increase of activation for the high control condition, despite that this condition led to lower WM load as a consequence of the top–down, selective encoding strategy requiring the subject to focus on only a subset of stimuli. If the attentional control factor merely reflected a load factor, then univariate analyses should have shown diminished activity for the high control condition, given that low load is typically associated with decreased posterior parietal cortex activity (Majerus et al., 2012, 2016; Cowan et al., 2011; Ravizza et al., 2004; Todd & Marois, 2004).

A further important observation of this study is that, although the control and load manipulations led to similar behavioral and neural effects, there were important modality-specific findings, indicating that the two attentional processes identified here may be less domain-general than previously suggested (Majerus et al., 2016; Cowan et al., 2011). First, the manipulation of control was associated, specifically in the verbal modality, with neural patterns in a wide-spread, left-hemisphere dominant frontotemporal network, involving the inferior pFC, the supramarginal gyrus, and the temporo-occipital cortices. The regions of this network are known to support controlled phonological and orthographic segmentation and comparison processes (Deng, Chou, Ding, Peng, & Booth, 2011; Booth, Mehdiratta, Burman, & Bitan, 2008); these processes may have allowed participants, in the auditory–verbal high control condition, to segment the incoming speech stream and explicitly compare the segmented stimuli to the target stimulus category defined by both phonological and orthographic characteristics. This may also have included attempts at rehearsing the target stimuli in the high control condition (as each target stimulus is followed by a nontarget stimulus, which can be actively ignored, leaving some small room for verbal rehearsal); in the low control conditions, this strategy is indeed very unlikely as verbal rehearsal in standard running span conditions, which come closest to the low control trials used in this study, is known to be very difficult to implement and even worsens performance (Bhatarah, Ward, Smith, & Hayes, 2009; Hockey, 1973). Further modality-specific results were observed for the load conditions. Modality-specific sensory cortices in superior temporal cortices encoded information about auditory–verbal encoding load, in line with differences in the richness of acoustic, phonetic, and phonological features between the two auditory–verbal load conditions. The posterior superior temporal cortex is known to support phonetic and phonological aspects of input, bottom–up speech processing (Skeide & Friederici, 2016; Rauschecker & Scott, 2009). Similarly, for visual encoding load conditions, a broad set of voxels in bilateral fusiform gyri discriminated between high and low visual encoding load, reflecting the differences in the visual richness of the sequences to be encoded. Bilateral fusiform cortices are known to be sensitive to sensory features associated with faces, and posterior occipitotemporal cortex has been shown to be sensitive to line orientation (Guntupalli, Wheeler, & Gobbini, 2017; Sneve, Sreenivasan, Alnaes, Endestad, & Magnussen, 2015; Sneve, Alnaes, Endestad, Greenlee, & Magnussen, 2012; Rossion et al., 2003). Moreover, even when restricting between-modality predictions of top–down control conditions to regions of the dorsal attention network that had been shown to be informative about control conditions in both verbal and visual conditions, no reliable between-modality prediction of attentional conditions was observed. These results suggest that, despite a common involvement of neural patterns in the dorsal attention network, the way attentional control and load are processed in this network differs between auditory–verbal and visual modalities. For the control manipulations, it could be argued that verbal stimulus categories (such as digits and letters) are more difficult to distinguish than visual stimulus categories (such as faces versus lines), requiring more attentional control for distinguishing the two stimulus categories, which could also explain the univariate differences we observed for high versus low top–down control conditions in the auditory–verbal but not the visual modality. At the same time, in this study, this did not have any consequences on overall task performance, as there was no evidence for modality effects or any modality-by-condition interaction on performance accuracy levels.

At first hand, these results appear to contradict those of previous studies claiming domain-general recruitment of the dorsal attention network as a function of WM load (Majerus et al., 2016; Cowan et al., 2011; Ravizza et al., 2004). Majerus et al. (2016) observed cross-modality decoding within the dorsal attention network of WM load for verbal (letters) and visual (colored squares) stimulus sets increasing from two to six stimuli, whereas in this study neither encoding load or top–down control led to reliable between-modality predictions. These contradictory results could however be reconciled if we examine the concept of saliency or priority maps that has been recently proposed. Intraparietal cortices have been considered to establish and maintain saliency or priority maps, resulting from bottom–up and top–down prioritization processes. Saliency or priority maps encode information about the type of stimuli that are salient as a function of a given (task) context (Chelazzi et al., 2014; Gillebert et al., 2013; Bisley & Goldberg, 2010; Gottlieb, 2007). If we analyze the attentional manipulations in this study in terms of these saliency maps, no cross-domain decoding of the different attentional conditions would be expected, given that information that is becoming prioritized in the different control and load conditions is clearly different for auditory and visual tasks. By contrast, in studies manipulating WM load across verbal and visual modalities by simply varying the number of stimuli presented in a given load condition, the number of stimuli within each load condition may become an important, common salient information in both modalities; hence, reliable cross-modality decoding of WM load, as observed by Majerus et al. (2016), could be expected in this specific case, particularly when both verbal and visual stimuli are presented visually.

To conclude, this study shows that multivariate neural activity patterns in the dorsal attention network during WM do not only represent top–down attentional control aspects but also nonstrategic attentional aspects involved in memory encoding. Nonstrategic aspects of attention reflect the quantity of information in the focus of attention. Top–down controlled attention determines the quality of WM representations and directs the focus of attention to target information in a strategic, task-related manner. These attentional functions are supported by the dorsal attention network in both verbal and visual modalities, but their implementation further requires modality-specific neural processes. This study highlights the complexity of the attentional processes that characterize the recruitment of the dorsal attention network in WM tasks. Given that this study focused on WM encoding, further studies need to determine the extent to which the different aspects of attention identified in this study also characterize the intervention of the dorsal attention network during WM maintenance and retrieval stages.

Acknowledgments

This work was supported by the government of the French-speaking community of Belgium (ARC, Convention 12/17-01-REST awarded to S. M.), by the Belgian Science Policy (Back-to-Belgium grant awarded to F. P.; Interuniversity Attraction Pole P7/11 awarded to S. M.), and by the National Institute of Health (NIH R01 HD-21338 awarded to N. C.).

Reprint requests should be sent to Steve Majerus, Psychology and Neuroscience of Cognition Research Unit, Université de Liège, Boulevard du Rectorat, B33, 4000 Liège, Belgium, or via e-mail: smajerus@ulg.ac.be.

REFERENCES

REFERENCES
Allefeld
,
C.
,
Gorgen
,
K.
, &
Haynes
,
J. D.
(
2016
).
Valid population inference for information-based imaging: From the second-level t-test to prevalence inference
.
Neuroimage
,
141
,
378
392
.
Andersson
,
J. L.
,
Hutton
,
C.
,
Ashburner
,
J.
,
Turner
,
R.
, &
Friston
,
K.
(
2001
).
Modeling geometric deformations in EPI time series
.
Neuroimage
,
13
,
903
919
.
Ashburner
,
J.
, &
Friston
,
K.
(
2005
).
Unified segmentation
.
Neuroimage
,
26
,
839
851
.
Asplund
,
C. L.
,
Todd
,
J. J.
,
Snyder
,
A. P.
, &
Marois
,
R.
(
2010
).
A central role for the lateral prefrontal cortex in goal directed and stimulus-driven attention
.
Nature Neuroscience
,
13
,
507
512
.
Barrouillet
,
P.
,
Bernardin
,
S.
, &
Camos
,
V.
(
2004
).
Time constraints and resource sharing in adults' working memory spans
.
Journal of Experimental Psychology: General
,
133
,
83
100
.
Bhatarah
,
P.
,
Ward
,
G.
,
Smith
,
J.
, &
Hayes
,
L.
(
2009
).
Examining the relationship between free recall and immediate serial recall: Similar patterns of rehearsal and similar effects of word length, presentation rate, and articulatory suppression
.
Memory & Cognition
,
37
,
689
713
.
Bisley
,
J. W.
, &
Goldberg
,
M. E.
(
2010
).
Attention, intention, and priority in the parietal lobe
.
Annual Review of Neuroscience
,
33
,
1
21
.
Booth
,
J. R.
,
Mehdiratta
,
N.
,
Burman
,
D. D.
, &
Bitan
,
T.
(
2008
).
Developmental increases in effective connectivity to brain regions involved in phonological processing during tasks with orthographic demands
.
Brain Research
,
1189
,
78
89
.
Broadway
,
J. M.
, &
Engle
,
R. W.
(
2010
).
Validating running memory span: Measurement of working memory capacity and links with fluid intelligence
.
Behavioral Research Methods
,
42
,
563
570
.
Brophy
,
A. L.
(
1986
).
Alternatives to a table of criterion values in signal detection theory
.
Behavioral Research Methods, Instruments and Computers
,
18
,
285
286
.
Bulthe
,
J.
,
De Smedt
,
B.
, &
Op de Beeck
,
H. P.
(
2015
).
Visual number beats abstract numerical magnitude: Format-dependent representation of Arabic digits and dot patterns in human parietal cortex
.
Journal of Cognitive Neuroscience
,
27
,
1376
1387
.
Burges
,
C. J. C.
(
1998
).
A tutorial on support vector machines for pattern recognition
.
Boston
:
Kluwer Academic Publishers
.
Cabeza
,
R.
,
Ciaramelli
,
E.
, &
Moscovitch
,
M.
(
2012
).
Cognitive contributions of the ventral parietal cortex: An integrative theoretical account
.
Trends in Cognitive Sciences
,
16
,
338
352
.
Chelazzi
,
L.
,
Estocinova
,
J.
,
Calletti
,
R.
,
Lo Gerfo
,
E.
,
Sani
,
I.
,
Della Libera
,
C.
, et al
(
2014
).
Altering spatial priority maps via reward-based learning
.
Journal of Neuroscience
,
34
,
8594
8604
.
Corbetta
,
M.
,
Kincade
,
J. M.
,
Ollinger
,
J. M.
,
McAvoy
,
M. P.
, &
Shulman
,
G. L.
(
2000
).
Voluntary orienting is dissociated from target detection in human posterior parietal cortex
.
Nature Neuroscience
,
3
,
292
297
.
Corbetta
,
M.
, &
Shulman
,
G. L.
(
2002
).
Control of goal-directed and stimulus-driven attention in the brain
.
Nature Reviews Neuroscience
,
3
,
201
215
.
Cowan
,
N.
(
1988
).
Evolving conceptions of memory storage, selective attention, and their mutual constraints within the human information-processing system
.
Psychological Bulletin
,
104
,
163
191
.
Cowan
,
N.
(
1995
).
Attention and memory: An integrated framework
.
New York
:
Oxford University Press
.
Cowan
,
N.
(
2001
).
The magical number 4 in short-term memory: A reconsideration of mental storage capacity
.
Behavioral and Brain Sciences
,
24
,
87
185
.
Cowan
,
N.
,
Elliott
,
E. M.
,
Saults
,
J. S.
,
Morey
,
C. C.
,
Mattox
,
S.
,
Hismjatullina
,
A.
, et al
(
2005
).
On the capacity of attention: Its estimation and its role in working memory and cognitive aptitudes
.
Cognitive Psychology
,
51
,
42
100
.
Cowan
,
N.
,
Fristoe
,
N. M.
,
Elliott
,
E. M.
,
Brunner
,
R. P.
, &
Saults
,
J. S.
(
2006
).
Scope of attention, control of attention, and intelligence in children and adults
.
Memory and Cognition
,
34
,
1754
1768
.
Cowan
,
N.
,
Li
,
D.
,
Moffitt
,
A.
,
Becker
,
T. M.
,
Martin
,
E. A.
,
Saults
,
J. S.
, et al
(
2011
).
A neural region of abstract working memory
.
Journal of Cognitive Neuroscience
,
23
,
2852
2863
.
Deng
,
Y.
,
Chou
,
T. L.
,
Ding
,
G. S.
,
Peng
,
D. L.
, &
Booth
,
J. R.
(
2011
).
The involvement of occipital and inferior frontal cortex in the phonological learning of Chinese characters
.
Journal of Cognitive Neuroscience
,
23
,
1998
2012
.
D'Esposito
,
M.
, &
Postle
,
B. R.
(
2015
).
The cognitive neuroscience of working memory
.
Annual Review of Psychology
,
66
,
115
142
.
Eklund
,
A.
,
Nichols
,
T. E.
, &
Knutsson
,
H.
(
2016
).
Cluster failure: Why fMRI inferences for spatial extent have inflated false-positive rates
.
Proceedings of the National Academy of Sciences, U.S.A.
,
113
,
7900
7905
.
Emrich
,
S. M.
,
Riggall
,
A. C.
,
LaRocque
,
J. J.
, &
Postle
,
B. R.
(
2013
).
Distributed patterns of activity in sensory cortex reflect precision of multiples items maintained in visual short-term memory
.
Journal of Neuroscience
,
33
,
6516
6523
.
Engle
,
R. W.
,
Kane
,
M. J.
, &
Tuholski
,
S. W.
(
1999
).
Individual differences in working memory capacity and what they tell us about controlled attention, general fluid intelligence, and functions of the prefrontal cortex
. In
A.
Miyake
&
P.
Shah
(Eds.),
Models of working memory: Mechanisms of active maintenance and executive control
(pp.
102
134
).
Cambridge
:
Cambridge University Press
.
Esterman
,
M.
,
Chiu
,
Y. C.
,
Tamber-Rosenau
,
B. J.
, &
Yantis
,
S.
(
2009
).
Decoding cognitive control in human parietal cortex
.
Proceedings of the National Academy of Sciences, U.S.A.
,
106
,
17974
17979
.
Fougnie
,
D.
, &
Marois
,
R.
(
2007
).
Executive working memory load induces inattentional blindness
.
Psychonomic Bulletin and Review
,
14
,
142
147
.
Friston
,
K. J.
,
Penny
,
W. D.
, &
Glaser
,
D. E.
(
2005
).
Conjunction revisited
.
Neuroimage
,
25
,
661
667
.
Gardumi
,
A.
,
Ivanov
,
D.
,
Hausfeld
,
L.
,
Valente
,
G.
,
Formisano
,
E.
, &
Uludag
,
K.
(
2012
).
The effect of spatial resolution on decoding accuracy in fMRI multivariate pattern analysis
.
Neuroimage
,
132
,
32
42
.
Gillebert
,
C. R.
,
Dyrholm
,
M.
,
Vangkilde
,
S.
,
Kyllingsbaek
,
S.
,
Peeters
,
R.
, &
Vandenberghe
,
R.
(
2012
).
Attentional priorities and access to short-term memory: Parietal interactions
.
Neuroimage
,
62
,
1551
1562
.
Gillebert
,
C. R.
,
Mantini
,
D.
,
Peeters
,
R.
,
Dupont
,
P.
, &
Vandenberghe
,
R.
(
2013
).
Cytoarchitectonic mapping of attentional selection and reorienting in parietal cortex
.
Neuroimage
,
67
,
257
272
.
Gottlieb
,
J.
(
2007
).
From thought to action: The parietal cortex as a bridge between perception, action, and cognition
.
Neuron
,
53
,
9
16
.
Gray
,
S.
,
Green
,
S.
,
Alt
,
M.
,
Hogan
,
T. P.
,
Kuo
,
T.
,
Brinkley
,
S.
, et al
(
2017
).
The structure of working memory in young children and its relation to intelligence
.
Journal of Memory and Language
,
92
,
183
201
.
Guntupalli
,
J. S.
,
Wheeler
,
K. G.
, &
Gobbini
,
M. I.
(
2017
).
Disentangling the representation of identity from head view along the human face processing pathway
.
Cerebral Cortex
,
27
,
46
53
.
Hockey
,
R.
(
1973
).
Changes in information-selection patterns in multisource monitoring as a function of induced arousal shifts
.
Journal of Experimental Psychology
,
101
,
35
42
.
Hutton
,
C.
,
Bork
,
A.
,
Josephs
,
O.
,
Deichmann
,
R.
,
Ashburner
,
J.
, &
Turner
,
R.
(
2002
).
Image distortion correction in fMRI: A quantitative evaluation
.
Neuroimage
,
16
,
217
240
.
Konoike
,
N.
,
Kotozaki
,
Y.
,
Jeong
,
H.
,
Miyazaki
,
A.
,
Sakaki
,
K.
,
Shinada
,
T.
, et al
(
2015
).
Temporal and motor representation of rhythm in fronto-parietal cortical areas: An fMRI study
.
PLoS One
,
10
,
e0130120
.
Kriegeskorte
,
N.
,
Goebel
,
R.
, &
Bandettini
,
P.
(
2006
).
Information-based functional brain mapping
.
Proceedings of the National Academy of Sciences, U.S.A.
,
103
,
3863
3868
.
Kurth
,
S.
,
Majerus
,
S.
,
Bastin
,
C.
,
Collette
,
F.
,
Jaspar
,
M.
,
Bahri
,
M. A.
, et al
(
2016
).
Effects of aging on task- and stimulus-related cerebral attention networks
.
Neurobiology of Aging
,
44
,
85
95
.
Lee
,
M. D.
, &
Wagenmakers
,
E. J.
(
2013
).
Bayesian modeling for cognitive science: A practical course
.
Cambridge, UK
:
Cambridge University Press
.
Majerus
,
S.
,
Attout
,
L.
,
D'Argembeau
,
A.
,
Degueldre
,
C.
,
Fias
,
W.
,
Maquet
,
P.
, et al
(
2012
).
Attention supports verbal short-term memory via competition between dorsal and ventral attention networks
.
Cerebral Cortex
,
22
,
1086
1097
.
Majerus
,
S.
,
Cowan
,
N.
,
Peters
,
F.
,
Van Calster
,
L.
,
Phillips
,
C.
, &
Schrouff
,
J.
(
2016
).
Cross-modal decoding of neural patterns associated with working memory: Evidence for attention-based accounts of working memory
.
Cerebral Cortex
,
26
,
166
179
.
Majerus
,
S.
,
D'Argembeau
,
A.
,
Martinez
,
T.
,
Belayachi
,
S.
,
Van der Linden
,
M.
,
Collette
,
F.
, et al
(
2010
).
The commonality of neural networks for verbal and visual short-term memory
.
Journal of Cognitive Neuroscience
,
22
,
2570
2593
.
Mathy
,
F.
, &
Feldman
,
J.
(
2012
).
What's magic about numbers? Chunking and data compression in short-term memory
.
Cognition
,
122
,
346
362
.
Mikl
,
M.
,
Marecek
,
R.
,
Hlustik
,
P.
,
Pavlicova
,
M.
,
Drastich
,
A.
,
Chlebus
,
P.
, et al
(
2008
).
Effects of spatial smoothing on fMRI group inferences
.
Magnetic Resonance Imaging
,
26
,
490
503
.
Moore
,
T. M.
,
Reise
,
S. P.
,
Depaoli
,
S.
, &
Haviland
,
M. G.
(
2015
).
Iteration of partially specified target matrices: Applications in exploratory and Bayesian confirmatory factor analysis
.
Multivariate Behavioral Research
,
50
,
149
161
.
Noirhomme
,
Q.
,
Lesenfants
,
D.
,
Gomez
,
F.
,
Soddu
,
A.
,
Schrouff
,
J.
,
Garraux
,
G.
, et al
(
2014
).
Biased binomial assessment of cross-validated estimation of classification accuracies illustrated in diagnosis predictions
.
Neuroimage Clinical
,
4
,
687
694
.
Phillips
,
P. J.
,
Wechsler
,
H.
,
Huang
,
J.
, &
Rauss
,
P.
(
1998
).
The FERET database and evaluation procedure for face recognition algorithms
.
Image and Vision Computing
,
16
,
295
306
.
Piazza
,
M.
,
Izard
,
V.
,
Pinel
,
P.
,
Le Bihan
,
D.
, &
Dehaene
,
S.
(
2004
).
Tuning curves for approximate numerosity in the human intraparietal sulcus
.
Neuron
,
44
,
547
555
.
Piazza
,
M.
,
Pinel
,
P.
,
Le Bihan
,
D.
, &
Dehaene
,
S.
(
2007
).
A magnitude code common to numerosities and number symbols in human intraparietal cortex
.
Neuron
,
53
,
293
305
.
Pollack
,
I.
,
Johnson
,
L. B.
, &
Knaff
,
P. R.
(
1959
).
Running memory span
.
Journal of Experimental Psychology
,
57
,
137
146
.
Postle
,
B. R.
(
2016
).
Working memory as an emergent property of the mind and brain
.
Neuroscience
,
139
,
23
28
.
Rauschecker
,
J. P.
, &
Scott
,
S. K.
(
2009
).
Maps and streams in the auditory cortex: Nonhuman primates illuminate human speech processing
.
Nature Neuroscience
,
12
,
718
724
.
Ravizza
,
S. M.
,
Delgado
,
M. R.
,
Chein
,
J. M.
,
Becker
,
J. T.
, &
Fiez
,
J. A.
(
2004
).
Functional dissociations within the inferior parietal cortex in verbal working memory
.
Neuroimage
,
22
,
562
573
.
Riggall
,
A. C.
, &
Postle
,
B. R.
(
2012
).
The relationship between working memory storage and elevated activity as measured with functional magnetic resonance imaging
.
Journal of Neuroscience
,
32
,
12990
12998
.
Rossion
,
B.
,
Caldara
,
R.
,
Seghier
,
M.
,
Schuller
,
A. M.
,
Lazeyras
,
F.
, &
Mayer
,
E.
(
2003
).
A network of occipito-temporal face-sensitive areas besides the right middle fusiform gyrus is necessary for normal face processing
.
Brain
,
126
,
2381
2395
.
Savini
,
N.
,
Brunetti
,
M.
,
Babiloni
,
C.
, &
Ferretti
,
A.
(
2012
).
Working memory of somatosensory stimuli: An fMRI study
.
International Journal of Psychophysiology
,
86
,
220
228
.
Schrouff
,
J.
,
Kussé
,
C.
,
Wehenkel
,
L.
,
Maquet
,
P.
, &
Phillips
,
C.
(
2012
).
Decoding semi-constrained brain activity from fMRI using support vector machines and Gaussian processes
.
PLoS One
,
7
,
e35860
.
Schrouff
,
J.
,
Rosa
,
M. J.
,
Rondina
,
J. M.
,
Marquand
,
M. F.
,
Chu
,
C.
,
Ashburner
,
J.
, et al
(
2013
).
PRoNTo: Pattern Recognition for Neuroimaging Toolbox
.
Neuroinformatics
,
11
,
319
337
.
Shulman
,
G. L.
,
Astafiev
,
S. V.
,
Franke
,
D.
,
Pope
,
D. L.
,
Snyder
,
A. Z.
,
McAvoy
,
M. P.
, et al
(
2009
).
Interaction of stimulus-driven reorienting and expectation in ventral and dorsal frontoparietal and basal ganglia-cortical networks
.
Journal of Neuroscience
,
29
,
4392
4407
.
Skeide
,
M. A.
, &
Friederici
,
A. D.
(
2016
).
The ontogeny of the cortical language network
.
Nature Reviews Neuroscience
,
17
,
323
332
.
Sneve
,
M. H.
,
Alnaes
,
D.
,
Endestad
,
T.
,
Greenlee
,
M. W.
, &
Magnussen
,
S.
(
2012
).
Visual short-term memory: Activity supporting encoding and maintenance in retinotopic visual cortex
.
Neuroimage
,
63
,
166
178
.
Sneve
,
M. H.
,
Sreenivasan
,
K. K.
,
Alnaes
,
D.
,
Endestad
,
T.
, &
Magnussen
,
S.
(
2015
).
Short-term retention of visual information: Evidence in support of feature-based attention as an underlying mechanism
.
Neuropsychologia
,
66
,
1
9
.
Todd
,
J. J.
,
Fougnie
,
D.
, &
Marois
,
R.
(
2005
).
Visual short-term memory load suppresses temporo-parietal junction activity and induces inattentional blindness
.
Psychological Science
,
16
,
965
972
.
Todd
,
J. J.
, &
Marois
,
R.
(
2004
).
Capacity limit of visual short-term memory in human posterior parietal cortex
.
Nature
,
428
,
751
754
.