We investigated the sources of dual-task costs arising in multisensory working memory (WM) tasks, where stimuli from different modalities have to be simultaneously maintained. Performance decrements relative to unimodal single-task baselines have been attributed to a modality-unspecific central WM store, but such costs could also reflect increased demands on central executive processes involved in dual-task coordination. To compare these hypotheses, we asked participants to maintain two, three, or four visual items. Unimodal trials, where only this visual task was performed, and bimodal trials, where a concurrent tactile WM task required the additional maintenance of two tactile items, were randomly intermixed. We measured the visual and tactile contralateral delay activity (CDA/tCDA components) as markers of WM maintenance in visual and somatosensory areas. There were reliable dual-task costs, as visual CDA components were reduced in size and visual WM accuracy was impaired on bimodal relative to unimodal trials. However, these costs did not depend on visual load, which caused identical CDA modulations in unimodal and bimodal trials, suggesting that memorizing tactile items did not reduce the number of visual items that could be maintained. Visual load did not also affect tCDA amplitudes. These findings indicate that bimodal dual-task costs do not result from a competition between multisensory items for shared storage capacity. Instead, these costs reflect generic limitations of executive control mechanisms that coordinate multiple cognitive processes in dual tasks. Our results support hierarchical models of WM, where distributed maintenance processes with modality-specific capacity limitations are controlled by a central executive mechanism.
The capacity of working memory (WM) is strictly limited (Cowan, 2001), but the reasons for this limitation remain under debate. Early accounts that assumed a unitary memory system for the short-term storage of information (Atkinson & Shiffrin, 1968) have been superseded by a multicomponent model (Baddeley & Hitch, 1974) that proposes separate domain-specific storage systems for auditory and visual information, which are controlled by a central executive mechanism. The hypothesis that WM storage processes operate in a domain-specific fashion is in line with evidence for content-specific WM capacity limitations (Fougnie, Zughni, Godwin, & Marois, 2015; Wheeler & Treisman, 2002). For example, the capacity of visual WM for colors is limited to three to five items (Cowan, 2001), consistent with neuroimaging studies that reported corresponding set size effects (Todd & Marois, 2004; Vogel & Machizawa, 2004). However, the hypothesis that WM capacity is strictly domain-specific has been challenged by an alternative account. This account assumes that WM capacity is shared across different sensory modalities, as it posits that the storage of sensory information in WM is mediated by a domain-general process that maintains items of any modality, resulting in a shared central WM store (Cowan, 2010, 2011; Barrouillet, Bernardin, Portrat, Vergauwe, & Camos, 2007; for functional brain imaging evidence, see Majerus et al., 2016; Cowan et al., 2011).
Behavioral evidence consistent with this view comes from auditory/visual dual-task experiments. These studies reported performance costs when a WM task was performed together with a memory task in another modality (dual-task condition) relative to a single-task baseline, where the same task was performed in isolation (Cowan, Saults, & Blume, 2014; Fougnie & Marois, 2011; Saults & Cowan, 2007; Morey & Cowan, 2005; Cocchini, Logie, Della Sala, MacPherson, & Baddeley, 2002). Such dual-task costs have been attributed to a central storage capacity, which is shared between two modalities on bimodal trials, but can be fully allocated to a single modality on unimodal baseline trials. However, alternative explanations of these costs remain viable. They could be unrelated to limitations in storage capacity (Cocchini et al., 2002) and instead reflect the increased executive demands of dual-tasking (for a review, see Vandierendonck, 2016), which requires cognitive control functions that play no role in single-task baselines (cf. Logie, Cocchini, Delia Sala, & Baddeley, 2004). The need to effectively coordinate two concurrent maintenance processes in vision and audition could impair WM performance relative to single tasks, even if the number of items that are maintained by these processes remains entirely unaffected by dual-tasking. In other words, dual-task costs in multisensory WM tasks may reflect a fundamental domain-general bottleneck that is unrelated to quantitative WM capacity limitations. Such a bottleneck could affect performance whenever two tasks are performed concurrently or in close succession, regardless of whether these tasks are perceptual (requiring an immediate response, see Pashler, 1994) or mnemonic (requiring a delayed response after a maintenance phase). This was illustrated in an ERP study where participants performed a speeded response to an auditory stimulus that was followed by a visual search display (Brisson & Jolicoeur, 2007). When the SOA between auditory and visual stimuli decreased, the onset of neural activity indicating the maintenance of search targets in WM was delayed. Because the auditory task did not require WM storage, this delay cannot be due to competition between auditory and visual stimuli for a central storage capacity but may instead reflect an executive bottleneck for the control of dual-tasks.
The question whether performance costs in bimodal relative to unimodal WM tasks reflect a competition between items from different modalities for a shared central WM store or whether they instead represent dual-task coordination costs is difficult to resolve exclusively on the basis of behavioral measures. Both types of mechanisms will result in impaired WM performance, and both will therefore affect estimates of WM storage capacity (such as Cowan's K; Cowan, 2001), even though the increased demands on executive control in bimodal WM dual tasks are essentially unrelated to WM capacity limitations. In contrast to performance, which is assessed at the end of a trial in response to test displays, EEG-based measures can track WM maintenance processes directly when they occur. For example, the contralateral delay activity (CDA) that is observed over posterior visual areas during visual WM maintenance increases in amplitude in a load-dependent fashion until WM capacity is reached (e.g., Fukuda, Awh, & Vogel, 2010) and thus provides an online measure of the number of items that can be stored in visual WM. Earlier studies reported CDA enhancements when WM load was increased from one to two to three items and no additional enhancement (Vogel & Machizawa, 2004) or a decrease in amplitude (Fukuda, Woodman, & Vogel, 2015) with four items. The CDA typically emerges around 300 msec after the onset of a visual sample set and is often preceded by an N2pc component (e.g., Drew, Horowitz, Wolfe, & Vogel, 2012), which reflects the allocation of attention to sample stimuli during their perceptual encoding. The maintenance of tactile stimuli in WM elicits the tactile CDA component (tCDA) over somatosensory cortex (Katus & Müller, 2016; Katus & Eimer, 2015; Katus, Müller, & Eimer, 2015), which is also sensitive to tactile WM load. The tCDA increases in size when tactile WM load is increased from one to two items but shows no further amplitude enhancement for higher tactile WM loads (Katus, Grubert, & Eimer, 2015). Analogous to the visual CDA, tCDA components are preceded by the somatosensory equivalent of the N2pc (N2cc component), which shows a topography centered over somatosensory areas (Katus, Grubert, et al., 2015) and reflects the encoding of tactile stimuli into WM.
Because the CDA and tCDA reflect dissociable maintenance processes for visual and tactile information (Katus & Eimer, 2016), measuring both components in bimodal visual–tactile WM tasks can provide direct insights about the nature of WM capacity limits. The tCDA can be coregistered with the visual CDA in tactile/visual dual tasks (Katus & Eimer, 2016, 2018; Katus, Grubert, & Eimer, 2017) after transforming EEG data to current source densities (CSDs; Tenke & Kayser, 2012). If visual and tactile items compete for access to a capacity-limited central WM store, increasing the number of tactile items in this store would reduce its remaining capacity for visual items and vice versa. An increase in tactile load that results in corresponding tCDA enhancements should hence preclude any CDA enhancements when visual load is also increased. In a recent ERP study (Katus & Eimer, 2018), we tested this prediction in a task where visual and tactile WM load was manipulated independently (one, two, or three items per modality). We found no load-dependent crossmodal interactions. Although CDA and tCDA amplitudes increased when WM load in the corresponding visual or tactile modality was increased, these load effects were entirely modality specific, as increasing tactile WM load had no effect on CDA amplitudes and increasing visual load did not modulate tCDA amplitudes. These observations provide support for modality-specific WM capacity limitations but no evidence for the hypothesis that visual and tactile items were maintained in a shared WM store.
Our previous study (Katus & Eimer, 2018) demonstrated the existence of independent maintenance systems for tactile and visual information (see also Katus & Eimer, 2016). The existence of such modality-specific stores does not necessarily imply that there are no costs in tasks where visual and tactile stimuli have to be maintained in parallel, relative to the corresponding unimodal baseline conditions. Critically, such costs could arise due to the demands on top–down executive control processes that are involved in the coordination of concurrent WM maintenance processes in different modalities. In this case, these costs would not reflect the limitations of modality-unspecific central storage mechanisms. Importantly, such generic dual-task coordination costs should be entirely independent of the capacity limitations of modality-specific visual and tactile WM stores. This prediction has not be tested in our previous ERP studies of visual/tactile WM, because these studies did not include unimodal WM tasks and thus could not measure behavioral and electrophysiological correlates of bimodal dual-task costs that arise relative to unimodal baselines. In the present experiment, we employed a unimodal baseline condition to isolate such dual-task costs by identifying differences in the encoding or maintenance of sensory items in WM in bimodal as compared with unimodal task contexts. More specifically, we investigated how visual WM mechanisms are affected in a task where visual items have to be maintained concurrently with tactile items relative to a unimodal baseline task without tactile stimuli.
We used a lateralized change detection task where participants had to memorize two, three, or four visual items on each trial. The choice of these visual WM load conditions was based on previous behavioral and electrophysiological evidence that visual WM capacity is limited to about three items (Vogel & Machizawa, 2004; Cowan, 2001). Thus, a WM load of two, three, or four items should be below, at, or above the capacity of visual WM, respectively. This visual WM task was either performed in isolation (unimodal baseline) or concurrently with a tactile WM task where two items had to be memorized (bimodal dual-task condition). A tactile WM load of 2 was chosen for bimodal trials to ensure that tactile WM capacity was reached, but not exceeded, based on previous ERP evidence, suggesting a capacity limit of two items for tactile WM (e.g., Katus, Grubert, et al., 2015). Visual load (two, three, or four items) and tactile load (zero or two items) were manipulated orthogonally, resulting in six conditions that varied unpredictably across trials. Each trial included a unimodal or bimodal sample set that was followed after a retention period of 1000 msec by a unimodal test set (50% vision and 50% touch in bimodal trials; 100% vision in unimodal trials).
In the unimodal visual baseline, CDA components should reflect visual WM load, with increased amplitudes for Load 3 relative to Load 2 and no further increase for Load 4. If tactile and visual items compete for representation in a shared central WM store with a capacity limit of approximately four items, encoding two tactile items on bimodal trials into WM should diminish the storage capacity available for visual items. Hence, load-dependent CDA enhancements observed on unimodal trials should be eliminated—or strongly attenuated—on bimodal trials, resulting in an interaction between visual and tactile WM load. Given the absence of load-dependent interactions between visual and tactile CDA amplitudes in our previous study (Katus & Eimer, 2018), we predict no such competition on an item level in the current experiment. We assume instead that bimodal dual-task costs are the result of a competition between tactile and visual maintenance processes for access to a central executive control system. In this case, the pattern of CDA modulations caused by visual WM load should not differ between unimodal and bimodal trials. However and importantly, a general impairment in the effectiveness of visual WM maintenance during dual-task performance could result in a general reduction of CDA amplitudes on bimodal trials (main effect of tactile WM load) that is independent of visual WM load. In addition, we used the tCDA components elicited in bimodal trials to examine the effects of visual WM load on tactile WM maintenance. If dual-task costs do not arise because tactile and visual items compete for a central storage capacity but because tactile and visual maintenance processes rely on a shared central executive process, visual WM load should have no impact on tCDA amplitudes in bimodal trials. To evaluate the statistical reliability of any predicted null effects in electrophysiological and behavioral measures, Bayesian procedures (Rouder, Morey, Verhagen, Swagman, & Wagenmakers, 2017) were used. In contrast to conventional significance tests, these procedures can provide evidence for the null hypothesis.
It is possible that reduced CDA amplitudes in bimodal versus unimodal trials might be due to dual-task costs that are generated at an early perceptual encoding stage. If there was a bottleneck for encoding in tasks where visual and tactile stimuli are presented simultaneously, smaller CDA components on bimodal trials could be the result of fewer visual stimuli having entered visual WM on these trials. To test this hypothesis, we measured N2pc components to the visual sample sets. As the N2pc marks the allocation of attention to items during their encoding into WM, N2pc amplitudes should increase when visual sample size was increased. If there was a central bottleneck for encoding in bimodal trials, N2pcs to visual sample sets should generally be smaller on these trials. We also tested whether the N2cc component to tactile sample sets was affected by visual WM load on bimodal trials. A perceptual encoding bottleneck for tactile stimuli should be more pronounced when the number of visual items that have to be encoded concurrently is increased, resulting in smaller N2cc components on bimodal trials with larger visual WM load.
All participants were neurologically unimpaired and gave informed written consent before testing. Twenty paid volunteers were tested, and 18 participants remained in the sample for statistical analysis (mean age = 29 years, 13 women, 15 right-handed). One participant was excluded due to low performance (<60% correct in trials with a visual load of four items), the other excluded participant had excessive ocular artifacts that could not be corrected for (see below for details on artifact correction procedures). The experiment was conducted in accordance with the Declaration of Helsinki and was approved by the Psychology ethics committee, Birkbeck, University of London.
We used a lateralized change detection task (672 trials, 12 blocks), where a sample set was followed by a test set after 1000 msec, with a presentation duration of 200 msec for each set (Figure 1). The sample set involved simultaneously presented tactile and visual stimuli in the bimodal condition, whereas only visual stimuli were presented in the single-task baseline condition. Participants memorized the locations of the tactile samples and/or the colors of the visual samples on one task-relevant side and judged whether the test set differed relative to the sample set (50% match, 50% mismatch). The task-relevant side changed after six blocks; the side relevant for the first six blocks (left or right) was randomly selected for each participant. In 336 bimodal trials, two tactile and a variable number of visual items (two, three, or four items, 112 trials per load condition) were presented simultaneously, and memory was unpredictably tested for either modality after the trial with a unimodal test set (50% touch or vision). In the unimodal single-task baseline, participants only received the visual samples (again, two, three, or four items, 112 trials per condition), and memory was always assessed with a visual test set. Visual WM load and tactile load (zero or two items in the single- and dual-task conditions, respectively) varied unpredictably on a trial-to-trial basis. Vocal responses (“a” for match, “e” for mismatch) were recorded using a headset microphone in the 2000 msec period following the memory test. Instructions emphasized accuracy over speed. One training block was run, and feedback on the percentage of correct responses was provided after each 4-min block. During EEG recordings, participants were asked to maintain central gaze fixation and to avoid head and body movements. Continuous white noise was played on headphones to mask any sounds produced by the tactile stimulators.
Stimulus Material and Randomization Procedure
Visual stimuli were colored squares (width and height: 0.52° of visual angle), shown for 200 msec against a black background on a 22-in. monitor (Samsung wide SyncMaster 2233; 1280 × 1024 resolution, 100 Hz refresh rate, 16 msec RT). The same number of squares was simultaneously shown on the left and right sides on each trial. The sample set was presented again at memory test on match trials (50%), whereas the color of one randomly selected square changed on mismatch trials (50%). The colors were selected from 180 color values on a circular wheel, which was invisible to the participant, in the CIE L*a*b color space (centered at L = 70, a = −6, b = 14; radius = 49). Randomization was separately performed for the left and right sides to avoid any systematic relation between the attended/ignored sides and is here explained for one side. In each trial, the color wheel was rotated by a random degree between 0° and 359°, and five equidistant color values were selected from the rotated wheel, starting at 1° with increments of 360°/5 degrees (i.e., 1°, 73°, 145°, 217°, and 289°). One of these color values was randomly selected to serve as distracter (shown on mismatch trials). A variable number n of the remaining four color values were randomly selected for the sample set, with n indexing visual WM load in the given trial. On each side, the squares were randomized to the quadrants of an invisible matrix, with the constraint that symmetric quadrants were occupied on the left and right sides. The matrix comprised two columns (inner vs. outer column: 1.03° vs. 1.83° horizontal offset, relative to central fixation) and two rows (equidistantly above and below central fixation: 0.80° vertical offset).
The tactile stimuli were 100 Hz sinusoids (duration: 200 msec, intensity: 0.37 N), presented via eight mechanical stimulators on the left and right hands' distal phalanges of the index, middle, ring, and little fingers. Stimulators were driven by custom-built amplifiers, using an eight-channel sound card (M-Audio, Delta 1010LT), controlled by MATLAB routines (MathWorks). Tactile stimuli were only involved in bimodal trials, where the tactile samples were randomized to two stimulators, separately for the left and right hands. On match trials (50%), both sample stimuli on the task-relevant hand were repeated at memory test. On mismatch trials (50%), one test pulse on the task-relevant hand appeared at the same location as one of the sample pulses, and the location of the other pulse on this hand changed.
Processing of EEG Data
EEG data, sampled at 500 Hz using a BrainVision amplifier, were DC-recorded from 64 Ag/AgCl active electrodes at standard locations of the extended 10–20 system. Two electrodes at the outer canthi of the eyes monitored horizontal eye movements (horizontal EOG). Continuous EEG data were referenced to the left mastoid during recording and rereferenced to the arithmetic mean of both mastoids (electrode sites TP9 and TP10) for data preprocessing. Data were offline submitted to a low-pass filter (30 Hz cutoff, Blackman window, filter order 500). Epochs were extracted for the 1-sec period after the sample set and were corrected relative to a 200-msec prestimulus baseline.
Artifact correction was based on EEGLab using independent component analysis (Delorme, Sejnowski, & Makeig, 2007; Delorme & Makeig, 2004). Independent components accounting for eye blinks were subtracted from the data. Epochs with lateral eye movements were identified and rejected using a differential step function that ran on the bipolarized horizontal EOG (step width 100 msec, threshold 30 μV). Subsequently, independent components accounting for horizontal eye movements were subtracted to remove residual traces of ocular artifacts that had not exceeded the amplitude threshold of the step function. Trials where any electrode exceeded a 100-μV amplitude threshold were discarded, and the remaining epochs entered Fully Automated Statistical Thresholding for EEG Artifact Rejection (FASTER; Nolan, Whelan, & Reilly, 2010) for the interpolation of noisy electrodes. Subsequently, EEG data were converted to CSDs (iterations = 50, m = 4, lambda = 10−5; Tenke & Kayser, 2012). 98.6% of all epochs remained for analysis after artifact rejection. Statistical tests were based on correct and incorrect trials, as the exclusion of incorrect trials did not change the pattern of results but would have reduced the signal-to-noise ratio of EEG data.
CSDs were separately averaged across three adjacent electrodes contralateral and ipsilateral to the task-relevant side. Tactile CDA (tCDA component) was measured at lateral central scalp regions (C3/4, FC3/4, CP3/4), and visual CDA was measured at lateral occipital regions (PO7/8, PO3/4, O1/2; same electrodes as in prior work: Katus & Eimer, 2016, 2018; Katus et al., 2017). Statistical tests were conducted on difference values of contra- minus ipsilateral ERPs, averaged between 300 and 1000 msec after sample onset (e.g., Vogel & Machizawa, 2004). When necessary, we adjusted the degrees of freedom in ANOVAs using Greenhouse–Geisser corrections. Error bars in graphs indicate 95% confidence intervals for the true population mean.
Bayesian t tests (Rouder et al., 2017) and the software JASP (JASP Team, 2016) were used to calculate Bayes factors (BFs) for each main effect/interaction in our statistical designs, as conventional null hypothesis significance tests do not permit interpreting nonsignificant results. BFs, in contrast, quantify the empirical evidence in the data for the presence—or absence—of modulations and thus principally allow for accepting the null hypothesis. The BF for the null hypothesis (BF01) is the inverse of the BF for the alternative hypothesis (BF10) and denotes the relative evidence in the data supporting the hypothesis that a statistical effect is absent rather than present. We either report BF10 or BF01, depending on which hypothesis is more likely to be true for a particular effect. Reliable evidence for either hypothesis is indexed by a BF larger than 3 (Jeffreys, 1961), suggesting that the empirical data are at least three times more likely under the respective hypothesis (relative to the competing hypothesis).
Visual CDA Components
Visual CDA amplitudes (Figures 2A and 3A) entered an ANOVA with the factors Visual load (two, three, or four visual items) and Tactile load (zero or two tactile items: single-task vs. dual-task trials, respectively). As expected, there was a main effect of Visual WM load on CDA amplitudes (Visual load: F(1.309, 22.256) = 8.309, p = .005, BF10 = 71.481; see Figure 4A). CDA amplitudes (collapsed across single-task and dual-task trials) were enhanced when visual load increased from two to three items, t(17) = 3.564, p = .002, BF10 = 17.685. No further CDA enhancement was obtained when WM load increased to four items conditions. CDA amplitudes were in fact smaller relative to trials with a WM load of three items, t(17) = 4.301, p < .001, BF10 = 70.303. Critically, this pattern of load-dependent CDA modulations was identical in single-task trials and dual-task trials where the visual WM task was performed together with the tactile WM task (Tactile Load × Visual Load: F(2, 34) = 0.025, p = .975, BF01 = 6.795; see Figure 4A). Although the presence versus absence of a tactile WM task had no impact on CDA load effects, there was a general reduction of CDA amplitudes in dual- relative to single-task trials (tactile load: F(1, 17) = 9.454, p = .007, BF10 = 4.181).
Visual N2pc Components
To assess whether the generic dual-task costs observed for CDA amplitudes were already present during the encoding of the visual samples, we examined the N2pc component to visual sample stimuli (measured between 200 and 300 msec poststimulus at the same electrodes as the CDA). Across all three visual load conditions, N2pcs were present both in unimodal trials, t(17) = 2.144, p = .047, BF10 = 1.519, as well as bimodal trials, t(17) = 2.498, p = .023, BF10 = 2.668. A main effect of Visual load, F(2, 34) = 7.690, p = .0018, BF10 = 114.360, demonstrated that N2pc amplitudes increased when visual WM load was increased. When collapsed across single-task and dual-task trials, N2pc components were larger with Load 3 relative to Load 2, t(17) = 3.556, p = .002, BF10 = 17.427, with no further increase when visual WM load increased to four items, t(17) = 0.109, p = .915, BF01 = 4.092. Critically, as illustrated in Figure 3A, N2pc amplitudes were essentially identical on single-task and dual-task trials (tactile load: F(1, 17) = 1.022, p = .326, BF01 = 3.362). There was also no interaction between tactile load and visual load, F(2, 34) = 0.241, p = .787, BF01 = 5.978.
Tactile CDA and N2cc Components
The amplitudes of tactile CDA components on dual-task trials were not affected by concurrent visual WM load, F(2, 34) = 1.377, p = .266, BF01 = 2.580 (see Figures 2B, 3B, and 4A). A direct comparison of contralateral and ipsilateral CSDs (collapsed across all three levels of visual load) confirmed that tCDA components were reliably elicited on dual-task trials, t(17) = 4.936, p < .001, BF10 = 230.930. An analogous contralateral–ipsilateral comparison showed that the tCDA was absent in the visual single-task baseline, t(17) = 0.544, p = .594, BF01 = 3.607, demonstrating that the maintenance of visual items did not produce any lateralized activity over somatosensory regions (see Figure 4B). To assess whether the perceptual encoding of tactile sample stimuli was affected by concurrent visual WM load, we measured N2cc components elicited between 180 and 260 msec after sample stimulus onset (i.e., the N2cc time window used in our previous study; Katus, Grubert, et al., 2015) at the same electrodes where the tCDA was recorded. The N2cc was reliably present in bimodal trials (collapsed across the visual load conditions: t(17) = 5.787, p < 10−4, BF10 > 103). Critically, N2cc amplitudes measured in bimodal trials were entirely unaffected by whether two, three, or four visual stimuli had to be encoded at the same time (visual load: F(2, 34) = 0.232, p = .795, BF01 = 5.917).
When memory was tested for vision, correct responses were obtained in 89.7% of trials. Performance sharply decreased when visual load increased from two to three and four items (visual load: F(1.339, 22.766) = 73.939, p < 10−8, BF10 > 1023; Figure 4B). Critically and analogous to the pattern found for the CDA component, this load-dependent modulation of visual WM performance was identical on single-task and dual-task trials (Tactile load × Visual load: F(2, 34) = 0.636, p = .536, BF01 = 5.083; Figure 4B). On dual-task trials where two tactile items were simultaneously maintained, visual WM performance dropped by 1.3% relative to single-task trials (89.1% vs. 90.4%, main effect Tactile load: F(1, 17) = 4.750, p = .044, BF01 = 1.933); this effect, although significant, was statistically less reliable than the dual-task costs observed for the CDA component (compare Figure 4A and B, left). Tactile WM accuracy (87.2% correct) was not influenced by visual load on bimodal trials where memory was tested for touch (visual load: F(2, 34) = 0.467, p = .631, BF01 = 5.004; Figure 4B, right).
Responses to visual test displays slowed with increasing visual WM load (672, 728, and 773 msec for two, three, and four visual items; main effect Visual load: F(1.294, 21.999) = 53.883, p < 10−7, BF10 > 1017). This load-related RT increase did not differ between single- and dual-task trials (Tactile load × Visual load: F(2, 34) = 0.411, p = .666, BF01 = 6.312). There was, however, an RT cost of 34 msec on dual-relative to single-task trials (741 vs. 707 msec, main effect Tactile load: F(1, 17) = 25.091, p = 10−4, BF10 > 104). RTs to tactile memory tests were not significantly affected by visual WM load, F(1, 17) = 2.653, p = .085, BF01 = 1.102.
We employed behavioral and electrophysiological measures to examine how the WM maintenance of items in one sensory modality is affected by the concurrent maintenance of items in another modality. Participants performed a visual WM task that required them to memorize two, three, or four visual items, either in isolation (unimodal trials) or simultaneously with two tactile items (bimodal trials). The maintenance of visual and tactile items in WM was tracked by measuring visual and tactile CDA components. Consistent with previous observations (e.g., Vogel & Machizawa, 2004), varying visual WM load modulated CDA amplitudes. Critically, however, although electrophysiological and behavioral bimodal dual-task costs were present, CDA modulations due to visual WM load did not differ between unimodal and bimodal trials.
The key finding of this study is that visual WM maintenance operated less effectively in trials where tactile information was simultaneously maintained relative to unimodal trials. CDA amplitudes were generally smaller in dual-task trials. There was also a small drop in visual WM accuracy for bimodal trials relative to the unimodal single-task baselines. Dual-task costs on bimodal trials have been interpreted as reflecting the competition between multisensory items for central storage capacity (e.g., Cowan, 2011; Saults & Cowan, 2007). If this interpretation was correct, tactile items should compete with visual items for access to a limited capacity central WM store on bimodal trials, thus reducing the probability that a particular visual item will be represented in this store. Because no such competition occurs on unimodal trials, more visual items will be able to gain access to this store. As a result, load-dependent increases of visual CDA amplitudes on unimodal trials should be much smaller or possibly entirely absent on bimodal trials, as reflected by an interaction between tactile load and visual load. In fact, the profiles of CDA modulations produced by varying visual WM load were identical in single- and dual-task trials. This can be seen in Figure 4A (left), which shows that, for both types of trials, maximal CDA amplitudes were found when three visual items had to be maintained, in line with the three-item capacity limit reported in unimodal visual WM experiments (Fukuda et al., 2010, 2015; Vogel & Machizawa, 2004).1 These observations are inconsistent with models postulating a central storage capacity with a fixed item limit that is shared between vision and touch. This point can be made most clearly by focusing on the CDA amplitude enhancement produced by increasing visual load from two to three items, which demonstrates the storage of a third visual item. With this third item, visual WM capacity limits were reached, as no further CDA enhancement was found on unimodal trials with four items. If tactile and visual items were encoded into the same shared WM store, maintaining additional tactile items on bimodal trials should have prevented the storage of this third visual item, resulting in no CDA enhancement when visual load was increased from two to three items on bimodal trials. This was not the case, as increasing visual WM load from two to three items elicited identical CDA enhancements in unimodal and bimodal trials. The absence of any differences of CDA load effects between these two types of trials was confirmed by a reliable BF for the interaction between tactile load (zero vs. two items) and visual load (two vs. three items; BF01 = 4.114).
A possible alternative explanation for the observation that the effects of visual WM load on CDA amplitudes were identical on unimodal and bimodal trials is that participants always prioritized the maintenance of visual information, regardless of whether tactile sample items were simultaneously presented or not. A previous behavioral study (Morey, Cowan, Morey, & Rouder, 2011) has shown that WM maintenance in a bimodal visual/auditory task with tones and colors can be selectively biased toward stimuli in one modality when the reward values associated with these two modalities were manipulated. In contrast, visual and tactile stimuli were equally task relevant on bimodal trials in this study, and there was no evidence that participants did prioritize the maintenance of visual over tactile stimuli on these trials. If this had been the case, tCDA amplitudes and tactile WM performance should have been reduced when visual WM load was increased from two to three or four items. In fact, tCDA components and accuracy on bimodal trials where touch was tested were entirely unaffected by the manipulation of visual WM load.2
The maintenance of tactile items on bimodal trials was reflected by a tactile CDA (tCDA) over somatosensory cortex (Figures 2B, 3B, and 4A). tCDA amplitudes were not modulated by visual WM load (Figure 4A, right), demonstrating that the ability to store two tactile items was unaffected by the number of visual items that had to be maintained concurrently and confirming our earlier suggestion that the tactile and visual CDA components reflect modality-specific maintenance processes (e.g., Katus & Eimer, 2016). Overall, the absence of any differences of CDA modulations due to visual WM load between unimodal and bimodal trials and the absence of visual WM load effects on tCDA components on bimodal trials provide evidence that the capacities of visual and tactile WM storage processes are independent. This conclusion is consistent with the results of our recent tactile/visual dual-task experiment (Katus & Eimer, 2018), where manipulations of visual and tactile WM load caused strictly modality-specific electrophysiological and behavioral effects. The existence of modality-specific WM limits is in line with the sensory recruitment hypothesis (Jonides, Lacey, & Nee, 2005), which assumes that sensory stimuli are stored in the same regions that are responsible for their perceptual encoding (cf. Bergmann, Genc, Kohler, Singer, & Pearson, 2016). If visual, tactile, and auditory items are stored in anatomically and functionally segregated content maps (Franconeri, Alvarez, & Cavanagh, 2013), there should be little, if any, interference or competition between these items.
The central new finding of the present experiment was the presence of electrophysiological dual-task costs, reflected by reduced visual CDA amplitudes on dual-task trials where visual and tactile items had to be maintained concurrently relative to unimodal single-task baseline trials. Given the absence of any crossmodal WM load effects in this experiment (which confirms similar observations from our earlier ERP study; Katus & Eimer, 2018), the dual-task costs obtained here cannot be attributed to a capacity-limited central WM store and must therefore be produced by storage-unrelated central executive mechanisms involved in dual-task coordination. When visual and tactile WM tasks have to be performed concurrently, dual-task costs could, in principle, arise during the encoding or maintenance of memorized sample items or at a later stage when these items are matched to test stimuli. Current evidence for dual-task costs for WM retrieval and matching processes is mixed (Fougnie & Marois, 2009; Cowan & Morey, 2007). The fact that the modality of test stimuli is uncertain in bimodal WM tasks could affect the speed of sample test comparisons and/or the preparation of stimulus–response mappings (cf. Logan, 1978). This may have contributed to the 34-msec dual-task costs observed in RTs to visual test stimuli in the present experiment.
It could be argued that the reduction of CDA amplitudes on bimodal versus unimodal trials does not reflect dual-task coordination costs for WM maintenance but an earlier bottleneck during the encoding of visual items into WM on bimodal trials. To test this, we analyzed N2pc components to visual samples in unimodal and bimodal trials. As expected, N2pc amplitudes were load sensitive, with larger N2pcs when visual load was increased from two to three items and no additional enhancement with four items. Analogous to the pattern observed for the CDA, these load-dependent N2pc modulations were identical on unimodal and bimodal trials. In contrast to the CDA, however, there were no overall dual-task costs for the N2pc, as N2pc amplitudes did not differ between unimodal and bimodal trials. This shows that the requirement to concurrently encode two tactile items on bimodal trials had no impact on the encoding of visual sample stimuli and provides strong evidence that dual-task interference effects did not arise during this early encoding stage. The fact that subsequent CDA components were reliably reduced in amplitude on bimodal relative to unimodal trials points toward the WM maintenance stage as the primary locus of dual-task interference. In addition, N2cc components to tactile sample stimuli were entirely unaffected by the number of visual stimuli that had to be encoded simultaneously. This provides additional evidence that there was no competition between visual and tactile items during the encoding stage on bimodal trials.
Why should WM maintenance processes be subject to dual-task costs in bimodal trials? We propose that these costs reflect domain-general limitations in the ability to perform two tasks concurrently. Such limitations can indeed be found in multisensory WM experiments (e.g., Cocchini et al., 2002), but they are not exclusive to these tasks. Similar multitasking costs have also been obtained in perceptual tasks with no apparent WM demands, where two target stimuli are presented sequentially, separated by a variable SOA, as in the psychological refractory period (PRP) paradigm. Here, responses to the second target are progressively delayed as the SOA decreases, suggesting a central bottleneck at the stage of response selection (Pashler, 1994). Similar delays have also been reported in modified PRP tasks where the second target display had to be encoded into WM (Brisson & Jolicoeur, 2007; Dell'Acqua & Jolicoeur, 2000). A central bottleneck associated with the encoding of target stimuli into WM may also be responsible for the attentional blink (Awh, Vogel, & Oh, 2006; Vogel & Luck, 2002; Chun & Potter, 1995; for the relation between attentional blink and PRP effects, see Wong, 2002). The dual-task costs for visual WM maintenance reflected by the reduction of CDA amplitudes on bimodal trials in the current experiment are thus likely to reflect the additional demands on central executive processes that are domain general (because they are involved in perceptual and WM tasks) and modality unspecific (because they are engaged by visual, auditory, and tactile tasks).
Why would executive demands linked with multitasking impair WM maintenance, given that the maintenance of stimuli from different modalities is mediated by parallel processes with modality-specific capacity limitations (Katus & Eimer, 2018; Fougnie et al., 2015)? The neural networks underlying the executive control of lower-level cognitive mechanism are primarily located in pFC (Koechlin, Ody, & Kouneiher, 2003), and they regulate the operation of perceptual networks involved in the storage of sensory information via top–down feedback signals to the relevant perceptual (e.g., somatosensory or visual) brain areas (D'Esposito, 2007; Curtis & D'Esposito, 2003). Dual-task costs in bimodal WM tasks may arise when multiple top–down control processes involving different sensory regions are concurrently activated. One possibility is that these executive processes operate in a strictly serial fashion (Tamber-Rosenau & Marois, 2016; Marti, Sigman, & Dehaene, 2012), so that top–down signals to visual or tactile regions are sent at different points in time, with rapid switches between transient visual and tactile WM activation processes. Another possibility is that these executive control processes operate in parallel but that interference between them results in less efficient or less precisely targeted feedback signals (Oberauer & Kliegl, 2004) relative to unimodal WM tasks. As a result, information stored in modality-specific areas might be subject to faster decay (Ricker, Vergauwe, & Cowan, 2016) or stronger interitem competition within content-specific maps (Franconeri et al., 2013). At the electrophysiological level, these processes should result in a general reduction of CDA amplitudes on bimodal dual-task trials. At the cognitive level, they could affect the precision with which particular items are represented in WM (see Luck & Vogel, 2013, for a review of quantitative and qualitative aspects of maintenance mechanisms associated with WM capacity and precision, respectively). It is important to note that measures of WM capacity such as Cowan's K are affected by all processes that modulate performance in WM tasks. For this reason, the dual-task costs that were found in previous behavioral bimodal WM experiments and were interpreted as evidence for domain-general storage mechanisms could primarily reflect the effects of the executive demands of dual-task coordination mechanisms on the precision of WM representations.
Using electrophysiological markers of WM maintenance, we investigated the sources of dual-task costs observed in bimodal WM tasks. We found that quantitative capacity limits of visual maintenance (i.e., the number of visual items held in WM) were not affected by concurrent tactile WM load. There was, however, a general cost for visual maintenance when it had to be coordinated with a tactile WM task. This cost is unrelated to storage capacity and reflects the executive control demands of coordinating two WM tasks, which impairs the effectiveness of content-specific maintenance processes. Dual-task coordination may reduce the precision of distributed sensory representations in perceptual brain areas, but not the number of items that can be stored in these areas.
This work was funded by the Leverhulme Trust (grant RPG-2015-370). We thank Laura Kischkel for proofreading the manuscript.
Reprint requests should be sent to Tobias Katus, Department of Psychology, Birkbeck, University of London, WC1E 7HX London, UK, or via e-mail: email@example.com.
We observed a significant drop in CDA amplitudes when visual WM load was increased from three to four items, in contrast to previous studies (e.g., Vogel & Machizawa, 2004), where the CDA appeared to reach a stable plateau for set sizes of more than three items. More recent experiments (Fukuda et al., 2015) showed that CDA amplitudes tend to decrease for supracapacity loads, especially for participants with below-average performance in the grand mean. The fact that this drop in CDA amplitudes was particularly pronounced in the present experiment may be related to task demands. We used 180 different color values, which were less easily to distinguish than the smaller sets of color categories employed in prior work.
Support for the absence of any strategic prioritization of vision over touch, or vice versa, was also provided in our previous EEG study (Katus & Eimer, 2018) where WM load (one, two, or three items) was varied orthogonally for vision and touch. To test for such strategic biases resulting in trade-offs between modalities, performance and ERP data for both modalities were submitted to the same ANOVA. For example, if visual stimuli had been prioritized, this should have resulted in large performance costs for the maintenance of three tactile items on trials when three as compared to just one visual item had to be retained, whereas tactile WM load should have little effect on visual WM performance. Statistical analyses obtained strong evidence for the absence of such asymmetries, both for WM accuracy (BF01 = 17) and tCDA/CDA data (BF01 = 18).