When attention is directed to stimuli in a given modality and location, information processing in other irrelevant modalities at this location is affected too. This spread of attention to irrelevant stimuli is often interpreted as superiority of location selection over modality selection. However, this conclusion is based on experimental paradigms in which spatial attention was transient whereas intermodal attention was sustained. Furthermore, whether modality selection affects processing in the task-relevant modality at irrelevant locations remains an open question. Here, we addressed effects of simultaneous spatial and intermodal attention in an EEG study using a balanced design where spatial attention was transient and intermodal attention sustained or vice versa. Effects of spatial attention were not affected by which modality was attended and effects of intermodal attention were not affected by whether the stimuli were at the attended location or not. This suggests not only spread of spatial attention to task-irrelevant modalities but also spread of intermodal attention to task-irrelevant locations. Whether spatial attention was transient or sustained did not alter the effect of spatial attention on visual N1 and Nd1 responses. Prestimulus preparatory occipital alpha band responses were affected by both transient and sustained spatial cueing, whereas late post-stimulus responses were more strongly affected by sustained than by transient spatial attention. Sustained but not transient intermodal attention affected late responses (>200 msec) to visual stimuli. Together, the results undermine the universal superiority of spatial attention and suggest that the mode of attention manipulation is an important factor determining attention effects.
Imagine a scene at a dance show: Two couples are dancing on opposite sides of the stage, accompanied by live music. A spectator may choose to attend to dancers on one side or the other (i.e., spatial attention) or to ignore the dancing altogether and listen to the music (i.e., intermodal attention). But how do these two types of attention interact? If one chooses to pay attention to sounds coming from the left (perhaps the piano at the dance show), would one also be more attentive to the left side of the visual scene although it is irrelevant to one's goal? And if one chooses to pay attention to the piano sounds (rather than sight), would that lead the listener to be more attentive to sounds of other instruments at other locations of the scene (e.g., the violins, situated on the other side of the stage)? That is, does spatial attention spread to irrelevant modalities, and does intermodal attention spread to irrelevant locations? The former question, that is, is spatial attention supramodal or modality-specific, has been studied extensively using behavioral, fMRI, and EEG measures (e.g., Banerjee, Snyder, Molholm, & Foxe, 2011; Föcker, Hötting, Gondan, & Röder, 2010; Green, Teder-Sälejärvi, & McDonald, 2005; Eimer, van Velzen, & Driver, 2004; Teder-Sälejärvi, Munte, Sperlich, & Hillyard, 1999; Spence & Driver, 1996). The findings have mostly shown that spatial attention spreads to the irrelevant modality and have often been interpreted as superiority of spatial attention to intermodal attention (i.e., selection by space precedes selection by modality). However, the opposite question, whether intermodal attention similarly spreads to the unattended location, has never been tested in a setup where the two kinds of attention were manipulated equally. Instead, spatial attention was generally transiently cued, whereas attention to modality was sustained throughout blocks or experiments.
Measuring brain responses provides an opportunity to assess the processing of task-irrelevant information without requiring the participants in the experiments to overtly respond to the stimuli (which would defy their “irrelevance”). In a typical study, auditory, visual, or tactile stimuli are presented in one of two (or more) spatial locations. Responses to stimuli are measured, whereas spatial attention is directed to the stimulus location or elsewhere, and the stimulus modality is either congruent or incongruent with the attended modality (Zhang, Hong, Gao, Gao, & Röder, 2011; Nager, Estorf, & Münte, 2006; Eimer et al., 2004; Eimer, van Velzen, & Driver, 2002; Teder-Sälejärvi et al., 1999; Eimer & Schröger, 1998; Hillyard, Simpson, Woods, Van Voorhis, & Munte, 1984). In these experiments, the processing of stimuli of the unattended modality—as measured by post-stimulus auditory and visual ERPs such as the auditory and visual N1 components, as well as later components—still benefited if the stimuli were at the attended location, leading to the conclusion that spatial attention is at least partly a supramodal mechanism. A similar conclusion was reached looking at the ADAN and LDAP EEG responses, which are measured following a spatial cue but before the target stimulus, presumably reflecting preparatory processes such as shift of attention or anticipation (e.g., Green & McDonald, 2006; Green et al., 2005; Eimer et al., 2002): Enhanced preparatory responses were found following spatial cues, whether participants were expecting stimuli in the relevant modality or in an irrelevant modality. In the EEG frequency domain, it has been shown that spatial cueing leads to parietal alpha band suppression contralateral to the attended location, both in visual and auditory tasks, albeit with some differences in topography suggesting also the involvement of modality-specific mechanisms (e.g., Banerjee et al., 2011; Worden, Foxe, Wang, & Simpson, 2000; Foxe, Simpson, & Ahlfors, 1998). Taken together, these studies show that spatial attention spreads across modalities, supporting the involvement of a supramodal mechanism of spatial attention.
It is tempting to conclude based on these findings that spatial attention has superiority in a selection hierarchy: having to select by space and modality, one first selects the region of space, boosting the processing of all modalities in that location. Under this premise, the spillover of attention is unidirectional: spatial attention spreads across modalities. After all there is a clear advantage to attending the selected location in all modalities—If something important occurs at a certain location, we would like to see it as well as hear it. However, the possibility that intermodal attention similarly spreads across locations is not unlikely either. Studies on visuospatial attention have demonstrated global effects of feature selection; that is, processing of the attended feature was affected by attention also at the unattended location (Störmer & Alvarez, 2014; Stoppel et al., 2012; Hopf, Boelmans, Schoenfeld, Luck, & Heinze, 2004; Saenz, Buracas, & Boynton, 2002). If feature-based attention spreads across locations, it is possible that intermodal attention similarly spreads across locations too.
Furthermore, the apparent superiority of spatial attention could be related to the specific experimental designs that have been used: In most studies, spatial attention was required to shift much more frequently than intermodal attention. In some, attention to modality was manipulated between participants, so different groups attended only one modality each, while directing spatial attention to different locations in subsequent blocks (Nager et al., 2006; Eimer et al., 2004; Teder-Sälejärvi et al., 1999; Hillyard et al., 1984). Alternatively, the modality to be attended was manipulated within participants at the beginning of each block, whereas the direction of spatial attention changed on a trial-by-trial basis (Smith et al., 2010; Salmi, Rinne, Degerman, Salonen, & Alho, 2007; Eimer et al., 2002; Eimer & Schröger, 1998).
Transient shifts of attention are a very different experience from maintaining sustained attention for long durations. Indeed, it is widely accepted that transient events are more salient than stationary ones, consistent with the notion that changes may require a response more frequently than static situations. The idea that transient and sustained attention operate differently has been suggested long ago (Eimer, 1996; Posner, Snyder, & Davidson, 1980). Eimer (1996, 1998) found that ERPs related to spatial attention are modulated differently by transiently or continuously orienting visual spatial attention. Other studies found behavioral differences depending on whether attention needed to be reallocated during the block or not. For instance, Spence and Driver (1996, Experiments 6 and 7) found that participants were able to split auditory and visual spatial attention, but only when spatial attention was sustained. Thus, if attention is to be directed simultaneously to modality and to location, then one (attention to modality) remains constant while the other (attention to location) is constantly changing, that in itself could prioritize spatial attention over intermodal attention.
The current study aimed at testing the putative hierarchy between spatial attention and intermodal attention while controlling for effects driven by the experimental design. Using a within-subject design, we manipulated spatial attention and intermodal attention and, critically, which kind of attention was sustained and which was transiently cued (hereinafter we relate to this variable as “attention assignment”). This attention assignment manipulation allowed us to compare effects of transient and sustained attention and directly assess whether spatial and/or intermodal attention effects depend on how they were manipulated. We tested EEG responses to visual and auditory stimuli on the left and on the right. The stimulus in each trial could be either attended or unattended on each of the two dimensions—intermodal attention and spatial attention. One dimension (either spatial or intermodal attention) was sustained throughout the block, and the other was transiently cued on every trial. Specifically we tested:
Is the effect of spatial attention different when sustained and when transient, and specifically, does it still spread across modalities when it is sustained (and when intermodal attention is transiently cued)?
Is the effect of intermodal attention different when sustained and when transient, and can it spread across locations when manipulated in a transient manner (and when spatial attention is sustained)?
To allow comparison with the previous literature, our main focus was on effects of attention on post-stimulus ERP effects: the auditory N1, the visual N1, and the presence of the Nd1 (negative difference) over parietal regions in response to stimuli of both modalities (Eimer, 1994; Schröger, 1994). We tested later effects of attention in a more exploratory manner (see Methods below). Finally, because we reasoned that spatial and intermodal effects of attention might be prioritized differently at different stages of processing, we also measured alpha band suppression at an earlier stage, following the cue but before the appearance of the stimulus (Worden et al., 2000; Foxe et al., 1998).
Twenty-four healthy volunteers participated in the experiment for course credit or payment. The data of three participants were excluded from analysis because of excessive EEG artifacts. Altogether, data of 21 participants were included in the analysis (13 women, 8 men, aged 19–30 years old, M = 23.7, SD = 3.11, 20 right-handed). All participants had normal hearing by self-report, normal or corrected-to-normal vision, and no history of neurological disorders. The experiment was approved by the ethical committee of the faculty of social science at the Hebrew University of Jerusalem, and informed consent was obtained after the experimental procedures were explained.
Stimuli and Apparatus
Participants were seated in a dimly lit, sound-attenuated, and echo-reduced chamber (C-26; Eckel, Cambridge, MA). A 17-in. CRT monitor (100-Hz refresh rate) was placed in front of them, at viewing distance of 90 cm. Two loudspeakers (model 821615 midrange 122M; Peerless, Cambridge, MA) and two ensembles of yellow LEDs were mounted on a semicircular bar with a radius of 90 cm, at 15° to the right and to the left of participants' midsagittal plane. Participants placed their heads in a chinrest to minimize movement. Binocular eye tracking (Eyelink 1000\2K; SR Research Ltd., Mississauga, Ontario, Canada) was used to monitor eye movements and to confirm that participants maintained fixation during the trials. The experiment was run using the E-Prime 2.1 software (Psychology Software Tools, Pittsburgh, PA). Throughout the experiment, a white fixation cross appeared above the center of the screen on a black background, approximately at the same height as the LED ensembles (Figure 1). Cue stimuli were pairs of symbols presented simultaneously on the monitor one above and one below the fixation cross. In blocks in which location was transiently cued (and thus the attended modality was sustained), the symbols were a pair of one blue arrow and one red arrow, one of which was pointing to the left and the other to the right. In blocks in which modality was transiently cued (and thus the attended location was sustained), the pair of symbols consisted of one blue icon and one red icon of an eye and of an ear. The red icon/arrow signified which location or modality was to be attended in a given trial (Figure 1). This design of the cue display was aimed to equate as much as possible the cueing mode of modality and location and to prevent any exogenous spatial attention effects (which could be the case with a single arrow, which is intrinsically asymmetric).
Auditory nontarget (standard) stimuli were continuous bursts of white noise (200-msec duration, including 5-msec rise and fall times), presented from one of the two loudspeakers. Auditory deviants were identical to the standards, except a short silence gap was introduced at the middle of the noise burst. Visual nontarget (standard) stimuli were 200-msec illuminations of one of the two LED ensembles. Visual deviants were identical to the standard stimuli, except for a short “gap” that was introduced at the middle of the illumination period. The gap durations were adjusted individually in each modality separately, in an attempt to reach 85% discrimination between standard and deviant stimuli (see Procedure).
Participants underwent two sessions of the main experiment on two separate days. The sustained modality session included separate attend auditory and attend visual runs, in which, one modality had to be attended and the other ignored throughout the run. In these runs, the to-be-attended location was cued on each trial. The Sustained Modality session was designed to replicate previous findings, where spatial attention was cued on a trial-by-trial basis and the relevant modality remained constant throughout the block. The novel sustained location session was identical to the sustained modality session in terms of the stimuli and task (with the exception of the symbolic cues). However, this condition consisted of separate attend right and attend left runs, in which the attended location was sustained throughout the run, whereas the to-be-attended modality was cued on a trial-by-trial basis according to the central cues.
Each experimental session consisted of 16 blocks: In the sustained modality session, there were eight consecutive sustained auditory blocks (together consisting a “run”) and eight consecutive sustained visual blocks. In the sustained location session, there were eight consecutive sustained left blocks and eight consecutive sustained right blocks. The order of sessions and runs was counterbalanced between participants. The modality or location to be continuously attended throughout the block was designated at the beginning of each block. Each block lasted 3 min or slightly longer (depending on participants' deviation from fixation, which could delay the beginning of a trial, see below), and short subject-paced breaks were provided between blocks. In each block, 50% of the stimuli were visual and 50% were auditory.
Participants' task was to respond by key press to rare targets, as quickly and as accurately as possible, and to withhold responses to all other stimuli, while maintaining gaze at fixation. Targets were the combination of three conditions: They were deviant stimuli (i.e., included a gap) of the to-be-attended modality presented at the to-be-attended location. The deviant trials were randomly drawn from a large sample where all four combinations of stimulus modality and side of presentation were equiprobable, and thus, each stimulus had a .25 probability of being presented at the attended location as well as the attended modality on a given trial. Of the 80 trials in each block, standard stimuli were presented on 64 trials. All standard stimuli had equal probabilities of appearing on the left and on the right. On the remaining 16 trials, deviant stimuli were presented. Thus, on average, four targets appeared per block.
A trial began with a central fixation presented for 100 msec, after which a 100-msec visual cue was presented, specifying the modality (in “sustained location” blocks) or location (in “sustained modality” blocks) to be attended in this trial. Importantly, the cue was independent of stimulus location and modality (i.e., the cue was instructional in that it determined what would be considered a target on a given trial, but it was not informative in providing information regarding the probability of the upcoming stimulus location/modality). The cue was followed after an interval of 600 msec by a visual or auditory peripheral stimulus (200-msec duration). Participants were required to respond to targets within a 1000-msec window following stimulus onset. Gaze direction was measured binocularly throughout the trial, and if it exceeded the limit of 1.5° to any direction of the fixation point during the 100 msec before stimulus appearance, the trial was aborted and a red exclamation mark appeared, accompanied by an abrasive tone. In addition, the presentation of the cue at the beginning of the trial was delayed if the distance between gaze and center of fixation cross exceeded 1.5°, until participant refixated.
Before the main experiment, on a separate day, participants underwent a staircase procedure based on the transformed up–down method (Levitt, 1971). This session was meant to find individual gap durations leading to ∼85% discrimination between standard and deviant stimuli, for both visual and auditory stimuli. Four staircases were run: visual left, visual right, auditory left and auditory right. For each modality, we used the average gap of left and right staircase results to create individual visual and auditory deviants for the main experiment.
Each staircase session was composed of two interleaved staircases, one going up and the other going down, using a two-alternative forced-choice task in which participants were asked to report on each trial which of two stimuli had a gap. The staircase converged when at least 10 reversals had occurred in each of the two staircases. Participants' individual thresholds were computed as the average of the last four reversals in each of the staircases.
EEG Acquisition and Processing
EEG was recorded using an Active 2 system (BioSemi, Amsterdam, The Netherlands) from 64 preamplified Ag/AgCl electrodes mounted on an elastic cap according to the extended 10–20 system, with the addition of two electrodes above the mastoid processes and a nose electrode. Eye movements were recorded using two electrodes at the outer canthi of the right and left eyes, two electrodes below the center of both eyes and one above the center of the right eye. The EEG was continuously sampled at 512 Hz with a low-pass anti-aliasing filter set at one fifth of the sampling rate and stored for offline analysis. EEG was analyzed offline using Vision Analyzer 2.0 (Brain Products, Munich, Germany) and MATLAB (2013b, The MathWorks, Inc., Natick, MA). The continuous data were digitally filtered (zero-phase 24 dB/octave Butterworth filter) with a high-pass cutoff of 0.5 Hz. Blinks were removed using the independent component analysis method (Jung et al., 2000). Segments contaminated by other artifacts were discarded (rejection criteria: absolute difference between samples > 100 μV within segments of 100 msec; absolute amplitude > 100 μV; absolute amplitude < 0.5 μV within segments of 100 msec). Only standard trials, with no false alarms and which were not aborted because of eye movements, were analyzed. Additionally, trials with visual stimuli were only analyzed if there were no blinks within 100 msec before or after stimulus presentation. After rejection of contaminated trials, an average of 53 trials was analyzed per participant per condition (range = 44–57 trials).
For ERP analysis, data were referenced offline to the averaged mastoids and parsed into 900-msec segments starting 100 msec before stimulus onset. A low-pass zero-phase 24 dB/octave Butterworth filter of 30 Hz was applied to the segmented data. The voltage was measured relative to the mean of the 100 msec prestimulus period (baseline correction).
For the frequency domain analysis, a custom designed notch filter of 50 Hz (24 dB/octave) was used to remove line noise (Keren, Yuval-Greenberg, & Deouell, 2010). Data were rereferenced to an average channel reference. Next, the data were parsed into 400-msec segments starting 200 msec after the cue (to avoid the visual evoked responses to the visual cue) and ending at the time of stimulus appearance.
To reduce the dimensionality of the data, for poststimulus analyses left and right trials were collapsed based on stimulus location, after transforming left and right electrodes to ipsilateral and contralateral electrodes relative to the location of the stimulus.
Effects of attention on EEG responses were analyzed separately for sensory evoked potentials and for later effects. For visual stimuli, we initially focused on effects of attention on the visual N1, which we extracted from the average of a pair of occipitoparietal channels (PO3/PO4, PO7/PO8), separately for channels ipsilateral and contralateral to the stimulus. We defined peak latencies individually for each participant, as the time of the negative peak between 100 and 220 msec post-stimulus across all conditions and calculated the average voltage of 30 msec around this peak latency, as the dependent variable for each condition. For auditory stimuli, we initially focused on effects of attention on the auditory N1 response. We defined peak latencies individually for each participant as the time of the negative peak between 40 and 160 msec post-stimulus across all conditions in the average of midline central channels FCz, Cz, and CPz and computed the average voltage of 40 msec around the peak as the dependent variable in each condition. For both auditory and visual stimuli, we also assessed the Nd1 effect of attention over parietal channels, which we measured as the difference wave between response amplitudes to spatially attended and unattended stimuli, averaged across Pz, P1, and P2 between 140 and 180 msec post-stimulus (Eimer, 1994, 1996).
Beyond amplification of visual and auditory evoked responses, attention effects have been shown at later stages of processing, usually in the form of negative shifts for attended versus unattended stimuli (“selection negativity” or “negative difference” [Nd]; Hansen & Hillyard, 1980; Näätänen, Gallard, & Mantysalo, 1978). To evaluate these effects within a reasonable number of comparisons, we divided the channels into nine clusters (see Figure 2) and averaged the potentials across the electrodes in each cluster in six non-overlapping time bins of 100 msec between 200 and 800 msec post-stimulus. The division was done a priori, similar to previous studies from our lab (e.g., Mudrik, Shalgi, Lamy, & Deouell, 2014; Mudrik, Lamy, & Deouell, 2010) and others (e.g., Renoult et al., 2014; see Dien & Santuzzi, 2004). We ran separate ANOVAs on each of the clusters and time bins (54 ANOVAs altogether) and determined the family-wise critical p value (p < .05) using false discovery rate (FDR) correction for multiple comparisons, considering the dependence between comparisons (Benjamini & Yekutieli, 2001).
For the auditory and visual evoked potentials, the Nd1, and the analysis of later time bins, we applied the same three-way ANOVA with the factors Spatial attention (attended/unattended), Intermodal attention (attended/unattended), and Attention assignment (sustained modality/sustained location). Analyses were conducted separately for visual stimuli and for auditory stimuli.
For the predefined components, following the traditional ANOVA analysis, we adapted a Bayesian approach of computing Bayes factors (BFs), reflecting the strength of the evidence for the presence or absence of an effect (Dienes, 2014; Rouder, Morey, Speckman, & Province, 2012; Kass & Raftery, 1995). Unlike the traditional frequentist hypothesis testing approach, this method allows us to evaluate the evidence supporting a null hypothesis by measuring the likelihood ratio of the null hypothesis versus the alternative hypothesis. This ratio (the BF) provides an approximation of the strength of the evidence in favor of one model over another (Kass & Raftery, 1995): Values below 3 are considered inconclusive, values between 3 and 20 are interpreted as positive evidence for the competing model, and values above 20 are interpreted as strong evidence for one model over the other. We used the BayesFactor package in R (R Core Team 2014, Vienna, Austria; BayesFactor Version 0.9.5; for further details regarding the implementation, see Rouder et al., 2012).
Frequency Domain Analysis
To examine preparatory effects of attention (before stimulus appearance), we measured the absolute of the alpha band frequency signal in each condition in occipitoparietal channel clusters ipsilateral and contralateral to the direction of spatial attention (P3, P4, P5, P6, PO3, PO4, PO7, PO8). For each participant and condition, we segmented the data to epochs of 400 msec preceding stimulus onset (excluding the first 200 msec following the cue to avoid effects originating from visual evoked potential (VEP) responses to the visual cues). For each trial, we computed the absolute of the fast Fourier transform amplitude between 8 and 12 Hz and averaged trials separately for each combination of spatial attention, intermodal attention, and attention assignment. Because the time of interest was before stimulus appearance and the modality and location of the upcoming stimulus were unpredictable, we could collapse across the modality and location of the stimulus itself and measure effects of attention based only on the spatial attention and intermodal attention allocation. Because alpha band activity may be asymmetrical (Thut, Nietzel, Brandt, & Pascual-Leone, 2006), we did not collapse across left- and right-directed spatial attention. Rather, we ran a four-way ANOVA with the factors Hemisphere (left/right), Spatial attention (left/right), Intermodal attention (attend vision vs. audition), and Attention assignment (modality sustained vs. location sustained).
For each participant, we computed the discriminability of the targets from distracters using d′ (the difference between the standardized hit rate and false alarm rate) separately for responses to visual and auditory targets in each session. Hit rates and false alarm rates were computed for each condition (mean hit rate for visual stimuli = 0.81, for auditory stimuli = 0.83; mean false alarm rate for visual stimuli = 0.04, for auditory stimuli = 0.01) and were used to compute d′ sensitivity index. The values of d′ were high in all conditions, showing good performance in the task (Figure 3). A two-way ANOVA with the factors Stimulus modality and Condition revealed that d′ was higher for auditory than for visual stimuli (mean auditory d′ = 3.73, mean visual d′ = 3.03, F(1, 20) = 21.3, p < .0001, ηp2 = 0.503). However, performance within each modality was almost identical for the traditional sustained modality and new sustained location conditions (F(1, 20) < 1; ηp2 = 0.004; Figure 3), suggesting that task difficulty was similar in the two conditions.
Responses to Visual Stimuli
Early effects of attention on VEPs
First, we analyzed peak amplitudes of the visual N1 component in response to lateralized visual stimuli over occipitoparietal electrodes (Figure 4) using a four-way ANOVA with the factors Channel laterality (ipsilateral, contralateral), Attention assignment (sustained location, sustained modality), Intermodal attention (auditory, visual), and Spatial attention (left, right). The results showed more negative N1 peaks contralateral to stimulus location than ipsilateral to stimulus location (F(1, 20) = 14.87, p = .001, ηp2 = 0.426), as well as more negative N1 amplitudes when intermodal attention was directed to the visual modality than when it was directed to the auditory modality (F(1, 20) = 4.58, p = .045, ηp2 = 0.186), and more negative N1 peaks when spatial attention was directed to the location of the stimulus than when it was directed to the other side (F(1, 20) = 89.64, p < .001, ηp2 = 0.818). Furthermore, an interaction between spatial attention and laterality revealed that the effect of spatial attention was stronger contralateral to stimulus location than ipsilateral to stimulus location (F(1, 20) = 10.44, p = .004, ηp2 = 0.343).
Further investigation of this interaction revealed significant effects of spatial attention in both ipsilateral channels (F(1, 20) = 32.7, p < .001, ηp2 = 0.621) and contralateral channels (F(1, 20) = 99.7, p < .001, ηp2 = 0.833), with more negative N1 peak amplitudes to stimuli that appeared at the attended location than to stimuli that appeared at the unattended location. The effect of intermodal attention was significant in ipsilateral channels (F(1, 20) = 5.18, p = .034, ηp2 = 0.206), showing more negative responses to stimuli when attention was directed to visual stimuli than when it was directed to auditory stimuli. In contralateral channels, the effect was in the same direction, but only approached significance (F(1, 20) = 3.32, p = .083, ηp2 = 0.142). In addition, in ipsilateral channels, a main effect of Attentional Assignment showed more negative values in general when spatial attention was sustained than when it was transiently cued (F(1, 20) = 5.52, p = .029, ηp2 = 0.216).
In line with previous findings showing that spatial attention spreads across modalities, there was no significant interaction between spatial attention and intermodal attention in either hemisphere (compare green and purple traces in Figure 4). Because traditional hypothesis testing does not allow us to accept the null hypothesis of no interaction, we used a complementary Bayesian approach to further examine whether the null interaction supports a spread of spatial attention across modalities (i.e., a “true” null result) or simply the lack of power necessary to reach significance (see Methods). For this purpose, we calculated the BF comparing a model including only the spatial attention and intermodal attention main effects to a model including also the Spatial Attention × Intermodal Attention interaction. The BF was 3.97 and 3.34 in contralateral and ipsilateral channels, respectively, providing positive support for the lack of an interaction.
Whereas there were significant main effects of Spatial attention, Intermodal attention, and Attention assignment in the ipsilateral channels, no interactions were found in the ANOVA between Attention assignment and Spatial and/or Intermodal attention. The BF in comparing a model including only the three main effects against a model including also a Spatial Attention × Attention Assignment interaction was 4.89 in the ipsilateral channels, providing positive evidence in favor of the model without the interaction. Comparing the main effects model to a model including also an Intermodal Attention × Attention Assignment interaction yielded a BF of 4.2, providing positive support in favor of the main effects model.
Taken together, these results show that the N1 was affected by spatial attention similarly when vision was the attended or the unattended modality, indicating that at the time of the N1 the two forms of attention operated independently. Furthermore, this effect occurred whether spatial attention was transiently cued or sustained, suggesting that the N1 effects of spatial and intermodal attention may not depend on whether attention was sustained or transiently cued.
Early effects of attention in parietal channels
Looking at the Nd1 effect in parietal channels, we found a main effect of spatial attention (F(1, 20) = 126.3, p < .001, ηp2 = 0.863), such that there was a negative difference between responses to stimuli that were spatially attended and stimuli that were spatially unattended (Figure 5A). There were no other main effects or interactions. The complementary Bayesian analysis yielded a BF value of 12.51 in favor of the model including only the spatial attention effect compared with the model including spatial attention, attention assignment, and a Spatial Attention × Attention Assignment interaction, providing positive support for the former model over the latter. Similarly, comparing the model including spatial attention only and the model including spatial attention, intermodal attention, and a Spatial Attention × Intermodal Attention interaction, the BF was 14.51, providing positive support for the model including only the effect of spatial attention over the model including an interaction between spatial and intermodal attention.
Effects of attention on responses to visual stimuli in later time windows
At later time windows, three-way ANOVAs on EEG response amplitudes were conducted for each of nine electrode clusters (Figure 2) at six consecutive time bins (FDR-corrected for multiple comparisons).
We found a main effect of Spatial attention in central and occipitoparietal channels between 400 and 800 msec post-stimulus (F ≥ 22.01, p < .05, FDR-corrected), such that response amplitude was more positive in the attended than in the unattended condition. This effect did not interact with intermodal attention at any time bin or cluster, suggesting that attending a location in space may not depend on whether the modality is attended or not at late time windows either.
In contrast with the early effects, spatial attention interacted with attention assignment between 400 and 800 msec in central and parieto-occipital channels (F values ≥ 19.65, p < .05, FDR-corrected). Further analysis of the simple effects (Figure 6) revealed significant effects of sustained spatial attention between 400 and 800 msec (F values ≥ 17.23, p < .05, FDR-corrected) in midline and ipsilateral central and parieto-occipital clusters, whereas the effect of transient spatial attention was significant only in the ipsilateral parieto-occipital cluster (F values ≥ 22.74, p < .05, FDR-corrected). Overall these results show a posterior ipsilateral positive effect of spatial attention on late processing of visual stimuli, more widespread when spatial attention was sustained than when it was transiently cued.
Although there was no main effect of Intermodal attention on late processing, we found an interaction between Intermodal Attention and attention assignment (F values ≥ 20.72, p < .05, FDR-corrected) between 500 and 700 msec in parieto-occipital clusters: Responses to visual stimuli were more positive when sustained intermodal attention was directed to vision than when directed to audition, and this effect was significantly diminished with transient intermodal attention (compare Figure 7A with Figure 7B). Further analysis of this interaction (simple effects) showed no other significant effects. However, the interaction implies that, similar to spatial attention, intermodal attention may have a stronger effect on late processing of visual stimuli when attention was sustained than when it was transient. We found the same pattern of results both in the attended location and in the unattended location (Figure 7C, 7D), with no two-way interaction between spatial attention and intermodal attention and no three-way interaction. This suggests that intermodal attention may spread to the unattended location. However, after separating the data to attended location and unattended location, the simple effects of sustained intermodal attention were not significant.
Responses to Auditory Stimuli
Early effects of attention on auditory evoked potentials
Auditory stimuli elicited clear N1 and P2 responses. However, there were no significant effects of spatial or intermodal attention on these components.
In parietal channels, we found a similar Nd1 effect as we found in response to visual stimuli (Figure 5B). During the time window between 140 and 180 msec, responses were more negative when sounds were spatially attended than when they were spatially unattended (F(1, 20) = 27.07, p < .001, ηp2 = 0.575). No other main effects or interactions were found. The complementary Bayesian model comparison of a model including only the spatial attention main effect with a model including spatial attention, attention assignment, and the Spatial Attention × Attention Assignment interaction yielded a BF of 2.86. Comparing a model including only the main effect of Spatial attention with a model including Spatial attention, Intermodal attention, and a Spatial Attention × Intermodal Attention interaction yielded a BF of 9.43, providing positive evidence in favor of the hypothesis that there was no interaction between spatial and intermodal attention.
Effects of attention on responses to auditory stimuli in later time windows
Spatial attention affected late processing of sounds between 300 and 600 msec (F values ≥ 26.49, p < .05, FDR-corrected), where responses to sounds were more negative when spatial attention was directed to the location of the sound than when it was directed elsewhere (Figure 8). Similar to the VEP results, we found no interaction between spatial attention and intermodal attention, suggesting that the effect of spatial attention on responses to sounds may not depend on the attended modality at these time windows. No other main effects or interactions were significant.
Preparatory Alpha Desynchronization Effects
In addition to the ERP analysis, we also examined preparatory alpha (8–12 Hz) desynchronization effects in the prestimulus interval over the occipitoparietal scalp (Figure 9).
Because this analysis focused on the 400 msec before the stimulus appeared, we collapsed the data across visual and auditory trials and across stimulus location. We ran a four-way ANOVA including the factors Hemisphere (left, right), Direction of spatial attention (left, right), Intermodal attention (attend auditory, attend visual), and Attention assignment (sustained modality, sustained location). As expected, spatial attention affected the prestimulus alpha amplitude: There was a significant disordinal interaction of Hemiphere × Spatial Attention (F(1, 20) = 16.953, p = .001, ηp2 = 0.459), showing opposite effects of direction of attention in left and right hemispheres. Separate analyses of effects in the left and right hemisphere revealed that for the right electrodes, directing attention to the left (contralaterally) showed reduced amplitude of alpha activity relative to directing attention to the right (ipsilaterally; F(1, 20) = 6.875, p = .016, ηp2 = 0.256). For the left hemisphere electrodes, the amplitude of alpha activity was also reduced when attention was directed contralaterally, but this effect was not significant (F < 1). In addition, there was a marginal (p = .053) three-way interaction of Hemisphere × Spatial Attention Direction × Attention Assignment. This trend was due to a stronger disordinal two-way interaction between hemisphere and direction of spatial attention when spatial attention was assigned transiently than when it was sustained. Finally, although alpha power was measured over parieto-occipital scalp, presumably most sensitive to visual processing, there was no three-way interaction of Spatial Attention Direction × Hemisphere × Intermodal Attention (F < 1, ηp2 = 0.032), suggesting similar effects of spatial attention in occipitoparietal channels whether participants attended the visual modality and when they attended the auditory modality. The complementary Bayesian analysis yielded a BF value of 25.8 in favor of the model including only the main effect of spatial attention compared with the model including spatial attention, intermodal attention, and the interaction between spatial attention and intermodal attention. This supports the conclusion that allocation of spatial attention in preparation to the stimulus, as reflected in alpha band activity, is indeed supramodal.
Attention is required when multiple stimuli are encountered, and processing all of them similarly would be suboptimal. In these situations, attention entails selection of some task-relevant stimuli over others. However, there are multiple dimensions across which a subset of stimuli may be selected (in the present case, their location in space and their modality) and in some cases the selection requires a conjunction of conditions to be met (e.g., visual stimuli on the left). Selection of one dimension over others may have a price to it: When spatial attention to one modality affects processing in a different modality, it essentially counteracts intermodal attention. Similarly, if intermodal attention affects processing in unattended locations, it might diminish the effects of spatial attention. Given this trade-off, is there a fixed hierarchy between the selection dimensions? If not, does the hierarchy depend on which dimension demands sustained attention to a relevant condition and which requires frequent shifts of attention? In the current study, we aimed at testing the simultaneous effects of spatial and intermodal attention on EEG responses, whether there is a hierarchy between them and whether this hierarchy depends on attention assignment, that is, whether attention is sustained or transiently cued.
Starting with effects of spatial attention, we found that selecting a location affected stimulus processing regardless of whether the modality of the stimulus was the attended or the unattended modality, supporting the claim that spatial attention is based on a supramodal mechanism1 (e.g., Nager et al., 2006; Green et al., 2005; Eimer et al., 2002, 2004; Teder-Sälejärvi et al., 1999). Expanding previous findings, here we show that the effect of spatial attention did not depend on whether the modality was relevant at any stage of processing: in anticipatory alpha band suppression in occipitoparietal regions between the cue and target, during the visual N1 response, at the time of the parietal Nd1, and during late EEG responses to both visual and auditory stimuli.2
The fact that spatial attention “spills over” to unattended modalities has been described before (Zhang et al., 2011; Nager et al., 2006; Eimer et al., 2002, 2004; Teder-Sälejärvi et al., 1999; Eimer & Schröger, 1998; Hillyard et al., 1984). However, the design of previous studies left open the possibility that the pervasiveness of spatial attention across modalities was due to the manipulation of spatial attention on a trial-by-trial basis while keeping intermodal attention sustained, thus making spatial attention more salient than intermodal attention. Our results do not support this concern. We directly examined the effect of transient versus sustained assignment of attention and found similar spread of spatial attention in both conditions. These results support previous interpretations of spatial attention being supramodal. Surprisingly, unlike previous studies that found the Nd1 effect in parietal channels mainly in transient attention conditions (e.g., Eimer, 1994; Schröger, 1994), in our study we found similar Nd1 effects for both sustained and transient spatial attention. This difference between our findings and those reported previously could be related to the complexity of the current design, in which one dimension of attention (either spatial or intermodal) had to be frequently updated, so even when spatial attention was sustained, participants still performed a task that required transient allocation of attention.
Nonetheless, our results revealed differences between effects of sustained and transient spatial attention at other stages of processing. Whereas early visual evoked responses were similarly affected by transient and sustained attention orienting, spatial attention depended on attention assignment in late (>400 msec) stages of processing. When spatial attention was sustained, late responses to visual stimuli at the attended location showed a widespread positivity that was not evident when lights were presented at the unattended location. Similarly, the effect of spatial attention on late responses to sounds was more widespread when spatial attention was sustained than when it was transient.
Intermodal attention generally showed weaker effects than spatial attention in our study, yet it was modulated by attention assignment at relatively late stages of processing, with stronger effects on visual responses in the 500–700 msec time window when intermodal attention was sustained than when it was transient. Importantly, the intermodal sustained attention effects were not restricted to the attended location. In other words, at least in the visual modality, effects of intermodal attention might spread to the unattended location. This poses a challenge for the claim that spatial attention inherently takes precedence to other forms of attention. The current results correspond to studies that questioned the superiority of spatial attention over feature-based attention, finding effects of feature-based (e.g., color, motion direction) attention spread outside the focus of spatial attention in the visual domain (Störmer & Alvarez, 2014; Andersen, Hillyard, & Müller, 2013; Stoppel et al., 2012; Hopf et al., 2004; Saenz et al., 2002). In this sense, modality might be seen as an additional feature of the stimulus, and intermodal attention resembles feature-based attention and can operate across locations.
The late positive effects, found in the exploratory analysis, may be related to other protracted positive components that have been reported in a variety of paradigms. In addition to the well-known P3 complex, in response to rare events in oddball paradigms (Sutton, Braren, Zubin, & John, 1965), late positive components/potentials were also found in nonoddball situations. These include, for example, a central-posterior late positive component elicited by perceiving a reversal in a bistable image (Pitts, Martínez, Stalmaster, Nerger, & Hillyard, 2009; Pitts, Gavin, & Nerger, 2008), a frontocentral “P400” elicited by the detection of auditory mistuned harmonic (Alain, Arnott, & Picton, 2001), and a late positive potential in response to images with emotional valence versus more neutral images (Weinberg & Hajcak, 2011; Schupp, Flaisch, Stockburger, & Junghofer, 2006). Another broad late positive potential was found in response to stimuli triggering episodic memory (the old/new effect; Wilding, Doyle, & Rugg, 1995), and recent studies by Renoult et al. (2014) found a similar occipitoparietal late positive effect enhanced by personal relevance (autobiographical significance) of memoranda. A late parietal positive potential was also related to the sound or view of one's own name (Tamura, Mizuba, & Iramina, 2016; Kutas, Hillyard, Volpe, & Gazzaniga, 1990). The distribution of these effects varies quite a bit across paradigms, from frontocentral to parieto-occipital, with various degrees of lateralization. This corresponds with the evidence for multiple neural generators across the brain (Soltani & Knight, 2000; Clarke, Halgren, & Chauvel, 1999). A full review of these effects is beyond the scope of our article, but it seems that common to these disparate findings is the response to task-relevant or personally relevant/salient stimuli leading to attentional modulation resulting in extended processing of the stimulus, updating of working memory, or inhibition of irrelevant processes. This type of interpretation may be relevant to the current late attentional effect, but this remains for direct testing in future research.
One last concern relates to the asymmetry found in our study between effects of attention on auditory and visual evoked responses: Whereas visual responses were modulated by attention starting from the N1, auditory evoked responses (such as the auditory N1) were not modulated by attention in our study. Furthermore, visual responses were affected by both spatial and sustained intermodal attention, whereas in auditory responses we only found an effect of spatial attention. We suspect that this could be due to a difference between our auditory and visual task difficulties: Despite our attempt to equate task difficulties using the staircase procedure, accuracy in the auditory gap detection task was higher than in the visual gap detection task. Furthermore, in the auditory task, participants could possibly perform the task by first deciding whether a sound is a deviant and then deciding whether a response is necessary or not, depending on sound location, whereas in the visual task spatial attention was necessary to decide whether a visual stimulus was a deviant or not. This difference between the two tasks may have lead to lower overall attention requirements in the auditory modality than in the visual modality, reducing our sensitivity to early attentional effects on auditory evoked potentials. Given this limitation of our design, our findings regarding effects of intermodal attention are currently restricted to the visual modality.
To conclude, we examined the interaction of three dimensions of attentional selection—spatial selection, modality selection, and the mode of allocating attention—transient or sustained. At no stage of processing did we find any interactions between spatial attention and intermodal attention, suggesting that attending a location was not restricted to the attended modality. Similarly, based on the effect of sustained intermodal attention on visual EEG responses, attending a modality does not appear to be restricted to the attended location, though further research is needed to establish this claim regarding intermodal attention more generally. In contrast with our initial hypothesis, the “spillover” of attention to the unattended modality/location is unaffected by attention assignment. However, although effects of spatial attention on visual evoked potentials were independent of attention assignment, we found that late effects of spatial and intermodal attention on stimulus processing (between 400 and 800 msec) depend on whether attention was sustained or transient. This should be considered whenever designing or interpreting results of experiments where spatial and/or intermodal attention is manipulated.
Reprint requests should be sent to Talia Shrem, The Department of Psychology, The Hebrew University of Jerusalem, Mount Scopus, Jerusalem, Israel, 91905, or via e-mail: email@example.com.
Note that this does not mean that spatial attention cannot be restricted to one modality if the task explicitly so requires (Santangelo, Fagioli, & Macaluso, 2010).
The lack of an effect of spatial attention on the auditory N1 in our study may be due to our choice of a longer ISI than the ISI chosen in some studies that found effects of spatial attention on the N1 (see Näätänen et al., 1978).