Abstract
In natural vision, processing of spatial and nonspatial features occurs simultaneously; however, the two types of attention in charge of facilitating this processing have distinct mechanisms. Here, we tested the independence of spatial and feature-based attention at different stages of visual processing by examining color-based attentional selection while spatial attention was focused or divided. Human observers attended to one or two of four fields of randomly moving dots presented in both left and right visual hemifields. In the focused attention condition, the target stimulus was defined both by color and location, whereas in the divided attention condition stimuli of the target color had to be attended in both hemifields. Sustained attentional selection was measured by means of steady-state visual evoked potentials elicited by each of the frequency-tagged flickering dot fields. Additionally, target and distractor selection was assessed with ERPs to these stimuli. We found that spatial and color-based attention independently modulated the amplitude of steady-state visual evoked potentials, confirming independent top–down influences on early visual areas. In contrast, P3 amplitudes elicited only by targets and distractors of the attended color were subject to space-based enhancement, suggesting increasing integration of spatial and feature-based selection over the course of perceptual processing.
INTRODUCTION
Human visual processing is limited; therefore, selective attention is necessary to focus processing on relevant stimuli to allow for adaptive behavior. This idea is captured by the biased competition model (Desimone & Duncan, 1995), according to which simultaneously presented stimuli activate overlapping neural populations, thus competing for cortical representation. Top–down attentional modulation can help resolve this competition in favor of attended sensory input, allocating the limited cortical resource to task-relevant stimuli (Reynolds & Heeger, 2009; Reynolds & Chelazzi, 2004; Luck, Chelazzi, Hillyard, & Desimone, 1997). However, it is unclear whether top–down attentional modulation is itself capacity limited. If this were the case, the strength of attentional modulation of visual processing should depend upon the demands placed on top–down selection. For example, selection of a single stimulus feature (e.g., color) should be more effective when selection is based solely on that feature as compared with when selection is also based on another feature (e.g., shape, size). A recent study tested this hypothesis for color and orientation selection, concluding that joint selection of a feature conjunction occurs in parallel, with each constituent feature enhanced independently (Andersen, Müller, & Hillyard, 2015). This result implies that there is no shared “resource” for attentional selection of different feature dimensions.
Some caution is warranted before generalizing this finding across all possible feature dimensions. It has been suggested that spatial location is itself a feature and that it participates in attentional selection equally to all the other features (Patzwahl & Treue, 2009; Martinez-Trujillo & Treue, 2004; Bundesen, 1990). For example, the feature similarity gain model (Cohen & Maunsell, 2011; Maunsell & Treue, 2006; Martinez-Trujillo & Treue, 2004; Treue & Martínez Trujillo, 1999; Connor, Preddie, Gallant, & Essen, 1997) proposes that the gain of a visual neuron depends on the similarity between the response selectivity of the neuron and the currently relevant feature(s) across all available feature dimensions, including location. However, other theories propose that location information has higher priority for selection (Tsal & Lavie, 1993; van der Heijden, 1993; Cave & Wolfe, 1990; Treisman, 1988). This view is supported by behavioral and electrophysiological evidence that feature-based attention operates later than spatial attention (Liu, Stevens, & Carrasco, 2007; Anllo-Vento & Hillyard, 1996; Eimer, 1995; Hillyard & Münte, 1984) and that feature-based enhancement is more pronounced at attended locations (Leonard, Balestreri, & Luck, 2015; Bengson, Lopez-Calderon, & Mangun, 2012; Hillyard & Anllo-Vento, 1998). Thus, it could be the case that, although concurrent selection of features of different nonspatial feature dimensions is mutually independent (Jenkins, Grubert, & Eimer, 2017; Andersen et al., 2015; Andersen, Hillyard, & Müller, 2008), selection of such features is not independent of spatial selection. If spatial information is indeed prioritized, then the magnitude of feature-based top–down modulation should depend on the availability of spatial cues and demands on concurrent spatial selection.
To test this hypothesis, we compared the effectiveness of color-based selection when spatial attention was focused on a single location or divided across two locations. Participants observed two pairs of overlapping fields of randomly moving dots of different colors and were asked to detect brief luminance decrements of the dots of the cued color on one side or on both sides. Frequency-tagged steady-state visual evoked potentials (SSVEPs) elicited by each of the four dot fields were recorded, as well as RTs and ERPs elicited by target and distractor events.
If voluntary selection of space and color operate under a common limit, then dividing spatial attention would make spatial selection more demanding, leaving less resources available to feature-based attention. This would result in reduced color-based attentional modulation of SSVEP amplitudes when spatial attention is cued to both sides of the visual field. Alternatively, independent color-based selection would be equally strong across both focused and divided spatial attention conditions, meaning that feature-based attention is immune to the costs of distributing spatial attention across the entire visual field.
METHODS
Participants
Sixteen participants (10 women, 15 right-handed, mean age = 22.4 years, SD = 1.7 years) with normal or corrected-to-normal vision were recruited for the experiment after giving informed consent. Data from all 16 participants were included in the analyses. Five additional participants were aborted from the experiment due to the poor task performance in the practice session (<60% correct responses) and did not complete the experimental session. The study was approved by the ethics committee of the School of Psychology at the University of Aberdeen.
Stimuli and Procedure
On each trial, participants were presented with two pairs of completely overlapping red and blue fields of randomly moving dots, one on the left and one on the right of the fixation cross (Figure 1). A cue at the beginning of the trial informed which dot field or fields were to be attended. Participants were instructed to detect brief luminance decrements of cued dot fields (targets) and respond by pressing the space bar on the standard keyboard while ignoring luminance decrements of noncued dot fields (distractors). Target or distractor events, during which 20% of the dots belonging to one of the dot fields decreased in luminance by 30% for 200 msec, occurred with equal probability in all four dot fields. Each trial contained between zero and three events in total, with consecutive events separated by a minimum of 700 msec and the earliest target or distractor appearing at least 600 msec after the onset of the dot fields.
Trial timeline, stimuli, and stimulation frequencies. All dots moved randomly and flickered at the assigned frequency.
There were six cue conditions. Four conditions (focused attention) specified both the color and the location of the to-be-attended dot field (left red, right red, left blue, or right blue). In two other conditions (divided attention), only the target color was cued (both red and both blue), instructing participants to attend to both the left and right fields simultaneously. Thus, in each trial within the focused attention group, one dot field was a target (S+C+, for side and color cued) and the other three were distractors: S−C+ (uncued side and cued color), S+C− (cued side, uncued color), and S−C− (both side and color uncued). Within the divided attention conditions, two dot fields at a time were targets (C++, for color cued) and the other two were distractors (C−−, color uncued).
Stimuli were presented on a midgray background (8 cd/m2). Each dot field consisted of 75 dots spread randomly within a rectangle (5.9° wide and 11.8° high) positioned 5.4° to the left or right of the fixation cross. On each side, red (8 cd/m2) and blue (8 cd/m2) dot fields overlapped. Each of the four fields of dots flickered at an individual frequency synchronized to the screen's refresh rate (left red, 10 Hz; right red, 8.57 Hz; left blue, 7.5 Hz; right blue, 12 Hz). On each frame, dots moved 0.03° of visual angle in a random direction (0% coherence), and dots that moved outside the rectangular apertures were wrapped around to the opposite side. All dots were drawn in random order to avoid systematic occlusion, which could otherwise have provided a depth cue.
Trials were presented in eight blocks of 84 trials. Of the total 672 trials, 288 contained one to three targets and distractors, and the remaining 384 contained none. In total, each condition contained 96 luminance decrement events randomly distributed between four dot fields. Thus, focused attention conditions contained between 22 and 28 targets (M = 24) and between 68 and 74 distractors (M = 72). Divided attention conditions contained approximately equal number of targets and distractors (46–50, M = 48). Trials of the six attentional conditions were presented in random order. Before the start of the experiment, participants familiarized with the task during the practice session, which continued until participants' performance in a block reached 60% with average RT faster than 700 msec.
Behavioral Data Analysis
Detection was considered correct if RT fell between 250 and 900 msec after the target event. Reactions following nontarget events were counted as false alarms. To obtain a behavioral measure of feature selection, we calculated observer sensitivity d′ for participants' ability to discriminate attended color targets from unattended color distractors at attended locations (i.e., S+C+ vs. S+C− or C++ vs. C−−). Hit and false alarm proportions were corrected using the loglinear approach (Hautus, 1995) before calculating d′ to control the influence of extreme proportions (i.e., hit or false alarm rates close to 0 or 1). RTs of correct responses and sensitivity measures (d′) were averaged and statistically compared between focused and divided attention conditions. Additionally, false alarm rates were averaged and compared across four distractor types (S−C+, S+C−, S−C−, C−−).
EEG Acquisition and Analyses
EEG Recordings
EEG data were recorded using an ActiveTwo amplifier system (Biosemi) from 64 Ag/AgCl electrodes at a sampling rate of 256 Hz. To enhance the spatial sampling of lower occipital locations, electrode positions were modified from the manufacturer's default 10–20 setup by removing electrodes at positions T7/8 and F5/6 and instead placing electrodes at positions PO9/10 and I1/2. Eye movements and blinks were monitored by electrooculographic recordings from supra- and infraorbital right eye electrodes (vertical EOG) and outer canthi of both eyes (horizontal EOG). EEG data were processed using EEGLAB toolbox (Delorme & Makeig, 2004) in combination with custom written MATLAB (2015a, The Mathworks) routines.
SSVEP Amplitudes
Only trials without targets or distractors were used for SSVEP analyses. Epochs were extracted from 600 msec before the onset of the stimulation to 3000 msec after. Epochs with blinks or eye movements were excluded, and the remaining artifacts were corrected using an automated trial exclusion and channel approximation procedure based on statistical properties of the data (Junghöfer, Elbert, Tucker, & Rockstroh, 2000), resulting in the average of 250 (±42) trials submitted to the analysis. The resulting averaged EOG traces indicated that remaining gaze position deviations from fixation were smaller than 0.8°. Because the borders of dot fields were approximately 2.45° away from the fixation point, a 0.8° EOG cutoff excluded the possibility of foveating the parts of or the entire stimulus. Data were then rereferenced to the average of all electrodes. All epochs within the same attentional condition were averaged for each participant. A cluster of occipital and parietal electrodes of interest (PO3/4, PO7/8, PO9/10, O1/2, I1/2, OZ, IZ, POz) was selected a priori based on a previous study that used comparable stimulation (Andersen, Hillyard, & Müller, 2013).
SSVEPs were analyzed in the time window from 400 to 2900 msec after the onset of stimulation to exclude evoked EEG responses at trial onset and allow the SSVEP signal to build up. Data within the time window were detrended to correct for linear drifts. SSVEP amplitudes at each of the four stimulation frequencies were calculated as the absolute value of the complex Fourier coefficients for each of the 13 selected electrodes. Figure 2 shows the spectrum of SSVEP waveforms averaged across all participants as well as the average voltage maps for each stimulation frequency.
(A) Spline-interpolated isocontour voltage maps averaged over all conditions and participants for each stimulation frequency. SSVEP amplitudes for each stimulation frequency show lateralized peaks at occipital electrodes. Electrodes selected a priori for analysis are indicated with larger dots. (B) Grand-averaged amplitude spectrum for all conditions obtained by Fourier transformation zero padded to 16,384 points. Distinctive peaks on the spectrum correspond to stimulation frequencies (7.5, 8.57, 10, and 12 Hz) and the second harmonic of 7.5 Hz (15 Hz; not included in analysis).Within each peak, the highest activation was registered in the condition in which both the color and location of the stimulus flickering at that frequency was attended. (C) Summary of grand-averaged amplitudes for each stimulation frequency. For each frequency, conditions are arranged in the following order: attend left blue, attend both blue, attend right blue, attend left red, attend both red, and attend right red (small dots underneath the bars depict attended colors on each side). All four frequencies (stimuli) exhibit corresponding patterns of attentional modulation, with enhanced amplitudes when the driving stimulus' color or location was attended.
(A) Spline-interpolated isocontour voltage maps averaged over all conditions and participants for each stimulation frequency. SSVEP amplitudes for each stimulation frequency show lateralized peaks at occipital electrodes. Electrodes selected a priori for analysis are indicated with larger dots. (B) Grand-averaged amplitude spectrum for all conditions obtained by Fourier transformation zero padded to 16,384 points. Distinctive peaks on the spectrum correspond to stimulation frequencies (7.5, 8.57, 10, and 12 Hz) and the second harmonic of 7.5 Hz (15 Hz; not included in analysis).Within each peak, the highest activation was registered in the condition in which both the color and location of the stimulus flickering at that frequency was attended. (C) Summary of grand-averaged amplitudes for each stimulation frequency. For each frequency, conditions are arranged in the following order: attend left blue, attend both blue, attend right blue, attend left red, attend both red, and attend right red (small dots underneath the bars depict attended colors on each side). All four frequencies (stimuli) exhibit corresponding patterns of attentional modulation, with enhanced amplitudes when the driving stimulus' color or location was attended.
After normalization, the amplitudes were collapsed across frequencies to yield average normalized SSVEP amplitudes for every attentional condition. Average amplitudes were then subjected to a 2 × 3 repeated-measures ANOVA with a Greenhouse–Geisser correction to examine main effects of color-selective and spatial attention, as well as their interaction. Planned analyses also included the following contrasts: (1) S−C+ versus S−C− (to estimate the global effect of feature-based attention), (2) S+C− versus S−C− (to estimate the global effect of spatial attention), (3) S−C+ versus S+C− (to compare the magnitude of feature-based and spatial attentional enhancement; this comparison uses S−C+ and S+C− as isolated contributions of each type of attention with the reference to a fully unattended, S−C− stimulus), (4) S+C+ versus C++, and (5) S+C− versus C−− (to estimate the cost of dividing spatial attention).
ERPs
Epochs for ERP analyses were extracted from 100 msec before to 700 msec after luminance decrements (target and distractor events). Note that the trials used for ERP and SSVEP analyses are nonoverlapping, with only trials without events used for examining SSVEPs and only trials containing events used for examining ERPs. Artifacts were treated in the same manner as in the SSVEP analysis (16% of trials rejected on average), and data were rereferenced to the average of the earlobes. The mean amplitude from 100 msec before stimulus onset to stimulus onset was subtracted as a baseline. The amplitude of the P3 component was averaged over the time window from 450 to 600 msec after stimulus onset at electrode PZ, where it peaked, and compared across all attentional conditions.
Controlling for Voluntary Switching of Attention
Alternatively to (a) concurrently attending to both sides in the divided attention condition or to (b) concurrently attending based on color and space in the focused attention condition, participants might have voluntarily switched between attending on the left and right or between color- and space-based selection. This type of switching could have occurred between or within trials. Such alternative accounts, according to which the condition means consist of a mixture of different attentional states, may be hard or impossible to distinguish by inspecting trial averages. However, in both cases, these alternative accounts can be tested by considering correlations of SSVEP amplitudes of pairs of stimuli over time. The presence of spatial switching can also be assessed by examining the variance in RTs.
Behavioral Analysis
Single-trial SSVEP Analyses
A single-trial SSVEP analysis was performed to confirm that participants were concurrently attending to (a) both left and right stimuli in the divided attention condition and (b) both color and space in the focused attention condition. In the case of spatial switching, stimuli on the left and right would never be attended concurrently. Instead, attention would move from attending the left side (high SSVEP amplitudes for stimuli on the left and low SSVEP amplitudes for stimuli on the right) to attending the right side (low SSVEP amplitudes for stimuli on the left and high SSVEP amplitudes for stimuli on the right), leading to a negative correlation over time between SSVEP amplitudes of stimuli on the left and right. If participants were alternating between attending to the cued color (S+C+ and S−C+ enhanced) and attending to the cued location (S+C+ and S+C− enhanced) in focused attention conditions, then a similar negative correlation should arise between S+C− and S−C+ stimuli, as these would always be attended alternatingly (Figure 5A).
Voluntary switching, if present, is expected to be slow. Typically, endogenous shifts of spatial attention take, in various conditions, from 300 to 500 msec if measured behaviorally (Carlson, Hogendoorn, & Verstraten, 2006; Horowitz, Holcombe, Wolfe, Arsenio, & DiMase, 2004; Reeves & Sperling, 1986) or slightly longer (400–600 msec) if measured with SSVEPs (Kashiwase, Matsumiya, Kuriki, & Shioiri, 2012; Müller, Teder-Sälejärvi, & Hillyard, 1998). Shifts of feature-based attention typically take longer than 300 msec (Andersen & Müller, 2010; Ravizza & Carter, 2008; Liu et al., 2007). Single-trial SSVEP analysis described here assumes that one cycle of switching back and forth between two locations or features takes at least 800 msec.
SSVEP amplitudes for all four stimulation frequencies on individual trials (400–2900 msec after stimulation onset) were extracted by the means of complex Morlet wavelets (Gabor filters) with a FWHM resolution of ±441.3 msec (±0.5 Hz). The resulting complex amplitudes were concatenated for the trials of the same attentional condition and projected within condition onto the mean phase to obtain evoked amplitudes before averaging over electrodes. To maximize statistical power, the analysis was carried out on three electrodes selected for each participant individually on the basis of the highest numerical difference in SSVEP amplitude between the S+C+ and S−C− stimuli (largest overall difference between attentional conditions). The amplitudes driven by the stimuli of interest (see Results for further explanation) were correlated across stimuli for each participant separately, and the resulting correlations were z-transformed and compared against zero using an equivalence test (Lakens, 2017) and a two-tailed t test.
RESULTS
Behavioral Data
Hit RTs were faster in focused attention conditions compared with divided attention conditions, t(15) = −2.4, p = .03, d = 0.6 (Figure 3A), indicating a cost of divided attention. False alarm rates did not differ between the four types of distractors, F(2, 30.6) = 2.65, p = .09, ηG2 = .12 (Figure 3C); however, distractors at the unattended location with the unattended color (S−C−) produced numerically fewer false alarm responses. As a behavioral measure of feature selection, we computed sensitivity (d′) for participants' ability to distinguish between attended (targets) and unattended (distractors) color luminance decrements at attended location(s). Interestingly, this measure did not differ between focused and divided attention conditions, t(15) = 0.98, p = .34, d = 0.25 (Figure 3B), indicating that feature selection was unaffected by division of spatial attention.
Average RTs (A) and sensitivity for target-distractor discrimination (B) under focused and divided attention conditions. (C) False alarm rates for different types of distractors: cued color and uncued side (C+S−), uncued color and cued side (C−S+), uncued color and side (C−S−), and uncued color under divided spatial attention (C−−). (D) Normalized grand-averaged SSVEP amplitudes for all attentional conditions. Error bars are within-subject 95% CI (A, B, D) or Wilson score intervals (C).
Average RTs (A) and sensitivity for target-distractor discrimination (B) under focused and divided attention conditions. (C) False alarm rates for different types of distractors: cued color and uncued side (C+S−), uncued color and cued side (C−S+), uncued color and side (C−S−), and uncued color under divided spatial attention (C−−). (D) Normalized grand-averaged SSVEP amplitudes for all attentional conditions. Error bars are within-subject 95% CI (A, B, D) or Wilson score intervals (C).
SSVEP Amplitudes
Figure 3 shows the summary of normalized and averaged SSVEP amplitudes. SSVEP amplitudes were significantly enhanced by both spatial attention, F(1.2, 18) = 12.9, p = .001, ηG2 = .24, and color-selective attention, F(1, 15) = 46.62, p < 10−5, ηG2 = .6. The interaction between the two types of attention was not statistically significant, F(1.6, 24) = 2.08, p = .15, ηG2 = .02, indicating that the magnitude of feature-based attentional enhancement did not depend on the presence of spatial attention or its state of focus.
SSVEP amplitudes in divided attention conditions were lower than those at the attended location, t(15) = 2.739, p = .018, and larger than those at the unattended location, t(15) = 2.613, p = .025, of the focused attention conditions.
Pairwise comparisons (Tukey contrasts) revealed significant global effects of both feature-based (S−C+ vs. S−C−, t(15) = 7.06, p < 10−11, d = 1.77), and spatial attention (S+C− vs. S−C−, t(15) = 3.75, p < 10−3, d = 0.94). That is, SSVEP amplitudes for the attended color were also enhanced at the unattended side, and amplitudes on the attended side were also greater when the color was unattended. The effect of feature-based attention was stronger than the effect of spatial attention (S+C− vs. S−C+, t(15) = 3.32, p < 10−3, d = 0.83).
ERPs
Figure 4 shows averaged ERPs time-locked to the onset of the luminance decrement at electrode Pz. P3 amplitudes were significantly enhanced by spatial attention, F(1.8, 27) = 7.46, p = .003, ηG2 = .06, as well as color-selective attention, F(1, 15) = 27.06, p < 10−3, ηG2 = .19.
(A) Grand-averaged ERP elicited by target and distractor events under all attentional conditions. Shaded area represents the time window used for averaging P3 amplitude. (B) Summary of P3 amplitudes. Error bars represent 95% CI.
However, in contrast with SSVEP results, the interaction between the two types of attention was also significant for P3 amplitudes, F(1.6, 24) = 7.14, p = .003, ηG2 = .03. Pairwise comparisons revealed that spatial attention enhanced P3 amplitudes elicited by the stimuli of the attended color (S−C+ vs. S+C+, t(15) = 14.28, p < 10−15, d = 3.57), but the effect was not extended to the stimuli of the unattended color (S−C− vs. S+C−, t(15) = 2.35, p = .17, d = 0.59). Divided spatial attention resulted in smaller P3 amplitudes compared with focused spatial attention only for the stimuli of the attended color (S+C+ vs. C++, t(15) = 5.79, p < 10−8, d = 2.98; S+C− vs. C−−, t(15) = 2.37, p = .17, d = 0.59). Conversely, the effect of feature-based attention was spatially global, enhancing P3 amplitudes related to the attended color even on the unattended side (S−C− vs. S−C+, t(15) = 6.36, p < 10−8, d = 1.59).
Voluntary Switching of Attention: Behavior
RT variance under the assumption of a binary mixture distribution of focused and unattended conditions was larger than the empirically observed variance, t(15) = 1.89, p = .03, d = 0.47; thus, the data are not consistent with the mixture distribution. Figure 5C shows an example of representative distributions as well as the difference between the observed and predicted variances. This result supports the conclusion that the divided attention condition represents a relatively sustained attentional state rather than a combination of attended and unattended trials produced by voluntary switching of spatial attention between locations.
(A) Example pairs of stimuli used for single-trial SSVEP analyses. Top row: to-be-attended stimuli—two dot fields (spatial switching possible) or one dot field (switching between color and space selection possible). Bottom row: stimuli (frequencies) for which single-trial SSVEP amplitudes were extracted and correlated, for the given conditions. Left and right dot fields of the same color were correlated for the spatial switching analysis, and dot fields sharing either side or color with the target were correlated for space–feature switching. (B) Summary of the single-trial SSVEP results. Gray dots represent individual observations; shaded areas are 95% CI. Posterior distributions of the means are shown in vertical histograms, summarized as 95% highest posterior density intervals (black bars). (C) Behavioral data of a representative participant: analysis of switching. Density plots represent basis RT distributions (focused and unattended trials) as well as divided attention distributions predicted by spatial switching and empirically observed.
(A) Example pairs of stimuli used for single-trial SSVEP analyses. Top row: to-be-attended stimuli—two dot fields (spatial switching possible) or one dot field (switching between color and space selection possible). Bottom row: stimuli (frequencies) for which single-trial SSVEP amplitudes were extracted and correlated, for the given conditions. Left and right dot fields of the same color were correlated for the spatial switching analysis, and dot fields sharing either side or color with the target were correlated for space–feature switching. (B) Summary of the single-trial SSVEP results. Gray dots represent individual observations; shaded areas are 95% CI. Posterior distributions of the means are shown in vertical histograms, summarized as 95% highest posterior density intervals (black bars). (C) Behavioral data of a representative participant: analysis of switching. Density plots represent basis RT distributions (focused and unattended trials) as well as divided attention distributions predicted by spatial switching and empirically observed.
Voluntary Switching of Attention: Single-trial SSVEP Analyses
The same hypothesis was also tested by correlating single-trial SSVEP amplitudes between the frequencies of attended stimuli in divided attention conditions (i.e., left red and right red for attend both red, left blue and right blue for attend both blue). This correlation over time did not differ significantly from zero, r = .009, t(31) = 0.238, p = .95, d = 0.04 (Figure 5B). Evidence in favor of the null hypothesis was assessed with an equivalence test (two one-sided t tests against the smallest effect of interest; Lakens, 2017). The equivalence bounds were set at Cohen's d = 0.5. For the divided attention condition, both one-sided t tests (against d = −0.5 and d = 0.5) were significant, t(31) = −2.6, p = .007, which means that we can consistently reject the alternative explanation of voluntary spatial switching.
Attentional switching between space and color was tested using complimentary S+C− and S−C+ stimuli. For instance, SSVEP amplitudes elicited by right red and left blue stimuli were compared in attend left red and attend right blue conditions (where they were, respectively, S−C+ and S+C− and vice versa). No significant correlation was observed between spatially cued and color cued stimuli, r = −.008, t(63) = −0.036, p = .97. An equivalence test confirmed the absence of a detectable switching effect, t(63) = 3.96, p < 10−4. In summary, neither attentional switching between cued sides in divided attention conditions nor switching between spatial and color selection in focused attention conditions is consistent with single-trial SSVEP amplitudes and variance of RTs.
DISCUSSION
The main goal of the study was to compare feature-based attentional selection under the conditions of focused and divided spatial attention to establish whether concurrent top–down attentional selection of spatial and nonspatial attributes relies on a shared resource. Divided spatial attention consistently yielded costs on measures of spatial selection: Hit RTs were slower, target P3 amplitudes were reduced, and SSVEP amplitudes were lower than in focused attention conditions. However, we found no effects of divided spatial attention on the strength of color-based selection as measured by SSVEPs or sensitivity d′. In other words, the effect of color-based selection on SSVEP amplitudes was of a similar magnitude regardless of the spatial attention condition, and the effect of spatial selection was not contingent on attending a particular color, indicating independent enhancement of individual attentional dimensions. This combination of attention to spatial and nonspatial features shows the same regularities as for the conjunctions of two nonspatial features in a previous study (Andersen et al., 2015). Additionally, neuroimaging studies of top–down attentional control have shown overlapping sources of feature-based and spatial attention with spatially interspersed neural populations tuned to either spatial or feature representation (Greenberg, Esterman, Wilson, Serences, & Yantis, 2010; Egner et al., 2008; Schenkluhn, Ruff, Heinen, & Chambers, 2008). Together, these findings suggest that simultaneous top–down attentional modulation of different dimensions relies on independent resources deployed jointly. This pattern of attentional enhancement cannot be explained by voluntary switching of attention to the preferred dimension. The results of the single-trial analyses confirmed that during focused attention, both cued space and color were attended simultaneously.
In contrast with independent selection of features of different dimensions, our data showed that simultaneous selection of multiple features within the same dimension (i.e., splitting attention) is subject to capacity limitations. The costs of dividing spatial attention were evident both in SSVEP, P3, and behavioral data. Lower SSVEP and P3 amplitudes and longer RTs in divided attention compared with focused attention conditions indicate that the total strength of attentional modulation due to spatial orienting was distributed across the two behaviorally relevant locations. This cost is a result of top–down selection and cannot be attributed to competitive interactions between the stimuli themselves, as they were located in opposite hemifields and represented by different cell populations (Kastner et al., 2001). Costs of splitting attention within a dimension also exist for other features such as color (Martinovic, Wuerger, Hillyard, Müller, & Andersen, 2018; Liu & Jigo, 2017; Andersen et al., 2013), direction of motion (Liu, Becker, & Jigo, 2013), and orientation (Herrmann, Heeger, & Carrasco, 2012). Importantly, this division of attentional enhancement was sustained across hemifields and not subject to strategic switching of attention. As consistently shown by the single-trial SSVEP analysis and the analysis of RTs, stimuli in both hemifields were attended concurrently. This may seem to conflict with previous claims that spatial attention is fundamentally periodic (Landau & Fries, 2012), that is, that divided attention is achieved by rapid sequential sampling of attended locations rather than fully parallel division of attentional resources. The key point here is that we can confirm that participants divided spatial attention rather than voluntarily employing a sequential strategy. We make no claims as to whether this division of spatial attention is achieved by parallel coselection or through serial sampling of locations at rates exceeding the speed of voluntary attentional selection and the temporal resolution of our single-trial analysis. Importantly, our conclusion that feature-based selection is unaffected by division of spatial attention is compatible with either proposed implementation of divided attention.
Although no interaction between spatial and color-selective attention was observed in SSVEP amplitudes, it was present in the ERP data. Spatial attention had no influence on P3 amplitudes to distractors of the unattended color. Color-based modulation was effective for all spatial attention conditions but was the highest on the attended side. This closely matches the difference between SSVEP and ERP evidence reported in Andersen, Fuchs, and Müller (2011), suggesting that the integration of attentional modulation varies across different levels of the cortical processing hierarchy. Early selection of continuously presented stimuli is achieved through parallel and independent facilitation of their features, whereas later selection stages for transient events show interactions, consistent with hierarchical feature selection (Anllo-Vento & Hillyard, 1996). These interactions at later stages most likely arise from nonlinear processing of the input from earlier stages related to competition (Tompary, Al-Aidroos, & Turk-Browne, 2018; Leonard et al., 2015; White, Rolfs, & Carrasco, 2015) or limitations in STM (Bundesen, 1990) as well as decisional processes.
It is likely that feature selection starts integrating with spatial selection earlier than the P3 time range. Additional analysis of the target- and distractor-elicited ERPs revealed a selection negativity (SN), an index of feature-selective processing, beginning around 200 msec after target/distractor onset in all three spatial attention conditions. Typically, the SN is more pronounced at attended locations (Hillyard & Anllo-Vento, 1998), which would in our case indicate a spatial bias in feature processing in this time window. The SN did not differ between the three spatial attention conditions in our experiment, F(2, 30) = 2.06, p = .14; however, this may be due to the fact that the current study was optimized to precisely resolve attention effects in SSVEPs rather than target- and distractor-elicited ERPs; thus, the SN here is not as clean as it can be in pure ERP experiments.
One important question is whether the integration of feature-based and spatial selection is task-specific and depends on the attention demands. This study did not manipulate discriminability of the stimuli; however, independent effects of spatial and feature-based attentional selection during focused attention were previously demonstrated using a more challenging task (Andersen et al., 2011). High discriminability in the current task is likely reflected in the large P3 amplitude effects and low false alarm rates, indicating reduced sensitivity resulting from effective feature-based filtering. With a more demanding task, early parallel selection may be less effective, producing more false alarms and higher P3 amplitudes associated with the unattended color. This is in line with the previous studies linking the relative strength of spatial and feature-based modulation with spatial discriminability as measured by ERPs (Hillyard & Münte, 1984; Harter, Aine, & Schroeder, 1982). Thus, attentional selection is fine-tuned according to the task demands during the later stages of processing, as a result of early independent filtering.
Alternative explanations of the present SSVEP results that are based on the physical characteristics of the stimuli, such as selection biased by depth cues or interference from target and distractor events, can be ruled out. Dots of different colors were rendered on screen in random order to avoid systematic occlusion and prevent depth cues. Target and distractor events could not directly have affected SSVEP amplitudes, as trials with events were excluded from these analyses.
SSVEP amplitudes were collapsed across frequencies before statistical analyses. This approach is consistent with previous studies using similar paradigms, which have generally revealed comparable SSVEP attention effects across different frequencies (Andersen et al., 2008, 2011, 2013, 2015). The frequencies employed here (7.5–12 Hz) were not further apart than in those previous studies. However, multiple studies have concluded that rhythmic visual stimulation, particularly in the alpha band (8–12 Hz), can entrain endogenous brain rhythms potentially interfering with associated cognitive processes (Gulbinaite, van Viegen, Wieling, Cohen, & VanRullen, 2017; Spaak, de Lange, & Jensen, 2014; Graaf et al., 2013). Applied to our data, this could mean that attentional effects on processing of the stimuli flickering within the alpha-band range (8.57, 10, and 12 Hz) would be different compared with the stimulus flickering outside the alpha band (7.5 Hz). Multiple arguments speak against the possibility that such frequency-specific effects may have affected our conclusions. First, our phase-locked analysis of SSVEP amplitudes strongly attenuates non-phase-locked ongoing oscillations (e.g., alpha). This is apparent from the spectrum depicted in Figure 2: The alpha band, visible as a slight bump in the range from roughly 8 to 12 Hz, is much smaller than the elicited SSVEP amplitudes and shows no clear condition differences. Thus, any possible differences in alpha activity across conditions cannot explain the SSVEP attention effects in our data. Second, although the effect of spatial attention on the stimulus flickering at 7.5 Hz seems less consistent than for the three other frequencies (Figure 2), attentional effects were highly consistent across frequencies in two previous experiments using the exact same frequencies and similar stimuli and task (Andersen et al., 2013). The ANOVA of SSVEP amplitudes in the present experiment yielded equivalent results when SSVEP amplitudes at 7.5 Hz were excluded. Third, attentional conditions were fully counterbalanced across stimuli. Thus, even if some frequencies were more or less prone to spatial or feature-based attentional modulation, such differences would be controlled for by the experimental design and thus cannot explain our pattern of results.
Our results challenge the role of spatial attention as primary or more fundamental form of attentional selection (Tsal & Lavie, 1993; Cave & Wolfe, 1990; Posner, Snyder, & Davidson, 1980). Undoubtedly, retinotopy and spatial maps play a significant role in organization of visual representations; however, our data support the models of attention treating spatial and nonspatial features equally for the purposes of attentional selection (Reynolds & Heeger, 2009; Martinez-Trujillo & Treue, 2004; Bundesen, 1990). The effects of spatial and feature-based attention on sensory responses in visual cortex are very similar as measured by SSVEP amplitudes (Andersen et al., 2011) and neural firing rates (Cohen & Maunsell, 2011; Patzwahl & Treue, 2009). Dividing spatial (Müller, Bartelt, Donner, Villringer, & Brandt, 2003) or feature-based attention (Andersen et al., 2013; Liu et al., 2013) as well as increased spatial (Voytek et al., 2017; Mangun & Hillyard, 1988) or feature-based uncertainty (Herrmann et al., 2012) leads to weaker attentional selection and performance costs. Additionally, both feature-only (Andersen et al., 2015) and feature-space conjunctions (focused vs. divided attention in this study) occur through parallel and independent selection of individual constituent dimensions, suggesting that feature-based and spatial selection are functionally equivalent.
Most existing theories of attention are underspecified with regard to how attentional resources are distributed between spatial and feature-based attention. For example, the feature-similarity gain model (Maunsell & Treue, 2006; Treue & Martínez Trujillo, 1999) proposes additive combination of attentional effects across multiple dimensions, including space, which is fully consistent with our results. However, this model is agnostic to the consequences of dividing attention in one of the dimensions. The biased competition account, on the other hand, does not specify the source of possible feature-selective biases. In normalization models of attention (Lee & Maunsell, 2009; Reynolds & Heeger, 2009), attentional enhancement is implemented in the form of an attentional field, which has a certain spread or specificity. The attentional field can be spatial or featural, but the constraints on the combination of the two are not yet specified. Finally, the (neural) theory of visual attention (Bundesen, Habekost, & Kyllingsbæk, 2005; Bundesen, 1990) was recently extended to account for spatial as well as “nonspatial criteria” in determining attentional weights (Nordfang, Staugaard, & Bundesen, 2017). This theory incorporates spatial and feature-based attention in a way that has been proposed compatible with feature-similarity gain model. Similar to the normalization model(s), their equations deal with the effects of attentional weights on stimulus processing but do not specify how attentional weights are set or whether any constraints on attentional weights exist. The present findings could potentially be integrated with these models by inclusion of additional equations that constrain the attentional field or the attentional weights of features of different dimensions.
In summary, our results strongly support parallel and independent modulation of sensory stimulus processing by selective attention to features of different dimensions. We observed no cost of simultaneous attentional selection of location and color during early sensory processing, demonstrating that these two types of selection do not rely on a shared resource. Conversely, dividing spatial attention reduced the magnitude of spatial attentional modulation of each selected item, indicating a capacity limit for selecting multiple features within the same dimension.
Acknowledgments
This work was supported by a grant from the Biotechnology and Biological Sciences Research Council (BB/P002404/1) awarded to S. K. A.
Reprint requests should be sent to Nika Adamian or Søren K. Andersen, School of Psychology, University of Aberdeen, William Guild Building, Kings College, Aberdeen AB24 3FX, United Kingdom, or via e-mail: nika.adamian@abdn.ac.uk, skandersen@abdn.ac.uk.