Abstract
Feature-based attentional selection is accomplished by increasing the gain of sensory neurons encoding target-relevant features while decreasing that of other features. But how do these mechanisms work when targets and distractors share features? We investigated this in a simplified color–shape conjunction search task using ERP components (N2pc, PD, and SPCN) that index lateralized attentional processing. In Experiment 1, we manipulated the presence and frequency of color distractors while holding shape distractors constant. We tested the hypothesis that the color distractor would capture attention, requiring active suppression such that processing of the target can continue. Consistent with this hypothesis, we found that color distractors consistently captured attention, as indexed by a significant N2pc, but were reactively suppressed (indexed by PD). Interestingly, when the color distractor was present, target processing was sustained (indexed by SPCN), suggesting that the dynamics of attentional competition involved distractor suppression interlinked with sustained target processing. In Experiment 2, we examined the contribution of shape to the dynamics of attentional competition under similar conditions. In contrast to color distractors, shape distractors did not reliably capture attention, even when the color distractor was very frequent and attending to target shape would be beneficial. Together, these results suggest that target-colored objects are prioritized during color–shape conjunction search, and the ability to select the target is delayed while target-colored distractors are actively suppressed.
INTRODUCTION
The core function of attention is to efficiently select currently relevant information while suppressing irrelevant information. Decades of research have demonstrated that this occurs, in part, by increasing the activity of neurons tuned to target-related features (Bichot, Heard, DeGennaro, & Desimone, 2015; Bichot, Rossi, & Desimone, 2005; Martinez-Trujillo & Treue, 2004; McAdams & Maunsell, 1999; Treue & Martinez Trujillo, 1999) and suppressing neurons tuned to other features (Trott & Born, 2015; Braithwaite & Humphreys, 2003; Bichot & Schall, 2002). However, it remains unclear how attentional facilitation and suppression operate simultaneously when they come into conflict, for instance, when distractors share features with the target during conjunction search.
Enhancement of relevant information and suppression of distracting information are inextricably linked processes. Previous research on feature-based gain enhancement proposed “on-target gain” accounts, which suggest that the maximal gain is applied to neurons tuned to a target-defining feature (Bichot et al., 2005; Hamker, 2004; Treue & Martinez Trujillo, 1999). Recent psychophysical and neuroimaging studies, however, have demonstrated that gain is applied to neurons differently based on stimulus context: “On-target gain” is optimal only when target and distractor features are dissimilar (Serences, Saproo, Scolari, Ho, & Muftuler, 2009; David, Hayden, Mazer, & Gallant, 2008; Martinez-Trujillo & Treue, 2004) but can be suboptimal for discriminating between similar target and distractors (Duncan & Humphreys, 1989), because neurons tuned to the target feature will also respond to similar distractors (Geng, DiQuattro, & Helm, 2017; Scolari, Byers, & Serences, 2012; Scolari & Serences, 2009; Jazayeri & Movshon, 2006, 2007; Navalpakkam & Itti, 2007; Regan & Beverley, 1985).
This suggests that feature-based attentional selection must balance the need to enhance target information and to minimize distractor interference. For instance, in everyday life, it is impossible to select an object such as an apple, which has multiple features that overlap with other objects, using only one feature dimension such as color. It remains unknown how feature-based attentional selection is modulated when multiple feature dimensions are relevant and the need to enhance target information comes into conflict with the need to minimize interference by distractors.
A similar conundrum exists for mechanisms of proactive distractor suppression (Braver, 2012; Aron, 2011; Braver, Paxton, Locke, & Barch, 2009; Braver, Gray, & Burgess, 2007; Geng, 2014). Proactive suppression can improve performance by preemptively down-modulating distractor processing (Gaspar & McDonald, 2014; Sawaki, Geng, & Luck, 2012), but this mechanism is problematic for distractor features that are identical to target features (because suppression of distractors will also interfere with target selection). Reactive suppression, on the other hand, occurs after a distractor captures attention and is subsequently identified as a nontarget. Although reactive suppression is less efficient than proactive suppression, it is more flexible and facilitates visual search by rapidly disengaging attention from erroneous capture (Geng, 2014; Geng & DiQuattro, 2010; Fukuda & Vogel, 2009). Thus, reactive suppression could be a way of dealing with the conundrum of feature-based attentional selection when distractors that have the same feature as the target capture attention. Because reactive suppression is also an active mechanism, it may be more efficient than relying on passive decay of attention on distractors (Sawaki et al., 2012).
Here, we investigated how attentional processing unfolds for a conjunction target embedded within distractors that share features with the target. We used ERPs, focusing on the N2pc, PD, and SPCN components, which index different aspects of lateralized attentional processing. The N2pc component indexes the deployment of spatial visual attention to an object and is observed as a greater negativity at contralateral than ipsilateral posterior electrode sites from approximately 175–300 msec after stimulus onset (Luck, 2012; Woodman, Kang, Rossi, & Schall, 2007; Hopf et al., 2000, 2006; Eimer, 1996; Luck & Hillyard, 1994a, 1994b). Because N2pc is calculated as the difference in voltage between contralateral and ipsilateral electrodes, its magnitude reflects competition between lateral stimuli in the two visual hemifields: Greater N2pc amplitude indicates greater allocation of attention to the contralateral stimulus compared with the ipsilateral stimulus.
The distractor-related positivity (PD) is a more recently discovered component that is thought to reflect the suppression of a lateralized object (Cosman, Lowe, Zinke, Woodman, & Schall, 2018; Sawaki et al., 2012; Sawaki & Luck, 2010; Hickey, Di Lollo, & McDonald, 2009; Eimer & Kiss, 2008). Like N2pc, PD is observed at posterior occipital-temporal scalp sites but is a more positive voltage at sites contralateral to the suppressed object compared with those ipsilateral. Importantly, PD is hypothesized to reflect the active suppression of objects as opposed to passive decay. For example, trials with larger distractor-elicited PD responses have shorter RTs (Gaspar & McDonald, 2014), suggesting that the suppression indexed by PD is responsible for freeing attention to be deployed to another object. Together, these studies suggest that PD indicates the active suppression of a distractor stimulus (Weaver, van Zoest, & Hickey, 2017).
The N2pc is often followed by a sustained contralateral negativity. In visual working memory paradigms, this sustained negativity persists across the delay interval and is therefore called “contralateral delay activity.” This neural signal has been closely linked with the maintenance of information in working memory (Luria, Balaban, Awh, & Vogel, 2016; Ikkai, McCollough, & Vogel, 2010; Vogel & Machizawa, 2004). Similar activity is also seen in tasks that do not involve an explicit memory task but nonetheless require a briefly presented object to be maintained while it is being processed for the current task. In such experiments, this activity is called “sustained posterior contralateral negativity” (SPCN; Jolicoeur, Brisson, & Robitaille, 2008). For example, Mazza, Turatto, Umiltà, and Eimer (2007) found that a singleton target elicited a sustained contralateral negativity following the N2pc when the target needed to be discriminated but not when it simply needed to be localized. Similarly, Gaspar and McDonald (2014) found an SPCN in response to distractors that slowed down the search RT, but not for those that did not interfere with search.
In this study, we used the N2pc, PD, and SPCN components to index the ongoing dynamics of attentional competition for a color–shape conjunction target when distractors are known to share the same color. As shown in Figure 1, the search display sometimes contained two items of the target color, one of which also had the appropriate shape and one of which did not have the target shape and was therefore a distractor. Because both items shared the same color, proactive control mechanisms could not easily be used to avoid directing attention to the target-colored distractor. However, once attention was focused onto the target-colored distractor, the shape information could potentially be used to mobilize reactive control mechanisms and terminate the allocation of attention, freeing up resources for the target object. Thus, we predicted that the target-colored distractor would initially attract attention, as indexed by the N2pc component, but that this would be followed by an active suppression, as indexed by the PD component. An alternative possibility is that the allocation of attention to the target-colored distractor would passively fade over time as more evidence accumulated that the other target-colored item was the target. This alternative hypothesis predicts that the N2pc elicited by the target-colored distractor would gradually return to baseline without a PD component.
We also manipulated the frequency of the target-colored distractor to test whether participants would decrease the use of color-based attention when the target-colored distractor occurred frequently. Experiment 2 replicated the basic design of Experiment 1 but additionally manipulated competition from a target-shaped distractor to assess attentional priority for color versus shape.
METHODS
Participants
Different groups of 18 individuals participated in the two experiments. The sample size of 18 was selected based on the anticipated ability to detect the N2pc, PD, and SPCN in visual search tasks similar to ours, which used sample sizes ranging from 15 to 20 (Jenkins, Grubert, & Eimer, 2017; Eimer & Grubert, 2014a; Sawaki et al., 2012). Participants ranged in age from 19 to 38 years (Experiment 1: 14 women, mean age = 23.9 years; Experiment 2: 11 women, mean age = 22.5 years) and were paid $30 for a 2-hr session. We excluded participants for whom more than 30% of trials were rejected because of artifacts; two participants in Experiment 1 and three participants in Experiment 2 were excluded from all analyses for this reason (see details below). All participants had normal or corrected-to-normal vision and gave informed consent. All of the experimental procedures were approved by the institutional review board of the University of California, Davis.
Apparatus
Stimuli were presented on a Dell 2408WFP monitor (refresh rate = 60 Hz) using Presentation software (Version 16.5; neurobs.com). The participants viewed the monitor from a distance of 100 cm in a dimly lit room. The monitor had a black background (0.31 cd/m2, x = 0.31, y = 0.42) and contained a gray fixation cross (11.7 cd/m2, x = 0.30, y = 0.33) that was visible at all times unless occluded by an experimental stimulus.
Task
Each participant performed a conjunction search task (Figure 1), in which the target was defined by the combination of a specific color and shape. The color of the target was randomly chosen from orange (21.4 cd/m2, x = 0.54, y = 0.41), blue (21.6 cd/m2, x = 0.17, y = 0.11), and green (21.3 cd/m2, x = 0.28, y = 0.56), and the shape of the target was randomly chosen to be either a circle (2.18° in diameter) or a square (1.93° × 1.93°). At the beginning of each block, one of the six possible combinations of color and shape (e.g., orange square) was shown on the center of the monitor, indicating that this would be the target for that block. The selection of the target was random and without replacement so that every participant saw every possible target.
Each trial started with a fixation display of 900–1100 msec (mean = 1000 msec, jittered in 20-msec steps with a rectangular distribution). Then, a search display was presented for 100 msec, followed by another fixation display until response or for a maximum duration of 1100 msec. The search display consisted of four different items, with each item located at one of four positions that were 3.27° left, right, above, and below the fixation cross. Each item contained a small hole (0.36° × 0.36°), either 0.73° above or below the center of the item. Participants responded to the location (up or down) of the hole in the target by pressing the upper or lower button on a gamepad (Logitech G-UG15) with a right index or middle finger, respectively. Speed and accuracy were both emphasized. Participants were instructed to maintain fixation throughout the trial, and fixation performance was verified with EOG recordings.
There were two types of trials, color distractor (CD)-present and CD-absent trials (Figure 1). In CD-present trials, the search display consisted of the target, a distractor that shared only the shape with the target (shape distractor; SD), a distractor that shared only the color with the target (CD), and a neutral distractor (ND) that shared neither color nor shape with the target. In CD-absent trials, the search display consisted of the target, a distractor that shared only the shape with the target (SD), and two NDs that had the nontarget shape randomly combined with either of the two possible nontarget colors. The location of the target and distractors in the search display was pseudorandomly determined with the constraint that the same shapes were always opposite to each other (e.g., circles located left and right, with squares located top and bottom). Therefore, the two lateralized items always had the same shape but different colors, such that the effect of CD frequency on color-based enhancement could be examined without sensory confounds from lateralized shape.
The percentage of CD-present and CD-absent trials was manipulated over blocks: High-frequency blocks consisted of 75% CD-present trials and 25% CD-absent trials, whereas low-frequency blocks consisted of 25% of CD-present trials and 75% CD-absent trials. Participants were not explicitly told about the distractor frequency manipulations and had to implicitly learn the distractor frequency context while doing the task. Six high-frequency blocks and six low-frequency blocks were randomly intermixed (128 trials/block). Participants were encouraged to take a rest after every 32 trials and after each block. Note that the frequency of the SD was kept constant (100%), whereas the frequency of the CD was manipulated to be high or low (75% or 25%), which may have contributed to participants relying more on the color rather than the shape dimension.
The stimuli and procedure in Experiment 2 were identical to those of Experiment 1, except the way in which the location of each item in the search display was determined: In half of the CD-present trials, different shapes were opposite to each other (e.g., a circle and a square were located left and right, and top and bottom), and in the other half of the CD-present trials, the same shapes were opposite to each other as in Experiment 1. In CD-absent trials, different shapes were always opposite to each other. The frequency of a CD was manipulated across blocks, just as in Experiment 1. Six high-frequency blocks and six low-frequency blocks were randomly intermixed (160 trials/block). Participants were encouraged to take a rest after every 40 trials and after each block.
EEG Recording and Analysis
The EEG was recorded inside a shielded chamber using active Ag/AgCl electrodes (Biosemi ActiveTwo) from the left and right mastoids and 32 scalp sites according to the extended 10–20 System (FP1, FP2, F3, Fz, F4, F7, F8, C3, Cz, C4, P1, P2, P3, Pz, P4, P5, P6, P7, P8, P9, P10, T7, T8, PO3, POz, PO4, PO7, PO8, O1, Oz, O2, and Iz). Horizontal eye movements were recorded from EOG electrodes placed at the outer canthi of each eye, and blinks were detected by EOG electrodes above and below the right eye. The single-ended voltage was recorded between each electrode site and a common mode sense electrode. The signals were low-pass filtered with a fifth-order sinc filter (half-power cutoff at 208 Hz) and digitized at 1024 Hz.
Offline data analyses were performed using EEGLAB Toolbox (Delorme & Makeig, 2004), ERPLAB Toolbox (erpinfo.org/erplab/), and custom MATLAB scripts. All EEG signals from the scalp electrodes were referenced to the average of the left and right mastoids, and the EOG signals were rereferenced into bipolar horizontal and vertical EOG derivations. The continuous data were then bandpass-filtered using a noncausal Butterworth infinite impulse response filter (12 dB/oct) with a half-amplitude bandpass of 0.01–36 Hz. Averaged ERP waveforms were then computed with a −200 to +700 msec epoch, relative to onset of the search display.
Trials were excluded if they contained an incorrect response or if the RT was shorter than 250 msec or longer than 1200 msec. Standard artifact rejection procedures were used to remove trials that contained large voltage deflections or blinks (Luck, 2014). Trials with saccades were rejected by means of a step function algorithm that eliminated trials in which a saccade exceeded ∼1.8° (Lins, Picton, Berg, & Scherg, 1993). We also excluded participants for whom more than 30% of the trials were rejected due to EEG/EOG artifacts (two participants in Experiment 1 and three participants in Experiment 2). Among the final set of participants, an average of 15.5% (range = 0.6–28.8%) of trials were rejected in Experiment 1, and an average of 7.6% (range = 2.4–15.5%) of trials were rejected in Experiment 2.
We focused on the trials in which the item of interest (target, CD, or SD) was located on the horizontal midline in the search display due to the lateralized nature (contralateral–ipsilateral) of the ERP components (N2pc, PD, and SPCN). The ERP components were measured at parietal–occipital electrode sites (P1/P2, P3/P4, P5/P6, P7/P8, P9/P10, PO3/PO4, PO7/PO8, and O1/O2) in difference waves in which the waveform from the hemisphere ipsilateral to the item of interest was subtracted from the waveform from the hemisphere contralateral to the item of interest. Specifically, the contralateral waveform was the average of the left-hemisphere electrodes when the item of interest was in the right visual field and the right-hemisphere electrodes when the item of interest was in the left visual field; the ipsilateral waveform was the average of the left-hemisphere electrodes when the item was in the left visual field and the right-hemisphere electrodes when the item was in the right visual field. Because stimuli were always bilaterally presented, the contra-minus-ipsilateral subtraction eliminates most of the other ERP components, with lateralized components (N2pc, PD, and SPCN) remaining in the difference wave (Luck, 2012).
The amplitude of the N2pc component for each condition was measured as the negative area of the contra-minus-ipsilateral difference wave between 190 and 290 msec after the onset of the search display. The amplitude of the PD component for each condition was measured as the positive area of the contra-minus-ipsilateral difference wave between 250 and 400 msec after the onset of the search display. The amplitude of the SPCN component for each condition was measured as the negative area of the contra-minus-ipsilateral difference wave between 290 and 700 msec after the onset of the search display. We chose these time windows on the basis of a grand average waveform that was averaged across all conditions, which provides an unbiased method for selecting ERP measurement windows (Luck & Gaspelin, 2017; Luck, 2014).
To avoid cancellation by temporally adjacent but opposite polarity components without requiring overly narrow measurement windows, we used signed area measures in which only the area below the baseline contributed to the N2pc and SPCN measurements and only the area above the baseline contributed to the PD measurements (Sawaki & Luck, 2013; Sawaki et al., 2012). A shortcoming of this approach is that the measured values are biased away from zero (Luck, 2014). This is not problematic when comparing across conditions (unless the bias differs across conditions), but it is problematic when determining whether a component is present or absent in a given condition. To account for this bias when asking whether a component was present or absent, we used a nonparametric permutation approach that estimated the distribution of values that would be expected from noise alone (the null distribution; Ernst, 2004). This approach has been widely used in recent ERP and neuroimaging studies (Maris, 2012; Sawaki et al., 2012; Maris & Oostenveld, 2007; Nichols & Holmes, 2002).
On each iteration of this procedure, we inverted the contra-minus-ipsilateral difference waveforms of a random subset of participants, which simulates a situation in which any differences among participants are random and the null hypothesis is therefore true (Groppe, Urbach, & Kutas, 2011). This procedure is conceptually identical to swapping the left/right stimulus location labels on a random subset of trials. We then measured the resulting positive or negative area from the grand average waveform across participants to estimate the area that would occur when the null hypothesis is true. This step was repeated 10,000 times to get the probability distribution of the area for each component that would be expected if the null distribution were true (an empirical null distribution). We estimated the p value for a given analysis by finding the percentile of the actual amplitude relative to the null distribution. For instance, if the actual amplitude was greater than the top 630 of the 10,000 permuted values, then the p value was estimated as p = .063. We rejected the null hypothesis if the actual amplitude was greater than any of the top 250 permuted amplitudes (p < .025, which corresponds to a two-tailed alpha of .05).
Another potential shortcoming of signed area measures is that the data are likely to be nonnormally distributed, with less variance for smaller components than for larger components. We therefore conducted additional confirmation analyses using nonparametric methods for the tests in which the relevant variables violated the assumption of normality. For the comparisons that violated the assumption of equality of variance, Greenhouse–Geisser corrected p values were used. The signed area measures also make comparisons between larger and smaller components more conservative, because smaller components will be more sensitive to the positive bias created by the metric. However, we believe that the advantages of the signed area measure outweigh this cost. First, the signed area approach avoids the direct cancellation that occurs when the measurement window contains part of a positive component and part of a negative component, which may make it more powerful than mean peak amplitude measures within narrower time windows. (Note, however, that some cancellation may still occur when positive and negative components overlap in time, so this is only a partial solution to the cancellation problem.) Second, when a component of one polarity is surrounded by components of the opposite polarity, the exact boundaries of the measurement window no longer have much impact on the measured value, whereas the boundaries can have a very large impact on mean amplitude. This eliminates the need to use a narrow a priori window (which may not actually match the timing of the component and may be wildly inappropriate for some participants) or the temptation to use the observed data to select the measurement window (which can dramatically increase the Type I error rate; see Luck & Gaspelin, 2017).
For component latency estimation, the peak latency for each component was defined as the time of the peak amplitude in each component's time window. The peak latency of each component was compared between conditions on jackknife-averaged ERPs, and corrected F and t values (denoted Fc and tc) were computed (Kiesel, Miller, Jolicoeur, & Brisson, 2008; Ulrich & Miller, 2001).
RESULTS
Experiment 1
We examined how the dynamics of attentional competition are resolved during color–shape conjunction search using ERP components (N2pc, PD, SPCN) that index lateralized processing. The primary goal was to measure the evolution of attentional dynamics between a target and a distractor that possesses the target color. One hypothesis is that attention can only be allocated to one object at a time and must passively decay before it is moved to another object. Another hypothesis is that the competition between target-colored items can occur in parallel and is resolved as the distractor is actively suppressed and attentional processing of the target continues. In addition to the main question about the dynamics of competition, we also manipulated the probability of the target-colored distractor (CD) occurring across blocks (high or low) to see if expectancies modulated the dynamics of competition between the CD and target indexed by the N2pc, PD, and SPCN.
Behavior: Accuracy and RT
Accuracy and RT data were entered into repeated-measures ANOVAs with CD frequency (high, low) and CD presence (present, absent) as within-subject factors. Numerically, mean accuracy was slightly lower for CD-present trials than for CD-absent trials. However, accuracy was near ceiling, and there were no significant main effects or interactions (Fs < 1) in the accuracy data (Figure 2A).
RTs for correct responses (Figure 2B) revealed a significant main effect of CD presence, F(1, 15) = 183.73, p < .0001, ηp2 = .925, with a slower mean RT for CD-present trials (585 msec) compared with CD-absent trials (533 msec). This effect indicates that the CD reliably captured attention. This effect was larger in the low-frequency blocks than in the high-frequency blocks, yielding a significant CD frequency by CD presence interaction, F(1, 15) = 35.42, p < .0001, ηp2 = .702. Interestingly, this interaction appeared to reflect a change on CD-absent trials rather than a change in CD-present trials: Paired comparisons revealed that mean RT for CD-absent trials was slower in high-frequency blocks (539 msec) than in low-frequency blocks (527 msec), t(14) = 2.84, p < .05, but there was no effect of block type for the CD-present trials, t(14) = −1.51, p = .15. This result suggests that, as a way of dealing with frequent target-colored distractors, facilitation for the target color may be reduced in high-frequency blocks (albeit only by 12 msec).
ERPs
The three ERP components of interest (N2pc, PD, and SPCN) are all lateralized (isolated with a contralateral-minus-ipsilateral difference wave), and therefore, the EEG data were analyzed on the basis of the exact spatial configuration of the search display. There were three different trial types in Experiment 1 (Figure 3A): target&SD-lateralCD-absent (lateralized target, lateralized SD, no CD), target&SD-lateralCD-present (lateralized target, lateralized SD, CD on the vertical midline), and CD-lateral (lateralized CD, target and SD on the vertical midline). In all three trial types, the two lateralized items in the search display had the same shape but different colors. This allowed us to measure attention to the target or target-colored distractor relative to a neutral-colored object of the same shape. Area amplitudes and peak latencies of the ERP components (N2pc, PD, and SPCN) to each trial type of interest were derived. We first report which components were significantly greater than zero in each condition using permutation tests and then compare the amplitude and latency of each component between conditions. Finally, the amplitude of each ERP component is compared between fast and slow response trials.
Dynamics of ERP components in each condition.
Figure 3B shows the contralateral-minus-ipsilateral difference waves for the three key trial types, collapsed across the two CD frequency conditions, and Figure 3C–E shows the waveforms separately for each trial type without collapsing across frequency conditions. An initial N2pc component was observed for all three trial types, indicating that both the targets and the CDs captured attention. However, processing diverged after the initial attentional capture. A PD immediately followed the N2pc to the lateralized CD, suggesting that the distractor was reactively suppressed. In contrast, when the target was lateralized and the CD was on the midline, there was little or no PD following the N2pc to the target; instead, the N2pc was followed by a substantial SPCN. Interestingly, when the target was lateralized without a CD on the midline, the N2pc was relatively brief and was not followed by an SPCN. Indeed, in the absence of a CD, the lateralized target elicited a small PD component, as has often been observed in similar tasks and appears to reflect the termination of attention to the target (Sawaki et al., 2012).
To provide statistical evidence for these observations, we performed the permutation tests described in the Methods section, in which the areas of the negative (N2pc, SPCN) and positive (PD) regions over broad time windows for each trial type (collapsed across low- and high-frequency blocks) were compared against the distribution of areas that would be expected by chance. In the absence of the CD (target&SD-lateralCD-absent trials; Figure 3C), the lateralized target elicited a significant N2pc (p = .001) but no statistically significant PD (p > .99) or SPCN (p = .73). When the CD was present on the vertical midline (target&SD-lateralCD-present trials; Figure 3D), the lateralized target elicited both a significant N2pc (p < .0001) and a significant SPCN (p = .005) but no significant PD (p > .99). When the CD was lateralized and the target was on the vertical midline (CD-lateral trials; Figure 3E), the lateralized CD elicited a significant N2pc (p = .001) and a significant PD (p = .004) but no significant SPCN (p = .15).
Together, these results indicate that both the targets and the target-colored distractors initially captured attention (as indexed by a significant N2pc) but that this was followed by an active suppression of the target-colored distractor (as indexed by a significant PD). In other words, it appears that proactive control processes led to a bias toward the target color, and reactive control processes were used to terminate the allocation of attention to a target-colored distractor once the shape information became available. Moreover, when a CD was present, the allocation of attention to the target was sustained over a longer period (as indexed by a significant SPCN), which may reflect the need to protect target processing from interference from the CD. More generally, these results suggest that the dynamics of attentional competition is resolved over time even after initial attentional capture and involves concurrent reactive suppression of distractors and the continued processing of targets.
Comparisons of amplitudes and latencies between conditions.
Next, the amplitude (area under the curve) and peak latency of ERP components were directly compared across conditions.1 First, the amplitude and latency of the N2pc were entered into separate two-factor ANOVAs with factors of CD frequency (high, low) and Trial type (CD-lateral, target&SD-lateralCD-present, target&SD-lateralCD-absent). The analyses revealed significant main effects of Trial type for both latency and amplitude (amplitude: F(2, 30) = 6.32, p < .05, Greenhouse–Geisser corrected; peak latency: Fc(2, 30) = 25.53, p < .0001). Paired comparisons indicated that the amplitude of N2pc on CD-lateral trials was smaller than on target&SD-lateralCD-present trials, F(1, 15) = 5.67, p < .05, and on target&SD-lateralCD-absent trials, F(1, 15) = 6.91, p < .05. The N2pc peak latency on CD-lateral trials (222 msec) was also faster than those on target&SD-lateralCD-present trials (239 msec), Fc(1, 15) = 12.76, p < .0001, or on target&SD-lateralCD-absent trials (255 msec), Fc(1, 15) = 128.54, p < .0001. This suggests that the N2pc elicited by lateralized CDs was smaller and terminated more rapidly than the N2pc elicited by lateralized targets. However, given the large overlap in N2pc timing for the target-lateral and CD-lateral trials, it is likely that attention was simultaneously allocated to both objects (Eimer & Grubert, 2014b). Target-elicited N2pc was smaller in amplitude, F(1, 15) = 4.83, p < .05, and peaked earlier, Fc(1, 15) = 7.60, p < .05, for target&SD-lateralCD-present trials than for target&SD-lateralCD-absent trials.
N2pc amplitude was not impacted by whether the CD occurred frequently or infrequently, with no significant main effect of frequency, F(1, 15) = 2.26, p > .15. However, there was a significant main effect of frequency on peak latency, Fc(1, 15) = 13.77, p < .0001. Specifically, the N2pc peak latency was earlier in low-frequency blocks (233 msec) than in high-frequency blocks (245 msec). This latency effect was mainly visible when the CD was absent (see Figure 3C), but the interaction between CD frequency and Trial type did not reach significance for either amplitude, F(2, 30) = .17, p > .85, or peak latency, Fc(2, 30) = .40, p > .67. This pattern of latency effects suggests that there was an overall decrease in attentional priority to the target color when target-colored distractors were likely to occur as compared with when they were rare, as would be expected if participants proactively decreased the priority of the color when this was a less reliable indicator of which item was the target. However, it is not clear why this effect was limited to latency and did not also impact N2pc amplitude.
Next, the amplitude and latency of the PD were entered into separate ANOVAs, paralleling the N2pc analyses. The analyses revealed significant main effects of trial type for PD amplitude, F(2, 30) = 5.64, p < .05, Greenhouse–Geisser corrected. Paired comparisons indicated that the amplitude of PD on target&SD-lateralCD-present trials was smaller than that on CD-lateral trials, F(1, 15) = 17.14, p < .001, or on target&SD-lateralCD-absent trials, F(1, 15) = 11.90, p < .005. This is consistent with the permutation results showing that a significant PD component was present only on CD-lateral trials, suggesting that the lateralized CD was actively suppressed. There was a marginal main effect of CD frequency for PD amplitude, F(1, 15) = 4.20, p = .06, with greater amplitude for high- than low-frequency blocks. Other effects did not reach significance (Fs < 2.4, ps > .10).
The amplitude and latency of the SPCN were analyzed in the same manner as the N2pc and PD. The analyses revealed a significant main effect of Trial type for SPCN amplitude, F(2, 30) = 4.55, p < .05 Greenhouse–Geisser corrected. Paired comparisons indicated that the amplitude of SPCN on target&SD-lateralCD-present trials was greater than on CD-lateral trials, F(1, 15) = 5.38, p < .05, or on target&SD-lateralCD-absent trials, F(1, 15) = 12.20, p < .005, with no significant difference between target&SD-lateralCD-absent trials and CD-lateral trials (F < 1). This is consistent with the permutation results showing that a significant SPCN component was found only when the lateralized target was simultaneously presented with a CD on the midline, which suggests that distractor competition necessitated sustained processing of the target. Other effects did not reach significance (Fs < 1). The absence of latency differences for the SPCN may simply reflect the fact that this component does not have a very distinct peak.
Comparisons of fast versus slow response trials.
The results in Experiment 1 indicated that both the target and the target-colored distractor initially captured attention (as indexed by a significant N2pc), but then the distractor was actively suppressed (as indexed by a significant PD) while the target continued to be processed (as indexed by a significant SPCN). However, the alternative hypothesis could be that the observed pattern of data was due to averaging trials on which the target initially captured attention with trials on which the CD was selected first. To distinguish between these hypotheses, we divided trials into fast and slow response subsets depending on whether the RT was shorter or longer than the participant's median RT for that display configuration and observed whether the dynamics of ERP components were different between fast and slow response trials (Gaspar & McDonald, 2014; Hickey, van Zoest, & Theeuwes, 2010). If attention was randomly drawn to either the target or the target-colored distractor first, then it should influence the speed of response and the dynamics of the ERP components. For instance, it would be likely that the trials in which attention was first drawn to the target-colored distractor would elicit a longer RT and a larger N2pc to the lateralized CD as well as a smaller N2pc to the lateralized target. If both the target and the target-colored distractor initially captured attention, on the other hand, the dynamics of the N2pc and PD would not be different between fast and slow response trials.
In the absence of the CD (target&SD-lateralCD-absent trials; Figure 4A), the lateralized target elicited a significant N2pc (ps < .0006) but no statistically significant PD (ps > .99) or SPCN (ps > .36) on both fast and slow response trials. The amplitude of each ERP component was not significantly different between fast and slow response trials (ps > .18).
When the CD was present on the vertical midline (target&SD-lateralCD-present trials; Figure 4B), the lateralized target elicited a significant N2pc (ps < .002) but no significant PD (ps > .99) on both fast and slow response trials, with no significant difference in the amplitude between fast and slow response trials (ps > .34). This is consistent with the hypothesis that both the target and the target-colored distractor initially captured attention, rather than being sequentially selected in a random order. Interestingly, the lateralized target elicited a significant SPCN only on slow response trials (p = .004), but not on fast response trials (p = .10), with greater amplitude of SPCN for slow response trials than for fast response trials, t(15) = 3.31, p = .005. This indicated that the greater amplitude of SPCN was elicited when sustained processing of the target was needed, which slowed down the RT.
When the CD was lateralized and the target was on the vertical midline (CD-lateral trials; Figure 4C), the lateralized CD elicited a significant N2pc (ps < .008) and significant PD (ps < .04) on both fast and slow response trials. The SPCN was significant only for fast response trials (p = .02), but not for slow response trials (p = .93). The amplitude of each ERP component was not significantly different between fast and slow response trials (ps > .16). The lack of significant difference in the amplitude of N2pc or PD between fast and slow response trials is consistent with the hypothesis that both the target and the target-colored distractor initially captured attention.
Experiment 2
In Experiment 1, we focused on how attentional competition evolves between the target and a CD during color–shape conjunction search. The results provided clear evidence that the CD captured attention and that the resolution of competition involved a combination of reactive suppression of the CD and sustained processing of the target. However, the target was defined by shape as well as color, and the design of the experiment did not make it possible to assess the allocation of attention to the shape dimension. Specifically, the two lateralized objects in the search display always had the same shape, making it impossible to separately evaluate the contribution of the shape dimension to the calculation of lateralized ERP components. Thus, in Experiment 2 we manipulated the location of the target-shaped distractor to independently measure the effect of the color and shape dimensions on the allocation of attention. This also provided an opportunity to assess the replicability of the effects observed in Experiment 1.
The experimental design was the same as in Experiment 1, with the exception that the two lateralized objects in the search display sometimes had the same shape (as in Experiment 1) and sometimes had different shapes (Figure 5A). Consequently, there were new conditions in which an SD was lateralized with or without a CD on the vertical midline (SD-lateralCD-present, SD-lateralCD-absent), enabling direct measurement of attentional processing of the SD. Also, the target was lateralized with an SD on the opposite side (target&SD-lateralCD-present) or an ND on the opposite side (target&ND-lateralCD-present). Because the manipulations of the CD were preserved from Experiment 1, we could examine the strength of attentional capture by target-colored versus target-shaped distractors and see how competition between these distractor types is resolved over time.
One possibility is that SDs would operate in the same way as CDs, capturing attention (as indexed by the N2pc component) and then being suppressed (as indexed by the PD component). Another possibility is that, because shape was less discriminable than color, participants would not use proactive control to prioritize the initial allocation of attention to objects containing the target shape but would instead rely on reactive control to assess shape only for items presented in the target color.
Behavior
Accuracy data were entered into a repeated-measures ANOVA with CD frequency (high, low) and CD presence (present, absent) as within-subject factors (Figure 5B). There was a significant main effect of CD presence, F(1, 14) = 12.40, p < .005, ηp2 = .470, with a slightly lower accuracy rate for CD-present trials (94%) than for CD-absent trials (96%). The main effect of CD frequency was also significant, F(1, 14) = 7.47, p < .05, ηp2 = .348, with slightly greater accuracy for high-frequency blocks (96%) than for low-frequency blocks (95%). The CD frequency by CD presence interaction was not significant (F < 1).
Mean RTs for correct response trials were entered into a repeated-measures ANOVA with CD frequency (high, low) and CD presence (present, absent) as within-subject factors (Figure 5C). The analysis revealed a significant main effect of CD presence, F(1, 14) = 193.50, p < .0001, ηp2 = .933, with slower mean RTs for CD-present trials (589 msec) than for CD-absent trials (532 msec). The main effect of CD frequency was not significant (F < 1). The CD frequency by CD presence interaction was significant, F(1, 14) = 26.89, p < .0001, ηp2 = .658. Paired comparisons revealed that the interaction was due to the mean RT for CD-absent trials being slower in high-frequency blocks (538 msec) than in low-frequency blocks (526 msec), t(14) = 2.85, p < .05, with no significant difference between CD-present trials in high- and low-frequency blocks, t(14) = −1.60, p = .13. These RT results replicate the pattern of behavior observed in Experiment 1 and further confirm that feature-based attentional enhancement was flexibly adjusted according to the frequency of CDs. However, the magnitude of the effect was relatively small, which limits our ability to assess the effect of probability on the ERPs.
ERPs
Dynamics of ERP components in each condition.
Figure 6 shows the contralateral-minus-ipsilateral difference waves for each of the key conditions, with lateralized target trials in the top row and lateralized distractor (CD and SD) trials in the bottom row. The left column overlays the relevant trial types, collapsed across CD frequency, and the other columns overlay the waveforms for the high-frequency and low-frequency blocks for a given trial type. Visual inspection indicates that the pattern of results was similar to the pattern observed in Experiment 1 for the trial types that were present in both experiments. Specifically, the lateralized CDs elicited an N2pc that was rapidly followed by a large PD, whereas the lateralized targets elicited an N2pc with a smaller PD. In addition, the SPCN for lateralized targets was larger when a CD was present. These results show that the key results from Experiment 1 are replicable.
Lateralized SDs, which were not included in Experiment 1, did not seem to elicit any N2pc, PD, or SPCN activity. That is, even though the target was defined by a combination of shape and color, the initial attentional selection prioritized color and not shape.
These observations were verified statistically by first assessing the presence or absence of each ERP components for each of the six trial types shown in Figure 6. As in Experiment 1, we used permutation tests to determine whether the area amplitude of each component (collapsed across low- and high-frequency blocks) was significantly greater than expected from noise alone.
In the absence of the CD (target&ND-lateralCD-absent; Figure 6B), the lateralized target elicited a significant N2pc (p = .0002) but no PD (p = .75) or SPCN (p = .07), replicating the result in Experiment 1. When the CD was present on the vertical midline (target&ND-lateralCD-present, target&SD-lateralCD-present; Figure 6C, D), the lateralized target elicited both an N2pc (ps < .0006) and an SPCN (ps < .02) but no significant PD (ps > .99), also replicating the finding from Experiment 1 that sustained target processing was necessary to complete the search when a CD was present in the display. When the CD was lateralized and the target was on the vertical midline (CD-lateral; Figure 6H), the lateralized CD elicited both an N2pc (p < .0001) and a PD (p = .005) but no significant SPCN (p = .15), again replicating the results of Experiment 1. This suggests that the CD captured attention and then was actively suppressed.
In addition to assessing the replicability of the results of Experiment 1, Experiment 2 was designed to assess the allocation of attention to SDs. However, in contrast to the lateralized CD, the lateralized SD (SD-lateralCD-present, SD-lateralCD-absent; Figure 6F, G) elicited no significant N2pc (p = .10, p = .09), PD (p > .99, p > .99), or SPCN (p = .82, p = .62). Whereas the target-colored distractor captured attention, the target-shaped distractor appeared to have little impact on the initial allocation of attention. This suggests that proactive control mechanisms did not effectively direct the initial allocation of attention to the target shape. This is consistent with previous research showing that the allocation of attention (as indexed by N2pc) to targets defined by a conjunction of features is determined almost entirely by the more salient of the two features (Luck & Hillyard, 1994b).
Comparisons of amplitudes and latencies between conditions.
Next, the amplitude and peak latency of each ERP component were directly compared across conditions.2 First, the amplitude and peak latency of the N2pc were entered into separate ANOVAs with factors of CD frequency (high, low) and Trial type (target&ND-lateralCD-absent, target&ND-lateralCD-present, target&SD-lateralCD-present, SD-lateralCD-absent, SD-lateralCD-present, CD-lateral). The analyses revealed a significant main effect of Trial type only for amplitude, F(5, 70) = 13.26, p < .0001, Greenhouse–Geisser corrected. Paired comparisons indicated that N2pc amplitude was smaller on SD-lateral trials (SD-lateralCD-absent, SD-lateralCD-present) than for any other trial type (Fs > 6.11, ps < .05), consistent with the permutation results that lateralized SDs did not elicit a significant N2pc. Also, N2pc amplitude was significantly smaller on CD-lateral trials than on target&ND-lateralCD-present trials, F(1, 14) = 8.07, p < .05, or target&ND-lateralCD-absent trials, F(1, 14) = 14.37, p < .005, replicating the patterns observed in Experiment 1. N2pc amplitude was also significantly larger on target&ND-lateralCD-absent trials than on target&SD-lateralCD-present trials, F(1, 14) = 12.18, p < .005, replicating the result in Experiment 1. Interestingly, N2pc amplitude was significantly larger on target&ND-lateralCD-present trials than on target&SD-lateralCD-present trials, F(1, 14) = 13.68, p < .005, indicating that attentional selection for the target was stronger when the opposite item shared nothing with the target as compared with when the opposite item shared the same shape with the target. The differences in other pairs did not reach significance (Fs < 2.82).
The main effect of CD frequency was not significant in amplitude or in peak latency (Fs < 1). The interaction between CD frequency and Trial type was not significant in amplitude or in peak latency (Fs < 2.04, ps > .08). Thus, although all the key amplitude results from Experiment 1 were replicated, the effect of CD frequency on N2pc latency was not.
Next, the amplitude and peak latency of the PD were entered into parallel ANOVAs. The analyses revealed a significant main effect of Trial type only for amplitude, F(5, 70) = 9.00, p < .0001. Paired comparisons indicated that PD amplitude was significantly greater on CD-lateral trials than for any other Trial type (Fs > 8.86, ps < .01), consistent with the permutation result indicating that a significant PD was elicited only by the lateralized CD. Also, PD amplitude was significantly greater on target&ND-lateralCD-absent trials than on target&ND-lateralCD-present trials or SD-lateral trials (Fs > 9.24, ps < .009). The differences between other pairs did not reach significance (Fs < 3.05, ps > .10). The main effect of CD frequency and the interaction between CD frequency and Trial type did not reach significance in amplitude or in peak latency (Fs < 1.09, ps > .37).
Finally, the amplitude and peak latency of the SPCN were entered into parallel ANOVAs. The analyses revealed a marginal main effect of Trial type on amplitude, F(5, 70) = 2.70, p = .07, Greenhouse–Geisser corrected. Paired comparisons indicated that SPCN amplitude on target&ND-lateralCD-present trials was significantly greater than on SD-lateral trials (Fs > 5.97, ps < .05), consistent with the permutation result showing that the lateralized target elicited a significant SPCN only when the CD was present on the vertical midline. The differences in other pairs did not reach significance (Fs < 3.18, ps > .10). The main effect of CD frequency and the interaction between CD frequency and Trial types did not reach significance for amplitude or peak latency (Fs < 1.99).
Comparisons of fast versus slow response trials.
To assess the replicability of the results observed in comparisons of fast versus slow response trials in Experiment 1, we also conducted the median-split analyses based on RT in Experiment 2. In the absence of the CD (target&ND-lateralCD-absent; Figure 7A), the lateralized target elicited a significant N2pc (ps < .0004) but no significant PD (ps > .44) on both fast and slow response trials. N2pc amplitude was not significantly different between fast and slow response trials (p = .23). The amplitude of PD was greater on fast response trials than on slow response trials, t(14) = 3.00, p = .01. The SPCN was significant only for slow response trials (p = .005) but not for fast response trials (p = .61), with greater amplitude on slow response trials than on fast response trials, t(14) = 2.89, p = .01.
When the CD was present on the vertical midline (target&ND-lateralCD-present, target&SD-lateralCD-present; Figure 7B, C), the lateralized target elicited a significant N2pc (ps < .01) but no significant PD (ps > .99) on both fast and slow response trials, with no significant difference in amplitude between fast and slow response trials (ps > .19). The SPCN was significant only on slow response trials (ps < .008) but not on fast response trials (ps > .13), with significantly greater amplitude for slow response trials than for fast response trials (ps < .02).
When the CD was lateralized and the target was on the vertical midline (CD-lateral; Figure 7D), the lateralized CD elicited both a significant N2pc (ps < .005) and a significant PD (ps < .05) but no significant SPCN (ps > .43) on both fast and slow response trials. There was no significant difference in amplitude between fast and slow response trials (ps > .54). These results replicate the patterns of results observed in Experiment 1 and support the hypothesis that both the target and the target-colored distractor initially captured attention, rather than they were sequentially selected in a random order.
DISCUSSION
Successful achievement of most complex visual tasks requires the ability to selectively enhance goal-relevant information in various contexts of distracting information. Previous research has demonstrated that efficient attentional allocation is achieved by modulating feature-based attentional enhancement in a way that maximally differentiates target and distractor processing (Scolari et al., 2012; Scolari & Serences, 2009; Navalpakkam & Itti, 2007). However, shifting attentional enhancement to an off-target feature to provide maximum signal-to-noise ratio is not possible when distractors share precisely the same feature value as the target. It is also difficult in this case for mechanisms of proactive suppression (Braver, 2012; Aron, 2011; Braver et al., 2007) to help adjudicate between the target and distractor, because suppression of the distractor feature will also interfere with target selection.
The results of the current study indicated that both target-colored distractors and targets are initially attended, as indexed by a significant N2pc early in the trial, and that two mechanisms are used to resolve competition for attention. The first mechanism is reactive distractor suppression, as indexed by the presence of a PD following the N2pc for the CD. The PD is known to reflect the active suppression of objects before attention is deployed to another object (Gaspar & McDonald, 2014; Sawaki & Luck, 2013; Sawaki et al., 2012). Thus, the result that the CD elicited an N2pc followed by a PD indicates that the CD initially captured attention but then was reactively suppressed (Geng, 2014; Geng & DiQuattro, 2010; Fukuda & Vogel, 2009). The second mechanism was sustained target processing, indexed by a significant SPCN to the target when the CD was present. The SPCN is associated with the sustained processing of items in visual working memory (Jolicoeur et al., 2008; Mazza et al., 2007; Vogel & Machizawa, 2004). The finding that the SPCN was observed for the target only when a CD was present and was larger on trials with longer RTs indicates that the SPCN reflected continued target processing until attentional competition between the CD and the target was resolved and a response decision was reached. Together, these results suggest that all objects with the target color were initially selected, but then the distractor was actively suppressed while the target continued to be processed until a decision was made for response. These findings extend our knowledge of the sources of behavioral interference during distractor interference: In the specific visual search paradigm of this study, slower RTs on CD-present trials compared with CD-absent trials appears to be due to the parallel suppression of distractors and continued selection of targets, rather than the purely serial selection of one object at a time (Wolfe & Horowitz, 2004; Treisman & Gelade, 1980).
Interestingly, the prioritization of different feature dimensions (color, shape) during conjunction target search was not equivalent. Whereas the CD clearly captured attention (N2pc) in both Experiments 1 and 2, the SD appeared to produce little or no attentional capture (Experiment 2). This is inconsistent with the recent study that found a significant N2pc for both color- and shape-matching distractors (Jenkins et al., 2017). These discrepancies are likely due to differences in the task design: In the current study, the frequency of SDs was kept constant (100%) whereas the frequency of CDs was manipulated to be high or low (75% or 25%), which may have contributed to participants relying mostly on the color rather than the shape dimension. In addition, other studies have also found that color and shape are processed in partially segregated pathways, with color perception temporally preceding shape perception (Rentzeperis, Nikolaev, Kiper, & van Leeuwen, 2014; White, Lunau, & Carrasco, 2014; Clifford, Arnold, & Pearson, 2003; Moutoussis & Zeki, 1997), and that color have precedence in selection over shape (Theeuwes, 1991, 1992; Treisman & Gelade, 1980). The results from these studies are consistent with the idea that color may generally enjoy greater attentional priority than shape information. However, irrespective of whether color always dominates shape in attentional priority, the primary result from the current study is the finding that a strongly competitive (color) distractor is initially attended but is then actively suppressed.
Finally, the current study provides behavioral evidence that prioritization of feature dimensions during conjunction search is adjusted depending on the distractor context. RTs to targets in CD-absent trials were longer in high-frequency blocks than in low-frequency blocks, both in Experiments 1 and 2. However, the associated ERP results were not consistent between experiments, and therefore, the electrophysiological bases of attentional adjustments to target features based on distractor expectancies require further study.
Taken together, the current study provides a comprehensive picture of how feature-based attentional processing evolves during conjunction search. The attentional competition between the target and distractors is resolved by reactive suppression of target-similar distractors along with sustained target processing. Also, the prioritization of features during conjunction target search is not equivalent across dimensions but is biased toward the more discriminable dimension, suggesting that conjunction search proceeds by selecting all objects with the dominant target feature and then adjudicating the actual target features subsequently through analysis of the secondary feature, which results in suppression of the nontarget and continued processing of the target.
Acknowledgments
This work was funded by NSF BCS 1230377-0 to J. J. G.
Reprint requests should be sent to Jeongmi Lee, Graduate School of Culture Technology, Korea Advanced Institute of Science and Technology, 291 Daehak-ro, Yuseong-gu, Daejeon 34141, South Korea, or via e-mail: [email protected] or Joy J. Geng, Center for Mind and Brain, University of California, Davis, 267 Cousteau Pl. Davis, CA 95618, or via e-mail: e-mail: [email protected].
Notes
Because we used the signed area measures, some of the amplitude variables violated the assumption of normality. We therefore conducted additional confirmation analyses using nonparametric methods (Friedman test, Wilcoxon signed-rank test) to test the main and simple effects of Trial type for comparisons in which the relevant variables violated the assumption of normality. All the effects obtained from the conventional ANOVA in Experiment 1 were replicated with the nonparametric tests (ps < .05).
As in Experiment 1, we conducted additional confirmation analyses using nonparametric methods (Friedman test, Wilcoxon signed rank test) for comparisons in which the relevant variables violated the assumption of normality. All the effects obtained from the conventional ANOVA in Experiment 2 were replicated with the nonparametric tests (ps < .05).