Abstract

In everyday life, we often need to track several objects simultaneously, a task modeled in the laboratory using the multiple-object tracking (MOT) task [Pylyshyn, Z., & Storm, R. W. Tracking multiple independent targets: Evidence for a parallel tracking mechanism. Spatial Vision, 3, 179–197, 1988]. Unlike MOT, however, in life, the set of relevant targets tends to be fluid and change over time. Humans are quite adept at “juggling” targets in and out of the target set [Wolfe, J. M., Place, S. S., & Horowitz, T. S. Multiple object juggling: Changing what is tracked during extended MOT. Psychonomic Bulletin & Review, 14, 344–349, 2007]. Here, we measured the neural underpinnings of this process using electrophysiological methods. Vogel and colleagues [McCollough, A. W., Machizawa, M. G., & Vogel, E. K. Electrophysiological measures of maintaining representations in visual working memory. Cortex, 43, 77–94, 2007; Vogel, E. K., McCollough, A. W., & Machizawa, M. G. Neural measures reveal individual differences in controlling access to working memory. Nature, 438, 500–503, 2005; Vogel, E. K., & Machizawa, M. G. Neural activity predicts individual differences in visual working memory capacity. Nature, 428, 748–751, 2004] have shown that the amplitude of a sustained lateralized negativity, contralateral delay activity (CDA) indexes the number of items held in visual working memory. Drew and Vogel [Drew, T., & Vogel, E. K. Neural measures of individual differences in selecting and tracking multiple moving objects. Journal of Neuroscience, 28, 4183–4191, 2008] showed that the CDA also indexes the number of items being tracking a standard MOT task. In the current study, we set out to determine whether the CDA is a signal that merely represents the number of objects that are attended during a trial or a dynamic signal capable of reflecting on-line changes in tracking load during a single trial. By measuring the response to add or drop cues, we were able to observe dynamic changes in CDA amplitude. The CDA appears to rapidly represent the current number of objects being tracked. In addition, we were able to generate some initial estimates of the time course of this dynamic process.

INTRODUCTION

In a typical multiple-object tracking (MOT) experiment, participants are asked to track several target items among a number of identical distractors for 5–20 sec (Pylyshyn & Storm, 1988). After this period, they are asked to identify the tracked items. Most individuals can successfully track about four items, illustrating a significant capacity limitation in visual processing. In real life, we rarely track the same set of items for extended periods. It is more likely that we will be rapidly switching between items as some become relevant and others lose relevance. For instance, while driving, one might need to track the movements of the cars to the left and the front when preparing to change lanes, then switch to monitoring the truck to the right when attempting to get in an exit lane. To study this more complex natural operation, Wolfe, Place, and Horowitz (2007) introduced the “multiple-object juggling” paradigm. Instead of tracking a fixed set of targets throughout the trial, participants were cued to add and drop targets throughout the trial. Participants were surprisingly good at this task; there was no significant cost for juggling targets as opposed to tracking a fixed set.

How quickly do we acquire new targets and stop tracking old ones? This is an attentional switching problem. Previous research on the topic has studied shifting attention between fixed spatial locations in response to symbolic cues using ERPs and fMRI. Converging evidence from these studies suggests that top–down attentional modulation of spatial attention is primarily driven by areas in the posterior parietal cortex (e.g., Yantis, 2008). For example, Bisley and Goldberg (2003) have shown that the lateral intraparietal area dynamically represents attended locations during sustained voluntary attention tasks. Yantis et al. (2002) have shown that activity in the superior parietal lobule responds transiently to cues that instruct the participant to switch the spatial location of attention. This area is also active when switching between nonspatial features (Liu, Slotnick, Serences, & Yantis, 2003), strongly suggesting a role for this area in controlling voluntary shifts of attention in multiple representational domains. Two ERP studies on the subject have focused on nonlateralized parietal activity presumed to reflect a change in the focus of spatial attention. The switch activity occurred between 300 and 700 msec after the cue (Brignani, Lepsien, Rushworth, & Nobre, 2009; Grent-'T-Jong & Woldorff, 2007). In both cases, the investigators studied the response to a symbolic cue that instructed the observer to shift attention to a different location. Brignani and colleagues (2009) contrasted the evoked response to “hold” and “switch” cues, thereby allowing them to focus on activity that was necessary to shift the locus of attention in preparation for a stimulus in the new location. However, although the observed activity suggests that processes related to the attentional shift begins roughly ∼350 msec after the cue, it is not clear what aspect of the attentional switch is reflected by this activity. As this was the first time at which the “switch” and “hold” activity differed significantly, one might expect that it reflects the beginning of the process, but with the existing data, there is no way to know when the process is complete; that is, when has attention effectively switched to the new location?

The current study addressed this question by examining the effect of an exogenous shift cue during an MOT task on a sustained, lateralized ERP component called the contralateral delay activity (CDA; McCollough, Machizawa, & Vogel, 2007; Vogel & Machizawa, 2004). When a participant is asked to track a set of stimuli presented in one visual hemifield, the difference between contralateral and ipsilateral activity at posterior-occipital electrode sites increases with the number of items and saturates above tracking capacity (Drew & Vogel, 2008). By manipulating the size of the area within which the objects moved, we have previously shown that this activity appears to reflect the number of attended items rather than the size of the attentional spotlight (Drew & Vogel, 2008). Vogel and colleagues have suggested that this component is an index of the number of currently attended targets (Drew & Vogel, 2008; McCollough et al., 2007; Drew, McCollough, & Vogel, 2006; Vogel & Machizawa, 2004). If so, the CDA should be sensitive to changes in tracking load during a trial, thereby allowing us to estimate the time course of effectively switching from tracking one group of objects to another.

In visual working memory (VWM) tasks, the CDA responds to increases in memory load. When participants held two items in memory and were then asked to add two more items, CDA amplitude increased (Ikkai, McCollough, & Vogel, 2010; Vogel, McCollough, & Machizawa, 2005). A clear prediction of the hypothesis that the CDA is a dynamic index of the number of items currently being attended is that the CDA would decrease when load is reduced. Alternatively, and consistent with existing data, the CDA could reflect the number of items selected or individuated during the course of the entire trial. That is, CDA amplitude might ratchet up each time target load is modified and might simply asymptote once its maximum amplitude was reached. This is straightforward in the case of adding items. CDA amplitude should increase when additional targets are added to a preexisting target load (2 + 2 = 4; Vogel et al., 2005). However, the “ratchet-up” account would predict a paradoxical result when participants are asked to drop three items and pick up one new item (as in the drop conditions of the present article). Now, 3 − 1 = 4, because the number of items attended during the entire trial would increase, whereas the active number of items being tracked would decrease. If the ratchet-up hypothesis were true, it would render the CDA a much less effective tool for studying dynamic situations because it would no longer be representative of the current load and the signal appears to saturate the above three or four items (Drew & Vogel, 2008; Vogel & Machizawa, 2004).

To test the dynamic index hypothesis for the CDA, we employed a variant on Wolfe et al.'s (2007) multiple-object juggling experiments. In this paradigm, participants are able to change tracking load with no observable behavioral cost. This creates an ideal situation to determine whether the CDA is truly representative of the number of items being actively attended: if so, it should dynamically increase and decrease in response with unpredictable changes in tracking load during the trial. On the other hand, if the CDA merely reflects the number of items selected in a given trial, the amplitude should increase when the tracking load decreases.

In two experiments reported here, participants performed a lateralized tracking task, as in Drew and Vogel (2008). In the Track 1 and Track 3 baseline conditions, participants would either track one or three targets, providing baseline CDA measures for one and three targets. In both experiments, we presented a cue after 500 msec of tracking on some trials. If tracking three targets, the cue would either indicate that the participant should Hold on to that target set or Drop to tracking a single target. If tracking a single target, the cue would indicate that the participant should switch to tracking three targets (Add). In all switch conditions, participants were asked to switch to new items. For instance, when switching from one to three items none of the three items was previously a target. We recorded ERPs and analyzed CDA amplitude. To anticipate the results, CDA amplitude followed the tracking conditions. Thus, in the Add condition, for example, the CDA was initially equivalent to that in the Track 1 condition because participants were tracking a single target in both cases. After the cue, CDA amplitude rose to equal the Track 3 level, indicating that the participant was tracking three targets. The time course of the CDA transition is a measure of the time necessary to make the switch in tracking behavior. The same logic applies in reverse to the Drop condition. The Hold condition serves as a control to evaluate the effects of the cue. In Experiment 2, we replaced the Hold condition with Refresh 1 and Refresh 3 conditions, where the objects that the participant was tracking were briefly flashed again during the cue period (“refreshing” their identity). Because the changes in target load were indicated by flashing the appropriate number of new targets, these “refresh” conditions served as an important control for the visually evoked activity in response to the cue items. This allowed us to generate some initial estimates of the time-course of increasing and decreasing target load during MOT.

METHODS

Participants

There were 17 participants in both Experiment 1 and Experiment 2, with no overlap. All were neurologically normal, recruited from the Eugene, Oregon, community, and gave informed consent according to procedures approved by the University of Oregon institutional review board.

Stimulus Displays and Procedures

Our method followed the lateralized MOT design developed by Drew and Vogel (2008). Participants were required to fixate the center of the monitor. Similar stimuli were presented on both sides of fixation to equalize the perceptual input in each hemifield. However, participants were instructed to attend to one hemifield on a given trial, and the attended hemifield was varied randomly from trial to trial.

Each trial began with a 200-msec arrow cue presented at fixation that informed the participants which side of the screen to attend. Following a 100–200 msec ISI, a set of eight stationary squares (each subtending 0.4° × 0.4°) appeared on either side of the screen. One or three of these squares on the attended side were then illuminated in the relevant color for 500 msec to designate the targets. An equal subset on the unattended side was illuminated in the irrelevant color to equate stimulus energy in the two hemifields. For half of the participants, red was the relevant color and green was the irrelevant color, and the color mappings were reversed for the remaining subjects. These colors were photometrically equiluminant according to a Konica Minolta ChromaMeter CS-100a.

After the selection period, color cues and disappeared and all objects began to move independently for 2000 msec. Motion in each hemifield was confined within an invisible rectangle subtending 8.90° × 4.45°, with the inner edge of this rectangle laterally offset from fixation by 2.16°. Objects moved at a speed of approximately 1.6°/sec. Motion trajectories were linear and changed at random intervals or when the items made contact with other items or the boundaries of the invisible rectangle.

Following the motion phase, one item on the attended side was illuminated in the relevant color and another on the unattended side was illuminated in the irrelevant color. Participants made a button press to identify the selected item on the attended side as having been either a tracked target or an untracked distractor.

In Experiment 1, there were five different trials types: Track 1, Track 3, Add, Drop, and Hold (see Figure 1). For Tracks 1 and 3 trials, participants tracked the initial targets throughout the duration of the trial. On Add trials, participants initially tracked one item. After 500 msec of tracking, three items, which had previously been distractors, were cued in the relevant color on the attended side of the screen for 200 msec. To match the physical stimulation on both sides of the screen as closely as possible, three items were illuminated on the unattended side in the relevant color at the same time. Participants were instructed to immediately stop tracking the initial target and start tracking the new targets. Drop and Hold trials followed the same time course but asked the participant to initially track three items. In the Drop condition, one former distractor was then cued, and participant was to track this item until the end of the trial. In the Hold condition, one distractor item was cued in the irrelevant color on both the attended and unattended side of the screen. Participants were told to ignore this cue and continue tracking the three initially cued targets. All trial types were interleaved so that when initially asked to track one target, there was a 50% probability of being asked to continue tracking that item throughout the trial, and when initially tracking three targets, there was a two-thirds chance of being asked to stay with the initial targets. After a short practice block, participants completed 880 trials, yielding 176 trials per condition.

Figure 1. 

Experimental design for Experiment 1. Here, we have depicted just the attended side of the screen. The unattended side of the screen was matched in terms in all phases except selection, where the attended side contained relevant colors on targets whereas the unattended side contained the irrelevant color on the same number of items. Relevant color was counter-balanced across participants and is depicted as gray in this example, whereas the irrelevant color is black. In the actual experiment, isoluminant red and green were used as two color types. On switch trials, the items that the participant switched to were always different than the originally tracked items. All trial types were interleaved so that a participant that began tracking three objects did not know if they would be told to switch to tracking one item (Drop), ignore an irrelevant cue (Hold), or simply continue tracking.

Figure 1. 

Experimental design for Experiment 1. Here, we have depicted just the attended side of the screen. The unattended side of the screen was matched in terms in all phases except selection, where the attended side contained relevant colors on targets whereas the unattended side contained the irrelevant color on the same number of items. Relevant color was counter-balanced across participants and is depicted as gray in this example, whereas the irrelevant color is black. In the actual experiment, isoluminant red and green were used as two color types. On switch trials, the items that the participant switched to were always different than the originally tracked items. All trial types were interleaved so that a participant that began tracking three objects did not know if they would be told to switch to tracking one item (Drop), ignore an irrelevant cue (Hold), or simply continue tracking.

There were six conditions in Experiment 2, four (Track 1, Track 3, Add, and Drop) of which were replications from Experiment 1. The two new conditions in this experiment were the Refresh 1 and Refresh 3 conditions. In these conditions, the initially cued target items were cued again (or refreshed) using the relevant color during the cue period. As in the Add and Drop conditions, an identical number of items was cued in the relevant color on the unattended side. After a short practice block, participants completed 1248 trials, yielding 208 trials per condition.

Electrophysiological Recording and Analysis

ERPs were recorded in each experiment using our standard recording and analysis procedures, including rejection of trials contaminated by blocking, blinks, or large (>1°) eye movements (see McCollough et al., 2007). We recorded from 22 tin electrodes mounted in an elastic cap (Electrocap International) using the International 10/20 System. 10/20 sites F3, FZ, F4, T3, C3, CZ, C4, T4, P3, PZ, P4, T5, T6, O1 and O2 were used along with five nonstandard sites: OL midway between T5 and O1; OR midway between T6 and O2; PO3 midway between P3 and OL; PO4 midway between P4 and OR; POz midway between PO3 and PO4. All sites were recorded with a left mastoid reference, and the data were re-referenced off-line to the algebraic average of the left and right mastoids. Horizontal EOG was recorded from electrodes placed approximately 1 cm to the left and right of the external canthi of each eye to measure horizontal eye movements. To detect blinks, vertical EOG was recorded from an electrode mounted beneath the left eye and referenced to the left mastoid. The EEG and EOG were amplified with a SA Instrumentation amplifier with a bandpass of 0.01–80 Hz and were digitized at 250 Hz in LabView 6.1 running on a Macintosh.

Eye Movements

Trials containing either blinks or eye-movements were excluded from further analysis. Participants with trial rejection rates >25% were excluded from the sample. Using these criteria, we eliminated 1 of the 17 participants that took part in Experiments 1 and 4 of the 17 participants in Experiment 2. We analyzed the horizontal EOG channel over a long time window (100–2400 msec) to ensure that eye position did not drift into the movement area during trial and that the presence of a switch cue on some trials did not result in an increased rate of eye position drift during the trial. We divided the data on the basis of the side that was attended and the experimental condition for each trial. We found no evidence of significant hEOG drift in either experiment: The main effects for both side attended (F(1, 15) = 2.00, p = .18, η2 = .12; Experiment 2: F(1, 12) = 4.38, p = .058 , η2 = .27) and condition (F(4, 60) = 1.41, p = .24, η2 = .09; F(5, 60) = 1.93, p = .10, η2 = .14) were not significant, and the two factors did not interact (F(4, 60) = .73, p = .57, η2 = .05, F(5, 60) = 1.72, p = .14, η2 = .13). Overall, the drift toward the attended side of the screen was less than 0.7 μV in both experiments. Hillyard and Galambos (1970) have shown that a 1° eye movement elicits roughly a 16-μV deflection in hEOG waveforms. Given that the area that the boxes moved within was lateralized by 2.16° from fixation, it is unlikely that these small drifts in fixation affected the data.

Difference Waves

Contralateral and ipsilateral waveforms were defined in terms of the side of the screen attended on a given trial. We then examined the data in terms of contralateral and ipsilateral response, collapsing the data across attend left and right trials. As in previous works (Drew & Vogel, 2008; McCollough et al., 2007; Vogel et al., 2005; Vogel & Machizawa, 2004), we averaged the response from a set of five posterior-occipital electrodes pairs (P3/4, PO3/4, T5/6, OL/R, O1/2). We computed difference waves by subtracting ipsilateral activity from contralateral activity. For the remainder of the article, we will analyze difference waves from this averaged response unless otherwise stated.

RESULTS

Experiment 1

Behavioral Results

We converted behavioral accuracy to an estimate of number of objects tracked using the equation: m = n(2P − 1). Here, m is an estimate of the number items accurately tracked, n is the total number of target items, and P is the percent correct (Scholl, 2001). We consider n to be the number of targets the participant was ultimately responsible for at report, so n was 3 for Add trials and 1 for Drop trials. The primary utility of this transformation is to ensure that participants were capable of tracking more than one target when asked to do so. If we consider percent correct instead of m, the pattern of results is identical, except that accuracy is naturally lower for the three-target conditions relative to the one-target conditions. The performance data in Figure 2A show that participants were able to efficiently switch target load on-line without any detectable cost, as observed by Wolfe et al. (2007). A repeated measures ANOVA showed a large effect of trial type on performance (F(1, 15) = 52.38, p < .001, η2 = .78). Using planned comparisons, we found that, as expected, the estimated number of items tracked was higher in the Track 3 condition than the Track 1 condition (t(15) = 8.22, p < .001, η2 = .82). In the Drop condition, the final target load was a single item. Accordingly, we found that the Track 1 and Drop conditions yielded equivalent performance (t(15) = 1.52, p = ns, η2 = .13). Similarly, performance in the Add and Hold conditions, in which the final target load was three items, was equivalent to the Track 3 case (Add vs. Track 3: t(15) = −0.43, p = ns, η2 = .01; Hold vs. Track 3: t(15) = 1.10, p = ns, η2 = .08). In other words, there was no observable behavioral cost of switching tracking load during the trial despite a cue period (200 msec) that was shorter than previous work (500 msec in Wolfe et al., 2007).

Figure 2. 

Behavioral performance in Experiment 1 (A) and Experiment 2 (B). Error bars here and all subsequent figures represent SEM.

Figure 2. 

Behavioral performance in Experiment 1 (A) and Experiment 2 (B). Error bars here and all subsequent figures represent SEM.

Electrophysiological Results

Replicating previous work (Drew & Vogel, 2008), CDA amplitude was higher for the Track 3 condition than the Track 1 condition throughout the trial, including during the initial selection of the target before movement onset. Amplitude in both the Add and Drop conditions changed dramatically after the cue period, rising in the Add condition and falling in the Drop condition. The Hold condition amplitude appears to be stable aside from a positive deflection immediately after the cue period. To quantify these results, we initially analyzed the mean amplitude of the CDA during three periods. The first was during the typical time window associated with selection activity before motion onset (the N2pc; 200–300 msec). The remaining two measured the CDA at different time points during tracking: one immediately before the onset of the switch cue (800–1000 msec) and a later period once the amplitude of juggling conditions appeared to have stabilized (1700–1900 msec).

In the selection time window (Figure 3C), we first compared the amplitude among conditions in which the participant was asked to initially track the same number of targets. Amplitude was equivalent for the two conditions that began with tracking one target (Track 1 and Add; F(1, 15) = .05, p = .82, η2 = .00) and for the three conditions that began with tracking three targets (Track 3, Drop and Hold; F(2, 30) = 2.24, p = .12, η2 = .13). Accordingly, we collapsed the data for this time window into these two categories and then compared amplitude between them. As in Drew and Vogel (2008), we found that the N2pc was higher when three targets were initially selected (F(1, 15) = 21.12, p < .001, η2 = .59).

Figure 3. 

Electrophysiological results from Experiment 1. (A) Difference waveforms of the CDA amplitude for the five conditions. Waveforms were taken from an average of five electrode pairs from posterior–parietal sites and were time-locked to the onset of the selection period. Motion began 500 msec subsequently. Only data from correct trials is shown. (B) Mean CDA amplitude for the Add and Drop conditions as a function of time. (C, D, and E) Mean CDA amplitude for the five conditions over the selection (C), early (D), and late (E) time windows.

Figure 3. 

Electrophysiological results from Experiment 1. (A) Difference waveforms of the CDA amplitude for the five conditions. Waveforms were taken from an average of five electrode pairs from posterior–parietal sites and were time-locked to the onset of the selection period. Motion began 500 msec subsequently. Only data from correct trials is shown. (B) Mean CDA amplitude for the Add and Drop conditions as a function of time. (C, D, and E) Mean CDA amplitude for the five conditions over the selection (C), early (D), and late (E) time windows.

We followed the same procedure for the early time window (Figure 3D). Again amplitude was equivalent for the Track 1 and Add conditions (F(1, 15) = 0.45, p = .51, η2 = .03). There was a small but significant effect of condition in for the Track 3, Drop, and Hold conditions (F(2, 30) = 3.84, p = .03, η2 = .2), but this appears to be being driven by slightly lower amplitude in the Track 1 condition. Furthermore, an analysis during the same early time window and using visually identical stimuli in Experiment 2 found no effect of condition (p > .6). Comparing these across initial target load, we found that amplitude for conditions where one item was initially tracked was much lower than trials where three items were initially tracked (F(1, 15) = 30.9, p < .001, η2 = .67).

In the late time window (Figure 3E), we again grouped the data on the basis of the current number of items being tracked. Because the late time window is after the cue, the categories are different than in the previous time windows. Amplitude was equivalent within the one target category (now comprising Track 1 and Drop; F(1, 15) = .01, p = .97, η2 = .00) and within the three target category (now comprising Track 3, Add, and Hold; F(2, 30) = .21, p = .81, η2 = .01). As in the other time windows, amplitude was significantly higher when tracking three targets compared with tracking one (F(1, 15) = 13.52, p < .005. η2 = .47).

Next, we compared amplitude within the juggling conditions across time windows (Figure 3B). Comparing the amplitude in early and late time windows for the Add and Drop conditions, we found a main effect of time (F(1, 15) = 8.3, p = .02, η2 = .36) and, more importantly, a significant crossover interaction (F(1, 15) = 22.3, p > .001, η2 = .6), but no effect of condition (F(1, 15) = 2.8, p = .11, η2 = .16). Paired t tests show that amplitude increased significantly for the Add condition (t(15) = 19.89, p < .001, η2 = .56) and decreased significantly for the Drop condition (t(15) = −5.19, p < .01, η2 = .26).

Timing Analysis

The previous analyses confirm that CDA amplitude reflects the number of targets that participants should be currently tracking within a given time window and that amplitude changes accordingly when participants are asked to switch tracking loads. However, as is clear from the waveforms, this is not an instantaneous process: It takes time for amplitude to shift from the level equivalent to the number of targets initially tracked to the final tracking load. From the waveforms, it appears clear that amplitude in the Add condition reaches the level of the Track 3 condition later than the Drop condition reaches the level of Track 1. However it is also clear that the amplitude in the Hold condition decreases for roughly 200 msec soon after the cue period. Hold amplitude during the 1200–1400 msec time window where this effect is maximal is significantly more positive than amplitude in the early (t(15) = 3.67, p < .005, η2 = .47) or late time window (t(15) = 6.43, p < .001, η2 = .73). It is important to understand this effect before we attempt to make any estimates about the time course of adding and dropping items. The positive-going transient response to the cue in the Hold condition could be related to inhibitory processes as the participants attempt to suppress the irrelevant information during the cue period. Alternatively, the onset of stimuli could induce an automatic response irrespective of the need to inhibit information. If this is the case, then there may be a visually evoked transient to the cue (e.g., P1) in both the Add and Drop conditions that obscures our ability to discern when object information has been added or dropped. We conducted a second experiment to investigate this possibility.

Experiment 2

To determine whether the cue transient observed in the Hold condition related to inhibition of the irrelevant stimulus in this condition or a more general response, we replicated Experiment 1 with two new conditions, Refresh 1 and Refresh 3, in place of the Hold condition. In both Refresh conditions, the targets that were initially cued were simply cued again in the relevant color during the cue period. To balance visual stimulation, the same number of items was cued in the relevant color on the unattended side as well.

Behavioral Results

As in Experiment 1, we analyzed behavioral performance on the basis of the number of objects the participant had to track at the end of each trial (Figure 2B). The estimated number of objects tracked was higher when tracking three targets than when tracking one target (F(1, 12) = 64.3, p < .001, η2 = .84). Performance was equivalent for the one target conditions (Track 1, Refresh 1, and Drop, F(2, 24) = 2.70, p = .88, η2 = .18). However, we did observe a significant effect of condition within the three target category (Track 3, Refresh 3, and Add; F(2, 24) = 7.94, p < .005, η2 = .40), which appears to be driven by the fact that performance in Track 3 condition was significantly worse than performance in the Refresh 3 condition (t(12) = 3.79, p < .005, η2 = .55). As in Experiment 1, participants showed no cost of switching targets during the trial: performance in the Add and Track 3 conditions was equivalent (t(12) = 1.88, p = ns, η2 = .23), as was Drop and Track 1 performance (t(12) = .79, p = ns, η2 = .05).

Electrophysiological Results

Four of the six conditions in this experiment were repeated from Experiment 1 and the overall results for these conditions are strikingly similar (Figure 4A). As in Experiment 1, we grouped conditions on the basis of the number of targets that the participant was instructed to be tracking during the given time window. Tracking three targets was associated with higher amplitude than tracking one target in both the selection (200–300 msec, Figure 4C; F(12, 1) = 30.58, p < .001, η2 = .72) and early (800–1000 msec, Figure 4D; F(12, 1) = 46.3, p < .001, η2 = .79) time windows. There was no effect of condition within the one or three target categories (all Fs < 0.6, all ps > .6). The same pattern of results held in the later period: Three target amplitude was greater than one target amplitude (Figure 4E, F(12, 1) = 26.94, p < .001, η2 = .69), with no effect of condition within the one target category (F(24, 2) = 1.90, p = .17, η2 = 14), or the three target category (F(24, 2) = 1.28, p = .30, η2 = .10). When we compared the Add and Drop conditions (Figure 4B), there was once again an effect of time (F(12, 1) = 42.49, p < .001, η2 = .78) and a strong crossover interaction (F(12, 1) = 52.27, p < .001, η2 = .81), but no effect of condition (F(12, 1) = .03, p = .86, η2 = .00).

Figure 4. 

Electrophysiological results from Experiment 2. (A) Difference waveforms for the four conditions also used in Experiment 1; Refresh 1 and Refresh 3 waveforms have been omitted for ease of viewing. (B) Mean CDA amplitude for the Add and Drop conditions as a function of time. (C, D, and E) Mean CDA amplitude over the selection (C), early (D), and late (E) time windows. As in Experiment 1, amplitude in the Add and Drop conditions changed to reflect the current number of items being tracked. This can be observed in mean amplitude bar graphs for the early (D) and late (E) time windows.

Figure 4. 

Electrophysiological results from Experiment 2. (A) Difference waveforms for the four conditions also used in Experiment 1; Refresh 1 and Refresh 3 waveforms have been omitted for ease of viewing. (B) Mean CDA amplitude for the Add and Drop conditions as a function of time. (C, D, and E) Mean CDA amplitude over the selection (C), early (D), and late (E) time windows. As in Experiment 1, amplitude in the Add and Drop conditions changed to reflect the current number of items being tracked. This can be observed in mean amplitude bar graphs for the early (D) and late (E) time windows.

Critically, we observed a cue transient soon after the cue period in each of the Refresh conditions that appeared to be very similar to the deflection in response to the Hold cue in Experiment 1 (compare Figure 4F and Figure 3A). This suggests that the positive-going transient response to the cue is stimulus-driven rather than an inhibitory process that is task-driven. Aside from the period immediately following the cue, amplitude in the Refresh conditions was followed the same pattern and the Track 1 and Track 3 conditions. Although there was a significant effect of both time (early or late time window: F(12, 1) = 9.75, p < .01, η2 = .45) and condition (Refresh 1 or Refresh 3: F(12, 1) = 33.1, p < .001, η2 = .73), the two factors did not interact (F(12, 1) = 0.00, p = .96, η2 = 0). Furthermore, as with the Hold condition in Experiment 1 amplitude in both refresh conditions was significantly lower during the 1200–1400 msec time window than either the early (Refresh 1 (t(12) = 2.69, p < .05, η2 = .37; Refresh 3 (t(12) = 2.81, p < .005, η2 = .40) or late time windows (Refresh 1 t(12) = 5.23, p < .001, η2 = .70; Refresh 3 (t(12) = 5.43, p < .001, η2 = .71).

Timing Analysis

In both Experiment 1 and Experiment 2, we observed a transient decrease in CDA amplitude in response to cues that did not necessitate switching object load (i.e., the Hold and Refresh conditions, respectively). The transient positivity also appears to be present in the Add and Drop waveforms. In both experiments amplitude for the Add conditions deflects positively before increasing to reflect the increase on target load whereas the Drop waveform appears to immediately decrease to reflect the lowered tracking load. We, therefore, suggest that the observed waveforms are a composite of two underlying processes, one reflecting the change in the number of targets being tracked, and one reflecting a temporary positive response to the cue. A better estimate of the switching process should therefore account for positive deflection that is unrelated to the switching process. We did so by subtracting the amplitude from the Refresh trial from the Switch trials. Figure 5B shows the result of subtracting the difference wave on Refresh 1 trials from that of the Drop trials and Refresh 3 from Add. Figure 5C shows the result of subtracting the difference wave of Refresh 1 from the Add condition and subtracting Refresh 3 from the Drop condition. These figures provide bookends for estimating when the switch took place while controlling for the positive deflection observed in response flashing objects during the cue interval. In both cases, we used a 20-msec sliding window to estimate the timing of the switch. We used Figure 5B to estimate the first time at which the switch conditions (Add and Drop) reflected the current number of objects being tracked. Here, we found the first time when the difference between the switch conditions were statistically equivalent (p > .001) to the amplitude of the newly cued target size for all subsequent periods. This yields an estimate of 280 msec after the cue onset for the both the Drop and Add conditions. The subtraction in Figure 5C allowed us to estimate the final time at which the switch conditions were equivalent to the target load before the switch cue. Here, we found the first point where all subsequent points of the waveform were greater than 0 (p < .001). This yields an estimate of 480 msec after the cue for the Drop condition and 500 msec for the Add condition.

Figure 5. 

Timing analysis for Experiment 2. (A) Difference CDA waveforms for the Refresh 1 and Refresh 3 conditions for comparison with the Add and Drop waveforms; Track 1 and Track 3 waveforms have been omitted for ease of viewing. As in the Hold condition from Experiment 1, there is a positive amplitude deflection in both Refresh conditions soon after the cue period. (B) Two conditions designed to control for the positive deflection in response to cues information found in Experiment 1. We subtracted Refresh 1 amplitude from the Drop condition and Refresh 3 amplitude from Add. The stars denote the last time window where each waveform differed significantly from zero. (C) Two additional conditions: Drop–Refresh 3 and Add–Refresh 1. Here the stars denote the first time when each condition differed significantly from zero and remained above this level for the rest of the trial.

Figure 5. 

Timing analysis for Experiment 2. (A) Difference CDA waveforms for the Refresh 1 and Refresh 3 conditions for comparison with the Add and Drop waveforms; Track 1 and Track 3 waveforms have been omitted for ease of viewing. As in the Hold condition from Experiment 1, there is a positive amplitude deflection in both Refresh conditions soon after the cue period. (B) Two conditions designed to control for the positive deflection in response to cues information found in Experiment 1. We subtracted Refresh 1 amplitude from the Drop condition and Refresh 3 amplitude from Add. The stars denote the last time window where each waveform differed significantly from zero. (C) Two additional conditions: Drop–Refresh 3 and Add–Refresh 1. Here the stars denote the first time when each condition differed significantly from zero and remained above this level for the rest of the trial.

Both of these analyses suggest that the time courses for adding and dropping items are remarkably similar. These numbers vary somewhat on the basis of how conservative we are in our statistical tests, but the surprising pattern of Add and Drop diverging at approximately the same time is consistent.

DISCUSSION

Shifting target sets while tracking multiple moving objects is a complex attentional operation which humans engage in on a daily basis. These results provide the first demonstration that we can measure the neural activity underlying this process operating in real time by measuring the amplitude of CDA. Our results conclusively demonstrate that the CDA is a dynamic rather than transient signal that is sensitive to both tracking load increases and decreases. This technique provides a powerful tool for studying how visual attention operates under conditions that approximate real-world behaviors more closely than typical laboratory tasks.

Relationship to Previous Research

Previous studies have suggested that CDA amplitude may serve as an on-line index of the number of items being held in working memory or the number of items being tracked during MOT (Drew & Vogel, 2008; McCollough et al., 2007; Vogel & Machizawa, 2004). The current study extends this line of research by showing that CDA amplitude is sensitive to dynamic changes in the number of target during an MOT task. Our group has previously suggested that one similarity between verbal working memory (VWM) and MOT is that both tasks require a pointer for each object that is being attended (Drew, Horowitz, Wolfe, & Vogel, 2011; Drew & Vogel, 2008). In fact, when viewing visually identical trials where the task was manipulated to be either a tracking task or a VWM task, the CDA amplitude mirrors the target load in both tasks, suggesting substantial overlap in the neural mechanisms that underlie these two tasks (Drew et al., 2011).

Given the hypothesis that CDA amplitude serves as an on-line index of the number of items being attended rather than reflecting the number of items that has been attended in a given trial, it is critical to demonstrate that the CDA is sensitive to dynamic changes in the number of items being attended in a given trial. The current study confirms this hypothesis in the MOT domain. This paves the way for future research using the CDA to examine other tasks where attentional load might change during the trial.

We replicated and extended Wolfe et al.'s (2007) finding that participants can “juggle” moving items in and out of the target set with little or no impairment in tracking performance. In the original Wolfe et al. study, targets were added or deleted one by one. Here, participants were asked to simultaneously delete their current target set and acquire new targets while holding fixation and attending lateralized items. Again, there was no noticeable decrement in performance relative to constant set tracking.

Although we view MOT as a more ecologically valid task than many typical cognitive psychology tasks, it would be very unusual to attentively track a fixed set of cars for a long period while driving on a highway. It is much more common to rapidly change what we are tracking as old targets become irrelevant and new ones appear. This behavior is captured in the laboratory with the multiple-object juggling task. Our data and those of Wolfe et al. (2007) suggest that humans are quite good at this task. We can add and drop objects one at a time or drop one set and acquire a whole new set at the same time with equal aplomb. Thus, the multiple-object juggling task may prove a useful way to assess everyday cognition in populations with potential deficits, such as the elderly.

One might question whether these cues might be simply treated as an initiation of new trial, because the target items in these experiments always changed during switch trials. This might have been the case if our trial types were blocked, but because all trial types were interleaved it was necessary to track items up until the time when a switch cue could occur. Alternatively, the participants could have chosen to gamble that a given trial would be a switch trial, but such a strategy would result in lower performance on nonswitch trials with the appropriate target set size. For instance, this would predict that Set Size 1 accuracy would be lower than Drop condition accuracy. We found no evidence of such effects. This means that one fundamental difference between the start of the trial and the cue onset for switch trials is that participants were actively tracking either one or three items when the switch cue occurred, meaning that when the switch cue occurred the participant had to both pick up the new items and quickly stop tracking initial items.

Time Course Information

After controlling for low-level effects in response to the onset of cue information, we documented the time course of attentional switching between moving objects in response to peripheral cues. There is an existing literature on attentional switching, which has focused on the neural response to symbolic switch cues. By focusing on the ERPs evoked by the switch cue, two groups have estimated that cue information is processed between 300 and 700 msec after the cue onset (Brignani et al., 2009; Grent-'T-Jong & Woldorff, 2007). In the current work, we have focused on the shifts engendered by the cue rather than cue processing alone and the second experiment was specifically designed to account for low-level activity evoked by stimulus onset but not specific to cue processing. In our paradigm, once the switch cue has been encoded the participant must then switch tracking sets, and our timing estimates are based examining both the time when the new tracking load is first reflected in the CDA waveform and the time when the switch waveform first differed significantly from the original target load amplitude. These analyses gave a range of time during which the switch process appears to taking place. Given the many differences between our paradigm and those used in the previous studies, it was surprising that the time course of this process (between 280 and 500 msec) was quite similar to the estimates generated by focusing on cue processing alone.

We did not expect to find that the time course was nearly identical for adding and dropping items, having assumed that switching to three items would take more time than switching to one. However, this is less surprising when we note that there was also no effect of number of targets on the latency of the initial rise of CDA at the start of the trial. Furthermore, our participants showed hardly any behavioral cost of the switching tracking load. Future work could explore the possibility of modulating the amount of time it takes to complete the switching process. For instance, the current study measures the time course for changing target load without a concurrent load, but we might expect that this switching process would take a longer time if the participant had to continue tracking additional objects during a switch condition.

Future Directions

Recent work has shown that the CDA is elicited by a number of lateralized tasks beyond VWM and MOT, including visual search (Woodman & Arita, 2011) and curve tracing (Lefebvre, Jolicoeur, & Dell' Acqua, 2010). The current finding that the CDA is sensitive to dynamic changes in attentional load during a trial provides researchers with a number of advantages over measuring behavioral performance alone. In the MOT domain, participants can fail to track targets in several different ways. A participant could fail to select targets for tracking at the start of a trial, lose track of a target while tracking, or might confuse a target with a distractor and begin tracking the wrong object. These effects would be indistinguishable in accuracy measures but would yield different patterns in the ERP data. Failures of target selection (Pylyshyn & Annan, 2006) would be reflected in N2pc amplitude. As the data from the current study show, the CDA is sensitive to decreases in tracking load, so when objects are lost, we should see decreased CDA amplitude, whereas target–distractor swaps would yield relatively stable amplitudes.

Consider the perceptual grouping effects observed by Yantis (1992). He found that there was a significant advantage for tracking items that initially appeared in easily grouped configuration (such as a square) and suggested that perceptual grouping enables participants to effectively lower the number of items that are being tracked in a given trial. If so, we would expect to observe reduced CDA amplitude. Alternatively, perhaps the effect depends entirely on improved selection. By measuring the N2pc and the CDA, we could determine which phase of the MOT task is affected by perceptual grouping: selection, tracking, or both.

Conclusions

One of the defining properties of working memory is that the contents of this system are constantly changing as new information is encoded and old information is moved to a more consolidated form or simply forgotten. In the current study, we used CDA amplitude and the excellent temporal resolution of ERPs to study this dynamic process. Although previous CDA research has primary studied the process of encoding new information, here we also studied the process of deleting irrelevant information in the face of new, more relevant information. We have shown quite clearly that the CDA is dynamically sensitive to both increases and decreases in tracking load and have documented for the first time the time course of attentional switching between moving objects in response to peripheral cues. Although a great deal of important recent research on the relationship between attention and working memory has employed fMRI techniques, critical information about these processes may be missed because of the poor temporal resolution of the hemodynamic response. As we move toward studying these processes in more ecologically valid paradigms, it is advantageous to employ techniques that are capable of detecting changes that occur along the same time scale (milliseconds as opposed to seconds) as the processes we are interested in studying. In the current work, we have shown that CDA amplitude reflects changes in the number of items being attended within less than a second and that the time course appears to be relatively stable when picking up one or three new items. We hope that future research will be able to use these techniques to further refine our understanding of the time course of shuttling information in and out of working memory.

Reprint requests should be sent to Trafton Drew, Harvard Medical School, Visual Attention Lab, 64 Sidney St., Suite 170, Cambridge, MA 02139, or via e-mail: tdrew1@rics.bwh.harvard.edu.

REFERENCES

REFERENCES
Bisley
,
J. W.
, &
Goldberg
,
M. E.
(
2003
).
Neuronal activity in the lateral intraparietal area and spatial attention.
Science
,
299
,
81
86
.
Brignani
,
D.
,
Lepsien
,
J.
,
Rushworth
,
M. F. S.
, &
Nobre
,
A. C.
(
2009
).
The timing of neural activity during shifts of spatial attention.
Journal of Cognitive Neuroscience
,
21
,
2369
2383
.
Drew
,
T.
,
Horowitz
,
T. S.
,
Wolfe
,
J. M.
, &
Vogel
,
E. K.
(
2011
).
Delineating the neural signatures of tracking spatial position and working memory during attentive tracking.
Journal of Neuroscience
,
31
,
659
668
.
Drew
,
T.
,
McCollough
,
A. W.
, &
Vogel
,
E. K.
(
2006
).
Event-related potential measures of visual working memory.
Clinical EEG and Neuroscience
,
37
,
286
291
.
Drew
,
T.
, &
Vogel
,
E. K.
(
2008
).
Neural measures of individual differences in selecting and tracking multiple moving objects.
Journal of Neuroscience
,
28
,
4183
4191
.
Grent-'T-Jong
,
T.
, &
Woldorff
,
M. G.
(
2007
).
Timing and sequence of brain activity in top–down control of visual-spatial attention.
Plos Biology
,
5
,
114
126
.
Hillyard
,
S. A.
, &
Galambos
,
R.
(
1970
).
Eye movement artifact in the CNV.
Electroencephalography and Clinical Neurophysiology
,
28
,
173
182
.
Ikkai
,
A.
,
McCollough
,
A. W.
, &
Vogel
,
E. K.
(
2010
).
Contralateral delay activity provides a neural measure of the number of representations in visual working memory.
Journal of Neurophysiology
,
103
,
1963
1968
.
Lefebvre
,
C.
,
Jolicoeur
,
P.
, &
Dell' Acqua
,
R.
(
2010
).
Electrophysiological evidence of enhanced cortical activity in the human brain during visual curve tracing.
Vision Research
,
50
,
1321
1327
.
Liu
,
T. S.
,
Slotnick
,
S. D.
,
Serences
,
J. T.
, &
Yantis
,
S.
(
2003
).
Cortical mechanisms of feature-based attentional control.
Cerebral Cortex
,
13
,
1334
1343
.
McCollough
,
A. W.
,
Machizawa
,
M. G.
, &
Vogel
,
E. K.
(
2007
).
Electrophysiological measures of maintaining representations in visual working memory.
Cortex
,
43
,
77
94
.
Pylyshyn
,
Z.
, &
Storm
,
R. W.
(
1988
).
Tracking multiple independent targets: Evidence for a parallel tracking mechanism.
Spatial Vision
,
3
,
179
197
.
Pylyshyn
,
Z. W.
, &
Annan
,
V.
(
2006
).
Dynamics of target selection in multiple object tracking (MOT).
Spatial Vision
,
19
,
485
504
.
Scholl
,
B. J.
(
2001
).
Objects and attention: The state of the art.
Cognition
,
80
,
1
46
.
Vogel
,
E. K.
, &
Machizawa
,
M. G.
(
2004
).
Neural activity predicts individual differences in visual working memory capacity.
Nature
,
428
,
748
751
.
Vogel
,
E. K.
,
McCollough
,
A. W.
, &
Machizawa
,
M. G.
(
2005
).
Neural measures reveal individual differences in controlling access to working memory.
Nature
,
438
,
500
503
.
Wolfe
,
J. M.
,
Place
,
S. S.
, &
Horowitz
,
T. S.
(
2007
).
Multiple object juggling: Changing what is tracked during extended multiple object tracking.
Psychonomic Bulletin & Review
,
14
,
344
349
.
Woodman
,
G.
, &
Arita
,
J. T.
(
2011
).
Direct electrophysiological measurement of attentional templates in visual working memory.
Psychological Science
,
22
,
212
215
.
Yantis
,
S.
(
1992
).
Multi-element visual tracking—Attention and perceptual organization.
Cognitive Psychology
,
24
,
295
340
.
Yantis
,
S.
(
2008
).
The neural basis of selective attention: Cortical sources and targets of attentional modulation.
Current Directions in Psychological Science
,
17
,
86
90
.
Yantis
,
S.
,
Schwarzbach
,
J.
,
Serences
,
J. T.
,
Carlson
,
R. L.
,
Steinmetz
,
M. A.
,
Pekar
,
J. J.
,
et al
(
2002
).
Transient neural activity in human parietal cortex during spatial attention shifts.
Nature Neuroscience
,
5
,
995
1002
.