Gestalt psychology has traditionally ignored the role of attention in perception, leading to the view that autonomous processes create perceptual configurations that are then attended. More recent research, however, has shown that spatial attention influences a form of Gestalt perception: the coherence of random-dot kinematograms (RDKs). Using ERPs, we investigated whether temporal expectations exert analogous attentional effects on the perception of coherence level in RDKs. Participants were presented fixed-length sequences of RDKs and reported the coherence level of a target RDK. The target was indicated immediately after its appearance by a postcue. Target expectancy increased as the sequence progressed until target presentation; afterward, remaining RDKs were perceived without target expectancy. Expectancy influenced the amplitudes of ERP components P1 and N2. Crucially, expectancy interacted with coherence level at N2, but not at P1. Specifically, P1 amplitudes decreased linearly as a function of RDK coherence irrespective of expectancy, whereas N2 exhibited a quadratic dependence on coherence: larger amplitudes for RDKs with intermediate coherence levels, and only when they were expected. These results suggest that expectancy at early processing stages is an unspecific, general readiness for perception. At later stages, expectancy becomes stimulus specific and nonlinearly related to Gestalt coherence.
Expectation is one of several processes that guide attention to visual stimuli (Zhao, Al-Aidroos, & Turk-Browne, 2013). Expectations are shaped by probabilities concerning spatial position, feature, or time (Nobre & Rohenkohl, 2014; Summerfield & de Lange, 2014), which provide modulatory biases that guide perception (Nobre & van Ede, 2018). In the case of expectations concerning time, also referred to as “temporal expectation” or “temporal attention,” temporal structure is used to prioritize and select items for processing (Nobre & van Ede, 2018). Temporal expectations are typically investigated using temporal cueing, rhythms, and foreperiods (Nobre & Rohenkohl, 2014; Summerfield & de Lange, 2014; Summerfield & Egner, 2009; Nobre, 2001). These studies raise the question of whether temporal expectation functions as a temporal analog to direction of attention by spatial expectation (Carrasco, 2018; Nobre & Rohenkohl, 2014).
Temporal expectation likewise influences sensory performance (Nobre & van Ede, 2018; Rungratsameetaweemana, Itthipuripat, Salazar, & Serences, 2018; Burr, Baldassi, Morrone, & Verghese, 2009; Rezec, Krekelberg, & Dobkins, 2003). For example, Coull and Nobre (1998) observed that temporal cueing of “x” or “+” stimuli improved performance in a detection task. However, temporal expectation influences perception later than spatial attention: In contrast to spatial attention, temporal attention modulates the later ERP component N1, but not P1 (Correa & Nobre, 2008; Hackley, Schankin, Wohlschlaeger, & Wascher, 2007; Correa, Lupiáñez, Madrid, & Tudela, 2006; Doherty, Rao, Mesulam, & Nobre, 2005; Griffin, Miniussi, & Nobre, 2002; Miniussi, Wilding, Coull, & Nobre, 1999).
Here, we study the effects of temporal expectation on the organization of dynamic Gestalt stimuli. Gestalt theory historically attributed little importance to the concept of attention (Boring, 1929) and instead relied on mechanisms of figure-ground organization as a mechanism of selection (van Leeuwen et al., 2011). The neo-Gestalt tradition (e.g., Pomerantz, 1981) understood this as implying that perceptual organization and, more specifically, perceptual grouping occur preattentively and that attentional effects occur only after grouping has been achieved (Julesz, 1991; Duncan, 1984; Kahneman & Henik, 1981).
Nonetheless, some behavioral evidence has challenged the notion of preattentive grouping. For example, Rezec et al. (2003) found that manipulating spatial expectations by spatially cueing random-dot kinematograms (RDKs) reduced coherence thresholds in a direction discrimination task. The notion of preattentive grouping would be further undermined in ERPs if expectation effects were found to occur as early as P1. Some evidence suggests that temporal expectation does modulate P1 when combined with spatial orienting (Doherty et al., 2005) and that expectation may influence early stages of perception when certain conditions are met, for example, when task demands are high (Correa et al., 2006). Accordingly, the neo-Gestalt view has currently been replaced by the view that principles of grouping operate at multiple stages and at multiple levels (Wagemans et al., 2012) based on neurophysiological evidence for the dynamical organization of perceptual experience (van Leeuwen et al., 2011). The dynamical framework allows for complex interactions between grouping and selection based on attention and/or expectancy. For instance, in MEG, anticipatory (prestimulus) effects of cueing were found in the right V1 and cuneus, as well as early and late effects on evoked activity (Plomp, van Leeuwen, & Ioannides, 2010). Thus, an effect of expectancy on early visual evoked potentials, including P1, might arise.
In this study, we investigate one specific type of perceptual organization process, specifically, the perception of coherent motion. The Gestalt tradition views coherent global motion as the product of common-fate grouping (Stürzel & Spillmann, 2004; Uttal, Spillmann, Stürzel, & Sekuler, 2000; Yuille & Grzywacz, 1988). A recent model of common-fate grouping (Levinthal & Franconeri, 2011) proposes that motion directions across the visual field are processed in parallel based on common fate (direction). This process produces a map of regions, in which each region is assigned a direction. Activation peaks in the selected map correspond to all locations where elements move in the selected direction. An attentional process of feature selection then selects one of those peaks, allowing objects that move in the same direction to be grouped by common fate, therefore appearing to move together. Thus, common-fate grouping involves both an early stage of parallel processing of motion direction and a later stage where attentional selection occurs. In this model, therefore, grouping is an attentional process. Investigating if and when attention influences common-fate grouping is important to reconcile the neo-Gestalt view of preattentive grouping and the contemporary dynamic view of perceptual organization. To that end, here we use ERPs while using temporal expectations to direct attention to motion stimuli.
The typical ERP response to motion stimuli involves three peaks: P1, N2, and P2 (Martin, Huxlin, & Kavcic, 2010; Kuba, Kubová, Kremlácek, & Langrová, 2007; Hoffmann, Dorn, & Bach, 1999; Bach & Ullrich, 1994). The source of the earliest of these components, P1, is the extrastriate cortex (Di Russo, Martínez, Sereno, Pitzalis, & Hillyard, 2002), which includes area V3A, suggesting that P1 is sensitive to motion. N2 is generated in the motion-sensitive visual cortex, including the temporo-occipital and parietal areas (see Kuba et al., 2007, for a review). N2 reflects the coherence level of motion stimuli (Martin et al., 2010; Niedeggen, Hesselmann, Sahraie, & Milders, 2006; Aspell, Tanskanen, & Hurlbert, 2005). For example, the amplitude of the N2 increases linearly with perceived coherence (Niedeggen et al., 2006). Attentional modulations of coherent motion have also been observed in this time window (Niedeggen et al., 2006; Niedeggen, Hesselmann, Sahraie, Milders, & Blakemore, 2004; Niedeggen, Sahraie, Hesselmann, Milders, & Blakemore, 2002), as well as in later visual processing (Kau et al., 2013). No effects of coherent motion on P1 were reported, however.
Despite clear evidence that attention modulates processing of moving stimuli starting from the N2, previous studies on the influence of attention on the perception of coherent motion (e.g., Martin et al., 2010; Niedeggen et al., 2002, 2006) had a number of limitations. First, they employed a few or even only two coherence levels: completely random (0% coherence) and completely coherent (100% coherence) RDKs (e.g., Kau et al., 2013). In our study, we employ a full range of coherence levels. This is important because attention may have distinct effects within the range of coherence levels. For example, discrimination of motion direction in 100% coherent RDKs can be performed using only local motion signals, because every dot has the same local motion vector (Cai, Chen, Zhou, Thompson, & Fang, 2014). Intermediate coherence levels, on the other hand, demand global integration to discriminate motion direction.
Second, in some previous paradigms, the effect of temporal attention is contingent on other factors. For example, motion blindness (Niedeggen et al., 2002, 2004) has been interpreted as resulting from a limitation in the ability to redirect attention from one stimulus to another given short SOAs (Niedeggen et al., 2006), which is similar to what occurs in the psychological refractory period (Pashler, 1994). This paradigm imposes demands beyond the selective aspects of attention and might not engage the same mechanisms that operate in the context of the temporal structure of tasks. Such mechanisms are typically observed with manipulations, such as foreperiods or temporal cueing (Nobre & van Ede, 2018). The results of those studies might depend on additional demands. Thus, a demonstration of systematic relationships between expectation and coherent motion perception is still missing.
Furthermore, the relationship between attention and coherent motion is influenced by which feature is task relevant. For example, patterns of hemodynamic activation and ERPs are distinct when participants attend to motion direction and motion coherence (Kau et al., 2013). This might occur because judgment of simple features, such as direction of motion, is supported by cortical areas that are lower in the visual hierarchy than those involved in judgment of motion coherence. This would lead to distinct latencies of modulation by attention. This is especially important for ERP research, given that the effects of attention on early ERP components, such as P1, depend on which feature of the stimulus (e.g., color or motion direction) is attended (Zhang & Luck, 2009).
The goal of our study is to investigate the influence of temporal expectations in the perception of coherent global motion in a parametric stimulus space. To this end, we independently manipulated the coherence levels of RDKs and the amount of participants' attention to motion coherence, both of which were varied in small steps. We collected participants' reports of the perceived coherence of RDKs, as well as ERPs. Two intervals after motion onset were considered, which are defined by two major motion-related ERP components: P1 and N2. This allowed us to construct a common space of neurophysiological and phenomenological responses, which represents gradual relationships between attention and motion coherence, with a precision level that has not been achieved before.
Considering the literature on ERPs and coherent motion perception, we expected that coherence level would modulate the amplitude of the N2 component. We had no specific hypothesis for effects of coherence on P1. Crucially, considering the model of common-fate grouping by Levinthal and Franconeri (2011) and the view of dynamic organization of perceptual experience (Wagemans et al., 2012; van Leeuwen et al., 2011), we expected to find effects of temporal expectation on Gestalt motion. Thus, we hypothesized that expectation would influence the magnitude of the N2 component, which reflects motion coherence. Finally, given the earlier reports of effects of expectation on the earlier P1 component, we hoped that using a full range of coherence levels and proper task demands would uncover possible effects of expectation on the earlier P1 stage.
A power analysis was performed before data collection to determine the sample size for the study based on previous studies showing a ηp2 = .26 for the difference between amplitudes of random and coherent RDKs in the motion-related N2 (e.g., Martin et al., 2010). Assuming an alpha of .05 and a power of .80, this resulted in a sample size of 26. We collected data from three more participants to ensure a sufficient sample size given a loss of two to three participants due to preprocessing of EEG.
Twenty-nine healthy adults participated in the experiment. All conformed to the following inclusion criteria: no psychiatric or neurological disorders (self-reported), no use of any medication or alcohol before the experiment, and normal or corrected-to-normal vision. Data from two participants were excluded because of excess of EEG artifacts (as described below), leaving 27 participants (10 men) aged 20–30 years (mean = 23.8 years, SD = 2.6 years). All participants provided written informed consent. The study was approved by the ethics committee of the Faculty of Psychology and Educational Sciences of KU Leuven.
Stimuli and Experimental Design
The stimuli were RDKs in which some dots moved coherently in the same direction across frames for the whole stimulus duration (signal dots), whereas others (noise dots) moved in random directions in between frames (Scase, Braddick, & Raymond, 1996). Throughout the entire presentation of the RDK, the same dots were the signal dots, whereas the remaining dots were noise dots (“same rule”; Scase et al., 1996). The percentage of signal dots defined the coherence level of an RDK, which varied between 0% and 100% in steps of 10. To prevent participants from tracking individual dots, all dots had limited lifetimes (Saenz, Buracas, & Boynton, 2002). Dots that drifted beyond the boundary of the RDK field were replotted in a random location in the field to keep the number of dots in the field constant across frames. Noise dots moved following a Random Walk algorithm (Scase et al., 1996), so that, in each frame, noise dots were assigned a random direction while keeping the same speed, instead of being replotted on the screen in each frame. This avoids any possible confounds of systematic sudden onsets on the early ERPs.
RDKs were presented at the center of the screen within a circular region with a radius of 7.35 degrees of visual angle against a uniform gray background (11.2 cd/m2 luminance). The following parameters were used for the RDKs: number of dots, 720; dot field area, 678.86 deg2; dot density, 1.06 dot/deg2; dot size, 0.05 × 0.05 deg; dot contrast, 3.6875 (Weber contrast); dot luminance, 52.5 cd/m2; dot speed, 13 deg/sec; dot lifetime, 4 frames. These parameters were kept constant throughout the experiment.
Within a trial, a sequence of RDKs was presented in the rapid serial visual presentation (RSVP) paradigm. A trial comprised one target RDK, pseudorandomly embedded in a sequence of nine distractor RDKs, with the restriction that consecutive RDKs could never have the same coherence level, although two RDKs with equal coherence could be presented in the same trial in nonconsecutive positions. The coherence levels of target and distractors RDKs were drawn from the same uniform distribution of coherence levels. Targets (and distractors) were presented the same number of times in each position across the experiment. Coherence level and direction of motion varied across trials.
A target RDK was indicated by an auditory cue with a frequency of 350 Hz and a duration of 48 msec, which followed the target with an onset jitter of 0–50 msec drawn from a uniform distribution. The sound served as a postcue to direct the participant's attention retroactively to the target.
Participants were seated in front of a 24.1-in. LCD monitor screen (1920 × 1200 resolution, 60 Hz refresh rate, Eizo FlexScan S2410W) at a distance of 70 cm. Stimulus presentation and registration of keyboard responses were performed with custom software programmed in Python using the PsychoPy library (Peirce & MacAskill, 2018; Peirce, 2009). A chinrest was used to control the position of the participant's head.
To manipulate temporal expectation, we applied a method related to the foreperiod technique (Ambinder & Lleras, 2009). We presented trials comprising sequences of 10 RDKs, one of which was a target. The probability of target occurrence in a position (i.e., hazard rate; Nobre & Rohenkohl, 2014) varied across sequences, and each sequence necessarily contained one target. Hence, the probability of any RDK being the target increased as the sequence progressed, given that the target had not yet appeared. Consequently, expectation toward the target gradually increased with the position of the RDK in the sequence until target presentation (Ambinder & Lleras, 2009), after which target expectation was gone. Therefore, by comparing ERPs for RDKs in distinct positions within the sequence, we can examine how increasing expectation influences perception of coherent motion.
Participants were requested to keep fixating at the center of the screen. They were instructed that, in each trial, a series of stimuli would be presented sequentially, with one of those stimuli being followed by a sound (postcue; Thibault, van den Berg, Cavanagh, & Sergent, 2016). The target was indicated by a postcue because a precue would modify participants' temporal expectation toward a target (Summerfield & de Lange, 2014) and interfere with our manipulation of expectancy through stimulus position.
The task for the participant was to report the coherence level of the target. Assessment of participants' responses was performed at the end of a trial. At the response screen, bars labeled “random” at the left end and “coherence” at the right end were presented. Participants were asked to indicate the perceived coherence level by moving the cursor along the response bar using two arrow keys—one to move the cursor to the left, and another one to move the cursor to the right—and to confirm their choice by pressing the central down arrow key. Participants had up to 5 sec to respond. In case of timeout, they were requested to respond faster in the following trials. Trials with timeout were discarded from the analysis. This task proved to be demanding in earlier pilot studies. This is relevant because task demands influence expectation effects on perception-related ERP components, in particular P1 (Correa et al., 2006; Handy & Mangun, 2000). Thus, we used a demanding discrimination task with high uncertainty about the target instead of a detection task to ensure that participants needed to keep focusing their attention throughout trial. By employing a demanding task with target and distractor RDKs that were drawn with equal frequency from the same set of coherence levels (see above), we increased the likelihood of observing attentional effects on P1.
A trial started with a fixation cross at the center of the screen, which lasted for 500 msec (Figure 1). Afterward, the presentation of RDKs began. RDKs were presented for 296 msec, with an ISI jittered between 200 and 250 msec, resulting in an SOA of 496–546 msec. Such stimulus duration and ISI were chosen to reduce overlap between epochs in the time window of interest, because the N2, in which we were interested, peaks at around 200 msec poststimulus (Luck, 2014; Niedeggen et al., 2006). A cue was presented between RDKs during the ISI. The cue's start time was jittered during the ISI, ranging from 50 to 100 msec. An intertrial interval of 500 msec was employed, during which a fixation cross was presented at the center of the screen. Each participant completed 600 trials, for a total of 6000 RDK stimuli. Hence, there were 600 targets at each stimulus position, balanced across 10 levels of coherence; 600 targets at each coherence level, balanced across stimulus positions; and 60 targets of identical coherence at each stimulus position.
A practice session with 20 trials was performed before the EEG session to ensure that participants understood the task. During this practice, participants received feedback on every trial to ensure that they correctly understood the task. The feedback consisted of presenting the participant's response error: the absolute value of the difference between the participant's response and the target's true coherence level.
EEG was continuously recorded throughout the experimental session using a Geodesic Sensor Net with 256 Ag/AgCl electrodes, amplified through a high input impedance Net Amps amplifier (EGI, a Philips company) using the Net Station software. The electrode montage included sensors for recording vertical and horizontal electrooculograms. Data were digitized at a sampling rate of 250 Hz. Impedance was kept below 50 kΩ. All channels were referenced to the vertex electrode (Cz) and were preprocessed online using a low-frequency cutoff of 0.1 Hz and a high-frequency cutoff of 100 Hz.
For the behavioral analysis, we considered two dependent variables: coherence ratings for each coherence level and cue location, and accuracy, quantified as response errors (the absolute difference between coherence rating and true coherence for a given stimulus).
To investigate if participants' sensitivity to differences in coherence level changes across cue locations, we analyzed the slopes of coherence ratings by true coherence level for each cue location. Our hypothesis was that, if expectation increases sensitivity to coherence, slopes of coherence rating by coherence level should be steeper for later than for early cue locations.
In the analysis of accuracy, we compared response errors between cue locations separately for each coherence level. The reason for this was that, because of the fixed length of the response bar, the maximum possible error systematically varies with coherence levels. Additionally, response scales, such as the one employed here, are susceptible to bias by the central tendency of judgment (Hollingworth, 1910). Such bias influences points along the scale to a distinct degree, precluding direct comparisons of accuracy between coherence levels.
EEG was analyzed using BrainVision Analyzer 2 software (Brain Products GmbH). The EEG data were filtered with zero phase shift Butterworth filters of the second order with a low cutoff frequency of 0.5 Hz and with a high cutoff of 30 Hz, with a filter slope of 12 dB/oct for both cutoffs and a 50-Hz notch filter. We removed 95 of 256 electrodes on the cheeks and neck, which showed strong muscle artifacts or poor contacts, and retained the data from the remaining 161 electrodes for further analyses.
We visually inspected EEG channels and excluded the ones that appeared to be noisy by visual inspection and the ones that were indicated as bad during recording by Netstation. We derived the vertical and horizontal electrooculograms, respectively, as the difference between the activity of electrodes placed above and below the eyes and of the ones placed near the right and left outer canthi of the eyes. We segmented EEG in epochs from −100 to +300 msec relative to stimulus onset. To identify bad channels, we employed an automatic artifact detection procedure. The following criteria were employed for artifact detection: The absolute voltage difference exceeded 50 μV between two neighboring sampling points, the amplitude was outside −100 or +100 μV, or the maximal difference in amplitude within an epoch exceeded 100 μV in any channel. If the percentage of excluded epochs for a particular channel exceeded 3%, the channel was removed. Next, we used independent component analysis to correct for oculomotor and other artifacts. The removed channels were interpolated using spherical spline interpolation across the channel set. Then, EEG epochs were submitted to the artifact detection procedure using the same criteria as for detection of bad channels reported above. Epochs that matched any of the criteria were excluded (on average, 0.87% of epochs). The data were rereferenced to the average reference and baseline-corrected using the interval from −100 to 0 msec relative to stimulus onset and were then averaged across epochs.
We focused on ERP components P1 and N2 related to motion perception (Martin et al., 2010; Kuba et al., 2007; Hoffmann et al., 1999; Bach & Ullrich, 1994). We did not consider C1, although it has been argued to be modulated by attention (e.g., Ding, 2018; Kelly & Mohr, 2018; Slotnick, 2018). The reason was that C1 is generated in V1 (Di Russo et al., 2002; Clark, Fan, & Hillyard, 1994), whereas the earliest visual cortex sensitive to motion is V3A (Bartels, Zeki, & Logothetis, 2008; Tootell et al., 1997).
We chose time windows for the analysis based on inspection of the ERPs grand-averaged across all participants and conditions. We selected a time window for P1 from 120 to 160 msec after RDK onset and for the N2 from 160 to 200 msec after the RDK onset and used the mean amplitude in these windows in the analyses. For the analysis of P1 and N2, 30 electrodes were selected over the parietal and occipital areas (Figure 4A). The selection was based on a priori hypotheses to maximize statistical power (Luck, 2014; Groppe, Urbach, & Kutas, 2011). Specifically, we selected electrodes over which the effects of attention on motion perception were previously observed (Martin et al., 2010; Niedeggen et al., 2004, 2006). We selected 12 electrodes around P3 and P4 and 14 electrodes around O1, Oz, and O2 electrodes of the International 10–20 System of Electrode Placement. Then, ERPs were averaged over the 30 electrodes because regional averaging collapses covarying measurements. The resulting averages provide a better fit to the ANOVA model than the individual sensor data. Thus, the averaging allows to achieve a more reliable estimate of the activity in a region than a single electrode (Dien & Santuzzi, 2005).
For the behavioral analysis, we focused on two factors: (1) coherence level, the percentage of dots moving in the same direction, ranging from 0 to 100 in steps of 10, and (2) cue location, the temporal location of the cue in the RSVP stream, from 1 to 9. Cue location 0 was excluded because it was preceded by a longer prestimulus interval and therefore was qualitatively distinct from other locations.
For the analysis of ERPs, we first examined the effects of expectation with the following factors as fixed effects: (1) expectancy versus postexpectancy condition, all epochs within a trial that were presented before the sound cue (expectancy, when participants expected a target) versus all epochs after the sound cue (postexpectancy, when no expectancy was present), and (2) stimulus position, the position of an epoch in the RSVP stream, from 1 to 9 (we excluded Position 0 because it was only present in the expectancy condition and was not affected by the overlap from previous events, making it qualitatively different from other positions). Afterward, we assessed the effects of expectation on the perception of coherence, using the following factors: (1) expectancy versus postexpectancy condition, as above, and (2) coherence level, the percentage of dots moving in the same direction, ranging from 0 to 100 with a step of 10.
For statistical analysis, we built linear mixed models with participants as a random effect. We ran ANOVAs on the models to investigate the effects of the factors above, reporting likelihood ratios for all tests. An alpha of 5% was adopted for all significance tests.
The responses to the coherence of the target RDK (coherence ratings) are presented in Figure 2. Participants were able to discriminate between targets with distinct coherence levels (Figure 2A). An ANOVA with Coherence Rating as dependent variable and factors of Coherence Level and Cue Location showed an increase in Coherence Rating with Cue Location, F(9, 189) = 2.83, p = .004; an increase in Coherence Rating with Coherence Level, F(10, 210) = 64.98, p < .001; and an interaction between Cue Location and Coherence Level, F(90, 1890) = 2.22, p < .001. This interaction is illustrated in Figure 2B and C, which shows a dependency of Coherence Rating on coherence level for each cue location. The slopes of linear fits for later cue locations were steeper than for earlier ones (Figure 2C and D).
To further investigate if discrimination improved with expectancy, we compared response errors using an ANOVA including only Cue Location as factor, because errors are not comparable between coherence levels (see Behavioral Analysis section). The ANOVA revealed a significant effect of Cue Location, F(8, 25) = 6.63, p < .001. Planned comparisons with linear contrasts for each coherence level showed that cue location had an effect (p < .001) only for coherence levels 0, 10, and 100. As seen in Figure 3, error decreases with cue positions only for stimuli with high or low coherence, but not for those with intermediate coherence.
Because coherence level was only reported at the end of the trial, it was possible that the reports were affected by memory. Two types of the memory effects might be expected. First, because the interval between the target RDK and the response decreases with cue location, target representation was maintained in memory longer for early than late targets. Second, earlier targets were followed by a larger number of distractor RDKs than later ones, which might have interfered with memory for the target. Possible memory effects on coherence ratings would be reflected in the variability of responses. In the case of memory decay, variability arises due to buildup of internal noise in the memory system (Nilsson, 2020; Donkin, Nosofsky, Gold, & Shiffrin, 2015). This happens because stochastic variability of the neural firing results in desynchronization with time of the firing of feature bundles that build a representation (Jonides et al., 2008). Furthermore, the presence of distractors also increases variability of the memory representation (Marini, Scott, Aron, & Ester, 2017). Thus, a possible contribution of memory to the results may be assessed by comparing response variability between cue locations: If memory has an effect in the results, the variability of coherence ratings should be larger for early than later cue locations. Therefore, to rule out the possibility that the behavioral results are affected by memory, we computed the standard deviation of responses at each cue location. An ANOVA with Cue Location (10 levels) as a factor and Standard Deviation of responses as dependent variable showed no significant differences between target locations, F(9, 26) = 0.58, p = .81. Hence, we conclude that the current results cannot be attributed to memory.
In the following sections, we first describe the effects of expectation on P1 and N2 amplitudes. Then, we describe how those effects interact with coherence levels to address our main question about influence of expectation on Gestalt perception of motion.
Effects of Expectation
Figure 4 shows the grand-averaged ERPs and maps for expectancy and postexpectancy conditions for different stimulus positions.
To test whether our manipulation of expectation had an effect on P1, we compared the P1 amplitudes between the expectancy and postexpectancy conditions for nine stimulus positions. An ANOVA with the factors of Expectancy (two levels) and Stimulus Position (nine levels) showed that the P1 amplitude was significantly larger in the expectancy than in the postexpectancy condition, F(1, 25) = 51.71, p < .001. We also found an effect of Stimulus Position, with P1 amplitudes increasing with stimulus position, F(8, 25) = 105.57, p < .001. The interaction between Stimulus Position and Condition was not significant, F(8, 25) = 1.35, p = .21 (Figure 5A).
The same analysis on the N2 amplitude showed a significant effect of Expectancy Condition, with smaller N2 amplitudes for expectancy stimuli than postexpectancy stimuli, F(1, 25) = 60.46, p < .001. The effect of Stimulus Position was also significant, F(8, 25) = 196.09, p < .001. In contrast to the P1 results, we found an interaction between Expectancy Condition and Stimulus Position, F(8, 25) = 10.72, p < .001, due to a difference between the expectancy and postexpectancy conditions at the early but not late stimulus positions (Figure 5B).
To examine the fine distribution of ERP amplitudes depending on stimulus position and cue location, we built matrices of P1 (Figure 6A) and N2 (Figure 6B) amplitudes by Stimulus Position × Cue Location. For both P1 and N2, we observed a clear diagonal trend: The amplitude changed from early to later epochs and from early to later positions of targets occurrence. In the follow-up analysis, we compared P1 and N2 amplitude between two diagonals: one corresponding to the last expectancy epochs before the cue (the targets) and another corresponding to the first postexpectancy stimulus after the cue, for each level of stimulus position from 1 to 9. The means and linear fits for each diagonal are shown in Figure 6C and D.
An ANOVA on the P1 amplitude with Expectancy Condition (represented by the diagonals in this model) and Stimulus Position as factors showed an effect of Expectancy Condition, F(1, 25) = 24.66, p < .001; an effect of Stimulus Position, F(8, 25) = 32.19, p < .001; and no interaction, F(8, 25) = 1.97, p = .06, although the slope of the linear fit was smaller in the expectancy (β = 0.103) than the postexpectancy condition (β = 0.146; Figure 6C). For the N2, we also observed an effect of Expectancy Condition, F(1, 25) = 82.99, p < .001, and an effect of Stimulus Position, F(8, 25) = 33.04, p < .001. However, in contrast to P1, there was an interaction, F(8, 25) = 3.35, p = .001, also with a smaller slope in the expectancy (β = 0.124) than the postexpectancy condition (β = 0.186; Figure 6D). For both P1 and N2, the slopes converged, that is, the effect of expectation on ERP decreased with the stimulus position.
To investigate the irregular increment with stimulus position visible in the curves (Figure 6C and D), we conducted a post hoc analysis between adjacent stimulus positions within each expectancy condition by testing consecutive contrasts (a t test adjusted for multiple comparisons with the Tukey correction). The results were similar for P1 and N2. Only in the postexpectancy condition, we found significant differences between Stimulus Positions 4 and 5, t(182) = 2.93, p = .02),for P1, and between Positions 3 and 4, t(182) = 3.02, p = .02, and Positions 4 and 5, t(182) = 2.90, p = .03, for N2. In the expectancy condition, none of the comparisons were significant.
Effects of Motion Coherence
Stimulus positions were aggregated for this analysis. A linear mixed-model ANOVA with the factors of Coherence Level (11 levels) and Expectancy Condition (two levels) showed an effect of Coherence on P1 amplitude, F(1, 25) = 51.58, p < .001: The amplitude decreased with increasing coherence level (Figure 7A). There was an effect of Expectancy Condition, F(1, 25) = 23.88, p < .001; however, Expectancy Condition and Coherence did not interact, F(1, 25) = 0.75, p = .68, in contrast with the N2 results (see below). To characterize the effect of coherence on P1, we tested the linear, quadratic, and cubic contrasts for coherence levels separately for the expectancy and postexpectancy conditions. Only the linear contrast showed significant results in both the expectancy and postexpectancy conditions (Table 1).
|Condition .||Degree .||Estimate .||SE .||t Ratio .||p .|
|Condition .||Degree .||Estimate .||SE .||t Ratio .||p .|
Values in bold indicate results significant at the .05 level.
The same ANOVA on N2 amplitude also showed an effect of Coherence, F(1, 25) = 6.67, p < .001, and an effect of Expectancy Condition, F(1, 25) = 59.76, p < .001 (Figure 7B). In contrast to the P1 results, there was an interaction between Coherence and Expectancy Condition, F(1, 25) = 2.54, p = .004. The contrast analysis revealed a significant result for the linear contrast for both the expectancy and postexpectancy conditions. The quadratic contrast showed a significant result for the expectancy condition only (Table 2).
|Condition .||Degree .||Estimate .||SE .||t Ratio .||p .|
|Condition .||Degree .||Estimate .||SE .||t Ratio .||p .|
Values in bold indicate results significant at the .05 level.
We investigated how expectation influences the Gestalt perception of motion stimuli and its neural correlates. Participants were presented with series of RDKs, one of which was signaled as a target by an auditory postcue. RDK coherence varied randomly across 11 levels. The participants' task was to report the coherence level of the target RDK. By varying the position of the auditory postcue within the RDK series, we manipulated participants' temporal expectation toward the target: Expectation gradually increased with the stimulus position until the cue and then suddenly dropped afterward. To investigate the time course of influences of expectation on the neural correlates of Gestalt motion perception, we compared mean amplitudes for the P1 and N2 ERP components evoked by RDKs with different coherence levels and in distinct locations within the stimulus sequence.
The behavioral results indicate that coherence ratings changed with the position of the target RDK within the RDK stream. Crucially, this change occurred in the opposite direction, depending on coherence level: Coherence ratings increased with cue location for high coherence levels and decreased with cue location for low coherence levels (Figure 2B). Consequently, slopes of coherence rating by coherence level were larger for later cue locations than for early cue locations. Analysis of response errors mirrored the results for coherence rating: Errors decreased with cue locations, but only for extreme coherence levels (Figure 3). This indicates that temporal expectations influence lower and higher coherence stimuli to a larger degree than intermediate-coherence stimuli. Thus, the behavioral results reveal that expectation (which increases with cue locations) facilitates judgment of some coherence levels but not others.
Effect of Expectation on ERPs
Before the cue, target expectation gradually builds up with stimulus position within a trial (a series of 10 RDKs) and more or less abruptly decreases after the cue. Because P1 amplitude and N2 amplitude generally increase with attention (Hillyard, Vogel, & Luck, 1998), such a manipulation of expectation should be manifested in amplitude changes in the ERPs in the expectancy condition only, not in the postexpectancy condition. However, the amplitudes of both P1 and N2 became more positive with stimulus positions in both the expectancy and postexpectancy conditions. This effect is evident in the matrices of amplitude by cue location versus stimulus position shown in Figure 6A and B and was confirmed statistically in the analysis of diagonal slices of ERP amplitudes at the boundary between the expectancy and postexpectancy conditions within the matrices. Notably, the size of this effect is larger than that of coherence, as demonstrated by the different scales of y-axes in Figures 6 and 7.
Such similar patterns of changes in the expectancy and postexpectancy conditions are unlikely to be the consequence of our manipulation of expectation. Instead, we propose that they are related to an increase in cortical excitability, as a result of a rhythmic presentation of attended stimuli (Mathewson et al., 2012; Schroeder & Lakatos, 2009). However, whereas the amplitude of P1 gradually increased, the amplitude of N2 (a negative component) gradually decreased with stimulus position (Figure 6C and D). This suggests that different processes are associated with changes of these components. Whereas P1 may reflect increasing excitability, N2 may reflect increasing adaptation. The N1 component, which has a similar latency to the motion N2, is known to have a refractory nature: N1 amplitude decreases in response to repeated stimuli at the same attended location (Luck, 2014). Thus, the predominant P1 and N2 changes with stimulus position may reflect a background effect of the RSVP on the brain state, which occurs both with and without expectation.
Nevertheless, both P1 and N2 were sensitive to expectation. This is evident from the difference in P1 and N2 amplitudes between the expectancy and the postexpectancy conditions in the analysis of matrix diagonal slices. We propose that the observed effect of temporal expectations is supported by a modulation of cortical excitability by rhythmic stimulation (Schroeder & Lakatos, 2009) in both expectancy conditions. Indeed, the entrainment of oscillations to a rhythmic structure of sensory inputs functions as a mechanism supporting temporal expectation and attention (Cravo, Rohenkohl, Wyart, & Nobre, 2013; Mathewson et al., 2012; Schroeder & Lakatos, 2009).
Notably, the slopes of the linear fit for the expectancy and the postexpectancy conditions converge as stimulus position increases for both P1 and N2. This convergence is reflected in an interaction between expectancy condition (expectancy vs. postexpectancy) and stimulus position. For P1, there was no interaction: At this stage, expectation simply boosts perceptual analysis, irrespective of stimulus position. This may be interpreted as a moderation of sensory gain processing by expectation (Luck & Kappenman, 2012; Hillyard et al., 1998), which does not vary with stimulus probability.
For N2, the interaction occurs due to the steeper amplitude decrease with stimulus position in the postexpectancy condition compared with the expectancy condition. The steeper decrease in the postexpectancy condition is accompanied by stepwise changes between adjacent stimulus positions. In the first half of the trial, stepwise changes occur for both P1 and N2 until Stimulus Position 5, where the amplitude curves almost intersect (Figures 6C and D). This suggests that the effect of expectation on ERP reaches a ceiling in the middle of the trial. Similar ceiling effects were observed in behavioral studies exploring temporal orienting of attention in the “attentional awakening” (Ambinder & Lleras, 2009; Ariga & Yokosawa, 2008): Performance in a discrimination task first increases with the position of the stimulus in a rhythmic sequential presentation and then reaches an asymptote. The authors propose that this effect arises due to the time it takes to synchronize internal attentional oscillations to the rhythmic stimulation so as to maximize perceptual processing by “attentional pulses” entrained to the stimuli (Large & Jones, 1999) and show that the magnitude of this effect is modulated by foreperiod expectation (Ambinder & Lleras, 2009). Here, we show a similarly shaped trend at the neural level when expectation is present, suggesting that similar mechanisms underlie both patterns of results.
Previous reports on the effect of expectation on P1 have been inconsistent. Whereas some studies found such an effect (Rohenkohl, Gould, Pessoa, & Nobre, 2014; Correa et al., 2006), other studies indicated that P1 is not affected by temporal expectation (Correa & Nobre, 2008; Doherty et al., 2005; Griffin et al., 2002; Miniussi et al., 1999). The effect of expectation on P1 in our study might arise from the distinctive features of our experimental design. First, our experiment employed postcueing of RDKs in an RSVP paradigm, which distinguishes it from more common manipulations of temporal attention and expectation, such as variations of foreperiods (Correa & Nobre, 2008). Different manipulations of expectancy lead to distinct effects on ERPs (Nobre & Rohenkohl, 2014). Furthermore, moderation of P1 amplitude by expectation may depend on the perceptual demands of the task (Correa et al., 2006), which in our experiment were high. Finally, in contrast to other experiments (e.g., Martin et al., 2010; Aspell et al., 2005), coherence level was a task-relevant feature in our experiment. Attending to the coherence level leads to activation of higher tier parietal areas, which are involved in the integration of motion components, compared with attending to other motion features, such as speed (e.g., Kau et al., 2013). Thus, one possibility is that the elevated cortical excitability, the difficulty of the task, and the task relevance of the coherence level shift visual processing in our experiment to earlier stages. Conversely, the finding that expectation reduces the amplitude of the N2 reproduces the pattern observed in previous studies (Seibold & Rolke, 2014; Hackley et al., 2007; Correa et al., 2006; Doherty et al., 2005; Lange, Rösler, & Röder, 2003).
Effect of Motion Coherence on ERPs
P1 amplitude decreases linearly with increasing coherence level (Figure 7A). This indicates that processing of motion coherence is reflected in P1. The dependence of coherence level on ERP amplitude in the P1 time window—from 120 to 160 msec after onset of the RDK—indicates that this time is sufficient to process motion coherence. This time is earlier than it was reported before: Motion coherence is typically reflected in MEG/EEG about 200 msec after the motion onset (Kau et al., 2013; Martin et al., 2010; Niedeggen et al., 2006; Aspell et al., 2005). This latency shift may occur because the elevated excitability, as proposed above, increases sensitivity to motion coherence.
In contrast to the linear relationship for P1, the relationship between N2 amplitude and coherence level is about linear in the postexpectancy condition only. In the expectancy condition, a quadratic fit adequately describes the dependence of N2 amplitude on coherence level (although a linear trend is significant and is clearly visible for the coherence levels from 0 to 50 in Figure 7B). This finding indicates particular visual processing of intermediate coherence levels, in line with a TMS study, which found the larger effect of TMS application on the MT+ area for perception of RDKs of intermediate coherence compared with fully coherent RDKs (Cai et al., 2014). However, it contradicts previous results, showing that the ERP amplitude at the latency of 200 msec increases linearly with increasing coherence (Niedeggen et al., 2006; Aspell et al., 2005; Nakamura et al., 2003). Differences in design make it difficult to compare our results with those studies. Whereas our participants discriminated coherence level, Aspell et al. (2005) asked their participants to discriminate the direction of motion. In Nakamura et al.'s (2003) study, participants watched a fixation point continuously without performing any other task. Compared with our stimuli, Niedeggen et al. (2006) and Nakamura et al. (2003) employed RDKs with much larger areas and shorter (Niedeggen et al., 2006) or longer (Nakamura et al., 2003) durations. Large stimuli reduce thresholds for coherent motion detection (Morrone, Burr, & Vaina, 1995) and facilitate center-periphery interactions in perception of coherence (Habak, Casanova, & Faubert, 2002). RDK duration interacts with coherence level in tasks where subjects need to discriminate RDK direction, producing shifts in accuracy that vary with coherence level (Pilly & Seitz, 2009). It is possible that these differences are responsible for the discrepancies with our results.
Effect of Expectation on the Perception of Coherent Motion
Our key findings concern relationships between expectancy conditions and coherence level for P1 and N2. For P1, we found a main effect of Expectation but no interaction between Expectation and Coherence Level. Conversely, for N2 we found both a main effect of Expectation and an interaction (Figure 7). The distinct effects of expectation on P1 and N2 allow us to dissociate two stages of coherent motion perception.
The first stage, indicated by P1, includes processing of physical and Gestalt features of the motion stimulus, as indicated by the gradual brain responses to the gradual changes in the stimulus property, that is, the coherence level. Although expectation does affect the P1 stage, as evidenced by the main effect of Expectancy Condition (Figure 7A), the effect of expectation at this stage, instead of reflecting selective attention, may be understood as a general readiness for perception (Nobre & Rohenkohl, 2014; Serences & Kastner, 2014) in the visually demanding RSVP paradigm. The enhanced readiness in expectancy condition may result in higher arousal, which increases P1 amplitude (Vogel & Luck, 2000). This effect is nonspecific in the sense that it does not depend on the configuration of the stimulus and thus do not vary with coherence level. In our design, this general readiness increases along a trial sequence and leads to linear changes in P1 amplitude with stimulus probability.
At the second stage, indicated by N2, the effect of expectation becomes more specific, deploying selective attention instead of general readiness mechanisms. At this stage, expectation interacts with the processing of motion coherence. This disturbs the linear relationships between stimulus properties and brain responses observed at the first, P1 stage. In the expectancy condition, those relationships become nonlinear, as indicated by the quadratic fit of the dependence of the N2 amplitude on coherence level (Figure 7B).
Remarkably, the U-shaped N2 dependence mirrors the behavioral observation: An inverted U-shaped dependence was also observed for the response errors, suggesting that expectation influences lower and higher coherence stimuli to a larger degree than intermediate ones (Figure 3). Because N2 amplitudes are smaller for the lower and higher coherence stimuli (Figure 7B), the lower N2 amplitudes are associated with larger susceptibility to modulatory consequences of expectation. In summary, we found that temporal expectation modulates perception of Gestalt motion in a nonlinear fashion.
What may be the mechanism underlying such nonlinear modulation? On the one hand, it may occur because perceptual processing differs between lower and higher coherence stimuli and intermediate-coherence stimuli. Indeed, random (i.e., 0% coherent) RDKs are easy to judge because they do not lead to coherent motion at all, whereas 100% coherent RDKs can be perceived using only local motion vectors (Cai et al., 2014). However, at intermediate levels, perception of dots moving in a single direction among randomly moving dots involves both integration of signal motion and segregation of noise (Husk, Huang, & Hess, 2012). The expectation-related enhancement of coherence discrimination for intermediate levels may be reduced because of this competition between integration and segregation. Previous studies have suggested that integration and segregation interact differently with attention, for example, that figure-ground segregation demands attention when there is competition for figural assignment (Rashal, Yeshurun, & Kimchi, 2017; Kimchi, 2009). This may explain the nonlinear modulation of the response errors (Figure 3) and N2 results (Figure 7B).
On the other hand, the U-shaped dependency observed for N2 may result from a combination of N2 reduction by the increase in temporal expectancy and N2 enhancement by discrimination difficulty for attended stimuli. Particularly, N2 amplitude was generally reduced in the expectancy condition as compared with the postexpectancy condition (Figure 7B). Intermediate-coherence stimuli may require larger perceptual resources to separate signal from noise in judgment of coherence, and they may be harder to discriminate. The visual N1 component, which has the same latency as the motion N2 explored here, has been proposed as an index of a general-purpose discrimination process (Hopf, Vogel, Woodman, Heinze, & Luck, 2002; Vogel & Luck, 2000). Effects of spatial attention on the visual N1 are larger for difficult discriminations than for easy discriminations (Parks, Beck, & Kramer, 2013; Fu et al., 2008; Handy & Mangun, 2000). In the auditory domain, temporal orienting leads to larger N1 amplitudes for difficult discrimination (Lange & Schnuerch, 2014). We may observe a similar effect in our experiment that occurs only in the expectancy condition, because stimuli in the postexpectancy condition are not attended. A combination of reduced N2 amplitudes by temporal expectancy and increased N2 amplitudes for stimuli that are harder to discriminate should result in the U-shaped N2 pattern in the expectancy condition. However, because we cannot compare the performance between coherence levels directly, we are not able to decisively distinguish between the first and second mechanisms.
Modulations by grouping difficulty of visual ERPs with a similar scalp distribution and latency have been described for other types of grouping. Han (2004) investigated grouping by proximity and by similarity when cues for each type of grouping were congruent (easy condition) or incongruent (difficult condition). Larger N2 amplitudes were revealed for grouping by similarity when grouping by proximity was congruent with similarity, compared with when it was incongruent. Villalba-García, Santaniello, Luna, Montoro, and Hinojosa (2018), studying shape similarity versus proximity grouping, reported a similar N2 modulation by congruence. They suggested that the N2 effect reflects a difference in difficulty (“processing fluency”) or visual salience of grouping. This is in line with the interpretation that N2 amplitude reflects a difficulty in discrimination of coherence in RDKs.
In summary, we found that expectation modulates the perception of Gestalt global motion. These results support the view that perception of motion Gestalts is not purely preattentive (Duncan, 1984; Julesz, 1981; Kahneman & Henik, 1981). Instead, an interesting transition is found between two distinct processing stages. Specifically, directing attention through expectancy does not initially (i.e., within the P1 time window) interact with the organization of the stimulus, represented by coherence level in our study. Afterward, a switch occurs from a preattentive to an attentive stage of Gestalt perception, which in our experiment corresponds to the transition from the P1 to the N2 component, occurring about 160 msec after motion onset. This timing is distinct from that observed for other types of stimuli and Gestalt percepts; for example, for perceptual grouping by similarity and proximity, attention influences grouping around 100 msec earlier (Nikolaev, Gepshtein, Kubovy, & van Leeuwen, 2008; Han, Jiang, Mao, Humphreys, & Qin, 2005). The timing of this transition suggests that attention does not deploy “on time” after stimulus presentation, but rather only after completion of an initial processing stage. Because coherent motion is processed in the MT area (Saproo & Serences, 2014; Hesselmann, Kell, & Kleinschmidt, 2008; Rees, Friston, & Koch, 2000), which is relatively high in visual hierarchy, it is likely that the deployment of selective attention to coherent motion processing also occurs later than in the case of grouping by similarity and proximity, which is associated with V1, V2, and the lateral occipital areas (Altmann, Bülthoff, & Kourtzi, 2003).
Moreover, attention may have distinct effects within the range of coherence levels because of increasing nonlinearity across the hierarchy of the visual system (Norcia, Appelbaum, Ales, Cottereau, & Rossion, 2015). At lower levels in the hierarchy, visual processing leading to perception of coherent motion may only slightly deviate from linearity. Correspondingly, attentional effects here may be straightforward. However, when the result of low-level processing propagates to higher levels, further nonlinear operations are applied to the signal (Alp, Nikolaev, Wagemans, & Kogo, 2017), and attentional influences may become nonadditive. Therefore, the high level of coherent motion processing may explain the observed nonlinear character of its interaction with attention.
A number of psychophysical and ERP studies established different dynamics for different types of groupings. For example, grouping by proximity is achieved earlier than grouping by similarity, which involves more complex processing (Villalba-García et al., 2018; Rashal et al., 2017; Kimchi, 2009; Han, 2004; Ben-Av & Sagi, 1995). Similarly, our results show that common-fate grouping of coherently moving dots has a particular time course that is distinct of other types of grouping. These findings support the current view that perceptual grouping is not a unitary construct and that the operations underlying the various types of grouping have to be studied case by case (Wagemans, 2018; Rashal et al., 2017; Kimchi, 2009).
We would like to thank Christophe Bossens for his help with programming the experiment and with lab setup, and we are grateful to Vera Eymann for her help in data processing.
Reprint requests should be sent to Alexandre de P. Nobre, Department of Developmental and Personality Psychology, Universidade Federal do Rio Grande do Sul, Room 227, Rua Ramiro Barcelos, 2600, Bairro Santa Cecília, Porto Alegre, Rio Grande do Sul, Brazil, 90035-003, or via e-mail: firstname.lastname@example.org.
Alexandre de P. Nobre: Conceptualization; Data curation; Formal analysis; Investigation; Methodology; Project administration; Software; Visualization; Writing—Original draft; Writing—Review & editing. Andrey R. Nikolaev: Conceptualization; Data curation; Formal analysis; Investigation; Methodology; Project administration; Visualization; Writing—Original draft; Writing—Review & editing. Gustavo Gauer: Conceptualization; Resources; Supervision; Writing—Original draft. Cees van Leeuwen: Conceptualization; Funding acquisition; Methodology; Resources; Supervision; Writing—Original draft; Writing—Review & editing. Johan Wagemans: Conceptualization; Funding acquisition; Methodology; Project administration; Resources; Supervision; Writing—Original draft; Writing—Review & editing.
Andrey R. Nikolaev and Cees van Leeuwen, Fonds Wetenschappelijk Onderzoek (http://dx.doi.org/10.13039/501100003130), grant number: G.0003.12. Alexandre de P. Nobre, Brazilian Coordination for the Improvement of Higher Education Personnel. Johan Wagemans, Vlaamse regering (http://dx.doi.org/10.13039/501100011878), grant number: METH/14/02.
Diversity in Citation Practices
A retrospective analysis of the citations in every article published in this journal from 2010 to 2020 has revealed a persistent pattern of gender imbalance: Although the proportions of authorship teams (categorized by estimated gender identification of first author/last author) publishing in the Journal of Cognitive Neuroscience (JoCN) during this period were M(an)/M = .408, W(oman)/M = .335, M/W = .108, and W/W = .149, the comparable proportions for the articles that these authorship teams cited were M/M = .579, W/M = .243, M/W = .102, and W/W = .076 (Fulvio et al., JoCN, 33:1, pp. 3–7). Consequently, JoCN encourages all authors to consider gender balance explicitly when selecting which articles to cite and gives them the opportunity to report their article's gender citation balance.