Self-motion perception relies primarily on the integration of the visual, vestibular, proprioceptive, and somatosensory systems. There is a gap in understanding how a temporal lag between visual and vestibular motion cues affects visual–vestibular weighting during self-motion perception. The beta band is an index of visual–vestibular weighting, in that robust beta event-related synchronization (ERS) is associated with visual weighting bias, and robust beta event-related desynchronization is associated with vestibular weighting bias. The present study examined modulation of event-related spectral power during a heading judgment task in which participants attended to either visual (optic flow) or physical (inertial cues stimulating the vestibular, proprioceptive and somatosensory systems) motion cues from a motion simulator mounted on a MOOG Stewart Platform. The temporal lag between the onset of visual and physical motion cues was manipulated to produce three lag conditions: simultaneous onset, visual before physical motion onset, and physical before visual motion onset. There were two main findings. First, we demonstrated that when the attended motion cue was presented before an ignored cue, the power of beta associated with the attended modality was greater than when visual–vestibular cues were presented simultaneously or when the ignored cue was presented first. This was the case for beta ERS when the visual-motion cue was attended to, and beta event-related desynchronization when the physical-motion cue was attended to. Second, we tested whether the power of feature-binding gamma ERS (demonstrated in audiovisual and visual–tactile integration studies) increased when the visual–vestibular cues were presented simultaneously versus with temporal asynchrony. We did not observe an increase in gamma ERS when cues were presented simultaneously, suggesting that electrophysiological markers of visual–vestibular binding differ from markers of audiovisual and visual–tactile integration. All event-related spectral power reported in this study were generated from dipoles projecting from the left and right motor areas, based on the results of Measure Projection Analysis.
The visual, vestibular, proprioceptive, and somatosensory systems collect information about how an organism moves through its environment, and integrate this information in associated brain areas, such as medial superior temporal area and ventral intraparietal area (for a review, see DeAngelis & Angelaki, 2012), to produce a smooth, unified perception of self-motion. One complicating factor in this integration process is that each of these cues to motion is perceived on different timelines. For example, self-motion information from the visual system is perceived faster than self-motion information from the vestibular system (e.g., RTs are ∼220 msec for light and ∼440 msec for galvanic vestibular stimulation; Barnett-Cowan & Harris, 2009); however, our perception of self-motion is a function of multisensory integration. Understanding how the temporal factors of visual and vestibular perception affect multisensory integration has been of interest to researchers in many fields of science and engineering. For example, understanding this construct has been a major focus for transfer of training research and for setting policies by flight training administration authorities.
Given the different temporal trajectories of information processing between sensory systems, the temporal integration of multisensory stimuli has long been of interest to researchers. For example, in audiovisual integration, direction-incongruent stimuli give rise to the ventriloquist effect, in which the two stimuli are perceived as having the same source despite a spatially separated origin (Alais & Burr, 2004). This effect disappears when the synchrony of the audiovisual stimuli exceeds ∼300 msec (Slutsky & Recanzone, 2001). We still do not fully understand the potential effect of temporal asynchrony on visual–vestibular integration and self-motion perception, especially in the context of driving and flight motion-simulator research. However, a recent study demonstrated that changes in the velocity of a visual or physical self-motion cue are most quickly detected when the stimuli are aligned, compared with a 100-msec timing difference (Kenney et al., 2020). Moreover, Rodriguez and Crane (2021) demonstrated that visual-inertial (e.g., visual–vestibular) heading perception is also sensitive to temporal misalignments of less than 250 msec between the motion cues.
Multisensory integration is also affected by attention allocation (Macaluso et al., 2016). Attention can be voluntarily allocated toward a stimulus, a sensory modality, or a specific region of space to achieve task goals (Li, Piëch, & Gilbert, 2004). However, processing can also be involuntarily captured by sensory events, even when the attention capturing signals are unrelated to the current goal-directed activity (Öhman, Flykt, & Esteves, 2001). EEG is a useful tool to explore the online processes related to the interaction between attention and multisensory integration. The high temporal resolution of EEG has been effective in testing hypotheses related to synchronization of neural oscillations as a mechanism for the integration of information across sensory modalities (Senkowski, Schneider, Foxe, & Engel, 2008). Synchronization of neural oscillations (event-related spectral power [ERSP]) is quantified by measuring power of event-related synchronizations (ERSs) and desynchronizations (ERDs) within particular frequency bands (e.g., theta, alpha, beta, gamma). One hypothesis about interpretation of neural oscillations is that distinct spectral timelines index different local cortical networks involved in sensory processing, attention allocation, and multisensory integration (Siegel, Donner, & Engel, 2012). Most studies that support the spectral timelines hypothesis are based on audiovisual or visuotactile integration (for a review, see Keil & Senkowski, 2018). For example, Senkowski, Talsma, Grigutsch, Herrmann, and Woldorff (2007) showed that the closer in time the audiovisual stimuli were presented together, the more feature binding-related gamma ERS was elicited early after stimulus onset. This finding also supports Singer and Gray's (1995) temporal correlation hypothesis, which suggests that oscillations within the gamma band facilitate integration across sensory modalities. As far as we know, there are few published studies exploring how the onset timing of multisensory stimuli affects EEG correlates of visual–vestibular integration.
Townsend, Legere, O'Malley, von Mohrenschildt, and Shedden (2019) used a high-fidelity motion simulator and a high-density EEG array to observe ERSP in response to simultaneous-onset visual- and physical-motion stimuli. To examine the effect of attention allocation to visual versus physical motion, in a blocked design, participants made heading judgments to visual (or physical) cues only, while ignoring the other modality. For each trial, headings of the motion cues were either spatially congruent (e.g., heading was the same for visual and physical) or incongruent (e.g., visual and physical headings differed). Importantly, in all conditions, the visual and physical cues to self-motion were presented simultaneously. Measure Projection Analysis (MPA) identified cortical regions in the premotor and sensory motor areas (Brodmann's areas [BAs] 6 and 4) associated with motor processing. ERSP analysis within these areas revealed sensitivity of theta- (4–7 Hz), alpha- (8–12 Hz), and beta- (13–30 Hz) band oscillations to attended visual versus physical self-motion stimuli. Specifically, attending to the visual-motion stimulus (while ignoring the physical-motion stimulus) evoked earlier theta ERS and alpha ERD, whereas attention to the physical-motion stimulus (while ignoring the visual-motion stimulus) evoked longer-lasting and more powerful beta ERD. Complimentary research suggests that theta ERS is an index of heading processing (Townsend, Legere, von Mohrenschildt, & Shedden, 2022; for a review, see Buzsáki & Moser, 2013), and alpha ERD/ERS is associated with focal attention and cognitive load (for a review, see Klimesch, 2012). Most important for the present article, previous research has indicated that beta ERD/ERS indexed visual–vestibular weighting (Townsend et al., 2019, 2022). For example, when attention was focused on the visual-motion stimulus (while ignoring physical-motion cues), beta ERS was stronger, whereas when attention was focused on the physical-motion stimulus (while ignoring visual-motion cues), beta ERD was stronger (Townsend et al., 2019). The purpose of the present article was to further examine visual–vestibular weighting by manipulating the timing of onset of the self-motion cues.
Previous research has demonstrated that the beta band is an index of visual–vestibular weighting, and that attention allocation plays a key role in how weighting is distributed among multisensory inputs (Townsend et al., 2019, 2022). Those studies, however, did not investigate the impact stimulus onset timing has on the process of visual–vestibular weighting within self-motion perception. Previous research has shown that discrepancies in the onset timing of audiovisual stimuli can affect multisensory weighting (Fister, Stevenson, Nidiffer, Barnett, & Wallace, 2016; Sheppard, Raposo, & Churchland, 2013). We need a better understanding about how the interaction of attention allocation and temporal misalignment affect the underlying cortical activity associated with visual–vestibular integration during self-motion perception. The goals of the present study were twofold. The first goal was to examine the effect of attention allocation and temporal asynchrony on induced ERSP, specifically the power and time course of beta oscillations associated with visual–vestibular weighting. The second goal was to examine induced gamma oscillations. Previous multisensory research (e.g., Senkowski et al., 2007) demonstrated more powerful feature-binding gamma ERS when audiovisual multisensory cue onsets were presented closer in time. The present study extends this work by asking whether feature-binding reflected by gamma ERS is similar for visual–vestibular integration.
Participants attended to either physical (ignoring visual) or visual (ignoring physical) motion cues (blocked design) and discriminated between left and right self-motion headings (random presentation within a block). There were three SOA conditions: (1) visual motion onset 100 msec before physical motion onset, (2) physical motion onset 100 msec before visual motion onset, and (3) simultaneous visual and physical motion onset. Given previous research (Townsend et al., 2019, 2022), we hypothesized that beta ERD would be most powerful when participants attended to the physical-motion cues, and beta ERS would be most powerful when participants attended to visual-motion cues. This pattern, however, would be modulated by the temporal lag conditions, such that beta ERD in response to attention to physical motion would be enhanced if the attended physical-motion cue was presented before the ignored visual-motion cue, and beta ERS in response to attention to visual motion would be enhanced if the attended visual-motion cue was presented before the ignored physical-motion cue. Moreover, if gamma ERS is most powerful during conditions of temporal synchrony (Senkowski et al., 2007), the present study may provide evidence that gamma ERS is an index of general processes related to multisensory binding and integration across multiple sensory systems. If this is not the case, feature binding-related gamma ERS may only be specific to processes such as audiovisual and visual–tactile integration.
Thirty-six participants (20 women) were recruited from the McMaster University psychology participant pool and the McMaster community. The sample size was sufficient based on a power analysis of data from our previous study (Townsend et al., 2019; 37 sample size, 0.73 effect size, 0.05 error probability, 0.95 power, four measurements) conducted by G*Power Software (Faul, Erdfelder, Buchner, & Lang, 2009). Ages ranged from 17 to 23 years (M = 18 years, SD = 1.30 years). Those recruited from the participant pool were compensated with course credits. All participants self-reported normal or corrected-to-normal visual acuity and reported no major problems with vertigo, motion sickness, or claustrophobia. This experiment was approved by the Hamilton Integrated Research Ethics Board and complied with the Canadian tri-council policy on ethics.
Visual Motion Stimuli
Visual motion stimuli were presented on a 43-in. LCD panel, 51 in. in front of the participant, subtending a visual angle of 41°. The panel had a refresh rate of 60 Hz and a resolution of 1920 × 1080 (1080p).
The visual display, which contributed to the perception of self-motion, was composed of a fixation cross in the center of the display and two tracks on a gray surface. Each track consisted of a series of yellow dashes perpendicular to the length of the track, drawn in perspective to a vanishing point so that the track appeared to extend into the distance. One track veered right, whereas the other veered left, both at 35°, starting at the lower center of the display. Both tracks together subtended a horizontal visual angle of 33.69°. A horizon line was created by a gray surface upon which the tracks laid, and a blue sky with white clouds above, accentuating the perception of traveling along a track into the distance. The perception of self-motion along the track was created via a first-person viewpoint animation that simulated a forward trajectory to align with the acceleration and perceived velocity that result from the physical-motion cues (see Figure 1B and C for two temporal snapshots). The duration of the visual-motion stimulus on each trial was 700 msec, which included a 200-msec acceleration period followed by 500 msec at a fixed velocity. This was followed by a 960-msec pause in the final position at the end of the track. At the completion of the trial (1660 msec), the visual stimulus was reset to the starting position of the tracks.
Physical Motion Stimuli
A motion simulator provided physical-motion stimuli. The motion simulator cabin was supported by a MOOG Stewart platform with six-degrees-of-freedom motion (Moog series 6DOF2000E). Participants were seated in a bucket-style car seat fixed to the cabin floor.
The entire session was between 1.5 and 2 hr in duration. The timeline of the session included collection of demographic information, followed by completion of one practice block (30 trials; ∼2 min), application of EEG electrodes (25 min), completion of four experimental blocks (60 min), and participant clean up and debriefing (15 min).
There were 796 experimental trials divided into four blocks of 199 trials each. Participants fixated on the fixation cross for the duration of each trial; a blink break was provided every 15 trials. The attend-visual (AV) and attend-physical (AP) tasks were blocked to avoid task switching effects. The task required participants to direct attention to the visual-motion stimulus and ignore the physical-motion cues (AV task) or to direct attention to the physical-motion stimulus and ignore the visual-motion cues (AP task). They responded with a button press to indicate whether the direction of the attended-modality motion was left or right heading.
Given the importance of collecting enough clean data with correct responses in each attention condition for EEG analyses, and given that participants have a more difficult time ignoring the visual while attending the physical stimulus (Townsend et al., 2019), we collected three AP blocks compared with one AV block. Presentation order was controlled so that the AV block was presented as the first, second, or third of the four blocks. Moreover, to ensure that participants maintained attention to the intended modality (especially during AP blocks), each block contained eight catch trials in which the ignored modality heading was incongruent with the attended modality heading.
SOA was manipulated to produce simultaneous (S), visual-first (V1st), and physical-first (P1st) conditions. In the simultaneous condition, visual and physical motion cues were onset at the same time. In the V1st condition, the visual motion stimulus was onset 100 msec before the physical motion, and in the P1st condition, the physical motion stimulus began 100 msec before the visual motion. The duration of 100 msec was selected as the SOA based on previous research that demonstrated a window in which temporal alignment of visual–vestibular cues speeds up the perception of self-motion (Kenney et al., 2020; O'Malley, Townsend, von Mohrenschildt, & Shedden, 2015). This research provided evidence that a temporal misalignment of 100 msec delayed the responses to the self-motion cues, relative to visual–vestibular cues that were closer in temporal alignment. Thus, the benefits of multisensory integration were weakened, which was the case regardless of which motion cue was being attended. There were an equal number of left and right heading trials in each block, randomly presented.
EEG Data Acquisition
EEG data were collected using the BioSemi ActiveTwo electrophysiological system (www.biosemi.com) with 128 sintered Ag/AgCl scalp electrodes. Four additional electrodes recorded eye movements (two placed laterally from the outer canthi and two below the eyes on the upper cheeks). Continuous signals were recorded using an open pass band from direct current to 150 Hz and digitized at 1024 Hz.
All processing was performed in MATLAB 2014a (The MathWorks) using functions from EEGLAB (Delorme & Makeig, 2004) on the Shared Hierarchical Academic Research Computing Network (www.sharcnet.ca). EEG data were band-pass filtered between 1 and 50 Hz, and epoched from 1000 msec prestimulus to 2000 msec poststimulus. Each epoch was baseline corrected using the whole-epoch mean (Groppe, Makeig, & Kutas, 2009). Channels with a standard deviation exceeding 200 μV were interpolated after referencing (on average, 0.97 channels interpolated per participant, with a total of 35 channels interpolated). Bad epochs were rejected if they had voltage spikes exceeding 500 μV or violated EEGLAB's joint probability functions (Delorme, Sejnowski, & Makeig, 2007).
Single-subject EEG data were submitted to an extended adaptive mixture independent component (IC) analysis (Palmer, Kreutz-Delgado, & Makeig, 2012) with an n − (1 + interpolated channels) principal components analysis reduction (Makeig, Bell, Jung, & Sejnowski, 1995). Decomposing an EEG signal into ICs allows for analysis of each individual signal produced by the brain that would otherwise be indistinguishable. Dipoles were then fit to each IC using the fieldtrip plugin for EEGLAB following adaptive mixture IC analysis (Oostenveld, Fries, Maris, & Schoffelen, 2011). ICs for which dipoles were located outside the brain, or explained less than 85% of the weight variance, were excluded from further analysis. On average, 20.47 ICs per participant were excluded from analysis.
ERSP Measure Projection Analysis
ERSP was computed for each of the remaining ICs. Fifty log-spaced frequencies between 3 and 50 Hz were computed, with three cycles per wavelet at the lowest frequency up to 25 at the highest. MPA was used to cluster ICs across participants using the Measure Projection Toolbox for MATLAB (Bigdely-Shamlo, Mullen, Kreutz-Delgado, & Makeig, 2013). MPA is a method of categorizing the location and consistency of EEG measures, such as ERSP, across single-subject data into 3-D domains. Each domain is a subset of ICs that are identified as having spatially similar dipole models, as well as similar cortical activity (measure-similarity). MPA fits the selected ICs into a 3-D model of the brain, composed of a cubic space grid with 8-mm spacing according to normalized Montreal Neurological Institute space. The MPA toolbox identified cortical regions of interest by incorporating the probabilistic atlas of human cortical structures provided by the Laboratory of Neuroimaging project (Shattuck et al., 2008). Voxels that fell outside of the brain model (muscle artifacts, etc.) were excluded from the analysis.
We then calculated local convergence values, using an algorithm based on Bigdely-Shamlo et al. (2013), which deals with the multiple comparisons problem. Local convergence calculates the measure-similarity of dipoles within a given domain and compares them with randomized dipoles. A pairwise IC similarity matrix was created by estimating the signed mutual information between IC-pair ERSP measure vectors, assuming a Gaussian distribution, to compare dipoles. As explained in detail by Bigdely-Shamlo et al. (2013), signed mutual information was estimated to improve the spatial smoothness of the obtained MPA significance value beyond determining similarity of dipoles through correlation. Bootstrap statistics were used to obtain a significance threshold for convergence at each location of our 3-D brain model. Following past literature, we set the raw voxel significance threshold to p < .001 (Chung, Ofori, Misra, Hess, & Vaillancourt, 2017; Bigdely-Shamlo et al., 2013).
Our analyses focused on two relevant domains: the right motor area, with the greatest concentration of dipoles consistent with right premotor and SMA (BA 6), and the left motor area, with the greatest concentration of dipoles consistent with left premotor and SMA (BA 6). For the right motor area, each participant contributed, on average, 2.33 (±1.53) ICs, with each participant contributing at least one IC, with a range from 1–7 ICs. For the left motor area, each participant contributed, on average, 2.19 (±1.51) ICs. There were five participants who did not contribute to this domain. The range of contributed ICs was 0–6.
ERSPs were computed for each experimental condition within each domain calculated by MPA. Bootstrap statistics were used to assess differences in ERSP between conditions to uncover main effects of task and SOA. Differences at each power band were computed by projecting the ERSP for each condition to each voxel in the domain. This projection was weighted by dipole density per voxel and then normalized by the total domain voxel density for each participant. Analysis of projected source measures were separated into discrete spatial domains by threshold-based affinity propagation clustering based on a similarity matrix of pairwise correlations between ERSP measure values for each position. Following Chung et al. (2017), we used the maximal exemplar-pair similarity, which ranges from 0–10 to set a value of 0.8 (Chung et al., 2017; Ofori, Coombes, & Vaillancourt, 2015; Bigdely-Shamlo et al., 2013).
Behavioral data were analyzed with two 2 × 3 repeated-measures ANOVAs for measures of judgment accuracy and RT. Outliers were defined as trials with RTs greater than 3 SDs above or below the mean in each condition and were eliminated from all further analyses. The Greenhouse–Geisser correction was applied to all effects that violated Mauchley's test of sphericity. All behavioral results are illustrated in Figure 2.
Participants were more accurate at discriminating direction in the attend-visual task (M = 99%, SE = .003) than the attend-physical task (M = 95%, SE = .01), F(1, 35) = 10.50, p = .003, ηp2 = .23. Moreover, there was a main effect of SOA on accuracy (Greenhouse–Geisser corrected), F(1.69, 59.02) = 5.77, p = .03, ηp2 = .14, and a Task ×SOA interaction, F(2, 70) = 5.00, p = .009, ηp2 = .13. Bonferroni-corrected pairwise comparisons supported the observation that the SOA effects were apparent during the attend-physical task only; there were no significant differences in accuracy between any of the SOA conditions during the attend-visual task. More specifically, participants were more accurate in the attend-physical physical-first (AP(P1st)) condition (M = 95.9%, SE = .01) than the attend-physical visual-first (AP(V1st)) condition (M = 94.10%, SE = 0.02; p = .007).
Participants were faster at discriminating direction in the attend-visual task (M = 1018 msec, SE = 90.20) than the attend-physical task (M = 1409 msec, SE = 78.72), F(1, 35) = 39.43, p < .001, ηp2 = .53. There was a main effect of SOA, F(2, 70) = 519.35, p < .001, ηp2 = .94, such that responses were fastest in the V1st conditions (M = 1189 msec, SE = 6.10), followed by the simultaneous conditions (M = 1317 msec, SE = 5.88), and slowest in the P1st conditions (M = 1451 msec, SE = 6.16). There was a trend toward a Task × SOA interaction on RTs (Greenhouse–Geisser corrected), F(1.52, 53.22) = 3.48, p = .05, ηp2 = .9, such that Bonferroni-corrected pairwise comparisons revealed RT differences across conditions in both attend-physical and attend-visual tasks. During the attend-visual task, responses were faster for the visual-first (AV(V1st)) trials (M = 899 msec, SE = 92.99) compared with simultaneous (AV(S)) trials (M = 1020 msec, SE = 90.19; p < .001), which were in turn faster than physical-first (AV(P1st)) trials (M = 1135 msec, SE = 88.36; p < .001). Likewise, during the attend-physical task, responses were faster for the AP(V1st) trials (M = 1269 msec, SE = 77.69) compared with simultaneous (AP(S)) trials (M = 1406 msec, SE = 79.04; p < .001), which were in turn faster than AP(P1st) trials (M = 1552 msec, SE = 80.12; p < .001). Thus, two important observations are that (1) participants are faster overall when attending to visual motion, but importantly, (2) both attend-visual and attend-physical conditions are highly sensitive to which stimulus was presented first. Exploring the ERSP results provides insights into how the temporal order of stimuli may be affecting multisensory integration and thus leading to differences in accuracy and RTs.
Effects of SOA in Attend-Visual Task
Figure 3 presents a comparison of the left and right motor areas to illustrate the effect of the timing of the stimulus onset on the cortical activity during the attend-visual conditions in both MPA domains. All ERSP represents a difference in oscillatory power compared with baseline (pretrial) cortical activity, where an ERS represents more spectral power than baseline and an ERD represents less spectral power than baseline. The 1000-msec baseline EEG was recorded during the ISI before each trial, while the simulator was stationary and participants were fixating on the fixation cross. Figure 3A shows the left motor area, with the highest dipole density in the premotor and SMA (BA 6), and Figure 3D shows the right motor area, with the highest dipole density in the premotor and SMA (BA 6). In Panel B (left motor) and E (right motor), we show the associated ERSP plots for the AV(V1st), AV(S), and AV(P1st) conditions. The ERSP plots are followed by bootstrapped comparisons (α = .05) between each possible pair of conditions for left (Panel C) and right (Panel F) motor areas. The following sections will describe observations of the activity changes associated with experimental conditions across frequency bands theta, alpha, beta, and gamma. All of the comparisons outlined in the following sections were significant at p < .05.
Theta-band latency differences.
The AV(P1st) condition elicited theta ERS significantly later than the AV(S) and AV(V1st) conditions. Specifically, in both the left and right motor areas (Panels C and F, respectively), AV(S) elicited greater theta ERS from ∼100 msec to 200 msec post stimulus and AV(P1st) elicited greater theta ERS later in the trial, from ∼500 msec to 950 msec post stimulus. Likewise, AV(V1st) elicited greater theta ERS from stimulus onset to 300 msec post stimulus and AV(P1st) elicited greater theta ERS from ∼500 msec to 1000 msec post stimulus.
Alpha-band power differences.
In the left and right motor areas (C and F, respectively), AV(P1st) elicited the strongest alpha ERD, compared with AV(S) (∼750–1500 msec poststimulus) and AV(V1st) (∼600–1500 msec poststimulus), and AV(S) elicited stronger alpha ERD than AV(V1st) (∼550–1500 msec poststimulus). Thus, in general, alpha ERD AV(P1st) > AV(S) > AV(V1st).
Beta-band power differences.
Much like the results in the alpha band, we found that the earlier the physical motion was presented, the stronger the elicited beta-band ERD power. In the left and right motor areas (C and F, respectively), AV(P1st) elicited the strongest beta ERD, compared with AV(S) (∼500–1500 msec poststimulus) and AV(V1st) (∼400–1500 msec poststimulus), and AV(S) elicited stronger alpha ERD than AV(V1st) (∼300–1000 msec poststimulus). Thus, in general, beta ERD AV(P1st) > AV(S) > AV(V1st).
Gamma-band power differences.
AV(V1st) elicited a more powerful gamma ERS than AV(P1st) from ∼600–1200 msec poststimulus in the right motor area (F).
Effects of SOA in Attend-Physical Task
Figure 4 presents a comparison of the same left and right motor areas as Figure 3 to illustrate the effect of stimulus onset timing on the cortical activity during the attend-physical conditions in both MPA domains. All of the comparisons outlined in the following sections were significant at p < .05.
Theta-band latency differences.
The AP(P1st) condition elicited theta ERS significantly later than the AP(S) and AP(V1st) conditions. Specifically, in both the left and right motor areas (C and F, respectively), AP(S) elicited greater theta ERS from stimulus onset to ∼300 msec post stimulus and AP(P1st) elicited greater theta ERS later in the trial, from ∼500 msec to 600 msec post stimulus. Likewise, AP(V1st) elicited greater theta ERS from stimulus onset to ∼400 msec post stimulus and AP(P1st) elicited greater theta ERS from ∼500 msec to 600 msec post stimulus.
Alpha-band power differences.
In the left and right motor areas (C and F, respectively), AP(P1st) elicited the strongest alpha ERD, compared with AP(S) (∼700–1500 msec poststimulus) and AP(V1st) (∼600–1500 msec poststimulus), and AP(S) elicited stronger alpha ERD than AP(V1st) (∼600–1500 msec poststimulus). Thus, in general, alpha ERD AP(P1st) > AP(S) > AP(V1st).
Beta-band power differences.
In the left and right motor areas (C and F, respectively), AP(P1st) elicited the strongest beta ERD, compared with AP(S) (∼550–1500 msec poststimulus) and AP(V1st) (∼500–1500 msec poststimulus), and AP(S) elicited stronger alpha ERD than AP(V1st) (∼800–1200 msec poststimulus). Thus, in general, beta ERD AP(P1st) > AP(S) > AP(V1st).
Effects of Attention Allocation across SOA Conditions
Figure 5 presents the same right motor area as Figures 3 and 4 to illustrate the interaction of stimulus onset timing and attention allocation. We compared cortical activity between conditions of attention allocation at each level of the SOA condition (i.e., AV(S) vs. AP(S), AV(V1st) vs. AP(V1st), and AV(P1st) vs. AP(P1st)). Similar results were found in the left motor area. All of the comparisons outlined in the following sections were significant at p < .05.
Theta-band power differences.
AV(S) elicited a more powerful theta ERS than AP(S) from ∼250 msec to 400 msec post stimulus (C).
Alpha-band power differences.
In the right motor area (A), AV(S) elicited a stronger alpha ERD, compared with AP(S) (∼50–550 msec poststimulus) (C). AP(V1st) elicited greater alpha ERD than AV(V1st) from ∼800 msec – end of trial (D).
Beta-band power differences.
In the right motor area (A), AP(P1st) elicited a stronger beta ERD than AV(P1st) from ∼550–1500 msec poststimulus (B), AV(S) elicited a stronger beta ERS than AP(S) from ∼800 msec – end of trial (C), and AV(V1st) elicited more powerful beta ERS than AP(V1st) from ∼700 msec – end of trial (D).
Behavioral research has demonstrated a temporal binding window for visual–vestibular integration, in which multisensory integration affects heading perception, temporal order judgements, and attention allocation (e.g., Rodriguez & Crane, 2021; Shayman et al., 2018). Research exploring the cortical processes underlying this temporal window is currently scarce. To better understand the online processes related to multisensory temporal binding, we must look to literature focused on the integration of other senses, such as audiovisual, or visuotactile integration. Studies such as Senkowski et al. (2007) have demonstrated that the closer audiovisual stimuli are presented temporally, the more powerful the elicited feature-binding gamma ERS response. Past multisensory research has demonstrated a Gaussian integration window, in which integration breaks at a temporal asynchrony specific to the senses being integrated (e.g., Rodriguez & Crane, 2021). The present study explored how EEG oscillations related to attention and multisensory weighting in self-motion perception (theta, alpha, and beta; Townsend et al., 2019, 2022), and multisensory feature binding (gamma; Senkowski et al., 2007) were affected by varying conditions of SOA. All differences in cortical activity discussed are projected from the motor area (likely including integrative areas such as ventral intraparietal area and medial superior temporal area) based on the MPA, which identified ROIs across participants.
The Effects of Timing Onset within an Attended Modality
Recent research by Townsend et al. (2019, 2022) showed that theta, alpha, and beta oscillations reveal brain networks involved in the perception of self-motion. Moreover, the power of these individual oscillations changed dynamically depending on which sensory inputs were attended to. Taken together, our two previous studies demonstrated that the beta band is most sensitive to changes in visual–vestibular weighting. Specifically, these studies showed that a strong beta ERS is an electrophysiological signature of heavy visual weighting, and a strong beta ERD is a signature of vestibular weighting.
The current study revealed changes in the same spectral bands as the previously mentioned studies and contributed additional key insights to the understanding of self-motion perception. One robust result that we observed was when presenting an attended motion cue before an ignored cue, the power of the beta oscillation associated with weighting bias toward the attended modality (ERS for visual and ERD for vestibular) was greater than during simultaneous presentation of the attended and ignored cues. This result suggests that the power of weighting-related beta oscillations during self-motion perception is also sensitive to the timing of the onset, and not just attention allocation. Regardless of which modality is being attended to, the earlier the attended motion cue is presented in relation to the ignored cue, the more powerful the weighting-related ERSP. The inverse was true when the ignored cues were presented before the attended cues. Beta ERS was less powerful in the AV(P1st) condition versus AV(S), and beta ERD was less powerful in the AP(V1st) condition versus AP(S).
The beta cycle has long been thought to reflect an initiation and termination of motor output (for a review, see Kilavik, Zaepffel, Brovelli, MacKay, & Riehle, 2013). Contrary to this hypothesis, Townsend et al. (2019, 2022) demonstrated a beta rebound during passive full-body motion that was induced by attention, and suggested that beta oscillations during motor processing may actually reflect perceptual weighting of the visual, vestibular, and proprioceptive systems. The beta rebound may reflect the inhibition of processing the physical-motion stimuli, considering visual–vestibular integration is a subadditive process. Subadditive inhibition typically occurs during integration when there is a discrepancy in the reliability of multiple sensory inputs (Angelaki, Gu, & DeAngelis, 2009). The Townsend et al. (2022) study showed that participants performed the heading discrimination task at 99% accuracy in both visual- and physical-motion only conditions (the same motion stimuli as the current study). Considering there were likely no significant differences in reliability between the two sensory inputs, we believe that the temporal advantage caused by the SOA led to strong inhibitory responses during integration. Our behavioral and EEG results fall in line with Townsend et al. (2019, 2022). Similar to our previous research, the average of participants' accuracy on the heading discrimination task ranged between conditions from 98–100%. We believe the oscillatory differences in the beta band between the stimulus onset timing conditions may be a product of the perceptual weights being changed because of the SOA. For example, the processing of the visual stimulus during the AV(V1st) condition began 100 msec before the processing of the physical-motion stimulus. This perceptual head start could have increased the weighting in favor of the visual stimulus, more so than in the AV(S) condition. A similar weighting bias may have taken place during the attend-physical conditions, as we found similar results (but in beta ERD). These power differences in ERSP did not result in differences in accuracy, however (attend-visual 99% accuracy, attend-physical 95% accuracy). We believe that the tasks may not have been sensitive enough to capture correlations between behavioral differences and oscillatory power.
RTs, on the other hand, were affected by the SOA. Keeping in mind that RTs were measured from the onset of the to-be-attended stimulus, RTs were fastest when the visual-motion cues were presented first regardless of whether visual or physical cues were attended. In contrast, RTs were slowest when the physical-motion cues were presented first, regardless of which cue was attended. The visual system is dominant over the vestibular system, as reported in many studies (e.g., Angelaki et al., 2009), and it is not surprising that we see this RT effect with 100-msec SOAs. Visual cues also lead to faster perceptual processing compared with vestibular cues (Barnett-Cowan & Harris, 2013), and the visual cue would have provided stronger priming than the vestibular cue when attention was directed to the opposite cue. Thus, RTs benefited more when the visual-motion cue was presented first. The present study clearly demonstrates that the timing of stimulus onset is a critical component of the visual–vestibular weighting process and is indexed by dynamic changes in the beta band.
The Interaction of Stimulus Timing and Attentional Selection
Not only did we find that the timing of stimulus onsets affected ERSP, we also found an interaction between the timing of onsets and attention allocation. This result has a direct application to pilot training; for example, current policies of Transport Canada and Federal Aviation Administration require physical cues to motion to precede visual cues to motion during pilot simulator training. Pilots are trained to attend to visual instruments and ignore vestibular inputs caused by forces such as turbulence, to avoid spatial disorientation (Braithwaite, 1997). One question that arises from this practice is how the temporal asynchrony and selective attention interact to affect pilots' multisensory processing. We compared the visual- versus the physical-motion conditions at each SOA condition. Our comparison of AP(S) versus AV(S) was a replication of a condition in Townsend et al. (2019), and we found similar results in the present study, the most important observation being stronger beta ERS in attend-visual conditions and stronger beta ERD in attend-physical conditions. This comparison acted as a baseline, whereas the other two comparisons presented novel findings.
The comparisons AP(P1st) versus AV(P1st) (contrasting attention conditions when the physical stimulus onset first), and AP(V1st) versus AV(V1st) (contrasting attention conditions when the visual stimulus onset first) demonstrated an interaction of attention allocation and SOA in the beta band. When the physical-motion cue was presented 100 msec before the visual cue, there were fewer ERSP differences between AP(P1st) versus AV(P1st), compared with the baseline comparison. Most notably, the typical beta rebound elicited by attention to the visual-motion cue was not present in the AV(P1st) condition. Based on the findings of Townsend et al. (2019, 2022), the lack of a beta rebound in the AV(P1st) condition suggests that presenting the physical-motion cue before the visual-motion cue resulted in greater weighting of vestibular signals than if the motion cues were presented simultaneously. This finding is relevant to simulator training for pilots. If the vestibular cue to motion is presented before the visual cue, it may disrupt the operator's ability to down-weight potentially disorienting vestibular cues that pilots are trained to ignore.
The lack of a beta rebound in the AV(P1st) condition resulted in relatively little difference in ERSP between AP(P1st) versus AV(P1st). However, when the visual-motion cue was presented 100 msec before the physical-motion cue, there was a robust beta ERS in the AV(V1st) condition versus a beta ERD in the AP(V1st) condition. This analysis revealed that visual–vestibular weighting is more sensitive to changes in the onset timing of the visual cues to motion than the vestibular cues. This finding is supported by Barnett-Cowan and Harris (2013), who demonstrated that perception of visual stimuli is faster than perception of vestibular stimuli. Considering the visual cue naturally has a temporal advantage (during simultaneous presentation), it is likely that the vestibular cue would need to be presented more than 100 msec before the visual cue to create the robust ERSP differences that were demonstrated between the conditions of attention allocation when the visual cue was presented first.
Feature-binding Gamma ERS in Visual–Vestibular Integration
We examined gamma ERS under varying conditions of SOA to test the temporal correlation hypothesis (Engel, Fries, & Singer, 2001; Singer & Gray, 1995) in the context of visual–vestibular integration. This hypothesis posits that synchronization of gamma-band oscillations is a key mechanism for integration across distributed cortical networks. Evidence supporting this hypothesis has been demonstrated in multiple studies (e.g., Senkowski et al., 2007; Sakowitz, Quiroga, Schürmann, & Başar, 2001) that typically focus on audiovisual integration. For example, Senkowski et al. (2007) presented human participants with audiovisual stimuli with varying degrees of temporal asynchrony and required them to attend to one modality-specific stimuli while ignoring the other. They found that gamma ERS was not significantly different between modalities but, for both modalities, significantly stronger gamma ERS was elicited when temporal asynchrony was 25 msec or less, compared with longer SOAs. In the present study, the temporal correlation hypothesis predicts that the simultaneous conditions (AP(S) and AV(S)) elicit stronger gamma ERS compared with the V1st and P1st conditions. Our results do not support this hypothesis. The present study only found differences in the gamma band when comparing the AV(V1st) and AV(P1st) conditions, such that AV(V1st) elicited stronger gamma ERS than AV(P1st). We are currently unaware of any literature directly explaining this finding. We offer two possible conclusions for our results. First, visual–vestibular integration does not rely on gamma ERS to synchronize modality-specific information across cortical networks. This facilitation of gamma ERS could be specific to superadditive integration processes (e.g., audiovisual integration; Dias, McClaskey, & Harris, 2021) as opposed to subadditive integration processes (e.g., visual–vestibular integration; Angelaki et al., 2009). Or second, visual–vestibular integration has a broader temporal window than 100 msec for gamma facilitation (compared with the Senkowski et al., 2007, temporal window of 25 msec), and therefore our experimental design was not sensitive enough to detect differences in gamma ERS because of SOA. A broader temporal window for visual–vestibular integration would be consistent with behavioral research (Rodriguez & Crane, 2021) and research demonstrating that perception for vestibular inputs being relatively slower than other senses (Barnett-Cowan & Harris, 2013). More research needs to be conducted to better understand the role of stimulus timing in visual–vestibular feature binding.
Limitation and Future Directions
Our heading discrimination task required participants to push a button as quickly as possible to make a heading judgment. It is possible that the preparation and execution of thumb movements during the button press contributed to the recorded EEG signal in the motor areas. Pilot studies revealed that participants had a tendency to only attend to visual cues to motion unless they were told that some physical-motion cues were spatially incongruent to visual-motion cues. Collecting RTs during the heading judgment task was important to ensure that participants attended to the correct motion cues to elicit the appropriate cortical activity. Our previous research (Townsend et al., 2019, 2022) demonstrated that RT data were diagnostic of attention allocation, such that visual headings were judged faster than physical headings.
The somatosensory system detects pressure and stretch on the skin, muscles, and joints during self-motion (Lackner, 1992). The forces generated by acceleration that produce vestibular or proprioceptive cues would be strong signals of self-motion perception; however, forces generated by the acceleration of our motion simulator would have also stimulated receptors in the back, seat, and feet of the seated participants. Although there is evidence from patients with spinal lesions that the somatosensory system does not contribute significantly to our perception of self-motion (Walsh, 1961), we cannot completely rule out the somatosensory system's contribution to our EEG signal projecting from the motor areas.
Functional neuroimaging studies exploring the neural correlates of visual motion perception typically use optic flow to elicit cortical responses to vection, or the illusion of inertial motion generated by visual-only stimuli. Some studies have compared coherent optic flow to control stimuli such as random (incoherent) dot motions (e.g., Cardin & Smith, 2010), static dot patterns (e.g., Deutschländer et al., 2004), or spatially scrambled versions of the original self-motion stimulus (e.g., Barry et al., 2014). In these studies, participants are not physically moved, so researchers commonly rely on self-report data to determine whether participants experienced the vection illusion. We did not collect self-report data to determine whether participants experienced vection from our visual-motion cues in the present study. Therefore, we cannot be completely certain that our visual-motion stimuli would have elicited vection on their own. However, a large body of research has shown that visually induced vection is strengthened when paired with vestibular stimulation (e.g., Gallagher, Dowsett, & Ferrè, 2019; Weech & Troje, 2017; Johnson, Sunahara, & Landolt, 1999). Our visual- and physical-motion stimuli were developed to combine for an immersive experience of self-motion that is similar to environments used in aviation and driving research and training.
Our research can be applied to the clinical space to better understand pathologies of self-motion perception and visual–vestibular integration. Patients with pathologies such as Mal de Débarquement Syndrome (Van Ombergen, Van Rompaey, Maes, Van de Heyning, & Wuyts, 2016), Persistent Postural-Perceptual Dizziness (Popkirov, Staab, & Stone, 2018), and Parkinson's disease (Yakubovich et al., 2020) show lower thresholds for self-motion perception. For example, a recent study has shown that, compared with healthy, age-matched controls, Parkinson's disease patients perform worse on heading judgment tasks because of overweighting of impaired visual-motion cues (Yakubovich et al., 2020). If we can establish electrophysiological biomarkers of the healthy versus impaired self-motion perception, we will develop a better understanding of the integration and motor impairments that are common in pathologies such as Parkinson's disease. Identification of these biomarkers in the prediagnostic phase of the disease could lead to a greater time window for possible preventative measures and earlier treatments (Noyce, Lees, & Schrag, 2016).
The present study examined cortical activity elicited in response to self-motion cues that varied in attention allocation and stimulus onset synchrony. There were two main findings. First, SOA produced robust differences in cortical activity during attention to both visual and physical motion. The electrophysiological signatures of visual (strong beta ERS) versus vestibular (strong beta ERD) weighting bias were enhanced when the attended motion cue was presented 100 msec before the ignored cue. When comparing across conditions of attention allocation, presenting the visual-motion cue first created more robust conditional differences than when physical-motion cues were presented first. These results demonstrate that the timing of visual–vestibular stimuli plays a critical role in multisensory weighting during self-motion perception, and that this weighting process is more sensitive to temporal changes in visual stimuli compared with vestibular stimuli. Second, contrary to the findings of several audiovisual and visuotactile studies, the temporal synchrony of visual- and physical-motion cues did not elicit gamma ERS beyond baseline. It is possible that the 100-msec SOA was not long enough to elicit these hypothesized differences. It could also be the case that visual–vestibular integration does not elicit processes indexed by gamma ERS.
Reprint requests should be sent to Ben Townsend, Department of Psychology, Neuroscience and Behaviour, McMaster University, 1280 Main St. West, Hamilton, Ontario, Canada L8S 4 L8, or via e-mail: email@example.com.
Data Availability Statement
The data and code for all analyses are available online at https://github.com/bentownsend11/Stimulus-onset-asynchrony-affects-attention-related-ERSP-in-self-motion-perception.
Ben Townsend: Conceptualization; Formal analysis; Investigation; Methodology; Project administration; Visualization; Writing—Original draft; Writing—Review & editing. Joey K. Legere: Formal analysis; Software. Martin v. Mohrenschildt: Funding acquisition; Methodology; Resources; Software; Supervision. Judith M. Shedden: Conceptualization; Funding acquisition; Methodology; Project administration; Resources; Supervision; Writing—Review & editing.
Funding for this study was provided to JMS and MvM by The Natural Sciences and Engineering Research Council of Canada, grant numbers: RGPGP-2014-00051 and RGPIN-2020-07245; and the Canada Foundation for Innovation (https://dx.doi.org/10.13039/501100000196), grant number: 2009M00034. These funding sources had no involvement in the study design, the collection, analysis and interpretation of data, in the writing of the report, and in the decision to submit the article for publication.
Diversity in Citation Practices
Retrospective analysis of the citations in every article published in this journal from 2010 to 2021 reveals a persistent pattern of gender imbalance: Although the proportions of authorship teams (categorized by estimated gender identification of first author/last author) publishing in the Journal of Cognitive Neuroscience (JoCN) during this period were M(an)/M = .407, W(oman)/M = .32, M/W = .115, and W/W = .159, the comparable proportions for the articles that these authorship teams cited were M/M = .549, W/M = .257, M/W = .109, and W/W = .085 (Postle and Fulvio, JoCN, 34:1, pp. 1–3). Consequently, JoCN encourages all authors to consider gender balance explicitly when selecting which articles to cite and gives them the opportunity to report their article's gender citation balance. The authors of this article report its proportions of citations by gender category to be as follows: M/M = .675; W/M = .125; M/W = .15; W/W = .05.