Abstract
In everyday listening situations, we need to constantly switch between alternative sound sources and engage attention according to cues that match our goals and expectations. The exact neuronal bases of these processes are poorly understood. We investigated oscillatory brain networks controlling auditory attention using cortically constrained fMRI-weighted magnetoencephalography/EEG source estimates. During consecutive trials, participants were instructed to shift attention based on a cue, presented in the ear where a target was likely to follow. To promote audiospatial attention effects, the targets were embedded in streams of dichotically presented standard tones. Occasionally, an unexpected novel sound occurred opposite to the cued ear to trigger involuntary orienting. According to our cortical power correlation analyses, increased frontoparietal/temporal 30–100 Hz gamma activity at 200–1400 msec after cued orienting predicted fast and accurate discrimination of subsequent targets. This sustained correlation effect, possibly reflecting voluntary engagement of attention after the initial cue-driven orienting, spread from the TPJ, anterior insula, and inferior frontal cortices to the right FEFs. Engagement of attention to one ear resulted in a significantly stronger increase of 7.5–15 Hz alpha in the ipsilateral than contralateral parieto-occipital cortices 200–600 msec after the cue onset, possibly reflecting cross-modal modulation of the dorsal visual pathway during audiospatial attention. Comparisons of cortical power patterns also revealed significant increases of sustained right medial frontal cortex theta power, right dorsolateral pFC and anterior insula/inferior frontal cortex beta power, and medial parietal cortex and posterior cingulate cortex gamma activity after cued versus novelty-triggered orienting (600–1400 msec). Our results reveal sustained oscillatory patterns associated with voluntary engagement of auditory spatial attention, with the frontoparietal and temporal gamma increases being best predictors of subsequent behavioral performance.
INTRODUCTION
Human behavior and communication depends on our ability to flexibly shift the focus between alternative locations of acoustic space and to engage voluntary auditory attention to the most relevant sources of information, such as when one tries to follow a conversation in a multitalker environment. In such situations, attention may also be involuntarily or exogenously captured by novel sounds that occur beyond our immediate perceptual field. Distinguishing between these voluntary versus involuntary modes of orienting has been a major goal in previous attention studies (for a review, see Corbetta & Shulman, 2002). Yet, many previous human neuroimaging studies, including both visual (Serences & Yantis, 2007; Mayer, Harrington, Adair, & Lee, 2006; Peelen, Heslenfeld, & Theeuwes, 2004; Kim et al., 1999; Rosen et al., 1999) and auditory efforts (Salmi, Rinne, Koistinen, Salonen, & Alho, 2009), suggest that largely overlapping networks are activated during voluntary and involuntary orienting. Notably, the majority of this evidence has been obtained with methods that provide only indirect indices of neuronal activities and have a poor spectral and temporal resolution. This might be a specifically relevant limitation in the auditory domain: Unlike visual objects that occur in parallel across the visual field, sounds consist of complex spectrotemporal signals that are distributed across time. Particularly in the auditory domain, voluntary and involuntary attention processes could be better distinguishable based on their distinct temporal behaviors.
Transient Orienting versus Sustained Endogenous Engagement of Attention
Classic models developed based on visuospatial studies suggest that attention shifting can be divided to distinct stages, involving the disengagement of previous activity, redirection of attention, and finally the engagement of attention to a new location (Posner & Petersen, 1990). Cognitive control of auditory attention can, thus, be presumed to involve processes occurring at different timescales. Redirection of attention from one sound source to another may occur almost instantaneously, as suggested by psychoacoustic (Mondor & Zatorre, 1995) and EEG (Näätänen, 1992) studies in humans. The rapid orienting process is presumably accompanied by sustained control activities that facilitate longer-term engagement of processing resources after the redirection of attention has occurred (Posner & Rothbart, 2007). Evidence for such executive processes, beyond transient activities related to the shifting operations, have been found in visuospatial (Liu, Slotnick, Serences, & Yantis, 2003; Yantis et al., 2002) and visual abstract task-switching fMRI studies (Braver, Reynolds, & Donaldson, 2003). Behavioral visuospatial studies, in turn, suggest that, whereas the facilitating effects of stimulus-driven orienting occur in less than 100 msec (Klein, 2000; Posner & Cohen, 1984), the effects of endogenous attention shifting may evolve more gradually, with the focus of attention sharpening and detection performance improving within the first second after orienting (Cheal & Lyon, 1991; Shepherd & Muller, 1989). In the auditory system, neurophysiological animal models have shown analogous top–down effects that develop up to several seconds after orienting (Fritz, Elhilali, & Shamma, 2005). However, despite the importance of the temporal information on sound processing, the distinction between transient and sustained attention processes is not yet fully known in the auditory domain.
Neuronal Oscillations and Auditory Attention
In humans, noninvasive estimates of transient and sustained control processes could be achieved with magnetoencephalography (MEG) and EEG, which as direct measures of neural activity provide much better temporal resolution than fMRI, the prevailing technology in contemporary attention studies. Pioneering advances have been achieved using trial-averaged MEG/EEG event-related fields (ERF)/ERPs. ERF/ERPs have been utilized to investigate attention shifting with visual (Rushworth, Passingham, & Nobre, 2002; Hopf & Mangun, 2000; Nobre, Sebestyen, & Miniussi, 2000) and auditory paradigms (Salmi, Rinne, Degerman, & Alho, 2007; Alho et al., 1998; Escera, Alho, Winkler, & Näätänen, 1998; Schröger & Wolff, 1998; Näätänen, 1992). One important limitation of ERP/ERF analyses is, however, that endogenous processes are often relatively loosely phase locked to external triggers, and thus, particularly higher-frequency oscillations related to such processes are easily canceled out from trial-averaged responses (Tallon-Baudry & Bertrand, 1999). Additional information of such endogenous processes could be obtained by analyses of neuronal oscillations by analyzing either transient oscillatory processes related to specific stimuli/events or sustained oscillatory modulations occurring over longer periods of time. Furthermore, examining oscillations occurring at distinct frequency ranges, such as the theta (4–7 Hz), alpha (8–15 Hz), beta (15–30 Hz), and gamma bands (30–100 Hz), may help distinguish between different types of attention effects within anatomically overlapping networks (Schroeder & Lakatos, 2009; Buschman & Miller, 2007; Fan et al., 2007).
In previous studies (Jensen, Kaiser, & Lachaux, 2007; Fries, Reynolds, Rorie, & Desimone, 2001), tasks requiring endogenous effort, including voluntary orienting (Fan et al., 2007; Landau, Esterman, Robertson, Bentin, & Prinzmetal, 2007), have been most consistently associated with oscillations at the gamma band. Neurophysiological animal models suggest that gamma synchronization increases in local neuronal networks representing attended features (Bichot, Rossi, & Desimone, 2005; Fries et al., 2001). In noninvasive MEG or EEG measurements, this is detectable as increases of transient gamma-band responses to attended auditory stimuli (Mulert et al., 2007; Kaiser, Hertrich, Ackermann, & Lutzenberger, 2006; Tiitinen et al., 1993). Sustained increases of frontotemporal MEG gamma-band activity, as potentially reflecting ongoing endogenous cognitive processing, have, in turn, been found during auditory pattern memory maintenance (Kaiser, Ripper, Birbaumer, & Lutzenberger, 2003), analogous to similar effects identified in the visual system (Tallon-Baudry, Bertrand, Peronnet, & Pernier, 1998). Interestingly, recent intracranial EEG studies in humans have suggested that endogenous attention is reflected by sustained broadband gamma-band activity in the dorsolateral pFC (DLPFC), medial frontal cortex, superior parietal cortex, and FEFs (Ossandon et al., 2012). Intracranial EEG studies in human visual cortices have provided evidence for sustained broadband gamma activities that are modulated by selective attention (Tallon-Baudry, Bertrand, Henaff, Isnard, & Fischer, 2005), consistent with similar results obtained in noninvasive MEG measurements (Kahlbrock, Butz, May, & Schnitzler, 2012). On the basis of these findings, one might expect stronger sustained gamma patterns after voluntary engagement of attention than transient involuntary orienting.
The associations between attention and beta-band oscillations are not yet as clear as those related to gamma oscillations. It has been proposed that beta-band activities reflect less demanding control processes than gamma-band responses (Engel & Fries, 2010). However, there is human EEG evidence that tasks requiring enhanced top–down processing are coupled with increased beta activity (Kaminski, Brzezicka, Gola, & Wrobel, 2012). Neurophysiological studies in nonhuman primates and cats have shown sustained beta synchronization patterns during attentional expectancy of stimuli (de Oliveira, Thiele, & Hoffmann, 1997; Roelfsema, Engel, König, & Singer, 1997; Montaron, Bouyer, & Rougeul-Buser, 1979). Beta-band modulations have been linked to increased endogenous processing in the human auditory system as well (Iversen, Repp, & Patel, 2009), but their exact role in auditory attention is still relatively poorly known.
Although the higher-frequency beta and gamma oscillations may relate to increased processing, it is generally thought that enhancement of alpha rhythms reflects disengagement of task-irrelevant cortex areas (Klimesch, Sauseng, & Hanslmayr, 2007; Cooper, Croft, Dominey, Burgess, & Gruzelier, 2003; Pfurtscheller, 2003; Adrian & Matthews, 1934). Visuospatial attention studies have shown that alpha activity increases in cortical areas representing task-irrelevant aspects of the visual field (Worden, Foxe, Wang, & Simpson, 2000), and that such effects correlate with the participants' ability to ignore irrelevant stimuli (Händel, Haarmeier, & Jensen, 2011). Parieto-occipital visual alpha oscillations are also strongly affected by cross-modal effects by auditory attention (Ahveninen et al., 2012; Fu et al., 2001). These cross-modal effects include lateralized alpha increases in parieto-occipital areas ipsilateral to the attended auditory field (Thorpe, D'Zmura, & Srinivasan, 2012; Banerjee, Snyder, Molholm, & Foxe, 2011). However, identifying lateralized alpha inhibition effects in auditory cortices, per se, may be more difficult because, although sounds presented to one ear elicit strongest activations in the contralateral auditory pathway (Langers, van Dijk, & Backes, 2005; Virtanen, Ahveninen, Ilmoniemi, Näätänen, & Pekkonen, 1998), a significant amount of information ascends also directly to the ipsilateral hemisphere (see, however, also Müller & Weisz, 2012; Gomez-Ramirez et al., 2011).
In the human cortex, attentional oscillatory activities may also occur at the theta range. It has been proposed that, whereas higher-frequency synchronization reflects local integration, inter-regional phase locking at the theta range supports longer-distance functional coupling in the brain (Doesburg, Green, McDonald, & Ward, 2012; Canolty et al., 2006; von Stein, Chiang, & König, 2000). The power of frontocentral cortical theta, presumably originating from MFC regions (Wang, 2010), strengthens with increased allocation of attentional resources (Sauseng, Hoppe, Klimesch, Gerloff, & Hummel, 2007) and enlarged working memory load (Jensen & Tesche, 2002; Gevins, Smith, McEvoy, & Yu, 1997). On the basis of these findings, one might expect increased MFC theta during engagement of auditory attention.
Localizing Oscillatory Attention Networks of the Human Brain
Going beyond comparisons of activation–magnitude differences provided by methods such as fMRI, examining neuronal oscillatory activities at different frequency bands, can help make more specific inferences about the role of activated areas. However, achieving anatomically accurate noninvasive MEG/EEG estimates of oscillatory activities during auditory attention has been limited by the ill-posed electromagnetic inverse problem. Employing constraints that reduce the number of potential solutions mitigates this problem. Because MEG and EEG activities are mainly generated within cerebral gray matter, the source locations can be restricted to the cortical mantle derived from anatomical MRI (Dale & Sereno, 1993). Additional improvements can be achieved by combining the complementary information provided by simultaneously measured MEG and EEG (Sharon, Hämäläinen, Tootell, Halgren, & Belliveau, 2007). MEG/EEG inverse solution can be constrained even further by fMRI information on BOLD changes within the gray matter (Dale et al., 2000), which have been shown to correlate with the postsynaptic neuronal events that also generate the MEG/EEG signals (Logothetis, 2002).
Here, we therefore utilized a combination of MEG/EEG and fMRI to investigate orienting and engagement of auditory attention, using a cued dichotic listening paradigm modified from classic visuospatial (Posner, 1980), auditory spatial selective attention (Näätänen, 1992; Hillyard, Hink, Schwent, & Picton, 1973), and auditory involuntary attention-shifting studies (Escera et al., 1998; Schröger & Wolff, 1998). We presumed that exogenous and endogenous stages of orienting and engagement of attention would be reflected by differential transient and sustained oscillatory power changes. We specifically hypothesized that voluntary allocation of attention, subsequent to cued orienting, would result in power increases of high-frequency oscillations in frontoparietal regions, whereas inhibition of task-irrelevant processes was expected to result in lateralized alpha modulations in posterior cortex.
METHODS
Task and Stimuli
During separate fMRI and MEG/EEG sessions, participants (n = 16, mean ± SD age = 23 ± 5 years, eight women) were presented with randomly ordered 10-sec sound dichotic listening trials (Figure 1). During each trial, participants were instructed to look at a fixation mark at the center of a MRI or MEG-compatible video display and wait for a cue (250-msec buzzer sound) delivered to the ear where a subsequent target (50-msec tone with 800- and 1500-Hz harmonics) was likely to occur. Upon hearing the cue, the participants were advised to shift their attention accordingly (without shifting their gaze), pay close attention to the tones presented to the designated ear, and press a button with the right index finger as rapidly as possible after hearing the target. To maximize the attention effects (Näätänen, 1992; Woldorff, Hackley, & Hillyard, 1991; Hillyard et al., 1973), we increased the serial load by embedding the cues and targets among pure tone “standards” (duration 50 msec, 5-msec ramps) presented randomly to the right (800 Hz) or left ear (1500 Hz). The participants were instructed to discriminate changes in the timbre (a “thickening” of the sound) in the task-relevant standard sound sequence, but they were not told that the targets were actually similar in both ear sequences. According to previous studies (Alho et al., 1998; Knight, 1984), strong event-related MEG/EEG responses related to involuntary auditory orienting can be evoked by physically varying “novel” sounds. To investigate oscillatory networks underlying involuntary attention shifting, in 20% of the trials, target was therefore replaced by task-irrelevant novel sound presented opposite to the cued ear. The novel sounds consisted of eight spectrotemporally complex environmental and synthetic sounds whose peak intensities, onset rise times, and perceived loudness, as well as their grand-averaged time envelope, were made as close to the cues as possible. The design also included 20% of “catch trials” with the cue and pure tone standards, but no target. Only pure tones (no cue, novel, or target) were presented in 20% of trials. The MEG/EEG session, consisting of two 37-min runs (220 trials/run), lasted approximately 2.5–3 hr. The fMRI session, including three 23-min runs (136 trials/run), lasted 2 hr with preparations and training. During each task session, there were an equal number of right and left ear events (cues, targets, novels, and standards). The order of sessions was randomized. Additionally, a separate 10-min behavioral experiment (n = 10, four women, age = 22–43 years) was conducted to test whether the spatial cueing indeed produced significant performance benefits. In this control experiment, we replaced 50% of the novel sounds with a target sound opposite to the cued ear (“invalidly cued target”).
Task and stimuli. During 10-sec trials, participants heard a cue in the ear where a subsequent target, a harmonic sound within pure tone trains consisting of randomly ordered 800-Hz left-ear and 1500-Hz right-ear tones (Hillyard et al., 1973), was likely to appear. The task was to shift attention to the cued ear, wait for the target, and press a button as quickly as possible upon hearing the target. Novel sounds, which occasionally occurred opposite to the cued ear, were to be ignored. Each trial ended to a 2.18-sec sound, the fMRI scanning noise or recorded simulation during MEG/EEG (i.e., fMRI was obtained with a sparse-sampling approach with other sounds presented in-between scans). Four types of trials were utilized similarly during fMRI and MEG/EEG, including Cue/Target (40%), Cue/Novel (20%), Cue/No target (20%), and Standards Only (20%). The SOA was jittered at 350–750 msec to mitigate expectancy confounds such as omission responses (there was at least 650 msec period after cues, targets, and novels).
Task and stimuli. During 10-sec trials, participants heard a cue in the ear where a subsequent target, a harmonic sound within pure tone trains consisting of randomly ordered 800-Hz left-ear and 1500-Hz right-ear tones (Hillyard et al., 1973), was likely to appear. The task was to shift attention to the cued ear, wait for the target, and press a button as quickly as possible upon hearing the target. Novel sounds, which occasionally occurred opposite to the cued ear, were to be ignored. Each trial ended to a 2.18-sec sound, the fMRI scanning noise or recorded simulation during MEG/EEG (i.e., fMRI was obtained with a sparse-sampling approach with other sounds presented in-between scans). Four types of trials were utilized similarly during fMRI and MEG/EEG, including Cue/Target (40%), Cue/Novel (20%), Cue/No target (20%), and Standards Only (20%). The SOA was jittered at 350–750 msec to mitigate expectancy confounds such as omission responses (there was at least 650 msec period after cues, targets, and novels).
In all sessions, sound stimuli were presented at 55 dB over the subjective hearing threshold, tested individually at the beginning of each session for each ear. At 7.82 sec after the trial onset, participants heard the sound of 2.18-sec fMRI volume acquisition or during MEG/EEG a 2.18-sec binaural scan-sound recording, signaling that the trial had ended. In other words, using it as a controlled alerting stimulus minimized the confounding effects of fMRI acquisition noise. The fMRI scanner sound (along with the buzzer cues and novel sounds) also acted as an interruption signal, to prevent the buildup of frequency-based “streaming” (which according to previous studies typically takes 5–10 sec; Bregman, 1978). In each trial, a total of 13 auditory stimuli were presented starting 2.3 sec after the onset of preceding scan/simulation and ending on average 1.3 sec before the next scan. Within the sequence of 13 sounds, the SOA was 350–750 msec, varied quasirandomly such that there was always at least 650 msec silence after each cue, target, and novel. The jittering of SOA was presumed to help avoid omission response confounds. During fMRI, three silent baseline trials occurred after every six active trials (i.e., a mixed blocked/event-related design was utilized). A standardized computerized approach taking about 5 min was utilized to teach the task to the participants before the scanning. In subsequent analyses, individual trials with target detection responses beyond the participant's mean ± 2SD RT were considered outliers. Of an initial cohort of 20 participants, one participant was excluded from the final MEG/EEG/fMRI analyses because of an incapability to perform the tasks and three other participants for technical reasons.
Data Acquisition
Human participants' approval was obtained, and voluntary consents were signed before each measurement. Three hundred six-channel MEG (Elekta-Neuromag, Helsinki, Finland) and 74-channel EEG data were recorded simultaneously (600 samples/sec, passband 0.01–192 Hz) in a magnetically shielded room. Common-average reference was utilized for all analyses of EEG data. The position of the head relative to the sensor array was monitored continuously using four head position indicator coils attached to the scalp. EOG was also recorded to monitor eye artifacts. Whole-head 3T fMRI was acquired in a separate session using a 32-channel coil (Siemens TimTrio, Erlagen, Germany). To circumvent response contamination by scanner noise, we used a sparse-sampling gradient-echo BOLD sequence (repetition time [TR]/echo time [TE] = 10,000/30 msec, 7.82 sec silent period between acquisitions, flip angle = 90°, field of view = 192 mm) with 36 axial slices aligned along the anterior–posterior commissure line (3-mm slices, including 0.75-mm gap, 3 × 3 mm2 in-plane resolution), with the coolant pump switched off. T1-weighted anatomical images were obtained for combining anatomical and functional data using a multiecho MPRAGE pulse sequence (TR = 2510 msec; four echoes with TEs = 1.64, 3.5, 5.36, and 7.22 msec; 176 sagittal slices with 1 × 1 × 1 mm3 voxels, 256 × 256 mm2 matrix; flip angle = 7°). A field mapping sequence (TR = 500 msec, flip angle 55°; TE1 = 2.83 msec, TE2 = 5.29 msec) with similar slice and voxel parameters to the EPI sequence was utilized to obtain phase and magnitude maps utilized for unwarping of B0 distortions of the functional data.
Data Analysis
Neuronal bases of auditory attention shifting were studied using an MEG/EEG/fMRI approach, analogous to, for example, Ahveninen et al. (2011). External MEG noise was suppressed, and participant movements, estimated continuously at 200-msec intervals, were compensated for using the signal–space separation method (Taulu, Simola, & Kajola, 2005; Maxfilter, Elekta-Neuromag, Helsinki, Finland). The MEG/EEG data were then downsampled (300 samples/sec, passband 0.5–100 Hz). Epochs coinciding with over 150 μV EOG, 100 μV EEG, or 2000 ft/cm MEG sensor data changes were excluded from further analyses. Some studies suggest that ocular muscle activity may become time locked to sound presentations (Yuval-Greenberg & Deouell, 2011). We therefore used the signal–space projection (SSP), calculated around the time points of artifacts, for removing MEG/EEG field patterns originating from the eyes.
To calculate fMRI-guided depth-weighted ℓ2 minimum-norm estimates (Lin, Belliveau, Dale, & Hämäläinen, 2006; Hämäläinen, Hari, Ilmoniemi, Knuutila, & Lounasmaa, 1993), the information from structural segmentation of the individual MRIs and the MEG sensor and EEG electrode locations were used to compute the forward solutions for all putative source locations in the cortex using a three-compartment boundary element model (Hämäläinen et al., 1993). The shapes of the surfaces separating the scalp, skull, and brain compartments were determined from the anatomical MRI data using FreeSurfer 5.0 (surfer.nmr.mgh.harvard.edu/). For whole-brain inverse computations, cortical surfaces extracted with FreeSurfer were decimated to ∼1000 vertices per hemisphere. The individual forward solutions for current dipoles placed at these vertices comprised the columns of the gain matrix (A). A noise covariance matrix (C) was estimated from the raw MEG/EEG data during a 20–200 msec prestimulus baseline. These two matrices, along with the source covariance matrix R, were used to calculate the minimum-norm estimate inverse operator W = RAT (ARAT + C)−1.
To obtain an fMRI prior, that is, an fMRI-weighted source covariance matrix, each vertex point in the cortical surface was assigned an fMRI significance value using FreeSurfer-FSFAST 5.0. Individual functional volumes were motion corrected, unwarped, coregistered with each participant's structural MRI, intensity normalized, resampled into cortical surface space, smoothed using a two-dimensional Gaussian kernel with an FWHM of 5 mm, and entered into a general linear model (GLM) with the task conditions as explanatory variables. The fMRI weighting was set to 90%. That is, diagonal elements in R corresponding to vertices with below-threshold (p < .01, all conditions vs. baseline) significance values were multiplied by 0.1 (group fMRI result was used as a prior in three participants). The entire MEG/EEG raw data time series at each time point were multiplied by the inverse operator W and noise normalized to yield the estimated source activity as a function of time (Lin et al., 2006).
Accepted MEG/EEG/fMRI trial epochs were analyzed using the FieldTrip toolbox (www.ru.nl/fcdonders/fieldtrip) in Matlab 7.11 (Mathworks, Natick, MA), for each vertex of the cortical surface. Power analyses were performed using a fast Fourier transform taper approach with sliding time windows. At 4–30 Hz, an adaptive time window of three cycles and a Hanning taper was used. At 30–100 Hz, a fixed 0.2-sec time window and 10-Hz frequency smoothing were used, resulting in three orthogonal Slepian tapers being applied to the sliding time window (Percival & Walden, 1993). Dynamic power estimates were then calculated within a 2-sec period (including a 500-msec prestimulus baseline) relative to the stimulus onset at each vertex location with neighboring time points temporally segregated by 0.05 sec. The resulting time-frequency estimates, particularly when converted to a standard brain representation, would have yielded extremely large data sets. Therefore, the whole-cortex power estimates were pooled to a smaller number of time-frequency ROIs. The available frequency band (4–100 Hz) was divided to consecutive one-octave wide subranges, which corresponded to theta (4–7.5 Hz), alpha (7.5–15 Hz), beta (15–30 Hz), lower gamma (30–60 Hz), and higher gamma (60–100 Hz) bands (i.e., the highest band was less than one octave). For each bands, a power estimate, divided by the prestimulus baseline and base-10 logarithm normalized, was calculated within three geometrically increasing poststimulus time windows of 0–200, 200–600, and 600–1400 msec, separately for the cues and novels. The resulting cortical power estimates were then normalized to the Freesurfer standard brain representation (Fischl, Sereno, Tootell, & Dale, 1999). Note that estimates of responses to the targets, per se, were analyzed only cursorily because of the presence of motor activity.
Statistical Analyses
Analyses of cortical MEG/EEG power estimates were conducted at each vertex point of the cortical surface, in both hemispheres, using a GLM masked at locations where the group fMRI omnibus F test contrast (i.e., including both activations and deactivations related to any individual regressor) was significant at p < .05 (Freesurfer/FSFAST 5.0). To control for multiple comparisons, the resulting statistical estimates were tested against an empirical null distribution of maximum cluster size across 10,000 iterations with a vertex-wise threshold of p < .05 and cluster-forming threshold of p < .05, yielding clusters corrected for multiple comparisons across the surface.
Three kinds of statistical comparisons were conducted. (a) We examined correlations between cortical oscillatory power changes after cued orienting, that is, allocation of attention before target presentation, and subsequent behavioral performance. To reduce the dimensionality of this analysis, postcue cortical power estimates were correlated with an inverse efficiency score (IES), which reflects the RT divided by the hit rate (Townsend & Ashby, 1978). That is, a higher value indicates worse performance, similar to RT. The IES has been utilized in behavioral and ERP studies of visual orienting (Kennett, Eimer, Spence, & Driver, 2001; Akhtar & Enns, 1989) as a measure of processing efficiency that discounts possible criterion shifts or speed/accuracy tradeoffs. Before the analysis, the IES values were logarithm normalized; according to a Jarque–Bara test, the resulting behavioral correlate was normally distributed. (b) To explore lateralization of power changes during audiospatial attention, we compared cortical power estimates to the left ear versus right ear cues. (c) Finally, we compared power estimates to the cues versus novels, with the ear-specific responses pooled together. To control for possible gradual baseline power shifts within trials, comparisons across cues and novels were normalized based on standards occurring in catch trials in the same sequential positions than the events of interest.
RESULTS
Behaviorally, the participants' mean ± SD RTs were 466 ± 100 msec and hit rates (HR) were 90 ± 10% during MEG/EEG. These results did not differ significantly from behavioral observations made during fMRI measurements (mean ± SD: RT = 495 ± 48 msec, HR = 90 ± 8% during fMRI). To verify the beneficial effect of cues in directing attention to subsequent targets, we conducted a separate behavioral control analysis (n = 10, four women, age = 22–43 years) where a portion of targets were presented in the location opposite to the cued ear. The result demonstrated that spatial cueing significantly [t(9) = −4.17, p < .01, paired t test] speeded up target discrimination, as compared with trials where the target occurred in the ear opposite of the cue (mean ± SD RTs 463 ± 68 vs. 555 ± 105 msec to validly vs. invalidly cued targets, respectively). Notably, the RTs to invalidly cued targets of these participants were also significantly longer [t(26) = 2.21, p < .05, independent samples t test] than the RTs to validly cued targets during MEG/EEG, suggesting that participants complied with the task instruction and also benefited behaviorally from the cueing.
Predicting Behavioral Target Discrimination by Oscillatory Power after Cued Attention Shifting
To examine engagement of auditory attention after cued orienting, oscillatory activities following the cues and preceding the target presentation were correlated with the behavioral measure IES, which represents RT normalized by HR (i.e., the unit is millisecond, smaller IES values represent fast and accurate performance) (Figure 2). These analyses showed significant (cluster-based Monte Carlo simulation test) negative correlations, with higher power predicting better behavioral performance, between gamma power and IES at 200–600 msec and 600–1400 msec postcue time windows. These correlations started earlier in the right hemisphere, spreading from TPJ, auditory cortices, anterior insula, and inferior frontal cortex (IFC) areas to the right FEF. In the left hemisphere, the cluster of significant correlations also extended to postcentral and posterior parietal regions, as well as to the anterior temporal cortex areas. The details of cluster statistics are shown in Table 1.
Goal-driven engagement of attention estimated by correlation analyses between postcue/pretarget oscillations and behavioral performance. Fast and accurate behavioral performance, as measured with IES (RT/HR, unit: millisecond, smaller value represents improved performance; Townsend & Ashby, 1978), correlated with increased oscillatory power at the lower (30–60 Hz) and higher (60–100 Hz) gamma bands. Significant correlations (cluster-based Monte Carlo simulation test) started earlier (0.2–0.6 sec) in the right hemisphere and subsequently (0.6–1.4 sec) spread also to the left hemisphere. These gamma correlation patterns emerged first in the TPJ, auditory cortices, anterior temporal cortex, anterior insula, and IFC. In the later time window, a strong correlation at the higher gamma band emerged also in an area corresponding to the right FEFs. No significant correlations were observed in the other frequency bands. The figure shows the significance of initial GLM, masked to the locations that survived the post hoc correction based on the cluster-based Monte Carlo simulation test. All estimates are normalized relative to a 500-msec prestimulus baseline before statistical analyses.
Goal-driven engagement of attention estimated by correlation analyses between postcue/pretarget oscillations and behavioral performance. Fast and accurate behavioral performance, as measured with IES (RT/HR, unit: millisecond, smaller value represents improved performance; Townsend & Ashby, 1978), correlated with increased oscillatory power at the lower (30–60 Hz) and higher (60–100 Hz) gamma bands. Significant correlations (cluster-based Monte Carlo simulation test) started earlier (0.2–0.6 sec) in the right hemisphere and subsequently (0.6–1.4 sec) spread also to the left hemisphere. These gamma correlation patterns emerged first in the TPJ, auditory cortices, anterior temporal cortex, anterior insula, and IFC. In the later time window, a strong correlation at the higher gamma band emerged also in an area corresponding to the right FEFs. No significant correlations were observed in the other frequency bands. The figure shows the significance of initial GLM, masked to the locations that survived the post hoc correction based on the cluster-based Monte Carlo simulation test. All estimates are normalized relative to a 500-msec prestimulus baseline before statistical analyses.
Correlations between Postcue/Pretarget Oscillations and Behavioral Performance
F Range . | Time (msec) . | Cluster . | Max Value . | Size (mm2) . | xTalairach . | yTalairach . | zTalairach . | Clusterwise p . | Maximum Location . |
---|---|---|---|---|---|---|---|---|---|
Left Hemisphere | |||||||||
Lower gamma | 0–200 | ||||||||
200–600 | |||||||||
600–1400 | |||||||||
1 | −2.9 | 2740 | −39 | −2 | −33 | .0023 | Inferior temp., extending to anterior insula | ||
2 | −2.3 | 3161 | −57 | −17 | 26 | .0008 | Postcentral, extending to IFC | ||
Higher gamma | 0–200 | ||||||||
200–600 | |||||||||
600–1400 | |||||||||
1 | −3.2 | 10846 | −36 | −1 | −7 | .0001 | Anterior insula | ||
Right Hemisphere | |||||||||
Lower gamma | 0–200 | ||||||||
200–600 | |||||||||
1 | −2.8 | 4256 | 48 | −36 | 21 | .0001 | TPJ/supramarginal gyrus | ||
600–1400 | |||||||||
1 | −3.5 | 5181 | 36 | −5 | −7 | .0001 | Anterior insula | ||
Higher gamma | 0–200 | ||||||||
200–600 | |||||||||
1 | −2.7 | 6427 | 45 | −35 | 23 | .0001 | TPJ/supramarginal gyrus | ||
600–1400 | |||||||||
1 | −3.7 | 1539 | 44 | 3 | 37 | .0219 | FEF/precentral cortex | ||
2 | −3.3 | 6732 | 29 | 25 | 0 | .0001 | Anterior insula |
F Range . | Time (msec) . | Cluster . | Max Value . | Size (mm2) . | xTalairach . | yTalairach . | zTalairach . | Clusterwise p . | Maximum Location . |
---|---|---|---|---|---|---|---|---|---|
Left Hemisphere | |||||||||
Lower gamma | 0–200 | ||||||||
200–600 | |||||||||
600–1400 | |||||||||
1 | −2.9 | 2740 | −39 | −2 | −33 | .0023 | Inferior temp., extending to anterior insula | ||
2 | −2.3 | 3161 | −57 | −17 | 26 | .0008 | Postcentral, extending to IFC | ||
Higher gamma | 0–200 | ||||||||
200–600 | |||||||||
600–1400 | |||||||||
1 | −3.2 | 10846 | −36 | −1 | −7 | .0001 | Anterior insula | ||
Right Hemisphere | |||||||||
Lower gamma | 0–200 | ||||||||
200–600 | |||||||||
1 | −2.8 | 4256 | 48 | −36 | 21 | .0001 | TPJ/supramarginal gyrus | ||
600–1400 | |||||||||
1 | −3.5 | 5181 | 36 | −5 | −7 | .0001 | Anterior insula | ||
Higher gamma | 0–200 | ||||||||
200–600 | |||||||||
1 | −2.7 | 6427 | 45 | −35 | 23 | .0001 | TPJ/supramarginal gyrus | ||
600–1400 | |||||||||
1 | −3.7 | 1539 | 44 | 3 | 37 | .0219 | FEF/precentral cortex | ||
2 | −3.3 | 6732 | 29 | 25 | 0 | .0001 | Anterior insula |
Monte Carlo simulation results of the data in Figure 2 are demonstrated. The maximum values of initial GLM reflect −log10(p) × sign(t). Significant correlations emerged at the lower (30–60 Hz) and higher (60–100 Hz) gamma ranges.
Lateralized Oscillatory Power Changes during Cued Auditory Attention
To examine attentional lateralization of oscillatory power changes, we compared estimates to the cues presented to the left and right ears (Figure 3). A double dissociation of cortical power changes as a function of the direction of attention emerged at the alpha range, at 200–600 msec after the cue onset. In the left posterior parietal and lateral occipital regions, a significant (cluster-based Monte Carlo simulation test) increase of alpha power was observed for the left versus right cues. In contrast, in the corresponding right parietal occipital areas, the alpha power was significantly decreased after left versus right ear cues. These effects suggest that, in both hemispheres, the parieto-occipital visual cortex alpha power is significantly stronger when attention is directed to the ipsilateral than contralateral ear. As shown by the data in Figure 3, the ipsilateral power enhancement effect was stronger, longer lasting, and anatomically more widespread in the right hemisphere, with significant effects observed also in the medial parieto-occipital regions (the precuneus, retrosplenial cortex) and in lateral inferior parietal cortices, and TPJ. Furthermore, in the right hemisphere, the effect also spread into the surrounding frequency ranges, consistent with, for example, recent macaque studies that showed occipital oscillatory modulations that were similar at the alpha and low beta ranges (Zhang, Wang, Bressler, Chen, & Ding, 2008). Finally, a significant increase of gamma power was observed in the right inferior temporal regions at 600–1400 msec after left versus right ear cues.
Lateralized power changes after engagement of attention to left versus right ears. The clearest lateralization pattern is observed at the alpha band, 200–600 msec after cue onset. A significant (cluster-based Monte Carlo simulation test) increase of alpha power is observed when attention is directed to the ipsilateral versus contralateral ear, as reflected by a positive Attend Left versus Attend Right contrast in the left and negative Attend Left versus Attend Right contrast in the right hemisphere. This ipsilateral alpha enhancement effect seems to be more widespread and longer lasting in the right hemisphere, where it also extends to the theta and beta ranges. At the theta range, an early power decrease was also observed in the left sensorimotor areas that are close to the right-hand representation, as well as in the left secondary somatosensory area. In these areas, the power is significantly increased when attention is directed to the contralateral hemisphere (i.e., Attend Right > Attend Left). Finally, indices of lateralized gamma power increases were observed in the right inferior temporal visual cortices when attention is directed to the contralateral ear. The figure shows the significance values of initial GLM, masked to the locations that survived the post hoc correction based on the cluster-based Monte Carlo simulation test. All estimates are normalized relative to a 500-msec prestimulus baseline before statistical analyses.
Lateralized power changes after engagement of attention to left versus right ears. The clearest lateralization pattern is observed at the alpha band, 200–600 msec after cue onset. A significant (cluster-based Monte Carlo simulation test) increase of alpha power is observed when attention is directed to the ipsilateral versus contralateral ear, as reflected by a positive Attend Left versus Attend Right contrast in the left and negative Attend Left versus Attend Right contrast in the right hemisphere. This ipsilateral alpha enhancement effect seems to be more widespread and longer lasting in the right hemisphere, where it also extends to the theta and beta ranges. At the theta range, an early power decrease was also observed in the left sensorimotor areas that are close to the right-hand representation, as well as in the left secondary somatosensory area. In these areas, the power is significantly increased when attention is directed to the contralateral hemisphere (i.e., Attend Right > Attend Left). Finally, indices of lateralized gamma power increases were observed in the right inferior temporal visual cortices when attention is directed to the contralateral ear. The figure shows the significance values of initial GLM, masked to the locations that survived the post hoc correction based on the cluster-based Monte Carlo simulation test. All estimates are normalized relative to a 500-msec prestimulus baseline before statistical analyses.
In addition to the lateralized alpha power changes, there was also evidence of a significant modulation of theta power at 0–200 msec after cues presented to the left versus right ears in the left sensorimotor regions, near the representation of the right hand, and in left the retrosplenial complex and left inferior temporal cortex. In other words, in these areas, the theta power was significantly increased when attention was directed to the right, that is, the contralateral hemisphere (or, alternatively, decreased when attention was directed to the left hemisphere). The details of cluster statistics are shown in Table 2.
Lateralization Analyses
F Range . | Time (msec) . | Cluster . | Max Value . | Size (mm2) . | xTalairach . | yTalairach . | zTalairach . | Clusterwise p . | Maximum Location . |
---|---|---|---|---|---|---|---|---|---|
Left Hemisphere | |||||||||
Theta | 0–200 | ||||||||
1 | −3.3 | 3281 | −35 | −25 | 45 | .0001 | Postcentral cortex | ||
2 | −2.7 | 2142 | −34 | −27 | 20 | .0048 | Insula | ||
3 | −2.3 | 2817 | −10 | −45 | 8 | .0004 | Retrosplenial cortex | ||
200–600 | |||||||||
600–1400 | |||||||||
Alpha | 0–200 | ||||||||
200–600 | |||||||||
1 | 2.2 | 2293 | −22 | −88 | 17 | .0089 | Lateral occipital cortex | ||
600–1400 | |||||||||
Right Hemisphere | |||||||||
Theta | 0–200 | ||||||||
200–600 | |||||||||
1 | −3.1 | 3158 | 34 | −70 | 27 | .0001 | Inferior parietal cortex | ||
600–1400 | |||||||||
1 | −2.6 | 2833 | 19 | −62 | 44 | .0001 | Superior parietal cortex | ||
2 | −2.4 | 3550 | 50 | −40 | 39 | .0001 | TPJ/supramarginal gyrus | ||
Alpha | 0–200 | ||||||||
200–600 | |||||||||
1 | −3.5 | 2637 | 29 | −51 | 41 | .0003 | Superior parietal cortex | ||
2 | −3.4 | 8533 | 26 | −66 | 32 | .0001 | Superior parietal cortex | ||
600–1400 | |||||||||
1 | −2.2 | 3177 | 20 | −75 | 36 | .0001 | Superior parietal cortex | ||
Beta | 0–200 | ||||||||
200–600 | |||||||||
1 | −2.8 | 2883 | 26 | −66 | 32 | .0002 | Superior parietal cortex | ||
600–1400 | |||||||||
Lower gamma | 0–200 | ||||||||
200–600 | |||||||||
600–1400 | |||||||||
1 | 2.3 | 1862 | 52 | −42 | −15 | .0066 | Inferior temporal cortex |
F Range . | Time (msec) . | Cluster . | Max Value . | Size (mm2) . | xTalairach . | yTalairach . | zTalairach . | Clusterwise p . | Maximum Location . |
---|---|---|---|---|---|---|---|---|---|
Left Hemisphere | |||||||||
Theta | 0–200 | ||||||||
1 | −3.3 | 3281 | −35 | −25 | 45 | .0001 | Postcentral cortex | ||
2 | −2.7 | 2142 | −34 | −27 | 20 | .0048 | Insula | ||
3 | −2.3 | 2817 | −10 | −45 | 8 | .0004 | Retrosplenial cortex | ||
200–600 | |||||||||
600–1400 | |||||||||
Alpha | 0–200 | ||||||||
200–600 | |||||||||
1 | 2.2 | 2293 | −22 | −88 | 17 | .0089 | Lateral occipital cortex | ||
600–1400 | |||||||||
Right Hemisphere | |||||||||
Theta | 0–200 | ||||||||
200–600 | |||||||||
1 | −3.1 | 3158 | 34 | −70 | 27 | .0001 | Inferior parietal cortex | ||
600–1400 | |||||||||
1 | −2.6 | 2833 | 19 | −62 | 44 | .0001 | Superior parietal cortex | ||
2 | −2.4 | 3550 | 50 | −40 | 39 | .0001 | TPJ/supramarginal gyrus | ||
Alpha | 0–200 | ||||||||
200–600 | |||||||||
1 | −3.5 | 2637 | 29 | −51 | 41 | .0003 | Superior parietal cortex | ||
2 | −3.4 | 8533 | 26 | −66 | 32 | .0001 | Superior parietal cortex | ||
600–1400 | |||||||||
1 | −2.2 | 3177 | 20 | −75 | 36 | .0001 | Superior parietal cortex | ||
Beta | 0–200 | ||||||||
200–600 | |||||||||
1 | −2.8 | 2883 | 26 | −66 | 32 | .0002 | Superior parietal cortex | ||
600–1400 | |||||||||
Lower gamma | 0–200 | ||||||||
200–600 | |||||||||
600–1400 | |||||||||
1 | 2.3 | 1862 | 52 | −42 | −15 | .0066 | Inferior temporal cortex |
Monte Carlo simulation results related to the data in Figure 3 are demonstrated. The maximum values of initial GLM reflect −log10(p) × sign(t). Significant effects occurred at the theta (4–7.5 Hz) and alpha (7.5–15 Hz) bands in the left and also in the beta (15–30 Hz) and lower gamma (30–60 Hz) bands in the right hemisphere.
Oscillatory Changes during Cued and Novelty-triggered Orienting
There was a significant decrease of alpha power in the right auditory cortex and TPJ regions at 0–200 msec after cues versus novels (Figure 4). This effect was followed by an increase of alpha power in the right inferior parietal cortex, right medial parieto-occipital cortices (precuneus, retrosplenial cortex), left retrosplenial cortex, bilateral inferior temporal cortices, and in the right auditory cortex.
Comparisons of power estimates to cued versus novelty-triggered attention shifting. At the theta range, an increase of right medial frontal cortex/ACC theta power was observed during 0.6–1.4 sec (encircled with white rectangle), consistent with previous observations of midline theta increases after allocation of selective attention. This effect coincided with a beta power increase in the right anterior insular/DLPFC regions (white rectangle on the right). In addition, there was evidence for suppression of sensorimotor mu rhythm power (alpha/beta ranges) near the right hand representations, possibly reflecting motor preparation for the upcoming target. In the posterior medial surfaces (the precuneus and retrosplenial complex), alpha and beta power increased at 0.2–0.6 sec, and this effect was followed by a bilateral lower gamma-band power increase at 0.6–1.4 sec. This later gamma pattern also extended to the posterior cingulate cortex. No significant effects emerged at the higher gamma band (not shown). The figure shows the significance values of initial GLM, masked to the locations that survived the post hoc correction based on the cluster-based Monte Carlo simulation test. All estimates are normalized relative to a 500-msec prestimulus baseline before statistical analyses.
Comparisons of power estimates to cued versus novelty-triggered attention shifting. At the theta range, an increase of right medial frontal cortex/ACC theta power was observed during 0.6–1.4 sec (encircled with white rectangle), consistent with previous observations of midline theta increases after allocation of selective attention. This effect coincided with a beta power increase in the right anterior insular/DLPFC regions (white rectangle on the right). In addition, there was evidence for suppression of sensorimotor mu rhythm power (alpha/beta ranges) near the right hand representations, possibly reflecting motor preparation for the upcoming target. In the posterior medial surfaces (the precuneus and retrosplenial complex), alpha and beta power increased at 0.2–0.6 sec, and this effect was followed by a bilateral lower gamma-band power increase at 0.6–1.4 sec. This later gamma pattern also extended to the posterior cingulate cortex. No significant effects emerged at the higher gamma band (not shown). The figure shows the significance values of initial GLM, masked to the locations that survived the post hoc correction based on the cluster-based Monte Carlo simulation test. All estimates are normalized relative to a 500-msec prestimulus baseline before statistical analyses.
At 600–1400 msec after the onsets of cues versus novels, there was a significant increase of theta activity in the right MFC region, including the ACC, which coincided with an increase of beta power in the right DLPFC, and anterior insula, and a bilateral increase of lower gamma-band activity in medial parieto-occipital and posterior cingulate cortices. These more sustained longer-term oscillatory differences after cues versus novels might be associated with voluntary engagement of auditory attention, after the initial reorienting triggered by the cue. There was also evidence of a sustained decrease of mu-rhythm power (alpha/beta ranges) 600–1400 msec after cues versus novels in the left sensorimotor cortices (∼right hand area), possibly reflecting differences in motor preparation. The details of cluster statistics are shown in Table 3.
Differences between Cued and Novelty-triggered Attention Shifting
F Range . | Time (msec) . | Cluster . | Max Value . | Size (mm2) . | xTalairach . | yTalairach . | zTalairach . | Clusterwise p . | Maximum Location . |
---|---|---|---|---|---|---|---|---|---|
Left Hemisphere | |||||||||
Alpha | 0–200 | ||||||||
200–600 | |||||||||
1 | 2.0 | 1977 | −55 | −8 | −20 | .0219 | Middle temporal gyrus | ||
2 | 1.9 | 2119 | −14 | −39 | −4 | .0156 | Parahippocampal gyrus | ||
600–1400 | |||||||||
1 | −3.4 | 5363 | −47 | −6 | 45 | .0001 | Precentral gyrus | ||
2 | 2.1 | 2632 | −9 | 55 | 11 | .0053 | Superior frontal gyrus | ||
Beta | 0–200 | ||||||||
200–600 | |||||||||
600–1400 | |||||||||
1 | −3.9 | 3389 | −53 | −15 | 47 | .001 | Postcentral gyrus | ||
Lower gamma | 0–200 | ||||||||
200–600 | |||||||||
600–1400 | |||||||||
1 | 3.1 | 7043 | −18 | −71 | 12 | .0001 | Pericalcarine cortex | ||
Right Hemisphere | |||||||||
Theta | 0–200 | ||||||||
200–600 | |||||||||
600–1400 | |||||||||
1 | 1.8 | 1384 | 6 | 32 | 17 | .0237 | Caudal ACC | ||
Alpha | 0–200 | ||||||||
1 | −2.9 | 1396 | 56 | −11 | 2 | .0213 | Superior temporal gyrus | ||
2 | −2.0 | 2017 | 50 | −52 | 43 | .0016 | Inferior parietal cortex | ||
200–600 | |||||||||
1 | 3.0 | 2469 | 48 | −44 | 8 | .0003 | STS | ||
2 | 2.9 | 6213 | 27 | −65 | 28 | .0001 | Superior parietal cortex | ||
3 | 2.6 | 1600 | 43 | −59 | 23 | .0112 | Inferior parietal cortex | ||
600–1400 | |||||||||
Beta | 0–200 | ||||||||
200–600 | |||||||||
1 | 3.0 | 3577 | 26 | −66 | 32 | .0002 | Superior parietal cortex | ||
600–1400 | |||||||||
1 | 2.6 | 1705 | 31 | 27 | 5 | .0096 | IFC/pars triangularis | ||
2 | 2.5 | 2312 | 38 | 42 | 12 | .0014 | DLPFC/rostral middle frontal gyrus | ||
3 | 2.2 | 3554 | 43 | −57 | 21 | .0001 | Inferior parietal cortex | ||
Lower gamma | 0–200 | ||||||||
200–600 | |||||||||
600–1400 | |||||||||
1 | 2.9 | 11442 | 8 | −69 | 10 | .0001 | Pericalcarine cortex |
F Range . | Time (msec) . | Cluster . | Max Value . | Size (mm2) . | xTalairach . | yTalairach . | zTalairach . | Clusterwise p . | Maximum Location . |
---|---|---|---|---|---|---|---|---|---|
Left Hemisphere | |||||||||
Alpha | 0–200 | ||||||||
200–600 | |||||||||
1 | 2.0 | 1977 | −55 | −8 | −20 | .0219 | Middle temporal gyrus | ||
2 | 1.9 | 2119 | −14 | −39 | −4 | .0156 | Parahippocampal gyrus | ||
600–1400 | |||||||||
1 | −3.4 | 5363 | −47 | −6 | 45 | .0001 | Precentral gyrus | ||
2 | 2.1 | 2632 | −9 | 55 | 11 | .0053 | Superior frontal gyrus | ||
Beta | 0–200 | ||||||||
200–600 | |||||||||
600–1400 | |||||||||
1 | −3.9 | 3389 | −53 | −15 | 47 | .001 | Postcentral gyrus | ||
Lower gamma | 0–200 | ||||||||
200–600 | |||||||||
600–1400 | |||||||||
1 | 3.1 | 7043 | −18 | −71 | 12 | .0001 | Pericalcarine cortex | ||
Right Hemisphere | |||||||||
Theta | 0–200 | ||||||||
200–600 | |||||||||
600–1400 | |||||||||
1 | 1.8 | 1384 | 6 | 32 | 17 | .0237 | Caudal ACC | ||
Alpha | 0–200 | ||||||||
1 | −2.9 | 1396 | 56 | −11 | 2 | .0213 | Superior temporal gyrus | ||
2 | −2.0 | 2017 | 50 | −52 | 43 | .0016 | Inferior parietal cortex | ||
200–600 | |||||||||
1 | 3.0 | 2469 | 48 | −44 | 8 | .0003 | STS | ||
2 | 2.9 | 6213 | 27 | −65 | 28 | .0001 | Superior parietal cortex | ||
3 | 2.6 | 1600 | 43 | −59 | 23 | .0112 | Inferior parietal cortex | ||
600–1400 | |||||||||
Beta | 0–200 | ||||||||
200–600 | |||||||||
1 | 3.0 | 3577 | 26 | −66 | 32 | .0002 | Superior parietal cortex | ||
600–1400 | |||||||||
1 | 2.6 | 1705 | 31 | 27 | 5 | .0096 | IFC/pars triangularis | ||
2 | 2.5 | 2312 | 38 | 42 | 12 | .0014 | DLPFC/rostral middle frontal gyrus | ||
3 | 2.2 | 3554 | 43 | −57 | 21 | .0001 | Inferior parietal cortex | ||
Lower gamma | 0–200 | ||||||||
200–600 | |||||||||
600–1400 | |||||||||
1 | 2.9 | 11442 | 8 | −69 | 10 | .0001 | Pericalcarine cortex |
Monte Carlo simulation results of the data in Figure 4 are shown. The maximum GLM values reflect −log10(p) × sign(t). Significant effects occurred at the alpha (7.5–15 Hz), beta (15–30 Hz), and lower gamma (30–60 Hz) bands in the left hemisphere and also at the theta (4–7.5 Hz) in the right hemisphere.
DISCUSSION
We utilized fMRI-weighted MEG/EEG source estimates to examine changes in cortical oscillations during cued orienting and subsequent goal-directed engagement of auditory attention. We specifically focused on effects occurring beyond the transient stimulus-driven processes, presumed to reflect endogenous attention. Gamma power increases after cued attention, emerging within auditory cortices, TPJ, anterior insula, IFC, and the right FEF, were associated with fast and accurate discrimination of subsequent targets. These correlation patterns evolved 200–1400 msec after cued orienting, which corresponds to the hypothesized time course of endogenous engagement of attention after the initial reorienting process. Engagement of attention to one ear resulted in a stronger increase of alpha activity in the ipsilateral than contralateral parieto-occipital cortex areas at 200–600 msec after cue onset, suggesting cross-modal modulation of dorsal visual pathways by audiospatial attention. Our results also suggest sustained increases of MFC theta, DLPFC and IFC/anterior insula beta, and medial parieto-occipital and posterior cingulate cortex gamma power (30–60 Hz) during cued versus novelty-triggered auditory attention.
Evolution of Cued Attention in Behavioral Correlation Analyses
In addition to the initial redirection of focus, cognitive control of auditory attention presumably involves a variety of endogenous processes associated with disengagement from previous tasks and engagement of heightened top–down control on the new task-relevant activity (Braver et al., 2003; Rushworth et al., 2002; Mayr & Keele, 2000). In current neurophysiological literature (Engel & Fries, 2010; Jensen et al., 2007), these kinds of control processes have been associated with gamma-band activities. Consistent with this prediction, we found significant correlations between postcue/pretarget gamma activities, which followed the initial cue-triggered orienting, and subsequent target discrimination performance in frontoparietal and temporal cortices. The significant correlations emerged >200 msec after the shifting cue, which is in line with visuospatial behavioral results (Cheal & Lyon, 1991; Shepherd & Muller, 1989) suggesting gradual emergence of attention within the first second after reorienting to a new focus. The spectral distribution of the present correlation patterns, extending across the lower and higher gamma bands, is consistent with recent human intracranial EEG evidence of sustained power increases that occur very broadly across the gamma band during voluntary endogenous attention (Ossandon et al., 2012). The gamma correlations in the auditory cortex are, in turn, in agreement with sustained broadband gamma increases by selective attention that have been reported in intracranial EEG (Tallon-Baudry et al., 2005) and noninvasive MEG measurements (Kahlbrock et al., 2012) of human visual cortex activity.
Significant gamma correlation effects were found in frontoinsular (IFC/anterior insula, FEF) regions previously associated with shifting or controlling spatial attention. The behavioral correlations at the higher gamma range in the right FEF are consistent with previous literature associating this area with control of spatial attention (Mesulam, 1981), voluntary shifts of attention (Kincade, Abrams, Astafiev, Shulman, & Corbetta, 2005), and preparation for motor activity after cued attention shifting (Wise, Weinrich, & Mauritz, 1983). Similarly, the recent intracranial EEG study of Ossandon et al. (2012) reported broadband gamma power increases in FEF during endogenous engagement of attention in humans. Areas roughly corresponding to IFC/anterior insula have, in turn, been linked to voluntary task shifting (Wager, Jonides, & Reading, 2004) and sustained effort-related control of auditory processing (Falkenberg, Specht, & Westerhausen, 2011; Altmann, Henning, Doring, & Kaiser, 2008). The behavioral correlations in auditory cortices, in turn, support previous findings that increased gamma power in sensory cortices correlates with improved behavioral RTs (Rose, Sommer, & Buchel, 2006) and accuracy (Kaiser et al., 2006). During the last time window of 600–1400 msec, the gamma correlation effect might also have been partially contributed by attentional modulations of oscillatory responses elicited by standard stimuli, given that transient auditory cortex gamma activities may be enhanced by selective attention (Mulert et al., 2007; Kaiser et al., 2006; Ahveninen et al., 2000; Tiitinen et al., 1993). In both hemispheres, significant correlations between increased gamma power and enhanced behavioral performance emerged also in areas near TPJ. This effect somewhat differs from the predictions of the influential model that distinguishes between dorsal voluntary and ventral stimulus-driven attention networks (Corbetta & Shulman, 2002). However, there are also several studies that have associated TPJ with top–down auditory processes (Salmi et al., 2009; Alain, He, & Grady, 2008). For example, a recent time-domain MEG study (Larson & Lee, 2013) suggested correlations between activations of right TPJ at 300–600 msec after visually presented shifting cues and behavioral indices of auditory attention. Taken together, the present correlation patterns between postcue/pretarget gamma power increases and behavioral discrimination performance could therefore reflect voluntary engagement of auditory attention, which follows the initial cue-triggered orienting process.
Lateralized Auditory Attention Effects
The main effects in comparisons between cued and novelty triggered dynamic oscillatory responses with ear-specific attention conditions pooled together, as well as the gamma-band correlation patterns, suggest stronger effects in the right than left frontoparietal cortex regions. Previous PET studies have shown that, irrespective to the attended ear, audiospatial attention is dominated by a network of right-hemispheric frontoparietal areas (Zatorre, Mondor, & Evans, 1999). Similarly, a recent fMRI study comparing exogenous and endogenous audiospatial modulations suggested that highest-order auditory attention is most likely controlled by right-hemispheric frontoparietal networks (Teshiba et al., 2012). Our result is also consistent with previous studies suggesting that the right hemisphere dominates higher-order processing of auditory space (At, Spierer, & Clarke, 2011; Spierer, Bellmann-Thiran, Maeder, Murray, & Clarke, 2009).
Here, we also found evidence for lateralized modulations of posterior alpha activities as a function of the direction of auditory attention. Alpha power increased in parieto-occipital cortices ipsilateral to the cued ear. This effect, which resembles interhemispheric alpha power modulations during visuospatial attention (Worden et al., 2000), is consistent with lateralized parietal-occipital alpha modulations found in other recent auditory attention studies (Thorpe et al., 2012; Banerjee et al., 2011). The modulation of alpha activities by auditory attention was centered in the superior parietal cortex, near the border of parietal and occipital lobes (Desikan et al., 2006). These modulations, further, overlapped with the areas V3 and V7 in both hemispheres and extended to the precuneus, inferior parietal cortex, and intraparietal sulcus in the right hemisphere. From the perspective of the alpha inhibition theory, the present lateralized alpha modulations could thus be interpreted to reflect cross-modal suppression of dorsal visual “where” pathway representing the task-irrelevant spatial hemifield.
Consistent with the overall pattern suggesting more enhanced attention effects in the right than left hemisphere, the alpha suppression effect was stronger, longer lasting, and anatomically more widespread in the right than in the left hemisphere. At the same time, in the right hemisphere, the contralateral suppression effects spread also to the adjacent frequency bands. Evidence for analogous broader band power suppression effects have been found in both human auditory EEG studies (Thorpe et al., 2012), which suggested lateralized theta, alpha, and beta power increases in the hemisphere ipsilateral to the attended auditory direction. The spreading of the present contralateral power suppression effect from alpha to beta bands is also consistent with a recent local field potential measurement in macaque visual cortex (Zhang et al., 2008). Finally, there was also evidence for lateralized increases of lower gamma-band power in inferior temporal visual cortices contralaterally to the attended ear, 600–1400 msec after cue. Although this effect is, in principle, very consistent with the alpha modulations (i.e., alpha decrease coupled with gamma increase in the same hemisphere), further studies are needed to verify the exact interpretation of this effect, as no corresponding modulations were observed in the opposite hemisphere.
There was no clear evidence of lateralized attention effects at the level of auditory cortex. The lack of these effects may be explained by the fact that, although strongest stimulus-evoked responses to monaural sounds are observed in the hemisphere contralateral to the stimulated ear, human auditory cortices also receive significant inputs from the ipsilateral ear. Previous MEG studies have shown that selective attention increases auditory responses to monaural sounds similarly in the hemisphere ipsilateral and contralateral to the task-relevant ear (Fujiwara, Nagamine, Imai, Tanaka, & Shibasaki, 1998), which contradicts a hypothesis of widespread inhibition of the ipsilateral auditory cortex by ear-specific selective attention. Previous studies have also shown that the contralateral dominance of ascending auditory pathways is much clearer in the primary auditory cortex areas than in nonprimary areas most sensitive to attentional modulations (Woods et al., 2009). It should be also noted that, although certain PET studies have provided evidence of lateralized auditory cortex attention effects (Alho et al., 2003), many previous fMRI (Salmi et al., 2009; Shomstein & Yantis, 2006) and PET (Zatorre et al., 1999) studies have failed to support this finding.
Finally, there was also evidence of an early modulation of theta power in the sensorimotor areas representing the right hand to right versus left ear cues. One potential explanation is a slight enhancement of auditory cross-modal effect within the left hemisphere, given the spatial congruency of the cue and responding hand (or, alternatively, a decrease of theta when the laterality of attention and responding hand is incongruent). Further studies are needed to verify the functional significance of this phenomenon.
Cued versus Novelty-triggered Attention
We also compared the estimates to cues and novel sounds, assuming that this comparison would reveal differences related to voluntary versus stimulus-driven attention, particularly after the time window of initial stimulus processing (0–200 msec). These comparisons demonstrated a sustained increase of right MFC theta power at 600–1400 msec after cues versus novels. This result is in line with previous findings that theta power strengthens with increased need for top–down auditory processing (Sauseng et al., 2007; Jensen & Tesche, 2002; Gevins et al., 1997). The present theta increase was centered in the right ACC, an area that has previously been associated with attention and cognitive control (Wager et al., 2005; Weissman, Gopalakrishnan, Hazlett, & Woldorff, 2005). It is thus possible this theta effect is related to endogenous allocation of auditory attention after cued orienting.
At the alpha range, we observed power decreases in the right auditory cortex and TPJ at 0–200 msec after cues versus novels, which were at 200–600 msec followed by power increases in the medial parieto-occipital cortices (right precuneus, bilateral retrosplenial cortex), bilateral inferior temporal cortices, and areas located inferiorly/posteriorly to the right auditory cortex areas (STS, right inferior parietal cortex). On the basis of the assumptions of alpha inhibition theory, these effects could reflect increased activation of the right auditory cortex and TPJ after cued versus novelty-triggered attention, followed by cross-modal inhibition of posterior and medial parietal cortices. The latter effect is consistent with previously reported increases of posterior “visual” alpha power during engagement of auditory attention (Ahveninen et al., 2012; Fu et al., 2001; Foxe, Simpson, & Ahlfors, 1998). Taken together, it thus seems that engagement of auditory attention increases alpha power in visual cortex areas. However, based on the lateralization effects discussed above, this cross-modal inhibition effect could be even more prominent in regions ipsilateral than contralateral to the attended ear.
The comparisons between cued and novelty-triggered attention effects also revealed significant differences at the beta range that could be potentially related to longer-term engagement of endogenous auditory attention. Significant sustained increases of beta power were observed in the right DLPFC, in the right anterior insula, and in the right IFC 600–1400 msec after cues versus novels. In comparison with the well-documented links between gamma oscillations and top–down processing, the association of beta oscillations with attention and cognitive control has remained somewhat elusive (Engel & Fries, 2010). However, beta power increases have been previously observed during attentional expectancy of stimuli (de Oliveira et al., 1997; Roelfsema et al., 1997; Montaron et al., 1979) and also during increased endogenous processing of auditory stimuli (Iversen et al., 2009). It should be also noted that the right DLPFC and anterior insula/IFC, where the present sustained beta increases during cued versus novelty-triggered attention originated, have previously been reported to be activated during top–down auditory attention tasks (Huang, Belliveau, Tengshe, & Ahveninen, 2012; Alain, Shen, Yu, & Grady, 2010; Rinne, Koistinen, Salonen, & Alho, 2009; Wu, Weissman, Roberts, & Woldorff, 2007; Zatorre et al., 1999).
Sustained power increases at 600–1400 msec to cues versus novels were also observed at the lower gamma range (30–60 Hz), centered in the precuneus, retrosplenial cortex, and posterior cingulate cortices. These areas have been previously linked to spatial attention (Grent-'t-Jong & Woldorff, 2007; Shomstein & Yantis, 2006; Small et al., 2003; Corbetta & Shulman, 2002). Previous fMRI studies have also shown that, of these areas, the posterior cingulate cortex is activated during anticipatory allocation of spatial attention (Small et al., 2003). Medial parietal cortex areas including the precuneus are, in turn, activated during both visual (Astafiev et al., 2003) and auditory (Smith et al., 2010; Wu et al., 2007; Shomstein & Yantis, 2006) spatial attention tasks. The significant increase of sustained gamma activity at 600–1400 msec after cues, as compared with novels, could thus reflect increased engagement of audiospatial attention. However, in this study, correlations between gamma increases and behavioral performance measures were focused in lateral frontoparietal and temporal regions, whereas the correlations in medial parietal/posterior cingulate regions did not survive the cluster-based post hoc correction.
Finally, the power comparisons between cued and novelty-triggered attention suggest sustained decreases of sensorimotor mu rhythms near the left hemisphere representation of the right hand, which was utilized for responding in this study. These effects are consistent with previous human MEG observations obtained in tactile attention studies (van Ede, de Lange, & Maris, 2012) as well as motor MEG studies (Donner, Siegel, Fries, & Engel, 2009) that have suggested lateralized motor/premotor cortex beta power decreases during preparation for contralateral hand movements.
Potential Limitations
A central question is whether our participants indeed utilized spatial attention to perform the task. That is, analogous to previous dichotic listening studies, the nontarget sound sequences were also separable based on their frequency (Näätänen, 1992; Woldorff et al., 1991; Hillyard et al., 1973). This may have, in theory, provided an alternative nonspatial streaming cue. However, according to the present behavioral control experiment, spatially valid versus invalid cues improved performance, suggesting that spatial cueing (and attention) was beneficial. The interpretation that our participants utilized spatial attention is also supported by the lateralized alpha effects to the left versus right ear cues (Figure 3), which are consistent with previous auditory–spatial attention studies (Thorpe et al., 2012; Banerjee et al., 2011). It is also worth noting that there were a number of factors that likely counteracted frequency (or pitch)-based streaming in the present paradigm. First, frequency-based streaming does not occur instantly (Bregman, 1978) but can take for several seconds to evolve, depending on a variety of stimulus parameters. Here, the task was divided to 10-sec trials, interrupted by the scanner noise stimulus, which according to previous studies results in a washout of streaming (Bregman, 1978). Furthermore, in contrast to typical studies on binaural auditory grouping where participants are presented with regularly alternating patterns or temporally consistent melody components (Darwin, 1997; Deutsch, 1979), here, the SOA was jittered and the order of dichotic presentation was randomized. Within each trial, there were additional interruptions caused by the cues, targets, and novel sounds. It is therefore very unlikely that frequency-based streaming could have evolved during these trials, at least to a degree that would have made it a stronger cue than the spatial separation across the ears (for further details on comparisons between dichotic vs. frequency/pitch cues, see Woods, Alain, Diaz, Rhodes, & Ogawa, 2001; Näätänen, Porkka, Merisalo, & Ahtola, 1980).
A potential limitation to this study needing further experimental attention is that, because of technical challenges, it was not possible to include several control conditions in the present MEG/EEG paradigm. For example, it would have been beneficial to have the novel sound appear also in the task relevant channel to make it possible to better control for the effect of the saliency of the novel sounds versus the involuntary attention shifting effect. At the same time, because of the nature of novelty detection, there was no behavioral response associated with sounds following novels to confirm that attention had indeed shifted to the stimuli. However, this concern is alleviated by numerous previous studies that demonstrate by using both psychophysical and neurophysiological measures that stimuli similar to the present novel sounds trigger very strong involuntary auditory attention shifting (see, e.g., Escera et al., 1998).
Although the present MEG/EEG source localization methods can separate functionally distinct anatomical areas at a sufficient accuracy (Sharon et al., 2007), their spatial resolution is not as good as that provided, for example, by fMRI. Even with current anatomical and functional constraints, the estimates are spatially distributed, which is why the present discussion has been concentrated on the overall statistical patterns. A limitation of MEG/EEG is also their limited capability to detect simultaneous synchronous activity on the opposite banks of sulci because of cancellation effects (Ahlfors et al., 2009). This limitation might, for example, explain why the present alpha lateralization effects were more prominent in the lateral than medial parieto-occipital cortices. At the same time, it is important to note that the time resolution of oscillatory analyses is limited. For example, at the lowest frequencies of 4 Hz, oscillatory power is being estimated by a three-cycle, that is, 750-msec sliding window. In this case, a power estimate at any time point is, thus, affected by activities 375 msec before and after the centerpoint of the analysis window. However, because the sliding window is tapered, the results are strongly weighted toward the midpoint of the window.
Finally, the scope of this study is limited to the power estimates. Future studies examining the functional connectivity patterns using cortico-cortical phase-locking analyses are needed to clarify the role of the different areas showing effects related to cued and novelty triggered attention. For example, Granger causality estimates (Lin et al., 2009) could be highly informative to elucidate the internal driving relationships across the areas showing sustained gamma correlation patterns during postcue engagement of attention.
Conclusions
Our results suggest distinct modulations of neuronal oscillations during cued auditory spatial attention and novelty-triggered orienting. Endogenous engagement of attention, subsequent to cue-triggered orienting, may be associated with sustained higher-frequency gamma activity in frontoparietal and temporal cortices. The brain regions associated with endogenous engagement of auditory attention include the bilateral TPJ, anterior insula, IFC, and the right FEF. As shown by the lateralized alpha power modulations, cross-modal inhibition of parieto-occipital visual cortices by auditory attention is stronger in the hemisphere ipsilateral than contralateral to the attended ear. Comparisons of dynamic power estimates during cued and novelty-triggered attention also suggested increased theta activity in medial frontal cortices, increased alpha in the medial and lateral parieto-occipital regions, and increased beta activity in the right DLPFC and IFC/anterior insula.
Acknowledgments
We thank An-Yi Hung, Natsuko Mori, Stephanie Rossi, Chinmayi Tengshe, and Nao Suzuki for their help. This work was supported by National Institutes of Health Awards R01MH083744, R21DC010060, R01HD040712, R01NS037462, 5R01EB009048, and P41RR14075. The research environment was supported by National Center for Research Resources Shared Instrumentation Grants S10RR014978, S10RR021110, S10RR019307, S10RR014798, and S10RR023401.
Reprint requests should be sent to Jyrki Ahveninen, MGH/MIT/HMS-Martinos Center, Bldg. 149 13th Street, Charlestown, MA 02129, or via e-mail: jyrki@nmr.mgh.harvard.edu.