Most evidence on the neural and perceptual correlates of sensory processing derives from studies that have focused on only a single sensory modality and averaged the data from groups of participants. Although valuable, such studies ignore the substantial interindividual and intraindividual differences that are undoubtedly at play. Such variability plays an integral role in both the behavioral/perceptual realms and in the neural correlates of these processes, but substantially less is known when compared with group-averaged data. Recently, it has been shown that the presentation of stimuli from two or more sensory modalities (i.e., multisensory stimulation) not only results in the well-established performance gains but also gives rise to reductions in behavioral and neural response variability. To better understand the relationship between neural and behavioral response variability under multisensory conditions, this study investigated both behavior and brain activity in a task requiring participants to discriminate moving versus static stimuli presented in either a unisensory or multisensory context. EEG data were analyzed with respect to intraindividual and interindividual differences in RTs. The results showed that trial-by-trial variability of RTs was significantly reduced under audiovisual presentation conditions as compared with visual-only presentations across all participants. Intraindividual variability of RTs was linked to changes in correlated activity between clusters within an occipital to frontal network. In addition, interindividual variability of RTs was linked to differential recruitment of medial frontal cortices. The present findings highlight differences in the brain networks that support behavioral benefits during unisensory versus multisensory motion detection and provide an important view into the functional dynamics within neuronal networks underpinning intraindividual performance differences.
A large body of work has focused its effort on disentangling the general principles underlying perceptual processes (and ultimately behavior). Much of this work has focused on reporting behavioral and/or neurophysiological findings that result from group-averaged data analysis approaches (for reviews, see Schmid & Maier, 2015; Pulvermüller & Fadiga, 2010; Peelen & Downing, 2007; Grill-Spector, 2003). Although such group-averaged analyses represent a necessary initial step, they often fail to address the enormous (and important) variability that characterizes performance both within and across individuals (Dickman, 1985; Witkin, 1949, 1950). The importance of interindividual differences is evident in behavior, perception, and cognition (for reviews, see Kanai & Rees, 2011; Kane & Engle, 2002). In addition to interindividual variability, substantial intraindividual variability is a hallmark of many sensory, perceptual, and cognitive processes (Simmonds et al., 2007; MacDonald, Nyberg, & Bäckman, 2006; Castellanos et al., 2005; Morell & Morell, 1966; Fiske & Rice, 1955). An exemplar account of such trial-by-trial fluctuations with regard to sensory processing was provided by Dehaene (1993). The author reported a periodic structure to the distribution of RTs in a series of auditory and visual discrimination tasks (Dehaene, 1993). The stochastic nature of the RT distributions suggested intertrial fluctuations in the accumulation of sensory information and/or response generation. More recently, these trial-to-trial changes in performance have been linked to differences in state-dependent neural processing that in turn cascade into differences in the processing time of perceptual information (e.g., Bourgeois, Chica, Valero-Cabré, & Bartolomeo, 2013; Corbetta & Shulman, 2002). Additional evidence has been gathered regarding the neurophysiological mechanisms that may support these intraindividual differences (e.g., Chaumon & Busch, 2014; de Graaf et al., 2013; Romei, Gross, & Thut, 2010). One example is that trial-to-trial differences in the magnitude and latency of evoked gamma band responses can predict variability in response speed in visual detection tasks (Fründ, Busch, Schadow, Körner, & Herrmann, 2007).
Although these studies have produced important insights into the functional mechanisms underpinning intraindividual variability, lesion and hemodynamic imaging studies have provided additional evidence on the neuronal architecture supporting such variability. For example, lesions to frontal cortices, right inferior parietal cortex, and regions of the thalamus can result in increased variability in intraindividual RT (Bellgrove, Hester, & Garavan, 2004; Stuss, Murphy, Binns, & Alexander, 2003; Stuss, Floden, Alexander, Levine, & Katz, 2001). In a reexamination of neuroimaging data across a broad range of visual experimental tasks, Yarkoni and colleagues reported evidence for an extended network including visual cortices as well as cerebellum and additional subcortical structures to be related to RT variability (Yarkoni, Barch, Gray, Conturo, & Braver, 2009). Collectively, these studies have revealed a highly distributed network of brain regions that appears to play a role in trial-to-trial response variability, with a specific emphasis on RTs.
Although highly revealing as to the characteristics of interindividual and intraindividual variability at both the behavioral and neural levels, it is important to note that these studies have largely been focused within a single sensory system—vision. As an extension of this work, there is a growing body of evidence showing both perceptual benefits and reduced behavioral variability when multiple redundant signals are presented within a single sensory modality (e.g., two redundant visual targets; Los & Van der Burg, 2013; Pérez-Bellido, Soto-Faraco, & López-Moliner, 2013; Miller, 1982). However, even under these unisensory redundant signal presentation conditions, substantial intraindividual response variability still exists (Krummenacher, Grubert, Töllner, & Müller, 2014; Ivanov & Werner, 2009; Martuzzi et al., 2006; Iacoboni & Zaidel, 2003; Murray, Foxe, Higgins, Javitt, & Schroeder, 2001; Saron, Schroeder, Foxe, & Vaughan, 2001; Miniussi, Girelli, & Marzi, 1998).
Under naturalistic circumstances, sensory information about an event is frequently conveyed through multiple sensory systems. Take as an example a bouncing ball, in which the visual and auditory cues provide complementary information about the ball's collision with the floor. Under such multisensory conditions, intraindividual response variability can be significantly reduced when compared with unisensory presentation conditions (reviewed in Murray & Wallace, 2011). Moreover, these investigations have provided robust evidence of significant decreases in variability of behavioral and neurophysiological responses to occur under multisensory presentation conditions, which exceed those observed under redundant unisensory trials (e.g., Gingras, Rowland, & Stein, 2009). Despite a growing number of circumstances in which such multisensory redundancy-mediated reductions in performance variability have been demonstrated, the neural correlates of these effects remain poorly understood. The aim of the current study was to address this open question.
Several studies have provided some insight into this issue and have shown that multisensory presentations that result in fast RTs are accompanied by increased power and phase coherence within early, low-level sensory cortices (Altieri, Stevenson, Wallace, & Wenger, 2015; Mercier et al., 2015; Sperdin, Cappe, Foxe, & Murray 2009; Senkowski, Molholm, Gomez-Ramirez, & Foxe, 2006). Although these data provide some information about the temporal dynamics underpinning RT variability under multisensory conditions, hemodynamic imaging has provided additional insight into the important nodes in a putative network. For example, Noppeney, Ostwald, and Werner (2010) found activity within a widespread network, spanning occipital to frontal cortices, to be modulated by both the ambiguity of the auditory and visual stimuli as well as their congruency in a visual object categorization task. Moreover, these authors provided evidence that this network activation is linked to the efficacy of the behavioral responses observed (Noppeney et al., 2010). Taken together, these previous studies provide a base of knowledge regarding the functional mechanisms and neuronal structures involved in response variability under multisensory presentations. Nonetheless, much remains to be elucidated, most notably how trial-by-trial changes in activity within the associated network(s) give rise to the striking variability observed in behavioral performance.
To this aim, in the current study, we reanalyzed previously published ERP data (Cappe, Thelen, Romei, Thut, & Murray, 2012). The initial study had been designed to investigate the neuronal mechanisms involved in the selective response facilitation observed under multisensory conditions for the detection of approaching (i.e., looming) versus receding motion cues. Specifically, participants were asked to detect motion under unisensory (auditory or visual-only) and multisensory (audiovisual) presentation conditions. The present analysis specifically focused on determining the neuronal networks underlying the RT variability that accompanies behavioral performance on this task.
After applying criteria for the minimal number of accepted trials per condition (detailed below), the data from eight healthy individuals were included in the current analyses (aged 18–28 years, mean = 23 ± 3 years; three women and five men; seven right-handed). All participants reported normal hearing and normal or corrected-to-normal vision. Handedness was assessed with the Edinburgh questionnaire (Oldfield, 1971). None of the participants reported a history of neurological or psychiatric illness. Participants provided written informed consent to the procedures that were approved by the ethics committee of the Faculty of Biology and Medicine of the University Hospital and University of Lausanne.
Stimuli and Procedure
We performed quartile-by-quartile analysis on a previously published data set (Cappe et al., 2012). A quartile analysis was chosen because interquartile range (IQR) is considered a robust measure of the spread of data, particularly when they are nonnormally distributed as is often the case for RT data from individual participants (Ratcliff, 1993). Only the features relevant to the quartile-by-quartile analysis will be detailed here. Briefly, participants were asked to perform a speeded detection task and were asked to indicate the presence of moving versus static stimuli by a simple button press. Stimuli could be presented in a unisensory (i.e., visual or auditory only) or multisensory (i.e., audiovisual) manner. Visual and auditory motion was perceived as either approaching or receding from the observer. In addition, the original design included static stimuli in both modalities. The experiment was composed of 15 conditions, consisting of six unisensory (auditory [A] and visual [V] only, static [s], receding [r] or approaching (looming = l) stimuli) and nine multisensory pairings (the full set of possible auditory–visual [AV] combinations). Overall, we collected 252 trials for each condition over 18 blocks. In anticipation of our analysis strategy, comparing predicted versus empirically derived cumulative distribution functions (CDFs; see Methods section for behavioral data), we here only considered multisensory stimuli that were composed of a combination of two unisensory motion cues (VlAl, VlAr, Vr, AL, and VrAr and their respective unisensory components: Vl, Vr, Al, Ar).
Visual motion stimuli consisted of a disc (initial size = 7° of visual angle) dynamically contracting (to 1°) or expanding (to 13°) over 500 msec of stimulus duration. Auditory stimuli consisted of 1000-Hz complex pure tones (44.1-kHz sampling, 500-msec duration, 10-msec rise/fall to avoid clicks), composed of triangular waveforms. To induce the perception of motion, the amplitude of the stimuli was linearly modulated (77 ± 10 dB SPL) over the stimulus duration period. After stimulus offset, a variable intertrial interval of 800–1400 msec was interleaved such that the onset of the next trial could not be anticipated by participants. In addition, all audiovisual stimulus pairings were presented synchronously. Stimulus delivery and response recording were controlled by E-Prime in conjunction with their Serial Response Box (Psychology Software Tools, Pittsburgh, PA; www.pstnet.com).
Concurrently to the behavioral task, we acquired continuous high-density (160-channel BioSemi ActiveTwo; www.biosemi.com) EEG at 1024 Hz. The low-impedance AD box references the data online to the common mode sense (active electrode), while grounding the data to the driven right leg (passive electrode). This functions as a feedback loop, driving the average instantaneous potential over the whole montage to the amplifier zero (for a more detailed description of the setup, see www.biosemi.com/faq/cms&drl.htm).
Data Processing and Analyses
The aim of the current study was to investigate the neuronal correlates underpinning intraindividual RT variability observed at the behavioral level. To this end, only task-relevant conditions requiring participants to respond to either sensory modality were included in the analyses (Vl, Vr, Al, Ar, VlAl, VrAr, VrAl, VlAr). To investigate the neuronal networks underpinning response speed variability, we ranked RT data for each of these eight conditions separately into four quartiles and calculated the mean response speed for each bin. Furthermore, the present analyses sought to assess the neuronal networks underpinning multisensory benefits of behavioral responses over unisensory events. To this end, behavioral and EEG data were subsequently averaged across conditions, leading to three grand averages independently of motion direction (A, V, and AV). In what follows, we only considered trials where RTs fell within the first and last of the quartiles, to compare behavioral and neuronal responses upon the fastest and slowest trials within each participant. Consequently, the data analyses carried out here specifically tested (1) differential processing under unisensory versus multisensory presentations and its impact on behavior and (2) the neural correlates underpinning response variability in terms of RTs.
To assess the occurrence of multisensory facilitation, response accuracy and RTs were initially computed for each condition, separately. Race models were calculated to evaluate the occurrence of redundant signal effects under multisensory versus unisensory conditions (Ulrich, Miller, & Schröter, 2007; Ulrich & Miller, 1997). The race model assumes that auditory and visual information is processed independently upon multisensory presentations and that responses are triggered by the faster unisensory process. Therefore, CDFs of RTs for multisensory events can be computed based on the observed unisensory CDFs. These multisensory CDF predictions are then compared with the empirical CDFs from the observed RTs. If the empirical CDFs show significantly faster RTs for 20–50% of the percentiles, this is considered as race model violation and suggests that multisensory information was integrated before motor response initiation (Miller & Ulrich, 2003). Note, however, that neural integration can occur in the absence of evidence for race model violation in behavioral data (Murray et al., 2001).
Interindividual Effects and Response Facilitation
In addition to investigating the response variability within participants and across conditions, we also addressed how individual differences in response variability affect neuronal processing. To this end, we computed the difference in response speed between the mean RTs within the first and last quartiles for each participant and each condition (i.e., the IQR). IQRs have been considered a more robust quantification of the width of RT distributions as compared with central tendency measures, such as standard deviations, because of the fact that these distributions are nonnormal by nature (see Ratcliff, 1993, for further details). Thus, the index chosen here reflects the width of the individual response distributions and serves as a descriptor of the response variability for each participant.
Furthermore, this RT difference also served as a variable when directly testing the assumption that RT distributions were narrowed rather than broadened under multisensory conditions as compared with unisensory conditions. In recent years, there has been a debate as to whether RT distributions are skewed or broadened under multisensory as compared with unisensory presentation conditions (see Otto, Dassy, & Mamassian, 2013). Although not the central focus of this study, our data contribute to the resolution of this debate by providing evidence that RT distributions are significantly skewed, rather than broadened, under multisensory presentation conditions.
EEG Data Preprocessing
EEG data were imported into MATLAB (www.mathworks.com), and preprocessing was performed using functions derived from the free EEGLAB toolbox and its ERPLAB plug-in (Lopez-Calderon & Luck, 2014; Delorme & Makeig, 2004). After import, a conventional 40-Hz finite impulse response low-pass filter was applied to the data. Subsequently, epochs from 200 msec prestimulus to 700 msec poststimulus onset were extracted for each of the experimental conditions and from each participant to calculate ERPs. Epochs containing ±80 μV artifacts, eye blinks, or other noise transients were rejected by trial-by-trial visual inspection. Remaining epochs were binned according to RTs into fast (first quartile of the RT distribution) and slow (last quartile of the RT distribution) trials, and single-participant averages were computed for each condition separately. The single-participant ERPs were then exported to CARTOOL (Brunet, Murray, & Michel, 2011; sites.google.com/site/cartoolcommunity/files) for further processing. Data at artifact electrodes were interpolated using 3-D splines before creating the single-participant supra-condition averages (Perrin, Pernier, Bertrand, Giard, & Echallier, 1987). Baseline-corrected group-averaged ERPs were computed over 100 msec prestimulus to 600 msec poststimulus onset. When calculating ERPs, we equated the number of trials from the various contributing stimulus pairings, in the case of AV trials, and the number of artifact-free trials from each quartile. These criteria resulted in the exclusion of data from 6 of the original 14 participants because a reliable ERP was not evident upon visual inspection of their data after equating the number of trials.
General ERP Analysis Framework
Differences in neuronal activity were identified within an electrical neuroimaging framework, implemented in a variety of freeware and toolboxes (CarTool: Brunet et al., 2011; Randomization Graphical User: Koenig, Kottlow, Stein, & Melie-García, 2011; STEN toolbox developed by Jean-François Knebel [www.unil.ch/line/home/menuinst/about-the-line/software–analysis-tools.html]). This particular framework allows us to differentiate between modulations in response strength (global field power [GFP]) and/or configuration (topography of the electric field) of neuronal networks recruited between conditions (for a review, see Murray, Brunet, & Michel, 2008). Ultimately, we estimated and statistically assessed the neuronal sources involved by using the local autoregressive average (LAURA) distributed linear inverse solution (Michel et al., 2004).
Last, to further investigate the neuronal correlates of interindividual differences of response variability (i.e., differences in the spread of RTs between participants), we submitted the data to an additional ANCOVA design, using the IQR as a covariate.
ERP Waveform Modulations
In a first step, we entered ERPs into a repeated-measures ANOVA, to analyze the waveforms from all electrodes as a function of time poststimulus onset. We specifically tested for differences due to response speed (i.e., first vs. fourth quartiles) and possible interactions with condition (i.e., A, V, AV). Temporal autocorrelation at the level of individual electrodes was corrected by applying a threshold criterion of ≥11 consecutive data points (∼11 msec; Guthrie & Buchwald, 1991). In addition, only effects present at >5% of channels (i.e., ≥8) were considered reliable. This was implemented as a way to account for spatial correlation, which also varies as a function of time and thus cannot be set a priori. This mass univariate analysis of voltage waveforms was chosen to provide an overview of the spatiotemporal dynamics and distribution of the statistical effects. We emphasize that our analyses of interest were those based on reference-independent and global measures of the electric field at the scalp. Because these are global measures, no correction for spatial correlation was necessary.
In addition, the analyses of voltage ERP waveforms at each electrode revealed a minimal influence of auditory ERPs to the overall observed statistical results pattern. Thus, although the full experimental design including auditory trials was considered throughout the analyses, we reanalyzed and focused the present report on results derived from a 2 × 2 ANOVA design, with the factors of Response speed (fast, slow) and Condition (V, AV). This approach led to the added advantage of increasing the observed effect sizes by reducing the number of factors considered in our statistical analyses.
As mentioned above, analyses of ERP voltage waveforms are reference dependent, with the consequence that statistical effects (and interpretations thereof) will also depend on the choice of the reference location (Murray et al., 2008). Consequently, our analyses focus instead on reference-independent global measures of ERP strength and topography that were analyzed within a so-called electrical neuroimaging framework (Michel & Murray, 2012). The first measure is GFP, which is the root mean square of the voltage data across the scalp (Lehmann & Skrandies, 1980). GFP is larger for stronger ERPs but provides no information about the spatial distribution of the ERP. Here, GFP was analyzed with a 2 × 2 ANOVA using within-participant factors of RT speed (fast vs. slow trials) and Condition (V vs. AV). ANOVA was performed on a millisecond-by-millisecond basis. Correction for temporal autocorrelation was achieved by considering as reliable only those effects lasting for at least 11 consecutive data points (∼11 msec; Guthrie & Buchwald, 1991). The second global measure is global dissimilarity (DISS), which is the root mean square of the difference between two GFP-normalized vectors (Lehmann & Skrandies, 1980). DISS can be analyzed in a factorial design using the Randomization Graphical User interface (Koenig et al., 2011). Furthermore, in an additional ANCOVA design, we tested the impact of individual behavioral response variability, quantified here as the IQR for each experimental condition, on brain responses to the AV and V conditions leading to slow and fast RTs. Subsequently, significant effects were assessed by submitting the data to post hoc t tests.
A topographic clustering analysis was also performed on the four group-averaged ERPs using CarTool. Specifically, we applied an atomize and agglomerate hierarchical clustering (AAHC) approach that uses measures of global explained variance alongside spatial correlation (see Brunet et al., 2011, and Murray et al., 2008, for detailed descriptions of the methods). By way of summary, topographic clustering is a data-driven and largely assumption-free means for identifying the minimal number of ERP topographies that explains a maximum of variance in the cumulative data set (here, the four group-averaged ERPs). Once this set of topographies and their sequence in time poststimulus onset was identified, they were used as template maps for the fitting to single-participant ERPs. This fitting is based on the spatial correlation between a given template map and the single-participant ERP at a given moment poststimulus for each condition and RT speed. As output, the fitting procedure yields the total amount of time a given template map was associated with responses to a given condition and/or RT speed.
We estimated the neuronal sources of the electrical activity measured at the level of the scalp using a distributed linear inverse solution (minimum norm) together with the LAURA regularization approach (Grave de Peralta Menendez, Murray, Michel, Martuzzi, & Gonzalez Andino, 2004; Michel et al., 2004; Grave de Peralta Menendez, Gonzalez Andino, Lantz, Michel, & Landis, 2001). LAURA selects the source configuration that best mimics the biophysical behavior of electric vector fields (i.e., according to electromagnetic laws, activity at one point depends on the activity at neighboring points). In our study, homogenous regression coefficients in all directions and within the whole solution space were used. LAURA uses a realistic head model, and the solution space included 3,005 nodes, selected from a 6 × 6 × 6 mm grid of equally distributed nodes within the gray matter of the Montreal Neurological Institute's average brain (courtesy of R. Grave de Peralta and S. Gonzalez Andino; www.electrical-neuroimaging.ch/). Prior basic and clinical research from members of our group and others has documented and discussed in detail the spatial accuracy of the inverse solution model used here (e.g., Martuzzi et al., 2009; Gonzalez Andino, Murray, Foxe, & de Peralta Menendez, 2005; Michel et al., 2004).
The results of the above topographic pattern analysis defined periods for which intracranial sources were estimated and statistically compared between conditions (here, 183–250 msec poststimulus). Before calculation of the inverse solution, the ERPs were downsampled and affine-transformed to a common 111-channel montage. Statistical analyses of source estimations were performed on a single average data point over the 183- to 250-msec poststimulus onset epoch. This procedure increases the signal-to-noise ratio of the data from each participant. The inverse solution was then estimated for each of the 3,005 nodes. Consequently, the data were entered into a 2 × 2 ANOVA with the factors of Response speed (i.e., fast vs. slow trials) and Condition (i.e., AV, V). In addition, the data were submitted to an ANCOVA, using the difference in RTs between the first and fourth quartiles as a covariate for each condition. Statistical results were corrected using a spatial extent criterion of at least 12 contiguous significant nodes. This spatial criterion was determined using the AlphaSim program (available at afni.nimh.nih.gov) and assuming a spatial smoothing of 2-mm FWHM and cluster connection radius of 8.5 mm. After 10,000 Monte Carlo iterations, a cluster of 10 nodes was observed with a probability of .034, yielding a corresponding node-level p value of ≤.001 (see Thelen, Cappe, & Murray, 2012; Sperdin, Cappe, & Murray, 2010; Toepel, Knebel, Hudry, le Coutre, & Murray, 2009, for similar criteria). Results have been rendered on the Montreal Neurologic Institute's average brain with the Talairach and Tournoux (1988) coordinates of the largest statistical differences within each cluster indicated.
In a last exploratory step, we investigated the relationship between activity within a left-lateralized occipital cluster and the activity between the clusters identified by the main effect of quartile (i.e., fast vs. slow RTs) to shed light on the neuronal network interactions underpinning our results. This was predicated by a recent hemodynamic study by Noppeney et al. (2010), which revealed a linear relationship between activity within visual and frontal areas and trial-by-trial response efficacy (Noppeney et al., 2010).
To extract the activity within the most prominent occipital cluster while minimizing the contribution of weakly responsive sources, we only considered nodes with current density values exceeding 2 SDs above the whole-brain volume's mean in each condition (here, mean + 2 SDs: Vslow = 0.0008 + 0.0011 μA/mm3, Vfast = 0.0008 + 0.0012 μA/mm3, AVslow = 0.0008 + 0.001 μA/mm3, and AVfast = 0.0007 + 0.001 μA/mm3; see Thelen et al., 2012, for a similar procedure). A cluster within left visual cortices extending to the middle temporal cortex was identified showing the strongest activations during the 183- to 250-msec poststimulus onset period in all conditions (coordinates of nodes with maximum current source density values: Vslow = −48, −61, 1 mm; Vfast/AVslow/AVfast = −49, −67, 6 mm; middle temporal gyrus [MTG], BA 37). No further nodes exceeding our statistical threshold were found.
Consequently, mean current density values for the cluster within the occipital cortex and each of the clusters yielding a main effect of RT quartile were extracted (i.e., first vs. last quartiles of the response distribution). More precisely, the mean activity across all voxels within three separate clusters, situated within the left inferior frontal gyrus (IFG), the right angular gyrus (AG)/MTG, and the right inferior parietal lobule (see Results for further details), was considered. We then correlated (1) the mean activity within each of these clusters with the activity within the occipital cluster and (2) the mean activity between each of the three clusters as a function of time. Although we are hesitant to overinterpret correlational relationships between activity patterns, this approach can at least reveal the basic interactions within a functional network. Given the small sample size, we used Spearman's nonparametric rank-ordered correlation coefficient and a bootstrapping procedure with 2,000 iterations to assess statistical reliability.
For the original analyses of the RT data, we refer readers to the previously published article (Cappe, Thut, Romei, & Murray, 2009). In the current study, we replicate the central behavioral result of speeded RTs under combined audiovisual stimulation, even with the smaller sample size dictated by the EEG analyses (significant main effect of condition: F(2, 6) = 43.898, p < .001, ηp2 = 1; post hoc t test confirming faster RTs to audiovisual presentations, median ± SEM: AV = 427 ± 30 msec, V = 473 ± 20 msec, A = 624 ± 28 msec; AV vs. V: t(7) = −2.532, p = .039; AV vs. A: t(7) = −9.823, p < .001; see Figure 1A).
We first sought to determine whether the audiovisual response speeding exceeded race model predictions (Ulrich et al., 2007; Ulrich & Miller, 1997). To this end, we modeled multisensory CDFs based on the empirically derived unisensory CDFs and compared these with the empirically derived multisensory CDFs. We then entered the data from the modeled and empirically derived multisensory CDFs into a repeated-measures ANOVA with the factors of Data type (empirical vs. modeled) and CDF percentile. This analysis revealed a main effect of Percentile (F(5, 3) = 142.5, p = .001, ηp2 = 1) and a significant Data type × Percentile interaction (F(5, 3) = 9.6, p = .046, ηp2 = 0.66). Subsequently, we performed post hoc one-tailed t tests on each percentile, which revealed significant race model violations for trials within the first ∼40 percentiles (p = .026). Note that one-tailed tests were conducted as we specifically tested for facilitation beyond race model predictions (i.e., a unidirectional effect). Furthermore, we divided the RT distributions into the fastest (first) and slowest (last) quartiles. Figure 1B plots the median RTs for the first and last quartiles on the right of the CDFs. This figure illustrates the main effect of quartile (F(1, 7) = 142.919, p < .001, ηp2 = 1), the main effect of condition (F(2, 6) = 47.649, p < .001, ηp2 = 1), and the Condition × Quartile interaction (F(2, 6) = 8.627, p = .017, ηp2 = 0.816).
In a final step of the behavioral analyses, to assess response variability between conditions and at the interindividual level, we computed the RT difference between the means of the first and fourth quartiles for each condition and for the group (Figure 2A) and for each participant (i.e., approximation of the IQR; see Figure 2B). The one-way ANOVA on these RT difference scores revealed a significant main effect of condition (F(2, 6) = 8.51, p = .018, ηp2 = 0.81; Figure 2A). Post hoc t tests showed that the difference score for audiovisual RTs was significantly less variable than for either the visual or auditory conditions (median difference score ± SEM: AV = 227 ± 21.6 msec, V = 261 ± 24.2 msec, A = 428 ± 50.5 msec; AV vs. V: t(7) = −2.441, p = .045; AV vs. A: t(7) = −4.106, p = .005). In addition, the difference score for visual RTs was significantly less variable than for auditory RTs (t(7) = −3.443, p = .011). Together, the behavioral data strongly support the presence of reduced variability under redundant (audiovisual) presentation conditions.
ERP analyses were structured to reveal the neuronal networks underlying response variability observed at the behavioral level. Thus, we will refrain from reporting statistical differences between conditions (i.e., audiovisual vs. visual only) because nonlinear multisensory interactions were not the focus of the present article (i.e., as compared with analyses presented by Mercier et al., 2015, and Sperdin et al., 2009, or prior analyses of the same data set in Cappe et al., 2012; Cappe, Thut, Romei, & Murray, 2010). The same statistical design was applied to all ERP measures and source estimations.
To address differential processing according to interquartile variability of RTs and presentation condition (i.e., intraindividual variability), 2 × 2 repeated-measures ANOVAs with the factors of Quartile (first vs. fourth) and Condition (V and AV) were performed. Note that this analysis was structured to specifically contrast brain responses during trials in which there was evidence for multisensory facilitation exceeding probability summations (first quartile of the RT distribution) from those in which no evidence for such race model violations was found (fourth quartile). The choice of limiting our analyses to contrasting AV and V trials only was motivated by the facts that (1) adding auditory-only conditions to the statistical design did not significantly alter the results and, (2) by reducing the number of conditions entered into the statistical design matrix, we increased the power in our analyses.
Analyses of the visual and audiovisual ERP waveforms (see Figure 3A for ERPs at a representative midline occipital electrode) as a function of time revealed a main effect of quartile (i.e., intraparticipant RT variability) starting at 188 msec poststimulus onset (see Figure 3B). In addition, there was a significant Quartile × Condition interaction starting at 184 msec poststimulus onset. Analyses of interparticipant differences (i.e., ANCOVA) revealed a three-way interaction between Quartile, Condition, and between-participant differences in RT spread (i.e., IQR) starting at 196 and 284 msec and poststimulus onset. These analyses of the ERP waveforms highlight significant differences found at single electrodes over time and serve as an initial indicator of differential neural processing. Nonetheless, these statistical results are reference dependent and cannot distinguish between activity differences due to changes in response strength from those due to differences in the topographic configuration of the scalp potentials; the latter of which is indicative of a change in the underlying neuronal generators (Murray et al., 2008).
Thus, to quantify statistical differences over the entire electrode montage, we analyzed both GFP and topographic dissimilarity (DISS; Brunet et al., 2011; Murray et al., 2008; Michel et al., 2004). GFP analyses did not reveal any statistically reliable differences. In contrast, the DISS analyses revealed a significant main effect of RT difference (from 142 to 239 msec poststimulus onset; see Figure 3C). Furthermore, we found a significant IQR × Condition interaction at 93–155 msec poststimulus onset as well as at a subsequent period (253–280 msec). Next, we sought to determine whether these topographic effects stemmed from stable differences in map configurations in each condition or from latency shifts of map onsets between conditions. To this end, we entered the group-averaged ERPs into an AAHC analysis (Murray et al., 2008). The procedure identified 17 maps that could account for 95.7% of the variance over the four group-averaged ERPs (i.e., AV and V conditions resulting in fast and slow responses) over the entire poststimulus onset period. These template maps are shown in Figure 4. During the 183- to 250-msec poststimulus onset period, three maps (framed in black, light gray, and dark gray in Figure 4) differentially characterized group-averaged ERPs across conditions.
This pattern observed at the group-averaged ERP level was next statistically assessed in the single-participant ERPs using a spatial-correlation fitting procedure over the 183- to 250-msec poststimulus period, using within-participant factors of Condition (AV and V), RT quartile (slow and fast), and Map (Murray et al., 2008). We observed a significant Condition × Map interaction (F(2, 14) = 11.38, p < .001, ηp2 = 0.62). No other main effect or interaction was statistically reliable (ps > .10). Given this interaction, we then performed separate ANOVAs for the AV and V conditions. For the AV condition, there was a nonsignificant trend for a main effect of Map (F(2, 14) = 3.65, p = .053, ηp2 = 0.34), but neither the main effect of RT quartile nor the interaction was statistically reliable (ps > .10). This suggests that one template map (i.e., that framed in black) predominated the responses to the AV conditions irrespective of the resultant RT and that the patterns were statistically indistinguishable for responses leading to slow and fast RTs. For the V condition, neither main effect was statistically reliable, yet there was a significant interaction between RT quartile and Map (F(2, 14) = 5.53, p = .017, ηp2 = 0.44). Post hoc analyses revealed that, for the visual-only condition leading to slow responses, the template map framed in dark gray predominated. By contrast, for the visual-only condition leading to fast responses, the template map framed in light gray predominated. In accordance with our behavioral findings showing a significant reduction of response variability under multisensory conditions, these results revealed a single stable map configuration (and, by extension, likely a stable neuronal generator configuration) underpinning multisensory processing within the 183- to 250-msec poststimulus onset window. In addition to quantifying the total duration of a given template map, the fitting procedure also provided output concerning the first onset of a given template map. Analysis of this output indicated an earlier switch between template maps to occur for visual-only presentations resulting in faster RTs than slower RTs (304 vs. 319 msec; p < .001).
The time window revealed by this clustering analysis (i.e., 183–250 msec poststimulus onset) served as a basis for determining the time window of analysis for the source estimations. Source estimations were carried out to identify the networks likely contributing to the effects observed at the scalp level. Although the topographic clustering analyses revealed the presence of two distinct template maps under visual-only presentation conditions, one template map predominated the responses under audiovisual conditions. Nonetheless, we chose to collapse over the whole period of interest when computing source estimations. This choice was mainly motivated by our relatively small sample size (N = 8) and to increase the signal-to-noise ratio of our scalp recordings.
During the period of interest identified by the clustering procedure (183–250 msec poststimulus onset), all four conditions (i.e., fast and slow responses for visual and audiovisual conditions) included prominent sources within occipital and temporal cortices. Statistical analyses revealed a main effect of RT quartile (i.e., fast vs. slow responses) that included several clusters located within the bilateral IFG, the right parietal cortex, and the right superior occipital cortex (SOC) extending to the middle temporal cortex (see Figure 5A and Table 1 for more detailed descriptions). Further analyses revealed a distinct network showing a significant Quartile × Condition interaction, which included sources within the right IFG, right middle frontal gyrus, and right superior temporal gyrus as well as a cluster within the left posterior PC (see Figure 5B). There was also a significant three-way interaction between Quartile, Condition, and RT difference located within the frontal cortex, extending from the superior frontal cortex to the medial frontal gyrus (see Figure 5C).
|Main effect of RT quartile|
|Left IFG||BA 45||−50||19||8||F(1, 7) = 14.3||p = .007|
|Right IFG||BA 44||54||13||8||F(1, 7) = 10.2||p = .015|
|Right IPL||BA 40||64||−32||31||F(1, 7) = 24.23||p = .0017|
|Right AG||BA 39||44||−68||28||F(1, 7) = 17.79||p = .004|
|Condition × Quartile interaction|
|Right IFG||BA 47||52||31||−1||F(2, 6) = 12.77||p = .001|
|Right SFG||BA 8||26||30||48||F(2, 6) = 10.33||p = .015|
|Right STG||BA 22||57||2||0||F(2, 6) = 9.22||p = .019|
|Left IPL||BA 40||−46||−51||46||F(2, 6) = 14.22||p = .007|
|Caudate body||−14||6||20||F(3, 5) = 321.62||p < .001|
|Main effect of RT quartile|
|Left IFG||BA 45||−50||19||8||F(1, 7) = 14.3||p = .007|
|Right IFG||BA 44||54||13||8||F(1, 7) = 10.2||p = .015|
|Right IPL||BA 40||64||−32||31||F(1, 7) = 24.23||p = .0017|
|Right AG||BA 39||44||−68||28||F(1, 7) = 17.79||p = .004|
|Condition × Quartile interaction|
|Right IFG||BA 47||52||31||−1||F(2, 6) = 12.77||p = .001|
|Right SFG||BA 8||26||30||48||F(2, 6) = 10.33||p = .015|
|Right STG||BA 22||57||2||0||F(2, 6) = 9.22||p = .019|
|Left IPL||BA 40||−46||−51||46||F(2, 6) = 14.22||p = .007|
|Caudate body||−14||6||20||F(3, 5) = 321.62||p < .001|
IPL = inferior parietal lobule; SFG = superior frontal gyrus; STG = superior temporal gyrus.
In a final step, we sought to shed light on the patterns of functional connectivity (FC; in terms of correlated activations) within the brain network showing differential responses as a function of RTs. To this end, we first extracted mean activity across all voxels within the cluster in the left visual areas showing the greatest activity (i.e., 2 SDs above the mean activity of whole-brain activity) during the 183- to 250-msec poststimulus onset window (see Methods). Second, we extracted mean activation values across voxels within each of the three clusters that showed a main effect of RTs within the same period of interest (see prior section). Subsequently, we computed Student t values of Spearman's rank-ordered correlation coefficients over time. Because of the relatively small sample size, the reliability of the correlation coefficients was assessed using bootstrap estimations (2,000 samples). Subsequently, we estimated differences in correlated activity patterns between visual cortices and the three clusters revealed by the main effect of RTs. This analysis sought to extend prior hemodynamic imaging results suggesting differential connectivity within a very similar neuronal network (including visual cortices) to be associated with differences in RTs (see Noppeney et al., 2010). More precisely, correlations were computed as a function of time between the mean cluster activity within left visual cortices (i.e., the cluster showing the greatest activity) and the three clusters identified by the statistical analyses in the source space (i.e., the main effect of RTs): (1) a cluster containing the right AG, extending to the posterior MTG; (2) a cluster in the right inferior parietal lobule extending to the SOC; and (3) a cluster within the right IFG.
Correlations between occipital cortices and the three clusters investigated here were significantly less pronounced under multisensory presentation conditions (within the 183- to 250-msec poststimulus onset time window). In contrast, response speed under visual-only presentations was facilitated when activity within occipital cortices was correlated with activity within all three clusters (see Figure 6A(i)).
In a last step, we sought to further elucidate how the connectivity between nodes beyond visual cortices differentially contributed to RT variability. To do this, we directly correlated activity from each of the three clusters shown to be differentially recruited as a function of RTs with one another (see Figure 6A(ii)). Again, we computed t values of Spearman's rank-ordered correlation coefficients (bootstrap estimation with 2,000 samples; time window of 183–250 msec; statistical criteria: t(6) > 2.44, p < .05; >12 contiguous time frames). These analyses showed that during the 183- to 250-msec period, intercluster correlations between the AG and the inferior parietal lobule were most robust for those conditions that led to faster RTs (i.e., AV fast, AV slow, and V fast; see Figure 6A(ii)). Similarly, only trials resulting in fast responses within each condition revealed significant correlations between all three clusters.
Generally, when participants' RTs were fastest (i.e., under audiovisual presentation conditions), occipital cortices did not exhibit significant correlation with the posterior parietal lobule and the IFG. In contrast, these clusters showed significant activity correlations with occipital cortices, with increased RTs (i.e., under slow responses to audiovisual stimuli and both visual-only presentations). We tentatively hypothesize that this difference in correlation patterns reflects more efficient stimulus processing (i.e., a decrease of the necessity of sustained FC) between visual cortices and the identified network clusters. In terms of between-cluster correlations, the data clearly showed that faster responses to both audiovisual and visual-only conditions were supported by stronger between-cluster activity correlations within the higher-level network (i.e., not including lower-level visual cortices).
The current study provides an important link between behavioral and neural data focused on examining intraindividual and interindividual differences (i.e., variability) in multisensory processing. The behavioral data support the presence of multisensory integrative processes as evidenced by violations of the race model—a result consistent with a number of prior studies (e.g., Mercier et al., 2015; Pomper, Brincker, Harwood, Prikhodko, & Senkowski, 2014; Stevenson, Fister, Barnett, Nidiffer, & Wallace, 2012; Sperdin et al., 2009). In addition, the behavioral analyses illustrate a reduction in response variability in audiovisual trials, again consistent with prior work (Sarko, Ghose, & Wallace, 2013). Analyses of the scalp-recorded EEG data show that differences in RT variability between visual and audiovisual conditions are related to the presence of different, stable ERP topographies. Specifically, these analyses revealed the recruitment of a single stable topography to occur under multisensory conditions (which characterized both fast and slow responses). In contrast, two stable network configurations characterized visual-only trials where a greater RT distribution variability was observed. We hypothesize that this apparent stability in the ERP topography under multisensory presentations reflects the more efficient (faster and less variable) processing of audiovisual stimuli. Source estimations suggest that the intraindividual (trial-by-trial) RT variability observed at the behavioral level is linked to differences in the recruitment of an extensive cortical network, which includes occipital, parietal, and frontal cortices. In addition, the analyses suggest that interindividual differences in the variability of RT distributions can be related to activity within middle frontal cortices. Finally, correlational analyses between clusters within this network revealed that greater behavioral benefits (under both visual and multisensory conditions) appear linked to more correlated (i.e., more efficient) interactions within the clusters of this network. In what follows, we discuss these findings within the framework of the existing literature.
In the current study, our measure of intraindividual and interindividual response variability is the mean difference in RTs between the first and fourth quartiles of the individual RT distributions. These results provide strong evidence that RTs under multisensory conditions are less variable when compared with those under unisensory visual conditions (for similar results, see Zehetleitner, Ratko-Dehnert, & Müller, 2015; Altieri & Hudock, 2014) and are of interest in the context of recent work that has distinguished between the concepts of sensory integration and cue interactions (Otto & Mamassian, 2012). Under circumstances of cue interactions, there should be an accompanying increase in sensory noise from a stimulus in a second modality, thus resulting in broadening of RT distributions under multisensory conditions in tasks like those used in the current study. In contrast, our observation of a less variable response distribution under multisensory conditions argues for a decrease in sensory noise, suggestive of an active integration process between the visual and auditory cues and supporting concepts of cue reliability (Morgan, DeAngelis, & Angelaki, 2008; Ernst & Banks, 2002).
To date, only a few studies have directly investigated the neuronal loci and networks that are associated with variability in behavioral responses to multisensory stimuli (Thelen, Matusz, & Murray, 2014; Tyll et al., 2013; Noppeney et al., 2010; Sperdin et al., 2009; Murray et al., 2004). Sperdin and colleagues (2009), reexamining a previously published data set from Murray and colleagues (2005) stemming from an audiotactile detection task, specifically addressed the neuronal interactions that accompanied RT facilitations under multisensory versus unisensory conditions but did not specifically address the neuronal correlates of RT variability under multisensory conditions. In particular, these authors found a facilitation of RTs under multisensory conditions and that was associated with differences in activation strength over the left posterior superior temporal cortex. Nonetheless, their analyses focused only on testing differences in nonlinear multisensory interactions as a function of RTs, rather than specifically addressing the neuronal correlates of RT variability per se.
Our findings also help to bridge results from EEG with those from hemodynamic imaging (Tyll et al., 2013; Noppeney et al., 2010). We provide the first evidence that trial-to-trial RT variability within an individual participant is linked to quantitative differences in terms of correlated activity within an occipital-to-frontal network. It has been argued that such correlations of neuronal activity can be highly informative about the FC between (relatively) distant cortical regions (Salinas & Sejnowski, 2001). Compared with simple correlation analyses, FC measures represent a more detailed analysis of the cross-correlation patterns between neural nodes as a function of experimental conditions (see Friston, 2011, for a review). The present results suggest the existence of a strong correlational relationship between the amount of neural activity within this occipito-to-frontal network and both presentation conditions (i.e., audiovisual vs. visual only) and response speed (i.e., fast vs. slow responses). Moreover, significant changes in activity correlations between neural nodes have been linked back to trial-by-trial fluctuations (i.e., intraindividual performance variability) reflected in behavior (e.g., Hansen, Chelaru, & Dragoi, 2012). Further investigations are needed to provide more detailed information concerning the links between connectivity patterns and their relationship to neural activation patterns and behavioral variability.
Although the current study focused on intraindividual RT variability, our analyses also revealed an important relationship between the neural correlates of intraindividual differences and interindividual variability. A distinct cluster within frontal cortices exhibited differential activation patterns as a function of response speed (fast vs. slow trials), presentation condition (visual only vs. multisensory), and the individual, within-participant RT differences (see Figure 2B). Activity within these areas has been linked to task difficulty and cognitive control mechanisms (Desai, Conant, Waldron, & Binder, 2006; Ridderinkhof, Ullsperger, Crone, & Nieuwenhuis, 2004) and sensory evidence accumulation in decision-related processes (Filimon, Philiastides, Nelson, Kloosterman, & Heekeren, 2013; Heekeren, Marrett, & Ungerleider, 2008). Here, frontal areas showed stronger activations when a participant's RT was slower under multisensory conditions, suggestive of less efficient evidence accumulation as compared with trials resulting in faster responses. Middle frontal areas have been related to individual differences in RT observed in attentional tasks and to aspects of behavioral control (Kelly, Uddin, Biswal, Castellanos, & Milham, 2008; Simmonds et al., 2007). Similarly, it has been suggested that the recruitment of premotor circuits is linked to more efficient behavioral performance (Ionta, Ferretti, Merla, Tartaro, & Romani, 2010; Simmonds et al., 2007), similar to what we have observed for visual-only trials and that resulted in shorter RTs. In other words, previous studies propose that frontal cortices are more strongly recruited under conditions of greater sensory uncertainty and higher cognitive demands. We propose that this increased activity within frontal cortices could reflect a greater effort to maintain performance (see Stuss et al., 1989, for a similar proposal; see Figure 6B for an illustration). Stated a bit differently, differential recruitment of frontal areas is linked to interindividual differences in the ability to maintain performance throughout the task. Such variability across individuals has been linked to differences in the maturation of executive functions as well as personality traits in both clinical and healthy cohorts (Alvarez & Emory, 2006; Barkley, 1997; Stuss, 1992). Thus, our results extend these prior findings by partially dissociating neuronal activation patterns responsible for intraindividual response variability from those related to between-participant differences in RTs.
The current behavioral and electrical neuroimaging data provide important insights into the spatiotemporal dynamics involved in RT variability in response to visual-only and audiovisual stimuli. Our results show that RT variability is related to differences in correlated activity of a distributed network involving occipital, temporal, parietal, and frontal cortices. Furthermore, our data suggest that the significant reduction of trial-to-trial variability under audiovisual presentations is related to the differential activation of superior and medial frontal cortex and accounts for differences in RTs as a function of race model predictions (i.e., violation vs. nonviolation). In contrast, for visual-only trials, a more extensive occipito-to-frontal network must be considered to explain RT variability.
The STEN toolbox (www.unil.ch/line/home/menuinst/about-the-line/software–analysis-tools.html) has been programmed by Jean-François Knebel, from the Laboratory for Investigative Neurophysiology (the LINE), Lausanne, Switzerland, and is supported by the Center for Biomedical Imaging of Geneva and Lausanne and by the National Center of Competence in Research project “SYNAPSY—The Synaptic Bases of Mental Disease” (project no. 51AU40_125759). Support for M. M. M. is from the Swiss National Science Foundation (grants 320030-149982 and 320030_169206) and from a grantor advised by Carigest SA. Support for A. T. is from the Swiss National Science Foundation (grant numbers P2LAP3_151771 and P300PB_164754). Support for S. I. is from the Swiss National Science Foundation (grants PZ00P1_148186 and PP00P1_170506/1) and the International Foundation for Research in Paraplegia (grant P164). Support for M. T. W. is from NIH (grant numbers HD083211 and MH109225).
Reprint requests should be sent to Micah M. Murray, Radiology, Centre Hospitalier Universitaire Vaudois, BH08.078, Rue du Bugnon 46, 1011 Lausanne, Switzerland, or via e-mail: firstname.lastname@example.org.
This paper appeared as part of a Special Focus deriving from a symposium at the 2017 annual meeting of Cognitive Neuroscience Society, entitled, “Real World Neuroscience.”
Denotes equal contributions.