Abstract
The brain's ability to extract information from multiple sensory channels is crucial to perception and effective engagement with the environment, but the individual differences observed in multisensory processing lack mechanistic explanation. We hypothesized that, from the perspective of information theory, individuals with more effective multisensory processing will exhibit a higher degree of shared information among distributed neural populations while engaged in a multisensory task, representing more effective coordination of information among regions. To investigate this, healthy young adults completed an audiovisual simultaneity judgment task to measure their temporal binding window (TBW), which quantifies the ability to distinguish fine discrepancies in timing between auditory and visual stimuli. EEG was then recorded during a second run of the simultaneity judgment task, and partial least squares was used to relate individual differences in the TBW width to source-localized EEG measures of local entropy and mutual information, indexing local and distributed processing of information, respectively. The narrowness of the TBW, reflecting more effective multisensory processing, was related to a broad pattern of higher mutual information and lower local entropy at multiple timescales. Furthermore, a small group of temporal and frontal cortical regions, including those previously implicated in multisensory integration and response selection, respectively, played a prominent role in this pattern. Overall, these findings suggest that individual differences in multisensory processing are related to widespread individual differences in the balance of distributed versus local information processing among a large subset of brain regions, with more distributed information being associated with more effective multisensory processing. The balance of distributed versus local information processing may therefore be a useful measure for exploring individual differences in multisensory processing, its relationship to higher cognitive traits, and its disruption in neurodevelopmental disorders and clinical conditions.
INTRODUCTION
We experience the world through multiple sensory systems, each sampling different kinds of energy from the environment and offering complementary information about the world around us. The brain's ability to extract relevant information from across sensory channels is critical to interacting effectively with the world, as demonstrated by the numerous behavioral enhancements that multisensory stimuli provide compared with unisensory stimuli (e.g., Shams & Seitz, 2008; Diederich & Colonius, 2004; Hershenson, 1962; Sumby & Pollack, 1954). These enhancements, in combination with converging anatomical and physiological evidence of extensive multisensory influences across neural scales, suggest that the brain is fundamentally organized to facilitate multisensory processing, with wide-reaching consequences for perception, cognition, and action (Driver & Noesselt, 2008; Ghazanfar & Schroeder, 2006).
Multisensory processing is also characterized by its variability across the population, with substantial individual differences in multisensory tasks and illusions occurring in healthy individuals (e.g., Cecere, Rees, & Romei, 2015; Nath & Beauchamp, 2012; Stevenson, Zemtsov, & Wallace, 2012; Miller & D'Esposito, 2005). Furthermore, correlations among various multisensory tasks and audiovisual illusions suggest that these individual differences may be expressions of a general mechanism of multisensory processing (Stevenson & Wallace, 2013; Stevenson et al., 2012), which may use a set of basic neural operations to flexibly integrate information between distributed neural populations in a wide variety of contexts (van Atteveldt, Murray, Thut, & Schroeder, 2014; Senkowski, Schneider, Foxe, & Engel, 2008). However, prior studies investigating individual differences in multisensory processing have largely focused on univariate measures of neural activity (e.g., Ferri et al., 2017; Balz et al., 2016; Kaganovich & Schumaker, 2016; Cecere et al., 2015) without considering interactions among neural populations (but see Kumar, Dutta, Talwar, Roy, & Banerjee, 2020). Given that multisensory processing involves reconciling information encoded in different sensory systems to produce percepts and guide action, we expect that effective multisensory processing requires information to be processed across widely distributed neural populations. As such, we hypothesize that individuals capable of more effective multisensory processing will demonstrate a higher degree of shared information among distributed cortical regions while engaged in a multisensory task, likely including regions of multisensory convergence such as the STS (Jones & Powell, 1970), which is frequently implicated in audiovisual processing (e.g., Marchant, Ruff, & Driver, 2012; Balk et al., 2010; Noesselt et al., 2007; Calvert, Hansen, Iversen, & Brammer, 2001).
To investigate this possibility, we compared the balance of local versus distributed information processing to multisensory performance using information-theoretical measures of local entropy and mutual information at source-localized EEG time series. Multisensory performance was assessed as audiovisual temporal discrimination, employing the commonly used audiovisual simultaneity judgment (SJ) task (e.g., Stevenson & Wallace, 2013; Powers, Hillock, & Wallace, 2009; van Eijk, Kohlrausch, Juola, & van de Par, 2008). This task allows a temporal binding window (TBW) to be measured for each participant, quantifying the probability that an auditory and visual stimulus will be perceived as asynchronous as a function of the time difference between them. TBWs are known to vary substantially between individuals (Stevenson et al., 2012; Powers et al., 2009; Conrey & Pisoni, 2006; Miller & D'Esposito, 2005), and those with narrower TBWs were considered to have more effective multisensory processing, reflecting a better ability to resolve timing differences between sensory channels. Following measurement of the TBW, participants underwent EEG recording during a second run of the SJ task, which was calibrated to equalize difficulty between participants.
To quantify the balance of distributed versus local information processing, source-localized EEG time series were submitted to a time delay embedding-based algorithm computing local entropy and mutual information at multiple timescales (Vakorin, Lippé, & McIntosh, 2011). These information-theoretical measures allow the joint information of two neural time series to be partitioned into that which is unique to each population (local entropy) and that which is shared between them (mutual information). As such, local entropy and mutual information provide measures of local and distributed information, respectively. Partial least squares (PLS; McIntosh & Lobaugh, 2004) was then used to identify components (latent variables [LVs]) of local entropy and mutual information distribution that maximally covaried with the individual-level TBW measurements. Additionally, source-localized power spectral densities (PSDs) were subjected to the same PLS analysis to allow comparison with previous electrophysiological work and disambiguate the role of cross-frequency dependencies and nonlinear autocorrelations in differentiating individuals (Courtiol et al., 2016).
METHODS
Participants
Twenty-eight healthy young adults were recruited, with three being excluded from analysis because of failed sigmoid model fitting after the calibration task (see below) and three excluded for excessive EEG artifacts. The final sample of 22 young adults (11 women, ages 19–33 years, mean = 23.6 years, SD = 3.5 years) had an average of 16.2 years of education, and all had normal or corrected-to-normal vision and hearing. Twenty participants were right-handed, one was left-handed, and one was ambidextrous. No participant reported a diagnosis of dyslexia, autism spectrum disorder, schizophrenia, or other clinical condition with noted relevance to multisensory processing (Hahn, Foxe, & Molholm, 2014; de Boer-Schellekens, Eussen, & Vroomen, 2013; Martin, Giersch, Huron, & van Wassenhove, 2013; Kwakye, Foss-Feig, Cascio, Stone, & Wallace, 2011; Foucher, Lacambre, Pham, Giersch, & Elliott, 2007; Hairston, Burdette, Flowers, Wood, & Wallace, 2005). All participants provided written consent according to the guidelines established by Baycrest Centre and the University of Toronto and were provided monetary compensation for their participation.
This work is a secondary analysis of a data set (collected by the authors and previously unpublished) intended to investigate the neural correlates of audiovisual perceptual binding. The final sample size (22) offered by this data set surpassed that of similar work using EEG to investigate multisensory perception during the SJ task (Yuan, Li, Liu, Yuan, & Huang, 2016 [18]; Kambe, Kakimoto, & Araki, 2015 [14]) and is comparable to other EEG studies investigating individual differences in multisensory perception (e.g., Kumar et al., 2020 [18]; Cecere et al., 2015 [22]).
Behavioral Protocol
Audiovisual SJ Task
Participants first completed a two-alternative forced-choice SJ task to measure the width of their TBWs, as well as to calibrate the stimuli for the task presented during EEG recording. The SJ task consisted of a jittered fixation period (1000–1500 msec), followed by a visual flash and auditory beep stimulus, and lastly a response prompt (see Figure 1). The flash and beep stimuli were separated by a systematically varied SOA, where a negative number denotes auditory-leading (AV) presentation and a positive number denotes visual-leading (VA) presentation.
The auditory stimulus was a 3500-Hz pure sine tone, 10 msec in length, delivered by a GSI 61 audiometer through ER-3A insert earphones (Etymotic Research). The audiometer was calibrated such that a 5-sec tone at the same frequency produced an intensity of 102 dB SPL. The visual stimulus was a white annulus flash on a black background, presented for 10 msec and covering 3.8° of visual angle at a viewing distance of 60 cm. It was presented on a Dell Trinitron CRT monitor at a refresh rate of 100 Hz.
After an interval of 750 msec, a prompt was displayed, and the participant reported whether they perceived the two stimuli as synchronous (“yes”) or asynchronous (“no”) by pressing the left or right arrow button (counterbalanced between participants) on a standard computer keyboard. Participants responded with the right hand, and responses were made within a 2000-msec time limit. Trials were then separated by a 750-msec intertrial interval before the next fixation.
The task was built with PsychoPy software (Version 1.90.3; Peirce et al., 2019) and presented on a Dell Precision T3600 computer. Stimulus timing was verified to be accurate within ±4 msec using a Tektronix TDS210 two-channel oscilloscope.
For the initial calibration run, 19 SOAs were used in total, ranging from −300 to 300 msec, to estimate each participant's TBW. Specifically, SOAs of 0, 10, 20, 50, 80, 100, 150, 200, 250, and 300 msec were presented in both the AV and VA cases (Stevenson et al., 2012). The task was broken into four blocks, wherein each SOA was presented four times in a pseudorandom order, for a total of 16 presentations per SOA. Two hundred eighty-five trials were presented over the course of the whole task, which lasted roughly 15 min in total. Participants were offered a self-timed break between each block.
EEG was recorded during a second run of the SJ task. This version of the task, originally designed to investigate the neural correlates of audiovisual binding, presented six SOAs, four of which were calibrated for each participant individually based on the results of the preceding behavioral run. To do so, the rate of synchrony perception for each SOA was calculated as the number of “synchronous” responses divided by the total number of presentations (16). Two psychometric sigmoid functions—one for the AV SOAs (−300 to 0 msec) and one for the VA SOAs (0–300 msec)—were fit to the resulting rates using the lmfit function in Python (least squares mode; Stevenson & Wallace, 2013; Hillock-Dunn & Wallace, 2012; Powers et al., 2009). For the AV stimuli, “A50” and “A95” SOAs were calibrated to produce asynchrony perception 50% and 95% of the time, respectively, by solving the AV psychometric sigmoid for values of 0.5 and 0.95. “V50” and “V95” trials were calibrated in the same way for VA stimuli. Lastly, “A10” and “V10” trials used a fixed SOA of 10 msec with auditory- and visual-leading stimuli, respectively. A total of 512 trials were presented in a pseudorandom order, broken into four blocks of 128 trials each. Within each block, A50 and V50 trials were presented 32 times each, and A10, V10, A95, and V95 trials were presented 16 times each. This balance of trial types was chosen to equalize subjective difficulty by presenting participants with a high proportion of ambiguous stimuli while preventing guessing or adaptation by offsetting these with more obviously synchronous (A10, V10) and asynchronous (A95, V95) stimuli.
Observer Model
Although an arbitrary psychometric sigmoid was used to calibrate the ambiguous stimuli at the time of data collection, a more sophisticated observer model was applied post hoc to parameterize the TBW. The four parameter observer model proposed by Yarrow et al. (Yarrow, 2018, 2020; Yarrow, Jahn, Durant, & Arnold, 2011) posits that SJs are the product of a latent decision process, where a noisy internal representation of the difference in timing between stimuli is compared with lower (auditory-leading) and upper (visual-leading) decision criteria. The TBW is therefore estimated as the difference of two cumulative Gaussians representing the noisy boundaries separating simultaneous from nonsimultaneous judgments. Each Gaussian is defined by a mean (mAV and mVA; the estimated mean SOA values of the decision criteria) and standard deviation (σAV and σVA; representing a combination of sensory and criterion noise). As such, mAV and mVA define the width of the TBW and are analogous to other point estimates of TBW width, which have previously shown individual differences (e.g.. Stevenson et al., 2012; Powers et al., 2009; Conrey & Pisoni, 2006; Miller & D'Esposito, 2005), whereas σAV and σVA determine the slope of the TBW. In line with previous work, mAV and mVA were considered the primary measures of interest quantifying individual differences in the multisensory SJ task and will hereafter be referred to simply as the AV and VA TBW width, respectively.
To assess the fit of the observer model, it was compared with a simpler two-parameter “guessing” model, designed to determine whether participants were simply guessing and whether they were presented with a sufficient range of SOAs to adequately sample their transitions from simultaneous to nonsimultaneous judgments. Full details of the observer model, the guessing model, and their implementation are available in Yarrow et al. (Yarrow, 2018, 2020; Yarrow et al., 2011).
Self-report Measures
Participants completed a posttask questionnaire, the results of which were not analyzed here. Participants reflected on the SJ task and tried to quantify potential biases in their responding, as well as guess the intention of the study. Additional questions assessed discomfort and whether participants fell asleep at any point during the task, as well as clinical or neurodevelopmental diagnoses (e.g., dyslexia, autism spectrum disorder, and schizophrenia). Finally, participants reported their experience with and time spent playing video games, as well as musical training and proficiency.
Electrophysiological Analysis
EEG Recording and Preprocessing
EEG was recorded with a BioSemi ActiveTwo acquisition system (BioSemi Instrumentation) at a sampling rate of 2048 Hz and bandwidth (−3 dB) of DC-400 Hz and then decimated to 512 Hz in ActiView acquisition software. Sixty-six scalp electrodes were employed, using BioSemi's 64 + 2 electrode cap configuration based on the 10/20 system. Ten additional electrodes were applied in pairs to the mastoids, preauricular points, upper cheeks, outer canthi of the eyes, and inferior orbit of the eyes. These provided better coverage of the scalp, as well as an accurate record of eye movements for later artifact removal. All recordings took place in a dimly lit, sound-attenuating room.
All EEG preprocessing was performed in Brainstorm (Version 07-Apr-2020; Tadel, Baillet, Mosher, Pantazis, & Leahy, 2011), an open-source application for M/EEG data processing and visualization. Continuous recordings were digitally band-pass filtered at 0.5–90 Hz (linear-phase FIR, stopband attenuation = 60 dB, transition band = 0 Hz), with a notch filter centered at 60 Hz to attenuate line noise (second-order IIR, 3-dB notch bandwidth = 2 Hz). Bad electrodes and contaminated segments of continuous data were manually rejected from subsequent processing, and the remaining data were rereferenced to the average of all remaining electrodes. Eye- and cardiac-related artifact components were then detected and removed using Brainstorm's implementation of Infomax independent component analysis (Makeig, Bell, Jung, & Sejnowski, 1996) applied to the longest available continuous segment of data without major artifacts (minimum of 5 min or 153,600 samples).
We assessed individual differences in neural activity during the multisensory task by analyzing data from the intertrial interval, with epochs spanning 1750 msec starting immediately after the response and ending before the onset of the next stimulus. This interval was chosen to capture individual differences in the functional organization of brain activity that emerges during the multisensory task while minimizing spillover from the stimuli themselves, which varied between participants and could not be directly compared. Implications of this choice are addressed in the Discussion section. After rejection of contaminated segments, an average of 471.4 epochs of 1750 msec (13.7 min of data in total) were analyzed for each participant.
EEG Source Estimation
The cortical current sources of the EEG signals were estimated with Brainstorm using sLORETA (Pascual-Marqui, 2002), with one dipole modeled normal to the cortical surface at each vertex, using an OpenMEEG BEM forward model (Gramfort, Papadopoulo, Olivi, & Clerc, 2010) computed on the default MNI/ICBM152 anatomy in Brainstorm (Fonov, Evans, Mckinstry, Almli, & Collins, 2009). The inversion kernel computed on the full cortical surface (15,002 vertices) was used to downsample the results to the Desikan-Killiany atlas (Desikan et al., 2006), resulting in one time series for each of the 68 regions of the atlas.
PSD Computation
PSDs were computed using a smoothed fast Fourier transform implemented in the neurodsp package in Python (“medfilt” method, frequency range = 1–90 Hz, frequency resolution = 0.57 Hz, median filter length = 1 Hz; Cole, Donoghue, Gao, & Voytek, 2019). PSDs were computed on each individual epoch and then averaged over epochs, resulting in one average PSD associated with each source.
Local Entropy and Mutual Information
Entropy H(X) is a measure of information or uncertainty associated with a single random variable X. Similarly, the joint entropy H(X, Y) is the entropy of the joint probability distribution of two random variables X and Y. By conceptualizing neural time series as random variables, information theory provides tools to partition the joint entropy of two neural time series into that which is unique to one or the other (local entropy) and that which is shared between them (mutual information; see Figure 2).
Local entropy and mutual information were estimated with time delay embedding (embedding dimension d = 2, embedding delay τ = 1, characteristic scale length r = 1) using the method previously described in Vakorin et al. (2011). Given that entropy estimates are scale dependent (Costa, Goldberger, & Peng, 2005; Zhang, 1991), these quantities were computed at multiple timescales. To do so, each time series was downsampled by averaging within nonoverlapping windows of successively increasing length before time delay embedding. A maximum window length of 18 samples was chosen to ensure that a minimum of about 50 data points (896 samples/18 = 49.78 windows) were included in the estimation.
To characterize the dynamic interactions of the whole-brain network, multiscale local entropy and mutual information were computed between each pair of sources at each epoch and then averaged across epochs. Local entropy for a given node X was then taken to be the average of local entropy values, H(X|Y), computed between that node and all other nodes Y.
PLS Analysis
PLS (McIntosh & Lobaugh, 2004; McIntosh, Bookstein, Haxby, & Grady, 1996) is a multivariate statistical technique, similar to canonical correlation, which uses singular value decomposition (SVD) to extract orthogonal patterns of maximal covariance, called LVs, between two data matrices. PLS can therefore identify components of a brain data matrix that maximally covary with behavioral measurements. Specifically, the relationship between brain and behavior data, represented by their cross-product matrix, is decomposed using SVD, producing three new matrices, containing (1) behavior saliences, (2) brain saliences, and (3) singular values. The behavior saliences can be thought of as contrasts that specify the relationship between the elements of the behavioral matrix, and the “brain saliences” are the extracted patterns of brain data that best characterize the cross-product matrix. The singular values are the square roots of the eigenvalues and are proportional to the amount of covariance in the cross-product matrix captured by each LV. Finally, because the extraction of these brain–behavior relationships via SVD takes place across the entire brain in a single mathematical step, correction for multiple comparisons is not necessary (McIntosh & Lobaugh, 2004; McIntosh et al., 1996).
Here, behavior PLS implemented in MATLAB (McIntosh & Lobaugh, 2004) was used to identify LVs relating individual differences in TBW width to patterns in (1) local entropy, (2) mutual information, and (3) PSDs. To explore the maximal experimental effects in the data, the rotated version of PLS was used, which produces a set of LVs based on mutually orthogonal contrasts in a data-driven manner. Statistical assessment of the resulting LVs was performed with a two-stage resampling procedure. First, statistical significance of each LV was estimated using permutation testing (1000 resamples), where observations of the behavioral matrix were randomly reassigned without replacement. The resulting p value represents the proportion of permuted singular values that exceeded the observed singular values. The reliability of each element was then estimated using bootstrap resampling (1000 resamples), where the standard error of each element salience was estimated by recomputing PLS on a set of observations resampled with replacement (maintaining the mapping between brain and behavioral observations). The bootstrap ratio (the ratio of the salience to the standard error computed through resampling) captures how dependent the element salience is on the particular makeup of the sample (Sampson, Streissguth, Barr, & Bookstein, 1989), and is roughly equivalent to a z score when the bootstrap distribution is normal (Efron & Tibshirani, 1986). We therefore interpret the bootstrap ratios as measures of reliability, using the permutation test for null hypothesis testing.
RESULTS
Observer Model Fitting
The observer model provided a significant improvement in fit over the guessing model in all cases (p < .001; Appendix A), suggesting that participants were not guessing, and SOAs were adequately sampled.
The mean TBW width estimated with the observer model was 162.1 msec (SD = 59.3 msec) for AV stimuli and 236.0 msec (SD = 86.44 msec) for VA stimuli, reproducing the commonly reported finding that the TBW is wider for VA stimuli than AV stimuli on average, t(21) = −5.9, p < .001.1Figure 3A depicts the average TBW for all participants in the sample. As described previously (e.g., Stevenson et al., 2012), AV and VA TBW widths were strongly correlated within individuals, r(20) = .75, p < .001 (see Figure 3B).
Local Entropy
Behavior PLS relating local entropy values to TBW width identified one significant LV (LV1: p < .001, singular value = 18.58; LV2: p = .70, singular value = 3.28). Figure 4 illustrates the relationship between TBW width and local entropy estimates across sources and timescales captured by this LV. The data-driven contrast values for AV and VA TBW width had the same sign (see Figure 4A, left), indicating that this LV captures commonalities between the two measures in their relation to local entropy. For this LV, correlations [95% CI] between brain scores and TBW width were AV: −.75 [−.89, −.67] and VA: −.67 [−.85, −.55]. Bootstrap ratios quantify the stability of this contrast at each source and timescale, where positive ratios indicate support for the contrast and negative ratios indicate support for the inverse of the contrast. Because the contrast values in this case are negative, negative bootstrap ratios indicate where TBW width and local entropy are positively related. Bootstrap ratios were overwhelmingly negative across sources and timescales (see Figure 4B–D), indicating that narrower TBWs were associated broadly with lower local entropy, and therefore, wider TBWs were associated with higher local entropy. Furthermore, the effect was most stable at finer timescales, with the 11.7 msec timescale having the most extreme median bootstrap ratio. Table 1A lists the sources demonstrating the highest median bootstrap ratios across timescales.
Source . | Median Bootstrap Ratio . |
---|---|
(A) Local Entropy | |
Pars triangularis R | −4.98 |
Caudal anterior cingulate R | −4.75 |
Middle temporal R | −4.49 |
Superior temporal R | −4.38 |
Caudal anterior cingulate L | −4.28 |
Superior frontal L | −4.22 |
Pars opercularis R | −3.80 |
Caudal middle frontal R | −3.72 |
Superior frontal R | −3.59 |
Fusiform R | −3.56 |
(B) Mutual Information | |
Middle temporal R | 2.60 |
Rostral middle frontal R | 2.46 |
Pars triangularis R | 2.44 |
Pars orbitalis R | 2.14 |
Caudal anterior cingulate R | 2.13 |
Paracentral R | 2.12 |
Superior temporal R | 2.10 |
Superior frontal L | 2.05 |
Rostral middle frontal L | 2.03 |
Pars triangularis L | 1.97 |
Source . | Median Bootstrap Ratio . |
---|---|
(A) Local Entropy | |
Pars triangularis R | −4.98 |
Caudal anterior cingulate R | −4.75 |
Middle temporal R | −4.49 |
Superior temporal R | −4.38 |
Caudal anterior cingulate L | −4.28 |
Superior frontal L | −4.22 |
Pars opercularis R | −3.80 |
Caudal middle frontal R | −3.72 |
Superior frontal R | −3.59 |
Fusiform R | −3.56 |
(B) Mutual Information | |
Middle temporal R | 2.60 |
Rostral middle frontal R | 2.46 |
Pars triangularis R | 2.44 |
Pars orbitalis R | 2.14 |
Caudal anterior cingulate R | 2.13 |
Paracentral R | 2.12 |
Superior temporal R | 2.10 |
Superior frontal L | 2.05 |
Rostral middle frontal L | 2.03 |
Pars triangularis L | 1.97 |
Sources in bold are within the top 10 for both local entropy and mutual information. R = right; L = left.
In addition to the width parameters, the slope parameters were also compared with local entropy values using behavior PLS. Whether relating local entropy to width and slope together (Appendix Figure B1, left side) or slope alone (Appendix Figure B1, right side), the pattern that emerged was largely the same as that described above.
Mutual Information
PLS relating mutual information to TBW width identified one significant LV (LV1: p = .005, singular value = 125.96; LV2: p = .70, singular value = 27.88). Figure 5 depicts the expression of this LV across timescales and source-to-source connections. The data-driven contrast values for AV and VA TBW width again had the same sign (see Figure 5A, left), indicating that this LV captures commonalities between the two measures in their relationship to mutual information. For this LV, correlations [95% CI] between brain scores and TBW width were AV: −.63 [−.85, −.54] and VA: −.71 [−.88, −.65]. In contrast to local entropy, the bootstrap ratios for mutual information were predominantly positive, indicating that narrower TBWs were largely associated with higher mutual information across timescales and connections, and therefore, wider TBWs were associated with lower mutual information. Again, this effect was most stable at finer timescales. Table 1B lists the sources with the highest median bootstrap ratios across timescales, which are most prominently involved in this pattern, and Figure 5B depicts the number of times a given source is involved in a highly reliable (bootstrap ratio > 4) connection with any other source. To illustrate the pattern of connectivity itself, Figure 6 displays the pattern of connections exhibiting the highest reliability at the 11.7-msec timescale.
The slope parameters were also compared with mutual information using behavior PLS. Relating mutual information to width and slope together revealed a similar pattern to the one described above (Appendix Figure C1, left side); however, slope alone did not yield any significant LVs (Appendix Figure C1, right side).
PSD
PLS relating PSDs to TBW width extracted two LVs, but permutation testing indicated that neither was significant (LV1: p = .28, singular value = 31.41; LV2: p = .73, singular value = 9.71).
DISCUSSION
Narrower TBW Is Broadly Associated with Mutual Information and Negatively Associated with Local Entropy
We found that individual differences in audiovisual temporal discrimination ability, as measured by TBW width covaried with differences in local entropy and mutual information while participants engaged in an audiovisual task. Widespread differences were observed, with better temporal discrimination (narrower TBW) being associated with higher mutual information and lower local entropy broadly across cortical sources and timescales. Taken together, these results support the hypothesis that better multisensory processing abilities are associated with a propensity for greater shared information among distributed cortical sources and, conversely, less effective processing is associated with more localized processing of information. A potential interpretation of this finding is that the ability to extract stimulus features (e.g., timing) encoded in separate modalities, compare them, and respond appropriately is facilitated in individuals who show broad integration of information involving the coordination of diverse neural populations while engaged in a multisensory task.
Furthermore, for both local entropy and mutual information, only the LV capturing similarities between the AV and VA TBW widths passed statistical assessment, suggesting that the pattern of mutual information and local entropy identified in the brain data applies to both types of stimulus. This finding, along with the strong correlation observed between AV and VA TBWs, accords with previous work proposing that the TBW width indexes a general mechanism of multisensory processing that applies across stimulus types (Stevenson et al., 2012), although further work comparing multiple tasks will be necessary to confirm this. Moreover, if local entropy and mutual information balance do reflect domain-general multisensory processing ability, further work will need to consider how this more general process interacts with possible stimulus-specific mechanisms, which may produce the observed difference in neural responses to AV and VA stimuli (e.g., Cecere, Gross, Willis, & Thut, 2017) and differences in the malleability of the AV and VA TBWs (Cecere, Gross, & Thut, 2016; Powers et al., 2009).
Although TBW width was our primary measure of interest, the observer model also provides the slope of the TBW, and these two parameters have different theoretical interpretations. Under the assumptions of the observer model, width indexes the position of the boundary between simultaneous and nonsimultaneous judgments, whereas slope is hypothesized to capture the “noisiness” of internal representations of stimulus timing and decision criteria. Despite these theoretical differences, our results suggest that both width and slope relate to largely the same pattern of individual differences in local entropy and mutual information (Appendices B and C). It could be the case that the observer model is correct in its interpretation of these parameters, but they are too strongly correlated to disambiguate (i.e., noisy observers may tend to also have noisy criteria; Magnotti, Ma, & Beauchamp, 2013). Alternatively, the model may not reflect the underlying process that produced these simultaneity judgments accurately enough to afford such specific interpretation of its parameters. In either case, we can only conclude that the individual differences observed in the brain data correspond to general performance on the SJ task, rather than criterion setting or the noisiness of internal representations specifically.
Temporal and Frontal Regions Exhibit Most Reliable Differences in Local Entropy and Mutual Information
Although narrower TBWs were associated with more distributed and less localized information processing broadly throughout the cortex, the most reliable differences were observed primarily in temporal and frontal cortex (see Table 1). Several of these temporal regions (right superior temporal gyrus and right middle temporal gyrus) and frontal regions (right pars triangularis, right caudal anterior cingulate, and left superior frontal gyrus) were among those that most reliably demonstrated an effect in both the local entropy and mutual information analyses.
Of particular interest is the STS, which has historically been considered a key region for multisensory processing because of its converging auditory and visual inputs (Jones & Powell, 1970), and the nonhuman primate work demonstrating the presence of neurons responsive to both modalities there (Schroeder & Foxe, 2002; Bruce, Desimone, & Gross, 1981; Benevento, Fallon, Davis, & Rezak, 1977). In humans, numerous fMRI studies have implicated STS (Marchant et al., 2012; Balk et al., 2010; Noesselt et al., 2007; Calvert et al., 2001), as well as nearby superior temporal gyrus (Marchant et al., 2012; Stevenson, VanDerKlok, Pisoni, & James, 2011; Stevenson, Altieri, Kim, Pisoni, & James, 2010; Noesselt et al., 2007) in the integration or temporal discrimination of asynchronous auditory and visual stimuli using a variety of task paradigms, which motivated our prediction that STS would be linked to individual differences on this task. Given the limited spatial resolution of the present analysis, it is plausible that the prominent differences observed in the right superior temporal and right middle temporal sources could originate in the STS and superior temporal gyrus. If so, these results would suggest that individual variability in audiovisual temporal discrimination may in part reflect the degree to which these putatively multisensory temporal regions exchange information with a wide array of other cortical regions (see Figure 6). However, it is important to note that these sources are based on atlas parcellations encompassing large areas of cortex, likely including multiple subregions that play diverse roles in cognition (Hein & Knight, 2008). Therefore, the sources used here may capture different processes than those reported in the fMRI literature. Furthermore, in contrast to previous findings, here the most reliable effects in temporal cortex were observed in the right hemisphere, with little difference emerging in the left hemisphere. Although the fMRI literature shows largely left-hemispheric or bilateral effects at the group level, the emphasis here on individual differences means that this is not necessarily a contradictory finding (but see Marchant et al., 2012). This discrepancy may be the result of heterogeneity in task design as well as inherent differences between the BOLD signal and information-theoretical measures used here. Alternatively, this finding could indicate that left multisensory temporal cortex is more consistently implicated in audiovisual processing tasks, but the degree to which right multisensory temporal cortex is involved varies between individuals and may be more closely related to performance. This individual variability in the involvement of right multisensory temporal cortex could therefore explain why it is less consistently identified using BOLD signal contrasts computed at the group level. Future work seeking to clarify the networks involved in multisensory processing may therefore benefit from considering how individual differences in these networks may relate to task performance.
In frontal cortex, a particularly strong relationship was identified between bilateral caudal anterior cingulate and TBW width. In particular, higher local entropy in this region was strongly associated with wider TBWs, especially at finer timescales. Involvement of this region in a forced-choice task is not unexpected, given that ACC has been implicated in decision-making and action selection in general and is thought to play a flexible role in integrating behaviorally relevant information from its prefrontal, parietal, and subcortical connections (Monosov, Haber, Leuthardt, & Jezzini, 2020). Furthermore, there is evidence that ACC is capable of integrating information over long timescales, including those spanning multiple trials (Spitmaan, Seo, Lee, & Soltani, 2020). Although still speculative, this result could suggest that individual differences in TBW may, to some extent, reflect differences in response selection processes mediated by ACC and its network, perhaps involving the way information from previous trials is integrated to affect response selection.
Differences in Local Entropy and Mutual Information Most Evident at Fine Timescales
In addition to spatial information, the current method gives insight into the temporal scales where individual differences in the information-theoretical measures most reliably correspond to individual differences in TBW. Although some sources exhibited differences across all timescales, the finer timescales were most consistently identified for both local entropy and mutual information, with median bootstrap ratios peaking at the 11.7 msec timescale for local entropy and decreasing monotonically from fine to coarse timescales for mutual information. Because the downsampling procedure removes the influence of fast-changing activity as the coarse-graining windows become larger, fine timescales represent the entropy of both fast- and slow-changing activity, whereas coarser timescales represent only slow-changing activity (Courtiol et al., 2016). This suggests that the key individual differences in local entropy and mutual information are likely attributable to relatively fast-changing elements of the neural signals. This finding would seem to corroborate the view that gamma synchronization between distinct neuronal groups may play an instrumental role in structuring information within cortical networks in general (Fries, 2009; Engel, Fries, & Singer, 2001) as well as the orchestration of multisensory interactions more specifically (Keil & Senkowski, 2018). However, relating power spectra to the width of the TBW with PLS did not show differences in power in the gamma range or any other, highlighting the fact that straightforward comparisons between the (multiscale) entropy measure used here and the power spectrum are likely not possible. Simulation work has demonstrated that multiscale entropy (Costa et al., 2005), a similar technique to that used here, is sensitive to nonlinear autocorrelations in the time series, as well as cross-frequency dependencies, although this is not true of the power spectrum (Courtiol et al., 2016; McIntosh, Kovacevic, & Itier, 2008). These differences confirm that the information-theoretical measures provide additional insight into the temporal structure of neural signals at fine timescales, which is not captured in the power spectrum.
Implications of Task Design
Given the substantial variability in performance on the SJ task, the SOAs were calibrated to equalize difficulty across participants (Yuan et al., 2016; Kambe et al., 2015). For the present analysis, the choice to vary stimuli between participants comes with a trade-off, namely, that stimulus-related activity cannot be compared directly between individuals. Although this is not ideal, the particularities of the neural responses to each type of stimulus (i.e., AV and VA) is of secondary interest to individual differences in the broader context of information processing that emerged during the task, which we judged more likely to relate to general (i.e., nonstimulus specific) mechanisms of multisensory processing. Furthermore, focusing on the intertrial interval allowed longer epochs without motor contamination to be analyzed and therefore allowed longer timescales to be assessed. Lastly, equalizing task difficulty across participants is intended to prevent the boredom or blind guessing that comes with a task that is too easy or too difficult, respectively, which are problematic when participants are assumed to be comparably engaged in the task.
The use of the intertrial interval in this analysis, rather than task-free resting-state data, poses additional questions for interpretation. Task-free resting-state data are typically assumed to better reflect inherent differences in functional organization between individuals, as opposed to the more transient brain states related to a specific task. However, the assumption that task-free resting-state provides an inherently neutral baseline reading of brain organization or that such a baseline can exist has been questioned on both procedural and theoretical grounds (Duncan & Northoff, 2013; Morcom & Fletcher, 2007), and evidence from fMRI work suggests that engaging participants in a task may in fact enhance behaviorally relevant individual differences compared with rest (Finn et al., 2017). With this in mind, we judged the intertrial interval sufficient to identify the task-related individual differences of interest; however, future work may need to consider both recordings made during multisensory tasks as well as task-free rest to disambiguate this issue.
Conclusion and Future Directions
Partitioning entropy into local and distributed components provided new insight into the neural correlates of individual differences in multisensory processing commonly observed in healthy adults and served as a proof of principle for the utility of information-theoretical measures for investigating multisensory processing. Overall, more effective multisensory processing, here operationalized as a narrower audiovisual TBW, was associated with a widespread pattern of higher mutual information and lower local entropy while participants were engaged in an audiovisual SJ task. This pattern provides support for the hypothesis that more effective multisensory processing requires sharing of information between widely distributed neural populations. Furthermore, several regions were strongly implicated in this pattern, including temporal and frontal regions that have previously been linked to multisensory integration and response selection, respectively. This suggests that the involvement of these regions within the larger pattern of information exchange could be important determinants of individual differences in multisensory processing.
To more definitively establish the causal role that the distribution versus localization of information plays in individual multisensory performance, future work will need to adopt an experimental approach. Prior work has shown that the TBW can be narrowed with training (Stevenson, Wilson, Powers, & Wallace, 2013; Powers et al., 2009) and that such training produces changes in resting-state and task-related BOLD functional connectivity among a network of unisensory and multisensory areas, including the posterior STS (Powers, Hevey, & Wallace, 2012). Training could be combined with the information-theoretical approach used here to test whether pre- and posttraining comparisons identify similar differences in local and distributed information processing throughout the brain. Furthermore, comparing local entropy and mutual information across multiple multisensory tasks could distinguish effects related to the particularities of each task from those related to a more general mechanism of multisensory processing and/or response selection underlying the correlation between tasks (van Atteveldt et al., 2014; Stevenson & Wallace, 2013; Stevenson et al., 2012).
Lastly, given the widespread nature of the individual differences in local entropy and mutual information, this perspective may help explain the links between individual differences in multisensory processing and higher-order cognitive abilities (see Wallace, Woynaroski, & Stevenson, 2020, for a review), such as those between multisensory RT benefit and intelligence in children (Barutchu et al., 2011), TBW width and problem solving in young adults (Zmigrod & Zmigrod, 2016), and audiovisual detection and mild cognitive impairment in older adults (Murray et al., 2018). Similarly, this approach may also shed light on mechanisms of dysfunction in clinical and neurodevelopmental groups with abnormalities in multisensory processing such as schizophrenia (Martin et al., 2013; Foucher et al., 2007), autism spectrum disorder (de Boer-Schellekens et al., 2013; Kwakye et al., 2011), and dyslexia (Hahn et al., 2014; Hairston et al., 2005).
APPENDIX A
APPENDIX B
APPENDIX C
Reprint requests should be sent to Phillip R. Johnston, Department of Psychology, University of Toronto, Toronto, Ontario M5S 3G3, Canada, or via e-mail: [email protected].
Funding Information
This work was supported by the Natural Sciences and Engineering Research Council of Canada (https://dx.doi.org/10.13039/501100000038), grant numbers: CGS M and CGS D to P. R. J., RGPIN-2018-04457 to A. R. M., and RGPIN-2016-05523 to C. A.
Diversity in Citation Practices
Retrospective analysis of the citations in every article published in this journal from 2010 to 2021 reveals a persistent pattern of gender imbalance: Although the proportions of authorship teams (categorized by estimated gender identification of first author/last author) publishing in the Journal of Cognitive Neuroscience (JoCN) during this period were M(an)/M = .407, W(oman)/M = .32, M/W = .115, and W/W = .159, the comparable proportions for the articles that these authorship teams cited were M/M = .549, W/M = .257, M/W = .109, and W/W = .085 (Postle and Fulvio, JoCN, 34:1, pp. 1–3). Consequently, JoCN encourages all authors to consider gender balance explicitly when selecting which articles to cite and gives them the opportunity to report their article's gender citation balance. The authors of this article report its proportions of citations by gender category to be as follows: M/M = .574, W/M = .296, M/W = .093, and W/W = .037.
Note
Nearly identical results were obtained from the arbitrary sigmoid fit at the time of data collection, with the width of the sigmoid at 50% simultaneity perception correlating almost perfectly with the width estimated from the observer model (Pearson's r = .99 for AV stimuli and r = .98 for VA stimuli).
REFERENCES
Author notes
Current affiliation: Simon Fraser University.