Abstract

Evaluating multisensory emotional content is a part of normal day-to-day interactions. We used fMRI to examine brain areas sensitive to congruence of audiovisual valence and their overlap with areas sensitive to valence. Twenty-one participants watched audiovisual clips with either congruent or incongruent valence across visual and auditory modalities. We showed that affective congruence versus incongruence across visual and auditory modalities is identifiable on a trial-by-trial basis across participants. Representations of affective congruence were widely distributed with some overlap with the areas sensitive to valence. Regions of overlap included bilateral superior temporal cortex and right pregenual anterior cingulate. The overlap between the regions identified here and in the emotion congruence literature lends support to the idea that valence may be a key determinant of affective congruence processing across a variety of discrete emotions.

INTRODUCTION

Emotional experiences are triggered by signals from multiple sensory modalities that can be either affectively congruent or incongruent. Imagine watching a scene in a horror movie played to scary music, so that information from both vision and audition is combined, making the experience scarier than that elicited by the music or film alone. The same scene played to happy music may produce a very different affective experience and interpretation. Sensitivity to the emotional content of real-life situations is vitally important for success in environments in which visual and auditory information together influence our mental states (De Gelder & Bertelson, 2003). This raises the question of which brain regions are involved in distinguishing between affectively congruent and incongruent content across different modalities and how these regions relate to those linked to general affective processing.

Although previous research has shown that congruent affective information from visual and auditory channels enhances emotional experiences compared with incongruent affective information (Gao, Wedell, Green, et al., 2018; Gao, Wedell, Kim, Weber, & Shinkareva, 2018; Christensen, Gaigg, Gomila, Oke, & Calvomerino, 2014; Gerdes et al., 2013; Baumgartner, Esslen, & Jäncke, 2006), the neural correlates underlying this phenomenon have received far less attention. Several fMRI studies have manipulated congruence of discrete emotions between audiovisual channels (Jansma, Roebroeck, & Münte, 2014; Pehrs et al., 2013; Watson et al., 2013; Jeong et al., 2011; Klasen, Kenworthy, Mathiak, Kircher, & Mathiak, 2011; Müller et al., 2011; Petrini, Crabbe, Sheridan, & Pollick, 2011; Dolan, Morris, & De Gelder, 2001). Brain regions including superior temporal cortex, amygdala, posterior/middle cingulate cortex, superior frontal cortex, insula, thalamus, and others showed activity modulated by congruence of discrete emotions. For example, one fMRI study presented happy or sad faces paired with happy or sad music (Jeong et al., 2011). Participants were instructed to experience the stimuli without attending exclusively to one modality. They found that superior temporal gyrus and fusiform gyrus are differentially activated by affective congruence across audiovisual modalities. In another fMRI study, happy, neutral, and fearful faces were combined with happy, neutral, and fearful sounds (Müller et al., 2011). Participants were instructed to rate the valence of the facial expressions as quickly and accurately as possible while ignoring the sounds. For affectively incongruent minus affectively congruent conditions, there were activations in middle cingulate cortex, right superior frontal gyrus, right SMA, and right TPJ, but there were no significant differences for affectively congruent minus affectively incongruent conditions. These studies have provided valuable contributions to the neural correlates of emotional congruence across audiovisual modalities but do not speak directly to congruence along affective dimensions rather than discrete categories.

Valence (positive or negative) and arousal (exciting or calming) are two dimensions characterizing emotional information that have been widely validated and can be seen as key components of emotional states (Lindquist, 2013; Russell, 2003). Although there are large individual differences of people's ability to differentiate different discrete emotions, almost everyone can tell the difference between a positive affective state and a negative affective state. Therefore, valence and, to a lesser extent, arousal may be considered the basic building blocks of emotion (Barrett, 2006). Recent evidence also suggests that there is no one-to-one mapping between a given brain region and a given emotion category. Instead, emotions and affective dimensions are represented in distributed neural systems (Satpute & Lindquist, 2019; Lindquist, Wager, Kober, Bliss-Moreau, & Barrett, 2012). Within this framework, it is important to examine the nature of congruent and incongruent valence representations in the brain.

Based on the existing literature, there are two outstanding questions that need to be addressed. First, how is valence congruence across visual and auditory modalities represented in brain activity? Prior research has focused primarily on brain areas involved in integration of discrete emotions. Rather than using discrete emotions, this study manipulated valence congruence across visual and auditory modalities. This is useful for understanding core neural mechanisms shared across emotions, given that accumulating evidence has indicated that there are no specific and consistent neural markers for each discrete emotion (Lindquist & Barrett, 2012). Moreover, the majority of previous studies examining the neural correlates of audiovisual affective congruence focused on emotion perception with exclusive attention to a single modality using faces and voices as stimuli, whereas the overall experience created by affective congruence or incongruence in complex situations has received far less attention. Affective experiences are mostly multisensory in nature and typically processed without attentional instructions, conditions that are currently understudied in the literature. We address this gap by focusing on how affective valence information from different visual and auditory modalities combines in an overall experience rather than investigating how emotion perception from one modality is influenced by emotion information from another modality.

A second key question we explore is how neural representations of valence congruence are related to neural representations of valence in general. Accumulating evidence suggests that valence congruence might modulate valence-related brain systems (Jansma et al., 2014; Klasen et al., 2011; Dolan et al., 2001). Affectively congruent conditions may enhance valence-related brain activations compared with affectively incongruent conditions. For example, Dolan et al. (2001) used happy and fearful faces and voices as stimuli and contrasted emotionally congruent conditions to emotionally incongruent conditions. They found stronger activation for congruent conditions in the amygdala, a fear- and valence-related brain region. Moreover, recent studies decoding neural representations of emotion using multivariate pattern analysis (MVPA) also found distributed neural systems that partially overlap with brain areas involved in integrating audiovisual affective signals (e.g., Kim, Shinkareva, & Wedell, 2017). Although this evidence is suggestive of affective congruence modulating valence-related brain regions, this relationship has not been examined directly. Examining the relationship between valence congruence and valence processing in general will provide a more complete picture of how crossmodal affective integration is achieved.

To investigate these questions, we measured brain activity in participants while they viewed naturalistic video clips and listened to instrumental music. Videos have an advantage in ecological validity by unfolding temporally compared with static pictures and faces, as do auditory stimuli. Furthermore, instrumental music carries little semantic content and so can be combined with videos with minimal semantic conflict. We created valence congruent or incongruent audiovisual stimuli while matching arousal at a moderate level.

Most previous studies examining affective congruence across audiovisual modalities have used a univariate approach, which focuses on differences in each voxel in isolation (e.g., Jeong et al., 2011). We used an MVPA approach to determine whether audiovisual affective congruence can be identified from distributed patterns of neural activity and to localize these patterns. First, we tested whether affective congruence across visual and auditory modalities can be identified on a trial-by-trial basis across participants and sought to identify the brain areas sensitive to affective congruence. Second, we identified areas sensitive to valence, enabling us to compare the neural representation of affective congruence to that of valence.

METHODS

Participants

Twenty-one healthy, right-handed adult volunteers (14 women, mean age = 22 years, age range = 19–30 years) participated in the experiment. Participants reported no history of psychiatric or neurological disorders and no CNS medications. All had normal or corrected-to-normal vision and reported normal hearing. Participants were not screened for drug use and were prescreened to ensure the fit with a 32-channel coil. They gave written informed consent in accordance with the institutional review board at the University of South Carolina.

Stimuli

Stimuli consisted of 3-sec audiovisual clips that were created from video and music components that were selected based on valence and arousal ratings from a previously developed and validated in-house affective stimuli set. The validation procedure is described in detail in the study of Kim et al. (2017). This stimuli set has been successfully used to induce affective experiences in previous studies (Gao, Wedell, Green, et al., 2018; Gao, Wedell, Kim, et al., 2018; Kim et al., 2017). Visual components of the stimuli selected for the current study consisted of 18 positive and 18 negative naturalistic video clips with balanced semantic content (human or animal) between the two valence categories. Auditory components consisted of 18 positive and 18 negative instrumental music clips without any vocal sounds to avoid semantic information from lyrics. Components were matched on arousal (Table 1). Visual and auditory exemplars from the valence categories were randomly paired to create audiovisual stimuli that either matched or mismatched on valence, with each stimulus used once in congruent and once in incongruent pairings to create 18 unique sets for each of the four experimental conditions (2 visual valence × 2 auditory valence), with two repetitions each for a total of 144 trials. An additional six unique sets for each of the four experimental conditions were used for catch trials. The catch trials were included to maintain the participants' attention throughout the experiment and were not included in the analyses.

Table 1. 
Mean Valence and Arousal Ratings from prior Norming Studies
ExperimentCondition
VpVnApAn
Valence 7.5 (0.40) 2.6 (0.65) 7.1 (0.35) 2.8 (0.53) 
Arousal 6.4 (0.47) 6.6 (0.42) 6.6 (0.50) 7.0 (0.58) 
ExperimentCondition
VpVnApAn
Valence 7.5 (0.40) 2.6 (0.65) 7.1 (0.35) 2.8 (0.53) 
Arousal 6.4 (0.47) 6.6 (0.42) 6.6 (0.50) 7.0 (0.58) 

Means and standard deviations are shown. Vp = visual positive; Vn = visual negative; Ap = auditory positive; An = auditory negative.

Procedure

fMRI was used to measure brain activity while participants viewed audiovisual clips presented using E-Prime software (Psychology Software Tools). All video stimuli were 320 × 240 pixels and were presented in 32-bit color. The sound was delivered via Nordic Neuro Headphones. The experimental design crossed visual valence (positive, negative) with auditory valence (positive, negative), producing four audiovisual conditions: two congruent, visual positive with auditory positive (VpAp) and visual negative with auditory negative (VnAn), and two incongruent, visual positive with auditory negative (VpAn) and visual negative with auditory positive (VnAp). In the scanner, participants were presented with 144 experimental trials and 24 catch trials distributed over three sessions. There were 12 experimental and 2 catch trials for each of the four conditions per session (12 experimental trials × 4 conditions × 3 sessions = 144 trials; 2 catch trials × 4 conditions × 3 sessions = 24 trials). Audiovisual pairings with the same exemplars were restricted to the same session (i.e., cross-validation fold) to generalize the identification of audiovisual affective representation across stimuli. In addition to breaks between sessions for the participants to relax, there was also a 12-sec break in the middle of each session. During the experimental trials, an audiovisual stimulus was presented for 3 sec followed by a white fixation cross presented in the center of a black screen for 7 sec (Figure 1A). During the catch trials, an audiovisual stimulus was presented for 3 sec followed by a 3-sec cue (“How do you feel?”). The cue signaled the participants to evaluate how they felt by pressing one of two response keys with right index finger indicating feeling “positive” and right middle finger indicating feeling “negative.” Participants were instructed to respond as quickly as possible and during the 3-sec presentation of the cue. The mean accuracy of valence judgments was 90.1% (range = 67–100%), suggesting participants were alert in the scanner throughout the experiment. The order of all trials within each session was random, and the order of the three sessions was counterbalanced across participants. Before scanning, participants completed a practice session outside the scanner using different stimuli from those in the main experiment.

Figure 1. 

Schematic illustration of trial timing for (A) fMRI experiment and (B) postscan behavioral ratings.

Figure 1. 

Schematic illustration of trial timing for (A) fMRI experiment and (B) postscan behavioral ratings.

MRI Acquisition

Data were acquired using a 3-T Siemens Prisma Fit scanner with a 32-channel coil (Siemens) at the McCausland Center for Brain Imaging at the University of South Carolina. fMRI data were acquired using a multiband gradient-echo EPI sequence using T2*-weighted BOLD contrast (multiband acceleration factor = 4), with the following parameters: repetition time = 1000 msec, echo time = 35 msec, flip angle = 62°, field of view = 210 × 210 mm, in-plane resolution = 70 × 70 pixels, slice thickness = 2.73 mm, gap = 0.27 mm, voxel size = 3 × 3 × 2.7 mm, number of slices = 40, order of slice acquisition = interleaved ascending, slice orientation = axial. fMRI data were acquired in three sessions and within each session, 590 images were collected. Functional scans with reversed phase encoding were collected after the first session, resulting in 20 pairs of field map images with distortions going in opposite directions (anterior–posterior, posterior–anterior). These scans were used for distortion corrections. Anatomical MRI data were acquired using a high-resolution T1-weighted sequence with the following parameters: repetition time = 2250 msec, echo time = 4.11 msec, flip angle = 9°, field of view = 256 × 256 mm, in-plane resolution = 256 × 256 pixels, voxel size = 1 × 1 × 1 mm. The total scanning time was approximately 50 min.

Postscan Behavioral Assessments

After the scanning session, outside the scanner participants were asked to rate valence and arousal for the same stimuli used in the fMRI experiment along with their components as a manipulation check. The ratings were made using a 9 × 9 grid with the horizontal axis reflecting valence, varying from negative to positive, and the vertical axis reflecting arousal, varying from low to high. Participants viewed the 72 audiovisual clips used in the main experiment (18 exemplars × 2 visual valence × 2 auditory valence) along with 18 unimodal components for each of four conditions: visual positive (Vp), visual negative (Vn), auditory positive (Ap), and auditory negative (An). A trial began with the participant clicking a mouse button, followed by a 500-msec presentation of a blank screen, which was then followed by either a visual, auditory, or audiovisual clip presented for 3 sec (Figure 1B). The order of all 144 trials was random.

Data Analysis

fMRI Data Preprocessing

The neuroimaging data were preprocessed using SPM 12 (Statistical Parametric Mapping 12, https://www.fil.ion.ucl.ac.uk/spm) in MATLAB (MATLAB, 2015b; The MathWorks, Inc.) and FSL 5.0 (FMRIB Software Library 5.0, https://fsl.fmrib.ox.ac.uk/fsl/fslwiki). The preprocessing procedure included realignment of all functional scans to the mean functional scan using a rigid body transformation implemented in SPM, field inhomogeneity correction with FSL's TOPUP tool, coregistering the mean functional image to the T1 anatomical image with SPM, normalization to the standard SPM 12 EPI template (MNI stereotactic space), and resampling to a 3-mm isotropic voxel size.

MVPA

In addition to standard preprocessing as reported above, we used GLMdenoise toolbox Version 1.4 (kendrickkay.net/GLMdenoise/) as a denoising step (Kay, Rokem, Winawer, Dougherty, & Wandell, 2013), which has been demonstrated to improve MVPA performance for fMRI data (Charest, Kriegeskorte, & Kay, 2018). For each participant, a general linear model was fit at each voxel by convolving the stimuli onsets with a canonical hemodynamic response function. A temporal derivative of the hemodynamic response function was included and orthogonalized with respect to the original regressor to account for the misspecification of the hemodynamic timing (Pernet, 2014). Six head motion parameters (three translations, three rotations) from realignment were included as nuisance regressors. Low-frequency noise was removed by applying a high-pass filter of 128 sec, and temporal autocorrelations were accounted for with a first-order autoregressive model AR(1). The estimated parameter values from general linear model for each trial were then standardized across voxels to have zero mean and unit variance. Thus, the final input for the multivariate analyses for each voxel for each participant contained 144 values (6 exemplars × 2 repetitions × 4 conditions × 3 sessions).

Classification Analyses

Two-way classifications were performed to identify “congruence” (congruent: VpAp + VnAn vs. incongruent: VpAn + VnAp) and “valence” (positive: VpAp vs. negative: VnAn). Classifications were performed with leave-one-participant-out cross-validation. In this procedure, the Gaussian Naïve Bayes classifier was trained on data from all but one participant and then tested by predicting each trial for the left-out participant. Within each cross-validation fold, a univariate t-test feature selection was used on the training set, selecting features based on absolute t values. For simplicity, we chose to use the top 400 voxels (Wang, Baucom, & Shinkareva, 2013; Shinkareva, Malave, Mason, Mitchell, & Just, 2011). The average classification accuracy across all cross-validation runs was reported. Statistical significance was evaluated with permutation tests, wherein obtained accuracy was compared with an empirically generated null distribution, formed by 1000 classification accuracies obtained from the same procedure, but with randomly permuted labels on each iteration.

Searchlight Analyses

Multivariate searchlight analyses were performed to identify the brain areas that have a distinct spatial pattern of activity for “congruence” and, separately, “valence” (Kriegeskorte, Goebel, & Bandettini, 2006). Searchlight analyses were conducted using the Searchmight toolbox (Pereira & Botvinick, 2011). For each participant and each voxel, data were extracted from a 5 × 5 × 5 voxel neighborhood centered at a given voxel. For each searchlight, two-way classifications were performed to identify “valence” as well as “congruence.” Classifications were performed with leave-one-session-out cross-validation. In this procedure, the Gaussian Naïve Bayes classifier was trained on data from two sessions and then tested on data from the left-out session, which ensured independence between training and testing data sets given the temporal separation between sessions. Notably, training and test sets did not contain the same exemplars. Classification performance was computed based on the average classification accuracy across the three cross-validation folds. The individual accuracy maps for positive versus negative or congruent versus incongruent were then subjected to a group-level random effects analysis after subtracting 50% (chance-level accuracy). Significance was tested using nonparametric permutation tests implemented in the Statistical non-Parametric Mapping toolbox (SnPM 13; warwick.ac.uk/snpm), wherein a cluster-forming threshold of p < .001 (Woo, Krishnan, & Wager, 2014) and .05 family-wise error (FWE) control of cluster size was used via 5000 permutations in conjunction with cluster size > 20 to exclude small clusters that are difficult to interpret (Christophel, Hebart, & Haynes, 2012).

Conjunction Analysis

Whole-brain conjunction analysis (Nichols, Brett, Andersson, Wager, & Poline, 2005) was performed to identify overlap between multivariate representations of valence congruence across visual and auditory modalities and valence. The conjunction null hypothesis was tested in SPM toolbox using a FWE correction found by random field theory with a cluster-forming threshold of p < .001 and .05 FWE control of cluster size.

Behavioral Data Analyses

For the postscan behavioral assessments, paired t tests and repeated-measures ANOVAs were used to analyze the data. To confirm the differences in valence ratings for each of the valence categories, a Modality (visual, auditory) × Valence (positive, negative) two-way ANOVA was conducted on mean valence ratings for unimodal trials. To confirm the audiovisual interaction effects, a Visual Valence (positive, negative) × Auditory Valence (positive, negative) two-way ANOVA was conducted on mean valence ratings for audiovisual trials. Follow-up tests used a Bonferroni correction for multiple comparisons.

RESULTS

Congruence

Identification of Affective Congruence across Participants

The classifiers were trained on data from all but one participant to identify each trial as affectively congruent or incongruent in the left-out participant. The average classification accuracy across participants was 62.4% (p < .05), with the accuracies ranging from 52.8% to 72.2%.

Representation of Affective Congruence

The multivariate searchlight analyses identified several regions sensitive to affective congruence: left inferior parietal lobule, triangular and opercular parts of inferior frontal gyrus and insula; right superior parietal and postcentral gyri, pregenual anterior cingulate and middle cingulate cortices, and precuneus; bilateral superior temporal and supramarginal gyri; and rolandic operculum (Figure 2A and Table 2).

Figure 2. 

(A) Multivariate searchlight results for identifying affectively congruent from affectively incongruent audiovisual trials. (B) Multivariate searchlight results for identifying affectively positive from affectively negative audiovisual trials. (C) Conjunction analysis showing overlap between multivariate representations of valence congruence across visual and auditory modalities and valence. IFG = inferior frontal gyrus; STC = superior temporal cortex; IP = inferior parietal; SP = superior parietal; Precu = precuneus; pgACC = pregenual anterior cingulate; PreC = precentral; PostC = postcentral.

Figure 2. 

(A) Multivariate searchlight results for identifying affectively congruent from affectively incongruent audiovisual trials. (B) Multivariate searchlight results for identifying affectively positive from affectively negative audiovisual trials. (C) Conjunction analysis showing overlap between multivariate representations of valence congruence across visual and auditory modalities and valence. IFG = inferior frontal gyrus; STC = superior temporal cortex; IP = inferior parietal; SP = superior parietal; Precu = precuneus; pgACC = pregenual anterior cingulate; PreC = precentral; PostC = postcentral.

Table 2. 
Summary of Multivariate Searchlight Results Identifying Brain Regions that Differentiate Affectively Congruent versus Incongruent Audiovisual Trials
RegionCluster SizePeak Coordinates (MNI)Z Score
x (mm)y (mm)z (mm)
R superior parietal 25 18 −49 56 4.17 
27 −49 56 3.61 
R pregenual anterior cingulate 28 12 44 14 4.14 
12 38 20 3.41 
R middle cingulate/precuneus 75 12 −46 35 4.14 
12 −58 32 4.10 
15 −52 44 4.03 
L superior temporal/rolandic operculum/supramarginal 34 −51 −34 17 4.13 
−45 −25 20 3.54 
−54 −34 26 3.43 
L inferior parietal 28 −42 −43 41 4.13 
−36 −55 41 3.67 
L insula/triangular part of inferior frontal gyrus/opercular part of inferior frontal gyrus 21 −33 20 14 4.11 
−45 23 3.70 
−48 17 14 3.67 
L supramarginal/postcentral 21 39 −37 44 3.96 
45 −28 44 3.46 
R superior temporal/rolandic operculum 21 51 −34 14 3.91 
42 −31 17 3.24 
RegionCluster SizePeak Coordinates (MNI)Z Score
x (mm)y (mm)z (mm)
R superior parietal 25 18 −49 56 4.17 
27 −49 56 3.61 
R pregenual anterior cingulate 28 12 44 14 4.14 
12 38 20 3.41 
R middle cingulate/precuneus 75 12 −46 35 4.14 
12 −58 32 4.10 
15 −52 44 4.03 
L superior temporal/rolandic operculum/supramarginal 34 −51 −34 17 4.13 
−45 −25 20 3.54 
−54 −34 26 3.43 
L inferior parietal 28 −42 −43 41 4.13 
−36 −55 41 3.67 
L insula/triangular part of inferior frontal gyrus/opercular part of inferior frontal gyrus 21 −33 20 14 4.11 
−45 23 3.70 
−48 17 14 3.67 
L supramarginal/postcentral 21 39 −37 44 3.96 
45 −28 44 3.46 
R superior temporal/rolandic operculum 21 51 −34 14 3.91 
42 −31 17 3.24 

The results are based on a permutation test with a cluster-forming threshold of p < .001 and .05 FWE control of cluster size via 5000 permutations in conjunction with cluster size > 20. Anatomical location labeling is based on the AAL3 atlas (www.gin.cnrs.fr/en/tools/aal/). L = left hemisphere; R = right hemisphere.

Valence

Identification of Valence across Participants

The classifiers were trained on data from all but one participant to identify each trial as positive or negative in the left-out participant. The average classification accuracy across participants was 73.7% (p < .05), with the accuracies ranging from 55.6% to 81.9%.

Representation of Valence

The multivariate searchlight analyses identified several regions sensitive to valence information: left paracentral lobule, middle cingulate cortex, supramarginal gyrus, precuneus, and cuneus; right rolandic operculum, superior medial frontal, superior frontal, middle frontal, precentral and postcentral gyri, and pregenual ACC; and bilateral superior temporal cortex (Figure 2B and Table 3).

Table 3. 
Summary of Multivariate Searchlight Results Identifying Brain Regions that Differentiate Positive versus Negative Audiovisual Trials
RegionCluster SizePeak coordinates (MNI)Z Score
x (mm)y (mm)z (mm)
R superior temporal/rolandic operculum 58 51 −31 17 4.42 
57 −28 11 3.81 
42 −28 20 3.64 
R pregenual anterior cingulate/superior medial frontal 49 47 26 4.17 
15 41 17 3.67 
L paracentral lobule/middle cingulate 81 −6 −34 59 3.97 
−15 −34 44 3.94 
−3 −43 50 3.80 
L superior temporal/supramarginal 33 −48 −37 14 3.90 
−51 −28 11 3.53 
−51 −31 26 3.42 
R superior frontal/middle frontal 22 24 23 47 3.84 
33 14 44 3.18 
R precentral/postcentral 34 45 −1 35 3.83 
54 −4 26 3.80 
42 −10 32 3.21 
L precuneus/cuneus 30 −6 −67 38 3.66 
−12 −76 35 3.64 
−6 −64 29 3.40 
RegionCluster SizePeak coordinates (MNI)Z Score
x (mm)y (mm)z (mm)
R superior temporal/rolandic operculum 58 51 −31 17 4.42 
57 −28 11 3.81 
42 −28 20 3.64 
R pregenual anterior cingulate/superior medial frontal 49 47 26 4.17 
15 41 17 3.67 
L paracentral lobule/middle cingulate 81 −6 −34 59 3.97 
−15 −34 44 3.94 
−3 −43 50 3.80 
L superior temporal/supramarginal 33 −48 −37 14 3.90 
−51 −28 11 3.53 
−51 −31 26 3.42 
R superior frontal/middle frontal 22 24 23 47 3.84 
33 14 44 3.18 
R precentral/postcentral 34 45 −1 35 3.83 
54 −4 26 3.80 
42 −10 32 3.21 
L precuneus/cuneus 30 −6 −67 38 3.66 
−12 −76 35 3.64 
−6 −64 29 3.40 

The results are based on a permutation test with a cluster-forming threshold of p < .001 and .05 FWE control of cluster size via 5000 permutations in conjunction with cluster size > 20. Anatomical location labeling is based on the AAL3 atlas (www.gin.cnrs.fr/en/tools/aal/). L = left hemisphere; R = right hemisphere.

Overlap of Congruence and Valence Representations

Conjunction analysis identified overlap between multivariate representations of valence congruence across visual and auditory modalities and valence in left supramarginal gyrus, right pregenual ACC, and bilateral superior temporal gyri (Figure 2C and Table 4).

Table 4. 
Summary of Conjunction Analysis Results Identifying Brain Regions Associated with Representations of Both Valence Congruence across Visual and Auditory Modalities and Valence
RegionCluster sizePeak coordinates (MNI)Z Score
x (mm)y (mm)z (mm)
R superior temporal 13 51 −34 14 4.40 
51 −25 17 3.32 
L superior temporal/supramarginal 19 −51 −34 17 4.16 
−51 −31 26 3.55 
R pregenual anterior cingulate 24 44 14 3.90 
47 26 3.34 
RegionCluster sizePeak coordinates (MNI)Z Score
x (mm)y (mm)z (mm)
R superior temporal 13 51 −34 14 4.40 
51 −25 17 3.32 
L superior temporal/supramarginal 19 −51 −34 17 4.16 
−51 −31 26 3.55 
R pregenual anterior cingulate 24 44 14 3.90 
47 26 3.34 

The results are based on a FWE correction found by random field theory with a cluster-forming threshold of p < .001 and .05 FWE control of cluster size. Anatomical location labeling is based on the AAL3 atlas (www.gin.cnrs.fr/en/tools/aal/). L = left hemisphere; R = right hemisphere.

Postscan Behavioral Assessment

For the postscan behavioral assessments, the Modality (visual, auditory) × Valence (positive, negative) two-way repeated-measures ANOVA conducted on mean valence ratings for unimodal trials showed positive valence was differentiated from negative valence as an experimental manipulation check, F(1, 20) = 474.9, p < .001, η2 = .86 (Table 5). The Modality main effect was not significant, F(1, 20) = 0.18, p = .68. The Modality × Valence interaction was significant, F(1, 20) = 6.43, p = .02, η2 = .07. Follow-up tests of valence differences conducted for each modality were both significant (ps < .001), with a larger difference for visual stimuli. Furthermore, postscan and prior norming ratings were highly correlated, r = .945, p < .001.

Table 5. 
Postscan Mean Valence Ratings with Standard Errors in Parentheses
ExperimentCondition
UnimodalVpVnApAn
Valence 7.2 (0.17) 2.9 (0.20) 6.8 (0.14) 3.4 (0.20) 
  
AudiovisualVpApVpAnVnApVnAn
Valence 7.4 (0.01) 5.2 (0.01) 3.8 (0.01) 2.6 (0.01) 
ExperimentCondition
UnimodalVpVnApAn
Valence 7.2 (0.17) 2.9 (0.20) 6.8 (0.14) 3.4 (0.20) 
  
AudiovisualVpApVpAnVnApVnAn
Valence 7.4 (0.01) 5.2 (0.01) 3.8 (0.01) 2.6 (0.01) 

Vp = visual positive; Vn = visual negative; Ap = auditory positive; An = auditory negative; VpAp = visual positive–auditory positive; VpAn = visual positive–auditory negative; VnAp = visual negative–auditory positive; VnAn = visual negative–auditory negative.

The Visual Valence (positive, negative) × Auditory Valence (positive, negative) two-way repeated-measures ANOVA conducted on mean valence ratings for audiovisual stimuli showed a main effect of Visual Valence, F(1, 20) = 161.2, p < .001, η2 = .82; a main effect of Auditory Valence, F(1, 20) = 72.0, p < .001, η2 = .58; and an interaction between Visual Valence and Auditory Valence, F(1, 20) = 25.5, p < .001, η2 = .09 (Table 5). This interaction reflects a negativity bias in which negatively valenced stimuli carry greater weight. Follow-up tests of auditory valence differences conducted for each visual valence condition were both significant (ps < .001), with a smaller difference for the negative than the positive videos, suggesting the negative videos are more difficult to alter. Finally, to examine whether there is an enhancement effect for congruent audiovisual stimuli relative to the average of their unimodal components, we compared the average of single modality ratings to the congruent multimodal pairings. The rating of VpAp was more extreme than the average of the Vp and Ap stimuli, t(20) = 2.3, p = .04, Cohen's d = 0.49 (7.4 > 7.0), and the rating of VnAn was more extreme than the average of the Vn and An stimuli, t(20) = 6.0, p < .001, Cohen's d = 1.3 (2.6 < 3.1). We also compared the congruent audiovisual pairings to each of their unimodal components. There was no significant difference between rating of VpAp and Vp, t(20) = 0.73, p = .5. The rating of VpAp was more extreme than Ap, t(20) = 2.8, p = .01, Cohen's d = 0.61 (7.4 > 6.8). The rating of VnAn was more extreme than Vn, t(20) = 2.9, p = .01, Cohen's d = 0.63 (2.6 < 2.9). The rating of VnAn was more extreme than An, t(20) = 5.1, p < .001, Cohen's d = 1.1 (2.6 < 3.4).

DISCUSSION

The main objective of this study was to determine brain regions sensitive to the distinction between affective congruence and incongruence across visual and auditory modalities. First, we identified regions sensitive to valence congruence. Second, we found that whole-brain activity patterns for decoding valence are also generalizable across participants. Third, we found the patterns of affective congruence overlap with the patterns of valence in bilateral superior temporal cortex and right pregenual anterior cingulate. These findings add to a neural account of affective congruence effects and contribute to our understanding about how different brain regions support our ability to have complex affective evaluations and experiences.

Our cross-participant classification results provide evidence that fine-grained patterns of neural activity for distinguishing affective congruence and affective valence are generalizable across participants (see also Baucom, Wedell, Wang, Blitzer, & Shinkareva, 2012). In everyday life, people evaluate emotional scenes by integrating information from multiple modalities, most frequently vision and audition, and differentially respond to affectively congruent and incongruent content. Our study demonstrates that the underlying neural patterns of activation reflecting this cognitive process are shared across participants.

Previous studies have examined congruence for discrete emotions. Our study extends these findings to the dimensional representation of affective valence under conditions in which attention is not directed to one modality or the other. We will discuss key regions in the distributed neural system supporting the distinction between affective congruence and incongruence: left inferior frontal gyrus, bilateral superior temporal, right pregenual anterior cingulate, and right superior and inferior parietal cortices.

Similar to studies of affective congruence for discrete emotions using prosody and content in spoken sentences, we have identified inferior frontal gyrus (Wittfoth et al., 2009; Mitchell, 2006; Schirmer, Zysset, Kotz, & von Cramon, 2004). Previous studies have suggested that inferior frontal cortex is associated with domain-general conflict resolution and cognitive control processes (Novick, Trueswell, & Thompson-Schill, 2010; Nelson, Reuter-Lorenz, Persson, Sylvester, & Jonides, 2009; Derrfuss, Brass, Neumann, & von Cramon, 2005; Novick, Trueswell, & Thompson-Schill, 2005). For example, one study found that interference resolution during retrieval from working memory and from semantic memory can be mapped to a common brain area: inferior frontal gyrus (Nelson et al., 2009). Our results imply that the domain-general role of inferior frontal gyrus in conflict resolution can generalize to crossmodal valence processing and the inferior frontal gyrus contains information to distinguish affectively congruent and incongruent signals outside discrete emotions.

We have also identified the involvement of bilateral superior temporal cortex in distinguishing affective congruence from incongruence. A previous study has implicated the role of superior temporal cortex in affective congruence. Wittfoth et al. (2009) paired emotional prosody with semantic content and implicated superior temporal cortex in an affective incongruent versus affective congruent comparison. Another study found superior temporal gyrus activation for a congruent versus incongruent comparison when affective congruence of discrete emotions was manipulated across visual and auditory modalities using face–voice stimuli (Jeong et al., 2011). Thus, there appears to be strong evidence that superior temporal cortex plays an important role in evaluating emotional content across visual and auditory modalities.

Pregenual ACC was also found in distinguishing affective congruence from incongruence. The conflict monitoring hypothesis posited that the ACC functions to detect occurrence of conflicts (Botvinick, Cohen, & Carter, 2004). Our results are consistent with prior findings that implicated ACC with resolution of emotional conflict (Egner, Etkin, Gale, & Hirsch, 2007; Etkin, Egner, Peraza, Kandel, & Hirsch, 2006) and extend these results to affective congruence detection in audiovisual settings.

Superior and inferior parietal cortices have also been found in distinguishing affective congruence from incongruence. Superior parietal cortex has been associated with multiple cognitive processes, including attention (Corbetta, Shulman, Miezin, & Petersen, 1995), spatial perception (Weiss, Marshall, Zilles, & Fink, 2003), episodic memory (Vilberg & Rugg, 2008), and visual–motor integration (Culham & Valyear, 2006). The activation of superior parietal cortex in distinguishing affective congruence from incongruence in our study could be related to attention and episodic memory interpretations, in which distinguishing congruent versus incongruent might involve different attentional allocation and memory retrieval processes. Inferior parietal cortex (i.e., angular gyrus) has been associated with semantic processing (Binder, Desai, Graves, & Conant, 2009). Therefore, the activation of superior and inferior parietal cortices may reflect a composite of cognitive processes such as memory retrieval and attentional allocation in distinguishing affective congruence from incongruence.

Our results distinguishing affectively congruent and incongruent content across visual and auditory modalities may be compared with multisensory integration processes in general. Previous studies have manipulated crossmodal congruence to identify neural correlates of audiovisual integration. For instance, when sensory cues from more than one modality occur simultaneously from the same spatial location, electrophysiological studies of animals show an enhanced firing rate of multisensory cells in the superior colliculus, whereas a response depression is found when cues are asynchronous or spatially disparate (Kadunce, Vaughan, Wallace, Benedek, & Stein, 1997; Wallace, Wilkinson, & Stein, 1996). Similar findings have been shown for human participants using fMRI (Calvert, Hansen, Iversen, & Brammer, 2001; Calvert, Campbell, & Brammer, 2000). One study investigated audiovisual integration of speech by instructing participants to hear speech while watching silent mouth and lip movements (Calvert et al., 2001). They found greater engagement of superior temporal gyrus for the congruent audiovisual inputs than for the incongruent audiovisual inputs. However, these findings do not necessarily generalize to affective processing. A recent meta-analysis summarized neuroimaging studies on audiovisual affective processing and found a series of brain areas involved in integrating visual and auditory affective signals, including right posterior superior temporal gyrus, left anterior superior temporal gyrus, right amygdala, and thalamus (Gao, Weber, & Shinkareva, 2019). The combined evidence implicates the superior temporal cortex in both general audiovisual integration and audiovisual integration of affect.

Notably, our design allowed us to determine regions sensitive to processing positive versus negative valence. We found several regions in an affective network whose activity patterns were sensitive to valence. Our results are consistent with previous studies decoding affective dimensions and discrete emotions, which showed activity patterns in superior temporal cortex (Kim et al., 2017; Kim, Wang, Wedell, & Shinkareva, 2016; Kotz, Kalberlah, Bahlmann, Friederici, & Haynes, 2013; Sitaram et al., 2011; Peelen, Atkinson, & Vuilleumier, 2010; Said, Moore, Engell, Todorov, & Haxby, 2010; Ethofer, Van De Ville, Scherer, & Vuilleumier, 2009), anterior cingulate (Saarimäki et al., 2018; Kotz et al., 2013; Sitaram et al., 2011), middle cingulate (Chikazoe, Lee, Kriegeskorte, & Anderson, 2014), precentral (Saarimäki et al., 2015, 2018), postcentral (Kim et al., 2017; Kragel & LaBar, 2016; Saarimäki et al., 2015), superior frontal (Sitaram et al., 2011), middle frontal (Kim et al., 2017; Kotz et al., 2013; Sitaram et al., 2011), and precuneus (Kim et al., 2016, 2017; Saarimäki et al., 2015; Sitaram et al., 2011). Our findings are also consistent with contemporary views that affective dimensions and emotion categories are represented across distributed neural systems (Satpute & Lindquist, 2019).

We found an overlap between regions sensitive to affective congruence and valence representations. Brain regions that were sensitive to both congruence and valence included bilateral superior temporal and right pregenual anterior cingulate cortices. This result highlights the multiple roles of superior temporal cortex in multisensory affective processing. As a multisensory integration center, superior temporal cortex integrates information from visual and auditory modalities (Gao et al., 2019; Driver & Noesselt, 2008). Superior temporal cortex also codes affective information, wherein neural activity of affective processing is modulated by congruency manipulation. Moreover, our results also showed the activation of pregenual anterior cingulate in both valence congruence and valence processing. The pregenual ACC is associated with valuation processes (Dixon, Thiruchselvam, Todd, & Christoff, 2017; Bartra, McGuire, & Kable, 2013; Amemori & Graybiel, 2012) and is activated when individuals attend internally to their subjective feelings (Kulkarni et al., 2005). Our results suggest that pregenual anterior cingulate is important for multisensory affective processing, which involves an evaluation of interoceptive feelings. Taken together, our results lend support to the idea that valence may be a key determinant of affective congruence processing across a variety of discrete emotions.

Although Dolan et al. (2001) found amygdala was modulated by congruency of discrete emotions when using face–voice stimuli, we did not find amygdala activation in our study. Previous literature has documented that the role of amygdala is strong in facial expression processing but weak in processing of vocal expressions (see Schirmer & Adolphs, 2017, for a review). The absence of amygdala in our findings might be due to different stimuli types.

An alternative explanation of the overlap of these regions is that the comparison between affectively congruent and affectively incongruent conditions overlaps with a comparison between valenced and neutral conditions. According to the averaging model, combining positive and negative valence can result in an intermediate valence level: neutral (Gao, Wedell, Green, et al., 2018; Gao, Wedell, Kim, et al., 2018). This interpretation is in line with our postscanner ratings showing that affectively incongruent ratings are intermediate compared with its visual or auditory component ratings.

Our postscanner ratings also showed that combinations of the same extreme valence led to more extreme state ratings than component stimuli presented in isolation (see also Gao, Wedell, Green, et al., 2018; Gao, Wedell, Kim, et al., 2018). These results support the hypothesis that affective congruence can influence valence-related neural systems. Previous studies have demonstrated that audiovisual congruence can influence perception, speech, and emotional processes. For example, participants' perceptions of a speech signal can be changed depending on the combination of visual and auditory signals (e.g., McGurk & MacDonald, 1976), wherein brain areas related to speech perception are influenced (Beauchamp, 2016). Evidence from affective processing also showed modulation of congruence on valence-related brain areas (Klasen et al., 2011; Dolan et al., 2001). Combined with these previous findings, this study is consistent with the idea that processing related to affective congruence may influence the behavioral and neural measures associated with valence processing.

One caveat in interpreting these results is that MVPA analyses are nondirectional. Therefore, it is unclear whether activation in identified regions is stronger for the congruent or incongruent conditions or how these activation patterns relate to behavioral ratings. Also, our behavioral analyses of postscan ratings indicated that there is a significant difference of arousal ratings between congruent and incongruent conditions. Therefore, although we tried to control arousal and the arousal difference is not large (Mcongruent = 6.41, Mincongruent = 6.07), we cannot fully rule out the possible influences of arousal.

In conclusion, we were able to identify individual trials as affectively congruent or incongruent across participants based on whole-brain activity patterns. We showed that widely distributed brain areas contain information for distinguishing affectively congruent from incongruent affective content. We also found the neural systems related to affective congruence overlap with the neural representations associated with valence processing. Taken together, these results provide insights into the neural mechanisms for distinguishing congruent from incongruent affective content across visual and auditory modalities.

Acknowledgments

We thank Roger Newman-Norlund for his consultation and technical assistance with fMRI data collection. This work was supported by the College of Arts and Sciences Faculty Research Initiative at the University of South Carolina.

Reprint requests should be sent to Svetlana V. Shinkareva, Department of Psychology, Institute for Mind and Brain, University of South Carolina, Columbia, SC 29201, or via e-mail: shinkareva@sc.edu.

REFERENCES

REFERENCES
Amemori
,
K.
, &
Graybiel
,
A. M.
(
2012
).
Localized microstimulation of primate pregenual cingulate cortex induces negative decision-making
.
Nature Neuroscience
,
15
,
776
785
.
Barrett
,
L. F.
(
2006
).
Valence is a basic building block of emotional life
.
Journal of Research in Personality
,
40
,
35
55
.
Bartra
,
O.
,
McGuire
,
J. T.
, &
Kable
,
J. W.
(
2013
).
The valuation system: A coordinate-based meta-analysis of BOLD fMRI experiments examining neural correlates of subjective value
.
Neuroimage
,
76
,
412
427
.
Baucom
,
L. B.
,
Wedell
,
D. H.
,
Wang
,
J.
,
Blitzer
,
D. N.
, &
Shinkareva
,
S. V.
(
2012
).
Decoding the neural representation of affective states
.
Neuroimage
,
59
,
718
727
.
Baumgartner
,
T.
,
Esslen
,
M.
, &
Jäncke
,
L.
(
2006
).
From emotion perception to emotion experience: Emotions evoked by pictures and classical music
.
International Journal of Psychophysiology
,
60
,
34
43
.
Beauchamp
,
M. S.
(
2016
).
Audiovisual speech integration: Neural substrates and behavior
. In
L.
Hickok
&
S.
Small
(Eds.),
Neurobiology of language
(pp.
515
526
).
Cambridge, MA
:
Academic Press
.
Binder
,
J. R.
,
Desai
,
R. H.
,
Graves
,
W. W.
, &
Conant
,
L. L.
(
2009
).
Where is the semantic system? A critical review and meta-analysis of 120 functional neuroimaging studies
.
Cerebral Cortex
,
19
,
2767
2796
.
Botvinick
,
M. M.
,
Cohen
,
J. D.
, &
Carter
,
C. S.
(
2004
).
Conflict monitoring and anterior cingulate cortex: An update
.
Trends in Cognitive Sciences
,
8
,
539
546
.
Calvert
,
G. A.
,
Campbell
,
R.
, &
Brammer
,
M. J.
(
2000
).
Evidence from functional magnetic resonance imaging of crossmodal binding in the human heteromodal cortex
.
Current Biology
,
10
,
649
657
.
Calvert
,
G. A.
,
Hansen
,
P. C.
,
Iversen
,
S. D.
, &
Brammer
,
M. J.
(
2001
).
Detection of audio-visual integration sites in humans by application of electrophysiological criteria to the BOLD effect
.
Neuroimage
,
14
,
427
438
.
Charest
,
I.
,
Kriegeskorte
,
N.
, &
Kay
,
K. N.
(
2018
).
GLMdenoise improves multivariate pattern analysis of fMRI data
.
Neuroimage
,
183
,
606
616
.
Chikazoe
,
J.
,
Lee
,
D. H.
,
Kriegeskorte
,
N.
, &
Anderson
,
A. K.
(
2014
).
Population coding of affect across stimuli, modalities and individuals
.
Nature Neuroscience
,
17
,
1114
1122
.
Christensen
,
J. F.
,
Gaigg
,
S. B.
,
Gomila
,
A.
,
Oke
,
P.
, &
Calvomerino
,
B.
(
2014
).
Enhancing emotional experiences to dance through music: The role of valence and arousal in the cross-modal bias
.
Frontiers in Human Neuroscience
,
8
,
757
.
Christophel
,
T. B.
,
Hebart
,
M. N.
, &
Haynes
,
J. D.
(
2012
).
Decoding the contents of visual short-term memory from human visual and parietal cortex
.
Journal of Neuroscience
,
32
,
12983
12989
.
Corbetta
,
M.
,
Shulman
,
G. L.
,
Miezin
,
F. M.
, &
Petersen
,
S. E.
(
1995
).
Superior parietal cortex activation during spatial attention shifts and visual feature conjunction
.
Science
,
270
,
802
805
.
Culham
,
J. C.
, &
Valyear
,
K. F.
(
2006
).
Human parietal cortex in action
.
Current Opinion in Neurobiology
,
16
,
205
212
.
De Gelder
,
B.
, &
Bertelson
,
P.
(
2003
).
Multisensory integration, perception and ecological validity
.
Trends in Cognitive Sciences
,
7
,
460
467
.
Derrfuss
,
J.
,
Brass
,
M.
,
Neumann
,
J.
, &
von Cramon
,
D. Y.
(
2005
).
Involvement of the inferior frontal junction in cognitive control: Meta-analyses of switching and Stroop studies
.
Human Brain Mapping
,
25
,
22
34
.
Dixon
,
M. L.
,
Thiruchselvam
,
R.
,
Todd
,
R.
, &
Christoff
,
K.
(
2017
).
Emotion and the prefrontal cortex: An integrative review
.
Psychological Bulletin
,
143
,
1033
1081
.
Dolan
,
R. J.
,
Morris
,
J. S.
, &
de Gelder
,
B.
(
2001
).
Crossmodal binding of fear in voice and face
.
Proceedings of the National Academy of Sciences, U.S.A.
,
98
,
10006
10010
.
Driver
,
J.
, &
Noesselt
,
T.
(
2008
).
Multisensory interplay reveals crossmodal influences on ‘sensory-specific’ brain regions, neural responses, and judgments
.
Neuron
,
57
,
11
23
.
Egner
,
T.
,
Etkin
,
A.
,
Gale
,
S.
, &
Hirsch
,
J.
(
2007
).
Dissociable neural systems resolve conflict from emotional versus nonemotional distracters
.
Cerebral Cortex
,
18
,
1475
1484
.
Ethofer
,
T.
,
Van De Ville
,
D.
,
Scherer
,
K.
, &
Vuilleumier
,
P.
(
2009
).
Decoding of emotional information in voice-sensitive cortices
.
Current Biology
,
19
,
1028
1033
.
Etkin
,
A.
,
Egner
,
T.
,
Peraza
,
D. M.
,
Kandel
,
E. R.
, &
Hirsch
,
J.
(
2006
).
Resolving emotional conflict: A role for the rostral anterior cingulate cortex in modulating activity in the amygdala
.
Neuron
,
51
,
871
882
.
Gao
,
C.
,
Weber
,
C. E.
, &
Shinkareva
,
S. V.
(
2019
).
The brain basis of audiovisual affective processing: Evidence from a coordinate-based activation likelihood estimation meta-analysis
.
Cortex
,
120
,
66
77
.
Gao
,
C.
,
Wedell
,
D. H.
,
Green
,
J. J.
,
Jia
,
X.
,
Mao
,
X.
,
Guo
,
C.
, et al
(
2018
).
Temporal dynamics of audiovisual affective processing
.
Biological Psychology
,
139
,
59
72
.
Gao
,
C.
,
Wedell
,
D. H.
,
Kim
,
J.
,
Weber
,
C. E.
, &
Shinkareva
,
S. V.
(
2018
).
Modelling audiovisual integration of affect from videos and music
.
Cognition and Emotion
,
32
,
516
529
.
Gerdes
,
A.
,
Wieser
,
M. J.
,
Bublatzky
,
F.
,
Kusay
,
A.
,
Plichta
,
M. M.
, &
Alpers
,
G. W.
(
2013
).
Emotional sounds modulate early neural processing of emotional pictures
.
Frontiers in Psychology
,
4
,
741
.
Jansma
,
H.
,
Roebroeck
,
A.
, &
Münte
,
T.
(
2014
).
A network analysis of audiovisual affective speech perception
.
Neuroscience
,
256
,
230
241
.
Jeong
,
J. W.
,
Diwadkar
,
V. A.
,
Chugani
,
C. D.
,
Sinsoongsud
,
P.
,
Muzik
,
O.
,
Behen
,
M. E.
, et al
(
2011
).
Congruence of happy and sad emotion in music and faces modifies cortical audiovisual activation
.
Neuroimage
,
54
,
2973
2982
.
Kadunce
,
D. C.
,
Vaughan
,
J. W.
,
Wallace
,
M. T.
,
Benedek
,
G.
, &
Stein
,
B. E.
(
1997
).
Mechanisms of within- and cross-modality suppression in the superior colliculus
.
Journal of Neurophysiology
,
78
,
2834
2847
.
Kay
,
K.
,
Rokem
,
A.
,
Winawer
,
J.
,
Dougherty
,
R.
, &
Wandell
,
B.
(
2013
).
GLMdenoise: A fast, automated technique for denoising task-based fMRI data
.
Frontiers in Neuroscience
,
7
,
247
.
Kim
,
J.
,
Shinkareva
,
S. V.
, &
Wedell
,
D. H.
(
2017
).
Representations of modality-general valence for videos and music derived from fMRI data
.
Neuroimage
,
148
,
42
54
.
Kim
,
J.
,
Wang
,
J.
,
Wedell
,
D. H.
, &
Shinkareva
,
S. V.
(
2016
).
Identifying core affect in individuals from fMRI responses to dynamic naturalistic audiovisual stimuli
.
PLoS One
,
11
,
e0161589
.
Klasen
,
M.
,
Kenworthy
,
C. A.
,
Mathiak
,
K. A.
,
Kircher
,
T. T.
, &
Mathiak
,
K.
(
2011
).
Supramodal representation of emotions
.
Journal of Neuroscience
,
31
,
13635
13643
.
Kotz
,
S. A.
,
Kalberlah
,
C.
,
Bahlmann
,
J.
,
Friederici
,
A. D.
, &
Haynes
,
J. D.
(
2013
).
Predicting vocal emotion expressions from the human brain
.
Human Brain Mapping
,
34
,
1971
1981
.
Kragel
,
P. A.
, &
LaBar
,
K. S.
(
2016
).
Somatosensory representations link the perception of emotional expressions and sensory experience
.
eNeuro
,
3
,
ENEURO.0090-15.2016
.
Kriegeskorte
,
N.
,
Goebel
,
R.
, &
Bandettini
,
P.
(
2006
).
Information-based functional brain mapping
.
Proceedings of the National Academy of Sciences, U.S.A.
,
103
,
3863
3868
.
Kulkarni
,
B.
,
Bentley
,
D. E.
,
Elliott
,
R.
,
Youell
,
P.
,
Watson
,
A.
,
Derbyshire
,
S. W.
, et al
(
2005
).
Attention to pain localization and unpleasantness discriminates the functions of the medial and lateral pain systems
.
European Journal of Neuroscience
,
21
,
3133
3142
.
Lindquist
,
K. A.
(
2013
).
Emotions emerge from more basic psychological ingredients: A modern psychological constructionist model
.
Emotion Review
,
5
,
356
368
.
Lindquist
,
K. A.
, &
Barrett
,
L. F.
(
2012
).
A functional architecture of the human brain: Emerging insights from the science of emotion
.
Trends in Cognitive Sciences
,
16
,
533
540
.
Lindquist
,
K. A.
,
Wager
,
T. D.
,
Kober
,
H.
,
Bliss-Moreau
,
E.
, &
Barrett
,
L. F.
(
2012
).
The brain basis of emotion: A meta-analytic review
.
Behavioral and Brain Sciences
,
35
,
121
143
.
McGurk
,
H.
, &
MacDonald
,
J.
(
1976
).
Hearing lips and seeing voices
.
Nature
,
264
,
746
748
.
Mitchell
,
R. L.
(
2006
).
How does the brain mediate interpretation of incongruent auditory emotions? The neural response to prosody in the presence of conflicting lexico-semantic cues
.
European Journal of Neuroscience
,
24
,
3611
3618
.
Müller
,
V. I.
,
Habel
,
U.
,
Derntl
,
B.
,
Schneider
,
F.
,
Zilles
,
K.
,
Turetsky
,
B. I.
, et al
(
2011
).
Incongruence effects in crossmodal emotional integration
.
Neuroimage
,
54
,
2257
2266
.
Nelson
,
J. K.
,
Reuter-Lorenz
,
P. A.
,
Persson
,
J.
,
Sylvester
,
C. Y.
, &
Jonides
,
J.
(
2009
).
Mapping interference resolution across task domains: A shared control process in left inferior frontal gyrus
.
Brain Research
,
1256
,
92
100
.
Nichols
,
T.
,
Brett
,
M.
,
Andersson
,
J.
,
Wager
,
T.
, &
Poline
,
J. B.
(
2005
).
Valid conjunction inference with the minimum statistic
.
Neuroimage
,
25
,
653
660
.
Novick
,
J. M.
,
Trueswell
,
J. C.
, &
Thompson-Schill
,
S. L.
(
2005
).
Cognitive control and parsing: Reexamining the role of Broca's area in sentence comprehension
.
Cognitive, Affective, & Behavioral Neuroscience
,
5
,
263
281
.
Novick
,
J. M.
,
Trueswell
,
J. C.
, &
Thompson-Schill
,
S. L.
(
2010
).
Broca's area and language processing: Evidence for the cognitive control connection
.
Language and Linguistics Compass
,
4
,
906
924
.
Peelen
,
M. V.
,
Atkinson
,
A. P.
, &
Vuilleumier
,
P.
(
2010
).
Supramodal representations of perceived emotions in the human brain
.
Journal of Neuroscience
,
30
,
10127
10134
.
Pehrs
,
C.
,
Deserno
,
L.
,
Bakels
,
J. H.
,
Schlochtermeier
,
L. H.
,
Kappelhoff
,
H.
,
Jacobs
,
A. M.
, et al
(
2013
).
How music alters a kiss: Superior temporal gyrus controls fusiform–amygdalar effective connectivity
.
Social Cognitive and Affective Neuroscience
,
9
,
1770
1778
.
Pereira
,
F.
, &
Botvinick
,
M.
(
2011
).
Information mapping with pattern classifiers: A comparative study
.
Neuroimage
,
56
,
476
496
.
Pernet
,
C. R.
(
2014
).
Misconceptions in the use of the general linear model applied to functional MRI: A tutorial for junior neuro-imagers
.
Frontiers in Neuroscience
,
8
,
1
.
Petrini
,
K.
,
Crabbe
,
F.
,
Sheridan
,
C.
, &
Pollick
,
F. E.
(
2011
).
The music of your emotions: Neural substrates involved in detection of emotional correspondence between auditory and visual music actions
.
PLoS One
,
6
,
e19165
.
Russell
,
J. A.
(
2003
).
Core affect and the psychological construction of emotion
.
Psychological Review
,
110
,
145
172
.
Saarimäki
,
H.
,
Ejtehadian
,
L. F.
,
Glerean
,
E.
,
Jääskeläinen
,
I. P.
,
Vuilleumier
,
P.
,
Sams
,
M.
, et al
(
2018
).
Distributed affective space represents multiple emotion categories across the human brain
.
Social Cognitive and Affective Neuroscience
,
13
,
471
482
.
Saarimäki
,
H.
,
Gotsopoulos
,
A.
,
Jääskeläinen
,
I. P.
,
Lampinen
,
J.
,
Vuilleumier
,
P.
,
Hari
,
R.
, et al
(
2015
).
Discrete neural signatures of basic emotions
.
Cerebral Cortex
,
26
,
2563
2573
.
Said
,
C. P.
,
Moore
,
C. D.
,
Engell
,
A. D.
,
Todorov
,
A.
, &
Haxby
,
J. V.
(
2010
).
Distributed representations of dynamic facial expressions in the superior temporal sulcus
.
Journal of Vision
,
10
,
11
.
Satpute
,
A. B.
, &
Lindquist
,
K. A.
(
2019
).
The default mode network's role in discrete emotion
.
Trends in Cognitive Sciences
,
23
,
851
864
.
Schirmer
,
A.
, &
Adolphs
,
R.
(
2017
).
Emotion perception from face, voice, and touch: Comparisons and convergence
.
Trends in Cognitive Sciences
,
21
,
216
228
.
Schirmer
,
A.
,
Zysset
,
S.
,
Kotz
,
S. A.
, &
von Cramon
,
D. Y.
(
2004
).
Gender differences in the activation of inferior frontal cortex during emotional speech perception
.
Neuroimage
,
21
,
1114
1123
.
Shinkareva
,
S. V.
,
Malave
,
V. L.
,
Mason
,
R. A.
,
Mitchell
,
T. M.
, &
Just
,
M. A.
(
2011
).
Commonality of neural representations of words and pictures
.
Neuroimage
,
54
,
2418
2425
.
Sitaram
,
R.
,
Lee
,
S.
,
Ruiz
,
S.
,
Rana
,
M.
,
Veit
,
R.
, &
Birbaumer
,
N.
(
2011
).
Real-time support vector classification and feedback of multiple emotional brain states
.
Neuroimage
,
56
,
753
765
.
Vilberg
,
K. L.
, &
Rugg
,
M. D.
(
2008
).
Memory retrieval and the parietal cortex: A review of evidence from a dual-process perspective
.
Neuropsychologia
,
46
,
1787
1799
.
Wallace
,
M. T.
,
Wilkinson
,
L. K.
, &
Stein
,
B. E.
(
1996
).
Representation and integration of multiple sensory inputs in primate superior colliculus
.
Journal of Neurophysiology
,
76
,
1246
1266
.
Wang
,
J.
,
Baucom
,
L. B.
, &
Shinkareva
,
S. V.
(
2013
).
Decoding abstract and concrete concept representations based on single-trial fMRI data
.
Human Brain Mapping
,
34
,
1133
1147
.
Watson
,
R.
,
Latinus
,
M.
,
Noguchi
,
T.
,
Garrod
,
O. G. B.
,
Crabbe
,
F.
, &
Belin
,
P.
(
2013
).
Dissociating task difficulty from incongruence in face-voice emotion integration
.
Frontiers in Human Neuroscience
,
7
,
744
.
Weiss
,
P. H.
,
Marshall
,
J. C.
,
Zilles
,
K.
, &
Fink
,
G. R.
(
2003
).
Are action and perception in near and far space additive or interactive factors?
Neuroimage
,
18
,
837
846
.
Wittfoth
,
M.
,
Schröder
,
C.
,
Schardt
,
D. M.
,
Dengler
,
R.
,
Heinze
,
H. J.
, &
Kotz
,
S. A.
(
2009
).
On emotional conflict: Interference resolution of happy and angry prosody reveals valence-specific effects
.
Cerebral Cortex
,
20
,
383
392
.
Woo
,
C. W.
,
Krishnan
,
A.
, &
Wager
,
T. D.
(
2014
).
Cluster-extent based thresholding in fMRI analyses: Pitfalls and recommendations
.
Neuroimage
,
91
,
412
419
.