There is a lively and theoretically important debate about whether, how, and when embodiment contributes to language comprehension. This study addressed these questions by testing how interference with facial action impacts the brain's real-time response to emotional language. Participants read sentences about positive and negative events (e.g., “She reached inside the pocket of her coat from last winter and found some (cash/bugs) inside it.”) while ERPs were recorded. Facial action was manipulated within participants by asking participants to hold chopsticks in their mouths using a position that allowed or blocked smiling, as confirmed by EMG. Blocking smiling did not influence ERPs to the valenced words (e.g., cash, bugs) but did influence ERPs to final words of sentences describing positive events. Results show that affectively positive sentences can evoke smiles and that such facial action can facilitate the semantic processing indexed by the N400 component. Overall, this study offers causal evidence that embodiment impacts some aspects of high-level comprehension, presumably involving the construction of the situation model.
Once thought of as separable input and output systems, cognition and action are increasingly viewed as integrated processes, reflecting the embodied nature of the mind (Barsalou, 2008). One area where embodiment is clearly an important factor is in the processing of emotional faces. For example, understanding the meaning of a smile has been argued to involve the construction of an embodied simulation supported by the brain's motor, somatosensory, and interoceptive systems (Niedenthal, Mermillod, Maringer, & Hess, 2010). Consistent with this argument, viewing emotional faces elicits spontaneous mimicry, and interfering with such facial actions can impair recognition and memory for emotional expressions (Neal & Chartrand, 2011; Halberstadt, Winkielman, Niedenthal, & Dalle, 2009; Pitcher, Garrido, Walsh, & Duchaine, 2008; Oberman, Winkielman, & Ramachandran, 2007; Niedenthal, Brauer, Halberstadt, & Innes-Ker, 2001).
The role of embodiment in higher-order cognition, such as language, is more controversial (e.g., Hoedemaker & Gordon, 2013; Mahon & Caramazza, 2008). According to embodied, or grounded, theories of meaning, language comprehension is supported by the partial reactivation of somatosensory and motor systems involved in experience with the domains referred to by a given sentence (e.g., Bergen, 2005; Zwaan, 2004). For emotional language, this can include a partial reinstatement of relevant facial actions. As such, manipulating facial action should influence language comprehension. Indeed, some evidence, reviewed below, is consistent with this prediction (Foroni & Semin, 2009; Niedenthal, Winkielman, Mondillon, & Vermeulen, 2009; Havas, Glenberg, & Rinck, 2007). However, because most studies have not used direct measures of language processing, they have not addressed the causal nature and timing of embodiment effects, issues that are central to their theoretical relevance. The current work uses the real-time brain response to language to provide such evidence and, as a result, informs recent theoretical debates on embodied cognition.
Facial Action and Understanding Emotional Language
As noted earlier, it is now recognized that facial action can contribute to the perception and recognition of emotional faces. It also has been hypothesized to play a role in the comprehension of emotional language. For example, Pulvermüller (2005) suggests that the frequent co-occurrence of words such as “smile” with the action of smiling leads to the formation of neuronal assemblies connecting a given word representation with the relevant motor program. Once assembled, the word representation triggers the rapid, automatic activation of associated motor programs. Of course, such motor activations might simply correlate with language comprehension processes without causally contributing to them. However, Pulvermüller and Fadiga (2010) have proposed that motor resonance processes are actually part of the neural representation of word meaning and, as such, contribute to comprehension.
Consistent with these proposals, evidence suggests that, at least under some conditions, emotional language activates somatic processes involved in emotion experience. Niedenthal et al. (2009) recorded facial EMG as participants read individual emotion words as well as neutral ones. Compared to baseline, words related to happiness elicited increased muscle activity at sites typically recruited during smiling (zygomaticus major). Analogously, words related to anger and disgust elicited increased corrugator supercilii (frowning) activity. Similar effects have also been obtained with verbs and adjectives, even when presented very briefly (Foroni & Semin, 2009).
Critically, additional research suggests that blocking facial action impairs the comprehension of emotional language. Niedenthal et al. (2009) asked participants to hold a pen horizontally in their mouth between their teeth and lips, effectively inhibiting facial expressions involving the lower half of the face, such as smiling or wrinkling one's nose in disgust. This manipulation lowered the accuracy of classification judgments for emotional words associated either with happiness or disgust, but not for control words (associated with anger and neutral words). Moving beyond single words, some studies examined the processing of emotional sentences using peripheral blocking manipulations, such as pen or Botox injection, and found evidence for impaired comprehension as measured by valence judgment and reading times (Havas, Glenberg, Gutowski, Lucarelli, & Davidson, 2010; Havas et al., 2007).
What and When: Neural Markers of Comprehension
Although intriguing, previous research on the importance of facial action for understanding emotional language leaves open a number of critical issues. First, the effects of facial action on behavioral measures might not exclusively reflect changes in language comprehension. For example, changes in judgments about the valence of words and sentences could reflect influences on decision processes, and changes in reading times could reflect influences on general variables such as depth of processing, attention, or interest in specific valenced material. These alternative interpretations limit the support these findings offer for theories of embodied meaning and call for investigations using neural markers of comprehension.
More importantly, even if the effects on comprehension were clear, the previous studies do not speak to the precise timing of those effects or to the linguistic level at which they occur. Both issues are crucial for the larger theoretical interpretations of embodiment effects. Specifically, some models suggest that embodiment influences processing at the word level (e.g., Pulvermüller, 2005). As such, this view predicts that blocking facial action should impair processing of specific valence words. In contrast, other models suggest that words are just one input to a simulation that unfolds over time, and embodiment plays the greatest role when constructing a full situation model (Havas et al., 2007; Bergen, 2005; Zwaan, 2004). On that view, embodiment effects should be most pronounced at the end of the phrase or the sentence, where readers integrate previously presented information and update their mental model of the situation (Rayner, Kambe, & Duffy, 2000; Just & Carpenter, 1980).
Current Study: Design and Predictions
To help address the above theoretical issues, we investigated whether interfering with facial action impacted the brain's real-time response to emotional language. Participants read sentences describing positive and negative events and made valence ratings. Facial action was manipulated within participants using an experimental paradigm (blocking) that interfered with the production of a smile. Facial EMG was used as a manipulation check for the default effect of sentence valence on facial action in the control condition and for the effectiveness of our blocking manipulation in the experimental condition.
Because the goal of this study was to elucidate “when” facial action influenced language comprehension, we capitalized on the temporal resolution of EEG and measured ERPs to sentences whose valence was determined by their third to last word (the valence word). “She reached into the pocket of her coat from last year and found some (cash/bugs) inside it.” Real-time language comprehension was assessed via two ERP components, the N400, associated with ease of semantic processing (see Kutas & Federmeier, 2011, for a review), and the LPC, which is sensitive to differences in affect (Holt, Lynn, & Kuperberg, 2009; Kissler, Herbert, Winkler, & Junghofer, 2009; Schupp et al., 2004; Bernat, Bunce, & Shevrin, 2001; Cacioppo & Berntson, 1994). If facial action impacts comprehension at the word level, it should be evident in ERPs time-locked to the valence word (e.g., “cash” or “bugs”). If facial action impacts comprehension at a larger discourse level, such as during the construction of a situation model, ERP differences should coincide with the timing associated with wrap-up effects, at the end of a phrase or, in the case of the current study, the last word of the sentence (Rayner et al., 2000; Just & Carpenter, 1980).
To manipulate facial action, we used an experimental procedure known to interfere with smiling and that has previously been shown to reduce the recognition of positive emotions in faces (Oberman et al., 2007) and to reduce categorization accuracy for positive words (Niedenthal et al., 2009). In the experimental condition, our participants held a pair of conjoined wooden chopsticks horizontally in their mouths with their lips closed around them. Importantly, the pair of chopsticks was held between participants' teeth at the corners of their mouth, making it difficult to raise the mouth into a smile. Additionally, holding the chopsticks with the teeth generated a level of baseline muscle noise such that if an individual did attempt to smile, the feedback signal from the facial muscle would be embedded in this noise. The control condition was very similar to the experimental condition in that participants also held the chopsticks horizontally in their mouths. However, in the control condition, the chopsticks were held at the front of the lips, allowing the corners of the lips to be raised in a smile (see Figure 1). Facial EMG was recorded to check the effectiveness of the facial action manipulation and to measure participants' affective responses to the positive and negative sentences.
Given that the facial action manipulation blocks smiling, we predicted that it would selectively interfere with the comprehension of sentences describing positive events. This should be reflected in a larger amplitude N400 to the positive language in the experimental condition than in the control condition, indicating that blocking a smile made comprehending positive language more difficult.
Selective N400 differences for the positive but not the negative language would argue against important alternative interpretations. For example, the experimental manipulation of facial action could possibly elicit negative mood, which might itself influence comprehension. However, if a negative mood impaired the comprehension of positive language, it should do the opposite for negative language, that is, facilitate comprehension. Consequently, the negative sentences helped control for the effect of the facial action manipulation on mood and other nonspecific variables that might influence both positive and negative sentences.
As an additional manipulation check, we time-locked ERPs to the word before the valence word, the control word. For example, the word “some” in the sentence, “She reached into the pocket of her coat from last year and found some (cash/bugs) inside it.” We expected to find no ERP differences at this word, indicating that the manipulation selectively impacted positive language and not language in general.
ERPs and EMG were also used to confirm that participants processed the affective content in the language. One indication of affective processing is the emergence of standard negativity bias, as reflected in larger amplitude LPC to negative versus positive stimuli (Holt et al., 2009; Kissler et al., 2009; Schupp et al., 2004; Bernat et al., 2001; Cacioppo & Berntson, 1994). Therefore, when comparing across sentence types, we predicted a larger LPC for the negative than the positive sentences, regardless of the facial action manipulation. Another indication of affective processing is the production of incipient facial expressions consistent with the affect implied by the sentences (Larsen, Norris, & Cacioppo, 2003). Therefore, we recorded EMG at the zygomaticus major (involved in smiling), the levator labii (involved in scrunching one's nose), and the corrugator supercilli (involved in knitting together one's brows) and predicted greater smiling to positive than negative sentences in the control (unblocked) condition.
In summary, this study manipulated facial action as participants read affective sentences and measured their ERPs during the comprehension process, in conjunction with measurements from facial EMG. As such, the current study provides a comprehensive characterization of the relationship between central and peripheral measures during the processing of affective language and the causal impact that facial action has on the real-time comprehension process.
Twenty-one University of California, San Diego, undergraduates participated for course credit and financial compensation. All participants were screened via self-report for history of head injury, drug use, psychiatric illness, and other neurological conditions. All participants reported that they were healthy, right-handed, native English speakers with normal or corrected-to-normal vision. One participant was removed because of equipment malfunction, and two participants were removed because of excessive EEG artifacts (more than 30% contamination). Consequently, 18 participants' data were analyzed (mean age = 19.4 years, range = 18–26 years, 10 women). This sample size is similar to what has been used in other ERP research on emotional language processing, (e.g., Holt, Lynn, & Kuperberg, 2009; Kanske & Kotz, 2007; Vanderploeg, Brown, & Marsh, 1987).
One hundred fifty-two experimental sentence pairs and an equal number of fillers were used. Each experimental sentence pair consisted of sentences that were the same until the last three words. The third to last word (the valence word) drove the direction of the sentence's affect: “She reached inside the pocket of her coat from last winter and found some (cash/bugs) inside it.” Although attempts were made to create sentence pairs that differed only on the third-to-last word (e.g., “cash”/“bugs”), the difficulty in creating enough stimuli made it such that some pairs differed in all three of the last words (see Table 1 for examples). We thought this preferable to either showing both pairs of a sentence to a participant or to repeating stimuli. However, it should be noted that the motivation for the experiment was not to compare the processing of sentences of different valence (this comparison was done only to ensure that participants were registering the affective differences between the conditions). Instead the study examined how the facial action manipulation impacted language comprehension when that language is held constant.
|Sentence Stem .||Positive Ending .||Negative Ending .|
|Every time she thought about his kiss, her||heart raced with excitement.||heart broke once more.|
|When she arrived home from work, she noticed that her annoying roommate||had vacuumed for once.||had overflowed the toilet.|
|He tried on the jacket his girlfriend had bought him and it||looked incredible on him.||looked ridiculous on him.|
|Sentence Stem .||Positive Ending .||Negative Ending .|
|Every time she thought about his kiss, her||heart raced with excitement.||heart broke once more.|
|When she arrived home from work, she noticed that her annoying roommate||had vacuumed for once.||had overflowed the toilet.|
|He tried on the jacket his girlfriend had bought him and it||looked incredible on him.||looked ridiculous on him.|
Each participant saw only one ending. The italicized words are the valence words. They were not presented to participants in italics.
The sentences were rated for valence before the experiment. No participant took part in both the normative task (n = 64) and the ERP experiment. We wanted to force participants to construe the sentences as affectively positive or negative; therefore, we did not provide them with a neutral option in the rating scale. Valence was rated on a 6-point scale: 1 = very good, 2 = good, 3 = somewhat good, 4 = somewhat bad, 5 = bad, and 6 = very bad. A 6-point scale was used rather than a binary scale to encourage richer semantic processing and reduce the probability that responses would be made by more superficial semantic processes such as word association. Note that this was also the scale that was used during the ERP experiment. The normative ratings indicated that sentences intended to be positive were rated on the positive side of the scale, 2.27 (SD = 0.41), and sentences intended to be negative were rated on the negative side of the scale, 4.62 (SD = 0.41). A statistical analysis of these means at the level of the sentence (item) revealed a highly significant difference in ratings of sentences of different valence, F2(1, 151) = 4023.61, p < .001.
Apart from valence, the stimuli were closely matched on other psycholinguistic properties. The positive and negative valence words that were listed in the MRC database (Colthart, 1981) were matched for concreteness, t(84) = −0.612, p > .5, Kucera–Francis written word frequency, t(168) = 0.4, p > .6, and imagability, t(96) = −0.21, p > .8.
An additional norming study with new participants (n = 122) collected cloze probabilities for the critical (valence) words, because cloze probability is the greatest predictor of amplitude differences in the N400 to words in sentences (Kutas & Federmeier, 2011). Each participant only performed this task on a subset of the sentences. They were given a sentence fragment (up to the valence word) and asked to complete the sentences with the first reasonable ending that came to mind. The average cloze probability of the critical words was relatively low (positive sentences, 9%; negative sentences, 6%) and did not significantly differ between the positive and negative versions of the experimental stimuli, F2(1, 151) = 3.086, p > .05.
Participants provided informed consent, completed surveys on handedness, neurological damage, drug use, and the PBC inventory (Miller, Murphy, & Buss, 1981). They were prepped with EEG and EMG electrodes (see the sections on EEG and EMG recording and analysis). Recording took place in a dimly lit, sound-attenuated chamber. Participants were told that they would be reading sentences and making valence decisions using a numerical keypad. Valence decisions were made on the same 6-point scale as in the norming study. The side of the keypad corresponding to positive and negative (i.e., “good” and “bad”, viz. left or right) was counterbalanced across participants. Participants were given a demonstration of each of the postures associated with the facial action manipulation (see Figure 1) and provided with feedback when they tried it themselves. Language about emotional facial expressions was explicitly avoided. Participants were also shown their EEG signal to practice performing the facial action manipulations while minimizing muscle noise in the EEG. Before the onset of the experiment, participants performed one practice trial for each facial posture, during which they were given verbal feedback on their valence ratings and encouraged to use the entire rating scale.
Each block began with information on how to hold the chopsticks: “TEETH and LIPS” (for the experimental manipulation) or “LIPS ONLY” (for the control). Whether participants began in the experimental or control condition was counterbalanced across participants. Each trial consisted of self-paced reading followed by rapid serial visual presentation of the last four words of the sentence: “She reached inside the pocket of her coat from last winter and found” + “some” + “cash” + “inside” + “it.” (see Figure 2 for a depiction of a trial). ERPs were time-locked to the control word, SOME (fourth from last), the valence word, CASH or BUGS (third from last), and the sentence-final word, IT. After the sentence-final word, there was a 3-sec pause followed by a cue to rate the sentence's valence. The 3-sec delay was included to prevent EEG contamination from overlapping decision- or motor-related activity evoked by the button-pressing valence judgment task (see Luck, 2005). The valence judgment was included to make sure participants were attending to the task and to encourage the processing of the sentence's affect. It is worth noting here that giving participants 3 sec to deliberate on how they ought to rate the sentence might wash out any potential effects of the facial action manipulation on the sentence evaluation. Our design thus prioritized cleaner ERP data over the potential observation of behavioral differences that might be found with a speeded decision.
No individual participant saw both the positive and negative versions of any given sentence pair, and these were split across two lists. Each list was pseudorandomized so that each participant saw an equal number of positive and negative sentences within a block and, therefore, within each of the facial action conditions.
EEG Recording and Analysis
EEG was collected from 27 scalp sites using a cap mounted with tin electrodes. Scalp electrodes were referenced to the left mastoid. Blinks were monitored from an electrode below the right eye and referenced to the left mastoid. Horizontal eye movements were monitored via a bipolar derivation of electrodes placed at the outer canthus of each eye. At all sites, electrical impedance was reduced to less than 5 kΩ by gentle abrasion of the skin. EEG was recorded and amplified using an SA Instruments (Stonybrook, NY) bioelectric amplifier with a high pass filter of 0.01 Hz and a low pass filter of 100 Hz. It was digitized online at 1024 Hz.
After recording, EEG was epoched from 200 msec before word onset, until 800 msec after. Epochs were visually examined and manually rejected when contaminated by blinks, eye movements, muscle noise or channel blocking. This resulted in the removal of an average of 18% of trials across participants (range = 6–29%, SD = 8%). An omnibus ANOVA 2 (Valence) × 2 (Facial action) × 3 (Location in sentence) was run on the number of trials that were analyzed. This revealed a main effect of Sentence location, such that there were significantly fewer trials in the control location (four words from the end) relative to the valence words (three words from the end) and the sentence-final words, F(2, 34) = 13.2, p < .001. Importantly, the number of valence and final words did not differ from each other. Because the control word was the same in both the positive and negative sentences, we collapsed across the positive and negative sentences when analyzing the ERP effects at this word (see ERP results). Trials in which the participants rated a positive sentence on the spectrum of “bad”, or a negative sentence on the spectrum of “good”, were also removed, resulting in the removal of 4% of experimental trials.
EMG Recording and Analysis
EMG was recorded from three sites—the zygomaticus major (associated with smiling), the levator labii (associated with nose wrinkling), and the corrugator supercilii (associated with frowning)—using bipolar derivations of tin electrodes. Electrodes were placed according to the guidelines for human EMG research established by Fridlund and Cacioppo (1986). At all sites, electrical impedance was reduced to less than 5 kΩ by gentle abrasion. EMG was sampled at 1024 Hz, recorded and amplified using the same bioelectric amplifier as the EEG, and then band-passed between 0.01 Hz and 200 Hz. The signals were then screened for artifacts, rectified, and integrated offline.
For the purpose of analysis, EMG was epoched from the onset of the control word until the end of the sentence, plus an additional second afterward (viz., 2 sec before the onset of the response prompt). This resulted in 3000 msec of EMG activity. As with the ERP, only trials in which responses were congruent with the normative valence of the sentence were analyzed. To minimize the impact of individual differences in muscle activity, we standardized each individual's EMG values using z scores within participants and muscles sites (Niedenthal et al., 2009; Oberman et al., 2007). The epochs were analyzed in 500-msec intervals. As we discuss shortly, we analyzed the EMG activity in two ways. First, we tested whether, in the experimental condition designed to block stimulus-related muscle movement, the manipulation of facial action influenced baseline activity of the zygomaticus muscle. If successful, we should observe greater baseline activity for all materials presented in the experimental than the control condition. Second, we tested whether in the control condition, designed to allow stimulus-related muscle movement, positive affective content of sentences produced relevant muscle activity in the zygomaticus muscle.
We organize the presentation of the results as follows. First, we cover the behavioral results, as they bear on the success of our valence manipulations on ratings for sentences. Second, we cover the EMG results, which indicate the muscular effects of both the sentence valence manipulation and the facial action manipulation. As such, behavioral and EMG results essentially constitute a manipulation check. Third, and most importantly, we cover the ERP results that provide a real-time measure of brain processes involved in comprehension.
Behavioral Results: Sentence Ratings
This initial analysis was to ensure that our sentences robustly communicated emotional meaning. Valence ratings ranged from 1 = very good to 6 = very bad. Participants rated the positive sentences with a mean of 2.10 similar to the value obtained in the norming study (2.27). The mean rating for the negative sentences was 4.83, quite similar to the value of 4.62 obtained in the norming study. Ratings were analyzed with a 2 (Valence) × 2 (Facial action) repeated-measures ANOVA. We only found a very strong main effect of Valence, F(1, 17) = 298.671, p < .001, such that positive sentences were rated more positive than the negative sentences, and no effect of Facial action (p > .05), suggesting that, as designed, the sentences conveyed similar affective meaning in all conditions.
EMG was analyzed to answer two questions: (i) Was our facial action manipulation successful? (ii) Was our linguistic valence manipulation successful?
Effects of the Facial Action Manipulation on Sentence-unrelated Baseline Muscle Activity
We first examined how the facial action manipulation influenced overall facial muscle activity that was unrelated to any sentence content. The initial analysis was intended as a manipulation check to ensure that the experimental condition selectively activated the target (zygomaticus) muscle. Accordingly, we analyzed activity of the three measured muscles during the control word (before any sentence changes). As can be seen in Figure 3, zygomaticus was indeed selectively activated by the manipulation. Repeated-measures ANOVA with factors (Muscle site (zygomaticus, levator, corrugator) and Facial action (experimental, control) revealed a significant interaction, F(2, 34) = 8.48, p = .001. As expected, this interaction was driven by the significant effect of the facial action manipulation on the zygomaticus, t(17) = 41.25, p < .001, but not on the levator (t < 1.6) or the corrugator (t < .4). Additional analyses on later time points confirmed that this enhanced activation persisted throughout the EMG measurement period (zygomaticus activity in the experimental condition was significantly greater in all time periods). In summary, our facial action manipulation was successful in selectively and robustly elevating the baseline level of activity at the zygomaticus muscle.
Effect of Valence on Muscle Activity
Next, we examined whether EMG picked up changes related to the word valence manipulation (i.e., whether there was more smiling after positive words). Because the experimental manipulation interfered with stimulus-related facial action, any EMG changes related to word valence should occur only in the control condition. Accordingly, we calculated baseline-corrected activity for five time windows beginning at the onset of the valence word (e.g., “cash” or “bugs”), using the 500-msec period of the control word (e.g., “some”) as the baseline. We entered this activity into a MANOVA for a 2 (Facial action) × 2 (Valence) × 5 (Time intervals) analysis. This revealed a significant three-way interaction of Facial action × Valence × Time interval, F(4, 68) = 2.65, p = .04. Figure 4 shows the results for zygomaticus activity. As can be seen, positive words generated more smiling than negative words at the first, second, and fourth 500-msec time interval beginning with the valence word: 0–500 msec, t(17) = 2.35, p = .03; 500–1000 msec, t(17) = 2.53, p = .02; 1000–1500 msec, t(17) < 2, p = .07; 1500–2000 msec, t(17) = 2.29, p = .04; 2000–2500 msec, t(17) = −0.02, p = .98. As predicted, there were no significant differences between the positive and negative sentences at any period for the experimental condition, each t(17) < 1.5, suggesting that our manipulation successfully reduced stimulus-related smiling. Moreover, no such differences emerged on other muscles.
Mean amplitude ERPs to the control word, valence word, and sentence-final word were measured. As mentioned, the main component of interest was the N400, assessed in the window from 300 to 500 msec (N400) poststimulus onset. The N400 is associated with the retrieval of meaning from semantic memory, with a relatively larger N400 for items whose meanings are more difficult to access (Kutas & Federmeier, 2011). As such, if the facial action manipulation increased the difficulty of comprehending the material (i.e., it made semantic access more difficult), then it should lead to a larger N400 relative to the control. Finding this pattern for the positive sentences but not the negative sentences would indicate that interfering with a smile had a valence-specific effect on language processing. If such a valence-selective pattern is found, then the location(s) within the sentence at which the N400 differences occur will inform the temporal dynamics of when bodily information enters into the process of making meaning—at the moment a valence word appears and/or later, when sentence wrap-up effects are known to occur.
A secondary component of interest was the LPC, which we measured in the 500–800 msec window. This interval began at the end of our N400 window (300–500 msec) and continued until the end of our recording epoch and has been used to measure the LPC in other studies of emotional language (e.g., Citron, Weekes, & Ferstl, 2013). Differences in valence are associated with differences in the LPC (see Kotz & Paulmann, 2011, for a review). As such, we predicted a negativity bias, a larger amplitude LPC would be elicited by the negative valence words (such as “bugs”) than the positive valence words (such as “cash”). Such a finding would indicate that ERP differences because of stimulus valence were detectable in our sample of participants and therefore serves as another manipulation check of real-time evaluative processing.
ERP measurements for the valence word, sentence-final word, and control word were each analyzed with a separate omnibus repeated-measures ANOVA with factors Component (N400, LPC) × sentence valence (positive, negative) × Facial action manipulation (experimental, control) × Lateral ROIs (left, center, right) × Anterior–posterior ROI (front, back). ROIs were defined as such because it is well known that there are hemispheric differences in the way that language and affect are processed and that there are broad functional differences between the processes that occur at the front and back of the brain. Similarly, previous research examining N400 and LPC effects on emotional language have made laterality and anterior–posterior ROI distinctions (e.g., Kissler et al., 2009; Holt, Lynn, & Kuperberg, 2009). Electrode sites included in each ROI can be seen in Figure 5. Greenhouse–Geisser correction was applied where appropriate.
As a reminder, positive and negative sentences were matched on cloze probability but differed in valence of the critical words (e.g., CASH vs. BUGS). The omnibus analysis (described above) revealed a main effect of ERP component time window, which has no theoretical significance, F(1, 17) = 56.940, p < .001. More interestingly, there was a main effect of Valence, F(1, 17) = 4.822, p = .042. Specifically, negative valence words (e.g., BUGS) elicited ERPs of greater amplitude (mean difference of approximately +0.9 μV) than the positive valence words (such as CASH). There was marginal interaction of valence with time window (F(1, 17) = 4.223, p = .056), reflecting greater effect of valence in a later time window. This finding is consistent with an LPC negativity bias. This bias is illustrated in Figure 6, which shows ERPs time-locked to the valence words in the positive and negative sentences, alongside the scalp topography. Critically, at the valence word (e.g., “cash” or “bugs”) we found no significant effects of the Facial action manipulation (main effect F = .12, interaction with Valence F = .28). This suggests that embodied information does not impact processing at the level of the specific valence word.
The omnibus ANOVA for the sentence-final words revealed a main effect of ERP component time window (N400, LPC), F(1, 17) = 17.155, p = .001, which has no theoretical significance. In addition, there were significant interactions for Valence × ROI (left, center, right), F(2, 134) = 4.730, p = .037, and Valence × ROI (anterior, posterior), F(2, 34) = 8.858, p = .008. These were further qualified by a five-way interaction of ERP component time window × Valence × Facial action manipulation × ROI (left, center, right) and × ROI (anterior, posterior), F(2, 134) = 5.962, p = .014. Critically, this interaction reflects the presence of an N400 facial action effect at the sentence-final word in the positive sentences and its absence in the negative ones (see Figure 7).
To better characterize the data, we then conducted separate analyses of the positive and negative sentences in each time window of the sentence-final words. The individual ANOVAs included the factors Facial action manipulation × ROI (left, center, right) × ROI (anterior, posterior).
We found no effects of the facial action manipulation in either ERP time window (see Figure 8A).
Analysis of the N400 time window revealed a main effect of the Facial action manipulation, F(1, 17) = 4.707, p = .045. The experimental condition was approximately 1.1 μV more negative than the control condition (see Figure 8B). No effects were observed in the LPC time window.
The control word occurred before the valence word. Its function was to exclude the possibility that the facial action manipulation induced global ERP differences. Because more control word trials were rejected from analysis because of artifacts than were valence or sentence-final words and because the control word did not differ between positive and negative sentences, we collapsed across sentence valence to increase statistical power when examining the control word. As expected, the ANOVA revealed no main effect of the Facial action manipulation, F(1, 17) = 0.405, p = .533, and no interactions.
Embodied approaches to meaning suggest that conceptual understanding of emotions involves their partial reinstatement and recruits somatosensory and motor systems involved in the experience of emotions. This study tested how interfering with facial action impacted the brain's real-time response to affective language. To that end, participants' facial action was manipulated as they read sentences about positive and negative events, whose meaning turned on the third-to-last word of the sentence, the “valence” word. For example, “She reached inside the pocket of her coat from last winter and found some (cash/bugs) inside it”. EMG recordings in the control condition, where facial action was allowed, revealed greater zygomaticus activity in the positive than negative sentences beginning at the valence word, suggesting that participants began to smile as they read, for example, CASH. By contrast, no such differences were observed in the experimental facial action condition, indicating that the manipulation worked as intended to interfere with stimulus-related smiling and any accompanying somatosensory feedback it might produce.
Critically, the ERP data reveal the neural consequences of this manipulation. Interfering with smiling did not influence ERPs to the valence words. However, it led to larger amplitude N400 in the sentence-final words of the sentences describing positive events. This suggests that interfering with spontaneous smiling induced a semantic processing cost at the end of the positive sentences. Such a semantic processing cost implies differences in real-time comprehension. Interestingly, 2 sec later, this real-time comprehension difference was no longer reflected in the way participants behaviorally rated the valence of the sentences (we return to this issue later).
To our knowledge, this is the first experiment using direct measures of brain activity to examine how facial action impacts the real-time processing of affective language. Previous research on this issue has found that perturbing feedback from facial muscles associated with particular emotions selectively interfered with behavioral responses (RTs and classifications) to language expressing those emotions (Havas et al., 2007, 2010; Niedenthal et al., 2009). Critically, prior research using behavioral methods has not addressed the exact mental processes causally influenced by facial actions, as behavioral measures can be influenced not only by language comprehension but also by attention, motivation, and decision-related processes. By recording EMG, we were able to determine that our facial action manipulation successfully and selectively modulated activity at the zygomaticus major (used when smiling). By recording participants' ERPs, we were able to establish that the facial action manipulation impacted neural indices of the real-time processing of language meaning.
Furthermore, the use of ERP recordings allowed us to address open questions regarding the timing of facial action effects on comprehension. One such question concerns whether facial mimicry influences processing at the level of the valenced words, as predicted by some models (e.g., Pulvermüller, 2005), or whether it influences sentence-level processing most evident at the end of phrases and sentences (as predicted by Bergen, 2005; Zwaan, 2004) or both. Interestingly, we found no effects of facial action on the word level, though as always, a null effect must be interpreted with caution, as it may reflect limitations in the power or other aspects of the design. Still, the absence of facial action effects on ERPs to the valence words stands in contrast to the presence of robust valence effects on ERPs to the same words. Specifically, the larger amplitude LPC to the negative than positive valence words (see Figure 6) resembles previously reported ERP effects to emotional language (Kissler et al., 2009). This effect, known as the negativity bias (Cacioppo & Berntson, 1994), is attributed to the tendency to devote more processing resources to negative events than positive or neutral ones. Its manifestation in the LPC component has been found for faces (Schupp et al., 2004), single words (Bernat et al., 2001), and emotional words in neutral contexts (Holt et al., 2009). The presence of the negativity bias effect here indicates that our participants were attending to the emotional meaning of the materials and had registered the valence differences by the time they had read the target words.
It is worth highlighting here that valence effects on ERPs to individual words did not vary as a function of the facial action manipulation (i.e., they occurred even in the “blocked” condition). This suggests that facial feedback is not strictly necessary to recognize a basic difference between positive and negative words. One way to interpret this finding is as an argument against a radical view of embodied cognition in which understanding of any emotional language, including single words, requires a partial reinstatement of the relevant affective state using the motor system. As such, results of this study are consistent with reports that indicate facial responses induced by single emotional words are not automatic but instead depend on participants' level of engagement with the words. Similarly, Niedenthal et al. (2009) found that muscle activations in the face were stronger when participants needed to categorize the individual word's emotionality or think about its emotional implications, and argued that motor resonance may be employed strategically to facilitate fine-grained inferences about emotion. In the context of this study, this suggests that there might be some task contexts where understanding of single emotional words will depend on embodiment (e.g., Niedenthal et al., 2009, Study 3). More broadly, a recent paper reviewed much of the evidence for grounded cognition and proposed from the review and new empirical data that most grounded congruency effects rely dynamically on context, with the central grounded features in a concept becoming active only when the current context makes them salient (Lebois, Wilson-Mendenhall, & Barsalou, 2014).
Most importantly, though, robust effects of facial action manipulation did emerge on sentence-final words of positive sentences, as these words elicited larger amplitude N400 in the experimental condition than the control (see Figures 7 and 8B). The N400 has been linked to brain activity underlying the retrieval of information from semantic memory, with larger amplitude responses associated with greater processing demands (see Kutas & Federmeier, 2011, for a review). N400 effects observed here suggest the inability to smile in the experimental condition made semantic retrieval more difficult than in the control condition. These data suggest that, when comprehending emotional language, somatosensory feedback from the face can act as a retrieval cue, playing a causal role in the construction of meaning.
Results of this study were most consistent with a tempered version of embodiment in which embodied simulations can have a facilitative impact on emotional processing, but whose deployment depends on strategic factors (see also Lebois et al., 2014). The automaticity of processes indexed by the N400 component is somewhat controversial, probably because the underlying neural substrates are involved in both automatic and strategic processes (see Lau, Phillips, & Poeppel, 2008, for a review). ERPs to sentence-final words are particularly likely to register sentence “wrap-up” effects that result when readers apply the results of compositional analysis of the sentence (Kuperberg, Choi, Cohn, Paczynski, & Jackendoff, 2010). The observation of facial action effects on ERPs to sentence-final words is thus consistent with claims that embodiment effects work at the phrasal or sentential level rather than the lexical one (Havas et al., 2010; Bergen, 2005; Zwaan, 2004).
Inhibiting participants' ability to smile thus influenced the real-time processing of sentences about pleasant events—but only for sentence-final words. The last word of positive sentences elicited larger N400 when participants' ability to smile was blocked than in the control condition, suggesting the facial action manipulation had a detrimental impact on semantic retrieval operations associated with the N400. These data suggest that emotional language gradually prompts affective responses in the body and these bodily responses can play a functional role in its comprehension. Overall, results are in keeping with grounded approaches to cognition in which language comprehension can involve partial simulation of modally specific sensorimotor and affective neural substrates recruited during action and perception (Barsalou, 1999, 2008; Havas et al., 2007; Niedenthal, 2007).
As noted above, our failure to find facial action manipulation effects on the ERPs to the valence word is inconsistent with a radical view of embodied cognition in which the comprehension of emotional language cannot proceed without the automatically triggered bodily state. However, such a view may be somewhat simplistic in adopting an overly reflexive model of both lexical activation and the generation of emotions. Recent advances in the language sciences, for example, have undermined the early idea that words automatically activate a fixed lexical entry, suggesting instead that they prompt context-sensitive retrieval from semantic memory (see e.g., Elman, 2009; Coulson, 2006). Similarly, mounting evidence suggests that the view of emotions as a set of discrete, automatic, and biologically innate responses to the environment (e.g., Ekman & Cordaro, 2011) should be replaced by a more flexible and dynamic model of emotions as psychological constructions that selectively draw on embodied resources in a context-sensitive way (Oosterwijk, Mackey, Wilson-Mendenhall, Winkielman, & Paulus, 2015; Lebois et al., 2014; Oosterwijk et al., 2012; Wilson-Mendenhall, Barrett, Simmons, & Barsalou, 2011; Barrett, Lindquist, & Gendron, 2007). Indeed emotion research increasingly suggests an important role for language in the perception of emotions, interpretation of bodily states, and the development of emotional concepts (Lindquist, 2013). The tight interplay between emotion, embodiment, and high-level language comprehension will come as no surprise to anyone who has ever laughed out loud to a joke on the radio, cried reading a novel, or cringed while reading an Op Ed in the New York Times.
Reprint requests should be sent to Joshua D. Davis, Department of Cognitive Science, University of California, San Diego, 9500 Gilman Dr., Mailcode 9515, La Jolla, CA 92093-0515, or via e-mail: firstname.lastname@example.org, email@example.com.