Abstract

Prediction is pervasive in human cognition and plays a central role in language comprehension. At an electrophysiological level, this cognitive function contributes substantially in determining the amplitude of the N400. In fact, the amplitude of the N400 to words within a sentence has been shown to depend on how predictable those words are: The more predictable a word, the smaller the N400 elicited. However, predictive processing can be based on different sources of information that allow anticipation of upcoming constituents and integration in context. In this study, we investigated the ERPs elicited during the comprehension of idioms, that is, prefabricated multiword strings stored in semantic memory. When a reader recognizes a string of words as an idiom before the idiom ends, she or he can develop expectations concerning the incoming idiomatic constituents. We hypothesized that the expectations driven by the activation of an idiom might differ from those driven by discourse-based constraints. To this aim, we compared the ERP waveforms elicited by idioms and two literal control conditions. The results showed that, in both cases, the literal conditions exhibited a more negative potential than the idiomatic condition. Our analyses suggest that before idiom recognition the effect is due to modulation of the N400 amplitude, whereas after idiom recognition a P300 for the idiomatic sentence has a fundamental role in the composition of the effect. These results suggest that two distinct predictive mechanisms are at work during language comprehension, based respectively on probabilistic information and on categorical template matching.

INTRODUCTION

An increasing number of studies attribute a crucial role to predictive mechanisms in language processing (e.g., Federmeier, 2007; Pickering & Garrod, 2007). These studies have generally manipulated the semantic and world knowledge (e.g., Kutas & Hillyard, 1984) provided by a linguistic fragment that builds up an expectation for a specific upcoming word (e.g., “The burglar had no trouble locating the secret family safe. Of course it was situated behind a …”; Van Berkum, Brown, Zwitserlood, Kooijman, & Hagoort, 2005). In our study, we explored the electrophysiological correlates of predictive forward-looking processing when the linguistic fragment contains a multiword expression (i.e., an idiom) whose canonical structure and meaning is stored in semantic memory.

Predictive Forward-looking Mechanisms in Language Comprehension

The comprehension of a linguistic message relies on a complex interplay between previously processed information, information currently being processed, and the predictions arising from the combination of these two sources of information (Roehm, Bornkessel-Schlesewsky, Rösler, & Schlesewsky, 2007). It has long been known that the longer a sentential fragment, the fewer the available alternatives in the language to continue and/or conclude it in a syntactically well-formed and semantically meaningful fashion (Miller & Selfridge, 1950). The notion of cloze-probability1 capitalizes on the fact that as a sentential context becomes more informative, reducing the uncertainty about possible alternative completions, the range of elicited responses becomes smaller and the reader's ability to predict the most probable response increases.

Despite the bulk of evidence showing the presence of predictive forward-looking mechanisms in language, the exact role of these anticipatory mechanisms has not yet been established. Syntax-first serial models of sentence comprehension attribute an essential role to top–down predictive processing in the architecture of the parser, although specifically limited to the minimal structural nodes necessary to construct a sentence. The meaning of a sentence is considered a function of the meaning of the constituents and of the syntactic rules that determine their combination. Accordingly, language interpretation is viewed as a two-step event in which, firstly, the context-free meaning of a sentence is computed compositionally in ways specified by the syntax. In a second step, the sentential meaning can be integrated with information coming from prior context, world knowledge, and pragmatic information that are crucial in establishing sentence interpretation in a final stage (e.g., Cutler & Clifton, 1999). In contrast, proponents of one-step models of language processing (e.g., Hagoort & van Berkum, 2007) assume that every source of information functional to the interpretation of a sentence interact from the very beginning to form a coherent mental model of what the sentence is about (e.g., Tanenhaus & Trueswell, 1995). According to this bottom–up view, highly expected information can be inserted in the language comprehension architecture as one of the sources of information that acts as soon as possible in parallel with other sources of information. Despite these differences, both types of model agree that we can anticipate (part of) a message and develop word-level expectations, but they disagree as to when and how predictive forward-looking processing occurs. The time course and whether a single or multiple systems manage anticipatory predictions are thus fundamental topics for neurocognitive models of language comprehension.

The study of predictive language processing has frequently used the measurement of scalp-recorded ERPs and exploited the characteristics of the N400 component of the ERP. The N400 is a broad negative deflection that begins 200–300 msec after a word has been presented and has its peak after approximately 400 msec. Since its original discovery (Kutas & Hillyard, 1984), the processing nature of the N400 has been extensively investigated (for recent overviews, see Lau, Phillips, & Poeppel, 2008; Kutas, Van Petten, & Kluender, 2006). Several studies have shown that the amplitude of the N400 to words within a sentence depends on how predictable those words are (where predictability is measured by off-line cloze-probability tests): The more predictable a word, the smaller the N400 elicited. The N400 has been considered an index of message-level semantic integration and contextual facilitation. Although it has been shown to be associated with semantic anomaly or low predictability, the N400 can be elicited by a variety of meaningful stimuli (e.g., isolated words, pronounceable pseudowords, faces, pictures). The functional interpretation of the word- and sentence-related N400 differs according to the way in which the on-line impact of language predictability is conceived: According to the integration view (e.g., Holcomb, 1993), the N400 indexes the amount of search in a semantic space necessary to select the semantic value of a word and insert it into a partial interpretation of the sentence fragment. The ERP waveform, and the behavioral facilitation often observed, does not necessarily reflect the explicit prediction of upcoming constituents. In contrast, the prediction view (e.g., Federmeier, 2007) posits that the N400 indexes the mismatch between the predicted lexical entry and the actual value of the incoming word.

A number of studies (e.g., Wicha, Moreno, & Kutas, 2004) have shown that the N400 is strongly influenced by predictive language processing. For instance, in Federmeier and Kutas (1999), participants read sentences such as, “They wanted to make the hotel look more like a tropical resort. So along the driveway, they planted rows of ….” The sentence could be completed with the most expected constituent (i.e., palms) or with an unexpected noun of the same or different category (pines and tulips, respectively). The expected noun elicited an N400 smaller than for both types of unexpected noun. Critically, and in contrast with what would have been predicted by the integration view, pines elicited a smaller N400 than tulips, despite their similar low cloze-probability (Federmeier, 2007).

Most of the studies that have obtained predictive N400 effects have manipulated expectations deriving from sentential and discourse information. Recently, Roehm et al. (2007) investigated the comprehension of antonymous adjectives (e.g., black–white). According to Gross, Fischer, and Miller (1989), the meanings of predicative adjectives are organized in semantic memory by relations of antonymy and synonymy. Antonymous adjective pairs provide the basic semantic structure with synonymous adjectives clustering around the two antonyms. Hence, antonymous word pairs can be considered as units retrieved as such from semantic memory. Roehm et al. (2007, Experiment 1) presented participants with visual sentences that contained the first part of an antonym (e.g., “The opposite of black is…”) and ended with either the correct antonymous adjective (white), or an adjective of the same category (yellow) or of a different one (nice). An N400 emerged for the nonantonymous adjectives, whereas the antonymous adjective elicited a clear positive peak interpreted as a P300, a waveform commonly associated with general processes of context updating (Donchin & Coles, 1988). Roehm et al. argued that “the P300 occurs in the same time range as the N400 to index functionally distinct levels of predictive processing via distinct electrophysiological characteristics” (p. 1260). Specifically, “the P300 for the expected antonymous noun arises because the correct identification of the predicted word does not require a lexical search (there is a unique prediction that may either be fulfilled or not)” (p. 1272). The authors acknowledge that this positivity might depend on the nature of the task (judging whether the sentence was right or wrong) and on individual processing strategies. Nonetheless, these results suggest possible P300 effects within the N400 time range.

Multiword Expressions: A Neglected Source of Evidence for Neurocognitive Models of Language

Semantic memory is a repository for a variety of knowledge that includes word meanings, concepts, and also learnt multiword strings (e.g., book titles, lines of poetry, clichés, and idioms). Idioms are strings of words whose meaning is generally not derived from that of the constituent parts. Idioms are well suited to investigating predictive mechanisms because their constituents are bound together in the string and have a typical, canonical structure and word order (Cacciari & Glucksberg, 1991). The principles that govern the syntactic and semantic variability of idioms have yet to be formalized. However, almost all idioms, like other types of multiword expressions, share the characteristic that prediction of their identity is based on how much of the string is necessary before the expression is called to mind. For example, cry over… is completed by most speakers with spilt milk, whereas break the… is more often given a literal ending (e.g., cup, bottle, dish) rather than an idiomatic one (e.g., ice). According to Cacciari and Tabossi (1988), the former idiom (cry over spilt milk) is predictable: It is recognized as soon as spilt is processed and its figurative meaning becomes available at that point. In contrast, break the ice is unpredictable and its meaning is not retrieved until the whole string has been processed.

The words that form an idiom string might be anticipated during reading or listening to an ongoing sentence in a way that partly differs from what happens in literal sentences. The comprehension of literal sentences, in fact, proceeds incrementally and compositionally by integrating each piece of semantic information in a dynamic representation of the described state of affairs. The more semantic and syntactic information accumulates to express it, the more semantic and structural expectations the reader can develop. However, although one can understand, “Unfortunately Cristina spilt the milk” by applying the semantic and morphosyntactic compositional rules of the language, this does not suffice for comprehending the figurative meaning of “Unfortunately Cristina spilt the beans.” The solution to this problem was initially found by postulating that idioms are semantically empty “long words” retrieved as such from the mental lexicon (Swinney & Cutler, 1979). According to this view, the idiom's meaning is directly retrieved from semantic memory and not elaborated via linguistic processing. However, consistent evidence has accumulated to show that: (1) idioms undergo syntactic analysis as literal sentences and (2) the semantic structure of the constituent words can constrain the final interpretation assigned to an idiomatic sentence (Cacciari & Glucksberg, 1991). In fact, current models of idiom processing (for an overview, see Cacciari, Padovani, & Corradini, 2007) assume that idiom meaning activation is not based on a mere retrieval of a word-like unit from the lexicon. Despite the fact that idioms form highly constrained contexts, higher-order language processes are maintained at least until the idiom has been retrieved, with respect to the semantic contribution of the constituent words, and until the end of the sentence, with respect to the syntactic analysis of the sentence (Peterson, Burgess, Dell, & Eberhard, 2001). An influential view of how idiom comprehension processes unfold is the Configuration Hypothesis (Cacciari & Tabossi, 1988), according to which an idiom is processed word by word, just like any other piece of language, until enough information has accumulated to render the sequence of words identifiable as—or highly expected to be—a memorized idiom. Only at this point is the idiomatic meaning retrieved. This implies that, once the reader has enough information to realize that the unfolding sentence contains an idiom or an idiom fragment (e.g., “All of a sudden John realized that he was barking up the…”), she or he can retrieve the string from semantic memory and compare the expected constituent (wrong tree) with the actual idiom string. Therefore, idiom recognition is a necessary prerequisite for the idiom meaning to be retrieved from semantic memory. The point at which the string is identified as a known idiom determines how early the idiomatic meaning is activated (for a related claim, see Sprenger, Levelt, & Kempen, 2006).

The aim of this study is to explore the possibility that the electrophysiological correlates of the processing of highly expected words in idioms, where predictability is determined by the knowledge of that specific expression stored in semantic memory, might differ from those reflecting the processing of highly expected words in sentences where predictability is subject to constraints deriving from sentence-level semantic–pragmatic information. Despite the pervasiveness of multiword strings in language, the electrophysiological correlates of their comprehension have been scarcely investigated, and none of the very few ERP studies testing idiom comprehension has manipulated the predictability of the constituents of the idiom string. Strandburg et al. (1993) measured ERP time-locked to the second word of idiomatic, literal, or nonsensical pairs of words presented in an acceptability judgment task. The authors found the N400 amplitude increased from the idiomatic to the literal and to the nonsense condition. In Laurent, Denhières, Passerieux, Iakimovac, and Hardy-Baylé (2006), participants were visually presented with the first part of a French idiom string followed by the final constituent and asked to perform a semantic relatedness task. The idiom string had both a literal and an idiomatic meaning. The N400 was smaller for highly salient idioms than for weakly salient ones. However, the notion of saliency partially overlaps with idiomatic meaning dominance and predictability of the last constituent, neither of which were controlled for. In Moreno, Federmeier, and Kutas (2002), English–Spanish bilinguals read literal or figurative English sentences that could end with three different types of word: expected high cloze-probability words (i.e., literal completions or proverb/idiom completions); within-language lexical switches, namely, English synonyms of the expected completions; or code switches, namely, translations into Spanish of the expected completions. Within-language lexical switches elicited larger N400s in literal and figurative contexts and a late positivity in figurative contexts, whereas code switches elicited a positivity that began at about 450 msec and continued into the 650–850 msec time window. In summary, the amplitude of the N400 response was affected by the predictability of the lexical item regardless of the prior context, whereas the latency of the N400 to lexical switches was affected by English vocabulary proficiency.

The Present Study

Predictable idioms (i.e., those that are identified as idioms before the last constituent) are an ideal test case to investigate the ERP components associated with the comprehension of multiword strings. The Configuration Hypothesis posits that during on-line idiom processing a qualitative change occurs after the idiom's recognition point (RP) in that only then can the idiomatic meaning be retrieved from semantic memory. Hence, idiom recognition is a prerequisite for idiom meaning activation (for an overview of the behavioral evidence, see Cacciari et al., 2007). If predictive sentence processing modulates only the N400, we might expect to find an N400 whose amplitude differs before and after the idiom RP. This N400 might index sensitivity to the co-occurrence of constituents even before enough perceptual input has accumulated to trigger recognition of the idiom. As the strings unfold, co-occurrence can create a “sense of familiarity” that incrementally increases as more constituents arrive, up to a “threshold” after which the idiom is recognized and then activated. After the idiom's RP, the specific configuration is retrieved from semantic memory. The matching of the actual input (the idiom fragment) to the stored template (the idiomatic configuration) might be indexed by a different component: a P300, similar to that found by Roehm et al. (2007) for antonymous pairs. To test these hypotheses, we compared an idiom-neutral sentential context containing a predictable Italian idiom (idiomatic condition; see 1a in Table 1 for an example) with two semantically well-formed literal sentences: one in which the constituent forming the idiom's RP (i.e., the constituent after which the idiom is retrieved from semantic memory, henceforth RP, indicated with a subscript in the example) was substituted with an idiom-unrelated word (1b, substitution condition) and one in which the constituent just after the RP (when we assume the idiom to be already retrieved, henceforth RP+1, indicated with a subscript in the example) was changed with an idiom-unrelated word (1c, expectancy–violation2 condition, henceforth referred to as violation condition).

Table 1. 

An Example of the Experimental Materials in Italian, with Word-by-Word English Translations and a Free Translation of the Figurative Meaning for the Example 1a

Condition
Example
Idiomatic 1a. Giorgio aveva un bucoRPnelloRP+1stomaco quella mattina. (George had a hole in the stomach that morning, namely George was hungry that morning) 
Substitution 1b. Giorgio aveva un doloreRPnelloRP+1stomaco quella mattina. (George had a pain in the stomach that morning
Violation 1c. Giorgio aveva un bucoRPsullaRP+1camicia quella mattina. (George had a hole on the shirt that morning
Condition
Example
Idiomatic 1a. Giorgio aveva un bucoRPnelloRP+1stomaco quella mattina. (George had a hole in the stomach that morning, namely George was hungry that morning) 
Substitution 1b. Giorgio aveva un doloreRPnelloRP+1stomaco quella mattina. (George had a pain in the stomach that morning
Violation 1c. Giorgio aveva un bucoRPsullaRP+1camicia quella mattina. (George had a hole on the shirt that morning

Recognition point (RP) and the following word (RP+1) are reported as subscripts.

The substitution condition was designed to track the development of idiom prediction as the sentence unfolds and the fragment is still perceived as literal. In fact, the idiom should be available to the reader after it is recognized as a known configuration, namely, after the RP. In this condition, we replaced the constituent coinciding with the idiom's RP with another constituent with similar lexical characteristics (see below). We expect that this change might modulate the N400 amplitude, as a function of cloze-probability differences, insofar as this component is sensitive to the distributional properties of language regardless of the literal or figurative nature of the linguistic input. The violation condition was designed to test the response to the match versus mismatch of the idiomatic configuration (just retrieved from semantic memory) to the actual sentence fragment that, in this condition, continued with a different constituent, again matched for lexical characteristics (the position of the changed constituent is indicated as RP+1). The waveforms associated with the perception of this mismatch might be a larger N400, as typically observed for unexpected constituents or for constituents that are more difficult to integrate. However, if our hypothesis is correct, we might alternatively find a P300 to index the match between the retrieved idiomatic configuration and the idiomatic fragment.

Determining whether an effect is caused by a single component or by different components (e.g., Luck, 2005), specifically whether an effect is a diminished N400 or a larger P300, is well known to be problematic. According to the literature, both ERP effects should result in a more negative potential at centro-parietal sites for substitution at the RP and for violation at the RP+1 with respect to the idiomatic condition. However, despite the similarity, some crucial differences exist. Regarding latency, the N400 has a peak around 400 msec and is usually evident between 300 and 500 msec for visually presented words in sentences. The P300 peaks at around 300 msec with an onset at around 250 msec. Regarding topography, the N400 is broadly distributed on the scalp with a maximum at the head vertex, usually slightly right-lateralized (Cz, C4). The P300 reported by Roehm et al. (2007) had a more posterior distribution with a maximum at parietal sites (P3, Pz, P4). We thus expect a peak at around 400 msec and maximum around Cz for the comparison between the substitution and idiomatic conditions at the RP. If the match between the constituent at RP+1 and the retrieved idiom elicits a P300, we expect to find an earlier effect in the comparison between the idiomatic and violation conditions peaking around Pz and also larger on occipital sites.

To aid interpretation of the ERP, we conducted a self-paced reading time experiment using the same experimental materials. Consistent with the behavioral literature, we expected idiomatic sentences to be read faster than literal sentences after the idiom RP, namely, at the RP+1. No effect is expected at the RP in the substitution versus idiomatic condition as the idiom should not yet have been retrieved from semantic memory. The effect of the expectancy violation component in the RP+1 should produce longer reading times compared with the idiomatic condition.

METHODS

Participants

A group of undergraduates from the University of Modena participated in the study for course credit: All were Italian native speakers unaware of the aim of the experiment, with normal or corrected-to-normal vision and no history of neurological disease. Specifically, 303 students participated in the norming of the experimental materials. Fifty different students participated in the ERP experiment after giving informed consent (26 women, 24 men; mean age = 21.0 years). They were right-handed, as assessed with an Italian version of the Oldfield questionnaire (Oldfield, 1971). Seventy students participated in the self-paced reading time experiment (40 women, 30 men; mean age = 20.9 years), none of whom had participated in the norming phase or in the ERP experiment.

Materials

One hundred seventy (170) idioms formed by a verb plus at least two constituents were selected from various collections of Italian idioms. This initial set of idioms was presented to 62 participants who were asked to rate each idiom for familiarity (on a 7-point scale, ranging from 1 = never heard to 7 = heard very often) and to paraphrase it. We selected 124 idioms that were familiar (M = 4.96, SD = 0.68, range = 3.6–6.3) and were correctly paraphrased (M = 0.90, SD = 0.08, range = 0.74–1.00). In order to test the idiom predictability and to identify the RP, 10 written questionnaires were prepared containing idiom fragments of increasing length inserted in minimal neutral contexts. The literal fragments were intermixed with the idiom fragments so that the latter represented only one third of the materials in each list. Eighteen students per list were asked to complete each sentence with the first words that came to mind. The RP of each idiom was operationally defined as the constituent after which idiomatic completion probability exceeded .65. The idiom's RP was at least one or two words before the offset of the idiom string. The mean cloze-probability of idiomatic completions after the RP was .85 (range = 0.66–100). For each of the 87 selected idioms,3 we constructed three sentences of similar syntactic structure and length: In the idiomatic condition (see Example 1a in Table 1; for further examples see Appendix 1), the sentence contained the idiom string in its canonical form embedded in as neutral a context as possible. The idiom string was always followed by two or three constituents (the same was done in the substitution and violation conditions). In the substitution condition (Example 1b, Table 1), the RP was substituted with a constituent unrelated to the idiomatic meaning and matched to the idiom constituent for number of characters, concreteness, grammatical class, age of acquisition (AoA) (idiom: M = 2.58, SD = 0.93, range = 0.72–4.44; substitution: M = 2.58, SD = 0.85, range = 0.88–4.28; t < 1), and written frequency (idiom: M = 5.88, SD = 2.98, range = 0–11.84; substitution: M = 6.05, SD = 2.83, range = 0.39–11.71; t < 1). In the violation condition (Example 1c, Table 1), the constituent after the RP (RP+1) was changed to a constituent unrelated to the idiomatic meaning and matched to the idiomatic constituent for number of characters, concreteness, grammatical class, AoA (idiom: M = 3.03, SD = 1.21, range = 0.6–5.45; violation: M = 2.78, SD = 0.85, range = 1.07–4.49; t < 1), and written frequency (idiom: M = 7.59, SD = 3.23, range = 1.13–14.05; violation: M = 7.24, SD = 2.92, range = 1.4–13.08; t < 1).4 The substitution and violation conditions had semantically and syntactically well-formed literal sentences with the same number of words as the sentences in the idiomatic condition (idiom: M = 8.54, SD = 1.31; substitution: M = 8.53, SD = 1.32; violation: M = 8.69, SD = 1.42; t < 1).

We assessed the cloze-probability of the constituents at the RP and at the RP+1 in the idiomatic, substitution, and violation conditions. The idiom predictability and the cloze-probability of an idiom constituent might differ because predictability refers to the probability that an idiom fragment is completed idiomatically, whereas the cloze-probability of a constituent refers only to the probability of a given word appearing irrespective of further continuations. For instance, in “George had a holeRP inRP+1 the stomach that morning,” predictability after the RP (i.e., after hole) is defined as the proportion of participants who, when presented with the fragment George had a hole.., completed it with in the stomach. In contrast, the cloze-probability of the RP (hole) is the proportion of participants that when presented with George had a… continued it with hole, regardless of whether the following words were consistent with the idiom or not. The cloze-probability values for the two critical constituents (RP and RP+1) in the three experimental conditions are reported in Table 2. The cloze-probability of the constituent after which the idiom was recognized (i.e., the RP) showed large between-item variability; however, the cloze-probability of half of the items lay between .11 and .62, leading to a mean value of .37. The cloze-probability of the word that substituted for the RP was even lower (75% of the items had a cloze-probability below .05). The constituent following the RP had a high cloze-probability in the idiomatic condition with all the idioms showing a cloze-probability larger than .68 (mean value = .86). The constituent that violated the expectancy generated by the idiom fragment (i.e., in the RP+1 position) had a very low cloze-probability (none of the participants completed the sentence with that constituent in more than 75% of the items). The word that followed this item had, on average, a cloze-probability of .18. Again, there was high variability in the level of cloze-probability of this constituent (50% of the items were below .1 vs. 25% above .32). These differences reflect semantic constraints deriving from the need to complete the sentences in a meaningful fashion (we preserved the vocabulary class of the items in almost all cases, but it was impossible to balance the other lexical characteristics).5

Table 2. 

Means and Quartile Values of the Cloze-probability Distribution at the RP and RP+1 in the Three Experimental Conditions


Mean
Quartile
Min
1st Quartile
2nd Quartile
3rd Quartile
Max
RP 
Idiom 0.37 0.11 0.37 0.62 
Substitution 0.05 0.05 0.37 
 
RP+1 
Idiom 0.86 0.68 0.79 0.86 0.94 
Substitution 0.18 0.10 0.32 0.90 
Violation 0.02 0.48 

Mean
Quartile
Min
1st Quartile
2nd Quartile
3rd Quartile
Max
RP 
Idiom 0.37 0.11 0.37 0.62 
Substitution 0.05 0.05 0.37 
 
RP+1 
Idiom 0.86 0.68 0.79 0.86 0.94 
Substitution 0.18 0.10 0.32 0.90 
Violation 0.02 0.48 

Twenty-five percent of the items had a cloze-probability lower than the first quartile value, 50% lower than the second, and 75% lower than the third one. A zero value on the third quartile means that for more than 75% of the items, none of the participants completed the fragment with that specific word.

Three lists were prepared using a Latin square design. The participants were randomly assigned to one of the three lists. Each list contained 29 experimental sentences per condition (idiomatic vs. substitution vs. violation) intermixed with 120 literal filler sentences6 of similar length and structure in a random order, the only constraint being that two experimental sentences in the same condition were never presented sequentially.

Procedure

The sentences were visually presented word by word in the center of a computer screen. The participants were instructed to read the sentences for comprehension. The instructions were given in written form and then orally repeated after a brief training. Each trial began when a participant pressed a keyboard button. A fixation point (a cross) at the center of the screen was substituted by single words presented for 300 msec and separated by a 300-msec blank (ISI = 600 msec). The last word of a sentence was followed by a period. The presentation of each sentence was followed by a 1500-msec blank. Every 10 sentences on average, the participants were asked to answer a true–false question about the content of the sentence just read. After each response, feedback was given. The experiment lasted approximately 35 min. The experiment started with a practice session formed by 15 literal sentences similar in structure and length to the experimental ones.

In the self-paced reading experiment, the same experimental and filler sentences were visually presented word by word in the center of a computer screen using a moving window self-paced reading procedure (Just, Carpenter, & Wooley, 1982). Each trial began with a button press at which the first word of the sentence was displayed on the screen with all nonspace characters of the rest of the sentence replaced by dashes. When the participant pressed the space bar, the following word was displayed replacing the corresponding dashes and the previous one reverted back to dashes. We measured the time between each button press and the accuracy of responses to the comprehension questions. The moving window paradigm, in which the reader knows in advance how many words are coming up, did not bias the reader toward the idiom as the idioms were always embedded in larger contexts. Practice, task, and instructions were the same as in the ERP experiment.

EEG Acquisition and ERP Extraction

The electroencephalogram (EEG) was amplified and recorded with the BioSemi Active-Two System from 30 active electrodes placed on the scalp (Fp1, Fp2, AF3, AF4, F3, F4, F7, F8, FC1, FC2, FC5, FC6, C3, C4, T7, T8, CP1, CP2, CP5, CP6, P3, P4, P7, P8, O1, O2, PO3, PO4, Fz, Cz, Pz, Oz) plus four electrodes placed around the eyes for eye movement monitoring (2 at the external ocular canthi and 2 below the eyes) and two electrodes placed over the left and right mastoids. Two additional electrodes were placed close to Cz, the Common Mode Sense [CMS] active electrode and the Driven Right Leg [DRL] passive electrode and used to form the feedback loop that drives the average potential of the participant as close as possible to the AD-box reference potential (Metting van Rijn, Peper, & Grimbergen, 1990). EEG and EOG signals were amplified and digitized continuously with a sampling rate of 512 Hz. Adequate trigger signals were generated and recorded for synchronization. EEG signals were off-line referenced to the average activity of the two mastoids and then analyzed using Brainvision Analyzer. After a band-pass filter (0.2–30 Hz band pass), 1500-msec epochs containing the ERP elicited by the two target words (RP and RP+1) were extracted, starting 200 msec prior to the onset of the RP. Segments including artifacts exceeding the amplitude of ±100 μV on any channel were rejected and the accepted epochs were averaged after a prestimulus 200-msec baseline correction. Six participants were excluded from the analyses due to the high number of rejected epochs (>25%).

EEG Data Analyses

The extracted average waveforms for each participant and condition were used to calculate the grand-average waveforms, to carry out ANOVAs on the fixed time windows, and to conduct latency and principal component analyses (PCAs). The statistical analyses on single-subject mean voltages in fixed time windows were performed using repeated measure ANOVAs with the Greenhouse–Geisser correction when the numerator degree of freedom exceeded one. Separate ANOVAs were carried out on different electrode groups in the 300–500 msec time window: one for midline sites (Fz, Cz, Pz, Oz) and one using 24 lateralized sites that were organized into three topographical regions, allowing evaluation of topographical effects to be divided into three orthogonal dimensions (see Table 3). The ANOVAs compared the idiomatic, substitution, and violation conditions at the RP and at the RP+1. ANOVAs were followed by t tests of the average voltage differences between conditions on separate topographic levels in order to test our hypothesis of a maximal effect around Cz in the comparison between the idiomatic and substitution conditions after the RP (N400), and of a more posterior effect, maximal around Pz, between the idiomatic and violation conditions at RP+1 (given that the P300 in the idiomatic condition is expected to contribute to the global effect). Separate comparisons were conducted for each of the midline sites, whereas for the 24 lateralized sites, comparisons were made of the average differences between the cells defined by the longitude and mesiolateral factors used for the ANOVA, as the effects are not expected to be lateralized. All the p values were adjusted with the Bonferroni correction. Because the violation and idiomatic conditions coincide up to RP+1, we pooled the mean values of these two conditions for the analyses at the RP. However, the t tests were conducted on the differences between the ERP amplitude of the substitution condition and the pooled means of the two other conditions. At RP+1, the t tests were conducted on pairwise differences among the three conditions.

Table 3. 

Organization of 24 Measurement Sites into Three Orthogonal Topographical Factors for the ANOVAs


F3
F4
F7
F8
FC1
FC2
FC5
FC6
C3
C4
T7
T8
Lateralization 
Longitude FC FC FC FC 
Mesiolateral 
 

 
CP1
 
CP2
 
CP5
 
CP6
 
P3
 
P4
 
P7
 
P8
 
O1
 
O2
 
PO3
 
PO4
 
Lateralization 
Longitude CP CP CP CP 
Mesiolateral 

F3
F4
F7
F8
FC1
FC2
FC5
FC6
C3
C4
T7
T8
Lateralization 
Longitude FC FC FC FC 
Mesiolateral 
 

 
CP1
 
CP2
 
CP5
 
CP6
 
P3
 
P4
 
P7
 
P8
 
O1
 
O2
 
PO3
 
PO4
 
Lateralization 
Longitude CP CP CP CP 
Mesiolateral 

Lateralization (2 Levels; L = Left; R = Right), Longitude (6 Levels; F = Frontal; FC = Fronto-central; C = Central; CP = Centro-parietal; P = Parietal; O = Occipital), Mesiolateral (2 Levels; M = Medial; L = Lateral).

A latency analysis was carried out in order to evaluate the onset of the effect both at the RP and at the RP+1. For this purpose, we considered the average activity over a large cluster of centro-parietal sites (C3, Cz, C4, CP1, CP2, P3, Pz, P4), where both the N400 and P300 should be visible. Separate t tests were conducted on 10-msec contiguous intervals comparing the differences between the substitution condition and the pooled mean value of the idiomatic and violation conditions at the RP, and between the idiomatic and the violation conditions at the RP+1. The onset of the effect was defined as the point at which at least five subsequent comparisons were statistically significant. This technique allows for evaluation of the latency of an effect (Rugg, Doyle, & Wells, 1995) and is appropriate when a relatively small number of trials by subject and condition (29 in our case) undermine the possibility of estimating peak latencies or fractional area latencies at the single-subject level.

A temporal PCA was performed to better describe the ERP components that underlie the effects at the RP and at the RP+1. This statistical decomposition technique (together with the spatial, spatio-temporal PCA, and independent component analysis) can be used to describe features in the ERP more objectively and more precisely than is possible to the unaided eye (Dien & Frishkoff, 2005). Recent simulations have shown that the temporal PCA is well suited for distinguishing between components partially superimposed in time such as the N400 and the P300 (Dien, Khoe, & Mangun, 2007). Moreover, the temporal PCA defines different factors on the basis of their evolution in time, hence, it is particularly appropriate to analyzing long multiword epochs. The values emerging from the temporal PCA represent the amplitudes of the variables across time points. The factor loadings thus represent the time course of each factor. The scores assign a value to the contribution of each subject, condition, and electrode to each factor. The factor loadings must be the same across the entire dataset and a visual inspection of their time course is necessary in order to choose the latent factors likely to explain the effects under study. Before running the temporal PCA, the data were filtered (low-pass, 15 Hz cutoff), and resampled at 256 Hz as we are not interested in extracting factors accounting for fast signals. Temporal factors (accounting for 90% of the variance) were extracted using the Varimax rotation procedure.

RESULTS

The ERP Study

The participants responded to the comprehension questions with an overall accuracy of 93%, indicating that they indeed read for comprehension. Figure 1 shows the grand-averaged ERP for the three experimental conditions time-locked to the idiom RP. Figure 2 shows the grand-average waveform over the cluster of centro-parietal sites used for the latency analysis.

Figure 1. 

Grand-average waveforms of the ERPs plotted in the negative-upward convention, for the three experimental conditions: idiomatic (thick line), substitution (thin continuous line), and violation (thin dashed line). The vertical lines correspond to the onset of the recognition point and the two following words (SOA = 600 msec). Note that the first word in the idiomatic and violation conditions are the same.

Figure 1. 

Grand-average waveforms of the ERPs plotted in the negative-upward convention, for the three experimental conditions: idiomatic (thick line), substitution (thin continuous line), and violation (thin dashed line). The vertical lines correspond to the onset of the recognition point and the two following words (SOA = 600 msec). Note that the first word in the idiomatic and violation conditions are the same.

Figure 2. 

Grand-average waveforms of the ERPs over the cluster of centro-parietal sites (C3, Cz, C4, CP1, CP2, P3, Pz, P4) for the idiomatic (thick line), substitution (thin continuous line), and violation (thin dashed line) conditions. Vertical thicker lines correspond to the onset of the RP and of the RP+1, thin vertical lines are drawn at 300 and 500 msec after the onset of each word.

Figure 2. 

Grand-average waveforms of the ERPs over the cluster of centro-parietal sites (C3, Cz, C4, CP1, CP2, P3, Pz, P4) for the idiomatic (thick line), substitution (thin continuous line), and violation (thin dashed line) conditions. Vertical thicker lines correspond to the onset of the RP and of the RP+1, thin vertical lines are drawn at 300 and 500 msec after the onset of each word.

Visual inspection of the waveforms reveals a more negative potential (peaking at around 400 msec) for the literal sentences, compared to the idiomatic condition, that occurred at the RP in the substitution condition, and at the RP+1 in the violation condition. However, the timing and the topographical distribution of these effects are different. The effect observed at the RP in the idiomatic versus substitution condition is compatible with a centrally distributed N400, slightly more pronounced over the right hemisphere. In contrast, the effect at the RP+1 in the idiomatic versus violation condition begins earlier and has a different topographical distribution: It is more posterior and is also visible at the occipital sites. Moreover, the waveform elicited by the idiomatic condition at the RP+1 is characterized by a clearly visible peak that is absent in the other conditions. This peak is also missing in the waveforms elicited by the idiomatic and substitution conditions at the RP. Its amplitude is similar to that of the preceding P200 at the occipital and parietal sites and it is comparable to what Roehm et al. (2007) have classified as a P300.

We performed the ANOVAs on the mean voltage in the 300–500 msec time window, an interval typically used for quantifying the N400. The results for the midline and the lateralized sites are reported in Tables 4 and 5, respectively. At the RP, we obtained main effects of condition and longitude for the midline electrodes (see Table 4). The t tests (see Table 6) showed that, despite the absence of a significant Condition × Longitude interaction in the ANOVA, the effect was maximal at Cz. The ANOVAs on the lateralized sites (see Table 5) for the RP showed two interactions involving the factor condition: a Mesiolateral × Condition interaction and a Longitude × Mesiolateral × Condition interaction. The Bonferroni-corrected t tests showed only a marginally significant effect for the C3, C4 electrode pool (Longitude C, Mesiolateral M): mean difference = 0.983 μV, t(42) = 2.991, p < .1. The overall analyses indicate that the scalp distribution of the effect was broad and maximal around the vertex as expected for modulation of the N400 amplitude.

Table 4. 

Summary of the ANOVA Analyses

Source
df
RP
RP+1
F
p
F
p
Long (3, 129) 5.34 .013*** 1.04 .345 
Cond (2, 86) 3.69 .031** 11.11 <.001**** 
Long × Cond (6, 258) 1.09 .361 10.37 <.001**** 
Source
df
RP
RP+1
F
p
F
p
Long (3, 129) 5.34 .013*** 1.04 .345 
Cond (2, 86) 3.69 .031** 11.11 <.001**** 
Long × Cond (6, 258) 1.09 .361 10.37 <.001**** 

(Greenhouse–Geisser Corrected). Analyses separately conducted after the RP and the RP+1, on the mean ERP amplitude within the 300–500 msec interval for the four midline sites (Long).

**p < .05.

***p < .01.

****p < .001.

Table 5. 

Summary of the ANOVA Analyses

Source
df
RP
RP+1
F
p
F
p
Long (5, 215) 4.399 .024** 4.213 .027** 
ML (1, 43) 18.566 <.001**** 1.958 .169 
Lat (1, 43) 4.401 .042** 3.419 .071* 
Cond (2, 86) 2.919 .062* 6.239 .004*** 
Long × ML (5, 215) 3.626 .011** 1.360 .254 
Long × Lat (5, 215) 2.343 .069* 2.919 .039** 
ML × Lat (1, 43) 0.136 .714 3.413 .072* 
Long × ML × Lat (5, 215) 0.790 .500 0.634 .591 
Long × Cond (10, 430) 0.284 .840 12.995 <.001**** 
ML × Cond (2, 86) 4.794 .011** 15.151 <.001**** 
Long × ML × Cond (10, 430) 2.167 .036** 3.919 .001*** 
Lat × Cond (2, 86) 2.311 .112 2.926 .069* 
Long × Lat × Cond (10, 430) 1.577 .144 2.724 .011** 
ML × Lat × Cond (2, 86) 0.103 .863 3.113 .054* 
Long × ML × Lat × Cond (10, 430) 1.311 .248 2.088 .051* 
Source
df
RP
RP+1
F
p
F
p
Long (5, 215) 4.399 .024** 4.213 .027** 
ML (1, 43) 18.566 <.001**** 1.958 .169 
Lat (1, 43) 4.401 .042** 3.419 .071* 
Cond (2, 86) 2.919 .062* 6.239 .004*** 
Long × ML (5, 215) 3.626 .011** 1.360 .254 
Long × Lat (5, 215) 2.343 .069* 2.919 .039** 
ML × Lat (1, 43) 0.136 .714 3.413 .072* 
Long × ML × Lat (5, 215) 0.790 .500 0.634 .591 
Long × Cond (10, 430) 0.284 .840 12.995 <.001**** 
ML × Cond (2, 86) 4.794 .011** 15.151 <.001**** 
Long × ML × Cond (10, 430) 2.167 .036** 3.919 .001*** 
Lat × Cond (2, 86) 2.311 .112 2.926 .069* 
Long × Lat × Cond (10, 430) 1.577 .144 2.724 .011** 
ML × Lat × Cond (2, 86) 0.103 .863 3.113 .054* 
Long × ML × Lat × Cond (10, 430) 1.311 .248 2.088 .051* 

(Greenhouse–Geisser Corrected). Separately conducted after the RP and the RP+1, on the mean ERP amplitude within the 300–500 msec interval in the three conditions (Cond = Idiomatic, Violation, and Substitution), for the 24 lateralized sites organized into three orthogonal topographical factors: Lateralization (Lat, 2 Levels; L = Left; R = Right), Longitude (Long, 6 Levels; F = Frontal; FC = Fronto-central; C = Central; CP = Centro-parietal; P = Parietal; O = Occipital), Mesiolateral (ML, 2 Levels; M = Medial; L = Lateral).

*p < .1.

**p < .05.

***p < .01.

****p < .001.

Table 6. 

Means of the Difference of the ERP Amplitude

Level
Mean
df
t
p
Fz −0.843 43 −2.112 .162 
Cz −1.267 43 −3.345 .007*** 
Pz −0.98 43 −2.521 .062* 
Oz −0.481 43 −1.487 .577 
Level
Mean
df
t
p
Fz −0.843 43 −2.112 .162 
Cz −1.267 43 −3.345 .007*** 
Pz −0.98 43 −2.521 .062* 
Oz −0.481 43 −1.487 .577 

Difference in the 300-msec and 500-msec interval after the onset of the RP between the substitution condition and the pooled means of the two other conditions (Idiomatic and Violation) and t tests for the data of the midline sites (Long) (p values are adjusted according to the Bonferroni method).

*p < .1.

***p < .01.

At the RP+1 (see Table 4), we observed a main effect of condition and a Site × Condition interaction for the central line that quantitatively supports the more posterior distribution of the effect. In fact, the t tests (Table 7) showed significant differences between the violation and idiomatic conditions not only at Cz but also at Pz and Oz. We also obtained similar results in the comparison between the idiomatic and substitution conditions, whereas no significant differences emerged when we compared the two literal conditions. The ANOVAs at RP+1 for the lateralized sites (see Table 5) showed a main effect of condition and significant interactions of this factor with the longitude and mesiolateral factors. Two significant three-way interactions (Longitude × Condition × Mesiolateral factors; Longitude × Condition × Laterality) were also obtained. The t tests (Table 8) suggest that the interaction Longitude × Mesiolateral was due to significant differences between the violation and idiomatic conditions on posterior sites (central–parietal, parietal, and occipital). Significant results emerged in the comparison between the idiomatic and the substitution conditions but only on mesial levels at centro-parietal and parietal sites (this might account for the Longitude × Condition × Mesiolateral interaction). The three-way interaction with laterality was unexpected. It might reflect either a slight asymmetry of the posterior effect between the idiomatic and the violation conditions (i.e., larger effects on left centro-parietal sites) or the onset of a negative deflection for the idiomatic condition on left anterior sites evident in the grand-average (see Figure 1). Post hoc t tests conducted on lateral–longitude mean differences were not conclusive for the interpretation of this unexpected interaction. The overall pattern emerging at RP+1 showed a more posterior topography, specifically when we compared the idiomatic versus violation conditions. This is consistent with our hypothesis of a P300 for the idiomatic condition after idiom recognition.

Table 7. 

Means of the Pairwise Differences of the ERP Amplitude

Long
Contrast
Mean
df
t
p
Fz vio–idi −0.395 43 −0.616 1.000 
Cz vio–idi −2.132 43 −3.737 .007*** 
Pz vio–idi −3.241 43 −5.802 <.001**** 
Oz vio–idi −2.458 43 −4.898 .000**** 
Fz sub–idi −0.124 43 −0.201 1.000 
Cz sub–idi −1.930 43 −3.595 .010*** 
Pz sub–idi −2.615 43 −5.182 <.001**** 
Oz sub–idi −1.651 43 −3.313 .023** 
Fz vio–sub −0.271 43 −0.554 1.000 
Cz vio–sub −0.201 43 −0.449 1.000 
Pz vio–sub −0.625 43 −1.460 1.000 
Oz vio–sub −0.807 43 −1.936 .714 
Long
Contrast
Mean
df
t
p
Fz vio–idi −0.395 43 −0.616 1.000 
Cz vio–idi −2.132 43 −3.737 .007*** 
Pz vio–idi −3.241 43 −5.802 <.001**** 
Oz vio–idi −2.458 43 −4.898 .000**** 
Fz sub–idi −0.124 43 −0.201 1.000 
Cz sub–idi −1.930 43 −3.595 .010*** 
Pz sub–idi −2.615 43 −5.182 <.001**** 
Oz sub–idi −1.651 43 −3.313 .023** 
Fz vio–sub −0.271 43 −0.554 1.000 
Cz vio–sub −0.201 43 −0.449 1.000 
Pz vio–sub −0.625 43 −1.460 1.000 
Oz vio–sub −0.807 43 −1.936 .714 

(Substitution Minus Idiomatic = sub–idi; Violation Minus Idiomatic = vio–idi; Violation Minus Substitution = vio–sub) in the 300- and 500-msec interval after the onset of the RP+1 and t tests for the data of the midline sites (Long) (p values are adjusted according to the Bonferroni method).

**p < .05.

***p < .01.

****p < .001.

Table 8. 

Means of the Pairwise Differences of the ERP Amplitude

Long
ML
Contrast
Mean
df
t
p
sub–idi −0.306 43 −0.734 1.000 
sub–idi −1.499 43 −3.237 .084* 
CP sub–idi −1.109 43 −2.341 .862 
CP sub–idi −2.003 43 −3.853 .014** 
sub–idi 0.585 43 1.058 1.000 
sub–idi −0.104 43 −0.177 1.000 
FC sub–idi −0.114 43 −0.222 1.000 
FC sub–idi −0.808 43 −1.453 1.000 
sub–idi −2.057 43 −4.611 .001*** 
sub–idi −1.651 43 −3.574 .032** 
sub–idi −0.883 43 −2.246 1.000 
sub–idi −2.352 43 −4.489 .002*** 
vio–idi −0.697 43 −1.438 1.000 
vio–idi −1.598 43 −2.941 .189 
CP vio–idi −1.906 43 −3.917 .011** 
CP vio–idi −2.297 43 −3.863 .013** 
vio–idi 0.579 43 1.092 1.000 
vio–idi −0.237 43 −0.397 1.000 
FC vio–idi −0.369 43 −0.615 1.000 
FC vio–idi −0.955 43 −1.579 1.000 
vio–idi −2.877 43 −5.957 .000**** 
vio–idi −2.444 43 −4.861 .001**** 
vio–idi −1.763 43 −4.075 .007*** 
vio–idi −2.639 43 −5.202 .000**** 
vio–sub −0.391 43 −0.919 1.000 
vio–sub −0.099 43 −0.243 1.000 
CP vio–sub −0.797 43 −2.053 1.000 
CP vio–sub −0.294 43 −0.700 1.000 
vio–sub −0.006 43 −0.015 1.000 
vio–sub −0.133 43 −0.263 1.000 
FC vio–sub −0.254 43 −0.610 1.000 
FC vio–sub −0.147 43 −0.325 1.000 
vio–sub −0.820 43 −2.127 1.000 
vio–sub −0.792 43 −2.033 1.000 
vio–sub −0.880 43 −2.499 .589 
vio–sub −0.287 43 −0.750 1.000 
Long
ML
Contrast
Mean
df
t
p
sub–idi −0.306 43 −0.734 1.000 
sub–idi −1.499 43 −3.237 .084* 
CP sub–idi −1.109 43 −2.341 .862 
CP sub–idi −2.003 43 −3.853 .014** 
sub–idi 0.585 43 1.058 1.000 
sub–idi −0.104 43 −0.177 1.000 
FC sub–idi −0.114 43 −0.222 1.000 
FC sub–idi −0.808 43 −1.453 1.000 
sub–idi −2.057 43 −4.611 .001*** 
sub–idi −1.651 43 −3.574 .032** 
sub–idi −0.883 43 −2.246 1.000 
sub–idi −2.352 43 −4.489 .002*** 
vio–idi −0.697 43 −1.438 1.000 
vio–idi −1.598 43 −2.941 .189 
CP vio–idi −1.906 43 −3.917 .011** 
CP vio–idi −2.297 43 −3.863 .013** 
vio–idi 0.579 43 1.092 1.000 
vio–idi −0.237 43 −0.397 1.000 
FC vio–idi −0.369 43 −0.615 1.000 
FC vio–idi −0.955 43 −1.579 1.000 
vio–idi −2.877 43 −5.957 .000**** 
vio–idi −2.444 43 −4.861 .001**** 
vio–idi −1.763 43 −4.075 .007*** 
vio–idi −2.639 43 −5.202 .000**** 
vio–sub −0.391 43 −0.919 1.000 
vio–sub −0.099 43 −0.243 1.000 
CP vio–sub −0.797 43 −2.053 1.000 
CP vio–sub −0.294 43 −0.700 1.000 
vio–sub −0.006 43 −0.015 1.000 
vio–sub −0.133 43 −0.263 1.000 
FC vio–sub −0.254 43 −0.610 1.000 
FC vio–sub −0.147 43 −0.325 1.000 
vio–sub −0.820 43 −2.127 1.000 
vio–sub −0.792 43 −2.033 1.000 
vio–sub −0.880 43 −2.499 .589 
vio–sub −0.287 43 −0.750 1.000 

(Substitution Minus Idiomatic = sub–idi; Violation Minus Idiomatic = vio–idi; Violation Minus Substitution = vio–sub) in the 300- and 500-msec interval after the onset of the RP+1 and t tests for the data pooled into the cells defined by the levels of the Longitude (Long) and the Mesiolateral (ML) factors (p values are adjusted according to the Bonferroni method).

*p < .1.

**p < .05.

***p < .01.

****p < .001.

In order to exclude that the observed statistical differences were simply due to the different amplitudes of the effects, we conducted a further ANOVA on the midline sites that compared the amplitude of the literal conditions (substitution at RP and violation at RP+1) with that of the idiomatic condition with two other factors: the four-level longitude factor (Fz, Cz, Pz, Oz) and the two-level word position factor (RP, RP+1). This analysis was also meant to assess the differing topographies emerging from comparisons of the substitution and idiomatic conditions at the RP and of the violation and idiomatic conditions at the RP+1. A significant three-way interaction was obtained that suggests that the effects at the two-word positions had a different impact on the longitudinal scalp distribution. This analysis was conducted both on the mean voltage amplitudes and on the same data scaled by the square root of the sum of the square voltages over all electrode locations for each cell defined by nontopographical factors, according to the McCarthy and Wood (1985) procedure. This latter procedure is specifically recommended for comparing topographical differences, taking into account possible multiplicative effect size differences. The results (see Table 9) further confirmed the differing topographies of the effects at the RP and the RP+1.

Table 9. 

Summary of the ANOVAs Results

Source
df
Unscaled Data
Scaled Data
F
p
F
p
WP (1, 43) 10.525 .002*** 10.837 .002*** 
Cond (1, 43) 14.536 <.001**** 17.906 <.001**** 
Long (3, 129) 0.394 .612 0.701 .459 
WP × Cond (1, 43) 5.62 .022** 3.506 .068* 
WP × Long (3, 129) 11.963 <.001**** 10.013 <.001**** 
Cond × Long (3, 129) 9.123 <.001**** 7.992 .001**** 
WP × Cond × Long (3, 129) 10.650 <.001**** 8.142 .001**** 
Source
df
Unscaled Data
Scaled Data
F
p
F
p
WP (1, 43) 10.525 .002*** 10.837 .002*** 
Cond (1, 43) 14.536 <.001**** 17.906 <.001**** 
Long (3, 129) 0.394 .612 0.701 .459 
WP × Cond (1, 43) 5.62 .022** 3.506 .068* 
WP × Long (3, 129) 11.963 <.001**** 10.013 <.001**** 
Cond × Long (3, 129) 9.123 <.001**** 7.992 .001**** 
WP × Cond × Long (3, 129) 10.650 <.001**** 8.142 .001**** 

Results with the Word Position (WP), Longitude, and Condition Factors (Substitution vs. Idiomatic Conditions at RP; Violation vs. Idiomatic Conditions at RP+1) without and with McCarthy Woods Scaling Procedure (unscaled and scaled data, respectively).

*p < .1.

**p < .05.

***p < .01.

****p < .001.

The Latency Analysis

Figure 3 depicted the values of the t tests conducted on the average voltage on a centro-parietal cluster of sites as a function of time in the idiomatic versus substitution conditions at the RP and in the idiomatic versus violation conditions at the RP+1. The straight horizontal line corresponds to the significance level (α = .05). The onset time of the effect, defined as the time where at least five subsequent tests were significant, is 330 msec at the RP and 260 msec at the RP+1.

Figure 3. 

t Test on the average voltage within contiguous 10-msec time windows for the comparisons between the idiomatic and substitution conditions at the recognition point [RP] (top) and between the idiomatic and violation conditions at the word following the RP (bottom). Horizontal lines correspond to significance level (p < .05). Onset latency estimations are reported.

Figure 3. 

t Test on the average voltage within contiguous 10-msec time windows for the comparisons between the idiomatic and substitution conditions at the recognition point [RP] (top) and between the idiomatic and violation conditions at the word following the RP (bottom). Horizontal lines correspond to significance level (p < .05). Onset latency estimations are reported.

The Temporal PCA

The 11 extracted PCA temporal factors are reported in Figure 4. The temporal factors TF5 and TF9 had the highest loadings respectively at 400 msec and 300 msec after the onset of the two target words (the constituents coinciding with the RP and with the RP+1). Other temporal factors (TF1, TF4, TF6, TF8) clearly account for exogenous potentials because they had similar loadings peaking at short latencies after the two critical words (less than 300 msec). In contrast, the temporal factors TF2 and TF3 were sensitive to only one of the two target words and with a later peak with respect to the components of interest.7 TF10 had a similar time loading and scalp distribution to TF9, but was larger at the RP than at the RP+1. It is possible that both reflect similar patterns of activity statistically split into two factors.

Figure 4. 

Summary of PCA analysis: For each temporal factor (TF), the percentage of explained variance is reported (VE) together with a plot of the factor loading (time loading) between 0 and 1200 msec from the onset of the RP with vertical lines every 200 msec (the first and the fourth corresponding to the onset of the RP and of the RP+1, respectively). The maps on the following columns correspond to the average across subjects of the PCA scores aggregated as a function of the experimental conditions: idiomatic, substitution, and violation.

Figure 4. 

Summary of PCA analysis: For each temporal factor (TF), the percentage of explained variance is reported (VE) together with a plot of the factor loading (time loading) between 0 and 1200 msec from the onset of the RP with vertical lines every 200 msec (the first and the fourth corresponding to the onset of the RP and of the RP+1, respectively). The maps on the following columns correspond to the average across subjects of the PCA scores aggregated as a function of the experimental conditions: idiomatic, substitution, and violation.

TF5 and TF9 were present both at the RP and at the RP+1 in the interval where the effects of our manipulations are evident in the grand average (see Figures 1 and 2). TF5 had two peaks 400 msec after the onset of each target word: This temporal factor had a higher score after the RP than after the RP+1. On the other hand, TF9 had two peaks at 300 msec but the factor scores were higher after the RP+1 than after the RP. The topography of TF5 in the three conditions is similar and showed the scalp distribution typical of the N400. The violation condition had a higher mean score for this time factor. Given that it differed from the idiomatic condition only at the RP+1, this might suggest that the effect at the RP+1 is due to the superimposition of a P300 for the idiomatic condition and of a larger N400 for the violation condition. It is, however, surprising that the TF5 scores for the substitution condition are not larger than for the other two conditions, despite a large N400 (and a relatively low cloze-probability) both at the RP and at the RP+1. One possibility is that although TF5 captured much of the variance of the N400 (given its time course and scalp distribution), part of it might have spread to other time factors. The topography of TF9 was drastically different and exhibited larger scores for the Idiomatic condition compared to the other two conditions. This and the fact that the TF9 time loadings were higher at RP+1, together identify this temporal factor as responsible for much of the variance associated with a P300 with a clear posterior distribution. These observations suggest that TF5 captured much of the N400 component, whereas TF9 captured that part of the effect that we hypothesized as being a P300 specifically elicited by the idiomatic condition at the RP+1.

PCA analysis suggests interplay between a P300 for the idiomatic condition and an N400 for the violation condition in the build-up of the effect at RP+1. Indirect evidence of this interplay is also suggested by the numerical amplitude of the mean differences in the 300–500 msec time windows. The amplitudes of the differences between the substitution and idiomatic conditions at the RP are about 1.2 μV at Cz and 1 μV at Pz; the differences between the violation and idiomatic conditions at the RP+1 are about double at Cz (2.1 μV) and triple at Pz (3.2 μV) (see Tables 5 and 6). The cloze-probability values in the literal conditions (see Table 1) were both very low (less than .05), whereas in the idiomatic condition the value at RP+1 (.86) was about double that at RP (.38). Therefore, the differences at Cz may be accounted for by the N400 whose amplitude has a well-known (inverse) correlation with the cloze-probability of the target items (DeLong, Urbach, & Kutas, 2005). In contrast, the value at Pz being three times greater is more consistent with the clearly posterior topography of the P300 estimated in terms of the TF9 factor in the PCA waveform decomposition.

The Self-paced Reading Study

Overall, 93% of the questions were responded to correctly, suggesting that participants did indeed read for comprehension. Data points ±3 SD from the mean reading times of each participant were excluded from the analyses as outliers (2.04%). The mean reading times (Table 10) were computed for each word and experimental condition, both by participant and by item, and were submitted to separate ANOVAs (Table 10) with condition (idiomatic vs. substitution vs. violation) as a within-subject factor in the by-participant analysis and as a between-subject factor in the by-item analysis. We analyzed the reading times of the regions corresponding to the RP, to the two constituents following it (RP+1, RP + 2) and to the final word of each sentence.

Table 10. 

Summary

Region

RP
RP +1
RP +2
Final
Mean (SD)
Mean (SD)
Mean (SD)
Mean (SD)
Idiomatic  394 106 387 84 373 82 516 180 
Substitution  395 101 399 99 400 83 546 173 
Violation  402 110 413 112 428 108 587 180 
 
Source
 

 
F
 
df
 
F
 
df
 
F
 
df
 
F
 
df
 
Condition F1 1.87 2, 69 9.39**** (2,69) 37.53**** 2, 69 22**** (2, 69) 
 F2 <1 2, 86 3.74** (2,86) 21.05**** 2, 86 8.39**** (2, 86) 
 

 

 
t
 
df
 
t
 
df
 
t
 
df
 
t
 
df
 
Idiomatic vs. Substitution t1   −2.44** 69 −5.32**** 69 −2.62** 69 
 t2   −1.84 86 −4.22**** 86 −2.11** 86 
Idiomatic vs. Violation t1   −3.76**** 69 −7.23**** 69 −6.68**** 69 
 t2   −2.59** 86 −6.02**** 86 −4.05**** 86 
Substitution vs. Violation t1   −2.32** 69 −4.59**** 69 −4.02**** 69 
 t2   −1.07 86 −2.68*** 86 −2.02** 86 
Region

RP
RP +1
RP +2
Final
Mean (SD)
Mean (SD)
Mean (SD)
Mean (SD)
Idiomatic  394 106 387 84 373 82 516 180 
Substitution  395 101 399 99 400 83 546 173 
Violation  402 110 413 112 428 108 587 180 
 
Source
 

 
F
 
df
 
F
 
df
 
F
 
df
 
F
 
df
 
Condition F1 1.87 2, 69 9.39**** (2,69) 37.53**** 2, 69 22**** (2, 69) 
 F2 <1 2, 86 3.74** (2,86) 21.05**** 2, 86 8.39**** (2, 86) 
 

 

 
t
 
df
 
t
 
df
 
t
 
df
 
t
 
df
 
Idiomatic vs. Substitution t1   −2.44** 69 −5.32**** 69 −2.62** 69 
 t2   −1.84 86 −4.22**** 86 −2.11** 86 
Idiomatic vs. Violation t1   −3.76**** 69 −7.23**** 69 −6.68**** 69 
 t2   −2.59** 86 −6.02**** 86 −4.05**** 86 
Substitution vs. Violation t1   −2.32** 69 −4.59**** 69 −4.02**** 69 
 t2   −1.07 86 −2.68*** 86 −2.02** 86 

The table presents mean reading times, standard deviations, and statistical analyses for the self-paced reading time study in the three experimental conditions (Idiomatic vs. Substitution vs. Violation) in the regions corresponding to the idiom's RP, the two following words (RP+1, RP + 2) and the last word of the sentences.

**p < .05.

***p < .01.

****p < .001.

The reading times at the RP did not show any effect of the factor condition (idiomatic vs. substitution condition). After the idiom's RP, the constituents belonging to the idiomatic sentences were read faster than in the two other conditions and this persisted until the end of the sentences. These results are consistent with the behavioral evidence on idiom processing that showed that, once an idiom has been recognized, it is comprehended faster than a matched literal sentence. The reading times in the violation condition were longer than in the substitution condition in all the regions of interest. This suggests that, after idiom recognition, the readers perceived a mismatch between the idiom-consistent expected constituent and the constituent actually read. This produced an increase in the reading times persisting until the end of the sentence and the final wrap-up.

These results nicely complement those of the ERP experiment: It has, in fact, already been shown (e.g., Ditman, Holcomb, & Kuperberg, 2007) that the N400 might have a delayed behavioral time course often associated with spillover effects (Van Berkum et al., 2005). On the other hand, the reading time differences between the idiomatic and violation conditions at the RP+1, characterized at the electrophysiological level by a P300, emerged immediately and not as a spillover effect on the following word.

GENERAL DISCUSSION

The aim of this study was to test the possibility that the electrophysiological correlates of the processing of highly expected words in idioms, where predictability arises from our knowledge of idioms, might differ from those underlying the processing of highly expected words in literal-compositional sentences where predictability is largely due to context- and sentence-level information. In order to explore this possibility, we selected a set of Italian idioms, predictable before the string offset, and designed three experimental conditions: In the idiomatic condition, the idiom was inserted in a minimal neutral context and was followed by at least two constituents; in the substitution condition, the constituent coinciding with the idiom's RP was substituted with an idiom-unrelated word; in the violation condition, the constituent after the RP (RP+1) was changed, again with an idiom-unrelated word. All the sentences were syntactically and semantically well formed.

The ERPs elicited by the idiomatic and the substitution conditions at the RP are compatible in timing and topographic distribution with an N400. In contrast, when we compared the idiomatic versus the violation conditions at the RP+1, a different picture emerged in that the effect was not only larger (as predicted by cloze-probability) but its timing, with an earlier onset of about 70 msec, and its topographic distribution, being significantly more posterior after the RP+1, differed. Moreover, on the unsubtracted waveforms, a posterior positive peak following the P200 is clearly visible only in the idiomatic condition at the RP+1. Because the constituents in that specific position were balanced for the main lexical variables that are known to influence early ERP responses, it is unlikely that this deflection was due to differences in lexical access. These differences put into doubt the possibility that both effects were just modulations of the amplitude of the same N400 component, or that they were due to an earlier onset of the N400. The N400 is usually centrally distributed with some instability in the degree of right lateralization and has quite stable spatial distribution and temporal development, at least when comparing data from the same subject pool and within the same modality. Even variables that certainly affect the speed of lexical access, such as lexical frequency, do not modulate the time course of the N400, but only its amplitude. It is thus unlikely that a faster recognition of constituents after idiom recognition alone can explain the different ERP patterns.

The temporal PCA analysis further suggested that the differences at the RP and at the RP+1 are better explained by a parietal P300 for the idiomatic condition only at the RP+1. This would account for a relevant part of the whole effect in its early development on more posterior sites, possibly partially superimposed with a subsequent large N400 for the violation conditions.

We hypothesized that the electrophysiological correlates of the processing of highly expected words in idioms, where predictability is due to knowledge of that specific idiomatic configuration, differed from those underlying the processing of highly expected words in sentences where predictability is due to sentence-level information. Therefore, we expected a qualitative change in the waveforms associated with the retrieval of the idiomatic configuration from semantic memory. Language processing, predictive or not, unfolds in time and, in fact, we hypothesized that before the idiom string was identified as a specific idiom (i.e., at the RP), a sufficient amount of input might have accumulated to render a constituent change noticeable even before the idiom was identified and retrieved from semantic memory. Immediately after the retrieval of the idiom, the waveforms changed and exhibited what we take to be a P300. Roehm et al. (2007) did not replicate the P300 when they presented antonymous word pairs in a lexical decision task (Experiment 2), suggesting that the P300 was task dependent. This would imply that categorical predictive processes are operative under specific circumstances, whereas probabilistic expectations are built up in more automatic ways. In this ERP experiment, participants were asked to read for comprehension without any explicit judgment on the correctness of the sentence. The fact that we observed a P300 suggests that this waveform can emerge during sentence comprehension even when no metalinguistic judgment is required.

Recently, Kok (2001) proposed a distinction between probabilistic and categorical expectations. This distinction might contribute in a substantial way to the interpretation of the present findings. In his dual model of event categorization, Kok claimed that the matching of an external stimulus to an internal representation might be based on the interplay of two independent mechanisms: a categorical matching mechanism whose electrophysiological correlate corresponds to a P300, and a probabilistic mechanism of search in working memory associated with a slow negative wave with similar latencies. We interpret our findings as evidence that the different ERP effects (N400 and P300) are the product of two different types of predictive mechanisms: one based on probabilistic expectations (leading to N400 modulations) and one on categorical expectations (leading to a P300). In general terms, we propose that the first mechanism is at work for sentential meanings constructed on-line and exploits semantic–pragmatic knowledge (context- and sentence-level information). The second mechanism specifically operates for multiword expressions (in our case idioms) when the compositional analysis must be integrated with the retrieval of prefabricated meaning from semantic memory. These two types of mechanism can be realized by different neural networks, which produce ERP effects with different timings and topographical distributions. In our study, the N400 at the RP is presumably due to the violation of a distributional-based sense of familiarity with the words forming the idiom fragment up to that point: The nature of the expectations based on the frequency of co-occurrence of a given set of words is intrinsically probabilistic. These expectations can develop when part of a multiword unit is presented, even before the reader recognizes that the fragment is a part of a specific idiom, and they modulate the amplitude of the N400, after the RP readers retrieve the idiom from semantic memory, as posited by the Configuration Hypothesis. This might explain the clearly different ERP waveforms: After recognition of the idiomatic configuration, a categorical prediction mechanism operates, confirmed by a P300 (as in Roehm et al., 2007, Experiment 1).

Further evidence of different cognitive processes before and after the idiom's RP came from the reading time experiment. After the RP, the idiomatic sentences were read faster than the literal sentences, consistent with previous behavioral findings. In contrast, after the RP, the sentences in the violation condition were read more slowly than in the substitution condition. This is unsurprising if one assumes that the categorical prediction mechanism is sensitive to the mismatch between the expected idiom template and the actual constituent. This mismatch might have forced the reader to revise interpretation of the unfolding sentence, a process that lasted until the end of the sentence. However, there was no reading time difference at the RP between the idiomatic and substitution conditions, despite evident N400 effects. This suggests that ERPs are more sensitive than reading times in detecting the effect of the distributional properties of language.

The P300s are a family of functionally distinct components: There is a more anterior P3a (novelty P300), typically elicited by unexpected events, and a more posterior P3b elicited by infrequent task-relevant stimuli. The P3b has a more posterior distribution than the P3a, similar to the one observed in our study, with a latency varying as a function of the time necessary to categorize the rare event. We exclude that the P300 we observed might be interpreted as an oddball P3b for a variety of reasons. First, it could not reflect the rarity of the idiomatic sentences because the presence of idiomatic sentences was undetected by the participants. Furthermore, the processing of a rare event is usually associated with a behavioral cost. In contrast, we obtained faster reading times of the idiomatic sentences. Finally, classifying a sentence as idiomatic was task-irrelevant in our experiment because participants read the sentences word-by-word and the comprehension verification sentences were always associated with filler sentences. That our P300 might differ from an oddball P3b does not exclude some functional similarities between them. A long-standing debate opposes a traditional context-updating explanation of the P3b (Donchin & Coles, 1988) to a context-closure explanation (Verleger, 1988). Both views explain the larger P3b for infrequent events in terms of management of expectations, but while for the context-updating view the P3b reflects surprise at an unexpected element, for the context-closure hypothesis it reflects the closure of an active expectation for the relevant event. The nature of predictable idioms is more easily accommodated by the context-closure explanation because the constituents after the idiom's RP are highly expected. The abovementioned refinement of Verleger's (1988) explanation offered by Kok (2001) fits well with the explanation we have offered of our data as an interplay between diminished N400 and enhanced P300 in terms of separate prediction–verification mechanisms at play in idiom comprehension.

To conclude, our results on idiom comprehension showed an interplay between a P300 for the (categorically) expected word partially superimposed on an N400 for the (probabilistically) unexpected word. These results show in a very clear way that different types of predictive forward-looking mechanism might operate during sentence comprehension. Further empirical work on this topic is, however, necessary for the development of a more detailed theoretical account of the relationships between predictive mechanisms and expectation–verification mechanisms, be they language specific or not.8

APPENDIX

Further examples of the experimental triplets in the three conditions (I = idiomatic; S = substitution; V = violation) with English word-by-word and idiomatic translations (RP = the point after which the idiom is retrieved; RP+1: the constituent after it).

  1. Piangere sul latte versato (to cry over spilt milk)

    • I: Marco piangeva sul latteRP versato quella volta (Marco cried over the milk spilt that time).

    • S: Marco piangeva sul letto disfatto quella volta (Marco cried over the bed unmade that time).

    • V: Marco piangeva sul latte macchiatoRP+1 quella volta (Marco cried over the milk with a dash of coffee that time).

  2. Mettere la testa a posto (To put the head in place: to get wiser)

    • I: Davide ha messo la testaRP a posto in fretta (Davide put the head in place in a hurry).

    • S: Davide ha messo la spalla a posto in fretta (Davide put the shoulder in place in a hurry).

    • V: Davide ha messo la testa sulRP+1 cuscino in fretta (Davide put the head on the pillow in a hurry).

  3. Avere il coltello dalla parte del manico (to have the knife by (the side of) the handle: to have the upper hand)

    • I: Maria aveva il coltello dallaRP parte del manico quel giorno (Maria had the knife by the side of the handle that day).

    • S: Maria aveva il coltello senza parte del manico quel giorno (Maria had the knife without part of the handle that day).

    • V: Maria aveva il coltello dalla vicinaRP+1 di casa quel giorno (Maria had the knife at the neighbour's that day).

  4. Levarsi un peso dallo stomaco (To get rid of a weight from the stomach: to get something off one's chest)

    • I: Donatella si era levata un peso dalloRP stomaco quella sera (Donatella got rid of a weight from the stomach that night).

    • S: Donatella si era levata un peso senza fatica quella sera (Donatella got rid of a weight without effort that night).

    • V: Donatella si era levata un peso dallo stivaleRP+1 quella sera (Donatella got rid of a weight from the boot that night).

  5. Perdere il filo del discorso (To loose the thread of the discourse: To get confused)

    • I: Nicola aveva perso il filoRP del discorso durante la lezione (Nicola had lost the thread of the discourse during the lesson).

    • S: Nicola aveva perso il video del concerto durante la lezione (Nicola had lost the video of the concert during the lesson).

    • V: Nicola aveva perso il filo sulRP+1 banco durante la lezione (Nicola had lost the thread on the bench during the lesson).

  6. Toccare il cielo con un dito (To touch the sky with the finger: to be extremely happy)

    • I: Anna aveva toccato il cieloRP con un dito quel giorno (Anna had touched the sky with the finger that day).

    • S: Anna aveva toccato il muro con lo specchietto quel giorno (Anna had touched the wall with the hand mirror that day).

    • V: Anna aveva toccato il cielo sulRP+1 suo quadro quel giorno (Anna had touched the sky on her painting that day).

  7. Fare castelli in aria (To build castles in the air: to imagine things that will not happen)

    • I: Gina faceva castelli inRP aria molto spesso (Gina built castles in the air very often).

    • S: Gina faceva castelli di sabbia molto spesso (Gina built castles of sand very often).

    • V: Gina faceva castelli in spiaggiaRP+1 molto spesso (Gina built castles on the beach very often).

  8. Aggiungere legna al fuoco (To add wood to the fire: to add fuel to the fire)

    • I: Giulia ha aggiunto legnaRP al fuoco durante la rissa (Giulia added wood to the fire during the brawl).

    • S: Giulia ha aggiunto sale al brodo durante la cena (Giulia added salt to the broth during the dinner).

    • V. Giulia ha aggiunto legna inRP+1 garage durante la settimana (Giulia added wood in the garage during the week).

  9. Fare vedere i sorci verdi (To make someone see the rats green: to make someone sweat blood)

    • I: Umberto gli fece vedere i sorciRP verdi quella volta (Umberto made him see the rats green that time).

    • S: Umberto gli fece vedere i vermi morti nel giardino (Umberto made him see the worms dead in the garden).

    • V: Umberto gli fece vedere i sorci mortiRP+1 quella volta (Umberto made him see the rats dead that time).

  10. Tirare i remi in barca (To pull the oars in the boat: to draw in one's horns)

    • I: Amedeo ha tirato i remi inRP barca da tempo (Amedeo has pulled the oars in the boat for a while).

    • S: Amedeo ha tirato i remi dalla barca da tempo (Amedeo has pulled the oars from the boat for a while).

    • V: Amedeo ha tirato i remi in spiaggiaRP+1 da tempo (Amedeo has pulled the oars on the beach for a while).

Acknowledgments

We thank Matteo Corradini and Michela Slomp for their help in carrying out the ERP experiment. The research was support by a PRIN grant to Cristina Cacciari (2005119758_003).

Reprint requests should be sent to Francesco Vespignani, Dipartimento di Scienze della Cognizione e della Formazione, Università degli Studi di Trento, Corso Bettini, 31, Rovereto, 38068, Italy, or via e-mail: francesco.vespignani@unitn.it.

Notes

1. 

Cloze-probability is the probability that a given word will be produced to continue a sentence fragment.

2. 

In fact, there was no overt violation of either the syntactic or semantic structure of the sentences in this condition because all were well-formed sentences (as in the other two conditions).

3. 

Only 20 out of 87 idioms have a possible literal interpretation as well. However these interpretations were of scarce literal plausibility or denoted rather infrequent actions (as in take the bull by the horns for instance).

4. 

When the constituents changed in the substitution and violation conditions were open-class items, we preserved the grammatical gender of the idiomatic constituent as far as possible. In fact, we only had two changes of gender in the substitution condition and nine in the violation condition out of 87 idioms.

5. 

The substitution condition was created for testing idiom processing at the RP. Any comparison between this condition and the other two conditions at the RP+1 is problematic because of the large between-item cloze-probability variability and the lack of a strict control of lexical variables such as word frequency.

6. 

In figurative language experiments, to avoid participants developing a special processing mode if they detect a figurative expression, it is common practice to restrict the ratio of figurative expressions to a maximum of ¼ of the experimental materials. This is why we had only 29 sentences containing the entire idiom and 120 literal fillers.

7. 

We used the PCA to better describe the effect under study; however, the clear difference in TF2 between the idiomatic condition and the two literal ones, as a possible correlate of the activation of the figurative meaning following recognition, is an attractive explanation. The left anterior scalp distribution is consistent with the deflection on left anterior sites already noted as a possible explanation of the unexpected Longitude × Laterality × Condition interaction at RP+1. A careful study of this slow-wave deflection, however, needs a specifically designed paradigm and will be addressed in future studies.

8. 

Recent results by Fisher et al. (2009) that studied the ERP correlates of algebraic computation using a verification task show an N400 with an earlier onset with respect to the classical linguistic N400. The effect, as in our experiment, is characterized by a positive peak in the unsubtracted waveform for the correct (categorically predictable) result that is absent when the result of the algebraic computation is wrong.

REFERENCES

REFERENCES
Cacciari
,
C.
, &
Glucksberg
,
S.
(
1991
).
Understanding idiomatic expressions.
In G. Simpson (Ed.),
Understanding word and sentence
(pp.
217
240
).
Amsterdam
:
Elsevier Science
.
Cacciari
,
C.
,
Padovani
,
R.
, &
Corradini
,
P.
(
2007
).
Exploring the relationship between individuals' speed of processing and their comprehension of spoken idioms.
European Journal of Cognitive Psychology
,
27
,
668
683
.
Cacciari
,
C.
, &
Tabossi
,
P.
(
1988
).
The comprehension of idioms.
Journal of Memory and Language
,
27
,
668
683
.
Cutler
,
A.
, &
Clifton
,
C.
(
1999
).
Comprehending spoken language.
In C. M. Brown & P. Hagoort (Eds.),
The neurocognition of language
(pp.
123
166
).
Oxford
:
Oxford University Press
.
DeLong
,
K. A.
,
Urbach
,
T. P.
, &
Kutas
,
M.
(
2005
).
Probabilistic word pre-activation during language comprehension inferred from electrical brain activity.
Nature Neuroscience
,
8
,
1117
1121
.
Dien
,
J.
, &
Frishkoff
,
G. A.
(
2005
).
Introduction to principal components analysis of event related potentials.
In T. Handy (Ed.),
Event related potentials: A methods handbook
(pp.
189
208
).
Cambridge, MA
:
MIT Press
.
Dien
,
J.
,
Khoe
,
W.
, &
Mangun
,
G. R.
(
2007
).
Evaluation of PCA and ICA of simulated ERP: Promax vs. infomax rotations.
Human Brain Mapping
,
28
,
742
763
.
Ditman
,
T.
,
Holcomb
,
P. J.
, &
Kuperberg
,
G. R.
(
2007
).
An investigation of concurrent ERP and self-paced reading methodologies.
Psychophysiology
,
44
,
927
935
.
Donchin
,
E.
, &
Coles
,
M. G. H.
(
1988
).
Is the P300 component a manifestation of context-updating?
Behavioral and Brain Sciences
,
11
,
355
372
.
Federmeier
,
K. D.
(
2007
).
Thinking ahead: The role and roots of prediction in language comprehension.
Psychophysiology
,
44
,
491
505
.
Federmeier
,
K. D.
, &
Kutas
,
M.
(
1999
).
A rose by any other name: Long term memory structure and sentence processing.
Journal of Memory and Language
,
41
,
469
495
.
Fisher
,
K.
,
Bassok
,
M.
, &
Osterhout
,
L.
(
2009, March
).
The arithmetic incongruency effect: Same processing for different symbolic representations
, Poster presented at The 16th Annual Meeting of the Cognitive Neuroscience Society, San Francisco, CA.
Gross
,
D.
,
Fischer
,
U.
, &
Miller
,
G. A.
(
1989
).
The organization of adjectival meanings.
Journal of Memory and Language
,
28
,
92
106
.
Hagoort
,
P.
, &
van Berkum
,
J.
(
2007
).
Beyond the sentence given.
Philosophical Transactions of the Royal Society of London, Series B, Biological Sciences
,
362
,
801
811
.
Holcomb
,
P.
(
1993
).
Semantic priming and stimulus degradation: Implications for the role of the N400 in language processing.
Psychophysiology
,
30
,
4761
.
Just
,
M. A.
,
Carpenter
,
P. A.
, &
Wooley
,
J. D.
(
1982
).
Paradigms and processes in reading comprehension.
Journal of Experimental Psychology: General
,
111
,
228
238
.
Kok
,
A.
(
2001
).
On the utility of P300 amplitude as a measure of processing capacity.
Psychophysiology
,
38
,
557
577
.
Kutas
,
M.
, &
Hillyard
,
S. A.
(
1984
).
Brain potentials during reading reflect word expectancy and semantic association.
Nature
,
307
,
161
163
.
Kutas
,
M.
,
Van Petten
,
C.
, &
Kluender
,
R.
(
2006
).
Psycholinguistics electrified II.
In M. A. Gernsbacher & M. Traxler (Eds.),
Handbook of psycholinguistics
(pp.
659
724
).
New York
:
Elsevier
.
Lau
,
E. F.
,
Phillips
,
C.
, &
Poeppel
,
D.
(
2008
).
A cortical network for semantics: (De)constructing the N400.
Nature Reviews Neuroscience
,
9
,
920
933
.
Laurent
,
J.
,
Denhières
,
G.
,
Passerieux
,
C.
,
Iakimovac
,
G.
, &
Hardy-Baylé
,
M.
(
2006
).
On understanding idiomatic language.
Brain Research
,
1068
,
151
160
.
Luck
,
S. J.
(
2005
).
An introduction to the event-related potential technique.
Cambridge, MA
:
MIT Press
.
McCarthy
,
G.
, &
Wood
,
C. C.
(
1985
).
Scalp distributions of event-related potentials: An ambiguity associated with analysis of variance models.
Electroencephalography and Clinical Neurophysiology
,
62
,
203
208
.
Metting van Rijn
,
A. C.
,
Peper
,
A.
, &
Grimbergen
,
C. A.
(
1990
).
High-quality recording of bioelectric events: Part 1.
Medical and Biological Engineering and Computing
,
28
,
389
397
.
Miller
,
G. A.
, &
Selfridge
,
J. A.
(
1950
).
Verbal context and the recall of meaningful material.
American Journal of Psychology
,
63
,
176
185
.
Moreno
,
E. M.
,
Federmeier
,
K. D.
, &
Kutas
,
M.
(
2002
).
Switching languages, switching palabras (Words): An electrophysiolocal study of code switching.
Brain and Language
,
80
,
188
207
.
Oldfield
,
R. C.
(
1971
).
The assessment and analysis of handedness: The Edinburgh inventory.
Neuropsychologia
,
9
,
97
114
.
Peterson
,
R. R.
,
Burgess
,
C.
,
Dell
,
G. S.
, &
Eberhard
,
K. L.
(
2001
).
Dissociation between syntactic and semantic processing during idiom comprehension.
Journal of Experimental Psychology: Learning, Memory, and Cognition
,
90
,
227
234
.
Pickering
,
M. J.
, &
Garrod
,
S.
(
2007
).
Do people use language production to make predictions during comprehension?
Trends in Cognitive Sciences
,
11
,
105
110
.
Roehm
,
D.
,
Bornkessel-Schlesewsky
,
I.
,
Rösler
,
F.
, &
Schlesewsky
,
M.
(
2007
).
To predict or not to predict: Influences of task and strategy on the processing of semantic relations.
Journal of Cognitive Neuroscience
,
19
,
1259
1274
.
Rugg
,
M. D.
,
Doyle
,
M. C.
, &
Wells
,
T.
(
1995
).
Word and nonword repetition within- and across-modality: An event related potential study.
Journal of Cognitive Neuroscience
,
7
,
209
227
.
Sprenger
,
S. A.
,
Levelt
,
W. J. M.
, &
Kempen
,
G.
(
2006
).
Lexical access during the production of idiomatic phrases.
Journal of Memory and Language
,
54
,
161
184
.
Strandburg
,
R. J.
,
Marsh
,
J. T.
,
Brown
,
W. S.
,
Asarnow
,
R. F.
,
Guthrie
,
D.
, &
Higa
,
J.
(
1993
).
Event-related potentials in high-functioning adult autistics.
Neuropsychologia
,
31
,
413
434
.
Swinney
,
D. A.
, &
Cutler
,
A.
(
1979
).
The access and processing of idiomatic expression.
Journal of Verbal Learning Verbal Behavior
,
18
,
523
534
.
Tanenhaus
,
M. K.
, &
Trueswell
,
J. C.
(
1995
).
Sentence comprehension.
In P. D. Eimas & J. L. Miller (Eds.),
Handbook in perception and cognition: Vol. 11. Speech language and communication
(pp.
217
262
).
San Diego, CA
:
Academic Press
.
Van Berkum
,
J. J. A.
,
Brown
,
C. M.
,
Zwitserlood
,
P.
,
Kooijman
,
V.
, &
Hagoort
,
P.
(
2005
).
Anticipating upcoming words in discourse: Evidence from ERP and reading times.
Journal of Experimental Psychology: Learning, Memory, and Cognition
,
31
,
443
467
.
Verleger
,
R.
(
1988
).
Event-related potentials and cognition: A critique of the context-updating hypothesis and an alternative interpretation of the P300.
Behavioral and Brain Sciences
,
11
,
343
427
.
Wicha
,
N. Y. Y.
,
Moreno
,
E. M.
, &
Kutas
,
M.
(
2004
).
Anticipating words and their gender: An event-related brain potential study of semantic integration, gender expectancy, and gender agreement in Spanish sentence reading.
Journal of Cognitive Neuroscience
,
16
,
1272
1288
.