Abstract

In an ERP experiment, we examined whether listeners, when making sense of spoken utterances, take into account the meaning of spurious words that are embedded in longer words, either at their onsets (e.g., pie in pirate) or at their offsets (e.g., pain in champagne). In the experiment, Dutch listeners heard Dutch words with initial or final embeddings presented in a sentence context that did or did not support the meaning of the embedded word, while equally supporting the longer carrier word. The N400 at the carrier words was modulated by the semantic fit of the embedded words, indicating that listeners briefly relate the meaning of initial- and final-embedded words to the sentential context, even though these words were not intended by the speaker. These findings help us understand the dynamics of initial sense-making and its link to lexical activation. In addition, they shed new light on the role of lexical competition and the debate concerning the lexical activation of final-embedded words.

INTRODUCTION

Many spoken words contain other shorter words. For example, the word pirate starts with the initial-embedded word pie and champagne contains the final embedding pain. According to a count by McQueen, Cutler, Briscoe, and Norris (1995), no less than 84% of all polysyllabic words in English have shorter words embedded within them. These words are not intended by the speaker, but are, nevertheless, present in the acoustic signal. What are the implications of this for the listener, who is trying to understand what the speaker is saying? Do listeners briefly take into account the meaning of these spurious words when making sense of the input? For example, if listeners hear a sentence such as He asked when the champagne would be cold enough to be served, do they also briefly consider the meaning of pain as being part of the message?

Research on spoken word recognition directly speaks to our question because any evidence of lexical activation of embedded words makes it more likely that these words are also involved during sense-making. Spoken word recognition is a rapid and continuous process. Although the speech signal unfolds in time, several lexical candidates are activated in parallel as function of the goodness-of-fit between the acoustic input and stored mental lexical representations (e.g., Allopenna, Magnuson, & Tanenhaus, 1998; McQueen, Norris, & Cutler, 1994; Zwitserlood, 1989; Marslen-Wilson, 1987). As more acoustic information becomes available, the set of matching candidates is narrowed down until only one candidate is left. Furthermore, there is good evidence that the recognition of spoken words involves a process of competition between lexical candidates (McQueen et al., 1994; Norris, 1994; McClelland & Elman, 1986), so that listeners can more rapidly settle on one particular candidate.

In line with the notion of parallel activation, there is evidence that a word with an initial embedding such as pirate also briefly activates the shorter lexical candidate pie (e.g., Salverda, Dahan, & McQueen, 2003). The activation of pie, however, is believed to be short-lived, because as soon as the second syllable of the word comes in, pirate should rapidly gain more support from the input and should suppress the activation of pie. In line with this, several priming studies have shown that the activation of initial-embedded words has disappeared at the end of the carrier words (Isel & Bacri, 1999; Marslen-Wilson, Tyler, Waksler, & Older, 1994).

For final embeddings, such as pain in champagne, the story is less straightforward. One essential property of final embeddings is that they start later in time than their carrier words, so that when the acoustic information starts to match with the shorter lexical candidate (pain), the longer candidate (champagne) has already gained considerable support. In principle, this could result in early suppression of any lexical activation of final embeddings. However, the empirical results, to date, are not consistent. Numerous priming studies have examined the lexical activation of final embeddings, with some studies reporting facilitatory priming effects (Isel & Bacri, 1999; Luce & Cluff, 1998; Vroomen & de Gelder, 1997; Shillcock, 1990), others reporting inhibitory priming (Shatzman, 2006; Marslen-Wilson et al., 1994), and some finding no priming at all (Norris, Cutler, McQueen, & Butterfield, 2006; Gow & Gordon, 1995). Thus, there is no consensus about whether final embeddings are activated upon hearing the carrier word.

The purpose of the present study is to look beyond the mere lexical activation of embedded words and examine whether listeners briefly take into account the meaning of embedded words when constructing the meaning of an unfolding sentence. After all, the ultimate goal of the listener is not to recognize the words, but to make sense of what is said. To achieve this, the meanings of the individual words have to be combined together into a sensible whole. As yet, little is known about the exact link between lexical activation and higher-level sense-making (but see Jackendoff, 2007, for a theoretical sketch involving parallel multiple constraints satisfaction), and there are, to our knowledge, no studies that examine to what extent spuriously activated lexical embeddings take part in it. Examining this is important because, although spuriously embedded words will usually not matter to the final interpretation, their role in the real-time process that constructs this interpretation can hold important clues to the nature and architecture of incremental sense-making.

Recent studies using ERPs have shown that listeners can start relating the meaning of a word to the context before the word is uniquely identified (Van den Brink, Brown, & Hagoort, 2001, 2006; Van Petten, Coulson, Rubin, Plante, & Parks, 1999). As the first syllable of pirate completely overlaps phonemically with the word pie, these earlier findings might lead one to expect that listeners will also briefly take into account the meaning of an initial embedding like pie upon hearing the word pirate. However, the case is not so clear, because the exact acoustic realization of a syllable like pie differs as a function of whether it is produced as a monosyllabic word or as the onset of a longer word; the latter typically results in a shorter syllable. Furthermore, previous research has shown that listeners can use these durational differences to distinguish between monosyllabic words and the onsets of longer words (Salverda et al., 2003; Davis, Marslen-Wilson, & Gaskell, 2002). Thus, what we ask here is whether the comprehension system still regards initial embeddings as good enough exemplars of the actual monosyllabic words to allow their meaning to affect initial sense-making. Also, if it does, what happens when more of the acoustic information in favor of the longer candidate comes in?

As for the final embeddings, our question is whether listeners briefly take into account the meaning of such embeddings even though they are preceded by nonnegligible acoustic information in favor of the carrier word. As explained previously, competition of this longer and earlier-starting lexical candidate may well strongly inhibit the activation of the later-starting embedded word (e.g., Norris et al., 2006), to such an extent that the meaning of the latter is not taken into consideration at all. Any ERP evidence that the embedded word's meaning can still temporarily impact sentence-level sense-making would thus be highly informative. In all, evidence for or against the involvement of initial- and final-embedded words in higher-level sense-making can help us understand the dynamics of initial sense-making and its link to lexical activation. In addition, such evidence will shed new light on the role of lexical competition and the debate concerning the lexical activation of final-embedded words.

To examine these issues, we made use of the N400 (Kutas & Hillyard, 1980), a negative-polarity component of the scalp-recorded EEG known to be highly sensitive to the relative ease with which the meaning of a word is retrieved and related to the preceding context. Words whose meanings are difficult to relate to the context elicit larger N400s than words for which this is easier (see Kutas, Van Petten, & Kluender, 2006, for a review), a robust phenomenon that allows us to selectively tap into the process of early sense-making. For current purposes, an additional advantage of the use of EEG is that it allows us to continuously monitor the brain activity of the listener as the speech signal unfolds in time, with high temporal resolution, and without the need for an additional response task that might interfere with natural sentence comprehension.

In the experiment, carrier words with an initial or final embedding were presented in sentences in which the meaning of the embedding was either supported by the preceding sentence frame or not. For example, a word like pirate would be presented in a sentence supporting the meaning of the initial embedding pie, as in “While Clare was waiting at the bakery she eagerly looked at the pirate on the film poster,” or in a sentence that did not support the embedding, as in “While Clare was waiting at the pharmacy she eagerly looked at the pirate on the film poster.” Similarly, a word like champagne with the final embedding pain would be presented in a sentence supporting the meaning of pain, as in “The patient asked the nurse when the champagne would be cold enough to be served,” or in an unsupporting sentence as, “The tourist asked the driver when the champagne would be cold enough to be served.” Critically, we made sure that the semantic fit of the longer carrier words was the same in both types of contexts, such that any difference in the ERPs evoked by the carrier words could only be attributed to the difference in the goodness-of-fit of the embedded words. For technical reasons explained in the Methods section, this was realized by keeping the semantic fit of the carrier words in these sentences relatively low.

The logic of the experiment was as follows: If listeners try to relate the meaning of the embedded words (e.g., pain) to the context, the N400 should be reduced in the context that initially supports the meaning of the embedding. In contrast, if listeners ignore the presence of the embedding, for example, because lexical activation of the embedded word is too weak or too short-lived (due to a poor acoustic match and/or due to lexical competition), both conditions should elicit similar N400 components.

METHODS

Participants

Forty volunteers (31 women and 9 men, mean age = 20.5 years), all right-handed college students from the University of Amsterdam, participated for course credits.

Materials

Words

The experimental carrier words were 79 Dutch multisyllabic words (2–5 syllables) with primary stress on the first syllable and 65 Dutch multisyllabic words (2–4 syllables) with primary stress on the last syllable. In all cases, the stressed syllable coincided phonemically with an existing Dutch monosyllabic word, such that each carrier word contained a monosyllabic word that was aligned with a syllable boundary. Thus, there were 79 word-initial embedded words [e.g., snor (moustache) in snorkel (snorkel)] and 65 word-final embedded words [e.g., meel (flour) in kameel (camel)]. In addition, for reasons explained below, for each carrier word, a corresponding word without any embedding was selected, matching the carrier word in number of syllables, stress pattern, and frequency (using the Corpus Spoken Dutch). The mean duration and mean frequency of the embedded words, carrier words, and corresponding embedding-free words are given in Table 1.

Table 1. 

Durations (msec) and Word Frequencies (10 Log of the Token Counts per 9 Million According to the Corpus Spoken Dutch)


Duration (Min–Max)
Frequency (SD)
Initial 
Embeddings 233 (111–329) 1.95 (0.89) 
Carriers 506 (261–791) 0.74 (0.78) 
Controls 469 (246–768) 0.76 (0.74) 
 
Final 
Embeddings 255 (160–431) 1.98 (0.87) 
Carriers 442 (291–752) 1.05 (0.84) 
Controls 438 (270–633) 1.08 (0.81) 

Duration (Min–Max)
Frequency (SD)
Initial 
Embeddings 233 (111–329) 1.95 (0.89) 
Carriers 506 (261–791) 0.74 (0.78) 
Controls 469 (246–768) 0.76 (0.74) 
 
Final 
Embeddings 255 (160–431) 1.98 (0.87) 
Carriers 442 (291–752) 1.05 (0.84) 
Controls 438 (270–633) 1.08 (0.81) 

Sentences

As shown in Table 2, two critical sentences were constructed for each carrier word, one whose incremental sentential meaning (up to just before the carrier word) supported the meaning of the embedded word, and one whose sentential meaning did not support it. As explained earlier, it was critical to our logic that the semantic fit of the carrier word in these two sentences would be identical. Because it is practically impossible to construct sentences that support the meaning of the carrier word and allow us to systematically manipulate the semantic fit of the word embedded in this word, the only way to control for the goodness-of-fit of the carrier words was by using initially anomalous carrier words. Furthermore, to have a sensitivity check in case of a critical null result, we included, for each pair of critical sentences with embeddings, a comparable supplementary sentence with a fully coherent embedding-free word. Relative to anomalous critical carrier words of either type, these coherent embedding-free words should elicit a strongly attenuated N400 component.

Table 2. 

Example Materials for Initial and Final Embeddings

Initial Embeddings 
Contextually supported embedding 
 De man vroeg de kapster of ze zijn snorkel op zolder had zien liggen 
Lit. The man asked the hairdresser whether she his[moustache]snorkelin the attic had seen lying 
Contextually unsupported embedding 
 De man vroeg de zangeres of ze zijn snorkel op zolder had zien liggen 
Lit. The man asked the singer whether she his[moustache]snorkelin the attic had seen lying 
Coherent embedding-free word 
 De hoogleraar vroeg zijn vrouw of ze zijn toga in de kast had zien hangen 
Lit. The professor asked his wife whether she hisgownin the closet had seen hanging 
 
Final Embeddings 
Contextually supported embedding 
 Jane wilde een quiche bakken, maar zag dat er geen kameel in de dierentuin was 
Lit. Jane wanted to bake a pie, but saw that there nocamel[flour]in the zoo was 
Contextually unsupported embedding 
 Jane wilde een jurk kopen, maar zag dat er geen kameel in de dierentuin was 
Lit. Jane wanted to buy a dress, but saw that there nocamel[flour]in the zoo was 
Coherent embedding-free word 
 Amy kreeg een beetje honger en hoopte dat er nog een banaan in haar tas zat 
Lit. Amy got a bit hungry, and hoped that there still abananain her bag was 
Initial Embeddings 
Contextually supported embedding 
 De man vroeg de kapster of ze zijn snorkel op zolder had zien liggen 
Lit. The man asked the hairdresser whether she his[moustache]snorkelin the attic had seen lying 
Contextually unsupported embedding 
 De man vroeg de zangeres of ze zijn snorkel op zolder had zien liggen 
Lit. The man asked the singer whether she his[moustache]snorkelin the attic had seen lying 
Coherent embedding-free word 
 De hoogleraar vroeg zijn vrouw of ze zijn toga in de kast had zien hangen 
Lit. The professor asked his wife whether she hisgownin the closet had seen hanging 
 
Final Embeddings 
Contextually supported embedding 
 Jane wilde een quiche bakken, maar zag dat er geen kameel in de dierentuin was 
Lit. Jane wanted to bake a pie, but saw that there nocamel[flour]in the zoo was 
Contextually unsupported embedding 
 Jane wilde een jurk kopen, maar zag dat er geen kameel in de dierentuin was 
Lit. Jane wanted to buy a dress, but saw that there nocamel[flour]in the zoo was 
Coherent embedding-free word 
 Amy kreeg een beetje honger en hoopte dat er nog een banaan in haar tas zat 
Lit. Amy got a bit hungry, and hoped that there still abananain her bag was 

Critical words are bold and embeddings are underlined.

A rating task, in which 40 subjects evaluated the semantic fit of the critical words on a 6-point scale (1 = very poor fit, 6 = excellent fit), confirmed that the semantic fit of the carrier words was matched between the two conditions, with mean scores of 2.0 (SD = 0.73) and 2.1 (SD = 0.83) for the carrier words with initial embeddings, and 2.0 (SD = 0.73) and 2.0 (SD = 0.66) for the words with final embeddings. Furthermore, the semantic fit of the embedded words was rated higher in the contexts supporting the embeddings than in the contexts that did not support the embeddings, with mean scores of 5.4 (SD = 0.42) and 2.2 (SD = 0.75), respectively, for the carrier words with initial embeddings and 5.5 (SD = 0.54) and 2.4 (SD = 0.98) for the carrier words with final embeddings. The semantic fit of the coherent embedding-free words in their contexts was 5.5 (SD = 0.41) for the ones corresponding to the carriers with initial embeddings and 5.6 (SD = 0.22) for the ones corresponding to the carriers with final embeddings. In addition to the experimental sentences, 144 normal filler sentences and 10 practice sentences were constructed.

Recordings and Stimulus Construction

The recordings were made by a female native speaker of Dutch who was unaware of the purpose of the experiment. To make sure that she did not notice the embedded words (as this could have affected the production of the carrier words), new sentences were constructed such that the information supporting the embedded word never appeared in the same sentence as the embedding. Afterward, the critical parts (either the first or second part of the recorded sentences) were spliced together. For the two critical conditions, the same recordings containing the carrier words were used to make sure that the second parts of these sentences were acoustically identical. In most sentences, the splice point was situated at a phrase boundary, and it never coincided with the onset of the critical word.

For the EEG experiment, two different lists were created. Both lists contained half of the contextually supported embeddings, half of the contextually unsupported embeddings, all sentences with coherent embedding-free words, and all filler sentences. Each list consisted of six blocks, resulting in 72 sentences per block (48 experimental sentences, 24 fillers). The different types of experimental sentences were evenly distributed among the six blocks. The experiment started with 10 practice sentences. For both lists, a second randomization was created by reversing the lists.

Procedure

During the experiment, participants sat in a comfortable chair in front of two loudspeakers. They were informed that they would hear a large number of unrelated sentences, and that their only task was to attentively listen to each sentence and to try to imagine the situation described.1 When they pressed the button, the next sentence was played. Each trial started with a silence of 1000 msec, followed by the sentence. At 1000 msec after the offset of each sentence, a plus sign appeared in the middle of the screen for 2000 msec, after which the participant could press the button to start the next sentence. Participants were asked to sit still as soon as they had started a new sentence, to look at the middle of the screen, and to blink as little as possible. One block, on average, took 15 min and was followed by a break. Participants were informed that after the experiment there would be a short questionnaire.

EEG Recording

The EEG was recorded from 30 silver–chloride electrodes mounted in an elastic cap at standard 10–20 locations (Fz, Cz, Pz, Oz, Fp1/2, F3/4, F7/8, F9/10, FC1/2, FC5/6, FT9/10, C3/4, T7/8, CP1/2, CP5/6, P3/4, P7/8), all referenced to the left mastoid, and with impedances below 5 kΩ. Signals were amplified with BrainAmps DC amplifiers (0.03–100 Hz band pass), digitized at 500 Hz, and re-referenced off-line to the mastoid average. Additional HEOG and VEOG signals were computed from F9 to F10 and from Fp1 to V1 (an electrode below the left eye), respectively. Then, EEG segments ranging from 500 msec before to 1600 msec after critical word onset were extracted and baseline corrected (by subtraction) to a 200-msec pre-onset baseline. Segments with potentials exceeding ±75 μV were rejected. If the total rejection rate exceeded 50% in any condition, the participant was excluded. Across the remaining 28 participants, the average segment loss was 19%, with no asymmetry across conditions. EEG segments were averaged per participant and condition. Because our hypotheses specifically involved the N400, repeated measures analyses of variance (ANOVAs) were conducted over mean amplitudes in the standard 300–500 msec latency range, or a subrange thereof, across all 16 posterior electrodes.

RESULTS

Initial Embeddings

Figure 1A displays the grand-average waveform for nine electrodes for the carrier words with initial embeddings (plotted separately for the contextually supported and unsupported condition) and for the coherent embedding-free words. Waveforms were time-locked to the onsets of the critical words (which also correspond to the onsets of the embeddings). As expected, the incoherent carrier words elicited a larger N400 than the coherent embedding-free words. An ANOVA with coherence (2) and electrodes (16) as within-subject factors indeed showed a main effect of coherence [F(1, 27) = 11.60, p = .002, prep = .99, ηp2 = .300] in the 300–500 msec window. This baseline finding reveals that our listeners were processing the sentences for meaning.

Figure 1. 

Grand average ERPs from nine scalp sites to incoherent carrier words with initial embeddings that were supported by the context (solid line) or not supported by the context (dashed line) and to coherent embedding-free words (dotted line), after baseline correction in the 200-msec prestimulus interval, time-locked to (A) the onset of the carrier/embedding-free words, which corresponds to the onset of the initial embeddings, (B) the offset of the initial embeddings. The time axis is in milliseconds (msec). Note that negative polarity is plotted upward. Waveforms are filtered (5 Hz high cutoff, 12 dB/oct) for presentation purpose only. The bars in the lower left corner show the offset of the embedded words (EWOFF) and the offset of the carrier words (CWOFF). The start of the bar corresponds to the minimal value, the end to the maximal value and the middle to the mean.

Figure 1. 

Grand average ERPs from nine scalp sites to incoherent carrier words with initial embeddings that were supported by the context (solid line) or not supported by the context (dashed line) and to coherent embedding-free words (dotted line), after baseline correction in the 200-msec prestimulus interval, time-locked to (A) the onset of the carrier/embedding-free words, which corresponds to the onset of the initial embeddings, (B) the offset of the initial embeddings. The time axis is in milliseconds (msec). Note that negative polarity is plotted upward. Waveforms are filtered (5 Hz high cutoff, 12 dB/oct) for presentation purpose only. The bars in the lower left corner show the offset of the embedded words (EWOFF) and the offset of the carrier words (CWOFF). The start of the bar corresponds to the minimal value, the end to the maximal value and the middle to the mean.

More interestingly, the N400 component elicited by the carrier words in the context supporting the embedding showed a consistent dip in comparison to the N400 elicited by the same words presented in sentences not supporting the embedding. Because we considered the possibility that only part of the N400 may be modulated, we divided the 300–500 msec interval into two separate windows of analysis: 300–400 msec and 400–500 msec after the onset of the embedded word. The ANOVAs showed that the N400 difference was significant in the 400–500 msec window only [F(1, 27) = 4.88, p = .036, prep = .93, ηp2 = .153]. Thus, the N400 elicited by the carrier words is briefly modulated by the semantic fit of the initial embeddings, indicating that while listeners make sense of the incoming speech signal, they also take into account the meaning of initial-embedded words (e.g., pie in pirate), at least for a short period of time.

After this initial reduction of the N400 for the carrier words in the context supporting the embeddings, there is a period during which these carrier words show a larger negative amplitude than the carrier words with unsupported embeddings. To examine the possibility that this effect was a delayed N400 effect, related to the moment at which there is more acoustic information in favor of the longer word (i.e., when the second syllable comes in), waveforms were time-locked to the offsets of the embedded words, as depicted in Figure 1B. The ANOVA showed that that the observed difference was significant in the 400–500 msec window [F(1, 27) = 4.86, p = .036, prep = .93, ηp2 = .152].

In summary, we see two different effects for the carrier words with initial embeddings. First and most critical, a reduction of the N400 time-locked to the onset of the carrier words (and the embeddings) in the context supporting the embedded word. Based on what is known about the N400, we take this as evidence that initial embeddings participate in some aspect of the initial sentence-level analysis of meaning. Second, this reduced N400 to carrier words with supported word-initial embeddings is followed by a larger negativity to the same words. The secondary differential ERP effect occurs some 400–500 msec after the offset of the embeddings, that is, the onset of the remainder of the carrier word, and might therefore be a delayed N400 effect related to the moment at which listeners have to deal with the carrier word in this context. We will return to this in the Discussion.

Final Embeddings

Figure 2A displays the grand-average waveforms for the carrier words with final embeddings and the embedding-free words, time-locked to the onsets of the critical words (note that now the onsets do not correspond to the onsets of the embedded words). As before, we found that the incoherent carrier words were more difficult to relate to the context than the coherent embedding-free words, as indicated by a standard N400 effect [F(1, 27) = 12.16, p = .002, prep = .99, ηp2 = .310].

Figure 2. 

Grand-average ERPs from nine scalp sites to incoherent carrier words with final embeddings that were supported by the context (solid line) or not supported by the context (dashed line) and to coherent embedding-free words (dotted line), after baseline correction in the 200-msec prestimulus interval, time-locked to (A) the onset of the carrier/embedding-free words, (B) the onset of the final embeddings. The time axis is in milliseconds (msec). Note that negative polarity is plotted upward. Waveforms are filtered (5 Hz high cutoff, 12 dB/oct) for presentation purpose only. The bars in the lower left corner indicate the onset of the embedded words (EWON), the offset of the embedded words (EWOFF), and the offset of the carrier words (CWOFF). The start of the bar corresponds to the minimal value, the end to the maximal value, and the middle to the mean.

Figure 2. 

Grand-average ERPs from nine scalp sites to incoherent carrier words with final embeddings that were supported by the context (solid line) or not supported by the context (dashed line) and to coherent embedding-free words (dotted line), after baseline correction in the 200-msec prestimulus interval, time-locked to (A) the onset of the carrier/embedding-free words, (B) the onset of the final embeddings. The time axis is in milliseconds (msec). Note that negative polarity is plotted upward. Waveforms are filtered (5 Hz high cutoff, 12 dB/oct) for presentation purpose only. The bars in the lower left corner indicate the onset of the embedded words (EWON), the offset of the embedded words (EWOFF), and the offset of the carrier words (CWOFF). The start of the bar corresponds to the minimal value, the end to the maximal value, and the middle to the mean.

To specifically examine the effect of the semantic fit of the final-embedded words, the grand-average waveforms were then time-locked to the onsets of the embeddings (e.g., onto the onset of the word pain in champagne; see Figure 2B). The waveforms elicited by these words show a smaller N400 for the embedded words in the supporting context than in the unsupporting context [significant difference in the 300–400 msec window2 [F(1, 27) = 4.87, p = .036, prep = .93, ηp2 = .153], indicating that, like initial embeddings, final embeddings are also momentarily involved in some aspect of higher-level sentential sense-making. Given the unfavorable position of these words in the carrier words, this is quite a surprise. We will discuss this in more detail in the following section.

DISCUSSION

The purpose of this study was to examine whether listeners briefly take into account the meaning of initial and final embeddings when making sense of spoken language. We presented carrier words with initial or final embeddings in sentences whose meaning, at the point of the carrier word, either did or did not support the embedded word. The resulting N400 effects unequivocally demonstrate that words unintentionally embedded in other words briefly take part in the incremental sense-making process.

In case of carrier words with an initial embedding, for example, pie in pirate, listeners briefly consider the meaning of pie when the context favors this interpretation, despite the fact that this syllable was produced as part of the longer word pirate and, therefore, contains acoustic cues in favor of the longer word (see Salverda et al., 2003; Davis et al., 2002). Thus, although the monosyllabic word is acoustically not the best matching lexical candidate, and although this mismatch is known to affect the lexical competition mechanism to some extent, listeners do still retrieve the meaning of the embedded word and try to relate it to the sentential context.

Remarkably, not only initial embeddings, but also final embeddings are taken into account. Compared to an initial embedding, a word embedded at the offset of a longer word (such as pain in champagne) is in a less favorable position, this because the initial part of the carrier word already supports the lexical representation of that longer word before the embedding comes along. Also, an interpretation involving the embedded word will cause the initial part of the longer word (e.g., cham_) to be left over. Nevertheless, our results show that listeners also briefly consider the meaning of a final-embedded word such as pain when making sense of the input. This confirms our evidence for the momentary interpretation of embedded words obtained with initial embeddings. In addition, it indicates that the sense-making system allows for interpretations that require passing over considerable portions of the input (the duration of the preceding part was, on average, 187 msec). Furthermore, in contrast to what is suggested by some previous word recognition studies (e.g., Norris et al., 2006; Shatzman, 2006; Gow & Gordon, 1995; Marslen-Wilson et al., 1994), it shows that this initial part of a word is not capable of blocking the activation of words embedded at the offsets of these words.

In all, what we see is that despite their unfavorable acoustic realization and positioning, both initial and final embeddings have the opportunity to “get through” to a higher level of the comprehension system, where word-level meaning is related to the wider communicative context. We believe that this is an unintended, but unavoidable, consequence of a highly incremental comprehension system designed to be fast and reliable as well as robust, that is, capable of extracting meaning under slightly suboptimal acoustic conditions. Such an analysis is in line with a more general perspective on comprehension as involving incremental and parallel multiple constraints satisfaction at various levels of analysis (e.g., acoustic, phonological, syntactic, conceptual), originally proposed in the domain of syntactic ambiguity resolution (Tanenhaus & Trueswell, 1995; MacDonald, Pearlmutter, & Seidenberg, 1994), and later embedded in a broader analysis of the language system (Jackendoff, 2002, 2007). In particular, what this broader perspective allows us to understand is how the system can temporarily pursue multiple analyses of the same input (e.g., “While Clare was waiting at the bakery she eagerly looked at the pie/pirate…”), and how certain “attractors” within one of these analyses (e.g., the semantic attraction of having a pie, rather than a pirate, in a bakery context) can temporarily lead the system to set aside suboptimalities elsewhere in the same analysis (e.g., the fact that the first syllable in pirate is a somewhat suboptimal realization of the single word pie).

Our core findings raise a number of interesting issues. First, an important open question is whether listeners try to integrate the meaning of the embeddings simply because the word that is actually spoken makes little sense here while the embedded word does fit, or whether they instead take embeddings into account more generally, as a consequence of the architecture of their comprehension system. We are currently addressing this in a follow-up research, and preliminary findings indicate that the latter is the case. Note, however, that any explanation that depends on how well the carrier word fits the context relative to the embedded word presupposes that not only the former but also the latter is related to the context, which is exactly what we propose here. Furthermore, even though the word pirate may be very unexpected in a context such as “While Clare was waiting at the bakery she eagerly looked at the pirate on the film poster,” the fact is that we do sometimes encounter such sentences, and do clearly understand them. In other words, the type of input studied here is not unusual, and our comprehension system can simply handle it.

A second issue left open by our findings has to do with the fact that all embedded words in the experiment were aligned with a syllable boundary and carried primary stress. We selected this type of embeddings because previous studies have suggested that stressed and aligned embeddings are most likely to be activated (e.g., Vroomen & de Gelder, 1997), making them the most plausible embeddings to be taken into account during sense-making. Whether listeners also try to relate the meaning of unaligned embeddings such as fee in feet or unstressed embeddings such as dough in meadow to the context is an empirical issue.

A third issue concerns the nature of the N400. Although our N400 results hinge on early processes that relate word meaning to the sentential context, they do not directly speak to what the nature of this initial sense-making is. This is because there are two slightly different accounts of what exactly the N400 reflects: the ease or difficulty of semantically integrating elements of meaning into a larger whole (e.g., Chwilla, Kolk, & Mulder, 2000; Brown & Hagoort, 1993; Holcomb, 1993), or the ease or difficulty of retrieving the meaning of a particular word from memory, given the particular context (e.g., Van Berkum, 2009; Kutas et al., 2006; Kutas & Federmeier, 2000). Our findings are compatible with either account. Of course, one may ask whether in actual processing, semantic integration and context-dependent retrieval can be sensibly distinguished from one another; perhaps these are two sides of the same coin (see Coulson & Federmeier, in press). Independent of how this specific debate will be resolved though, our findings clearly show that words embedded in other words are semantically related to the wider sentence context.

When designing the experiment, we had not considered the possibility of an increased negativity following the initial N400 reduction to initial embeddings (e.g., pie in pirate), and our interpretation of this secondary finding as a delayed N400 effect remains speculative. It is conceivable that when more and more acoustic information in favor of the actually produced word comes in (e.g., the second syllable of pirate), listeners briefly run into trouble because after just having considered the context-supported meaning of pie, it is extra hard to relate the meaning of pirate to the context. This may not be the only explanation for this second effect. But note that both the initial reduction of the N400 and the presence of this delayed increased negativity must one way or the other be caused by the semantic involvement of the embedding, as they both hinge on how the embedding relates to the sentential context.

In the light of the current results, our perspective is as follows. During spoken language comprehension, listeners are faced with the challenge to divide the speech signal in sensible chunks, taking into account the acoustic match with the lexicon and the most likely combination of meanings, while the acoustic signal unfolds in time. They cannot wait until all pieces are on the table and then start combining them until nothing is left and everything makes sense. Instead, listeners solve the puzzle in a measured and flexible fashion, by constantly calculating the goodness-of-fit between the acoustic input and the lexicon, by exercising a certain degree of tolerance with respect to suboptimal signal fit, and by constantly initiating and updating possible interpretations using all available information as soon as they can. As a result, words that are coincidentally embedded in the words uttered by the speaker briefly participate in the sense-making process. Thus, although embedded words are not meant by the speaker, they do mean something to the listener.

Acknowledgments

This research was supported by an NWO Innovation Impulse Veni grant to P. v. A. (Veni Grant 016.054.071) and a Vidi grant to J. v. B. (Vidi Grant 016.008.021). We thank Alan Langus, Femke van der Meulen, and four anonymous reviewers for their help.

Reprint requests should be sent to Petra M. van Alphen, Rathenau Institute, PO Box 95366, 2509 CJ The Hague, The Netherlands, or via e-mail: P.vanAlphen@rathenau.nl.

Notes

1. 

We assume that participants try to interpret sentences even when we do not force them to by means of comprehension questions or a secondary decision task. This assumption is supported by the fact that many similarly task-less experiments have elicited ERP effects that are difficult to explain otherwise. Particularly relevant here are studies in which subtle sentence- or discourse-dependent referential manipulations elicit robust ERP effects (e.g., see Van Berkum, Koornneef, Otten, & Nieuwland, 2007, for a review), as well as semantic prediction and integration studies that controlled for simple word–word priming (so that the observed ERP effects must hinge on nontrivial compositional sense-making; e.g., Otten & Van Berkum, 2007, 2008; Otten, Nieuwland, & Van Berkum, 2007; see also Ditman, Holcomb, & Kuperberg, 2007, for a comparable result).

2. 

Note that the effect for the final embeddings appears earlier than for the initial embeddings, even though in both cases the ERPs were time-locked to the onset of the embeddings. However, the final embeddings started in the middle of the word, whereas the initial embeddings were preceded by a word boundary. Because coarticulatory cues are usually stronger within words than between words, the final embeddings may have been recognized earlier (relative to their onset) than initial embeddings.

REFERENCES

REFERENCES
Allopenna
,
P. D.
,
Magnuson
,
J. S.
, &
Tanenhaus
,
M. K.
(
1998
).
Tracking the time course of spoken word recognition using eye movements: Evidence for continuous mapping models.
Journal of Memory and Language
,
38
,
419
439
.
Brown
,
C.
, &
Hagoort
,
P.
(
1993
).
The processing nature of the N400: Evidence from masked priming.
Journal of Cognitive Neuroscience
,
5
,
34
44
.
Chwilla
,
D. J.
,
Kolk
,
H. H. J.
, &
Mulder
,
G.
(
2000
).
Mediated priming in the lexical decision task: Evidence from event-related potentials and reaction time.
Journal of Memory and Language
,
42
,
314
341
.
Coulson
,
S.
, &
Federmeier
,
K. D.
(
in press
).
Words in context: ERPs and the lexical/postlexical distinction.
Journal of Psycholinguistic Research
.
Davis
,
M. H.
,
Marslen-Wilson
,
W. D.
, &
Gaskell
,
M. G.
(
2002
).
Leading up the lexical garden-path: Segmentation and ambiguity in spoken word recognition.
Journal of Experimental Psychology: Human Perception and Performance
,
28
,
218
244
.
Ditman
,
T.
,
Holcomb
,
P. J.
, &
Kuperberg
,
G. R.
(
2007
).
The contributions of lexico-semantic and discourse information to the resolution of ambiguous categorical anaphors.
Language and Cognitive Processes
,
22
,
793
827
.
Gow
,
D. W.
, &
Gordon
,
P. C.
(
1995
).
Lexical and prelexical influences on word segmentation: Evidence from priming.
Journal of Experimental Psychology: Human Perception and Performance
,
21
,
344
359
.
Holcomb
,
P. J.
(
1993
).
Semantic priming and stimulus degradation: Implications for the role of the N400 in language processing.
Psychophysiology
,
30
,
47
61
.
Isel
,
F.
, &
Bacri
,
N.
(
1999
).
Spoken-word recognition: The access to embedded words.
Brain and Language
,
68
,
61
67
.
Jackendoff
,
R.
(
2002
).
Foundations of language: Brain, meaning, grammar, evolution.
Oxford
:
Oxford University Press
.
Jackendoff
,
R.
(
2007
).
A parallel architecture perspective on language processing.
Brain Research
,
1146
,
2
22
.
Kutas
,
M.
, &
Federmeier
,
K. D.
(
2000
).
Electrophysiology reveals semantic memory use in language comprehension.
Trends in Cognitive Sciences
,
4
,
463
470
.
Kutas
,
M.
, &
Hillyard
,
S. A.
(
1980
).
Reading senseless sentences: Brain potentials reflect semantic incongruity.
Science
,
207
,
203
205
.
Kutas
,
M.
,
Van Petten
,
C.
, &
Kluender
,
R.
(
2006
).
Psycholinguistics electrified II: 1994–2005.
In M. Traxler & M. A. Gernsbacher (Eds.),
Handbook of psycholinguistics
(2nd ed., pp.
659
724
).
New York
:
Elsevier
.
Luce
,
P. A.
, &
Cluff
,
M. S.
(
1998
).
Delayed commitment in spoken word recognition: Evidence from cross-modal priming.
Perception & Psychophysics
,
60
,
484
490
.
MacDonald
,
M. C.
,
Pearlmutter
,
N. J.
, &
Seidenberg
,
M. S.
(
1994
).
Lexical nature of syntactic ambiguity resolution.
Psychological Review
,
101
,
676
703
.
Marslen-Wilson
,
W. D.
(
1987
).
Functional parallelism in spoken word-recognition.
Cognition
,
25
,
71
102
.
Marslen-Wilson
,
W. D.
,
Tyler
,
L. K.
,
Waksler
,
R.
, &
Older
,
L.
(
1994
).
Morphology and meaning in the English mental lexicon.
Psychological Review
,
101
,
3
33
.
McClelland
,
J. L.
, &
Elman
,
J. L.
(
1986
).
The TRACE model of speech perception.
Cognitive Psychology
,
18
,
1
86
.
McQueen
,
J. M.
,
Cutler
,
A.
,
Briscoe
,
T.
, &
Norris
,
D.
(
1995
).
Models of continuous speech recognition and the contents of the vocabulary.
Language and Cognitive Processes
,
10
,
309
331
.
McQueen
,
J. M.
,
Norris
,
D.
, &
Cutler
,
A.
(
1994
).
Competition in spoken word recognition: Spotting words in other words.
Journal of Experimental Psychology: Learning, Memory, and Cognition
,
20
,
621
638
.
Norris
,
D.
(
1994
).
Shortlist: A connectionist model of continuous speech recognition.
Cognition
,
52
,
189
234
.
Norris
,
D.
,
Cutler
,
A.
,
McQueen
,
J. M.
, &
Butterfield
,
S.
(
2006
).
Phonological and conceptual activation in speech comprehension.
Cognitive Psychology
,
53
,
146
193
.
Otten
,
M.
,
Nieuwland
,
M. S.
, &
Van Berkum
,
J. J. A.
(
2007
).
Great expectations: Specific lexical anticipation influences the processing of spoken language.
BMC Neuroscience
,
8
,
89
.
Otten
,
M.
, &
Van Berkum
,
J. J. A.
(
2007
).
What makes a discourse constraining? Comparing the effects of discourse message and scenario fit on the discourse-dependent N400 effect.
Brain Research
,
1153
,
166
177
.
Otten
,
M.
, &
Van Berkum
,
J. J. A.
(
2008
).
Discourse-based word anticipation during language processing: Prediction of priming?
Discourse Processes
,
45
,
464
496
.
Salverda
,
A. P.
,
Dahan
,
D.
, &
McQueen
,
J. M.
(
2003
).
The role of prosodic boundaries in the resolution of lexical embedding in speech comprehension.
Cognition
,
90
,
51
89
.
Shatzman
,
K. B.
(
2006
).
Sensitivity to detailed acoustic information in word recognition
(MPI Series in Psycholinguistics No. 37). Doctoral dissertation, University of Nijmegen, Nijmegen, the Netherlands.
Shillcock
,
R. C.
(
1990
).
Lexical hypotheses in continuous speech.
In G. T. M. Altmann (Ed.),
Cognitive models of speech processing: Psycholinguistic and computational perspectives
(pp.
24
49
).
Cambridge, MA
:
MIT Press
.
Tanenhaus
,
M. K.
, &
Trueswell
,
C.
(
1995
).
Sentence comprehension.
In J. L. Miller & P. D. Eimas (Eds.),
Speech, language, and communication
(pp.
217
262
).
San Diego, CA
:
Academic Press
.
Van Berkum
,
J. J. A.
(
2009
).
The neuropragmatics of “simple” utterance comprehension: An ERP review.
In U. Sauerland & K. Yatsushiro (Eds.),
Semantic and pragmatics: From experiment to theory
(pp.
276
316
).
Basingstoke
:
Palgrave Macmillan
.
Van Berkum
,
J. J. A.
,
Koornneef
,
A. W.
,
Otten
,
M.
, &
Nieuwland
,
M. S.
(
2007
).
Establishing reference in language comprehension: An electrophysiological perspective.
Brain Research
,
1146
,
158
171
.
Van den Brink
,
D.
,
Brown
,
C. M.
, &
Hagoort
,
P.
(
2001
).
Electrophysiological evidence for early contextual influences during spoken-word recognition: N200 versus N400 effects.
Journal of Cognitive Neuroscience
,
13
,
967
985
.
Van den Brink
,
D.
,
Brown
,
C. M.
, &
Hagoort
,
P.
(
2006
).
The cascaded nature of lexical selection and integration in auditory sentence processing.
Journal of Experimental Psychology: Learning, Memory and Cognition
,
32
,
364
372
.
Van Petten
,
C.
,
Coulson
,
S.
,
Rubin
,
S.
,
Plante
,
E.
, &
Parks
,
M.
(
1999
).
Time course of word identification and semantic integration in spoken language.
Journal of Experimental Psychology: Learning, Memory, and Cognition
,
25
,
394
417
.
Vroomen
,
J.
, &
de Gelder
,
B.
(
1997
).
Activation of embedded words in spoken word recognition.
Journal of Experimental Psychology: Human Perception and Performance
,
23
,
710
720
.
Zwitserlood
,
P.
(
1989
).
The locus of the effects of sentential-semantic context in spoken word processing.
Cognition
,
32
,
25
64
.