Abstract

Many studies refer to the relevance of metric cues in speech segmentation during language acquisition and adult language processing. However, the on-line use (i.e., time-locking the unfolding of a sentence to EEG) of metric stress patterns that are manifested by the succession of stressed and unstressed syllables during auditory syntactic processing has not been investigated. This is surprising as both processes rely on abstract rules that allow the building up of expectancies of which element will occur next and at which point in time. Participants listened to metrically regular sentences that could either be correct, syntactically incorrect, metrically incorrect, or doubly incorrect. They either judged syntactic correctness or metric homogeneity in two different sessions. We provide first event-related potential evidence that the metric structure of a given language is processed in two stages as evidenced in a biphasic pattern of an early frontal negativity and a late posterior positivity. This pattern is comparable to the biphasic pattern reported in syntactic processing. However, metric cues are processed earlier than syntactic cues during the first stage (LAN), whereas both processes seem to interact at a later integrational stage (P600). The present results substantiate the important impact of metric cues during auditory syntactic language processing.

INTRODUCTION

Successful perception and, consequently, comprehension of a specific speech signal is an ambitious challenge that we handle daily and effortlessly. However, imagine being confronted with a foreign language that you have never heard before. In this case, speech is merely a continuous noise stream and it is nearly impossible to extract a single word or to understand an utterance. Thus, successful auditory language comprehension requires that a listener can segment and, in turn, sequence an incoming acoustic signal. We use the term segmenting to describe the process of extracting words from a continuous acoustic signal (e.g., Mattys & Samuel, 1997). Therefore, segmenting is a precursor of sequencing an utterance, that is, predicting the order of future signals (“What next?”; Large & Kolen, 1994). In line with others (e.g., Mueller, Bahlmann, & Friederici, 2008; Chang, Dell, & Bock, 2006; Berridge, Aldridge, Houchard, & Zhuang, 2005; Hoen & Dominey, 2000), we argue that syntactic processing is a form of sequencing. The main question we address is: How do we manage to sequence complex acoustic information such as speech without any obvious problems? Seminal ERP work showed that prosody plays an important role in this context (e.g., Eckstein & Friederici, 2006; Pannekamp, Toepel, Alter, Hahne, & Friederici, 2005; Steinhauer, 2003; Steinhauer, Alter, & Friederici, 1999). These studies revealed that intonational patterns are immediately applied and have a high impact on successful sentence processing. However, according to the prosodic hierarchy proposed by Selkirk (1986), intonational phrases are located at a very high level of sentence prosody, whereas feet (metric stress patterns) are linked to a lower level of prosody. In the current study, we concentrated on this lower prosodic level, that is, the metric pattern of a given sentence (i.e., the succession of stressed and unstressed syllables). We predict that the underlying metric structure of a given language is a prime candidate to facilitate syntactic sequencing. This should be the case if sequencing goes hand in hand with the precursor segmentation as has been previously discussed (e.g., Quené & Koster, 1998; Cutler, 1994; Quené, 1993; Slowiaczek, 1990; Cutler & Carter, 1987; Mehler, Dommergues, Frauenfelder, & Segui, 1981). Thus, successful segmentation of the incoming speech stream and, consequently, extracting single words is mandatory for syntactic sequencing. According to Large and Kolen (1994), meter is defined as a temporal processing system that allows predicting when future events are likely to occur (“When next?”). Thus, we define meter at an abstract level as a regular occurrence of beats, where beats are perceived pulses marking regularly distributed points in time (see Lehrdahl & Jackendoff, 1983). In languages such as German or English, beats may be realized as the distribution of stressed syllables in a sentence as stressed syllables mark prominent acoustic features (Lee & Todd, 2004; Cummins & Port, 1998; Slowiaczek, 1990; Cutler & Norris, 1988; Cutler & Carter, 1987; Kohler, 1982; Mehler et al., 1981). Thus, we assume that the meter of a certain language is a critical cue for segmentation and, in turn, for sequencing language input.

Language Rhythms

Languages of the world are associated with different speech inherent rhythms. Originally, Abercrombie (1967) proposed that stress pulses recur at equal time points in stress-timed languages, whereas syllable-timed languages are characterized by isochronally distributed syllables. Although the original “isochrony hypothesis” has been rejected, it is indisputable that languages of the world have different underlying speech rhythms (Nazzi & Ramus, 2003; Auer, 1993). In addition, Lee and Todd (2004) have noted that syllables of stress-timed languages such as German or English show greater variability in auditory prominence (influenced by intensity, duration, and frequency) than syllables in syllable-timed languages (such as French). Thus, the subdivision between stress- and syllable-timed languages has been maintained, although the distinction between the two language groups on the basis of physical parameters (i.e., constant duration between two subsequent stressed syllables in stress-timed languages) seems to be more complicated than initially assumed (see Auer & Uhmann, 1988; Roach, 1982).

Stress-timed Languages

German belongs to the group of stress-timed languages, and thus, offers a prominent alternation of stressed and unstressed syllables. The trochee1 as a pattern of stressed and unstressed syllables is considered to be the default meter in German (Féry, 1997; Eisenberg, 1991). As such, it plays a significant role in grouping an incoming speech stream into smaller units, and consequently, in facilitating speech processing.2 Evidence in support for this concept comes from language acquisition (Nazzi & Ramus, 2003; Sansavini, 1997), differentiation of languages (Ramus, 2002; Ramus, Hauser, Miller, Morris, & Mehler, 2000), working memory cost (Saito & Ishio, 1998), phoneme monitoring (Cutler, 1976), syllable monitoring (Gow & Gordon, 1993), and embedded word monitoring (Cutler & Norris, 1988). In stress-timed languages, meter offers an important doorway to auditory language comprehension during language acquisition (Jusczyk, 1999). Thus, the central function of meter and stressed syllables in segmentation and sequencing an incoming speech input is indisputable (Sanders & Neville, 2000; Norris, McQueen, & Cutler, 1995; Cutler, 1994; McQueen, Norris, & Cutler, 1994; Cutler & Butterfield, 1992; Cutler & Norris, 1988).

The aim of the current experiment was to investigate the impact of metric structure by means of predictable syllable stress on auditory syntactic processing. Gasser, Eck, and Port (1999) stated that “Meter is a skill, manifested as a particular mechanism, a means by which signals are processed, guided by underlying tendencies toward periodicity and integral relationships between periodicities. This mechanism self-organizes to discover and reproduce the temporal regularities in the input.” Consequently, meter is involved in a number of cognitive systems that require a detailed temporal analysis of an input and a resulting output. However, why should meter interact with syntax during auditory language processing?

Similarities between Meter and Syntax

Meter and syntax share similar structural properties (Patel, Gibson, Ratner, Besson, & Holcomb, 1998) as both processes rely on the building up of expectancies concerning the continuation in a speech stream. Although metric and syntactic structure cannot be reduced or derived from one another (Jackendoff, 2002), both are “rule-based” systems. Obviously, other rule-based systems do exist. However, metric and syntactic rules share a high degree of abstraction. Consider the following German syntax example: As soon as one hears an article (Art), the listener expects the next incoming element to be a nominal phrase (NP). Thus, the rule is modeled as “Art → NP” that guides the listener's expectancy. As such, syntactic structures allow predictions on the basis of normative rules and impart predictability to the sequence as a whole (Chang et al., 2006). This is not the case for semantics where semantic associations are dependent on the individual world knowledge of a listener. Thus, to some degree, semantic associations do rely on interindividual differences, whereas syntactic predictions do not.

In meter perception, a listener focuses attention to time points in the speech signal at which salient stressed syllables are expected to occur. This expectancy is derived from the timing properties of the speech signal (Quené & Port, 2005). This is in line with the “attentional bounce hypothesis” (Pitt & Samuel, 1990), where attention is thought of as moving from one stressed syllable to the next. Thus, expectancy allows the listener to focus attention on relevant aspects in the continuous speech stream. Therefore, both meter and syntax support the structuring of a continuous speech stream. As successful metric processing allows predicting when the next element occurs, a metric pattern serves as a “framework” that enables the listener to sequence linguistic input and to build-up syntactic hierarchies (Large & Kolen, 1994). However, it is important to note that a metric error is only noticeable in the auditory domain. Consider a native accent of a second language (L2) speaker. If a native accent persistently overlays L2 speech production, it may become rather difficult to understand the speaker even if she/he has an advanced L2 level of proficiency. If the same person writes an email, one will understand her/him without a problem as no foreign accent overlays the written text. Thus, an incorrect metric structure (e.g., stress shifting) is critical in auditory language processing, but may not be relevant for visual language processing.

Previous ERP Evidence

Many sentence processing studies have investigated the function of syntactic cues (e.g., Deutsch & Bentin, 2001; Gunter, Friederici, & Schriefers, 2000; Hahne & Friederici, 1999; Coulson, King, & Kutas, 1998; Neville, Nicol, Barss, Forster, & Garrett, 1991), but very little evidence exists for the on-line use of metric cues in auditory sentence processing (Dohmas, Wiese, Bornkessel-Schlesewsky, & Schlesewsky, 2008; Magne et al., 2007). Furthermore, to our knowledge, there are no published ERP studies that investigated the interplay of metric and syntactic cues in auditory language processing.

Various ERP studies have confirmed a biphasic ERP pattern elicited during morphosyntactic processing: an anterior negativity (LAN = left anterior negativity) and a late posterior positivity (P600) (e.g., Friederici & Kotz, 2003; Deutsch & Bentin, 2001; Gunter et al., 2000; Friederici, 1999; Hahne & Friederici, 1999; Neville et al., 1991). Due to the early onset of the LAN (∼300–500 msec after onset of the critical item), syntactic structure has been interpreted as a process supporting correct sound-to-meaning mapping (Friederici, 1995). However, it has to be stated that the LAN may not be syntax specific. Some studies have shown that the LAN may be linked to working memory load (King & Kutas, 1995), reflecting a general sequencing phenomenon (Hoen & Dominey, 2000) or incorrect tool use (Bach, 2005).

The P600 has been interpreted as a correlate of syntactic reanalysis/repair (e.g., Friederici, Hahne, & Saddy, 2002; Friederici, Steinhauer, & Frisch, 1999) or late integration (Kaan & Swaab, 2003) in language comprehension. However, ERP studies investigating metric processing in language (Dohmas et al., 2008; Magne et al., 2007), as well as in music and tones (Abecasis, Brochard, Granot, & Drake, 2005; Brochard, Abecasis, Potter, Ragot, & Drake, 2003; Besson & Faita, 1995), have reported a similar biphasic ERP pattern in response to metric violations. However, none of these studies have investigated the use of metric and syntactic cues in parallel. Thus, it remains to be tested (i) which negativity (metric or syntactic) deflects earlier, and (ii) whether metric and syntactic information interact in the P600 as this component may reflect a late integrational stage in language comprehension (Kaan & Swaab, 2003). As both meter and syntax are rule-based and structure language input, we predict that successful syntactic reanalysis/integration requires metric competencies if both processes go hand in hand.

The Present Study

We investigated metrical and syntactical violations in parallel to (i) establish electrophysiological correlates of metric processing in German, (ii) test whether metric cues precede syntactic cues or vice versa during language comprehension, and (iii) test whether meter and syntax interact during auditory language processing. Of particular interest is the P600 as this component has been linked to syntactic, metric, and prosodic violations in language processing (Eckstein, 2007; Magne et al., 2007).

Previous studies provide evidence that morphosyntactic violations in various languages evoke a LAN (e.g., Silva-Pereyra & Carreiras, 2007; Metz-Lutz, Otzenberger, Gounot, & Jaffre, 2006; Friederici, 2004; Allen, Badecker, & Osterhout, 2003; Coulson et al., 1998) as a first correlate of error-related morphosyntactic processing. Others reported “metric” negativities between 300 and 400 msec, and a fronto-central or left-lateralized distribution (Magne et al., 2007; Böcker, Bastiaansen, Vroomen, Brunia, & Gelder, 1999). The onset and latency of the metric negativity give first evidence that metric cues may be used prior to syntactic cues (i.e., inflectional features). If meter pushes syntactic processing, then we expect that a metric negativity is deflected earlier than a syntactic LAN.

Additionally, change in a metric pattern (trochaic to iambic) demands a readjustment of the expected metric structure that is reflected in a P600 analogous to a positivity reported by Magne et al. (2007) and Besson and Faita (1995). If metric violations elicit a P600-like ERP response, such a result would contribute to the currently hotly debated domain-specific nature of the P600 (for recent reviews, see Kuperberg, 2007; van Herten, Kolk, & Chwilla, 2005). As previous data have shown, the P600 is not only evoked by syntactic deviations but also by certain semantic deviations (Osterhout, Kim, & Kuperberg, 2007; Kaan & Swaab, 2003). Thus, the present study should clarify whether meter needs to be considered next to semantics and syntax. Second, if meter and syntax rely on similar structural principles (i.e., normative rules that are independent of individual world-knowledge), we predict that metric and syntactic P600 effects should show a similar morphology. Lastly, if syntax and meter were to interact, we would predict that the amplitude of the P600 elicited by a double violation should not be larger than in any of the single violation conditions. If the two processes should operate independently, the amplitude of the P600 in the double violation should be equal to the sum of the P600s in the single violation conditions (Gondan & Röder, 2006; Barth, 1995): Double = Meter + SyntaxorDouble = (Meter + Syntax) = 0. Thus, independent neural generators have additive effects on the amplitude of an ERP component. If, in turn, two processes interact, the amplitude of the double violation condition should be over- or underadditive: Double − (Meter + Syntax) ≠ 0. Consequently, an additive effect in the double violation condition would indicate that meter and syntax processing relate to distinct neural processes, whereas over- or underadditivity would suggest an interaction of meter and syntax.

In sum, the following key questions motivated the current experiment:

  • How relevant is metric competence in auditory language processing, especially during syntactic processing, and which ERP components do reflect metric processing in German?

  • As both processes are based on abstract rules, do they interact during sentence processing?

METHODS

Participants

Twenty-four (12 women) right-handed students as determined by the Edinburgh Inventory (Oldfield, 1971), aged 21 to 29 years (mean age = 25.2 years, SD = 2.5), were tested. All participants were native speakers of German, had normal or corrected-to-normal vision, and were high-span readers according to the Reading Span Test3 (equal or higher than 3.5). None of the participants had any neurological impairment or hearing deficit.

Materials

To reveal a possible interaction between meter and syntax and to verify ERP correlates of metric violations during auditory language processing, stimuli with a consistent trochaic pattern were constructed. These sentences contained either a metric, a syntactic, or a double (metric as well as syntactic) violation next to correct sentences (see Table 1).

Table 1. 

Experimental Conditions

Condition
Example
Correct 'Vera 'hätte 'Christoph 'gestern 'morgen 'duzen 'können. 
Vera could have adressed Christoph informally yesterday morning 
Metric violation 'Detlef 'hätte 'Franzi 'gestern 'morgen du'ZEN 'können. 
Vera could have adressed Christoph informally yesterday morning 
Syntactic violation 'Wilma 'hätte 'David 'gestern 'morgen 'duzte 'können. 
Wilma could have adress David informally yesterday morning 
Double violation 'Hermann 'hätte 'Anke 'gestern 'morgen duz'TE ‘können. 
Vera could have adress Christoph informally yesterday morning 
Condition
Example
Correct 'Vera 'hätte 'Christoph 'gestern 'morgen 'duzen 'können. 
Vera could have adressed Christoph informally yesterday morning 
Metric violation 'Detlef 'hätte 'Franzi 'gestern 'morgen du'ZEN 'können. 
Vera could have adressed Christoph informally yesterday morning 
Syntactic violation 'Wilma 'hätte 'David 'gestern 'morgen 'duzte 'können. 
Wilma could have adress David informally yesterday morning 
Double violation 'Hermann 'hätte 'Anke 'gestern 'morgen duz'TE ‘können. 
Vera could have adress Christoph informally yesterday morning 

In terms of the syntactic condition, we are aware of the fact that the difference between “duzen” and “duzte” can also be interpreted as a morphological violation. However, from a theoretical linguistic point of view, the inflected verb (auxiliary) is located in the C0 position in verb-second languages such as German (e.g., Grewendorf, 2002), whereas inflectional features have already been checked in the I0 position. Therefore, it is impossible to have any inflectional features (left) at the critical sentence position (V0 position).

To avoid sentence-final wrap-up effects (Frisch, 2000), the penultimate word in a sentence was the critical item. All verbs as well as the preceding adverbs were matched in word frequency (according to the Wortschatz Lexikon4).

Production of Stimuli

Sentences were spoken by a professional native female speaker of German at a normal speech rate and digitally recorded with a 16-bit resolution and a sampling rate of 44.1 kHz. The speaker was instructed to avoid overemphasizing the regular stress pattern. In order to familiarize the speaker with the incorrect stress pattern in the metrically and the doubly violated sentences, three-word phrases were constructed, with the first word adhering to iambic patterns to facilitate the speaker's access to unexpected metrical words. The last two words remained the same as in the original sentence. Then, the two verbs were cut out of the signal and were inserted into the original sentence. To avoid coarticulatory deviations, the iambic word ended with the same consonant–vowel combination as the adverb before the critical item in the original sentence. The same procedure was performed for the correctly pronounced sentences in which the first word of the three-word phrases was trochaic. This extensive splicing procedure was conducted to avoid coarticulatory artifacts with respect to incorrectly stressed items. In Figure 1, exemplary pitch contours of the different experimental conditions are plotted. Pitch patterns are similar for all conditions up to the critical item. The pitch contour of the critical verb in the metric condition proceeds in the opposite direction of the syntactically violated and correctly inflected verb.

Figure 1. 

Exemplary pitch contours of critical sentence fragments: correct condition (black) and metric violation (red).

Figure 1. 

Exemplary pitch contours of critical sentence fragments: correct condition (black) and metric violation (red).

Procedure

As several studies report the P600 to depend on probability, salience, and task relevance (e.g., Coulson et al., 1998), we conducted two different tasks, a metric task and a syntactic task. Fifty percent of the trials in each session (task) were judged as “incorrect.” This allowed testing whether specific violations vary as a function of explicit and implicit processing demands. We predicted that purely automatic potentials are completely unaffected by task demands, whereas attentionally controlled components are affected (e.g., Tokowicz & MacWhinney, 2005).

In the first session, participants were asked to evaluate metrical homogeneity of each sentence, whereas in the second session they judged grammatical correctness. The order of task instruction was counterbalanced across participants. Each trial was introduced by a visual cue (star) on the center of a computer screen. At 2000 msec after the offset of the presented stimulus, participants were asked to perform the respective judgment. The next trial started 2000 msec after the participant's response (button press, counterbalanced for correct and incorrect). All 208 experimental sentences (52 per condition) were presented auditory via two loud speakers in pseudorandomized order. The experimental trials were presented in four blocks of approximately 8 min each. After the second block, participants were offered a break of 5 min. Participants were tested in a dimly illuminated sound-attenuating booth, were seated in a comfortable reclining chair, and were instructed to move and blink as little as possible.

Electrophysiological Recording

The EEG was recorded from 59 scalp sites by means of Ag/AgCl electrodes mounted in an elastic cap (Electro-Cap, n.d.) according to the 10–20 International System (cf. American Electroencephalographic Society, 1991). The sternum served as ground, and the left mastoid as on-line reference (recordings were re-referenced to averaged mastoids off-line). Electrode impedances were kept below 3 kΩ. In order to control for eye movements, a horizontal and a vertical EOG was recorded. EEG and EOG signals were digitized on-line with a sample frequency of 500 Hz. An anti-aliasing filter of 135 Hz was applied during recording.

Data Analyses

Individual EEG recordings were scanned for artifacts such as electrode drifting, amplifier blocking, muscle artifacts, eye movements, or blinks by means of a rejection algorithm as well as on basis of visual inspection. Epochs lasted 100 msec before onset of the critical item (main verb) up to 2500 msec after the critical item. Contaminated trials, as well as incorrectly answered trials, were rejected. In the metric task 19% and in the syntactic task 14% of all trials were excluded. The remaining trials (about 42 and 45 per condition) were averaged per participant, condition, and electrode site, with a 100-msec prestimulus baseline as acoustic analyses revealed no significant differences between conditions in this time frame. For graphical display only, data were filtered off-line with a 7-Hz low-pass filter. All statistical evaluations were carried out on unfiltered ERP data.

RESULTS

Behavioral Data

As responses were delayed, we refrain from analyzing reaction times and restrict the behavioral analyses to the error data. Overall, accuracy rates were above 80%, indicating that all participants performed well on both tasks. A repeated measure ANOVA in the metric task (see Figure 2) revealed a significant main effect of condition [F(3, 69) = 19.25, p < .001]. Planned comparisons between the levels of the factor condition (correct/metrically violated/syntactically violated/doubly violated) revealed significant differences between the syntactic (93.02%) and the correct (99.27%) condition [F(1, 23) = 13.93, p < .01], between the correct and the double violation (83.17%) condition [F(1, 23) = 27.01, p < .001], between the syntactic violation and the double violation condition [F(1, 23) = 12.30, p < .01], and between the metric violation (96.07%) and the double violation condition [F(1, 23) = 16.35, p < .001]. Due to restricted degrees of freedom, a Bonferroni-adjusted α-level of .025 was applied.

Figure 2. 

Percentage of correctly answered trials for each condition and task.

Figure 2. 

Percentage of correctly answered trials for each condition and task.

In the syntactic task, correct responses were above 99% (see Figure 2; correct: 99.35%, syntactic: 99.19%, metric: 99.11%, double: 99.51%). The omnibus ANOVA confirmed no significant differences between conditions (all p > .05).

ERP Data

The following ERP components were elicited in all three experimental manipulations: a frontally distributed negativity and a late posterior positivity. Based on a 50-msec timeline analysis, ERP components in the respective tasks were analyzed separately as they varied in latency. This resulted in the following time windows: Metric task: 200 to 450 msec and 600 to 850 msec (negativity) and 550 to 850 msec as well as 850 to 1150 msec (positivity) after onset of the critical item for the posterior positivity. For the syntactic task: 250 to 500 msec and 500 to 650 msec (negativity) as well as 750 to 1050 msec (positivity). The computation of two P600 time windows in the metric task versus one window in the syntactic task was motivated by the conducted timeline analysis that revealed latencies differences between conditions in the metric task, whereas in the syntactic task, latencies were similar across conditions. In both tasks, the following regions of interest were statistically analyzed: anterior left (AF7, AF3, F7, F5, F3, FT7, FC5, FC3), anterior right (AF8, AF4, F8, F6, F4, FT8, FC6, FC4), posterior left (CP3, P3, P5, P7, P9, PO3, PO7, O1), and posterior right (CP4, P4, P6, P8, P10, PO4, PO8, O2). To evaluate effect sizes, we computed omega square (Ω2), that is, the coefficient of determination that represents the proportion of variance in the dependent variable accounted for by the independent variable (interpreted in a similar manner as r2). As we have used a within-subject design, Ω2 values greater than .26 are defined as large effects, Ω2 values from .048 to .26 are defined as medium effects, and Ω2 values from .019 to .048 are small effects (cf. Cohen, 1992). Greenhouse–Geisser (1959) correction was applied for effects with more than one degree of freedom.

Metric Task

Negativities

As shown in Figure 3, an anteriorly distributed negativity was elicited in all of the violation conditions. However, the negativity evoked by metric and double violations deflected earlier than the negativity (LAN) elicited by syntactic violations. A repeated measures ANOVA with three within-subject factors supported this impression: window (200–450/600–850), hemisphere (right/left), and condition (correct, syntactic violation, metric violation, double violation). The ANOVA revealed a significant interaction of condition and window [F(3, 69) = 13.51, p < .001] as well as a three-way interaction between window, condition, and hemisphere [F(3, 69) = 2.98, p = .05]. Resolving the interactions by window revealed a main effect of condition [F(3, 69) = 13.65, p < .001] for the 200 to 450 msec time window. Breakdown analyses supported that the negativity evoked by the metric and double violation deflected early compared to the syntactic violation condition as differences between the correct and the metric condition [F(1, 23) = 25.43, p < .001, Ω2 = .50] as well as differences between the correct and the double violation condition [F(1, 23) = 4.03, p < .06, Ω2 = .11] turned out to be significant in this early time window. In the later time window (600–850 msec), a main effect of condition [F(3, 69) = 4.88, p < .01], as well as an interaction between the factors condition and hemisphere [F(3, 69) = 4.37, p < .01], was significant. Resolving the interaction between condition and hemisphere resulted in a significant effect for condition in the left hemisphere only [F(3, 69) = 8.04, p < .001]. Post hoc analyses showed that all comparisons between the correct and each of the violation conditions turned out to be significant [syntactic: F(1, 23) = 19.39, p < .001, Ω2 = .43; metric: F(1, 23) = 5.62, p < .01, Ω2 = .16; double: F(1, 23) = 11.34, p < .001, Ω2 = .30], indicating that the syntactically evoked LAN starts later and is clearly left-lateralized. Metrically induced negativities start bilaterally and become left-lateralized in a later time window. However, the metric negativity has its maximum in the early time window as effect sizes in the later time window show larger effects for the syntactic and the double violation condition.

Figure 3. 

Metric task: Negativity elicited by the critical main verb in the syntactic, the metric, and the double violation conditions. Waveforms show the average for correct and the particular violation condition from 100 msec prior to the item onset up to 1500 msec.

Figure 3. 

Metric task: Negativity elicited by the critical main verb in the syntactic, the metric, and the double violation conditions. Waveforms show the average for correct and the particular violation condition from 100 msec prior to the item onset up to 1500 msec.

In sum, negativities elicited in the metric and double violation condition have an earlier onset than the syntactically evoked negativity, and start bilaterally before shifting to the left hemisphere. The syntactically evoked negativity was left-lateralized right from the start (Table 2).

Table 2. 

Metric Task: ANOVA of Mean ERP Amplitudes in 200 to 450 and 600 to 850 msec Latency Range (Negativity)


Source
df
F
p
Omnibus ANOVA 
Overall Con × Win 3, 69 13.51 <.001 
Con × Win × Hemi 3, 69 2.98 .05 
 
200 to 450 msec 
Corr vs. Met Con 1, 23 25.43 <.001 
Corr vs. Double Con 1, 23 4.88 <.06 
 
600 to 850 msec 
 Con × Hemi 3, 69 4.37 <.01 
 
600 to 850 msec Left 
Corr vs. Syn Con 1, 23 19.39 <.001 
Corr vs. Met Con 1, 23 5.62 <.01 
Corr vs. Double Con 1, 23 11.34 <.001 

Source
df
F
p
Omnibus ANOVA 
Overall Con × Win 3, 69 13.51 <.001 
Con × Win × Hemi 3, 69 2.98 .05 
 
200 to 450 msec 
Corr vs. Met Con 1, 23 25.43 <.001 
Corr vs. Double Con 1, 23 4.88 <.06 
 
600 to 850 msec 
 Con × Hemi 3, 69 4.37 <.01 
 
600 to 850 msec Left 
Corr vs. Syn Con 1, 23 19.39 <.001 
Corr vs. Met Con 1, 23 5.62 <.01 
Corr vs. Double Con 1, 23 11.34 <.001 

Corr = correct condition; Met = metric violation; Double = double violation.

P600

All violation conditions elicited a posterior P600 (see Figure 4). An ANOVA with three within-subject factors was applied to statistical analysis of posterior electrode sites: window (550–850/850–1150 msec), hemisphere (left/right), condition (correct, syntactic violation, metric violation, double violation). The omnibus ANOVA yielded a two-way interaction between window and condition [F(3, 69) = 3.62, p < .05], and a three-way interaction between window, condition, and hemisphere [F(3, 69) = 7.24, p = .001]. The interactions were resolved by window that revealed a main effect of condition in both time windows [550–850: F(3, 69) = 6.32, p < .01, Ω2 = .13; 850–1150: F(3, 69) = 4.15, p < .05, Ω2 = .08], and an interaction between hemisphere and condition [F(3, 69) = 4.37, p < .01] in the later time window (850–1150 msec). Planned comparisons between the correct condition and each of the violation conditions in the 550 to 850 msec time window revealed a significant effect for the comparison between the correct condition and the double violation [F(1, 23) = 8.87, p < .01, Ω2 = .25], and for the comparison between the correct condition and the metric violation [F(1, 23) = 4.20, p = .05, Ω2 = .12]. This result indicates that the conditions involving metric violations evoked an early deflecting positivity, with the purely metric effect being smaller than the effect for the double violation condition. In the 850 to 1150 msec time window, planned comparisons yielded a significant effect for all violation types [syntactically incorrect: F(1, 23) = 4.13, p = .05, Ω2 = .12; metrically incorrect: F(1, 23) = 12.39, p < .01, Ω2 = .32; double violation: F(1, 23) = 6.02, p < .05, Ω2 = .17]. In this late time window, Ω2 values indicate that the metric violation induced the largest effect followed by the double violation condition and the syntactic violation. Resolving the interaction of hemisphere and condition yielded a significant effect for condition in both hemispheres [left hemisphere: F(3, 69) = 5.09, p < .01, Ω2 = .11; right hemisphere: F(3, 69) = 3.22, p < .05, Ω2 = .06]. Planned comparisons between the levels of the factor condition in both hemispheres showed that the positivity induced by the metric and the double violation condition is bilaterally distributed [correct–metric: left F(1, 23) = 17.17, p < .001, Ω2 = .40; right F(1, 23) = 6.67, p < .05, Ω2 = .19; correct–double: left F(1, 23) = 5.18, p < .05, Ω2 = .15; right F(1, 23) = 5.63, p < .05, Ω2 = .16]. However, post hoc analyses showed that the syntactically induced positivity is right-lateralized [F(1, 23) = 5.27, p < .05, Ω2 = .15].

Figure 4. 

Metric task: Positivity elicited by the critical main verb in the syntactic, the metric, and the double violation conditions. Waveforms show the average for correct and the particular violation condition from 100 msec prior to the item onset up to 1500 msec.

Figure 4. 

Metric task: Positivity elicited by the critical main verb in the syntactic, the metric, and the double violation conditions. Waveforms show the average for correct and the particular violation condition from 100 msec prior to the item onset up to 1500 msec.

To evaluate the expected underadditivity of the P600 component, we summed the difference waves of the syntactic and the metric P600. This calculated difference wave for the double violation condition should equal the evoked difference wave in the double violation condition if the P600 is additive. Therefore, we computed an ANOVA including the factors condition (calculated/evoked) and hemisphere (left/right). As the syntactic violation does not come into play until 850 msec, we analyzed the 850 to 1150 msec time window for interaction. The ANOVA resulted in a significant main effect for condition [F(1, 23) = 5.57, p < .05] while the amplitude increase of the calculated P600 is larger (+2.44 μV) than the amplitude increase of the evoked P600 (+1.25 μV).

To sum up, the P600 has varying onsets depending on the violation condition. Metric and double violations evoked earlier deflecting positivities than the syntactic violation condition. Although the P600 effects for the metric and the double violation condition are bilaterally distributed, the syntactically induced P600 is right-lateralized. Additionally, the P600 of the double violation condition was underadditive (Tables 3 and 4).

Table 3. 

Metric Task: ANOVA of Mean ERP Amplitudes in 550 to 850 and 850 to 1150 msec Latency Range (P600)


Source
df
F
p
Omnibus ANOVA 
Overall Con × Win 3, 69 3.62 <.05 
Con × Win × Hemi 3, 69 7.24 .001 
 
550 to 850 msec 
Corr vs. Met Con 1, 23 4.20 .05 
Corr vs. Double Con 1, 23 8.87 <.01 
 
850 to 1150 msec 
 Con × Hemi 3, 69 4.37 <.01 
 
850 to 1150 msec Left 
Corr vs. Met Con 1, 23 17.17 <.001 
Corr vs. Double Con 1, 23 5.18 <.05 
 
850 to 1150 msec Right 
Corr vs. Syn Con 1, 23 5.27 <.05 
Corr vs. Met Con 1, 23 6.67 <.05 
Corr vs. Double Con 1, 23 5.63 <.05 

Source
df
F
p
Omnibus ANOVA 
Overall Con × Win 3, 69 3.62 <.05 
Con × Win × Hemi 3, 69 7.24 .001 
 
550 to 850 msec 
Corr vs. Met Con 1, 23 4.20 .05 
Corr vs. Double Con 1, 23 8.87 <.01 
 
850 to 1150 msec 
 Con × Hemi 3, 69 4.37 <.01 
 
850 to 1150 msec Left 
Corr vs. Met Con 1, 23 17.17 <.001 
Corr vs. Double Con 1, 23 5.18 <.05 
 
850 to 1150 msec Right 
Corr vs. Syn Con 1, 23 5.27 <.05 
Corr vs. Met Con 1, 23 6.67 <.05 
Corr vs. Double Con 1, 23 5.63 <.05 

Corr = correct condition; Met = metric violation; Double = double violation.

Table 4. 

Metric Task: Mean Voltage Values (μV) for Each Condition, ERP Component, and Time Window


Early Negativity
P600
Condition
200–450 msec
600–850 msec
550–850 msec
850–1150 msec
Window
Mean
SD
Mean
SD
Mean
SD
Mean
SD
Correct −0.21 1.35 −0.66 1.71 3.38 2.84 2.94 2.71 
Metric −1.13 1.47 −1.43 1.84 4.03 2.83 4.16 2.79 
Syntactic – – −1.86 1.82 – – 3.73 2.38 
Double −1.23 1.47 −1.23 1.58 4.40 2.01 3.92 2.25 

Early Negativity
P600
Condition
200–450 msec
600–850 msec
550–850 msec
850–1150 msec
Window
Mean
SD
Mean
SD
Mean
SD
Mean
SD
Correct −0.21 1.35 −0.66 1.71 3.38 2.84 2.94 2.71 
Metric −1.13 1.47 −1.43 1.84 4.03 2.83 4.16 2.79 
Syntactic – – −1.86 1.82 – – 3.73 2.38 
Double −1.23 1.47 −1.23 1.58 4.40 2.01 3.92 2.25 

Syntactic Task

Negativities

All violation conditions evoked an anteriorly distributed negativity in the syntactic task. Although attention was not directed to metric processing, the negativity evoked by metric violations deflected earlier than the LAN (see Figure 5). To evaluate these negativities, we conducted an omnibus ANOVA with the three within-subject factors: window (250–500/500–650), hemisphere (left/right), and condition (correct/syntactic violation/metric violation/double violation). The ANOVA yielded a significant interaction between window and condition [F(3, 69) = 12.22, p < .001]. Follow-up analyses by window showed that the negativity evoked by the metric violation deflected earlier and lasted longer than the syntactically evoked LAN. This was confirmed in the analysis of the 250 to 500 msec time window in a significant effect for condition [F(3, 69) = 4.88, p < .01]. Planned comparisons revealed a significant condition effect comparing correct and metric violation [F(1, 23) = 13.52, p < .01, Ω2 = .34], and correct and double violation [F(1, 23) = 1.46, p < .01, Ω2 = .28].

Figure 5. 

Syntactic task: Negativity elicited by the critical main verb in the syntactic, the metric, and the double violation conditions. Waveforms show the average for correct and the particular violation condition from 100 msec prior to the item onset up to 1500 msec.

Figure 5. 

Syntactic task: Negativity elicited by the critical main verb in the syntactic, the metric, and the double violation conditions. Waveforms show the average for correct and the particular violation condition from 100 msec prior to the item onset up to 1500 msec.

In the 500 to 650 msec time window, the condition effect turned out to be significant [F(3, 69) = 4.54, p < .01]. Planned comparisons in this window yielded significant effects between correct and metric violation [F(1, 23) = 5.71, p < .05, Ω2 = .16], and between correct and syntactic violation [F(1, 23) = 12.03, p < .01, Ω2 = .31]. Analogous to the results in the metric task, the metric effect diminished in the later time window, whereas the syntactic effect was only evoked in the second time window.

In sum, in the syntactic task, all negativities were distributed bilaterally with the metric and double negativity deflecting earlier than the syntactic negativity (Table 5).

Table 5. 

Syntactic Task: ANOVA of Mean ERP Amplitudes in 250 to 500 and 500 to 650 msec Latency Range (Negativity)


Source
df
F
p
Omnibus ANOVA 
Overall Con × Win 3, 69 12.22 <.001 
 
250 to 500 msec 
Corr vs. Met Con 1, 23 13.52 <.01 
Corr vs. Double Con 1, 23 1.46 <.01 
 
500 to 650 msec 
Corr vs. Met Con 1, 23 5.71 <.05 
Corr vs. Syn Con 1, 23 12.03 <.01 

Source
df
F
p
Omnibus ANOVA 
Overall Con × Win 3, 69 12.22 <.001 
 
250 to 500 msec 
Corr vs. Met Con 1, 23 13.52 <.01 
Corr vs. Double Con 1, 23 1.46 <.01 
 
500 to 650 msec 
Corr vs. Met Con 1, 23 5.71 <.05 
Corr vs. Syn Con 1, 23 12.03 <.01 

Corr = correct condition; Met = metric violation; Double = double violation.

Positivity

The statistical analysis of the P600 in the syntactic task (see Figure 6) contained two within-subject factors, namely, hemisphere (left/right) and condition (correct/syntactically incorrect/metrically incorrect/double violation), and was conducted for the time window from 750 to 1050 msec after the onset of the critical item. The ANOVA revealed a main effect of condition [F(3, 69) = 4.88, p < .01]. Planned comparisons showed that amplitude differences between the violation conditions were not significantly different. We found a significant condition effect for comparisons between the correct condition and the syntactic violation [F(1, 23) = 15.14, p < .001, Ω2 = .37], between the correct condition and the metric violation [F(1, 23) = 4.04, p < .06, Ω2 = .11], and between the correct and the double violation [F(1, 23) = 9.00, p < .01, Ω2 = .25]. There was no significant effect comparing the violation conditions. In this case, effect sizes indicated a larger effect for the syntactic violation as compared to the double violation and the metric violation. In sum, the P600 in the syntactic task deflected in the same time window for all violation conditions and had a bilateral distribution.

Figure 6. 

Syntactic task: Positivity elicited by the critical main verb in the syntactic, the metric, and the double violation conditions. Waveforms show the average for correct and the particular violation condition from 100 msec prior to the item onset up to 1500 msec.

Figure 6. 

Syntactic task: Positivity elicited by the critical main verb in the syntactic, the metric, and the double violation conditions. Waveforms show the average for correct and the particular violation condition from 100 msec prior to the item onset up to 1500 msec.

To evaluate the underadditivity of the P600 component in the syntactic task, we also summed the difference waves of the single violation conditions and compared this calculation with the evoked difference wave in the double violation condition. We computed an ANOVA with the factors condition (calculated P600/evoked P600) and hemisphere (left/right). In line with our hypothesis, the amplitude of the calculated P600 is larger (+2.40 μV) than the amplitude of the evoked P600 (+1.37 μV). Statistical analysis revealed that this difference is marginally significant [F(1, 23) = 3.54, p < .08] (Tables 6 and 7).

Table 6. 

Syntactic Task: ANOVA of Mean ERP Amplitudes in 750 to 1050 msec Latency Range (P600)


Source
df
F
p
Corr vs. Syn Con 1, 23 15.14 <.001 
Corr vs. Met Con 1, 23 4.04 <.06 
Corr vs. Double Con 1, 23 9.00 <.01 

Source
df
F
p
Corr vs. Syn Con 1, 23 15.14 <.001 
Corr vs. Met Con 1, 23 4.04 <.06 
Corr vs. Double Con 1, 23 9.00 <.01 

Corr = correct condition; Met = metric violation; Double = double violation.

Table 7. 

Syntactic Task: Mean Voltage Values (μV) for Each Condition, ERP Component, and Time Window


Early Negativity
P600
Condition
250–500 msec
500–650 msec
750–1050 msec
Window
Mean
SD
Mean
SD
Mean
SD
Correct −0.09 1.51 0.09 2.56 1.43 1.82 
Metric −1.28 1.50 −0.70 2.84 2.22 2.34 
Syntactic – – −1.30 3.05 2.72 2.43 
Double −0.55 1.51 −0.86 2.76 2.64 2.08 

Early Negativity
P600
Condition
250–500 msec
500–650 msec
750–1050 msec
Window
Mean
SD
Mean
SD
Mean
SD
Correct −0.09 1.51 0.09 2.56 1.43 1.82 
Metric −1.28 1.50 −0.70 2.84 2.22 2.34 
Syntactic – – −1.30 3.05 2.72 2.43 
Double −0.55 1.51 −0.86 2.76 2.64 2.08 

DISCUSSION

The present study investigated the influence of metric cues in speech by utilizing predictable syllable stress during on-line auditory sentence processing. We tested whether meter not only supports the structuring of the incoming speech signal, that is, “segmentation” (i.e., Quené & Koster, 1998; Cutler, 1994; Quené, 1993; Slowiaczek, 1990; Cutler & Carter, 1987; Mehler et al., 1981), but also interacts with other linguistic processes such as syntax. In addition, we tried to verify if metric violations in German evoke a similar ERP pattern as in French, namely, a biphasic pattern consisting of an earlier negativity and a late posterior negativity (Magne et al., 2007). The current results clearly support the role of metric cues in auditory sentence processing. We are able to show that, at an early processing stage, metric violations are processed prior to syntactic violations as reflected in an early frontally distributed negativity. Furthermore, our results suggest that meter and syntax interact in the processes underlying the P600. In the following, behavioral and ERP data will be discussed in turn.

Behavioral Data

Under both explicit (grammaticality judgment with respect to syntax/metrical judgment with respect to metric homogeneity) and implicit instructions (vice versa), participants responded highly accurately. However, the performance between the two tasks differed. We only found significantly different error rates in the metric task, but not in the syntactic task. This may be due to fact that the detection of double violations in the metric task seems to be more difficult. A reason may be that metric violations are harder to detect, whereas syntactic violations are more salient. As soon as the participant's attention is focused on metric violations, the double violation has to be identified as “incorrect,” but the participant may be unsure whether the violation is a purely syntactic violation, or coincides with a metric violation. Participants may use this strategy to label syntactic violations in the metric task as “correct” as the sentence context is metric. They are only in conflict in the double violation condition. However, the lower performance in the double violation (metric task) may result from explicit judgment of metric structure. Although syntax is based on explicit rules, meter is inherent in most human systems (i.e., motor, speech, music). Thus, making inherent concepts explicit may infer with our ability to judge. This may be particularly true when salient syntactic violations superimpose metric violations, and in turn, may result in higher error rates in the metric task.

ERP Data

In terms of the ERP data, we return to each of our key questions.

How Relevant is Metric Competence in Auditory Language Processing, Especially during Syntactic Processing, and Which ERP Components Do Reflect Metric Processing in German?

The critical verb elicited an anterior negativity whose latency varied as a function of violation type. This negativity, evoked by metric and double violations, emerged earlier than the negativity evoked by syntactic violations in both tasks. This indicates that metric deviations are processed prior to syntactic violations independent of the focus of attention.

However, one has to be cautious to interpret these latency differences. As the critical cue for syntactic violation may be perceived later than the critical cue for the metric violation, these latency differences, in fact, are not as large as they seem to be. Metrically induced negativities deflect between 250 and 400 msec earlier than syntactically evoked negativities. However, the first syllable of the critical item lasted 300 msec on average. This means that the metric violation has no processing advantage if the metric deviation is directly detected at the onset of the first syllable, whereas syntactic violation is not detected until the onset of the second syllable. However, it is rather unlikely that the metric violation is immediately perceived at the onset of the critical item as (i) stress is always relative to another event, that is, a foot (trochee or iamb) always consists of two syllables, a strong and a weak one, and (ii) the nucleus of a syllable is thought to be the stress-bearing element, which is not the onset of the first syllable. Furthermore, it is also unlikely that the syntactic violation is not detected until the onset of the second syllable due to coarticulatory effects. To conclude, although the processing advantage is not large, the latency differences indicate that metric violation is processed prior to syntactic violation. We interpret this metrically induced negativity as a correlate of the ongoing segmentation process. This segmentation process is interrupted in metrically violated sentences as the stressed syllable in the critical verb does not mark the onset of a new word. However, a similar negativity in response to metric violations reported by Magne et al. (2007) and Böcker et al. (1999) has been interpreted differently. Böcker et al. claimed that the negativity is a correlate of metrical stress violations, whereas Magne et al. proposed that the component is an N400 linked to effortful lexical access. Thus, the metrically induced negativity that we report may be an electrophysiological index of increased effort during lexical access (N400), or it may be interpreted as a subcomponent of a LAN reflecting the violations of a general (nonlinguistic) rule-based mechanism (see Hoen & Dominey, 2000). This interpretation is supported by data from Abecasis et al. (2005) and Brochard et al. (2003), who reported an early negativity in response to metric deviations in tone sequences that can be interpreted as a correlate of metric error detection (affecting segmentation). As data from the present experiment cannot support one or the other competing explanations, a follow-up experiment to test the nature of the negativity is currently undertaken. However, the present experiment shows that metric cues are used prior to syntactic cues during language comprehension, stressing the high relevance of metric competencies in language processing.

As Both Processes are Based on Rule-based Structures, Do They Interact during Sentence Processing?”

The present data suggest that meter and syntax interact in auditory language comprehension. As already described in the Introduction, the amplitude of the P600 in the double violation should equal the sum of the P600s in the single violation conditions if the two processes operate independently. Thus, independent neural generators have additive effects on the amplitude of an ERP component. If, in turn, two processes interact, the amplitude of the double violation condition should be over- or underadditive. In the current study, the amplitude size of the P600 in response to the double violation did not differ from the amplitude size of the P600 in response to the single violation condition, thus, is underadditive. This is further supported by the reported effect sizes. If meter and syntax are independent processes, the effect in the double violation condition should be larger compared to the single violation conditions. However, the interaction was only marginal significant for the syntactic task. We assume that the established language system (i.e., healthy native speakers) may not be as reliant on the metric structure if attention is directed to syntactic processing. In such a case, appending metric violations may not influence syntactic processing as strongly as they should in a less calibrated system (e.g., second-language learners). Nonetheless, the current data suggest that metric violations were reanalyzed in the syntactic task and that both processes interact regardless of the task. However, as pointed out in the Introduction, underadditivity does not necessarily imply interdependency of the underlying processes (Palolahti, Leino, Jokela, Kopra, & Paavilainen, 2005; Osterhout & Nicol, 1999). Here, fMRI and MEG are useful techniques to test for identical sources of both (metric and syntactic) P600 effects. In fact, the results from our fMRI experiment (Kotz, Rothermich, & Schmidt-Kassow, 2007) are in support of this. The data reveal that meter and syntax processing share neural sources as evidenced in overlapping activation of the left STG.

The P600 in the current experiment was evoked in all violation conditions and in both tasks, but was clearly task-sensitive. Task-specific modulation of conditions resulted in an earlier deflection of all positivities. These results support the idea that the P600 reflects a process that is under attentional control (Vissers, Chwilla, & Kolk, 2007; Kolk, Chwilla, van Herten, & Oor, 2003; Gunter & Friederici, 1999; Coulson et al., 1998; Gunter, Stowe, & Mulder, 1997; Hahne & Friederici, 1997). Thus, if attention is not directed toward syntax, syntactic reanalysis is initiated later than when attention is focused on syntax. The same is true for meter. Thus, task-specific attention leads to earlier reanalysis than non-task-specific processing.

One might argue that two factors of metric processing are confounded in the current material. Next to the manipulation of the metric structure of the whole sentence (strong–weak–strong–weak–strong–weak…), we also violated the lexical stress of each particular critical item. Thus, the elicited ERP components can be both (i) a reflection of the disrupted metric structure at the sentence level and (ii) a reflection of locally violated lexical stress. In a follow-up experiment (Kotz & Schmidt-Kassow, 2007), we investigated this open question by changing the critical item into a real existing German verb that is stressed on the second instead of the first syllable (e.g., “ver'höhnen”; to mock s.o.). Even in this case we found a P600 effect in response to the metrically deviant item, indicating that the brain is not only sensitive to lexical stress, but also to the metric pattern in a given sentence.

With respect to the ongoing discussion whether the P600 is syntax specific or not, the present data support the idea that the P600 is a correlate of a general integration mechanism (see also Osterhout, Kim, & Kuperberg, 2007; Kaan & Swaab, 2003). The present results extend the functional interpretation of the P600 by adding the influence of meter to such a general integration mechanism. As has been previously shown, the P600 cannot be labeled as a purely syntactic component as it has already been found in response to semantic violations that contradict certain plausibility heuristics, that is, those violations that are inconsistent with the listener's expectancy about how a sentence will continue (see van Herten et al., 2005). Such semantically driven P600s always involve difficulties in theta-role assignment. Thus, a semantic P600 can be traced back to a rule-based expectancy violation, as theta-role assignment is subject to specific rules. Furthermore, the P600 seems to be not a language-specific component as this ERP component has also been evoked in studies investigating music processing, mathematics, as well as the rhythmization of simple tone sequence (Martín-Loeches, Casado, Gonzalo, Heras, & Fernández-Frías, 2006; Abecasis et al., 2005; Besson & Faita, 1995). The question is: What is the driving factor eliciting a P600? We propose that all factors that are relevant for a (rule-based) building up of expectancies about a plausible continuation of the input (i.e., meter in tone sequences and music pieces, mathematics, and syntactic rules) modulate the P600. On the basis of the present and previous results, the P600 appears to reflect a reprocessing mechanism as has likewise been proposed by van Herten et al. (2005). In this context, reprocessing takes place if the rule-based expectancy of a given input is not fulfilled, and consequently, an initial analysis has to be rejected. Thus, the ongoing discussion about late integration of different factors that lead to a particular analysis of language input needs to be extended by one more factor, that is, the metric structure of a given language.

Conclusion

The present results are twofold: Firstly, they confirm the importance of a metric structure during segmentation of an incoming speech stream. Furthermore, data confirm that metric structure is analyzed prior to syntactic structure at an early processing stage (early negativity). This indicates that the meter and metric competence provide a grid to structure an incoming speech stream. Secondly, the current data suggest that meter and syntax do interact during a late stage of auditory language processing (P600). This indicates that meter is not only involved in the early segmentation of speech but is also interwoven with a late integrational process that is the reanalysis of a syntactically violated sentence.

Acknowledgments

We thank Sandra Pappert, Kathrin Rothermich, and three anonymous reviewers for helpful comments, Maren Grigutsch for providing useful tools and hints facilitating the ERP analysis, Heike Boethel and Solvejg Schulz for help in data acquisition, and Kerstin Flake as well as Sebastian Wahnelt for graphics support.

Reprint requests should be sent to Maren Schmidt-Kassow, Research Group Neurocognition of Rhythm in Communication, Max Planck Institute for Human Cognitive and Brain Sciences, Stephanstrasse 1a, Leipzig, Germany, 04103, or via e-mail: kassow@cbs.mpg.de.

Notes

1. 

A binary metric pattern has been defined as a default leading to perceptual advantages not only in stress-timed languages, but also with respect to other acoustic events as, for instance, tone sequences (Abecasis et al., 2005; Drake, 1993; Smith & Cuddy, 1989; Fraisse, 1982).

2. 

The process of grouping operates across modalities. It describes the active search for regularity in an acoustic input and results in “a predisposition to find regular pulse around which other events are organized” (Abecasis et al., 2005). By the means of grouping, memory demands and processing time are reduced (for a detailed review, see Drake & Bertrand, 2003; Lehrdahl & Jackendoff, 1983).

3. 

The Reading Span Test has originally been developed by Daneman and Carpenter (1980). For the current experiment we used a German adaptation of the Reading Span Test (Steinhauer, 1995).

4. 

The “Wortschatz Lexikon” is a German database evaluating words’ frequency of occurrence in print media (wortschatz.uni-leipzig.de).

REFERENCES

Abecasis
,
D.
,
Brochard
,
R.
,
Granot
,
R.
, &
Drake
,
C.
(
2005
).
Differential brain response to metrical accents in isochronous auditory sequences.
Music Perception
,
22
,
549
562
.
Abercrombie
,
D.
(
1967
).
Elements of general phonetics
,
Edinburgh, UK
:
Edinburgh University Press
.
Allen
,
W.
,
Badecker
,
W.
, &
Osterhout
,
L.
(
2003
).
Morphological analysis in sentence processing: An ERP study.
Language and Cognitive Processes
,
18
,
405
430
.
(
1991
).
American Electroencephalographic Society, Guidelines for standard electrode position nomenclature.
Journal of Clinical Neuropsysiology
,
8
,
200
202
.
Auer
,
E. T.
(
1993
).
Dynamic processing in spoken word recognition: The influence of paradigmatic and syntactic states.
PhD thesis, States University of New York at Buffalo.
Auer
,
P.
, &
Uhmann
,
S.
(
1988
).
Silben- und akzentzählende Sprachen.
Zeitschrift für Sprachwissenschaft
,
7
,
214
259
.
Bach
,
P.
(
2005
).
Space and function in action comprehension
,
Berlin
:
Logos
.
Barth
,
D. S.
(
1995
).
: The spatiotemporal organization of auditory, visual and auditory–visual evoked-potentials in rat cortex.
Brain Research
,
678
,
177
190
.
Berridge
,
K. C.
,
Aldridge
,
J. W.
,
Houchard
,
K. R.
, &
Zhuang
,
X.
(
2005
).
Sequential super-stereotypy of an instinctive fixed action pattern in hyper-dopaminergic mutant mice: A model of obsessive compulsive disorder and Tourette's.
BMC Biology
,
3
, .
Besson
,
M.
, &
Faita
,
F.
(
1995
).
An event-related potential (ERP) study of musical expectancy: Comparison of musicians with nonmusicians.
Journal of Experimental Psychology: Human Perception and Performance
,
21
,
1278
1296
.
Böcker
,
K. B.
,
Bastiaansen
,
M. C.
,
Vroomen
,
J.
,
Brunia
,
C. H.
, &
Gelder
,
B. D.
(
1999
).
An ERP correlate of metrical stress in spoken word recognition.
Psychophysiology
,
36
,
706
720
.
Brochard
,
R.
,
Abecasis
,
D.
,
Potter
,
D.
,
Ragot
,
R.
, &
Drake
,
C.
(
2003
).
The “ticktock” of our internal clock: Direct brain evidence of subjective accents in isochronous sequences.
Psychological Science
,
14
,
362
366
.
Chang
,
F.
,
Dell
,
G. S.
, &
Bock
,
K.
(
2006
).
Becoming syntactic.
Psychological Review
,
113
,
234
272
.
Cohen
,
J.
(
1992
).
A power primer.
Psychological Bulletin
,
112
,
155
159
.
Coulson
,
S.
,
King
,
J.
, &
Kutas
,
M.
(
1998
).
Expect the unexpected: Event-related brain response to morphosyntactic violations.
Language and Cognitive Processes
,
13
,
21
58
.
Cummins
,
F.
, &
Port
,
R.
(
1998
).
Rhythmic constraints on stress timing in English.
Journal of Phonetics
,
26
,
145
171
.
Cutler
,
A.
(
1976
).
Phoneme-monitoring reaction time as a function of preceding intonation contour.
Perception & Psychophysics
,
20
,
55
60
.
Cutler
,
A.
(
1994
).
The perception of rhythm in language.
Cognition
,
50
,
79
81
.
Cutler
,
A.
, &
Butterfield
,
S.
(
1992
).
Rhythmic cues to speech segmentation: Evidence from juncture misperception.
Journal of Memory and Language
,
31
,
218
236
.
Cutler
,
A.
, &
Carter
,
D.
(
1987
).
The predominance of strong initial syllables in the English vocabulary.
Computer Speech and Language
,
2
,
133
142
.
Cutler
,
A.
, &
Norris
,
D.
(
1988
).
The role of strong syllables in segmentation for lexical access.
Journal of Experimental Psychology: Human Perception and Performance
,
14
,
113
121
.
Daneman
,
M.
, &
Carpenter
,
P. A.
(
1980
).
Individual differences in working memory and reading.
Journal of Verbal Learning and Verbal Behavior
,
19
,
450
466
.
Deutsch
,
A.
, &
Bentin
,
S.
(
2001
).
Syntactic and semantic factors in processing gender agreement in Hebrew: Evidence from ERPs and eye movements.
Journal of Memory and Language
,
45
,
200
224
.
Dohmas
,
U.
,
Wiese
,
R.
,
Bornkessel-Schlesewsky
,
I.
, &
Schlesewsky
,
M.
(
2008
).
The processing of German word stress: Evidence for the prosodic hierarchy.
Phonology
,
25
,
1
36
.
Drake
,
C.
(
1993
).
Perceptual and performed sequences in musical sequences.
Bulletin of the Psychonomic Society
,
31
,
107
110
.
Drake
,
C.
, &
Bertrand
,
D.
(
2003
).
The quest for universals in temporal processing in music.
Annals of the New York Academy of Sciences
,
930
,
17
27
.
Eckstein
,
K.
(
2007
).
Interaktion von Syntax und Prosodie beim Sprachverstehen: Untersuchungen anhand ereigniskorrelierter Hirnpotentiale
, MPI Series in Cognitive Neurosciences.
Eckstein
,
K.
, &
Friederici
,
A.
(
2006
).
It's early: Event-related potential evidence for initial interaction of syntax and prosody in speech comprehension.
Journal of Cognitive Neuroscience
,
18
,
1696
1711
.
Eisenberg
,
P.
(
1991
).
Syllabische Strukturen und Wortakzent: Prinzipien der Prosodik deutscher Woerter.
Zeitschrift fuer Sprachwissenschaft
,
10
,
37
64
.
Féry
,
C.
(
1997
).
“Uni und Studis: Die besten Wörter des Deutschen”.
Linguistische Berichte
,
172
,
461
490
.
Fraisse
,
P.
(
1982
).
Rhythm and tempo.
In D. Deutsch (Ed.), The psychology of music.
New York
:
Academic Press
.
Friederici
,
A.
(
1995
).
The time course of syntactic activation during language processing: A model based on neuropsychological and neurophysiological data.
Brain and Language
,
50
,
259
281
.
Friederici
,
A.
(
1999
). In A. Friederici (Ed.),
Language comprehension: A biological perspective
(Chapter 9pp.
263
301
).
Berlin
:
Springer
.
Friederici
,
A.
(
2004
).
Event-related brain potential studies in language.
Current Neurology and Neuroscience Reports
,
4
,
466
470
.
Friederici
,
A.
,
Hahne
,
A.
, &
Saddy
,
D.
(
2002
).
Distinct neurophysiological patterns reflecting aspects of syntactic complexity and syntactic repair.
Journal of Psycholinguistic Research
,
31
,
45
63
.
Friederici
,
A.
, &
Kotz
,
S.
(
2003
).
The brain basis of syntactic processes: Functional imaging and lesion studies.
Neuroimage
,
20
, (Suppl. 1),
S8
S17
.
Friederici
,
A.
,
Steinhauer
,
K.
, &
Frisch
,
S.
(
1999
).
Lexical integration: Sequential effects of syntactic and semantic information.
Memory & Cognition
,
27
,
438
453
.
Frisch
,
S.
(
2000
).
Verb-Argument-Struktur, Kasus und thematische Interpretation beim Sprachverstehen
, MPI Series in Cognitive Neurosciences.
Gasser
,
M.
,
Eck
,
D.
, &
Port
,
R.
(
1999
).
Meter as mechanism: A neural network model that learns metrical patterns.
Connection Science
,
11
,
187
216
.
Gondan
,
M.
, &
Röder
,
B.
(
2006
).
A new method for detecting interactions between the senses in event-related potentials.
Brain Research
,
1073
,
389
397
.
Gow
,
D.
, &
Gordon
,
P.
(
1993
).
Coming to terms with stress: Effects of stress location in sentence processing.
Journal of Psycholinguistic Research
,
22
,
545
578
.
Greenhouse
,
S.
, &
Geisser
,
S.
(
1959
).
On methods in the analysis of profile data.
Psychometrika
,
24
,
95
112
.
Grewendorf
,
G.
(
2002
).
Minimalistische syntax
,
Tübingen/Basel
:
Francke Verlag
.
Gunter
,
T.
, &
Friederici
,
A.
(
1999
).
Concerning the automaticity of syntactic processing.
Psychophysiology
,
36
,
126
137
.
Gunter
,
T.
,
Friederici
,
A.
, &
Schriefers
,
H.
(
2000
).
Syntactic gender and semantic expectancy: ERPs reveal early autonomy and late interaction.
Journal of Cognitive Neuroscience
,
12
,
556
568
.
Gunter
,
T.
,
Stowe
,
L.
, &
Mulder
,
G.
(
1997
).
When syntax meets semantics.
Psychophysiology
,
34
,
660
676
.
Hahne
,
A.
, &
Friederici
,
A. D.
(
1997
).
Two stages in parsing: Early automatic and late controlled processes.
Experimental Brain Research
,
117
,
47
.
Hahne
,
A.
, &
Friederici
,
A. D.
(
1999
).
Electrophysiological evidence for two steps in syntactic analysis. Early automatic and late controlled processes.
Journal of Cognitive Neuroscience
,
11
,
194
205
.
Hoen
,
M.
, &
Dominey
,
P.
(
2000
).
ERP analysis of cognitive sequencing: A left anterior negativity related to structural transformation processing.
NeuroReport
,
11
,
3187
3191
.
Jackendoff
,
R.
(
2002
).
Foundations of language
,
Oxford University Press
.
Jusczyk
,
P.
(
1999
).
How infants begin to extract words from speech.
Trends in Cognitive Sciences
,
3
,
323
328
.
Kaan
,
E.
, &
Swaab
,
T. Y.
(
2003
).
Repair, revision, and complexity in syntactic analysis: An electrophysiological differentiation.
Journal of Cognitive Neuroscience
,
15
,
98
110
.
King
,
J. W.
, &
Kutas
,
M.
(
1995
).
Who did what and when? Using word and clause-level ERPs to monitor working memory usage in reading.
Journal of Cognitive Neuroscience
,
7
,
376
395
.
Kohler
,
K.
(
1982
).
Rhythmus in Deutschen.
Arbeitsberichte des Instituts für Phonetik Kiel
,
19
,
89
105
.
Kolk
,
H. H. J.
,
Chwilla
,
D.
,
van Herten
,
M.
, &
Oor
,
P. J. W.
(
2003
).
Structure and limited capacity in verbal working memory: A study with event-related potentials.
Brain and Language
,
85
,
1
36
.
Kotz
,
S. A.
,
Rothermich
,
K.
, &
Schmidt-Kassow
,
M.
(
2007
).
A network of organizational principles in the brain: fMRI evidence on the meter–syntax interface.
Journal of Neuroscience.
Program No. 738.12.2007 Neuroscience Meeting Planner.
San Diego, CA
:
Society for Neuroscience
. (On line).
Kotz
,
S. A.
, &
Schmidt-Kassow
,
M.
(
2007
).
ERP evidence on the interaction of metrical and syntactic processing: The case of the P600.
Journal of Cognitive Neuroscience
,
Suppl.
,
165
166
.
Kuperberg
,
G. R.
(
2007
).
Neural mechanisms of language comprehension: Challenges to syntax.
Brain Research
,
1146
,
23
49
.
Large
,
E.
, &
Kolen
,
J.
(
1994
).
Resonance and the perception of musical meter.
Connection Science
,
6
,
177
208
.
Lee
,
C.
, &
Todd
,
N.
(
2004
).
Towards an auditory account of speech rhythm: Application of a model of the auditory “primal sketch” to two multi-language corpora.
Cognition
,
93
,
225
254
.
Lehrdahl
,
F.
, &
Jackendoff
,
R.
(
1983
).
A generative theory of tonal music
,
Cambridge, MA
:
MIT Press
.
Magne
,
C.
,
Astesano
,
C.
,
Aramaki
,
M.
,
Ystad
,
S.
,
Kronland-Martinet
,
R.
, &
Besson
,
M.
(
2007
).
Influence of syllabic lengthening on semantic processing in spoken French: Behavioral and electrophysiological evidence.
Cerebral Cortex
,
17
,
2659
2668
.
Martín-Loeches
,
M.
,
Casado
,
P.
,
Gonzalo
,
R.
,
Heras
,
L. D.
, &
Fernández-Frías
,
C.
(
2006
).
Brain potentials to mathematical syntax problems.
Psychophysiology
,
43
,
579
591
.
Mattys
,
S.
, &
Samuel
,
A.
(
1997
).
How lexical stress affects speech segmentation and interactivity: Evidence from the migration paradigm.
Journal of Memory and Language
,
36
,
87
116
.
McQueen
,
J.
,
Norris
,
D.
, &
Cutler
,
A.
(
1994
).
Competition in word recognition: Spotting words in other words.
Journal of Experimental Psychology: Learning, Memory, and Cognition
,
20
,
621
638
.
Mehler
,
J.
,
Dommergues
,
J.
,
Frauenfelder
,
U.
, &
Segui
,
J.
(
1981
).
The syllables role in speech segmentation.
Journal of Verbal Learning and Verbal Behavior
,
20
,
298
305
.
Metz-Lutz
,
M.-N.
,
Otzenberger
,
H.
,
Gounot
,
D.
, &
Jaffre
,
J. P.
(
2006
).
Neurophysiological correlates of morphosyntactical processing.
Langue Francaise
,
151
,
94
.
Mueller
,
J. L.
,
Bahlmann
,
J.
, &
Friederici
,
A. D.
(
2008
).
The role of pause cues in language learning: The emergence of ERPs related to sequence processing.
Journal of Cognitive Neuroscience
,
20
,
892
905
.
Nazzi
,
T.
, &
Ramus
,
F.
(
2003
).
Perception and acquisition of linguistic rhythm by infants.
Speech Communication
,
4
,
233
243
.
Neville
,
H.
,
Nicol
,
J.
,
Barss
,
A.
,
Forster
,
K.
, &
Garrett
,
M.
(
1991
).
Syntactically based sentence processing classes: Evidence from event-related brain potentials.
Journal of Cognitive Neuroscience
,
3
,
151
.
Norris
,
D.
,
McQueen
,
J.
, &
Cutler
,
A.
(
1995
).
Competition and segmentation in spokenword recognition.
Journal of Experimental Psychology: Learning, Memory, and Cognition
,
21
,
1209
1228
.
Oldfield
,
R. C.
(
1971
).
The assessment and analysis of handedness: The Edinburgh inventory.
Neuropsychologia
,
9
,
97
113
.
Osterhout
,
L.
,
Kim
,
A.
, &
Kuperberg
,
G.
(
2007
).
The neurobiology of sentence comprehension.
In M. Spivey, M. Joannisse, & K. McRae (Eds.),
The Cambridge handbook of psycholinguistics
,
Cambridge
:
Cambridge University Press
.
Osterhout
,
L.
, &
Nicol
,
J.
(
1999
).
On the distinctiveness, independence, and time course of the brain responses to syntactic and semantic anomalies.
Language and Cognitive Processes
,
14
,
283
317
.
Palolahti
,
M.
,
Leino
,
S.
,
Jokela
,
M.
,
Kopra
,
K.
, &
Paavilainen
,
P.
(
2005
).
Event-related potentials suggest early interaction between syntax and semantics during online sentence comprehension.
Neuroscience Letters
,
384
,
222
227
.
Pannekamp
,
A.
,
Toepel
,
U.
,
Alter
,
K.
,
Hahne
,
A.
, &
Friederici
,
A. D.
(
2005
).
Prosody-driven sentence processing: An event-related brain potential study.
Journal of Cognitive Neuroscience
,
17
,
407
421
.
Patel
,
A.
,
Gibson
,
E.
,
Ratner
,
J.
,
Besson
,
M.
, &
Holcomb
,
P.
(
1998
).
Processing syntactic relations in language and music: An event-related potential study.
Journal of Cognitive Neuroscience
,
10
,
717
733
.
Pitt
,
M. A.
, &
Samuel
,
A. G.
(
1990
).
The use of rhythm in attending to speech.
Journal of Experimental Psychology: Human Perception and Performance
,
16
,
564
573
.
Quené
,
H.
(
1993
).
Segment durations and accent as cues to word segmentation in Dutch.
Journal of the Acoustical Society of America
,
94
,
2027
2035
.
Quené
,
H.
, &
Koster
,
M. L.
(
1998
).
Metrical segmentation in Dutch: Vowel quality or stress?
Language and Speech
,
41
,
185
202
.
Quené
,
H.
, &
Port
,
R. F.
(
2005
).
Effects of timing regularity and metrical expectancy on spoken-word perception.
Phonetica
,
62
,
1
13
.
Ramus
,
F.
(
2002
).
Language discrimination by newborns: Teasing apart phonotactic, rhythmic, and intonational cues.
Annual Review of Language Acquisition
,
2
,
85
115
.
Ramus
,
F.
,
Hauser
,
M. D.
,
Miller
,
C.
,
Morris
,
D.
, &
Mehler
,
J.
(
2000
).
Language discrimination by human newborns and by cotton-top tamarin monkeys.
Science
,
288
,
349
351
.
Roach
,
P.
(
1982
).
Linguistic controversies: Essay in linguistic theory and practice in honor of F. R. Palmer.
London
:
Arnold
.
Saito
,
S.
, &
Ishio
,
A.
(
1998
).
Rhythmic information in working memory: Effects of concurrent articulation on reproduction of rhythms.
Japanese Psychological Research
,
40
,
10
18
.
Sanders
,
L. D.
, &
Neville
,
H. J.
(
2000
).
Lexical, syntactic, and stress-pattern cues for speech segmentation.
Journal of Speech, Language, and Hearing Research
,
43
,
1301
1321
.
Sansavini
,
A.
(
1997
).
Neonatal perception of the rhythmical structure of speech.
Early Development and Parenting
,
6
,
3
13
.
Selkirk
,
E.
(
1986
).
On derived domains in sentence phonology.
Phonology Yearbook
,
3
,
371
405
.
Silva-Pereyra
,
J. F.
, &
Carreiras
,
M.
(
2007
).
An ERP study of agreement features in Spanish.
Brain Research
,
1185
,
201
211
.
Slowiaczek
,
L. M.
(
1990
).
Effects of lexical stress in auditory word recognition.
Language and Speech
,
33
,
47
68
.
Smith
,
K. C.
, &
Cuddy
,
L. L.
(
1989
).
Effects of metric and harmonic rhythm on the detection of pitch alterations in melodic sequences.
Journal of Experimental Psychology: Human Perception and Performance
,
15
,
457
471
.
Steinhauer
,
K.
(
1995
).
Hirnelektrische Korrelate sprachlicher Verarbeitungsprozesse beim Lesen lokal ambiger Relativsätze.
Unpublished master's thesis, Free University Berlin.
Steinhauer
,
K.
(
2003
).
Electrophysiological correlates of prosody and punctuation.
Brain and Language
,
86
,
142
164
.
Steinhauer
,
K.
,
Alter
,
K.
, &
Friederici
,
A. D.
(
1999
).
Brain potentials indicate immediate use of prosodic cues in natural speech processing.
Nature Neuroscience
,
2
,
191
196
.
Tokowicz
,
N.
, &
MacWhinney
,
B.
(
2005
).
Implicit and explicit measures of sensitivity to violations in second language grammar: An event-related potential investigation.
Studies in Second Language Learning
,
27
,
173
204
.
van Herten
,
M.
,
Kolk
,
H.
, &
Chwilla
,
D.
(
2005
).
An ERP study of P600 effects elicited by semantic anomalies.
Brain Research, Cognitive Brain Research
,
22
,
241
255
.
Vissers
,
C. T. W. M.
,
Chwilla
,
D. J.
, &
Kolk
,
H. H. J.
(
2007
).
The interplay of heuristics and parsing routines in sentence comprehension: Evidence from ERPs and reaction times.
Biological Psychology
,
75
,
8
18
.