In auditory neuroscience, electrophysiological synchronization to low-level acoustic and high-level linguistic features is well established—but its functional purpose for verbal information transmission is unclear. Based on prior evidence for a dependence of auditory task performance on delta-band oscillatory phase, we hypothesized that the synchronization of electrophysiological responses at delta-band frequency to the speech stimulus serves to implicitly align neural excitability with syntactic information. The experimental paradigm of our auditory EEG study uniformly distributed morphosyntactic violations across syntactic phrases of natural sentences, such that violations would occur at points differing in linguistic information content. In support of our hypothesis, we found behavioral responses to morphosyntactic violations to increase with decreasing syntactic information content—in significant correlation with delta-band phase, which had synchronized to our speech stimuli. Our findings indicate that rhythmic electrophysiological synchronization to the speech stream is a functional mechanism that may align neural excitability with linguistic information content, optimizing language comprehension.
To accurately extract the entire linguistic information conveyed by speech, listeners have to decode the hidden syntactic information coded in speech. The decoding of syntactic information is critical for language comprehension, because it partially determines the computation of compositional meaning from individual words (Bonhage, Meyer, Gruber, Friederici, & Mueller, 2017); syntactic information also allows for overcoming capacity limitations, as sentences' word count often exceeds verbal working memory capacity—when words occur in random sequences, individual words are not remembered too well; but when word sequences allow for the grouping of individual words into syntactic phrases, the same words are remembered much better (Schremm, Horne, & Roll, 2015; Roll, Lindgren, Alter, & Horne, 2012; Wingfield & Byrnes, 1972; Miller, 1962).
Electrophysiological responses at delta-band frequency have been found to align with syntactic phrase structure during speech comprehension. On the one hand, delta-band oscillations have been found to track the superficial acoustic markings of syntactic structures that are present in speech prosody as pitch modulations (Ghitza, 2016; Bourguignon et al., 2013; Frazier, Carlson, & Clifton, 2006). On the other hand, the frequency of delta-band oscillations can be experimentally driven to match the occurrence frequency of syntactic phrases (Zhang & Ding, 2016; Ding, Melloni, Zhang, Tian, & Poeppel, 2015), delta-band oscillatory phase is shifted by internally generated syntactic structure in spite of the absence of acoustic markings (Meyer, Henry, Gaston, Schmuck, & Friederici, 2017), and delta-band power increases with the presence of syntactic structure (Bonhage et al., 2017)—that is, delta-band oscillations are sensitive to syntactic independent of the electrophysiological tracking of acoustic information at the phonemic, syllabic, and intonation phrasal rates (Molinaro, Lizarazu, Lallier, Bourguignon, & Carreiras, 2016; Bourguignon et al., 2013; Gross et al., 2013; Peelle, Gross, & Davis, 2013; Giraud & Poeppel, 2012; Lehongre, Ramus, Villiermet, Schwartz, & Giraud, 2011; for a review, see Meyer, 2017).
It is unclear how the alignment of electrophysiological responses at delta-band frequency with the speech stimulus links to their general role in the modulation of neural excitability. Excitability of auditory regions can fluctuate with delta phase and is maximal at phase troughs, benefitting auditory task performance (Schroeder & Lakatos, 2009; Lakatos, Karmos, Mehta, Ulbert, & Schroeder, 2008): When delta-band oscillations are experimentally driven into synchronicity with a rhythmic acoustic stimulus, short gaps in the stimulus are detected best at specific phase angles of the delta-band cycle (Henry & Obleser, 2012). One possibility emerging from these findings is that delta phase is a vehicle for preallocating neural excitability to informative stimuli and thereby optimizing behavioral performance (Schroeder & Lakatos, 2009; Lakatos et al., 2008; Shannon, 1951). In line with this, delta-band phase has been found to speed up the detection of auditory stimuli, depending on their information-theoretic expectedness, that is, their information content (i.e., Ng, Schroeder, & Kayser, 2012; Stefanics et al., 2010).
Based on the proposed functions of delta-band oscillations, we hypothesized here that the phase alignment between electrophysiological responses at delta-band frequency and the speech stimulus would indeed result in an implicit alignment between neural excitability and fine-grained syntactic information: During processing, the syntactic category of each incoming word adds syntactic information to the syntactic structure that the listener incrementally computes (Levy, 2008; Hale, 2001)—in information-theoretic terms, incoming words are either noninformative (i.e., when expected from prior knowledge) or informative (i.e., when unexpected). Hence, it is conceivable that once delta-band oscillations phase-align with the stimulus, excitable neural phases could be in optimal alignment with syntactic information, benefitting its processing and thus task performance. Importantly, our hypothesis was not committed to expecting either speech prosody driving delta-band oscillations into alignment with syntactic information content (i.e., via entrainment proper) or delta-band oscillations aligning with the internally generated syntactic structure hidden in the speech stream. To test our hypothesis, we first quantified syntactic information content of our experimental stimuli, employing the syntactic surprisal metric: Syntactic surprisal is a computational measure of the unexpectedness of the syntactic category of an incoming word, modeling tacit grammatical knowledge (Roark, Bachrach, Cardenas, & Pallier, 2009; Levy, 2008; Hale, 2001). In information-theoretic terms, the degree of unexpectedness equals a word's information content: When the syntactic category of a word is expected (e.g., a noun following an article), the word's information content is low; conversely, when the syntactic category is unexpected (e.g., an adverb following an article), the word's information content is high. Critically, our experimental paradigm allowed us to quantify participants' electrophysiological and behavioral responses during stimulus periods differing in syntactic information content. Our auditory sentence stimuli contained morphosyntactic violations that were spaced quasiuniformly across syntactic phrases, such that violations would occur at positions differing in syntactic information content. We acquired EEG data during auditory sentence comprehension. Participants' behavioral violation detection performance was recorded. First, we hypothesized that violation detection performance would decrease with syntactic information content: In a situation where most of the upcoming information had already been predicted anyway, participants would easily spot any violation, as it is highly salient, strongly against their predictions. In contrast, when participants' predictions are weak, a morphosyntactic violation would be hard to spot, because it would be less salient—in nonpredictive contexts, many different things can happen. Second, we hypothesized that electrophysiological responses at delta-band frequency would align in phase with the syntactic information coded in our speech stimuli (Meyer et al., 2017; Ding et al., 2015). Third, as a result, we hypothesized that behavioral performance should be correlated with the phase of the EEG at delta-band frequency. Together, this would show that delta-band synchronization to speech aligns neural excitability with syntactic information.
Twenty-three right-handed (Oldfield, 1971) participants (12 men, mean age = 24.52 years, SD = 2.21 years) participated in the study; none reported neurological or hearing disorders. Participants were naive about the purpose of the study and received reimbursement (€21) for participation.
Sentences containing two morphosyntactic violations each were employed to assess our hypothesis (Figure 1A). Violations were restricted to the sentence-initial and sentence-final phrases to allow for nonoverlapping RT windows. Violations were distributed uniformly across 10 different positions increasing in latency from the onset of each of the two violated phrases, such that violations would associate with different degrees of syntactic informativity (Figure 1B; see below). Each item consisted of a subject noun phrase (Die ziemlich berühmte Künstlerin/The somewhat famous artist), a verb (verwirklichte/conducted), an object noun phrase (das politisch umstrittene Projekt/the politically controversial project), and a prepositional phrase modifier of the verb phrase (nach einem kürzlich erlassenen Urteil/after a recently issued judgment). Length (i.e., syllable count) was matched across items within sentential positions (except for the verb to increase productivity). Violations were either morphosyntactic or syntactic-categorical; this mixture was inevitable to allow for violations at phrase onset, where agreement violations cannot occur. Syntactic-categorical violations were created by either replacing words by words of a syntactically wrong category (e.g., eins Künstlerin/a single one artist [the numeral eins/a single one cannot be used as a determiner]); morphosyntactic violations were created by an agreement mismatch (gender, number, or case; e.g., die junge [singular] Künstlerinnen [plural]/the young [singular] artists [plural]). All violations were latency-adjusted by either replacing words with words of the same word class but different syllable counts or removing optional words (e.g., adverbs). Violation time points were defined as points in time when integration with the prior syntactic context became impossible. Sentences containing a single violation each were recorded (48 sentences × 10 morphosyntactic violations per phrase × 2 violated phrases) with neutral intonation by a professional male speaker in a soundproof cabin. Audio editing was performed in PRAAT (Boersma & Weenink, 2001). The final stimulus sentences including two violations each (48 sentences × 10 morphosyntactic violations in the first noun phrase × 10 morphosyntactic violations in the prepositional phrase) were constructed by splicing. Sound files were normalized to 65 dB SPL, and onset and offset ramps of 50 msec duration were attached to avoid acoustic edge artifacts (Meyer et al., 2017; Meyer, Grigutsch, Schmuck, Gaston, & Friederici, 2015; Meyer, Obleser, & Friederici, 2013). Average duration of sound files was 5.81 sec (SD = 0.49 sec); average duration of syntactic phrases was 2.05 sec (SD = 1.42 sec).
To test our hypothesis on a relationship between delta-band phase and linguistic information content, we calculated syntactic surprisal (i.e., contextual unpredictability of a word's syntactic category; Hale, 2001) for our original stimuli (i.e., sentences that did not contain violations). The choice to calculate surprisal on the nonviolated stimuli was made for various reasons: First and most importantly, surprisal is in essence a measure of unpredictability; that is, surprisal quantifies the new information in the stimulus that has not been predicted from the prior context and thus needs to be extracted from the speech stream. The syntactic information available in the stimulus is thus best modeled as the information that is present at each point in the stimulus under normal circumstances (i.e., without a violation), minus the information that is known to the listener from the prior context. As a second reason, the calculation of surprisal for morphosyntactic and syntactic violations would require a large annotated corpus of such violations, which does not exist. To calculate surprisal, a probabilistic context-free grammar was derived from the TIGER treebank (Brants et al., 2004) using a freely available incremental top–down parsing algorithm (Roark et al., 2009). For each word of each phrase of our stimuli, syntactic surprisal was then calculated. Within phrase, the vector was up-sampled to each word's syllable count and interpolated to the 10 exact violation time points (Figure 1B).
Of the 48 × 10 × 10 combinations, 2 × 10 × 10 served as training items and 46 × 10 × 10 served as test items. Using MATLAB (The MathWorks, Inc., Natick, MA), the 4600 stimuli were distributed among 23 individual pseudorandomized lists, counterbalancing violation conditions. Each participant thus received 400 violations in 200 stimuli. For training, each participant was presented with eight stimuli from the training item pool.
Participants were seated in a dimly lit, electrically shielded, and soundproof cabin. Stimuli were presented using Presentation software (Neurobehavioral Systems, Inc., Albany, CA). Sentences were played through stereo loudspeakers (Harman International Industries, Inc., Stamford, CT) located about 100 cm in front of the participant. Fixation crosses were presented against a gray background on a CRT computer screen (Sony Corporation, Tokyo, Japan). At trial onset, a green fixation cross was presented for 1500 msec, which transitioned to red during stimulus playback and remained on screen for 2000 msec after offset. The cross then transitioned to green and remained on screen for 1500 msec before the onset of a new trial. To reduce the number of blink artifacts, participants were instructed to blink during green crosses only. Participants were instructed to respond as fast as possible to violations via button press. Before the experiment, participants were familiarized with the task in a training session, which they could repeat freely until acquainted with the task. The experiment was split into four blocks to avoid fatigue. Between blocks, participants could take a self-paced break. Block duration was 10 min. The experiment lasted about 2 hr (including preparation).
RTs were recorded using a one-button button box. The EEG was recorded from a BrainVision BrainAmp DC amplifier (Brain Products GmbH, Munich, Germany) using a 64 Ag/AgCl channel setup mounted on an elastic cap (ANT Neuro, Enschede, the Netherlands), according to the extended international 10–20 system. Channels were referenced to the left mastoid (i.e., Channel A1) and grounded to the sternum. Vertical electrooculograms (EOGs) were recorded from channels above and below the right eye. Horizontal EOGs were recorded from the outer canthi of both eyes. Signals were recorded from DC to 250 Hz at a sampling rate of 500 Hz. Channel impedances were kept below 10 kΩ.
Data analysis was performed in MATLAB. Trials with RTs above 2000 msec were defined as misses and excluded from analysis because of ambiguity between responses to the first and second violations in some sentences. Outlier trials were removed within participant, across trials, using the box plot method (Tukey, 1977; mean rejection percentage = 5.03%, SD = 1.64%). RTs were averaged within each of the 10 violation time points. Because misses were rare and ambiguous (see Results), we focused on analyzing RTs only. To assess whether RTs would increase with syntactic information content, we correlated average RTs with average surprisal within participant; the resulting coefficients were Fisher z-transformed, and group-level significance was assessed through a one-sample t test.
Processing of the EEG data was performed using FieldTrip (Oostenveld, Fries, Maris, & Schoffelen, 2011). Epochs of 6 sec were created, 3 sec before and 3 sec after the violation, to avoid filtering artifacts (see below). To minimize slow drifts, the data were high-pass filtered at 0.1 Hz using a two-pass Kaiser finite impulse response (FIR) filter (Widmann, Schröger, & Maess, 2015). Recordings were re-referenced offline to the average of all channels (excluding the EOG). Afterward, due to missing or artifact-heavy recordings in three participants, channels were interpolated based on the average of the surrounding channels (Participant 9: T8; Participant 18: O2, PO4, POZ; Participant 22: F5) on missing or artifact-heavy trials only. Muscle artifacts were detected with a semiautomatic distribution-based approach (z = 4) and rejected after visual inspection. On average, 36.76% (SD = 7.95%) of trials were rejected, the high percentage resulting from two factors: First, any nonrejected trial would require 6 sec of artifact-free data; second, each sentence contained two violations; thus, a single muscle artifact in the vicinity of the stimulus would result in the removal of two trials. Independent component analysis was then performed to correct pulse and blink artifacts. To improve classification accuracy, independent component analysis was run on high-pass filtered data (1 Hz cutoff, two-pass Kaiser-windowed FIR filtered; Winkler, Debener, Müller, & Tangermann, 2015). To-be-rejected components were detected visually (topography and waveform) and rejected from the 0.1-Hz high-pass filtered data. On average, 11.52% (SD = 4.18%) of components were rejected. To derive delta-band phase, the preprocessed data were low-pass filtered with a sixth-order two-pass Butterworth infinite-impulse-response 25 Hz low-pass filter, down-sampled to 100 Hz, and band-pass filtered with an optimal (Parks & McClellan, 1972) 2148th-order linear-phase FIR 0–4 Hz low-pass filter. Phase shift was corrected for by an according time shift, and analytic phase was derived via the Hilbert transform.
Statistical analysis of phase was performed using the CircStats toolbox (Berens, 2009). As we did not have a topographical hypothesis, initial analysis was performed at all channels; resulting p values were Bonferroni-corrected (Dunnett, 1955). To assess phase-locking to the speech stimulus, we first averaged within participants and within violation bins and then quantified the nonuniformity of the phase distribution across participants within each violation position and channel using Rayleigh's tests; this test is commonly used to quantify phase consistency across time-locked electrophysiological responses; specifically, significant nonuniformity of phase is taken as an index of an alignment between an electrophysiological oscillation and an external stimulus, in particular for delta-band oscillations (Soltesz, Szucs, Leong, White, & Goswami, 2013; Stefanics et al., 2010; Lakatos et al., 2008). Here, we hypothesized significant nonuniformity of phase within each violation position, indicating significant synchronization to the stimulus across participants. Subsequent analyses were then restricted to the channel that showed maximal phase nonuniformity across violation positions, that is, the channel where delta-band phase was most consistent across participants (i.e., F7; see Results; e.g., Herrmann, Henry, Grigutsch, & Obleser, 2013). The relationships between stimulus-synchronized phase and surprisal (factoring out the time point of violation occurrence; see Results) and phase and RTs (factoring out the time point of violation occurrence; see Results) were also assessed using within-subject circular–linear correlation analysis. This test assesses the linear association between a linear and a circular variable, correlating the linear variable with the cosine and sine transforms of the circular variable independently (Berens, 2009). Because any distribution of correlation coefficients has a rightward skew, the resulting coefficients were Fisher z-transformed (Fisher, 1915), and group-level significance of correlations was assessed through a one-sample t test. We hypothesized a correlation between phase and residual surprisal and phase and residual RT, indicating an alignment of neural excitability with syntactic information.
Detection rate was high across participants and violation positions (mean detection rate = 95.89%, SD = 4.91%). Within each violation position, a large fraction of participants (minimum fraction = 6/23, maximum fraction = 15/23) showed detection rates of 100%. Given this ceiling effect, we chose to focus on RTs only for statistical analysis. RTs for violation detection were positively correlated with surprisal (median r = .66, first quartile = .44, third quartile = .70; across-participant one-sample t test, t(22) = 9.40, p = 3.7032e−09). Yet, in principle, this effect might have partially been driven by our experimental procedure: Morphosyntactic violations occurred in all experimental trials, potentially increasing hazard rate (i.e., participants' expectation for violation occurrence toward the end of the stimulus). To control for hazard rate, we repeated the analysis using partial linear correlations, factoring out the exact time point of violation occurrence. Because hazard rate could, in principle, have influenced RTs in a nonlinear fashion as well, we verified the validity of applying partial linear correlations via model comparison: First, we fitted RTs to violation occurrence time points with regression functions of increasing complexity (linear to eighth-degree polynomial, after which the functions became ill-conditioned); we quantified the models' goodness-of-fit by the adjusted r2 metric. Second, we compared the adjusted r2 of the polynomial models to the linear model (paired-samples t tests). None of the polynomials fitted the data better (linear model: mean adjusted r2 = .57, SD adjusted r2 = .34; quadratic model to eighth-degree polynomial model: range of mean adjusted r2 = .58–0.66, range of SD adjusted r2 = .28–.87; all t(22) = −1.59 < t(22) < −0.06); hence, we decided to keep the original partialization procedure. After factoring out hazard rate, surprisal remained a significant predictor of RTs (median r = .28, first quartile = −.08, third quartile = .37; across-participant one-sample t test, t(22) = 2.20, p = .04). Together, this supports our hypothesis that behavioral performance was inversely related to syntactic information content. Yet, because correlations were affected substantially when factoring out hazard rate, we chose to employ residual RTs and residual surprisal values in all subsequent analyses (i.e., factoring out hazard rate; Figure 2).
Phase exhibited the most significant nonuniformity at Channel F7, where phase was nonuniform within every violation position (all 2.51 < κ < 8.96; Rayleigh's tests, all 295.74 < z < 500.62, all p < .001, Bonferroni-corrected for sensors and violation positions; Figure 3A and B). This is an indication that electrophysiological responses at delta-band frequency were successfully phase-locked to the syntactic information coded in our stimuli: If there was no relationship between stimulus and phase, then all possible phases should be observed equally often leading to a uniform distribution; if, on the other hand, there was a relationship between stimulus and phase, then certain parts of the oscillatory cycle—and thus, certain phase angles—would be overrepresented within violation position, across participants. To substantiate this interpretation, we first calculated intertrial phase coherence (ITPC; Lachaux, Rodriguez, Martinerie, & Varela, 1999) within participant, within violation bin, across trials; we then compared ITPC values across participants against zero using a one-sample t test. Within participant, within violation bin, we then generated 10,000 random phase distributions, matching the number of trials in the observed data for that participant and calculated the random ITPC. We then ran 10,000 across-participant one-sample t tests against zero, sorted the resulting t statistics, and compared the t statistic on the observed ITPC values to the t statistic on the random ITPC values at the 95th percentile. This test was significant for 8 of 10 bins (one-sample t tests, all 12.86 < t(22) < 18.54, all 0 < p < .02), suggesting that phase nonuniformity indeed reflected phase-locking (Maris & Oostenveld, 2007; for a similar procedure, see van Diepen, Cohen, Denys, & Mazaheri, 2015).
There were significant correlations between phase and residual surprisal (median r = .43, first quartile = .17, third quartile = .68; across-participants one-sample t test, t(22) = 6.81, p = 4.8245e−08; Figure 3D) as well as between phase and residual RTs (median r = .45, first quartile = .23, third quartile = .60; across-participants one-sample t test, t(22) = 8.10, p = 7.5979e−07; Figure 3C). Control analyses (see below) suggested that the set of correlations between surprisal and RTs, phase and surprisal, and phase and RTs is evidence that phase was aligned to syntactic information within the syntactic phrase, predicting behavioral performance.
To substantiate our claim that phase nonuniformity at sensor F7 indeed reflected significant phase-locking to the syntactic structure of our stimuli, rather than to superficial acoustic cues of speech prosody (i.e., pitch; Bourguignon et al., 2013), we first calculated phase-locking to surprisal and pitch across the whole sentence. We found that phase-locking to surprisal and pitch did not differ significantly (paired-samples t test, t(22) = −0.95, p = .35), leaving the question whether delta-band oscillations had synchronized to abstract stimulus features (i.e., surprisal) or physical characteristics (i.e., pitch) unanswered.
To ensure that phase-locking was not artificially increased by our experimental manipulation, we calculated phase-locking separately for the whole sentence and for the violation-containing segments of the sentence; as phase-locking was not found to differ between surprisal and pitch, we averaged across both. Phase-locking did not significantly differ between the whole sentence and the violation-containing segment (paired-samples t test, t(22) = 0.99, p = .34), suggesting that our experimental manipulation was not the main source of synchronization.
To support our interpretation that the correlations between surprisal and RT, phase and surprisal, and phase and RT together indeed suggest that phase intervenes between information content of the stimulus and behavioral performance, we compared the correlation coefficients for the phase–surprisal and the phase–RT correlation to the coefficients for the surprisal–RT correlation. We hypothesized that the correlation between surprisal and RTs should be significantly lower than the correlations each between phase and surprisal and phase and RT. This hypothesis was supported statistically (paired-samples t tests; surprisal–RT vs. phase–surprisal: t(22) = −2.86, p = .009; surprisal–RT vs. phase–RT: t(22) = −2.51, p = .020; Figure 4), indicating that phase is a likely intervening variable between stimulus information content and behavioral performance.
The findings can be taken as evidence that when delta-band oscillations synchronize with the speech stream, neural excitability is implicitly aligned with syntactic information (Roark et al., 2009; Levy, 2008; Shannon, 1951), predicting behavioral performance in response to the stimulus. We thus connect the roles of delta-band oscillations in the processing of syntactic structure during speech comprehension (Meyer et al., 2017; Ding et al., 2015), the fluctuation of neural excitability over time (Henry & Obleser, 2012; Schroeder & Lakatos, 2009; Lakatos et al., 2008), and the facilitation of information processing (Henry & Obleser, 2012; Stefanics et al., 2010). First, we observed the behavioral detection of morphosyntactic violations to mirror syntactic information. Second, delta-band oscillations aligned with our stimuli. Third, the phase gradient within syntactic phrase predicted behavioral violation detection. Control analyses suggested that phase was indeed intervening between stimulus and behavior.
Our results provide preliminary evidence that the synchronization of delta-band oscillations with the speech stream serves to align neural excitability with information that is maximally predicted from a high-level linguistic point of view. By showing that such a synchronization is intrinsically functional, this extends prior work that has found delta-band oscillations to synchronize with both superficial acoustic cues and abstract syntactic structure during speech processing (Bonhage et al., 2017; Meyer et al., 2017; Ding et al., 2015; Bourguignon et al., 2013), independent of speech onset-related ERPs (Zhang & Ding, 2016), and the relative time point within a sentence (Zhang & Ding, 2016). Although our control analyses leave it open whether concrete acoustic stimulus characteristics or abstract syntactic structure drove delta-band oscillations into synchronicity, our analyses do suggest that synchronicity facilitates the processing of genuine syntactic information. As suggested by a reviewer, the fact that we observed a relationship between phase and syntactic surprisal specifically could mean that, by aligning with speech, neural excitability is steered to upcoming syntactic information by taking into account expectations derived from internal grammatical knowledge (e.g., Herrmann, Maess, Hasting, & Friederici, 2009). In line with this idea, we note that phase was not only correlated with syntactic surprisal but also with syntactic entropy (median r = .48, first quartile = .31, third quartile = .60; across-participants one-sample t test, t(22) = 9.07, p = 6.9247e−09)—which is a measure of the strength of a syntactic expectation derived from the syntactic structure that has been generated before the incoming, more or less surprising information (e.g., Hale, 2001, 2016). Yet, correlation coefficients did not differ between the surprisal and entropy correlations (across-participant paired-samples t test, t(22) = −0.57; p = .58); we are thus careful in interpreting this observation and state here that further research is definitely needed.
The current results also reveal that higher-level linguistic violation–detection performance fluctuates with delta-band phase, extending prior work that reported behavioral performance in low-level auditory tasks and putatively neural excitability to cofluctuate with delta-band oscillatory phase (Hickok, Farahbod, & Saberi, 2015; Henry & Obleser, 2012; Schroeder & Lakatos, 2009; Lakatos et al., 2008) to the sentence level. Third, this supports the general hypothesis that the pace of periods of high neuronal excitability, as reflected by attentive delta-band phase angles, aligns neural excitability and stimulus informativeness (Barne, Claessens, Reyes, Caetano, & Cravo, 2016; Schroeder, Wilson, Radman, Scharfman, & Lakatos, 2010; Stefanics et al., 2010). In general, our results support the proposal that high-level language comprehension is parasitic upon electrophysiological excitability gradients that manifest in the analytic properties of neural oscillations across cognitive domains (Friederici & Singer, 2015; Meyer et al., 2013, 2015).
Although we propose that our results are evidence for the interpretation that delta-band oscillations are a neural mechanism that supports the decoding of high-level linguistic information, an alternative interpretation is that the responses observed here are merely signatures of an underlying train of transient electrophysiological responses disguised as oscillatory (e.g., Ding & Simon, 2014; Klimesch, Sauseng, Hanslmayr, Gruber, & Freunberger, 2007). In particular, experimental paradigms that employ syntactic or morphosyntactic violations are notorious for eliciting ERPs such as the LAN and P600 effects (Molinaro, Barber, & Carreiras, 2011; Kaan & Swaab, 2003; Kluender & Kutas, 1993; Osterhout & Holcomb, 1992). We note here cautiously that our control analyses suggest that, in the current study, ERPs were not the major driving force behind the observed across-participant phase-locking that we interpret here in terms of phase alignment to our stimuli: In the case of an ERP in disguise, phase-locking should have been significantly higher for the violation-containing sentence segment as compared with the whole sentence—the violation-containing segments would have been the only data segment containing a violation-related ERP, which would have led to increased phase-locking; in contrast, phase-locking to the whole sentence would have been dominated by non-ERP data, because there were no syntactic or morphosyntactic violations. Yet, although it has been argued that oscillatory synchronization also occurs in the absence of rhythmic amplitude modulations (Henry & Obleser, 2012; Obleser, Herrmann, & Henry, 2012), rhythmicity of oscillatory synchronization is to some extent robust to decreased stimulus rhythmicity (cf. Calderone, Lakatos, Butler, & Castellanos, 2014; Mathewson et al., 2012), and auditory processing performance transiently keeps stimulation frequency even after stimulation offset (e.g., Hickok et al., 2015; Neuling, Rach, Wagner, Wolters, & Herrmann, 2012), our current paradigm cannot fully rule out this alternative interpretation of the current effects.
A possible, yet inevitable, limitation of our experimental paradigm could be the association of surprisal with violation type: Syntactic-categorical violations were confined to phrase onsets, where surprisal was high. Conversely, morphosyntactic violations occurred across the phrase, where surprisal was more variable. We tested this by comparing for each experimental item the model deviance of a logistic regression model, including surprisal as a predictor of violation type, to the deviance of an intercept-only model, using a chi-square statistic. We confirmed the significance of these deviance differences (paired-samples t test; t(45) = −10.19, p = 3.1641e−14). Yet, this cannot explain the current result, as it would predict the direct opposite of the behavioral pattern observed here—faster RTs at phrase onset and slower RTs across the phrase: The electrophysiological response to syntactic-categorical violations peaks roughly 150–250 msec before the electrophysiological response to morphosyntactic violations (e.g., Molinaro et al., 2011; Friederici, Pfeifer, & Hahne, 1993) and is generated by distinct cortical areas (Jakuszeit, Kotz, & Hasting, 2013); critically, a prior direct comparison yielded no evidence for an inversion of this pattern in RTs (Rossi, Gugler, Hahne, & Friederici, 2005). As a further confirmation, an exclusion of syntactic-categorical violations from analysis increased the group-level significance of the correlation between residual surprisal and RTs (median r = .31, first quartile = .11, third quartile = .46; across-participant one-sample t test, t(22) = 3.47, p = .002).
We thank Angela D. Friederici and Thomas Gunter for very helpful discussion. We thank Philipp Kuhnke for methodological advice. We are obliged to Katrin Ina Koch for data acquisition. The Max Planck Society funded this research.
Reprint requests should be sent to Lars Meyer, Department of Neuropsychology, Max-Planck-Institut für Kognitions- und Neurowissenschaften, Stephanstraße 1A, Leipzig, Germany, 04103, or via e-mail: firstname.lastname@example.org.
This paper is part of a Special Focus deriving from a symposium at the 2017 annual meeting of Cognitive Neuroscience Society, entitled “Top–Down Functions of Neural Oscillations for Speech and Language Processing.”
Joint first authorship.