Language acquisition is a complex process that requires the synergic involvement of different cognitive functions, which include extracting and storing the words of the language and their embedded rules for progressive acquisition of grammatical information. As has been shown in other fields that study learning processes, synchronization mechanisms between neuronal assemblies might have a key role during language learning. In particular, studying these dynamics may help uncover whether different oscillatory patterns sustain more item-based learning of words and rule-based learning from speech input. Therefore, we tracked the modulation of oscillatory neural activity during the initial exposure to an artificial language, which contained embedded rules. We analyzed both spectral power variations, as a measure of local neuronal ensemble synchronization, as well as phase coherence patterns, as an index of the long-range coordination of these local groups of neurons. Synchronized activity in the gamma band (20–40 Hz), previously reported to be related to the engagement of selective attention, showed a clear dissociation of local power and phase coherence between distant regions. In this frequency range, local synchrony characterized the subjects who were focused on word identification and was accompanied by increased coherence in the theta band (4–8 Hz). Only those subjects who were able to learn the embedded rules showed increased gamma band phase coherence between frontal, temporal, and parietal regions.
Learning any new skill is a dynamic process that requires the coordination of different brain networks and cognitive functions. In particular, language is one of the most important skills that humans have to acquire early in life and often again later in adulthood. Investigation of oscillatory brain activity is an interesting tool used to approach the interplay between different cognitive processes and brain networks during language learning. Neural oscillations represent a fundamental mechanism that allows the precise coordination of activity between distant regions of the brain as well as regional synchronization. Additionally, this measure represents a remarkable instrument for the investigation of the rapid changes associated with learning-induced brain plasticity (Benchenane et al., 2010). These types of studies have been particularly relevant for understanding the brain dynamics underlying learning and memory but also for our understanding of how different cognitive functions relate to different synchronization patterns at different frequency bands (see Table 1 in Uhlhaas & Singer, 2010, for a recent review).
|le__di||ka, fi, ro||lerodi||dirole||Lemadi|
|ba__gu||fe, pi, lo||bapigu||gupiba||Badogu|
|pa__mi||te, la, ko||patemi||mitepa||Pabumi|
|da__lu||na, tu, go||dagolu||lugoda||Dabilu|
|le__di||ka, fi, ro||lerodi||dirole||Lemadi|
|ba__gu||fe, pi, lo||bapigu||gupiba||Badogu|
|pa__mi||te, la, ko||patemi||mitepa||Pabumi|
|da__lu||na, tu, go||dagolu||lugoda||Dabilu|
Middle syllables could be combined with the three structures of the language. Each language had a filler version with a random combination of the same syllables. Word, Nonword, and Rule Word columns provide examples of test items.
Although an increasing number of studies have recently focused on discovering the oscillatory activity behind language comprehension (Bastiaansen, Magyari, & Hagoort, 2010; Weiss et al., 2005; Hagoort, Hald, Bastiaansen, & Petersson, 2004; Weiss & Mueller, 2003; Bastiaansen, van Berkum, & Hagoort, 2002b; Rohm, Klimesch, Haider, & Doppelmayr, 2001), surprisingly, studies that investigate the brain dynamics in terms of oscillatory activity related to language acquisition (see Buiatti, Peña, & Dehaene-Lambertz, 2009) are lacking. In language comprehension, modulations that are associated with semantic processing with diverse results at different frequency bands have been reported (Davidson & Indefrey, 2009; Weiss & Mueller, 2003; Rohm et al., 2001). More consistent results have been observed in relation to syntactic processing with the low beta frequency band appearing to be specifically modulated as a function of syntactic complexity (Weiss et al., 2005; Weiss & Mueller, 2003) and displaying an interruption of the activity with unstructured sentences or upon the introduction of grammatical violations (Bastiaansen et al., 2010; Davidson & Indefrey, 2009). Gradual increases in theta power and coherence that are related to the progressive building up of the working memory trace of the linguistic input have also been found during sentence comprehension (Weiss et al., 2005; Bastiaansen, van Berkum, & Hagoort, 2002a). In this context, concurrent increased gamma coherence has been observed with greater syntactic complexity and has been interpreted as to represent greater attentional effort (Weiss et al., 2005). Interestingly, the theta–gamma relationships in that study have been interpreted to be consistent with the idea that theta and gamma frequencies interact during memory encoding to store representations of sensory inputs (Lisman, 1999; Jensen & Lisman, 1998). Although this has not been explored yet, this mechanism might be particularly relevant in the course of language acquisition.
Despite the absence of language learning studies, some previous work on reading difficulties observed in dyslexic children has shown an association between theta activity and learning deficits (Klimesch et al., 2001). Additionally, a recent study has suggested that induced gamma band responses observed for musical sounds in professional musicians and children after musical training may be related to learning in many cognitive domains such as language (Trainor, Shahin, & Roberts, 2009). Thus, it is likely that language acquisition, like other general domain learning abilities, might engage the dynamic coordination of distinct neuronal assemblies that are related to different cognitive functions. The study of brain oscillatory activity has led to the observation that the synchronization and desynchronization in different frequency bands allows the selection, blocking out, and enhancement of information (Schroeder & Lakatos, 2009). These oscillatory dynamics may help humans tune to speech in a manner that shows preferences for this type of stimulation from a few days of age (Vouloumanos & Werker, 2007) and progressively detects different regularities that characterize speech input (Gomez & Maye, 2005; Saffran & Wilson, 2003; Marcus, Vijayan, Rao, & Vishton, 1999; Saffran, Aslin, & Newport, 1996). Thus, our cognitive system is equipped to decode the speech signal and eventually acquire the two major milestones that will allow us to develop complex adult language: word learning for lexical development and rule learning for grammar acquisition. Indeed, infants' language acquisition shows a functional distinction between the way our cognitive system is able to track words and rules in language. Infants start using words with no-rule-based productive variations (Tomasello & Brooks, 1999; Clark, 1998) before they start using rules productively, and second language learners have a hard time avoiding grammatical errors while flawlessly learning vocabulary (Weber-Fox & Neville, 1996). Data from infants at different ages and adults also suggest that learners have an initial tendency to rely first on the extraction of adjacent dependencies, which is relevant for word acquisition, before shifting to the detection of nonadjacent dependencies, which are more necessary for the extraction of grammatical relationships (Gomez & Maye, 2005; Gomez, 2002).
Our working hypothesis is that this synchronization mechanism might play a key role during language learning for the extraction of words and rules from speech input. Indeed, a recent study by Buiatti and colleagues nicely showed that the explicit extraction of words from a nonsense speech stream is accompanied by a frequency entrainment with perceived acoustic boundaries (Buiatti et al., 2009). In addition, it has also been shown that rapid changes in brain activity differentiate the extraction of words and rules from the speech stream. In a previous study (De Diego-Balaguer, Toro, Rodriguez-Fornells, & Bachoud-Levi, 2007), brain electrophysiological activity (ERPs) was registered while participants were trying to learn an artificial language that contained embedded rules. Through exposure time, a gradual increase in the amplitude of an early positive component in the range of 200 msec (P2) was observed to be correlated with rule learning performance. This modulation was similar to those observed in other studies on perceptual grouping (Snyder, Alain, & Picton, 2006; Reinke, He, Wang, & Alain, 2003; Hillyard, Hink, Schwent, & Picton, 1973) and was clearly dissociated from the N400 modulation, which is related to lexical acquisition (Cunillera et al., 2009; Dobel, Lagemann, & Zwitserlood, 2009; Cunillera, Toro, Sebastian-Galles, & Rodriguez-Fornells, 2006; Mestres-Misse, Rodriguez-Fornells, & Munte, 2006; McLaughlin, Osterhout, & Kim, 2004; Sanders, Newport, & Neville, 2002).
The aim of the present study was to further understand whether different specific cognitive mechanisms and brain dynamics underlie word and rule learning in the very early stages of exposure to a new language. The achievement of this goal requires the ability to track the evolution of the learning process in real time. In our previous study, we directly tackled the on-line language learning process involved in rule extraction (De Diego-Balaguer et al., 2007). However, whereas the ERP measures in that study were helpful to dissociate word learning from rule learning functionally, the specific cognitive functions and the brain dynamics that sustain these two types of learning could not be fully understood with that analysis. In the present work, we employed trial-by-trial wavelet-based time-frequency (TF) analysis to study the ongoing modulation of oscillatory neural activity (Herrmann, Munk, & Engel, 2004; Tallon-Baudry & Bertrand, 1999). These trial-by-trial analyses of oscillatory activity allow a better temporal resolution in terms of the evolution of the activity through the learning process and make them a better measure for the evaluation of neural plasticity than the standard ERP approach (Miltner, Braun, Arnold, Witte, & Taub, 1999). In addition, this analysis has been effectively used to understand the brain dynamics underlying different cognitive functions (Makarov, Panetsos, & de Feo, 2005; Laufs et al., 2003).
In addition, although incoming stimulation is initially processed locally, learning requires the simultaneous cross talk between different regions of the brain that influences this local activity (Stevens, 2009). These brain plasticity changes occur through the modification of neural efficacy between cortical regions (Hebb, 1949), and they could be characterized by different coherence patterns in local brain regions and between distal brain regions (Singer, 1995). There is also evidence that supports that large-scale coordination between fronto-parietal and sensory cortices enables top–down influences on attention (Corbetta & Shulman, 2002) and that such coordination is reflected in the dynamic modification of coherent oscillatory synchronization between neuronal groups in distant cortical areas (Siegel, Donner, Oostenveld, Fries, & Engel, 2008; Fries, 2005; Buzsaki & Draguhn, 2004; Engel, Fries, & Singer, 2001). Thus, in addition to the analysis of local (spectral power at electrode level) scale synchrony EEG bands, in the present study, a large-scale (coherence across distant electrodes) analysis was performed. Mostly on the basis of single-cell recordings in monkeys, each type of measure has been identified as a critical “middle ground” between cortical mechanisms and cognitive functions (see, for a review, Varela, Lachaux, Rodriguez, & Martinerie, 2001). They have also been proven to be independently modulated as a function of memory (Jensen & Tesche, 2002; Sarnthein, Petsche, Rappelsberger, Shaw, & von, 1998) and attention (Maunsell & Treue, 2006; Miller & D'Esposito, 2005) in different measures of coherence and in different modalities. We believe that the coherence analysis (phase synchrony) can add important information to understand to which degree different cortical regions show synchronous coherence in a specific frequency band, which is interpreted as the degree to which two regions are consistently coordinating their respective neural activities associated with specific cognitive functions (Varela et al., 2001; Lachaux, Rodriguez, Martinerie, & Varela, 1999). It has been proposed that neural oscillators showing a similar temporal pattern (i.e., interelectrode phase synchrony) might indicate large-scale integration mechanisms among different neural assemblies (Lachaux et al., 1999).
In the present investigation, this analysis will permit the examination of the variations in functional connectivity in different frequency bands throughout the learning process of an artificial language with embedded words that follows simple nonadjacent dependency rules. If word and rule learning are achieved by the involvement of the same brain dynamics, then subjects who learned the rules and those who, at the end of the learning phase, were not able to extract this information should not show remarkable differences in their oscillatory patterns, as long as they have comparable word learning abilities. Common modulations through the learning phase in the two groups should correspond to those processes related to word learning. In contrast, differential oscillatory patterns, in terms of power spectra of different frequency bands involved and interelectrode phase synchrony, should give us information about the specific brain dynamics, which characterize rule learning. In addition, in the current investigation, the results of the on-line variations in the oscillatory activity during the learning process are complemented by a second behavioral experiment to better understand the evolution of the learning performances half way through the learning period, when the observed variations in oscillatory activity start to emerge. To determine whether the two groups of participants also differ earlier in the learning process, this second experiment follows the evolution of the performances and characterizes participants who were able to learn the rules and those who were only able to extract the words of the speech stream.
Twenty-four right-handed volunteers (seven men, mean age = 25 years, SD = 6 years) participated in the study. None of them had a history of neurological or hearing deficits. Written consent was obtained from each volunteer before the experiment. The experiment was approved by the local ethics committee of the University of Barcelona. Four participants were discarded from the analysis because of excessive eye movements.
Four artificial language streams were created according to the same principle used by Peña, Bonatti, Nespor, and Mehler (2002). They contained trisyllabic words built following a rule that established that their initial syllable determined their ending (paliku, paseku, paroku) irrespective of the middle element, thus forming a structure similar to some morphosyntactic rules (heplays, hewants, hewalks) (see Figure 1). There were three different frames, and the intervening middle syllable could take up to three values, for a total of nine different words per language. None of the syllables were repeated across languages. Details of the exact stimuli used can be found in De Diego-Balaguer et al. (2007) and are included in Table 1. This type of material has been shown to induce learning of abstract rules as shown by the acquisition of categories on the basis of the underlying dependencies (Endress & Bonatti, 2007) and by the transfer of these rules to new material with no physical overlap with the learned language (De Diego-Balaguer et al., 2008).
Streams and test items were synthesized using the MBROLA speech synthesizer software (Dutoit, Pagel, Pierret, Bataille, & van der Vreken, 1996) concatenating diphones at 16 kHz from the Spanish male database (es2) (tcts.fpms.ac.be/synthesis/mbrola.html). All phonemes had the same duration (116 msec) and pitch (200 Hz; equal pitch rise and fall, 216 with pitch maximum at 50% of the phoneme) in the language streams. Thus, words in the language streams had a duration of 696 msec each. They were separated by 25-msec pauses to induce the extraction of structural information (Peña et al., 2002) and were concatenated in a pseudorandom order so that a word was never immediately repeated in the stream. As the same three middle syllables appeared in the three frames of a given language, the transitional probability between the initial and middle syllable or between this one and the final syllable was 0.33. The transitional probability between the first and the last syllable of every word was 1.0, whereas the corresponding probability between the last syllable of any word and the first syllable of the following one was 0.5. To have the same length in the different streams and fit the duration to the necessary millisecond precision for the ERP recordings, we used Adobe Audition™ to slightly stretch the audio files.
During the learning phase of the experiment, participants were presented with four different languages counterbalanced across individuals. They listened to 4 min of each language leading to 336 word observations. For each language, participants were told that they would hear a nonsense language and that their task was to pay attention to it because they would be asked afterwards to recognize words of this language After listening to each stream, participants were tested using a two-alternative forced-choice recognition test. Thus, the learning procedure was performed four times. Isolated test items were created and presented in pairs (Figure 1). The two test items of each trial were separated by 704 msec. For half of the streams, participants were tested for word acquisition, such that they had to choose between words from the exposed language and nonwords in each trial (see Figure 1). For the other half, rule learning was evaluated, such that participants had to choose between a nonword and a rule word. Each test item (9 words, 9 rule words, 18 nonwords) appeared twice, leading to 72 rule word, 72 word, and 144 nonword presentations. Participants were instructed to listen to the two alternative stimuli and wait until an indication on the screen appeared to respond with the right or left button of the mouse. Nonwords were new items formed with the same three syllables of a previously exposed word in the wrong order: the first and last syllables were placed in the inverse order (see Figure 1). Participants should, thus, encode the order of presentation of the syllables and their position (Endress & Bonatti, 2007) to detect this sequence as an invalid item. Rule words were new words with the same initial and final syllable of a word from the exposed language while a syllable corresponding to another word was inserted in the middle position (see Figure 1). Thus, although these new words followed the structure of words in the artificial languages, the participants had not heard these rule words before.
The experiment was run individually in an electrically and acoustically shielded room on a PC computer using the Presentation Software (nbs.neuro-bs.com/). Stimuli were played through Sennheiser (HMD224) headphones connected to the computer, via a Proaudio Spectrum 16 soundcard.
EEG activity was recorded from the scalp by using tin electrodes mounted in an electrocap (Electro-Cap International, Eaton, OH) and located at 29 standard locations (Fp1/2, Fz, F7/8, F3/4, Fc1/2 Fc5/6, Cz, C3/4, T3/4, Cp1/2, Cp5/6, Pz, P3/4, T5/6, Po1/2, O1/2). Biosignals were rereferenced off-line to the mean of the activity at the two mastoids. Vertical eye movements were monitored with an electrode at the infraorbital ridge of the right eye. Electrode impedances were kept below 3 kΩ. The electrophysiological signals were filtered with a bandpass of 0.01–50 Hz (half-amplitude cutoffs) and digitalized at a rate of 250 Hz. Trials with base-to-peak EOG amplitude of more than 100 μV or amplifier saturation were automatically rejected off-line.
Single-Trial TF Analysis
To trace the continuous brain oscillatory pattern activity associated to the learning process, the spectral power analysis was computed on blocks of 50 word trials grouped following the order of appearance during the experiment (i.e., Words 1–51: first block; Trials 51–101: second block, etc., pooling together the four exposed languages). For each subject, words from each of the four languages were first divided as a function of the minute of exposure and pooled together. Then 50 consecutive word trials were selected for each block (mean number of word trials per language per block = 12.42, SD = 0.17), first for Minute 1 until all word trials free of artifacts were consumed. The same procedure was used for words in the second minute and so on. Hence, a total of 750 word trials free of artifacts were used, which gave rise to 15 blocks of 50 words over the four languages learned per subject. Therefore, each block corresponded approximately to 9 sec of exposure to the language. As the objective of the study was to follow the ongoing neural mechanisms underlying the word and rule learning processes from speech, time-to-time modulations of spectral power (divided in blocks of 50 words across languages) during the stimuli presentation were measured as amplitude changes compared with a baseline of −75 to 0 msec window previous to the presentation of the language in the first block. Given the short interword time window derived from the experimental paradigm (i.e., 25 msec), there was a tradeoff between frequency resolution and power calculation for the theta band. We, therefore, decided to include a long prestimuli baseline of 1 sec in the initial analysis, which is sufficient to reliably capture more than three theta cycles to calculate the spectral power. Having accurately decomposed the data into its frequency domain, we used a short time window (i.e., 75 msec, equivalent to approximate ½ theta cycle at 7 Hz) to normalize the poststimuli power spectra. This period was set to avoid the influence of brain activity changes evoked by the previous word presentation during baseline period. However, a longer baseline of −200 to 0 msec was also used for the analyses of this frequency band (4–8 Hz) to ensure that the effects observed were not baseline dependent.
Power spectrum analysis was centered in five commonly studied frequency bands, namely theta (4–8 Hz), alpha (8–12 Hz), beta (13–29 Hz), and low gamma (30–40 Hz). To deal with the problem of comparing multiple frequencies simultaneously, we averaged the power data of each frequency according to the bands described, which is a common method in TF studies (Hagoort et al., 2004). Mean 0–700 msec spectral band power changes related to the learning process were submitted to an omnibus repeated measures ANOVA including 16 electrodes (Fp1/2, F7/8, F3/4, Fc5/6, Cp1/2, Cp5/6, P3/4, and T5/6) optimally covering the whole scalp distribution of the effects and four within-subject factors: Block (15 levels) × Anterior–Posterior (frontal and parietal locations) × Hemisphere (right and left) × Laterality (medial and lateral) (Cunillera et al., 2006). These analyses allowed us to observe the variations in each frequency band as a function of exposure to the language stream as well as the topographical distribution of these effects. When necessary, degrees of freedom were corrected using the Greenhouse–Geisser epsilon value.
In addition, to study the spectral features specifically characterizing the rule learning process, we further compared the differences in the TF domain for two extreme groups of participants: those who better learned the rules of the languages and those who were not able to learn them. Importantly, both groups were nevertheless matched for their word learning capacity. We used an M/EEG-based nonparametric permutation test (Cluster-based nonparametric permutation test) described by Maris, Schoffelen, and Fries (2007) to avoid the need to define frequency bands on a priori basis. Importantly, this analysis also deals with the multiple comparison problem given the multiple number of electrodes, frequencies, and word blocks. This method provides two important advantages: (1) it provides a simple way to solve the multiple comparison problem and (2) it is nonparametric, insofar as it does not depend on parametric assumptions about the probability distribution of the data. This analysis was also implemented to control for possible statistical bias when considering specific frequency bands to be modulated throughout language learning in previous ANOVA of spectral power bands. We first calculated a paired t test value (i.e., Good vs. Poor learners) for each of the 16 electrodes used in the previous ANOVA and for each frequency and word block. We then clustered the data that resulted larger than the threshold of p < .05 (uncorrected) and also followed an adjacency criteria so that at least one bin was contiguous in the Electrode × Frequency × Word Block Space. The permutation distribution was then obtained by (1) collecting the trials of the two relevant experimental conditions (e.g., Good vs. Poor) in a single set, (2) randomly partitioning the trials into two subsets, (3) for each cluster, computing the sum of the t2 values and then taking a test statistic equal to the maximum of the cluster level statistic, and (4) repeating Steps 2 and 3 1000 times to construct a histogram. The nonparametric statistical test was finally performed by calculating a “Monte Carlo” p value under the permutation distribution and comparing it with an alpha level of 0.05.
EEG Coherence Analysis
Compared with other approaches previously used (e.g., Lachaux et al., 1999), in this method, the phase difference vector is modulated by the product of the amplitudes of both electrodes. Therefore, although synchrony does not depend on the amplitude, the relative weight of each trial in the global coherence magnitude affects the final computation. This is particularly interesting because only those trials containing “real information” (and not noisy trials) will have an impact in the final coherence measure. Coherence analysis was performed in the present study at single-trial level, and EEG spectral power was obtained from 0 to 700 msec word onset and extracted by a CWT to all electrode locations (see Single-Trial TF Analysis).
A similar M/EEG-based nonparametric permutation test (Cluster-based nonparametric permutation test) used in the power analyses was also implemented for statistical comparisons between Good and Poor learners' coherence data. The analyses followed the parameters described in previous nonparametric permutation test for spectral power data between groups, except that in this case, we only implemented it to compare the differences between groups in theta (4–8 Hz), alpha (8–12 Hz), and gamma coherence (20–30 Hz) separately. The range of the gamma frequency in this analysis was adjusted to that range that appeared significant in the previous power analyses (i.e., from 20 Hz, see Figure 3). Clustering data was restricted to sensors whose paired t test value exceeded a threshold of p < .05 (uncorrected) and fulfilled an adjacency criteria of electrode contiguity, which we defined by those sensor pairs that shared at least one sensor showing another significant (p < .05, uncorrected) contiguous sensor pair. This analysis was performed for each block separately, and the block showing a cluster of sensors with maxima sum of t values was used to create the empirical permutation distribution. The permutation distribution and significant threshold (corrected) was obtained following the exact same steps as in the power data.
Language Learning Accuracies
The behavioral measures showed that participants were able to both recognize words (62% ± 13.3%; t(19) = 4.04, p < .001) and extract the underlying structure from the language streams (54.5% ± 8.5%; t(19) = 2.34, p < .03) significantly better than chance (50% for a two-alternative forced-choice, one-sample t Test). Better performance in the test for word recognition than for rule generalization (t(19) = 2.4, p < .02) was observed. Concerning the two groups of participants that differed in their rule learning accuracy, we used the groups defined in the previous study (De Diego-Balaguer et al., 2007) to compare the previously reported ERP results with the current TF analyses. In short, the eight participants with the highest performances (>58%) were included in the Good learner group (mean = 63%, SD = 5%), and the eight lowest performers, those who performed at chance in the rule learning test (mean = 46%, SD = 4), were assigned to the Poor learner group (t(14) = −7.84, p < .0001). Performance in word learning was comparable in the two groups (Good learners: 67% ± 14%, Poor learners: 59% ± 10%; t(14) = −1.39, p < .1). The remaining four participants with intermediate values were excluded from the group analyses.
Brain Oscillatory Dynamics Associated with Language Learning Process
Results for the whole group
Power spectrum analysis
TF results during the learning process at Fz locations are depicted in Figure 2A. TF charts of continuous blocks revealed a modulation of the four main spectral bands with time, which comprised the theta (4–8 Hz), alpha (8–12 Hz), beta (13–29 Hz), and gamma (30–40 Hz) bands. This was confirmed by a main effect of Block in the repeated measures ANOVA for theta (F(14, 266) = 3.56, p < .001, η2 = 0.15), alpha (F(14, 266) = 2.67, p = .001, η2 = 0.12), beta (F(14, 266) = 2.42, p = .001, η2 = 0.12), and gamma (F(14, 266) = 3.81, p = .0001, η2 = 0.17) indicating spectral changes as a function of language exposure. Moreover, the evolution of the spectral power showed a different modulation through time: whereas theta and alpha bands significantly increased linearly (F(1, 19) = 17.65, p < .001, η2 = 0.48 and F(1, 19) = 13.16, p = .002, η2 = 0.41, respectively), the beta and gamma bands were modulated in a cubic mode (F(1, 19) = 7.35, p = .01, η2 = 0.29 and F(1, 19) = 13.19, p < .002, η2 = 0.41, respectively). This increase can be clearly observed in Figure 2C for both the theta and alpha bands, whereas the increase observed in the beta and gamma bands reached its maximum at the third block and decreased afterward. The linear increase in the theta band showed the same pattern in the analysis with the longer baseline (−200 to 0) (F(1, 19) = 6.29, p = .02, η2 = 0.28).
As can be observed in Figure 2A, these power modulations were block dependent but were constant throughout the 0–700 msec word length. Although a peak in frequency seems to be observed around 400 msec in the theta band, rapid differences in the 0–700 msec range are difficult to detect at low-frequency ranges (i.e., theta and alpha band modulations) because of the uncertainty principle of the wavelet analyses. That is, an increase in frequency resolution involves a decrease in time sensitivity (see Methods). It is nevertheless worth noticing that, at higher frequencies (i.e., the gamma modulation) where this limitation is not present, a sustained power activity still characterizes the learning process. Therefore, for further analyses and description of the results, we focused in the entire time interval between 0 and 700 msec.
The topographical analysis of the 16 selected electrodes for the theta band showed no significant interactions either with scalp locations (all p > .05) or power evolution during the learning process (interaction between block and topographical factors, in all cases p > .05); this indicates that the theta band was distributed bilaterally in both anterior and posterior areas throughout the learning process in both the analyses with short and long baselines. Likewise, no main effect on the scalp distribution was observed for the alpha band (all p > .05). However, in this case, the distribution varied as a function of the time of exposure, showing a progressive increase in posterior central locations throughout the learning process (Block × Anterior–Posterior: F(14, 266) = 1.77, p = .043, η2 = 0.09). In contrast to the previous analysis, the beta and gamma bands showed a progressive increase in frontal areas of the scalp throughout language exposure (Block × Anterior–Posterior, beta band: F(14, 266) = 2.22, p = .007, η2 = 0.1; gamma band: F(14, 266) = 2.13, p = .01, η2 = 0.1). Furthermore, gamma band power was more pronounced in the right compared with the left hemisphere (Laterality × Hemisphere: F(1, 19) = 4.97, p = .04, η2 = 0.21). The topographical representation of each band power throughout word exposure can be observed in Figure 2B.
Good versus Poor rule learners
Power spectrum analysis
To elucidate the implication of the oscillatory patterns observed specifically during the rule learning process, we carried out a random permutation analyses comparing two subsamples of the group: the group of Good learners formed by those participants who showed the highest scores in the rule learning test and the group of Poor learners that consisted of participants who performed at chance in this test. Both groups were matched in their word learning performances (see accuracy results). The cluster-based permutation analyses yielded a single cluster of sensor frequencies representing significant results (p = .006; corrected for multiple comparisons) (Figure 3A). This cluster corresponded to an increased gamma band activity for Poor learners (Figure 3B). The range of frequencies in this cluster included those of the gamma band studied at the whole group level, but it spanned lower frequencies corresponding to the beta band (20–45 Hz). This cluster was distributed over the last period of language learning (i.e., Blocks 10–15) and included frontal, parietal, and occipital sensors, which were slightly left lateralized for frontal and parietal electrodes (see scalp representation in Figure 3B). A second cluster with similar sensor frequency characteristics as the previous one was apparent at the very beginning of the learning process (see Figure 3A), although these results were not significant. Neither theta nor alpha bands yielded significant results in this analysis (all this contrasts p > .1; corrected).
The results of the coherence analyses also showed that the differences between groups were evident in the progression of brain dynamics as a function of exposure to the language. Figure 4A shows the electrode pairs that displayed significant coherence values (p < .05) averaged for all the learning processes for Good and Poor learners. Statistical significance was determined as the square value of correlation thresholds at p < .05. Averaging the whole learning period, the pattern of coherence was relatively similar in the two groups and was only significant between adjacent electrodes. The results indicate that both groups showed significant theta (4–8 Hz) and gamma (20–40 Hz) coherence through widely distributed scalp locations with no clearly specific pattern. Significant coherence appeared only between adjacent electrodes indicating simple volume conduction effects between nearby areas. Substantial differences between groups were nevertheless observed regarding the strength of the synchrony patterns between more distant electrodes in frontal and left temporal sites and between parietal and left temporal sites in the theta band, greater for poor learners (Figure 4A, right). Notably, coherence in the alpha band showed practically no electrode pair synchrony differences between groups.
To elucidate the progression through time of these synchrony differences, the interelectrode coherence was further computed as a function of the block of exposure. For the purpose of comparison with the results obtained in the power analyses, the same eight relevant periods depicted in Figure 2A and B were analyzed (Blocks 1, 3, 5, 7, 9, 11, 13, and 15). A systematic pattern emerged when time of exposure was taken into account highlighting clear distinct long-range patterns of neural synchrony underlying the learning process in the two groups. Although the two groups showed almost no differences in coherence in the first 10 sec of exposure (1–50 words), a clear dissociation in their theta and gamma band coherence patterns arose after approximately 1 min of exposure (i.e., 250 words, Figure 4B). Poor learners showed progressive enhanced neural synchrony between temporal and parietal areas and between frontal and parietal areas in the theta band. Good learners revealed progressive enhancement of high frequency coherence in the gamma band between bilateral frontal and temporal electrodes and between parietal and temporal electrodes (p < .05) (Figure 4B). The application of the permutation test to control for multiple comparison effects showed that, at the theta band, significant clusters (p < .05, corrected) belonged to Blocks 3, 5, 11, and 13 for Poor learners with the max summed t values for Cluster 1 at Block 3 (101–150 words). For the Good learner group, the gamma band clusters with significant results (p < .05, corrected) belonged to Blocks 5, 9, 11, and 13 with max summed t values for Cluster 1 at Block 11 (451–500 words).
The difference between the amount of coherence in the gamma and the theta bands significantly correlated with the performance in the different tests in the two groups only at specific blocks. This difference was positively correlated with rule learning performance very early in the learning process (Block 3: r = .75, p < .05) and later at Block 9 (r = .75, p < .05) for Good learners. These blocks coincide with the first and last peaks of the coherence clusters. Interestingly, for word learning, the direction of the correlation was inverted in the two groups. Whereas a positive correlation was significant during the first and second minutes for Good learners (Block 3: r = .71, Block 5: r = .62, Block 7: r = .66, Bloc 9: r = .65; all p < .05), a negative correlation was observed at the end of the third minute for Poor learners (Block 11: r = −0.75, Block 13: r = −.79; p < .05) corresponding to the later significant coherence cluster.
These correlations to specific blocks and the evolution of the oscillatory patterns as a function of exposure suggest that these differences may reflect a shift in the learning patterns of the two groups, with Good learners being more focused on structural information, and Poor learners being more focused on whole word memorization. However, because in this experiment measures of behavioral performance were only recorded at the end of the learning process (i.e., after 4 min of exposure to the language), we could not reject the possibility that these differences may nevertheless reflect a better general performance in the group of Good learners throughout learning. We, therefore, performed a second experiment with a different group of participants to explore the evolution of the performances at shorter language exposures corresponding to the time points where differences between groups started to emerge in the EEG analyses.
Thus, to understand the evolution of the rule learning performance through time in the two groups of learners better, we carried out a second post hoc experiment where performances in word and rule learning were recorded after 2 min and after 4 min of exposure. If the electrophysiological responses differentiating Good from Poor rule learners corresponded to a shift in the learning strategies, then the two groups should show a comparable performance in rule learning at 2 min of exposure despite the better performance shown after 4 min by the Good learners. In contrast, if these differences do not reflect a shift in strategy but rather a general better performance in the group of Good learners throughout learning then, at 2 min as well as at 4 min, Good learners should show better performance than Poor learners.
Thirty-two new volunteers (four men; mean age = 21, SD = 1.2; 31 are right-handed) participated in the study. None of them had a history of neurological or hearing deficits. Written consent was obtained from each volunteer before the experiment, and all participants received course credits for their participation.
Stimuli and Procedure
Participants in this experiment were presented with the four languages used in Experiment 1. For two of the languages, participants were tested after 2 min of exposure and, for the other two languages, after 4 min of exposure, this latter test reproducing the same condition as Experiment 1. For each condition, the 2 min and 4 min, one of the languages was tested for word learning and the other for rule learning forming a Latin square design. In this case, test items were presented only once, leading to 19 test trials per language.
The results at 4 min replicated the effects observed in Experiment 1. Participants were able to learn both words (t(31) = 8.48, p < .0001) and rules (t(31) = 4.36, p < .0001) above chance and displayed better performance in the word learning test (74% ± 17.8%) than the rule learning test (58.4% ± 14.3%) (t(31) = 3.81, p < .001). Following the same procedure as in Experiment 1, the 12 participants with the highest performances in the 4-min test (>63%) were included in the Good learner group (mean = 73.2%, SD = 7.2%), and the 12 lowest performers at the rule learning test (mean = 43.9%, SD = 7.2%) were assigned to the Poor learner group (t(22) = 9.95, p < .0001). Performance in word learning was comparable in the two groups (Good learners: 69.3% ± 2%, Poor learners: 75.9% ± 1.8%; t(22) = −0.8, p < .4) (see Figure 5). The remaining eight participants with intermediate performances were excluded from the group analyses.
After this segregation into groups, a repeated measures ANOVA was carried out including two within-subject factors, Time (2–4 min) and Learning (Word Rule), and one between-subject factor, Group (Good − Poor learners of the rule). A significant interaction was obtained (Time × Learning × Group: F(1, 22) = 13.77, p < .001), confirming the prediction that the two groups shifted their learning strategy after the first 2 min of exposure (see Figure 5). In contrast to the observed results at 4 min (Learning × Group: F(1, 22) = 23.07, p < .0001), at 2 min of exposure, the two groups displayed the same level of performance for the word learning and the rule learning conditions (all p > .7; Learning × Group: F < 1). Both groups showed a good level of overall achievement. Performances were above chance levels at this time point for word learning in both groups (Good: words t(11) = 5.7, p < .0001; Poor: words t(11) = 6.9, p < .0001), for rule learning in the Poor learner group (t(11) = 2.64, p < .023), and marginally significant for rule learning in the Good learners group (t(11) = 2.1, p < .055). The idea of a shift in strategy is further reinforced by two additional results. On the one hand, good learners improved their rule learning performance from 2 to 4 min of exposure (t(11) = −2.88, p < .015). On the other hand, despite poor learners correctly learning the rule after 2 min of exposure (t(11) = 2.64, p < .023), they showed a significant decrease in performance at 4 min (43.9%; t(11) = 3.48, p < .005).
By studying both synchronization of local neuronal assemblies (i.e., electrode-wise spectral power variations) as well as their spatially distributed coordination patterns (i.e., large-scale phase synchrony modulations) (Varela et al., 2001), we have been able to observe dissociations in these responses that occur during the learning of words and during the extraction of the embedded rules in an artificial language. Theta, alpha, beta, and gamma bands showed progressive spectral power modulations during the learning process. To isolate the changes specifically related to rule extraction from speech, we compared two groups of participants that were matched in their word learning performance but different in their rule learning abilities. We found that the gamma band was differentially modulated in the two groups, both in terms of spectral power as well as phase synchrony patterns. A boost in long-range gamma band phase synchrony was observed in the Good learner group. This systematic pattern of phase coherence coordinating bilateral frontal and temporal regions was observed from the third minute of the learning period. In contrast, greater gamma band activity in Poor learners appeared only in measures of local synchrony, toward the end of the learning phase. However, the pattern of long-range synchronization was focused on the theta band in this group, displaying widespread phase synchronization between frontal and parietal electrodes. The implication of these band-specific modulations is discussed in the following sections.
Theta Band Modulations
In our learning task, language was presented as a continuous auditory stream inducing participants to constantly evaluate new inputs and compare them with previously presented syllable sequences to detect possible matchings for word candidates. In that sense, in the current study, the whole group of participants showed a progressive increase in theta power through learning. However, this power was accompanied by a greater long-range theta fronto-parietal phase synchrony in the Poor learner group of participants only. Theta synchrony between these regions has been reported in human studies involving periods of information retention and has been attributed to a common mechanism of neural interaction that sustains working memory functions (Sarnthein et al., 1998). In our study, this increased activity appeared after the period when differences in performance between Good and Poor rule learners might have started to emerge as indicated by the results of Experiment 2, that is, after 2 min of exposure. This change in the pattern of performance along with a theta power comparable to the group of Good learners but accompanied by higher theta fronto-parietal coherence suggests that Poor rule learners might have applied a language learning strategy that relies more heavily on the enhancement of memory traces of the words. Therefore, the increased theta coherence observed in the group of Poor learners might reflect the sharpening or increase in neural efficiency of this memory-matching process across the learning blocks. Because Good learners rely on a different learning strategy, this sharpening process reflected by theta band modulations was not observed (notice the differences of theta phase synchrony in Figure 4B). These findings are consistent with the increased theta coherence found by Weiss et al. (2005) during sentence comprehension related to updating of episodic memory information.
Gamma Band Modulations
The most interesting results were obtained in the gamma band range. Increased synchrony in the gamma band has been previously associated with attention and memory (Jensen, Kaiser, & Lachaux, 2007). Because of its important role in neuronal communication and synaptic plasticity, this frequency band has been interpreted as an index of increased synchrony between the neural representations or cell assemblies created during the learning process (Gruber, Keil, & Muller, 2001; Miltner et al., 1999). In the present study, a clear dissociation was evident in the two groups of learners. Poor learners displayed greater local neural synchrony in the gamma band, mostly at bilateral fronto-temporal electrodes, whereas Good learners engaged long-range synchronization between frontal, parietal, and temporal electrodes.
As previously mentioned, the process of learning a new language seems to demand the involvement of working memory but also an ongoing matching of bottom–up sensory information with top–down biases. Several studies have reported enhancement of induced gamma band power in the 40-Hz range when perceiving a coherent object (Busch, Herrmann, Muller, Lenz, & Gruber, 2006; Gruber, Tsivilis, Montaldi, & Muller, 2004; Tallon-Baudry et al., 1996), and its role has been emphasized in matching stimuli to memory templates (Osipova et al., 2006; Gruber et al., 2004; Herrmann et al., 2004; Tallon-Baudry, Bertrand, Peronnet, & Pernier, 1998). Similarly, increased induced gamma band activity in the anterior temporal electrode sites has been found when participants listened to correctly identified speech compared with those not identified from degraded speech signals (Hannemann, Obleser, & Eulitz, 2007). These findings suggest that local gamma band enhancement during language learning could reflect the matching process between auditory input and lexical candidates that are built up continuously and incrementally during the ongoing exposure to the language. Gruber, Muller, and Keil (2002) argued that this activity might arise as a learning process on the basis of strengthening and refreshing lexical memory traces that are continuously handled in working memory. Indeed, memory formation or maintenance has been shown to be accompanied by neural oscillatory synchrony in the gamma range (Gruber & Muller, 2005; Pesaran, Pezaris, Sahani, Mitra, & Andersen, 2002; Fell et al., 2001; Tallon-Baudry et al., 1998). These proposals are consistent with our data that shows a clear frontal and temporal enhancement of sustained gamma band power that is not present at the initial presentations of the language words. This increase is progressively enhanced at the end of the learning phase when memory traces of words are likely to be created (i.e., see Figures 2 and 3) and appears in those participants who could learn the words of the language but failed to identify the underlying rules. The absence of this local gamma power increase in Good learners suggests that this group was not using a template-matching strategy during learning.1
On the other hand, it is worth mentioning that power differences between groups identified a cluster of spectral power including frequencies in the gamma band range (30–40 Hz) but also spanning the high beta band (20–30 Hz). This result is consistent with recent studies addressing the importance of neural oscillatory activity in the beta range that is associated with rule-based (i.e., syntactic) processing in sentence comprehension (Bastiaansen et al., 2010; Davidson & Indefrey, 2007; Weiss et al., 2005).
In our previous work (De Diego-Balaguer & Lopez-Barroso, 2010; De Diego-Balaguer et al., 2007), we hypothesized that rule learning, in contrast to word memorization, required discarding of irrelevant variable information and, thus, orienting the focus of attention to the relevant information to be chunked. This orienting of attention might be a core aspect for the acquisition of language rules. Gomez and Maye (2005) previously argued that tracking of adjacent elements, which are necessary for word learning, is actually the default strategy used when confronted with a new language, whereas rule extraction requires a shift in focus to the nonadjacent elements of the sequence, which involves filtering out the variable and nonuseful information. In that sense, a recent study (Rose, Sommer, & Buchel, 2006) showed that attention reallocation to global instead of local features of a complex visual scene underlies the process of binding neural representations to a global percept. More important to our study, this work showed that this shift could be mediated by long-range gamma band coherence (Rose et al., 2006). This result is consistent with our results because long-range gamma band coherence was not accompanied by power modulation at the sensor level. Such findings highlight the relevance of gamma phase synchrony as a neural mechanism underlying long-range connectivity in the present results. In relation to language comprehension, this interpretation is also in agreement with the greater coherence in the gamma band reported to be related to increased syntactic complexity (Weiss et al., 2005). The long-range synchronization observed is also consistent with the results of the following fMRI studies: those that report increased activation in the posterior parietal cortex as a function of rule learning during artificial language exposure and during switching between two different language tasks (Opitz & Friederici, 2003; Gurd et al., 2002; Sohn, Ursu, Anderson, Stenger, & Carter, 2000) in addition to the involvement of frontal and subcortical areas (Bahlmann, Schubotz, & Friederici, 2008; Forkstam, Hagoort, Fernandez, Ingvar, & Petersson, 2006; Lieberman, Chang, Chiao, Bookheimer, & Knowlton, 2004).
The results of the correlations of the gamma–theta coherence difference with the performance of the participants in the two learning tasks supports the idea that word learning could have been performed using different strategies in the two groups. Good learners appear to rely more on gamma coherence and Poor learners rely more on theta band coherence. In addition, Poor learners display greater gamma power along with greater theta coherence. This possibility is also consistent with the results of Experiment 2, which indicated that both groups displayed comparable word and rule learning after 2 min of exposure despite their different performance in rule learning after 4 min. The consolidation of the information learned with these different strategies is reflected in different evolutions of the performances in the two groups. An improved performance through time is observed in the Good learners for rule learning, and this indicates greater reliance on the use of structural/grammatical information in performing the task. In contrast, Poor learners relying more on whole word information led to a decrease in the rule learning test performance as exposure progressed.
The dissociated local and long-range gamma synchronization modulations observed in Poor and Good learning groups align with recent results from magneto-encephalography in the visual domain. Combining the two types of synchrony measures used in our study, Siegel and colleagues (2008) showed that top–down bias of attention was mainly reflected by long-range synchrony mechanisms in the gamma band, whereas stimulus-dependent neuronal activity was associated with changes in local visual sensory synchrony in the same frequency range. In our study, reallocation of attention might exert top–down filtering to the relevant information that carries the rule while discarding irrelevant variable information (Gomez, 2002). This type of attentional bias has been described as having an important role in sentence comprehension (Astheimer & Sanders, 2009) and may underlie the increased gamma band coherence observed with syntactic complexity (Weiss et al., 2005). Indeed, recent intracortical results from Womelsdorf et al. (2007) demonstrated that gamma band synchronization of neural groups enhances their effective synaptic strength by cognitive top–down influences, which facilitates sensory information in terms of sensitization of stimulus-driven processing by means of increased expectations of specific features (Summerfield et al., 2006; Friston, Penny, & David, 2005). This modulation has also been described in speech perception and could have a particularly important role in the learning process (Davis & Johnsrude, 2007; Hickok & Poeppel, 2007). However, although we have favored an attentional account in this interpretation, oscillations in the gamma band represent a versatile neurophysiological mechanism that may serve many roles in addition to mediating attentional processes.
The study of neural oscillatory activity has been fruitful in understanding top–down and bottom–up interactions in perceptual processes (Buschman & Miller, 2007; Engel et al., 2001). By applying these analyses to word and rule acquisition from speech in adults, the results obtained in the present study suggest that there are different roles of attention and memory processes while learning these two types of information. Word learning was characterized as an increased coherence in the theta band (4–8 Hz), most likely reflecting the matching between the words handled in working memory and incoming stimulation and progressively reinforcing the memory traces for new words. In contrast, rule learning was associated with a clear increase in synchrony between frontal and temporal areas in the gamma band. This result supports the hypothesis of a possible role of the reorienting of attention for rule learning. However, this consideration requires an extension of our findings and analyses to other experiments that focus on tracing the neural mechanisms underlying the process of language learning. We believe the results presented in this study open a new perspective in understanding how high cognitive functions tightly operate in the process of very early stages of language learning.
This work has been supported by the Spanish Ministerio de Educación y Ciencia postdoctoral grants EX2005-0404 to R. D. B. and 2007-0956 to L. F. and research grants SEJ2005-06067/PSIC and PSI2008-03901 to A. R. F. and PSI2008-3885 to R. D. B. We thank Josep Marco-Pallarés for the advice on analysis methods, Toni Cunillera and Matti Laine for helpful discussions on the data, and Marco Buiatti and two anonymous reviewers for their valuable comments on a previous version of the manuscript.
Reprint requests should be sent to Ruth de Diego-Balaguer, Dept. de Psicologia Bàsica, Faculty of Psychology, University of Barcelona, Pg. Vall d'Hebron 171, 08035 Barcelona, Spain, or via e-mail: email@example.com.
Importantly, it has been recently shown that induced gamma band power increases in the EEG are tightly driven by saccadic eye movements (Yuval-Greenberg, Tomer, Keren, Nelken, & Deouell, 2008). However, these findings were confined to induced gamma power that occurred in a specific time window of 200-300 msec stimuli onset. In the present study, the induced gamma band power increase arose along the 696-msec word presentation (see Figure 2A), which excludes the possibility that eye movements could be considered the main origin of these increments.