Abstract

A foreign language (a language not spoken in one's community) is difficult to master completely. Early introduction of foreign-language (FL) education during childhood is becoming a standard in many countries. However, the neural process of child FL learning still remains largely unknown. We longitudinally followed 322 school-age children with diverse FL proficiency for three consecutive years, and acquired children's ERP responses to FL words that were semantically congruous or incongruous with the preceding picture context. As FL proficiency increased, various ERP components previously reported in mother-tongue (L1) acquisition (such as a broad negativity, an N400, and a late positive component) appeared sequentially, critically in an identical order to L1 acquisition. This finding was supported not only by cross-sectional analyses of children at different proficiency levels but also by longitudinal analyses of the same children over time. Our data are consistent with the hypothesis that FL learning in childhood reproduces identical developmental stages in an identical order to L1 acquisition, suggesting that the nature of the child's brain itself may determine the normal course of FL learning. Future research should test the generalizability of the results in other aspects of language such as syntax.

INTRODUCTION

Children are given opportunities to learn a foreign language (FL; a language not spoken in one's community) to prepare for practical use of it in future, perhaps in their adulthood. Both behavioral and neuroimaging evidence has accumulated on the processing of a second language in adults who have moved or immigrated to a country where that language is actually spoken (Wartenburger et al., 2003; Kim, Relkin, Lee, & Hirsch, 1997; Weber-Fox & Neville, 1996; Johnson & Newport, 1989). In particular, ultimate attainment of processing accuracy and efficiency in the second language has been claimed to be higher in early than in late arrivals (Weber-Fox & Neville, 1996; Johnson & Newport, 1989). Such claims support widespread beliefs about children's language learning ability. Currently, many countries start FL education at primary rather than secondary schools, targeting children of 6 to 12 years of age, who have acquired the basics of grammar in their mother tongue (L1) (Crain & Lillo-Martin, 1999) but have not passed the hypothesized sensitive period for language learning (Weber-Fox & Neville, 1996; Johnson & Newport, 1989). Our study aims to reveal the neural process of FL learning in children of these ages.

Behavioral research of FL learning has rigorously studied children in this age group, as well as adolescents and adults. One milestone of such research is the finding that FL learning undergoes developmental stages identical to L1 acquisition (for a review, see Muñoz, 2006; Ellis, 1994; Dulay, Burt, & Krashen, 1982). Both the multiple stages observed for FL learning and the order of the appearance of those stages seem to be identical to those of L1 acquisition, suggesting the strong role of biological constraints on language learning in general (Dulay et al., 1982) and the possibility of educational practice based on the learner-internal syllabus (see Ellis, 1994, for a review). Neuroimaging studies of child L1 acquisition have also reveled that brain responses to linguistic stimuli gradually change in multiple dimensions at different timings during the long years of childhood (e.g., Hahne, Eckstein, & Friederici, 2004; Holcomb, Coffey, & Neville, 1992). To our knowledge, however, no neuroimaging data exist on FL learning in primary school years. In adult populations, longitudinal neuroimaging studies of FL learning have reported intriguing findings (Osterhout et al., 2008; Osterhout, McLaughlin, Pitkänen, Frenck-Mestre, & Molinaro, 2006; McLaughlin, Osterhout, & Kim, 2004), but comprehensive data covering multiple levels of FL proficiency have not been reported even in adults.

To detect multiple developmental stages of child FL learning, we longitudinally followed 322 school-age Japanese children for three consecutive years, who had diverse proficiency in English. This study size, which compares in scale with largest longitudinal structural imaging studies of children (Shaw et al., 2006; Giedd et al., 1999), was essential for gaining a process-wide understanding of child FL learning and, in particular, revealing its multiple developmental stages spanning from elementary to high levels of proficiency. We divided these children into groups of different levels of English proficiency (for more details, see below). The groups inevitably differed in the age of first exposure (AOFE) to English; children with higher proficiency started learning English earlier than did those with lower proficiency. A simple one-time cross-sectional comparison of these groups (Figure 1A) would thus be susceptible to confounding by AOFE. A purely longitudinal study (Figure 1B), in which the same one group of child FL learners would be followed from elementary to high levels of FL proficiency, can avoid this problem, but was unrealistic in terms of the time required for data acquisition. We thus took a hybrid approach (Figure 1C), wherein multiple groups of children at different levels of FL proficiency were studied longitudinally. In this design, neural changes associated with progress in FL learning are assessed within each of the multiple groups. None of these intraindividual changes can be due to AOFE, because AOFE cannot change in the same group.

Figure 1. 

Possible study designs. (A) A simple cross-sectional design. Multiple groups of participants with different levels of FL proficiency are compared. This design is susceptible to confounding by factors that correlate with proficiency. (B) A simple longitudinal design. The same one group of participants is followed longitudinally. This is too time-consuming because it takes more than several years to attain high FL proficiency. (C) A hybrid design. We adopted this design in which cross-sectional and longitudinal designs are combined. Participants are divided into groups of different proficiency. Then, neural changes between two time points are assessed within each group, after some progress has been made in proficiency. Our three experimental groups, namely, Low (low-proficiency group), Medium (medium-proficiency group), and High (high-proficiency group), correspond to Groups A, B, and C, respectively.

Figure 1. 

Possible study designs. (A) A simple cross-sectional design. Multiple groups of participants with different levels of FL proficiency are compared. This design is susceptible to confounding by factors that correlate with proficiency. (B) A simple longitudinal design. The same one group of participants is followed longitudinally. This is too time-consuming because it takes more than several years to attain high FL proficiency. (C) A hybrid design. We adopted this design in which cross-sectional and longitudinal designs are combined. Participants are divided into groups of different proficiency. Then, neural changes between two time points are assessed within each group, after some progress has been made in proficiency. Our three experimental groups, namely, Low (low-proficiency group), Medium (medium-proficiency group), and High (high-proficiency group), correspond to Groups A, B, and C, respectively.

Using the event-related potential technique, we studied children's cortical processing of FL words as a first attempt to provide direct and comprehensive neuroimaging evidence on child FL learning. Words are basic building blocks of language which underlie higher-level linguistic processing such as syntax and discourse, and are an appropriate initial target of study especially in children who do not always have a high command of FL. We measured event-related potentials (ERPs), which are real-time electrical reflections of ongoing neural activities in the brain. Recorded on the scalp, they lack precise spatial resolution but have excellent temporal resolution in the order of milliseconds (Rugg & Coles, 1995). They can be measured safely in children and have been successfully used to study L1 acquisition during childhood (Hahne et al., 2004; Holcomb et al., 1992).

As indices of cortical processing of L1 and FL words, we acquired ERPs using a modified version of the N400 paradigm (Kutas & Hillyard, 1980), in which the participants passively listened to words that were either congruous or incongruous in meaning with the preceding picture context (Friedrich & Friederici, 2004, 2005; Byrne et al., 1999; Connolly, Byrne, & Dywan, 1995). The ERPs thus elicited in the L1 correlate with development in children (Friedrich & Friederici, 2004, 2005; Byrne et al., 1999) and lexical knowledge in adults (Connolly et al., 1995).

Previous L1 studies using this particular paradigm as well as other N400 paradigms, taken together, indicate the following developmental stages in L1 acquisition: (Stage 1) no N400-like negativity (Friedrich & Friederici, 2005); (Stage 2) a long-lasting broad negativity which has a more anterior scalp distribution than does the typical posterior N400 (Silva-Pereyra, Rivera-Gaxiola, & Kuhl, 2005; Friedrich & Friederici, 2004; Hahne et al., 2004); (Stage 3) a posterior N400 and a late negativity with anterior dominance (Hahne et al., 2004; Byrne et al., 1999; Holcomb et al., 1992); and (Stage 4) addition of a late positive component (LPC) (Juottonen, Revonsuo, & Lang, 1996; McCallum, Farmer, & Pocock, 1984). These stages are listed in Table 1. Below we test the general hypothesis that child FL learning closely follows stages in L1 acquisition (Dulay et al., 1982).

Table 1. 

Predicted ERPs at Each Stage

Stage
ERP
no negativity 
long-lasting broad negativity 
posterior N400 and anterior late negativity 
addition of LPC (late positive component) 
Stage
ERP
no negativity 
long-lasting broad negativity 
posterior N400 and anterior late negativity 
addition of LPC (late positive component) 

METHODS

Participants

From the 322 children who had taken part in the project in all 3 years (85.41% of the original 377 children), 201 children were selected for the analyses of 3-year longitudinal data. They were 6 to 9 years old in Year 1 of the project, had lived in Japan since birth, were right-handed (Oldfield, 1971), had no disorder (psychological, neurological, auditory, or linguistic), had not lived with a native English speaker, had been born to a native Japanese-speaking mother, had provided ERP data of good signal-to-noise ratio (defined as having 30 or more accepted trials in each of the four conditions tested) in at least 2 years, and can be classified into one of the four groups of interest (Low, Medium, High, and Little Progress; see below). To determine children's levels of English proficiency and their 3-year progress, a meaning comprehension test of spoken English tailored for primary-school pupils in Japan (score range: 0–100 points)1 was administered in each year. The scores of this test were used as critical information in group selection.

We divided the children (total n = 201) into four groups of interest, namely, Low (n = 53), Medium (n = 55), High (n = 53), and Little Progress (n = 40), and analyzed longitudinal changes in their ERPs separately. The first three groups, Low (low-proficiency group), Medium (medium-proficiency group), and High (high-proficiency group), are experimental groups consisting of children who were learning English as an FL. These children made large progress in English proficiency between Year 1 and Year 3, and were selected to detect ERP changes associated with behavioral progress in FL proficiency. The three groups differed in English proficiency in Year 1. Children in the Low group had low proficiency in Year 1 (English score: 25–50 points in Year 1). Those in the Medium group and those in the High group had medium and high proficiency in Year 1, respectively (English score: 50–75 points in Medium, 75–100 points in High). The Little Progress group served as a control group. They had no or only short exposure to English and did not make large behavioral progress in English proficiency between Year 1 and Year 3. They were selected to obtain further evidence on the dependency between progress in FL proficiency and longitudinal ERP changes (see Table 2 for more details of the four groups).

Table 2. 

Details of Participants



Large Progress
Little Progress
Low
Medium
High
Number of children  53 55 53 40 
Number of boys (%)  19 (35.8) 24 (43.6) 28 (52.8) 22 (55) 
Mean handedness quotient (SD)a  0.954 (0.113) 0.944 (0.106) 0.967 (0.079) 0.973 (0.067) 
Number of children analyzedb Year 1 48 48 47 36 
Year 2 53 53 53 39 
Year 3 52 53 53 37 
Mean age at EEG recording (SDYear 1 7.705 (1.330) 7.987 (0.732) 8.263 (1.437) 7.804 (1.407) 
Year 2 8.824 (0.788) 9.000 (0.732) 9.417 (0.875) 9.008 (0.666) 
Year 3 9.810 (0.789) 9.978 (0.727) 10.409 (0.870) 10.002 (0.670) 
Mean English score (SDYear 1 44.99 (4.664) 59.52 (5.798) 92.95 (7.279) 57.37 (8.928) 
Year 2 54.72 (10.556) 69.04 (12.199) 98.06 (4.800) 55.46 (11.263) 
Year 3 67.78 (6.301) 80.43 (7.354) 99.06 (2.051) 55.53 (8.978) 
Mean AOFE (SD 5.559 (1.842) 3.819 (2.465) 2.594 (2.508) 5.336c (1.506)c 
Median HOE (= hours of exposure)d Year 1 37.5 248.2 1740 39.75 
Year 2 67.5 329.5 2224 52 
Year 3 97.5 380.7 2672 68.25 


Large Progress
Little Progress
Low
Medium
High
Number of children  53 55 53 40 
Number of boys (%)  19 (35.8) 24 (43.6) 28 (52.8) 22 (55) 
Mean handedness quotient (SD)a  0.954 (0.113) 0.944 (0.106) 0.967 (0.079) 0.973 (0.067) 
Number of children analyzedb Year 1 48 48 47 36 
Year 2 53 53 53 39 
Year 3 52 53 53 37 
Mean age at EEG recording (SDYear 1 7.705 (1.330) 7.987 (0.732) 8.263 (1.437) 7.804 (1.407) 
Year 2 8.824 (0.788) 9.000 (0.732) 9.417 (0.875) 9.008 (0.666) 
Year 3 9.810 (0.789) 9.978 (0.727) 10.409 (0.870) 10.002 (0.670) 
Mean English score (SDYear 1 44.99 (4.664) 59.52 (5.798) 92.95 (7.279) 57.37 (8.928) 
Year 2 54.72 (10.556) 69.04 (12.199) 98.06 (4.800) 55.46 (11.263) 
Year 3 67.78 (6.301) 80.43 (7.354) 99.06 (2.051) 55.53 (8.978) 
Mean AOFE (SD 5.559 (1.842) 3.819 (2.465) 2.594 (2.508) 5.336c (1.506)c 
Median HOE (= hours of exposure)d Year 1 37.5 248.2 1740 39.75 
Year 2 67.5 329.5 2224 52 
Year 3 97.5 380.7 2672 68.25 

Data are shown separately for Low (low-proficiency group, with English scores ranging from 25 to 50 points in Year 1), Medium (medium-proficiency group, 50–75 points in Year 1), High (high-proficiency group, 75–100 points in Year 1), and Little Progress group (control group). SD = standard deviation; AOFE = age of first exposure to English.

aObtained by the Edinburgh Inventory (Oldfield, 1971). 1 = completely right-handed, −1 = completely left-handed.

bThe numbers can differ among the years because not all children provided ERPs with good signal-to-noise ratio in all years.

cBased only on 25 data, because not all children in the Little Progress group had been exposed to English.

dHOE is the total hours of exposure to English since AOFE. Children had been exposed to English through one or more of the following means: (1) school lessons (10–25 hours per year at public schools and 450–600 hours per year at one private school running an immersion program), (2) private lessons outside school, (3) home study, and (4) contact with English-speaking acquaintances.

Additional information relevant to group selection is as follows. The English scores of Low and Medium improved by about 20 points between Year 1 and Year 3 (minimum 10 points) (see also Table 2). All children in the High group scored 90 points or higher in Year 3. They became more proficient than measurable by the test we used (i.e., ceiling effects) due to their extremely long hours of exposure to English (Table 2).2 The behavioral progress of the Little Progress group was less than 5 points (average −1.82 points) between Year 1 and Year 3. Exposure to English began earlier in children with higher English proficiency; AOFE to English was earliest in the High group and latest in the Low group (Table 2). Thus, one-time cross-sectional comparisons of these groups cannot tease apart the effects of proficiency from those of AOFE. Hence, we resort to longitudinal analyses within each group to determine what ERP changes accompany progress in FL proficiency.

All children and their parents gave informed assent and consent, respectively. The ethics committee of Tokyo Metropolitan University approved the procedures.

Stimuli

Prerecorded words pronounced by a female professional bilingual narrator were used as the critical experimental stimuli. They consisted of 80 basic-level English words appropriate for Japanese children (words difficult to understand due to cultural differences were not used) and 80 Japanese words with the corresponding meanings [mean word length: 526 and 518 msec; mean log10 of print frequency per million: 1.661 and 1.592 counts3; differences between languages: t(158) = −0.472, p = .670 in word length; t = 0.666, p = .506 in log10 frequency; t = −0.975, p = .331 in root-mean-square amplitude; and t = 0.606, p = .545 in maximal amplitude; full list in Appendix]. The playback of a spoken word was preceded by the appearance of a colored illustration. The illustrations were refreshed every year.

Procedure

Children were invited for electroencephalogram (EEG) recording once every year. One recording session consisted of four subsessions, each of which contained 80 trials. One trial sequence consisted of the appearance of a picture (colored illustration) at a visual angle of 10.61° × 10.61°, the playback of a spoken word 1000 msec after picture onset (around 60 dB sound pressure level), and the disappearance of the picture 750 msec after word offset. The stimulus onset asynchrony was 4000 msec. Each of the 80 English (FL) and 80 Japanese (L1) stimulus words were presented twice during the entire recording session, once in a congruous context (congruous with the preceding picture) and once in an incongruous context (Table 3). The English and Japanese words were presented in different subsessions. In each of the four subsessions, half of the trials (40) were in the congruous condition and the other half (40) in the incongruous condition. Children were instructed to attend to the visual and auditory stimuli silently. No behavioral task was assigned during EEG recording.

Table 3. 

Experimental Conditions

Condition
Language
Context
Word
Congruous L1 Japanese picture of a bird tori (“bird”) 
Incongruous L1 Japanese picture of a bird kuruma (“car”) 
Congruous FL English picture of a bird bird 
Incongruous FL English picture of a bird desk 
Condition
Language
Context
Word
Congruous L1 Japanese picture of a bird tori (“bird”) 
Incongruous L1 Japanese picture of a bird kuruma (“car”) 
Congruous FL English picture of a bird bird 
Incongruous FL English picture of a bird desk 

L1 = mother tongue; FL = foreign language. Each of the four conditions had 80 trials. One trial consisted of the appearance of a picture which provided a context, and the playback of a spoken word. Examples of picture–word pairs are given in the table.

Data Acquisition and Analyses

In a sound-proof electrically shielded room, EEGs were continuously recorded through 27 scalp electrodes arranged according to the Extended International 10–20 System (Sharbrough et al., 1991) (Figure 2). NuAmp (Neuroscan, Texas) digitized EEGs at 500 Hz, using Fpz as the ground and the left earlobe as the on-line reference, with a band-pass filter of 0.1 to 100 Hz applied on-line. Vertical and horizontal elctrooculograms were recorded through the electrodes placed above and below the left eye and those placed besides the outer canthi. After applying an off-line band-pass filter of 0.3–30 Hz, the continuous EEGs were segmented into epochs of 1700 msec consisting of 200 msec before word onset and 1500 msec after word onset. The scalp EEG channels were re-referenced to the average of both earlobes off-line. Ocular artifacts on all scalp EEG channels were corrected by a method based on regression analysis (Semlitsch, Anderer, Schuster, & Presslich, 1986). Epochs exceeding ±75 μV at the right earlobe or ±150 μV at the scalp electrodes were rejected automatically. Noisy datasets were further eliminated by visual inspection. Table 4 presents the mean numbers of averaged trials.

Figure 2. 

Electrode positions displayed on a schematic head seen from above. The participant wore an elastic cap (EasyCap, Germany) in which 27 Ag/AgCl sintered electrodes were mounted. The shaded areas in this schema correspond to the regions used in statistical analyses.

Figure 2. 

Electrode positions displayed on a schematic head seen from above. The participant wore an elastic cap (EasyCap, Germany) in which 27 Ag/AgCl sintered electrodes were mounted. The shaded areas in this schema correspond to the regions used in statistical analyses.

Table 4. 

Mean Number of Averaged Trials (Standard Deviation in Parentheses)



English (Foreign Language)
Japanese (Mother Tongue)
Congruous
Incongruous
Congruous
Incongruous
Low Year 1 67.0 (10.6) 67.5 (10.4) 69.3 (9.3) 68.5 (10.5) 
Year 2 72.5 (7.9) 72.4 (8.4) 73.9 (6.7) 73.8 (6.9) 
Year 3 73.4 (8.5) 73.2 (8.8) 73.7 (8.4) 73.2 (9.5) 
Medium Year 1 67.7 (11.4) 68.1 (10.9) 67.3 (11.7) 66.7 (11.6) 
Year 2 73.7 (5.8) 73.4 (5.8) 73.4 (5.7) 73.6 (5.3) 
Year 3 74.0 (6.6) 73.8 (6.2) 73.8 (5.8) 74.0 (5.7) 
High Year 1 65.5 (12.4) 66.3 (10.9) 66.1 (11.5) 66.0 (11.3) 
Year 2 74.8 (6.3) 74.2 (6.0) 73.5 (6.0) 72.9 (7.5) 
Year 3 75.7 (6.4) 74.9 (7.2) 75.3 (6.0) 75.6 (6.1) 
Little progress Year 1 68.3 (10.3) 68.5 (10.2) 70.5 (9.0) 70.0 (11.1) 
Year 2 72.5 (8.9) 73.0 (8.7) 71.1 (8.1) 71.5 (9.2) 
Year 3 74.7 (6.8) 73.5 (7.9) 72.6 (7.9) 72.2 (7.9) 


English (Foreign Language)
Japanese (Mother Tongue)
Congruous
Incongruous
Congruous
Incongruous
Low Year 1 67.0 (10.6) 67.5 (10.4) 69.3 (9.3) 68.5 (10.5) 
Year 2 72.5 (7.9) 72.4 (8.4) 73.9 (6.7) 73.8 (6.9) 
Year 3 73.4 (8.5) 73.2 (8.8) 73.7 (8.4) 73.2 (9.5) 
Medium Year 1 67.7 (11.4) 68.1 (10.9) 67.3 (11.7) 66.7 (11.6) 
Year 2 73.7 (5.8) 73.4 (5.8) 73.4 (5.7) 73.6 (5.3) 
Year 3 74.0 (6.6) 73.8 (6.2) 73.8 (5.8) 74.0 (5.7) 
High Year 1 65.5 (12.4) 66.3 (10.9) 66.1 (11.5) 66.0 (11.3) 
Year 2 74.8 (6.3) 74.2 (6.0) 73.5 (6.0) 72.9 (7.5) 
Year 3 75.7 (6.4) 74.9 (7.2) 75.3 (6.0) 75.6 (6.1) 
Little progress Year 1 68.3 (10.3) 68.5 (10.2) 70.5 (9.0) 70.0 (11.1) 
Year 2 72.5 (8.9) 73.0 (8.7) 71.1 (8.1) 71.5 (9.2) 
Year 3 74.7 (6.8) 73.5 (7.9) 72.6 (7.9) 72.2 (7.9) 

The lateral electrodes were analyzed as four regions: anterior (left: Fp1, F3, F7; right: Fp2, F4, F8), temporal (left: FC5, T7, CP5; right: FC6, T8, CP6), central (left: FC1, C3, CP1; right: FC2, C4, CP2), and posterior (left: P3, P7, O1; right: P4, P8, O2). The ERP data for the L1 and the FL were analyzed in different time windows (TWs) to accommodate the possible latency differences (L1: 100–200, 250–450, 500–700, 800–1100 msec; FL: 100–200, 300–500, 600–900, 1000–1400 msec). These TWs were determined based on the previous L1 acquisition studies and on visual inspection of the waveforms. Using the multivariate analysis of variance (MANOVA), the effects of condition (congruous, incongruous) and its interactions with TW (4 levels), hemisphere (left, right), region (lateral electrodes only: anterior, temporal, central, posterior), and electrode (midline electrodes only: Fz, Cz, Pz) were statistically analyzed, setting the significance level at p = .05. Significant interactions between condition and a regional factor (region or electrode) were followed by a test of condition effect at each level of the regional factor. Group (Low, Medium, High) and year (1, 2, 3) were analyzed in MANOVAs as between-subject factors. One could have used regression analyses without separating the participants into groups, but we opted for testing the significance of condition in groups, in order to present the results in a manner comparable to the previous ERP studies of L1 acquisition we referred to above.

RESULTS

Longitudinal changes in each group are summarized in Figure 3. In Year 1, congruous and incongruous FL words did not differ clearly in Low (low-proficiency group). Incongruous FL words evoked a broad negativity in Medium (medium-proficiency group). An N400 response with posterior scalp distribution appeared in High (high-proficiency group). Overall, the ERPs of Low, Medium, and High in Year 1 are compatible with ERPs at Stages 1, 2, and 3 in L1 acquisition, respectively. In Year 3, a broad negativity appeared in Low. Children in Medium showed an N400 with posterior distribution. An LPC was found in High in addition to an N400. Thus, the ERPs of Low, Medium, and High in Year 3 are compatible with ERPs at Stages 2, 3, and 4 in L1 acquisition. The results of statistical analyses are presented in detail below.

Figure 3. 

Longitudinal changes in ERP responses to FL words. ERP waveforms (at electrode Pz) and t maps comparing incongruous and congruous words are shown separately for Year 1 (left column) and Year 3 (right column). (A–C) ERP responses to FL words obtained in Year 1 for Low (low-proficiency group), Medium (medium-proficiency group), and High (high-proficiency group). (D) ERP responses to mother-tongue (L1) words obtained from High in Year 1 are shown as a reference. The t maps in (D) are for the following time windows: 100–200 msec, 250–450 msec, 500–700 msec, and 800–1100 msec. (E–G) ERP responses to FL words obtained in Year 3 for Low, Medium, and High. (H, I) ERP responses to FL words obtained in Years 1 and 3 for the Little Progress group, who did not show large behavioral progress in FL proficiency, unlike Low, Medium, and High. aLN = anterior late negativity; bNeg = broad negativity; FL = foreign language; L1 = mother tongue; LPC = late positive component.

Figure 3. 

Longitudinal changes in ERP responses to FL words. ERP waveforms (at electrode Pz) and t maps comparing incongruous and congruous words are shown separately for Year 1 (left column) and Year 3 (right column). (A–C) ERP responses to FL words obtained in Year 1 for Low (low-proficiency group), Medium (medium-proficiency group), and High (high-proficiency group). (D) ERP responses to mother-tongue (L1) words obtained from High in Year 1 are shown as a reference. The t maps in (D) are for the following time windows: 100–200 msec, 250–450 msec, 500–700 msec, and 800–1100 msec. (E–G) ERP responses to FL words obtained in Year 3 for Low, Medium, and High. (H, I) ERP responses to FL words obtained in Years 1 and 3 for the Little Progress group, who did not show large behavioral progress in FL proficiency, unlike Low, Medium, and High. aLN = anterior late negativity; bNeg = broad negativity; FL = foreign language; L1 = mother tongue; LPC = late positive component.

Low, Medium, and High in Year 1

Grand-average ERP waveforms in the FL obtained in Year 1 are shown separately for Low, Medium, and High in Figure 4A–C. A MANOVA involving group as a factor showed that in Year 1, condition (congruous vs. incongruous) effects in the FL significantly differed among the three proficiency groups [TW × Condition × Region × Group, F(18, 264) = 3.415, p < .001]. Table 5 shows the results of MANOVAs conducted separately for Low, Medium, and High.

Figure 4. 

Grand-average waveforms in the FL (English). ERP responses to congruous (solid lines) and incongruous words (dotted lines) in Years 1 and 3 are plotted separately for Low, Medium, and High. The vertical lines indicate the onset of the word. Negative is plotted upward.

Figure 4. 

Grand-average waveforms in the FL (English). ERP responses to congruous (solid lines) and incongruous words (dotted lines) in Years 1 and 3 are plotted separately for Low, Medium, and High. The vertical lines indicate the onset of the word. Negative is plotted upward.

Table 5. 

MANOVA Results on ERPs in Year 1 and Year 3



df
100–200 msec
300–500 msec
600–900 msec
1000–1400 msec
F
p
F
p
F
p
F
p
Low, Year 1 
Midline Condition 1, 47 0.362  2.094  3.116 .084(*) 0.582  
C × Electrode 2, 46 0.676  0.268  0.204  2.073  
Lateral Condition 1, 47 0.243  0.444  3.739 .059(*) 0.377  
C × H 1, 47 0.155  1.368  0.607  0.760  
C × Region 3, 45 0.198  2.062  0.575  0.096  
C × H × R 3, 45 2.377 .082(*) 1.170  2.430 .078(*) 1.273  
 
Medium, Year 1 
Midline Condition 1, 47 0.758  8.982 .004** 8.673 .005** 0.029  
C × Electrode 2, 46 0.591  0.511  0.063  1.289  
Lateral Condition 1, 47 0.783  6.768 .012* 7.420 .009** 0.560  
C × H 1, 47 1.078  0.364  0.146  0.088  
C × Region 3, 45 1.958  2.403 .080(*) 1.817  3.267 .030* 
C × H × R 3, 45 0.401  0.298  1.090  0.742  
 
High, Year 1 
Midline Condition 1, 46 1.894  26.64 <.001*** 17.05 <.001*** 0.039  
C × Electrode 2, 45 1.139  3.521 .038* 3.345 .044* 0.900  
Lateral Condition 1, 46 0.773  17.68 <.001*** 23.14 <.001*** 0.264  
C × H 1, 46 2.413  1.081  0.083  5.058 .029* 
C × Region 3, 44 5.231 [.004] 4.469 .008** 10.09 <.001*** 4.837 [.005] 
C × H × R 3, 44 0.131  0.222  0.182  1.307  
 
Mother Tongue (High), Year 1 
Midline Condition 1, 46 1.001  63.88 <.001*** 50.46 <.001*** 1.756  
C × Electrode 2, 45 0.133  3.170 .052(*) 8.350 .001** 9.830 <.001*** 
Lateral Condition 1, 46 0.186  54.49 <.001*** 61.04 <.001*** 2.104  
C × H 1, 46 1.143  0.115  0.268  3.622 .063(*) 
C × Region 3, 44 3.038 [.039] 9.431 <.001*** 19.56 <.001*** 14.72 <.001*** 
C × H × R 3, 44 0.629  1.094  0.934  0.751  
 
Low, Year 3 
Midline Condition 1, 51 0.010  1.108  15.61 <.001*** 7.655 .008** 
C × Electrode 2, 50 1.144  0.158  0.214  0.115  
Lateral Condition 1, 51 0.411  1.666  14.17 <.001*** 5.687 .021* 
C × H 1, 51 0.959  0.449  0.435  0.113  
C × Region 3, 49 0.679  3.490 .022* 4.911 .005** 3.107 .035* 
C × H × R 3, 49 1.154  1.350  .455  0.850  
 
Medium, Year 3 
Midline Condition 1, 52 0.433  15.48 <.001*** 23.80 <.001*** 0.115  
C × Electrode 2, 51 0.529  2.149  1.830  0.174  
Lateral Condition 1, 52 1.046  8.006 .007** 18.12 <.001*** 0.015  
C × H 1, 52 0.024  0.032  3.040 .087(*) 5.177 .027* 
C × Region 3, 50 0.143  5.118 .004** 3.951 .013* 3.248 [.029] 
C × H × R 3, 50 0.402  0.344  2.461 .073(*) 0.454  
 
High, Year 3 
Midline Condition 1, 52 1.684  24.63 <.001*** 13.80 <.001*** 0.450  
C × Electrode 2, 51 0.651  3.720 .031* 11.17 <.001*** 6.680 .003** 
Lateral Condition 1, 52 3.539 .066(*) 9.970 .003** 10.96 .002** 0.001  
C × H 1, 52 0.000  0.414  0.054  8.801 .005** 
C × Region 3, 50 1.247  10.04 <.001*** 17.22 <.001*** 8.016 <.001*** 
C × H × R 3, 50 1.063  0.142  0.097  0.911  


df
100–200 msec
300–500 msec
600–900 msec
1000–1400 msec
F
p
F
p
F
p
F
p
Low, Year 1 
Midline Condition 1, 47 0.362  2.094  3.116 .084(*) 0.582  
C × Electrode 2, 46 0.676  0.268  0.204  2.073  
Lateral Condition 1, 47 0.243  0.444  3.739 .059(*) 0.377  
C × H 1, 47 0.155  1.368  0.607  0.760  
C × Region 3, 45 0.198  2.062  0.575  0.096  
C × H × R 3, 45 2.377 .082(*) 1.170  2.430 .078(*) 1.273  
 
Medium, Year 1 
Midline Condition 1, 47 0.758  8.982 .004** 8.673 .005** 0.029  
C × Electrode 2, 46 0.591  0.511  0.063  1.289  
Lateral Condition 1, 47 0.783  6.768 .012* 7.420 .009** 0.560  
C × H 1, 47 1.078  0.364  0.146  0.088  
C × Region 3, 45 1.958  2.403 .080(*) 1.817  3.267 .030* 
C × H × R 3, 45 0.401  0.298  1.090  0.742  
 
High, Year 1 
Midline Condition 1, 46 1.894  26.64 <.001*** 17.05 <.001*** 0.039  
C × Electrode 2, 45 1.139  3.521 .038* 3.345 .044* 0.900  
Lateral Condition 1, 46 0.773  17.68 <.001*** 23.14 <.001*** 0.264  
C × H 1, 46 2.413  1.081  0.083  5.058 .029* 
C × Region 3, 44 5.231 [.004] 4.469 .008** 10.09 <.001*** 4.837 [.005] 
C × H × R 3, 44 0.131  0.222  0.182  1.307  
 
Mother Tongue (High), Year 1 
Midline Condition 1, 46 1.001  63.88 <.001*** 50.46 <.001*** 1.756  
C × Electrode 2, 45 0.133  3.170 .052(*) 8.350 .001** 9.830 <.001*** 
Lateral Condition 1, 46 0.186  54.49 <.001*** 61.04 <.001*** 2.104  
C × H 1, 46 1.143  0.115  0.268  3.622 .063(*) 
C × Region 3, 44 3.038 [.039] 9.431 <.001*** 19.56 <.001*** 14.72 <.001*** 
C × H × R 3, 44 0.629  1.094  0.934  0.751  
 
Low, Year 3 
Midline Condition 1, 51 0.010  1.108  15.61 <.001*** 7.655 .008** 
C × Electrode 2, 50 1.144  0.158  0.214  0.115  
Lateral Condition 1, 51 0.411  1.666  14.17 <.001*** 5.687 .021* 
C × H 1, 51 0.959  0.449  0.435  0.113  
C × Region 3, 49 0.679  3.490 .022* 4.911 .005** 3.107 .035* 
C × H × R 3, 49 1.154  1.350  .455  0.850  
 
Medium, Year 3 
Midline Condition 1, 52 0.433  15.48 <.001*** 23.80 <.001*** 0.115  
C × Electrode 2, 51 0.529  2.149  1.830  0.174  
Lateral Condition 1, 52 1.046  8.006 .007** 18.12 <.001*** 0.015  
C × H 1, 52 0.024  0.032  3.040 .087(*) 5.177 .027* 
C × Region 3, 50 0.143  5.118 .004** 3.951 .013* 3.248 [.029] 
C × H × R 3, 50 0.402  0.344  2.461 .073(*) 0.454  
 
High, Year 3 
Midline Condition 1, 52 1.684  24.63 <.001*** 13.80 <.001*** 0.450  
C × Electrode 2, 51 0.651  3.720 .031* 11.17 <.001*** 6.680 .003** 
Lateral Condition 1, 52 3.539 .066(*) 9.970 .003** 10.96 .002** 0.001  
C × H 1, 52 0.000  0.414  0.054  8.801 .005** 
C × Region 3, 50 1.247  10.04 <.001*** 17.22 <.001*** 8.016 <.001*** 
C × H × R 3, 50 1.063  0.142  0.097  0.911  

The values are for foreign-language words, if not otherwise stated. p Values larger than .1 are omitted. Asterisks indicate statistically significant results (*p < .05, **p < .01, ***p < .001). Marginally significant results (0.05 < p < 0.1) are accompanied by (*). p Values in square brackets [ ] denote statistically significant interactions that are not followed by a finding of statistically significant condition main effect in subsequent analyses. All the significant Condition × Hemisphere interactions reported above reflect more negative ERPs elicited by incongruous than congruous words in the right hemisphere in the last TW. Hence, these interactions are not elaborated in the main text. df = degree of freedom; C = condition; H = hemisphere; R = region.

In Low (Figure 3A, Figure 4A), incongruous FL words elicited no significantly different response compared to congruous FL words (all ps > .05; Table 5).

In Medium (Figure 3B, Figure 4B), a long-lasting broad negativity appeared for incongruous words, as indicated by significant main effects of condition in both 300–500 msec and 600–900 msec TWs (Table 5). The negativity had a broad scalp distribution and spread from posterior to anterior regions (Figure 5A and B).

Figure 5. 

Scalp distributions of the negativity in 300–500 msec time window. ERP voltages of [Incongruous − Congruous] subtraction are shown. (A) Voltage maps of Medium. The negativity elicited by incongruous FL words was distributed broadly and spread to anterior regions in Year 1, but not in Year 3, when the negativity in the same time window had a posterior distribution, which is typical of the N400. (B) Average voltages at fronto-polar (Fp1/2), frontal (F7/3/z/4/8), fronto-central (Fc5/1/2/6), central (T7/8, C3/z/4), centro-parietal (CP5/1/2/6), parietal (P7/3/z/4/8), and occipital electrodes (O1/2) in Medium. Only the data in Year 3 have a clear posterior distribution. (C) Voltage maps of High. Upper: ERPs in the FL in Year 1 (time window: 300–500 msec). Lower: ERPs in the mother tongue in Year 1 (time window: 250–450 msec).

Figure 5. 

Scalp distributions of the negativity in 300–500 msec time window. ERP voltages of [Incongruous − Congruous] subtraction are shown. (A) Voltage maps of Medium. The negativity elicited by incongruous FL words was distributed broadly and spread to anterior regions in Year 1, but not in Year 3, when the negativity in the same time window had a posterior distribution, which is typical of the N400. (B) Average voltages at fronto-polar (Fp1/2), frontal (F7/3/z/4/8), fronto-central (Fc5/1/2/6), central (T7/8, C3/z/4), centro-parietal (CP5/1/2/6), parietal (P7/3/z/4/8), and occipital electrodes (O1/2) in Medium. Only the data in Year 3 have a clear posterior distribution. (C) Voltage maps of High. Upper: ERPs in the FL in Year 1 (time window: 300–500 msec). Lower: ERPs in the mother tongue in Year 1 (time window: 250–450 msec).

In High (Figure 3C, Figure 4C), a clear N400 with posterior dominance and a late negativity with anterior dominance can been seen. In the 300–500 msec TW, the large negativity elicited by incongruous words was stronger toward central to posterior regions (Figure 5C, upper) [anterior: F(1, 46) = 6.235, p = .016; temporal: F = 13.22, p = .001; central: F = 21.51, p < .001; posterior: F = 16.09, p < .001]. In the next TW (600–900 msec), the negativity was statistically significant at anterior, temporal, and central regions [F(1, 46) = 41.88, 25.58, and 18.66; all ps < .001] and Fz and Cz [t(46) = −5.394 and −4.095, both ps < .001], but not at the posterior region [F < 1] and Pz [t(46) = −1.661], suggesting that the negativity in this TW had a more anterior distribution than that in the previous TW. There seems to be a trend toward the appearance of an LPC, but this trend was statistically unreliable (Figure 6A).

Figure 6. 

Voltage maps and t maps in consecutive 100-msec time windows of the late latency range in High. [Incongruous − Congruous] subtraction data in the FL are shown separately for Year 1 (A) and Year 3 (B). The contour lines in the t maps surround areas of statistically significant effects.

Figure 6. 

Voltage maps and t maps in consecutive 100-msec time windows of the late latency range in High. [Incongruous − Congruous] subtraction data in the FL are shown separately for Year 1 (A) and Year 3 (B). The contour lines in the t maps surround areas of statistically significant effects.

Mother Tongue

In addition to the posterior N400 and the anterior late negativity, incongruous words in the L1 evoked an LPC at the posterior region in all of Low, Medium, and High (Figure 7 for grand-average ERP waveforms). Hence, the different patterns of ERP responses in the FL among Low, Medium, and High are not attributable to the L1. Here, the data of High in Year 1 are statistically analyzed as representative data (see also Figure 3D). The strong negativity elicited by incongruous words (Table 5) was weaker toward the anterior region in the 250–450 msec TW (Figure 5C, lower) [F(1, 46) = 8.416, p = .006 at the anterior region; all other Fs > 30, all other ps < .001], whereas it was weaker toward the posterior region in the 500–700 msec TW [F(1, 46) < 1 at posterior region; all other Fs > 45, ps < .001]. In the last TW (800–1100 msec), the Condition × Electrode and Condition × Region interactions were significant, supporting the appearance of an LPC at posterior sites [t(46) = 3.388, p = .001 at Pz; F(1, 46) = 24.16, p < .001 at the posterior region].

Figure 7. 

Grand-average waveforms in the mother tongue (Japanese). ERP responses to congruous (solid lines) and incongruous words (dotted lines) are plotted separately for each group (Low, Medium, and High), for each year (Year 1, 2, and 3). The vertical lines indicate the onset of the word. Negative is plotted upward.

Figure 7. 

Grand-average waveforms in the mother tongue (Japanese). ERP responses to congruous (solid lines) and incongruous words (dotted lines) are plotted separately for each group (Low, Medium, and High), for each year (Year 1, 2, and 3). The vertical lines indicate the onset of the word. Negative is plotted upward.

The early priming effect for congruous words (more negative ERPs for congruous than for incongruous words starting around 100 msec) found previously in an identical experimental paradigm (Friedrich & Friederici, 2004, 2005) seemed to be inconsistent even in the L1 in our study. Our stepwise approach using an omnibus MANOVA and follow-up analyses of significant interactions failed to detect any statistically significant results for this effect in any of the three groups. Hence, we resorted to a region-of-interest (ROI) approach. When electrodes F7/8 and T7/8 were analyzed as ROIs in the 100–200 msec TW, we found a significant condition effect at the left ROI [F(1, 46) = 5.939, p = .019] in High, due to more negative potentials for congruous than for incongruous words. This effect was not detected statistically in Low and Medium (for FL words, ROI analyses revealed no significant effects in any group in Year 1).

Longitudinally, the amplitudes of ERP responses to L1 words seem to have become smaller toward Year 3 in all groups (Figure 7). As will be clear below, these longitudinal changes in the L1 did not parallel those in the FL. The possibility that the ERP changes detected in the FL are not specific to the FL can thus be excluded.

Low, Medium, and High in Year 3

The results in Year 1 provide precise predictions as to longitudinal changes between Year 1 and Year 3: Low will show a broad negativity (specific to Stage 2) in Year 3, as did Medium in Year 1; a posterior N400 and an anterior late negativity (that are typical of Stage 3) will appear in Medium in Year 3, as in the Year 1 data of High; an LPC (associated with Stage 4) will be found in High in Year 3 as in the L1. The ERP data obtained (Figure 3E–G) seem to be compatible with these predictions. Changes which occurred between Year 1 and Year 3 differed significantly among the three groups [TW × Condition × Region × Group × Year, F(18, 574) = 2.065, p = .006]. The groups will be analyzed separately below (Table 5).

In Low (Figure 3E, Figure 4D), a significant Condition × Region interaction was found in the 300–500 msec TW, due to significantly more negative potentials elicited by incongruous words than by congruous words only at temporal region [F(1, 51) = 5.363, p = .025]. In both 600–900 msec and 1000–1400 msec TW, incongruous words continued to elicit more negative ERPs than congruous words, as indicated by the significant condition main effect and the significant Condition × Region interaction (Table 5). In the 600–900 msec TW, the condition main effect was significant at all regions but was stronger at temporal and central regions [anterior: F(1, 51) = 4.439, p = .040; temporal, F = 15.80, p < .001; central, F = 14.51, p < .001; posterior, F = 8.223, p = .006]. At only those regions did the condition effect reach significance in the next TW [temporal, F(1, 51) = 8.436, p = .005; central, F = 6.103, p = .017]. Hence, Low showed a long-lasting negativity at broad areas, which did not have a posterior distribution typical of the N400.

In Medium (Figure 3F, Figure 4E), significant condition main effects at both lateral and midline electrodes and a significant Condition × Region interaction at the lateral electrodes were found in the 300–500 msec TW (Table 5), due to more negative potentials elicited by incongruous than by congruous words. The negativity had a posterior distribution typical of the full-blown N400 (Figure 5A and B), and was statistically significant at the temporal [F(1, 52) = 5.873, p = .019], central [F = 13.97, p < .001], and posterior region [F = 7.147, p = .010], but not at the anterior region (F < 1). In the 600–900 msec TW, the negativity spread to all regions [anterior: F(1, 52) = 7.463, p = .009; temporal: F = 14.35, p < .001; central: F = 25.58, p < .001; posterior: F = 13.38, p = .001], but statistically, the scalp distributions of the negative potentials did not differ between 300–500 msec and 600–900 msec TW in the anterior–posterior dimension [TW × Condition × Region, F(3, 50) < 1; TW × Condition × Hemisphere × Region, F(3, 50) = 2.318, p = .087]. In a slightly later TW, however, the late negativity did shift to more anterior sites; in the 750–1050 msec TW, the Condition × Region interaction was significant [F(3, 50) = 3.237, p = .030; condition main effect, F(1, 52) = 7.363, p = .009], reflecting the more anterior distribution of the negativity than in the 300–500 msec TW [anterior: F(1, 52) = 5.523, p = .023; temporal: F = 6.956, p = .011; central: F = 9.213, p = .004; posterior: F = 2.143, p > .1; TW × Condition × Region interaction, F(3, 50) = 3.122, p = .034].

In High (Figure 3G, Figure 4F), the negativity in the 300–500 msec TW was significant at all but the anterior region [anterior: F(1, 52) = 2.288, p > .1; temporal: F = 5.832, p = .019; central: F = 24.67, p < .001; posterior: F = 4.055, p = .049], whereas the negativity in the 600–900 msec TW was significant at all but the posterior region [anterior: F(1, 52) = 17.55, p < .001; temporal: F = 9.764, p = .003; central: F = 20.84, p < .001; posterior: F < 1]. In the last TW (1000–1400 msec), a significant Condition × Electrode interaction and a significant Condition × Region interaction were found because incongruous words elicited more positive potentials (i.e., LPC) than did congruous words at posterior sites [Pz: t(52) = 2.834, p = .007; posterior region: F(1, 52) = 8.982, p = .004]. The LPC was present from around 800 msec to at least 1500 msec (Figure 6B). In High, ROI analyses (electrodes: F7/8, T7/8) indicated the presence of early priming for congruous FL words. In the 100–200 msec TW, congruous FL words elicited more negative ERPs than did incongruous FL words in Year 3 [F(1, 52) = 4.790, p = .033] (Figure 3G) as well as in Year 2 [F(1, 52) = 5.617, p = .022].

Little Progress Group

The ERP responses to FL words in the Little Progress group (children who showed little progress in English proficiency from Years 1 to 3) (Figure 3H and I, Figure 8) showed no statistically significant interactions involving both condition and year [MANOVA design for lateral electrodes: TW × Condition × Hemisphere × Region × Year; midline electrodes: TW × Condition × Electrode × Year, all ps > .15]. In contrast, each of the three other groups of children (Low, Medium, High) showed at least one significant interaction involving both condition and year (1 vs. 3) when analyzed the same way [Low: TW × Condition × Hemisphere × Year, F(3, 96) = 2.804, p = .044; Medium: Condition × Region × Year, F(3, 97) = 3.705, p = .014; High: TW × Condition × Region × Year, F(9, 90) = 3.243, p = .002]. These data, taken together, indicate that ERP changes can be detected in children who made large behavioral progress in FL proficiency, but not in children who did not show such progress. The possibility that significant longitudinal ERP changes in the FL can be detected independent of behavioral progress in FL proficiency is not tenable.

Figure 8. 

Grand-average waveforms of the Little Progress group in the FL (English). ERP responses to congruous (solid lines) and incongruous words (dotted lines) are plotted separately for Year 1 and Year 3. The vertical lines indicate the onset of the word. Negative is plotted upward.

Figure 8. 

Grand-average waveforms of the Little Progress group in the FL (English). ERP responses to congruous (solid lines) and incongruous words (dotted lines) are plotted separately for Year 1 and Year 3. The vertical lines indicate the onset of the word. Negative is plotted upward.

Data in Year 2

Additional analyses of the Year 2 data (Figure 9) lend further support for the dependency between ERP changes and behavioral progress. Children in Low and Medium made some progress in English proficiency from Year 1 to 2 (see Table 2), but this progress (about 10 points) was only about a half of that between Year 1 and Year 3 (about 20 points). Accordingly, MANOVAs involving year (1 vs. 2) as a factor revealed no statistically significant interactions involving both year and condition in either Low or Medium (all ps > .18). Thus, in the absence of large progress in FL proficiency, statistically significant ERP changes cannot be detected between years, even in groups of about 50 participants. In High, who had extremely long hours of exposure to English (as of Year 1, 1740 hours; between Year 1 and 2, about 500 hours; see Table 2), a significant TW × Condition × Region × Year interaction was found [F(9, 90) = 2.645, p = .009].

Figure 9. 

Grand-average waveforms in Year 2. ERP responses to congruous (solid lines) and incongruous FL words (dotted lines) are shown. The vertical lines indicate the onset of the word. Negative is plotted upward.

Figure 9. 

Grand-average waveforms in Year 2. ERP responses to congruous (solid lines) and incongruous FL words (dotted lines) are shown. The vertical lines indicate the onset of the word. Negative is plotted upward.

DISCUSSION

The overall results are consistent with the general hypothesis that child FL learning follows L1 developmental stages (Dulay et al., 1982); the four stages noted in L1 developmental research appeared in our FL data, critically in an identical order. In Year 1, the ERPs of Low (low-proficiency group), Medium (medium-proficiency group), and High (high-proficiency group) contained no negativity, a long-lasting broad negativity, and a posterior N400 followed by an anterior late negativity, respectively. These ERP responses in the FL are compatible with Stages 1, 2, and 3 in L1 acquisition. These cross-sectional data appear to support the hypothesis that child FL learning proceeds from Stage 1 to 2 to 3, but Low, Medium, and High differed not only in English proficiency but also in AOFE. Here, the longitudinal data within each group become essential because intraindividual changes over time cannot be induced by AOFE, which stays constant over time. In Year 3, the ERPs of Low contained a long-lasting broad negativity, which is compatible with Stage 2 in L1 acquisition. The posterior N400 and the anterior late negativity found in Medium are compatible with Stage 3. The appearance of an LPC in High is compatible with Stage 4. Thus, in terms of neural responses, Low progressed from Stage 1 to 2, Medium from Stage 2 to 3, and High from Stage 3 to 4. Taken together, these data suggest that child FL learning proceeds from Stage 1 to 2 to 3 to 4, just like in L1 acquisition. The appearance of identical developmental stages in child FL learning and L1 acquisition highlights the role of learner-internal factors because large environmental differences exist between FL learning and L1 acquisition. It may be the nature of the child's brain itself that determines the normal course of child FL learning.

The functional significance of the specific ERP responses observed at different stages (1–4) in the FL can be interpreted using information contained in the ERPs themselves and previous neuroimaging literature. In the transition from Stage 1 to 2, a broad negativity appeared, which lacked the posterior dominance of, and peaked later than, the typical N400 component. In L1 acquisition research, similar responses have been reported in young children (Silva-Pereyra et al., 2005; Friedrich & Friederici, 2004; Hahne et al., 2004). The presence of the broad negativity in our ERP data ensures that the children who produced it had known the meanings of at least some of the FL words used as stimuli. Its late peak indicates that lexical processing (Deacon, Hewitt, Yang, & Nagata, 2000) or postlexical integration (Brown & Hagoort, 1993) is still slow and/or that the number of words comprehended by the children may still be small (Friedrich & Friederici, 2004). The broad distribution of the negativity implies either that additional cognitive processing specific to the experimental paradigm was involved (Friedrich & Friederici, 2004) or that distributed (rather than focal) cerebral activations occurred which are characteristic of early stages of learning in general (Olesen, Westerberg, & Klingberg, 2004). In the transition from Stage 2 to 3, a posterior N400 appeared in the 300–500 msec TW. The earlier peak of this response than the broad negativity is indicative of acceleration of lexical processing and/or semantic integration, and the posterior dominance of this negativity indicates that its neural generators may be similar to those engaged in L1 processing in teenagers and adults (Hahne et al., 2004; Byrne et al., 1999; Holcomb et al., 1992). In the transition to Stage 4, an LPC was added to the posterior N400. Because this positivity is the last to appear in L1 acquisition (Juottonen et al., 1996), its appearance in the FL indicates the attainment of qualitatively fully L1-like patterns of cortical processing of the FL words used in the experiment, with quantitative differences in timing and strength still remaining. The LPC elicited by semantic incongruities probably originates from frontal cortex (Van Petten & Luka, 2006) and is used to study various groups of normal and clinical populations (Iakimova et al., 2009; Daltrozzo, Wioland, & Kotchoubey, 2007; Revonsuo, Portin, Juottonen, & Rinne, 1998), but the exact cognitive processes it indexes are still unknown (Kuperberg, 2007; Van Petten & Luka, 2006). Another ambiguity remains in the interpretation of the results. We did not obtain direct behavioral information as to whether the children knew the meanings of the stimulus words. Without such information, it is possible that the different patterns of ERP responses observed above are not due to the different levels of proficiency, but due to the different numbers of comprehended words. This issue should be clarified empirically by future research. Several L1 acquisition studies analyzed only the trials on which the participant had responded correctly and still report that ERPs varied as a function of age (Hahne et al., 2004; Holcomb et al., 1992). In line with these studies, we predict that ERPs will vary as a function of FL proficiency even if the analyses are restricted to the comprehended words only.

For educational purposes, we should address questions as to how factors such as AOFE, length of exposure, and type of exposure, and so forth, influence the neural process of child FL learning. However, investigations of the factors influencing this process are possible only after the process itself has been characterized properly, which is the main theme of the present article. We have already obtained detailed information as to the children's exposure to the FL via extensive questionnaires and follow-up phone interviews answered by the parents and teachers. The factors that accelerate the neural process of child FL learning should be clear soon.

ERPs have poor spatial resolution, and we should use other methods to localize the cortical regions involved in the processing of FL words in children and their changes as a function of proficiency. fMRI has excellent spatial resolution but is vulnerable to motion artifacts and produces large noise during scanning. Pediatric fMRI studies on auditory language processing already exist (Brauer & Friederici, 2007), but for a large-scale study of child FL learning involving many children at various levels of proficiency, near-infrared spectroscopy (NIRS) may be a better candidate. NIRS is a lot less costly than fMRI, allows the participant's motion to a greater degree, and does not produce scanning-related noise which may hinder auditory experiments, although its spatial resolution is poorer than that of fMRI. The participants in our project underwent an NIRS experiment as well, and their NIRS data are currently being analyzed.

Whether our conclusion based on single-word processing can be extended to higher-level combinatory processing in sentences and discourse should be tested by future research. Because combinatory processing presupposes knowledge of words combined, only children who have substantial FL vocabulary can serve as participants. Contrary to common beliefs, behavioral research has consistently shown that younger children are slower at learning syntax as well as vocabulary than older children and adults, particularly at early stages of learning or when exposure is limited (Muñoz, 2006; Dulay et al., 1982). This tendency will make it even harder to run a syntax experiment on child FL learners. Moreover, due to large cross-linguistic differences, implementation of identical syntax experiments in both L1 and FL may prove difficult for some pairs of languages (such as Japanese and English). Interpretations of the results should be guided by published studies of L1 acquisition, but existing ERP data on children's syntactic processing in the L1 are based exclusively on German- or English-speaking populations (Silva-Pereyra et al., 2005; Hahne et al., 2004). Clearly, more research is necessary here.

Differences in a stimulus property (i.e., speaking speed) may have led to different ERP patterns in the L1 between previous studies which used an experimental paradigm identical to ours (Friedrich & Friederici, 2004, 2005) and our study. The early priming effect for congruous words reported previously were at best weak even in the L1 in our study (and not reported by Byrne et al., 1999 and Connolly et al., 1995), which prevents us from drawing a firm conclusion about this response in child FL learning. On the other hand, the LPC consistently found in our study is not reported in the previous studies (Friedrich & Friederici, 2004, 2005). We speculate that child-directed, slowly spoken speech used in these previous studies may leave enough time for the early priming effect to appear clearly before the N400, while speech spoken at a normal speed may enable precise enough time-locking for a late-latency, post-N400 component (i.e., LPC) to appear in averaged data, as in our study. One future task may be to find optimal speaking speeds in stimulus recording which enable the elicitation of both an early priming effect and an LPC.

In conclusion, our ERP data collected longitudinally from more than 300 children over a period of 3 years vividly show neural correlates of FL learning in childhood, which remained largely unknown until now. As FL proficiency increased, distinct stages characterized by specific ERP components appeared. Both the stages themselves and the order of their appearance were identical to those in L1 acquisition. Thus, our ERP data are consistent with the hypothesis that child FL learning reproduces developmental stages in L1 acquisition, suggesting that the nature of the child's brain itself may determine the normal course of FL learning.

APPENDIX. WORDS USED AS EXPERIMENTAL STIMULI

ERPs were elicited by 80 English (Foreign Language) words and 80 Japanese (Mother Tongue) words with corresponding meanings. The Japanese words in the list are written in Romaji script.


English
Japanese
1. baby akachan 
2. back senaka 
3. bag kaban 
4. bathroom furoba 
5. beach kaigan 
6. bear kuma 
7. bicycle jitensha 
8. bird tori 
9. blackboard kokuban 
10. book hon 
11. bottle bin 
12. box hako 
13. boy otokonoko 
14. bridge hashi 
15. car kuruma 
16. carrot ninjin 
17. cat neko 
18. chair isu 
19. children kodomo 
20. classroom kyoushitsu 
21. clock tokei 
22. cow ushi 
23. desk tsukue 
24. dish sara 
25. doctor isha 
26. dog inu 
27. doll ningyou 
28. face kao 
29. feet ashi 
30. finger yubi 
31. fire hi 
32. fish sakana 
33. flag hata 
34. flower hana 
35. forest mori 
36. fruits kudamono 
37. gate mon 
38. gift okurimono 
39. girl onnanoko 
40. gloves tebukuro 
41. grapes budou 
42. hand te 
43. hat boushi 
44. head atama 
45. hill oka 
46. horse uma 
47. hospital byouin 
48. house ie 
49. key kagi 
50. king ousama 
51. kitchen daidokoro 
52. lake mizuumi 
53. leaf happa 
54. letter tegami 
55. lips kuchibiru 
56. map chizu 
57. milk gyuunyuu 
58. mirror kagami 
59. monkey saru 
60. mountain yama 
61. mouth kuchi 
62. newspaper shinbun 
63. nurse kangoshi 
64. park kouen 
65. peach momo 
66. pencil enpitsu 
67. picture shashin 
68. plane hikouki 
69. policeman keisatsukan 
70. postcard hagaki 
71. potato jagaimo 
72. queen jouou 
73. ring yubiwa 
74. rose bara 
75. sheep hitsuji 
76. shoes kutsu 
77. teeth ha 
78. train densha 
79. watch udedokei 
80. window mado 

English
Japanese
1. baby akachan 
2. back senaka 
3. bag kaban 
4. bathroom furoba 
5. beach kaigan 
6. bear kuma 
7. bicycle jitensha 
8. bird tori 
9. blackboard kokuban 
10. book hon 
11. bottle bin 
12. box hako 
13. boy otokonoko 
14. bridge hashi 
15. car kuruma 
16. carrot ninjin 
17. cat neko 
18. chair isu 
19. children kodomo 
20. classroom kyoushitsu 
21. clock tokei 
22. cow ushi 
23. desk tsukue 
24. dish sara 
25. doctor isha 
26. dog inu 
27. doll ningyou 
28. face kao 
29. feet ashi 
30. finger yubi 
31. fire hi 
32. fish sakana 
33. flag hata 
34. flower hana 
35. forest mori 
36. fruits kudamono 
37. gate mon 
38. gift okurimono 
39. girl onnanoko 
40. gloves tebukuro 
41. grapes budou 
42. hand te 
43. hat boushi 
44. head atama 
45. hill oka 
46. horse uma 
47. hospital byouin 
48. house ie 
49. key kagi 
50. king ousama 
51. kitchen daidokoro 
52. lake mizuumi 
53. leaf happa 
54. letter tegami 
55. lips kuchibiru 
56. map chizu 
57. milk gyuunyuu 
58. mirror kagami 
59. monkey saru 
60. mountain yama 
61. mouth kuchi 
62. newspaper shinbun 
63. nurse kangoshi 
64. park kouen 
65. peach momo 
66. pencil enpitsu 
67. picture shashin 
68. plane hikouki 
69. policeman keisatsukan 
70. postcard hagaki 
71. potato jagaimo 
72. queen jouou 
73. ring yubiwa 
74. rose bara 
75. sheep hitsuji 
76. shoes kutsu 
77. teeth ha 
78. train densha 
79. watch udedokei 
80. window mado 

Acknowledgments

We thank all the children and their families for taking part; the Society for Testing English Proficiency (STEP) for providing us with English proficiency tests; Ryuichiro Hashimoto, Fumitaka Homae, Ayumi Koso, Norihiro Sadato, Lisa Sugiura, and Satoshi Tanaka for comments; Kenji Itoh for assisting audio recording; and Hideaki Koizumi for continuous support. This work is supported by Grant-in-aid for the promotion of “Brain Science and Education, Type II” from the Research Institute of Science and Technology for Society, Japan Science and Technology Agency (RISTEX/JST) to H. H.

Reprint requests should be sent to Hiroko Hagiwara, Department of Language Sciences, Graduate School of Humanities, Tokyo Metropolitan University, 1-1 Minami-Osawa, Hachioji, Tokyo, 192-0397, Japan, or via e-mail: hagiwara@tmu.ac.jp.

Notes

1. 

The meaning comprehension test of spoken English administered each year was provided by courtesy of The Society for Testing English Proficiency (Japan's largest testing body, hereafter STEP). STEP compiled child-friendly multiple-choice tests which can measure diverse levels of English proficiency of children younger than 12 years, although the public versions of their tests target particular levels. STEP provided us with a different version of the test each year, strictly controlling the difficulties by Item Response Theory.

2. 

In addition to the English test tailored for Japanese children younger than 12 years (see the above note), we also administered a higher-level English test which is intended to measure the listening comprehension ability of junior and senior high school Japanese students. We are currently integrating the scores obtained from the two tests; the integrated scores will reveal the English proficiency of High more accurately.

3. 

The print frequencies of the stimulus words were studied using the most standard corpuses of English (Kučera & Francis, 1967) and Japanese (Amano & Kondo, 2000), which nonetheless differed greatly in the sources of texts; the English corpus used various sources (e.g., press, religion, hobbies, science), whereas the Japanese corpus used only newspapers. Hence, we first determined the arithmetic relationship between the two corpuses, using several hundred pairs of basic English words and their Japanese counterparts. This revealed a 3.85 times higher frequency (per million) in the English than in the Japanese corpus for the pairs of basic words used. Hence, we selected 80 pairs of English and Japanese words as stimuli which did not statistically differ when the frequencies of the Japanese words were multiplied by 3.85.

REFERENCES

Amano
,
N.
, &
Kondo
,
M.
(
2000
).
NTT database series: Nihongo-no goitokusei
[Lexical properties of Japanese].
Tokyo
:
Sanseido
.
Brauer
,
J.
, &
Friederici
,
A. D.
(
2007
).
Functional neural networks of semantic and syntactic processes in the developing brain.
Journal of Cognitive Neuroscience
,
19
,
1609
1623
.
Brown
,
C.
, &
Hagoort
,
P.
(
1993
).
The processing nature of the N400: Evidence from masked priming.
Journal of Cognitive Neuroscience
,
5
,
34
44
.
Byrne
,
J. M.
,
Connolly
,
J. F.
,
MacLean
,
S. E.
,
Dooley
,
J. M.
,
Gordon
,
K. E.
, &
Beattie
,
T. L.
(
1999
).
Brain activity and language assessment using event-related potentials: Development of a clinical protocol.
Developmental Medicine and Child Neurology
,
41
,
740
747
.
Connolly
,
J. F.
,
Byrne
,
J. M.
, &
Dywan
,
C. A.
(
1995
).
Assessing adult receptive vocabulary with event-related potentials: An investigation of cross-modal and cross-form priming.
Journal of Clinical and Experimental Neuropsychology
,
17
,
548
565
.
Crain
,
S.
, &
Lillo-Martin
,
D.
(
1999
).
An introduction to linguistic theory and language acquisition.
Malden, MA
:
Blackwell Publishers
.
Daltrozzo
,
J.
,
Wioland
,
N.
, &
Kotchoubey
,
B.
(
2007
).
Sex differences in two event-related potentials components related to semantic priming.
Archives of Sexual Behavior
,
36
,
555
568
.
Deacon
,
D.
,
Hewitt
,
S.
,
Yang
,
C.
, &
Nagata
,
M.
(
2000
).
Event-related potential indices of semantic priming using masked and unmasked words: Evidence that the N400 does not reflect a post-lexical process.
Brain Research, Cognitive Brain Research
,
9
,
137
146
.
Dulay
,
H.
,
Burt
,
M.
, &
Krashen
,
S.
(
1982
).
Language two.
Oxford
:
Oxford University Press
.
Ellis
,
R.
(
1994
).
The study of second language acquisition.
Oxford
:
Oxford University Press
.
Friedrich
,
M.
, &
Friederici
,
A. D.
(
2004
).
N400-like semantic incongruity effect in 19-month-olds: Processing known words in picture contexts.
Journal of Cognitive Neuroscience
,
16
,
1465
1477
.
Friedrich
,
M.
, &
Friederici
,
A. D.
(
2005
).
Phonotactic knowledge and lexical–semantic processing in one-year-olds: Brain responses to words and nonsense words in picture contexts.
Journal of Cognitive Neuroscience
,
17
,
1785
1802
.
Giedd
,
J. N.
,
Blumenthal
,
J.
,
Jeffries
,
N. O.
,
Castellanos
,
F. X.
,
Liu
,
H.
,
Zijdenbos
,
A.
,
et al
(
1999
).
Brain development during childhood and adolescence: A longitudinal MRI study.
Nature Neuroscience
,
2
,
861
863
.
Hahne
,
A.
,
Eckstein
,
K.
, &
Friederici
,
A. D.
(
2004
).
Brain signatures of syntactic and semantic processes during children's language development.
Journal of Cognitive Neuroscience
,
16
,
1302
1318
.
Holcomb
,
P. J.
,
Coffey
,
S. A.
, &
Neville
,
H. J.
(
1992
).
Visual and auditory sentence processing: A developmental analysis using event-related brain potentials.
Developmental Neuropsychology
,
8
,
203
241
.
Iakimova
,
G.
,
Passerieux
,
C.
,
Foynard
,
M.
,
Fiori
,
N.
,
Besche
,
C.
,
Laurent
,
J. P.
,
et al
(
2009
).
Behavioral measures and event-related potentials reveal different aspects of sentence processing and comprehension in patients with major depression.
Journal of Affective Disorders
,
113
,
188
194
.
Johnson
,
J. S.
, &
Newport
,
E. L.
(
1989
).
Critical period effects in second language learning: The influence of maturational state on the acquisition of English as a second language.
Cognitive Psychology
,
21
,
60
99
.
Juottonen
,
K.
,
Revonsuo
,
A.
, &
Lang
,
H.
(
1996
).
Dissimilar age influences on two ERP waveforms (LPC and N400) reflecting semantic context effect.
Brain Research, Cognitive Brain Research
,
4
,
99
107
.
Kim
,
K. H.
,
Relkin
,
N. R.
,
Lee
,
K. M.
, &
Hirsch
,
J.
(
1997
).
Distinct cortical areas associated with native and second languages.
Nature
,
388
,
171
174
.
Kučera
,
H.
, &
Francis
,
W. N.
(
1967
).
Computational analysis of present-day American English.
Providence, RI
:
Brown University Press
.
Kuperberg
,
G. R.
(
2007
).
Neural mechanisms of language comprehension: Challenges to syntax.
Brain Research
,
1146
,
23
49
.
Kutas
,
M.
, &
Hillyard
,
S. A.
(
1980
).
Reading senseless sentences: Brain potentials reflect semantic incongruity.
Science
,
207
,
203
205
.
McCallum
,
W. C.
,
Farmer
,
S. F.
, &
Pocock
,
P. V.
(
1984
).
The effects of physical and semantic incongruities on auditory event-related potentials.
Electroencephalography and Clinical Neurophysiology
,
59
,
477
488
.
McLaughlin
,
J.
,
Osterhout
,
L.
, &
Kim
,
A.
(
2004
).
Neural correlates of second-language word learning: Minimal instruction produces rapid change.
Nature Neuroscience
,
7
,
703
704
.
Muñoz
,
C.
(Ed.) (
2006
).
Age and the rate of foreign language learning.
Clevedon, UK
:
Multilingual Matters
.
Oldfield
,
R. C.
(
1971
).
The assessment and analysis of handedness: The Edinburgh inventory.
Neuropsychologia
,
9
,
97
113
.
Olesen
,
P. J.
,
Westerberg
,
H.
, &
Klingberg
,
T.
(
2004
).
Increased prefrontal and parietal activity after training of working memory.
Nature Neuroscience
,
7
,
75
79
.
Osterhout
,
L.
,
McLaughlin
,
J.
,
Pitkänen
,
I.
,
Frenck-Mestre
,
C.
, &
Molinaro
,
N.
(
2006
).
Novice learners, longitudinal designs, and event-related potentials: A means for exploring the neurocognition of second language processing.
Language Learning
,
56(Suppl. 1)
,
199
230
.
Osterhout
,
L.
,
Poliakov
,
A.
,
Inoue
,
K.
,
McLaughlin
,
J.
,
Valentine
,
G.
,
Pitkanen
,
I.
,
et al
(
2008
).
Second-language learning and changes in the brain.
Journal of Neurolinguistics
,
21
,
509
521
.
Revonsuo
,
A.
,
Portin
,
R.
,
Juottonen
,
K.
, &
Rinne
,
J. O.
(
1998
).
Semantic processing of spoken words in Alzheimer's disease: An electrophysiological study.
Journal of Cognitive Neuroscience
,
10
,
408
420
.
Rugg
,
M. D.
, &
Coles
,
M. G. H.
(Eds.) (
1995
).
Electrophysiology of mind: Event-related brain potentials and cognition.
Oxford, UK
:
Oxford University Press
.
Semlitsch
,
H. V.
,
Anderer
,
P.
,
Schuster
,
P.
, &
Presslich
,
O.
(
1986
).
A solution for reliable and valid reduction of ocular artifacts, applied to the P300 ERP.
Psychophysiology
,
23
,
695
703
.
Sharbrough
,
F.
,
Chatrian
,
G.-E.
,
Lesser
,
R. P.
,
Lüders
,
H.
,
Nuwer
,
M.
, &
Picton
,
T. W.
(
1991
).
American Electroencephalographic Society guidelines for standard electrode position nomenclature.
Journal of Clinical Neurophysiology
,
8
,
200
202
.
Shaw
,
P.
,
Greenstein
,
D.
,
Lerch
,
J.
,
Clasen
,
L.
,
Lenroot
,
R.
,
Gogtay
,
N.
,
et al
(
2006
).
Intellectual ability and cortical development in children and adolescents.
Nature
,
440
,
676
679
.
Silva-Pereyra
,
J.
,
Rivera-Gaxiola
,
M.
, &
Kuhl
,
P. K.
(
2005
).
An event-related brain potential study of sentence comprehension in preschoolers: Semantic and morphosyntactic processing.
Brain Research, Cognitive Brain Research
,
23
,
247
258
.
Van Petten
,
C.
, &
Luka
,
B. J.
(
2006
).
Neural localization of semantic context effects in electromagnetic and hemodynamic studies.
Brain and Language
,
97
,
279
293
.
Wartenburger
,
I.
,
Heekeren
,
H. R.
,
Abutalebi
,
J.
,
Cappa
,
S. F.
,
Villringer
,
A.
, &
Perani
,
D.
(
2003
).
Early setting of grammatical processing in the bilingual brain.
Neuron
,
37
,
159
170
.
Weber-Fox
,
C. M.
, &
Neville
,
H. J.
(
1996
).
Maturational constraints on functional specializations for language processing: ERP and behavioral evidence in bilingual speakers.
Journal of Cognitive Neuroscience
,
8
,
231
256
.