Infants' speech perception abilities change through the first year of life, from broad sensitivity to a wide range of speech contrasts to becoming more finely attuned to their native language. What remains unclear, however, is how this perceptual change relates to brain responses to native language contrasts in terms of the functional specialization of the left and right hemispheres. Here, to elucidate the developmental changes in functional lateralization accompanying this perceptual change, we conducted two experiments on Japanese infants using Japanese lexical pitch–accent, which changes word meanings with the pitch pattern within words. In the first behavioral experiment, using visual habituation, we confirmed that infants at both 4 and 10 months have sensitivities to the lexical pitch–accent pattern change embedded in disyllabic words. In the second experiment, near-infrared spectroscopy was used to measure cortical hemodynamic responses in the left and right hemispheres to the same lexical pitch–accent pattern changes and their pure tone counterparts. We found that brain responses to the pitch change within words differed between 4- and 10-month-old infants in terms of functional lateralization: Left hemisphere dominance for the perception of the pitch change embedded in words was seen only in the 10-month-olds. These results suggest that the perceptual change in Japanese lexical pitch–accent may be related to a shift in functional lateralization from bilateral to left hemisphere dominance.
The ability to perceive speech sounds goes through significant changes during the first year of life, from broad sensitivity to becoming specifically attuned to the native language (for a review, see Saffran, Werker, & Werner, 2006; Werker & Yeung, 2005; Kuhl, 2004). Infants seem to begin life with the ability to discriminate most (but not all) segmental contrasts (Kuhl, 2004; Polka & Werker, 1994; Werker, Gilbert, Humphrey, & Tees, 1981; Eimas, Siqueland, Jusczyk, & Vigorito, 1971), but their sensitivity to nonnative contrasts starts declining at about 6 months of age for vowels and at roughly 10 months for consonants (Kuhl, 2004; Best & McRoberts, 2003; Polka & Werker, 1994; Kuhl, Williams, Lacerda, Stevens, & Lindblom, 1992; Werker & Tees, 1984). At the same time, sensitivity to native contrasts becomes enhanced (Kuhl et al., 2006; Polka, Colantonio, & Sundara, 2001). It has been argued that the observed changes in infants' behaviors reflect the way infants process speech stimuli. That is, whereas infants initially process speech stimuli via general auditory processing, they begin to specifically process speech in their own language in a different way by the second half of the first year (Werker & Tees, 1992, 2005; Kuhl, 2004). We will call this process “reorganization” (Werker & Tees, 1992). The proposal, thus far, has been built on the basis of cumulative cross-linguistic studies showing that infants begin to loose discrimination ability to foreign contrasts during the second half of the first year. The implication drawn from behavioral data alone is indirect, however, because the loss of discrimination for foreign contrasts does not a priori mean that infants are processing native contrasts as “linguistically relevant.” Data from infants' brain activation patterns may help shed light on this process.
It is thought that reorganization is likely to be associated with underlying neural development for processing speech (Werker & Tees, 2005; Kuhl, 2004). According to ERP studies, younger infants show similar MMN patterns in response to both native and nonnative phonemic contrasts, whereas older infants show either smaller or no MMN responses to nonnative contrasts (Rivera-Gaxiola, Silva-Pereyra, & Kuhl, 2005; Cheour et al., 1998). Although the MMN results provide evidence that younger and older infants are processing the native and nonnative contrasts differently, they do not necessarily indicate that the older infants are distinguishing the native contrast as “linguistically relevant.”
Evidence for this is likely to come from imaging studies that test functional lateralization of speech processing. In adults, it is well known that the left and right cerebral hemispheres work differently for speech processing: The left hemisphere is more heavily involved in processing segmental contrasts in one's native language, and the right hemisphere typically processes prosodic cues including affective prosody. Bilateral activation is seen in the processing of nonspeech or nonnative contrasts (Schirmer & Kotz, 2006; Jacquemot, Pallier, LeBihan, Dehaene, & Dupoux, 2003; Vouloumanos, Kiehl, Werker, & Liddle, 2001; Buchanan et al., 2000; Tervaniemi et al., 1999; Näätänen et al., 1997; Zatorre, Evans, Meyer, & Gjedde, 1992; Ross, 1981; van Lancker, 1980). If, as the reorganization hypothesis predicts, infants begin to process linguistically relevant speech stimuli differently from other auditory stimuli after they go through the reorganization, we may observe a shift in hemispheric dominance between the younger and the older infants as they learn that a particular contrast is linguistically relevant in their language during the first year of life.
To date, however, no imaging data are available to indicate that the reorganization during infancy is associated with a shift in hemispheric dominance for processing a native phonemic contrast. It is not the case that the two hemispheres of young infants are symmetrical for processing auditory or speech stimuli. A number of studies have also reported a differential involvement of the two hemispheres for speech stimuli from early in infancy. In neonates or 3-month-olds, a stronger activation in the left hemisphere has been reported for regular speech than for backward speech or silence (Peña et al., 2003; Dehaene-Lambertz, Dehaene, & Hertz-Pannier, 2002), and stronger right-side activation was reported in 3-month-olds when regular speech was compared with speech with flattened intonation (Homae, Watanabe, Nakano, Asakawa, & Taga, 2006). Some ERP studies have also shown the presence of an early (2 to 4 months) left dominance for some segmental processing, such as /ba/ versus /ga/ (Dehaene-Lambertz, 2000; Dehaene-Lambertz & Baillet, 1998; Dehaene-Lambertz & Dehaene, 1994).
These findings demonstrate that the hemispheric asymmetry for processing speech stimuli is already observable in early infancy. At present, however, the differences and similarities between this early asymmetry and full-fledged functional lateralization of language processing in adults are not yet well understood. As a working hypothesis for this article, we assumed that the behavioral difference accompanying reorganization in infants may be reflected in the functional lateralization for processing speech stimuli. Therefore, in discriminating linguistically relevant (native phonemic) and nonrelevant (nonnative) contrasts, younger infants should process both in the same way (i.e., no left hemisphere dominance expected), whereas older infants should show a left-side dominance only for linguistically relevant contrasts.
A recent near infrared spectroscopy (NIRS) study suggests that this may in fact be the case. Stronger left-side activation for a Japanese vowel duration contrast was seen after 13 months of age, whereas bilateral activation was found with younger infants (Minagawa-Kawai, Mori, Naoi, & Kojima, 2007). However, the lack of behavioral data makes it difficult to interpret the results of this study. As discussed above, 13 months is significantly later than the age that has been typically reported for the reorganization of vowel or consonant perception. However, no behavioral data are available to determine when Japanese infants become capable of discriminating the vowel duration contrast. Consequently, we cannot determine whether the emergence of the left-side advantage corresponds to the timing of when the vowel duration contrasts become linguistically relevant for Japanese infants.
The goal of the present study was to test the prediction that infants' processing of speech stimuli accompanying reorganization is associated with changes in functional lateralization. We used a lexical pitch–accent contrast for the stimuli. In lexical level prosody, such as lexical pitch–accent in Japanese and tones in Chinese and Thai, acoustic cues that are prosodic (e.g., pitch changes) are used to distinguish lexical meaning. In Japanese, a pair of homophones with two syllables is distinguished by a pitch–accent that follows either a high-to-low (HL) or a low-to-high (LH) pitch pattern such as ha'shi (HL: “chop stick”) versus hashi' (LH: “bridge”). Brain activation for these stimuli seems to be functionally determined; that is, when the pitch cues for the lexical prosody are processed as linguistically relevant, left-lateralized activations are found, and bilateral or no left dominance activation is seen when the same cue is processed nonlinguistically (Sato, Sogabe, & Mazuka, 2007; Wang, Sereno, Jongmen, & Hirsch, 2003; Klein, Zatorre, Milner, & Zhao, 2001; Gandour et al., 2000; Gandour, Wong, & Hutchins, 1998). The use of lexical pitch–accent stimuli allowed us to test linguistic and nonlinguistic contrasts within one language by presenting the same pitch cue in word pairs and pure tones (PTs).
On the basis of a previous study showing that French neonates can discriminate the Japanese lexical pitch–accent difference between HL and LH (Nazzi, Floccia, & Bertoncini, 1998), we predict that behaviorally, Japanese infants should be able to discriminate HL and LH words from early on. Recent studies showing that English- or French-learning infants become unable to discriminate tones in Thai by 9 months of age (Mattock, Molnar, Polka, & Burnham, 2008; Mattock & Burnham, 2006) suggest that the linguistic attunement for lexical-level prosody, like segmental discrimination, is likely to occur between 6 and 9 months of age. If this is true, it suggests that although Japanese infants continue to discriminate HL and LH pitch–accent behaviorally, how that is processed should change from nonlinguistic specific processing to linguistically relevant distinction. We predict that older Japanese infants should show a stronger left hemisphere processing for HL versus LH stimuli when they are presented in word forms and no left-side dominance when the equivalent pitch change is presented in PT. Younger infants, in contrast, are expected to show similar no left-side dominant responses to both types of stimuli.
In Experiment 1, we studied 4- and 10-month-old Japanese infants in a behavioral discrimination task for lexical pitch–accent changes (HL vs. LH) embedded in Japanese disyllabic words. Note that it was necessary for us to confirm that Japanese infants at both ages are capable of discriminating the contrast behaviorally using the same stimuli as the NIRS testing in Experiment 2. We were interested in examining whether the way that infants process the contrast is correlated with lateralized activity in the brain, not whether they can detect the contrast. To our knowledge, no previous study has demonstrated that Japanese infants are capable of discriminating the lexical pitch–accent contrast as young as 4 months of age.
In Experiment 2, the same age groups were studied to examine the developmental changes underlying neural correlates during the perception of lexical pitch–accent pattern change. Using NIRS, the previous study revealed that adult Japanese speakers show a different lateralization pattern for lexical pitch–accent change and nonlexical pitch change (Sato et al., 2007). NIRS measures relative changes in the concentration of hemoglobin (Hb) noninvasively in localized brain tissues without the loud noises that are associated with MRI. It requires minimum constraints on participants and is therefore well suited for studying infants (Minagawa-Kawai et al., 2007; Homae et al., 2006; Peña et al., 2003).
Full-term healthy 4- and 10-month-old Japanese infants (n = 25 for each age group) participated in this experiment. They were recruited from Japanese-speaking homes in the Tokyo area and did not participate in Experiment 2. Ten infants were excluded from analysis for the following reasons: crying (n = 5), technical problems (n = 2), and experimenter error (n = 3). The final sample consisted of twenty 4-month-olds (10 girls, mean age = 4.1 months, age range = 3.5–4.5 months) and twenty 10-month-olds (8 girls, mean age = 10.1 months, age range = 9.5–10.5 months). Their parents had given written informed consent before the experiment. This study was approved by the ethical committees of RIKEN and Duke University.
The auditory stimuli were 14 existing disyllabic Japanese word pairs that minimally differ in pitch–accent (HL vs. LH; we used a subset of the stimuli used in Nazzi et al., 1998). These stimuli were used in the previous NIRS study (Sato et al., 2007). Four lists of stimuli were produced by selecting words randomly in each pitch pattern (i.e., HL and LH). Each list contained 14 words with a 1-sec SOA, making it 14 sec in duration. The word stimuli were recorded by a female Japanese native speaker.
The experiments were carried out in a sound-attenuated room and controlled by software (Habit X; Cohen, Atkinson, & Chaput, 2000) on a Macintosh computer. The infant sat on the parent's lap facing a 19-in. monitor (FlexScan1767, EIZO). A speaker (GX-77M, ONKYO) was located behind the monitor, from which stimuli were presented at approximately 60 dB sound pressure level. An experimenter monitored the infants' visual responses without audio feedback in an observation room through a video camera (VC-C50iR, Canon), and the responses were simultaneously recorded by a DV recorder (DVCAM DSR-11, SONY) for later video coding. The parent listened to music through a pair of headsets to mask the auditory stimuli presented to the infants.
The present experiment used the modified visual habituation paradigm method (Stager & Werker, 1997), in which infants are habituated with one kind of stimuli in a habituation phase and then are presented with another kind of stimulus (change trial) and the same kind of stimulus (no-change trial) in a test phase. An attention getter appeared before each trial, and the experimenter pressed a key to start the next trial as soon as the infant looked at it. In each trial of the habituation phase, one of the four lists of either pitch–accent pattern was presented randomly with a visual stimulus (checkerboard of red and black squares). During the trials, the experimenter pressed down a key on a computer keyboard during the interval that the infant was looking at the monitor, and the time was recorded as “looking time.” The test phase, which contained two trials that started after a maximum of 28 habituation trials, was either completed or else the infant's looking times in the habituation trials declined to a criterion that had been decided as the following case: the average looking time on the last four habituation trials reached less than 65% of that on the first four habituation trials. During the test phase, about half of the infants were presented with a list of words with different pitch–accent patterns (change trial) followed by a list with the same pitch–accent pattern as the habituation list (no-change trial). The remaining infants were given “no-change” and “change” trials in that order. Infants' looking times during the test trials were calculated by hand coding the video frame by frame. Infants' looking times during the “change” trials were compared with those of the “no-change” trials. In this paradigm, infants were expected to look longer at the visual stimulus on the change trial than on the no-change trial if they discriminated the change of the pitch contour pattern.
To assess habituation, we submitted infants' looking times in the first and the last four habituation trials to a 2 × 2 ANOVA, with age groups as a between-subject variable (4- vs. 10-month olds) and habituation as a within-subject variable (first four vs. last four trials). The results revealed only a main effect of habituation, F(1, 38) = 208.07, p < .00, indicating a significant decrease in looking times between the beginning and the end of the habituation phase for both age groups. Neither the main effect of age group, F(1, 38) = 1.97, p > .10, nor the interaction between age group and habituation was significant, F(1, 38) = 0.33, p > .10, suggesting that the 4- and 10-month old infants did not differ from each other in their looking times during the habituation phase.
Infants' looking times during the “change” and “no-change” trials were subjected to a two-way ANOVA with age group (4 and 10 months) as a between-subjects factor and trial type (no-change and change) as a within-subject factor. Figure 1 shows averaged looking times of both age groups for the no-change and change trials. The ANOVA revealed a main effect for the trials, F(1, 38) = 11.14, p < .01, but no interaction between age and trial, F(1, 38) = 0.09, p > .10, or a main effect for the age group, F(1, 38) = 0.62, p > .10. This finding indicates that infants' looking times during the change trials were significantly longer than those of the no-change trials in both age groups. Thus, this experiment confirmed that both age groups discriminate between two kinds of pitch contours (HL vs. LH) involved in Japanese lexical pitch–accent.
Again, 4-month-old (n = 35) and 10-month-old (n = 56) infants from Japanese-speaking homes in the Tokyo area participated after their parents gave written informed consent. Infants were tested while they were awake (except for one 4-month-old). Forty-three infants were excluded from analysis for the following reasons: crying (n = 10), technical problems (n = 1), large motion artifact (n = 20), and loose probe placement (n = 12). The final sample consisted of twenty-four 4-month-olds (13 girls, mean age = 4.1 months, age range = 3.5–4.5 months) and twenty-four 10-month-olds (14 girls, mean age = 10.2 months, age range = 9.5–10.5 months). This study was approved by the ethics review committee of RIKEN and Duke University.
Hemodynamic responses for lexical pitch–accent changes in Japanese infants were recorded with a multichannel NIRS system (ETG-4000, Hitachi Medical Co., Japan), which uses near infrared lasers at two wavelengths (695 and 830 nm). The recording channels resided in the optical path in the brain between the nearest pairs of incident and detection probes, which were separated by 3 cm on the scalp surface (Fukui, Ajichi, & Okada, 2003). Five incident and four detection probes were placed on each lateral side of the head, which made the total number of recording channels 12 on either side (Peña et al., 2003). The middle probes in the lowest line on both sides were located on the nearby T3 and T4 positions (according to the international 10–20 system for EEG recording; Figure 2). To prevent hair from interfering with the light emission, we attached the tips of the probes to the skin of subjects' head after hairs were pushed away from under the probes.
The experiments were carried out in a sound-attenuated room. A loudspeaker (Reveal, TANNOY, Scotland, UK) was located 70 cm from the infant's head, from which stimuli were presented at approximately 60 dB sound pressure level. The infant sat on the parent's lap. To reduce head movement, a silent movie, which was not synchronized to the auditory stimuli, was played continuously on a video monitor, or an experimenter entertained the infant with silent toys. The parent and the experimenter listened to music through a pair of headsets.
Each participant was tested in two conditions in a block design. In the word condition, the baseline block (20 or 25 sec) contained a sequence of either HL or LH words repeated approximately every 1.25 sec. During the test block (10 sec), participants were presented with both pitch pattern words. The HL and LH pattern words were presented in a pseudorandom order with equal probability. The baseline and test blocks were presented alternately, for a total of at least five test blocks. The baseline blocks were longer than the test blocks because NIRS measurements need sufficient poststimulus periods until the responses go back to baseline levels. The PT condition was similar to the word condition, except for the presentation of the PT stimuli. In the word condition, the Hb responses to the pitch pattern change (i.e., HL vs. LH) embedded in the words were estimated against the responses to either HL or LH pattern words alone (i.e., no change). In the PT condition, the responses to pitch pattern change within PTs were measured. The accent pattern presented in baseline (i.e., HL or LH) and the order of the two conditions were counterbalanced across subjects.
The Hb responses were sampled every 100 msec and smoothed with a 5-sec moving average. We focused on the oxygenated-Hb (Oxy-Hb) responses, which represent cerebral blood oxygenation. For each response in the test blocks, we corrected the baseline using a linear fit that was computed between mean signal of the 5-sec baseline points just before the onset of the test block and the 5-sec time points between 5 and 10 sec after the end of the test block (Peña et al., 2003). The responses during the test blocks were averaged synchronously after excluding manually the blocks with large and rapid motion artifacts in each condition (signal variations approximately >0.1 mmol/mm over two consecutive samples; Peña et al., 2003). Overall, we removed 2.93 data blocks (SD = 1.14) on average due to the artifacts and used 4.34 data blocks (SD = 1.07) for data analyses.
According to the three-dimensional probabilistic anatomical craniocerebral correlation (Okamoto et al., 2004), T3 and T4 were projected onto the middle temporal gyrus in adults. Therefore, the left channels 4, 6, and 7 and their counterparts on the right (Figure 2) are presumed to approximately cover the superior temporal cortex, which we refer to as the ROI of the auditory regions. For each infant, the mean change in concentration of Oxy-Hb over the 15-sec after the onset of the test block was calculated for each condition and for each ROI channel. When the time courses of Oxy-Hb were examined for each channel (Figure 3), some peak responses appeared around the end of test blocks. Thus, we included an additional 5 sec after the end of the test blocks for data analyses (i.e., 15 sec total; Peña et al., 2003).
The values of Oxy-Hb changes obtained with NIRS lack the reference to optical path length. Consequently, the comparison or integration of the data between different channels or across different subject is difficult to validate. However, a recent study demonstrated that the optical path lengths are similar among nearby channels and between homologous regions of left and right hemispheres within a subject (Katagiri et al., 2010). On the basis of these findings, we used average data among ROI channels in each hemisphere of individual participants to carry out the statistical analyses. The values of Oxy-Hb changes in the left and right ROI were subjected to a two-way ANOVA, with Conditions (word and PT) and Sides (left and right) as within-subject factors in each group. It should be noted that Katagiri's findings do not extend to a comparison across different subjects. Thus, data were submitted to within-subject ANOVAs for each age group. Note that analyses averaging across infants were conducted in some previous studies (Nakano, Watanabe, Homae, & Taga, 2009; Blasi et al., 2007; Peña et al., 2003).
In addition, paired t tests were conducted to compare the Oxy-Hb changes during the test blocks and the baseline blocks in each condition, side, and age group. This was done to determine whether the HL versus LH pitch changes in the test blocks elicited significantly larger Oxy-Hb changes than the baseline blocks.
We found that 4- and 10-month-olds exhibited hemodynamic responses to the pitch pattern changes in the test blocks under each condition (Figure 3). Whereas 4-month-olds showed similar Oxy-Hb responses under both conditions, the responses of 10-month-olds seemed to differ between the conditions.
Figure 4 shows the mean time course of Oxy-Hb concentration changes of both age groups for left and right ROI channels in the two conditions. The y-axes are the grand averages of Oxy-Hb responses from all participants in each condition. We conducted paired t tests between the values of the 5-sec baseline period just before the test block and the values of the Oxy-Hb signals over the 15 sec after the onset of the test block for each condition and each side in both age groups (the false discovery rate correction at q < 0.05). Significantly larger responses were found during the test blocks (including 5-sec periods after the end of the test blocks) than the baseline block (5-sec periods just before the test blocks) in every comparison except for the left side under the PT condition in 10 months: 4 months, df = 23; PT, L, t = 3.62, p = .001 (<.03125), R, t = 3.96, p = .001 (<.025); Word, L, t = 4.38, p = .000 (<.01875), R, t = 4.59, p = .000 (<.0125); 10 months, df = 23; PT, L, t = 1.79, p = .086 (>.05), R, t = 2.43, p = .023 (<.0375); Word, L, t = 6.48, p = .000 (<.00625), R, t = 2.31, p = .030 (<.04375).
The lateralization pattern under the word condition differed between the two age groups. Whereas 4-month-olds showed similar Oxy-Hb responses under both conditions, the responses of 10-month-olds differed between the conditions. Figure 5 shows the averaged values of the Oxy-Hb responses of both age groups in the left and right side under the two conditions. Results of two-way ANOVA, with conditions (word and PT) and sides (left and right) as within-subject factors in each group, showed that the 4-month-old infants exhibited no significant interaction or main effects for the two factors, F(1, 23) = 0.55, p > .10, F(1, 23) = 0.37, p > .10, and F(1, 23) = 0.00, p > .10, interaction, condition, and side, respectively, whereas the 10-month-olds exhibited a significant interaction, F(1, 23) = 8.06, p < .01, but no effects for side, F(1, 23) = 0.57, p > .10, or condition, F(1, 23) = 1.23, p > .10, suggesting that the activations of left and right differed between the conditions in the 10-month-olds. The interaction between side and condition was due to the fact that the left-side response under the word condition was significantly larger than that in the PT condition (p < .05) and the left-side response was significantly larger than that in the right-side under the word condition (p < .05).
The fact that the main effect of condition (i.e., PT vs. word) was not significant in either age group showed that the Oxy-Hb changes elicited by PT stimuli were on average comparable with that of word stimuli for these infants. The lack of main effect of side (left or right) showed that on average, the stimuli in the present experiment elicited comparable levels of Oxy-Hb changes in either side of the brain.
In the present study, we found that both 4- and 10-month-old Japanese infants behaviorally discriminated the lexical pitch–accent pattern change between HL and LH words. The neural responses to these stimuli also showed that infants in both age groups had higher hemodynamic responses in the test blocks than that in the baseline blocks. Furthermore, the neural responses to the HL versus LH stimuli differed between the two age groups: 10-month-old infants showed stronger left hemisphere hemodynamic responses to the pitch changes embedded in word forms but no left-side dominance in responses to PT stimuli, whereas 4-month-olds showed bilateral responses to both types of stimuli. Older infants' responses showed similar patterns as seen in Japanese adults in a previous study (Sato et al., 2007). The behavioral experiment confirmed that any changes we observed between 4- and 10-months of age are not due to changes in their ability to discriminate the stimuli. The change is found only in how the two brain hemispheres processed these contrasts.
We tested the prediction that the way infants process contrasts in speech accompanying “reorganization” is associated with functional lateralization of speech processing. Research on infant speech perception has revealed that although younger infants seem to be capable of discriminating the majority of segmental contrasts, they begin to lose this sensitivity by the second half of the first year unless the contrasts exist in their language (Werker & Yeung, 2005; Werker & Tees, 1984). It has been proposed that this change is not a loss of sensitivity but is better characterized as a “reorganization.” What changes is how infants process the stimuli—from general auditory processing to specifically attuned processing for linguistically relevant contrasts (Werker & Tees, 1992, 1999, 2005; Kuhl, 2004). On the basis of this proposal and strong evidence from adult imaging studies, we predicted that this reorganization may be linked to the functional lateralization of speech processing; namely, if older infants are processing speech contrasts as “linguistically relevant,” a left hemisphere dominance should be seen, as in adults, whereas younger infants should not show this left-side dominance, as the contrasts are yet to become linguistically relevant to them. Our current behavioral and NIRS findings are consistent with this prediction.
Since the early studies by Wernicke (1969/1874) and Broca (1950/1861), it has been well documented that the left hemisphere plays a more dominant role than the right in processing language and speech stimuli. Still, there has been an active debate as to what drives this asymmetry. The dominant view in the field has been that it is the linguistic function of the stimuli that drives the lateralization; we will call this a functional account. Important evidence for this account comes from the processing of lexical tones in Chinese and Thai (Wong, Parsons, Martinez, & Diehl, 2004; Gandour et al., 1998, 2000, 2002, 2003; Klein et al., 2001). In a PET study, for example, Gandour et al. (2000) found that Thai adults show left hemisphere dominance to Thai tones, whereas such asymmetry is not found in Chinese or English speakers, demonstrating that the linguistic function of the tone is sufficient to drive a left hemisphere advantage. On the other hand, it has been proposed that the observed asymmetry is not a lateralization driven by the function of the speech signal, but instead, it reflects the physical properties of speech signals: Auditory stimuli with slow acoustic transitions, such as pitch change, are preferentially processed in the right hemisphere whereas rapidly changing sounds, like consonants, are preferentially processed in the left (Zatorre, Belin, & Penhune, 2002; Zatorre & Belin, 2001). We will call this an acoustic account. A review of the literature suggests that both of these factors may contribute to the functional lateralization of language and speech stimuli in the adult brain. However, as Zatorre and Gandour (2008) pointed out, much is still to be discovered about how these factors interact.
Developmental research could contribute to this debate significantly. As discussed above, cumulative data from electrophysiological and imaging studies with young infants suggest that the two hemispheres are not symmetrical in processing speech and other auditory stimuli from early in infancy (Homae et al., 2006; Dehaene-Lambertz & Gliga, 2004; Peña et al., 2003; Dehaene-Lambertz et al., 2002; Dehaene-Lambertz, 2000; Dehaene-Lambertz & Baillet, 1998; Dehaene-Lambertz & Dehaene, 1994). Infants at this young age are not likely to have learned much about the specific characteristics of their native language segments although they may have already learned some prosodic properties. Thus, these data show that at least some asymmetry is already present before “reorganization,” when infants learn much about the segments of their language. Still, we cannot determine whether the observed asymmetry is derived purely from the physical properties of the stimuli (e.g., fast or slow transition) or is functionally linked to the fact that some of the stimuli were human speech. Moreover, as these studies have focused on very young infants, we do not know whether the said asymmetry changes as infants learn the linguistic relevance of particular speech stimuli.
In the present study, we observed a change in the left hemisphere dominance for pitch–accent embedded in word forms between 4- and 10-months of age. Compared with the early asymmetry data discussed above, the emergence of the left-side advantage in the current study is relatively late. This allows us to dissociate the acoustic and developmental factors that could contribute to the emergence of a left hemisphere advantage. The phonemic contrast that we used in the present study is the lexical pitch–accent of Japanese. Similar to the Gandour et al. (2000) study with tones, the acoustic cue for the lexical pitch–accent in Japanese is a slow transition of the pitch. This type of cue is typically associated with bilateral activation or a right hemisphere advantage. Had we used a typical phonemic contrast with consonants or vowels involving fast formant transitions, a left hemisphere advantage could have arisen on the basis of the inherent acoustic properties of the stimuli independent of its linguistic function (Zatorre et al., 2002; Zatorre & Belin, 2001). We found that 4-month-old infants did not have a left hemisphere advantage either for the PT stimuli or word form stimuli, showing that the LH advantage we found in 10-month-old infants is not attributable to the acoustic property of the pitch–accent stimuli. Instead, it should be due to the developmental changes that occurred to infants between 4- and 10-months of age.
It is important to note, however, that the data from the current study on its own are not sufficient to determine which aspects of developmental changes between 4 and 10 months of age caused the left-side dominance in processing the word-form stimuli. As our working hypothesis predicted, the left-side dominance could have emerged as a consequence of learning the linguistic function of pitch–accent. However, there are many other developmental changes that occur during this time, and we cannot rule out alternative possibilities that other changes, for example, a maturational improvement of the left hemisphere, could have contributed significantly to the emergence of the LH advantage. In our study, we used PT stimuli as a control to rule out the possibility that older infants would show a LH advantage to the pitch changes in general. The results of the ANOVA show that the hemodynamic changes for the PT stimuli were comparable with the word form stimuli, which allows us to rule out the possibility that the lack of LH advantage in 4-month olds was due to a lower level activation for the PT stimuli. The lack of main effect for side (left or right) also shows that it was not the case that LH had a selective improvement over the RH to the pitch changes between 4- and 10-months of age.
The PT and the word form stimuli differ in many aspects, however, such as bandwidth and formant complexity, and some of these differences could have influenced our results. For example, the maturation of the LH may selectively improve infants' sensitivity to pitch changes embedded in complex stimuli, such as syllables. In previous NIRS studies, it has been reported that pitch changes embedded in words or speech do not automatically lead to the LH advantages (Homae, Watanabe, Nakano, & Taga, 2007; Sato et al., 2003). Still, these data hardly exhaust the alternative accounts that could have contributed to the emergence of the LH advantage, and we need to wait for future studies to clarify the relationship among “reorganization” in language development, linguistically relevant processing, and brain responses.
In comparing adult and infant studies, one needs to consider the fact that adults process the linguistic stimuli for meaning, whereas infants are not likely to comprehend the semantic content of the words. For this reason, adult studies that show left hemisphere advantages for tone or pitch–accent discrimination are open to the criticism that the effect could be caused not only by the tones or pitch–accent processing but also by the lexical processing of the stimulus words (Wong, 2002). To address this, Xu et al. (2006) used tonal chimeras, in which tones in Thai (and Chinese) are superimposed onto Chinese syllables. They found that both Thai and Chinese speakers show an overlapping left hemisphere advantage to the tones of one's native language but not to the foreign ones. This demonstrates that the tones in one's own language could produce a left hemisphere advantage independent of lexical processing. This finding is consistent with results from the 10-month-old infants in our study who demonstrated a left hemisphere advantage for lexical pitch–accent although they were not familiar with the meaning of the lexical items.
Cross-linguistic comparison, similar to Xu et al. (2006), with infants could provide valuable data. If the prediction of our working hypothesis is correct, we should find no left-dominant responses to the tones in either Thai or Chinese in 4-month-old infants of any culture. At 10 months, in contrast, we should find a LH advantage for Thai tones and no left dominance for Chinese tones in Thai infants, and the opposite should be found in Chinese infants. At present, however, we know very little about whether infants who are learning various tone languages (e.g., Chinese, Thai, Vietnamese) can discriminate tones in other languages (cf. Mattock & Burnham, 2006) or the lexical pitch–accent of Japanese at any age, nor do we know whether Japanese infants can discriminate tones of different languages. As more studies reveal how infants learning different lexical level prosody acquire these contrasts, further cross-linguistic studies could provide critical data about whether the “reorganization” corresponds with emergence of the LH advantage.
Many issues remain unanswered. For example, we were able to capture the developmental change between 4 and 10 months of age, which we argue is before and after the reorganization. However, our data do not allow us to determine precisely when the shift occurs. A priori, the emergence of left hemisphere dominance need not correspond precisely with when infants lose their sensitivity to foreign contrasts or when infants become capable of discriminating the contrasts behaviorally. Infants may first learn the linguistic significance of a particular contrast, but there may be a lag before it is reflected as a functional difference between hemispheres. Alternatively, it is also possible that electrophysiological responses, such as MMN, or hemodynamic responses that are measured by NIRS may be observed for speech contrasts that infants fail to discriminate behaviorally. A careful comparison between behavioral and imaging studies will be necessary to determine whether the timing of left-lateralization corresponds closely to the reorganization of how infants process speech stimuli.
Behaviorally, at least three different paths are possible in how infants acquire the ability to discriminate the set of phonemic contrasts in their native language: an initial sensitivity to a wide range of contrasts is modified by progressive loss of sensitivity to nonnative ones (Best & McRoberts, 2003; Polka & Werker, 1994; Kuhl et al., 1992; Werker & Tees, 1984), an initially poor sensitivity gradually improves with experience (Kuhl et al., 2006), or an initial sensitivity undergoes no change (Sundara, Polka, & Genesee, 2006; Best & McRoberts, 2003; Polka et al., 2001). It will be interesting to discover whether similar lateralization is observed when infants learn phonemic contrasts in different manners. The examination of brain responses to the developmental changes in various phonemic contrasts could elucidate the mechanism of how infants learn the sound system of their language. Note, however, that the developmental shift in hemispheric dominance may not occur in the segmental contrasts for native versus nonnative sounds. This is because a left hemisphere advantage is predicted for the segmental contrasts that involve fast transition independent of their linguistic function (Zatorre et al., 2002; Zatorre & Belin, 2001). Thus, left hemisphere dominance may be observed for all contrasts, native or nonnative, linguistically relevant or nonrelevant, independent of infants' linguistic experience. Thus, the lexical level prosody (i.e., tones, pitch–accent, and stress) will provide ideal stimuli to further examine the development of hemispheric dominance in speech processing.
To conclude, we should clarify that our study does not contradict the acoustic account of hemispheric dominance. We have demonstrated that above and beyond the acoustic properties of the stimuli, the linguistic function of the stimuli is enough to drive the left hemisphere advantage. We do argue that the physical characteristics of the stimuli cannot be solely responsible for the left hemisphere advantage for certain types of speech stimuli. However, our data are consistent with the account that the inherent properties of the stimuli contribute significantly to how the stimuli are processed by the two hemispheres of the brain. Similarly, the present results are consistent with findings that hemispheric asymmetry exists among very young infants for various types of auditory and speech stimuli. If the acoustic properties of the stimuli can drive hemispheric dominance, it could contribute significantly to how the very young infants process auditory stimuli in the two hemispheres. In addition, we know that much learning takes place earlier than when infants learn the language-specific phonemic inventory. Such learning could also contribute how the speech stimuli are processed in the brain of the infants. What we have demonstrated is that 4- and 10-month-old Japanese infants differ in how the two hemispheres process the pitch changes in word stimuli. This occurred although they were not familiar with the lexical meaning. This new information adds to our understanding of how linguistic function and acoustic characteristics interact in the hemispheric dominance for speech and language processing.
The authors would like to thank Janet F. Werker, Ryoko Mugitani, Sachiyo Kajikawa, and Mitsuhiko Ota for their advice and assistance on the experiments reported in this article and Hiromichi Hosoma, Kotaro Takeda, and Ippeita Dan for their technical advice and contributions. This work was supported in part by Grants-In-Aid for Scientific Research to the first author (Y. S.) (Kakenhi, the Japanese Ministry of Education, Culture, Sports, Science and Techonology No. 19730474).
Reprint requests should be sent to Yutaka Sato, Laboratory for Language Development, RIKEN Brain Science Institute, 2-1 Hirosawa, Wako-shi, Saitama, Japan 351-0198, or via e-mail: firstname.lastname@example.org.