Abstract

A crucial aspect of bilingual communication is the ability to identify the language of an input. Yet, the neural and cognitive basis of this ability is largely unknown. Moreover, it cannot be easily incorporated into neuronal models of bilingualism, which posit that bilinguals rely on the same neural substrates for both languages and concurrently activate them even in monolingual settings. Here we hypothesized that bilinguals can employ language-specific sublexical (bigram frequency) and lexical (orthographic neighborhood size) statistics for language recognition. Moreover, we investigated the neural networks representing language-specific statistics and hypothesized that language identity is encoded in distributed activation patterns within these networks. To this end, German–English bilinguals made speeded language decisions on visually presented pseudowords during fMRI. Language attribution followed lexical neighborhood sizes both in first (L1) and second (L2) language. RTs revealed an overall tuning to L1 bigram statistics. Neuroimaging results demonstrated tuning to L1 statistics at sublexical (occipital lobe) and phonological (temporoparietal lobe) levels, whereas neural activation in the angular gyri reflected sensitivity to lexical similarity to both languages. Analysis of distributed activation patterns reflected language attribution as early as in the ventral stream of visual processing. We conclude that in language-ambiguous contexts visual word processing is dominated by L1 statistical structure at sublexical orthographic and phonological levels, whereas lexical search is determined by the structure of both languages. Moreover, our results demonstrate that language identity modulates distributed activation patterns throughout the reading network, providing a key to language identity representations within this shared network.

INTRODUCTION

Successful bilingual communication hinges on correct attribution of linguistic input to a language, which can facilitate word recognition by restricting it to one language (Schulpen, Dijkstra, Schriefers, & Hasper, 2003). Sometimes language attribution is possible based on prior sentential or source-related information (Li, Yang, Scherf, & Li, 2013; Schwartz & Kroll, 2006). In the frequent case of mixed-language settings (e.g., “I am at theArbeitsamt[job office].”), however, language attribution must be based on single words.1 It is well established that once a word is mapped to its lexical word form representation, its language is easily recognized (Grainger & Dijkstra, 1992). Language attribution before complete lexical access, however, can be more beneficial to the word recognition process, as it could facilitate word recognition. This study was designed to investigate the cognitive and neural mechanisms of such early language attribution.

According to cognitive models of bilingual word recognition, language attribution before lexical access is only possible based on language-unique sublexical representations (BIA+ model; Van Kesteren, Dijkstra, & de Smedt, 2012; Dijkstra & van Heuven, 2002), such as, for example, the sublexical unit sh, which exists in English but not in German (Casaponsa, Carreiras, & Duñabeitia, 2014; Vaid & Frenck-Mestre, 2002). Most recently, compelling evidence was provided for a fast route to language membership assignment based on language-unique letters (i.e., the letter Ø in Norwegian vs. English; Van Kesteren et al., 2012) and bigrams (e.g., the bigram TX in Basque vs. Spanish; Casaponsa et al., 2014). Both studies found that words containing such language-unique cues were assigned to a language faster than words that were orthographically legal in both languages in a visual lexical decision task. Yet, the relevance of such language-unique sublexical letter sequences for language attribution in general is limited, as they are only contained in a rather restricted number of words. Our recent findings, instead, suggest that bilinguals rely on continuous differences between languages concerning orthographic patterns at sublexical (bigram frequencies, e.g., the bigram gi is more frequent in English than in German, although it is orthographically legal in both languages) and lexical (orthographic neighborhoods, e.g., the word cage has more English than German orthographic neighbors; Shook & Marian, 2013) levels to infer the language of a visually presented letter string (Oganian, Conrad, Aryani, Heekeren, & Spalek, 2015). However, although this study established bilinguals' sensitivity to distinctive language similarity, it did not test the contribution of similarity to each of the languages separately, as only proportional variables were employed (i.e., the difference between German (L1) and English (L2) orthographic neighborhood size). Thus, it remains unclear, in which ways L1 and L2 statistics contribute to bilinguals' assessment of language identity.

The study of Oganian et al. (2015) offers first insights into the cognitive mechanisms of early language attribution. Yet, the neurobiological mechanisms underlying bilinguals' disentangling of language ambiguity remain largely unknown so far. Behavioral and neuroimaging studies report that bilinguals activate both of their languages nonselectively (Wu & Thierry, 2010; Kroll, Bobb, & Wodniecka, 2006), even in monolingual settings. Moreover, most fMRI studies find that proficient bilinguals process both of their languages with the same neural network (Consonni et al., 2013; Abutalebi, 2008; Klein et al., 2006). The processing of sublexical and lexical native language statistics within this network is well documented. Specifically, the visual word form area in the left ventral occipito-temporal cortex (vOT) is involved in encoding of visually presented letter strings and is preferentially active for letter strings that are characterized by sublexical statistics typical for the native language (L1; Dehaene & Cohen, 2011; Vinckier et al., 2007; Binder, Medler, Westbury, Liebenthal, & Buchanan, 2006; Dehaene, Cohen, Sigman, & Vinckier, 2005). Lexical similarity statistics of L1 affect neural activity in left angular gyrus (AG; Carreiras, Baayen, Perea, & Frost, 2014), which plays a role in the access to lexico-semantic representations during visual word recognition (Binder et al., 2003). Crucially, though, whether and how second language (L2) statistics are neurally represented is thus far unexplored.

In light of the above-mentioned findings that neural representations of both languages are activated concurrently, it is unknown how this network represents language identity. One possibility is that, although mean activation patterns (as analyzed in the above mentioned studies) do not differ between languages, distributed activation patterns (Lee, Turkeltaub, Granger, & Raizada, 2012; Raizada, Tsao, Liu, & Kuhl, 2010) reflect the language similarity of an input. Such representations would then be the basis for attributing linguistic input to a specific language.

Here we used event-related fMRI to investigate the neural representation of sublexical and lexical statistics of L1 and L2 and their involvement in a German–English language decision task. Previous studies employed this task to investigate the effects of categorical cues on language attribution (Casaponsa et al., 2014; Van Kesteren et al., 2012; Vaid & Frenck-Mestre, 2002) or to manipulate the relative similarity to first and second language (Oganian et al., 2015). Here, we created a stimulus set of pseudowords (PWs), where continuous similarities to L1 and L2 at sublexical and lexical levels were varied independently and parametrically. The use of PW stimuli that did not perfectly match any lexical entry of one of the two languages precludes too strong effects of the unambiguous language membership of existing words (involving semantic processing), which would shadow our attempt to explore the early processes of language assignment. This design allowed us to directly contrast the cognitive effects and neural representations of each variable. Moreover, we were interested in characterizing the involvement of regions in and outside the visual word processing network in language attribution and language identity representation. As previous findings have shown, the neural network involved in second language processing varies with age of acquisition and proficiency (Wartenburger et al., 2003), we chose to focus on a homogenous group of highly proficient bilinguals with a relatively early age of onset (7–13 years, mean = 10). Importantly, Wartenburger et al. found no difference in the neural networks underlying lexico-semantic processing of first and second language in this population.

We expected, first, that participants' language attribution would follow the statistical similarity of stimuli to each language, as coded by the language-specific statistics. Second, based on the literature, we expected to find similarity to each language represented throughout the reading network. We focused, in particular, on the representation of sublexical statistics in bilateral vOT (Binder et al., 2006) and lexical statistics in left AG (Price, 2012). Third, we reasoned that language attribution would be most likely reflected in differences in activation patterns within the brain regions that encode language-specific statistics. We thus employed multivariate pattern analysis (MVPA) to decode language decisions.

METHODS

Participants

Twenty late highly proficient German–English bilinguals (three men, ages 21–34 years, mean age = 25, age of L2 acquisition onset = 7–13 years, mean = 10) participated in the study. They were recruited through advertisements on campus and in mailing lists. All were native German speakers, had studied English as their first foreign language in school, and had spent at least 9 months living in an English-speaking country (Great Britain, United States, English-speaking Canada, Australia, or New Zealand). None suffered from a reading disability or other learning disorders. All participants completed an online language history questionnaire (adapted from Li, Sepanski, & Zhao, 2006) before the experiment and were admitted to the study only if they fulfilled the above criteria. Participants were right-handed and had normal or corrected-to-normal vision. Self-reports of L2 proficiency were collected on a 1–7 Likert scale, separately for reading, writing, speaking, and listening abilities (Table 1). General proficiency in German and English was assessed after the experiment using the LEXTALE tests of German and English proficiency (Lemhöfer & Broersma, 2011), which reflects knowledge of vocabulary and familiarity with lexical structures of the test language. We also assessed participants' reading speed using the single word and PW reading subtests of TOWRE for English (Torgesen, Wagner, & Rashotte, 1999) and SLRT-II for German (Moll & Landerl, 2010). All measures reflect a high mastery of English (Table 1). All participants completed an informed consent form before the experiment and were reimbursed either monetarily or with course credit. The experiment was approved by the ethics board of the Psychology Department of the Freie Universitaet Berlin.

Table 1. 

Participants' Language and Reading Profiles

L1 (German)L2 (English)
MeanRangeMeanRange
LEXTALE  91.8 80–100 81.3 60–100 
Reading rates (words/min) Real words 131.6 111–154 124.1 108–140 
Pseudowords 84.8 58–110 77.8 43–107 
Self-report Proficiency (1–7) Reading – – 6.1 5–7 
Writing – – 5.8 4–7 
Listening – – 6.1 5–7 
Speaking – – 6.0 5–7 
Age of acquisition (years) – – 9.8 7–13 
Accenta – – 2.6 1–4 
L1 (German)L2 (English)
MeanRangeMeanRange
LEXTALE  91.8 80–100 81.3 60–100 
Reading rates (words/min) Real words 131.6 111–154 124.1 108–140 
Pseudowords 84.8 58–110 77.8 43–107 
Self-report Proficiency (1–7) Reading – – 6.1 5–7 
Writing – – 5.8 4–7 
Listening – – 6.1 5–7 
Speaking – – 6.0 5–7 
Age of acquisition (years) – – 9.8 7–13 
Accenta – – 2.6 1–4 

a1 = none, 7 = strong accent.

Stimuli and Design

The stimulus set contained 432 PWs, which varied parametrically in their similarity to German and English at sublexical and lexical levels (see below). The use of PWs had several advantages for this study. First, PWs created a language-ambiguous context and thus facilitated simultaneous activation of both language systems. Second, although sublexical processing and lexical search for PWs and words are comparable, the use of PWs reduces the effects of semantics and precludes a lexical match, thus boosting the effects of sublexical information and lexical search, which are at the focus of the current study. Third, PWs can be carefully constructed to contain task-relevant information with considerably fewer constraints than real words. When constructing our stimulus set, we ensured that PWs contained only letter groups that exist in both languages, maximizing the potential effects of continuous statistics. Moreover, stimuli were carefully chosen to reduce the shared variance of psycholinguistic statistics (sublexical vs. lexical in L1 and L2) in our stimulus set. This allowed us to pinpoint potential unique effects of each variable on changes in BOLD—an undertaking that would not be possible with real words. Note that although the use of PWs reduces semantic effects to a minimum and makes full lexical access impossible, sublexical processing of word-like PWs (as in our study) and words recruit the same neural network (Mechelli, Gorno-Tempini, & Price, 2003). Importantly, during PW reading partially overlapping lexical representations are coactivated (Holcomb, Grainger, & O'Rourke, 2002; Peereman & Content, 1995), such that PWs are suited for the investigation of lexical neighborhood effects.

PWs were selected from the English lexicon project (Balota et al., 2007) and a German PW database provided by two of the coauthors (MC & AA), as well as a specially created PW set. All PWs had equal numbers of phonemes and syllables in both languages. They were orthographically legal and pronounceable in both languages as ensured through ratings by two highly proficient bilinguals. Only PWs that were rated as pronounceable in both languages by both reviewers were included in the stimulus list. Sublexical language similarity to German and English was computed as log10 mean positional bigram frequency in German (BFG) and English (BFE). Mean positional bigram frequency for each PW was computed by averaging the frequencies of all of its bigrams according to their position (word onset, word end, or word-middle bigram) based on the SUBTLEX corpora of German and English (Brysbaert et al., 2011). We chose to use position-specific, rather than nonpositional, bigram frequencies, as they are better suited to capture the orthotactic legality of certain bigrams in German and English (i.e., the bigram “ko” is illegal in English in word onsets but not in word-middle positions, whereas it is frequent in German at word onsets and word-middle positions; Oganian et al., 2015). To ensure comparability across languages, both variables were converted to z-scores, such that positive values reflect above average bigram frequency and negative values reflect below average bigram frequency in the respective language. Similarity to lexical word forms of each language was operationalized through orthographic neighborhood size, measured through the orthographic Levenshtein distance (Yarkoni, Balota, & Yap, 2008). The orthographic Levenshtein distance between two letter strings is determined by the minimal number of letter substitutions, deletions, or insertions necessary to transform them into each other. The orthographic Levenshtein neighborhood size (OLD) of a letter string in each language is computed as the mean distance to its respective 20 closest neighbors in this language. We refer to the orthographic neighborhood size in German as OLDG, and in English as OLDE. As OLD is highly correlated with length in both languages (German: −.84, English: −.87), with long letter strings having fewer close neighbors, we rendered it comparable across lengths by dividing it by stimulus length (correlations in German and English SUBTLEX after correction for length: German: .04, English: −.07). The resulting variables were then converted to z-scores, such that positive values reflect above average and negative values below average orthographic neighborhood sizes in the respective language. PWs were selected such that they covered the crucial range of similarity overlap between English and German for each of the four measures (BFG, BFE, OLDG, and OLDE), whereas at the same time keeping all pairwise correlations between the four measures as low as possible (Table 2). Note, that all language similarities were positively correlated. To identify the unique contribution of each psycholinguistic variable to variations in dependent variables it is thus crucial to simultaneously analyze the effects of all variables in a multiple regression model. Importantly, while varying the values of the four psycholinguistic variables in our stimulus set, we controlled for stimulus length and syllable number, such that there was no dependency between BFG, BFE, OLDG, and OLDE and letter or syllable number. Moreover, PWs were chosen such that equal numbers of PWs were more similar to German and English based on either lexical or sublexical language similarity.

Table 2. 

Statistical Properties of the Stimulus Set

BFGBFEOLDGOLDE
(A) 
Mean 3.79 3.80 0.40 0.35 
SD 0.25 0.17 0.06 0.06 
Range [3.1, 4.3] [3.2, 4.2] [0.3, 0.6] [0.2, 0.6] 
 
(B) 
BFG  .54 .31 .31 
BFE   .18 .35 
OLDG    .39 
BFGBFEOLDGOLDE
(A) 
Mean 3.79 3.80 0.40 0.35 
SD 0.25 0.17 0.06 0.06 
Range [3.1, 4.3] [3.2, 4.2] [0.3, 0.6] [0.2, 0.6] 
 
(B) 
BFG  .54 .31 .31 
BFE   .18 .35 
OLDG    .39 

(A) Mean, range, and SD of mean bigram frequency (per million bigrams, log10) and orthographic neighborhood size (normalized by stimulus length) in stimulus set. (B) Pairwise correlations between mean bigram frequency and orthographic neighborhood size. All correlations are significant with p < .05. BFG = mean bigram frequency in German; BFE = mean bigram frequency in English; OLDG = Levenshtein orthographic neighborhood size in German; OLDE = Levenshtein orthographic neighborhood size in English.

Task

Participants categorized PWs as more German-like or more English-like in a speeded language decision task. Responses were given by button press on an MRI-compatible response box with the index and middle fingers of the left hand. Buttons were assigned to a language during training and remained constant throughout the experiment for each participant but were counterbalanced across participants. Participants were reminded of the button order before each run. On each trial, RTs and language decisions were recorded.

Procedure

Before entering the scanner, participants were familiarized with the task on a training set of 30 PWs with similar characteristics as the stimuli used in the main task. The task in the fMRI scanner was subdivided in eight runs, in-between which participants were allowed a self-paced break. Subsequent to the scanning session, participants filled out a questionnaire about their response strategies and completed the proficiency tests described above.

In each experimental trial, a PW was presented on the screen for 2 sec, during which participants were required to respond. It was followed by a jittered ISI of 2–20 sec (event-related design was optimized with optseq2, Dale, 1999), during which a fixation cross was presented on the screen. Stimuli were presented using goggles (Arial, font size 40) with an experimental procedure programmed in Cogent 2000 for MATLAB (version 7.10.0, MathWorks, Inc., Natick, MA). Each run contained 54 PWs and 6 lower-level control trials, which consisted of four to seven backward or forward slashes. During these trials, participants were required to press the left button for backward slashes and the right button for forward slashes. Pseudorandomization ensured that runs were matched on stimulus length and contained equal numbers of more German-typical and more English-typical PWs with respect to lexical as well as sublexical variables.

Behavioral Data Analysis

The effects of mean bigram frequency in German (BFG) and English (BFE) and orthographic neighborhood size in German (OLDG) and English (OLDE) on language decisions and RT were analyzed in mixed-effects multiple regression models with a random factor structure for participants and items (Barr, Levy, Scheepers, & Tily, 2013; Baayen, Davidson, & Bates, 2008). Models were fitted in a two-stage procedure whereby, first, the fixed-effect structure was forward-fitted based on log-likelihood model comparisons starting with a model containing main effects only. In the second step, we backward-fitted the random factor structure. Importantly, to correct for the pairwise correlations among the independent variables, all reported analyses are based on simultaneous estimation of the optimal models identified in the fitting process. Thus, the percentage of variance explained by each predictor reflects unique variance explained after regressing out all other independent factors.

Language decisions were analyzed in a logistic mixed-effects model, where significance of fixed factors was assessed via chi-square Wald tests. For this analysis, response English was coded as 1 and response German as 0, resulting in positive b values for an increase in percentage of English responses.

RTs were analyzed with a linear MEM that also included language decisions in an effect-coding as independent variable to investigate response-specific effects. As exact degrees of freedom cannot be determined for linear MEMs, we used the normal approximation of the t distribution to infer significance, as has been recommended for large sample sizes (Baayen, 2008).

After the final model for RTs was chosen by this procedure, we fitted this model for each participant. Trials were labeled as outliers if the RT of a trial deviated by more than 2.5 residual standard deviations from the predicted RT value for this trial based on the regression model. Outliers were excluded from behavioral analyses and were coded separately in the fMRI analysis (see below). The language decision task does not allow for a definition of error responses, as all PWs could be attributed to both languages based on their orthographic appearance. However, based on unpublished pilot data, we could also posit that responses that strongly contradict the orthographic neighborhoods of a stimulus are most probably outliers and because of a mistake in the decision process. Thus, trials with either (1) response German, but low OLDG and high OLDE, or (2) response English, but high OLDG and low OLDE, were defined as outliers. Hereby we regarded a PW as having high or low OLD values if the value was in the upper third or the lower third of all values, respectively. Three percent of all trials were excluded based on this criterion. Behavioral data were modeled both with and without these response-based outliers for comparison.

fMRI Procedures

fMRI Data Acquisition

MRI data were acquired on a 3-T scanner (Trio; Siemens, Berlin, Germany) at the Dahlem Institute for the Neuroimaging of Emotions using a 12-channel head coil. Functional images were acquired with a gradient-echo T2*-weighted echo-planar sequence (repetition time = 2000 msec, echo time = 30 msec, flip angle = 70°, 64 × 64 matrix, field of view = 192 mm, voxel size = 3 × 3 × 3 mm3). A total of 37 axial slices (3 mm thick, 10% gap) were sampled for whole-brain coverage with interleaved bottom–up slice order. Moreover, fat saturation was used. Imaging data were acquired in eight separate runs of 210 volumes (7 min) each. The first four volumes of each run were discarded to allow for T1 equilibration. A high-resolution T1-weighted anatomical scan of the whole brain (256 × 256 matrix, voxel size 1 × 1 × 1 mm3) was acquired at the end of the scanning session.

fMRI Data Preprocessing

All preprocessing was done in SPM 8 for MATLAB (MathWorks, Natick, MA). Preprocessing steps were applied in the following order: realignment, coregistration to mean image, segmentation, normalization, and smoothing. Realignment was based on the two 2-pass procedures available in SPM8 using a six-parameter rigid body transformation. Subsequently, a participant's anatomical image was coregistered to the mean functional image. Coregistration was based on a normalized mutual information algorithm provided in SPM8. Gray and white matter were segmented, and segmentation results were used for normalization of functional images to SPM8's standard T1 template based on the Montreal Neurological Institute (MNI) reference brain with isotropic 3-mm voxels. For the univariate analyses, normalized images were smoothed with an isotropic 8-mm FWHM Gaussian kernel to ameliorate intersubject differences. Multivariate analyses were performed on the unsmoothed images.

fMRI Single Participant Analysis

For each participant, fMRI time series were regressed onto a general linear model (GLM) containing regressors representing experimental trials, high-level control trials, and participants' responses in an effect coding. The four psycholinguistic variables BFG, BFE, OLDG, and OLDE were entered as parametric modulators for the experimental trials. Motivated by the behavioral results, we also added the interaction of OLDG and OLDE as parametric modulators (see Results). All parametric modulators were mean centered and normalized before entering the model, such that we could directly compare beta-values for different predictors. All events were modeled with a 2-sec box-car function, convolved with the standard hemodynamic response function, and high-pass filtered at 128 sec. All predictors entered the design matrix independently, that is, without the default serial orthogonalization implemented in SPM (see Wilson, Isenberg, & Hickok, 2009; Hauk, Davis, & Pulvermüller, 2008, for a similar approach). This way, any correlation between a particular predictor and BOLD percent signal change reflects unique variance explained by this predictor, independently of regressor order and after all other factors have been accounted for. Outlier trials were coded in an additional regressor that was not further analyzed. To account for variance induced by participants' movement, the six realignment parameters estimated in the realignment preprocessing step were added as nuisance regressors. This first-level analysis produced a beta-estimate for each participant and each parametric predictor.

Univariate Group Analysis

In the univariate analysis, we asked in which brain regions unique variance would be explained by one of the psycholinguistic variables at the group level. Statistical inference at the group level was performed with independent t tests on beta-maps, estimated for each participant for each variable. Because we were, in particular, interested in potential differences between effects of first and second language similarity statistics at each representational level, we also computed the difference contrasts for orthographic neighborhood sizes (OLDG− OLDE) and bigram frequencies (BFG− BFE), masked by the respective main effects. We also assessed which brain regions would get preferentially activated for categorizations to German versus categorizations to English.

All reported results were significant after setting the uncorrected voxel-wise threshold at p < .001 and the whole-brain family-wise error (FWE)-corrected threshold to p < .05 at the cluster level. All univariate analyses were performed with SPM8. Anatomical labels are based on the AAL atlas (built-in the WFU toolbox for MATLAB) and refer to the location of peak voxels.

ROI Analysis

ROIs for the left-hemispheric visual word form area and the homologous right-hemispheric regions were defined as spheres with an 8-mm radius around coordinates in bilateral vOT that are most often reported in the literature for the visual word form area (i.e., Cohen, Jobert, Le Bihan, & Dehaene, 2004: [±43, −57, −12]). Beta-estimates were extracted from these ROIs for all four psycholinguistic variables as the first eigenvector across all voxels and participants in each ROI using MarsBaR (marsbar.sourceforge.net/). The beta-estimates were then submitted to a 4 (contrasts) × 2 (ROIs) repeated-measures ANOVA. The main effect of contrasts was further analyzed in planned comparisons. The analysis was also repeated with small ROIs of 4-mm radius with similar results.

Multivariate Pattern Analyses

The aim of the MVPAs was to identify brain regions in which activation patterns contained information about participants' language decisions. Multivariate analysis methods are more sensitive to differences in bold signal change patterns than univariate analyses, as they are based on activation patterns across voxels rather than single voxels' behavior (Pereira, Mitchell, & Botvinick, 2009; Norman, Polyn, Detre, & Haxby, 2006). ROI MVPA was performed with a linear support vector machine classification algorithm on all voxels within clusters of activation as identified by the univariate analyses (see Results). Classifier performance was evaluated with a leave-one-out procedure. In this procedure, the classifier is repeatedly trained on data from all but one run, and then its accuracy is tested on the one run that was not used for training. Classifiers were trained to discriminate between trials with German responses and trials with English responses based on beta-value estimates for each type of trials.

Classifier performance was assessed separately for each ROI in all participants and then tested against chance level (50%) with permutation tests (with 10,000 permutations per test based on custom MATLAB scripts) across participants within each ROI. Significant results are reported after a Bonferroni correction for multiple comparisons based on the number of independent ROIs that entered the analysis.

Finally, we asked whether language decisions could be decoded from brain regions that showed no sensitivity to lexical or sublexical language similarity, as determined in the univariate analysis. To this end, we performed a searchlight MVPA (Kriegeskorte, Goebel, & Bandettini, 2006). For this analysis, the same procedure as described for ROI MVPA was subsequently performed for ROIs defined as a sphere of 7-mm radius around each voxel of the brain. For each participant, the performance of the classifier for each voxel was mean centered by subtracting the participants' mean accuracy computed across all voxels. This way, the variance in overall classifier performance across participants was taken into account (for a similar approach, see Lee et al., 2012). The resulting classifier performance maps were then entered into t tests at the group level and corrected for multiple comparisons based on random field theory (p uncorrected < .001, cluster-wise FWE-corrected p < .05).

RESULTS

Behavioral Results

Effects of Lexical and Sublexical Language Similarity on Language Decisions

To test whether categorization of PWs as German or English was affected by BFG, BFE, OLDG, and OLDE, we modeled participants' decisions as a function of the four psycholinguistic variables and their interactions in a logistic mixed-effects model. The results of the best-fitting model are summarized in Table 3. Participants more readily categorized PWs as English if they had a large English orthographic neighborhood, and analogously they also more often assigned PWs to German if they had a large German orthographic neighborhood. Additionally, there was an interaction effect between OLDG and OLDE. It was due to the effect of OLDG being most pronounced for items with small English neighborhoods, that is, for items that could not be discarded as German based on OLDE (Figure 1A). From another perspective, the effect of OLDE was most pronounced for PWs with a large German neighborhood, for which categorization as German could not be rejected based on OLDG. Overall, this interaction effect reflects a bias toward classification to L1 German, although participants were sensitive to lexical similarity statistics of both languages. There were no effects of bigram frequencies on language decisions.

Table 3. 

Summary of the Logistic Mixed-effects Model for Responses in the Language Decision Task

PredictorSignificanceBeta-estimateSDz-Valuep
(Intercept)  0.08 0.14 0.54 .586 
BFG  −0.07 0.22 −0.34 .732 
BFE  −0.29 0.32 −0.90 .366 
OLDE *** 8.43 0.84 −9.99 <.001 
OLDG *** −6.05 0.90 6.71 <.001 
OLDE : OLDG * 30.28 14.69 2.06 .039 
PredictorSignificanceBeta-estimateSDz-Valuep
(Intercept)  0.08 0.14 0.54 .586 
BFG  −0.07 0.22 −0.34 .732 
BFE  −0.29 0.32 −0.90 .366 
OLDE *** 8.43 0.84 −9.99 <.001 
OLDG *** −6.05 0.90 6.71 <.001 
OLDE : OLDG * 30.28 14.69 2.06 .039 

BFG = mean bigram frequency in German; BFE = mean bigram frequency in English; OLDG = Levenshtein orthographic neighborhood size in German; OLDE = Levenshtein orthographic neighborhood size in English.

*p < .05.

**p < .01.

***p < .001.

Figure 1. 

Interaction effect of German (OLD_G) and English (OLD_E) orthographic neighborhood sizes in behavioral and neural data. (A) Language decision behavior: percentage of responses English increased with OLD_E and percentage of responses German increased with OLD_G. (B) Mean beta-estimates from fMRI second-level analysis in bilateral AG. Although there were no correlations with OLD_G or OLD_E, their interaction was significant. (C) Predicted percentage of BOLD signal change in bilateral AG for the linear interaction of OLD_G and OLD_E. To elucidate the linear interaction effect of OLD_G and OLD_E on changes in BOLD signal, we calculated the predicted percent BOLD signal change based on GLM beta-estimates at the single trial level and averaged it separately for trials with high and low OLD_G and OLD_E values (see below). Percentage signal change increased for stimuli that had similar neighborhood sizes in both languages (i.e., a large neighborhood in English and German or a comparably small neighborhood in both languages) and decreased if orthographic neighborhood in one language was larger than in the other. In all panels, “low” OLD_G/OLD_E values refer to values below average in the respective language and “high” OLD_G/OLD_E values refer to values above average in the respective language. Error bars reflect ±1 SEM.

Figure 1. 

Interaction effect of German (OLD_G) and English (OLD_E) orthographic neighborhood sizes in behavioral and neural data. (A) Language decision behavior: percentage of responses English increased with OLD_E and percentage of responses German increased with OLD_G. (B) Mean beta-estimates from fMRI second-level analysis in bilateral AG. Although there were no correlations with OLD_G or OLD_E, their interaction was significant. (C) Predicted percentage of BOLD signal change in bilateral AG for the linear interaction of OLD_G and OLD_E. To elucidate the linear interaction effect of OLD_G and OLD_E on changes in BOLD signal, we calculated the predicted percent BOLD signal change based on GLM beta-estimates at the single trial level and averaged it separately for trials with high and low OLD_G and OLD_E values (see below). Percentage signal change increased for stimuli that had similar neighborhood sizes in both languages (i.e., a large neighborhood in English and German or a comparably small neighborhood in both languages) and decreased if orthographic neighborhood in one language was larger than in the other. In all panels, “low” OLD_G/OLD_E values refer to values below average in the respective language and “high” OLD_G/OLD_E values refer to values above average in the respective language. Error bars reflect ±1 SEM.

Effects of Lexical and Sublexical Language Similarity on RTs

The best-fitting model for RTs is summarized in Table 4. English categorizations were faster than German categorizations (mean RTs English: 1017, German: 1075 msec). Responses were speeded by increases in sublexical similarity to German (BFG), but slowed with increases in sublexical similarity to English (BFE), with both these effects being independent of the response language. This suggests that decisions on PWs were easier if their orthographic structure resembled the native language, and more effortful if it was more similar to L2 orthographic structure, demonstrating a tuning to L1 orthography. There was also a main effect of lexical similarity to English (OLDE) because of faster responses for more positive OLDE values. Importantly, this effect was qualified by an interaction with response language, reflecting that increase in OLDE speeded categorizations as English, whereas OLDE had no effect on RT for categorizations as German. There was no main effect of lexical similarity to German (OLDG); however, OLDG interacted with response. Similarly to the effects of OLDE, this interaction was due to a speed-up of categorizations as German with an increase in German orthographic neighborhood size, whereas the RT for response English was unaffected by OLDG. This pattern parallels the main effects of lexical similarity on language decisions, as it shows that decisions were faster with increases in lexical evidence for that language, as reflected in orthographic neighborhood sizes.

Table 4. 

Summary of Linear Mixed-effects Model for RTs in the Language Decision Task

PredictorSignificanceBeta-estimateSDtpz
(Intercept) *** 1040.51 26.55 39.19 <.001 
BFG ** −14.33 3.80 −3.77 <.001 
BFE * 9.47 3.76 2.52 .01 
OLDE * −8.96 3.62 −2.47 .01 
OLDG  3.04 3.48 0.87 .38 
Response ** −19.20 3.48 −5.52 <.001 
BFG : Response  −2.30 3.70 −0.62 .5 
BFE : Response  6.93 3.65 1.90 .06 
OLDE : Response ** −15.75 3.51 −4.49 <.001 
OLDG : Response * 6.96 3.38 2.06 .04 
OLDE : OLDG  2.74 3.22 0.85 .4 
OLDE : OLDG : Response a −7.28 3.40 −2.14 .03 
PredictorSignificanceBeta-estimateSDtpz
(Intercept) *** 1040.51 26.55 39.19 <.001 
BFG ** −14.33 3.80 −3.77 <.001 
BFE * 9.47 3.76 2.52 .01 
OLDE * −8.96 3.62 −2.47 .01 
OLDG  3.04 3.48 0.87 .38 
Response ** −19.20 3.48 −5.52 <.001 
BFG : Response  −2.30 3.70 −0.62 .5 
BFE : Response  6.93 3.65 1.90 .06 
OLDE : Response ** −15.75 3.51 −4.49 <.001 
OLDG : Response * 6.96 3.38 2.06 .04 
OLDE : OLDG  2.74 3.22 0.85 .4 
OLDE : OLDG : Response a −7.28 3.40 −2.14 .03 

BFG = mean bigram frequency in German; BFE = mean bigram frequency in English; OLDG = Levenshtein orthographic neighborhood size in German; OLDE = Levenshtein orthographic neighborhood size in English.

pz denotes significance based on the normal approximation.

aEffect not significant after removal of response-based outliers.

*

p < .05.

**

p < .01.

***

p < .001.

Finally, the three-way interaction of OLDG, OLDE, and response was significant. However, we suspected that this effect was due to outlier responses that did not comply with the orthographic neighborhood statistics. Indeed, the three-way interaction of OLDG, OLDE, and response was no longer significant after removal of such “response-based” outliers (see experimental procedures), suggesting that this effect resulted from outliers and did not reflect a pattern inherent to the complete stimulus set. All other effects were unaffected by removal of response-based outliers.

Overall, RT patterns converge with the analysis of language decisions in the contribution of orthographic neighborhood sizes in both languages to the decision process. Additionally, they also reflect a general L1-bias at the sublexical level, reflected in faster responses with increase in sublexical German similarity.

Whole-brain fMRI Results

We first aimed to characterize the brain network, in which activation is modulated by the psycholinguistic variables describing sublexical (BFG, BFE) and lexical (OLDG, OLDE) similarity to first and second language. On the basis of behavioral results, in the univariate analysis of fMRI data BOLD was regressed on response language (contrast-coded as German minus English), BFG, BFE, OLDG, OLDE, as well as the linear interaction of OLDG and OLDE.

General Task-related Brain Network

Making language decisions on PWs, as compared to the lower-level control condition, lead to an increase in BOLD in a wide-spread, mostly left-lateralized network (Table 5) that contained the bilateral occipital cortex and lingual gyri, the left inferior and middle frontal gyri, the left postcentral gyrus as well as bilateral SMAs and the anterior cingulate. Activations in this network were independent of participants' responses.

Table 5. 

Coordinates of Activation Peaks for Each of the Univariate Contrasts Analyzed

ContrastCluster pfwe-cCluster SizeAnatomic LocalizationtPeak MNI Coordinates
xyz
Task > LC <.001 619 Left inferior and middle frontal lobe 9.54 −45 19 
7.58 −48 35 13 
7.28 −42 26 −2 
<.001 436 Left inferior and middle occipital lobe 8.98 −39 −88 
8.67 −21 −94 
8.3 −24 −100 
<.001 106 Left lingual gyrus 8.94 −18 −52 
5.86 −24 −61 
5.71 −15 −67 
<.001 185 Left postcentral gyrus 7.75 −51 −7 52 
7.31 −57 −10 46 
6.5 −45 −19 52 
<.001 270 Right cerebellum and inferior occipital lobe 7.63 21 −85 −5 
7.32 27 −82 −17 
6.66 30 −73 −23 
<.001 49 Right rolandic operculum 7.45 36 −19 25 
5.68 45 −22 19 
<.001 63 Right lingual gyrus 6.31 18 −49 
6.07 12 −52 −5 
<.001 99 Bilateral SMA and cingulate 6.3 −6 17 43 
5.69 −3 14 64 
5.34 12 26 37 
BFG <.001 243 Left middle occipital gyrus 6.58 −18 −94 −5 
Left inferior occipital gyrus 5.94 −33 −85 −8 
Left middle occipital gyrus 5.79 −24 −97 
.029 56 Left fusiform gyrus 5.77 −45 −46 −20 
Left Inferior temporal gyrus 4.54 −51 −58 −11 
.011 70 Right inferior occipital gyrus 4.68 33 −88 −11 
Right calcarine sulcus 4.56 21 −94 −2 
Right inferior occipital gyrus 4.26 27 −94 
OLDG <.001 121 Left rolandic operculum/left supra-marginal gyrus 6.68 −36 −31 22 
Left parietal inferior lobule 5.66 −57 −25 37 
Left postcentral gyrus 5.48 −48 −25 31 
.021 64 Right lingual gyrus 5.69 21 −85 −2 
.016 68 Right postcentral gyrus 5.59 54 −19 55 
Right precentral gyrus 5.43 48 −10 49 
Right postcentral gyrus 4.31 42 −25 52 
neg. OLDE <.001 358 Left inferior occipital gyrus 7.48 −42 −70 −5 
Left middle occipital gyrus 6.04 −36 −85 −2 
5.84 −45 −79 −2 
.010 81 Left SPL 6.81 −24 −61 52 
4.49 −24 −49 46 
.021 68 Right inferior temporal gyrus 5.28 42 −55 −8 
.053 53 Right inferior occipital gyrus 4.72 30 −85 
4.09 39 −85 −5 
OLDG × OLDE .003 95 Right AG 5.79 51 −58 34 
.013 71 Left inferior parietal lobule 4.76 −51 −61 43 
Left AG 4.08 −39 −73 40 
ContrastCluster pfwe-cCluster SizeAnatomic LocalizationtPeak MNI Coordinates
xyz
Task > LC <.001 619 Left inferior and middle frontal lobe 9.54 −45 19 
7.58 −48 35 13 
7.28 −42 26 −2 
<.001 436 Left inferior and middle occipital lobe 8.98 −39 −88 
8.67 −21 −94 
8.3 −24 −100 
<.001 106 Left lingual gyrus 8.94 −18 −52 
5.86 −24 −61 
5.71 −15 −67 
<.001 185 Left postcentral gyrus 7.75 −51 −7 52 
7.31 −57 −10 46 
6.5 −45 −19 52 
<.001 270 Right cerebellum and inferior occipital lobe 7.63 21 −85 −5 
7.32 27 −82 −17 
6.66 30 −73 −23 
<.001 49 Right rolandic operculum 7.45 36 −19 25 
5.68 45 −22 19 
<.001 63 Right lingual gyrus 6.31 18 −49 
6.07 12 −52 −5 
<.001 99 Bilateral SMA and cingulate 6.3 −6 17 43 
5.69 −3 14 64 
5.34 12 26 37 
BFG <.001 243 Left middle occipital gyrus 6.58 −18 −94 −5 
Left inferior occipital gyrus 5.94 −33 −85 −8 
Left middle occipital gyrus 5.79 −24 −97 
.029 56 Left fusiform gyrus 5.77 −45 −46 −20 
Left Inferior temporal gyrus 4.54 −51 −58 −11 
.011 70 Right inferior occipital gyrus 4.68 33 −88 −11 
Right calcarine sulcus 4.56 21 −94 −2 
Right inferior occipital gyrus 4.26 27 −94 
OLDG <.001 121 Left rolandic operculum/left supra-marginal gyrus 6.68 −36 −31 22 
Left parietal inferior lobule 5.66 −57 −25 37 
Left postcentral gyrus 5.48 −48 −25 31 
.021 64 Right lingual gyrus 5.69 21 −85 −2 
.016 68 Right postcentral gyrus 5.59 54 −19 55 
Right precentral gyrus 5.43 48 −10 49 
Right postcentral gyrus 4.31 42 −25 52 
neg. OLDE <.001 358 Left inferior occipital gyrus 7.48 −42 −70 −5 
Left middle occipital gyrus 6.04 −36 −85 −2 
5.84 −45 −79 −2 
.010 81 Left SPL 6.81 −24 −61 52 
4.49 −24 −49 46 
.021 68 Right inferior temporal gyrus 5.28 42 −55 −8 
.053 53 Right inferior occipital gyrus 4.72 30 −85 
4.09 39 −85 −5 
OLDG × OLDE .003 95 Right AG 5.79 51 −58 34 
.013 71 Left inferior parietal lobule 4.76 −51 −61 43 
Left AG 4.08 −39 −73 40 

LC = lower level control; BFG = mean bigram frequency in German; BFE = mean bigram frequency in English; OLDG = Levenshtein orthographic neighborhood size in German; OLDE = Levenshtein orthographic neighborhood size in English.

All reported activations were significant at an uncorrected voxel-wise significance level of p < .001 and a whole-brain cluster-level FWE-corrected level of p < .05. We report up to 3 peaks min 8 mm apart for each activation cluster.

Brain regions where BOLD was modulated by one of the psycholinguistic variables are listed in Table 5 and illustrated in Figure 2.

Figure 2. 

Maps of brain regions where the change in brain activation was modulated by one of the psycholinguistic variables independently of the response language. No significant activations or deactivations were found for English bigram frequency (BFE). All clusters were significant at a cluster-level FWE-corrected p < .05, with peak-level uncorrected p < .001, heat maps reflect t values. BF_G = German bigram frequency; OLD_G = German orthographic neighborhood; OLD_E = English orthographic neighborhood; OC = occipital lobe; TP = parieto-temporal junction; SMG = supramarginal gyrus; PC = postcentral gyrus.

Figure 2. 

Maps of brain regions where the change in brain activation was modulated by one of the psycholinguistic variables independently of the response language. No significant activations or deactivations were found for English bigram frequency (BFE). All clusters were significant at a cluster-level FWE-corrected p < .05, with peak-level uncorrected p < .001, heat maps reflect t values. BF_G = German bigram frequency; OLD_G = German orthographic neighborhood; OLD_E = English orthographic neighborhood; OC = occipital lobe; TP = parieto-temporal junction; SMG = supramarginal gyrus; PC = postcentral gyrus.

Neural Correlates of Mean German Bigram Frequency (BFG)

BFG positively correlated with BOLD response in two clusters along the posterior–anterior axes of the left temporo-occipital cortex (Figure 2). The most anterior cluster was located in left vOT and corresponded in its location to the VWFA as commonly reported in the literature (i.e., Cohen & Dehaene, 2004: [−43, −57, −12]), whereas the posterior cluster extended along the ventral stream of visual processing (Milner & Goodale, 1995). Additionally, BFG was positively correlated with changes in BOLD signal in a small cluster located in the right posterior occipital cortex. These results replicate the previously reported correlation between activation in bilateral vOT and mean bigram frequency in the native language (Binder et al., 2006) and extend it to a language-ambiguous context. The asymmetry between left-hemispheric and right-hemispheric activations is also in line with previous results (Price, 2012; Vinckier et al., 2007).

These activations were independent of participants' language decisions. No significant activation clusters were found for negative correlations with BFG even after lowering the peak-level threshold to p < .01.

Moreover, we found no positive or negative correlations with mean English bigram frequency (BFE).

Neural Correlates of German Orthographic Neighborhood Size (OLDG)

OLDG positively correlated with changes in BOLD signal in a large cluster extending from the left supra-marginal gyrus and parieto-temporal junction to left inferior precentral gyrus, as well as a cluster in right motor cortex and another cluster in right occipital lobe (see Figure 2 and Table 5). We found no negative correlations of BOLD signal with OLDG. These activations were independent of participants' language decisions.

Neural Correlates of English Orthographic Neighborhood Size (OLDE)

There were no positive correlations with OLDE. However, we found that BOLD signal changes were negatively correlated with OLDE in a cluster in left occipital cortex, partially overlapping with the posterior left occipital cluster found for BFG, and two clusters along the posterior–anterior axes of the right occipital cortex. The anterior of these two right occipital clusters was homologous to the left vOT cluster, where activation was modulated by BFG (Figure 4). Moreover, OLDE was also negatively correlated with BOLD in left superior parietal lobule (SPL). These activations were independent of participants' language decisions.

Neural Correlates of the Linear Interaction of Orthographic Neighborhood Sizes in German (OLDG) and English (OLDE)

The linear interaction of OLDG and OLDE was positively correlated with BOLD in the bilateral AG (Figure 1B, C), meaning that the correlation between BOLD and OLDE was most positive for large OLDG values and decreased with OLDG. That is, for PWs with a large German orthographic neighborhood, the AG were most active if the English orthographic neighborhood was large as well (e.g., for the PW tald, which has many orthographic neighbors in both languages), which rendered language decisions more ambiguous, than if OLDE was small (e.g., egen, which has many neighbors in German and fewer in English), which made the attribution to German more plausible. Figure 1B displays mean beta-estimates from the GLM analysis for OLDG, OLDE, and their interaction. To further illustrate the interaction effect, we divided the OLD variables in two bins each and calculated the average predicted percent BOLD signal change for each of the resulting four conditions based on the linear model used for the first-level fMRI analysis. The resulting Figure 1C shows have the change in predicted percent BOLD signal change with one of the OLD variables depends on the values of the other variable. For example predicted percent BOLD signal change increases with OLDE for high but not for low OLDG values.

In summary, activation in the AG appears to reflect the ambiguity of language attribution based on a comparison of coactivation of lexical representations from the two languages, as reflected by orthographic neighborhood sizes for each language. This result is consistent with the activation patterns of left AG in the L1 study of Binder and colleagues (2003), who found an increase in BOLD signal for letter strings that require more laborious mapping to lexical word forms, such as low-frequency compared to high-frequency words. Importantly our findings extend these previous results to mixed-language settings and language-nonselective lexical search. Similar to the main effects of lexical neighborhood sizes, these activations were independent of participants' language decisions.

The above contrasts of beta-values against 0 describe the brain regions where each of the psycholinguistic variables explained unique variance. Additionally, for each region identified by the main effects, we asked whether the respective variable explained significantly more unique variance than each of the remaining three variables, by computing the respective difference contrasts masked by the main effects. The results showed for all main effects that beta-values for other variables significantly differed from the beta-value of the contrast-defining variable.

The contrast between responses German and responses English produced no significant activation clusters even when lowering the threshold to p < .01. Although the univariate analysis yielded no significant activations for the contrast between German and English categorizations, multivariate analyses (MVPA), which are sensitive to differences in activation patterns rather than activation levels in single voxels, may provide additional insights. Differences in activation patterns would point to an involvement of a region in the language categorization process, whereas an absence of differences (reflected in chance-level classification performance) would suggest that a region is not directly involved in the language decision process. We thus went on with a multivariate analysis to assess differences in patterns of neural activations between trials with German and English responses.

MVPAs—Decoding of Language Decisions

The aim of the MVPA was to identify differences in activation according to whether participants labeled PWs as German or English—in regions modulated by psycholinguistic variables in the univariate analysis. The average accuracy of classifiers for each of the ROIs identified through the univariate analysis (Table 2) is summarized in Figure 3A and Table 6. Significant differences between activation patterns on trials with response German and trials with response English were found in clusters located in the left occipital lobe, the left SPL, the left AG, and the right postcentral gyrus. In other words, a classifier trained on a subset of data from each of these regions could identify the response of a participant with above-chance accuracy based the activation pattern from an independent subset of data from the respective region.

Figure 3. 

Results of MVPAs. (A) Decoding accuracy for classifiers trained to discriminate between trials with English responses and trials with German responses in ROIs defined by whole-brain univariate contrasts. Activation patterns in clusters in left occipital cortex, left parietal lobe, and right motor cortex contained information about participants' decisions. *Bonferroni-corrected p < .05. (B) Results of MVPAs' seven-voxel radius searchlight decoding of participants' responses. Two clusters in the right hemisphere located in right supramarginal gyrus and SPL produced decoding accuracies above chance level. Clusters were significant at cluster-level FWE-corrected p < .05, with peak-level uncorrected p < .001. BF_G = mean bigram frequency in German; BF_E = mean bigram frequency in English; OLD_G = Levenshtein orthographic neighborhood size in German; OLD_E = Levenshtein orthographic neighborhood size in English; LING = lingual gyrus; SMG = supramarginal gyrus; PC = postcentral gyrus; OCC = occipital cortex. Error bars represent mean ± SEM.

Figure 3. 

Results of MVPAs. (A) Decoding accuracy for classifiers trained to discriminate between trials with English responses and trials with German responses in ROIs defined by whole-brain univariate contrasts. Activation patterns in clusters in left occipital cortex, left parietal lobe, and right motor cortex contained information about participants' decisions. *Bonferroni-corrected p < .05. (B) Results of MVPAs' seven-voxel radius searchlight decoding of participants' responses. Two clusters in the right hemisphere located in right supramarginal gyrus and SPL produced decoding accuracies above chance level. Clusters were significant at cluster-level FWE-corrected p < .05, with peak-level uncorrected p < .001. BF_G = mean bigram frequency in German; BF_E = mean bigram frequency in English; OLD_G = Levenshtein orthographic neighborhood size in German; OLD_E = Levenshtein orthographic neighborhood size in English; LING = lingual gyrus; SMG = supramarginal gyrus; PC = postcentral gyrus; OCC = occipital cortex. Error bars represent mean ± SEM.

Table 6. 

Results of ROI-based MVPA

ROI-defining ContrastROI% CorrectSDpSignificance
BFG Left OCC 56.9 11.5 .02 * 
Left vOT 55.9 14.7 .28  
Right OCC 56.6 11.7 .05  
OLDG Left SMG 55.9 13.7 .11  
Right Ling 51.9 12.8  
Right PC 58.1 10.2 .004 * 
neg. OLDE Left SPL 63.1 17.0 .02 * 
Left OCC 59.4 14.1 .02 * 
Right OCC 52.8 8.5 .29  
Right vOT 52.8 14.8  
OLDG × OLDE Left AG 61.3 13.7 < .001 * 
Right AG 55.6 11.3 .07  
ROI-defining ContrastROI% CorrectSDpSignificance
BFG Left OCC 56.9 11.5 .02 * 
Left vOT 55.9 14.7 .28  
Right OCC 56.6 11.7 .05  
OLDG Left SMG 55.9 13.7 .11  
Right Ling 51.9 12.8  
Right PC 58.1 10.2 .004 * 
neg. OLDE Left SPL 63.1 17.0 .02 * 
Left OCC 59.4 14.1 .02 * 
Right OCC 52.8 8.5 .29  
Right vOT 52.8 14.8  
OLDG × OLDE Left AG 61.3 13.7 < .001 * 
Right AG 55.6 11.3 .07  

p Values are based on permutation tests and a Bonferroni correction for multiple comparisons. BFG = mean bigram frequency in German; BFE = mean bigram frequency in English; OLDG = Levenshtein orthographic neighborhood size in German; OLDE = Levenshtein orthographic neighborhood size in English; Ling = lingual gyrus; SMG = supramarginal gyrus; PC = postcentral gyrus; OCC = occipital cortex.

*p ≤ .05.

To investigate whether language decisions were reflected in additional regions that were not identified in the univariate analysis, a searchlight classification across the whole brain was performed. The searchlight analysis revealed two right-hemispheric clusters with differences in activation patterns between trials with response German and trials with response English (Figure 3B): One cluster located at the border of the SPL (peak coordinates: [18, −46, 61], cluster size k = 47) and the precuneus, and another cluster in the inferior parietal lobule (peak coordinates: [45, −43, 28], k = 54), comprising parts of the supramarginal gyrus and the AG.

ROI Analysis in Bilateral vOT

On the basis of previous literature, we defined bilateral vOT ROIs as 8-mm spheres around the coordinates [±43, −57, −12], which is the typically reported location of the VWFA. The left-hemispheric ROIs contained the occipito-temporal cluster activated by BFG, and the homologous right-hemispheric ROI contained the occipito-temporal cluster that was modulated by OLDE (Figure 4A). Beta-values were analyzed in a 2 (hemisphere) × 4 (contrast) repeated-measures ANOVA (see Figure 4B). The main effect of contrast (F(3, 57) = 7.1, p = .005) showed that beta-values for BFG were positive in both ROIs (left vOT: t(59) = 4.3, p < .0001; right VOT: t(59) = 3.4, p = .001), and that beta-values for OLDE were negative in both ROIs (left vOT: t(59) = −5.2, p < .001; right vOT: t(59) = −6.7, p < .001), whereas OLDG and BFE had no significant effects (p > .1). In summary, there was no difference between activation patterns in the two ROIs between trials with responses German and English. MVPA within the two ROIs showed that language decisions could be decoded with above-chance accuracy from the left-hemispheric ROI only (Figure 4C).

Figure 4. 

Sensitivity of vOT to lexical and sublexical language similarity and results of MVPA. (A) Results of univariate analysis in occipital lobes and location of vOT literature-based ROIs. (B) Beta-estimates in vOT ROIs. (C) Results of MVPA in vOT ROIs with classifiers trained to discriminate between trials with responses in German and English. Error bars indicate mean ± SEM.

Figure 4. 

Sensitivity of vOT to lexical and sublexical language similarity and results of MVPA. (A) Results of univariate analysis in occipital lobes and location of vOT literature-based ROIs. (B) Beta-estimates in vOT ROIs. (C) Results of MVPA in vOT ROIs with classifiers trained to discriminate between trials with responses in German and English. Error bars indicate mean ± SEM.

DISCUSSION

Our aim was to investigate the neural representations of continuous statistics at the sublexical (mean bigram frequencies) and lexical (orthographic neighborhood sizes) levels in first (L1) and second (L2) language and their involvement in language attribution of letter strings during visual word processing.

Behaviorally, sublexical statistics affected response speed, with faster responses to sublexically L1-similar letter strings and slower responses to sublexically L2-similar letter strings, whereas lexical similarity to both languages biased language attribution. These findings are in line with our previous study on effects of the difference between similarity to L1 and L2 at sublexical and lexical similarity on bilinguals' language decisions: language decisions got faster as the difference between L1 and L2 mean bigram frequencies increased (Oganian et al., 2015). Moreover, our current result suggests that our previous results were indeed due to sensitivity to both—the increase in sublexical similarity to the L1 as well as a decrease in sublexical similarity to the L2. Importantly our findings converge with our previous findings of a lack of direct effects of mean bigram frequencies onto the decision variable, which we attributed to the similarity in mean bigram frequencies between German and English words. At the lexical level, we previously found that the difference in orthographic neighborhood sizes between L1 and L2 guided the choice of a language assignment (Oganian et al., 2015). The increase in German categorizations with lexical similarity to German and increase in English categorizations with lexical similarity to English in the current data set demonstrates that our previous findings were actually due to independent assessment of lexical similarity to each of the languages.

The most prominent model of bilingual visual word recognition, BIA+ (Van Kesteren et al., 2012; Dijkstra & van Heuven, 2002) incorporates effects of language-unique sublexical units onto language attribution (Casaponsa et al., 2014; Vaid & Frenck-Mestre, 2002). In its current form, however, the model does not take effects of continuous sublexical L1 and L2 statistics into account, nor does it predict effects of orthographic neighborhood size on language attribution. To incorporate our findings, the model would need to be implemented with language-specific statistical information at the sublexical level, as well as mutual connections of orthographic neighbors and language identity representations (e.g., as in representational network in Shook & Marian, 2013; see also Oganian et al., 2015, for a more detailed discussion of this matter).

Changes in brain activation with similarity to L1 and L2 in occipital lobes and parieto-temporal regions provide several novel insights regarding the brain organization of L1 and L2 and the representation of sublexical and lexical statistics of both languages. First, they suggest that sublexical orthographic and phonological processing is tuned to L1 statistics even in a language-ambiguous context. Second, activation patterns in the AG provide direct support for language nonselective lexical search. Finally, our neuroimaging results yielded no differences between trials with different responses, indicating that, despite the sublexical L1-bias, the same network was activated for all PWs independent of participants' language decisions. Crucially, however, multivariate analyses revealed that distributed activation patterns throughout the reading network did reflect language decisions, in particular in regions along the ventral visual stream implicated in sublexical processing. Information on language attribution was also encoded in brain regions associated with decision-making (right PC and precuneus).

The tuning of bilateral vOT to L1 sublexical statistics in our study extends previous monolingual studies (Vinckier et al., 2007; Binder et al., 2006) to a language-ambiguous context. Although previous studies found greater recruitment of right vOT for reading in L2 (Leonard et al., 2011; Leonard, Brown, & Travis, 2010; Perfetti et al., 2007), we found no evidence for neural encoding of L2 sublexical statistics in this (or other) region(s). As L2 sublexical statistics affected RTs, it is unlikely that sublexical processing was exclusively determined by L1 statistics. Rather, the lack of distinct neural correlates for L2 bigram frequencies might be due to a spatially more distributed representation of this variable in bilinguals alongside with considerable interindividual differences in L2 functional localization (Dehaene et al., 1997), in particular as our participants were late L1-dominant bilinguals.

When lexical similarity was not diagnostic for language attribution (e.g., for the PW tald, which has many orthographic neighbors in both languages), activation in the AG was increased and RTs were longer, implying more challenging attribution and a more demanding mapping effort. This result extends previous studies in L1 (Binder et al., 2003) to a concurrent manipulation of lexical neighborhoods in L1 and L2, providing a direct demonstration of the sensitivity of bilateral AG to lexical organization of both languages and neuronal evidence for bilinguals' nonselective activation of both languages. Moreover, we found the same activation patterns in left and right AG, providing novel evidence for bilateral AG involvement in reading (Carreiras et al., 2009), different from the traditional view implicating solely the left AG in reading processes (for a review, see, e.g., Carreiras et al., 2014).

Curiously, lexical similarity to L2 was inversely correlated with BOLD in bilateral vOT, more posterior visual cortices, as well as the left SPL, that is, PWs with few orthographic neighbors in English induced greater activation. We interpret this pattern as additional sublexical processing of PWs with low lexical similarity to English. This interpretation is supported by the involvement of left SPL in reading tasks requiring increased attentional resources (Reilhac, Peyrin, Démonet, & Valdois, 2013; Segal & Petrides, 2012; Chen, Fu, Iversen, Smith, & Matthews, 2002). Recruitment of additional processing resources at the sublexical level for L2-dissimilar PWs is best interpreted in the framework of the grain size theory. The grain size theory posits that reading in irregular orthographies relies mainly on whole-word similarities to deduce the correct orthography-to-phonology mapping, whereas in regular orthographies single letters or small letter chunks can be mapped correctly to phonemes (Ziegler & Goswami, 2005). As such, reading of the irregular English orthography relies heavily on whole-word similarities (Koyama, Stein, Stoodley, & Hansen, 2013). When the number of similar words is not sufficient for a whole word-based reading strategy, however, reliance on sublexical representations increases, leading to additional occipital lobe activity. Whether this analysis is conducted with respect to first or second language orthography cannot be distinguished based on our data. Yet, this pattern does provide evidence for different neuronal consequences of first and second language orthographical patterns.

Lexical similarity to the L1 was positively correlated with activation in the left supramarginal gyrus and TPJ, as well as the right precentral cortex. Previous findings of positive correlations between lexical neighborhood size in the native language and activation of the left supramarginal gyrus (Prabhakaran, Blumstein, Myers, Hutchison, & Britton, 2006) were interpreted as evidence for phonological mapping based on lexical similarity. Similarly, the TPJ has been associated with the retrieval and encoding of phonology during visual word processing (Wilson et al., 2009; Hickok & Poeppel, 2007). Although these regions are typically reported in studies of real word naming, strong involvement of phonological encoding is more than likely also in this study as a means of assessing language similarity. In our data, this effect was present for L1 but not L2, suggesting that despite the bilingual context our participants preferentially focused on German grapheme-to-phoneme conversion rules when processing language-ambiguous stimuli. This implies that the generally nonselective phonological access (Kroll et al., 2006) can be object to a strong dominant language (L1) bias even in processing of single words, and not only in sentential contexts (Schwartz & Kroll, 2006), in particular for L1-dominant bilinguals as the participants' in this study (Spalek, Hoshino, Wu, Damian, & Thierry, 2014). The positive correlation between L1 lexical similarity and activation in right motor cortex was not part of our hypotheses and is probably best explained as a consequence of task environment. As our participants gave their responses with their left hand and lexical similarity to German was a strong predictor of participants' language decision, this correlation most likely reflects evidence accumulation or decision preparation (Donner, Siegel, Fries, & Engel, 2009; Heekeren, Marrett, & Ungerleider, 2008).

Language decisions were associated with distributed activation patterns throughout the widespread network identified in the univariate analyses, as revealed by our multivariate analyses. In particular, activation patterns throughout the left-hemispheric ventral stream of visual word processing predicted participants' language decisions. The low temporal resolution of fMRI makes it challenging to identify whether this is the result of feedback connections from later processing stages or of local computations. Nevertheless, these data may be the key to settle the discrepancy between a shared neural network for both languages and bilinguals' ability to make language decisions, as they show that activation patterns but not mean activations throughout the shared neural network reflect language decisions. Interestingly, although univariate results were not different in left and right AG, activation patterns in left but not right AG were predictive of participants' decisions, suggesting that left AG encodes language identity, although left and right AG are involved in the task. Furthermore, the searchlight analysis revealed additional encoding of language identity in right parietal cortex, including parts of AG and supramarginal gyrus, posterior to the left AG activation found in the univariate analysis. Although several previous reports associated this region with the correct use of multiple languages (Price, Green, & von Studnitz, 1999; Pötzl, 1930), the role of right temporoparietal regions in bilingual linguistic processing is underspecified. Our results open up the intriguing possibility that specialized neural populations in the right parietal lobe represent abstract language identities.

The searchlight analysis and the ROI analysis operated on different spatial scales; the searchlight analysis with 7-mm radius takes into account more local patterns (∼30 voxels) whereas the ROI analysis is based on more distributed patterns, as ROIs contained over 50 voxels. The difference between results from these two approaches suggests that language membership information was represented in local activations at a small spatial scale in the right parietal lobe and on a larger spatial scale in the left-hemispheric language processing network (Lee et al., 2012). Future research is required to investigate the task dependency of this network, as well as for an extension to language attribution of auditorily presented words.

The significant classifier accuracies of 55–65% for the classification between trials with different behavioral responses match the typical accuracy levels of MVPA results in functional neuroimaging of cognitive functions (e.g., Albers et al., 2013; Raizada et al., 2010). Furthermore, accuracy was probably influenced by the fact that, first, stimuli used in this study were highly ambiguous—presumably rendering the decision process itself quite noisy. Second, language decisions in this task were based on several sources of evidence—represented through a widespread neural network as indicated by the univariate analyses. Accordingly, the highest decoding accuracy was found in right motor cortices, where the direct response execution (made with the left hand) was prepared. Crucially though, the classification analysis directly shows the involvement of a large subset of the reading network in the decision process.

A substantial body of research suggests that the basic structure of the neural network underlying reading in L1 and L2 does not depend on specific languages or populations under study (Chee, 2009; Klein et al., 2006; Kim, Relkin, Lee, & Hirsch, 1997). This is particularly important for research on language attribution, as the linguistic distance between languages probably dictates the ease of language attribution. Our multivariate results in two very similar languages provide strong evidence for neural separability of languages, suggesting that it is independent of linguistic distance. Nevertheless, an important point for future studies is whether language attribution is dealt with differently for more dissimilar languages.

Most previous studies of language decision focused on PW stimuli that contained combinations of letters that are typical for one but illegal in the other language, which were found to serve as strong cues to language decisions (Oganian et al., 2015; Casaponsa et al., 2014; Van Kesteren et al., 2012; Vaid & Frenck-Mestre, 2002). In this study, we extend this approach to letter strings that are all orthographically legal and pronounceable in both languages and for which language decisions must thus be based on more fine-grained differences in similarity to the languages in question. This setup allowed us to pinpoint the neural network involved in processing of such fine-grained continuous frequency effects. However, our selection only partially covers real-world scenarios, where letter strings are often found to contain language-unique bigrams. Future studies may investigate the neural effects of the interplay between more robust or dichotomic language cues (see, e.g., Casaponsa et al., 2014) on the one hand and more subtle or continuous manipulations like the one used in our study on the other hand (for a description of behavioral effects, see Oganian et al., 2015).

To summarize, our brain imaging results provide a novel description of the neural representation of language membership in and outside the classical language processing network. Some of our findings, most prominently the sensitivity of the AG to first and second language lexical structure, support the notion of language nonselective processing as advocated by common cognitive models of bilingual language processing (Dijkstra & van Heuven, 2002). On the other hand, our data also suggest that language nonselective processing might be confined to lexical search, whereas sublexical and phonological processing at the neural level seems to follow a strong bias toward the native language. Most importantly, our data provide evidence for the representation of language membership within the core language processing network, rather than only as separate abstract entities. The problem of reconciling this latter finding with the notion of external language nodes in the BIA+ model should be taken into account in future models of the bilingual visual word recognition process.

Conclusions

Research on the neural networks involved in first and second language processing is central to neurobiology of language. It was repeatedly shown that native and second language processing relies on the same network. It is fundamental, however, to understand how this one shared network supports bilinguals' ability to recognize the language of an input. Our data provide an important step forward by showing that L1 and L2 similarity statistics are handled differently by the language network, with preferential tuning to L1. We also provide evidence for sensitivity to continuous language-specific statistics at all stages of visual word processing and for their involvement in language attribution. Although both languages are processed by a shared neural network, distributed activation patterns in this network, including early visual processing in the left occipital lobe, differentiated between languages. This finding offers a neural mechanism for language attribution, which could not be pinpointed based on univariate analyses. Moreover, we describe the involvement of right parietal regions outside the classical language network in language attribution. Future studies should investigate the dependence of these representations on the combination of languages, bilingual populations, and tasks involved.

Acknowledgments

This research was supported by the Deutsche Forschungsgemeinschaft (GRK1589/1 fellowship to Y. O.). We thank Frederike Albers for assistance in data collection and Assaf Breska for fruitful discussions of the manuscript.

Reprint requests should be sent to Yulia Oganian, Institut für Psychologie, Freie Universitaet Berlin, Habelschwerter Allee 45, JK25/215, 14195 Berlin, Germany, or via e-mail: yulia.oganian@fu-berlin.de.

Note

1. 

We would like to note that such mixed-language sentences have become quite typical in bilinguals' communication not only in speech but also in informal written communication, such as email, sms, or chat.

REFERENCES

Abutalebi
,
J.
(
2008
).
Neural aspects of second language representation and language control
.
Acta Psychologica
,
128
,
466
478
.
Baayen
,
R. H.
(
2008
).
Analyzing linguistic data. A practical introduction to statistics
.
Cambridge, UK
:
Cambridge University Press
.
Baayen
,
R. H.
,
Davidson
,
D. J.
, &
Bates
,
D. M.
(
2008
).
Mixed-effects modeling with crossed random effects for subjects and items
.
Journal of Memory and Language
,
59
,
390
412
.
Balota
,
D. A.
,
Yap
,
M. J.
,
Cortese
,
M. J.
,
Hutchison
,
K. A.
,
Kessler
,
B.
,
Loftis
,
B.
, et al
(
2007
).
The English Lexicon Project
.
Behavior Research Methods
,
39
,
445
459
.
Barr
,
D. J.
,
Levy
,
R.
,
Scheepers
,
C.
, &
Tily
,
H. J.
(
2013
).
Random effects structure for confirmatory hypothesis testing: Keep it maximal
.
Journal of Memory and Language
,
68
,
255
278
.
Binder
,
J. R.
,
McKiernan
,
K. A.
,
Parsons
,
M. E.
,
Westbury
,
C. F.
,
Possing
,
E. T.
,
Kaufman
,
J. N.
, et al
(
2003
).
Neural correlates of lexical access during visual word recognition
.
Journal of Cognitive Neuroscience
,
15
,
372
393
.
Binder
,
J. R.
,
Medler
,
D. A.
,
Westbury
,
C. F.
,
Liebenthal
,
E.
, &
Buchanan
,
L.
(
2006
).
Tuning of the human left fusiform gyrus to sublexical orthographic structure
.
Neuroimage
,
33
,
739
748
.
Brysbaert
,
M.
,
Buchmeier
,
M.
,
Conrad
,
M.
,
Jacobs
,
A. M.
,
Bölte
,
J.
, &
Böhl
,
A.
(
2011
).
The word frequency effect
.
Experimental Psychology
,
58
,
412
424
.
Carreiras
,
M.
,
Baayen
,
R. H.
,
Perea
,
M.
, &
Frost
,
R.
(
2014
).
The what, when, where, and how of visual word recognition
.
Trends in Cognitive Sciences
,
18
,
90
98
.
Carreiras
,
M.
,
Seghier
,
M. L.
,
Baquero
,
S.
,
Estévez
,
A.
,
Lozano
,
A.
,
Devlin
,
J. T.
, et al
(
2009
).
An anatomical signature for literacy
.
Nature
,
461
,
983
986
.
Casaponsa
,
A.
,
Carreiras
,
M.
, &
Duñabeitia
,
J. A.
(
2014
).
Discriminating languages in bilingual contexts: The impact of orthographic markedness
.
Frontiers in Psychology
,
5
,
424
.
Chee
,
M. W. L.
(
2009
).
fMR-adaptation and the bilingual brain
.
Brain and Language
,
109
,
75
79
.
Chen
,
Y.
,
Fu
,
S.
,
Iversen
,
S. D.
,
Smith
,
S. M.
, &
Matthews
,
P. M.
(
2002
).
Testing for dual brain processing routes in reading: A direct contrast of Chinese character and pinyin reading using fMRI
.
Journal of Cognitive Neuroscience
,
14
,
1088
1098
.
Cohen
,
L.
,
Jobert
,
A.
,
Le Bihan
,
D.
, &
Dehaene
,
S.
(
2004
).
Distinct unimodal and multimodal regions for word processing in the left temporal cortex
.
Neuroimage
,
23
,
1256
1270
.
Consonni
,
M.
,
Cafiero
,
R.
,
Marin
,
D.
,
Tettamanti
,
M.
,
Iadanza
,
A.
,
Fabbro
,
F.
, et al
(
2013
).
Neural convergence for language comprehension and grammatical class production in highly proficient bilinguals is independent of age of acquisition
.
Cortex
,
49
,
1252
1258
.
Dale
,
A. M.
(
1999
).
Optimal experimental design for event-related fMRI
.
Human Brain Mapping
,
8
,
109
114
.
Dehaene
,
S.
, &
Cohen
,
L.
(
2011
).
The unique role of the visual word form area in reading
.
Trends in Cognitive Sciences
,
15
,
254
262
.
Dehaene
,
S.
,
Cohen
,
L.
,
Sigman
,
M.
, &
Vinckier
,
F.
(
2005
).
The neural code for written words: A proposal
.
Trends in Cognitive Sciences
,
9
,
335
341
.
Dehaene
,
S.
,
Dupoux
,
E.
,
Mehler
,
J.
,
Cohen
,
L.
,
Paulesu
,
E.
,
Perani
,
D.
, et al
(
1997
).
Anatomical variability in the cortical representation of first and second language
.
NeuroReport
,
8
,
3809
3815
.
Dijkstra
,
T.
, &
van Heuven
,
W. J. B.
(
2002
).
The architecture of the bilingual word recognition system: From identification to decision
.
Bilingualism: Language and Cognition
,
5
,
175
197
.
Donner
,
T. H.
,
Siegel
,
M.
,
Fries
,
P.
, &
Engel
,
A. K.
(
2009
).
Buildup of choice-predictive activity in human motor cortex during perceptual decision making
.
Current Biology
,
19
,
1581
1585
.
Grainger
,
J.
, &
Dijkstra
,
T.
(
1992
).
On the representation and use of language information in bilinguals
. In
R. J.
Harris
(Ed.),
Cognitive processing in bilinguals
(
Vol. 83
, pp.
207
220
).
Hauk
,
O.
,
Davis
,
M. H.
, &
Pulvermüller
,
F.
(
2008
).
Modulation of brain activity by multiple lexical and word form variables in visual word recognition: A parametric fMRI study
.
Neuroimage
,
42
,
1185
1195
.
Heekeren
,
H. R.
,
Marrett
,
S.
, &
Ungerleider
,
L. G.
(
2008
).
The neural systems that mediate human perceptual decision making
.
Nature Reviews Neuroscience
,
9
,
467
479
.
Hickok
,
G.
, &
Poeppel
,
D.
(
2007
).
The cortical organization of speech processing
.
Nature Reviews Neuroscience
,
8
,
393
402
.
Holcomb
,
P. J.
,
Grainger
,
J.
, &
O'Rourke
,
T.
(
2002
).
An electrophysiological study of the effects of orthographic neighborhood size on printed word perception
.
Journal of Cognitive Neuroscience
,
14
,
938
950
.
Kim
,
K. H.
,
Relkin
,
N. R.
,
Lee
,
K. M.
, &
Hirsch
,
J.
(
1997
).
Distinct cortical areas associated with native and second languages
.
Nature
,
388
,
171
174
.
Klein
,
D.
,
Zatorre
,
R. J.
,
Chen
,
J.-K.
,
Milner
,
B.
,
Crane
,
J.
,
Belin
,
P.
, et al
(
2006
).
Bilingual brain organization: A functional magnetic resonance adaptation study
.
Neuroimage
,
31
,
366
375
.
Koyama
,
M. S.
,
Stein
,
J. F.
,
Stoodley
,
C. J.
, &
Hansen
,
P. C.
(
2013
).
Cerebral mechanisms for different second language writing systems
.
Neuropsychologia
,
51
,
2261
2270
.
Kriegeskorte
,
N.
,
Goebel
,
R.
, &
Bandettini
,
P.
(
2006
).
Information-based functional brain mapping
.
Proceedings of the National Academy of Sciences, U.S.A.
,
103
,
3863
3868
.
Kroll
,
J. F.
,
Bobb
,
S. C.
, &
Wodniecka
,
Z.
(
2006
).
Language selectivity is the exception, not the rule: Arguments against a fixed locus of language selection in bilingual speech
.
Bilingualism: Language and Cognition
,
9
,
119
.
Lee
,
Y.-S.
,
Turkeltaub
,
P.
,
Granger
,
R.
, &
Raizada
,
R. D. S.
(
2012
).
Categorical speech processing in Broca's area: An fMRI study using multivariate pattern-based analysis
.
The Journal of Neuroscience
,
32
,
3942
3948
.
Lemhöfer
,
K.
, &
Broersma
,
M.
(
2011
).
Introducing LexTALE: A quick and valid Lexical Test for Advanced Learners of English
.
Behavior Research Methods
,
44
,
325
343
.
Leonard
,
M. K.
,
Brown
,
T. T.
, &
Travis
,
K. E.
(
2010
).
Spatiotemporal dynamics of bilingual word processing
.
Neuroimage
,
49
,
3286
3294
.
Leonard
,
M. K.
,
Torres
,
C.
,
Travis
,
K. E.
,
Brown
,
T. T.
,
Hagler
,
D. J.
,
Dale
,
A. M.
, et al
(
2011
).
Language proficiency modulates the recruitment of non-classical language areas in bilinguals
.
PloS ONE
,
6
,
e18240
.
Li
,
P.
,
Sepanski
,
S.
, &
Zhao
,
X.
(
2006
).
Language history questionnaire: A web-based interface for bilingual research
.
Behavior Research Methods
,
38
,
202
210
.
Li
,
Y.
,
Yang
,
J.
,
Scherf
,
S. K.
, &
Li
,
P.
(
2013
).
Two faces, two languages: An fMRI study of bilingual picture naming
.
Brain and Language
,
127
,
452
462
.
Mechelli
,
A.
,
Gorno-Tempini
,
M. L.
, &
Price
,
C. J.
(
2003
).
Neuroimaging studies of word and pseudoword reading: Consistencies, inconsistencies, and limitations
.
Journal of Cognitive Neuroscience
,
15
,
260
271
.
Milner
,
A. D.
, &
Goodale
,
M. A.
(
1995
).
The visual brain in action
.
Oxford
:
Oxford University Press
.
Moll
,
K.
, &
Landerl
,
K.
(
2010
).
SLRT-II–Verfahren zur Differentialdiagnose von Störungen der Teilkomponenten des Lesens und Schreibens
.
Bern
:
Hans Huber
.
Norman
,
K. A.
,
Polyn
,
S. M.
,
Detre
,
G. J.
, &
Haxby
,
J. V.
(
2006
).
Beyond mind-reading: Multi-voxel pattern analysis of fMRI data
.
Trends in Cognitive Sciences
,
10
,
424
430
.
Oganian
,
Y.
,
Conrad
,
M.
,
Aryani
,
A.
,
Heekeren
,
H. R.
, &
Spalek
,
K.
(
2015
).
Interplay of bigram frequency and orthographic neighborhood statistics in language membership decision
.
Bilingualism: Language and Cognition
.
Peereman
,
R.
, &
Content
,
A.
(
1995
).
Neighborhoood size effect in naming: Lexical activation or sublexical correspondences?
Journal of Experimental Psychology: Learning, Memory, and Cognition
,
21
,
409
421
.
Pereira
,
F.
,
Mitchell
,
T.
, &
Botvinick
,
M.
(
2009
).
Machine learning classifiers and fMRI: A tutorial overview
.
Neuroimage
,
45(1 Suppl.)
,
S199
S209
.
Perfetti
,
C. A.
,
Liu
,
Y.
,
Fiez
,
J.
,
Nelson
,
J.
,
Bolger
,
D. J.
, &
Tan
,
L.-H.
(
2007
).
Reading in two writing systems: Accommodation and assimilation of the brain's reading network
.
Bilingualism: Language and Cognition
,
10
,
131
.
Pötzl
,
O.
(
1930
).
Aphasie und mehrsprachigkeit
.
Zeitschrift Für Die Gesamte Neurologie Und Psychiatrie
,
651
,
145
162
.
Prabhakaran
,
R.
,
Blumstein
,
S. E.
,
Myers
,
E. B.
,
Hutchison
,
E.
, &
Britton
,
B.
(
2006
).
An event-related fMRI investigation of phonological-lexical competition
.
Neuropsychologia
,
44
,
2209
2221
.
Price
,
C. J.
(
2012
).
A review and synthesis of the first 20 years of PET and fMRI studies of heard speech, spoken language and reading
.
Neuroimage
,
62
,
816
847
.
Price
,
C. J.
,
Green
,
D.
, &
von Studnitz
,
R. E.
(
1999
).
A functional imaging study of translation and language switching
.
Brain
,
122
,
2221
2235
.
Raizada
,
R. D. S.
,
Tsao
,
F.-M.
,
Liu
,
H.-M.
, &
Kuhl
,
P. K.
(
2010
).
Quantifying the adequacy of neural representations for a cross-language phonetic discrimination task: Prediction of individual differences
.
Cerebral Cortex (New York, N.Y.: 1991)
,
20
,
1
12
.
Reilhac
,
C.
,
Peyrin
,
C.
,
Démonet
,
J.-F.
, &
Valdois
,
S.
(
2013
).
Role of the superior parietal lobules in letter-identity processing within strings: fMRI evidence from skilled and dyslexic readers
.
Neuropsychologia
,
51
,
601
612
.
Schulpen
,
B.
,
Dijkstra
,
T.
,
Schriefers
,
H. J.
, &
Hasper
,
M.
(
2003
).
Recognition of interlingual homophones in bilingual auditory word recognition
.
Journal of Experimental Psychology: Human Perception and Performance
,
29
,
1155
1178
.
Schwartz
,
A. I.
, &
Kroll
,
J. F.
(
2006
).
Bilingual lexical activation in sentence context
.
Journal of Memory and Language
,
55
,
197
212
.
Segal
,
E.
, &
Petrides
,
M.
(
2012
).
The anterior superior parietal lobule and its interactions with language and motor areas during writing
.
The European Journal of Neuroscience
,
35
,
309
322
.
Shook
,
A.
, &
Marian
,
V.
(
2013
).
The bilingual language interaction network for comprehension of speech
.
Bilingualism: Language and Cognition
,
16
,
304
324
.
Spalek
,
K.
,
Hoshino
,
N.
,
Wu
,
Y. J.
,
Damian
,
M.
, &
Thierry
,
G.
(
2014
).
Speaking two languages at once: Unconscious native word form access in second language production
.
Cognition
,
133
,
226
231
.
Torgesen
,
J. K.
,
Wagner
,
R. K.
, &
Rashotte
,
C. A.
(
1999
).
TOWRE-2 Test of Word Reading Efficiency
.
Austin: TX
:
Pro-ed Publishers
.
Vaid
,
J.
, &
Frenck-Mestre
,
C.
(
2002
).
Do orthographic cues aid language recognition? A laterality study with French–English bilinguals
.
Brain and Language
,
82
,
47
53
.
Van Kesteren
,
R.
,
Dijkstra
,
T.
, &
de Smedt
,
K.
(
2012
).
Markedness effects in Norwegian–English bilinguals: Task-dependent use of language-specific letters and bigrams
.
The Quarterly Journal of Experimental Psychology
,
65
,
2129
2154
.
Vinckier
,
F.
,
Dehaene
,
S.
,
Jobert
,
A.
,
Dubus
,
J. P.
,
Sigman
,
M.
, &
Cohen
,
L.
(
2007
).
Hierarchical coding of letter strings in the ventral stream: Dissecting the inner organization of the visual word-form system
.
Neuron
,
55
,
143
156
.
Wartenburger
,
I.
,
Heekeren
,
H. R.
,
Abutalebi
,
J.
,
Cappa
,
S. F.
,
Villringer
,
A.
, &
Perani
,
D.
(
2003
).
Early setting of grammatical processing in the bilingual brain
.
Neuron
,
37
,
159
170
.
Wilson
,
S. M.
,
Isenberg
,
A. L.
, &
Hickok
,
G.
(
2009
).
Neural correlates of word production stages delineated by parametric modulation of psycholinguistic variables
.
Human Brain Mapping
,
30
,
3596
3608
.
Wu
,
Y. J.
, &
Thierry
,
G.
(
2010
).
Chinese–English bilinguals reading English hear Chinese
.
The Journal of Neuroscience: The Official Journal of the Society for Neuroscience
,
30
,
7646
7651
.
Yarkoni
,
T.
,
Balota
,
D. A.
, &
Yap
,
M. J.
(
2008
).
Moving beyond Coltheart's N: A new measure of orthographic similarity
.
Psychonomic Bulletin & Review
,
15
,
971
979
.
Ziegler
,
J. C.
, &
Goswami
,
U.
(
2005
).
Reading acquisition, developmental dyslexia, and skilled reading across languages: A psycholinguistic grain size theory
.
Psychological Bulletin
,
131
,
3
29
.

Author notes

*

K. S. and H. R. H. contributed to this work equally with shared senior authorship.