Abstract
Language rehabilitation centers on modifying its use through experience-based neuroplasticity. Statistical learning of language is essential to its acquisition and likely its rehabilitation following brain injury, but its corresponding brain networks remain elusive. Coordinate-based meta-analyses were conducted to identify common and distinct brain activity across 25 studies coded for meta-data and experimental contrasts (Grammatical and Ungrammatical). The resultant brain regions served as seeds for profiling functional connectivity in large task-independent and task-dependent data sets. Hierarchical clustering of these profiles grouped brain regions into three subnetworks associated with statistical learning processes. Functional decoding clarified the mental operations associated with those subnetworks. Results support a left-dominant language sub-network and two cognitive control networks as scaffolds for language rule identification, maintenance, and application in healthy adults. These data suggest that cognitive control is necessary to track regularities across stimuli and imperative for rule identification and application of grammar. Future empirical investigation of these brain networks for language learning in individuals with brain injury will clarify their prognostic role in language recovery.
1 Introduction
Rehabilitation for cognitive-communication impairments following left hemisphere stroke requires relearning, as behavioral interventions train restorative or compensatory skills to improve performance. However, little is known about how the damaged brain relearns. In a first step to investigating models of learning for rehabilitation of language, we identify language learning paradigms and their corresponding brain networks in neurotypical, healthy adults. We focus on learning that is known to be involved in language acquisition, that is, statistical learning (c.f., Perruchet & Pacton, 2006), given that this process is also utilized in adult language learning.
Most often, statistical language learning is investigated through experimental paradigms using artificial grammar learning in which participants are exposed to novel language-like stimuli and must delineate salient cues, or rules, in incoming stimuli without explicit instruction to do so. The learner may receive reinforcement when that salient stimulus is paired with a relevant response, but do not otherwise receive explicit feedback. Individuals with aphasia and traumatic brain injury also evidence this type of learning in statistical learning paradigms (Cope et al., 2017; Peñaloza et al., 2015, 2017; Schuchard et al., 2017; Schuchard & Thompson, 2017; Vallila-Rohter & Kiran, 2013a, 2013b), suggesting that this type of learning relies on brain systems that differ from those underlying the language impairment. However, there is no consensus on the brain regions or networks associated with statistical learning of language-like stimuli. Modeling statistical learning in the brain in healthy adults can provide a priori hypotheses for investigating relearning in individuals with acquired brain injury. With such a brain-based model, investigation may commence to determine whether and how a statistical learning brain network is impaired or may be optimized in response to intervention.
1.1 Statistical Learning (SL)
Statistical learning (SL) is the ability to learn the lawfulness of sequenced sounds, characters, or linguistic units to efficiently respond to those stimuli without explicit knowledge of the rules or conscious strategy use. SL is typically measured experimentally utilizing tasks that provide no explicit instruction, and yet an increase in accuracy implies that a participant has implicitly learned the rules of the task (Reber, 1967). SL may be more narrowly defined as the ability to detect transitional or distributed dependencies among elements in close temporal or spatial proximity of each other in a perceived stream of input (Christiansen, 2019; Perruchet & Pacton, 2006). SL has been investigated in language learning because individuals are known to utilize statistical regularities (e.g., predictive dependencies in word segmentation, Saffran, 2001) when learning to extract phrase structures, or syntax, that are organized hierarchically. Phrases are marked by dependencies, that is, one word class depends on another specific word class to follow it. For example, determiners such as “a” or “the” must be followed by a noun. However, the phrases and their constituent probabilities are organized in language such that they do not happen bidirectionally (e.g., a noun does not require a determiner) and thus are not based simply on co-occurrence. An example of a transitional probability may be: given B, what is the likelihood of A? Or, given a determiner, what is the likelihood that a noun will follow? Learning of transitional probabilities is evident when learners can detect phrasal units using predictive dependencies above chance, even when all surface-level cues (e.g., intonation or lexical stress) are removed (Saffran, 2001; Saffran et al., 1996). The statistical dependencies between the word classes (e.g., nouns and verbs, or nouns and determiners) are as complex as surface-level cues and provide a framework for the syntactic rules that can be implicitly learned from natural language. Regardless of the considerable spectrum of complexity that may be explained by recognizing statistical dependencies, statistical learning may be considered a subtype of implicit learning as both are grounded in the basic processes of attending to salient cues, learning, and memory, with an uncontroversial overlap through chunking (Christiansen, 2019). That is, statistical learning typically involves rule application that is chunked (stored as a unit of information) once acquired, and is more efficient when employed for computation in the moment (Batterink et al., 2015).
1.2 Artificial or non-native grammar learning
While SL has been studied using several experimental paradigms, artificial or non-native grammar tasks most closely mimic learning of natural language as they require SL of syntactic rule learning. A person is exposed to a language that requires segmentation of sound strings into words and/or organization of words based on statistically learned rules. Thus, learning requires the recognition of regularity in the input and subsequent extraction of rules from the stimuli. The rules in artificial grammars are embedded among strings of elements (letters, characters, sounds) and the learner is exposed to multiple exemplars of the strings (Reber, 1967; Vadinova et al., 2020). Non-native grammars, instead, are made up of pseudo words or characters arranged to mimic the syntactic structure of real languages, or may be real languages that are unfamiliar to the learner (Hauser et al., 2012; Morgan-Short et al., 2015; Opitz & Friederici, 2003; Turk-Browne et al., 2009; Yusa et al., 2011). Adults have been shown to reliably extract regularities in unfamiliar languages or artificial grammars, allowing them to make judgments about the grammaticality of an utterance and perform above chance (Bahlmann et al., 2008; Lieberman et al., 2004; Petersson et al., 2004, 2012; Plante et al., 2015; Reber, 1967). Artificial or non-native grammar tasks are composed of learning and testing phases. In the learning phase, participants are exposed to multiple repetitions of grammatical stimuli demonstrating permissible sentence composition in the grammar. Participants may or may not receive instructions for the task, but the rules are never explicitly taught. Participants then enter the testing phase during which they make grammaticality judgments about presented grammatical sentences, random word-ordered sentences, or sentences that partially follow a rule. Participants can efficiently and accurately judge the grammaticality of sentences created from an artificial grammar while being unaware of the underlying rules or of strategies they may have used.
1.3 Neural models of language learning through SL
Differing lines of research exist that contribute to understanding how the brain accomplishes SL. Some models center on distinguishing or clarifying the roles of domain-general or modality-specific processes necessary in different phases of learning. For example, Frost, Siegelman, Christiansen and colleagues (Christiansen, 2019; Christiansen et al., 2010; Christiansen & Chater, 2016; Frost et al., 2015; Siegelman et al., 2017, 2018; Siegelman & Frost, 2015) highlight the similarities across statistical learning tasks, that is, that learning requires extraction of salient features in multi-modal stimuli that may be patterned/rule-governed and integration of those patterns or rules to apply to other stimuli. In a review of neuroimaging findings in SL tasks, Frost et al. (2015) propose unimodal brain regions are dedicated to processing of modality-specific information (e.g., occipital cortex for visual stimuli, temporal cortex for auditory stimuli) and to recognizing statistical regularities in the stimuli to a greater extent than in random information. However, they also propose brain regions involved in SL regardless of stimulus modality that may be considered domain-general. These regions include the hippocampus, basal ganglia, thalamus, and prefrontal cortex that are thought to operationalize or modulate modality-specific information.
Other models of SL have been developed to explain word learning, specifically as it is observed in language acquisition in children, as well as for second language learning in adults. Incorporating several of the brain regions and pathways highlighted by Frost et al. (2015), Rodríguez-Fornells et al. (2009) developed a multi-faceted, functional neuroanatomical, and neurophysiological framework for statistical word learning. Their model posits three language learning interfaces—dorsal audio-motor, ventral meaning inference, and episodic-lexical—that are realized through integration of activity across several neurological structures and pathways. The dorsal audio-motor interface is primarily responsible for phonological processing involved in the retention and production of new words. This interface is made up of structures in the temporoparietal (e.g., the supramarginal and posterior temporal gyri) and frontal cortex (e.g., the posterior inferior frontal gyrus and ventral premotor cortex) and supports a dorsal rehearsal and processing route for word learning. The ventral meaning inference interface attaches meaning to new words via contextual linguistic and extralinguistic information, relying on integration of internal (e.g., previous knowledge) and external sources of information. The meaning inference interface engages brain regions responsible for storing and retrieving conceptual information (c.f., Winocur et al., 2010) that include the inferior, medial, and anterior temporal lobes and the ventral inferior frontal cortex. After repeated exposure to a new word, its episodic trace is proposed to be stored for long-term representation in the episodic-lexical interface, and at that point is independent of the medial temporal storage processes of meaning inference. Within this model, the basal ganglia are suggested to be responsible for integrating information received through each of the language learning interfaces. Specifically, the striatum receives and integrates input from multiple neocortical areas (c.f., Copland et al., 2021) that is sent back to the cortex through the thalamus, creating functional loops for cognitive processing as part of learning. Since the basal ganglia have been implicated in cognitive processes (e.g., executive functioning and attention), the model suggests that they are essential for language learning, beyond word learning to extend to rule learning (De Diego-Balaguer et al., 2008). As well, they pose a role of the medial temporal structures (hippocampus and parahippocampus) in recognition of the novelty and learning of meaning in this interface. Of course, integration across the interfaces is essential when learning occurs in environments where there are ambiguous or rule-violating language elements (Wahl et al., 2008). Ultimately, Rodríguez-Fornells and colleagues suggest that more research is necessary to clarify how the development and integration of the streams influence language learning, as well as to specify roles of the basal ganglia.
Another model of SL derives from the role of memory systems and their integral roles in language learning. Ullman and colleagues focus specifically on the type of memory, or memory system, engaged in learning language. The Declarative-Procedural Model (DPM, Ullman, 2004) proposes a one-to-one relationship between the type of linguistic information being learned (rule-governed or lexical) and its memory system. Like the episodic-lexical interface of Rodríguez-Fornells, the DPM distinguishes a mental lexicon that is learned explicitly, can be explicitly recalled, and is proposed to be subserved by the declarative memory system. This memory system involves the medial temporal lobes (hippocampus, dentate gyrus and subicular complex, parahippocampus, entorhinal and perirhinal cortex) for formation and consolidation of new memories (Winocur et al., 2010), and is distributed throughout the association cortex for long-term semantic and episodic memory storage (Batterink & Paller, 2019; Poldrack et al., 2001).
The DPM also proposes a mental grammar, a computational system that extracts regularities from language, and analyzes it based on stored knowledge of rules and constraints. The mental grammar is hypothesized as part of the procedural memory system, which is responsible for learning new skills and controlling established skills, habits, and procedures. Procedural memory utilizes brain regions similar or related to those for declarative memory, but also involves brain structures historically associated with motor skill learning (a frontal/basal ganglia network) or with conflict resolution/prediction of behavior (superior temporal lobe, parietal lobe, and cerebellum). Functionally, the basal ganglia are thought to be involved in implicit learning generally, but are specifically engaged for probabilistic rule learning, sequence learning, context-dependent rule selection, working memory maintenance, and attention shifting (Poldrack et al., 2001; Ullman, 2004). The striatal regions (caudate nucleus and putamen) are particularly involved in explicit and implicit language learning (Copland et al., 2021), but their connections with cortical regions appear to be gated based on the phase of learning. For example, Plante et al. (2015) found that activation in the right caudate nucleus is present immediately preceding behaviorally evidenced learning during a non-native grammar learning task, but this caudate activity is reduced once the task becomes more familiar. Similar findings have been reported in other artificial grammar learning studies (Bahlmann et al., 2008; Forkstam et al., 2006; Lieberman et al., 2004). The declarative mental lexicon subserves rapid learning following one stimulus presentation, which binds and associates it with arbitrarily related information. The procedural mental grammar evolves gradually given several presentations of stimuli, tends to be stable with limited flexibility, and exhibits largely unconscious or automatic application of the learned rules.
Specific to artificial grammar or non-native language learning, which differs from word learning in terms of the flexible application of grammatical rules, one region thought to be crucial is the left inferior frontal gyrus (IFG) (Tagarelli et al., 2019; Yang & Li, 2012). This region appears responsible for sequence learning of linguistic and non-linguistic stimuli (Forkstam et al., 2006; Karuza et al., 2013; Petersson et al., 2012; Ullman, 2004). In fact, Tagarelli and colleagues, through a series of coordinate-based meta-analyses contrasting grammar learning tasks, found the bilateral IFG involved in language learning for both word learning and grammar learning. While the left IFG is typically recognized for its involvement in speech and language production (i.e., Broca’s area), it may also serve a role more generally in learning and processing of rule-based sequences. It is also recognized for its role in maintaining information in working memory (Rogalsky et al., 2008). Thus, it is likely that the IFG’s role in working memory supports sequence learning by maintaining information to allow for its consideration or manipulation, which may be essential for the extraction and computation while learning of underlying rules (Arciuli, 2017; Rodríguez-Fornells et al., 2009; Thiessen et al., 2012).
This latter proposal, that the left IFG’s role in grammar learning may be for the general processing of rule-based sequences rather than specifically for language, is in line with the Frost and the Rodríguez-Fornells assertions that domain-general regions must be involved in SL regardless of the modality of the input. Along with the IFG, Tagarelli et al. (2019) found activity in a diffuse set of regions, including the premotor and supplementary motor cortex, anterior insula, as well as left-sided superior parietal lobule and angular gyrus during artificial grammar learning tasks whether the intent was to learn word meanings or to learn a grammar. A similar distribution of connections between the IFG and other brain regions is also proposed by Rodríguez-Fornells et al. (2009). These brain regions are also considered to be part of the dorsal attention (Corbetta & Shulman, 2002) and salience networks (Seeley et al., 2007), along with the cerebellum. For example, the superior inferior parietal lobules and supramarginal gyrus, along with the anterior cingulate cortex are engaged in non-native grammar learning tasks in individuals with higher rates of correct rejection of ungrammatical utterances indicating more efficient learning (Plante et al., 2015). As well, the procedural memory system utilizes the cerebellar hemispheres (the dentate nucleus and the vermis) for error-based learning and error detection, two important aspects of grammaticality judgments in SL tasks (Ullman, 2004).
Thus, the brain activity engaged during artificial and non-native grammar learning tasks and other SL tasks converges on domain-general and modality-specific regions. Some evidence exists that certain brain systems may be engaged to a greater extent for specific features of language (i.e., lexical learning = medial temporal cortex; grammar learning = basal ganglia, Tagarelli et al., 2019). However, it may be that the inter-relationship between the regions is critical to understanding their roles or the processes necessary for SL of language, which Rodríguez-Fornells and colleagues suggest is mediated by the basal ganglia. In contrast, Tagarelli et al. (2019) found basal ganglia activity (specifically the left caudate head/body and anterior putamen) was unique to automatization of grammar skill.
The aim of the current study is to conduct a coordinate-based meta-analysis of artificial and non-native grammar learning in healthy adults. Specifically, this study will extend those meta-analytical findings to model functional connectivity profiles and to characterize them with behavioral decoding. There is not currently a meta-analysis that clarifies which brain regions are involved in SL of grammars, although two meta-analyses have examined language learning (Tagarelli et al., 2019) and non-linguistic sequence learning (Janacsek et al., 2020). While Tagarelli et al. (2019) examined artificial grammar learning in adults, their goal was to look at language learning in general and evaluate the DPM for distinctive regional activity associated with lexical versus grammatical learning. As a result, their meta-analysis excluded evaluation of transitional probabilities given that they rely on overlapping lexical and grammatical elements. Nonetheless, their data supported the DPM indicating declarative memory structure involvement in explicit language training and procedural memory structure involvement in statistical learning. A focus on the artificial and non-artificial grammars included in their analyses highlighted considerable overlap in the left inferior frontal gyrus for both. Therefore, the present study will extend the findings of Tagarelli et al. (2019) by identifying the brain regions that are commonly involved in SL during artificial and non-native grammar learning tasks. Further, the regions commonly active across studies in the meta-analysis will be used to establish functional connectivity profiles through leveraging of task-independent and task-dependent MRI datasets. These connectivity profiles will be grouped into subnetworks and associated with mental operations. This process will clarify not only the regions uniquely involved in SL of grammar, but also how those regions integrate with others that are aligned with memory or cognitive systems. We hypothesize that the regions identified as being essential for SL will be like those reported by Tagarelli et al. (2019) and hypothesized by the collective work of Frost, Rodríguez-Fornells, and Ullman. However, we further hypothesize that the evaluation of the functional connectivity profiles will clarify the domain-general networks, outside of the traditional language network, that are engaged.
2 Material and Methods
An overview of the methods utilized in the study is presented in Figure 1.
2.1 Literature search, filtering, and paper selection
A comprehensive literature search for peer-reviewed articles commenced from March 27, 2020, to February 19, 2021. PubMed, PsychInfo, Google Scholar, and references of related articles were systematically searched. Additionally, article data from Tagarelli et al. (2019) provided by Ullman (personal communication, 2020) were searched to determine inclusion eligibility for the current study. Inclusion criteria for the studies were: study of healthy adults 18+ years, use of statistical language learning tasks, reporting of coordinates for whole-brain analyses, and reporting of experimental contrasts indicating rule learning. Search terms used to find relevant articles included: “implicit learning,” “implicit learning AND automaticity,” “implicit learning AND plasticity,” “implicit learning AND fMRI NOT disorder,” “implicit learning AND statistical learning AND MRI,” “artificial grammar,” “artificial grammar AND fMRI,” “statistical learning AND fMRI,” “artificial grammar learning AND fMRI,” “implicit learning grammar AND fMRI,” and “distributional learning AND fMRI.” Searches yielded over 800 papers. Titles and abstracts were then screened by the second author for studies involving language and including brain imaging. Secondary screening was for the following criteria: (1) fMRI or PET studies (not DTI, ERP, or structural imaging), (2) whole-brain activation analysis (not region-of-interest, resting-state functional connectivity, dynamic causal modeling, or independent components analysis), and (3) reporting results for healthy adult participants.
2.2 Meta-data coding
Experimental contrasts were categorized as Grammatical (rule-following) or Ungrammatical (rule-violating). Meta-data coding for each paper included task definition, stimulus type, stimulus presentation modality, response modality, amount and type of training/learning, feedback type and frequency, and the experimental contrasts. Contrasts from the coded papers included directional analyses (e.g., group differences or contrast comparisons), increasing activation, decreasing activation, correlation with task performance, and conjunction analyses. Only contrasts reporting (1) increased activation during either a rule-based learning (grammatical) or non-rule-based (ungrammatical) task, or (2) experimental contrasts reporting relevant contrasts between experimental conditions (e.g., rule-based > random) in healthy adults were included. Decreasing activations were excluded because, given that the data in the meta-analysis are derived from contrasts, decreasing activations are not clearly indicative of regions that are less active in a task as much as of regions that are not active in another task or condition (e.g., rest > task). Additionally, contrasts correlating functional activation during learning with test performance were excluded. Included grammatical contrasts were those for which activation during a grammatical task (learning or test phase) was contrasted with baseline or rest; or the comparison of activation in Grammatical > Ungrammatical conditions.
2.3 Modeled activation maps
Once the relevant papers were identified and experiments/contrasts chosen, the coordinates for brain activations were extracted. Coordinate-based meta-analyses were calculated in a revised version of the Activation Likelihood Estimation (ALE) algorithm (Eickhoff et al., 2009, 2012; Turkeltaub et al., 2011), conducted in the Neuroimaging Meta-Analysis Research Environment (NiMARE) v0.0.3, a centralized standard implementation of meta-analytic tools through Python (Salo et al., 2020). This step identified the brain regions most commonly active during statistical learning in imaging studies of healthy adult participants. This ALE algorithm modeled foci for each contrast with a spherical Gaussian blur with full width at half maximum determined by the number of subjects included in each experiment, representing uncertainty due to within-subject and across-lab variability. Any coordinates reported using the Talairach atlas (Talairach & Tournoux, 1988) were converted to the Montreal Neurological Institute (MNI) space (Collins et al., 1994; Evans et al., 1993) using tal2icbm (Laird et al., 2010; Lancaster et al., 2007). Then, a set of modeled activation maps for each experimental contrast was generated such that each voxel’s value corresponded to its maximum probability of activation. ALE values were calculated as the voxel-wise convolution of the modeled experimental contrasts, quantifying the spatial convergence across the brain. ALE values were transformed to p-values using a cumulative distribution function and thresholded at p < 0.001. A Monte Carlo approach was implemented to correct for multiple comparisons and determined a minimal cluster size through a set of 10,000 iterations. For every iteration, randomly selected gray matter mask coordinates replaced the foci, ALE values were calculated for the randomized dataset, and those values were transformed to p-values, thresholded at p< 0.001, and the maximum size of supra-threshold clusters was recorded. The maximum cluster sizes for each iteration were used to build a null distribution that was contrasted with the original ALE map so that those larger than the cluster size in the null distribution’s 95th percentile made up the family-wise error (FWE) corrected convergence maps (i.e., cluster-forming threshold at pvoxel-level < 0.001 and cluster-extent threshold at pFWE-corrected < 0.05 were used to control for multiple comparisons). Surface-based and axial slice visualizations were derived with NiLearn plotting tools (Abraham et al., 2014) at https://github.com/mriedel56/surflay.git.
The first meta-analysis assessed convergence across a dataset of the pooled artificial grammar contrasts, identifying brain regions consistently active regardless of the grammaticality of the stimuli. Two additional ALE meta-analyses were then performed to identify convergence of brain activity for foci obtained from contrasts of Grammatical (rule-following; activation when the test item correctly follows the implicitly learned rule) or Ungrammatical (rule-violating; activation when the test item incorrectly follows the rule) stimuli, thus elucidating convergence of activity based on grammaticality. Conjunction analyses examined the significant convergence, or union, common between both ALE groups. Subtraction analyses of the unthresholded Grammatical ALE and unthresholded Ungrammatical ALE helped to identify brain regions that were distinctly convergent in one versus the other task type. To do this, a null distribution of the ALE difference scores was created by pseudo-randomly permuting the experimental contrasts between ALE groups, calculating voxel-level difference scores, and repeating this procedure for 10,000 iterations. Experimental contrasts were shuffled, and an equal number of contrasts to those originally in the Grammatical or Ungrammatical conditions were assigned. These pseudo-ALE images were then subtracted. Voxel-level p-values were assigned based on a voxel’s observed difference score relative to its null distribution of pseudo-ALE difference scores (pFWE-corrected < 0.05). An additional extent-threshold of 100 contiguous voxels was also applied to exclude any small regions with spurious differences (c.f., Bartley et al., 2018; Laird et al., 2005; Poudel et al., 2020).
2.4 Functional connectivity profiles
Once ALE clusters were identified indicating common activity across studies for the pooled Grammatical + Ungrammatical meta-analyses, their connectivity was evaluated to define subnetworks, or cliques. Connectivity profiles were computed for the ALE-derived ROIs. This step helped to identify subgroups of functionally connected brain regions. To do this, a 6-mm radius spherical seed was generated at each local maxima from each ALE cluster of the meta-analysis maps using FSL’s cluster command. Only local maxima that were at least 20-mm from another served as an ROI. These seeds were then used to identify task-independent and task-dependent connectivity between the average ROI time-course and all other brain voxels. Unthresholded meta-analytic maps for each ROI were cross-correlated with each other to generate an ROI x ROI correlation matrix. The same was done for unthresholded resting-state functional connectivity images for each map. ROI labels were assigned using AFNI’s whereami command with AAL atlas labels derived from the atlas reader python package (https://github.com/miykael/atlasreader).
2.5 Task-independent functional connectivity using the resting-state fMRI (rsfMRI) data from the Human Connectome Project’s (HCP) Young Adult Study (Van Essen et al., 2013; S1200 Data Release)
Each ROI from the pooled ALE was used as a seed in a seed-based query of task-independent functional connectivity between the average of each ROI time course and all other brain voxels. The S1200 data release includes minimally pre-processed and denoised MRI data per the workflow detailed by Glasser et al. (2016). On November 12, 2019, 150 randomly selected participants (mean ± SD: 28.7 ± 3.9 years old) were downloaded via the HCP’s Amazon Web Services Simple Storage Solution repository. The sample included 77 females (30.3 ± 3.5 years old) and 73 males (27.1 ± 3.7 years old) who provided consent through the contributing investigators’ local Institutional Review Board or Ethics Committee. While the age difference for biological sex was significant (t[149] = 5.3, p < 0.001), it was consistent with the ages noted in the full S1200 Data Release (Van Essen et al., 2013). Each HCP participant underwent T1- and T2-weighted structural imaging and four rsfMRI (15 minutes each) acquisitions on a 3T Siemens Connectome MRI scanner with a 32-channel head coil. Parameters were structural: 0.7-mm isotropic resolution; rsfMRI: TR = 720 ms, TE = 33.1 ms, in-plane FOV = 208 x 180 mm, 72 slices, 2.0 mm isotropic voxels, and multiband acceleration factor = 8 (Feinberg et al., 2010). The average time course for each ROI was extracted for each participant as well as the average time course for all brain voxels. Each ROI’s separate deconvolution (FSL’s FEAT, Jenkinson et al., 2012) included the global signal time course, which served as a regressor of no interest, and spatially smoothed with a 6 mm FWHM kernel (https://github.com/NBCLab/niconn-hcp). This was done given that the use of such a regressor performs better than other commonly used motion-correction strategies for HCP rsfMRI data (Burgess et al., 2016). Each participant’s ROI was then averaged across their four rsfMRI runs using a fixed-effects analysis. A rsfMRI map was then derived through a group-level, mixed-effects analysis (Woolrich et al., 2009). Gaussian Random Field theory-based maximum height nonparametric thresholding (voxel-level FWE-corrected at p < 0.001) was used to create spatially specific rsFC maps (c.f., Woo et al., 2014).
2.6 Task-dependent functional connectivity—meta-analytic connectivity modeling (MACM)
Each ROI from the pooled ALEs was also seeded in MACM analyses (https://github.com/NBCLab/niconn). These analyses reflect the coactivation patterns of spatially distinct brain regions with the seed from varied task-based neuroimaging studies and indicate the brain regions most likely to coactivate with a given ROI across tasks and behavioral domains (Laird et al., 2009). Neurosynth (Yarkoni et al., 2011) was searched for all studies reporting coordinates from each of the ALE ROIs output from NiMARE (Salimi-Khorshidi et al., 2009). A 15-mm FWHM kernel was used for all study coordinates and thresholded with a voxel-level FWE correction of p < 0.001 to parallel the rsFC assessments.
2.7 Hierarchical cluster analysis of the pooled ALE brain regions
Hierarchical clustering of the connectivity profiles grouped the brain regions into subnetworks that were associated with Grammatical/Ungrammatical processes. rsFC and MACM cross-correlation matrices were calculated separately using the unthresholded connectivity maps. Three-dimensional images that represented connectivity were vectorized and concatenated creating a matrix of the number of voxels (V, 2 mm MNI152 template) by the number of maps (M, ROIs). Each pair-wise combination of maps was correlated (Pearson) resulting in an MxM matrix. The similarity of the task-independent and task-dependent maps was then represented in an agglomerative hierarchical cluster tree demonstrating clusters of ROIs with similar features but that were distinct across clusters. This is done using an algorithm that finds two spatially similar clusters, merges it with the two most similar ones, and continues until all are merged as measured by a standardized Euclidean distance method and Ward’s minimum variance linkage (Eickhoff et al., 2011; Timm, 2002). These clusters, or cliques, demonstrate similar task-independent connectivity in the rsfMRI data and task-dependent coactivation patterns in the MACMs. However, given the variance in imaging modality, it is likely for clustering outcomes to differ. Thus, an integrated multimodal correlation matrix that combines the averaged rsFC and MACM information was created. This multimodal correlation matrix then provided the final cluster solution/groupings as in Hill-Bowen et al. (2021).
2.8 Functional decoding
Functional decoding further characterized the mental operations associated with those sub-networks. To do this, the spatial correlations between the averaged and unthresholded MACM maps across ROIs were associated with each of the 1,335 terms in Neurosynth. This produced a ranked list of psychological terms that related to each cluster, thus providing a semi-quantitative interpretation of each map to the broader literature. Here, the top anatomical or functional terms (removing duplicates or synonyms) that correlated with an input map at r > 0.29 were considered to represent the functional profile of the subnetwork.
3 Results
3.1 Outcomes of the literature search
The literature searches yielded over 800 papers published between 1995 and February 1, 2022. The title and abstract screening reduced the total to 228 papers. Of those, 88 papers were excluded for not utilizing fMRI or PET imaging, or not conducting whole-brain analyses. Thirteen papers were excluded due to not including or reporting results for healthy adults. An additional 32 papers were excluded because they administered serial reaction time task (SRT) experiments that did not involve a grammar.
A final screening was completed by one author (KC) for the remaining 150 whole-brain fMRI or PET language studies (95 identified through the literature search and 55 articles from Tagarelli et al., 2019) to determine whether each study fit inclusion criteria. Studies including language learning which referenced rules, grammar, regularities, statistical probability, chunk strength, dependencies, and/or adjacencies were included. These studies either utilized a word learning or a grammar learning paradigm. Studies that focused on word learning were reviewed to determine whether learning occurred statistically (no translations given) and if there was an underlying rule system. Studies that referenced explicit grammar learning were reviewed to clarify whether rules were explicitly taught (and examine the nature of feedback), and if so, they were excluded. As a result, 36 word learning studies were excluded that either did not involve rules or included explicitly taught words. For the grammar learning studies, one study was excluded (Musso et al., 2003) and four were retained (Fletcher et al., 1999; Skosnik et al., 2002; Yang & Li, 2012; Yusa et al., 2011). Finally, studies that explicitly assessed participants’ learning of words or grammar, such as recognition and word segmentation tasks, were included since the learning was determined to be statistical. Correlational analyses between task performance and brain activity were excluded due to the small number of experiments (n = 6), thus excluding Opitz and Friederici (2003) Orpella et al. (2020), and leaving 25 papers.
A final set of 25 fMRI articles involved data from 506 participants (238 females, 268 males) and included 25 Grammatical and 14 Ungrammatical experiments/contrasts (Table 1). Twenty-three studies used artificial grammars or languages, and two studies used unfamiliar, non-native languages. The types of artificial grammar or languages included BROCANTO (2 papers), Brocanto2 (1 paper), finite-state grammar (10 papers using both Reber and Markovian grammars), transitional probabilities (5 papers), and the remainder included a mix of adjacent, non-adjacent, or pairwise dependencies, hierarchical rules, chunk-based rules, or phrase structure rules (7 papers). Chunk-based rules, also known as associative chunk strength, are a learning mechanism that relies on the frequency of pairs of letters that appear together to make grammaticality judgments (Meulemans & Van der Linden, 1997). Only eight of the studies provided some sort of feedback. Fifteen of the studies presented stimuli visually, five auditory, and three auditory + visual. A subset of papers (n = 12) included greater activation for Ungrammatical items greater than for Grammatical items and were included in a separate ALE group. See Table 1 for a summary of papers included in the meta-analyses and a summary of study characteristics. A summary of the literature search is presented as a PRISMA flowchart (Fig. 2).
Author (Year) . | n . | F:M . | Age . | Type of grammar . | Rule type . | Feedback . | Stimulus presentation . | Included contrasts . |
---|---|---|---|---|---|---|---|---|
Learning | ||||||||
Fletcher et al. (1999) | 7 | 03:04 | 28 | Artificial grammar | Linear (Finite state) | Yes | Visual | training > baseline |
Karuza et al. (2013) | 25 | 17:08 | 20 | Artificial language | Transitional probabilities | No | Auditory + Visual | training > baseline |
Karuza et al. (2016) | 16 | 10:06 | 21 | Artificial language | Transitional probabilities | No | Auditory | training > baseline |
Opitz and Friederici (2003) | 14 | 07:07 | 25 | Artificial grammar | Phrase structure (BROCANTO) | Yes | Visual | training > baseline |
Learning + Test | ||||||||
Goranskaya et al. (2016) | 61 | 31:30 | 27 | Artificial grammar | Pairwise dependencies | No | Auditory | learn + test > baseline |
McNealy et al. (2006) | 27 | 13:14 | 27 | Artificial language | Transitional probabilities | No | Auditory | training > rest |
test > baseline | ||||||||
Ordin et al. (2020) | 24 | 14:10 | 25 | Artificial language | Transitional probabilities | No | Auditory | test > test |
Tettamanti et al. (2002) | 14 | 07:07 | 27 | Artificial grammar | Hierarchical and non-hierarchical | No | Visual | training > baseline |
Turk-Browne et al. (2009) | 16 | 09:07 | 23 | Non-native language | Transitional probabilities | No | Visual | training > training |
Test | ||||||||
Bahlmann et al. (2008) | 14 | 07:07 | 25 | Artificial grammar | Hierarchical and adjacent | Yes | Visual | test > test |
Conway et al. (2020) | 21 | 12:09 | 22 | Artificial grammar | Adjacent and non-adjacent | No | Visual | test > test |
Finn et al. (2013) | 10 | 05:05 | 24 | Artificial language | Phrase structure | No | Auditory + Visual | test > baseline |
Folia and Petersson (2014) | 32 | 16:16 | 23 | Artificial grammar | Linear (Reber finite state) | No | Visual | test > test |
test day 5 > day 1 | ||||||||
Forkstam et al. (2006) | 12 | 08:04 | 23 | Artificial grammar | Linear (Reber finite state) | No | Visual | test > baseline |
test > test day 8 | ||||||||
Hauser et al. (2012) | 17 | 07:10 | 24 | Artificial grammar | Phrase structure (BROCANTO) | Yes | Visual | test > test |
Lieberman et al. (2004) | 9 | 05:04 | 26 | Artificial grammar | Linear (Markovian finite state) | Yes | Visual | test > test |
Morgan-Short et al. (2015) | 13 | 06:07 | 22 | Artificial grammar | Phrase structure (Brocanto2) | Yes | Auditory + Visual | posttest > pretest |
test > baseline | ||||||||
Newman-Norlund et al. (2006) | 18 | 10:08 | 19 | Artificial language | Linear (Finite state) | Yes | Auditory | posttest > pretest |
Petersson et al. (2004) | 12 | 03:09 | 24 | Artificial grammar | Linear (Reber finite state) | No | Visual | test > baseline |
test > test | ||||||||
Seger et al. (2000) | 14 | 03:11 | nr | Artificial grammar | Linear (Finite state) | No | Visual | test > baseline |
Skosnik et al. (2002) | 23 | 11:12 | nr | Artificial grammar | Linear (Markovian finite state) | No | Visual | test > test |
Thiel et al. (2003) | 16 | 07:09 | 30 | Artificial grammar | Chunk-based | No | Visual | test > test |
Wilson et al. (2015) | 12 | 06:06 | 23 | Artificial grammar | Linear (Finite state) | No | Auditory | test > test |
Yang and Li (2012) | 43 | 21:22 | 21.59 | Artificial grammar | Linear (Markovian finite state) | No | Visual | test > rest |
Yusa et al. (2011) | 36 | nr | 21.6 | Non-native language | English negative inversion rule | Yes | Auditory + Visual | test 2 > test 1 |
nr: not reported. |
Author (Year) . | n . | F:M . | Age . | Type of grammar . | Rule type . | Feedback . | Stimulus presentation . | Included contrasts . |
---|---|---|---|---|---|---|---|---|
Learning | ||||||||
Fletcher et al. (1999) | 7 | 03:04 | 28 | Artificial grammar | Linear (Finite state) | Yes | Visual | training > baseline |
Karuza et al. (2013) | 25 | 17:08 | 20 | Artificial language | Transitional probabilities | No | Auditory + Visual | training > baseline |
Karuza et al. (2016) | 16 | 10:06 | 21 | Artificial language | Transitional probabilities | No | Auditory | training > baseline |
Opitz and Friederici (2003) | 14 | 07:07 | 25 | Artificial grammar | Phrase structure (BROCANTO) | Yes | Visual | training > baseline |
Learning + Test | ||||||||
Goranskaya et al. (2016) | 61 | 31:30 | 27 | Artificial grammar | Pairwise dependencies | No | Auditory | learn + test > baseline |
McNealy et al. (2006) | 27 | 13:14 | 27 | Artificial language | Transitional probabilities | No | Auditory | training > rest |
test > baseline | ||||||||
Ordin et al. (2020) | 24 | 14:10 | 25 | Artificial language | Transitional probabilities | No | Auditory | test > test |
Tettamanti et al. (2002) | 14 | 07:07 | 27 | Artificial grammar | Hierarchical and non-hierarchical | No | Visual | training > baseline |
Turk-Browne et al. (2009) | 16 | 09:07 | 23 | Non-native language | Transitional probabilities | No | Visual | training > training |
Test | ||||||||
Bahlmann et al. (2008) | 14 | 07:07 | 25 | Artificial grammar | Hierarchical and adjacent | Yes | Visual | test > test |
Conway et al. (2020) | 21 | 12:09 | 22 | Artificial grammar | Adjacent and non-adjacent | No | Visual | test > test |
Finn et al. (2013) | 10 | 05:05 | 24 | Artificial language | Phrase structure | No | Auditory + Visual | test > baseline |
Folia and Petersson (2014) | 32 | 16:16 | 23 | Artificial grammar | Linear (Reber finite state) | No | Visual | test > test |
test day 5 > day 1 | ||||||||
Forkstam et al. (2006) | 12 | 08:04 | 23 | Artificial grammar | Linear (Reber finite state) | No | Visual | test > baseline |
test > test day 8 | ||||||||
Hauser et al. (2012) | 17 | 07:10 | 24 | Artificial grammar | Phrase structure (BROCANTO) | Yes | Visual | test > test |
Lieberman et al. (2004) | 9 | 05:04 | 26 | Artificial grammar | Linear (Markovian finite state) | Yes | Visual | test > test |
Morgan-Short et al. (2015) | 13 | 06:07 | 22 | Artificial grammar | Phrase structure (Brocanto2) | Yes | Auditory + Visual | posttest > pretest |
test > baseline | ||||||||
Newman-Norlund et al. (2006) | 18 | 10:08 | 19 | Artificial language | Linear (Finite state) | Yes | Auditory | posttest > pretest |
Petersson et al. (2004) | 12 | 03:09 | 24 | Artificial grammar | Linear (Reber finite state) | No | Visual | test > baseline |
test > test | ||||||||
Seger et al. (2000) | 14 | 03:11 | nr | Artificial grammar | Linear (Finite state) | No | Visual | test > baseline |
Skosnik et al. (2002) | 23 | 11:12 | nr | Artificial grammar | Linear (Markovian finite state) | No | Visual | test > test |
Thiel et al. (2003) | 16 | 07:09 | 30 | Artificial grammar | Chunk-based | No | Visual | test > test |
Wilson et al. (2015) | 12 | 06:06 | 23 | Artificial grammar | Linear (Finite state) | No | Auditory | test > test |
Yang and Li (2012) | 43 | 21:22 | 21.59 | Artificial grammar | Linear (Markovian finite state) | No | Visual | test > rest |
Yusa et al. (2011) | 36 | nr | 21.6 | Non-native language | English negative inversion rule | Yes | Auditory + Visual | test 2 > test 1 |
nr: not reported. |
3.2 Meta-analytic outcomes
The Grammatical and Ungrammatical ALE groups were pooled to identify convergence across studies representing the recognition and endorsement/acceptance of rule-based stimuli and the rejection of random, non-rule following stimuli. The pooled ALE results identified significant activation in bilateral ROIs of the frontal, insular, and parietal cortex with six clusters of activation: left pars Opercularis extending into the left insula, right pars triangularis extending into the right precentral gyrus, right middle cingulate gyrus extending to the left supplemental motor area, right insula, right middle occipital gyrus, and right inferior parietal lobule (Fig. 3A, Table 2). The statistically significant conjunction of the Grammatical and Ungrammatical ALE groups showed only three regions of convergence, organized in two main clusters including the left pars Opercularis and triangularis and the left insula (Fig. 3A, Supplemental Table 1).
Cluster . | x . | y . | z . | Volume (mm3) . | ALE max . | Label . |
---|---|---|---|---|---|---|
1 | -44 | 12 | 20 | 12352 | 7.36 | L Inferior frontal opercularis |
-34 | 22 | -2 | 7.26 | L Insula | ||
2 | 48 | 26 | 20 | 5392 | 5.73 | R inferior frontal triangularis |
50 | 26 | 4 | 4.69 | R Inferior frontal triangularis | ||
48 | 4 | 30 | 3.32 | R Precentral gyrus | ||
3 | 6 | 26 | 34 | 3952 | 5.79 | R Middle cingulate gyrus |
2 | 22 | 50 | 4.99 | L Supplemental motor area | ||
4 | 36 | 22 | -4 | 2584 | 7.29 | R Insula |
5 | 32 | -66 | 38 | 1264 | 5.52 | R Middle occipital gyrus |
6 | 36 | -50 | 48 | 1104 | 4.02 | R Inferior parietal lobule |
Cluster . | x . | y . | z . | Volume (mm3) . | ALE max . | Label . |
---|---|---|---|---|---|---|
1 | -44 | 12 | 20 | 12352 | 7.36 | L Inferior frontal opercularis |
-34 | 22 | -2 | 7.26 | L Insula | ||
2 | 48 | 26 | 20 | 5392 | 5.73 | R inferior frontal triangularis |
50 | 26 | 4 | 4.69 | R Inferior frontal triangularis | ||
48 | 4 | 30 | 3.32 | R Precentral gyrus | ||
3 | 6 | 26 | 34 | 3952 | 5.79 | R Middle cingulate gyrus |
2 | 22 | 50 | 4.99 | L Supplemental motor area | ||
4 | 36 | 22 | -4 | 2584 | 7.29 | R Insula |
5 | 32 | -66 | 38 | 1264 | 5.52 | R Middle occipital gyrus |
6 | 36 | -50 | 48 | 1104 | 4.02 | R Inferior parietal lobule |
The conjunction analysis clarified that Grammatical contrasts uniquely identified five clusters comprising seven regions across both hemispheres (Fig. 3A; z corrected, FWE thresholded at p < 0.05; Supplemental Table 1) including the left inferior frontal gyrus extending from the precentral gyrus to the pars triangularis, left insula, right middle occipital gyrus, right insula, and left superior temporal gyrus. The Ungrammatical stimuli uniquely activated four clusters comprising eight regions (Fig. 3C; z corrected, FWE thresholded at p < 0.05; Supplemental Table 1) in the left pars Opercularis extending to the left insula, right middle frontal gyrus extending to the right pars orbitalis, right middle cingulate gyrus extending to the left supplemental motor areas, and the right insula. However, regional activity did not significantly differ when participants were distinguishing rule-following versus rule-violating stimuli (pFDR > 0.05). Visual inspection of the maps suggests subtly more left-lateralized activation as well as left superior temporal gyrus activation when the stimuli were rule-following (Grammatical).
3.3 Functional connectivity and hierarchical clustering analysis outcomes
Given the Grammatical + Ungrammatical ALEs did not differ significantly, the ROIs from the pooled ALE were used as seeds to assess connectivity in the rsFC and MACM data (Fig. 4). Connectivity profiles indicated strong connections among the ROIs, as well as inclusion of subcortical structures (e.g., thalamus, basal ganglia), but the general pattern was consistent with structures of the frontoparietal network. Hierarchical clustering analysis of the connectivity profiles grouped brain regions into cliques with processes common across spatial and functional domains. Cross-correlations of the unthresholded MACMs and rsFC maps for each ROI generated correlation maps that were then averaged. Hierarchical clustering analysis was performed on each matrix, and visual inspection of the metrics and dendrograms (Fig. 5) indicated a three cluster solution was optimal. Brain maps of these clusters demonstrate considerable overlap in frontoparietal cortex, but with connectivity that includes subcortical, basal ganglia regions in one (red in Fig. 5).
3.4 Functional decoding
Spatial correlations between the map for each of the three hierarchical clusters and the terms in Neurosynth allowed for annotation of terms most related with activity or connectivity of each. This step distinguished functional specialization of the clusters to one clique aligned with language, one with attention, and one with cognitive control (Fig. 5). Thus, the following functional interpretations of the clusters were suggested:
Clique 1: A language (Fig. 3, blue) clique was composed of the left inferior frontal Opercularis, the right inferior frontal gyrus, and right inferior frontal triangularis. Coordinates for the clusters of this clique and their anatomical and cytoarchitecture are provided in Supplemental Table 2. This clique was strongly associated with the following Neurosynth terms (r > 0.29, Supplemental Table 5): phonological, language, word, demands, syntactic, linguistic, lexical, semantic. This decoding aligns the functional specialization of this clique with processing of linguistic stimuli regardless of the correctness of grammatical rules.
Clique 2: A salience (Fig. 3, red) clique was composed of bilateral insula, right mid-cingulate cortex, and the left supplemental motor area. It is, in part, consistent with the canonical salience (Seeley et al., 2007) or cingulo-opercular networks (Coste & Kleinschmidt, 2016). Coordinates for the clusters of this clique and their anatomical cytoarchitecture are provided in Supplemental Table 3. This clique was associated with the Neurosynth terms (r > 0.29, Supplemental Table 5) task, cognitive control, conflict, working memory, stop signal, working, response inhibition, demands, and error. Thus, this clique is associated with the recognition of multiple features present in the stimuli and the conflict associated with determining the correctness.
Clique 3 (Fig. 3, green): A cognitive control or attentional (Fig. 3, green) clique was composed of posterior regions in the right inferior parietal lobule and right mid-occipital gyrus, as well as the right precentral gyrus. Coordinates for the clusters of this clique and their anatomical cytoarchitecture are provided in Supplemental Table 4. This clique was associated with Neurosynth terms (r > 0.29, Supplemental Table 5) like those of clique 2 (i.e., tasks, working memory) as well as calculation, attentional, numbers, spatial, numerical, load, color, symbolic, object, visuospatial, arithmetic, interference, and irrelevant. The decoding of this clique realized commonalities with clique 2 relative to working memory and demand, and with non-linguistic complex stimulus processing, but also aligns with the executive control (Seeley et al., 2007) or frontoparietal networks (Cole et al., 2013; Cole & Schneider, 2007).
4 Discussion
This study presents a common set of brain regions active across many neuroimaging investigations during artificial or non-native grammar learning and their functional connectivity profiles. Using coordinate-based meta-analysis methods to identify those common regions, large task-independent and task-dependent datasets were leveraged to identify the functional connectivity profiles for these regions across the brain. From those data, cliques of regions (subnetworks) were identified and associated with the mental operations frequently associated with them. These profiles suggest three cliques that are engaged for artificial and non-native grammar learning: a language network, a salience network, and a cognitive control network. The tasks included in the meta-analyses varied for modality of input (auditory, visual, or both) and thus the findings suggest that the engagement of these subnetworks for SL is domain-general, though future analyses contrasting modality of input may further delineate the role of modality-specificity and inform modeling like that of Frost et al. (2015).
4.1 Brain regions commonly active for artificial and non-native grammar tasks
As has been reported previously in SL tasks, the left inferior frontal gyrus was engaged for artificial and non-native grammar learning but was not uniquely involved for grammatically correct, or rule-following, stimuli. This suggests that the left IFG is necessary for both the extraction of rule governed patterns and the resolution of conflict for stimuli not following the pattern. Along with the left IFG was activity of right IFG, bilateral insula, right precentral gyrus, right mid cingulate gyrus and supplementary motor area, right middle occipital gyrus, and right inferior parietal lobule. These findings support a left-dominant, but bilateral group of regions involved largely for identification of rule-following and rule-violating grammatical stimuli, regardless of the modality of the input. Notably missing from the meta-analysis results were the basal ganglia, given their proposed primary role in SL. This is somewhat consistent with the meta-analytic findings of Tagarelli et al. (2019), but they also investigated common activity in lexical learning and in non-declarative grammar learning where they found left caudate and putamen activity. Plante et al. (2015) also found caudate activity, but only in the early phases of learning. Both Tagarelli et al. (2019) and Plante et al. (2015) note that caudate activity was specific to grammar learning and particularly in the early extraction of patterns or rules in the stimuli. As such, it may be that the collapsing across learning phases specific to rule-learning precluded the finding in the present study.
4.2 Functional connectivity profiles
Utilizing the ALE regions that were commonly active in the grammar learning tasks as seeds to query connectivity in resting-state and task-dependent data sets, profiles of connectivity elucidated three unique cliques. These three cliques were further characterized through functional decoding to clarify that one was likely specialized in language processing, one in salience of features in stimuli, and one in cognitive control (Fig. 5).
These profiles extend from the specific regions active during artificial and non-native grammar learning tasks to include a more diffuse representation of each region’s participation in brain networks, thus providing evidence to build hypotheses for how these brain networks may contribute to task performance. For example, the left inferior frontal gyrus (pars Opercularis) that is proposed to be central to grammar learning in previous literature was found to be commonly active across the artificial grammar learning studies, and it is part of a network of brain regions associated with language processes (clique 1, blue in Fig. 3), in concert with the right inferior frontal gyrus (pars triangularis) and the left insula. This network of regions was associated with terms in the NeuroSynth database that clarified its role in language with specific terms like phonological, syntactic, semantic, and sentence, as well as to a lesser extent in general cognitive processing with specific terms including demands, domain-general, working memory, and cognitive control. This may indicate that the IFG’s role in language processing is paramount, but that it is also conjointly engaged with other networks when that processing requires maintenance and manipulation of incoming stimuli. While this network includes the left IFG, there is an equal or greater extent of the right IFG, suggesting that grammar learning utilizing auditory and visual stimuli may not be uniquely left dominant.
Clique 1 is akin to the audio-motor interface, or dorsal route, and the modality-specific regions identified by Frost and colleagues. While the left IFG region of interest clustered into clique 1, portions of this cortical region were represented in all the cliques suggesting that its role in SL is complex (Supplemental Fig. 1). Frost et al. (2015) proposed that the bilateral IFG, along with the hippocampus and basal ganglia, are part of a shared SL system that facilitates modality-specific sensorimotor processing but also modulates the input for domain-general, higher-level cognition for learning. Rodríguez-Fornells et al. (2009) proposed that the posterior IFG is part of the audio-motor interface, but that the ventral IFG is part of the ventral meaning inference interface, and that these two regions interact in early phases of SL. The ventral aspect of the IFG is represented in clique 2 in the present study, contiguous with the insula and caudate regions (Supplemental Fig. 2), which may confirm segregate roles of the IFG in SL.
Conceptually, cliques 2 (salience) and 3 (cognitive control) align with the Frost and the Rodríguez-Fornells models as domain-general cognitive aspects of SL. However, none of the regions thought to be engaged for inferring meaning or episodic storage of lexical information (i.e., the medial, anterior, and inferior temporal gyri) are present in our meta-analytic results. This is likely because the processes proposed to engage these regions relate to phases of learning, where extraction of contextual information with repeated exposures is eventually stored through medial-temporal lobe-dependent storage processes (Rodríguez-Fornells et al., 2009). Most of the studies included in our meta-analyses acquired data in the testing phase of the grammar learning paradigms; therefore, they did not observe the evolution of SL.
While the cerebellum has been associated with error detection during SL, it was not observed to be commonly active across the studies included in our meta-analyses. However, it was identified as part of clique 3, a central executive, frontoparietal network. Prior evidence suggests that this cognitive control network flexibly shifts as the cognitive goals or demands of a given task change (Cole et al., 2013). Ullman and colleagues propose the cerebellum is involved in conflict monitoring or error detection and thus, we assert that clique 3 is involved in grammar learning as a scaffold for the changing demands in learning across grammar rule identification, maintenance, and rule application in healthy adults.
In cooperation with the network involving the left IFG (clique 1) and the network involving the cerebellum (clique 3), clique 2 associated with behaviors like working memory and attention and involved the basal ganglia (more specifically the caudate nucleus bilaterally). This network clustered with the insula and mid-cingulate cortex regions derived from the ALEs and we propose its role in SL is for realizing the salience of features in the stimuli and their relevance in rule-learning. And, as noted above, the caudate nucleus has been found to be active during the early phases of grammar learning (Plante et al., 2015; Tagarelli et al., 2019), but not once the grammatical rule has been learned. This network would be necessary to track regularities across stimuli and imperative for rule identification and application of grammar, as has been predicted by others reporting frontoparietal activity during language learning tasks. The frontoparietal brain network is domain-general and largely overlaps with other cognitive control networks (e.g., multi-demand), and its integrity is known to be a positive prognostic sign for language recovery in individuals with post-stroke aphasia (Fedorenko et al., 2013). Future study will determine whether connectivity of the additional subcortical structures, and other temporal structures, are also relevant to successful language relearning.
4.3 Limitations
The approach taken and interpretations made in this study are based on meta-analytic methods from large-scale data mining. The ALE conducted to identify regions of interest was limited to functional activation studies utilizing similar artificial grammar and non-native grammar learning tasks. This approach is susceptible to publication bias. That is, it is limited by the design and reporting of results in the primary published studies. As well, the coordinate-based ALE algorithm does not take into account the cluster size of the original studies, which is likely less precise than results of image-based meta-analyses (Salimi-Khorshidi et al., 2009).
The 25 artificial grammar learning studies included in the meta-analyses were common in their inclusion of statistical learning paradigms and contrasts of rule-following or rule-violating stimuli. However, the studies differed in terms of the grammatical rule types, the inclusion of feedback during learning trials, and whether the stimuli were auditory (n = 6), visual (n = 15), or both (n = 4) (see Table 1). As well, papers were included that focused on word segmentation tasks that involved extracting rules about word boundaries, but likely relate to differing aspects of linguistic processing than grammar learning at phrase and sentence levels. The common clustered activity across studies is the primary outcome of the ALE, but we recognize that there may be bias toward the study types with more representation—finite state artificial grammars learned without feedback and presented in visual stimuli. Future study may focus on artificial grammars that more closely resemble natural language, and which may expand the set of ROIs and continue to clarify the specificity or interactions among the subnetworks identified here when grammars are more complex.
A similar limitation is the small number of experiments that reported Grammatical/Ungrammatical results. Separate meta-analyses (ALEs) were conducted for the Grammatical and Ungrammatical data but the contrast of the two was not significant (p > 0.05) despite appearing to differ spatially (Fig. 2 and Supplemental Table 1). The lack of significance between these two ALE groups is likely due to the small number of experiments in each (Eickhoff et al., 2016). As such, the data for the Grammatical and Ungrammatical ALE groups were pooled, increasing the number of experiments included in the hierarchical clustering analysis and functional decoding steps. Thus, the data represented and interpreted in Figure 5 do not differentiate the unique contributions for mental processes specific to identifying Grammatical or Ungrammatical stimuli. Future study with larger numbers of experiments is necessary before indication that one or the other of the cliques is responsible for one aspect of grammar learning.
Last, the inclusion of learning and test phase activations in the meta-analyses may reflect differing cognitive processes. Ideally, the two phases of learning would be evaluated separately to determine whether specific network profiles exist for the extraction of salient features from the input during learning relative to the integration of those features to exemplify that a rule is learned. However, only 16 contrasts reported coordinates for activations specific to the test phase alone. Again, this does not meet the minimum threshold for contrasts in coordinate-based meta-analytic work (Eickhoff et al., 2016; Yeung et al., 2023). Thus, future study is warranted to delineate the networks and mental processes for early versus late phases of learning. This may elucidate not only the role of the basal ganglia in learning, but also the role of the hippocampus. For example, using intracranial electrodes to evaluate changes in neural recruitment when participants learned transitional probabilities, Henin et al. (2021) and Ramos-Escobar et al. (2022) specified the role of the hippocampus in early phases of SL extraction of salient features specific to words and their ordinal position in streams of syllables. This role was differentiated from the recruitment of the auditory cortex, which attuned to syllable structure. The precision of these types of studies, and the importance of considering phase of learning, is paramount for improving understanding of the neural bases of SL.
5 Conclusions
By utilizing advanced neuroimaging meta-analytic techniques, this study elucidates the brain regions actively engaged during artificial grammar learning tasks. These included the oft-hypothesized left IFG as well as other frontal, insular, cingulate, and posterior parieto-occipital cortices, consistent with findings of other meta-analytic work. By seeding these regions in large task-dependent and task-independent datasets and applying hierarchical clustering, functionally connected cliques were identified. These cliques were then linked to mental operations commonly attributed to their collective activity. In this way, three cliques were found and suggest bilateral connectivity profiles engaged for grammar learning that are associated with language, salience, and cognitive control. The language clique (clique 1 in Fig. 5) was not left-dominant, though the ALE results contrasting rule-following and rule-violating stimuli may indicate more left-sided activity for identifying grammatical correctness. The other two subnetworks had bilateral representation and their associations with mental operations suggest they are engaged for domain-general processing.
These findings provide a brain network model of SL for artificial and non-native grammar learning. This model suggests recruitment of a left hemisphere dominant language network as well as bilateral salience and cognitive control networks for learning rule-governed stimuli. The model thus presents three differentiated brain networks integral to learning, though the specific role that each play in successful learning requires further investigation. These findings have potential implication for rehabilitative efforts. For example, it is likely that the language network is damaged in individuals with post-stroke aphasia, but the integrity of the salience and cognitive control networks may be relatively spared. And, spared integrity of these two networks may be evidence of better prognosis (Fedorenko et al., 2013; Geranmayeh et al., 2014). Or, if it is the case that one of these networks is impaired in a clinical population like aphasia, then understanding whether designing intervention approaches that optimize the function of one of the other networks may be key to improving learning, and thereby the therapeutic outcomes.
Data and Code Availability
The data and code supporting the findings of this study are available on Github (https://github.com/NBCLab/meta-analysis_implicit-learning) and on NeuroVault (https://neurovault.org/collections/9597/). Tools used to perform the rsFC analyses can be found in these GitHub repos: https://github.com/NBCLab/niconn; https://github.com/NBCLab/niconn-hcp. Code for plotting of the results is at Github (https://github.com/mriedel56/surflay).
Author Contributions
Conceptualization: A.E.R. and K.C.; Methodology: M.C.R., A.R.L., and A.E.R.; Formal Analysis: M.C.R., K.C., and A.R.L.; Writing: A.E.R., K.C., J.C.T., K.L., and A.R.L.; Visualization: M.C.R. and A.E.R.; Supervision: A.E.R. and A.R.L.
Funding
This study was supported, in part, by the UNH COVID Recovery Fund from the Senior Vice Provost for Research, Economic Engagement, and Outreach. Data were provided [in part] by the Human Connectome Project, WU-Minn Consortium (Principal Investigators: David Van Essen and Kamil Ugurbil; 1U54MH091657) funded by the 16 NIH Institutes and Centers that support the NIH Blueprint for Neuroscience Research; and by the McDonnell Center for Systems Neuroscience at Washington University.
Declaration of Competing Interest
The authors declare no conflicts of interest in relation to the publication of this paper.
Supplementary Materials
Supplementary material for this article is available with the online version here: https://doi.org/10.1162/imag_a_00355.