A Bayesian incorporated linear non-Gaussian acyclic model for multiple directed graph estimation to study brain emotion circuit development in adolescence

Abstract Emotion perception is essential to affective and cognitive development which involves distributed brain circuits. Emotion identification skills emerge in infancy and continue to develop throughout childhood and adolescence. Understanding the development of the brain’s emotion circuitry may help us explain the emotional changes during adolescence. In this work, we aim to deepen our understanding of emotion-related functional connectivity (FC) from association to causation. We proposed a Bayesian incorporated linear non-Gaussian acyclic model (BiLiNGAM), which incorporated association model into the estimation pipeline. Simulation results indicated stable and accurate performance over various settings, especially when the sample size was small. We used fMRI data from the Philadelphia Neurodevelopmental Cohort (PNC) to validate the approach. It included 855 individuals aged 8–22 years who were divided into five different adolescent stages. Our network analysis revealed the development of emotion-related intra- and intermodular connectivity and pinpointed several emotion-related hubs. We further categorized the hubs into two types: in-hubs and out-hubs, as the center of receiving and distributing information, respectively. In addition, several unique developmental hub structures and group-specific patterns were discovered. Our findings help provide a directed FC template of brain network organization underlying emotion processing during adolescence.


I. INTRODUCTION
Emotion perception is essential to affective and cognitive development and is thought to involve distributed brain circuits.Identification of distinct facial expressions, which is fundamental to recognizing the emotional state of others, begins in infancy and continues to develop throughout childhood and adolescence [1].Understanding the development of brain's emotion circuits may help us explain the emotional maturation seen in adolescence.In our previous work [2], we studied the fMRI images collected on the Philadelphia Neurodevelopmental Cohort (PNC) and delineated the trajectory of the brain functional connectivity (FC) from late childhood (preadolescence) to early adulthood (post-adolescence) during emotion identification tasks.The FC metrics that we used were defined by statistical associations (partial correlations, in specific) between measured brain regions.However, it has been pointed out that the statistical association may be problematic in that it only reveals the spatial connections but not causal information [3].Approaches that characterize statistical associations are likely a good starting point, but the causality of brain connectivity should be more promising.
Directed acyclic graph (DAG) models, also known as Bayesian networks, are designed to model causal relationships in complex systems.Current methods for DAG identification can be divided into four categories: constraint-based methods, score-based methods, non-Gaussian based methods, and hybrids of these categories.The constraint-based methods, such as the PC algorithm (named after its authors, Peter and Clark) [4], and the score-based methods, such as the greedy equivalent search (GES) have been found sub-optimal for predicting the direction of causal relationships but are accurate in identifying the causal skeleton (graph structure without directions) [5].The non-Gaussian based methods, the linear non-Gaussian acyclic models (LiNGAMs), have better performance since the non-gaussian data may contain more information to infer the directionalities [6].However, it requires a large number of data points in the relevant dimension to converge to the true graph [5].A recent review discussed that by leveraging insights from existing association studies, we can reduce the set of likely causal models, facilitating causal inferences despite major limitations [3].Methods like high-dimensional PC [7], fast GES [8] and ψ-LiNGAM [9] that incorporated association networks have all shown great improvements.
Currently, existing methods have focused on estimating a single directed graphical model.However, in many biomedical applications, we have access to data from related classes, like the multiple adolescent periods in our adolescence study.This raises an important statistical question, namely how to jointly estimate related graphical models in order to make full use of the available data [10].We thus developed a joint Bayesian-incorporating ψ-learning method to estimate multiple undirected graphical models as priors based on our previous study [11] and proposed a multiple DAG estimation for non-Gaussian data.We named it the Bayesian incorporated linear non-Gaussian acyclic model (BiLiNGAM) in the sense that we were inspired by the idea of Bayesian prior and implemented a joint Bayesian-incorporating estimation to acquire the priors.A series of simulation studies have been conducted to further illustrate the advantages of BiLiNGAM in terms of convergence speed and accuracy, especially in high dimensional datasets.Network analysis of brain connectivity development using the fMRI image from PNC revealed the development of emotion-related intra-and inter-modular connectivity and pinpointed a few emotion-related hubs.We further categorized them into two types: in-hubs and outhubs, as the center of receiving and distributing information.Some unique developmental hub structures and group-specific patterns have also been discovered.
The remainder of this paper is organized as follows.In Section II, we introduce the BiLiNGAM method step by step.A series of simulation studies are displayed in Section III.A detailed analysis of causal emotion circuit development of human brain in adolescence is illustrated in Section IV, followed by some concluding remarks in the last section.

II. METHODS
In this section, we introduce the Bayesian incorporated linear non-Gaussian acyclic model (BiLiNGAM) step by step.The proposed model is developed from the LiNGAM method and incorporate prior information for multiple groups that are distinct but related.We first introduce the general LiNGAM methods in Section II.A. Section II.B briefly describes the approach for prior knowledge estimation for multiple groups.Finally, we summarize the BiLiNGAM algorithm in Section II.C.

A. LiNGAM
The linear non-Gaussian acyclic model (LiNGAM) was first proposed to study Bayesian networks (BN) using a structural equation model (SEM) for non-Gaussian variants [12].LiNGAM assumes that the casual relationships of the variables can be represented graphically by a directed acyclic graph (DAG) G = (V, E), where the node set V = {1, 2, ..., p} represents the corresponding variables and E ∈ R p×p denotes the adjacency matrix of the directed edges.Let B = {b ij } ∈ R p×p be the weighted adjacency matrix specifying the edge weights of the underlying DAG G.The observed random vector X = (X 1 , X 2 , ..., X p ) ∈ R p is assumed to be generated from the following linear SEM, X = B T X + , where = ( 1 , 2 , ..., p ) is a continuous random vector; the i 's, ∀i = 1, 2, ..., p have non-Gaussian distributions with nonzero variances, and are independent of each other.
A property of acyclicity is that there exists at least one permutation π of p variables such that b ij = 0, ∀π(i) < π(j).
In other words, the weight matrix B can be reordered to a strictly lower triangular matrix according to the permutation π.The goal of LiNGAM is to find the correct permutation and estimate the weight matrix B. Since the components of are independent and non-Gaussian, Shimizu et al. [12] first proposed the independent component analysis (ICA) based algorithm known as ICA-LiNGAM.Later, another method, the direct LiNGAM [13], has been developed which estimates the causal order of variables by successively subtracting the effect of each independent component from given data.Compared to the ICA-based algorithm, the direct LiNGAM needs no initial guess or algorithmic parameters and has guaranteed convergence.
LiNGAM is designed for non-gaussian data, which contains more orientation information.Therefore it is possible to identify more of the causal graph structure than the traditional Gaussian setting [6].However, the identification of DAG models in general is nondeterministic polynomial time hard (NP-hard) [14].LiNGAM, especially, requires a larger number of data points in the relevant dimension to converge to the true graph [5].In biomedical applications, the datasets collected often have limited sample size or the problems we are facing are high dimensional cases (i.e., the number of variables/nodes greatly exceeds the number of samples/observations).Under these circumstances, the traditional LiNGAM methods could give questionable results or cannot even be applied to high dimensional cases.

B. Bayesian Incorporated Prior Estimation
Although investigating causal interactions should be central to studies of complex systems like the brain function, associations are a good starting point for estimating network interactions and various methods have been proposed.Current methods can be categorized into three groups based on the statistical associations to be calculated, which, namely, are the Pearson correlations, partial correlations and distance correlations.Pearson correlation describes the linear correlation of a pair of variables in a system, partial correlation measures the association between two variables removing the effect of other variables, and distance correlation can measure both linear and nonlinear association between two random variables.Based on [3], generalizing insights from existing association studies can facilitate causal inferences and overcome major limitations such as small sample size and computational inefficiency.As in statistical modelling, we can incorporate the established networks as prior information with existing causal models.Within a linear model, a partial correlation-based approach is more appropriate than the one using Pearson correlation.This is because in a complex system like the brain, the Pearson correlation is much weaker marginally [15], i.e., all nodes (variables) are directly or indirectly correlated and it is difficult to distinguish significant connections through a dense network constructed by Pearson correlations.Therefore, incorporating the Pearson correlation network as a prior may bring too much confounding information and the improvement of causal inferences is limited.On the contrary, partial correlations have been proposed to explore direct associations between two nodes, controlling for the confounding variables, which can help to infer causal interactions.In our previous causal study [9], we incorporated the partial correlation network using the ψ-learning method [15] as a prior with LiNGAM and showed its superiorities of convergence speed and accuracy.
Our goal in this paper is to jointly estimate multiple related DAGs.Therefore, we developed a joint Bayesian-incorporating ψ-learning method to estimate multiple undirected graphical models as priors.The approach consisted of three steps: Step 1, Gaussian transformation; Step 2, distinct and common graph construction; and Step 3, prior matrices acquisition.Plenty of the partial correlation based methods have been working on the Gaussian distributed data due to their mathematical simplicity [11].Liu et al. [16] have proposed a nonparanormal transformation which relax the Gaussian assumption to continuous.Thus in Step 1, we apply this Gaussian transformation.For Step 2, we take into consideration the distinct and common structure for each group.The distinct graph estimation for each group is implemented through the ψ-learning method as in [11].To strengthen the similarities over various groups, we adopt the same Bayesian incorporating joint estimation method as in our previous study of brain connectivity development in adolescence [2].The similarities are highlighted through proper Bayesian priors and a meta-analysis procedure.Finally, we use the union of the distinct estimated graph and joint estimated graph as the prior graph for each group.

C. BiLiNGAM
In the literature, most existing methods have focused on estimating a single directed graphical model.For multiple DAG estimation, Wang et al. [10] have proposed a highdimensional joint estimation of multiple directed graphical models for Gaussian distributed data.In this paper, we propose a multiple DAG estimation for non-Gaussian data, with the name of Bayesian incorporated linear non-Gaussian acyclic model (BiLiNGAM) in the sense that we were enlightened by the idea of Bayesian prior and implemented a joint Bayesianincorporating estimation to acquire the priors.The procedure of the BiLiNGAM algorithm is summarized in Algorithm 1.

III. SIMULATION STUDIES
In this section, we analyzed the performance of the joint estimation of K different DAGs where we varied K ∈ {3, 5}.For all experiments, we set the number of nodes p = 200 and the total number of observations N = 750.For each group, we set the number of samples equally, i.e.
The random DAG G can be simulated through the R package pcalg and density of the graph is controlled by the edge probability d/(p − 1), where d is an edge degree parameter with values {1, 2, 5}.The true DAGs generation procedure is illustrated as follows.We first used the pcalg to generate G 1 .Given G 1 , we assigned uniformly random weights to the edges to obtain the weighted adjacency matrix . ., K, we followed the same random edge deleting-adding procedure in a sequential manner.We randomly removed 5% edges in a.For k = 1, 2, . . ., K, use the nonparanormal transformation [16] to render X k normal (Gaussian).
b. Apply the ψ-learning method [15] to each group k, k = 1, 2, . . ., K separately for distinct estimation and acquire the adjacency matrix E d,k .c. Apply the Bayesian incorporating joint estimation [2] to strengthen the similarities among the groups and acquire the E c,k , ∀k.
d. Extract the prior matrix We considered three methods for comparison, which are the PC algorithm [4], the ICA-LiNGAM [12] and the ψ-LiNGAM we proposed previously [9].The prominent PC ("Peter and Clark") algorithm [4] is a type of constraint-based methods, that first learn an undirected graph from conditional independence relationships, which is called as the skeleton of the directed graph, and then orient the edges.Studies [5], [17] have shown that the PC algorithm is good at the skeleton estimation but not the edge orientation.The PC and ICA-LiNGAM were implemented through the R package pcalg and the code for ψ-LiNGAM is available at https://github.com/Aiying0512/psi-LiNGAM.We set the significance level α = 0.05 with FDR correction for the PC, ψ-LiNGAM and BiLiNGAM.For each scenario, 10 datasets were simulated independently.We assessed the performances of the four methods through the true positive rate (TPR), false discovery rate (FDR), and structural hamming distance (SHD) [18].TPR and FDR are two common measures of binary classification.Let us define an experiment from P positive instances and N negative instances for some conditions.In our case, the positive instance represents a directed edge from one node to the other.The four outcomes are summarized in SHD is a frequently used metric based on the number of operations needed to transform the estimated DAG into the true graph [19].In simple terms, SHD counts the total number of edge insertions, deletions or flips during the transformation.Fig. 1 gives the results for K = 3, where each group has n = 250 samples and p = 200 nodes.We compared the results of all four methods mentioned above with various graph densities controlled by the degree parameter d.The results from Fig. 1 (left column) validate the poor edge orientation ability of the PC algorithm.Although the TPR curve of ICA-LiNGAM indicates a decent performance, the high FDR and SHD prove that the ICA-LiNGAM need a larger number of observations to converge to the true graph.As we can see, the two methods incorporating association networks as prior information (ψ-LiNGAM and BiLiNGAM) performs similarly well.This shows that when the sample size is sufficient (n > p), both methods incorporating association networks to estimate multiple DAGs can improve causal inferences equivalently.Further, we compared the two methods under high dimensional cases (i.e., n < p).Fig. 1 (right column) shows the results for K = 5, where each group has n = 150 samples.The FDR curves of the two methods maintain at the same low level.The TPR of BiLiNGAM is always higher than the ψ-LiNGAM's and the SHD performs the opposite.In addition, as the degree parameter d increases, the differences in TPR and SHD also increase.Therefore, under high dimensional settings, both methods remain at a low FDR level, but BiLiNGAM outperforms ψ-LiNGAM in terms of TPR and SHD.Overall, BiLiNGAM has maintained a stable and accurate performance over various settings.Particularly, under high dimensional cases, the performance of BiLiNGAM is superior.When the sample size is adequate (i.e.n > p), BiLiNGAM performs at least as good as ψ-LiNGAM.

A. Materials
The dataset we used is publicly available from the Philadelphia Neurodevelopmental Cohort (PNC).It consists of fMRI images from 855 individuals using an emotion identification task.All MRI scans were acquired on a single 3T Siemens TIM Trio whole-body scanner.During the task, each subject was asked to label emotions displayed which include happy, angry, sad, fearful and neutral faces.The total scan duration was 10.5 min.Blood oxygenation level-dependent (BOLD) fMRI was acquired using a whole-brain, single-shot, multislice, gradient-echo (GE) echoplanar (EPI) sequence of 124 volumes (372s) with the following parameters TR/TE=3000/32 ms, ip = 90 • , FOV= 192 × 192 mm, matrix = 64 × 64, slice thickness/gap=3 mm/0 mm.The resulting nominal voxel size was 3.0×3.0×3.0mm[20].Standard preprocessing steps were applied using SPM12, including motion correction, spatial normalization to standard MNI space, and spatial smoothing with a 3mm full width at half max (FWHM) Gaussian kernel.Then multiple regression considering the influence of motion was performed and the stimulus on-off contrast maps for each subject were obtained.Finally, 264 functionally defined regions of interest (ROIs) were extracted based on the Power parcellation [21].The age range of the participating subjects was between 8 and 22 years.Due to physical and cognitive changes [2], we divided them into five groups, each representing a stage related to adolescence (Table II) as shown in Fig. 2.

B. Results
We first conducted the Darling-Anderson test for non-Gaussianity and then applied BiLiNGAM.The causal brain connectivity of each group is shown in Fig 3 A. We summarized the networks from the aspects of modular development and significant ROIs (hubs).
1) Development of emotion-related intra-and inter-module connectivity: We examined intramodular and intermodular connectivity over the 5 adolescent groups.

2) Development of emotion-related hubs:
To gain more insights into the affective emotion circuits and their development with age, we analyzed hub nodes for each group.Here we define hubs as the nodes with degrees at least two standard deviation higher than the mean degrees [22].We identified two types of hubs: in-hubs and out-hubs, which are selected through the in-and out-degrees, respectively.The in-degree of a node i is defined as the number of directed edges that end at node i (i.e.→ i), and the out-degree of a node i is the number of directed edges that begin from node i (i.e.i →).The in-hubs can be treated as centers that receive information, while out-hubs are central nodes that convey out information.We identified hubs for each adolescent group, separately, and tracked their changes across groups.III give the in-hub development and the detailed in-hub information.The ROI at SMA.L has a preadolescence specific pattern.The ROIs at PQ.R and ACG.R have increased activities of receiving messages, especially, the ROI at ACG.R only starts to develop from middle adolescence.The remaining in-hubs have fluctuating trajectories.From Fig. 3D.b and Table IV, several out-hubs start to develop in a fluctuating manner after pre-adolescence, whose anatomical locations are at AMY.L, SFG.L (ROI 9, 10), PG.R, PQ.R, LG.R, Q.R, MCG.R, IFGT.R, VA.R, STG.R and MTG (ROI 23,24).Some group-specific patterns have also been detected: the ROIs at SFGM.R for pre-adolescent group; the ROIs at IFGT.R and MTG.L for early adolescent group; the ROIs at SFG.L, MCG.R, VA.R for middle adolescent group; the ROIs at MCG.L (ROI: 5, 8), STG.R for late adolescent group; the ROIs at AMY.L and SFG.R for post-adolescent group.
3) Discussion: Our analysis revealed the developmental trajectory of directed brain circuitry during emotion identification tasks over various adolescent groups.We found intra-and inter-modular development, and pinpointed emotion-related hubs as well as several group-specific patterns.Our findings provide a causation template of emotion processing in the developing brain.
Intra-modular development: We found a developmentally stable intra-modular activation anchored in the default mode (DMN), subcortical (SCN) and cerebellum (CERE) networks.The default mode network is important for mentalizing and inferring emotional states of others ( [23]); subcortical regions have a pivotal role in cognitive, affective, and social functions in humans [24]; the cerebellum contributes prominently to processing emotional facial expression [25].The intramodular activities of CERE increased significantly in the postadolescent group, which may emerge in late puberty [26].The significance of intra-modular activities of the SCN appeared from pre-adolescence to middle adolescence, which has proven to be an important developmental period for subcortical brain maturation [27].
Inter-modular development: As age increases, more intermodular connectivity emerges during emotion-related processing.Starting from early adolescence, inter-connections start to build among VAN, CON, SN and SSN.Specifically, SSN exhibits significant inter-connections of receiving information from other funtional network modules after middle adolescence.VAN plays an important role of conveying information in late adolescent group, and SN is crucial for receiving information in post-adolescent group.Two stable causal influences from CON to SSN, DMN to MRN become established after the late adolescent period.CON facilitates the maintenance of task-relevant goals and the incorporation of error information to adjust behaviors [28] and SSN (includes somatosensory cotex, motor regions and extends to the supplementary motor areas) is involved in performing and coordinating motor-related tasks like finger tapping.Gehringer et al. [29] proposed that maturation of the somatosensory system during adolescence contributes to the improved motor control.They further discovered that altered attenuation of the somatosensory cortical oscillations might be central to the under-developed somatosensory processing and motor performance characteristics in adolescents.Our results agreed with their conclusion and may provide a possible explanation.Besides, Sestieri et al. [30] confirmed that responses in DMN regions peaked sooner than non-DMN regions during memory retrieval, and the parietal regions of DMN directly supported memory retrieval.
Emotion-related hubs: As shown in Table III & IV, most identified hubs were located in the right hemisphere since it is dominant in the perception of facial expression and important for processing primary emotions [31].Particularly, the hubs at PG, MCG, PQ, MTG play central roles in socioemotional processing.The precentral gyrus (PG) of the somatosensory cortex is related to recognition of facial and vocal expressions of emotion and a main effect of emotional valence on brain activity has been found in the PG.R [32].A previous study [33] verified that the mid-cingulate gyrus (MCG) is a hub linking incoming affective information with brain regions involved in goal-directed behavior.We further discovered that it is also a hub for distributing affective information.Precuneus (PQ) activation has been implicated in emotional and memoryloaded processes.The current study suggests that the PQ may play a direct role in the regulation of amygdala reactivity to emotional stimuli [34], which explains its prominence as a out-hub location.Studies of emotional face recognition [35], [36] identified the middle temporal gyrus (MTG) as a primary neural substrate for suprathreshold processing of the emotional expression of faces, which is consistent with our result of MTG as a central node to pass out information.
Developmental hub structures and group-specific hub patterns: The majority of networks during development fluctuate, except for the steady increase of the in-hub activities at the PQ.R and ACG.R. Group-specific patterns have also been identified: the in-hub at SMA.L and out-hub at SFGM.R for pre-adolescence; the out-hubs at IFGT.R and MTG.L for early adolescence; the out-hubs at SFG.L, MCG.R, VA.R for middle adolescence; the out-hubs at MCG.L, STG.R for late adolescence; the out-hubs at AMY.L and SFG.R for post-adolescence.Some of our results have been previously supported in the literature.The role of developmental centers at PG, PQ, ACG, LING and PHIG remains consistent with our previous study of brain connectivity development in adolescence [2].In this study, we further pinpointed their specific functions in the emotion circuit through directed graphical models.Another study of brain development from adolescence to adulthood [37] also brought attention to age-related changes in the PQ.In [38], fluctuating trajectories in the MCG during adolescence were discovered.
The effect of sample size and robustness: We examined the robustness of our proposed method and assessed the effect of sample size on robustness for each group.We randomly drew m percent of N participants (m ∈ {20%, 50%, 80%, 90%}, N = 855) from the PNC dataset, sample without replacement and with the proportion of the 5 adolescent groups.We then applied the BiLiNGAM method with the subsamples.For each sample size, we repeated the procedure 10 times.In Fig. 4, we showed the mean number of edges detected for each adolescent group under various sample sizes.From preadolescence to post-adolescence, the trajectory of each sample remains similar.However, we found that a limited number of edges were identified with small sample size (m = 20%).As the sample size increases, the identified number of edges for each group becomes steady and the variance decreases significantly.

V. CONCLUSION
In this paper, we propose a multiple DAG estimation for non-Gaussian data, with the name of Bayesian incorporated linear non-Gaussian acyclic model (BiLiNGAM) in the sense that we were enlightened by the idea of Bayesian prior and implemented a joint Bayesian-incorporating estimation to acquire the priors.The main contributions of our work can be summarized as follows.First, from a mathematical perspective, BiLiNGAM was, to the best of our knowledge, the first method to jointly estimate related DAG models in the high-dimensional setting for non-Gaussian data.Second, our proposed method accomplished the integration of undirected graph with directed acyclic graph, in the sense that we incorporated the undirected graph estimation as prior information into the direct LiNGAM model to perform DAG construction.In other words, we use the undirected graphs to mitigate the irrelevant information to facilitate casual inferences, which speeds up numerical convergence and computation.Third, the simulation results in Section III show that BiLiNGAM can maintain a stable and accurate performance over various settings.In particular, the proposed BiLiNGAM is superior for high dimensional cases.Finally, the analysis of brain's emotion circuit development revealed the trajectory of directed brain circuitry during emotion identification tasks over various adolescent groups.We have found several significant intraand inter-modular networks that change over developmental stages, and pinpointed emotion-related hubs as well as various group-specific patterns.Our findings provide a causation template of emotion processing in the developing brain.

Fig. 1 :
Fig. 1: Simulation results of the mean TPR, FDR and SHD over various graph densities.The performances of 4 methods for K = 3 are on the left column (A, C, E).The comparisons between BiLiNGAM and ψ-LiNGAM for K = 5 are on the right column (B, D, F).

Fig. 2 :
Fig. 2: 12 functional network modules based on the 264 nodes from the template defined by Power et al. [21].
Fig 3B visualized the average directed edge degrees within and across modules, where the rows indicate the beginning of the arrows and the columns indicate the end of the arrows.From Fig 3B, the intra-module connectivity of DMN, SCN and CERE are strongly activated for all 5 groups.As age increases, there is an increasing intermodular connectivity.We then conducted hypergeometric tests based on the number of edges module wise, and significant intra-and inter-causal connections are shown in Fig 3C at significance level α = 0.05 with FDR correction.The intramodular connectivity of the CERE were significantly activated of all adolescent groups for the emotion identification task, while the role of the SCN was only significant until the middle adolescent period.In addition, we found substantial intra-connectivity of SCN in the early adolescent group and CON in the late adolescent group.From the aspect of intermodular connectivity, no significant causal flows were found in the pre-adolescent group, 1 each was identified for the early (VAN → CON) and middle (DAN → SSN) adolescent groups, 5 were identified for the late adolescent group (CON → SSN, VAN → SSN, VAN → AUD, DMN → MRN, DAN → CERE), and 4 causal flows were discovered in the postadolescent group (CON → SSN, DMN → MRN, CON → SN, FPN → SN).

Fig. 3 :
Fig. 3: Causal brain connectivity development from pre-adolescence to post-adolescence.The number index (1 to 5) corresponds to the age category in Table II.A, Axial views of emotion-related node-level causal networks, where the arrows indicate the causal flow.B, Heatmaps of the mean edge degrees, module-wise.C, Identified intra-(blue arrows) and inter-(yellow arrows) module causal flows.D, Development of emotion-related hubs (D.a: in-hubs, D.b: out-hubs) over various adolescent groups with detailed ROI information in Table III (for D.a) and IV (for D.b).

Fig. 3D.a and Table
Fig. 3D.a and TableIIIgive the in-hub development and the detailed in-hub information.The ROI at SMA.L has a preadolescence specific pattern.The ROIs at PQ.R and ACG.R have increased activities of receiving messages, especially, the ROI at ACG.R only starts to develop from middle adolescence.The remaining in-hubs have fluctuating trajectories.From Fig.3D.b and TableIV, several out-hubs start to develop in a fluctuating manner after pre-adolescence, whose anatomical locations are at AMY.L, SFG.L (ROI 9, 10), PG.R, PQ.R, LG.R, Q.R, MCG.R, IFGT.R, VA.R, STG.R and MTG(ROI  23,24).Some group-specific patterns have also been detected: the ROIs at SFGM.R for pre-adolescent group; the ROIs at IFGT.R and MTG.L for early adolescent group; the ROIs at SFG.L, MCG.R, VA.R for middle adolescent group; the ROIs at MCG.L (ROI: 5, 8), STG.R for late adolescent group; the ROIs at AMY.L and SFG.R for post-adolescent group.3)Discussion: Our analysis revealed the developmental trajectory of directed brain circuitry during emotion identification tasks over various adolescent groups.We found intra-and inter-modular development, and pinpointed emotion-related hubs as well as several group-specific patterns.Our findings provide a causation template of emotion processing in the developing brain.Intra-modular development: We found a developmentally stable intra-modular activation anchored in the default mode

TABLE I .
The definitions of TPR and FDR are given as follows:

TABLE II :
Group division information.

TABLE III :
Anatomical location, functional network module and MNI coordinates of the identified in-hub ROIs.
* The ROI index corresponds to the row order from Fig.3 D.a.

TABLE IV :
Anatomical location, functional network module and MNI coordinates of the identified out-hub ROIs.
* The ROI index corresponds to the row order from Fig.3D.b.