Few sex differences in regional gray matter volume growth trajectories across early childhood

Abstract Sex-specific developmental differences in brain structure have been documented in older children and adolescents, with females generally showing smaller overall brain volumes and earlier peak ages than males. However, sex differences in gray matter structural development in early childhood are less studied. We characterized sex-specific trajectories of gray matter volume development in children aged 2–8 years. We acquired anatomical magnetic resonance imaging (MRI) of the brain at the Alberta Children's Hospital in 123 typically developing children. Most children were scanned multiple times, for a total of 393 scans (mean = 3.2 scans/subject). We segmented T1-weighted structural MRI with MaCRUISE to define 116 regions and measured both absolute volumes (mm3) and proportional volumes (percent of intracranial volume). We characterized growth trajectories of gray matter volume for these brain regions between 2 and 8 years using mixed-effects models, showing volume increases, with most posterior and temporo-parietal regions peaking before 8 years. We found widespread main effects of sex, with males having larger volumes in 86% of brain regions. However, there were no significant sex differences in trajectories (age or age2 terms) for absolute volume. Proportional volumes of the right occipital fusiform gyrus and left medial postcentral gyrus showed significant age-by-sex interactions where females had steeper volume decreases than males. This study also confirms regional patterns observed in previous studies of older children, such as posterior-to-anterior timing of brain maturation. These results provide a comprehensive picture of gray matter volume development across early childhood, and suggest that sex differences do not emerge until later in development.


INTRODUCTION
Early childhood is a critical period of brain and cognitive development ( Thomas & Johnson, 2008).Cross-sectional and longitudinal neuroimaging evidence in healthy humans has shown that from birth through adulthood, changes in whole-brain gray matter volume follow a non-linear trajectory with rapid increases in early life followed by gradual decreases during adolescence ( Bethlehem et al., 2022; Few sex differences in regional gray matter volume growth trajectories across early childhood Brown et al., 2012;Lenroot et al., 2007;Mills et al., 2016;Narvacan et al., 2017;Remer et al., 2017;Rutherford et al., 2022).Increases in gray matter volume observed via magnetic resonance imaging (MRI) are thought to reflect biological processes such as neural proliferation, dendritic spine arborization, synaptogenesis, gliogenesis and maturation, maturation of the neuropil, and gyrification, while decreases are historically thought to reflect synaptic pruning and myelination, resulting in a change in the gray to white matter ratio ( Deoni et al., 2015;Giedd & Rapoport, 2010;Raznahan et al., 2011;Remer et al., 2017).Gray matter maturation is spatiotemporally variable: parietal and inferior regions of the cortex reach peak volumes slightly earlier than frontal, temporal, superior, and subcortical regions ( Bethlehem et al., 2022;Lenroot & Giedd, 2006;Lenroot et al., 2007;Tanaka et al., 2012;Uematsu et al., 2012;Zhang et al., 2021).However, only a few neuroimaging studies have been conducted spanning early childhood, which is a critical development period.Furthermore, there is a lack of evidence on brain development in early childhood from longitudinal studies, which are necessary for capturing true development trajectories ( Di Biase et al., 2023).A more thorough understanding of typical gray matter development patterns in early childhood would help provide a basis for identifying atypical trajectories.
Sex differences in brain volumes have been widely reported in previous research ( Cosgrove et al., 2007;De Bellis et al., 2001;DeCasien et al., 2022;Gennatas et al., 2017;Giedd et al., 2012;Goldstein et al., 2001;Gur et al., 2002;Kaczkurkin et al., 2019;Lenroot & Giedd, 2010;Lenroot et al., 2007;Lotze et al., 2019;Sanchis-Segura et al., 2019;Tanaka et al., 2012), which robustly show that males, on average, have larger gray matter, white matter, and total brain volumes than females across the lifespan.Previous research has also indicated that males have a more protracted development trajectory than females across late childhood and adolescence ( De Bellis et al., 2001;Giedd et al., 2009;Lenroot et al., 2007;Tanaka et al., 2012).Notably, studies that examined gray matter volume as a proportion of total brain or intracranial volume suggest that females have proportionally more gray matter than males ( Goldstein et al., 2001;Gur et al., 2002).Given the high degree of correlation between sex and total brain volume, both raw volume and normalized gray matter volume should be considered when testing for developmental sex differences ( Giedd et al., 2012;Mills et al., 2016).While the presence of sex differences in brain structure during adolescence and adulthood is well established, it remains unclear whether those sex differences are already present in early childhood.Examining the timing with which brain sex differences emerge may help explain mechanisms underlying sex-differential prevalence rates for mental health and neurodevelopmental disorders ( Copeland et al., 2014;Gogos et al., 2019;Halladay et al., 2015;Mathes et al., 2019;Maughan et al., 2013;Mazure & Swendsen, 2016;Merikangas & Almasy, 2020;Ramtekkar et al., 2010;Rutter et al., 2003;Smink et al., 2012).
The present study aimed to characterize longitudinal changes in regional gray matter volume across early childhood (ages 2-8 years) and compare development trajectories between males and females.We hypothesized that most group-level regional volume trajectories would show a quadratic increase in volume over time with a slowing of growth towards 8 years (inverted-U shaped trajectory).Additionally, we hypothesized that there would be significant effects of sex, both main effects (with males having larger total volumes) and interactions (with males having slower development than females).Regarding proportional volume, we hypothesized that global and regional gray matter volume would decrease over time and that sex effects would diminish when normalizing gray matter volume as a percent of total intracranial volume.While it is increasingly common to report additional gray matter measurements such as cortical thickness and surface area, which are more closely related to cytoarchitectural changes than gray matter volume alone, this manuscript focuses on gray matter volume for several reasons.First, we report development trajectories for both cortical and subcortical regions, the latter of which only volume can be reliably reported.Second, volume has been the most widely reported gray matter metric in previous studies, and thus reporting volume in this novel sample simplifies connecting our results to previous research.

Participants and longitudinal MRI acquisition
This study was approved by The University of Calgary Conjoint Health Research Ethics Board (CHREB; REB13-0020).The present sample included 123 children from the prospective longitudinal Calgary Preschool MRI Dataset ( Reynolds et al., 2020).The cohort consists of healthy children with no physical, neurological, or psychiatric diagnoses who were born at >35 weeks gestation.Children were recruited to the study between ages 2 and 5 years and completed MRI scans at 6-12 month intervals (N = 393 scans, ~3.2 scans/subject; Fig. 1) for a full age range of 1.97-8.04years.90% of parents were married/common law, 84% of families identified as White, 87% of mothers had an undergraduate degree or more, and 70% of family incomes were at or above the median income for the city of Calgary, Alberta, Canada (Table 1).MRI data were collected at the Alberta Children's Hospital on a research-dedicated 3 T GE MR750w MRI scanner with a 32-channel head coil.The MRI protocol included T1-weighted anatomical images (FSPGR BRAVO with 0.9 × 0.9 × 0.9 mm resolution, 210 axial slices, TR = 8.23 ms, TE = 3.76 ms, flip angle = 12 degrees, matrix size = 512 × 512, and inversion time = 540 ms).During T1 acquisition, children were awake viewing a movie of their choice or sleeping naturally.

MRI processing, segmentation, and quality control
Images were initially assessed for quality at the scanner during acquisition; sequences were repeated if neces-sary and if time permitted.Images were also examined for motion and those with major motion artifacts were excluded.During processing, N4 bias-corrected images ( Cox, 1996;Cox & Hyde, 1997) were resampled to a voxel size of 1 mm 3 in preparation for multi-atlas segmentation combined with cortical reconstruction using implicit surface evolution (MaCRUISE; Huo, Carass, et al., 2016;Huo, Plassard, et al., 2016;Huo et al., 2017).MaCRUISE integrates the processes of cortical reconstruction and multi-atlas segmentation to produce reliable and consistent cortical surface parcellations in anatomical agreement with brain segmentations ( Huo, Carass, et al., 2016;Huo, Plassard, et al., 2016;Huo et al., 2017).
MaCRUISE has systematically been shown to reduce spatial inconsistencies between volumetric segmentation and surface parcellations in aging adult populations as compared to Freesurfer ( Huo, Plassard, et al., 2016).We observed similarly incorrect surfaces in our pediatric data when processed with Freesurfer and subsequently decided to process all T1-weighted images used in this study with MaCRUISE.In the MaCRUISE pipeline ( Huo, Plassard, et al., 2016), skull and dura stripped images are subject to both multi-atlas segmentation of 132 regions ( Asman & Landman, 2012, 2013;Bermudez et al., 2020;Klein et al., 2010) and TOpology-preserving Anatomical Segmentation (TOADS) fuzzy membership segmentation ( Bazin & Pham, 2008).MaCRUISE then fuses the rigid multi-atlas and TOADS segmentations, resulting in a full cerebrum segmentation comprising gray matter and white matter regions.To achieve a cortical reconstruction consistent with the segmentations, MaCRUISE applies multi-atlas anatomically consistent gray matter enhancement (MaACE; Han & Fischl, 2007;Sethian, 1999) to the gray matter component while applying a topology correction to the white matter component ( Han et al., 2001( Han et al., , 2002)).These refined gray and white matter segmentations form the outer and inner surfaces of the reconstructed cortex, respectively.Lastly, to resolve any remaining disagreements between the multi-atlas segmentation and reconstructed surfaces, MaCRUISE refines boundaries in the multi-atlas segmentation using the inner and outer cortical surfaces ( Huo, Plassard, et al., 2016).While MaCRUISE is a surface-based volume pipeline (rather than voxel-wise), vertex-wise measurements for cortical thickness and surface area are not readily available as outputs from this pipeline and require further processing, outside the scope of this present study.We extracted the refined segmentations (in mm 3 ) for analyses of absolute total gray matter volume and absolute regional volume.We additionally computed proportional volume by dividing each region's volume by total intracranial volume (ICV) and multiplying by 100, resulting in the regional volume expressed as a percent of ICV.Trained raters checked each individual segmentation image for accuracy, overlayed in each scan's native T1space, and assigned them a quality score of 1 (poor), 2 (unsatisfactory), 3 (satisfactory), and 4 (excellent).Segmentations with a quality score < 3 were manually edited and reintroduced to the MaCRUISE pipeline at the segmentation fusion step, in place of the original rigid multiatlas segmentation.Edited segmentation outputs were reassessed for quality and included in the analysis if the resulting segmentation obtained a quality score of 3 or 4. Of the 452 successfully acquired T1-weighted images considered for this study, 393 (87%) had sufficient quality of both the T1-weighted image and resulting MaCRUISE segmentation.Of the excluded scans, 20 were the subject's first session, 15 were the subject's second session, 3 were the third, 4 were the fourth, 8 were the fifth, 4 were the sixth, and the remaining 4 scans were from the seventh timepoint or later.Only 1 subject was completely excluded from the study due to quality concerns.Spaghetti plots of raw volume measurements over time were visually inspected to ensure plausibility of intra-individual change in volume between scans.

Fitting gray matter volume developmental trajectories
To determine the best fitting polynomial trajectory of development, we successively fit and compared nested mixed-effects models of increasing complexity with subjects allowed a random intercept (R package lme4; Bates et al., 2015).We compared the following three models for each region: where Y ij = j th volume measurement for the i th subject, x ij = the subject's age at time of scan, B 1 = coefficient for age, B 2 = coefficient for age 2 , B 0i = subject-specific y-intercept, and ε ij = random error.
We determined the best fitting model for each brain region by selecting the one with the lowest Akaike Information Criterion (AIC; Sakamoto et al., 1986), and Bayesian Information Criterion (BIC; Neath & Cavanaugh, 2012), two loss metrics for model selection which also account for model complexity (Cox & Vladescu, 2023).While both AIC and BIC penalize a model for every additional parameter to reduce the risk of over-fitting, the penalty in BIC is greater.In this study, with a limited set of models compared at once, we deemed the risk of over-fitting to be low; In cases where AIC and BIC disagreed, we preferred the model selected by AIC.In this study, AIC and BIC selected the same model in ~75% of regions.Additionally, we calculated the Akaike weight for each model tested.The Akaike weight is a probability between 0 and 1 that a given model is the best at minimizing Kullbach-Leibler discrepancy among a set of models ( Wagenmakers & Farrell, 2004).Akaike weights provide a means to compare the strength of evidence in favor of one model, as compared to another.We fit trajectory models for absolute total gray matter and absolute regional volumes (mm 3 ) as well as for proportional total and regional gray matter volume expressed as a percent of ICV.For each best fitting model, we extracted the percent change in volume from baseline to peak, age at peak volume, and predicted volume at ages 2 and 8.5 years.We additionally determined partial eta squared (η p 2 ) for all age effects to examine the magnitude of effect that age/time has on developmental changes in gray matter volume.η p 2 characterizes the proportion of variance in the dependent variable explained by an independent variable while accounting for other covariates in a model.Magnitude of effect size conveyed by η p 2 is typically benchmarked as follows: small = 0.0099, medium = 0.0588, and large = 0.1379 ( Cohen, 2013).Regions with non-null trajectories were retained for further analysis.

Testing for effects of sex
In a second step, using the best fitting non-null regional trajectories described above (i.e., final model was linear or quadratic), we created mixed-effects models to characterize the effects of child sex on gray matter development.
where Y ij = j th volume measurement for the i th subject, x ij = the subject's age at time of scan, S ij = subject sex (male = 0, female = 1), B 1 = coefficient for age, B 2 = coefficient for age 2 , B 0i = subject-specific y-intercept, B 3 = coefficient for sex main effect, B 4 = coefficient for ageby-sex interaction, B 5 = coefficient for age 2 -by-sex interaction, and ε ij = random error.
To correct for multiple comparisons, we grouped p-values for like model components (i.e., all p-values for age-by-sex interaction term, all p-values for age 2 -by-sex interaction term, and all p-values for main effect of sex) and applied a false discovery rate (FDR; Benjamini & Hochberg, 1995) correction to each group of tests (absolute and proportional volume, main effect or interaction of sex) with a threshold of q < 0.05.When full models including age-or age 2 -by-sex interactions showed no significant effects of sex (q > 0.05), models were reduced (interactions removed, main effect of sex retained) and rerun.We examined residual distribution plots for all reported models to confirm adequacy of model fit.

Total gray matter volume development
Absolute total gray matter volume had an inverted U-shaped quadratic trajectory, increasing non-linearly by 8% between 2 and 6.85 years, or between 1% and 4% annually (Tables 2, S1; Fig. 2).Proportional gray matter volume was 57.7% of ICV at age 2 and 55.8% of ICV at age 8 years.The trajectory followed a slightly inverted U-shape with a peak at 3.67 years (Table 2, S1; Fig. 2).Males had significantly larger absolute total gray matter volume than females (Table S2; Fig. 2), although this effect disappeared when normalizing by ICV (Table S3, Fig. 2).This pattern was confirmed by evaluation of the AIC values and weights; the best fitting trajectory model for total gray matter volume included a main effect of sex and evidence strongly favored this model (AIC = 9281.99,w = 0.84) and proportional volume was best fit by a trajectory including no effects of sex (AIC = 1435.26,w = 0.64).We found no significant sex differences in trajectory shape for either absolute or proportional total gray matter volume (Table S2, S3, Fig. 2).

Regional development trajectories for absolute volume (mm 3 )
The best fitting model (i.e., the lowest AIC) was quadratic for 88 of 116 regions (76%), linear for 20 regions (17%), and null for 8 regions (7%); see Tables 2, S1.In regions where the null trajectory was the best fit, null models were about twice as likely to be the best fitting model than the next-best fitting age trajectory model.
We generally observed earlier peaks in posterior regions and later peaks in anterior regions (Fig. 3; Table 2).The age at peak volume for quadratic trajectories ranged from 4.03 (left cuneus) to 7.96 years (right anterior cingulate gyrus).Most regions showed larger volumes at 8 years than 2 years, though rates of development differed (Fig. 4

Proportional regional volume development (%ICV)
The best fit for 62 of 116 regions (53%) was the quadratic model, 30 regions (26%) were best fit by a linear model, and 24 regions (21%) were best fit by a null model (Table S2).In regions where the null trajectory was the best fit, null models were about twice as likely to be the best fitting model than the next-best fitting age trajectory model.54% of regions decreased in proportional volume between 2 and 8 years, while 25% showed an increase (Fig. 4.) The medial postcentral gyrus had the largest proportional decrease (right: -24%, η p 2 age = 0.06; left: -22%, η p 2 age = 0.06), while the left occipital pole had the largest proportional increase (21%, η p 2 age = 0.02).Additional regions with strong age effects included the right ventral diencephalon (10.32% increase, η p 2 age = 0.32), and the cuneus (right: 14.04% decrease, η p 2 age = 0.22, left: 16.07%decrease, η p 2 age = 0.25; Table 2).

Effects of sex
No regions had significant sex interactions for absolute volume.In reduced models (sex interactions removed), 86% of regions had a significant main effect of sex (Table S2); in all regions, males had larger gray matter volume than females.Most sex main effects were medium (η p 2 > 0.0588) to large (η p 2 > 0.1379; Cohen, 2013).In 89 of the regions best fit by a sex main effects model, Akaike weights indicated that a model with sex main effects was at least twice as likely to be the best fitting model than a model with no sex effects.In 66 of those regions, models with a main effect of sex were at least 50 times likelier to be the best fitting model than a model without a sex main effect.
For proportional volume, two regions showed significant sex interactions (development trajectory  differences by sex).The left medial postcentral gyrus had a significant main effect of sex (q = 0.01, η p 2 = 0.049) as well as significant sex by age (q < 0.04, η p 2 = 0.039) and sex by age 2 (q = 0.048, η p 2 = 0.034) interactions (Fig. 5, Table 3).Females showed a U-shaped trajec tory with a rapid decrease from age ~2-6 years while males showed a gradual, linear decrease.The interaction model was 133 times likelier than the main effects only model to be the best fit given the data, and 74 times likelier to be the best fitting model than the model with no sex effects.In the right occipital fusiform gyrus, there was significant sex-by-age interaction (q < 0.04, η p 2 = 0.034; Table 3) Fig. 3. Age at peak absolute volume (left), and proportional volume (right).For most regions, absolute volume reaches a peak around age 6 or later, consistent with our finding that absolute volumes primarily increase between ages 2 and 8 years.On the other hand, proportional volume was largest for most regions before age 5.5, consistent with the finding that proportional volume shows a net decrease between ages 2 and 8 years.For regions with a linearly decreasing trajectory, peak age is coded as 3 years and linearly increasing trajectories are coded as 7.5 years.Regions with a null trajectory are shown in gray.with females' volume decreased more rapidly than males.
The interaction model was 225 times likelier to be the best-fitting model than the main effects only model and 1312 times likelier than the model with no sex effects, given the present data.In reduced models, 15% of regions had a main effect of sex on proportional volumes (Table S3).Of those regions, females had larger proportional volume in 10 of them: bilaterally in the caudate, pallidum, and thalamus, the right ventral DC, the left hippocampus, postcentral gyrus, and gyrus rectus.
Males had proportionally larger volume in four regions: the left anterior cingulate cortex, inferior occipital gyrus, and middle cingulate gyrus, as well as the right posterior insula.Evidence in favor of a model with main effects of sex over a model without sex effects was particularly strong for the bilateral caudate and pallidum, and the left inferior occipital gyrus and ventral diencephalon, whose sex main effects models were more than 400 times likelier to be the best-fitting model than models with no sex effects.

DISCUSSION
In this study, using longitudinal sampling, we provide a detailed characterization of typical gray matter volume development between 2 and 8 years across 116 gray matter regions.The three key results are: 1) absolute gray matter volume development shows a spatiotemporal pattern where posterior regions typically reach peak volumes earlier than anterior regions; 2) absolute regional volumes generally increase between 2 and 8 years, while proportional volumes are stable or decrease; and 3) there are few sex differences in development trajectories.

Spatiotemporal development pattern
Our finding that posterior regions show absolute volume decreases earlier than anterior regions is broadly consistent with previous literature ( Brown & Jernigan, 2012;Lenroot & Giedd, 2006;Lenroot et al., 2007;Tanaka et al., 2012;Uematsu et al., 2012), which used wider age ranges with relatively few children under 8 years who were scanned longitudinally.Thus, our study confirms the presence of an overall posterior-to-anterior developmental pattern in a sample of younger children, complementing what is already known about human neurodevelopment ( Ouyang et al., 2019).
In contrast to prior studies however, most regions in our study reached peak volume between 4 and 7 years, with global gray matter volume peaking around 7 years.Previous longitudinal studies showed peak gray matter volumes between ages 9 and 11 years for frontal gray matter, around 8 years for parietal gray matter, and between 10 and 11.5 years for temporal gray matter ( Lenroot et al., 2007;Tanaka et al., 2012).Our observations are more similar to prior findings in occipital regions, which peaked around 7 years ( Lenroot et al., 2007).A likely reason for differences in peak ages identified between studies are differences in the age range from one sample to another.For example, the oldest child included in our sample was 8.04 years old, whereas Lenroot and colleagues included participants up to 23 years (2007).It is well documented that fitting and interpreting quadratic models for biological data is highly influenced by the age range of the sample ( Fjell et al., 2010) and our finding of an earlier peak may be attributable to model fit artifacts of a parabola within the range of ages we sampled.Additionally, quality control procedures for structural neuroimaging data can affect the outcome of the analysis ( Ducharme et al., 2016).More recently-available quality control methods may also be more stringent, therefore shifting the shape of the best fitting model towards earlier peaks.When development follows a parabolic trajectory, the slowest development occurs near the peak, which is what we observed in the present study for most regions.While the peaks we found do not precisely concur with findings from previous research including older children and adolescents, our study and previous studies both indicate that developmental changes are relatively slow around ages 7-8 years ( Brown & Jernigan, 2012;Lenroot et al., 2007).While the nominal age at peak volume differs between the present study and others, a similar conclusion can be drawn from all of them: developmental changes in gray matter volume occur more slowly during middle to late childhood as absolute gray matter volume approaches a peak, compared to earlier or later childhood.
While a quadratic trajectory was the best fit for most regions for both absolute (mm 3 ) and proportional volume (%ICV), roughly one-fifth of regions followed a linear trajectory.A quadratic fit indicates that the rate of change, changes during the course of development from 2-8 years (i.e., sometimes development is fast, other times development is slower).In some cases for quadratic trajectories, when the predicted peak is within the age range, it also indicates a brief developmental period where volume is relatively stable.On the other hand, a linear fit indicates that the rate of change in volume is relatively constant, and does not change direction during 2-8 years.It may also indicate that the region is not imminently approaching peak volume during this age range, as the rate of development is not slowing as would be expected nearing a peak.
We detected earlier volume decreases in sensory regions compared to other brain areas.For example, the postcentral gyrus, the location of the somatosensory cortex ( DiGuiseppi & Tadi, 2023), showed volume decreases from 2 years onward, while adjacent regions showed rapid increases in volume.The ability to detect this regional heterogeneity was made possible by the regional specificity of the atlas used in our segmentation pipeline ( Huo, Carass, et al., 2016;Klein et al., 2010).The spatiotemporal pattern observed in this study is in concordance with previous studies that characterized cortical gray matter development in the preschool years with similar or better regional specificity ( Brown & Jernigan, 2012;Deoni et al., 2015, Remer et al., 2017), although these studies did not use longitudinal data.Our longitudinal analysis confirms that sensory regions show some of the earliest volume decreases, likely in conjunction with early functional/behavioral development of sensory systems.
A limited number of absolute volume regions (8/116) throughout the brain were best fit by a null trajectory.The null fit could have one of several meanings; first, a null fit could indicate that change in volume between 2 and 8 years is incredibly small and/or slow, resulting in the appearance of a null trajectory.Alternatively, if the direction and magnitude of volume change varies considerably between individuals, then a null model may be preferred by the AIC.In our study, AIC weights indicated that null fits were roughly twice as likely to be the best fitting model than the next-best fitting model, which bolsters our confidence in these null trajectories.However, it does not rule out the contribution of intra-individual variability to model selection.
Roughly 20% of regions were best fit by a null model when volume was expressed as a proportion of total ICV.Specifically, this indicates that the region remains the same percent of total ICV throughout the age range.In other words, these regions appear to be increasing in volume at roughly the same rate as ICV.In the case of relatively large frontal areas, such as the right inferior temporal gyrus, left middle frontal gyrus, right and left superior frontal gyrus, these large regions make up a significant portion of total ICV and are thus large contributors to the total ICV trajectory.
An additional type of spatiotemporal variance observed in our study was medium to large age effects in most subcortical structures but not in most frontal cortical regions, particularly superior frontal regions.These larger effect sizes for age in subcortical structures indicate that the variance of the age effect is similar to the total variance of the model.In other words, change in subcortical volumes are well explained by changes in age in the 2-8 year developmental period and less so in superior frontal regions.The disconnect in the developmental pattern between subcortical and superior frontal regions observed here may offer support for the developmental mismatch hypothesis, wherein a mismatch in developmental timing of maturation in subcortical and prefrontal structures is related to developmentally appropriate behaviour ( Mills et al., 2021).While this has predominantly been described in adolescents with regard to amygdala-prefrontal cortical development and risk-taking behavior, it stands to reason that a mismatch in fontal and subcortical development in young children may be related to rapidly developing executive functioning, and the networks underlying the development of a central executive (Clark et al., 2021;McKenna et al., 2017;Nachshon et al., 2020).

Absolute increases, proportional decreases
Absolute total gray matter volume showed a non-linear increase during early childhood, with rapid increases early, followed by slower increases, a peak, and slight decreases approaching 8 years.Research in infants has indicated large (100-150%) increases in total gray matter volume over the first year and an additional 14-18% increase over the second year ( Fukami-Gartner et al., 2023;Gilmore et al., 2012;Kelly et al., 2023;Knickmeyer et al., 2008;Peterson et al., 2021).Here, we showed moderate increases until approximately 6.5 years (1-4% increase annually), followed by modest decreases (<1% decrease annually).
Rates of volume change were regionally heterogeneous and varied by age; some regions showed large (up to 12%) increases in volume early on while other regions concurrently showed large volume decreases (up to 7%).The regional heterogeneity we observed highlights the importance of using smaller regional units to describe developmental volume changes; some regions decrease while others increase; simply looking at total gray matter volume dilutes developmental effects and is less informative than detailed vertex-, voxel-, or region-wise approaches.
Total proportional gray matter volume decreased between 2 and 8 years of age, as did the majority of individual regions when expressed as a percent of ICV.It has previously been understood that proportional decreases reflect concurrent volume increases in other tissue types, such as white matter; myelination of axons is ongoing during childhood ( Williamson & Lyons, 2018), and an increase in myelinated axons could change the relative proportion of gray matter ( Houdé & Borst, 2022;Natu et al., 2019;Sherin & Bartzokis, 2011;Walhovd et al., 2017).However, emerging evidence suggests that decreases in gray matter volume are not spatiotemporally synchronous with increases in white matter volume, calling the exact nature of this relationship between gray and white matter into question (Chad & Lebel, in review).Nevertheless, our study indicates that while absolute gray matter volume is broadly increasing in early childhood, the proportion of gray matter to ICV is broadly decreasing, suggesting that at least one other tissue type contributing to ICV is increasing in volume more rapidly than gray matter.

Few sex differences in trajectories
Our finding of larger absolute volume in males compared to females is consistent with widely accepted scientific consensus across previous studies (see DeCasien et al., 2022;Kaczkurkin et al., 2019;Ruigrok et al., 2014 for an overview of relevant studies).The main effects of sex were tempered when the volume was normalized by ICV, which is not surprising given the known relationship between sex and ICV.
Notably, none of the regional volumes expressed in absolute terms (mm 3 ) had even nominally significant ageby-sex or age 2 -by-sex interaction effects.For regional volume expressed as percent of ICV, only 2 regional volumes had a significant age-by-sex interactions after FDR correction.An additional 5 regions expressed as percent ICV had nominally significant age-by-sex interactions.As these 7 regions only represent 6% of the regions we examined, our conclusion remains that few sex differences in development trajectories are present between ages 2 and 8 years.Previous research in neonates suggested that growth rates for total brain volume differ by sex, with male brain volume increasing more rapidly than female brain volume (Holland et al., 2014), resulting in male brain volume being larger than that of females.In samples of children spanning the typical pubertal period, it has similarly been found that rates of regional gray matter volume development differ between males and females over the age of 8; Gennatas et al. found that caudate, and frontal and temporal gray matter volume increase more rapidly in males until about age 15 years, after which rates of volume change become more similar for males and females (2017).Additionally, they observed that rates of thalamic and insular volume change were consistently larger in males than females through age 23 years, while parietal and occipital volume change was initially larger in males before females showed increased rates of change after age 15.Only in the putamen did the study find that females had a faster rate of volume change than males ( Gennatas et al., 2017).To our knowledge, no previous study has specifically reported sex differences in development rates in the occipital fusiform gyrus or medial postcentral gyrus; however, this is likely due to differences in the atlases used for segmentation or ways in which segmentations are combined for analysis.What is clear, however, is that sex differences in rates of development appear to be global in infancy and later childhood and adolescence ( Gennatas et al, 2017;Holland et al., 2014;Lenroot et al., 2007), while in our study we found no sex differences in trajectories for either total gray matter volume and only few differences in regional volume development.This could be explained by the fact that the perinatal period and puberty (but not early childhood) are associated with surges in gonadal hormones, leading to brain sex differences ( Breedlove et al., 2010, Sisk et al., 2013).That the medial postcentral gyrus and occipital fusiform gyrus show sex differences in development during the early childhood period suggests that these differences are maintained in these regions, even in the absence of an active surge in gonadal hormones.
Sex differences in human brain structure observed via MRI may stem from a myriad of factors, including differential exposure to sex hormones in utero, perinatally, and during puberty, sex chromosome complement, or through interactions with environmental exposures ( DeCasien et al., 2022;Giedd et al., 2006Giedd et al., , 2012;;Joel et al., 2015;Marrocco & McEwen, 2016;McCarthy, 2016).Given that previous findings in later childhood and adolescence showed sex-differentiated developmental gray matter volume trajectories, we asked whether this also happens during an earlier developmental period and expected to find widespread differences in rates of development between males and females.However, we found no sex differences in absolute volume trajectories, and only two regions (left medial postcentral gyrus and right occipital fusiform gyrus) showed sex differences in proportional volume trajectories with females having steeper slopes than males.Previous studies of later childhood and adolescence have indicated earlier volume peaks for females ( Lenroot et al., 2007;Tanaka et al., 2012;Uematsu et al., 2012), suggesting sex-dependent timing of developmental processes such as synaptogenesis and pruning.
However, these differences do not appear to be present before age 8 years.Our findings suggest that while rates of gray matter development can vary by sex, developmental sex differences in early childhood are less extensive than in older childhood and adolescence.
Again, a key difference between our study and prior work is the age range of the samples.Particularly, given the young age of our sample (the oldest subject was scanned at age 8.04 years), it is likely that all subjects are prepubertal, though it should be noted that this was not measured directly.The young age range of our sample sets it apart from previous longitudinal studies of gray matter development, which found developmental sex differences in longitudinal samples with age ranges spanning from childhood through adolescence and adulthood ( Lenroot et al., 2007;Tanaka et al., 2012;Uematsu et al., 2012).Previous studies have characterized the relationship between brain structure and puberty-related hormones, such as testosterone and estradiol, and found that gray matter volume is related to gonadal hormone levels, even when accounting for age-related volume effects, though few have examined this relationship longitudinally (see Peper et al., 2011 for a review).The relative lack of sex differences found in our early childhood sample suggests that the onset of puberty, and associated changes in gonadal hormones, may promote the emergence of wider-spread sex differences in gray matter volume, whereas the few regions with developmental sex differences identified in this study may represent brain sex differences preserved from the perinatal hormonal surge.The longitudinal data and the large number of individual scans included in our study provide confidence that the lack of sex differences found in this study is not due to a lack of power, but rather that limited sex differences in gray matter development in early childhood may be a biological reality.Further longitudinal studies of brain development, which measure and adequately account for the effects of puberty, are needed to further elucidate the mechanisms underlying sexual differentiation in the human brain.

Study limitations
Our sample is a primarily white, high-income, high education sample that is not representative of the Canadian population as a whole ( Statistics Canada, 2021).Care should be taken in extrapolating these findings to samples with different demographic makeups, not only because socioeconomic status is related to trajectories of brain development (Tooley et al., 2021), but also because misapplication of findings from a majority white sample to different ethnic groups or assuming that a majority white sample is representative of the broader population may inadvertently perpetuate racism in biomedical and neuroscientific research (Gilpin & Taffe, 2021;Henrich et al., 2010).Next, movement artifacts are common in pediatric neuroimaging studies.However, our lab has developed best practices for scanning unsedated young children, which support the success rate of scans ( Thieba et al., 2018).Nonetheless, head motion artifacts are inevitably present in our dataset, so we visually inspected all T1-weighted images and excluded those scans with excessive movement.Furthermore, most processing and analysis pipelines for segmenting and measuring gray matter volume are developed on adult brain MRI data, which does not always translate to robust segmentations in pediatric MRI.The software we used to segment gray matter volumes, MaCRUISE, was likewise developed on adult brain data.However, the machine-learning approach to multi-atlas segmentation combined with internally consistent segmentations and surfaces yielded robust volume measurement in our sample of young, typically developing children ( Bermudez et al., 2020;Huo, Plassard, et al., 2016).To ensure a high quality of segmentations, all processed images were visually inspected by trained raters.In the case of outlier values, segmented images were double-checked to ensure that only biologically plausible segmentations are included in the analysis.A second potential limitation of MaCRUISE is that the pipeline does not account for the repeated nature of scans, a feature available in other comparable pipelines such as Freesurfer.Given that MaCRUISE yielded far more robust segmentations in our pediatric data than Freesurfer, we considered use of MaCRUISE despite the lack of a longitudinal pipeline to be a worthy trade off.Additionally, we visually inspected spaghetti plots of raw volume values to ensure biological plausibility of intra-individual change between scanning sessions which would have ameliorated any significant issues posed by the lack of a longitudinal segmentation pipeline.We also acknowledge some of the limitations associated with quadratic trajectories, some of which could be addressed by more complex model fits such as spline or localized regression approaches.During preparation of this manuscript, we visualized the regional volume data over time using a localized regression (LOESS) model and observed that the shape of the LOESS fit did not differ much from the linear or quadratic fits in our data.Additionally, deriving metrics such as peak age and percent change can be non-trivial for splines, GAMMs, and other non-linear approaches, although this may be changing given the advent of new approaches such as linear estimation with non-linear inference (LENI; McCormick, 2024, OSF Preprint).In this trade-off between flexibility and interpretability, we sided with interpretability and utilized quadratic models as the most advanced fit.
Lastly, our study only focused on gray matter volume, which is a composite of two more specific measurements of the cortex, surface area, and cortical thickness.For example, an increase in gray matter volume may be attributable to an increase in cortical thickness, surface area, or both.MaCRUISE provides region-based estimates of cortical thickness (i.e., mean thickness) and surface area; however, for surface-based measurements, a vertex-wise approach is preferred to a region of interest approach; intra-regional variation in cortical thickness ( Carey et al., 2018) is lost when averaged across a region of interest, as are changes in cortical thickness that cross regional anatomical boundaries ( Backhausen et al., 2022).Future neuroimaging studies of brain development should report vertex-wise measurement of cortical thickness and surface area whenever possible.

CONCLUSIONS
Childhood is a key developmental period, and this study provides a detailed characterization of gray matter volume changes in young children, showing regional patterns consistent with studies in older children, but little evidence for sex differences in development rates at this young age.Our results constitute a basis for comparison of brain development in various clinical samples, where brain development patterns may differ.

Fig. 1 .
Fig. 1.Age at the time of scanning for all subjects included in this sample.Each line represents a subject, and each dot is a single scan session used in the volume analysis.

Fig. 2 .
Fig.2.Total gray matter volume showed an absolute increase (left), but a proportional decrease when normalized by intracranial volume (right).Thick lines represent group-level model fit.Thin lines represent best fit lines for individual subjects; each dot is an individual data point.Males had significantly larger absolute total gray matter volume than females, but there were no significant sex by age interaction effects.No sex effects were present for proportional total gray matter volume.

Fig. 4 .
Fig. 4. Percent change for absolute regional volume (left) and proportional volume normalized as %ICV (right).Most absolute volumes increased between 2 and 8 years, while proportional volume generally decreased.Development followed a rostro-caudal pattern with posterior regions decreasing in volume earlier than frontal regions.Regions with a null development trajectory are coded as a 0% change.

Fig. 5 .
Fig. 5.Only two regions had significant sex-by-age effects on proportional volume.In the left medial segment of the postcentral gyrus, females showed a curvilinear decrease while males showed a gradual linear decrease.In the right occipital fusiform gyrus, males showed a more gradual volume decrease than females.Dots represent the proportional volume for a single subject at a single scanning timepoint.Thin lines represent individual subject best fit lines; thick lines represent the group-level trend.

Table 1 .
Sample demographics by sex.

Table 2 .
Trajectory model parameters for all regions and both absolute and proportional volume.For some regions, predicted age at peak volume was outside of the observed age range and peak age is displayed at >8.04 and <1.97, respectively.

Table 3 .
Model summaries for regions with significant age-by-sex interactions.
* indicates that model component was statistically significant after correction for multiple comparisons.