Abstract
Does early exposure to cognitive and linguistic stimulation impact brain structure? Or do genetic predispositions account for the co-occurrence of certain neuroanatomical phenotypes and a tendency to engage children in cognitively stimulating activities? Low socioeconomic status infants were randomized to either 5 years of cognitively and linguistically stimulating center-based care or a comparison condition. The intervention resulted in large and statistically significant changes in brain structure measured in midlife, particularly for male individuals. These findings are the first to extend the large literature on cognitive enrichment effects on animal brains to humans, and to demonstrate the effects of uniquely human features such as linguistic stimulation.
INTRODUCTION
How does early life experience shape the human brain? The question is surprisingly difficult to answer, as it concerns the causes, rather than merely the correlates, of individual differences in human development. Studies of such differences are normally observational and thus silent on the subject of causality. Animal studies, in contrast, have demonstrated causal influence of environmental stimulation on brain structure using random assignment to physical environments with low or high complexity. However, they cannot tell us about the features of the environment that matter most for human development: linguistic and cognitive stimulation.
The role of the environment in shaping brain development is a central issue for neuroscience, and a significant open question concerns the impact of uniquely human features of the environment, namely, linguistic and cognitive stimulation (Lenroot & Giedd, 2011). Whereas a large animal literature shows that more complex cage environments lead to microscopic and macroscopic brain changes, including larger cortex (Diamond, 2001), such manipulations provide an incomplete model for the environmental differences that may matter most in human development. These include differences in complex forms of cognitive and linguistic experience.
Understanding how experience shapes human development is also a central issue for social science and policy. Does early experience drive socioeconomic stratification across generations? Can environmental interventions enhance the development in individuals of lower socioeconomic status (SES) and disrupt intergenerational cycles of disadvantage (Peterson, Loeb, & Chamberlain, 2018; Duncan, Magnuson, & Votruba-Drzal, 2014)? In an era of growing inequality and persistent child achievement gaps, the response of the human brain to early childhood cognitive and linguistic experiences has societal, as well as scientific, importance (Farah, 2018).
To address these questions an experiment is needed, with human infants randomly assigned to environments of high versus low cognitive and linguistic stimulation, ideally starting early in life and comprising a substantial portion of their early childhood years. Although it would be unethical and unfeasible to experimentally assign a group of children to low cognitive and linguistic stimulation, below what it would otherwise have been, there is an alternative way to achieve the equivalent contrast. It has long been reported that children growing up in lower SES families, on average, receive less cognitive and linguistic stimulation compared with their higher SES peers (Hoff, 2013; Bradley & Corwyn, 2002). By randomizing such infants into one group that continues to receive the expected low stimulation and one that receives higher linguistic and cognitive stimulation, the effect of randomly assigned high versus low stimulation can be observed.
This was the intervention design of the Abecedarian Project (Ramey et al., 2000). Starting between 3 and 21 weeks of age, and continuing through age 5 years, participants in the intervention group engaged in a program designed to promote linguistic interactions and age-appropriate learning opportunities. Randomization was constrained to equate the two groups for multiple poverty-associated risk factors, and the two groups eventually scanned four decades later remained well-matched on these factors as described in Methods section. Both the early intervention group and the comparison group received enhanced medical care and social services.
Participants were evaluated throughout the period of the intervention and over the subsequent decades. Cognitive benefits of the early intervention, assessed with IQ and academic achievement tests, were significant through the latest evaluation at age 21 years, although smaller than when measured in childhood (Campbell, Pungello, Miller-Johnson, Burchinal, & Ramey, 2001). Larger and enduring effects occurred in real-world behavioral achievements including additional years of education completed, greater likelihood of a 4-year college graduation, lower reliance on public assistance, older age at first child, and greater rates of full-time employment (Campbell, Pan, & Burchinal, 2019).
Sex shapes life trajectories and health in myriad ways, biological and social, in many cases with boys more affected by adverse environments than girls, even within the same family (Golding & Fitzgerald, 2017; Bale et al., 2010). Within the Abecedarian project, sex differences emerged over time in some but not all analyses. Different outcomes showed female advantage, male advantage, or no sex difference (Campbell et al., 2019), a mix of outcomes that hasalso been observed in other early childhood education programs (Magnuson et al., 2016). These differences in intervention effects have yet to be satisfactorily explained, but their existence motivates the inclusion of sex as a moderator in this study.
For the present research, structural MRI scans were obtained from 47 of the Abecedarian sample, 29 from the early intervention group and 18 from the comparison group. As shown in Table 1, the groups were closely matched on a number of characteristics that would be expected to correlate with brain structure, including mothers' IQ, educational attainment and age at birth, infant gestational age and head circumference at birth, composite risk index (see Methods section), sex (15/29 or 52% and 9/18 or 50% male in the early intervention and comparison groups, respectively), and race (all African American).
Variable . | Comparison Group . | Early Intervention Group . |
---|---|---|
Maternal Characteristics at Enrollment | ||
Maternal educational attainment | 10.50 (2.04) y | 10.46 (1.53) y |
Maternal IQ | 85.5 (9.64) | 84.62 (9.02) |
Maternal age at birth | 21.28 (6.91) y | 18.72 (2.42) y |
Participant Characteristics at Enrollment | ||
Sex | 15/29, 52% male | 9/18, 50% male |
Race/ethnicity | 18/18, 100% AA | 29/29, 100% AA |
High-risk index (see Methods) | 19.83 (5.42) | 19.93 (5.91) |
Gestational age at birth | 39.44 (3.52) wks | 39.32 (2.50) wks |
Head circumference at birth | 34.06 (2.21) cm | 34.33 (1.52) cm |
Participant Characteristics at Time of Scan | ||
Age at scan | 41.22 (1.67) y | 41.38 (1.57) y |
SAI status | 9/18, 50% SAI | 15/29, 52% SAI |
Variable . | Comparison Group . | Early Intervention Group . |
---|---|---|
Maternal Characteristics at Enrollment | ||
Maternal educational attainment | 10.50 (2.04) y | 10.46 (1.53) y |
Maternal IQ | 85.5 (9.64) | 84.62 (9.02) |
Maternal age at birth | 21.28 (6.91) y | 18.72 (2.42) y |
Participant Characteristics at Enrollment | ||
Sex | 15/29, 52% male | 9/18, 50% male |
Race/ethnicity | 18/18, 100% AA | 29/29, 100% AA |
High-risk index (see Methods) | 19.83 (5.42) | 19.93 (5.91) |
Gestational age at birth | 39.44 (3.52) wks | 39.32 (2.50) wks |
Head circumference at birth | 34.06 (2.21) cm | 34.33 (1.52) cm |
Participant Characteristics at Time of Scan | ||
Age at scan | 41.22 (1.67) y | 41.38 (1.57) y |
SAI status | 9/18, 50% SAI | 15/29, 52% SAI |
AA = African American; SAI = School Age Intervention.
The size of the sample deserves comment. First, as detailed in Methods section, power analysis indicates that the sample is adequate under the assumption that the effect to be detected is large (by Cohen's classification of effect sizes; Cohen, 1992). Previous research, summarized in Methods section, suggests that the effects of sustained environmental stimulation will indeed be large. Second, even if sample size were a concern, the sustained randomized manipulation of cognitive and linguistic stimulation followed by brain imaging is unprecedented. The unique research opportunity presented by a full-time, 5 days/week intervention lasting the first 5 years of life warrants examination.
The brain measures of primary interest were the volumes of five specific ROIs, summed to create a primary summary measure, as well as the volume of cortex more generally. Four of the specific ROIs were selected for their a priori relevance to the intervention, which emphasized language for communication and as scaffolding for cognitive control (see Methods): left inferior frontal gyrus (LIFG) and left superior temporal gyrus (LSTG) relevant to language (Friederici, 2011), right inferior frontal gyrus and bilateral ACC, relevant to cognitive control (Aron, Robbins, & Poldrack, 2004). The fifth, bilateral hippocampus, was added because its volume is frequently associated with early life adversity including poverty (Hanson et al., 2015).
Unless otherwise specified, volumes were expressed as percentages of the mean comparison group volume for the same sex, allowing us to report intervention effects in percentage differences, a more meaningful measure than cubic centimeters.
The effects of the early intervention on brain structure were assessed using standard and permutation-based tests. Initially, we assessed the effects of the early intervention and possible moderation by sex, using standard and robust ANOVA on the two relatively global measures: the summed volumes of ROIs and total cortical volume. A later School-Age Intervention (SAI), not found to affect behavioral measures in the long term (Campbell et al., 2019) and balanced between the two early intervention groups, was also included as a covariate. The five a priori ROIs were then tested individually. Additional exploratory analyses included early intervention effects on the surface areas and thicknesses of the cortical ROIs and the relations of brain measures to selected psychological measures. Finally, the volumes, surface areas, and thicknesses of all brain areas from the Desikan–Killiany atlas (Desikan et al., 2006) were also assessed.
METHODS
Participants
The Abecedarian Project (Ramey et al., 2000) was established in North Carolina in the early 1970s and enrolled 112 predominantly (98%) African American infants from homes of very low SES (low income and maternal education) with multiple associated risk factors such as paternal absence, welfare receipt, and low parental IQ (Table 2), but free of neurodevelopmental disorder. In order to equate the early intervention and comparison groups on various demographic and risk factors, pairs of children with equivalent baseline measures were randomly allocated to each condition. One of the 112 infants initially randomized to early intervention later received a diagnosis of a congenital condition that was disqualifying based on the exclusionary criteria, resulting in 111 infants participating in the study. The intervention was a comprehensive program of developmentally appropriate cognitive and linguistic enrichment embedded within a positive and responsive, university-based childcare setting for five full days (6–8 hr) per week, 50 weeks per year.
Factor and Level . | Weighted Contribution . |
---|---|
Low Maternal Educational Attainment (Highest Grade Completed) | |
6 | 8 |
7 | 7 |
8 | 6 |
9 | 3 |
10 | 2 |
11 | 1 |
12 | 0 |
Low Paternal Educational Attainment (Highest Grade Completed) | |
6 | 8 |
7 | 7 |
8 | 6 |
9 | 3 |
10 | 2 |
11 | 1 |
12 | 0 |
Family Income (Per Year), dollars | |
>1000 | 8 |
1,001–2,000 | 7 |
2,001–3,000 | 6 |
3,001–4,000 | 5 |
4,001–5,000 | 4 |
5,001–6,000 | 0 |
Father absent for reasons other than health or death | 3 |
Absence of maternal relatives in local area (i.e., parents, grandparents, or brothers or sisters of majority age) | 3 |
Siblings of school age who are one or more grades behind age-appropriate grade or who score equivalently low on school-administered achievement test | 3 |
Payments received from welfare agencies within past 3 years | 3 |
Record of father's work indicated unstable and unskilled or semi-skilled labor | 3 |
Records of mother's or father's IQ indicate score of 90 or below | 3 |
Records of sibling's IQ indicate score of 90 or below | 3 |
Relevant social agencies in the community indicate that the family is in need of assistance | 3 |
One or more members of the family has sought counseling or professional help in the past 3 years | 1 |
Special circumstances not included in any of the above that are likely contributors to cultural or social disadvantage | 1 |
Factor and Level . | Weighted Contribution . |
---|---|
Low Maternal Educational Attainment (Highest Grade Completed) | |
6 | 8 |
7 | 7 |
8 | 6 |
9 | 3 |
10 | 2 |
11 | 1 |
12 | 0 |
Low Paternal Educational Attainment (Highest Grade Completed) | |
6 | 8 |
7 | 7 |
8 | 6 |
9 | 3 |
10 | 2 |
11 | 1 |
12 | 0 |
Family Income (Per Year), dollars | |
>1000 | 8 |
1,001–2,000 | 7 |
2,001–3,000 | 6 |
3,001–4,000 | 5 |
4,001–5,000 | 4 |
5,001–6,000 | 0 |
Father absent for reasons other than health or death | 3 |
Absence of maternal relatives in local area (i.e., parents, grandparents, or brothers or sisters of majority age) | 3 |
Siblings of school age who are one or more grades behind age-appropriate grade or who score equivalently low on school-administered achievement test | 3 |
Payments received from welfare agencies within past 3 years | 3 |
Record of father's work indicated unstable and unskilled or semi-skilled labor | 3 |
Records of mother's or father's IQ indicate score of 90 or below | 3 |
Records of sibling's IQ indicate score of 90 or below | 3 |
Relevant social agencies in the community indicate that the family is in need of assistance | 3 |
One or more members of the family has sought counseling or professional help in the past 3 years | 1 |
Special circumstances not included in any of the above that are likely contributors to cultural or social disadvantage | 1 |
The Abecedarian Early Intervention was designed to provide consistently high levels of individually paced cognitive and language experiences, of the kind more common in higher SES families. The program utilized the Learning Games curriculum (Sparling & Lewis, 1984), which is based on the Vygotskian view of the centrality of language in cognitive development—that children learn self-regulation by internalizing speech. Infant activities included talking to the child, playing with cause-and-effect toys or picture books, and offering infants an opportunity to react to sights and sounds in the environment. As children grew, the curriculum shifted toward more conceptual and skill-based learning games and interactions, always using language, even in motor skill activities, and eliciting language from the child.
Both the early intervention group and the comparison group were provided with free iron-fortified baby formula (because none were breastfed), and with social workers to facilitate access to free or low-cost healthcare for the first 5 years of life, as well as family social services. As a result, outcome differences would not be attributable to these factors and both groups correctly viewed themselves as part of a treatment group.
Over the ensuing decades, participants were evaluated via blinded assessment on their functioning in various important spheres of life including cognitive, educational, social–emotional, occupational, economic, and health outcomes. Cognition was assessed with IQ and academic achievement tests of reading and mathematical skill. Tests of more specific neurocognitive abilities such as executive function were not administered. Although the IQ and academic skills advantage faded with time, more enduring benefits were observed in other important behavioral outcomes including years of education completed, likelihood of college graduation, reliance on public assistance, age at first child, and continuity of employment (Campbell et al., 2019; Ramey et al., 2000).
Seventy-eight study participants (42 intervention and 36 comparison) traveled to Roanoke for follow-up testing between the ages of 38 and 44 years. Eighteen were not scanned because of anxiety or claustrophobia, in some cases related to their girth relative to the scanner bore (8 interventions and 10 comparisons); eight were not scanned because of metal in the body (three interventions and five comparisons); one was not scanned because of weight alone (comparison group); and one was not scanned because of recent neurological symptoms (intervention group). One intervention participant declined with no reason offered, and one comparison participant's scan failed because of hardware error. Finally, of the 48 completed scans, one was of poor quality (comparison), leaving a total of 47 images. Twenty-nine of these (15 male, 14 female) came from the intervention group, and 18 (9 male, 9 female) came from the comparison group. Mean age at time of scan was 41.4 and 41.2 years (SD = 1.6 and 1.7) for the intervention and comparison groups, respectively.
Imaging
Imaging was conducted on a 3.0-T Siemens Trio scanner. High-resolution T1-weighted scans (voxel size: 1.0 × 1.0 × 1.0 mm) were acquired using an MPRAGE sequence (Siemens). Each participant's T1 data were processed using Advanced Normalization Tools (ANTs; Avants, Epstein, Grossman, & Gee, 2008). The antsMultivariateTemplateConstruction.sh script was used to build a template using all participants' T1 images. The population-specific template was processed using the antssCorticalThickness.sh tool in order to obtain a set of tissue segmentation priors. Each scan was then processed using the population-specific template along with antsCorticalThickness.sh (Tustison et al., 2014). This pipeline produces a brain extraction mask and a six-tissue segmentation. Jacobian images of volume were calculated from the nonlinear warp fields that align each participant to the template. To obtain cortical labels for each participant, the antsJointLabelFusion.sh script was used along with an existing population of labeled images to perform multi-atlas label fusion, which provides both cortical labels as well as deep gray labels (Wang et al., 2013).
Behavioral Measures
Although not the primary focus of this research, behavioral data were also analyzed. These analyses were aimed at assessing the relation of the brain measures used in this study to individual psychological outcomes. Two behavioral measures were selected for these analyses. One, contemporaneous with the early intervention, was the Stanford–Binet intelligence test (Form L-M), administered at age 4 years by staff blind to group assignment. All but two of the scanned participants had taken this test and therefore had scores available. The other behavioral measure, obtained the day of scanning, was a midlife strengths and risk index. This was extracted from structured interviews conducted by research staff blind to participants' group assignments. The index was computed by adding together two checklists, each out of 10, of strengths (such as high school graduate and current full-time employment) and reverse-coded risk (such as unfavorable self-rated health and first child before age 20 years; Sonnier-Netto, 2018). We note that four other assessments were administered on this occasion but were not analyzed in relation to brain structure because they did not measure an ability or quality of performance along a quantitative dimension of better or worse. These were an open-ended interview by a staff member not blind to group assignment, an Ultimatum Game, a Multi-round Trust Game, and a Locus of Control questionnaire (see Luo et al., 2018, for a report on the economic games and Sonnier-Netto, 2018, for a report on locus of control).
Analyses
All statistical analyses were carried out in R (2019), with additional permutation analyses assisted by R package lmPerm (Wheeler & Torchiano, 2016), and bias-corrected and accelerated confidence intervals calculated by bootstrapping (Canty & Ripley, 2019; Peng, 2019; Weiss, 2016).
For analyses reported here, unless otherwise noted, brain measures were normalized as percentages of the mean control participant of the corresponding sex as follows: Relative volume = 100(v − )/, where is the corresponding same-sex mean volume.
The resulting proportions provide a more intuitive measure of the intervention effect than absolute volumes measured in cc. In addition, by using the comparison mean from the same-sex participants, we eliminate the size difference between male and female brains from these percentage increase measures.
To determine whether or not these measures would need to be corrected for other participant characteristics that could affect brain outcomes, we examined the rich array of baseline measures available on the study participants and their families that are shown in Table 1. These included age at scan, gestational age at birth, head circumference at birth, maternal IQ, maternal educational attainment, and overall “high risk” score. The groups were highly similar on all measures, as shown in Table 1, so that any difference in brain outcomes cannot be attributed to differences in these baseline measures.
The hypothesis testing sequence progressed from anatomically general to specific, that is, normalized cortex and summed normalized ROIs and then normalized individual ROIs. The analyses of cortex and summed ROIs consisted of standard and permutation-based ANOVA, including the following variables: Early Intervention, SAI, Sex, and all interactions among these variables. The effect of the early intervention on the five ROIs was then assessed separately for male and female individuals, with false discovery rate (FDR) correction for the 10 multiple comparisons.
The sample size, although modest, provided adequate power for detecting a large effect by Cohen's classification. Specifically, power analysis for the present sample, with groups of 28 and 19 participants, indicates adequate power by the conventional criteria of 80% power and p < .05 for an effect size of d = 0.84 (G*Power 3; Faul, Erdfelder, Lang, & Buchner, 2007). The most similar research with humans, comparing the effects of Romanian versus UK orphanage environments, shows an effect of size d = 1.13 on gray matter volume (Mackes et al., 2020; Appendix). Research that varies sustained environmental stimulation experimentally has been carried out only with animals. Although contemporary animal research in this area focuses on molecular and cellular effects, some early work reported macroscopic differences roughly analogous to those studied here, focused on cortical weight, length, width, and thickness, largely in male rats (Diamond, 2001). Based on an early publication that included a table with data required for calculating a standardized effect size (Rosenzweig, 1966; Table 1), the increase in cortex weight for rats given environmental enrichment was d = 0.87.
Additional exploratory analyses were undertaken to learn as much as possible from this unique data set. First, effects of the early intervention with 95% confidence intervals were assessed with normalized measures of total surface area and mean cortical thickness of cortex as a whole and the four a priori cortical ROIs for male and female participants. Second, the normalized volumes of all 134 regions of the Desikan–Killany atlas identified by joint label fusion were then analyzed in the same manner to gauge the effects of early intervention on them. Third, to assess the relation of brain measures to psychological outcomes, Pearson correlations were computed between normalized brain measures on the one hand, and IQ and midlife strengths and risk index on the other. Brain measures selected for testing were confined to the two relatively global measures, namely, the sum of ROIs and total cortex volume, to be augmented by any regional measure that showed significant effects of the intervention for participants of both sexes. The results of these correlations were FDR corrected. The goal of these analyses was to determine whether the brain measures studied here are related to psychological outcomes of interest.
RESULTS
Descriptive Overview of Anatomical Sequelae
Table 3 presents basic descriptive data concerning participants' brain volumes, separated by the manipulation of interest, the early intervention, as well as by sex, given other findings of sex differences in outcomes from early intervention programs. In addition to regions selected for a priori testing, included are also whole-brain volume and the remainders of cortical and brain volumes when a priori ROIs have been subtracted.
Percent Volume Increase . | Female Mean (SE) . | Male Mean (SE) . | ||
---|---|---|---|---|
Comparison . | Intervention . | Comparison . | Intervention . | |
Areas of a priori Interest | ||||
Cortex | 400.77 (12.57) | 402.09 (6.56) | 433.78 (7.8) | 480.33 (9.29) |
Sum of ROIs | 25.59 (0.84) | 26.49 (0.63) | 25.72 (0.6) | 29.7 (.52) |
ACC (bilat) | 5.4 (.11) | 5.47 (0.28) | 4.79 (0.27) | 6.24 (0.19) |
HC (bilat) | 6.63 (0.48) | 6.59 (0.11) | 7.16 (0.28) | 7.61 (0.23) |
IFG (L) | 4.47 (0.17) | 5.11 (0.17) | 4.72 (0.13) | 5.30 (0.16) |
IFG (R) | 4.52 (0.16) | 4.91 (0.19) | 4.32 (0.15) | 5.36 (0.18) |
STG (L) | 4.57 (0.30) | 4.41 (0.13) | 4.71 (0.16) | 5.11 (0.15) |
Remaining Compartments | ||||
Total brain | 868.53 (27.91) | 870 (14.66) | 938.11 (14.84) | 1030.91 (20.45) |
Cortex net of ROIs | 381.81 (12.21) | 382.19 (6.18) | 415.21 (7.68) | 458.24 (9.13) |
Brain net of cortex and HC | 461.13 (16.77) | 461.32 (9.99) | 497.17 (8.81) | 542.97 (12.69) |
Percent Volume Increase . | Female Mean (SE) . | Male Mean (SE) . | ||
---|---|---|---|---|
Comparison . | Intervention . | Comparison . | Intervention . | |
Areas of a priori Interest | ||||
Cortex | 400.77 (12.57) | 402.09 (6.56) | 433.78 (7.8) | 480.33 (9.29) |
Sum of ROIs | 25.59 (0.84) | 26.49 (0.63) | 25.72 (0.6) | 29.7 (.52) |
ACC (bilat) | 5.4 (.11) | 5.47 (0.28) | 4.79 (0.27) | 6.24 (0.19) |
HC (bilat) | 6.63 (0.48) | 6.59 (0.11) | 7.16 (0.28) | 7.61 (0.23) |
IFG (L) | 4.47 (0.17) | 5.11 (0.17) | 4.72 (0.13) | 5.30 (0.16) |
IFG (R) | 4.52 (0.16) | 4.91 (0.19) | 4.32 (0.15) | 5.36 (0.18) |
STG (L) | 4.57 (0.30) | 4.41 (0.13) | 4.71 (0.16) | 5.11 (0.15) |
Remaining Compartments | ||||
Total brain | 868.53 (27.91) | 870 (14.66) | 938.11 (14.84) | 1030.91 (20.45) |
Cortex net of ROIs | 381.81 (12.21) | 382.19 (6.18) | 415.21 (7.68) | 458.24 (9.13) |
Brain net of cortex and HC | 461.13 (16.77) | 461.32 (9.99) | 497.17 (8.81) | 542.97 (12.69) |
Abbreviations: ACC = anterior cingulate gyrus; HC = hippocampus; IFG = inferior frontal gyrus; STG = superior temporal gyrus.
Table 4 presents the same results expressed as normalized volumes relative to mean of same-sex comparison participants.
Percent Volume Increase . | Female Mean (SE) . | Male Mean (SE) . | ||
---|---|---|---|---|
Comparison . | Intervention . | Comparison . | Intervention . | |
Areas of a priori Interest | ||||
Cortex | 0 (3.14) | +0.33 (1.64) | 0 (1.80) | +10.73 (2.14) |
Sum of ROIs | 0 (3.27) | +3.50 (2.47) | 0 (2.34) | +15.46 (2.04) |
ACC (bilat) | 0 (2.08) | +1.29 (5.12) | 0 (5.67) | +30.37 (4.04) |
HC (bilat) | 0 (7.22) | −0.67 (1.61) | 0 (3.96) | +6.27 (3.22) |
IFG (L) | 0 (3.80) | +14.24 (3.88) | 0 (2.74) | +14.19 (3.31) |
IFG (R) | 0 (3.58) | +8.61 (4.22) | 0 (3.51) | +23.88 (4.29) |
STG (L) | 0 (6.47) | −3.40 (2.84) | 0 (3.45) | +7.84 (3.22) |
Remaining Compartments | ||||
Total brain | 0 (3.21) | +0.17 (1.69) | 0 (1.58) | +9.89 (2.18) |
Cortex net of ROIs | 0 (3.20) | +0.10 (1.62) | 0 (1.85) | +10.36 (2.20) |
Brain net of cortex and HC | 0 (3.64) | +0.04 (2.17) | 0 (1.77) | +9.21 (2.55) |
Percent Volume Increase . | Female Mean (SE) . | Male Mean (SE) . | ||
---|---|---|---|---|
Comparison . | Intervention . | Comparison . | Intervention . | |
Areas of a priori Interest | ||||
Cortex | 0 (3.14) | +0.33 (1.64) | 0 (1.80) | +10.73 (2.14) |
Sum of ROIs | 0 (3.27) | +3.50 (2.47) | 0 (2.34) | +15.46 (2.04) |
ACC (bilat) | 0 (2.08) | +1.29 (5.12) | 0 (5.67) | +30.37 (4.04) |
HC (bilat) | 0 (7.22) | −0.67 (1.61) | 0 (3.96) | +6.27 (3.22) |
IFG (L) | 0 (3.80) | +14.24 (3.88) | 0 (2.74) | +14.19 (3.31) |
IFG (R) | 0 (3.58) | +8.61 (4.22) | 0 (3.51) | +23.88 (4.29) |
STG (L) | 0 (6.47) | −3.40 (2.84) | 0 (3.45) | +7.84 (3.22) |
Remaining Compartments | ||||
Total brain | 0 (3.21) | +0.17 (1.69) | 0 (1.58) | +9.89 (2.18) |
Cortex net of ROIs | 0 (3.20) | +0.10 (1.62) | 0 (1.85) | +10.36 (2.20) |
Brain net of cortex and HC | 0 (3.64) | +0.04 (2.17) | 0 (1.77) | +9.21 (2.55) |
Abbreviations as in Table 3.
Observe that 18 of the 20 entries in the Intervention columns of Table 4 are positive, indicating that the early intervention is associated with increased size of the whole brain, the cortex, and most of the ROIs. Observe also that, except for one region (the left inferior frontal gyrus), the group treatment effects for males were substantially greater than for females.
In order to visualize the distributions of these volume measurements over participants, we plotted raw (as opposed to normalized) volumes separated by sex and intervention group for the two relatively global measures of a priori interest: summed ROIs and total cortex. Figure 1 shows the effect of the intervention on the sum of ROIs, substantially more pronounced in males, as well as the expected sex differences in volume. Figure 2 displays the same relations for cortex volume, again showing sex differences in both volume and effect of intervention.
Size and Reliability of Intervention Effects
To assess the early intervention effects on the two most global of the a priori measures, namely, the sum of the predicted ROIs and total cortical volume, we conducted analyses of variance. Tables 5 and 6 show the results of these analyses for summed ROI volumes and cortex volume, respectively, with factors Sex, SAI, and all interactions. In these tables, PConv values are based on conventional analyses of variance, using the F-distribution and the obtained values of the F-statistic; PPerm values are based on permutation-based ANOVA.
Source . | Df . | Mean Sq. . | F Value . | PConv . | PPerm . |
---|---|---|---|---|---|
Early Intervention | 1 | 1042 | 14.12 | 0.001 | 0.000 |
Sex | 1 | 638 | 8.65 | 0.006 | 0.023 |
School-Age Intervention (SAI) | 1 | 127 | 1.72 | 0.200 | 0.190 |
Early Int. × Sex | 1 | 431 | 5.84 | 0.020 | 0.025 |
Early Int. × SAI. | 1 | 4 | 0.06 | 0.807 | 0.902 |
Sex × SAI | 1 | 0 | 0 | 0.986 | 0.583 |
Early Int. × Sex × SAI | 1 | 100 | 1.36 | 0.251 | 0.251 |
Residuals | 39 | 74 |
Source . | Df . | Mean Sq. . | F Value . | PConv . | PPerm . |
---|---|---|---|---|---|
Early Intervention | 1 | 1042 | 14.12 | 0.001 | 0.000 |
Sex | 1 | 638 | 8.65 | 0.006 | 0.023 |
School-Age Intervention (SAI) | 1 | 127 | 1.72 | 0.200 | 0.190 |
Early Int. × Sex | 1 | 431 | 5.84 | 0.020 | 0.025 |
Early Int. × SAI. | 1 | 4 | 0.06 | 0.807 | 0.902 |
Sex × SAI | 1 | 0 | 0 | 0.986 | 0.583 |
Early Int. × Sex × SAI | 1 | 100 | 1.36 | 0.251 | 0.251 |
Residuals | 39 | 74 |
Source . | Df . | Mean Sq. . | F Value . | PConv . | PPerm . |
---|---|---|---|---|---|
Early Intervention | 1 | 362 | 6.69 | 0.014 | 0.019 |
Sex | 1 | 483 | 8.93 | 0.005 | 0.037 |
School-Age Intervention (SAI) | 1 | 52 | 0.96 | 0.334 | 0.121 |
Early Int. × Sex | 1 | 320 | 5.91 | 0.020 | 0.017 |
Early Int. × SAI | 1 | 19 | 0.35 | 0.560 | 0.902 |
Sex × SAI | 1 | 119 | 2.21 | 0.146 | 0.065 |
Early Int. × Sex × SAI | 1 | 71 | 1.32 | 0.258 | 0.295 |
Residuals | 39 | 54 |
Source . | Df . | Mean Sq. . | F Value . | PConv . | PPerm . |
---|---|---|---|---|---|
Early Intervention | 1 | 362 | 6.69 | 0.014 | 0.019 |
Sex | 1 | 483 | 8.93 | 0.005 | 0.037 |
School-Age Intervention (SAI) | 1 | 52 | 0.96 | 0.334 | 0.121 |
Early Int. × Sex | 1 | 320 | 5.91 | 0.020 | 0.017 |
Early Int. × SAI | 1 | 19 | 0.35 | 0.560 | 0.902 |
Sex × SAI | 1 | 119 | 2.21 | 0.146 | 0.065 |
Early Int. × Sex × SAI | 1 | 71 | 1.32 | 0.258 | 0.295 |
Residuals | 39 | 54 |
Standardized effect sizes for the early intervention effect, expressed as Cohen's d based on F values (Thalheimer & Cook, 2002), are substantial: 1.61 for summed ROIs and 0.80 for cortex. These large effects sizes are for all participants combined, male and female.
As can be seen, for both summed ROI volumes and cortex volume, analyzed with standard and permutation-based ANOVA, the analyses agree on which differences are important and which are negligible. Specifically, Early Intervention, Sex, and their interaction are all significant, consistent with the means shown in Table 4. The later, SAI and all of its interactions are nonsignificant.
The size and reliability of the intervention effects in specific ROIs were then assessed for male and female participants. Correcting for multiple comparisons across the 10 tests, male participants showed significant increases in three of the five areas and female participants showed an increase in one.
Figure 3 depicts the relationship of the early intervention on percentage volume increase in the sample, separated by sex, for volumes of the five individual a priori ROIs. The 95% bootstrap confidence intervals show that, for male individuals, the intervention had a positive effect on bilateral ACC, LIFG, and RIFG, with smaller positive numerical differences observed on LSTG and bilateral hippocampus. For female individuals, only the LIFG shows a relationship that is comparable to that of the male participants. Applying FDR correction to the 10 tests together, the areas just noted were significant at q = 0.0025, 0.0025, 0.0233, and 0.0488, respectively.
Exploratory Analyses: Beyond the Volumes of Selected ROIs and Relations to Behavior
Cortical surface area and thickness index different developmental processes, with surface area assumed to reflect the development of cortical columns and cortical thickness reflecting the development of cells within a column as well as synapse formation, pruning, and myelination (Johnson & de Haan, 2015). As with SES effects (Noble & Giebler, 2020), the intervention effects were more pronounced for cortical surface area than thickness.
Table 7A provides the numerical values corresponding to the volume effects shown in Figure 3, for comparison with the cortical surface area and cortical thickness results reported next. Table 7B shows that males had significantly expanded surface areas for cortex, bilateral ACC, and RIFG, similar to the volume findings, as indicated by confidence intervals that did not cross zero, along with LSTG; for LIFG surface area, the confidence interval just crossed zero. Female participants showed surface area effects only for LIFG, similar to the findings for volume. In contrast, as shown in Table 7C, the intervention had little effect on cortical thickness, with the only one confidence interval failing to cross zero, indicating thinning of LSTG for males.
. | Female Participants . | Male Participants . | ||||
---|---|---|---|---|---|---|
Mean % Difference . | 95% CI Lower Limit . | 95% CI Upper Limit . | Mean % Difference . | 95% CI Lower Limit . | 95% CI Upper Limit . | |
(A) Regional Volume | ||||||
Cortex | 0.33 | −6.03 | 7.36 | 10.73* | 5.02 | 15.73 |
Left Superior Temporal Gyrus | −3.40 | −16.05 | 11.12 | 7.84 | −0.34 | 18.02 |
Left Inferior Frontal Gyrus | 14.24* | 4.33 | 24.88 | 14.19* | 5.28 | 21.96 |
Right Inferior Frontal Gyrus | 8.61 | −1.77 | 19.74 | 23.88* | 12.88 | 33.98 |
Bilateral Anterior Cingulate Gyrus | 1.29 | −9.35 | 12.16 | 30.37* | 16.90 | 44.22 |
Bilateral Hippocampus | −0.67 | −4.62 | 15.32 | 6.27 | −4.62 | 15.32 |
(B) Regional Surface Area | ||||||
Cortex | 0.87 | −6.45 | 9.57 | 13.13* | 7.39 | 18.43 |
Left Superior Temporal Gyrus | −0.09 | −15.04 | 17.51 | 17.34* | 8.03 | 26.96 |
Left Inferior Frontal Gyrus | 20.04* | 4.17 | 38.52 | 12.63 | −0.39 | 24.85 |
Right Inferior Frontal Gyrus | 1.13 | −7.50 | 8.41 | 21.99* | 8.56 | 37.47 |
Bilateral Anterior Cingulate Gyrus | −5.33 | −18.66 | 7.75 | 30.41* | 18.62 | 42.59 |
(C) Regional Mean Thickness | ||||||
Cortex | −0.38 | −5.72 | 5.48 | −1.94 | −4.99 | 0.83 |
Left Superior Temporal Gyrus | −1.4 | −12.33 | 8.79 | −7.38* | −12.82 | −1.79 |
Left Inferior Frontal Gyrus | −4.97 | −15.67 | 5.55 | 1.04 | −5.93 | 8.61 |
Right Inferior Frontal Gyrus | 1.13 | −7.50 | 8.41 | 2.83 | −3.50 | 9.12 |
Bilateral Anterior Cingulate Gyrus | 6.08 | −3.03 | 14.36 | 0.39 | −5.81 | 6.93 |
. | Female Participants . | Male Participants . | ||||
---|---|---|---|---|---|---|
Mean % Difference . | 95% CI Lower Limit . | 95% CI Upper Limit . | Mean % Difference . | 95% CI Lower Limit . | 95% CI Upper Limit . | |
(A) Regional Volume | ||||||
Cortex | 0.33 | −6.03 | 7.36 | 10.73* | 5.02 | 15.73 |
Left Superior Temporal Gyrus | −3.40 | −16.05 | 11.12 | 7.84 | −0.34 | 18.02 |
Left Inferior Frontal Gyrus | 14.24* | 4.33 | 24.88 | 14.19* | 5.28 | 21.96 |
Right Inferior Frontal Gyrus | 8.61 | −1.77 | 19.74 | 23.88* | 12.88 | 33.98 |
Bilateral Anterior Cingulate Gyrus | 1.29 | −9.35 | 12.16 | 30.37* | 16.90 | 44.22 |
Bilateral Hippocampus | −0.67 | −4.62 | 15.32 | 6.27 | −4.62 | 15.32 |
(B) Regional Surface Area | ||||||
Cortex | 0.87 | −6.45 | 9.57 | 13.13* | 7.39 | 18.43 |
Left Superior Temporal Gyrus | −0.09 | −15.04 | 17.51 | 17.34* | 8.03 | 26.96 |
Left Inferior Frontal Gyrus | 20.04* | 4.17 | 38.52 | 12.63 | −0.39 | 24.85 |
Right Inferior Frontal Gyrus | 1.13 | −7.50 | 8.41 | 21.99* | 8.56 | 37.47 |
Bilateral Anterior Cingulate Gyrus | −5.33 | −18.66 | 7.75 | 30.41* | 18.62 | 42.59 |
(C) Regional Mean Thickness | ||||||
Cortex | −0.38 | −5.72 | 5.48 | −1.94 | −4.99 | 0.83 |
Left Superior Temporal Gyrus | −1.4 | −12.33 | 8.79 | −7.38* | −12.82 | −1.79 |
Left Inferior Frontal Gyrus | −4.97 | −15.67 | 5.55 | 1.04 | −5.93 | 8.61 |
Right Inferior Frontal Gyrus | 1.13 | −7.50 | 8.41 | 2.83 | −3.50 | 9.12 |
Bilateral Anterior Cingulate Gyrus | 6.08 | −3.03 | 14.36 | 0.39 | −5.81 | 6.93 |
Indicate differences whose 95% confidence intervals do not cross zero.
An exploratory analysis sought to assess the relations of brain to behavioral measures of psychology in this sample. Brain anatomy has a priori relevance to psychological function, which is one reason to study it in animals and humans. Although further testing of this relation was not a goal of this study, we attempted a brief confirmation that the relation was present for the participants studied here. As detailed in the Methods section, two psychological outcomes were selected and examined in relation to the two relatively global brain volumes of interest, as well as the most reliably affected ROI, which was LIFG. Pearson correlations (and bootstrapped p values) of the six brain–behavior relationships are shown in Table 8, demonstrating each brain measure was significantly associated in the expected direction with one or both of two psychological outcomes.
Brain Region . | Stanford–Binet . | Midlife Strengths and Risks Index . |
---|---|---|
Summed ROIs | +0.29 (0.045) | +0.21 (0.191) |
Cortex | +0.42 (0.015) | +0.23 (0.191) |
LIFG | +0.36 (0.015) | +0.35 (0.016) |
Brain Region . | Stanford–Binet . | Midlife Strengths and Risks Index . |
---|---|---|
Summed ROIs | +0.29 (0.045) | +0.21 (0.191) |
Cortex | +0.42 (0.015) | +0.23 (0.191) |
LIFG | +0.36 (0.015) | +0.35 (0.016) |
Finally, the relative volumes of all 134 regions delineated by the Desikan–Killiany atlas, as well as the surface areas and thicknesses of cortical regions, were compared for early intervention and comparison participants, separated by sex. The three large tables containing these results and their 95% confidence intervals are included with raw data at the Figshare link listed at the end of the paper. We offer these data for descriptive purposes to readers seeking additional information. Given the many regions tested, caution regarding potential false positives is warranted.
Of the regions subjected to exploratory analysis, a number showed substantial volume increases. More regions overall showed volume increases from the intervention in male than female participants, 42 (all positive in sign) versus six (of which five were positive). Similarly, surface area showed numerous positively signed differences for male individuals and many fewer such differences observed for female individuals. Intervention effects on cortical thickness were overall fewer in number for both sexes and included both positive and negative differences.
One question of interest addressed by the exploratory analyses is whether the sex difference observed with the a priori brain measures is specific to those measures. Before examining the different anatomical dimensions of the rest of the cortex and brain in all brain regions, one might have thought that anatomy is affected for both sexes equally, but with different regional distributions or different manifestations in volume, surface area, or thickness. The findings here indicate that this is not the case. Rather, the results obtained throughout the brain suggest that macroscopic brain structure is more affected by early life cognitive and linguistic stimulation in male than in female individuals.
DISCUSSION
Here we report the first evidence that normal variation in early life experience impacts human brain structure. Specifically, we show that the cognitive and linguistic environment of young humans affects macroscopic brain structure. Unlike previous observational research, which cannot address causality, the present data show that early life experience shapes brain structure, through its immediate causal effects and continuing chains of causal consequences.
Only one other randomized experimental study of early life experience in humans has reported brain measures, the Bucharest Early Intervention Project. It differs from this study in two ways. First, it could not shed light on the earliest years of human development, because its randomized component started at 2 years, as opposed to early infancy. Second, the manipulation involved a general and severe perturbation of childhood experience, including limited social, emotional, motoric experience in addition to limited cognitive and linguistic experience in Romanian orphanages. At 2 years of age, children were randomly assigned to foster care or continued institutional care. The impact of home rearing improved, but did not fully restore, later cognitive abilities, psychological adjustment, or brain structure (Mackes et al., 2020; Sheridan, Fox, Zeanah, McLaughlin, & Nelson, 2012). The inability of the fostering experience to “rescue” the brain from pathological treatment in the first 2 years does not address the question of interest here: whether experience changes brain structure in the context of normal human development, such that higher versus lower levels of cognitive and linguistic stimulation in the earliest years of life make a difference in brain structure. The present study therefore provides unique information about the causal relationship between early life experience and human brain structure, and the specific effect of cognitive and linguistic stimulation.
The present findings are also relevant to understanding the recently observed relation between brain structure and socioeconomic status (Noble & Giebler, 2020). Two general types of explanation have been put forward for this relation. On the one hand, it may be that environmental causes, such as the well-documented disparities in opportunities for cognitive stimulation and child-directed speech, are responsible, which is called a “social causation” account because the social environment causes the observed differences (Dunham, 1961). On the other hand, genetic inheritance of neural and cognitive differences may operate, and insofar as these differences influence SES, they could accountwhich is called a for the relations between brain, cognition, and SES (Murray, 2020; Wax, 2017), called a “social selection” account because different levels of SES select individuals based on their innate capabilities (Dunham, 1961). In order for brain disparities to be accounted for by the first type of account, it must be the case that cognitive and linguistic experience impacts brain structure. The present results provide the first evidence that this is true.
The results showed a pronounced sex difference in the effect of the early intervention, with larger effects on males. The only a priori ROI for which the intervention benefitted females to the same degree as males was the left inferior frontal gyrus; female participants showed nonsignificant trends in some but not all other a priori areas. Of note, animal studies measuring gross anatomical effects of environmental stimulation frequently include only males, and a variety of differences have been reported when both sexes are included (Diamond, 2001).
For humans, it is not uncommon for childhood intervention studies to find differences in efficacy for the behavior of male and female individuals (García, Heckman, & Ziff, 2018; Chetty, Hendren, & Katz, 2016). A recent meta-analysis of sex differences in response to early childhood education across studies found small sex differences favoring female individuals for most outcomes, with a more pronounced effect of reduced grade retention and special education referral favoring male individuals (Magnuson et al., 2016). For the Abecedarian Early Intervention, many of the young adult behavioral measures showed more lasting effects for female than male participants (Campbell et al., 2019), with the exception that later cardiovascular and metabolic health indicators showed more benefits to male participants (Campbell et al., 2014).
The reasons for sex differences in the Abecedarian and other program outcomes are poorly understood. They could involve biological differences between the sexes or social differences in their lives or both. Furthermore, these differences could moderate the intervention effect through either their effects on sensitivity to environmental enrichment in the intervention group or to the conditions of poverty in the comparison group or both (García et al., 2018; Golding & Fitzgerald, 2017).
Limitations of this study include those intrinsic to the sample and those intrinsic to MRI. Regarding the sample, it is small compared to most current studies in cognitive neuroscience. The trend toward larger samples has been motivated in part by the realization that they reduce the risk of false positives, in addition to the more obvious reduction in risk of false negatives (Button et al., 2013). Sample size impacts false positives by its relation to statistical power, and power in turn depends on expected effect size. Crucially, power and replicability are not determined simply by sample size per se, but rather by sample size in relation to the size of the effect being tested. As discussed in the Methods section, our sample is adequately powered to detect a large effect, and a large effect is plausible given effect sizes from comparable studies in humans and animals. On this basis, it was appropriate to proceed with analyses of the sample. The effects we found were also large, and the possibility that they were false positives, by chance yielding p values below 0.05, is unlikely. As shown in Table 5, the main effect of the intervention on the a priori summary measure was highly significant by conventional and permutation testing; in the latter case, the precise value was 0.000000004, truncated to 0.000 in the table. In summary, although recent concerns about sample size in neuroscience are well-justified in general, they do not call into question this study and its findings.
Regarding MRI, it does not reveal changes in the brain at the cellular level. On the basis of our data, we cannot know whether the intervention affected size or number of neuronal or glial cell bodies, of dendrites, synapses, or other experience-dependent features of brain tissue documented by animal research. Furthermore, the study is limited by having images from just one stage of life, long after the conclusion of the intervention. While the enduring nature of the effect adds to its potential practical importance, it would have been ideal to scan participants longitudinally, starting in infancy, in order to further constrain the ways in which early childhood experience and the causal chain of its later life effects impact the brain.
In the absence of such data, we can nevertheless conclude that early life cognitive and linguistic stimulation impacts brain structure, in the form of larger volumes of brain regions associated with cognition. At present, only a small number of human beings in the world have ever undergone early, intensive, and sustained cognitive and linguistic intervention with random assignment, namely, the participants of the Abecedarian project. Their brain structure findings extend, in a qualitative way, our knowledge of experiential effects on the brain. They also argue for investment in future randomized intervention studies, with longitudinal, multimodal imaging and behavioral measures starting in infancy.
Acknowledgments
The authors thank Carrie Bynum and Laura Bateman for their assistance in data collection and Vincent Hurtubise for computer systems support.
Reprint requests should be sent to Martha J. Farah, Center for Neuroscience & Society, University of Pennsylvania, 3710 Hamilton Walk, Goddard Labs 506, Philadelphia, PA 19104, or via e-mail: [email protected].
Author Contributions
Martha J. Farah: Conceptualization; Formal analysis; Investigation; Methodology; Writing—Original draft. Saul Sternberg: Conceptualization; Formal analysis; Software; Visualization; Writing—Review & editing. Thomas A. Nichols: Data curation; Formal analysis; Writing—Original draft. Jeffrey T. Duda: Data curation; Writing—Original draft. Terry Lohrenz: Data curation; Investigation; Methodology; Writing—Review & editing. Yi Luo: Investigation; Methodology; Writing—Review & editing. Libbie Sonnier: Data curation; Writing—Review & editing. Sharon L. Ramey: Conceptualization; Resources; Writing—Review & editing. Read Montague: Conceptualization; Investigation; Methodology; Resources; Writing—Review & editing. Craig T. Ramey: Conceptualization; Resources; Writing—Review & editing.
Funding Information
This work was supported by a Principal Research Fellowship from the Wellcome Trust (R. M.), Virginia Tech (R. M.) and the School of Arts and Sciences Research Fund, University of Pennsylvania (M. J. F.).
Data and Materials Availability
Anonymized brain measures analyzed here with group membership, age, and sex, as well as analyzed regional differences in volume, surface area, and thickness, are available at https://figshare.com/articles/Early_Experience_Volume_Cortical_Thickness_Surface_Area_data/9161894.
Diversity in Citation Practices
A retrospective analysis of the citations in every article published in this journal from 2010 to 2020 has revealed a persistent pattern of gender imbalance: Although the proportions of authorship teams (categorized by estimated gender identification of first author/last author) publishing in the Journal of Cognitive Neuroscience (JoCN) during this period were M(an)/M = .408, W(oman)/M = .335, M/W = .108, and W/W = .149, the comparable proportions for the articles that these authorship teams cited were M/M = .579, W/M = .243, M/W = .102, and W/W = .076 (Fulvio et al., JoCN, 33:1, pp. 3–7). Consequently, JoCN encourages all authors to consider gender balance explicitly when selecting which articles to cite and gives them the opportunity to report their article's gender citation balance. The authors of this article report its proportions of citations by gender category to be as follows: M/M = .542, W/M = .125, M/W = .167, and W/W = .167.