Abstract
Children’s reading progress typically slows during extended breaks in formal education, such as summer vacations. This stagnation can be especially concerning for children with reading difficulties or disabilities, such as dyslexia, because of the potential to exacerbate the skills gap between them and their peers. Reading interventions can prevent skill loss and even lead to appreciable gains in reading ability during the summer. Longitudinal studies relating intervention response to brain changes can reveal educationally relevant insights into rapid learning-driven brain plasticity. The current work focused on reading outcomes and white matter connections, which enable communication among the brain regions required for proficient reading. We collected reading scores and diffusion-weighted images at the beginning and end of summer for 41 children with reading difficulties who had completed either 1st or 2nd grade. Children were randomly assigned to either receive an intensive reading intervention (n = 26; Seeing Stars from Lindamood-Bell which emphasizes orthographic fluency) or be deferred to a wait-list group (n = 15), enabling us to analyze how white matter properties varied across a wide spectrum of skill development and regression trajectories. On average, the intervention group had larger gains in reading compared to the non-intervention group, who declined in reading scores. Improvements on a proximal measure of orthographic processing (but not other more distal reading measures) were associated with decreases in mean diffusivity within core reading brain circuitry (left arcuate fasciculus and left inferior longitudinal fasciculus) and increases in fractional anisotropy in the left corticospinal tract. Our findings suggest that responses to intensive reading instruction are related predominantly to white matter plasticity in tracts most associated with reading.
1 Introduction
Reading disabilities are the most common learning disability (Shaywitz, 1998), impacting as many as 20% of children (Wagner et al., 2020). Formal reading instruction begins at school entry (around 6 years old) for most children in the United States, and readers continue developing their skills in and out of school contexts. However, during extended formal education breaks such as summer vacation, typically a two to three month period in U.S. schools, reading progress typically slows (Cooper et al., 1996; Entwisle et al., 1997; von Hippel et al., 2018). Extended suspension of formal schooling can exacerbate achievement gaps among vulnerable readers who do versus do not participate in reading instruction (Christodoulou et al., 2017) as well as between vulnerable readers and their typically reading peers, as was observed during COVID-19 school disruptions (Kuhfeld et al., 2023). Reading interventions over the summer can halt reading skill loss and even lead to appreciable gains in reading abilities for struggling readers (Christodoulou et al., 2017; Donnelly et al., 2019).
1.1 White matter supporting reading
Fluent reading is enabled by coordination of a network of regions in the brain. This system is typically left-lateralized (Murphy et al., 2019), consistent with the frequent left-lateralization of language processing (Enge et al., 2020; Fedorenko et al., 2011; Vigneau et al., 2006). A set of white matter tracts propagate information within this network (Ben-Shachar et al., 2007). Among these tracts are the inferior longitudinal fasciculus (ILF), which connects primary visual cortices to their ipsilateral anterior temporal lobes (Herbet et al., 2018), and the arcuate fasciculus (AF), which connects ipsilateral temporal, parietal, and frontal regions (Catani et al., 2005). Surgical resection and lesion mapping studies of these tracts support their necessity for fluent reading; individuals with damage to these regions exhibit dysfluent reading (Epelbaum et al., 2008; Herbet et al., 2018; Ng et al., 2021; Zemmoura et al., 2015). The ILF is predominantly involved in early processes of word reading, such as orthographic recognition, by carrying primary visual signals to the posterior visual word form area (VWFA; see Dehaene & Cohen, 2011) in the (typically left) ventral occipitotemporal cortex (Bouhali et al., 2014; Lerma-Usabiaga et al., 2018; Yeatman & White, 2021; Yeatman et al., 2013). The ILF’s anterior temporal projections also support semantic processing (Duffau et al., 2013; Herbet et al., 2018; Saur et al., 2008; Shin et al., 2019). The AF’s connections from the anterior VWFA to temporoparietal and frontal regions (e.g., linking Wernicke’s and Broca’s area) are involved in abstracting phonological and higher-order language representations from printed text (Catani & Mesulam, 2008; Catani et al., 2005; Weiner et al., 2017). Other bundles likely support reading as well, including the inferior fronto-occipital fasciculus (ipsilateral frontal-to-occipital connections), superior longitudinal fasciculus (ipsilateral frontal-to-parietal connections), ventral occipital fasciculus (ipsilateral dorsolateral-to-ventrolateral visual cortex connections), and splenium of the corpus callosum (providing interhemispheric visual cortices connections) (Ben-Shachar et al., 2007; Vandermosten et al., 2012; Yeatman et al., 2013), but these bundles have not been as frequent of a focus in reading-related neuroimaging literature compared to the AF and ILF.
1.2 Diffusion-weighted imaging and white matter plasticity
Learning is thought to drive long-term plasticity in white matter (Fields, 2015; Fields et al., 2014; Sampaio-Baptista & Johansen-Berg, 2017; Xin & Chan, 2020). Such changes may manifest as alterations to axonal geometry (e.g., diameter modulations or axonal pruning/branching), myelin remodeling driven by oligodendrocyte proliferation and differentiation, or variations to extra-axonal glial cells and vascular systems (Sampaio-Baptista & Johansen-Berg, 2017). White matter micro- and macro-structural properties can be inferred in vivo non-invasively with diffusion-weighted imaging (DWI; Basser et al., 1994). Longitudinal DWI studies have related skill acquisition to white matter changes in behavior-relevant bundles in animals (Blumenfeld-Katzir et al., 2011; Sampaio-Baptista et al., 2013) and humans (Metzler-Baddeley et al., 2017; Scholz et al., 2009), providing evidence for DWI’s utility in quantifying white matter plasticity. Most DWI studies of plasticity have used metrics from the diffusion tensor imaging (DTI) model, including fractional anisotropy (FA) and mean diffusivity (MD). FA measures the degree to which water molecule movement is directionally dependent (varies between 0—water moves equally as well in all directions, and 1—water moves only along a single axis), while MD is related to the total magnitude of water movement across all directions. Higher FA and lower MD are thought to indicate well-myelinated white matter (however, see the Discussion section for limitations surrounding these metrics). Given the high test-retest reliability of DTI measures and modest rates of developmental change (Behler et al., 2021; Wu et al., 2022; Yu et al., 2020), observing significant and rapid changes in these metrics is encouraging in suggesting that white matter alterations are occurring and manifest at a level resolvable by MRI.
1.3 Longitudinal relationships of reading performance and white matter properties
Reading is an appropriate and educationally relevant domain for investigations of learning-driven neural plasticity. First, reading is a skill that has been socio-culturally introduced too recently to be a product of evolution or natural selection pressure. Second, reading must be explicitly taught, and highly reliable measures exist to gauge reading performance (Torgesen et al., 2012; Woodcock, 2011). Most of the longitudinal neuroimaging literature in reading has been on the order of several months to years. Longitudinal DWI studies of long-term reading development have shown that trajectories of white matter properties and reading skills are significantly linked (Moulton et al., 2019), especially in the left AF, such that increases in tract volume (Myers et al., 2014) and FA (Roy et al., 2024; Van Der Auwera et al., 2021; Wang et al., 2017; Yeatman et al., 2012) accompany improvements in reading among children with diverse reading abilities, although the opposite trend for FA has also been reported for children with reading disabilities (Yeatman et al., 2012). A comparison of illiterate and ex-illiterate adults suggests that developing literacy is associated with higher FA in the left AF (Thiebaut de Schotten et al., 2014). Similarly, higher FA in core reading tracts have predicted subsequently better future reading outcomes in children (Borchers et al., 2019; Davison et al., 2023; Gullick & Booth, 2015), as well as better reading-adjacent skills, such as phonological awareness, among pre-readers (Saygin et al., 2013; Zuk, Yu, et al., 2021). These reports collectively suggest that reading outcomes are related to microstructural changes in white matter that are known to support reading.
In longitudinal studies of reading disabilities, there has not only been a focus on left-hemispheric core reading tracts, which have exhibited lower FA among pre-readers with familial risk of dyslexia (Langer et al., 2017; Vandermosten et al., 2015) and future diagnoses of dyslexia (Vanderauwera et al., 2017), but also on their right hemispheric homotopes. Higher FA in the right superior longitudinal fasciculus has predicted future reading outcomes in children with dyslexia (Hoeft et al., 2011) and longitudinal FA increases in this tract relates to positive reading development in children with familial risk for dyslexia (Wang et al., 2017). These studies suggest the right hemisphere may provide a compensatory mechanism in reading disabilities.
The extant literature implies that white matter infrastructure could have a causal and dynamic relationship with reading outcomes, as opposed to being a static genetically predisposed foundation that reflects individual differences in such outcomes. Indeed, longitudinal designs, as described above, have yielded stronger results than analogous high-powered cross-sectional studies that have suggested little-to-no relationship between DTI measures and individual differences in reading skills (Meisler & Gabrieli, 2022a; Moreau et al., 2018; Roy et al., 2024).
1.4 White matter plasticity in reading intervention
While longitudinal studies of long-term reading development seem to have converged on the importance of left-hemispheric reading circuitry in predicting and tracking reading outcomes, neuroimaging studies focusing on short-term intensive reading instruction (on the order of days-to-weeks) have yielded few and mixed findings on rapid anatomical correlates of reading remediation (as well as functional correlates—(Barquero et al., 2014; Braid & Richlan, 2022; Perdue et al., 2022)). Huber et al. (2018) found that decreases in MD across the brain, not limited to core reading circuitry, were related to better reading intervention benefits across participants. One study reported better intervention responses were related to increases in FA only in the anterior left centrum semiovale (Keller & Just, 2009), a broad term for white matter between the corpus callosum and cortical surface which is not considered a part of core reading circuitry. Another study found decreased MD in several left hemispheric regions after reading intervention, although right-hemispheric regions were not reported (Richards et al., 2017). However, a different study concluded that white matter microstructure did not change with reading intervention, but lower right-hemispheric dorsal white matter MD prior to intervention predicted better intervention outcomes (Partanen et al., 2021). In young pre-readers undergoing early literacy training, pre-to-post increases in FA in the left AF and ILF were observed, but these were ultimately attributed to developmental, as opposed to intervention-driven, processes (Economou et al., 2022). In summary, while rapid white matter changes may be observed in a short period of time in the context of intensive reading intervention, it is unclear whether these changes are reproducible, domain-specific (i.e., localized to tracts that typically support reading), or dissociable from underlying developmental trajectories. Inconsistent findings could be driven by a variety of factors, including publication bias to report positive findings, small sample sizes, and variation in participant characteristics, interventions, and neuroimaging acquisition and analysis protocols (Perdue et al., 2022; Roy et al., 2024; Schilling, Rheault, et al., 2021; Schilling, Tax, et al., 2021; Thornton & Lee, 2000). It is also a possibility that intervention-driven effects are not robust or generalizable, reflecting unique properties of the intervention used or cohort studied.
In the present study, we examined changes in reading skill and white matter microstructure over the course of a six-week summer reading intervention among children with reading disabilities. We focused on properties averaged across all white matter and specifically within seven white matter tracts: the left AF and ILF as core reading circuitry bundles, their right-sided homotopes as potential compensatory bundles, bilateral corticospinal tracts (CST) as bundles that are not thought to subserve reading, and the splenium of the corpus callosum, which may support reading but has been shown to be microstructurally stable during reading intervention (Huber et al., 2018). We hypothesized observing one of two outcomes: (1) decreases in MD and/or increases in FA in all tracts (besides the splenium) would be related to better intervention responses (e.g., Huber et al., 2018), or (2) this effect would be localized to just the left AF, consistent with multiple studies tracking reading development on longer time scales (Roy et al., 2024; Van Der Auwera et al., 2021; Yeatman et al., 2012).
2 Materials and Methods
2.1 Ethics statement
This project was approved by the Massachusetts Institute of Technology’s Committee on the Use of Humans as Experimental Subjects (protocol number: 1201004850). Informed written consent was obtained from parents or legal guardians, while informed written assent was obtained from the participants, who were all minors.
2.2 Participants
Participants included in the present study were recruited for a broader overarching study, for which reading (Christodoulou et al., 2017) and gray matter morphometric (Romeo et al., 2018) findings have been previously reported. Forty-one participants passed all inclusion and quality control criteria and were analyzed in the present study (see Data Inclusion and Quality Control section). All participants were between 7 and 9 years old at the time of enrollment and were entering the summer having completed grades 1 or 2. Inclusion criteria included a history of reading difficulty based on parental report and a manifestation of reading difficulty at study enrollment. In particular, to be included in the study, participants had to have scored “At Risk” or “Some Risk” on the Dynamic Indicators of Basic Early Literacy Skills test (DIBELS; Good et al., 2002) and below the 25th percentile on at least 3 of the 5 following measures: Elision and Nonword Repetition subtests from the Comprehensive Test of Phonological Processing, 2nd Edition (CTOPP-2; Wagner et al., 1999), and the Objects, Letters, and 2-set Letters and Numbers subtests of the Rapid Automatized Naming and Rapid Alternating Stimulus Tests (RAN/RAS; Wolf & Denckla, 2005). Additionally, participants had to score at or above the 16th percentile on the Matrices subtest of the Kaufman Brief Intelligence Test, 2nd Edition (KBIT-2; Kaufman & Kaufman, 2004), which is a measure of nonverbal cognitive ability. All children were native English speakers. Children were recruited from a local partner charter school and the Greater Boston area. Socioeconomic information was collected from parents, who completed the Barratt Simplified Measure of Social Status (Barratt, 2006).
2.3 Reading intervention
Participants were randomly assigned to either receive a reading intervention (n = 26) or be placed on a waiting-list (n = 15). Comprehensive details of the intervention have been previously described (Christodoulou et al., 2017). Intervention participants completed intensive reading instruction following the Seeing Stars: Symbol Imagery for Fluency, Orthography, Sight Words, and Spelling program (Bell, 1997). Instruction was delivered by trained Lindamood-Bell teachers, who rotated classrooms hourly. The program duration was 4 hours per day on 5 days per week for 6 weeks; intervention duration totaled between 100 and 120 hours. Students received small group instruction (3-to-5 students per group) to improve foundational reading skills including phonological and orthographic processing, with an emphasis on visual and orthographic skills. Children recruited from the local partner school received the intervention on-site at their school (n = 9), while children recruited from the community-at-large (n = 17) received the intervention at a dedicated space at the Massachusetts Institute of Technology.
2.4 Outcome measures
Standardized reading scores were collected from all participants before and after the intervention period, regardless of whether they participated in the intervention. A reading measure proximal to the intervention, the Symbol Imagery Test (SIT), measured orthographic processing in reading (Bell, 2010). During the SIT, participants briefly viewed cards with words or pseudowords for between 2 and 7 seconds and were then asked to report what they were shown. Cronbach’s α values range from 0.86 to 0.88, and the test–retest reliability is 0.95 (Bell, 2010). A relatively distal composite reading index was calculated at each time point by averaging the following four age-standardized reading scores: Sight Word Efficiency (SWE) and Phonemic Decoding Efficiency (PDE) from the Test of Word Reading Efficiency, 2nd Edition (TOWRE-2; Torgeson et al., 1999), and Word Identification (WID) and Word Attack (WA) from the Woodcock Reading Mastery Tests, 3rd Edition (WRMT-3; Woodcock, 2011). Although these measures included orthographic processing, they involved multiple other processes involved in word reading accuracy and fluency. Timed and untimed single word reading skills were measured by SWE and WID, respectively, while timed and untimed pseudoword reading skills were measured by PDE and WA, respectively. For all four subtests, Form A was administered at the beginning of the study, and Form B was administered at the end of the study to avoid practice or familiarity effects. High alternate form reliability has been reported for standardized tests scores on both the WRMT-3 subtests (Word ID: r = 0.93, Word Attack: r = 0.76; Woodcock, 2011) and the TOWRE-2 subtests (SWE: r = 0.90, PDE: r = 0.92; Torgesen et al., 2012). Age-normed scores for all tests were defined such that the population mean is 100, with a standard deviation of 15.
2.5 Neuroimaging acquisition
All participants, regardless of intervention site, were scanned at the Athinoula A. Martinos Imaging Center at the Massachusetts Institute of Technology using a 3 Tesla Siemens TimTrio scanner and standard 32 channel head coil. During each session, a T1-weighted (T1w) MPRAGE image was acquired with the following parameters: TR = 2.53 s, TE = 1.64 ms, Flip Angle = 7°, and 1 mm isotropic voxels. A diffusion-weighted image (DWI) was acquired with the following parameters: TR = 9.3 s, TE = 84 ms, Flip Angle = 90°, 2 mm isotropic voxels, and 10 b0 volumes followed by 30 non-collinear directions at b = 700 s/mm2. Age-appropriate movies were shown during these scans to increase scan engagement and reduce head motion (Greene et al., 2018). Functional MRI tasks were also collected but are not discussed here. Before the first MRI session, participants were introduced to the MRI by visiting the center’s pediatric mock scanner, which allows children to get acclimated with MRI noise and lying still in the machine, which improves scan compliance (de Bie et al., 2010; Gao et al., 2023).
2.6 MRI preprocessing and tract segmentation
MRI preprocessing and tract segmentation were performed according to the longitudinal TRActs Constrained by UnderLying Anatomy (TRACULA) pipeline (Maffei et al., 2021; Yendiki et al., 2011, 2016), as part of FreeSurfer version 7.2 (Fischl, 2012; Fischl et al., 2002; Reuter et al., 2012). This method uses longitudinal anatomical priors to produce more plausible tracts compared to creating independent segmentations at each time point (Yendiki et al., 2016), as well as leverages high-quality training data to help inform tract shapes on routine-quality DWI data (Maffei et al., 2021). To achieve this, FreeSurfer’s longitudinal processing pipeline (Reuter et al., 2012) was run on each participant’s pre and post T1w images to create an unbiased subject template image (Reuter & Fischl, 2011) using inverse consistent registration (Reuter et al., 2010). Information from this template was used to initialize several steps of the recon-all pipeline, such as skull-stripping and anatomical segmentation (Reuter et al., 2012).
DWI volumes from each image were aligned to the first b0 image in that scan. The b-matrix was rotated accordingly (Leemans & Jones, 2009). DWI images were corrected for motion and eddy currents with FSL’s eddy command (Andersson & Sotiropoulos, 2016). Information from this process was used to generate four measures of head motion and image quality that inform the “total motion index” (Yendiki et al., 2014): mean volume-by-volume head rotation, mean volume-by-volume head translation, proportion of slices with signal dropout, and severity of signal dropout. The diffusion tensor was fitted using FSL’s dtifit. Mean diffusivity (MD) and fractional anisotropy (FA) were derived from the tensor. A GPU-accelerated ball-and-stick model was fit for each DWI image (Behrens et al., 2007; Hernández et al., 2013; Jbabdi et al., 2012). At each time point, a registration was computed between the diffusion-weighted image and T1w image (native space) using an affine boundary-based registration algorithm (Greve & Fischl, 2009). This transformation was used to bring anatomical segmentations into DWI space. The DWI-to-T1w and T1w-to-template registrations were multiplied to get a DWI-to-template transformation. Information from high-resolution 7T training data (Maffei et al., 2021) was used to estimate endpoint ROIs and pathways for white matter tracts in each participant’s template space images. These data were then brought back into the native DWI space of each time point. The DWI ball-and-stick model and tract anatomical priors were used to calculate the probability density of each pathway. From these, we collected the average MD and FA from the cores of our tracts of interest ([MD|FA]_Avg_Center) to mitigate concerns of noise and partial volume effects from fibers branching towards the exterior and extremities of the bundles. These tracts included the bilateral AF, ILF, and CST, as well as the splenium of the corpus callosum (Fig. 1). Additionally, at each time point, we calculated the laterality index of microstructural measures among bilateral tracts ([L-R]/[L+R]), as well as the average of the microstructural measures within the FreeSurfer-produced white matter segmentation mask. This average, unlike a whole-brain average, does not covary with the proportions of gray-to-white matter.
2.7 Statistics and analysis
Analyses were prepared, run, and visualized using Python packages Pandas 1.3.2 (McKinney, 2011), Statsmodels 0.13.5 (Seabold & Perktold, 2010), and Seaborn 0.12.1 (Waskom, 2021), respectively. We made a dataframe that contained the following phenotypic fields for each subject: ages at each scan (in months), sex (binary categorical factor), and reading measures at pre and post time points (as well as their longitudinal pre-to-post differences). For each tract, the pre and post MD and FA metrics were added to the dataframe, along with their longitudinal pre-to-post differences. Laterality indexes of microstructural measures and their longitudinal differences for bilateral tracts were also added. Microstructural measures averaged across white matter and their longitudinal differences were additionally added. The total motion indices, calculated separately for each time point, were also added to the dataframe (see Data Inclusion and Quality Control section).
We used ordinary least squares linear models to run multiple regressions, allowing us to relate reading measures to white matter microstructure while controlling for confounds. For the primary analyses, we created models to relate pre-to-post changes in a given tract microstructural measure (dependent variable) to the longitudinal difference in a reading measure across all participants (independent variable), with nuisance regressors for sex, age at first scan, and motion indices at both time points. We also ran a series of related supplemental analyses to provide additional context for the analysis. These included: (1) models with radial and axial diffusivities (RD and AD, respectively) as the primary DTI metric (Fig. S1); (2) models using just participants who completed the reading intervention (Fig. S3); and (3) cross-sectional models at each time point (with age and motion confounds being derived from the particular time point) (Fig. S4). For each model, the effect size (ΔR2adj) was calculated as the difference in adjusted R2 coefficients between that from the full model and from a reduced model without the reading score predictor of interest. A family of tests was considered as the set of tests across tracts for a given reading measure and microstructural metric. This included up to 11 tests (3 bilateral tracts with their laterality indexes, the splenium, and the average white matter). Benjamini-Hochberg false-discovery rate (FDR) correction for multiple hypotheses (Benjamini & Hochberg, 1995) was performed within each family of tests.
2.8 Data inclusion and quality control
A total of 153 children were recruited as part of a broader study. Fifty-two participants had anatomical and DTI scans at both time points and were able to complete the neuroimaging processing pipeline without errors. Forty-four of the remaining participants had the necessary phenotypic data. As a quality assurance metric, we computed the total motion index (TMI; Yendiki et al., 2014). TMI is related to four measures: rotation, translation, signal dropout prevalence, and signal dropout severity. For each scan, we calculated each motion metric’s difference from the study population mean for the given time point, divided by the interquartile range of the metric. The TMI for each scan is the cumulative sum of these calculations across the four motion metrics. Three subjects had outliers in TMI at either time point and were excluded. For further quality assurance, we confirmed that no remaining participant had any tract-averaged FA lower than 0.3, which could indicate some combination of white matter disorganization and partial volume effects from a tract branching into significant amounts of gray matter or CSF. Thus, a total of 41 subjects (26 who received an intervention, and 15 in the non-intervention group) were analyzed in the present study.
We conducted a power analysis, in which we used an estimated effect size of |r| = 0.40. This corresponds to the relationship between mean diffusivity and changes in TOWRE reading scores observed in the left AF and ILF in a related study (see Table 1 in Huber et al., 2018 for reference). To resolve that effect at α = 0.05 and power of 0.8, one would need n = 34 participants, as calculated by the G*Power software version 3.1 (Faul et al., 2009), which our sample exceeds.
Participants . | All (n = 41) . | Intervention (n = 26) . | Non-intervention (n = 15) . | Effect size . |
---|---|---|---|---|
Age at first scan (months) | 94.85 (1.16) | 95.00 (1.32) | 94.60 (2.29) | d = 0.053 |
Sex (M/F) | 26/15 | 16/10 | 10/5 | Φ = 0.051 |
Handedness (L/R) | 7/34 | 7/19* | 0/15* | Φ = 0.345* |
SES (years of parental education) | 17.89 (0.41) | 18.00 (0.50) | 17.70 (0.75) | d = 0.112 |
KBIT Matrices | 101.7 (2.03) | 100.3 (2.41) | 104.2 (3.70) | d=0.301 |
SIT (Pre/Post/Diff) | 89.76 (1.77)/93.20 (1.69)/3.439 (2.06) | 88.62 (2.15) /97.07 (1.87)*/8.462 (2.15)* | 91.73 (3.11) /86.47 (2.53)*/-5.267 (3.21)* | d = 0.275/d = 1.102*/d = 1.191* |
Composite Reading Index (Pre/Post/Diff) | 83.21 (1.28)/81.11 (1.34)/-2.101 (0.957) | 82.37 (1.51)/82.34 (1.57)/-0.029 (1.27)* | 84.67 (2.34)/78.97 (2.42)/-5.694 (0.859)* | d = 0.280/d = 0.395/d = 1.102* |
TOWRE SWE (Pre/Post) | 83.49 (1.78)/80.46 (1.99) | 83.46 (2.23)/80.81 (2.61) | 83.53 (3.06)/79.87 (3.11) | d = 0.006/d = 0.073 |
TOWRE PDE (Pre/Post) | 79.88 (1.40)/77.63 (1.51) | 78.46 (1.64)/79.42 (1.74) | 82.33 (2.48)/74.29 (2.70) | d = 0.439/d = 0.552 |
WRMT Word ID (Pre/Post) | 83.68 (1.45)/82.68 (1.46) | 83.50 (1.79)/83.69 (1.89) | 84.00 (2.57)/80.93 (2.27) | d = 0.053/d = 0.295 |
WRMT Word Attack (Pre/Post) | 86.00 (1.70)/83.80 (1.43) | 84.32 (2.06)/85.42 (1.52) | 88.80 (2.89)/81.00 (2.84) | d = 0.421/d = 0.489 |
Participants . | All (n = 41) . | Intervention (n = 26) . | Non-intervention (n = 15) . | Effect size . |
---|---|---|---|---|
Age at first scan (months) | 94.85 (1.16) | 95.00 (1.32) | 94.60 (2.29) | d = 0.053 |
Sex (M/F) | 26/15 | 16/10 | 10/5 | Φ = 0.051 |
Handedness (L/R) | 7/34 | 7/19* | 0/15* | Φ = 0.345* |
SES (years of parental education) | 17.89 (0.41) | 18.00 (0.50) | 17.70 (0.75) | d = 0.112 |
KBIT Matrices | 101.7 (2.03) | 100.3 (2.41) | 104.2 (3.70) | d=0.301 |
SIT (Pre/Post/Diff) | 89.76 (1.77)/93.20 (1.69)/3.439 (2.06) | 88.62 (2.15) /97.07 (1.87)*/8.462 (2.15)* | 91.73 (3.11) /86.47 (2.53)*/-5.267 (3.21)* | d = 0.275/d = 1.102*/d = 1.191* |
Composite Reading Index (Pre/Post/Diff) | 83.21 (1.28)/81.11 (1.34)/-2.101 (0.957) | 82.37 (1.51)/82.34 (1.57)/-0.029 (1.27)* | 84.67 (2.34)/78.97 (2.42)/-5.694 (0.859)* | d = 0.280/d = 0.395/d = 1.102* |
TOWRE SWE (Pre/Post) | 83.49 (1.78)/80.46 (1.99) | 83.46 (2.23)/80.81 (2.61) | 83.53 (3.06)/79.87 (3.11) | d = 0.006/d = 0.073 |
TOWRE PDE (Pre/Post) | 79.88 (1.40)/77.63 (1.51) | 78.46 (1.64)/79.42 (1.74) | 82.33 (2.48)/74.29 (2.70) | d = 0.439/d = 0.552 |
WRMT Word ID (Pre/Post) | 83.68 (1.45)/82.68 (1.46) | 83.50 (1.79)/83.69 (1.89) | 84.00 (2.57)/80.93 (2.27) | d = 0.053/d = 0.295 |
WRMT Word Attack (Pre/Post) | 86.00 (1.70)/83.80 (1.43) | 84.32 (2.06)/85.42 (1.52) | 88.80 (2.89)/81.00 (2.84) | d = 0.421/d = 0.489 |
The italic d refers to the “Cohen’s d” effect size. The Φ refers to the “Cramer’s phi” effect size.
Values are provided as Mean (SEM). All cognitive measures are age-standardized. Two-sample t-tests were used to test for differences between groups, with the exception of χ2-tests for sex and handedness distributions. * denotes p < 0.05. Abbreviations: SES - Socioeconomic Status; KBIT - Kaufman Brief Test of Intelligence; SIT - Symbol Imagery Test; TOWRE - Test of Word Reading Efficiency; SWE - Sight Word Efficiency; PDE - Phonemic Decoding Efficiency; and WRMT - Woodcock Reading Mastery Tests.
3 Results
3.1 Cognitive and phenotypic data
Phenotypic summary statistics for the participant cohort are provided in Table 1. Of note, the intervention and non-intervention groups were matched in sex (χ2 test, p > 0.7), but not handedness (χ2 test, p < 0.05). We did not include handedness as a regressor in our models due to a lack of evidence of handedness-related asymmetry in white matter microstructure (Jang et al., 2017; López-Vicente et al., 2021). The two groups were also matched in age, non-verbal intelligence (KBIT), socioeconomic status, and reading performance across all subtests cross-sectionally at each time point (two-sample t-test, p > 0.1 across all tests), with the exception of the intervention group having a significantly higher SIT score post-intervention (two-sample t-test, p < 0.005). Demonstrating the efficacy of the reading intervention, the intervention group showed larger longitudinal pre-to-post differences in the SIT and composite reading index (two-sample t-test, p < 0.002 across both tests), driven by the non-intervention group regressing in both measures and the intervention group improving on the SIT and maintaining scores on the composite reading index (Fig. 2). These results are consistent with what was observed in the larger cohort from which the present subset was derived (Christodoulou et al., 2017).
3.2 Relationship between changes in white matter microstructure and reading scores
Across all participants, pre-to-post decreases of MD in the left AF and left ILF were related to improvements in SIT scores over the summer (Table 2, Fig. 3). SIT score trajectories accounted for ~9% of variance among MD changes in the left AF, and ~16% of MD difference variance in the left ILF. Additionally, longitudinal differences in SIT scores accounted for ~21% of variance in changes in ILF MD laterality, following a similar trend of decreasing leftward laterality of MD relating to improvements in reading. However, similar effects were not present when considering the composite reading index. Similar patterns to MD results were also observed when using radial diffusivity as the DTI metric of interest (Fig. S1). Decreasing splenium MD was marginally correlated with improvements in both reading measures (p < 0.1), with each test score accounting for ~5% of variance in microstructure. Increasing FA in the left CST (p < 0.05) and to a lesser extent the left AF (p < 0.1) were related to improvements in SIT scores, but not the composite reading index (Fig. 3, Table 3). Changes in white matter average FA and MD did not relate to changes in either reading measure (and these metrics were not different between groups before or after the intervention, Fig. S2). After multiple comparison correction, the models relating changes in SIT scores to differences in the ILF MD laterality (pFDR = 0.037) and left ILF MD (pFDR = 0.056) remained statistically significant or marginally significant.
Tract . | SIT score . | Composite reading index . | ||||||
---|---|---|---|---|---|---|---|---|
β [95% CI] . | ΔR2adj . | p-value . | pFDR . | β [95% CI] . | ΔR2adj . | p-value . | pFDR . | |
White Matter Average MD | -0.198 [-0.559, 0.163] | 0.006 | 0.274 | 0.381 | -0.022 [-0.338, 0.294] | -0.025 | 0.889 | 0.889 |
Left AF | -0.385 [-0.746, -0.024] | 0.092 | 0.037* | 0.130 | -0.219 [-0.541, 0.104] | 0.024 | 0.177 | 0.650 |
Right AF | 0.049 [-0.334, 0.431] | -0.026 | 0.798 | 0.810 | 0.080 [-0.249, 0.408] | -0.021 | 0.625 | 0.707 |
Laterality Index AF | -0.370 [-0.735, -0.005] | 0.083 | 0.047* | 0.130 | -0.260 [-0.581, 0.060] | 0.046 | 0.108 | 0.596 |
Left ILF | -0.480 [-0.839, -0.121] | 0.157 | 0.010* | 0.056† | -0.079 [-0.417, 0.260] | -0.023 | 0.641 | 0.707 |
Right ILF | 0.050 [-0.330, 0.430] | -0.026 | 0.792 | 0.810 | 0.104 [-0.222, 0.430] | -0.016 | 0.520 | 0.707 |
Laterality Index ILF | -0.540 [-0.889, -0.193] | 0.207 | 0.003* | 0.037* | -0.186 [-0.520, 0.148] | 0.008 | 0.267 | 0.655 |
Left CST | -0.197 [-0.558, 0.165] | 0.005 | 0.277 | 0.381 | -0.073 [-0.388, 0.243] | -0.201 | 0.643 | 0.707 |
Right CST | -0.193 [-0.539, 0.154] | 0.006 | 0.267 | 0.381 | -0.156 [-0.454, 0.143] | 0.002 | 0.298 | 0.655 |
Laterality Index CST | 0.046 [-0.338, 0.430] | -0.027 | 0.810 | 0.810 | 0.144 [-0.183, 0.471] | -0.006 | 0.378 | 0.692 |
Splenium | -0.292 [-0.624, 0.039] | 0.046 | 0.082† | 0.180 | -0.265 [-0.549, 0.019] | 0.054 | 0.066† | 0.596 |
Tract . | SIT score . | Composite reading index . | ||||||
---|---|---|---|---|---|---|---|---|
β [95% CI] . | ΔR2adj . | p-value . | pFDR . | β [95% CI] . | ΔR2adj . | p-value . | pFDR . | |
White Matter Average MD | -0.198 [-0.559, 0.163] | 0.006 | 0.274 | 0.381 | -0.022 [-0.338, 0.294] | -0.025 | 0.889 | 0.889 |
Left AF | -0.385 [-0.746, -0.024] | 0.092 | 0.037* | 0.130 | -0.219 [-0.541, 0.104] | 0.024 | 0.177 | 0.650 |
Right AF | 0.049 [-0.334, 0.431] | -0.026 | 0.798 | 0.810 | 0.080 [-0.249, 0.408] | -0.021 | 0.625 | 0.707 |
Laterality Index AF | -0.370 [-0.735, -0.005] | 0.083 | 0.047* | 0.130 | -0.260 [-0.581, 0.060] | 0.046 | 0.108 | 0.596 |
Left ILF | -0.480 [-0.839, -0.121] | 0.157 | 0.010* | 0.056† | -0.079 [-0.417, 0.260] | -0.023 | 0.641 | 0.707 |
Right ILF | 0.050 [-0.330, 0.430] | -0.026 | 0.792 | 0.810 | 0.104 [-0.222, 0.430] | -0.016 | 0.520 | 0.707 |
Laterality Index ILF | -0.540 [-0.889, -0.193] | 0.207 | 0.003* | 0.037* | -0.186 [-0.520, 0.148] | 0.008 | 0.267 | 0.655 |
Left CST | -0.197 [-0.558, 0.165] | 0.005 | 0.277 | 0.381 | -0.073 [-0.388, 0.243] | -0.201 | 0.643 | 0.707 |
Right CST | -0.193 [-0.539, 0.154] | 0.006 | 0.267 | 0.381 | -0.156 [-0.454, 0.143] | 0.002 | 0.298 | 0.655 |
Laterality Index CST | 0.046 [-0.338, 0.430] | -0.027 | 0.810 | 0.810 | 0.144 [-0.183, 0.471] | -0.006 | 0.378 | 0.692 |
Splenium | -0.292 [-0.624, 0.039] | 0.046 | 0.082† | 0.180 | -0.265 [-0.549, 0.019] | 0.054 | 0.066† | 0.596 |
denotes p < 0.05, † denotes p < 0.1. Abbreviations: AF - Arcuate Fasciculus; ILF - Inferior Longitudinal Fasciculus; CST - Corticospinal Tract; SIT - Symbol Imagery Test.
Tract . | SIT score . | Composite reading index . | ||||||
---|---|---|---|---|---|---|---|---|
β [95% CI] . | ΔR2adj . | p-value . | pFDR . | β [95% CI] . | ΔR2adj . | p-value . | pFDR . | |
White Matter Average FA | 0.108 [-0.276, 0.491] | -0.019 | 0.573 | 0.701 | -0.050 [-0.381, 0.282] | -0.026 | 0.763 | 0.839 |
Left AF | 0.292 [-0.052, 0.635] | 0.045 | 0.094† | 0.377 | 0.031 [-0.277, 0.339] | -0.024 | 0.839 | 0.839 |
Right AF | -0.063 [-0.449, 0.323] | -0.254 | 0.743 | 0.817 | 0.135 [-0.194, 0.464] | -0.009 | 0.411 | 0.828 |
Laterality Index AF | 0.267 [-0.107, 0.639] | 0.029 | 0.157 | 0.431 | -0.084 [-0.413, 0.246] | -0.021 | 0.609 | 0.828 |
Left ILF | 0.180 [-0.210, 0.569] | -0.004 | 0.356 | 0.507 | 0.101 [-0.237, 0.439] | -0.019 | 0.548 | 0.828 |
Right ILF | -0.172 [-0.532, 0.188] | -0.001 | 0.338 | 0.507 | -0.065 [-0.378, 0.248] | -0.021 | 0.677 | 0.828 |
Laterality Index ILF | 0.294 [-0.062, 0.651] | 0.044 | 0.103 | 0.377 | 0.0931 [-0.225, 0.411] | -0.017 | 0.556 | 0.828 |
Left CST | 0.3929 [0.031, 0.755] | 0.097 | 0.034* | 0.377 | 0.089 [-0.242, 0.421] | -0.020 | 0.587 | 0.828 |
Right CST | 0.155 [-0.191, 0.501] | -0.004 | 0.369 | 0.507 | 0.215 [-0.077, 0.507] | 0.027 | 0.144 | 0.828 |
Laterality Index CST | 0.171 [-0.181, 0.522] | -0.001 | 0.331 | 0.507 | -0.130 [-0.434, 0.174] | -0.006 | 0.391 | 0.828 |
Splenium | 0.036 [-0.340, 0.412] | -0.026 | 0.847 | 0.847 | 0.079 [-0.244, 0.402] | -0.020 | 0.621 | 0.828 |
Tract . | SIT score . | Composite reading index . | ||||||
---|---|---|---|---|---|---|---|---|
β [95% CI] . | ΔR2adj . | p-value . | pFDR . | β [95% CI] . | ΔR2adj . | p-value . | pFDR . | |
White Matter Average FA | 0.108 [-0.276, 0.491] | -0.019 | 0.573 | 0.701 | -0.050 [-0.381, 0.282] | -0.026 | 0.763 | 0.839 |
Left AF | 0.292 [-0.052, 0.635] | 0.045 | 0.094† | 0.377 | 0.031 [-0.277, 0.339] | -0.024 | 0.839 | 0.839 |
Right AF | -0.063 [-0.449, 0.323] | -0.254 | 0.743 | 0.817 | 0.135 [-0.194, 0.464] | -0.009 | 0.411 | 0.828 |
Laterality Index AF | 0.267 [-0.107, 0.639] | 0.029 | 0.157 | 0.431 | -0.084 [-0.413, 0.246] | -0.021 | 0.609 | 0.828 |
Left ILF | 0.180 [-0.210, 0.569] | -0.004 | 0.356 | 0.507 | 0.101 [-0.237, 0.439] | -0.019 | 0.548 | 0.828 |
Right ILF | -0.172 [-0.532, 0.188] | -0.001 | 0.338 | 0.507 | -0.065 [-0.378, 0.248] | -0.021 | 0.677 | 0.828 |
Laterality Index ILF | 0.294 [-0.062, 0.651] | 0.044 | 0.103 | 0.377 | 0.0931 [-0.225, 0.411] | -0.017 | 0.556 | 0.828 |
Left CST | 0.3929 [0.031, 0.755] | 0.097 | 0.034* | 0.377 | 0.089 [-0.242, 0.421] | -0.020 | 0.587 | 0.828 |
Right CST | 0.155 [-0.191, 0.501] | -0.004 | 0.369 | 0.507 | 0.215 [-0.077, 0.507] | 0.027 | 0.144 | 0.828 |
Laterality Index CST | 0.171 [-0.181, 0.522] | -0.001 | 0.331 | 0.507 | -0.130 [-0.434, 0.174] | -0.006 | 0.391 | 0.828 |
Splenium | 0.036 [-0.340, 0.412] | -0.026 | 0.847 | 0.847 | 0.079 [-0.244, 0.402] | -0.020 | 0.621 | 0.828 |
denotes p < 0.05, † denotes p < 0.1. Abbreviations: AF - Arcuate Fasciculus; ILF - Inferior Longitudinal Fasciculus; CST - Corticospinal Tract; SIT - Symbol Imagery Test.
When running the same models on only the 26 participants who completed the intervention (Fig. S3), significant relationships remained between pre-to-post decreases in MD in the left ILF and improvement in SIT scores (Fig. 3, p < 0.05, ΔR2adj = 0.238) and between pre-to-post decreases in MD in the splenium and improvement in composite reading index scores (p < 0.1, ΔR2adj = 0.093). Additionally, improvements in SIT scores were associated with decreases in white matter average MD (p < 0.05, ΔR2adj = 0.113) and with decreasing FA in the right ILF (p < 0.1, ΔR2adj = 0.123). The relationship between MD decreases in the left ILF and improvements in SIT scores remained marginally significant (p < 0.1; ΔR2adj = 0.098) after including white matter MD as an additional covariate post hoc. None of these supplementary models remained significant at α = 0.05 after FDR correction for multiple hypotheses.
4 Discussion
In the present study, we investigated whether changes in white matter microstructure were related to changes in reading skill during the summer among 41 children with reading disabilities. Reading ability trajectories varied on a wide spectrum, including score regression (the “summer slump”) and intervention-driven improvement. We focused on seven tracts within and outside of core reading circuitry, using two microstructural measures (FA and MD), and two reading measures. One reading measure, the SIT, was closely related to the orthographic focus of the intervention, while a separate composite reading index was more distal to the intervention and indexed a broader range of reading-related skills (e.g., phonological awareness and rapid automatized naming). We found that longitudinal decreases in MD (and leftward laterality of MD) were related to improved SIT scores in left-hemisphere core reading circuitry (the AF and ILF). Longitudinal increases in FA in the left CST were also related to improved SIT scores. Notably, none of these associations were present when considering the composite reading index, and only the relationship between improving SIT scores and decreasing leftward laterality of MD in the ILF was significant after FDR correction.
We originally hypothesized that we would see a relationship between improving white matter microstructure (lower MD and/or higher FA) and improving reading scores in either just the left AF or more globally. While neither of these hypotheses were supported, the pattern of results suggests that intervention effects were strongest (and in the hypothesized direction) within core reading circuitry when considering the reading measure most related to the intervention. The specificity of effects to core reading circuitry is supported by the use of the laterality index (which tends to be unrelated to global trends) and the mostly null results in models using white matter averaged microstructural measures. That ILF plasticity was most related to reading trajectories, compared to changes in the AF, might reflect the higher relative emphasis on orthographical and visual training in the intervention program (as opposed to processing phonological representations of print, which would be subserved by the AF). This might also explain changes observed in the splenium, which is thought to subserve more basic visual processes. The nature of the intervention may influence what brain structures are affected; for example, phonological-based instruction may have effects localized to brain structures supporting phonological processing (Perdue et al., 2022). Similarly, this pattern of results we observed might be representative of the relatively high reliance on visual and orthographic processing in early readers still learning the basics of decoding for reading (Badian, 2001).
The design of our study most closely resembles that of Huber et al. (2018), but with some important differences. The same intervention curriculum (Seeing Stars) was used in both studies, with similar instruction hours per week, but the present study had a shorter 6-week intervention period compared to the 8 weeks in Huber et al. (2018). Additionally, students in the present study were taught in small group settings, while a 1-on-1 approach was used in Huber et al. (2018). The longer intervention duration and more intense individualized instruction may have been factors that led to children in that study improving on their composite reading measure (composed of the same reading tests as in the present study), as opposed to only maintaining scores as found in the present study. Both studies had similar cohort sizes (41 and 43 children). However, the present study had participants within a narrower age-range of 7-9 years compared to 7-12 years old in Huber et al. (2018). Additionally, all participants in the present study were diagnosed with a reading disability, while 10 children in Huber et al. (2018) were typical readers. Both studies found that changes in MD in the left AF and ILF were related to reading outcomes over the course of the summer. However, Huber et al. (2018) found more widespread plasticity that did not include the splenium, while our cohort exhibited more domain-specific plasticity that included the splenium (p < 0.1). Beyond the contrasts between studies explained above, additional variation in data acquisition, processing, and statistical techniques likely contributed to differences in results between studies (Schilling, Rheault, et al., 2021; Schilling, Tax, et al., 2021).
It is encouraging that both the present study and Huber et al. (2018) found intervention effects in the left AF and ILF, which provides converging lines of evidence suggesting that intense educational instruction influences reading-relevant white matter tracts, albeit with different findings about plasticity occurring in a broader range of tracts. Future studies ought to address similar questions in different contexts to evaluate the reproducibility and generalizability of these findings. Presently, there are not enough extant studies on longitudinal neuroanatomical correlates of reading intervention (in either gray or white matter) to perform meaningful meta-analyses. In functional MRI, a meta-analysis of eight studies with longitudinal neuroimaging and cognitive scores concluded that there were no consistent locations where longitudinal changes in reading-invoked BOLD signal and intervention response covaried (Perdue et al., 2022). However, individual studies, including those that may have only contained one session of neuroimaging either prior or after intervention, have found intervention response both in putative reading regions as well as more globally (reviewed in Barquero et al., 2014; Braid & Richlan, 2022; Perdue et al., 2022). One of the few studies of gray matter morphometric correlates of intervention response was conducted on the same participant pool as in the present study (Romeo et al., 2018). This study concluded that children who improved exhibited significant cortical thickening in brain regions, including the left middle temporal gyrus, right superior temporal gyrus, and bilateral middle-inferior temporal cortex, inferior parietal lobule, precentral cortex, and posterior cingulate cortex. While some of these regions comprise left-lateralized core reading areas, others extend globally beyond the reading network. Considering these spatially distinct patterns of results for gray and white matter, it is unclear yet how to coincide different anatomical and functional measures of plasticity in response to reading intervention.
In exploratory analyses, we created cross-sectional models to investigate whether white matter microstructure and reading skills were associated at each time point (Fig. S4). Lower MD and higher FA in the left ILF were associated with better reading scores at the beginning of the summer, consistent with the left ILF’s critical role in supporting reading. We also found that the right ILF and right CST microstructure had significant associations with reading scores. Notably, this is not consistent with a previous report showing an inverse relationship between right ILF FA and reading outcomes among children with reading disabilities (Banfi et al., 2018), and in the present study, longitudinal microstructural trajectories in these right-sided homotopes were not linked with reading score changes. This might suggest that right-lateralized white matter serves as a static compensatory agent that reflects early reading outcomes in reading disabilities (e.g., Zuk, Dunstan, et al., 2021), but does not dynamically change with reading instruction. However, given the small sample size and limited power of cross-sectional designs, this should be interpreted with caution.
The significance of the models including the CST, both cross-sectionally and longitudinally, was unexpected given its seeming lack of a role in reading as a primary motor tract. Although effect sizes for these models were appreciably lower than those for the left ILF and AF, this suggests that intervention effects could still be detected to some extent outside of reading circuitry. The CST is not often focused in studies of reading abilities, but one study found that volumes of bilateral CST were informative in predicting dyslexia diagnoses in children (Cui et al., 2016). Additionally, other studies have found that FA in the left CST predicted future phonological skills (a critical pre-reading ability) in kindergarten (Zuk, Yu, et al., 2021), correlated with phonological processing in preschoolers (Walton et al., 2018), and corresponded with phonological encoding abilities in adults with brain damage (Han et al., 2016). This suggests that the left CST could be co-opted into reading circuitry in populations with deficient or not-fully developed language abilities, albeit its role in this context is not clear. This deviates from the more frequent focus of compensation from right-sided reading circuitry homotopes such as the right ILF and AF. However, speech and reading difficulties tend to co-occur (Catts, 1993; Hayiou-Thomas et al., 2010), and the left CST may be important for speech, evidenced by microstructural deficiencies in pre-term children with poor oromotor outcomes (Northam et al., 2012) and stuttering populations (Connally et al., 2014; Kronfeld-Duenias et al., 2016). It is also a possibility that the CST reconstructions largely intersected with the nearby corticobulbar projections, as tractography is prone to overlaps (Schilling et al., 2022). Corticobulbar projections innervate cranial nerves that support head and neck muscles, and properties of corticobulbar tracts have also been related to speech and language outcomes in preterm adolescents (Northam et al., 2019).
Associations between white matter microstructure and reading outcomes were found in relation to a proximal measure of orthographic skill, but not distal measures that included multiple processes related to reading words and pseudowords. The proximal outcome measure (SIT) involved rapid identification of letter sets and was thus directly aligned with the Seeing Stars intervention program’s emphasis on orthographic skills. We also expected but did not find statistically significant associations between white matter microstructure and the distal reading composite that included four single word reading measures (timed and untimed, real and pseudo-words). Previous research on this intervention reported the largest effect sizes (Cohen’s d) for the SIT measure (1.32), but more modest effect sizes for distal single word reading measures that constituted the reading composite score (Untimed word reading (WRMT WID): 0.96; untimed pseudoword reading (WRMT WA): 0.87; timed word reading (TOWRE SWE): 0.19; timed pseudoword reading (TOWRE PDE): 1.08; Christodoulou et al., 2017). The association between plasticity in white matter microstructure and change in SIT scores may reflect this specific substantial change, perhaps reflecting a minimal threshold of intervention impact that ties to structural plasticity.
The pattern of results in the present study favoring stronger effects in MD than FA imply that the microstructural changes accompanying reading development are related to extra-axonal factors, such as neurite density and CSF volume (Beaulieu, 2002; Genc et al., 2017), as opposed to axonal factors such as myelination, orientation coherence, and axonal density (Friedrich et al., 2020). This is consistent with other DWI studies of reading intervention (Huber et al., 2018, 2021). Our supplemental analyses, which showed that our results with MD were similar to models with radial (but not axial) diffusivity (Fig. S1), support the idea that improvements in reading might be limited to factors that restrict water movement in extracellular space, such as increased density of axons (Winklewski et al., 2018). However, multimodal research at various spatial and temporal resolutions will need to be reconciled to perform the nontrivial task of ascribing such changes to biophysical mechanisms (Jelescu et al., 2020). While higher FA and lower MD are often thought to reflect more “healthy” white matter, these metrics are biologically unspecific due to the variety of factors that can influence diffusion of water in voxel-sized regions and the confounding impact of crossing fibers (De Santis et al., 2014; Jones et al., 2013), which can impact as many as 90% of white matter voxels (Behrens et al., 2007; Jeurissen et al., 2013). Fiber-specific measures such as quantitative anisotropy (Yeh et al., 2013) and fixel-based metrics (Raffelt et al., 2017), and multicompartmental models such as NODDI (Zhang et al., 2012), can provide higher biological specificity and have shown promise in better resolving brain-behavior relationships in studies of reading abilities (Koirala et al., 2021; Meisler & Gabrieli, 2022b; Sihvonen et al., 2021). Unfortunately, the low angular resolution and weak single-shelled diffusion weighting of the present DTI acquisition scheme were not well-suited for these more novel approaches (Genc et al., 2020), effectively limiting us to using DTI metrics. Outside of DWI, related white matter neuroimaging sequences, such as myelin water imaging and quantitative T1 imaging, may provide more targeted insights into learning-driven plasticity in reading (Economou et al., 2023; Huber et al., 2021). Future longitudinal studies should consider incorporating these techniques.
Our results should be considered in the context of additional limitations. First, the only test that survived multiple comparison correction was the one that suggested decreasing leftward laterality of MD in the ILF was related to improving SIT scores (and the analogous model concerning only the left ILF was marginally significant at pFDR = 0.056). Despite this, we believe the strong effect sizes achieved by many of the models, even those that did not survive FDR correction, are noteworthy given the typically small effect sizes observed in analogous cross-sectional analyses (e.g., ~3% variance explained, as observed in Meisler & Gabrieli, 2022b). The small amount of within-participant data (two time points) precluded us from running more statistically sophisticated models, such as linear mixed-effect models as in Huber et al. (2018). This also limited our ability to infer when microstructural changes occurred over the course of the intervention and characterized the temporal relationship between changes in reading and tract properties (in other words, whether tract microstructural changes preceded changes in reading scores, or vice-versa). To address these points, future longitudinal studies of reading intervention should strive to contain at least three sessions of data collection (King et al., 2018). Our relatively small sample size of 41 reflects the challenges of collecting acceptable quality cognitive and multi-modal MRI data in children with learning disabilities undergoing intervention, which are particularly amplified for longitudinal studies (Davis et al., 2022). Even with these limitations, we found that white matter microstructural plasticity, predominantly in core reading circuitry, was related to changes in reading abilities over the summer in the context of short-term intensive educational intervention.
Data and Code Availability
Due to language used in the consenting process, we are not permitted to publicly share subject MRI images. Images may be privately distributed upon reasonable request. We share a CSV containing all necessary data to replicate the present results, as well as the code to recreate the analyses and figures. All instructions and code for processing data and running the statistical analyses can be found at https://github.com/smeisler/Meisler_ReadingInt_DWI. To execute the FreeSurfer workflows, we ran a Docker container containing FreeSurfer 7.2 and FSL 6.0.4 with Singularity (3.9.5) (Kurtzer et al., 2017). The container can be collected with either docker pull “amirro/tracula:latest” or singularity build tracula_container“.img docker://amirro/tracula:latest”. Development of these software may introduce improvements and bug fixes that should be used in future research, so we encourage using the latest stable releases.
Author Contributions
S.L.M.: Conceptualization, Formal analysis, Funding acquisition, Methodology, Software, Visualization, Writing—original draft, and Writing—review & editing. J.D.E.G.: Conceptualization, Funding acquisition, Supervision, and Writing—review & editing. J.A.C.: Conceptualization, Funding acquisition, Investigation, Supervision, and Writing—review & editing.
Funding
This work was supported by the National Institutes of Health (S.L.M.: 5T32DC000038 and 1F31HD111139; J.D.E.G. and J.A.C.: 1R01HD106122; JAC: 1R15HD102881), the Halis Family Foundation, and Reach Every Reader, a grant supported by the Chan Zuckerberg Foundation.
Declaration of Competing Interest
The authors declare no competing interests.
Acknowledgments
We thank all the participants and their families for volunteering their time to participate in the study. We also thank Atsushi Takahashi, Sheeba Arnold Anteraper, and Steven Shannon at the Athinoula A. Martinos Imaging Center at the McGovern Institute for Brain Research, Massachusetts Institute of Technology for technical assistance. We thank our colleagues and team members Rachel Romeo, Patricia Chang, Abigail Cyr, Pamela Hook, Jiayi Lin, Jack Murtagh, and Carly Schimmel.
Supplementary Materials
Supplementary material for this article is available with the online version here: https://doi.org/10.1162/imag_a_00108.