Abstract
Functional magnetic resonance imaging (fMRI) of the auditory and visual sensory systems of the human brain is an active area of investigation in the study of human health and disease. The medial geniculate nucleus (MGN) and lateral geniculate nucleus (LGN) are key thalamic nuclei involved in the processing and relay of auditory and visual information, respectively, and are the subject of blood-oxygen-level-dependent (BOLD) fMRI studies of neural activation and functional connectivity in human participants. However, localization of BOLD fMRI signal originating from neural activity in MGN and LGN remains a technical challenge, due, in part, to the poor definition of boundaries of these thalamic nuclei in standard T1-weighted and T2-weighted magnetic resonance imaging sequences. Here, we report the development and evaluation of an auditory and visual sensory thalamic localizer (TL) fMRI task that produces participant-specific functionally-defined regions of interest (fROIs) of both MGN and LGN, using 3 Tesla multiband fMRI and a clustered-sparse temporal acquisition sequence, in less than 16 minutes of scan time. We demonstrate the use of MGN and LGN fROIs obtained from the TL fMRI task in standard resting-state functional connectivity (RSFC) fMRI analyses in the same participants. In RSFC analyses, we validated the specificity of MGN and LGN fROIs for signals obtained from primary auditory and visual cortex, respectively, and benchmarked their performance against alternative atlas- and segmentation-based localization methods. The TL fMRI task and analysis code (written in Presentation and MATLAB, respectively) have been made freely available to the wider research community.
1 Introduction
The thalamus is a key neural processing hub in human sensory perception (Kandel, 2013; Mai & Paxinos, 2012; Sherman, 2016; Ward, 2013). Disruption of thalamocortical connectivity has been implicated in the pathogenesis of multiple neurological and psychiatric diseases (Anticevic et al., 2014, 2015; Chun et al., 2014, 2017; Giraldo-Chica et al., 2018; Karlsgodt, 2016; Kelly et al., 2018; Kim et al., 2014; Kubota et al., 2013; Murray & Anticevic, 2017; Pasternak et al., 2018; Peters et al., 2016; Pettersson-Yeo et al., 2011; Sheffield et al., 2020; Tamnes & Agartz, 2016; Tu et al., 2019; Woodward & Cascio, 2015; Woodward & Heckers, 2016; Woodward et al., 2012, 2017), including potential specific involvement of sensory thalamic nuclei (Chun et al., 2014, 2017; Giraldo-Chica et al., 2015; Huang et al., 2024). Accordingly, the study of neural activity in sensory thalamic nuclei and their functional connectivity with sensory cortex is an area of active investigation in the study of the human brain in health and disease (Almasabi et al., 2022; Bartlett, 2013; Chun et al., 2014, 2017; Erskine et al., 2016; Papadopoulou et al., 2019; Schmid et al., 2010; Truong et al., 2015), including through the use of blood-oxygen-level-dependent (BOLD) functional magnetic resonance imaging (fMRI) of human participants.
Within the thalamus, the medial geniculate nucleus (MGN) and lateral geniculate nucleus (LGN) are integral components of the human auditory and visual pathways, respectively, and play critical roles in the modulation and relay of incoming sensory information to the cortex (Kandel, 2013; Sherman, 2016; Ward, 2013). The MGN receives incoming sensory information from the cochlear nucleus, primarily through the inferior colliculus and superior olivary complex, and provides ascending inputs to the auditory cortex (AC; Bartlett, 2013; Brodal, 1981; Kandel, 2013; Lee, 2013). The AC, in turn, projects distally back to MGN, inferior colliculus, and superior olive (Bartlett, 2013; Lee, 2013), in order to facilitate integration of modulatory and contextual information, such as prediction error signals (Li & Ebner, 2016; Malmierca et al., 2015; Parras et al., 2017). Similarly, the LGN receives inputs from the retina, as well as the superior colliculus, and provides inputs to the visual cortex (VC), which projects back to LGN and superior colliculus (Gilbert & Li, 2013; Lee et al., 2010; Sherman, 2016).
BOLD fMRI studies that aim to measure activity within MGN and LGN hinge upon accurate localization of these structures. However, identifying regions of interest (ROIs) within the thalamus that contain BOLD activation patterns specific to auditory and visual perception using standard atlases or segmentation techniques is hindered by the relatively small size of these nuclei (e.g., estimates of approximately 120 mm3 and 60 mm3, respectively, for LGN (Muller-Axt et al., 2021) and MGN (Garcia-Gomar et al., 2019)), and a lack of distinct anatomical landmarks in the posterior thalamus that can be identified using T1-weighted (T1w) and T2-weighted (T2w) imaging sequences that causes segmentation algorithms to rely heavily on priors (relative to individual-specific anatomical information). Moreover, individual variability in the location and size of nuclei (Andrews et al., 1997; Garcia-Gomar et al., 2019; Giraldo-Chica & Schneider, 2018; Kiwitz et al., 2022; Rademacher et al., 2002) is exacerbated by both the difficulty of precise spatial normalization to standardized spaces, and the resulting imprecision in the coregistration between anatomical structural images and echo planar images (EPIs) measuring hemodynamic BOLD signal.
One broadly successful approach to localizing brain regions with specific neural functions that are otherwise difficult to localize based on anatomical scans alone is to develop and utilize a task that specifically elicits the function subserved by the targeted neural region, commonly known as a functional localizer task (Berman et al., 2010; Dodell-Feder et al., 2011; Fedorenko et al., 2010; Jacoby et al., 2016; Jiang et al., 2013; Kastner et al., 2004; Saxe et al., 2006). Consequently, BOLD fMRI sensory thalamic localizer (TL) tasks have been developed to measure human sensory thalamic responses in vivo, enabling group-wide and/or single-participant localization of geniculi experimentally (W. Chen et al., 1998; Denison et al., 2014; Hess et al., 2009; Jiang et al., 2013; Kastner et al., 2004; Schneider & Kastner, 2009; Sitek et al., 2019).
In this work, we developed and implemented an auditory and visual sensory TL fMRI task, using high-resolution multiband fMRI (Moeller et al., 2010), with the aim of producing functionally-defined ROIs (fROIs) optimal for extracting information from other BOLD images acquired from the same participant. To improve functional specificity, this task uses clustered-sparse temporal acquisition (Schmidt et al., 2008; Yang et al., 2000; Zaehle et al., 2007), through which volumes are collected only immediately after the presentation of task stimuli, thereby reducing the confounding effect of concurrent scanner noise, which is critical when studying auditory processing with fMRI (Bandettini et al., 1998; Schmidt et al., 2008). The task developed here builds on prior sensory TL tasks by (1) combining both visual and auditory stimulation in a single task to simultaneously identify fROIs for MGN and LGN; (2) leveraging multiband acceleration to improve spatial resolution and increase the number of data points (volumes) acquired with each acquisition cluster, which, in turn, facilitates (3) using a shorter (~15 minute) total task length while still obtaining high-quality fROIs; and (4) is developed and tested at 3 Tesla (as opposed to higher fields) in order to maximize availability to investigators.
We report the utility of this sensory TL task in producing participant-level MGN and LGN fROIs, validate their specificity for auditory and visual processing using resting-state (RS) fMRI scans acquired from the same participants, and demonstrate the performance of these fROIs relative to widely used atlas and segmentation-based techniques. Code for the TL fMRI task employed here, as well as participant-level analysis code for generating MGN and LGN fROIs, are also made publicly available for use by the wider research community.
2 Materials and Methods
2.1 Overview
This is a two-site study of healthy participants who completed task procedures at either the New York State Psychiatric Institute (NYSPI) or Stony Brook University (SBU). All research procedures at NYSPI were approved by the New York State Psychiatric Institute Institutional Review Board; all research procedures at SBU were approved by the Stony Brook University Institutional Review Board. Participants were recruited with advertisements, and all individuals provided written consent prior to their participation.
2.2 General inclusion and exclusion criteria
Data at each site were collected from healthy adult participants, ages 18–55, over the course of other studies, and thus inclusion and exclusion criteria varied slightly between sites. Full inclusion and exclusion criteria for each site are described in the Inclusion and Exclusion Criteria section of Supplementary Materials and Methods. All participants were free of any major neurological disorders, psychiatric disorders, current substance use disorders, and hearing impairment. Participants at SBU were additionally tested for normal hearing thresholds and speech recognition for monosyllabic words. Psychiatric diagnoses were determined at NYSPI according to the Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition, Text Revision (DSM-IV-TR; American Psychiatric Association, 2000), and at SBU according to the Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition (DSM-5; American Psychiatric Association, 2013).
2.3 Demographic and clinical assessments
Handedness was assessed using the Edinburgh Handedness Inventory (EHI; Oldfield, 1971), and socioeconomic status using the Hollingshead Four Factor Index of Socioeconomic Status (Hollingshead, 1975). Family history of mental disorders was assessed using the Family History Screen (FHS; Milne et al., 2009). Clinical measures were assessed at NYSPI using the Structured Clinical Interview for the DSM-IV Axis I Disorders (SCID-I; First & Gibbon, 2004), and at SBU using the Structured Clinical Interview for DSM-5, Research Version (SCID-5-RV; First et al., 2015), and at both sites using the Positive and Negative Syndrome Scale (PANSS; Kay et al., 1987). At SBU, hearing thresholds were evaluated either using a GSI Automated Method for Testing Auditory Sensitivity Pro (Grason-Stadler, Eden Prairie, Minnesota) automated audiometry system, or by a licensed audiologist or otolaryngologist, and speech recognition for monosyllabic words using the Northwestern University Auditory Test No. 6 (Carhart & Tillman, 1966).
2.4 fMRI acquisition
Prior to magnetic resonance (MR) scanning procedures, participants completed an MRI clearance form to assess for metallic implants and past experiences with metal, and were administered a urine toxicology test, as well as a urine pregnancy screen for biologically female participants. MR scanning was carried out using a 3 Tesla General Electric (GE; Boston, MA) MR750 with a NOVA 32-channel head coil at NYSPI, and a 3 Tesla Siemens (Munich, Germany) MAGNETOM Prisma at the SBU Social, Cognitive, and Affective Neuroscience (SCAN) Center with a Siemens 64-channel head-and-neck coil (Keil et al., 2013). All T2* BOLD EPI data were acquired in alternating, opposing phase-encode (PE) directions, that is, two anterior-posterior (AP) PE runs and two posterior-anterior (PA) PE runs.
The following data were acquired during each session at SBU: (1) high-resolution (0.80 mm isotropic voxels) anatomical T1w and T2w scans; (2) four T2*-weighted BOLD EPI RS runs; (3) four T2*-weighted BOLD EPI TL runs; (4) two spin echo field maps, one AP and one PA; and (5) one B0 field map. BOLD fMRI data were acquired at NYSPI with a TR of 850 ms and field of view of 192 mm, while BOLD data acquired at SBU were acquired with a TR of 800 ms and field of view of 204 mm. BOLD data were acquired from both sites with 2 mm isotropic voxels, a flip angle of 60 degrees, echo time of 25 ms, and multiband factor of 6. The duration of each RS run was 7 minutes 34 seconds at NYSPI and 7 minutes 38 seconds at SBU. At both sites (NYSPI and SBU), each thalamic localizer run was 3 minutes 46 seconds in duration. Participants were asked to complete four runs each of the TL fMRI task and RS fMRI, with a minimum of two runs of each required for study completion and inclusion in the final analyzed dataset.
2.5 Sensory thalamic localizer fMRI task procedures
The goal of the sensory thalamic localizer task is to identify regions of the thalamus specifically responsive to auditory and visual stimuli in each participant, as has been done in prior work (Jiang et al., 2013; Kastner et al., 2004). This task uses clustered-sparse temporal acquisition (Schmidt et al., 2008; Yang et al., 2000; Zaehle et al., 2007), in which fMRI volumes are not acquired while task stimuli are presented. Instead, fMRI is acquired immediately after the presentation of stimuli, in order to minimize the effect of scanner noise from BOLD data acquired; otherwise, the sound produced by the MR scanner while volumes are being acquired could confound detected responses to stimuli (Schmidt et al., 2008). In this task, 3 volumes of BOLD fMRI are acquired at the end of each trial. Acquisition clusters were timed such that (1) the hemodynamic response to scanner noise produced by the prior acquisition cluster should have returned to baseline levels, and (2) the hemodynamic response to stimuli presented between two clusters is at or near its peak when the acquisition cluster is acquired.
A schematic diagram of the sensory TL fMRI task is shown in Figure 1. In each run (Fig. 1A), participants are presented with 16 trials of either visual or auditory stimulation in a pseudorandom order. Each trial consists first of a “gap period” of silence, 9 seconds of stimulation (sufficient to reach the peak of the hemodynamic response in sensory cortex; Inan et al., 2004; Jiang et al., 2013), a second “gap period” of silence equal in duration to the first, and then the clustered acquisition of 3 volumes of BOLD data (TR × 3 in duration). The duration of gap periods is calibrated such that the total trial time is 12 seconds in both NYSPI and SBU data, where the acquisition clusters last 3 TRs. In NYSPI data, this gap period is 225 ms (acquisition cluster duration of 2.55 seconds); in SBU data, this gap period is 300 ms (acquisition cluster duration of 2.4 seconds).
Auditory stimulation was presented as nine sets of 900 ms segments of music, followed by 100 ms of silence, in pseudorandom order (Fig. 1B), similar to prior work (Jiang et al., 2013), in order to reduce neural responses specific to the experience of listening to music, delivered binaurally using a SereneSound headset and transducer (Resonance Technology, Inc., Northridge, CA). Musical segments were obtained from the instrumental song, Transmission94 (Parts 1 & 2) by Bonobo. This song was chosen to meet the following criteria: (1) instrumental music, to avoid evoking responses specific to speech processing in segments containing perceptible speech fragments; (2) no breaks or pauses in the music; and (3) a long enough song to produce at least 7.5 minutes of audio segments after discarding the first and last 4.8 seconds of the song (to avoid fade in/fade out effects or segments in which the music begins late or ends early; also note that less than 7.5 minutes was ultimately used in the task version reported here). Musical segments were normalized to have the same mean amplitudes in order to avoid sudden changes in amplitude when the presentation switches from one 900 ms segment to another (which may have been selected from disparate portions of the song).
Visual stimulation was presented via a projector with a 60 Hz refresh rate as a circular black-and-white checkerboard with a central fixation cross, alternating between light and dark with a reversal frequency of 7.5 Hz (Fig. 1C), which was found in prior work to reliably activate both LGN and VC (Kastner et al., 2004, 2006; Rosa et al., 2010). The presentation of task stimuli was controlled using Presentation software (Version 22.1, Build 10.23.20, Neurobehavioral Systems, Inc., Berkeley, CA, http://www.neurobs.com) installed on a dedicated laptop. The sensory TL task Presentation code (Williams et al., 2024) is freely available from the Neurobehavioral Systems Archives of Neurobehavioral Experiments and Stimuli at http://www.neurobs.com/ex_files/expt_view?id=302.
2.6 Resting-state fMRI procedures
During RS scans, participants were instructed to look ahead at a fixation cross, remaining still and awake with their eyes open. One of the participants’ eyes was monitored by a research coordinator via an eye tracking camera (although eye tracking was not collected), and eye closure times were recorded to exclude time points from analysis during which participants may have been sleeping, as participants entering into and out of sleep states have been shown to measurably impact measured RSFC (Tagliazucchi & Laufs, 2014).
2.7 fMRI pre-processing and post-processing
All fMRI data were preprocessed through the following steps of the Human Connectome Project Minimal Preprocessing Pipelines (HCP MPP; Glasser et al., 2013), version 4.2.0 (https://github.com/Washington-University/HCPpipelines): (1) PreFreeSurfer, (2) FreeSurfer, (3) PostFreeSurfer, and (4) fMRIVolume. Surface registration was completed by cortical surface matching (MSMsulc; Robinson et al., 2014, 2018). Preprocessed structural (T1w) and functional (BOLD) images were obtained from fMRIVolume normalized to Montreal Neurological Institute (MNI) 152 non-linear 6th-generation space (MNI152NLin6; Grabner et al., 2006).
Nuisance signals for white matter (WM) and cerebro-spinal fluid (CSF) were calculated in unsmoothed data as the average signal in all compartment voxels that remain after an iterative erosion procedure (Power et al., 2014). In this iterative erosion procedure, masks are eroded up to three times, as long as one additional erosion would result in at least two remaining voxels. Six vectors of motion parameters (MPs), the estimated framewise motion in each of the three directions of translation and three directions of rotation, were also obtained from the HCP MPP. TL and RS BOLD fMRI data were smoothed using a 4 mm full-width-half-maximum Gaussian filter (note that unsmoothed data were retained and used in analyses where noted below).
2.7.1 Resting-state fMRI data post-processing
RS images (both smoothed and unsmoothed) were mode 1000 normalized (divided by the modal value of all in-brain voxels and multiplied by 1000; Power et al., 2012), linearly detrended, and mean centred. Volumes acquired during periods of excess participant motion were removed using volume censoring. Volume censoring (Power et al., 2012, 2014, 2015; Smyser et al., 2011; Williams et al., 2022) was performed using study-wide thresholds for participant motion, measured by low-pass-filtered framewise displacement (LPF-FD), and run-wise thresholds for whole-brain signal fluctuation, using generalized extreme value low-pass filtered temporal derivative root-mean-squared over voxels (GEV-DV) thresholding, as has been described in detail elsewhere (Williams et al., 2022). Censoring thresholds were determined using resting-state data available for participants with usable TL MGN and LGN fROIs using the Multiband Censoring Optimization Tool (Williams et al., 2022) (https://github.com/CNaP-Lab/MCOT/), resulting in an LPF-FD threshold (ΦF) of 0.07587 mm and GEV-DV parameter (dG) of 3.105 (arbitrary units).
Volumes acquired at times during which a participant eye closure lasted longer than 3 seconds in duration were additionally removed through volume censoring, as were volumes acquired between eye closure periods that occurred less than 30 seconds apart. Contiguous clusters of data that were less than 8 seconds in duration after censoring were discarded, as were resting-state runs with fewer than 1.5 minutes of remaining data. Participants with fewer than either two runs of data remaining, or fewer than 5 minutes of remaining data in total, were excluded from resting-state analyses.
RS runs were filtered using a 0.009–0.08 Hz band-pass second-order zero-phase Butterworth filter, with censored time points replaced by linear interpolation prior to band-pass filtering before being discarded from analysis. The first and last 22 seconds of each time series were discarded to remove discontinuity (edge) artifacts present after band-pass filtering; approximately 22 seconds of data were removed from the start and end of each run (slightly different between sites due to the different TRs: 22.1 seconds / 26 volumes at NYSPI, 22.4 seconds / 28 volumes at SBU).
2.8 Sensory thalamic localizer fMRI task analysis
2.8.1 Overview
The objective in analyzing the TL task is to identify voxels within defined regions of posterior thalamus that show coactivation with either auditory cortex or visual cortex, but not both, over the course of alternating auditory and visual stimulation. To achieve this, TL task data were analyzed to first identify a cluster within each hemisphere’s auditory cortex (AC) and visual cortex (VC) that was maximally responsive to either auditory or visual stimulation, respectively (section 2.8.3). Average AC and VC time series were extracted from these clusters (section 2.8.4).
Participant-specific hemispheric thalamic search region (TSR) masks for MGN and LGN (MGN-TSR and LGN-TSR, respectively) were generated (section 2.8.2). Voxels within the MGN-TSR and LGN-TSR were assessed for coactivation with the AC and VC time series during the TL task (using Pearson’s partial correlations; see section 2.8.5) and thresholded based on this coactivation. Voxels showing coactivation with both AC and VC time series were removed to improve specificity (section 2.8.5). The largest remaining contiguous cluster within each hemisphere’s MGN-TSR and LGN-TSR were extracted as each participant’s MGN and LGN, respectively.
2.8.2 A priori regions of interest
Auditory cortex (AC) and visual cortex (VC) search region masks were generated from Brodmann areas (BAs) using the Wake Forest University PickAtlas Toolbox (WFU; Maldjian et al., 2003, 2004). A temporal lobe AC mask was produced from BAs 41 and 42, and an occipital lobe VC mask from BAs 17 and 18, and each was finally dilated in 3D by one voxel.
Hemispheric search region masks for auditory thalamus (MGN-TSR) and visual thalamus (LGN-TSR) were generated for each participant by first applying the FreeSurfer thalamic nuclei segmentation (Iglesias et al., 2018) to each participant’s anterior commissure-posterior commissure (AC-PC) aligned, readout distortion and bias field corrected T1w image. The resulting segmentation output image was warped to MNI152NLin6 space using the warps calculated by the HCP MPP (acpc_dc2standard.nii.gz) and resampling to BOLD space, using Connectome Workbench 1.5.0 (Marcus et al., 2011) (wb_command -volume-resample). To create an appropriately large search space in posterior thalamus, FreeSurfer estimates of MGN were 3D dilated three voxels for the MGN-TSR, and FreeSurfer estimates of LGN were 3D dilated by one voxel for the LGN-TSR (as the FreeSurfer LGN ROIs were slightly larger).
To help ensure that fROIs were distinct from surrounding structures, TSR voxels estimated to comprise part of surrounding structures were removed to improve specificity and robustness across participants as follows. First, masks of pulvinar and mediodorsal nuclei were obtained from the FreeSurfer thalamic segmentation. Next, the following masks were obtained from the FreeSurfer segmentation atlas (Desikan et al., 2006; Fischl, 2012; Fischl et al., 2004) output by the HCP MPP (Atlas_wmparc.2.nii.gz): cortical gray matter, parahippocampal gyrus WM, insular cortex WM, choroid plexus, pulvinar, mediodorsal nucleus, putamen, pallidum, and hippocampus. The masks of mediodorsal nucleus, insular cortex WM, choroid plexus, pulvinar, mediodorsal nucleus, putamen, and pallidum were 3D dilated by one voxel. All voxels overlapping with masks of surrounding structures were then removed from MGN-TSR and LGN-TSR masks. Finally, the posterior-most two slices of each TSR mask were then removed to remove voxels in close proximity to midbrain, and any MGN-TSR voxels superior to the inferior-most slice of the pulvinar in either hemisphere were removed to constrain the search region to the inferior aspect of posterior thalamus and remove remaining voxels posterior to the pulvinar FreeSurfer ROI. Probability maps of final MGN-TSR and LGN-TSR masks are shown in Figure S1.
2.8.3 Generalized linear modeling and generation of participant-level contrasts
Smoothed TL task fMRI data were analyzed using a generalized linear modeling framework in SPM12 (Friston et al., 2007; Friston, Holmes, et al., 1994; Friston, Jezzard, et al., 1994), version 7771, in MATLAB R2018a (The MathWorks, Inc., Natick, MA), with betas estimated from a design matrix that contained, for each run, a column each for intercept, auditory stimulation, visual stimulation, 6 MPs and their squares, WM signal, and CSF signal, as well as two additional regressors to model the second and third volumes of each cluster. These volume-specific regressors were employed to regress out the effect of T1-relaxation on BOLD signal, as the clustered-sparse temporal acquisition scheme meant that BOLD data were not at steady-state during acquisition (Horga et al., 2014; Schmidt et al., 2008; Zaehle et al., 2007). Task regressors indicating trial type (auditory or visual) were not convolved with a hemodynamic response function (HRF), because the sparse temporal sampling precludes a true timeseries HRF model, as acquisition of volumes is not continuous.
Contrast images of activation from auditory stimulation versus visual stimulation (Auditory − Visual) were then generated for each participant. Each hemisphere’s AC search region masks were thresholded to include only voxels with the top 10% of [Auditory − Visual] contrast values, and VC search region masks were similarly thresholded to include only the bottom 10% of [Auditory − Visual] contrast values. Clusters smaller than 10 contiguous voxels in size were then removed. The single contiguous cluster within each AC mask and VC mask with the greatest magnitude peak contrast value was used for extraction of AC and VC task time series.
2.8.4 Extraction of time series
To obtain AC and VC time series for each participant, TL task BOLD activation was averaged within each hemisphere’s AC and VC clusters, as identified in section 2.8.3. These hemispheric AC and VC time series were then averaged across hemispheres, resulting in single AC and VC time series for each participant.
Local WM signal was additionally obtained for each TSR for use as nuisance regressors by first identifying WM in the FreeSurfer segmentation atlas located between one and five voxels from the TSR in any direction, and then extracting average time series from these voxels from unsmoothed BOLD data. Brain-wide gray matter signal was extracted by eroding the FreeSurfer segmentation atlas by one voxel and spatially averaging signal from unsmoothed BOLD data within the resultant mask.
2.8.5 Task coactivation analysis and identification of MGN and LGN regions of interest
We aimed to identify thalamic voxels in each TSR that coactivate, over the course of auditory and visual stimulation, with AC and VC. Note that this is distinct from task connectivity analysis methods such as general functional connectivity (Elliott et al., 2019) in that, rather than exploring connectivity after removing task effects, we aimed instead to identify shared patterns of task responsiveness (coactivation). This was done to capitalize on shared variance (connectivity) between sensory cortex and thalamic nuclei across volumes within a task condition, rather than merely identifying thalamic voxels that activate to task. That is, using only a task contrast [Auditory – Visual] would only pick up on differences driven by the task condition, while the coactivation method employed can be driven by these differences in addition to functional connectivity to sensory cortex that is detectable within each fMRI volume acquired within each condition.
Within each TSR (left and right MGN-TSR and LGN-TSR), coactivation between each voxel’s time series and activation in AC and VC (see section 2.8.4) was estimated using partial correlations, resulting in AC and VC coactivation maps. Note that this produces separate AC and VC coactivation maps for each TSR-MGN and each TSR-LGN search region. Partial correlations included nuisance regressors for average WM, CSF, gray matter signal, local WM (see section 2.8.4), and a regressor for the second and third image of each cluster.
AC and VC coactivation maps were then thresholded within each TSR to select the highest-ranking voxels (i.e., voxels with the greatest Pearson correlations). TSR-MGN and TSR-LGN upper percentile thresholds were determined separately for each participant, based on the average number of voxels comprising each TSR-MGN and TSR-LGN across hemispheres. The upper percentile threshold for TSR-MGN coactivation maps was 32 divided by the average number of voxels, and the upper percentile threshold for TSR-LGN coactivation maps was 20 divided by the average number of TSR-LGN voxels. These parameter values are arbitrary, and were selected in order to produce reasonably sized and shaped fROIs, with a stricter threshold in the larger TSR-LGN to accommodate overall greater LGN-VC observed coactivation values. Any voxels remaining in both AC and VC coactivation maps of any TSR after thresholding were then removed from both, in order to improve the specificity of resulting fROIs for sensory modality. The largest contiguous cluster remaining within each hemi-thalamus’s TSR-MGN-AC coactivation map, and TSR-LGN-VC coactivation map, were then identified as MGN and LGN, respectively.
Each MGN and LGN fROI was evaluated by multiple members of the study team (J.C.W., P.N.T., and Z.J.Z., and J.X.V.S.) by overlaying them over each participant’s high-resolution T1w anatomical images and temporally averaged BOLD TL task EPIs. If issues were noted by any reviewer, they were examined by two members of the study team (J.C.W. and J.X.V.S.) and a consensus was reached. We excluded fROIs that were considered to be anatomically implausible (e.g., an LGN that is either inferior or medial to MGN, or excessive asymmetry between left and right hemispheric MGN or LGN fROIs); data collected from participants with any fROI that failed this quality check (QC) procedure were excluded from further analyses.
2.9 Resting-state data analysis
2.9.1 Overview
After localization using the sensory TL task, functional connectivity from MGN and LGN fROIs was evaluated in resting-state data obtained from the same participants. Specificity for connectivity with auditory and visual regions was determined by comparing connectivity between primary AC and VC ROIs (obtained independently of the TL task), and compared with connectivity from alternative MGN and LGN ROIs derived from a standardized atlas, the WFU PickAtlas Toolbox (Maldjian et al., 2003, 2004), and two thalamic segmentations using T1w images, the FreeSurfer thalamic nuclei segmentation (Iglesias et al., 2018; Su et al., 2019) and the Thalamus Optimized Multi Atlas Segmentation (THOMAS) segmentation (Iglesias et al., 2018; Su et al., 2019) (see section 2.9.2. below). We then performed whole-brain seed connectivity analyses to determine functional relationships between signals within TL MGN and LGN fROIs, as well as the differences between functional connectivity maps derived from TL ROIs and those alternative ROIs obtained from the aforementioned atlas and T1w segmentations (Iglesias et al., 2018; Maldjian et al., 2003, 2004; Su et al., 2019).
2.9.2 A priori regions of interest
ROIs for primary AC and primary VC were obtained from the FreeSurfer Desikan-Killiany segmentation atlas (Desikan et al., 2006; Fischl, 2012; Fischl et al., 2004) output by the HCP MPP (Atlas_wmparc.2.nii.gz). AC was created from the transverse temporal gyrus parcel (#1034 and #2034), dilated by one voxel in 3D. The VC mask was created from the pericalcarine gray matter parcels (#1021 and #2021), 3D dilating once, removing any voxels not considered to be in gray matter, and then 3D dilating two more times. These ROIs are shown in Figure S2. As a benchmark for the potential for bias in RSFC analysis that would be produced by excessive overlap between primary AC and primary VC ROIs described here and the fROIs derived from the TL task, we additionally calculated the Dice-Sørensen similarity coefficient (Dice, 1945; Serrano-Sosa et al., 2021) for each participant’s TL fROI and a priori anatomical ROI, bilaterally, in AC and VC separately.
TL MGN and LGN fROIs were obtained from the thalamic localizer task as described above (see section 2.8.5). An alternative atlas-based set of left and right MGN and LGN ROIs were obtained from the WFU PickAtlas Toolbox (Maldjian et al., 2003, 2004); alternative T1w-segmentation based MGN and LGN ROIs for comparison to the thalamic localizer ROIs developed here were obtained from the FreeSurfer thalamic nuclei segmentation (Iglesias et al., 2018) and the Thalamus Optimized Multi Atlas Segmentation (THOMAS) segmentation (Su et al., 2019), shown in Figure S3.
2.9.3 Resting-state functional connectivity analysis
For each run, time series were extracted for each ROI (including fROIs) by spatially averaging unsmoothed volumetric resting-state data within ROI voxels. MGN-AC connectivity was estimated by calculating pairwise Pearson’s partial correlations between left MGN and left AC, left MGN and right AC, right MGN and left AC, and right MGN and right AC; these four values were then averaged within each run to create a single run-level MGN-AC connectivity estimate. Both ipsilateral and contralateral connectivity was evaluated in order to account for bilateral processing of sensory stimuli and commissural interhemispheric communication between sensory cortices (Bamiou et al., 2007; Z. Chen et al., 2018; Engel et al., 1991; Ito et al., 2016; Jones, 2007; Lee, 2013; Lee & Winer, 2008; Ramachandra et al., 2020; van der Knaap & van der Ham, 2011), as well as to maximize power. This process was then repeated to additionally calculate MGN-VC, LGN-AC, and LGN-VC connectivity for each run. This procedure was performed for MGN and LGN fROIs obtained from the sensory TL task, as well as the three alternative ROIs (WFU PickAtlas, FreeSurfer, and THOMAS). Differences between MGN-AC and LGN-VC connectivity estimates produced using TL fROIs and those produced using these alternative ROIs were then estimated for each participant.
Volumetric whole-brain seed connectivity images for each hemisphere’s MGN and LGN ROIs were calculated using Pearson’s partial correlations between each seed time series and each voxel in 4 mm full-width-half-maximum smoothed resting-state images. Connectivity for left and right MGN and LGN TL fROIs was estimated separately, and the two connectivity maps for each geniculi pair were averaged together to produce a single image of bilateral MGN connectivity and a single image of bilateral LGN connectivity. This was repeated for alternative MGN and LGN ROIs generated from WFU PickAtlas, FreeSurfer thalamic segmentation, and THOMAS segmentation. To compare seed connectivity maps derived from TL MGN and LGN fROIs with those derived from each of their three alternatives, six seed connectivity difference images were then calculated for each participant: (1) TL MGN minus WFU MGN, (2) TL MGN minus FreeSurfer MGN, (3) TL MGN minus THOMAS MGN, (4) TL LGN minus WFU LGN, (5) TL LGN minus FreeSurfer LGN, and (6) TL LGN minus THOMAS LGN.
All Pearson’s partial correlations were calculated using the following nuisance parameters: band-pass filtered MPs (using the same 0.009–0.08 Hz band-pass filter that was applied to the RS time series), the squares of the band-pass filtered MPs, the derivatives of the band-pass filtered MPs, the squares of the derivatives of the band-pass filtered MPs, the white matter signal and its derivative, and the CSF signal and its derivative.
2.9.4 Multi-site harmonization (ComBat)
ROI pair correlations and seed connectivity images were harmonized across SBU and NYSPI for site and scanner effects using ComBat (Fortin et al., 2017, 2018; Johnson et al., 2007; Yu et al., 2018) (https://github.com/Jfortin1/ComBatHarmonization/) in MATLAB, using non-parametric adjustments and including demographic covariates. Demographic covariates were included in order to preserve biological variability in the sample while removing the variability associated with sites/scanners. Included covariates included age, gender, handedness, race, ethnicity, and parental socioeconomic status. PANSS Positive Scale, Negative Scale, and General Scale assessment scores were also included, to retain potential variability in auditory and visual perception, as well as other aspects of cognition and affect that might be associated with these measures.
Three participants were missing data for parental socioeconomic status, and four participants were missing data for Positive and Negative Syndrome Scale assessments. Missing data were estimated in the full sample of participants who completed the study using SPSS Statistics (Version 29.0.0.0, Build 241, International Business Machines Corporation, Armonk, NY) as the aggregate median of 1000 imputations, each with a maximum of 100 Markov chain Monte Carlo iterations, using linear regression models for scale variables and two-way interactions among categorical predictors. Singularity tolerance was set to 10-12; maximum case draws and maximum parameter draws were both set to 1000. Missing data estimates determined via multiple imputation were only used as covariates in the ComBat harmonization procedure, and not in any other analyses.
2.10 Statistical analysis
2.10.1 Thalamic localizer fROIs
As an exploratory analysis, significance was assessed for group-wide interhemispheric differences in TL-derived LGN and MGN fROI sizes using two-tailed Mann–Whitney–Wilcoxon rank-sum tests (Gibbons & Chakraborti, 2021; Hollander et al., 2014). Relationships between fROI size and connectivity with sensory cortex were then assessed, in order to determine whether differences in intrinsic functional connectivity that are observable in task coactivation maps could be systematically driving differences in the sizes of obtained LGN and MGN fROIs, or, conversely, whether differences in fROI sizes could be significantly driving estimates of measured connectivity with sensory cortex. Relationships between fROI sizes and MGN-AC connectivity were assessed using Kendall rank correlation coefficients (tau-b, two-tailed; Kendall, 1990; van Doorn et al., 2021).
2.10.2 ROI pair correlation RSFC analyses
The selectivity of time series extracted from TL MGN and LGN fROIs was evaluated using MGN-AC, MGN-VC, LGN-AC, and LGN-VC connectivity estimates. Using one-tailed paired t-tests, the following relationships were assessed, corrected for false discovery rate (Benjamini & Hochberg, 1995): (1) MGN selectivity for AC over VC (MGN-AC > MGN-VC), (2) AC selectivity for MGN over LGN (MGN-AC > LGN-AC), (3) LGN selectivity for VC over AC (LGN-VC > LGN-AC), and VC selectivity for LGN over MGN (LGN-VC > MGN-VC). One-tailed tests were used because the conclusion drawn from a non-significant result would not differ materially from the conclusion drawn from a significant two-tailed result in the unexpected direction; in both cases, we would conclude that the TL task has failed to properly localize the appropriate thalamic nucleus.
Connectivity between TL fROIs and their appropriate sensory cortex (i.e., MGN-AC connectivity and LGN-VC connectivity) was then benchmarked against connectivity estimates calculated using alternative ROIs from WFU PickAtlas, the FreeSurfer thalamic nuclei segmentation, and the THOMAS segmentation, with the significance of differences assessed using two-tailed, paired t-tests.
2.10.3 Seed connectivity RSFC analyses
Whole-brain TL MGN and LGN seed connectivity was assessed for significance using Permutation Analysis of Linear Models (PALM; Alberton et al., 2020; Winkler et al., 2014, 2015; Winkler, Ridgway, et al., 2016; Winkler, Webster, et al., 2016), with a design matrix consisting of a single column of all 1s, using 10,000 sign-flips, built-in family-wise error rate correction (Holmes et al., 1996), and threshold-free cluster enhancement (TFCE; Smith & Nichols, 2009). All hypothesis testing was two-tailed (bi-sided), performed using two one-directional contrasts followed by multiple testing correction over contrasts (Alberton et al., 2020). This was repeated without TFCE as well, in order to improve the specificity of seed correlation results (at the expense of power). Exploratory analyses in PALM were then conducted identically for difference images comparing seed connectivity between TL MGN and LGN fROIs, and those MGN and LGN ROIs obtained from WFU PickAtlas, FreeSurfer, and THOMAS. T-statistic maps obtained from PALM were thresholded at α = 0.05 after correction for family-wise error rate and over contrasts.
3 Results
3.1 Sensory thalamic localizer auditory and visual thalamic fROIs (MGN and LGN)
A total of 52 participants (NYSPI n = 30; SBU n = 22) completed at least two runs each of the sensory TL and RS data. After evaluation of TL MGN and LGN fROIs, data from 47 participants (26 NYSPI; 21 SBU) were retained for subsequent analysis; fROIs that failed the QC procedure are shown in Figure S4. Sample demographics and clinical assessment data before and after exclusions for TL fROI failures are shown in Table 1.
All study completers . | |||
---|---|---|---|
Participant characteristic . | Full sample (NYSPI+SBU, N = 52) . | NYSPI (n = 30) . | SBU (n = 22) . |
Age, years (SD) | 30.5 (10.0) | 32.2 (9.5) | 28.2 (10.5) |
Gender, female (%) | 17 (32.7%) | 9 (30.0%) | 8 (36.4%) |
Ethnicity, Hispanic (%) | 14 (26.9%) | 8 (26.7%) | 6 (27.3%) |
Race | |||
Caucasian (%) | 17 (32.7%) | 9 (30.0%) | 8 (36.4%) |
African American (%) | 21 (40.4%) | 13 (43.3%) | 8 (36.4%) |
Asian (%) | 5 (9.6%) | 3 (10.0%) | 2 (9.1%) |
Other (%) | 7 (13.5%) | 5 (16.7%) | 2 (9.1%) |
More than one race (%) | 2 (3.8%) | 0 (0.0%) | 2 (9.1%) |
Handednessa | |||
Left (%) | 2 (3.8%) | 1 (3.3%) | 1 (4.5%) |
Right (%) | 48 (92.3%) | 29 (96.7%) | 19 (86.4%) |
Ambidextrous (%) | 2 (3.8%) | 0 (0.0%) | 2 (9.1%) |
Smokerb,c (%) | 3 (5.8%) | 3 (10.0%) | 0 (0.0%) |
Parental SESd,e (SD) | 44.2 (11.4) | 44.4 (11.2) | 44.0 (11.8) |
PANSSf total (SD) | 35.1 (6.9) | 32.8 (4.6) | 37.7 (8.2) |
PANSSf general (SD) | 18.4 (3.8) | 16.2 (3.9) | 20.1 (4.5) |
PANSSf positive (SD) | 7.4 (0.9) | 6.9 (1.5) | 7.7 (1.2) |
PANSSf negative (SD) | 9.3 (3.3) | 8.8 (2.7) | 9.9 (3.8) |
All study completers . | |||
---|---|---|---|
Participant characteristic . | Full sample (NYSPI+SBU, N = 52) . | NYSPI (n = 30) . | SBU (n = 22) . |
Age, years (SD) | 30.5 (10.0) | 32.2 (9.5) | 28.2 (10.5) |
Gender, female (%) | 17 (32.7%) | 9 (30.0%) | 8 (36.4%) |
Ethnicity, Hispanic (%) | 14 (26.9%) | 8 (26.7%) | 6 (27.3%) |
Race | |||
Caucasian (%) | 17 (32.7%) | 9 (30.0%) | 8 (36.4%) |
African American (%) | 21 (40.4%) | 13 (43.3%) | 8 (36.4%) |
Asian (%) | 5 (9.6%) | 3 (10.0%) | 2 (9.1%) |
Other (%) | 7 (13.5%) | 5 (16.7%) | 2 (9.1%) |
More than one race (%) | 2 (3.8%) | 0 (0.0%) | 2 (9.1%) |
Handednessa | |||
Left (%) | 2 (3.8%) | 1 (3.3%) | 1 (4.5%) |
Right (%) | 48 (92.3%) | 29 (96.7%) | 19 (86.4%) |
Ambidextrous (%) | 2 (3.8%) | 0 (0.0%) | 2 (9.1%) |
Smokerb,c (%) | 3 (5.8%) | 3 (10.0%) | 0 (0.0%) |
Parental SESd,e (SD) | 44.2 (11.4) | 44.4 (11.2) | 44.0 (11.8) |
PANSSf total (SD) | 35.1 (6.9) | 32.8 (4.6) | 37.7 (8.2) |
PANSSf general (SD) | 18.4 (3.8) | 16.2 (3.9) | 20.1 (4.5) |
PANSSf positive (SD) | 7.4 (0.9) | 6.9 (1.5) | 7.7 (1.2) |
PANSSf negative (SD) | 9.3 (3.3) | 8.8 (2.7) | 9.9 (3.8) |
Passed thalamic localizer fROI QC . | |||
---|---|---|---|
Participant characteristic . | Full sample (NYSPI+SBU, N = 47) . | NYSPI (n = 26) . | SBU (n = 21) . |
Age, years (SD) | 39.9 (9.9) | 31.5 (9.1) | 28.1 (10.6) |
Gender, female (%) | 15 (31.9%) | 7 (26.9%) | 8 (38.1%) |
Ethnicity, Hispanic (%) | 14 (29.8%) | 8 (30.8%) | 6 (28.6%) |
Race | |||
Caucasian (%) | 15 (31.9%) | 7 (26.9%) | 8 (38.1%) |
African American (%) | 18 (38.3%) | 11 (42.3%) | 7 (33.3%) |
Asian (%) | 5 (10.6%) | 3 (11.5%) | 2 (9.5%) |
Other (%) | 7 (14.9%) | 5 (19.2%) | 2 (9.5%) |
More than one race (%) | 2 (4.3%) | 0 (0.0%) | 2 (9.5%) |
Handednessa | |||
Left (%) | 2 (4.3%) | 1 (3.8%) | 1 (4.8%) |
Right (%) | 43 (91.5%) | 25 (96.2%) | 18 (85.7%) |
Ambidextrous (%) | 2 (4.3%) | 0 (0.0%) | 2 (9.5%) |
Smokerb (%) | 3 (6.4%) | 3 (11.5%) | 3 (0.0%) |
Parental SESd,e (SD) | 43.5 (11.6) | 43.2 (11.4) | 43.8 (11.8) |
PANSS totalf (SD) | 35.4 (7.1) | 32.9 (4.8) | 38.0 (8.3) |
PANSS generalf (SD) | 18.6 (4.0) | 17.0 (2.5) | 20.2 (4.6) |
PANSS positivef (SD) | 7.5 (0.9) | 7.2 (0.5) | 7.8 (1.2) |
PANSS negativef (SD) | 9.3 (3.3) | 8.7 (2.6) | 10.0 (3.9) |
Passed thalamic localizer fROI QC . | |||
---|---|---|---|
Participant characteristic . | Full sample (NYSPI+SBU, N = 47) . | NYSPI (n = 26) . | SBU (n = 21) . |
Age, years (SD) | 39.9 (9.9) | 31.5 (9.1) | 28.1 (10.6) |
Gender, female (%) | 15 (31.9%) | 7 (26.9%) | 8 (38.1%) |
Ethnicity, Hispanic (%) | 14 (29.8%) | 8 (30.8%) | 6 (28.6%) |
Race | |||
Caucasian (%) | 15 (31.9%) | 7 (26.9%) | 8 (38.1%) |
African American (%) | 18 (38.3%) | 11 (42.3%) | 7 (33.3%) |
Asian (%) | 5 (10.6%) | 3 (11.5%) | 2 (9.5%) |
Other (%) | 7 (14.9%) | 5 (19.2%) | 2 (9.5%) |
More than one race (%) | 2 (4.3%) | 0 (0.0%) | 2 (9.5%) |
Handednessa | |||
Left (%) | 2 (4.3%) | 1 (3.8%) | 1 (4.8%) |
Right (%) | 43 (91.5%) | 25 (96.2%) | 18 (85.7%) |
Ambidextrous (%) | 2 (4.3%) | 0 (0.0%) | 2 (9.5%) |
Smokerb (%) | 3 (6.4%) | 3 (11.5%) | 3 (0.0%) |
Parental SESd,e (SD) | 43.5 (11.6) | 43.2 (11.4) | 43.8 (11.8) |
PANSS totalf (SD) | 35.4 (7.1) | 32.9 (4.8) | 38.0 (8.3) |
PANSS generalf (SD) | 18.6 (4.0) | 17.0 (2.5) | 20.2 (4.6) |
PANSS positivef (SD) | 7.5 (0.9) | 7.2 (0.5) | 7.8 (1.2) |
PANSS negativef (SD) | 9.3 (3.3) | 8.7 (2.6) | 10.0 (3.9) |
RSFC analysis sample (passed TL fROI QC + RS motion denoising) . | |||
---|---|---|---|
Participant characteristic . | Full sample (NYSPI+SBU, N = 42) . | NYSPI (n = 23) . | SBU (n = 19) . |
Age, years (SD) | 29.6 (9.9) | 30.6 (8.8) | 28.4 (11.3) |
Gender, female (%) | 14 (33.3%) | 6 (26.1%) | 8 (42.1%) |
Ethnicity, Hispanic (%) | 12 (28.6%) | 6 (26.1%) | 6 (31.6%) |
Race | |||
Caucasian (%) | 15 (35.7%) | 7 (30.4%) | 8 (42.1%) |
African American (%) | 15 (35.7%) | 10 (43.5%) | 5 (26.3%) |
Asian (%) | 4 (9.5%) | 2 (8.7%) | 2 (10.5%) |
Other (%) | 6 (14.3%) | 4 (17.4%) | 2 (10.5%) |
More than one race (%) | 2 (4.8%) | 0 (0.0%) | 2 (10.5%) |
Handednessa | |||
Left (%) | 2 (4.8%) | 1 (4.3%) | 1 (5.3%) |
Right (%) | 39 (92.3%) | 22 (95.7%) | 17 (89.5%) |
Ambidextrous (%) | 1 (2.4%) | 0 (0.0%) | 1 (5.3%) |
Smokerb (%) | 3 (7.1%) | 3 (13.0%) | 0 (0.0%) |
Parental SESc,g (SD) | 43.7 (11.7) | 43.8 (11.4) | 43.6 (12.4) |
PANSS totalh (SD) | 34.9 (6.7) | 31.9 (2.4) | 37.8 (8.3) |
PANSS generalh (SD) | 18.2 (3.7) | 15.8 (3.7) | 19.9 (4.8) |
PANSS positiveh (SD) | 7.4 (1.0) | 6.7 (1.6) | 7.8 (1.2) |
PANSS negativeh (SD) | 9.2 (3.2) | 8.3 (2.1) | 10.1 (3.9) |
RSFC analysis sample (passed TL fROI QC + RS motion denoising) . | |||
---|---|---|---|
Participant characteristic . | Full sample (NYSPI+SBU, N = 42) . | NYSPI (n = 23) . | SBU (n = 19) . |
Age, years (SD) | 29.6 (9.9) | 30.6 (8.8) | 28.4 (11.3) |
Gender, female (%) | 14 (33.3%) | 6 (26.1%) | 8 (42.1%) |
Ethnicity, Hispanic (%) | 12 (28.6%) | 6 (26.1%) | 6 (31.6%) |
Race | |||
Caucasian (%) | 15 (35.7%) | 7 (30.4%) | 8 (42.1%) |
African American (%) | 15 (35.7%) | 10 (43.5%) | 5 (26.3%) |
Asian (%) | 4 (9.5%) | 2 (8.7%) | 2 (10.5%) |
Other (%) | 6 (14.3%) | 4 (17.4%) | 2 (10.5%) |
More than one race (%) | 2 (4.8%) | 0 (0.0%) | 2 (10.5%) |
Handednessa | |||
Left (%) | 2 (4.8%) | 1 (4.3%) | 1 (5.3%) |
Right (%) | 39 (92.3%) | 22 (95.7%) | 17 (89.5%) |
Ambidextrous (%) | 1 (2.4%) | 0 (0.0%) | 1 (5.3%) |
Smokerb (%) | 3 (7.1%) | 3 (13.0%) | 0 (0.0%) |
Parental SESc,g (SD) | 43.7 (11.7) | 43.8 (11.4) | 43.6 (12.4) |
PANSS totalh (SD) | 34.9 (6.7) | 31.9 (2.4) | 37.8 (8.3) |
PANSS generalh (SD) | 18.2 (3.7) | 15.8 (3.7) | 19.9 (4.8) |
PANSS positiveh (SD) | 7.4 (1.0) | 6.7 (1.6) | 7.8 (1.2) |
PANSS negativeh (SD) | 9.2 (3.2) | 8.3 (2.1) | 10.1 (3.9) |
Handedness was reported using the Edinburgh Handedness Inventory.
Smoking status defined as current daily nicotine use.
Smoking status data were unavailable for one NYSPI participant (data excluded after Thalamic Localizer QC).
Mean Parental SES was reported using the Hollingshead Four Factor Index Scale of Socioeconomic Status scale.
Mean Parental SES data were unavailable for three participants (two NYSPI and one SBU).
PANSS data were unavailable for four NYSPI participants.
Mean Parental SES data were unavailable for one NYSPI participant.
PANSS data were unavailable for four NYSPI participants.
Abbreviations: fROI: functionally-defined region of interest; NYSPI: New York State Psychiatric Institute; PANSS: Positive and Negative Syndrome Scale; QC: quality check; RS: resting-state; RSFC: resting-state functional connectivity; SBU: Stony Brook University; SD: standard deviation; SES: socioeconomic status.
Group-wide probability density maps and corresponding violin plots of TL fROI sizes across participants are shown in Figure 2 for left MGN (Fig. 2A-B), right MGN (Fig. 2C-D), left LGN (Fig. 2E-F), and right LGN (Fig. 2G-H). Median left MGN and right MGN sizes were both 42 voxels, median left LGN size was 46 voxels, and median right LGN size was 50 voxels. LGN size was significantly greater in the right than left hemisphere (p = 5.68 × 10-4), consistent with a recent finding using 7 Tesla structural quantitative MR imaging (Muller-Axt et al., 2021), but the interhemispheric difference in MGN size was not (p = 0.256). For visualization, probability density maps were thresholded (binarized) at 50% of the dataset, displayed together in Figure 3.
3.2 Resting-state functional connectivity validation of thalamic localizer fROIs
Five participants (two SBU, three NYSPI) were excluded from resting-state functional connectivity (RSFC) analyses due to insufficient data remaining after volume censoring, due to excessive participant motion. Descriptive statistics for the final RSFC analysis sample are shown in Table 1. An exploratory analysis assessing the association between the size of each hemisphere’s MGN and LGN size in each participant and connectivity with sensory cortex (AC and VC, respectively) using Kendall’s rank correlation coefficient are shown in Table S1 (all p > 0.05).
Connectivity measured between MGN and LGN fROIs obtained from TL task analysis and AC and VC ROIs obtained from the FreeSurfer atlas (Atlas_wmparc.2.nii.gz) is shown in Figure 4A. Specificity measurements for MGN and LGN fROIs and their significance are shown in Figure 4B (all p < 0.05). Dice-Sørensen similarity coefficients between FreeSurfer-derived primary AC and primary VC ROIs and TL-derived AC and VC fROIs used as connectivity seeds for identifying MGN and LGN fROIs were 0.0533 and 0.115, respectively, indicating minimal bias in these analyses.
Whole-brain seed connectivity T-statistic maps are shown for TL MGN and LGN seed correlations, two-sided, thresholded at α = 0.05, and family-wise error rate corrected in Figure S5. Broadly distributed whole-brain connectivity is apparent in analyses using TFCE (Fig. S5A-B), with increased specificity apparent in voxel-wise (no TFCE) analyses (Fig. S5C-D). In both images, significant connectivity is visible between MGN and AC, and between LGN and VC.
3.3 Resting-state functional connectivity benchmarking of thalamic localizer fROIs
MGN-AC connectivity from MGN fROIs derived from the TL task are shown alongside MGN-AC connectivity from MGN ROIs derived from the WFU atlas (Maldjian et al., 2003, 2004), the FreeSurfer segmentation (Iglesias et al., 2018), and THOMAS segmentation (Su et al., 2019) in Figure 5A, and differences between MGN-AC connectivity estimates from TL fROIs and each of the other ROIs are shown in Figure 5C, along with p-values for these connectivity differences. Similarly, LGN-VC connectivity estimates derived from TL task LGN fROIs are shown in Figure 5B, alongside LGN-VC connectivity estimates derived using WFU, THOMAS, and LGN ROIs. Differences in LGN-VC connectivity observed between TL LGN fROIs and the three alternatives are shown in Figure 5D. MGN-AC connectivity was significantly greater when using the TL task ROIs relative to the three alternatives (vs. WFU, p = 2.84 × 10-9; vs. THOMAS, p = 2.17 × 10-4; vs. FreeSurfer, p = 3.36 × 10-2). TL LGN-VC connectivity was observed to be greater relative to the three alternatives, significantly versus WFU (p = 9.00 × 10-14) and THOMAS (p = 6.10 × 10-8), but not significantly versus FreeSurfer (p = 0.584).
Exploratory analyses assessing for differences in seed connectivity between TL MGN and LGN fROIs, and the three evaluated alternatives (WFU, THOMAS, and FreeSurfer), are shown in Figure 6. Significant whole-brain increases in connectivity are visible relative to WFU ROIs (Fig. 6A and D). TL MGN fROIs show increased connectivity with areas of primary AC and superior temporal gyrus relative to all alternative ROIs (Fig. 6A-C). Similarly, TL LGN fROIs show increased connectivity compared to the WFU and THOMAS ROIs across the brain, with particularly increased connectivity with regions of primary VC around the calcarine sulcus (Fig. 6D-E). No significant differences in TL LGN seed connectivity and FreeSurfer seed connectivity were found after correction for family-wise error rate.
4 Discussion
Here, we detail a sensory thalamic localizer (TL) fMRI task (design shown in Fig. 1) capable of localizing auditory and visual thalamic nuclei in individual human participants (shown in Figs. 2 and 3). Building off of past work (W. Chen et al., 1998; Denison et al., 2014; Jiang et al., 2013; Kastner et al., 2004), in which localizer tasks were utilized in order to identify either medial or lateral geniculate nuclei, here we show identification of both MGN and LGN through the use of a single task. Additionally, this task provides estimates of these thalamic nuclei in 16 minutes 12 seconds total, across four runs (of 3 minutes 46 seconds each), which is substantially less time than in prior work. For example, a single set of auditory thalamic fROIs was acquired in 32 minutes, across six runs (of 5 minutes and 20 seconds each) (Jiang et al., 2013). This is likely at least partially due to improvements in data acquisition: data here were acquired using 3 Tesla MR scanners, using a head coil with 32 or more channels (i.e., 32-channel at NYSPI and 48 head coil channels in the 64-channel head-and-neck coil at SBU), a high-resolution simultaneous-multi-slice (multiband) acquisition sequence (Moeller et al., 2010), and clustered-sparse temporal acquisition (Schmidt et al., 2008; Yang et al., 2000; Zaehle et al., 2007) of 3 volumes per cluster.
Additionally, improvements in the ability to resolve the location of auditory and visual thalamic fROIs is potentially enabled, in part, due to novel elements of the task design, in which a contrast between auditory and visual stimuli was used to robustly detect responsive regions around auditory and visual cortices in each subject, and use these task time series to find thalamic voxels (within specified search regions) that are responsive to either, but not to both. The contrast produced by alternating task conditions serves as a sensory control condition, in which circuits involved in the conscious perception of stimulation generally are engaged in both conditions. Furthermore, this design increases the specificity of obtained thalamic nuclei: auditory and visual thalamic fROIs are optimized to represent regions specific to their sensory modality, and not general features of perception or cognition. This may be critical in studies that aim to relate human perceptual abnormalities to dysfunction in circuits specifically associated with either auditory or visual processing. We validated this specificity using resting-state functional connectivity (RSFC) data (Fig. 4) collected from the same participants.
Finally, we compared fROIs obtained from the TL task to three alternative sources of thalamic ROIs (Fig. 5): an atlas-based ROI from the Wake Forest University PickAtlas Toolbox (Maldjian et al., 2003, 2004), the Thalamus Optimized Multi Atlas Segmentation (THOMAS; Su et al., 2019), and the FreeSurfer thalamic nuclei segmentation (Iglesias et al., 2018). We found significantly greater connectivity with auditory cortex from TL MGN fROIs relative to the evaluated alternatives, and significantly greater connectivity with visual cortex from TL LGN fROIs relative to the WFU and THOMAS ROIs; average connectivity from the FreeSurfer LGN ROI to visual cortex was slightly increased, but not significantly so. This may be due, in part, to the fact that FreeSurfer MGN ROIs are utilized during TL task analysis as the starting point for building search regions for these fROIs, and may additionally represent that priors for LGN locations in the FreeSurfer segmentation are typically well-placed for detecting BOLD activation from visual thalamus. This pattern is also visible in exploratory whole-brain seed correlation analyses exploring regions showing significant differences in connectivity from TL MGN and LGN fROIs relative to the three aforementioned alternative ROIs (Fig. 6).
4.1 Caveats and limitations
One critical aspect of fROIs derived from the TL task described here is that these fROIs purport to represent the profile of the hemodynamic signature of geniculate nuclei’s neural responses during task stimulation as detected in BOLD echo planar images, including the point spread functions of the hemodynamic response and T2* contrast (Parkes et al., 2005; Shmuel et al., 2007; Ugurbil, 2016; Ugurbil et al., 2013; Vu et al., 2017), after pre-processing and post-processing (including spatial smoothing). Thus, they may not correspond fully to anatomically expected sizes observed in human tissue post-mortem, or in vivo using anatomical (as opposed to functional) imaging techniques (Garcia-Gomar et al., 2019; Muller-Axt et al., 2021; Oishi et al., 2023; Sitek et al., 2019; Tourdias et al., 2014). When using these fROIs in BOLD fMRI data analysis, this could be a substantial advantage, since they may better map onto specific spatial features of the thalamic hemodynamic response and its profile in BOLD images, which can be impacted by imperfectly corrected susceptibility distortions, errors in spatial normalization to MNI template space, and other errors in localization. However, depending on the application, this advantage may be a flaw, and other techniques may prove superior; for example, when attempting to evaluate specific features of individual anatomy that are too fine to be reliably detected with standard whole-brain 3 Tesla BOLD imaging techniques. Likewise, aspects of human perceptual processing that occur at faster timescales than the Nyquist frequency enabled by BOLD fMRI imaging techniques (e.g., 800 ms) are not detectable using fMRI, except through summation into a slower, aggregate hemodynamic response.
Additionally, TL task-derived fROIs aim to achieve a degree of specificity to sensory modality, which could preclude their use for aspects of perception that involve multisensory integration or context-dependent modulation, including the medial and dorsal subnuclei of the MGN (Lee, 2015; Wepsic, 1966). While this does not appear to be a significant issue given the smoothness of 3T BOLD fMRI, especially after Gaussian smoothing, and the relatively large voxel size of 2 mm (isotropic), this could require modification if adapting this task for more precise mapping of geniculi using higher field strengths or more specialized sequences.
The TL task requires dedicated data acquisition of slightly more than 15 minutes total; this is in addition to standard T1w and T2w sequences, which can be used to obtain ROIs from anatomical segmentations. Future work systematically exploring the impact of scan time on fROI quality is warranted in order to aid in balancing needs for data quality against the feasibility of adding the task to a scan protocol. In this study, task data acquired from 5 of 52 (9.6%) participants did not produce a full set of four valid thalamic fROIs, which may be a concern in studies with limited sample sizes or variable data quality, and the extent to which additional data collection could have reduced this failure rate is unknown. As this study did not include the evaluation of test-retest reliability of fROIs obtained from the sensory TL task, future work should aim to quantify this by obtaining multiple sessions of sensory TL task data for each participant and explore the relationship between reliability measures and scan time.
Finally, the task code is written in Presentation, and the analysis code in MATLAB. These are proprietary software tools that require paid licenses to use. Future work exploring the adaptation of the task code to utilize free alternatives, such as Python (Python Software Foundation, Wilmington, DE) and PsychoPy (Peirce, 2007, 2008; Peirce et al., 2019), would allow for greater accessibility to researchers. As this software is open-source and released under the GNU General Public License version 3, it may be translated to other platforms and programming languages, as well as extended or adapted for other purposes as desired by the wider research community.
4.2 Conclusion
This work details the development, implementation, validation, and benchmarking of a sensory thalamic localizer (TL) BOLD fMRI task capable of detecting auditory and visual thalamic nuclei in individual participants. Using resting-state functional connectivity analyses, we show that auditory and visual thalamic functionally-defined regions of interest (fROIs) obtained from the sensory TL task are modality specific, and benchmark them against existing atlas- and T1w segmentation-based alternatives. This task produces estimated thalamic medial and lateral geniculate fROIs suitable for resting-state functional connectivity (RSFC) studies and may be added to fMRI acquisition protocols with less than 16 minutes of dedicated additional acquisition time. The TL task, implemented in Presentation, and analysis code, implemented in MATLAB, are both publicly available to the research community.
Data and Code Availability
The sensory Thalamic Localizer Presentation task code (Williams et al., 2024) is available from GitHub at https://github.com/CNaP-Lab/Sensory-Thalamic-Localizer and the Neurobehavioral Systems Archives of Neurobehavioral Experiments and Stimuli at http://www.neurobs.com/ex_files/expt_view?id=302. The MATLAB task analysis code for producing MGN and LGN fROIs from acquired BOLD fMRI task data and Presentation task logs is additionally available from GitHub at https://github.com/CNaP-Lab/Sensory-Thalamic-Localizer. All software associated with this manuscript is released under the GNU General Public License version 3. Note that the sensory Thalamic Localizer task is designed to be deployed on a system with a display refresh rate of 60 Hz. Data acquired from human participants used in the analyses detailed in this manuscript are available upon request from the corresponding author through a formal data-sharing agreement.
Author Contributions
Conceptualization: G.H., A.A.-D., and J.X.V.S.; Data Curation: J.C.W., P.N.T., Z.J.Z., E.B.S.-F., D.T.P., N.K.H., S.K.A., G.H., and J.X.V.S.; Formal Analysis: J.C.W., P.N.T., and J.X.V.S.; Funding Acquisition: J.C.W., A.A.-D., G.H., and J.X.V.S.; Investigation: J.C.W., P.N.T., Z.J.Z., E.B.S.-F., D.T.P., N.K.H., S.K.A., A.A.-D., G.H., and J.X.V.S.; Methodology: J.C.W., P.N.T., G.H., and J.X.V.S.; Project Administration: J.C.W., N.K.H., A.A.-D., G.H., and J.X.V.S.; Resources: A.A.-D., G.H., and J.X.V.S.; Software: J.C.W., P.N.T., S.K.A., and J.X.V.S.; Supervision: N.K.H., A.A.-D., G.H., and J.X.V.S.; Validation: J.C.W., S.K.A, G.H., and J.X.V.S.; Visualization: J.C.W., P.N.T., D.T.P., and J.X.V.S.; Writing—Original Draft: J.C.W.; Writing—Review & Editing: J.C.W., P.N.T., Z.J.Z., A.A.-D., G.H., and J.X.V.S.
Funding
Research reported in this publication was supported by the National Institute of Mental Health of the National Institutes of Health (NIH) under award numbers K01MH107763 to J.X.V.S., K23MH101637 to G.H., R01MH109635 to A.A.-D., and F30MH122136 to J.C.W. J.C.W. was also supported by an NIH Research Supplement to Promote Diversity in Health-Related Research (3R01MH120293-04S1) and by the Stony Brook University Medical Scientist Training Program (NIH Award No. T32GM008444; Principal Investigator: Dr. Michael A. Frohman). P.N.T. was supported by a Stony Brook University Department of Biomedical Engineering Graduate Assistance in Areas of National Need Fellowship (United States Department of Education Award No. P200A210006; Director: Dr. David Rubenstein) and the Stony Brook University Scholars in Biomedical Sciences Program (NIH Award No. T32GM148331; PI: Dr. Styliani-Anna [Stella] E. Tsirka). The Stony Brook high-performance SeaWulf computing system was supported by National Science Foundation (NSF) Award Nos. 1531492 (PI: Dr. Robert Harrison; co-PI: Dr. Yuefan Deng) and 2215987 (PI: Dr. Robert Harrison; co-PIs: Dr. Yuefan Deng, Dr. Eva Siegmann, and David Cyrille), and matching funds from the Empire State Development’s Division of Science, Technology and Innovation (NYSTAR) program contract C210148. Magnetic resonance imaging at Stony Brook University was performed at the Social, Cognitive, and Affective Neuroscience (SCAN) Center, supported by NSF Award No. 0722874 (PI: Dr. Turhan Canli). The content is solely the responsibility of the authors and does not necessarily represent the official views of the NIH, NSF, or NYSTAR.
Declaration of Competing Interest
A.A.-D. received consulting fees and/or honoraria from Sunovion Pharmaceuticals Inc., Otsuka Pharmaceutical Co., Ltd., Merck & Co., Inc., Neurocrine Biosciences, Inc., F. Hoffmann-La Roche AG, and C.H. Boehringer Sohn AG & Co. KG. A.A.-D. holds stock options in Herophilus, Inc. and in Terran Biosciences, Inc. All other authors declare that they have no known competing financial interests or personal relationships that could have influenced or appear to have influenced the work reported in this paper.
Acknowledgments
The authors would like to thank Jaeyop Jeong, Sam R. Luceno, Srineil Nizambad, and Yash Patel for contributions to fMRI data preprocessing, Kelly Bobchin for assistance with data organization, and Tram Ngoc Bao Nguyen for helpful discussions. Computing resources and technical assistance were provided by Stony Brook Medicine Research Computing, with substantial support from Allen Zawada and James Xikis. Access to and technical support for the high-performance SeaWulf computing system was provided by Stony Brook Research Computing and Cyberinfrastructure and the Institute for Advanced Computational Science at Stony Brook University, with notable support from Fırat Coşkun, Daniel Wood, and David Carlson.
Supplementary Materials
Supplementary material for this article is available with the online version here: https://doi.org/10.1162/imag_a_00360.