Abstract
Characterizing a particular neurodegenerative condition against others possible diseases remains a challenge along clinical, biomarker, and neuroscientific levels. This is the particular case of frontotemporal dementia (FTD) variants, where their specific characterization requires high levels of expertise and multidisciplinary teams to subtly distinguish among similar physiopathological processes. Here, we used a computational approach of multimodal brain networks to address simultaneous multiclass classification of 298 subjects (one group against all others), including five FTD variants: behavioral variant FTD, corticobasal syndrome, nonfluent variant primary progressive aphasia, progressive supranuclear palsy, and semantic variant primary progressive aphasia, with healthy controls. Fourteen machine learning classifiers were trained with functional and structural connectivity metrics calculated through different methods. Due to the large number of variables, dimensionality was reduced, employing statistical comparisons and progressive elimination to assess feature stability under nested cross-validation. The machine learning performance was measured through the area under the receiver operating characteristic curves, reaching 0.81 on average, with a standard deviation of 0.09. Furthermore, the contributions of demographic and cognitive data were also assessed via multifeatured classifiers. An accurate simultaneous multiclass classification of each FTD variant against other variants and controls was obtained based on the selection of an optimum set of features. The classifiers incorporating the brain’s network and cognitive assessment increased performance metrics. Multimodal classifiers evidenced specific variants’ compromise, across modalities and methods through feature importance analysis. If replicated and validated, this approach may help to support clinical decision tools aimed to detect specific affectations in the context of overlapping diseases.
Author Summary
The distinction of a neurodegenerative condition against multiple related diseases simultaneously, with overlapping features and high levels of heterogeneity in behavioral, clinical, and neuropathological markers, remains a challenge. Here, we combined structural and functional connectivity markers with diverse methods including graph theory and cognitive markers to perform a multiclass classification of five FTD variants. An optimum set of features was obtained through progressive feature elimination by removing redundant and uninformative variables. Our results for the simultaneous multiclass categorization of each FTD variant and healthy controls achieved a performance up to an area under the curve of 0.95. This approach can help develop clinical decision support tools to detect specific affectations in the context of overlapping neurodegenerative diseases.
INTRODUCTION
Distinguishing a single neurodegenerative condition against multiple related diseases simultaneously remains a challenge at clinical, biomarker, and neuroscientific levels. In particular, the clinical diagnosis of each frontotemporal dementia (FTD) variant requires high levels of expertise and multidisciplinary teams to distinguish among subtle phenotypes with similar underlying physiopathological processes (Bejanin et al., 2020; Staffaroni et al., 2019; Zetterberg et al., 2019). FTD presents with high levels of heterogeneity in behavioral, clinical, and neuropathological markers (Boeve et al., 2022; Peet et al., 2021; Rohan et al., 2019). Furthermore, biomarkers have been shown to be less sensitive to multiclass differentiation (i.e., classifying one condition against multiple other variants) across neurodegenerative diseases (Moral-Rubio et al., 2021; Tahmasian et al., 2016). Standard positron emission tomography (PET) biomarkers are relatively specific for Alzheimer’s disease (Ossenkoppele et al., 2021), but there are caveats when using PET to distinguish between FTD variants (Tsai et al., 2019). Plasma markers are promising for determining some FTD variants but are not yet massively accessible. Finally, the field of cognitive neuroscience has developed multiple metrics for use with machine learning (ML) classification of neurodegenerative conditions (Bachli et al., 2020; Ibañez et al., 2021a, 2021b; Moguilner et al., 2021), including some FTD variants in particular (Feis et al., 2018; Moral-Rubio et al., 2021; Premi et al., 2016). However, in most cases, the classification is based on binary comparisons without assessing the clinical classification of one condition compared to numerous other conditions. These studies have also usually been performed with small sample comparisons and generally consider unimodal brain features. Thus, a multiclass characterization across FTD variants remains scarce despite different approaches.
Robust, scalable, and affordable biomarkers segregating not only healthy status from disease but also among multiple variants of similar conditions could be assessed via brain network approaches. Functional and structural connectivity based on magnetic resonance imaging (MRI) can be relevant regarding molecular mechanisms, pathological alterations, and clinical symptoms (Meeter et al., 2017; Pievani et al., 2011). Previous research has demonstrated its usefulness in identifying neurodegenerative disorders (Hafkemeijer et al., 2017; Hohenfeld et al., 2018; Jalilianhasanpour et al., 2019) and FTD variants (Chen et al., 2020; Reyes et al., 2018; Whitwell, 2019). However, the currently available research has several limitations. Within-subject variability is usually excluded (Finn et al., 2015; Salvatore et al., 2014). The results are typically biased by unsystematic research, including different modalities (i.e., structural vs. functional connectivity) and methods (i.e., voxel connectivity, region of interest (ROI) analysis, graph theory). Most importantly, comprehensive frameworks integrating different connectivity metrics to simultaneously distinguish between multiple FTD variants have not yet been developed. Thus, a more systematic computational framework that incorporates different connectivity modalities and methods to characterize each FTD variant against multiple other variants and controls has not been developed.
Recent artificial intelligence developments for multiclass classification (Churcher et al., 2021; Gao et al., 2021; Hou et al., 2021) using ML are well suited to combine different connectivity modalities and methods to test the power of multiclass classification. The ML framework requires minimal assumptions and is more robust than parametric approaches regarding data heterogeneity (Makridakis et al., 2018; Poldrack et al., 2019). Classification can be performed with many variables with nonlinear interactions (Bzdok et al., 2018; Bzdok et al., 2017). Combining ML with progressive feature elimination can identify the main predictors, enhance classification, and provide a top assortment of features to classify outcomes (Montavon et al., 2018; Nicholls et al., 2020). Moreover, these methods can account for additional sources of heterogeneity, such as demographic and cognitive measures, in combination with brain network features.
The general aim of this study was to optimize the number of multimodal sources of information and to develop an effective multifeatured and multiclass classification of FTD variants. To reduce dimensionality due to the large number of variables (i.e., voxel-wise information adding up to the scale of 106 variables), group-level statistical analyses were employed before data-driven progressive elimination to overcome computational constraints. These statistical approaches enabled comparing our findings with previous studies in FTD variants (Agosta et al., 2012; Bharti et al., 2017; Iaccarino et al., 2015; Popal et al., 2020; Tovar-Moll et al., 2014; Upadhyay et al., 2016; Whitwell et al., 2009). Moreover, by complementing the statistical approaches with a subsequent ML classification, we followed hybrid methodologies reported in the literature (Dottori et al., 2017; Fittipaldi et al., 2020; Gonzalez Campo et al., 2020; Kassraian-Fard et al., 2016; McMillan et al., 2012). In total, 298 subjects were analyzed having different FTD variants, namely, behavioral variant FTD (bvFTD; n = 47), corticobasal syndrome (CBS; n = 38), nonfluent variant primary progressive aphasia (nfvPPA; n = 34), progressive supranuclear palsy (PSP; n = 42), and semantic variant primary progressive aphasia (svPPA; n = 38), and healthy controls (HC; n = 99). The ML classifier employed was the eXtreme Gradient Boosting (XGBoost) algorithm (Kaufmann et al., 2019), which was used to classify one group against each of the remaining others. Brain network features included two modalities (functional and structural) and three methods (voxel level or raw connectivity, ROI-to-ROI, and graphs). In addition, we incorporated basic demographics (sex, age, years of education) and cognitive measures (disease severity and cognitive screening) into a multifeatured classification. Three hypotheses were advanced: (1) feature optimization will enable accurate simultaneous multiclass classification of each FTD variant relative to other variants and controls, (2) use of cognitive measures in conjunction with connectivity will increase classification accuracy, and (3) the model comprising all modalities and methods will outperform the models with single methods and modalities. By testing these hypotheses, we aimed to assess the robustness of a multiclass computational framework for characterizing each FTD variant simultaneously against all other variants and controls.
MATERIALS AND METHODS
Subjects
All the data were obtained from LONI’s databases (https://ida.loni.usc.edu), namely, Neuroimaging in Frontotemporal Dementia (NIFD) and the 4 Repeat Tauopathy Neuroimaging Initiative (4RTNI), both of which are part of the frontotemporal lobar degeneration neuroimaging initiative (https://4rtni-ftldni.ini.usc.edu). Clinical diagnosis of the FTD variants was based on current criteria (Armstrong et al., 2013; Gorno-Tempini et al., 2011; Rascovsky & Grossman, 2013; Rascovsky et al., 2011). Patients did not present any vascular, psychiatric, or other neurological disorders. The inclusion of healthy subjects required confirmation of normal cognitive function, the absence of any disease, and a brain MRI free of lesions or significant white matter changes.
In total, data from 298 subjects were analyzed, including HCs (n = 99) and individuals with one of five FTD variants: bvFTD (n = 47), CBS (n = 38), nfvPPA (n = 34), PSP (n = 42), and svPPA (n = 38). These variants constitute the core FTD spectrum disorders (Olney et al., 2017), and their typical atrophy patterns (Boxer et al., 2006; Lu et al., 2013; Seeley et al., 2009; Whitwell et al., 2010b; Whitwell et al., 2009) were confirmed via voxel-based morphometry (Supporting Information, Supplementary data 1). Due to the large variability within the clinical groups (where some values can represent outliers) and the nonnormal data distribution of some variables, the median and median absolute deviation (Wilcox, 2017) were used to analyze demographic and cognitive characteristics (Table 1). Additionally, to solve the pronounced between-group variability (Supporting Information, Supplementary data 2), it was necessary to match the clinical groups in age, sex, and education using two subsamples of the HCs, both with n = 50 (Supporting Information, Supplementary data 3): the first subsample (HCsub1) was matched with bvFTD and svPPA patients, and the second subsample (HCsub2) was matched with CBS, nfvPPA, and PSP patients. Based on the clinical dementia rating (CDR), bvFTD and svPPA patients were in the mild stage of the disease (CDR = 1), while those with the other variants were in the early disease stage (CDR = 0.5) (see Table 1). However, all variants were comparable regarding their cognitive status, as assessed by the Mini-Mental State Examination (MMSE). Additionally, all variant groups were significantly different from the HC group on the CDR and MMSE scores (Table 1; see details in Supporting Information, Supplementary data 2). Among the variant groups, analysis of disease severity and cognitive measures presented statistically significant differences only between the bvFTD and nfvPPA groups in the CDR index.
Group . | n . | Age . | Sex (% of females) . | Education . | CDR . | MMSE . |
---|---|---|---|---|---|---|
HCtotal | 99 | 66.0 (4.45) | 54.5 | 18.0 (2.97) [4] | 0 (0.0) [31] | 30.0 (0) |
HCsub1 | 50 | 63.0 (5.19) | 46 | 16.0 (1.48) [1] | 0 (0.0) [31] | 29.0 (1.48) |
HCsub2 | 50 | 68.0 (5.93) | 58 | 17.0 (1.48) [2] | 0 (0.0) [11] | 29.5 (0.74) |
bvFTD | 47 | 62.0 (5.93) | 38.3 | 15.5 (3.71) | 1 (0.74) | 25.0 (4.45) |
CBS | 38 | 67.0 (8.15) | 52.6 | 16.0 (2.97) [2] | 0.5 (0.37) [2] | 25.0 (4.45) [4] |
nfvPPA | 34 | 69.5 (8.90) | 52.9 | 16.0 (2.97) | 0.5 (0.0) [1] | 26.0 (2.97) |
PSP | 42 | 68.5 (6.67) | 50 | 16.0 (2.97) [2] | 0.5 (0.74) [3] | 26.0 (2.97) [4] |
svPPA | 38 | 64.0 (7.41) | 44.7 | 16.0 (2.97) [1] | 1.0 (0.37) | 24.5 (3.71) [1] |
Shapiro–Wilk normality test | p = 0.264 | — | p < 0.001 | p < 0.001 | p < 0.001 | |
Significant between-group comparisons | a, b, c | a, d | d |
Group . | n . | Age . | Sex (% of females) . | Education . | CDR . | MMSE . |
---|---|---|---|---|---|---|
HCtotal | 99 | 66.0 (4.45) | 54.5 | 18.0 (2.97) [4] | 0 (0.0) [31] | 30.0 (0) |
HCsub1 | 50 | 63.0 (5.19) | 46 | 16.0 (1.48) [1] | 0 (0.0) [31] | 29.0 (1.48) |
HCsub2 | 50 | 68.0 (5.93) | 58 | 17.0 (1.48) [2] | 0 (0.0) [11] | 29.5 (0.74) |
bvFTD | 47 | 62.0 (5.93) | 38.3 | 15.5 (3.71) | 1 (0.74) | 25.0 (4.45) |
CBS | 38 | 67.0 (8.15) | 52.6 | 16.0 (2.97) [2] | 0.5 (0.37) [2] | 25.0 (4.45) [4] |
nfvPPA | 34 | 69.5 (8.90) | 52.9 | 16.0 (2.97) | 0.5 (0.0) [1] | 26.0 (2.97) |
PSP | 42 | 68.5 (6.67) | 50 | 16.0 (2.97) [2] | 0.5 (0.74) [3] | 26.0 (2.97) [4] |
svPPA | 38 | 64.0 (7.41) | 44.7 | 16.0 (2.97) [1] | 1.0 (0.37) | 24.5 (3.71) [1] |
Shapiro–Wilk normality test | p = 0.264 | — | p < 0.001 | p < 0.001 | p < 0.001 | |
Significant between-group comparisons | a, b, c | a, d | d |
Note. Descriptive statistics are presented as the median (median absolute deviation) [missing values]. The HCsub1 subsample is demographically matched with the bvFTD and svPPA groups, and the HCsub2 subsample is matched with the CBS, nfvPPA, and PSP variant groups. Group median comparisons were based on a 5,000 permutations test to deal with tied values (Wilcox, 2017); see details in Supporting Information Table S1. The p values were set at 0.05 and adjusted by the Bonferroni method. a: bvFTD vs. nfvPPA; b: bvFTD vs. PSP; c: PSP vs. svPPA; d: all variant groups presented statistically significant differences to the HC group. bvFTD = behavioral variant of FTD; CBS = corticobasal syndrome; CDR = Clinical Dementia Rating; HC = healthy controls; MMSE = Mini-Mental State Exam; nfvPPA = nonfluent variant primary progressive aphasia; PSP = progressive supranuclear palsy; svPPA = semantic variant primary progressive aphasia.
MRI: Analysis
All MRI images were acquired from a 3.0 Tesla MR device following the protocols of the frontotemporal lobar degeneration neuroimaging initiative between 2008 and 2016. Although two different scanners were used (Siemens model Trio; scanner 1 = 285 subjects, 95.6 %; and scanner 2 = 13 subjects, 4.4 %), previous studies with this dataset have pointed out that this variability is negligible (Dickerson et al., 2008; Fox et al., 2012; Melzer et al., 2020; Noble et al., 2017a; Noble et al., 2017b; Zhou et al., 2018). In addition, only images with the same acquisition parameters for both scanners were selected to avoid additional sources of variability (Han et al., 2006; Mueller et al., 2011). Finally, additional procedures were performed to assess the potential effects of the scanner variability (see below). To enable a greater sample size, only cross-sectional data were included. We detail the study goal and the pipeline in Figure 1.
Resting-state fMRI (rs-fMRI): Connectivity matrices.
Three approaches were used to analyze the functional connectivity pattern in each variant group. First, connectivity was analyzed at the voxel level. We refer to this approach as raw connectivity, given that it is the most straightforward method to obtain functional connectivity maps (Nieto-Castanon, 2020). Second, at the ROI level, we analyzed the linear correlation between regions (ROI-to-ROI) because this is the gold standard for obtaining averaged voxel-wise associations (Fox & Raichle, 2007; Ibañez et al., 2021a, 2021b; Nieto-Castanon, 2020). Third, we employed graph connectivity measures to characterize brain network organization in a more comprehensive manner (Bressler & Menon, 2010; Rubinov & Sporns, 2010; Sporns, 2018). The three approaches were implemented to compare each variant with its respective HC subsample and between other variants based on t statistics. The multiple comparison problem was accounted for by using a threshold-free cluster enhancement (TFCE) method (Chen et al., 2018; Smith & Nichols, 2009), except for the graph measures, where we used the false discovery rate (FDR) method with p < 0.05.
The rs-fMRI signals were obtained from all subjects with an echo-planar pulse sequence and the following parameters: TR = 2,000 ms, TE = 27 ms, flip angle = 80 degrees, voxel size = 3 mm3, number of slices = 240. The preprocessing and functional connectivity analysis were implemented with the CONN 20.b toolbox (Whitfield-Gabrieli & Nieto-Castanon, 2012) running in SPM12 on MATLAB R2018b. For the functional images, the first five volumes were removed. The pipeline for preprocessing the images was set as the default of the CONN toolbox: (1) subject-motion estimation and correction, (2) automatic translations of the center to the (0, 0, 0) coordinates, (3) slice timing correction, (4) outlier detection (global signal z-value threshold = 5, subject-motion mm threshold = 0.9), (5) direct segmentation and normalization to Montreal Neurological Institute (MNI) space with a resolution of 3 mm3, (6) automatic translations of the center to the (0, 0, 0) coordinates of structural images, (7) direct segmentation and normalization to MNI of structural images (resolution of 1 mm3), (8) smoothing with a Gaussian kernel (8 mm × 8 mm × 8 mm), (9) denoising (linear regression of confounding effects of white matter, cerebrospinal fluid, realignment, and scrubbing), and (10) band-pass filter (0.001–0.09 Hz).
TFCE’s corrections were employed for raw and ROI-to-ROI connectivity because they can assess statistical significance at the voxel and ROI level without requiring one to set an arbitrary threshold, thus providing unbiased results (Chen et al., 2018; Smith & Nichols, 2009). Additionally, TFCE is more sensitive to both focal and peripheral effects than classical correction methods, reaching the best balance between family-wise error (FWE) rates and replicability (Chen et al., 2018). We calculated TFCE through 1,000 permutations and a significance level of p < 0.05 (FEW-corrected). Since the aggregation of areas within clusters (like in the case of the TFCE method) can interfere with the calculation of graph theory metrics (Bullmore & Sporns, 2009; Sporns, 2010), we employed the FDR correction which is the method of choice to correct for the multiple comparisons problem when using those measures (Agosta et al., 2014; Khazaee et al., 2015).
At the voxel level, the raw functional connectivity analysis included four measures to characterize the brain’s complexity (Mohanty et al., 2020). First, we employed global correlation, which represents the average of the correlation coefficient between each voxel and all other voxels in the brain (Nieto-Castanon, 2020). Second, we employed local correlation, which is defined as the average of the correlation coefficients of every voxel and their neighboring voxels in a kernel size of 30 mm (Deshpande et al., 2009). Additionally, two measures that are nonscale invariant were implemented to analyze BOLD signal power within a frequency window of interest (0.00–0.09 Hz). First, we used the amplitude of low-frequency fluctuations, considering the root-mean-square of the time series of each voxel after low- or band-pass filtering (Yang et al., 2007). Second, we used the fractional amplitude of low-frequency fluctuations to represent the power of the frequency band of interest (0.001–0.09 Hz) compared to the entire frequency spectrum. This measure represents the ratio of the root-mean-square of the BOLD signal at each voxel after vs. before the filtering (Zou et al., 2008).
At the ROI level, we extracted the mean time course of the BOLD signal of each one of the 116 regions according to the automated anatomical labeling (AAL) atlas (Tzourio-Mazoyer et al., 2002). An ROI-to-ROI connectivity matrix for each subject was calculated for all regions using the Fisher-transformed bivariate correlation coefficient between every pair of ROI time series (Nieto-Castanon, 2020). The TFCE clustering was based on a hierarchical algorithm method, where ROIs with similar effect patterns were grouped to achieve more meaningful results (Nieto-Castanon, 2020). Regarding graph measures, we included the positive values of the ROI-to-ROI connectivity matrices only, with a threshold of 0.3, to avoid small effect sizes (Cohen, 1988, 1992; Rosenthal, 1996). Thus, the AAL’s regions were considered the nodes, and the linear correlations greater than 0.3 the edges. The matrices obtained were used as undirected weighted inputs for the calculation of graph metrics in the brain connectivity toolbox (https://www.brain-connectivity-toolbox.net). We used the approach used in Sedeño et al. (2017) by employing weighted graph measures, given that the information about the connection strength is preserved (van Wijk et al., 2010). Following previous standards in this field (Rubinov & Sporns, 2010), seven graph connectivity metrics were analyzed for each ROI: (1) global efficiency: the average of inverse-shortest path (the minimum number of edges that must be traversed to go from one node to another) between this node and all other nodes in the network (Nieto-Castanon, 2020); (2) local efficiency: global efficiency computed on the neighborhood of the node (Nieto-Castanon, 2020); (3) degree: number of links connected to the node (Bullmore & Sporns, 2009); (4) strength: the sum of the weights of links connected to the node (Rubinov & Sporns, 2010); (5) clustering coefficient: fraction of node’s neighbors that are neighbors of each other (Bassett & Bullmore, 2006); (6) betweenness centrality: the fraction of all shortest paths in the network that contain a given node (Rubinov & Sporns, 2010).
Diffusion-weighted images (DWI).
Structural connectivity was analyzed in a similar way to functional connectivity, using the three previous methods to characterize connectivity changes at different levels. The DWI sequences were acquired for 264 of the subjects, including 77 HCs (HCsub1 = 43, HCsub2 = 41), 45 with bvFTD, 35 with CBS, 34 with nfvPPA, 37 with PSP, and 36 with svPPA. As in the total sample, no sociodemographic differences were observed (Supporting Information, Supplementary data 3). The group comparison methods for the three DWI approaches were the same as those used for the functional data. The multiple comparisons problem for the raw and graph measures of connectivity was corrected by the FDR method (p < 0.05). The connections between ROIs were corrected by the TFCE method using 1,000 permutations with p < 0.05 (FWE-corrected).
All DWI were acquired with an echo-planar sequence in two dimensions and 64 diffusion sampling directions, with the following parameters: TR = 7,400 ms, TE = 86 ms, flip angle = 180 degrees, in-plane resolution = 2.2 mm, slice thickness = 2.2 mm, and b-value = 2,000 s/mm2. The MRIToolkit toolbox (https://github.com/delucaal/MRIToolkit), running on Matlab R2018b was used for preprocessing, where the pipeline includes denoising based on principal components analysis (Veraart, Fieremans, & Novikov, 2016) and corrections for distortions due to head motion and eddy currents. The analysis was implemented with DSI Studio (version 2021, Jul 12, https://dsi-studio.labsolver.org). The quality of the DWIs was confirmed by an average of 0.89 (SD = 0.01) in the mean Pearson correlation coefficient of the “neighboring” DWI (Yeh et al., 2019b). To obtain the spin distribution function (Yeh et al., 2010), diffusion data were reconstructed in the MNI space using q-space diffeomorphic reconstruction (Yeh & Tseng, 2011) with a length ratio of 1.25 and a resolution of 2 mm isotropic. We used a deterministic fiber tracking algorithm (Yeh et al., 2013), which has been shown to accomplish 92% valid connections over an average of 54% of other algorithms (Maier-Hein et al., 2017).
The multiple comparison problem was tackled using a TFCE method (Chen et al., 2018; Smith & Nichols, 2009), except for the graph measures, in which we used the discovery rate (FDR) method with p < 0.05. The reasons for selecting these methods were the same as with functional connectivity, but for raw structural connectivity, we used the FDR correction. The FDR method is the gold standard for tackling the multiple comparisons problem in tractography (Whitwell et al., 2010a; Yeh et al., 2019b; Zhang et al., 2013) because it enables a substantial improvement of statistical power for comparing individuals averaging tracts (Schwartzman et al., 2008). Moreover, tract-crossing fibers (Tournier, 2010) can affect clustering measures, thus discouraging the use of TFCE corrections.
First, we initialized the analysis of raw structural connectivity taking into account all the tracts included in the structural connectome (Yeh et al., 2018). An automatic fiber tracking algorithm was used to calculate seven connectivity metrics for the subject’s tracts. The seeding region was placed at the track regions of the tractography atlas (Yeh et al., 2018) with a track-to-voxel ratio of 2. The anisotropy threshold, angular threshold, and step size were randomly selected, the last two between 15–90 degrees and 0.5–1.5 voxels, respectively (Yeh et al., 2018). Tracks with a length shorter than 30 or longer than 200 mm were discarded (Yeh et al., 2019a). Additionally, topology-informed pruning (Yeh et al., 2019a) was applied with 32 iterations to remove false connections. Seven raw connectivity measures were obtained for every tract using two models. On the one hand, we obtained four measures from the classical tensor model (Basser et al., 1994a, 1994b) with diffusion tractography image: (1) fractional anisotropy, degree of anisotropy of the diffusion process; (2) axial diffusivity: quantifies diffusivity along the principal axis of the tensor; (3) radial diffusivity: explains the diffusivity perpendicular to the principal axis of the tensor; and (4) mean diffusivity: characterizes the overall mean squared displacement of the water molecules (for more details on tensor-based measures calculations, see Hecke et al., 2016). On the other hand, from the q-space model (Callaghan, 1994) and the generalized q-sampling imaging method (Yeh et al., 2010, 2013), the following measures were obtained: (5) the normalized quantitative anisotropy: evaluates the most prominent fiber orientation, in a scaled way so that the maximum of each subject is one (Yeh et al., 2010, 2013); (6) the isotropic diffusion component derive: represents the non-directional restricted diffusion (Yeh et al., 2010, 2013); and (7) the restricted diffusion imaging: quantifies the density of restricted diffusion (Yeh et al., 2017).
Second, an ROI-to-ROI structural connectivity analysis was used to assess the integrity of tracts between gray matter regions. To this end, the fractional anisotropy associations among the 116 ROI of the AAL atlas were calculated (Tzourio-Mazoyer et al., 2002). A total of 10,000 seeds were placed in the whole brain (Yeh et al., 2010, 2013). The fractional anisotropy threshold was set at 0.2 for the elimination of voxels containing gray matter (Mori & van Zijl, 2002; Oishi et al., 2008). This value was chosen based on the existing literature on FTD structural connectivity characterization (Chen et al., 2020; Möller et al., 2015; Sheelakumari et al., 2020). Additionally, the angular threshold and step size were randomly selected from 15 to 90 degrees and 0.5 to 1.5 voxels, respectively (Yeh et al., 2018). The tracts that appeared repeatedly at a distance smaller than 1 mm, shorter than 30 mm, or longer than 200 mm were discarded (Yeh et al., 2019a, 2019b). Finally, one connectivity matrix was calculated for every subject based on the fractional anisotropy mean of the tracts that ended in every area of the 116 AAL’s ROIs. Furthermore, these matrices were used for the calculation of graph metrics, as explained for functional data.
Machine Learning
We employed ML to compare classification performance of functional and structural connectivity data alone or in combination with demographic and cognitive measures. With this aim, we evaluated the individual contributions of the three methods of connectivity in the two modalities (functional and structural) to characterize the pathological groups. Each algorithm was executed two times, with connectivity data alone and with the demographic and cognitive features. Additionally, we created one model combining all methods and modalities. In total, 14 data-driven models were used for the classification of one class relative to the remaining subjects, following best practices in ML (Müller & Guido, 2016; Poldrack et al., 2019).
Feature engineering and selection.
The total number of multimodal features was on the order of 106, voxel-wise variables entailing extensive computational time and memory requirements (Cohen et al., 2017; Huys et al., 2016). Using the full set of features might also induce adaptation of ML algorithms to the particularities of a specific dataset (overfitting), resulting in poor generalizability (Müller & Guido, 2016). Following previous procedures, we reduced dimensionality with group-level statistical analysis, also called filter method (Kassraian-Fard et al., 2016). Unlike other methods for dimensionality reduction, such as principal component analysis, this approach allows a more direct interpretation of the results (Huys et al., 2016; Pereira et al., 2009). In addition, filter methods are computationally inexpensive and do not take classifier performance into consideration (Kassraian-Fard et al., 2016). Moreover, our group-level statistical findings can be compared with previous research (Agosta et al., 2012; Bharti et al., 2017; Iaccarino et al., 2015; Popal et al., 2020; Tovar-Moll et al., 2014; Upadhyay et al., 2016; Whitwell et al., 2009). This hybrid approach has been employed in previous studies (Dottori et al., 2017; Fittipaldi et al., 2020; Gonzalez Campo et al., 2020; Kassraian-Fard et al., 2016; McMillan et al., 2012).
Each possible group comparison was calculated for each modality and method, with a statistical power threshold of 0.80 for detecting medium effect sizes, as recommended (Cohen, 1988); see Supporting Information, Supplementary data 2 for details. The significant results from these comparisons were used as inputs for the ML algorithms in our training sample. In the case of raw functional connectivity, the results were averaged for significant clusters (with a size greater than 50 voxels) across the individual maps of connectivity. To extract the most relevant features, we performed a progressive feature elimination approach in the training set (80% of the total sample, further details below) to select the optimum set of features after stabilization (Donnelly-Kehoe et al., 2018) using a k-fold scheme (with k = 5) with nested training and validation. At each iteration, the Gini scores were used to eliminate the features of the lowest importance, while evaluating feature stability on each nested fold. The variability of the feature ranking in the importance list was evaluated across nested k-folds. To this end, we assessed if the confidence interval’s right tail of each feature was ranked in the same way as the feature mean (see Supporting Information, Supplementary data 4 for analysis output details). Finally, we kept the N first features in the ranking, where N is the optimal number of features such that using more than N features fails to improve classifier performance.
Classification models.
Based on the selected features, we used the XGBoost algorithm (Kaufmann et al., 2019) to classify the different clinical groups. The XGBoost algorithm is a gradient boosting machine implementation that provides parallel computation tree boosting, enabling fast and accurate predictions and advanced regularization techniques to avoid overfitting (Torlay et al., 2017). This algorithm has proven successful in several diagnostic applications (Behravan et al., 2018; Torlay et al., 2017; Zheng et al., 2017). Gradient boosting machine is based on the gradient boosting technique, in which ensembles of decision trees iteratively attempt to correct the classification errors of their predecessors by minimizing a loss function (i.e., a function representing the difference between the estimated and true values) pointing in the negative gradient direction (Mason et al., 1999). When compared to other GBM algorithms, XGBoost provides regularized boosting, helping to reduce overfitting with more generalizable results (Torlay et al., 2017; Xuan et al., 2019).
Following best practices in ML (Müller & Guido, 2016; Poldrack et al., 2019), we used 80% of the sample for training and validation and 20% of the sample for testing. Within the training set, we performed k-fold cross-validation (with k = 5 nonoverlapping folds) in alternating nested training sets and validation sets to tune hyperparameters. The 20% left was an independent test set to perform an unbiased and accurate performance estimation. XGBoost has several hyperparameters, including the number of subtrees to retain, maximum tree depth, learning rate, minimum loss reduction required to further partition a leaf node, maximum number of leaves, and regularization weights (Wade, 2020). To choose the best hyperparameter combination, we used Bayesian optimization, an approach with demonstrated applicability to different problem settings (Feurer & Hutter, 2019; Zeng & Luo, 2017). This is an iterative algorithm with two key components: a probabilistic surrogate model and an acquisition function to decide which point to evaluate next. At each step, a new point in the hyperparameter space to explore is selected to be the maximum of an activation function of the prior knowledge and the uncertainty. As this optimization progresses, the chances of finding a better solution increase. Compared to other techniques, such as grid search (which is undermined by issues of dimensionality) or random search (where each guess is independent of the previous run), the Bayesian optimization algorithm is fast to compute, enabling a thorough optimization of the hyperparameters.
The performance of the classifiers was evaluated through receiver operating characteristic (ROC) curves (Fawcett, 2006), in which the sensitivity (true positive rate) and (1 − specificity) (i.e., false-positive rate) were used as the Y and X axes, respectively. These results were condensed using the area under the ROC curve (AUC) value, representing the probability that a randomly picked subject from the correct group will have a higher score according to the classifier than a randomly picked subject from the incorrect group (Müller & Guido, 2016). The pipeline divided the original dataset into six binary datasets, where the positive class in each of these datasets corresponded to the group of interest, and the negative class was composed of all the other groups. Then, for each dataset, performance was evaluated through the microaverage AUC by averaging classification performance with respect to the labels and samples, thus taking into account the imbalance between classes and producing an unbiased performance metric (Koyejo et al., 2015).
RESULTS
Functional Connectivity
Depending on the method used, each variant group exhibited heterogeneous functional network alterations. The patients with bvFTD presented multiple changes in their frontal networks, as detected by raw (Figure 2), ROI-to-ROI (Figure 3A), and graph (Figure 4A) connectivity measures. Additionally, clusters in the insula and anterior temporal area were significantly disconnected from distant areas but not from nearby voxels (Figure 2). Additionally, temporal regions (ROI-to-ROI) were less connected (Figure 3A), and cerebellar areas presented improved network organization (Figure 4A). Raw functional connectivity in the CBS group showed an increase with the left prefrontal areas (Figure 2). In contrast, the ROI-to-ROI method detected impaired connectivity of the cerebellum with the frontal regions (Figure 3A), while efficiency and strength graph properties of the orbitofrontal areas and cerebellum increased (Figure 4A). In the nfvPPA group, there was evidence of a decrease in the global correlation between the insula and anterior temporal region in the left hemisphere (Figure 2). Based on the ROI-level analysis, the same areas showed decreased connectivity with occipital and parietal regions (Figure 3A). The graph connectivity analysis showed a diminution of the global efficiency and strength values in the left Heschl gyrus and left superior temporal regions (ROI), with the same metrics increasing in the cerebellum (Figure 4A). Regarding the PSP group, raw functional connectivity showed alterations in the left insula, cerebellum, and inferior and superior temporal gyri (Figure 2 and Supporting Information, Supplementary data 5). ROI-to-ROI connectivity analysis showed increased cerebellar connections with the occipital lobe (Figure 3A). Graph analyses demonstrated better network organization in the bilateral occipital ROIs and a deterioration of network organization in the insula, Heschl’s area, and temporal superior area in the left hemisphere (Figure 4A). Regarding the svPPA patients, raw functional connectivity analyses evidenced impaired connectivity of the bilateral anterior temporal gyri and the insula (Figure 2). The ROI-to-ROI connectivity showed decreases in multiple connections (intratemporal, tempo-occipital, tempo-central, and fronto-subcortical connections; see Figure 3A). Moreover, the graph connectivity analysis showed increases mainly in global efficiency, degree, and strength metrics in the bilateral frontal, superior, orbital, and cerebellar ROIs (Figure 4A).
Structural Connectivity
Similar to functional connectivity, structural networks evidenced variable results depending on the method employed. The patients with bvFTD showed several altered tracts based on the raw structural connectivity analysis (Figure 5). Similar findings were obtained in the ROI-to-ROI connectivity analysis (Figure 3B), indicating impairments primarily in the fronto-frontal connections but also in the fronto-cerebellar and fronto-central connections. Similarly, all frontal ROIs presented decreased network graph organization (Figure 4B). The CBS group had a similar pattern to the bvFTD group in raw structural connectivity (Figure 5) but with more extended changes across the cerebellar connections (Supporting Information, Supplementary data 5). The ROI-to-ROI connectivity analysis showed decreases in the precentral and frontal superior regions in the right hemisphere (Figure 3B). The analysis of the graph metrics presented diminished efficiency, degree, and strength metrics in the frontal and temporal regions (Figure 4B). In the nfvPPA group, reduced raw structural connectivity was observed in the corpus callosum, frontal aslant tract, superior longitudinal fasciculus, anterior thalamic radiations, superior thalamic radiations, and cerebellar connections (Figure 5 and Supporting Information, Supplementary data 5). ROI-to-ROI connectivity presented decreases between the left frontal and right central areas and between the superior and middle right frontal ROIs (Figure 3B). These ROIs showed reduced strength, while local efficiency and the clustering coefficient decreased in the left precentral and right mid temporal ROIs, respectively (Figure 4B). The patients with PSP presented a similar pattern to those with bvFTD in raw structural connectivity (Figure 5), with numerous additional alterations in the cerebellar connections (Supporting Information, Supplementary data 5). Additionally, this group presented disconnections between the right precentral ROI to frontal regions (Figure 3B). The network organization in the frontal, central, and temporal ROIs was reduced, mainly regarding efficiency and strength (Figure 4B). In the svPPA group, the three methods mainly evidenced a loss of connectivity of the temporal lobes with the rest of the brain. All raw connectivity indices showed significant differences in the uncinated fasciculus and cingulum parahippocampal tract (Figure 5 and Supporting Information, Supplementary data 5). ROI-to-ROI connectivity analyses (Figure 3B) detected a decrease in connectivity of the intrafrontal and intratemporal areas in the right hemisphere (Figure 3B). Global efficiency, degree, and strength values increased in the bilateral thalamic, left insular, and right hippocampal ROIs but decreased in the frontal and cerebellar regions (Figure 4B).
Machine Learning
Multiclass classification.
The multiclass classification among groups based on connectivity data from the raw functional method achieved an AUC of 0.92 (Figure 6A). The model with multimodal data, including raw functional connectivity and cognitive features, reached the highest classification performance (AUC = 0.95) (Figure 6C). The top-ranked features in both models included the amplitude of low-frequency fluctuations and global and local correlation metrics. Furthermore, in the multimodal approach, the MMSE score was the fourth ranked in the feature importance list (Figure 6B). The models with ROI-to-ROI functional connectivity features yielded AUCs of 0.75 and 0.80 when the connectivity data were analyzed individually (Figure 6D) or with the multimodal approach (Figure 6F), respectively. The top features in these models are shown in Figure 6E, in which the MMSE score ranked third in the multimodal model. The model with functional graph connectivity data only yielded an AUC of 0.63 (Figure 6G), a value that was increased in its multimodal counterpart to 0.89 (Figure 6I). Finally, Figure 6H shows the selected features for the graph models and their relative feature importance, where the MMSE score was the most important feature in this multimodal approach.
The model with raw structural data achieved an AUC of 0.84 (Figure 7A), while its multimodal counterpart reached an AUC of 0.85 (Figure 7C). The top features included in these models and their importance are shown in Figure 7B. The MMSE score feature ranked as one of the top features in the multimodal model. In the case of the model with ROI-to-ROI structural connectivity only, an AUC performance of 0.70 was obtained (Figure 7D), a value that increased to 0.80 when incorporating multimodal features (Figure 7F). The feature importance is shown in Figure 7E, where the MMSE score was the second most important feature in the multimodal version. The models with graph connectivity data presented AUC values of 0.73 and 0.83 using connectivity data only (Figure 7G) and when combined with multimodal data (Figure 7I), respectively. The top features of these models are presented in Figure 7H, where the MMSE score was the most relevant feature in this multimodal approach.
Finally, the combination of all connectivity data from both modalities and the three methods for multiclass classification reached an AUC of 0.80 (Figure 8A). When incorporating cognitive features, we obtained an AUC of 0.89 (Figure 8C). The top features are shown in Figure 8B. Of note, the MMSE score ranked fourth in the multimodal approach. Results for the average AUC for each class in the ROC curves showed varying mean performance across modalities and methods (Figures 6, 7, and 8). Nevertheless, performance variability was low on different folds as shown in the confidence intervals. Moreover, feature stability was assessed on nested k-folds during the validation step (see Supporting Information, Supplementary data 4 for details).
Comparison of metrics across modalities, methods, and techniques.
In total, 14 XGBoost data-driven models were used for the multiclass classification of the five variants of FTD and HCs based on the optimal feature sets after recursive optimization. The data were computed individually and combined with all modalities and methods (2 modalities × 3 methods + 1 multimodal). Furthermore, each of the models was calculated twice, with and without demographic and cognitive variables. Based on the average performance indicators (i.e., accuracy, sensitivity, specificity, F1, and AUC; Figure 9 and Supporting Information, Supplementary data 4), the top three performing models were the raw functional multifeatured model, followed by the raw functional connectivity model and the multimodal multifeatured model. To statistically compare the performance results, we employed nonparametric tests to assess statistically significant differences between the ROC curves (Venkatraman, 2000). In this approach, the equality of the curves is analyzed at all operating points, and a reference distribution is generated by permuting the pooled ranks of the test scores for each classification. We found that although the top performing model was the raw functional multifeatured model, the difference between this model and the two that followed (raw functional connectivity and multimodal multifeatured models) was not statistically significant (p > 0.05 in both cases).
To discard any possible effect due to the scanner, the classifiers were trained again with data acquired on only one scanner (Supporting Information, Supplementary data 4) with the 95.6 % of the data. The performance of all classifiers for one or two scanners did not show statistical differences in the AUC values (all p > 0.05). To check possible biases due to specific brain parcellations, we compared the performance of four of our models (graph connectivity and graph multifeatured in both modalities) using the Human Connectome Project (HCP) atlas (Glasser et al., 2016) with respect to our results with AAL atlas. Figure 10 shows the performance of ML models based on HCP parcellation with connectivity data only (Figure 10A and D) and multifeatured data (Figure 10C and F). Also, the top features are presented in Figure 10B and E. No significant differences were observed in the microaverage AUC for the functional graph connectivity (statistic = 1.62, p = 0.14), the functional graph multifeature (statistic = 2.12, p = 0.11), the structural graph connectivity (statistic = 1.58, p = 0.17) or the structural graph multifeatured (statistic = 1.75, p = 0.12) classifiers. These comparisons were based on nonparametric tests to assess statistical differences between the ROC curves (Venkatraman, 2000). Moreover, the most important features were in line with the results obtained with the AAL atlas. Areas from the prefrontal cortex (inferior, dorsolateral, and anterior cingulate regions), temporal lobe (superior and middle regions), and occipital lobe (superior and inferior regions) were evidenced in both parcellations of the functional connectivity classifiers. Similarly, both atlases’ structural classifiers shared the main features, including areas of frontal (inferior region) and occipital lobes.
DISCUSSION
In this study, a simultaneous multiclass categorization of each FTD variant and healthy controls achieved a performance up to an AUC of 0.95. This was accomplished with a multifeatured strategy, where the classifiers combining brain network connectivity and cognitive assessments increased model performance. The multimodal classifiers evidenced the relative importance of specific domains for FTD variant characterization. Through progressive feature elimination, an optimum set of features was obtained by removing redundant and uninformative variables. The results address current calls for robust FTD variant multimodal marker classification. This approach, if further replicated and validated, may be translated into the development of future affordable clinical decision computational tools.
Our framework provided support for proposed hypotheses regarding the multiclass classification of FTD variants based on computational inference. First, we obtained highly accurate simultaneous multiclass classification of each FTD variant relative to other variants and controls after feature optimization. In line with previous studies, the functional ROI-to-ROI models showed alterations in fronto-temporal (Jastorff et al., 2016; Meijboom et al., 2017), intrafrontal (Dopper et al., 2014; Whitwell, 2019), and precuneus-insula (Whitwell et al., 2011b) connectivity. Moreover, the functional graph theory models captured node-degree differences in the left superior occipital area (Agosta et al., 2013; Reyes et al., 2018), left Heschl gyrus (Agosta et al., 2013), and left frontal inferior pars triangularis (Zhou et al., 2012). Regarding the raw structural connectivity models, and in agreement with previous research, we found alterations in the uncinate fasciculus (Agosta et al., 2012; Daianu et al., 2016; Iaccarino et al., 2015; Nguyen et al., 2013), superior longitudinal fasciculus (Agosta et al., 2012; Daianu et al., 2016), corpus callosum (Tovar-Moll et al., 2014), dentatorubrothalamic tract (Whitwell et al., 2011a), and inferior fronto-occipital fasciculus (Meijboom et al., 2017). Second, adding cognitive features (i.e., multifeatured approach) increased the averaged AUC performance metrics across all subject groups. The relevance of adding cognitive features was further evidenced in the feature importance lists of our models, where they ranked in the top four positions across the multifeatured models. Third, although the raw functional models ranked higher than the multimodal approach, the differences in performance were not statistically significant. This may be because the top features of the multimodal approaches were mainly functional connectivity features and adding information from other domains was not relevant for FTD variant characterization. Increased model complexity with a limited dataset may induce overfitting (Müller & Guido, 2016). Therefore, model performance may be lower. However, the multimodal approach helps to provide information on specific functional and structural alterations capturing differential patterns in FTD (Agosta et al., 2012; Nguyen et al., 2013; Reyes et al., 2018; Whitwell, 2019).
The feature importance lists from our ML analyses showed that the most relevant features for discriminating between FTD variants were generally in line with previous research. The subjects with bvFTD showed reduced connectivity in the prefrontal, insular, and temporal regions in terms of the functional (Agosta et al., 2013; Jalilianhasanpour et al., 2019; Seeley et al., 2009; Whitwell, 2019) and structural (Dopper et al., 2014; Mahoney et al., 2015; Tovar-Moll et al., 2014; Whitwell, 2019) networks, as well as a reduction in global network degree and efficiency (Filippi et al., 2017; Reyes et al., 2018; Saba et al., 2019; Sedeño et al., 2016). The primary alterations in the subjects with nfvPPA were observed in speech-language regions (Mandelli et al., 2016, 2018), with a predominance in the left hemisphere (Whitwell, 2019). The subjects with svPPA exhibited distinct patterns of disconnection in functional (Agosta et al., 2014; Popal et al., 2020; Whitwell, 2019) and structural (Agosta et al., 2012; Iaccarino et al., 2015; Zhang et al., 2013) connectivity in the temporal lobe. Last, for the subjects with CBS, we found alterations in motor/parietal areas (Tovar-Moll et al., 2014; Whitwell et al., 2014), while the subjects with PSP showed structural alterations in connections encompassing the thalamus (Borroni et al., 2014; Whitwell et al., 2011a). We also detected increased connectivity values in specific FTD variants, for example, in the connections involving the parietal lobe in bvFTD (Meijboom et al., 2017; Whitwell et al., 2011b), the frontal lobe in CBS (Wolpe et al., 2014), occipital lobe in PSP (Whitwell et al., 2011a), and structural thalamic tract connectivity in bvFTD and svPPA. These findings may reflect compensatory mechanisms as a result of the disconnection of critical brain regions specific to each pathology (Jalilianhasanpour et al., 2019; Saba et al., 2019). Overall, our ML approach was consistent with previous studies, while allowing the detection of specific alterations in distinct FTD variants with overlapping pathophysiological profiles, avoiding possible methodological biases. Furthermore, we compared, for the first time, the performance reached for multiclass classifications of FTD variants with data from different modalities of connectivity and using different methods.
Our approach provides a comprehensive computational framework that may be used in clinical settings after replication and external validation. Historically, ML research on the categorization of dementia has relied on binary comparisons and atrophy metrics. However, atrophy is associated with late-stage neurodegeneration (Lu et al., 2013; Seeley et al., 2008), while brain connectivity alterations may be present at early stages (Dopper et al., 2014; Meeter et al., 2017). The few studies with multiclass comparisons of FTD variants were conducted only with atrophy metrics (Kim et al., 2019) and tractography (Torso et al., 2020, 2021). Indeed, the literature examining binary comparisons of FTD variants is more extensive, individually assessing atrophy (Bachli et al., 2020; Bisenius et al., 2017; Meyer et al., 2017; Salvatore et al., 2014; Santillo et al., 2013) and functional (Moguilner et al., 2021) and structural (Mahoney et al., 2014; Santillo et al., 2013) connectivity measures. To the best of our knowledge, our approach is the first to enable multiclass model characterizations in a multimodal context. Moreover, this approach outperforms previous attempts for multiclass classification (Kim et al., 2019; Torso et al., 2020, 2021). Thus, our research lays the groundwork for the future creation of a useful clinical computational inference tool.
Limitations and Future Studies
Our work has some limitations. First, despite the larger sample size compared to similar previous studies in the literature (Kim et al., 2019; Torso et al., 2020), the sample was based on a unique database. Future research may include multicentric samples from different consortia, with a variety of MRI acquisition protocols to assess the robustness of this method against heterogeneity. Additionally, to test the robustness of our results against sample heterogeneity, data from underrepresented populations with different genetic, demographic, and socioeconomic factors should be included (Ibañez et al., 2021a, 2021b). Second, we lacked histopathological diagnosis confirmation. However, this limitation is shared with previous similar work in the literature (Ma et al., 2020; Manera et al., 2019; Torso et al., 2020; Yu & Lee, 2019), and our approach can be extended to other datasets with histopathological confirmation. Third, our models should be compared in future studies to standard biomarkers such as PET and plasma indicators to evaluate potential synergistic biomarker combinations. Fourth, although the cognitive test employed in this study was the MMSE, additional specific assessments for FTD are available, such as particular executive function and language tasks (Custodio et al., 2016; Kramer, Alioto, & Kramer, 2020; Torralva et al., 2009; Younes & Miller, 2020), which may be added to the model. Fifth, despite the AAL being one of the most widely assess atlases in dementia research (Elsheikh et al., 2021; Ibañez et al., 2021a, 2021b; Lee et al., 2018; Liu et al., 2012; Lord et al., 2016; Reyes et al., 2018; Saba et al., 2019; Sedeño et al., 2016, 2017), future research may compare classification performance across different brain parcellations in the dementia population. As a starting point, we compared the AAL and the HCP atlas (Glasser et al., 2016) parcellation on a representative subsample (graph connectivity and graph multifeature in both modalities) and we did not find significant differences in the AUC across groups. Moreover, the pathophysiological profile evidenced in our feature importance analysis was similar. Models using both atlases prioritized hallmark-affected areas for FTD, such as inferior and dorsolateral prefrontal cortex, anterior cingulum, middle, and superior temporal areas (Filippi et al., 2013; Meeter et al., 2017; Whitwell, 2019; Whitwell et al., 2011b; Zhang et al., 2009). Furthermore, the AAL’s models prioritized also other areas with extensive evidence of affectation in FTD such as the insula and precuneus (Agosta et al., 2013; Baez et al., 2019; Popal et al., 2020; Reyes et al., 2018; Whitwell et al., 2011b). The selection of parcellation must rely on quantifiable factors such as reproducibility, clustering validity metrics, multimodal comparisons, and network analysis (Arslan et al., 2018). Importantly, no single parcellation consistently outperforms the others across all evaluation criteria (Arslan et al., 2018). Thus, by considering the various studies that used the AAL atlas in dementia research, even with ML methods (Asim et al., 2018; Bachli et al., 2020; Castellazzi et al., 2020; Hao et al., 2020; McMillan et al., 2014; Park et al., 2022; Sedeño et al., 2017), our choice of parcellation was based on the reproducibility criteria. A systematic comparison of multiple atlases across the different modalities is beyond the scope of this work. Finally, most of the patients in this study were in the early to moderate stages of the disease. Future longitudinal studies may determine the value of our approach for monitoring disease progression.
Conclusions
We developed a multiclass characterization of FTD variants combining hundreds of functional and structural network features, as well as demographic and cognitive variables. In contrast to previous studies, we optimized the variable space by eliminating uninformative features resulting in a highly accurate FTD variant characterization. This approach can help in the future development of clinical decision support tools aimed at detecting specific affectations in the context of overlapping neurodegenerative diseases.
ACKNOWLEDGMENTS
We thankfully acknowledge the participation of patients and controls, as well as the support of the patients’ families. Data used in preparation of this article were obtained from the Frontotemporal Lobar Degeneration Neuroimaging Initiative (FTLDNI) database (https://4rtni-ftldni.ini.usc.edu/). The investigators at NIFD/FTLDNI contributed to the design and implementation of FTLDNI and/or provided data but did not participate in analysis or writing of this report.
SUPPORTING INFORMATION
Supporting information for this article is available at https://doi.org/10.1162/netn_a_00285. Datasets are available in their own online repository: Neuroimaging In Frontotemporal Dementia (NIFD/LONI). The code for the data analysis of this study is available from the corresponding author on reasonable request.
AUTHOR CONTRIBUTIONS
Sebastian Moguilner: Conceptualization; Formal analysis; Investigation; Methodology; Software; Supervision; Visualization; Writing – original draft; Writing – review & editing. Raul Gonzalez-Gomez: Conceptualization; Data curation; Formal analysis; Investigation; Software; Validation; Visualization; Writing – original draft; Writing – review & editing. Agustín Ibañez: Conceptualization; Funding acquisition; Methodology; Supervision; Writing – original draft; Writing – review & editing.
FUNDING INFORMATION
Agustín Ibáñez, Takeda Pharmaceutical Company (https://dx.doi.org/10.13039/100008373), Award ID: CW2680521. This work is partially supported by grants from CONICET; ANID/FONDECYT Regular (1170010); FONCYT-PICT 2017-1820; Sistema General de Regalías (BPIN2018000100059), Universidad del Valle (CI 5316); Alzheimer’s Association GBHI ALZ UK-20-639295; and the Multi-Partner Consortium To Expand Dementia Research In Latin America: ReDLat, supported by National Institutes of Health, National Institutes of Aging (R01 AG057234), Alzheimer’s Association (SG-20-725707), Rainwater Charitable foundation - Tau Consortium, and Global Brain Health Institute).
TECHNICAL TERMS
- Multiclass classification:
Classifying positive class label subjects against a negative class comprising several categories.
- eXtreme Gradient Boosting:
An optimized distributed library that implements Gradient Boosting machine learning algorithms providing parallel tree boosting.
- Multifeature:
Combining neuroimaging features with cognitive features.
- Raw connectivity:
Brain connectivity analyzed at the voxel’s BOLD time series level in the case of functional connectivity, and at the tract level for structural connectivity.
- Multimodal:
Combining several neuroimaging methods.
- Microaverage AUC:
Micro-average metrics aggregate the contributions of all classes to compute the average AUC metric. This method is preferred when class imbalance is present.
REFERENCES
Author notes
Competing Interests: The authors have declared that no competing interests exist.
First authors.
Handling Editor: Mikail Rubinov