Abstract
The UK Biobank study has produced thousands of brain imaging-derived phenotypes (IDPs) collected from more than 40,000 genotyped individuals so far, facilitating the investigation of genetic and imaging biomarkers for brain disorders. Motivated by efforts in genetics to integrate gene expression levels with genome-wide association studies (GWASs), recent methods in imaging genetics adopted an instrumental variable (IV) approach to identify causal IDPs for brain disorders. However, several methodological challenges arise with existing methods in achieving causality in imaging genetics, including horizontal pleiotropy and high dimensionality of candidate IVs. In this work, we propose testing the causality of each brain modality (i.e., structural, functional, and diffusion magnetic resonance imaging (MRI)) for each gene as a useful alternative, which offers an enhanced understanding of the roles of genetic variants and imaging features on behavior by controlling for the pleiotropic effects of IDPs from other imaging modalities. We demonstrate the utility of the proposed method by using Alzheimer’s GWAS data from the UK Biobank and the International Genomics of Alzheimer’s Project (IGAP) study. Our method is implemented using summary statistics, which is available on GitHub.
1 Introduction
Studies indicate that dementia, including Alzheimer’s disease (AD), is highly heritable, and genetic factors are estimated to play a role in around 80% of AD cases (Gatz et al., 2006; Tanzi, 2012). However, research on dementia still has significant room for developing genetics–brain pathways. Although genome-wide association studies (GWASs) have identified a number of genetic risk factors for AD, they are limited in testing for a direct association between the variants and the disease trait. GWASs do not provide a comprehensive understanding of how the genetically regulated structural and functional brain pathways drive AD progression, which is essential for characterizing the genetic mechanism of the disease.
The growing interest in the role of genetics in brain disorders has spawned a new field known as imaging genetics that integrates neuroimaging features (e.g., from brain magnetic resonance imaging (MRI)) with genetics using statistics and machine learning. Many neuroimaging features, including the brain’s anatomy (structural MRI (sMRI)), function (functional MRI (fMRI)), and microstructure (diffusion MRI (dMRI)), have been demonstrated to be heritable. For example, Lin et al. (2014) and Huang et al. (2015) use imaging features as GWASs traits to identify the associations between the imaging features and single nucleotide polymorphisms (SNPs) or a set of SNPs (e.g., a gene). At the same time, in recent years, there has been substantial evidence in the studies of brain disorders that brain MRI features, that is, imaging-derived phenotypes (IDPs), are useful biomarkers for AD and are modulated by genetic factors (Elliott et al., 2018). By integrating data from imaging scans and GWASs, researchers have attempted to better understand the heritability of AD and other brain-related diseases. Such integrative analyses have been promising in advancing imaging neuroscience that paces with an increased size of genetic and neuroimaging data (Shen & Thompson, 2019). However, due to its high dimensionality (e.g., thousands of IDPs available from UK Biobank (Elliott et al., 2018; Smith et al., 2021)) and complex correlation structure (e.g., correlations between imaging features), it is difficult to fully leverage brain imaging data in GWASs, especially in pinpointing causal imaging features affecting the heritability of brain-related diseases (Burgess et al., 2017).
The transcriptome-wide association study (TWAS), which leverages expression quantitative trait loci (eQTL) with GWAS, has become an important statistical tool in genetics research (Gamazon et al., 2015). TWAS identifies genes associated with diseases/phenotypes through genetic regulation of gene expression. TWAS is constructed by first using SNPs in a gene to predict the corresponding gene expression levels via machine learning methods (e.g., penalized regression), which represents the “genetically regulated” (or “genetically imputed”/“genetically predicted”) component of gene expression. The predicted gene expression levels are then associated with selected outcomes, adjusting for multiple comparisons. In neuroimaging literature, the imaging-wide association study (IWAS) extended TWAS by considering multiple IDPs as intermediate phenotypes rather than gene expression, which may offer improved mechanistic interpretation when studying brain disorders such as AD (Xu et al., 2017). These approaches showed significant improvement in localizing genes associated with the phenotype compared with GWAS.
In addition, TWAS offers a statistical interpretation from the lens of causal inference via instrumental variable analysis. Specifically, TWAS is equivalent to testing a causal relationship from eQTL to phenotype using 2-stage least squares (2SLS) by using SNPs as instrumental variables (Mai et al., 2023; Xue et al., 2020). In neuroimaging, BrainXcan takes this approach to identify IDPs leading psychiatric traits under the assumptions of Mendelian Randomization (MR) (Liang et al., 2025). The multivariate IWAS (MV-IWAS) extends the causal interpretation of TWAS in imaging genetics by accounting for horizontal pleiotropic effects of other IDPs (Knutson et al., 2020). We refer to Taschler et al. (2022) for a comprehensive review of conducting causal inference with neuroimaging data using MR. A key consideration in these approaches in imaging genetics is the high dimensionality of the IDPs. For example, the UK Biobank provides thousands of IDPs, which may increase dramatically based on preprocessing pipelines and analysis goals (e.g., ROI-level analysis versus whole-brain analysis). Because it increases the burden of multiple comparisons, BrainXcan and MV-IWAS consider the whole genome as potential instrumental variables and use polygenic scores (PGSs) to implement the first stage of 2SLS. However, such approaches would lose the original purpose of TWAS, which was to localize gene-level associations with diseases/phenotypes.
This paper aims to mitigate current challenges in making causal interpretations in imaging genetics and localizing corresponding genes. Rather than evaluating each IDP individually, we propose to make causal interpretations between diseases/phenotypes and each imaging modality (e.g., sMRI, fMRI, or dMRI). Each modality represents distinct aspects of brain structure and function, and therefore, aggregating the brain information in each modality can contribute to a more comprehensive understanding of the genetic mechanisms underlying brain disorders. At the same time, our method explicitly accounts for horizontal pleiotropy among imaging modalities, therefore, providing robustness of the causal direction between each modality and the progression of AD. With the MRI modalities through which genetics affect brain function in AD, we can better understand the underlying biological mechanism by which genetics causes changes in brain function, which in turn causes AD. We also make our methods implementable using GWAS summary statistics and reference panels, enhancing the practical utility.
The rest of the paper is organized as follows. In Section 2, we provide a methodological review of the statistical methods in imaging genetics that extended TWAS and develop the proposed method, including the implementation with summary statistics. In Section 3, we evaluate Type 1 error rate and power of the proposed method by simulations. Section 4 applies our methods to the AD GWAS data obtained by the UK Biobank and the International Genomics of Alzheimer’s Project (IGAP). A few points of discussion are made in Section 5.
2 Methods
2.1 Notations
We let denote the phenotype of our interest (e.g., AD status) from subjects. Also, let be the collection of the th SNP across subjects in a gene for . Similarly, we let be the th IDP for . Without loss of generality, we assume that each and is standardized to have the zero mean and unit variance.
2.2 Review of existing methods
We first review transcriptome-wide association study (TWAS) and relevant approaches in neuroimaging.
2.2.1 TWAS and UV-IWAS
With a single endophenotype , TWAS is a method that integrates endophenotypes (e.g., gene expression) in gene-based association study to test the genetically regulated effect of the endophenotype on the disease trait (Gusev et al., 2016). TWAS is conducted using a two-stage approach. In Stage 1, using a set of SNPs encoded in a gene , machine learning methods (e.g., linear models) are used to predict a corresponding gene expression level () based on the following model:
Here, is obtained by fitting a penalized linear regression with the ridge (Liang et al., 2025), LASSO, or elastic net penalty (Gusev et al., 2016). Then, a genetically imputed expression, , is used to test for its association with the phenotype () in Stage 2 working model:
where is the canonical link function (e.g., the logistic link for binary traits and the identity link for for quantitative traits). TWAS tests to identify genes whose genetically regulated gene expression is associated with the trait. UV-IWAS is conceptually equivalent to TWAS (Knutson et al., 2020), using an imaging-derived phenotype (IDP) instead for gene expression as an endophenotype. One insight of UV-IWAS is that AD is primarily a brain disorder, and using IDPs may be more effective than gene expression in AD studies.
TWAS (or UV-IWAS) provides a useful framework for understanding how genetic variation influences disease by modeling intermediate endophenotypes, such as gene expression or IDPs, which help capture key aspects of biological function at the transcriptomic or neuroanatomical level. It also offers several possible statistical interpretations. One interpretation is that it is essentially a weighted burden test for gene-based association testing where the weight for the th SNP in a gene is determined by in Stage 1. Because an optimal choice of weights would lead to improved power in the burden test, the choice of weights in TWAS, which is informed biologically, is expected to improve statistical power. Another interpretation is that it can be interpreted as an instrumental variable (IV) approach, where the 2-stage least squares (2SLS) is used to estimate and infer the causal effect of endophenotypes () on the phenotype (). In such a case, the SNPs that led the causal direction serve as composite IVs, provided that the following assumptions hold.
A1. There exist correlations between all SNPs selected in Stage 1 and the endophenotype;
A2. SNPs are uncorrelated with potential confounders;
A3. SNPs do not affect the disease trait, except through the endophenotype/IDP being tested.
A3 indicates that the SNPs do not affect the disease trait via genetic direct effect. Since most SNPs are unrelated to endophenotypes, certain thresholds are typically used in TWAS Stage 1, such as predictive or value, to ensure that A1 is met.
2.2.2 Multivariate IWAS (MV-IWAS)
A unique challenge in imaging genetics is that there are multiple IDPs each may have distinct genetic regulatory pathways and potential pleiotropic effects to AD. However, multiple significant IDPs have been identified (Xu et al., 2017); as discussed by Knutson et al. (2020) and Elliott et al. (2018), the presence of horizontal pleiotropy can lead to violations of the IV assumptions, resulting in invalid causal inference using univariate IV analysis and ultimately inflating Type I error in the hypothesis testing for causal effects from an IDP to a phenotype, especially when there is high correlation between the IDPs.
To address the potentially inflated Type I error rate due to pleiotropy, Knutson et al. (2020) proposed using multiple linear regression in Stage 2 to adjust for possible pleiotropic effects of genetically regulated IDPs from gene to disease. If there are imaging features in total, MV-IWAS considers the working model the second stage:
Stage 1 of MV-IWAS is equivalent to Stage 1 of TWAS/UV-IWAS, where each IDP is predicted separately by the same set of SNPs across all chromosomes. In Stage 2, a general linear model (GLM) is applied using all predicted endophenotypes, replacing the univariate regression in TWAS/UV-IWAS. In MV-IWAS, whether the th IDP causes the phenotype is achieved by testing in the Stage 2 model, which controls the pleiotropic effects of the other IDPs. Furthermore, to account for remaining genetic direct effects, Knutson et al. (2020) extended the Stage 2 model to MV-IWAS-Egger to account for direct genetic effects on the phenotype. The Stage 2 model of MV-IWAS-Egger is formulated by
with the same null hypothesis to test the causal effect of the th IDP on the phenotype. We note that, from its construction, the power of MV-IWAS and MV-IWAS-Egger is severely affected by the multicollinearity of the s (and for MV-IWAS-Egger) (Burgess et al., 2017).
2.2.3 Remaining challenges
While MV-IWAS provides a possible causal interpretation, it could be underpowered due to the following factors:
1. Multicollinearity: Multicollinearity arises when two genetically regulated components of the IDPs (e.g., and ) are highly correlated, leading to a challenge in point estimation (i.e., an increase in the standard error) in Stage 2 of MV-IWAS and a decrease in statistical power. We refer to Chan et al. (2024) for the challenges with applying Mendelian randomization in case of high dimensional exposures (IDPs).
We empirically evaluated the multicollinearity by computing the genetic correlations of the IDPs. Specifically, we performed Linkage Disequilibrium Score Regression (LDSC) analyses (Duncan et al., 2018) using UKB IDP GWAS summary statistics available from Smith et al. (2021). For computational reasons, we included top 50 features with highest heritability estimates from different imaging modalities. As shown in Figure 1, 28.5% of sMRI and 60.3% of dMRI within modality IDP pairs show genetic correlation greater than , which implies that including such highly correlated IDPs in MV-IWAS could suffer from power loss.
Genetic correlations among the top 50 heritable IDPs (according to Smith et al. (2021)) from structural MRI (sMRI), diffusion MRI (dMRI), and functional MRI (fMRI). The correlations were estimated using LDSC, using all SNPs across all chromosomes with summary statistics from UKB IDP GWAS and reference panel from 1000G, with a minor allele frequency (MAF) filter of 1%. IDPs within each modality were reordered by hierarchical clustering for better visualization.
Genetic correlations among the top 50 heritable IDPs (according to Smith et al. (2021)) from structural MRI (sMRI), diffusion MRI (dMRI), and functional MRI (fMRI). The correlations were estimated using LDSC, using all SNPs across all chromosomes with summary statistics from UKB IDP GWAS and reference panel from 1000G, with a minor allele frequency (MAF) filter of 1%. IDPs within each modality were reordered by hierarchical clustering for better visualization.
2. Multiple testing: Multiple testing becomes an issue when multiple highly correlated IDPs are associated with a gene, increasing the number of comparisons in Stage 2 and making a stringent Bonferroni correction necessary, which can substantially reduce statistical power.
2.3 Proposed method: testing modality-specific pathways to AD
To mitigate potential power loss in analyzing multiple IDPs in an IV framework, we propose testing all IDPs in each MRI modality as a unitary entity while controlling for pleiotropic effects from other modalities. Referring back to Figure 1, we largely observed high genetic correlations within each MRI modality, including such high correlated IPDs in MV-IWAS can compromise inference accuracy, which motivated us to test all IDPs at the modality level. Besides the high within-modality correlations, we also noted moderate cross-modality correlations, for example, between sMRI and dMRI IDPs (Fig. 1), indicating a need to adjust for cross-modality pleiotropic effects. Beyond statistical considerations, grouping IDPs by modality is also scientifically informative: each MRI modality characterizes distinct aspects of brain structure and function, and, therefore, aggregating the brain information in each modality can contribute to a more comprehensive understanding of the genetic mechanisms underlying neurological diseases.
To the end, we specifically aim to answer the question: how does a gene lead to AD progression through which imaging modality? To handle possible bi-directional genetic effects, for example, bidirectional effect of dMRI IDPs in Figure 1, we propose MV-Modality-IWAS (multivariate modality global testing in IWAS), which bridges the extremes of testing each IDP individually in MV-IWAS (Knutson et al., 2020) and testing all IDPs together through a global testing in IWAS (Xu et al., 2017). By testing all IDPS within each MRI modality, we can assess the genetic regulatory effects of entire sets of imaging features, rather than individual IDPs, which may capture a range of subtle and complex brain processes. Ultimately, this methodology has the potential to uncover novel genetic variants or regions linked to brain structure and function, providing new insights into the underlying mechanisms of AD. A graphical illustration is provided in Figure 2.
An illustration of a data-generating model for simulation studies, using a directed acyclic graph (DAG) representation. Different colors in IDP denote different imaging modalities, where the green one is the modality of our interest.
An illustration of a data-generating model for simulation studies, using a directed acyclic graph (DAG) representation. Different colors in IDP denote different imaging modalities, where the green one is the modality of our interest.
Our Stage 2 model considers the testing for each gene from the following working model:
Equation (1) provides an interpretable decomposition of genetic pathways to phenotype: (i) the direct effect (adjusted by ) and (ii) genetically regulated effects (modeled by ) through the th imaging modality. With the assumed model, we test whether the specified modality contributes to the phenotype , that is, test the existence of casual modality-specific genetic pathways.
We propose to test the genetically regulated effects of a given modality using score-based testing while adjusting genetically regulated effects from other modalities, as well as residual genetic direct effects, as fixed effects, with the test statistic ,
where , and s are estimated under the null hypothesis. Under the null hypothesis, asymptotically follows a mixture distribution, which is approximated by a single non-central chi-square distribution with matching moments (Davies, 1980). In the proposed modality-level testing, the fixed effects of genetically regulated effects from other modalities are adjusted, similarly to the format of the Sequence Kernel Association Test (SKAT) (Wu et al., 2011). The proposed modality-level testing is motivated here well because (i) each IDP contributes only a small proportion to the progression of AD and (ii) the directions (sign) of the effects are expected to be non-uniform. In this scenario, different IDPs may drive AD in varying effect directions and magnitudes, in which SKAT is known to be powerful in this context. We note, however, that other test statistics could also be considered, such as score-based tests (Pan et al., 2014) (when accounting for sparsity in ), or -type tests (Knutson et al., 2020) (under the normality assumption for ).
The proposed test is an extension of several previous methods including UV-IWAS and multivariate model in MV-IWAS. If only one IDP in the modality being tested is included, it is equivalent to UV-IWAS. However, the proposed test differs from UV-IWAS in that it adjusts for other modalities, which could lead to horizontal pleiotropy. If the modality has one feature only, the proposed test is also equivalent to MV-IWAS or MV-IWAS-Egger. Our proposed method can be viewed as an extension of previous methods but can handle the challenges of testing genetically regulated effects of imaging features in the modality of interest while adjusting for genetic direct effects and effects from other modalities.
2.4 Implementation using GWAS summary statistics
Our method can be implemented using reference GWAS summary statistics, making it more useful in practice. For simplicity in notations, we define and let be a coefficient matrix for the modality (of the interest) from the Stage 1 working model for and . We define for all other modalities, including direct effects.
Then the Stage 2 model is summarized by testing in
and the test statistic in Equation (2) is equivalent to
Here,
where is the vector of z-scores of the GWAS between the phenotype and a gene, and is the pairwise Pearson correlation matrix of SNPs in the gene, following Kwak and Pan (2016). Note that could also be approximated by a reference panel (e.g., 1000 Genomes Project (The 1000 Genomes Project Consortium, 2010)). Similarly, we get
Since follows a multivariate normal distribution with the zero mean under the null hypothesis, it is sufficient to use plug-in estimate to derive the null distribution of (mixture ). We use the CompQuadForm R package to obtain the value when and use the Monte Carlo method to obtain the value otherwise to improve suboptimal Davies approximation methods to the mixture distribution of the test statistic.
We note that our summary statistics approach implicitly assumes that the phenotype of interest is continuous, which is violated when working with binary disease status. However, such “linearization” is a well-accepted method when the effect sizes of the covariates are tiny, which is also the case when evaluating the effects of imputed IDPs to AD (Knutson et al., 2020; Xue et al., 2020). To evaluate the robustness, our simulation studies in Section 3 are implemented by the summary statistics approach in this section, while the logistic model is used. The R functions to implement the proposed summary statistics approach are available at https://github.com/junjypark/MV_Modality_IWAS.
3 Simulation Studies
3.1 Simulation designs
We conducted simulation studies to evaluate Type 1 error rates and statistical power under linear model assumptions in both Stages 1 and 2, extending simulations of Knutson et al. (2020). We generated subjects and used SNPs with their correlation structures obtained from a gene on chromosome 16. The SNP data were generated by using the rmvbin function from the bindata R package.
Two imaging modalities were generated (), each with 10 features. A graphical model in Figure 2 was used to generate simulated imaging features and binary phenotypes. In our simulations, all SNPs were assumed to be associated with all imaging features. There is an unobserved confounding variable which is used to generate imaging features and the phenotype (but not used to compute values). Therefore, the imaging features were generated by
where is the residual noise. The binary phenotype was generated via (i) effects from imaging features, (ii) direct effects from genotypes, and (iii) effects from the unobserved confounding variable. Specifically,
The parameters in our simulation studies were chosen to reflect gene-level effects on imaging features and phenotypes. Specifically, we used , , , , and . The variance of noise of the IDPs () and the continuous trait were set to be and , respectively. In both continuous and binary traits, the average for each of (i) genotype–phenotype and (ii) genotype–IDP was set to be no larger than 0.10 on average under the null model. The average for IDP–phenotype was also set to be no larger than 0.10 in the continuous trait setting. Also, in binary traits, the prevalence of AD was approximately 18%.
For the power analysis, we consider two different scenarios. In Scenario 1, all IDPs in the first imaging modality cause the trait and where controls the degree of signals. In Scenario 2, only the first IDP was causal and the remaining IDPs were not (i.e., .
Denoting our method as Method 1 (Proposed), we compare our methods with three competitors. Method 2 (IDP-specific) is the univariate version of the proposed method, where the pleiotropic effect of the other modalities is adjusted but each feature of the modality of interest is tested separately. The global Type 1 error is controlled by applying Bonferroni correction (i.e., for 10 features in the modality). Method 3 (Proposed unadjusted) is our method without adjusting for the pleiotropic effect of the other modalities. Method 4 (UV-IWAS) is UV-IWAS (i.e., massive-univariate analysis) without adjusting for the pleiotropic effect, with the Bonferroni correction applied.
Since we implemented our data analysis with summary statistics only, we converted each simulated data into necessary summary statistics. For binary traits, GWAS summary statistics were obtained under the true model (e.g., logistic regression), but the assumed stage 2 model was linear regression (Eq. (3)). Each simulation was based on 5,000 replicated datasets.
3.2 Simulation results
We first evaluated the empirical Type 1 error rates of the methods when the first imaging modality does not have a causal effect on the trait. We considered both 5% and 1% as the theoretical rates. The results shown in Figure 3 imply that methods 1 and 2, which consider the pleiotropic effects, controlled Type 1 error rates well, in a slightly conservative manner. Method 2 was more conservative than our methods. It could be explained because each univariate analysis is already conservative (as discussed in Knutson et al. (2020)) and the Bonferroni correction would make it more conservative. Methods 3 and 4 constantly produced inflated Type 1 errors as they do not adjust for pleiotropic effects in their models. Therefore, we only considered methods 1 and 2 in subsequent power evaluations.
Type 1 error rates of different methods considered in this paper. The red horizontal line denotes the theoretical Type 1 error rates. The proposed method controls Type 1 error rate at the nominal rates of 5% and 1% in both continuous and binary traits. Methods 3 and 4 suffered from inflated Type 1 error rates as they do not consider pleiotropic effects. For visualizations, the y-axis of the plot was truncated at 0.2 when 1% Type 1 error is applied.
Type 1 error rates of different methods considered in this paper. The red horizontal line denotes the theoretical Type 1 error rates. The proposed method controls Type 1 error rate at the nominal rates of 5% and 1% in both continuous and binary traits. Methods 3 and 4 suffered from inflated Type 1 error rates as they do not consider pleiotropic effects. For visualizations, the y-axis of the plot was truncated at 0.2 when 1% Type 1 error is applied.
The summaries of power evaluations are provided in Figure 4. We considered Type 1 error rate of 5% when computing power in each scenario. The results were qualitatively similar for both continuous and binary traits. As expected, when all IDPs in the modality of interest have causal effects on the phenotype (“dense signal”), our method achieved higher power than Method 2, and the power increased as the degree of signal increased. Also, since we only used 10 IDPs from each modality, it is expected that the differences between the methods would be more prominent when signals are dense and the number of IDPs increases. When only one IDP had a causal effect on the phenotype, then the power between the two methods was similar. Although the latter result is unexpected because it is a setting that the min method is expected to perform better, it might be explained by the strict Bonferroni correction applied in Method 2 (which was shown in the Type 1 error simulation).
Summary of power analysis. specifies the degree of signals where IDPs in the first modality cause the phenotype. “Dense signal” refers to the case where all IDPs in the first modality cause the phenotype, and “sparse signal” refers to the case where only one of the IDPs in the first modality is causal.
Summary of power analysis. specifies the degree of signals where IDPs in the first modality cause the phenotype. “Dense signal” refers to the case where all IDPs in the first modality cause the phenotype, and “sparse signal” refers to the case where only one of the IDPs in the first modality is causal.
4 Data Analysis
4.1 Data availability
4.1.1 UK Biobank: GWAS IDP summary statistics
We analyzed 1,436 T1-weighted sMRI features and 690 dMRI features (the IDs and descriptions of these features are available in the Github repository) and 210 resting-state functional connectivity metrics from the rfMRI full correlation matrix (data field: 25750). As of 2020, UK Biobank collected approximately 40,000 subjects whose genotype and brain imaging data are both available. Among them, the GWAS summary statistics for each IDP (after adjusting for age, sex, 40 genetic principal components, scanning sites, head size, etc) is provided by Oxford Brain Imaging Genetics Server - BIG40 (https://open.win.ox.ac.uk/ukbiobank/big40/, led by Dr. Lloyd T. Elliott and Dr. Stephen Smith) (Elliott et al., 2018; Smith et al., 2021). The sMRI IDPs are categorized to regional and tissue volume (636 IDPs), cortical area (372 IDPs), cortical thickness (306 IDPs), cortical gray–white contrast (70 IDPs), regional and tissue intensity (41 IDPs). Also, the dMRI IDPs are categorized to WM tract FA (75 IDPs), WM tract MO (75 IDPs), WM tract diffusivity (300 IDPs), WM tract ICVF (75 IDPs), WM tract OD (75 IDPs), WM tract ISOVF (75 IDPs). Genotypes were imputed using the Haplotype Reference Consortium (HRC) as a reference panel (Bycroft et al., 2018).
We used the processing guidelines provided by Knutson et al. (2020) to choose SNPs to be used in our analysis. We used GWAS summary statistics on 2,310 heritable IDPs from 39,691 participants following (Smith et al., 2021), and used 1000G as reference genome for LD clumping. We employed a clumping radius of 1 M and set an cutoff value of 0.5, as recommended by Privé et al. (2019). This approach allowed us to remove loci with high linkage disequilibrium (LD) and keep the most significant SNP in each clump window based on the IDP GWAS results. When dealing with SNPs for a specific gene that are identified from different IDP GWAS clumps, we merged the SNP sets and treated them as a single SNP set for the gene. We refer to https://github.com/kathalexknuts/MVIWAS for more detailed procedures. Then, in our analysis, we only considered variants whose missing rate is less than 1%, and imputed all missing values with median values for each variant. Following Gusev et al. (2016), we considered 1 Mb window around each gene to collect SNPs. Then, for each gene, we fitted a general linear model (GLM) to each preadjusted imaging feature to obtain weights. Following implications from Knutson et al. (2020), we only considered gene–IDP pairs whose values from the test are less than .
After applying LD clumping, the marginal GWAS vector for each gene was converted to regression coefficients for SNPs IDP (i.e., ) by first transforming the marginal (univariate) GWAS score with (marginal regression coefficient) and multiplying it with the inverse of obtained from 1000 Genomes project.
4.1.2 UK Biobank: GWAS AD summary statistics
In UK Biobank, various health statuses, including AD, have been phenotyped by ICD 10 codes (data field: 41270). The GWAS summary statistics for AD is provided by the Neale laboratory (https://www.nealelab.is/uk-biobank), where genotypes are imputed by HRC, UK10K, and 1000 Genomes. The summary statistics were obtained after adjusting for (inferred) sex, age, age2, sex age, sex age2, and 20 genetic principal components. Among 361,194 genotyped subjects, there were approximately 4,000 subjects diagnosed with AD.
4.1.3 IGAP: GWAS AD summary statistics
International Genomics of Alzheimer’s Project (IGAP) (Lambert et al., 2013) is a large two-stage study based upon genome-wide association studies (GWAS) on individuals of European ancestry. In stage 1, IGAP used genotyped and imputed data on 7,055,881 SNPs to meta-analyze four previously published GWAS datasets consisting of 17,008 Alzheimer’s disease cases and 37,154 controls (The European Alzheimer’s disease Initiative – EADI the Alzheimer Disease Genetics Consortium – ADGC The Cohorts for Heart and Aging Research in Genomic Epidemiology consortium – CHARGE The Genetic and Environmental Risk in AD consortium – GERAD). In stage 2, 11,632 SNPs were genotyped and tested for association in an independent set of 8,572 Alzheimer’s disease cases and 11,312 controls. Finally, a meta-analysis was performed combining results from stages 1 and 2. GWAS summary statistics for gene–AD associations were obtained from https://www.niagads.org/datasets/ng00036 based on 17,008 AD cases and 37,154 healthy controls, after adjusting for sex, age, and (at least) 4 genetic principal components (Lambert et al., 2013).
4.2 Analysis results
4.2.1 IGAP: instrumental genes and causal IDPs using IGAP GWAS AD summary statistics
As shown in Figure 5, the instrument genes identified by MV-Modality-IWAS were found on chromosomes 8 (4 genes, 9 genes) and 19 (39 genes, 38 genes) for sMRI and dMRI, respectively, but none was found in fMRI, at the Bonferroni-adjusted significance level of . Among the genes on chromosome 8, 3 instrumental genes are shared between dMRI and sMRI, 37 instrumental genes are shared between dMRI and sMRI. No instrumental genes were identified for fMRI as expected, as the fMRI Independent Component Analysis (ICA) was performed at 25 dimensions, summarizing major patterns of brain activity. This approach of data summarizing may result in low () IDP heritability explained by genotype (Smith et al., 2021), which affects the Stage 1 model and subsequently then led to insignificant results in the Stage 2 testing.
MV-Modality-IWAS analysis results using AD GWAS summary statistics from IGAP and UKB. Panels A–C show -values for testing causality of each modality when adjusting for pleiotropic effects of other modalities and using each gene as instrumental variables. Panel D shows Venn diagram of the statistically significant genes identified in A–C, summarized by chromosomes 8 and 19. The analysis was performed at a Bonferroni-adjusted significance level, red dashed lines in panels A, B, and C represent the global significant threshold, that is, ; red dashed lines in panels A, B, and C represent the modality-specific significant threshold, that is, , and , respectively. Panels E–H show the results when using UK Biobank AD GWAS summary statistics. Still, UKB analysis was performed at a Bonferroni-adjusted significance level, red dashed lines in panels E, F, and G represent the global significant threshold ; blue dashed lines in panels E, F, and G represent the modality-specific significance level, that is, , and , respectively.
MV-Modality-IWAS analysis results using AD GWAS summary statistics from IGAP and UKB. Panels A–C show -values for testing causality of each modality when adjusting for pleiotropic effects of other modalities and using each gene as instrumental variables. Panel D shows Venn diagram of the statistically significant genes identified in A–C, summarized by chromosomes 8 and 19. The analysis was performed at a Bonferroni-adjusted significance level, red dashed lines in panels A, B, and C represent the global significant threshold, that is, ; red dashed lines in panels A, B, and C represent the modality-specific significant threshold, that is, , and , respectively. Panels E–H show the results when using UK Biobank AD GWAS summary statistics. Still, UKB analysis was performed at a Bonferroni-adjusted significance level, red dashed lines in panels E, F, and G represent the global significant threshold ; blue dashed lines in panels E, F, and G represent the modality-specific significance level, that is, , and , respectively.
For dMRI, MV-Modality-IWAS identified 71 unique causal IDPs with instrumental genes on chromosomes 8 and 19, where 44 IDPs fall in the category of weighted-mean (WM) tract diffusivity, 6 IDPs were WM tract fractional anisotropy (FA), 15 IDPs falls in the category of WM tract intracellular volume fraction (ICVF), 3 IDPs fall in the category of WM tract isotropic or free water volume fraction (ISOVF), and 3 IDPs fall in the category of WM tract orientation dispersion index (OD).
For sMRI, MV-Modality-IWAS identified 135 unique causal IDPs with instrumental genes on chromosomes 8 and 19. Among these, 31 IDPs were cortical area, 4 IDPs were cortical gray–white contrasts, 54 IDPs were cortical thickness, 4 IDPs were regional and tissue intensities, and 42 IDPs were regional and tissue volume.
4.2.2 UKB: instrumental genes and causal IDPs using UKB GWAS AD summary statistics
With UKB summary statistics, similar to using IGAP summary statistics, we found instrument genes on chromosomes 8 (0 gene, 6 genes) and 19 (27 genes, 34 genes) for sMRI and dMRIs, respectively. However, using UKB summary statistics, fewer instrumental genes were identified overall, particularly for sMRI where no instrumental genes were found on chromosome 8, compared with 4 genes identified using IGAP summary statistics. This discrepancy may be due to the median age difference between the datasets: the median age in UKB is 56.5 years, whereas IGAP considered late-onset AD cases with onset age , suggesting potentially stronger signals in IGAP with a larger number of cases used in generating its GWAS summary statistics. Additionally, it is also noteworthy that all instrumental genes identified on chromosome 19 for both dMRI and sMRI using UKB summary statistics are also identified using IGAP summary statistics. However, none of the 6 genes found by UKB for dMRI on chromosome 8 is identified using IGAP summary statistics.
With instrumental genes on chromosomes 8 and 19, MV-Modality-IWAS identified 71 unique causal dMRI IDPs for the UKB dataset. Among these 71 unique dMRI IDPs, 36 IDPs fall in the category of WM tract diffusivity, 2 IDPs in WM tract FA, 21 IDPs fall in the category of WM tract ICVF, 6 IDPs fall in the category of WM tract ISOVF, 1 IDP falls in the category of WM tract mean orientation (MO), and 5 IDPs fall in the category of WM tract OD. All causal IDP categories found in IGAP summary statistics were also identified with UKB, with the one additional category: WM tract MO. Additionally, a similar pattern was observed in both IGAP and UKB causal IDP results, where the majority of causal IDPs fell into the categories of WM tract diffusivity and WM tract ICVF.
With instrumental genes on chromosome 19, MV-Modality-IWAS identified 35 unique causal sMRI IDPs from the UKB dataset. Among these, 8 IDPs fall in the category of cortical area, 9 IDPs fall in the category of cortical thickness, 1 IDP falls in the category of regional and tissue intensity, and 17 IDPs fall in the category of regional and tissue volume. The causal IDP categories identified by UKB summary statistics are similar to those of IGAP, but with significantly fewer causal IDPs across all categories. Notably, no causal IDPs were found in cortical gray–white contrasts with UKB data.
Taken altogether, while IGAP and UKB GWAS summary statistics differ by many factors including preprocessing pipelines, the number of samples, the proportion of AD cases, age range, and others, V-Modality-IWAS identified genes on chromosomes 8 and 19 for sMRI and dMRI and no genes for fMRI in both datasets. These results show potential utility of the proposed method, warranting further investigations.
4.3 Comparisons
To test alternative methods of identifying causal IDPs, we compared our proposed approach with two additional strategies. With these comparisons, we aim to explore different underlying model assumptions and potential biases in identifying causal MRI–AD relationships.
4.3.1 No adjustment for horizontal pleiotropic effects
In this comparison, we tested the same hypothesis without adjusting for pleiotropic effects from other modalities. The summaries of values are provided in Figure 6. In the sMRI analysis, ignoring pleiotropy led to the identification of nine additional genes on chromosome 2, but failed to detect the loci on chromosome 8 that were identified by MV-Modality-IWAS in the IGAP dataset. In UKB, however, no instrumental genes were detected, which yielded heterogeneity in results. In the dMRI analysis, ignoring pleiotropy led to the identification of 8 genes on chromosome 1 and 9 genes on chromosome 2, but failed to detect the loci on chromosome 8 that were identified by MV-Modality-IWAS in the IGAP dataset. In UKB, the loci on chromosomes 1 and 2 did not show any significance, but there was a loci (with 3 genes) on chromosome 22. In fMRI analysis, while UKB did not show any significant gene (which agreed with MV-Modality-IWAS results), IGAP detected 1 instrumental gene on chromosome 1 and 35 instrumental genes on chromosome 19. These noticeable differences in results indicate potential false positive genes from the lens of causal inference.
Analysis results using AD GWAS summary statistics without adjusting for other modalities. Panels A–C show -values for testing causality of each modality when no pleiotropic effects are adjusted and using each gene as instrumental variables. The analysis was performed at a Bonferroni-adjusted significance level as described in Figure 5. Panels D–F show the results when using UK Biobank AD GWAS summary statistics and no pleiotropic effects are adjusted. UKB analysis was performed at a Bonferroni-adjusted significance level as described in Figure 5.
Analysis results using AD GWAS summary statistics without adjusting for other modalities. Panels A–C show -values for testing causality of each modality when no pleiotropic effects are adjusted and using each gene as instrumental variables. The analysis was performed at a Bonferroni-adjusted significance level as described in Figure 5. Panels D–F show the results when using UK Biobank AD GWAS summary statistics and no pleiotropic effects are adjusted. UKB analysis was performed at a Bonferroni-adjusted significance level as described in Figure 5.
4.3.2 MV-IWAS
In this comparison, we test each IDP individually while adjusting for all other IDPs within the same modality as well as IDPs from other modalities. To control for multiple comparisons in modality level testing, a Bonferroni correction was applied based on the number of tests conducted within each modality. Specifically, there were 699,007 tests in total being conducted in IGAP GWAS and 615,315 tests in total conducted in UKB GWAS. To keep MV-IWAS comparable with the proposed method, we applied MV-IWAS for each gene (rather than using all SNPs from 22 chromosomes). We did not find any statistical significance in both analyses. It is also important to note that our approach to performing MV-IWAS differs from that of Knutson et al. (2020) in two ways: (1) in the original approach, all SNPs across all chromosomes were used in the Stage 1 model to impute each IDP, whereas we used only the SNPs within each gene being tested; (2) in the original MV-IWAS, only 14 IDPs were tested, so the Bonferroni correction had a much milder impact than in our case, where a substantially larger number of IDPs were tested individually.
These results support our proposal of testing MRI modalities as a whole unit. While testing each IDP individually offers better interpretation regarding causality in identifying causal IDPs, it is too restrictive that no statistically significant results were found using both IGAP and UKB data after applying Bonferroni correction. Analyzing each IDP individually necessitates a much stronger signal to achieve significant findings, which underscores the challenge of power loss with this approach.
5 Discussion
In this study, we were motivated by the large amount of MRI data available in the UK Biobank and the power loss associated with analyzing a large number of individual IDPs using existing methods such as MV-IWAS. To address this issue, we proposed an association test for the genetically regulated effects of a given MRI modality (sMRI, dMRI, or fMRI) on the progression of Alzheimer’s disease with high power. Our method views each MRI modality as a unitary entity and tests the genetically regulated effects of the modality as a whole. Finally, since each IDP contributes only a small proportion to the progression of AD, we propose a variance component test to test whether the overall genetically regulated effects of one modality contribute to the progression of AD while adjusting for the effects of other modalities as well as the residual genetic direct effects.
The proposed method, MV-Modality-IWAS, provides a new perspective on how to handle high-dimensional complex imaging data in genetic association studies. Our method significantly reduces the number of tests needed, resulting in higher power and a reduced need for family-wise error correction, especially when dealing with a large number of IDPs. Similar to MV-IWAS, our method adjusts for other genetically regulated effects and residual genetic direct effects, thus providing better control of Type I error than UV-IWAS. Furthermore, by collapsing the information in the modality being tested, fewer parameters are being tested, allowing our method to achieve higher power while controlling the Type I error rate. Overall, our method provides a powerful and efficient approach to analyzing complex imaging data in genetic association studies. However, there is a trade-off between power and causal interpretability. While the proposed method can identify the modality of the genetically regulated effect, we cannot pinpoint the specific IDPs through which the effect is mediated. Therefore, if the goal is to pinpoint specific IDPs that are causal for AD, a post-analysis can be conducted within the causal modality identified by MV-Modality-IWAS using MV-IWAS or other approaches, such as the Sum of Single Effects (SuSiE) method (Chan et al., 2024; Karageorgiou et al., 2023).
In general, causal inference on observational data is not without assumptions and limitations, and imaging genetic studies may also be subject to violations on assumptions. First, we only listed brain imaging data as potential exposures. In practice, there could be environmental/behavioral factors or other physiological measures (including -omics or other imaging types such as heart MRI (Zhao et al., 2023) and retinal optical coherence tomography (Zhao et al., 2024)) that could serve as potential exposures. In this case, it is expected that our causal argument would be weaker. Second, we considered horizontal pleiotropy in our settings only but not vertical pleiotropy, which might be the case in neuroimaging when a certain modality (e.g., sMRI) leads changes in other modality (e.g., fMRI). It is also important to note that the proposed method is primarily focused on statistical inferences rather than effect size estimation. If modality-level effect sizes are of interest, one may consider estimating modality-specific partial , which provides a standardized measure of how much a linear approximation of variance in AD is uniquely explained by a specific modality. Despite these limitations, our proposed method makes one step closer in addressing high-dimensionality of pleiotropic effects in leveraging IDPs in studying genetic effects on AD. We also note that these are ongoing challenges in causal inference in genetics research where more research is needed to make valid inferences in such cases.
One possible extension is to use a dimension reduction method (e.g., principal component analysis or independent component analysis) to relax issues with multicollinearity when horizontal pleiotropy exists (Karageorgiou et al., 2023; Mo et al., 2022; Patel et al., 2024). It might require determination of dimensions in each modality. It could be a popular method such as principal component analysis or independent component analysis, or more flexible methods that decompose and factorize multimodal data into “shared and individual” variations (Lock et al., 2013, 2022; Park & Lock, 2020). Although it is not evaluated in this paper because it requires access to individual-level genotype and imaging data, as well as computational costs to train machine learning models for each gene–IDP pair, we believe such approaches would greatly enhance the scope of our method. Especially if there is availability of dimensionality-reduced composite score GWAS summary statistics, and with clear evidence that the genetic variation is captured by the composite scores, the dimensionality-reduced composite scores can be used to replace the IDPs in the analysis. Another potential extension of the proposed method is to the flexibility on IDP grouping scheme. In real data applications, grouping by modality is not required by the modeling approach, and alternative grouping schemes may be considered—such as organizing IDPs by feature type (e.g., surface area, volume, thickness). However, when doing so, it is important to carefully consider the impact of multicollinearity and multiple testing when selecting the groups.
Data and Code Availability
The GWAS summary statistics data used in this study are publicly available, with details provided in Section 4.1. Codes used in simulation studies and data analyses are available at https://github.com/junjypark/MV_Modality_IWAS.
Author Contributions
Y.T.: Data curation, Formal analysis, Methodology, Software, Validation, Visualization, Writing—original draft. D.F.: Funding acquisition, Supervision, Writing—review & editing. J.G.: Funding acquisition, Supervision, Writing—review & editing. J.Y.P.: Conceptualization, Formal analysis, Funding acquisition, Methodology, Software, Supervision, Writing—original draft, Writing—review & editing.
Ethics Statement
No individual-level genetic data was accessed or analyzed in this study. Ethical approval for this study was reviewed and granted by the Office of Research Ethics, University of Toronto (protocol number 44674).
Declaration of Competing Interest
None.
Acknowledgments
We thank the editor, associate editor, and three anonymous reviewers for their valuable suggestions and comments, which improved the quality of the manuscript. We also thank Dr. Lisa Strug and Dr. Lei Sun (University of Toronto) for providing helpful suggestions and comments on this work. We thank Wellcome Centre for Integrative Neuroimaging (WIN/FMRIB), Oxford, UK, and Department of Statistics and Actuarial Science, Simon Fraser University, Canada, for providing GWAS summary statistics for imaging-derived phenotypes from the UK Biobank cohort through the Oxford Brain Imaging Genetics Server—BIG40. We also thank the Neale laboratory for providing GWAS summary statistics for Alzheimer’s disease from the UK Biobank cohort, and the International Genomics of Alzheimer’s Project (IGAP) for providing summary results data for these analyses. The investigators within IGAP contributed to the design and implementation of IGAP and/or provided data but did not participate in analysis or writing of this report. IGAP was made possible by the generous participation of the control subjects, the patients, and their families. The i–Select chips were funded by the French National Foundation on Alzheimer’s disease and related disorders. EADI was supported by the LABEX (laboratory of excellence program investment for the future) DISTALZ grant, Inserm, Institut Pasteur de Lille, Université de Lille 2, and the Lille University Hospital. GERAD was supported by the Medical Research Council (Grant no 503480), Alzheimer’s Research UK (Grant no 503176), the Wellcome Trust (Grant no 082604/2/07/Z), and German Federal Ministry of Education and Research (BMBF): Competence Network Dementia (CND) grant nos 01GI0102, 01GI0711, 01GI0420. CHARGE was partly supported by the NIH/NIA grant R01 AG033193 and the NIA AG081220 and AGES contract N01-AG-12100, the NHLBI grant R01 HL105756, the Icelandic Heart Association, and the Erasmus Medical Center and Erasmus University. ADGC was supported by the NIH/NIA grants: U01 AG032984, U24 AG021886, U01 AG016976, and the Alzheimer’s Association grant ADGC-10-196728.
This research was supported by the McLaughlin Centre (accelerator grant). D.F. was supported by The Koerner Family Foundation, The Krembil Foundation, The CAMH Foundation, and the Canadian Institutes of Health Research. J.G. was partially supported by the Natural Sciences and Engineering Research Council of Canada (RGPIN-2021-03734), and the University of Toronto’s Data Science Institute, and the Connaught Fund. J.Y.P. was partially supported by the Natural Sciences and Engineering Research Council of Canada (RGPIN-2022-04831), and the University of Toronto’s Data Science Institute, and the Connaught Fund.