Abstract
A cycle in a brain network is a subset of a connected component with redundant additional connections. If there are many cycles in a connected component, the connected component is more densely connected. Whereas the number of connected components represents the integration of the brain network, the number of cycles represents how strong the integration is. However, it is unclear how to perform statistical inference on the number of cycles in the brain network. In this study, we present a new statistical inference framework for determining the significance of the number of cycles through the Kolmogorov-Smirnov (KS) distance, which was recently introduced to measure the similarity between networks across different filtration values by using the zeroth Betti number. In this paper, we show how to extend the method to the first Betti number, which measures the number of cycles. The performance analysis was conducted using the random network simulations with ground truths. By using a twin imaging study, which provides biological ground truth, the methods are applied in determining if the number of cycles is a statistically significant heritable network feature in the resting-state functional connectivity in 217 twins obtained from the Human Connectome Project. The MATLAB codes as well as the connectivity matrices used in generating results are provided at http://www.stat.wisc.edu/∼mchung/TDA.
Author Summary
In this paper, we propose a new topological distance based on the Kolmogorov-Smirnov (KS) distance that is adapted for brain networks, and compare them against other topological network distances including the Gromov-Hausdorff (GH) distances. KS-distance is recently introduced to measure the similarity between networks across different filtration values by using the zeroth Betti number, which measures the number of connected components. In this paper, we show how to extend the method to the first Betti number, which measures the number of cycles. The performance analysis was conducted using random network simulations with ground truths. Using a twin imaging study, which provides biological ground truth (of network differences), we demonstrate that the KS distances on the zeroth and first Betti numbers have the ability to determine heritability.
INTRODUCTION
The modular structure and connected components are the fundamental topological features of a brain network. Brain networks with a higher number of connected components have many disjointed clusters, and the transfer of information will likely be impeded. Modular structures are often studied through the Q-modularity in graph theory (Meunier, Lambiotte, Fornito, Ersche, & Bullmore, 2009; Newman, Barabasi, & Watts, 2006) and the zeroth Betti number in persistent homology (Carlsson & Memoli, 2008; Carlsson & Mémoli, 2010; Chung, Vilalta-Gil, Lee, Rathouz, Lahey, & Zald, 2017b; Chung, Luo, Leow, Adluru, Alexander, Richard, & Goldsmith, 2018b; Lee, Chung, & Lee, 2014).
Persistent homology provides a coherent framework for obtaining higher order topological features beyond modular structures (Edelsbrunner & Harer, 2008; Zomorodian & Carlsson, 2005). A brain network can be treated as the 1-skeleton of a simplicial complex, where the 0-dimensional hole is the connected component, and the 1-dimensional hole is a cycle. The number of k-dimensional holes is called the k-th Betti number and denoted as βk (Lee et al., 2014; Lee, Chung, Kang, Choi, Kim, & Lee, 2018; Petri, Expert, Turkheimer, Carhart-Harris, Nutt, Hellyer, & Vaccarino, 2014; Sizemore, Giusti, Kahn, Vettel, Betzel, & Bassett, 2018). In this study, we will study higher order topological changes of brain networks using cycles. The cycle structure in networks is important for information propagation, redundancy, and feedback loops (Lind, Gonzalez, & Herrmann, 2005). If a cycle exists in the network, information can be delivered using two different redundant paths and interpreted as redundant connections. Alternately, it can be viewed as diffusing the spread of information and creating information bottlenecks (Tarjan, 1972).
Although cycles in a network have been widely studied in graph theory, especially in path analysis, they are rarely used in brain network analysis (Sporns, 2003; Sporns, Tononi, & Edelman, 2000). Existing graph analysis packages such as Brain Connectivity (http://sites.google.com/site/bctnet) do not provide any tools related to cycles. Traditionally, cycles are often computed using the brute-force depth-first search algorithm (Tarjan, 1972). In standard graph theoretic approaches, graph theory features are measured mainly by determining the difference in graph theory features such as assortativity, betweenness centrality, small-worldness, and network homogeneity (Bullmore & Sporns, 2009; Rubinov & Sporns, 2010; Rubinov, Knock, Stam, Micheloyannis, Harris, Williams, & Breakspear, 2009; Uddin, Kelly, Biswal, Margulies, Shehzad, Shaw, Ghaffari, Rotrosen, Adler, Castellanos, & Milham, 2008). Comparison of graph theory features appears to reveal changes of structural or functional connectivity associated with different clinical populations (Rubinov & Sporns, 2010). Since weighted brain networks are difficult to interpret and visualize, they are often turned into binary networks by thresholding edge weights (He, Chen, & Evans, 2008; Wijk, Stam, & Daffertshofer, 2010). However, the thresholds for the edge weights are often chosen arbitrarily and produce results that could alter the network topology and thus make comparisons difficult. To obtain the proper optimal threshold where comparisons can be made, the multiple comparison correction over every possible edge has been proposed (Rubinov et al., 2009; Wijk et al., 2010). However, the resulting binary graph is extremely sensitive depending on the chosen p value or threshold value. Others tried to control the sparsity of edges in the network in obtaining the binary network (Achard & Bullmore, 2007; Bassett, 2006; He et al., 2008; Lee, Kang, Chung, Kim, & Lee, 2012; Wijk et al., 2010). However, one encounters the problem of thresholding sparse parameters. Thus existing methods for binarizing weighted networks cannot escape the inherent problem of arbitrary thresholding.
There is currently no widely accepted criteria for thresholding networks. Instead of trying to find an optimal threshold that gives rise to a single network that may not be suitable for comparing clinical populations, cognitive conditions, or different studies, why not use each network produced from every threshold? Motivated by this simple question, a new multiscale hierarchical network modeling framework based on persistent homology has been proposed (Cassidy, Rae, & Solo, 2015; Chung, Hanson, Lee, Adluru, Alexander, Davidson, & Pollak, 2013; Giusti, Pastalkova, Curto, & Itskov, 2015; Lee, Chung, Kang, Kim, & Lee, 2011a, 2011b; Lee et al., 2012; Petri, Scolamiero, Donato, & Vaccarino, 2013; Petri et al., 2014; Sizemore, Giusti, & Bassett, 2016; Sizemore et al., 2018; Stolz, Harrington, & Porter, 2017). Persistent homology, a branch of computational topology (Carlsson & Memoli, 2008; Edelsbrunner & Harer, 2008; Edelsbrunner, Letscher, & Zomorodian, 2000), provides a more coherent mathematical framework for measuring network distance than the conventional method of simply taking the difference between graph theoretic features or the norm of the connectivity matrices. Instead of looking at networks at a fixed scale, as is usually done in many standard brain network analysis, persistent homology observes the changes of topological features of the network over multiple resolutions and scales (Edelsbrunner & Harer, 2008; Horak, Maletić, & Rajković, 2009; Zomorodian & Carlsson, 2005). In doing so, it reveals the most persistent topological features that are robust under noise perturbations. This robustness in performance under different scales is needed for most network distances that are parameter and scale dependent.
In persistent homology–based brain network analysis, instead of analyzing networks at one fixed threshold that may not be optimal, we build the collection of nested networks over every possible threshold by using the graph filtration, a persistent homological construct (Chung et al., 2013; Lee et al., 2011a, 2012). The graph filtration is a threshold-free framework for analyzing a family of graphs but requires hierarchically building specific nested subgraph structures. The graph filtration shares similarities to the existing multithresholding or multiresolution network models that use many different arbitrary thresholds or scales (Achard, Salvador, Whitcher, Suckling, & Bullmore, 2006; He et al., 2008; Kim, Adluru, Chung, Okonkwo, Johnson, Bendlin, & Singh, 2015; Lee et al., 2012; Supekar, Menon, Rubin, Musen, & Greicius, 2008). Such approaches are mainly used to visually display the dynamic pattern of how graph theoretic features change over different thresholds, and the pattern of change is rarely quantified. Persistent homology can be used to quantify such dynamic patterns in a more coherent mathematical framework. Recently, various persistent homological network approaches have been proposed. In Giusti et al. (2015) and Sizemore et al. (2016, 2018), graph filtration was developed on cliques. In Petri et al. (2013), weighted clique rank homology was developed. In Petri et al. (2014), the concept of homological scaffolds was developed and applied to the resting-state fMRI.
In persistent homology, there are various metrics that have been proposed to measure similarity and distances, including the bottleneck, Gromov-Hausdorff (GH), and Wasserstein distances (Chazal, Cohen-Steiner, Guibas, Mémoli, & Oudot, 2009; Kerber, Morozov, & Nigmetov, 2017; Tuzhilin, 2016), the complex vector method (Di Fabio & Ferri, 2015), and the persistence kernel (Ibanez-Marcelo, Campioni, Manzoni, Santarcangelo, & Petri, 2018a; Ibanez-Marcelo, Campioni, Phinyomark, Petri, & Santarcangelo, 2018b; Kusano, Hiraoka, & Fukumizu, 2016). Among them, the bottleneck and GH distances are possibly the two most popular distances that were originally used to measure distance between two metric spaces (Tuzhilin, 2016). They were later adapted to measure distances in persistent homology, dendrograms (Carlsson & Memoli, 2008; Carlsson & Mémoli, 2010; Chazal et al., 2009), and brain networks (Lee et al., 2011b, 2012). The probability distributions of bottleneck and GH-distances are unknown. Thus, the statistical inference on them can only be done through resampling techniques such as permutations (Lee et al., 2012; Lee, Kang, Chung, Lim, Kim, & Lee, 2017), which often cause serious computational bottlenecks for large-scale networks.
To bypass the computational bottleneck associated with resampling large-scale networks, the Kolmogorov-Smirnov (KS) distance was introduced (Chung et al., 2013, 1; Lee et al., 2017). The advantage of using KS-distance is that its gives results that are easier to interpret than those obtained from less intuitive distances from persistent homology. Furthermore because of its simplicity in construction, it is possible to determine its probability distribution exactly without resampling (Chung et al., 2017b). However, the KS-distance has been only applied to the number of connected components β0, and it is unclear how to apply to the number of cycles β1 in graphs and networks. In this paper, for the first time, we show how to extend the KS-distance by performing statistical inference on β1. This is achieved by establishing the monotonic property of the number of cycles over graph filtration. The monotonicity is then used in constructing the KS-distance for topologically differentiating two networks. Subsequently, the method is applied to the large-scale resting-state twin fMRI study in determining the heritability of the number of cycles.
CORRELATION BRAIN NETWORK
The edge weight, which measures the strength of a connection, is usually given by a similarity measure between the observed data on the nodes in brain networks. Various similarity measures have been proposed. The correlation or mutual information between measurements for the biological or metabolic network and the frequency of contact between actors for the social network have been used as edge weights (Bassett, Meyer-Lindenberg, Achard, Duke, & Bullmore, 2006; Bien & Tibshirani, 2011; Li, Liu, Li, Qin, Li, Yu, & Jiang, 2009; McIntosh & Gonzalez-Lima, 1994; Newman & Watts, 1999; Song, Havlin, & Makse, 2005). In particular, the Pearson correlation has been most widely used as edge weights in functional brain network modeling.
GRAPH FILTRATION
All topological network distances that will be introduced in later sections are based on filtrations on graphs by thresholding edge weights.
Any edge weight less than or equal to ϵ is made into zero while edge weights larger than ϵ are made into one. Lee et al. (2011b, 1) defines the binary graphs by thresholding above, that is, wij,ϵ = 1 if wij <= ϵ, which is consistent with the definition of the Rips filtration. However, in brain imaging, the higher value of wij indicates stronger connectivity. Thus, we are thresholding below and leave out stronger connections (Chung et al., 2013, 1).
The condition of having unique edge weights is not restrictive in practice. Assuming edge weights to follow some continuous distribution, the probability of any two edges being equal is zero. The finiteness and uniqueness of the filtration levels over finite graphs are intuitively clear by themselves and are implicitly assumed in software packages such as javaPlex (Adams, Tausz, & Vejdemo-Johansson, 2014).
BETTI NUMBERS
In persistent homology, the k-th Betti number is often referred to as the number of k-dimensional holes (Lee et al., 2014, 1; Petri et al., 2014; Sizemore et al., 2018). In network setting, the 0-th Betti number is the number of connected components and the 1st Betti number is the number of cycles. During graph filtration, we can show that β0 and β1 monotonically change. Although it is not true in general (Bobrowski & Kahle, 2014), on the graph filtration (2), β0 and β1 numbers have very stable monotonic increases and decreases respectively.
In a graph, Betti numbers β0 and β1 are monotone over graph filtration on edge weights.
Theorem 1 is related to the incremental Betti number computation over a simplical complex (Boissonnat & Teillaud, 2006). Once we compute β0 number, β1 number is simply given by β0 − p + q without additional computation. For the computation of β0, it is not necessary to perform graph filtration for infinitely many possible filtration values. The maximum possible number of filtration level needed for computing β0 is one plus the number of unique edge weights. In the case of trees, β0 computation is exactly given.
The proof is given in Chung et al. (2015). Note a tree with p nodes has p − 1 edges. For a graph that is not possible, it may not be possible to analytically represent β0 over a filtration like Theorem 2. In general, β0 can be numerically computed using the single linkage dendrogram (SLD) (Lee et al., 2012), the Dulmage-Mendelsohn decomposition (Chung, Adluru, Dalton, Alexander, & Davidson, 2011; Pothen & Fan, 1990), or the simplical complex method (Carlsson & Memoli, 2008; de Silva & Ghrist, 2007; Edelsbrunner, Letscher, & Zomorodian, 2002). In this study, we computed β0 over filtration by using the Dulmage-Mendelsohn decomposition.
SINGLE LINKAGE CLUSTERING
Every edge connecting a node in R1 to a node in R2 has the same SLD. The SLD is then used to construct the single linkage matrix (SLM) S = (sij) (Figure 1). SLM shows how connected components are merged locally and can be used in constructing a dendrogram over filtration. If the single linkage distance sij is larger than the current filtration value ϵk but smaller than the next filtration value ϵk+1, that is, ϵk ≤ sij < ϵk+1. Then components R1 and R2 will be connected at the next filtration value ϵk+1. The sequence of how components are merged during the graph filtration is identical to the sequence of the merging in the dendrogram construction (Lee et al., 2012). By tracing how each of the connected components are merged, we can compute β0. In the single linkage clustering, instead of deleting edges, we are connecting nodes over increasing edge weights.
BOTTLENECK DISTANCE
The bottleneck distance does not directly measure the distance between two metric spaces 𝒳1 = (V1, w1) and 𝒳2 = (V2, w2), but measures the distance between their corresponding persistence diagrams 𝒫(𝒳1) and 𝒫(𝒳1). In practice, the bottleneck distance has been often used since it is a lower bound on the GH-distance and it is easier to compute (Chazal et al., 2009). Since the brain regions that form the network nodes are matched across the networks through predefined parcellations in brain network studies, the GH-distance can be computed easily. Thus, in this study, we will only use the GH-distance and not show the result of the bottleneck distance in the simulation study.
PERMUTATION TEST ON NETWORK DISTANCES
Statistical inference on network distances can be done using resampling techniques such as the permutation test (Chung et al., 2013; Efron, 1982; Lee et al., 2012). The permutation test is perhaps the most widely used nonparametric test procedure in the sciences (Chung et al., 2017b; Nichols & Holmes, 2002; Thompson, Cannon, Narr, van Erp, Poutanen, Huttunen, Lonnqvist, Standertskjold-Nordenstam, Kaprio, & Khaledy, 2001; Zalesky, Fornito, Harding, Cocchi, Yücel, Pantelis, & Bullmore, 2010). It is known as the exact test in brain imaging since the distribution of the test statistic under the null hypothesis can be exactly computed if we can calculate all possible values of the test statistic under every possible permutation.
Unfortunately, generating every possible permutation for whole images is still extremely time consuming even for a modest sample size. The number of permutations exponentially increases, and it is impractical to generate every possible permutation. In the permutation test, only a small fraction of possible permutations are generated, and the statistical significance is computed approximately. In most studies, on the order of 1% of total permutations were often used, mainly due to the computational bottleneck of generating permutations (Thompson et al., 2001; Zalesky et al., 2010). In Zalesky et al. (2010), 5,000 permutations out of possible = 17,383,860 permutations (2.9%) were used. In Thompson et al. (2001), 1 million permutations out of possible permutations (0.07%) were generated using a super computer. In our study, we have 131 MZ and 77 DZ twins. The possible number of permutations is . This is a number so large, we cannot exactly represent it in computing systems such as MATLAB and R. Even the 1% of is about 1.96 × 1056, which is still astronomically large and beyond the computing capability of the most computers. On the other hand, the proposed KS-distance method computes for all possible permutations combinatorially and completely bypasses the computational bottleneck. There is no computational cost involved in the KS-distance and the computation is done in a few seconds. Furthermore, the method computes p values exactly and it is not approximate.
KOLMOGOROV-SMIRNOV DISTANCE
Recently, the Kolmogorov-Smirnov (KS) distance has been successfully applied in quantifying the change of β0 number over graph filtration as a way to quantify brain networks without thresholding (Chung et al., 2017a, 2017b). The main advantage of the method is that it avoids using the computationally costly and time consuming permutation test for large-scale networks. In this paper, we show how to apply KS-distance in quantifying the change of the β1 number over graph filtration as well.
The proof is given in Chung et al. (2017b).
COMPARISONS
Six network distances (L1, L2, L∞, GH and KS on β0 and β1) were compared in simulation studies. For the review of various brain network distances, refer to Chung et al. (2017a). We also used the popular Q-modularity function for community detection in graph theory (Girvan & Newman, 2002; Meunier et al., 2009; Newman et al., 2006). The difference in Q-modularity functions was used as the distance measure. The simulations below were independently performed 100 times. We used p = 20,100,500 nodes and n = 5 images in each group, which made it possible for permutations to be exactly = 252 (Figure 3). The small number of permutations enables us to compare the performance of distances exactly. Through the simulations, σ = 0.1 was universally used as network variability.
No Network Difference
It was expected there was no network difference between networks generated using the same parameters and initial data vectors xi in the above model. For example, Figure 3 shows two simulated networks generated with the same parameters k = 4, 10. We compared networks with the same parameter k: 4 vs. 4, 5 vs. 5 and 10 vs. 10. It is expected we should not able to detect the network differences. The performance results were given in terms of the false positive error rate computed as the fraction of simulations that gave p value below 0.05 (Table 1). For all the distances except KS-distance, the permutation test was used. Since there were five samples in each group, the total number of permutations was = 272, making the permutation test exact and the comparisons accurate. All the distances performed very well including Q-modularity. KS-distance was overly sensitive and was producing up to 7% false positives. However, for 0.05 level test, it is expected that there is 5% chance of producing false positives. Thus, KS-distance is producing only 2% above the expected error rate.
p = 20 . | L1 . | L2 . | L∞ . | GH . | KS (β0) . | KS (β1) . | Q . |
---|---|---|---|---|---|---|---|
4 vs. 4 | 0.00 | 0.00 | 0.00 | 0.00 | 0.04 | 0.01 | 0.05 |
5 vs. 5 | 0.00 | 0.00 | 0.00 | 0.00 | 0.07 | 0.01 | 0.06 |
10 vs. 10 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.04 |
4 vs. 5 | 0.63 | 0.40 | 0.33 | 0.15 | 0.27 | 0.06 | 0.9 |
2 vs. 4 | 0.71 | 0.48 | 0.42 | 0.53 | 0.18 | 0.00 | 0.95 |
5 vs. 10 | 0.94 | 0.80 | 0.78 | 0.72 | 0.44 | 0.24 | 0.96 |
p = 20 . | L1 . | L2 . | L∞ . | GH . | KS (β0) . | KS (β1) . | Q . |
---|---|---|---|---|---|---|---|
4 vs. 4 | 0.00 | 0.00 | 0.00 | 0.00 | 0.04 | 0.01 | 0.05 |
5 vs. 5 | 0.00 | 0.00 | 0.00 | 0.00 | 0.07 | 0.01 | 0.06 |
10 vs. 10 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.04 |
4 vs. 5 | 0.63 | 0.40 | 0.33 | 0.15 | 0.27 | 0.06 | 0.9 |
2 vs. 4 | 0.71 | 0.48 | 0.42 | 0.53 | 0.18 | 0.00 | 0.95 |
5 vs. 10 | 0.94 | 0.80 | 0.78 | 0.72 | 0.44 | 0.24 | 0.96 |
The p = 20 simutation might be too small a network to extract topologically distinct features that are used in topological distances. Thus, we increased the number of nodes to p = 100 (Table 2). All the network distances except KS-distances performed reasonably well. KS-distances seem to be overly sensitive to slight topological change in large topological structures that were present in k = 2, 4, 5 cases. As k increases, KS-distances seem to perform reasonably well.
p = 100 . | L1 . | L2 . | L∞ . | GH . | KS (β0) . | KS (β1) . | Q . |
---|---|---|---|---|---|---|---|
4 vs. 4 | 0.00 | 0.00 | 0.00 | 0.00 | 0.26 | 0.54 | 0.03 |
5 vs. 5 | 0.00 | 0.00 | 0.00 | 0.00 | 0.14 | 0.43 | 0.05 |
10 vs. 10 | 0.00 | 0.00 | 0.00 | 0.00 | 0.05 | 0.05 | 0.05 |
4 vs. 5 | 0.51 | 0.37 | 0.35 | 0.16 | 0.11 | 0.00 | 0.93 |
2 vs. 4 | 0.66 | 0.45 | 0.57 | 0.61 | 0.03 | 0.00 | 0.91 |
5 vs. 10 | 0.94 | 0.86 | 0.79 | 0.72 | 0.11 | 0.00 | 0.98 |
p = 100 . | L1 . | L2 . | L∞ . | GH . | KS (β0) . | KS (β1) . | Q . |
---|---|---|---|---|---|---|---|
4 vs. 4 | 0.00 | 0.00 | 0.00 | 0.00 | 0.26 | 0.54 | 0.03 |
5 vs. 5 | 0.00 | 0.00 | 0.00 | 0.00 | 0.14 | 0.43 | 0.05 |
10 vs. 10 | 0.00 | 0.00 | 0.00 | 0.00 | 0.05 | 0.05 | 0.05 |
4 vs. 5 | 0.51 | 0.37 | 0.35 | 0.16 | 0.11 | 0.00 | 0.93 |
2 vs. 4 | 0.66 | 0.45 | 0.57 | 0.61 | 0.03 | 0.00 | 0.91 |
5 vs. 10 | 0.94 | 0.86 | 0.79 | 0.72 | 0.11 | 0.00 | 0.98 |
Network Differences
We generated networks with parameter k = 2, 4, 5, 10 with p = 20 nodes simulation (Figure 3). Since topological structures were different, the distances are expected to differentiate the networks. The performance results were given in terms of the false negative error rate computed as the fraction of simulations that give p value above 0.05 (Table 1). All the distances including Q-modularity performed badly, although KS-distance performed the best. Since graph theory features are not explicitly designed to measure network distances, they do not usually perform well when there are large topological differences.
We increased the number of nodes to p = 100. All the network distances including Q-modularity were still performing badly except KS-distances (Table 2). KS-distance on the number of cycles seems to be the best network distance to use when there are network topology differences, although it has tendency to produce false positives when there is no difference.
In terms of computation, distance methods based on the permutation test took about 950 seconds (16 minutes) for 100 nodes, while the KS-like test procedure only took about 20 seconds in a computer. The results given in Tables 1–3 may slightly change if different random networks are generated. We also performed the simulation study on the 500 nodes to see the effect of increased network sizes (Table 3). The proposed KS-distance on both β0 and β1 are not necessarily performing well in the case of no network differences. Again the KS-distance is too sensitive and detecting minute network differences. On the other hand, in the case of actual network differences, the KS-distances are performing exceptionally well compared with other network differences.
p = 500 . | L1 . | L2 . | L∞ . | GH . | KS (β0) . | KS (β1) . | Q . |
---|---|---|---|---|---|---|---|
4 vs. 4 | 0.04 | 0.05 | 0.06 | 0.08 | 0.20 | 0.26 | 0.02 |
5 vs. 5 | 0.00 | 0.00 | 0.00 | 0.00 | 0.13 | 0.20 | 0.02 |
10 vs. 10 | 0.00 | 0.00 | 0.00 | 0.00 | 0.06 | 0.18 | 0.05 |
4 vs. 5 | 0.20 | 0.20 | 0.20 | 0.20 | 0.11 | 0.00 | 0.20 |
2 vs. 4 | 0.14 | 0.11 | 0.14 | 0.12 | 0.00 | 0.00 | 0.17 |
5 vs. 10 | 0.20 | 0.18 | 0.19 | 0.16 | 0.00 | 0.00 | 0.20 |
p = 500 . | L1 . | L2 . | L∞ . | GH . | KS (β0) . | KS (β1) . | Q . |
---|---|---|---|---|---|---|---|
4 vs. 4 | 0.04 | 0.05 | 0.06 | 0.08 | 0.20 | 0.26 | 0.02 |
5 vs. 5 | 0.00 | 0.00 | 0.00 | 0.00 | 0.13 | 0.20 | 0.02 |
10 vs. 10 | 0.00 | 0.00 | 0.00 | 0.00 | 0.06 | 0.18 | 0.05 |
4 vs. 5 | 0.20 | 0.20 | 0.20 | 0.20 | 0.11 | 0.00 | 0.20 |
2 vs. 4 | 0.14 | 0.11 | 0.14 | 0.12 | 0.00 | 0.00 | 0.17 |
5 vs. 10 | 0.20 | 0.18 | 0.19 | 0.16 | 0.00 | 0.00 | 0.20 |
APPLICATION
As an application, we show how to apply KS-distances in understanding heritability of brain networks. Because of their unique relationship, twin imaging studies allow researchers to examine genetic and environmental influences easily in vivo (Blokland, McMahon, Thompson, Martin, de Zubicaray, & Wright, 2011; Chiang, McMahon, de Zubicaray, Martin, Hickie, Toga, Wright, & Thompson, 2011; Glahn, Winkler, Kochunov, Almasy, Duggirala, Carless, Curran, Olvera, Laird, Smith, Beckmann, Fox, & Blangero, 2010; McKay, Knowles, Winkler, Sprooten, Kochunov, Olvera, Curran, Kent Jr., Carless, Göring, Dyer, Duggirala, Almasy, Fox, Blangero, & Glahn, 2014; Smit, Stam, Posthuma, Boomsma, & De Geus, 2008). Monozygotic (MZ) twins share 100% of genes, whereas dizygotic (DZ) twins share 50% of genes (Chung et al., 2017b). The difference between MZ and DZ twins measures the degree of genetic and environmental influence. Twin imaging studies are very useful for understanding the extent to which brain networks are influenced by genetic factors. This information can then be later used to develop better ways to prevent and treat disorders and maladaptive behaviors.
Dataset and Image Preprocessing
We used the resting-state fMRI of 271 twin pairs from the Human Connectome Project (Van Essen, Ugurbil, Auerbach, Barch, Behrens, Bucholz, Chang, Chen, Corbetta, & Curtiss, 2012). Out of a total 271 twin pairs, we only used genetically confirmed 131 MZ twin pairs (age 29.3 ± 3.3 years, 56M/75F) and 77 same-sex DZ twin pairs (age 29.1 ± 3.5 years, 30M/47F) in this study. Since the discrepancy between self-reported and genotype-verified zygosity was fairly high at 13% of all the available data, 19 MZ and 19 DZ twin pairs that do not have genotyping were excluded. We additionally excluded 35 twin pairs with missing fMRI data.
fMRI were collected on a customized Siemens 3T Connectome Skyra scanner, using a gradient-echo-planar imaging (EPI) sequence with multiband factor = 8, TR = 720 ms, TE = 33.1 ms, flip angle = 52°, 104 × 90 (RO×PE) matrix size, 72 slices, and 2-mm isotropic voxels; 1,200 volumes were obtained over a 14 min, 33 sec scanning session. fMRI data has undergone spatial and temporal preprocessing including motion and physiological noise removal (Smith et al., 2013). Using the resting-state fMRI, we employed the Automated Anatomical Labeling (AAL) brain template to parcellate the brain volume into 116 regions (Tzourio-Mazoyer, Landeau, Papathanassiou, Crivello, Etard, Delcroix, Mazoyer, & Joliot, 2002). The fMRI were then averaged across voxels in each brain region for each subject. The averaged fMRI signal in each parcellation was then temporally smoothed using the cosine series representation as follows (Chung, Adluru, Lee, Lazar, Lainhart, & Alexander, 2010; Gritsenko, Lindquist, Kirk, & Chung, 2018).
Twin Correlations
The network differences between MZ and DZ twins are considered as mainly contributed to heritability and can be used to determine the statistical significance of HI (Chung et al., 2017, 2018). The KS-distance was computed by taking 1 − CMZ and 1 − CDZ as edge weights.
In most brain imaging studies, 5,000–1,000,000 permutations are often used, which puts the total number of generated permutations to usually less than 0.01 to 1% of all possible permutations. In Zalesky et al. (2010), 5,000 permutations are out of a possible = 17,383,860 permutations (2.9%) used. In Thompson et al. (2001), for instance, 1 million permutations out of possible permutations (0.07%) were generated using a super computer. In Lee et al. (2017), 5,000 permutations out of a possible = 92,561,040 permutations (0.005%) were used. Since we have 131 MZ and 77 DZ pairs, the total number of possible permutation is , which is larger than 1080. Even if we generate only 0.01% of 1080 of all possible permutations, 1076 permutations are still too large for most desktop computers. Thus, we choose the KS-distance for measuring the network distance. Although the probability distribution of the KS-distance is actually based on the permutation test but the probability is computed combinatorially, bypassing the need for resampling. KS-distance in our study only took a few seconds to compute the p value.
Results
We used β0 and β1 in computing KS-distances. Let ϕ ∘ CMZ = (ϕ()) and ϕ ∘ CDZ = (ϕ()) for some monotone function ϕ. Then KS-distance between CMZ and CDZ is equivalent to KS-distance between 1 − CMZ and 1 − CDZ as well as between ϕ ∘ (1 − CMZ) and ϕ ∘ (1 − CDZ). Thus, we simply built filtrations over CMZ and CDZ and computed KS-distance without using the square-root of 1 - correlation. We used 101 filtration values between 0 and 1 at 0.01 increment (Figure 4). This gives a reasonably accurate estimate of the maximum gap in the βi-plots between the twins (Figure 5). For β0-plots, the maximum gap is 82, which gives the p value smaller than 10−24. For β1-plots, the maximum gap is 3,647, which gives the p value smaller than 10−32. At the same correlation value, MZ twins are more connected than DZ twins. Also MZ twins have more cycles than DZ twins. Such huge topological differences are contributed to heritability.
Figure 6, which displays the HI index thresholded at 100% heritability, shows MZ twins far more similar compared with DZ twins in many connections, suggesting that genes influence the development of these connections. The most heritable connections include the left frontal gyrus, left and right middle frontal gyri, left superior frontal gyrus, left parahippocampal gyrus, left and right thalami, left and right caudate, and nuclei among many other regions. Most regions overlap with highly heritable regions observed in other twins brain-imaging studies (Fan, Fossella, Sommer, Wu, & Posner, 2003; Glahn et al., 2010; Gritsenko et al., 2018). Moreover, the findings here are somewhat consistent with a previous study on diffusion tensor imaging on twins from our group (Chung, Luo, Adluru, Alexander, Richard, & Goldsmith, 2018a; Chung et al., 2018b), showing that many regions of both resting-state functional and structural connections are heritable at the same time. The left and right caudate nuclei are identified as the most heritable hub nodes in our study.
The MATLAB codes for the simulation study as well as the connectivity matrices CMZ and CDZ used in generating results are given at http://www.stat.wisc.edu/∼mchung/TDA.
DISCUSSION
The Limitation of KS-distances
Currently KS-distance is applied to Betti numbers β0 and β1 separately. It may be possible to construct a new topological distance that uses the combination of both β0 and β1 and come up with topologically more sensitive distances. One possible approach is to use the convex combination α + (1 − α), where is KS-distance for βi and 0 ≤ α ≤ 1. This is beyond the scope of this paper and left as a future study.
Other Network Distances
The network distances used in this study are not just any other distances but metrics. Since there are almost infinitely many possible similarity measures and distances we can use in networks, the performance of the chosen distance is important in discrimination tasks, which we have shown in simulation studies. The determination of the optimal distance is related to metric learning, an area of supervised machine learning in which the goal is to learn from data an optimal similarity function that measures how similar two objects are (Ktena, Parisot, Ferrante, Rajchl, Lee, Glocker, & Rueckert, 2018; Lowe, 1995). This is left as a future study.
Computational Issues
The total number of permutations in permuting two groups of size q each is ∼ . Even for small q = 10, more than tens of thousands of permutations are needed for the accurate approximation of the p value. The main advantage of KS-distance over all other distance measures is that it avoids numerically performing the permutation test and avoids generating tens of thousands of permutations. Although the probability distribution of the KS-distance is actually based on the permutation test, the probability is computed combinatorially. We believe that it is possible to develop similar theoretical results for other distance measures and come up with a method for avoiding a resampling-based method for statistical inference.
AUTHOR CONTRIBUTIONS
Moo Chung: Conceptualization; Data curation; Formal analysis; Funding acquisition; Investigation; Methodology; Project administration; Resources; Software; Supervision; Validation; Visualization; Writing - Original Draft; Writing - Review & Editing. Hyekyoung Lee: Investigation; Methodology; Validation; Visualization; Writing - Original Draft. Alex DiChristofano: Investigation. Hernando Ombao: Writing - Review & Editing. Victor Solo: Conceptualization; Methodology; Writing - Review & Editing.
FUNDING INFORMATION
Moo Chung, National Institutes of Health (http://dx.doi.org/10.13039/100000002), Award ID: EB022856. Hyekyoung Lee, National Research Foundation of Korea (http://dx.doi.org/10.13039/501100003725), Award ID: NRF-2016R1D1A1B03935463.
ACKNOWLEDGMENTS
We thank Yuan Wang of University of South Carolina, Peter Bebunik of University of Florida, Bala Krishnamoorthy of Washington State University, Dustin Pluta of University of California-Irvine, Alex Leow of University of Illinois-Chicago, and Martin Lindquist of Johns Hopkins University for valuable discussions. We also thank Andrey Gritsenko and Gregory Kirk of University of Wisconsin-Madison for logistic support and image preprocessing help.
TECHNICAL TERMS
- Persistent homology:
A topological data analysis technique for computing topological features at different spatial resolutions.
- Graph filtration:
A collection of nested graphs.
- Metric space:
A set with a metric defined on the set.
- Permutation test:
Determines the statistical significance by calculating all possible values of the test statistic under all possible rearrangements of the samples.
- Kolmogorov-Smirnov (KS) distance:
A distance between the empirical distributions of two samples.
- Mixed-effect model:
A model with both fixed and random effect terms.
- Heritability index:
A number between 0 and 1 that measures the amount of genetic contribution.
- Betti-plots:
Displays the change of Betti numbers over filtration values.
REFERENCES
Author notes
Competing Interests: The authors have declared that no competing interests exist.
Handling Editor: Paul Expert