Test-retest reliability of the human functional connectome over consecutive days: identifying highly reliable portions and assessing the impact of methodological choices

Countless studies have advanced our understanding of the human brain and its organization by using functional magnetic resonance imaging (fMRI) to derive network representations of human brain function. However, we do not know to what extent these “functional connectomes” are reliable over time. In a large public sample of healthy participants (N = 833) scanned on two consecutive days, we assessed the test-retest reliability of fMRI functional connectivity and the consequences on reliability of three common sources of variation in analysis workflows: atlas choice, global signal regression, and thresholding. By adopting the intraclass correlation coefficient as a metric, we demonstrate that only a small portion of the functional connectome is characterized by good (6–8%) to excellent (0.08–0.14%) reliability. Connectivity between prefrontal, parietal, and temporal areas is especially reliable, but also average connectivity within known networks has good reliability. In general, while unreliable edges are weak, reliable edges are not necessarily strong. Methodologically, reliability of edges varies between atlases, global signal regression decreases reliability for networks and most edges (but increases it for some), and thresholding based on connection strength reduces reliability. Focusing on the reliable portion of the connectome could help quantify brain trait-like features and investigate individual differences using functional neuroimaging.

Effects of thresholding on edge retention in the Brainnetome atlas. For each absolute and relative threshold we show the proportion of edges that are consistently retained. As a measure of consistency, we use the number of participants in which the edge was retained at both timepoints divided by the ones in which it was retained at least once. For convenience, we then use the values defined in (1) to compute the ratio of edges having poor (ratio<0.40), fair (ratio=0.40-0.60), good (ratio=0.60-0.75) or excellent (ratio >0.75) consistency. For absolute thresholds, all edges below the value are set to 0, for relative ones (right) only the top percent corresponding to the threshold is retained.

Reliability of edges in the functional connectome in unrelated participants.
We show the median, minimum and maximum ICC of functional connectomes computed using three different atlases, with or without global signal regression. We also show the proportion of edges having poor (ICC<0.40), fair (ICC=0.40-0.60), good (ICC=0.60-0.75) or excellent (ICC>0.75) reliability, defined in accordance to (1). Abbreviations: ICC=intraclass correlation coefficient, GSR-=no global signal regression, GSR+=global signal regression.  Effects of thresholding on edge retention in the Brainnetome atlas in unrelated participants. For each absolute and relative threshold we show the proportion of edges that are consistently retained. As a measure of consistency, we use the number of participants in which the edge was retained at both timepoints divided by the ones in which it was retained at least once. For convenience, we then use the values defined in (1) to compute the ratio of edges having poor (ratio<0.40), fair (ratio=0.40-0.60), good (ratio=0.60-0.75) or excellent (ratio >0.75) consistency. For absolute thresholds, all edges below the value are set to 0, for relative ones (right) only the top percent corresponding to the threshold is retained. Effects of thresholding on edge retention in the Glasser atlas in unrelated participants. For each absolute and relative threshold we show the proportion of edges that are consistently retained. As a measure of consistency, we use the number of participants in which the edge was retained at both timepoints divided by the ones in which it was retained at least once. For convenience, we then use the values defined in (1) to compute the ratio of edges having poor (ratio<0.40), fair (ratio=0.40-0.60), good (ratio=0.60-0.75) or excellent (ratio >0.75) consistency. For absolute thresholds, all edges below the value are set to 0, for relative ones (right) only the top percent corresponding to the threshold is retained.

Effects of thresholding on edge retention in the Brainnetome atlas using different ICC intervals.
For each absolute and relative threshold we show the proportion of edges that are consistently retained. As a measure of consistency, we use the number of participants in which the edge was retained at both timepoints divided by the ones in which it was retained at least once. For convenience, we then use the values defined in (3) to compute the ratio of edges having slight (<0.20), fair (0.20-0.40), moderate (0.40-0.60), substantial (0.60-0.80), and perfect (>0.80) consistency. For absolute thresholds, all edges below the value are set to 0, for relative ones (right) only the top percent corresponding to the threshold is retained. Effects of thresholding on edge retention in the Glasser atlas using different ICC intervals. For each absolute and relative threshold we show the proportion of edges that are consistently retained. As a measure of consistency, we use the number of participants in which the edge was retained at both timepoints divided by the ones in which it was retained at least once. For convenience, we then use the values defined in (3) to compute the ratio of edges having slight (<0.20), fair (0.20-0.40), moderate (0.40-0.60), substantial (0.60-0.80), and perfect (>0.80) consistency. For absolute thresholds, all edges below the value are set to 0, for relative ones (right) only the top percent corresponding to the threshold is retained.

Figure S5
Effects of thresholding on edge retention in unrelated participants. In the Brainnetome, Glasser and Gordon atlases, for each absolute and relative threshold we show the proportion of edges that are consistently retained. As a measure of consistency, we use the number of participants in which the edge was retained at both timepoints divided by the ones in which it was retained at least once. For convenience, we then use the values defined in (1) to plot the ratio of edges having poor (ratio<0.40), fair (ratio=0.40-0.60), good (ratio=0.60-0.75) or excellent (ratio >0.75) consistency. For absolute thresholds (left) all edges below the value are set to 0, for relative ones (right) only the top percent corresponding to the threshold is retained. Abbreviations: r=Pearson correlation coefficient.

Figure S6
Effects of thresholding on reliability in unrelated participants. In the Brainnetome, Glasser and Gordon atlases, for each absolute and relative threshold we show the proportion of edges having poor (ICC<0.40), fair (ICC=0.40-0.60), good (ICC=0.60-0.75) or excellent (ICC>0.75) reliability. In this calculation, only subjects for which the edge was retained in both sessions were considered. For absolute thresholds (left) all edges below the value are set to 0, for relative ones (right) only the top percent corresponding to the threshold is retained. Abbreviations: r=Pearson correlation coefficient, ICC=intraclass correlation coefficient.

Figure S8
Effects of thresholding on edge retention using different ICC intervals. In the Brainnetome, Glasser and Gordon atlases, for each absolute and relative threshold we show the proportion of edges that are consistently retained. As a measure of consistency, we use the number of participants in which the edge was retained at both timepoints divided by the ones in which it was retained at least once. For convenience, we then use the values defined in (1) to plot the ratio of edges having slight (<0.20), fair (0.20-0.40), moderate (0.40-0.60), substantial (0.60-0.80), and perfect (>0.80) consistency. For absolute thresholds (left) all edges below the value are set to 0, for relative ones (right) only the top percent corresponding to the threshold is retained. Abbreviations: r=Pearson correlation coefficient.

Figure S9
Effects of thresholding on reliability using different ICC intervals. In the Brainnetome, Glasser and Gordon atlases, for each absolute and relative threshold we show the proportion of edges having slight (<0.20), fair (0.20-0.40), moderate (0.40-0.60), substantial (0.60-0.80), and perfect (>0.80) reliability. In this calculation, only subjects for which the edge was retained in both sessions were considered. For absolute thresholds (left) all edges below the value are set to 0, for relative ones (right) only the top percent corresponding to the threshold is retained. Abbreviations: r=Pearson correlation coefficient, ICC=intraclass correlation coefficient