With a growing interest in personalized medicine, functional neuroimaging research has recently shifted focus from the evaluation of group-level summaries to associating individual differences in brain function with behaviors. However, this new focus brings forth challenges related to accurately measuring the sources of individual variation in functional signals. In this perspective, we highlight the impact of within-individual variations and discuss the concept of measurement reliability as a critical tool for accounting for within- and between-individual variations when measuring individual differences in brain function.

“No man ever steps in the same river twice” - Heraclitus.

The human brain is a dynamic system, continuously updating itself in response to internally and externally generated stimuli and task demands in support of higher order cognition (Fox et al., 2005). From moment to moment, patterns of neural activity traverse the structural connectome, like waves moving along a riverbed that is being remodeled over time. This notion is reflected in classical (psychometric) test theory, which asserts that no two points in time are ever exactly the same for an individual—a reality that challenges efforts seeking to consistently and meaningfully measure brain function (e.g., task activation), particularly those aiming to understand differences among individuals (DeVellis, 2006; Seghier & Price, 2018). Any experimental measurement merely captures a snapshot of the function and behavior produced by an individual brain at a given moment. Arrays of measurements can be combined to test hypotheses across time scales (e.g., milliseconds, seconds, minutes, days, months, years) that are relevant to answering scientific questions. Considered in the present context, variation across measurements is the rule in the evaluation of brain function for individuals—not the exception. However, in the midst of this change is a dynamic equilibrium supported by homeostatic processes—while our brains and behaviors evolve, we remain, to an extent, the same individuals. It is this equilibrium that allows the brain to maintain order despite its dynamic nature. And, it is this equilibrium that makes it feasible for measurements of trait differences in brain function (e.g., functional connectivity) to approach repeatability over time (i.e., agreement between temporally independent test results), despite never being exactly the same (Gratton et al., 2018; Laumann et al., 2017; Poldrack et al., 2015; Yang et al., 2020).

This Perspective illustrates the impact of within-individual variations when measuring individual differences. In addition, we delineate challenges inherent to the measurement of trait differences in brain function that must be addressed for individual difference research to reach clinical and scientific utility. The challenges discussed are applicable regardless of whether the trait measures of brain function are attempting to capture the central tendency (e.g., static functional connectivity matrices) or the variation (e.g., time-varying properties for functional connectivity patterns) observed in individuals.

If one is attempting to obtain a stable estimate from a dynamic system, multiple measurements are required to estimate the central tendency. However, most studies are cross-sectional without repeated measurements from the same individuals. As a result, within-individual variation is often overlooked or misinterpreted as interindividual variation when studying individual differences in brain function. We demonstrate this by simulating an illustrative example in which individuals with known “ground truth” interindividual differences can generate distinct observed individual differences, driven by within-individual variation. First, we simulated a “ground truth” score for each of the 10 individual subjects (marked in cross “X” Fig. 1a-b), which is the expected value of an individual score distribution. In an ideal scenario, we would obtain the true individual differences (i.e., interindividual distance matrix, Fig. 1d). However, due to the measurement variability within each subject, the observed scores from each measurement instance (sample 1 vs. sample 2 in Fig. 1a-b) vary, resulting in divergent observed individual differences (Fig. 1c-e). As within-subject measurement variation increases (e.g., more variation between measurements, Fig. 1b), the observed individual differences vary even more (Fig. 1e, sample 1 vs. sample 2) and are likely be diverge further from the “ground truth”.

Fig. 1.

Simulations demonstrating the effects of within-subject variation on measuring individual differences. The observed individual differences can vary due to the within-subject variation. Simulations of observed individual differences from 10 individual subjects are generated from the same known “ground truth” (marked in cross “X”). (a–b) Simulations of a single variable of interest, which were scored repeated twice (sample1: diamond, sample 2: triangles) from these 10 individual subjects with a low (a) and high (b) within-subject variation. (c-e) Interindividual distance matrices of “ground truth” (d) and samples (c, e). (f) Simulations of correlations between two variables of interest (ground truth r = 0.3). The observed correlations between two variables (e.g. thickness and IQ) were calculated within each simulation and they vary due to the within-subject variations (sample size = 100, 10,000 simulations). The greater within-subject variation introduces noisier samples and more nonsignificant results. (g) Simulations of correlations between two variables of interest (ground truth r = 0.3), conducted with varying sample sizes and levels of within-subject variations (10,000 simulations per sample size). A large sample is required to detect the association between variables of interest when individual within-subject variation is sufficiently high.

Fig. 1.

Simulations demonstrating the effects of within-subject variation on measuring individual differences. The observed individual differences can vary due to the within-subject variation. Simulations of observed individual differences from 10 individual subjects are generated from the same known “ground truth” (marked in cross “X”). (a–b) Simulations of a single variable of interest, which were scored repeated twice (sample1: diamond, sample 2: triangles) from these 10 individual subjects with a low (a) and high (b) within-subject variation. (c-e) Interindividual distance matrices of “ground truth” (d) and samples (c, e). (f) Simulations of correlations between two variables of interest (ground truth r = 0.3). The observed correlations between two variables (e.g. thickness and IQ) were calculated within each simulation and they vary due to the within-subject variations (sample size = 100, 10,000 simulations). The greater within-subject variation introduces noisier samples and more nonsignificant results. (g) Simulations of correlations between two variables of interest (ground truth r = 0.3), conducted with varying sample sizes and levels of within-subject variations (10,000 simulations per sample size). A large sample is required to detect the association between variables of interest when individual within-subject variation is sufficiently high.

Close modal

Importantly, such variation of brain measurements within subjects also has an impact on studies of brain-wide associations (e.g., thickness and IQ). In Figure 1f-g, we simulated repeated measures for two variables from same individuals and showed varied observed correlation scores purely driven by the variations of measures within each sampling individual. The observed correlation between two variables (e.g., thickness and IQ) based on the ground truth (r = 0.3 in simulations) can either result in (1) inconsistencies in findings between samples (i.e., insignificant or significant with a relatively reasonable sample size N = 100 in Fig. 1f) or (2) lower the statistical power and require a larger sample to detect the brain-wise association when individual within-subject variation is sufficiently high (Fig. 1g, N = 20–2000). Thus, although the “ground truth” remains the same, the observed between-individual differences can be different due to within-individual variation alone, thereby impacting the power to detect brain-wise associations (Hsu et al., 2022; Zuo et al., 2019).

Depending on the time scale of the data, sources of within-individual variation can exist between multiple levels: from moment to moment, day to day, month to month, or year to year. At each time scale, brain function can also vary with internal and external factors. Figure 2 (lower panel) summarizes common potential sources of variation in brain function that can arise, whether measuring the same individual on multiple occasions, or different individuals. From moment to moment, brain functional organization varies along with ongoing fluctuation for each individual (Garrett et al., 2013). Such dynamic activity reflects the changes of the brain state with internal awareness and/or external stimulus. For example, the brain state tends to change as the mind wanders during a resting-state scan, while an individual watching a movie or undertaking a specific task (e.g., Wisconsin Card Sorting Test) will undergo state changes consistent with the ongoing stimulus (Smallwood & Schooler, 2015; Vatansever et al., 2017). On the scale of hours, diurnal rhythms during the day can alter the brain state, as well as circadian rhythms which govern the homeostatic metabolomic and have shown daily variations on functional connectomes (Hodkinson et al., 2014). These can be further affected by external factors (e.g., feeding, caffeination, sleep quality the day before the scan) (Blautzik et al., 2013; Poldrack et al., 2015). From weeks to months, seasonal effects and life events might change the individuals’ mood and cognitive functions, affecting the brain state during measurement (Di et al., 2021; Shine et al., 2016). Biological and psychological changes with individual unique experiences also lead to variations in functional organization (Poldrack et al., 2015). From year to year, age effects of brain development combined with long-term environmental–social influences (e.g., education, income) contribute to the variation in the measurement and comparison of individuals (Bethlehem et al., 2022; Hackman & Farah, 2009; Kolb et al., 1998; Tooley et al., 2021).

Fig. 2.

Factors affect the measure of individual differences.

Fig. 2.

Factors affect the measure of individual differences.

Close modal

Regardless of the time scale one considers (e.g., seconds, minutes, days, months, years), numerous potential sources of within-individual variation are capable of individually or jointly impacting brain function and/or its measurements. The breadth of sources of variation can be overwhelming, though consistent with prior suggestions, they can be categorized into subgroups, such as acquisition-, experimental-, environmental-, participant-, and analysis-related factors (Botvinik-Nezer et al., 2020; Nichols et al., 2017; Schilling et al., 2021; Yan et al., 2013). Beyond these sources, measurement errors are additional, pervasive contributors to sampling variability, affecting both intra- and interindividual variations. Such variations may stem from fluctuations in attention or minor disruptions during testing. It is worth noting that some factors can create opportunities for understanding the impact of biologically meaningful processes on brain function (e.g., time of day) when explicitly considered in the experimental design or analysis (e.g., the MyConnectome project) (Laumann et al., 2017; Poldrack et al., 2015). However, most of these factors are a nuisance, creating undesirable perturbations in analysis and leading to potential biases in measuring individual differences, especially when the factors are systematic in nature (Glatard et al., 2015; Kiar et al., 2021). When the data are available, modeling the factors that may contribute to the intraindividual variation as covariates at the individual level is advisable.

In addition to the within-individual variation, there are diverse sources of variation in measuring individual differences of brain function (Fig. 2). The criteria and strategies employed for data collection play a crucial role in determining the extent to which the sample reflects the population of interest. This variability in recruitment methods (e.g. college, online, or community recruitment) may introduce unwanted between-individual variations into the experiments (Conley et al., 2021). Additionally, analytic variations such as pipeline, computational system, and batch effects (Fig. 2 green box) may also lead to contaminated estimates of individual differences (Botvinik-Nezer et al., 2020; Haddad et al., 2023; Hedges et al., 2022; Hu et al., 2023; Kiar et al., 2021; Li et al., 2024). While recent attention has predominantly been on reproducibility of analytic variations, few studies have investigated how different analytic perturbations contribute to both within- and between-individual variations.

How do the different sources of variation impact the measurement of individual differences? In general, measurements that have relatively lower within-individual variation and higher between-individual variation lead to improved individual differentiation (Finn et al., 2017). It is important to note that neither the within- nor between-individual variation alone determines the differences observed among individuals (Zuo et al., 2019). For example, there is always variation from session to session within each individual. However, when the functional connectivity of a given individual is relatively more similar from one session to the next as compared with other individuals, this individual can be recognized and differentiated from others. When measuring individual differences of the functional connectome on a finer time scale (e.g., functional dynamics in minutes or seconds), the dynamic changes from moment to moment may not be stable within individuals. Yet, the dynamic characteristics (e.g., the principle that governs the dynamics, how the network configurations vary, etc.) may be relatively similar within individuals as compared with between individuals (Yang et al., 2020). Thus, it is important to decipher sources of variation—within- versus between-individual variations—as well as their proportional impact on overall variations, to improve individual differentiation (Gratton et al., 2018).

In neuroimaging studies, participants included in the analysis can be scanned at different times during the day, or undertake different experiment states (e.g., rest, task, movie) (Blautzik et al., 2013; Vanderwal et al., 2017). Even for the same task, participants can adopt different cognitive strategies resulting in variation between individuals (Seghier & Price, 2018). Although previous studies have demonstrated that functional networks are largely stable within individuals, that is, variation between sessions or tasks contributes less compared with between-individual variations (Gratton et al., 2018), few studies differentiate within-individual variation from total observed variation. In particular, in cross-sectional studies, an individual’s connectome is often assumed to be stable and interpreted as part of the between-individual difference (Hsu et al., 2022). This is problematic because as shown in Figure 1, within-individual variation can jeopardize the estimation of true inter-individual differences when it is large relative to between-individual differences. Such contaminations of between-individual differences from within-individual differences can compromise brain–behavior association discovery.

A critical prerequisite for individual difference research is that variation observed between individuals should not be assumed to reflect true individual differences without accounting for the underlying variation within individuals. In test theory, it is common practice for summarizing the relative contributions of the two dimensions in a single value, namely measurement reliability. For example, when looking at continuous measurements, the ratio of between-individual variation divided by the sum of within-individual and between-individual variation, namely the intraclass correlation (ICC), is widely used to quantify how well the measure can characterize reliable individual differences (Brandmaier et al., 2018; Chen et al., 2018; McGraw & Wong, 1996). However, as modern neuroscience has increased the dimensionality of characterizations for an individual, the field has faced the challenge of how to achieve such indices for multivariate profiles (Bridgeford et al., 2021; Yoo et al., 2019). One solution that has emerged in neuroimaging is fingerprinting (i.e. identification rate), a nonparametric index that quantifies whether the individuals can be matched with themselves across repetitions (Finn et al., 2015). Alternatively, three approaches to generalizing the ICC to multivariate formulations have emerged, including (i) a parametric extension of the classic ICC formulation, image intraclass correlation coefficient (I2C2) (Shou et al., 2013), (ii) distance-based ICC (dbICC), which reformulates the ICC in terms of distances (M. Xu et al., 2021), and (iii) discriminability, a nonparametric index that assesses the degree to which an individual’s repetitions are relatively similar to one another (Bridgeford et al., 2021; Wang et al., 2020).

With many reliability indices emerging to take on the challenge of measuring individual differences and calculating reliability, understanding the advantages and limitations of each can aid in selecting the appropriate measure for a given application. First, we draw attention to parametric assumptions for a given data set (e.g., Gaussian distribution, homogeneous variance). In cases where data are not Gaussian distributed, ICC and dbICC can be misleading. Additionally, in some cases, which often occur in neuroimaging studies, ICC and dbICC are negative due to the negative difference between two mean-square terms in the computational formula (Chen et al., 2018); although not inherently problematic, in practice, negative ICC/dbICC is not interpretable, and can be avoided in some more recent ICC formulations (e.g., the Restricted Maximum Likelihood method). Second, each index provides different sensitivity. High discriminability is required for fingerprinting (i.e., identification), but not vice versa; in some conditions, fingerprinting and discriminability will diverge, with fingerprinting potentially leading to the wrong conclusions (Milham et al., 2021). Previously we illustrated such situation that the individual difference is relatively discriminable, but the fingerprinting score is zero—and thus, fingerprinting, may mislead its users with respect to the potential for optimization and eventual usage (Supplementary Materials and Milham et al., 2021).

An important caveat to be aware of when using multivariate indices of reliability is that they do not guarantee the reliability of each univariate feature. It is well established that reliability differs between regions and connections in the brain (Jiang et al., 2023; Noble et al., 2019). Some features may contribute more to the detectability of individual differences than others (Hong et al., 2020), and it is possible that some features may differentiate a subset of individuals, but not all. As such, the reliability for a multivariate profile should not be assumed to reflect that for its individual features. A sensitivity test (e.g., leave-one-out analysis) to examine the contributions for each of the individual features or the univariate reliability should be considered.

Additionally, while reliability is a prerequisite of measurements of interest, it is not sufficient for establishing validity or implying predictive power for other measurements. Reliability ensures that a brain or behavioral feature can be consistently measured within an individual across specific time scales. However, consistency alone does not inherently validate the feature as an accurate representation of the intended construct. A reliable feature may still lack relevance or utility in explaining individual differences in behavior or predicting specific outcomes. This raises a question: why not focus exclusively on predictive power in brain–behavior association studies, disregarding reliability? The answer lies in the fact that the confidence in observed associations between two measurements (x and y) is impacted by their reliability, as expressed by the equation rtrue(x,y)=robserved(x,y)reliabilityx×reliabilityy (Nikolaidis et al., 2022). High reliability strengthens confidence in the true relationship between variables of interest. Conversely, in cases of low predictive accuracy, knowing the reliability of the measurements can help determine whether the weak prediction reflects a truly low relationship or a possible artifact of measurement error and sample variability. Therefore, establishing both reliability and validity is imperative as this distinction is crucial for interpreting findings for understanding individual differences in biomarker discovery.

When considering the above discussed challenges, it becomes evident that differentiation of within-individual from the observed between-individual variances is crucial. Toward this end, studies of individual differences in the brain and behavior would benefit from the inclusion of repeat assessment in their design, at least for a subportion of collected samples, to allow for assessment of within-individual variation. Previous studies have highlighted that increasing number of repetitions and scan duration enhances reliability (Noble et al., 2017), Notably, acquiring and concatenating multiple shorter scans have shown a clear advantage over a single long scan within individuals for improving reliability (Cho et al., 2021). The sample size and scan time within individuals appear to be broadly interchangeable for total quantities up to 20–30 min per individual in brain–behavior prediction. The prediction accuracy improves as the total amount of data, following a logarithmic model (Ooi et al., 2024). Additionally, studies that assess the variation across analytic methods and experiment protocols should also be encouraged to understand the contributions to sources of variation at both within- and between-individual levels. In addition, the changes of the brain at different time scales and the variation of participant samples (e.g. education, socioeconomic status, etc.) need to be considered (e.g. included in the statistic model) in studies of individual differences. While there is an increasing focus on the standardization of measurements and procedures, enabling centralized assessments for researchers using standardized protocols, the feasibility and generalizability of this approach across population characteristics need to be established. This requires careful consideration to ensure that standardized approaches are applicable and meaningful across diverse participant demographics.

Regarding the quantification of individual differences and its source of variation, we posit that standardizing methods for assessing reliability is as important as standardization of measurement protocols—a central focus of the past decade of work in the imaging community. We have recently developed ReX (Reliability eXplorer) as a tool for rapidly calculating, visualizing, and interrogating these variances, though advanced consideration of data needs is critical to making such calculations possible (T. Xu et al., 2023). This is not to suggest that there can only be one measure of individual difference, but rather, that there would be the benefit of achieving consensus on the interpretations of these measures and which may be optimal for differing situations, as well as agreement on their limitations.

Overall, individual difference research is increasingly highlighting the challenges of reliable measurement for human brain function. In practice, within-individual variation is always embedded in the observed scores of interindividual differences. It is important to conceptualize reliability in terms of its component variances—particularly for studies of brain function, where the within-subject variance can never be zero, and can be noise or meaningful variation of interest depending on testing circumstances. Future work that studies sources of variations across experiments and analytic methods, as well as the changes at different time scales within individuals, can facilitate studying individual differences in brain and behavior in neuroscience and psychology.

The code used in the simulations for Figure 1 is available on GitHub: https://github.com/TingsterX/Reliability_Explorer/tree/main/simulation.

T.X. analyzed and wrote the original draft. G.K., X.-N.Z., J.T.V., M.P.M. reviewed and edited the manuscript.

The authors declare no competing interests

This work is supported by gifts from Joseph P. Healey, Phyllis Green, and Randolph Cowen to the Child Mind Institute and the National Institutes of Health fundings (R24MH114806 to M.P.M., RF1MH128696 to T.X., additional grant support for JTV comes from R01MH120482 (to M.P.M.) and funding from Microsoft Research, the STI 2030—the major projects of the Brain Science and Brain-Inspired Intelligence Technology (2021ZD0200500 to X.-N.Z.) and National Basic Science Data Center “Interdisciplinary Brain Database for In vivo Population Imaging” (ID-BRAIN to X.-N.Z.).

Supplementary material for this article is available with the online version here: https://doi.org/10.1162/imag_a_00430.

Bethlehem
,
R. A. I.
,
Seidlitz
,
J.
,
White
,
S. R.
,
Vogel
,
J. W.
,
Anderson
,
K. M.
,
Adamson
,
C.
,
Adler
,
S.
,
Alexopoulos
,
G. S.
,
Anagnostou
,
E.
,
Areces-Gonzalez
,
A.
,
Astle
,
D. E.
,
Auyeung
,
B.
,
Ayub
,
M.
,
Ball
,
G.
,
Baron-Cohen
,
S.
,
Beare
,
R.
,
Bedford
,
S. A.
,
Benegal
,
V.
,
Beyer
,
F.
,…
Alexander-Bloch
,
A. F.
(
2022
).
Brain charts for the human lifespan
.
Nature
,
604
,
525
533
. https://doi.org/10.1038/s41586-022-04554-y
Blautzik
,
J.
,
Vetter
,
C.
,
Peres
,
I.
,
Gutyrchik
,
E.
,
Keeser
,
D.
,
Berman
,
A.
,
Kirsch
,
V.
,
Mueller
,
S.
,
Pöppel
,
E.
,
Reiser
,
M.
,
Roenneberg
,
T.
, &
Meindl
,
T.
(
2013
).
Classifying fMRI-derived resting-state connectivity patterns according to their daily rhythmicity
.
Neuroimage
,
71
,
298
306
. https://doi.org/10.1016/j.neuroimage.2012.08.010
Botvinik-Nezer
,
R.
,
Holzmeister
,
F.
,
Camerer
,
C. F.
,
Dreber
,
A.
,
Huber
,
J.
,
Johannesson
,
M.
,
Kirchler
,
M.
,
Iwanir
,
R.
,
Mumford
,
J. A.
,
Adcock
,
R. A.
,
Avesani
,
P.
,
Baczkowski
,
B. M.
,
Bajracharya
,
A.
,
Bakst
,
L.
,
Ball
,
S.
,
Barilari
,
M.
,
Bault
,
N.
,
Beaton
,
D.
,
Beitner
,
J.
, …
Schonberg
,
T
. (
2020
).
Variability in the analysis of a single neuroimaging dataset by many teams
.
Nature
,
582
,
84
88
. https://doi.org/10.1101/843193
Brandmaier
,
A. M.
,
Wenger
,
E.
,
Bodammer
,
N. C.
,
Kühn
,
S.
,
Raz
,
N.
, &
Lindenberger
,
U.
(
2018
).
Assessing reliability in neuroimaging research through intra-class effect decomposition (ICED)
.
Elife
,
7
,
e35718
. https://doi.org/10.7554/eLife.35718
Bridgeford
,
E. W.
,
Wang
,
S.
,
Wang
,
Z.
,
Xu
,
T.
,
Craddock
,
C.
,
Dey
,
J.
,
Kiar
,
G.
,
Gray-Roncal
,
W.
,
Colantuoni
,
C.
,
Douville
,
C.
,
Noble
,
S.
,
Priebe
,
C. E.
,
Caffo
,
B.
,
Milham
,
M.
,
Zuo
,
X.-N.
,
Consortium for
Reliability
and Reproducibility, &
Vogelstein
,
J. T.
(
2021
).
Eliminating accidental deviations to minimize generalization error and maximize replicability: Applications in connectomics and genomics
.
PLoS Comput Biol
,
17
,
e1009279
. https://doi.org/10.1371/journal.pcbi.1009279
Chen
,
G.
,
Taylor
,
P. A.
,
Haller
,
S. P.
,
Kircanski
,
K.
,
Stoddard
,
J.
,
Pine
,
D. S.
,
Leibenluft
,
E.
,
Brotman
,
M. A.
, &
Cox
,
R. W.
(
2018
).
Intraclass correlation: Improved modeling approaches and applications for neuroimaging
.
Hum Brain Mapp
,
39
,
1187
1206
. https://doi.org/10.1002/hbm.23909
Cho
,
J. W.
,
Korchmaros
,
A.
,
Vogelstein
,
J. T.
,
Milham
,
M. P.
, &
Xu
,
T.
(
2021
).
Impact of concatenating fMRI data on reliability for functional connectomics
.
Neuroimage
,
226
,
117549
. https://doi.org/10.1016/j.neuroimage.2020.117549
Conley
,
S.
,
O’Connell
,
M.
,
Linsky
,
S.
,
Moemeka
,
L.
,
Darden
,
J. W.
,
4th
,
Gaiser, E. C.
,
Jacoby
,
D.
,
Yaggi
,
H.
, &
Redeker
,
N. S.
(
2021
).
Evaluating recruitment strategies for a randomized clinical trial with heart failure patients
.
West J Nurs Res
,
43
,
785
790
. https://doi.org/10.1177/0193945920970229
DeVellis
,
R. F.
(
2006
).
Classical test theory
.
Med Care
,
44
,
S50
S59
. https://doi.org/10.1097/01.mlr.0000245426.10853.30
Di
,
X.
,
Wolfer
,
M.
,
Kühn
,
S.
,
Zhang
,
Z.
, &
Biswal
,
B. B.
(
2021
).
Estimations of the weather effects on brain functions using functional MRI: A cautionary note
.
Hum Brain Mapp
,
43
(
11
),
3346
3356
, https://doi.org/10.1002/hbm.25576
Finn
,
E. S.
,
Scheinost
,
D.
,
Finn
,
D. M.
,
Shen
,
X.
,
Papademetris
,
X.
, &
Todd
Constable
, R
. (
2017
).
Can brain state be manipulated to emphasize individual differences in functional connectivity?
NeuroImage
,
160
,
140
151
. https://doi.org/10.1016/j.neuroimage.2017.03.064
Finn
,
E. S.
,
Shen
,
X.
,
Scheinost
,
D.
,
Rosenberg
,
M. D.
,
Huang
,
J.
,
Chun
,
M. M.
,
Papademetris
,
X.
, &
Constable
,
R. T.
(
2015
).
Functional connectome fingerprinting: Identifying individuals using patterns of brain connectivity
.
Nat Neurosci
,
18
,
1664
1671
. https://doi.org/10.1038/nn.4135
Fox
,
M. D.
,
Snyder
,
A. Z.
,
Vincent
,
J. L.
,
Corbetta
,
M.
,
Van Essen
,
D. C.
, &
Raichle
,
M. E.
(
2005
).
The human brain is intrinsically organized into dynamic, anticorrelated functional networks
.
Proc Natl Acad Sci U S A
,
102
,
9673
9678
. https://doi.org/10.1073/pnas.0504136102
Garrett
,
D. D.
,
Samanez-Larkin
,
G. R.
,
MacDonald
,
S. W. S.
,
Lindenberger
,
U.
,
McIntosh
,
A. R.
, &
Grady
,
C. L.
(
2013
).
Moment-to-moment brain signal variability: A next frontier in human brain mapping?
Neurosci Biobehav Rev
,
37
,
610
624
. https://doi.org/10.1016/j.neubiorev.2013.02.015
Glatard
,
T.
,
Lewis
,
L. B.
,
Ferreira da Silva
,
R.
,
Adalat
,
R.
,
Beck
,
N.
,
Lepage
,
C.
,
Rioux
,
P.
,
Rousseau
,
M.-E.
,
Sherif
,
T.
,
Deelman
,
E.
,
Khalili-Mahani
,
N.
, &
Evans
,
A. C.
(
2015
).
Reproducibility of neuroimaging analyses across operating systems
.
Front Neuroinform
,
9
,
12
. https://doi.org/10.3389/fninf.2015.00012
Gratton
,
C.
,
Laumann
,
T. O.
,
Nielsen
,
A. N.
,
Greene
,
D. J.
,
Gordon
,
E. M.
,
Gilmore
,
A. W.
,
Nelson
,
S. M.
,
Coalson
,
R. S.
,
Snyder
,
A. Z.
,
Schlaggar
,
B. L.
,
Dosenbach
,
N. U. F.
, &
Petersen
,
S. E.
(
2018
).
Functional brain networks are dominated by stable group and individual factors, not cognitive or daily variation
.
Neuron
,
98
,
439
452.e5
. https://doi.org/10.1016/j.neuron.2018.03.035
Hackman
,
D. A.
, &
Farah
,
M. J.
(
2009
).
Socioeconomic status and the developing brain
.
Trends Cogn Sci
,
13
,
65
73
. https://doi.org/10.1016/j.tics.2008.11.003
Haddad
,
E.
,
Pizzagalli
,
F.
,
Zhu
,
A. H.
,
Bhatt
,
R. R.
,
Islam
,
T.
,
Ba Gari
,
I.
,
Dixon
,
D.
,
Thomopoulos
,
S. I.
,
Thompson
,
P. M.
, &
Jahanshad
,
N.
(
2023
).
Multisite test-retest reliability and compatibility of brain metrics derived from FreeSurfer versions 7.1, 6.0, and 5.3
.
Hum Brain Mapp
,
44
,
1515
1532
. https://doi.org/10.1002/hbm.26147
Hedges
,
E. P.
,
Dimitrov
,
M.
,
Zahid
,
U.
,
Brito
Vega
, B.,
Si
,
S.
,
Dickson
,
H.
,
McGuire
,
P.
,
Williams
,
S.
,
Barker
,
G. J.
, &
Kempton
,
M. J.
(
2022
).
Reliability of structural MRI measurements: The effects of scan session, head tilt, inter-scan interval, acquisition sequence, FreeSurfer version and processing stream
.
Neuroimage
,
246
,
118751
. https://doi.org/10.1016/j.neuroimage.2021.118751
Hodkinson
,
D. J.
,
O’Daly
,
O.
,
Zunszain
,
P. A.
,
Pariante
,
C. M.
,
Lazurenko
,
V.
,
Zelaya
,
F. O.
,
Howard
,
M. A.
, &
Williams
,
S. C. R.
(
2014
).
Circadian and homeostatic modulation of functional connectivity and regional cerebral blood flow in humans under normal entrained conditions
.
J Cereb Blood Flow Metab
,
34
,
1493
1499
. https://doi.org/10.1038/jcbfm.2014.109
Hong
,
S.-J.
,
Xu
,
T.
,
Nikolaidis
,
A.
,
Smallwood
,
J.
,
Margulies
,
D. S.
,
Bernhardt
,
B.
,
Vogelstein
,
J.
, &
Milham
,
M. P.
(
2020
).
Toward a connectivity gradient-based framework for reproducible biomarker discovery
.
Neuroimage
,
223
,
117322
. https://doi.org/10.1016/j.neuroimage.2020.117322
Hsu
,
S.
,
Poldrack
,
R.
,
Ram
,
N.
, &
Wagner
,
A. D.
(
2022
).
Observed correlations from cross-sectional individual differences research reflect both between-person and within-person correlations
.
PsyArXiv
. https://doi.org/10.31234/osf.io/zq37h
Hu
,
F.
,
Chen
,
A. A.
,
Horng
,
H.
,
Bashyam
,
V.
,
Davatzikos
,
C.
,
Alexander-Bloch
,
A.
,
Li
,
M.
,
Shou
,
H.
,
Satterthwaite
,
T. D.
,
Yu
,
M.
, &
Shinohara
,
R. T.
(
2023
).
Image harmonization: A review of statistical and deep learning methods for removing batch effects and evaluation metrics for effective harmonization
.
Neuroimage
,
274
,
120125
. https://doi.org/10.1016/j.neuroimage.2023.120125
Jiang
,
C.
,
Betzel
,
R.
,
He
,
Y.
, &
Zuo
,
X. N.
(
2023
).
Toward reliable network neuroscience for mapping individual differences
.
bioRxiv
,
7
(
3
),
1080
1108
. https://doi.org/10.1162/netn_a_00315
Kiar
,
G.
,
Chatelain
,
Y.
,
De
,
O. C. P.
,
Petit
,
E.
,
Rokem
,
A.
,
Varoquaux
,
G.
,
Misic
,
B.
,
Evans
,
A. C.
, &
Glatard
,
T.
(
2021
).
Numerical uncertainty in analytical pipelines lead to impactful variability in brain networks
.
PLoS One
,
16
(
11
),
e0250755
. https://doi.org/10.1371/journal.pone.0250755
Kolb
,
B.
,
Forgie
,
M.
,
Gibb
,
R.
,
Gorny
,
G.
, &
Rowntree
,
S.
(
1998
).
Age, experience and the changing brain
.
Neurosci Biobehav Rev
,
22
,
143
159
. https://doi.org/10.1016/s0149-7634(97)00008-0
Laumann
,
T. O.
,
Snyder
,
A. Z.
,
Mitra
,
A.
,
Gordon
,
E. M.
,
Gratton
,
C.
,
Adeyemo
,
B.
,
Gilmore
,
A. W.
,
Nelson
,
S. M.
,
Berg
,
J. J.
,
Greene
,
D. J.
,
McCarthy
,
J. E.
,
Tagliazucchi
,
E.
,
Laufs
,
H.
,
Schlaggar
,
B. L.
,
Dosenbach
,
N. U. F.
, &
Petersen
,
S. E.
(
2017
).
On the stability of BOLD fMRI correlations
.
Cereb Cortex
,
27
,
4719
4732
. https://doi.org/10.1093/cercor/bhw265
Li
,
X.
,
Esper
,
N. B.
,
Ai
,
L.
,
Giavasis
,
S.
,
Jin
,
H.
,
Feczko
,
E.
,
Xu
,
T.
,
Clucas
,
J.
,
Franco
,
A.
,
Heinsfeld
,
A. S.
,
Adebimpe
,
A.
,
Vogelstein
,
J. T.
,
Yan
,
C.-G.
,
Esteban
,
O.
,
Poldrack
,
R. A.
,
Craddock
,
C.
,
Fair
,
D.
,
Satterthwaite
,
T.
,
Kiar
,
G.
, &
Milham
,
M. P.
(
2024
).
Moving beyond processing and analysis-related variation in neuroscience
.
Nat Hum Behav
,
8
(
10
),
2003
2017
. https://doi.org/10.1101/2021.12.01.470790
McGraw
,
K. O.
, &
Wong
,
S. P.
(
1996
).
Forming inferences about some intraclass correlation coefficients
.
Psychol Methods
,
1
,
30
46
. https://doi.org/10.1037//1082-989x.1.1.30
Milham
,
M. P.
,
Vogelstein
,
J.
, &
Xu
,
T.
(
2021
).
Removing the reliability bottleneck in functional magnetic resonance imaging research to achieve clinical utility
.
JAMA Psychiatry
,
78
,
587
588
. https://doi.org/10.1001/jamapsychiatry.2020.4272
Nichols
,
T. E.
,
Das
,
S.
,
Eickhoff
,
S. B.
,
Evans
,
A. C.
,
Glatard
,
T.
,
Hanke
,
M.
,
Kriegeskorte
,
N.
,
Milham
,
M. P.
,
Poldrack
,
R. A.
,
Poline
,
J.-B.
,
Proal
,
E.
,
Thirion
,
B.
,
Van Essen
,
D. C.
,
White
,
T.
, &
Yeo
,
B. T. T.
(
2017
).
Best practices in data analysis and sharing in neuroimaging using MRI
.
Nat Neurosci
,
20
,
299
303
. https://doi.org/10.1038/nn.4500
Nikolaidis
,
A.
,
Chen
,
A. A.
,
He
,
X.
,
Shinohara
,
R.
,
Vogelstein
,
J.
,
Milham
,
M.
, &
Shou
,
H.
(
2022
).
Suboptimal phenotypic reliability impedes reproducible human neuroscience
.
bioRxiv
. https://doi.org/10.1101/2022.07.22.501193
Noble
,
S.
,
Scheinost
,
D.
, &
Constable
,
R. T.
(
2019
).
A decade of test-retest reliability of functional connectivity: A systematic review and meta-analysis
.
Neuroimage
,
203
,
116157
. https://doi.org/10.1016/j.neuroimage.2019.116157
Noble
,
S.
,
Spann
,
M. N.
,
Tokoglu
,
F.
,
Shen
,
X.
,
Constable
,
R. T.
, &
Scheinost
,
D.
(
2017
).
Influences on the test–retest reliability of functional connectivity MRI and its relationship with behavioral utility
.
Cereb Cortex
,
27
,
5415
5429
. https://doi.org/10.1093/cercor/bhx230
Ooi
,
L. Q. R.
,
Orban
,
C.
,
Zhang
,
S.
,
Nichols
,
T. E.
,
Tan
,
T. W. K.
,
Kong
,
R.
,
Marek
,
S.
,
Dosenbach
,
N. U. F.
,
Laumann
,
T.
,
Gordon
,
E. M.
,
Yap
,
K. H.
,
Ji
,
F.
,
Chong
,
J. S. X.
,
Chen
,
C.
,
An
,
L.
,
Franzmeier
,
N.
,
Roemer
,
S. N.
,
Hu
,
Q.
,
Ren
,
J.
,
Liu
,
H.
,
Chopra
,
S.
,
Cocuzza
,
C. V.
,
Baker
,
J. T.
,
Zhou
,
J. H.
,
Bzdok
,
D.
,
Eickhoff
,
S. B.
,
Holmes
,
A. J.
, &
Yeo
,
B. T. T.
(
2024
).
MRI economics: Balancing sample size and scan duration in brain wide association studies
.
bioRxiv
. https://doi.org/10.1101/2024.02.16.580448
Poldrack
,
R. A.
,
Laumann
,
T. O.
,
Koyejo
,
O.
,
Gregory
,
B.
,
Hover
,
A.
,
Chen
,
M.-Y.
,
Gorgolewski
,
K. J.
,
Luci
,
J.
,
Joo
,
S. J.
,
Boyd
,
R. L.
,
Hunicke-Smith
,
S.
,
Simpson
,
Z. B.
,
Caven
,
T.
,
Sochat
,
V.
,
Shine
,
J. M.
,
Gordon
,
E.
,
Snyder
,
A. Z.
,
Adeyemo
,
B.
,
Petersen
,
S. E.
, …
Mumford
,
J. A
. (
2015
).
Long-term neural and physiological phenotyping of a single human
.
Nat Commun
,
6
,
8885
. https://doi.org/10.1038/ncomms9885
Schilling
,
K. G.
,
Rheault
,
F.
,
Petit
,
L.
,
Hansen
,
C. B.
,
Nath
,
V.
,
Yeh
,
F.-C.
,
Girard
,
G.
,
Barakovic
,
M.
,
Rafael-Patino
,
J.
,
Yu
,
T.
,
Fischi-Gomez
,
E.
,
Pizzolato
,
M.
,
Ocampo-Pineda
,
M.
,
Schiavi
,
S.
,
Canales-Rodríguez
,
E. J.
,
Daducci
,
A.
,
Granziera
,
C.
,
Innocenti
,
G.
,
Thiran
,
J.-P.
, …
Descoteaux
,
M
. (
2021
).
Tractography dissection variability: What happens when 42 groups dissect 14 white matter bundles on the same dataset?
Neuroimage
,
243
, 118502. https://doi.org/10.1016/j.neuroimage.2021.118502
Seghier
,
M. L.
, &
Price
,
C. J.
(
2018
).
Interpreting and utilising intersubject variability in brain function
.
Trends Cogn Sci
,
22
,
517
530
. https://doi.org/10.1016/j.tics.2018.03.003
Shine
,
J. M.
,
Koyejo
,
O.
, &
Poldrack
,
R. A.
(
2016
).
Temporal metastates are associated with differential patterns of time-resolved connectivity, network topology, and attention
.
Proc Natl Acad Sci U S A
,
113
,
9888
9891
. https://doi.org/10.1073/pnas.1604898113
Shou
,
H.
,
Eloyan
,
A.
,
Lee
,
S.
,
Zipunnikov
,
V.
,
Crainiceanu
,
A. N.
,
Nebel
,
N. B.
,
Caffo
,
B.
,
Lindquist
,
M. A.
, &
Crainiceanu
,
C. M.
(
2013
).
Quantifying the reliability of image replication studies: The image intraclass correlation coefficient (I2C2)
.
Cogn Affect Behav Neurosci
,
13
,
714
724
. https://doi.org/10.3758/s13415-013-0196-0
Smallwood
,
J.
, &
Schooler
,
J. W.
(
2015
).
The science of mind wandering: Empirically navigating the stream of consciousness
.
Annu Rev Psychol
,
66
,
487
518
. https://doi.org/10.1146/annurev-psych-010814-015331
Tooley
,
U. A.
,
Bassett
,
D. S.
, &
Mackey
,
A. P.
(
2021
).
Environmental influences on the pace of brain development
.
Nat Rev Neurosci
,
22
,
372
384
. https://doi.org/10.1038/s41583-021-00457-5
Vanderwal
,
T.
,
Eilbott
,
J.
,
Finn
,
E. S.
,
Cameron
Craddock
, R.,
Turnbull
,
A.
, &
Xavier
Castellanos
, F
. (
2017
).
Individual differences in functional connectivity during naturalistic viewing conditions
.
NeuroImage
,
157
,
521
530
. https://doi.org/10.1016/j.neuroimage.2017.06.027
Vatansever
,
D.
,
Menon
,
D. K.
, &
Stamatakis
,
E. A.
(
2017
).
Default mode contributions to automated information processing
.
Proc Natl Acad Sci U S A
,
114
,
12821
12826
. https://doi.org/10.1073/pnas.1710521114
Wang
,
Z.
,
Bridgeford
,
E.
,
Wang
,
S.
,
Vogelstein
,
J. T.
, &
Caffo
,
B.
(
2020
).
Statistical analysis of data repeatability measures
.
arXiv
[stat.AP]. https://doi.org/10.1111/insr.12591
Xu
,
M.
,
Reiss
,
P. T.
, &
Cribben
,
I.
(
2021
).
Generalized reliability based on distances
.
Biometrics
,
77
,
258
270
. https://doi.org/10.1111/biom.13287
Xu
,
T.
,
Kiar
,
G.
,
Cho
,
J. W.
,
Bridgeford
,
E. W.
,
Nikolaidis
,
A.
,
Vogelstein
,
J. T.
, &
Milham
,
M. P.
(
2023
).
ReX: An integrative tool for quantifying and optimizing measurement reliability for the study of individual differences
.
Nat Methods
,
20
,
1025
1028
. https://doi.org/10.1038/s41592-023-01901-3
Yan
,
C.-G.
,
Craddock
,
R. C.
,
Zuo
,
X.-N.
,
Zang
,
Y.-F.
, &
Milham
,
M. P.
(
2013
).
Standardizing the intrinsic brain: Towards robust measurement of inter-individual variation in 1000 functional connectomes
.
Neuroimage
,
80
,
246
262
. https://doi.org/10.1016/j.neuroimage.2013.04.081
Yang
,
Z.
,
Telesford
,
Q. K.
,
Franco
,
A. R.
,
Lim
,
R.
,
Gu
,
S.
,
Xu
,
T.
,
Ai
,
L.
,
Castellanos
,
F. X.
,
Yan
,
C.-G.
,
Colcombe
,
S.
, &
Milham
,
M. P.
(
2020
).
Measurement reliability for individual differences in multilayer network dynamics: Cautions and considerations
.
NeuroImage
,
225
,
117489
. https://doi.org/10.1016/j.neuroimage.2020.117489
Yoo
,
K.
,
Rosenberg
,
M. D.
,
Noble
,
S.
,
Scheinost
,
D.
,
Constable
,
R. T.
, &
Chun
,
M. M.
(
2019
).
Multivariate approaches improve the reliability and validity of functional connectivity and prediction of individual behaviors
.
Neuroimage
,
197
,
212
223
. https://doi.org/10.1016/j.neuroimage.2019.04.060
Zuo
,
X.-N.
,
Xu
,
T.
, &
Milham
,
M. P.
(
2019
).
Harnessing reliability for neuroscience research
.
Nat Hum Behav
,
3
,
768
771
. https://doi.org/10.1038/s41562-019-0655-x
This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International (CC BY 4.0) license, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. For a full description of the license, please visit https://creativecommons.org/licenses/by/4.0/legalcode.

Supplementary data