Abstract
In systems neuroscience, most models posit that brain regions communicate information under constraints of efficiency. Yet, evidence for efficient communication in structural brain networks characterized by hierarchical organization and highly connected hubs remains sparse. The principle of efficient coding proposes that the brain transmits maximal information in a metabolically economical or compressed form to improve future behavior. To determine how structural connectivity supports efficient coding, we develop a theory specifying minimum rates of message transmission between brain regions to achieve an expected fidelity, and we test five predictions from the theory based on random walk communication dynamics. In doing so, we introduce the metric of compression efficiency, which quantifies the trade-off between lossy compression and transmission fidelity in structural networks. In a large sample of youth (n = 1,042; age 8–23 years), we analyze structural networks derived from diffusion-weighted imaging and metabolic expenditure operationalized using cerebral blood flow. We show that structural networks strike compression efficiency trade-offs consistent with theoretical predictions. We find that compression efficiency prioritizes fidelity with development, heightens when metabolic resources and myelination guide communication, explains advantages of hierarchical organization, links higher input fidelity to disproportionate areal expansion, and shows that hubs integrate information by lossy compression. Lastly, compression efficiency is predictive of behavior—beyond the conventional network efficiency metric—for cognitive domains including executive function, memory, complex reasoning, and social cognition. Our findings elucidate how macroscale connectivity supports efficient coding and serve to foreground communication processes that utilize random walk dynamics constrained by network connectivity.
Author Summary
Macroscale communication between interconnected brain regions underpins most aspects of brain function and incurs substantial metabolic cost. Understanding efficient and behaviorally meaningful information transmission dependent on structural connectivity has remained challenging. We validate a model of communication dynamics atop the macroscale human structural connectome, finding that structural networks support dynamics that strike a balance between information transmission fidelity and lossy compression. Notably, this balance is predictive of behavior and explanatory of biology. In addition to challenging and reformulating the currently held view that communication occurs by routing dynamics along metabolically efficient direct anatomical pathways, our results suggest that connectome architecture and behavioral demands yield communication dynamics that accord to neurobiological and information theoretical principles of efficient coding and lossy compression.
INTRODUCTION
The principle of compensation states that “to spend on one side, nature is forced to economize on the other side” (West-Eberhard, 2003). In the economics of brain connectomics, natural selection optimizes network architecture for versatility, resilience, and efficiency under constraints of metabolism, materials, space, and time (Bullmore & Sporns, 2012; Laughlin, 2001; West-Eberhard, 2003). Brain networks—composed of nodes representing cortical regions and edges representing white matter tracts—strike evolutionary compromises between costs and adaptations (Avena-Koenigsberger, Goñi, Solé, & Sporns, 2015; Buckner & Krienen, 2013; Laughlin, 2001; Reardon et al., 2018; West-Eberhard, 2003; Whitaker et al., 2016). Further, network disruptions may contribute to the development of neuropsychiatric disorders (Crossley et al., 2014; Di Martino et al., 2014; Gollo et al., 2018; Kaczkurkin et al., 2018). Because the limits of computations are intertwined with the limits of communication between brain regions (Cover, 1999), to understand how the brain efficiently balances resource constraints with pressures of information processing, one must begin with models of information transmission in brain networks.
Principles of neurotransmission established at the cellular level suggest that biophysical constraints on information processing may apply to the macroscopic levels of brain regions and networks (Laughlin, 2001; Levy & Baxter, 1996; Sterling & Laughlin, 2015). The principle of efficient coding proposes that the brain transmits maximal information in a metabolically economical or compressed form to improve future behavior (Chalk, Marre, & Tkačik, 2018). Efficient coding at the macroscale offers a parsimonious principle of compression characterizing the dimensionality of neural representations (Mack, Preston, & Love, 2020; Shine et al., 2019; Stringer, Pachitariu, Steinmetz, Carandini, & Harris, 2019; Tang et al., 2019), as well as a parsimonious principle of transmission characterizing a spectrum of network communication mechanisms (Avena-Koenigsberger et al., 2015, 2017; Avena-Koenigsberger, Misic, & Sporns, 2018; Bullmore & Sporns, 2012; Goñi et al., 2013; Goñi et al., 2014; Mišić et al., 2015). However, it remains incompletely understood how this principle generalizes from cells and sensory systems to the macroscale connectome (Chalk et al., 2018; Sterling & Laughlin, 2015).
An unexplored link between the efficient coding of compressed transmissions and macroscale brain network communication dynamics is rate-distortion theory, a major branch of information theory that establishes the mathematical foundations of lossy data compression for any communication channel (Shannon, 1959). Rate-distortion theory formalizes the link between compression and communication by determining the minimum amount of information that a source should transmit (the rate) for a target to approximately receive the input signal without exceeding an expected amount of noise (the distortion) (Shannon, 1959). Lossy data compression is reducing the amount of information transmitted (rate), accepting some loss of data fidelity (distortion). Using a rate-distortion model, we sought to explain how the macroscale connectome supports efficient coding from minimal assumptions.
We modeled information transmission as the passing of stochastic messages in parallel along the wiring of the human connectome (Figure 1A). More precisely, we modeled information transmission using a repetition code (Barlow, 1961; Cover, 1999). A repetition code uses redundancy—here by sending multiple copies of a message—to overcome errors in communication arising from the stochasticity of neural processes (Cover, 1999; Sterling & Laughlin, 2015). A natural trade-off emerges between the redundancy and efficiency of a message: while redundant messages are more robust to errors in transmission, they also incur greater cost (Barlow, 1961). Thus, given an allowed error rate (or, equivalently, an expected fidelity), maximizing the efficiency of information transmission requires minimizing the redundancy of messages (Cover, 1999). By quantifying the minimal number of repeated messages needed to achieve a given fidelity, questions of connectome computation and communication can be formulated as a tractable mathematical problem using stochastic processes and redundancy reduction (Avena-Koenigsberger et al., 2018; Sims, 2018; Sterling & Laughlin, 2015).
A parsimonious model of information transmission in connectomes emerges naturally from two key assumptions. The first assumption is that stochastic transmission entails an economy of discrete impulses where the immediate future state only depends on the current state (Barlow, 1961; Sterling & Laughlin, 2015). Mathematically, this assumption casts transmission as a linear process, or a random walk, wherein message copies (identical random walkers) propagate along structural connections with probabilities proportional to the connection’s microstructural integrity (Avena-Koenigsberger et al., 2018; Fornito, Zalesky, & Bullmore, 2016). Biologically, macroscale random walk models are supported by their ability to predict the transsynaptic spread of pathogens (Henderson et al., 2019; Raj et al., 2015; Zheng et al., 2019), as well as the directionality and spatial distribution of neural dynamics from structural connectivity (Abdelnour, Dayan, Devinsky, Thesen, & Raj, 2018; Goñi et al., 2014; Paquola et al., 2020; Seguin, Razi, & Zalesky, 2019). The second assumption is that the impulse can lose information but never generate additional information over successive steps of propagation (Amico et al., 2021). Mathematically, this assumption represents the data processing inequality, which states that a random walker can only lose (and never gain) information about an information source (Cover, 1999). Biologically, the assumption is supported by increasing temporal delay, signal mixing, and signal decay introduced by longer paths (Murray et al., 2014; Sterling & Laughlin, 2015).
Combining these two assumptions, if packets of information propagate along structural pathways and information can only be lost with each step, then the shortest pathway between two brain regions yields an upper bound on the fidelity with which they can communicate. This key conclusion allows one to formulate the probability that a message propagates along the shortest path as an effective fidelity for the communication between two regions, thereby operationalizing the notion of distortion. Moreover, by modeling messages as random walkers, one can operationalize the notion of rate by computing the number of messages that must be sent to ensure that at least one transmits along the shortest path, that is, to ensure that at least one message reaches a specified receiver with maximum fidelity (Goñi et al., 2013). We applied our model to 1,042 youth (aged 8–23 years) from the Philadelphia Neurodevelopmental Cohort who underwent diffusion-weighted imaging (DWI; see Supporting Information Figure S1) (Satterthwaite, Elliott, et al., 2014). To operationalize metabolic expenditure, we used arterial-spin labeling (ASL) MRI, which measures cerebral blood flow (CBF) and is correlated with glucose expenditure (Gur et al., 2008; Vaishnavi et al., 2010).
To evaluate the validity of the efficient coding model, we assessed five published predictions of any communication system adhering to rate-distortion theory, which we adapted to connectomes and distinguished from alternative explanations of brain network communication dynamics (Figure 1B) (Goñi et al., 2013; Goñi et al., 2014; Marzen & DeDeo, 2017; Sims, 2018; van den Heuvel et al., 2012). First, information transmission should produce a characteristic rate-distortion gradient in biological and artificial networks, where exponentially increasing information rates are required to minimize signal distortion. Second, transmission efficiency should improve with manipulations of the communication system designed to facilitate signal propagation, where information costs decrease when randomly walking messages are biased with regional differences in metabolic rates and intracortical myelin. Third, the information rate should vary as a function of the costs of error, with discounts when costs are low and premiums when costs are high. Fourth, brain network complexity should flexibly support communication regimes of varying fidelity, where a high-fidelity regime predicts information rates that monotonically increase as the network grows more complex, and a low-fidelity regime predicts asymptotic information rates indicative of lossy compression. Fifth and finally, structural hubs should integrate incoming signals to efficiently broadcast information, where hubs (compared to other brain regions) have more compressed input rates and higher transmission rates for equivalent input–output fidelity. As described below, this model advances the current understanding of how information processing is associated with behaviors in a range of cognitive domains, subject to constraints on metabolic resources and network architecture.
RESULTS
Macroscale Efficient Coding Can Be Understood by Communication Processes of Random Walks But Not the Alternative Model of Shortest Path Routing
To understand how the brain balances the transmission rate of stochastic messages and signal distortion across different network architectures, we developed a model positing random walk communication dynamics atop the structural connectome. We tested this model by comparing it to the alternative hypothesis of shortest path routing (see Supporting Information Modeling/Math Notes). The random walk model and shortest path model are currently viewed as opposing extremes of a spectrum of communication processes (Avena-Koenigsberger et al., 2015, 2018; Goñi et al., 2013). In network neuroscience, shortest path routing anchors metrics of communication dynamics and information integration (Avena-Koenigsberger et al., 2018; Seguin, van den Heuvel, & Zalesky, 2018; Sporns, 2013). In cognitive neuroscience, the neural circuit related to behavior is commonly depicted as a subset of brain regions communicating their specialized information to each other across shortest and direct anatomical connections (Saleeba, Dempsey, Le, Goodchild, & McMullan, 2019). Although shortest path routing has acknowledged shortcomings as a model of communication dynamics (see Supporting Information Modeling/Math Notes), a key extenuating hypothesis of the model is reduced metabolic cost (Avena-Koenigsberger et al., 2018; Bullmore & Sporns, 2012). Yet, existing macroscale evidence for this model remains sparse (Várkuti et al., 2011).
We sought to determine how brain metabolism is associated with structural signatures of shortest path versus random walk models. To quantify the extent to which a person’s brain is structured to support shortest path routing, we used a statistical quantity known as the global efficiency (Latora & Marchiori, 2001), a commonly used measure of the average shortest path strength between all pairs of brain regions. Intuitively, global efficiency represents the ease of routing information by shortest paths and is proportional to the strength of shortest paths in a network (Sporns, 2013). As an operationalization of metabolic running cost, we considered CBF, which is correlated with glucose consumption (Gur et al., 2008; Vaishnavi et al., 2010). To test the spatial correlation between CBF and glucose consumption, we used a spatial permutation test that generates a null distribution of randomly rotated brain maps that preserves the spatial covariance structure of the original data; we denote the p value that reflects significance as pSPIN (Materials and Methods). We observed a linear association between CBF and glucose consumption (Figure 1C; Pearson’s correlation coefficient r = 0.47, df = 358, pSPIN < 0.001).
Next, we tested if shortest path routing is linked to a decrease in metabolic expenditure, operationalized as a negative correlation between global efficiency and CBF (Bullmore & Sporns, 2012; Várkuti et al., 2011). Controlling for mean gray matter density, sex, mean degree, network density, and in-scanner motion, we found that the global efficiency was negatively correlated with CBF (r = −0.20, df = 1039, p < 0.001), consistent with prior reports (Várkuti et al., 2011). Notably, we did not regress out age in the previous analysis in order to align with the prior analysis that we aimed to replicate (Várkuti et al., 2011). We were also interested in determining whether development had any effect on the relationship between global efficiency and CBF. Our interest was justified by the positive correlation between age and global efficiency (Figure 2A; F = 50, estimated df = 3.46, p < 2 × 10−16) and the negative correlation between age and CBF (F = 69.22, estimated df = 3.74, p < 2 × 10−16). After controlling for age we did not find a significant relationship between global efficiency and CBF (r = 0.01, df = 1039, p = 0.79), suggesting that colinearity with age drove the initial observed association between CBF and global efficiency. This null result undermines the claim that shortest path routing is associated with reduced metabolic expenditure.
Rather than being driven by shortest path routing, metabolic expenditure could instead be associated with communication by random walks. Each brain region can reach every other brain region via random walks along paths of five connections (Figure 2B). A random walker will likely not take the most efficient paths and must instead rely on the structural strengths of longer paths. Hence, if brain metabolism is associated with communication by random walks, then CBF should correlate with the strength of the white matter paths greater than length 5. To evaluate this prediction, we computed the strength of connections across different path distances using the matrix exponent of the structural network (see Materials and Methods and Supporting Information Figure S1B). We then tested the association between longer paths and metabolic expenditure across individuals (Figure 2C) and across regions (Figure 2D). In first considering variation across individuals, we found that the average node strengths for walks of length 2 to 15 were negatively correlated with CBF (t = −1.59 to −2.81, estimated model df = 11.45, FDR-corrected p < 0.05), after controlling for age, sex, age-by-sex interaction, average node degree, network density, and in-scanner motion (Figure 2C). The negative correlations between CBF and the average connection strengths suggest that the greater the connection integrity, the lower the metabolic expenditure. In next considering variation across brain regions, we found that the average node strengths for walks of length 2 to 15 were positively correlated with CBF (Spearman’s rank correlation coefficient ρ = 0.12 to 0.14, df = 358, FDR-corrected p < 0.05), after controlling for age, sex, age-by-sex interaction, average node degree, network density, and in-scanner motion (Figure 2D). The positive correlations between CBF and the average connection strengths suggest that brain regions with greater path strengths tended to have higher metabolic expenditure. See Supporting Information Figures S2–S4 for metabolic costs associated with other connectivity metrics supporting random walks. Together, we found no evidence of metabolic expenditure associated with shortest path routing, whereas the convergent findings of an association between CBF and random walk path strengths across individuals and regions provided some evidence that metabolic running costs were linked to random walk communication dynamics.
Using the random walk model, we formalized a rate-distortion model of efficient coding by assuming that the minimal amount of noise is achieved by messages that randomly walk along shortest paths (Figure 3A–D). We defined rate as the number of random walkers per transmission, and distortion as the probability of the random walkers not taking the shortest path. We evaluated the validity of the redundancy reduction implementation of efficient coding (Barlow, 1961). Reducing redundancy in repetition coding is equivalent to minimizing the number of random walkers (Barlow, 1961; Cover, 1999; Shannon, 1959). To understand how the brain balances information rate and distortion, we measured the number of random walkers that are required for at least one to randomly walk along the shortest path to a target cortical region, with an expected probability (Materials and Methods). This measure of random walk dynamics is based on a prior metric (Goñi et al., 2013). The number of random walkers can be used to calculate the transmission length of a neural message (in units of bits) or the information rate (in units of bits per second; see Materials and Methods and Supporting Information Figure S3). To evaluate the roles of random walk dynamics and rate-distortion theory in the brain, we assessed five previously published predictions of rate-distortion theory and information processing (Figure 1C) (Marzen & DeDeo, 2017; Sims, 2018; van den Heuvel et al., 2012).
Hypothesis 1: Rate-Distortion Gradient
The first prediction of rate-distortion theory is that communication systems including both brain networks and artificial random networks should produce an information rate that is an exponential function of distortion because biological and engineered systems are governed by the same information-theoretic trade-offs (Sims, 2018). To test this prediction, we computed the average number of random walkers over all nodes in the network for a given individual, with the probability of randomly walking along the shortest path ranging from 10% to 99.9% (Figure 3D). We compared the number of random walkers required of structural connections in the brain with that of random walkers required of connections in random Erdős-Rényi networks, which have larger probabilities of shortest path communication compared among canonical random networks (Goñi et al., 2013; Latora & Marchiori, 2001). Hence, Erdős-Rényi networks serve as an optimal benchmark for the efficiency of random walk communication and shortest path routing (Materials and Methods). For each individual network, the information-theoretic trade-off between information rate and signal distortion was defined by a rate-distortion gradient (Figure 3E). The gradient shows that distortion increases as the information rate decreases, which is the hallmark feature of lossy compression. Next, we considered the extent to which the brain’s structural connectome prioritizes compression versus fidelity. We refer to this trade-off as the compression efficiency (Figure 3E), and define it as the slope of the rate-distortion gradient (Materials and Methods). With random walk communication dynamics, increased compression efficiency prioritizes lossy compression, while decreased compression efficiency prioritizes transmission fidelity.
Consistent with the first prediction of rate-distortion theory, we observed an exponential gradient in every individual brain network and the Erdős-Rényi random networks. Furthermore, the random networks, which are composed of more short connections than empirical brain networks, required significantly fewer random walkers than the empirical brain networks (Figure 3D and Supporting Information Figure S3; F = 10 × 105, df = 29120, p < 2 × 10−16), consistent with the intuition that a greater prevalence of short connections in the random network translates to greater likelihood of shortest path propagation (Goñi et al., 2013; Latora & Marchiori, 2001). Rate-distortion trade-offs varied as a function of age and sex, where compression efficiency (Figure 3E) was negatively correlated with age (F = 27.54, estimated df = 2.17, p < 0.001), suggesting that neurodevelopment places a premium on fidelity (Figure 3F). Compression efficiency was greater on average in females compared to males (t = 9.53, df = 996.82, p < 0.001). The data, therefore, indicate that random walk communication dynamics on biological brain networks differ from random walks on artificial networks, yet each accords well with the prediction of rate-distortion trade-offs governing all communication systems.
Hypothesis 2: Redundancy Reduction
The second prediction of rate-distortion theory is that manipulations to the physical communication system (the connectome) that are designed to facilitate information transmission will improve communication efficiency. The prediction stems from two key observations. First, between the rate-distortion gradients for structural connectomes and random networks (depicted in Figure 3D) exists a range of possible rate-distortion gradients produced by some other construction of communication networks (Shannon, 1959). Second, it is well known that brain signaling relies extensively on metabolic diffusion and devotes much of its metabolic resources to maintaining a chemical balance that supports neuron firing (Attwell & Laughlin, 2001; Sterling & Laughlin, 2015). At the longer distances of the connectome, myelin in the white matter and cerebral cortex supports the speed and efficiency of electrical signaling in subcortical fiber tracts and in cortico-cortical communication (Barbas & Rempel-Clower, 1997; Deco, Roland, & Hilgetag, 2014; Laughlin, 2001). Hence, modifying the connectome to bias random walk dynamics according to metabolic resources and myelin mimics biological investments in communication efficiency.
The second observation above leads to the hypothesis that including biological biases in random walk probabilities based on the strength of structural connections will improve the efficiency of information transmission. We hypothesized that biasing random walkers with metabolic resources and myelin would reduce the information rate required to communicate a message with a given fidelity compared to random walkers that propagate only by connectome topology. To test this hypothesis, we biased edge weights (Materials and Methods) representing structural connection strength by multiplying the edge weight by a bias term. Across pairs of connected brain regions, the bias term was either defined as the average, normalized metabolic rate using CBF or the average, normalized cortical myelin content using published maps of T2/T1w MRI measures with histological validation (Figure 4A) (Glasser et al., 2014). By modeling network communication dynamics, one can calculate directed patterns of transmission as inputs into (receiver) and outputs from (sender) brain regions (Seguin et al., 2019). We separately computed the send and receive compression efficiency of brain regions to better understand the biological relevance of transmitted information sent or received across the connectome (Figure 4B; Materials and Methods).
We found that brain regions that prioritized input fidelity and output compression tended to have greater myelin content (Figure 4B; sender r = 0.23, df = 358, pSPIN,Holm-Bonferroni = 0.02, receiver r = −0.14, df = 358, pSPIN,Holm-Bonferroni = 0.046), consistent with myelin’s function in neurotransmission efficiency and speed. For a channel communicating at 0.1% distortion (and across all distortions; Supporting Information Figure S6C), biasing structural edge weights by the biological properties of metabolic and electrical signaling resulted in more efficient communication by reducing the number of redundant random walkers required (Figure 4C; tmetabolic = 20.87, bootstrap 95% CI [18.72, 23.14], df = 1993.9; telectrical = 295.93, bootstrap 95% CI [281.19, 312.76], df = 1225.9, p < 0.001). While both metabolic and electrical signaling supported more efficient communication, electrical signaling was more efficient than metabolic signaling (Deco et al., 2014; Sterling & Laughlin, 2015). Compared to rewired null networks preserving the degree sequence (Supporting Information Figure S6D), structural topology and metabolic resources support communication that prioritizes fidelity (ttopological,degree-preserving(2074.4) = 121.02, p < 0.001; tmetabolic,degree-preserving(2025.8) = 87.78, p < 0.001), while myelination supports communication that prioritizes compression efficiency (telectrical,degree-preserving(1208.3) = −122.62, p < 0.001). The minimum number of random walkers required for distortion levels less than 60% was explained by the interaction of the distortion level with the type of biased random walk (see Figure 4C, F = 6 × 105, df = 29120, pHolm-Bonferroni < 0.05). Together, these results support the prediction that biological investments can augment connectivity to support efficient communication, especially when transmission prioritizes fidelity.
Hypothesis 3: Error-Dependent Costs
The third prediction of rate-distortion theory is that the information rate should vary as a function of the costs of errors in communication systems that interact with their environment. If errors are more costly for networks operating at high fidelity, then we should observe an information rate surpassing the minimum predicted by rate-distortion theory. In contrast, if errors are less costly for networks operating at low fidelity, then we should observe no more than the minimum predicted information rate. In testing this prediction, we observed that brain networks commit more random walkers than required for very low levels of distortion, such as 0.1%, but allocate the predicted number of random walkers or fewer to guarantee levels of distortion between 2% and 60% (Figure 4D). Hence, the third prediction of rate-distortion theory was consistent with our observation of a premium placed on very low signal distortion and a discounted cost of greater distortion.
Hypothesis 4: Flexible Coding Regimes
The fourth prediction of rate-distortion theory proposes that communication systems, including the connectome, have distinct network properties that support information transmission in a flexibly high- or low-fidelity regime. With increasing information processing demands, a high-fidelity regime will continue to place a premium on accuracy, whereas a low-fidelity regime will tolerate noise in support of lossy compression. The operating regime depends on the behavioral demands of the environment, indicating the need for a flexible regime that can simultaneously support both high- and low-fidelity communication. In this section, we assess the fourth prediction of rate-distortion theory by testing the more precise hypotheses that large brain networks support communication in a high-fidelity regime, indirect pathways supports a low-fidelity regime, and hierarchical organization supports a flexible regime. We explain and test each hypothesis in turn.
Large Networks Support a High-Fidelity Regime and Indirect Pathways Support a Low-Fidelity Regime
Rate-distortion theory predicts that, in a high-fidelity regime, the information rate will monotonically increase with the complexity of the communication system in order to continue to place a premium on accuracy (Figure 5A). To evaluate this prediction, we operationalized complexity as network size because size determines the number of possible states (or nodes) available to each random walker (Marzen & DeDeo, 2017). We re-parcellated each individual brain network at different spatial resolutions to generate brain and random null networks of five different sizes representing the original network. We compared the rate for brain networks to the rate for random networks of matched size. In testing this prediction, we observed that the minimum number of random walkers increased monotonically with network size, consistent with a high-fidelity regime, and at a rate different to random networks with matched sizes (Figure 5B). Larger brain networks support high fidelity communication by placing a greater premium on accuracy than do larger random networks.
In addition to high-fidelity communication, in a low-fidelity regime, rate-distortion theory predicts that the information rate should plateau as a function of network complexity in order to tolerate noise in support of lossy compression. To evaluate this prediction, we operationalized network complexity supporting low fidelity communication by using path transitivity. Path transitivity quantifies the number of indirect pathways along the shortest path which are a one-connection longer detour (Figure 5A, Materials and Methods). Along the shortest paths, the transitivity is the number of triangles formed by one edge in the shortest path and two connected edges composing an indirect pathway that exits and immediately returns to the shortest path. Operationalizing low fidelity with path transitivity stemmed from interpreting path transitivity using our model assumptions of random walk dynamics.
Applying our random walk model assumptions to path transitivity, if the shortest path represents the structure supporting highest fidelity because longer paths introduce information loss, then path transitivity’s indirect pathways, which are just one connection longer than the shortest path, are the next-best paths for fidelity. Thus, path transitivity quantifies indirect pathways that offer a random walker the best approximations of the highest fidelity path or the best lossy compression. To operationalize the complexity of the communication system contributing to low-fidelity transmission (better able to tolerate noise in support of lossy compression), we measured the number of nodes in the indirect pathways of path transitivity, which we termed shortest path complexity. In testing the prediction that information rate should plateau as a function of network complexity in low-fidelity regimes, we found that the number of random walkers plateaued nonlinearly as a function of shortest path complexity, consistent with low-fidelity communication (Figure 5C). Model selection criteria support the nonlinear form compared to a linear version of the same model (nonlinear AIC = 7902, linear AIC = 7915; nonlinear BIC = 7964, linear BIC = 7968). The nonlinear fit of these data suggest that path transitivity supports low-fidelity communication that is tolerant to noise.
Hierarchical organization flexibly and efficiently supports both a high-fidelity and a low-fidelity regime.
Our findings suggest that high fidelity depends on network size while low fidelity depends on network transitivity. Prior work has shown that hierarchical organization supports the simultaneous presence of networks of large size and high transitivity (Ravasz & Barabási, 2003). Therefore, we hypothesized that if the brain network has hierarchical organization, then such organization may enable flexible switching between high- and low-fidelity regimes. Hierarchical organization, wherein submodules are nested into successively larger but less densely interconnected modules, is thought to support efficient spatial embedding as well as specialized information transfer (Figure 5D, left) (Bassett et al., 2010). We first sought to assess if brain networks exhibit hierarchical organization. Hierarchical organization imposes a strict scaling law between transitivity and degree; the slope of this relationship can be used to identify the presence of hierarchically modular organization in real networks. This strict scaling is distinctive of hierarchical networks because increasing the size of a generic nonhierarchical network containing highly connected hubs (degree) will tend to diminish transitivity (clustering); however, hierarchical organization is known to decouple the size and transitivity of a network, which allows each property to vary individually (Ravasz & Barabási, 2003). The characteristic scaling of reduced transitivity of brain regions with higher degree was present across the five brain network sizes (83 nodes: t = −15.31, bootstrap 95% CI [−19.37, −12.29], p < 0.001, R2 = 0.74; 129 nodes: t = −14.82, bootstrap 95% CI [−18.48, −11.93], p < 0.001, R2 = 0.63; 234 nodes: t = −16.96, bootstrap 95% CI [−20.43, −13.75], p < 0.001, R2 = 0.55; 463 nodes: t = −26.47, bootstrap 95% CI [−30.02, −23.16], p < 0.001, R2 = 0.60; 1015 nodes: t = −41.88, bootstrap 95% CI [−46.10, −38.11], p < 0.001, R2 = 0.63; F = 917.3, df = 1913, p < 0.001; Figure 5D, right). Hence, the connectome exhibits hierarchical organization.
We next assessed the hypothesis that hierarchical organization permits networks to have simultaneously large size and transitivity: two core network properties hypothesized to underlie high- and low-fidelity communication regimes (Ravasz & Barabási, 2003). Hence, we sought to test whether hierarchical organization exhibits the hallmarks of both high- and low-fidelity regimes. If hierarchical organization supports a high-fidelity regime, then the information rate will monotonically increase with the complexity of the communication system. In testing this hypothesis, we found that the scaling characteristic of hierarchical organization in brain networks was associated with monotonically greater information rate (F = 2002, bootstrap 95% CI [1946.6, 2061.4], df = 13389, p < 0.001, R2 = 0.62, bootstrap 95% CI [0.62, 0.63], Figure 5E). Considering the slope of the increasing rate, networks with greater hierarchical organization supported high fidelity communication more efficiently (t = 32.4, bootstrap 95% CI [29.96, 34.79], p < 0.001) than larger networks (t = 2640.32, bootstrap 95% CI [2556.3, 2725.07], p < 0.001). Taken together, our findings suggest that hierarchical organization, which is already known to be spatially efficient (Bassett et al., 2010), also supports high-fidelity network communication and does so more efficiently than large networks without hierarchical organization.
To enable flexible switching, hierarchical organization should also support a low-fidelity regime. If path transitivity supports a low-fidelity regime better able to tolerate noise, then hierarchical organization may support a low-fidelity regime by allowing for greater global transitivity in hierarchical networks compared to nonhierarchical networks. In testing this prediction, we found that the hierarchical organization of the connectome exhibited greater transitivity than nonhierarchical scale-free networks (generated with Materials and Methods); and this increased transitivity became more pronounced with increasing network size (size-by-hierarchical network type interaction t = −37.22, bootstrap 95% CI [−40.57, −34.12], p < 0.001; size-by-scale-free network type interaction t = −125.32, bootstrap 95% CI [−131.71, −118.70], p < 0.001; Figure 5F). Thus, hierarchical organization could contribute to the flexibility of efficient coding by preserving high network transitivity for low-fidelity communication that prioritizes noise tolerance and large network size for high-fidelity communication that prioritizes accuracy.
Brain regions disproportionately scale in relation to regional prioritization of network communication fidelity or lossy compression.
Having found that hierarchical organization preserves high transitivity despite large network size, we sought to understand how transitivity is associated with the areal expansion of brain regions. Evolutionarily new connections may support higher order and flexible information processing, emerging from disproportionate expansion of the association cortex (Buckner & Krienen, 2013). In contrast, brain regions that are disproportionately out-scaled by total brain expansion may save material, space, and metabolic resources. To explore how sender and receiver compression efficiency relates to cortical areal expansion, we used published maps of areal scaling; here, allometric scaling coefficients were defined by the nonlinear ratios of surface area change to total brain size change over development. Brain regions that disproportionately expanded in relation to total brain size during neurodevelopment tended to have greater transitivity (r = 0.26, df = 358, pSPIN,Holm-Bonferroni = 0.001), which may support noise-tolerant lossy compression with minimal loss of fidelity in association cortex regions thought to underpin higher order and flexible information processing. Disproportionately expanding brain regions tended to have greater sender (r = 0.14, df = 358, pSPIN,Holm-Bonferroni = 0.045; Figure 5H) and reduced receiver (r = −0.20, df = 358, pSPIN,Holm-Bonferroni = 0.03; Figure 5H) compression efficiency. While brain regions that disproportionately expanded tend to prioritize messages received with high fidelity, brain regions that are disproportionately out-scaled by total brain expansion tend to receive compressed messages.
Hypothesis 5: Integrative Hubs
The fifth and final hypothesis of our model posits that the structural hubs of the brain’s highly interconnected rich club supports information integration of randomly walking signals (van den Heuvel et al., 2012). To explain the hypothesized information integration roles of rich-club structural hubs, we investigated the compression efficiency of messages randomly walking into and out of hub regions compared to that of other regions. In order to identify the rich-club hubs, we computed the normalized rich-club coefficient and identified 43 highly interconnected structural hubs (Figure 6A). Next, we computed the send and receive compression efficiency of rich-club hubs compared to all other regions. In support of their hypothesized function, we found that the rich-club hubs required receiving fewer random walkers compared to other regions (Wilcox rank-sum test, W = 12829, p < 0.001), suggesting prioritization of information compression (or integration). For the rich-club hub to transmit outgoing messages with a fidelity that is equivalent to the incoming messages, the rich-club hubs required sending more random walkers compared to other brain regions (W = 64, p < 0.001), supporting the notion that rich-club hubs serve as high-fidelity information broadcasting sources. The function of hubs apparently compensates for high metabolic cost (Collin, Sporns, Mandl, & van den Heuvel, 2013). However, we did not find evidence of greater CBF in rich-club hubs compared to nonhubs, suggesting that hubs may be more metabolically efficient than previously known (Supporting Information Figure S8). This may potentially be due to hubs being compression-efficient receivers (see Supporting Information Results and Discussion for additional explanations). Though we did not find evidence of high metabolic cost, the contrasting roles of prioritizing input compression and output fidelity within rich-club hubs was consistent with our hypothesis and the current understanding of rich-club hubs as the information integration centers and broadcasters of the brain’s network (van den Heuvel et al., 2012).
Compression efficiency predicts behavioral performance.
Efficient coding predicts that the brain should not only balance fidelity and information compression as quantified by compression efficiency, but should do so in a way that improves future behavior (Chalk et al., 2018; Niven, Anderson, & Laughlin, 2007). Thus, we next sought to evaluate the association between compression efficiency in rich-club hubs and cognitive performance in a diverse battery of tasks. In light of trade-offs between communication fidelity and information compression, we hypothesized that compression efficiency would correlate with cognitive efficiency, defined as the combined speed and accuracy of task performance. Compression efficiency prioritizing lossy compression and low-dimensionality should predict worse performance (Rigotti et al., 2013). To assess the relationships between compression efficiency and cognitive efficiency, we used four independent cognitive domains that have been established by confirmatory factor analysis to assess individual variation in tasks of complex reasoning, memory, executive function, and social cognition (Materials and Methods). Consistent with the hypothesis, we found that individuals having rich-club hubs with reduced compression efficiency, prioritizing transmission fidelity, tended to exhibit increased cognitive efficiency of complex reasoning (all p values corrected using the Holm–Bonferroni family-wise error method; t = −4.73, bootstrap 95% CI [−6.58, −2.82], model adjusted R2 = 0.21, bootstrap 95% CI [0.16, 0.26], estimated model df = 10.59, p = 2 × 10−7), memory (t = −2.60, bootstrap 95% CI [−4.41, −0.79], model adjusted R2 = 0.24, bootstrap 95% CI [0.16, 0.26], estimated model df = 9.85, p = 0.03), executive function (t = −2.80, bootstrap 95% CI [−4.75, −0.87], model adjusted R2 = 0.50, bootstrap 95% CI [0.45, 0.56], estimated model df = 10.69, p = 0.03), and social cognition (t = −2.55, bootstrap 95% CI [−4.50, −0.73], model adjusted R2 = 0.22, bootstrap 95% CI [0.16, 0.25], estimated model df = 10.47, p = 0.02). See Supporting Information Figure S9 for similar nonhub correlations, as well as speed and accuracy modeled separately. Our finding that the compression efficiency of rich-club hubs was associated with cognitive efficiency is consistent with the notion that the integration and broadcasting function of hubs contributes to cognitive performance (van den Heuvel et al., 2012). Importantly, compression efficiency explained variation in cognitive efficiency even when controlling for the commonly used shortest path measure of global efficiency (Figure 6D; compression efficiency t = −4.95, estimated model df = 11.52, p < 0.001; global efficiency t = 2.68, estimated model df = 11.52, p < 0.01). Collectively, individuals with connectomes prioritizing fidelity tended to perform with greater cognitive efficiency in a diverse range of functions, consistent with the understanding of high-dimensionality neural representations predicting flexible behavioral performance (Rigotti et al., 2013; van den Heuvel et al., 2012).
DISCUSSION
The efficient coding principle and rate-distortion theory constrain models of brain network communication, explaining how connectome biology and architecture prioritize either communication fidelity or lossy compression (Achard & Bullmore, 2007; Avena-Koenigsberger et al., 2015, 2017, 2018; Bullmore & Sporns, 2012; Goñi et al., 2013; Johansen-Berg, 2010; Laughlin, 2001; Levy & Baxter, 1996; Palmer, Marre, Berry, & Bialek, 2015; Rubinov, 2016). Information compression can improve prediction and generalization, separate relevant features of information, and efficiently utilize limited capacity by balancing the unreliability of stochasticity with the redundancy in messages (Olshausen & Field, 2004; Palmer et al., 2015; Sims, 2016, 2018; Sterling & Laughlin, 2015). We observed a rate-distortion gradient for every individual, consistent with the conceptualization of the brain structural network as a communication channel with limited capacity. While each individual’s network adheres to rate-distortion theory, differences in structural connectivity lead to substantial variance in the compression efficiency. In addition to some of the variation being explained by age and sex, our results also indicate that variance could be driven by biological differences, such as in metabolic expenditure and myelin, as well as topological differences, such as transitivity, degree, and hierarchical organization. We found that biological investments of metabolic resources and myelin in connectome topology, hierarchically interconnected brain regions, and highly connected network hubs each supported efficient coding and compressed communication dynamics. We found evidence consistent with five predictions of rate-distortion theory adapted from prior literature, corroborating the validity of this theoretical model of macroscale efficient coding (Marzen & DeDeo, 2017; Sims, 2018; van den Heuvel et al., 2012).
In addition to these findings, we reported two important null results that could be explained in part by the macroscale efficient coding model. First, we did not find evidence for the hypothesis that metabolic savings are associated with shortest path routing communication dynamics, a key redeeming feature against acknowledged theoretical shortcomings of this routing model (Avena-Koenigsberger et al., 2018; Bullmore & Sporns, 2012; Várkuti et al., 2011). Rather, we found evidence of efficient coding, implemented using random walks, as in other areas of neuroscience (Barlow, 1961; Chalk et al., 2018; Denève, Alemi, & Bourdoukan, 2017; Laughlin, 2001; Laughlin, van Steveninck, & Anderson, 1998; Olshausen & Field, 2004; Palmer et al., 2015; Sterling & Laughlin, 2015; Weber, Krishnamurthy, & Fairhall, 2019; Wei & Stocker, 2015). Efficient coding generated unique predictions of communication as a stochastic process governed by trade-offs between lossy compression and transmission fidelity (MacKay, 2003). The evidence we presented was consistent with these predictions and would be difficult to explain with alternative models of shortest path routing (see Supporting Information Modeling/Math Notes and Discussion). This null result challenges the validity of the shortest path routing model as anchoring one end of a hypothesized spectrum of communication mechanisms and the usage of shortest path routing metrics to quantify the integrative capacity of particular brain areas or whole-brain connectivity (Avena-Koenigsberger et al., 2018). Rather, the macroscale efficient coding model implemented by random walk dynamics provides strong theoretical interpretation of information integration in hubs as lossy compression (Chanes & Barrett 2016; van den Heuvel et al., 2012).
Second, we did not find evidence for high metabolic cost associated with structural network hubs. Rather, we found that hubs are regions that receive inputs with greater compression efficiency; regions prioritizing input compression tend to be disproportionately out-scaled by total brain expansion during development, consistent with cortical consolidation (thinning and myelination) (Whitaker et al., 2016). In contrast, regions prioritizing input fidelity tend to disproportionately expand in relation to total brain growth during development, consistent with theories positing that evolutionary expansion and new connections support the flexibility of neural activity for higher-order cognition (Buckner & Krienen, 2013; Bullmore & Sporns, 2012; Glasser et al., 2014; Goldman-Rakic, 1988; Reardon et al., 2018; Scholtens, Schmidt, de Reus, & van den Heuvel, 2014; Theodoni et al., 2020; Vij, Nomi, Dajani, & Uddin, 2018). Structural network hubs support unique information integrative processes, apparently offsetting high metabolic, spatial, and material costs (Avena-Koenigsberger et al., 2019; Chanes & Barrett, 2016; Crossley et al., 2014; Oldham & Fornito, 2018; Scholtens et al., 2014; van den Heuvel et al., 2012; Vértes, Alexander-Bloch, & Bullmore, 2014). Hence, hub dysfunction may be particularly costly. Our null result motivates reconsideration of metabolic costs, at least for the simplest and most common definition of a structural connectivity hub using the measure of degree centrality (see Supporting Information Figure S8, Results, and Discussion) (Oldham & Fornito, 2018). An alternative hypothesis to high metabolic cost of hubs is that long-run metabolic savings depend on frequent usage of hubs to compensate for the expense of maintaining their size and connectivity (Harris & Attwell, 2012; S. S.-H. Wang et al., 2008). Although further investigation of the metabolic costs of rich-club hubs is warranted, our findings nevertheless reinforce a wealth of evidence emphasizing the importance of the development, resilience, and function of hubs in cognition and psychopathology (Chanes & Barrett 2016; Crossley et al., 2014; Gollo et al., 2018; Liang, Zou, He, & Yang, 2013; Mišić et al., 2015; Scholtens et al., 2014; van den Heuvel et al., 2012; Whitaker et al., 2016).
Our work admits several theoretical and methodological limitations. First, regionally aggregated brain signals are not discrete Markovian messages and do not have goals like reaching specific targets. As in recent work, our model is a deliberately simplified but useful abstraction of macroscale brain network communication (Mišić et al., 2015). Second, although we modeled random walk dynamics in light of prior methodological decisions and information theory benchmarks (Goñi et al., 2013), compression efficiency can be implemented using alternative approaches. Several methodological limitations should also be considered. The accurate reconstruction of white matter pathways using diffusion imaging and tractography remains limited (Zalesky et al., 2016). While we chose to model structural connections using fractional anisotropy because it is one of the most widely used measures of white matter microstructure, future research could evaluate these same hypotheses in structural networks built from different measures and of other species (Chang et al., 2017; Jones, Knösche, & Turner, 2013; Teich et al., 2021). Moreover, noninvasive measurements of CBF with high sensitivity and spatial resolution remain challenging. However, we acquired images using an ASL sequence providing greater sensitivity and approximately four times higher spatial resolution than prior developmental studies of CBF (Satterthwaite, Shinohara, et al., 2014). Lastly, our data was cross-sectional, limiting the inferences that we could draw about neurodevelopmental processes.
In summary, our study advances understanding of how brain metabolism and architecture at the macroscale supports communication dynamics in complex brain networks as efficient coding. Our model provides a simple framework for investigating efficient coding at the whole-brain network level and an important basis to build toward computational network models of special cases of efficient coding, such as robust, sparse, and predictive coding (Chalk et al., 2018). In quantifying integrative communication dynamics and lossy compression of the macroscale brain network, these results are relevant to future research on low-dimensional neural representations (Rigotti et al., 2013; Shine et al., 2019; Tang et al., 2019), efficient control of functional network dynamics (Srivastava et al., 2020; Tang et al., 2017), and hierarchical abstraction of behaviorally relevant cortical representations (Momennejad, 2020; Schapiro, Turk-Browne, Botvinick, & Norman, 2017; Stachenfeld, Botvinick, & Gershman, 2017). Future research could investigate whether other biological properties of the network support efficient coding, such as gradients in brain structure and function (Barbas, 1986; Charvet, Cahalane, & Finlay, 2015; Charvet & Finlay 2014; Huntenburg, Bazin, & Margulies, 2018; Kingsbury & Finlay, 2001; Paquola et al., 2020; Vazquez-Rodriguez, Liu, Hagmann, & Misic, 2020). Neurodevelopmental processes vary with the naturalistic environment and socioeconomic background, suggesting that they are shaped by exposure to different experiences and expectations (Tooley, Bassett, & Mackey, 2021). We hypothesize that the neurodevelopmental range of compression efficiency reflects a relationship between source coding (statistics or compressibility of the naturalistic input information) and channel coding (compression efficiency), which demarcate zones of possible brain network communication (MacKay, 2003; Olshausen & Field, 2004). Lastly, the compression efficiency metric offers a novel tool to test leading hypotheses of dysconnectivity (Di Martino et al., 2014), hubopathy (Crossley et al., 2014; Gollo et al., 2018), disrupted information integration (Chanes & Barrett, 2016; Hernandez, Rudie, Green, Bookheimer, & Dapretto, 2015), and neural noise (Dinstein et al., 2012) in neuropsychiatric disorders.
MATERIALS AND METHODS
Participants
As described in detail elsewhere (Satterthwaite, Elliott, et al., 2014), diffusion-weighted imaging (DWI) and arterial-spin labeling (ASL) data were acquired for the Philadelphia Neurodevelopmental Cohort (PNC), a large community-based study of neurodevelopment. The subjects used in this paper are a subset of the 1,601 subjects who completed the cross-sectional imaging protocol. We excluded participants with health-related exclusionary criteria (n = 154) and with scans that failed a rigorous quality assurance protocol for DWI (n = 162) (Roalf et al., 2016). We further excluded subjects with incomplete or poor ASL and field map scans (n = 60). Finally, participants with poor quality T1-weighted anatomical reconstructions (n = 10) were removed from the sample. The final sample contained 1042 subjects (mean age = 15.35, SD = 3.38 years; 467 males, 575 females). Study procedures were approved by the Institutional Review Board of the Children’s Hospital of Philadelphia and the University of Pennsylvania. All adult participants provided informed consent; all minors provided assent and their parent or guardian provided informed consent.
Cognitive Assessment
All participants were asked to complete the Penn Computerized Neurocognitive Battery (CNB). The battery consists of 14 tests adapted from tasks typically applied in functional neuroimaging, and which measure cognitive performance in four broad domains (Satterthwaite, Elliott, et al., 2014). The domains included (1) executive control (i.e., abstraction and flexibility, attention, and working memory), (2) episodic memory (i.e., verbal, facial, and spatial), (3) complex cognition (i.e., verbal reasoning, nonverbal reasoning, and spatial processing), (4) social cognition (i.e., emotion identification, emotion intensity differentiation, and age differentiation), and (5) sensorimotor and motor speed. Performance was operationalized as z-transformed accuracy and speed. The speed scores were multiplied by −1 so that higher indicates faster performance, and efficiency scores were calculated as the mean of these accuracy and speed z-scores. The efficiency scores were then z-transformed again, to achieve mean = 0 and SD = 1.0 for all scores. Confirmatory factor analysis supported a model of four latent factors corresponding to the cognitive efficiency of executive function, episodic memory, complex cognition, and social cognition (Moore, Reise, Gur, Hakonarson, & Gur, 2015). Hence, we used these four cognitive efficiency factors in our analyses. In a factor solution separately modeling accuracy and speed, the accuracy factors correspond to (1) executive and complex cognition, (2) social cognition, and (3) memory. The speed factors correspond to (1) fast speed (e.g., working memory and attention tasks requiring constant vigilance), (2) episodic memory speed, and (3) slow speed (e.g., tasks requiring complex reasoning).
Image Acquisition, Preprocessing, and Network Construction
Neuroimaging acquisition and pre-processing were as previously described (Satterthwaite, Elliott, et al., 2014). We depict the overall workflow of the neuroimaging and network extraction pipeline in Supporting Information Figure S1.
Diffusion-weighted imaging.
As was previously described (Baum et al., 2017; Tang et al., 2017), diffusion imaging data and all other MRI data were acquired on the same 3T Siemens Tim Trio whole-body scanner and 32-channel head coil at the Hospital of the University of Pennsylvania. DWI scans were obtained using a twice-focused spin-echo (TRSE) single-shot EPI sequence (TR = 8,100 ms, TE = 82 ms, FOV = 240 mm2/240 mm2; Matrix = RL: 128/AP: 128/Slices: 70, in-plane resolution (x & y) 1.875 mm2; slice thickness = 2 mm, gap = 0; FlipAngle = 90°/180°/180°, volumes = 71, GRAPPA factor = 3, bandwidth = 2170 Hz/pixel, PE direction = AP). The sequence employs a four-lobed diffusion encoding gradient scheme combined with a 90-180-180 spin-echo sequence designed to minimize eddy current artifacts. The complete sequence consisted of 64 diffusion-weighted directions with b = 1,000 s/mm2 and 7 interspersed scans where b = 0 s/mm2. Scan time was about 11 min. The imaging volume was prescribed in axial orientation covering the entire cerebrum with the topmost slice just superior to the apex of the brain (Roalf et al., 2016).
Connectome construction.
Cortical gray matter was parcellated according to the Glasser atlas (Glasser et al., 2016), defining 360 brain regions as nodes for each subject’s structural brain network, denoted as the weighted adjacency matrix A. To assess multiple spatial scales, cortical and subcortical gray matter was parcellated according to the Lausanne atlas (Cammoun et al., 2012). Together, 89, 129, 234, 463, and 1,015 dilated brain regions defined the nodes for each subject’s structural brain network in the analyses of Figure 5.
DWI data was imported into DSI Studio software and the diffusion tensor was estimated at each voxel (Yeh, Verstynen, Wang, Fernández-Miranda, & Tseng, 2013). For deterministic tractography, whole-brain fiber tracking was implemented for each subject in DSI Studio using a modified fiber assessment by continuous tracking (FACT) algorithm with Euler interpolation, initiating 1,000,000 streamlines after removing all streamlines with length less than 10 mm or greater than 400 mm. Fiber tracking was performed with an angular threshold of 45, a step size of 0.9375 mm, and a fractional anisotropy (FA) threshold determined empirically by Otzu’s method, which optimizes the contrast between foreground and background (Yeh et al., 2013). FA was calculated along the path of each reconstructed streamline. For each subject, edges of the structural network were defined where at least one streamline connected a pair of nodes. Edge weights were defined by the average FA along streamlines connecting any pair of nodes. The resulting structural connectivity matrices were not thresholded and contain edges weighted between 0 and 1.
Arterial-spin labeling.
Because prior work has shown that the T1 relaxation time changes substantially in development and varies by sex, this parameter was set according to previously established methods, which enhance CBF estimation accuracy and reliability in pediatric populations (Jain et al., 2012; Wu et al., 2010). As in prior work, the global network CBF was calculated as the average CBF across all brain regions to obtain an individual participant’s CBF (Satterthwaite, Shinohara, et al., 2014).
Brain Maps
Cortical myelin.
Cortical areal scaling.
When β is 1, the scaling between total brain size and brain regions is linear. When β deviates greater or less than 1, scaling is nonlinearly and disproportionately expanding or contracting. We used the published atlas generated using the same data as in our study (Reardon et al., 2018; Satterthwaite, Elliott, et al., 2014).
Network Statistics
Global efficiency.
Path strengths.
Path transitivity.
Modularity.
Resource efficiency.
The number of random walkers rij has been referred to as resources in prior literature (Goñi et al., 2013). In our analyses, we calculate resources rij over a range of values of η for each participant. Finally, to calculate the resource efficiency of each participant, the resource efficiency of an entire network is taken to be 1/(rij(η)) averaged over all pairs of nodes i and j. With the right stochastic matrix , the resource efficiency of brain regions as message senders is 1/(rij(η)) averaged over i, while brain regions as message receivers is 1/(rji(η)) averaged over j.
Compression efficiency.
By minimizing the mutual information I(X, ), we arrive at a probabilistic map from the signal to the compressed representation, where the information gain between the signal and compression is as small as possible (i.e., high fidelity) to favor the most compact representations.
Similar to the mathematical framework of rate-distortion theory, we sought to specify a distortion function reflecting communication over the brain’s structural network. Prior work building models of perceptual and cognitive performance have inferred distortion functions through Bayesian inference of a loss function (Sims, 2016, 2018). For instance, the loss function could be the squared error denoting the residual values of the true signal minus the compression, L = ( − x)2 (Figure 3A). A neural rate-distortion theory has been theoretically developed (Marzen & DeDeo, 2017), but remains empirically untested due in part to a lack of methodological tools at the level of brain systems. Moreover, it has been difficult to define a distortion function that incorporates both true signals x and compressed signals in part because the measurements of these signals in human brain networks remains challenging. Here, we define an analogous framework of information transfer through capacity-limited channels in the structural network of the brain. Particularly, we build a distortion function from the simple intuition that the shortest path is the route that most reliably preserves signal fidelity, as depicted in Figure 3B.
While prior work calculates the bit rates that arise from the stochasticity of opening and closing ion channels or releasing a synaptic vesicle (Laughlin, 2001), in this article we specifically chose to focus on macroscale-level neuroimaging data. Our choice stemmed in part from the fact that the surrounding background literature in neuroimaging motivated many components of our theory, and in part from the fact that the theory of efficient coding had not yet been extended to this scale. We sought to understand the role of the whole connectome in supporting efficient communication and information processing.
To calculate numerical measures of bit rates at any spatial scale, we require a probabilistic information source. In the random walk model of interregional communication, each brain region is an information source that activates while sending a message—that is, a random walker—to another region along the wiring of the connectome. The information content of an individual message is determined by the probability of a message being sent, which, in our setting, is equivalent to the probability of a brain region changing its activity (activating or deactivating) in statistical association with behaviors and cognitive functions. At the level of macroscale neuroimaging data, individual studies describe localized neural activation and deactivation associated with specific cognitive tasks and behavior. Meta-analyses of these individual studies can help aggregate and summarize a large quantity of data on the general probability of brain region activation across a wide range of behaviors and cognitive functions (Sterling & Laughlin 2015; Yarkoni, Poldrack, Nichols, Essen, & Wager, 2011). We used the meta-analytic approach of NeuroSynth to obtain probability distributions across statistical maps of neural activity (Yarkoni et al., 2011). We then used this average probability that a brain region is active to calculate the information content of each communication event, or the transmission of one neural message.
Biased random walk.
Rich club.
Hierarchical organization.
Network Null Models
Random graphs are commonly used in network science to test the statistical significance of the role of some network topology against null models. We used randomly rewired graphs generated by shuffling each individual’s empirical networks 20 times, as in prior work (Maslov & Sneppen, 2002). Furthermore, we generated Erdős-Rényi random networks for each individual brain network where the presence or absence of an edge was generated by a uniform probability calculated as the density of edges existing in the corresponding brain network. Edge weights were randomly sampled from the edge weight distribution of the brain network. While the randomly rewired graphs retain empirical properties such as the degree and edge weight distributions of the individual brain networks, the Erdős-Rényi networks do not. Hence, the randomly rewired null network was used in all analyses where the degree distribution should be retained (e.g., normalized rich-club coefficient), while the Erdős-Rényi network was used in analyses assessing the overall contribution of the brain network topology (e.g., compression efficiency). We used the Erdős-Rényi networks as the null network for two reasons. First, Erdős-Rényi networks are particularly appropriate for comparing the random walk model to the alternative shortest path routing model. Prior work found that Erdős-Rényi were networks with very high global efficiency, though the networks were biologically implausible due to their connection costs (Sporns, 2013). Thus, the Erdős-Rényi networks serve as a benchmark for a communication architecture supporting highly efficient shortest path routing.
Second, we chose to use Erdős-Rényi networks because we sought to compare random walk communication on brain network connectivity to random walk communication on synthetic networks that optimally minimize our distortion function—that is, maximizing the shortest path probability (Goñi et al., 2013). In doing so, we can address the major criticism of the plausibility of random walk models: inefficiency (Avena-Koenigsberger et al., 2018). Prior work compared random walks dynamics on canonical graph models, finding that the Erdős-Rényi networks had the highest probability of shortest path propagation (Goñi et al., 2013). Thus, we expected that the Erdős-Rényi networks would approximate a limit on the efficiency for random walk communication dynamics according to the rate-distortion model. Compared to this approximate limit of efficiency, we aimed to show that biological investments in brain network communication modeled using random walks could indeed be efficient. Moreover, we aimed to show that inefficiency, when viewed in the light of the biologically established efficient coding framework, uses the redundancy of communication to overcome noise introduced by intrinsic stochasticity of neural processes to balance the transmission fidelity and lossy compression of information (Barlow, 1961).
Our tests using the randomly rewired network evaluate the null hypothesis that an apparent rich-club property of brain networks is a trivial result of topology characteristic of random networks with some empirical properties preserved, as in prior work (Colizza et al., 2006; van den Heuvel & Sporns, 2013). The alternative hypothesis is that the brain network has a rich-club organization beyond the level expected in the random networks. Our tests using the Erdős-Rényi network evaluate the null hypothesis that the rates in the rate-distortion function modeling information processing capacity in brain networks does not differ from the rates in the rate-distortion function of random networks. The alternative hypothesis is that the rate of the brain network’s rate-distortion function differs from that of random networks, consistent with the notion that Erdős-Rényi networks have a greater prevalence of shortest paths compared to brain networks. We additionally used the Erdős-Rényi network to assess the hypothesis of rate-distortion theory that synthetic networks should exhibit the same information processing trade-offs (the monotonic rate-distortion gradient) as empirical brain networks (Sims, 2018). We selected Erdős-Rényi networks to assess these hypotheses for two reasons. First, Erdős-Rényi networks do not retain core architectures of brain networks, such as modularity, and therefore reflect an extreme synthetic network. Second, Erdős-Rényi networks are commonly used as a benchmark for assessing shortest path prevalence due to the prominence of uniformly distributed direct pairwise connections (Avena-Koenigsberger et al., 2018; Sporns, 2013). In light of the central assumption that shortest paths represent the route of highest signal fidelity in our definition of distortion, we used Erdős-Rényi networks to verify our intuition that compression efficiency should be greater in the Erdős-Rényi network than in brain networks.
Scale-free networks have a degree distribution resulting in a subset of highly connected hubs. Prior work established that global transitivity decreases exponentially with network size in a scale-free network, but that networks that are scale-free and have hierarchical organization can decouple transitivity from size. We created nonhierarchical scale-free networks using the Barabási-Albert algorithm (with the preferential attachment parameter set to 1 for linear preferential attachment) and reevaluated whether the structural brain networks with hierarchical organization similarly decoupled transitivity and size (Barabási & Albert, 1999). Decoupling transitivity and size allows for simultaneously high levels of transitivity despite large size. In the context of efficient coding theory, we chose to compare the global transitivity of structural brain networks with scale-free networks across five network sizes to examine if hierarchical organization in the human structural brain networks confers a similar outcome.
Statistical Analyses
To assess the covariation of our measurements across individuals and brain regions, we used generalized additive models (GAMs) with penalized splines. GAMs allow for statistically rigorous modeling of linear and nonlinear effects while minimizing over-fitting (Wood, 2004). Throughout, the potential for confounding effects was addressed in our model by including covariates for age, sex, age-by-sex interaction, network degree, network density, and in-scanner motion. Due to the likelihood of inflated estimates of brain–behavior associations despite well-powered analyses (Marek et al., 2020), we report bootstrap 95% confidence intervals for the test statistic and adjusted R2 of each model across 1,000 bootstrap samples.
Metabolic running costs associated with brain network architectures.
To evaluate the importance of age as a confound for the relationship between global efficiency and CBF, we also performed sensitivity analyses by removing selected covariates and reassessing the model. In addition, for consistency with prior work (Várkuti et al., 2011), we performed the same analysis including covariates for gray matter volume and density.
Assessments of path strengths were corrected for false discovery rate across the statistical tests performed over the discrete path lengths.
Trade-offs between modularity and random walk architecture.
To visualize the landscape of CBF as a function of modularity and path transitivity, we plotted the GAM model response function. We described the distribution of modularity and path transitivity across individuals using frequency histograms. We calculated a map of change in global CBF with respect to both modularity and path transitivity using first-order derivatives. A saddle point suggests that adaptive compromises in network architecture are constrained by dual objectives. To quantify the location of the saddle point coordinate within the change map, we performed a k-nearest neighbor search of the value 0 in the gradient of first-order derivatives indicating minima and maxima.
Compression efficiency and development.
To quantify the differences between the resources calculated from resource efficiency and the compression efficiency across 14 levels of distortion, we performed two-sample t tests while controlling for family-wise error rate across multiple comparisons. For each level of distortion, we calculated the t-statistic comparing all individual resources to all individual resources predicted by the linear rate-distortion gradient.
Compression efficiency of biased random walks.
Compression efficiency in a low-or high-fidelity regime.
Compression efficiency and patterns of neurodevelopment.
To explore how compression efficiency might relate to patterns of cortical myelination and areal scaling, we assessed the Spearman’s correlation coefficient between myelination or scaling and send or receive compression efficiency. We also assessed the relationship between cortical areal scaling and nodal transitivity, in light of our hypothesis that greater path transitivity may support greater fidelity and the efficient coding hypothesis, which states that the organization of the brain allocates neural resources according to the physical distribution of information. To further test correspondence between brain maps, we used a spatial permutation test, which generates a null distribution of randomly rotated brain maps that preserve the spatial covariance structure of the original data (Alexander-Bloch et al., 2018). Using spatially constrained null models is the state-of-the-art for comparing brain maps (Alexander-Bloch et al., 2018). We used a spin test variant that reassigns parcels with no duplication. This method implements an iterative procedure that uniquely assign parcels based on Euclidean distance of the rotated parcels, ignoring the medial wall and its location (Váša et al., 2018). Our choice of null model introduces a trade-off between permitting slightly more liberal critical thresholds than other spatially constrained null models and retaining the exact distribution of the original brain map to better test network-based statistics like compression efficiency and transitivity (Markello & Misic, 2021). We refer to the p value of this statistical test as pSPIN. Finally, we applied the conservative Holm–Bonferroni correction for family-wise error across these tests.
Compression efficiency and rich-club hubs.
Code Availability
Code can be found at https://github.com/dalejn/economicsConnectomics.
Data Availability
Neuroimaging and cognitive test data were acquired from the Philadelphia Neurodevelopmental Cohort. The data reported in this paper have been deposited in the database of Genotypes and Phenotypes under accession number dbGaP: phs000607.v2.p2 (https://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs000607.v2.p2). The allometric cortical scaling maps were downloaded from a NeuroVault repository (Reardon et al., 2018) and the cortical myelin maps were downloaded from a public resource (Glasser & Van Essen, 2011).
Citation Diversity Statement
Recent work in several fields of science has identified a bias in citation practices such that papers from women and other minority scholars are under cited relative to the number of such papers in the field (Bertolero et al., 2020; Caplar, Tacchella, & Birrer, 2017; Chatterjee & Werner, 2021; Dion, Sumner, & Mitchell, 2018; Dworkin et al., 2020; Fulvio, Akinnola, & Postle, 2021; Maliniak, Powers, & Walter, 2013; Mitchell, Lange, & Brus, 2013; X. Wang et al., 2021). Here we sought to proactively consider choosing references that reflect the diversity of the field in thought, form of contribution, gender, race, ethnicity, and other factors. First, we obtained the predicted gender of the first and last author of each reference by using databases that store the probability of a first name being carried by a woman (Dworkin et al., 2020; Zhou et al., 2020). By this measure (and excluding self-citations to the first and last authors of our current paper), our references contain 18.07% woman (first)/woman(last), 5.75% man/woman, 23.33% woman/man, and 52.85% man/man. This method is limited in that (1) names, pronouns, and social media profiles used to construct the databases may not, in every case, be indicative of gender identity and (2) it cannot account for intersex, nonbinary, or transgender people. Second, we obtained predicted racial/ethnic category of the first and last author of each reference by databases that store the probability of a first and last name being carried by an author of color (Ambekar, Ward, Mohammed, Male, & Skiena, 2009; Sood & Laohaprapanon, 2018). By this measure (and excluding self-citations), our references contain 7.38% author of color (first)/author of color (last), 12.18% white author/author of color, 24.82% author of color/white author, and 55.62% white author/white author. This method is limited in that (1) names and Florida Voter Data to make the predictions may not be indicative of racial/ethnic identity, and (2) it cannot account for Indigenous and mixed-race authors, or those who may face differential biases due to the ambiguous racialization or ethnicization of their names. We look forward to future work that could help us to better understand how to support equitable practices in science.
ACKNOWLEDGMENTS
We acknowledge helpful discussions with Dr. Jennifer Stiso, Dr. Richard Betzel, Dr. David Lydon-Staley, Dr. Lorenzo Caciagli, Adon Rosen, and Dr. Bart Larsen.
SUPPORTING INFORMATION
Supporting information for this article is available at https://doi.org/10.1162/netn_a_00223.
AUTHOR CONTRIBUTIONS
Dale Zhou: Conceptualization; Data curation; Formal analysis; Investigation; Methodology; Visualization; Writing – original draft; Writing – review & editing. Christopher W. Lynn: Methodology; Writing – review & editing. Zaixu Cui: Data curation; Validation; Writing – review & editing. Rastko Ciric: Data curation; Writing – review & editing. Graham L. Baum: Data curation; Writing – review & editing. Tyler M. Moore: Methodology; Writing – review & editing. David R. Roalf: Data curation; Methodology; Writing – review & editing. John A. Detre: Data curation; Resources; Writing – review & editing. Ruben C. Gur: Funding acquisition; Resources; Writing – review & editing. Raquel E. Gur: Funding acquisition; Resources; Writing – review & editing. Theodore D. Satterthwaite: Conceptualization; Funding acquisition; Investigation; Project administration; Resources; Supervision; Writing – review & editing. Danielle S. Bassett: Conceptualization; Funding acquisition; Investigation; Methodology; Project administration; Resources; Supervision; Writing – review & editing.
FUNDING INFORMATION
The work was largely supported by the John D. and Catherine T. MacArthur Foundation, the ISI Foundation, the Paul G. Allen Family Foundation, the Alfred P. Sloan Foundation, the NSF CAREER award PHY-1554488, NIH R01MH113550, NIH R01MH112847, and NIH R21MH106799. Secondary support was also provided by the Army Research Office (Bassett-W911NF-14-1-0679, Grafton-W911NF-16-1-0474) and the Army Research Laboratory (W911NF-10-2-0022). D.Z. acknowledges support from the National Institute of Mental Health F31MH126569. C.W.L. acknowledges support from the James S. McDonnell Foundation 21st Century Science Initiative Understanding Dynamic and Multi-scale Systems - Postdoctoral Fellowship Award. The content is solely the responsibility of the authors and does not necessarily represent the official views of any of the funding agencies.
TECHNICAL TERMS
- Efficient coding:
A principle of neural signaling that predicts a trade-off between maximizing the amount of information conveyed under constraints of limited resources.
- Rate-distortion theory:
A mathematical framework of information theory for communicating information given a tolerated level of information distortion in channels with limited capacity.
- Lossy data compression:
A procedure that sacrifices information fidelity to improve compact and economical transmission.
- Repetition code:
A simple strategy to overcome noise that corrupts a transmission by repeatedly implementing the sufficient mechanisms to convey a message for multiple copies.
- Redundancy reduction:
A theory for how efficient coding may be implemented by exploiting statistical regularities of information inputs.
- Random walk:
A decentralized model of network communication wherein a brain region sends an input to a target region by discrete messages propagating across the structural network.
- Shortest path routing:
A model of network communication wherein a source brain region sends an input to a target region by selecting the shortest pathway between regions.
- Compression efficiency:
A metric that quantifies the trade-off between information compression and communication fidelity that is afforded by an individual’s structural connectivity.
REFERENCES
Author notes
Competing Interests: The authors have declared that no competing interests exist.
Co-senior authors.
Handling Editor: Petra Vertes