Persistence of information flow: A multiscale characterization of human brain

Information exchange in the human brain is crucial for vital tasks and to drive diseases. Neuroimaging techniques allow for the indirect measurement of information flows among brain areas and, consequently, for reconstructing connectomes analyzed through the lens of network science. However, standard analyses usually focus on a small set of network indicators and their joint probability distribution. Here, we propose an information-theoretic approach for the analysis of synthetic brain networks (based on generative models) and empirical brain networks, and to assess connectome’s information capacity at different stages of dementia. Remarkably, our framework accounts for the whole network state, overcoming limitations due to limited sets of descriptors, and is used to probe human connectomes at different scales. We find that the spectral entropy of empirical data lies between two generative models, indicating an interpolation between modular and geometry-driven structural features. In fact, we show that the mesoscale is suitable for characterizing the differences between brain networks and their generative models. Finally, from the analysis of connectomes obtained from healthy and unhealthy subjects, we demonstrate that significant differences between healthy individuals and the ones affected by Alzheimer’s disease arise at the microscale (max. posterior probability smaller than 1%) and at the mesoscale (max. posterior probability smaller than 10%).


INTRODUCTION
The human brain is usually referred to as an emblematic example of efficient complex systems, where neurons (i.e., the units) process a huge amount of information by signaling through synaptic transmission (i.e., the links). To better understand how information flows within the brain, imaging techniques are widely used to infer structural and functional relationships between distinct areas of the brain obtained according to some parcellation in regions of interests (Assaf & Pasternak, 2007;van den Heuvel & Pol, 2010). The resulting maps coarse grain the brain into complex networks of much smaller size, typically of the order of a few hundreds or thousands of areas that are analyzed through the lens of network science (Bassett & Sporns, 2017;Bullmore & Sporns, 2009Deco, Jirsa, & McIntosh, 2010;Deco, Tononi, Boly, & Kringelbach, 2015;Fox et al., 2005;Meunier, Lambiotte, Fornito, Ersche, & Bullmore, 2009;Papo, Buldú, Boccaletti, & Bullmore, 2014;Schirner, McIntosh, Jirsa, Deco, & Ritter, 2018;Yamamoto et al., 2018;Zalesky, Fornito, & Bullmore, 2010).
In this study, we propose an information-theoretic approach for the analysis of synthetic and empirical brain networks with a two fold aim. The outcome of our procedure naturally accounts for the function of a system in terms of the interplay between the underlying structure and a dynamical process on the top of it, at different temporal scales, measured in bits of information required to describe the connectome state (see Materials and Methods). To this aim, we use two distinct diffusive processes for exchanging information among units: (a) a classical random walk (CRW), where the walker has no global knowledge of the connectome and performs decisions based only on local knowledge of the connectivity, while keeping a uniform probability of choosing a connection for jumping (Noh & Rieger, 2004); (b) a maximal entropy random walk (MERW), where the walker has global knowledge of the connectome and jumps through a connection, while keeping a uniform probability of choosing any trajectory on the network (Burda, Duda, Luck, & Waclaw, 2009). Thanks to these distinct dynamics, we are able to describe the network state from two distinct perspectives: one where only local knowledge is used to explore the connectome (CRW) and one where global knowledge is used instead (MERW).
On the one hand, we use our framework to compare a large set of real connectomes from healthy subjects against a selected pool of network models, characterized by distinct structural features and increasing amount of complexity. Specifically, we built 400 synthetic brain networks based on four generative models, that is, the Erdo }s-Rényi model (Erdo }s & Rényi, 1959), the configuration model (Newman, 2017), the stochastic block model (Holland, Laskey, & Leinhardt, 1983), and the hyperbolic model (Papadopoulos, Kitsak, Serrano, Boguñá, & Krioukov, 2012). These generative models (Betzel & Bassett, 2017) allow for obtaining samples of synthetic data (networks) while maintaining some specific features of empirical connectome (see Materials and Methods for a more detailed illustration of the generative models considered for this work). On the other hand, we compare the function in healthy subjects against the one in patients at different stages of dementia, namely mild cognitive impairment (MCI) and Alzheimer's disease (AD), from a network information theory perspective.
Here we use spectral entropy, based in statistical physics and information theory-field devoted to the study of transmission, processing, extraction, and utilization of information-to investigate the structure of human brain networks. In general, the spectral entropy captures the complexity of a system in terms of the mixedness of information flow through the network. This method provides a more comprehensive analysis for the comparison of brain networks than standard techniques, since it is not limited to consider a small set of network indicators (e.g., centrality, clustering, modularity, and so forth) and their joint probability distribution, as it accounts for the contribution of the whole network state encoded into a density matrix (De Domenico & Biamonte, 2016;Ghavasieh, Nicolini, & De Domenico, 2020), a mathematical representation of the system which shares important physical and information-theoretic similarities with its counterpart widely used in quantum statistical physics (see Methods for details). Moreover, it has been systematically shown that the spectral entropy framework performs better than the traditional methods, previously introduced to investigate information dynamics within complex structures, in characterizing the global aspects of complex networks (Su, Chen, Pan, & Zeng, 2021).
We find that simple models, like the Erdo }s-Rényi and configuration models, have smaller spectral entropy at the mesoscale, where mid-or long-range communications between the nodes are considered, and, consequently, require a significantly smaller amount of bits (up to 1.5 bits) for their description than empirical human brains from healthy individuals. Conversely, degree-corrected stochastic block models and hyperbolic models (see Methods), accounting for the inferred modular structure of the connectome and its latent hyperbolic geometry, respectively, provide similar descriptions of the network state, with differences smaller than 1 bit. It is worth remarking that the geometry-driven model exhibits higher information entropy for increasing temporal scale-that is, moving from the mesoscale to the macroscale-denoting Spectral entropy: The mixedness of information streams determined by the Von Neumann entropy of the density matrix defining the information content of the network. a larger persistence of information flow-that is, entropy tends to decay slower with Markov time-in this type of networks, at variance with stochastic block models and empirical connectomes. Results are compatible when the two types of dynamics, CRW and MERW, are considered. In general, p values from statistical tests indicate the mesoscale as the suitable scale to highlight differences (and similarities) between empirical data and synthetic models. Moreover, we find out that, considering the MERW dynamics, the stochastic block model can significantly reproduce the empirical brain across all scales.
When applied to connectomes obtained from healthy and unhealthy subjects, we identify significant differences between healthy individuals and the ones affected by AD at the microscale (adjusted p value smaller than 0.1%; maximum posterior probability smaller than 1%) and at the mesoscale (adjusted p value smaller than 1%; maximum posterior probability smaller than 10%), in the case of CRW. The results are confirmed, at one order of magnitude larger, for MERW and only at the microscale. Remarkably, our approach is able to capture this difference despite the fact that the topologies of the two groups exhibit a certain amount of similarity with respect to more traditional network indicators.
In the final section, we describe the interpretation of our results from a neuroscience perspective, highlighting that our approach is well suited to capture the multiscale nature of neural dynamics that is embedded in the hierarchical modular organization of brain structure. Furthermore, our method allows us to identify precisely the scale at which abnormalities can alter information flows when studying brain network in patients with AD.

Information-Theoretic Analysis of Human Brain Networks
Before discussing our results, it is important to introduce a few basic concepts that will be used in the following. Let us frame our problem in terms of a communication process, where one encodes the description of a complex network to transmit it through some noiseless communication channel to a receiver, who has to decode the corresponding information, in bits, in order to reconstruct the original network. Since the channel is assumed to be noiseless, it has maximum information capacity, that is, the mutual information between the sent and received information is maximum. Note that the communication we are referring to should not be confused with signaling or communication among distinct areas of the brain.
One way to quantify the average number of bits required to describe a network state G is to build the corresponding density matrix ρ τ (G) and calculate the spectral entropy S τ (G), mathematically equivalent to the von Neumann entropy of an entangled quantum system (De Domenico & Biamonte, 2016). Here, the parameter τ indicates the Markov time of the dynamical process used to propagate information among nodes (see Materials and Methods for details): the idea is that an ensemble of signals, whose dynamics is governed by a propagator, is sent from each node to the others for a time τ and contributes to collect information about the underlying topology, therefore reducing uncertainty about the structure. In fact, at time τ = 0, no signal propagates and, consequently, the entropy is maximum because no information at all is available about network structure. Recently, it has been shown  that the density matrix describes the trajectories of information flow through the network and the entropy provides a measure of diversity of information dynamics in the system (see Figure 1 for an illustration). Interestingly, one can show that a network's topological complexity, like the presence of modularity or hierarchy, can boost the diversity of information dynamics within the system and the functional diversity of nodes as senders of information (Ghavasieh et al., Information flow: Communication of brain areas happen through the propagation of (electrochemical) signals. Diffusion processes are used as a proxy to determine the flow of information in the connectome.
Information capacity: Amount of information allowed to pass through a communication channel in a given time period.
Density matrix: A matrix encoding the state of the system obtained from the superposition of information streams weighted by their activation probabilities. 2020)-that is, the modularity separates the groups of nodes from each other and the hierarchy differentiates between the groups, both making asymmetries between the nodes as senders and receivers of information and diversifying the trajectories of information flow within the system. Here, we go beyond the analysis of synthetic networks and investigate real connectomes, in comparison to null and generative models. To avoid confusion, it is worth remarking that the Figure 1. Illustrating network information entropy in the case of a human connectome. (A) A schematic view of the human connectome is presented, as a fully connected network with six nodes. Information dynamics within the connectome is regulated by an ensemble of information streams, mathematically shown as σ (ℓ) , ℓ = 1, 2, … 6. The way each stream contributes to the flow of information is represented as a diagram, where blue and red arrows, respectively, represent positive and negative fluxes and the size of each node represents the amount of field trapped on top of the node. (B) Snapshots of possible functional diversity in a schematic connectome at two different Markov times (i.e., τ = 1 above and τ = 2 below). From left to right spectral entropy increases as the overlap of information flows decreases. Colored nodes encode the sources of information, shaded areas encode the flows of information, colored arrows the directions of such flows. entropy is a macroscopic descriptor of the system as a whole, that is, it does not quantify pairwise information transfer between nodes.
Another desirable feature of this framework is that one can vary τ to characterize the network state at different scales, from microscopic (τ 1) to mesoscopic (τ ffiffiffiffi N p ) and macroscopic (τ N). Note that the scales we are referring to are topological, but the tunable parameter used to span from the microscopic to the macroscopic one is of temporal nature, since it is the time required by a dynamical process defined on the top of the network, such as a random walk, to propagate information. More specifically, we use the temporal evolution of a statistical field to explore the topological scales of the connectome, a procedure successfully adopted for the analysis of other complex systems, from the human proteome (Ghavasieh, Bontorin, Artime, Verstraete, & De Domenico, 2021), to the human microbiome (De Domenico & Biamonte, 2016) and to social and transportation systems .
For instance, a network of size N with no connectivity at all would have an entropy equal to log 2 N bits, the maximum attainable value, whereas a fully connected network (i.e., a clique), would have the lowest possible entropy, tending to 0 bits in the limit of large τ.
We consider two distinct dynamical processes, namely CRW and MERW (see Materials and Methods for details), and use the variation of spectral entropy with Markov time τ to characterize synthetic and empirical human brain networks across multiple scales. The persistence of information flow is characterized by the decay of the spectral entropy: the slower the decay, the more persistent the flow through the network.

Probing Synthetic Models of the Human Brain
Our first analysis concerns with quantifying the differences between empirical connectomes from healthy subjects, as measured from 196 individuals within the Nathan S. Kline Institute -Rockland Sample (see Materials and Methods), and synthetic networks obtained from a pool of generative models (see Materials and Methods for details).
Persistence of information flow is used to this aim: we calculate the average spectral entropy hS τ (G data )i over the whole set of subjects, as well as the average spectral entropy hS τ (G model )i over the ensemble of different independent realizations of a generative model, for each generative model separately, and for the two distinct dynamical processes separately. Results are shown in Figure 2. Specifically, results shown in Figure 2 have been generated by considering a sample of 196 subjects. For each subject we generate 100 different realizations of each generative model, resulting in 19,600 samples that are later used to estimate each synthetic curve shown in the figure. As expected, when considering the values of spectral entropy varying with Markov time, τ, the generative models exhibit distinct behavior across scales, and their ordering with respect to the value of entropy allows one rank them from the simplest to the most complex one. In fact, as it can be seen in Figure 2A and 2C the Erdo }s-Rényi model (ERM) and the configuration model (CM) require a smaller amount of bits for their description than empirical connectomes, followed by the more complex stochastic block model (SBM) and finally by the hyperbolic model (HM), for both CRW and MERW dynamics. Interestingly, the spectral entropy of the empirical brain lies between these last two more complex generative models, providing an indication of its possible mixed nature, interpolating between the modular feature encoded by SBM and the latent geometry encoded by HM. This results is robust across the two type of considered dynamics, CRW and MERW. It is worth noticing that differences between the spectral entropy of the connectomes and their synthetic counterpart obtained from generative models are visible while spanning from the micro-to mesoscale (τ ffiffiffiffi N p ) and that these differences are amplified at the mesoscale ( where N is the number of nodes of each network and is equal to 188 (thus, microscale < ffiffiffiffiffiffiffi ffi 188 p , ffiffiffiffiffiffiffi ffi 188 p ≤ mesoscale < 188 and macroscale ≥ 188). The fact that spectral entropy values remain higher for increasing Markov time denotes a larger persistence of information flow, that is, a slower entropy decay, as in the case of the hyperbolic model. To further appreciate the differences between the entropy of synthetic and empirical networks we compute the entropic ratio r τ (G model , G data |RW) = hS τ (G model )i/hS τ (G data )i for each value of τ, generative model and random walk (RW) process (see Figure 2B and 2D).
In this case, differences are visible at the mesoscale and are amplified at the macroscale (τ > N ). CRW dynamics show an important difference between the empirical brain and the hyperbolic model, highlighted by an high entropic ratio. This difference also appears when considering MERW dynamics, which, in addition, reveals the deviation of the SBM from the empirical brain at the macroscale, highlighted by a high entropic ratio.
Leveraging on the multiresolution nature of our information-theoretic approach, we tested if there are significant differences between the empirical connectomes and their pool of generative models, by considering the values of spectral entropy at (and across) different scales Figure 2. Persistence of information flow in human brain and generative models. In (A) and (C) we report the average value of spectral entropy varying the Markov time, τ, for the real data (blue line) and for all the considered generative models (encoded with colored lines), that is, Erdo }s-Rényi model (ERM), configuration model (CM), hyperbolic model (HM), and the stochastic block model (SBM), obtained from the classic random walk (CRW) and the maximal entropy random walk (MERW) dynamics, respectively. In (B) and (D) we report the ratio between the average value of spectral entropy-varying the Markov time-of each generative model and the value of spectral entropy of real data, by considering the classic random walk (CRW) and the maximal entropy random walk (MERW) dynamics, respectively. All the curves have been generated considering networks of 188 nodes, and each synthetic curve results from a sample of 19,600 realizations of the network. For all the plots, the x-axis is expressed in logarithmic scale. Shaded areas in (B) and (D) represent the error as one standard deviation. defined by τ. Results of t tests between spectral entropy values coming from real data and synthetic models are provided in terms of adjusted p values and maximum posterior probability, and are reported in Figure 3. At the microscale, all generative models are significantly different from the human brain networks they attempt to reproduce when considering the CRW dynamics, except for a few values of spectral entropy in HM and SBM. Instead, MERW dynamics show significant similarity between real data and the stochastic block model not only at the microscale but also across the mesoscale and macroscale. For τ ≥ 30, above the mesoscale the synthetic networks, except for the ones generated by HM, are significantly similar to the empirical ones, when considering CRW. In the case of HM, the similarity is well established in the macroscale. In the case of MERW dynamics, the similarity with HM emerges slightly before the transition from the mesoscale to the macroscale, and across the macroscale. To further strengthen our results, we report, as well, the values of maximum posterior probability, obtained by recalibrating the p values adjusted (see Materials and Methods), in Figure 3B and 3D. Under some specific assumptions (see Materials and Methods), this corresponds to the error probability in rejecting the null hypothesis H 0 -the generative models reproduce the real data-from a Bayesian perspective. Figure 3. Identifying significant differences between human brain networks and generative models. In (A) and (C) we display the adjusted p value resulting from the t test between real data and all the considered generative models (encoded with colored lines), that is, Erdo }s-Rényi model (ERM), configuration model (CM), hyperbolic model (HM), and the stochastic block model (SBM), by considering the classic random walk (CRW) and the maximal entropy random walk (MERW) dynamics, respectively. In (B) and (D) we display the values of maximum posterior probability recalibrated from the adjusted p values. All plots are expressed in log-log scale. Shaded areas represent in the order the micro-, meso-, and macroscale coincident with ffiffiffiffi N p (from micro to meso) and N from meso to macroscale, with N = 188. It is to be noticed that there is an overlap between the ERM and the CM.
To sum up, the mesoscale seems to be the suitable scale for distinguishing the differences (and the similarities) between the empirical brain and its possible generative models, since at the microscale the real data are significantly different from all the synthetic models, while at the macroscale there are not significant differences between data and models. Curiously, when considering the MERW dynamics, the empirical brain can resemble a stochastic block model across all scales, thus supporting the broad application of community detection algorithms and stressing their importance for the analysis and the understanding of the human brain.

Information Capacity at Different Stages of Dementia
Here, we wonder if we can use the same framework to identify differences between healthy subjects and patients at different stages of dementia, namely mild cognitive impairment (MCI) and Alzheimer's disease (AD). For details about the dataset used for this analysis, we refer to Materials and Methods.
To spatially characterize different diffusion processes (i.e., CRW and MERW) on top of the network in healthy brain (H) and at different stages of dementia (MCI and AD), we provide brain maps encoding the steady state of the two considered dynamics, corresponding to the leading eigenvector of the transition matrix defining the process (see Figure 4).
Using the same approach as before, we show the results in Figure 5. At the turn of micro and mesoscale, when considering CRW dynamics, the values of spectral entropy in the connectome of Alzheimer's disease patients show some differences from both healthy subjects Figure 4. Brain maps of the steady-state distribution for the CRW and MERW dynamics in healthy brain (H) and at different stages of dementia (MCI and AD). In (A) we report the steady state for the CRW dynamics, while in (B) the steady state for the MERW dynamics. The size of the node encodes the value of the steady state corresponding to the leading eigenvector of the process. and MCI patients. Intriguingly, the connectome of AD patients exhibits a (slightly) higher spectral entropy than the ones of healthy and MCI subjects, denoting a larger persistence of information flow. As in the previous case, to further highlight differences between healthy and different stages of dementia, we compute the ratio between the corresponding values of spectral entropy (see plots of Figure 5B and 5D). Differences between AD patients and the other two considered groups of subjects (healthy and MCI) are visible at the mesoscale and are amplified at the macroscale, while no differences appear between MCI and healthy. These results are in agreement for the two dynamics, CRW and MERW.
Also in this case, we tested the significance of the differences between the spectral entropy of the two stages of dementia and healthy controls by means of t tests; results are displayed in Figure 6. According to the adjusted p values obtained when considering the CRW dynamics (see plot of Figure 6A), the spectral entropy in AD patients is significantly different from the one of healthy controls for most of the Markov time values at the micro-, meso-, and macroscale. For what concerns the MERW dynamics (see plot of Figure 6C), the same is true only at the microscale. Interestingly, the only significant differences between values of spectral entropy in MCI and in healthy controls arises at the microscale and only for the CRW dynamics. Finally, resulting values of maximum posterior probability strengthen the results at the microscale for the AD patients (probability of the error < 1%) when considering CRW dynamics.

DISCUSSION
We presented a multiresolution analysis of the human brain building on statistical physics and information theory of complex networks.
Patterns of distributed activity, or modes, of the brain are neural dynamics unfolding on anatomical connectivity structures (Honey, Kötter, Breakspear, & Sporns, 2007). The topological structure of the brain has been investigated using many functional and structural neuroimaging datasets (Bullmore & Sporns, 2009;K. Friston, Kahan, Razi, Stephan, & Sporns, 2014;K. J. Friston, 2009;Sporns, Chialvo, Kaiser, & Hilgetag, 2004) that have established that brain regions have a modular functional organization and are also connected in a way that permits the emergence of whole-brain processes like attention, cognition, and behavior. In other words, functionally distinct brain areas have a hierarchical organization that permits integration at different topological scales (Bassett et al., 2008;Meunier et al., 2009). A typical example of a task that is implemented at different scales is invariant visual object recognition, which relies on a hierarchically organized set of visual cortical areas whose competition, biased by attention, is implemented locally but gradually increases thanks to the hierarchical nature of the network (Deco & Rolls, 2004. In this context, to simultaneously capture the state of the network at different scales is crucial to fully understand a neural process. In this paper we focused on resting-state structural data but further research, investigating behavioral or cognitive functional Figure 6. Identifying significant differences between human brain networks in health and disease. (A and C) We display the adjusted p value resulting from the t test between different stage of dementia (MCI in teal and Alzheimer's disease in red) and healthy subjects for each value of spectral entropy varying with Markov time, τ, by considering the classic random walk (CRW) and the maximal entropy random walk (MERW) dynamics, respectively. Horizontal lines mark the p value adjusted at 0.05. (B and D) We display the values of maximum posterior probability recalibrated from the adjusted p values. All the plots are expressed in log-log scale. Shaded areas represent in the order the micro-, meso-, and macroscale coincident with ffiffiffiffi N p (from micro to meso) and N from meso-to macroscale, with N = 90.
tasks, can benefit from our approach. In particular, we could investigate how the functional role of brain regions changes in resting state with respect to specific behavioral or cognitive tasks, or how functional alterations are displayed at multiple brain scales in nonhealthy subjects.
We exploited the information flow among system units restricted by the underlying connections to gain insights into the different functional role of brain regions at multiple scales. Here we used classical and maximal entropy random walk processes to explore the topological scale of the networks, but other types of dynamical processes (such as synchronization processes) on the top of the system may be considered in further works, since the framework is very flexible. We first use our method to compare network models that have been widely used in the literature to describe brain organization and we found that at the microscale, all the tested models are not suitable to describe real data. We hypothesize that, although MRI resolution preclude analysis about functional specialization within the dendritic tree or cortical macrocolumn (K. J. Friston, 2009), their layered structures could strongly affect the results at low spatial scale. On the contrary, the mesoscale is the most suitable resolution to compare network topologies. We found that real connectomes have features of the stochastic block model and the hyperbolic model, where the former is representing the aforementioned brain modular organization of brain areas and the latter takes into account latent geometry in the communication flows among them.
Several previous studies have explored the mechanisms that control communication dynamics in brain networks (Amico et al., 2021;Avena-Koenigsberger, Misic, & Sporns, 2018;Hahn, Ponce-Alvarez, Deco, Aertsen, & Kumar, 2019). Some models suggest that neural units have a knowledge of the whole network topology and convey information from a source to a predetermined target using (multiple) shortest paths (Avena-Koenigsberger et al., 2017;Goni et al., 2014). Other models, instead, do not make assumptions about global knowledge of network topology and propose that communication flows are ruled only by local knowledge of the distance between cortical regions (Seguin, van den Heuvel, & Zalesky, 2018). In our work we explore both frameworks using two different diffusive processes to describe information flows among units: in the classical random walk we assume that communication dynamics only require local knowledge of the connectivity, whereas in the max-entropy random walk process we hypothesize that the walker uses global knowledge of the connectome to explore it. We found that the two approaches are compatible and that spectral entropy values are similar when we consider the two types of dynamics on top of stochastic block model and hyperbolic model. Those results are in agreement with the aforementioned study that shows how specific brain topological and geometrical properties lead to comparable efficiency in network communication with or without centralized knowledge.
Finally, our approach was used to investigated alterations in brain network topology in patients with AD. Here we can avoid any assumption regarding the network generative model that better describes real data and we can focus on changes in information flows that characterize AD brain networks at different scales. Previous studies used standard network analysis indicators to report abnormalities in the connectivity between different brain areas, and specifically found an increased connectivity at spatial scales lower than brain lobe and postulated a mechanism of compensation associated with cognitive impairment (Penny, Iglesias-Fuster, Quiroz, Lopera, & Bobes, 2018;Reuter-Lorenz & Cappell, 2008;Sheng et al., 2021;Yao et al., 2010). Our results are in agreement with the aforementioned studies, since they show a significant increase in persistence of information flows at microscopic scales in AD patients with respect to healthy subjects. Furthermore, by taking simultaneously into account the whole network state, they suggest that compensation mechanisms may act at smaller topological scales than previously hypothesized.
We conclude by outlining some limitations of our approach. Statistical physics of complex information dynamics has been shown to be a powerful framework to attack a range of problems in the domain of complex systems. Yet, it is worth mentioning that the computational cost for calculating spectral entropy is still relatively high, being of the same order of the complexity of an eigenvalue problem. Therefore, our method would perform slower than more standard techniques in the case of very large networks, that is, for sizes above 10,000 nodes; for smaller networks, the method is fast enough. The robustness of our analysis should be assessed also by analyzing other empirical datasets to further clarify how our entropy measures can be affected by the definition of distinct brain areas and can evolve when considering nonstationary (no resting state) brain activities. Finally, although the results presented in this work are promising for the investigation of dementia, the clinical usability of our approach requires further investigation.

Data
In this work we rely on two structural connectivity data sets: with Alzheimer's disease, 23 affected by mild cognitive impairment, and 26 healthy controls.
The first data set (Nooner et al., 2012) consists of resting-state structural data, that represent physical structure of brain networks, of 196 healthy subjects at rest without any mental or physical disorder. The sample is made up of 114 males and 82 females and the age range is rather large: the youngest person is 4 and the eldest is 85. By looking at the age variable distribution, the first quartile is equal to 20, and the third quartile is equal to 47; for this reason, the sample can be considered representative of age variability. The NKI-RS has been designed as a community-ascertained sample and the representativeness is maximized according to demographic characteristics of the United States. Brain structural networks are mapped with diffusion tensor imaging measures (137-direction, 2 mm isotropic), provided by the Center for Magnetic Resonance Research at the University of Minnesota for the Human Connectomes Project. All data were publicly shared through the Collaborative Informatics and Neuroimaging Suite (COINS) developed by the Mind Research Network.
The second data set consists of structural networks reconstructed from diffusion tensor imaging data. In particular, to reconstruct the networks, Lin et al. applied a streamline-based fiber tracking algorithm on voxelwise diffusion tensors with these set parameters: random wholebrain seeding, 200,000 reconstructed streamlines, anisotropy threshold of 0.15, angular threshold of 45°, and streamline length between 30 and 300 mm (for more details about this data set please refer to Lin et al., 2019). Here, the nodes of the brain networks correspond to the 90 cerebral regions from the automatic anatomical labeling (AAL) template (Tzourio-Mazoyer et al., 2002), while the edges are quantified by computing the fractional anisotropy (FA) along the interconnected streamlines between two different AAL regions. According to Lin et al. (2019), measurements obtained from tract-specific metrics (e.g., fractional anisotropy and diffusivity) reveal themselves to be more sensitive and interpretable than those obtained from Structural connectivity: Topological interconnection of brain regions as identified through diffusion tensor imaging (DTI) techniques and summarized in adjacency matrices according to some specific indicator (e.g., fractional anisotropy).
Fractional anisotropy: Degree of anisotropy in a diffusion process bounded between 0 and 1, where 0 means isotropic diffusion and 1 represents a diffusion occurring only along one axis. metrics based on streamline count. These findings motivate our choice to rely on data obtained from a fractional anisotropy tract-specific metric for building the empirical brain networks.
Fractional anisotropy ranges from 0 to 1, where 0 means that diffusion is isotropic and 1 that diffusion occurs along one axis. To establish the presence of links in a binary way, and to avoid, at the same time, using arbitrary thresholds, the links of the networks used in this work are the result of a sampling assuming that the probability of existence of each link is uniformly distributed. Specifically, we defined the link between i and j in the FA dataset as w ij and, interpreting w ij as a probability, we extract a random number r from a uniform distribution U(0, 1) and we assign the binary link according to the Heaviside Â as a ij = Â(w ij − r). In other words, when the value of FA is greater than the corresponding random value the link exists, otherwise it is discarded. In this case, the sampling is well suited given the FA values bounded between 0 and 1 and can be safely interpreted as the probability of a link to exist. In fact, since FA values measure the degree of anisotropy of diffusion occurring on a given tract and being such values bounded between 0 and 1-where 0 means isotropic diffusion and 1 anisotropic diffusionthey can be safely interpreted as the probability to have a structural connections among brain regions. The choice of avoiding to adopt a specific threshold is motivated by recent studies showing that, in the case of probabilistic or correlation networks, it is desirable to account for the intrinsic uncertainty in the existence of each link (Raimondo & Domenico, 2021).

Generative Models
Generative models are statistical processes allowing one to obtain a sample of synthetic data. The synthetic networks obtained through such processes can share some properties with the observed data, and this procedure is guaranteed by the use of parameters that are usually obtained by fitting the observed data. In this study, we consider four different types of generative models; for each type, we fit the underlying parameters of each model for each empirical connectome separately, and generate 100 independent realizations, to obtain an ensemble of synthetic networks for each connectome. Therefore, we have a total of 400 synthetic networks for each empirical network.
The Erdo }s-Rényi model (ERM) generates random graphs with the same number of vertices and the same number of edges of the real network. These topological features are preserved each time the model is produced, whereas the network structure randomly changes.
The configuration model (CM) reproduces the degree distribution of the real network, preserving the the degree of the nodes while avoiding multiple edges; in the literature, this model is also known as degree-preserving random rewiring model (Maslov & Sneppen, 2002). Parameters used to fit this data-driven model are as many as the number of nodes N, and each represents the corresponding node degree k i (i = 1, 2, …, N ). From each node, k i stubs (edge halves) start and, by changing the connectivity pattern, link to different nodes, obtaining a network with no topological correlations which preserve the observed connectivity. The stochastic block model (SBM) allows one to define an ensemble of random models that reproduce the mesoscale organization present in the real network. A block consists of a group of nodes that have a higher likelihood of being connected among them than making external connections with nodes from other groups. Here, we use graph-tool (Peixoto, 2019), an efficient Python module for statistical analysis of graphs and network manipulation, to fit the degree-corrected SBM.
The hyperbolic model (HM) is based on two important parameters, the popularity and the similarity (Papadopoulos et al., 2012), whose trade-off is responsible for the network structure: the two parameters are physically formalized and geometrically interpreted, and their product is optimized in order to obtain connections in the network. To fit this model, we use the Mercator method (García-Pérez, Allard, Serrano, & Boguñá, 2019), which maps real complex networks into a hyperbolic geometric space, which is able to provide a more accurate interpretation of the connectome structure than Euclidean geometry. We use Mercator (Network Geometry -Mercator, 2020) to generate this class of synthetic networks .

Random Walks on Connectomes
Information flow in complex networks, such as human connectomes, has been modeled by diffusive processes like random walks (Masuda, Porter, & Lambiotte, 2017). As Markovian processes, different types of random walks are defined in terms of transition matrices encoding the probability of jumps from nodes to neighbors. In this work, we use two important types, including the classical random walk (CRW) (Noh & Rieger, 2004) and maximal entropy random walk (MERW) (Burda et al., 2009).
For both types of dynamics, the Laplacian matrix is defined by L = I − T, where T is the transition matrix governing random walk dynamics and I is the identity matrix. Let us assume that the i-th components of the vector p(τ) indicate the probability to find the random walker in node i at time τ. The evolution of the probability vector is governed by the master equation In the continuous-time approximation, Equation 1 reduces to: with solution given by p(τ) = p(0)e −τL .
In a classical random walk on a binary network, the transition matrix is defined as T CRW ð Þ ij = A ij /k i , where k i is the degree of i-th node and A is the adjacency matrix. In MERW the transition matrix is defined in terms of the largest eigenvector of the adjacency matrix. Assume the eigenvalues of the adjacency matrix are ordered as a ℓ , ℓ = 1, 2, …, N, where a N has the maximum value, and their corresponding eigenvectors are given by q (ℓ) . The transition matrix, in this case, is given by T . One of the interesting features of MERW is that the probability of a transition from one node to another within τ steps of time is independent of the intermediate transitions and all trajectories from i-th to j-th node with the length of τ are equiprobable.
The Laplacian matrix for CRW is L (CRW ) = I − T (CRW ) , while for MERW it is defined by L (MERW ) = I − T (MERW ) . Therefore, each dynamical process can be obtained from Equation 2 by choosing the relevant Laplacian matrix.

Statistical Physics of Information Dynamics
Characterizing the flow of information between nodes in a complex network is challenging, requiring a deep understanding of the network topology, relevant dynamical processes, and the interplay between them. Recently, a statistical field theory has been introduced to describe the information flow between the components of complex systems in terms of the dynamics of a field on top of the network, moving among nodes. In this framework, for a network denoted as G, the dynamical process governing the flow can be described in terms of a general differential equation, which, after linearization, reduces to a Schrodinger-like equation with a quasi-HamiltonianĤ(G) and the propagator e −τĤ (G) that determines the flow trajectories at time τ. Furthermore, the propagator can be eigen-decomposed to obtain an ensemble of operators acting like information streams, directing the flow of the field from unit to unit . The topological factors, the type of the quasi-Hamiltonian and τ affect the size of the streams and, consequently, each stream can be active or nonactive (i.e., having negligible size) under a specific system configuration.
To study the macroscopic properties of these microscopic interactions between the nodes, it has been shown that a superposition of the information streams, weighted by their activation probabilities, provides a Gibbsian-like density matrix describing the state of the system: where Z τ (G) = Tr[e −τĤ (G) ] plays the role of the partition function and is related to the transport properties of the network . Using the above density matrix, one can quantify the mixed-ness of the information streams in terms of the von Neumann entropy as which is also a measure of diversity of the flow dynamics. The maximum value for the von Neumann entropy of the system is log 2 N corresponding to the state where all information streams are active with the same size and to capture the dynamics, one needs to consider all the streams. At large temporal scales τ, the entropy of a connected network is expected to decay, as the distribution of the field becomes less dependent on the initial conditions. Interestingly, it has been shown that the von Neumann entropy can be used to measure the functional diversity of nodes as senders of information . In fact, in a system where the overlap between the flow distribution originated from different nodes is high, we get lower values for the entropy.
It is worth noting that if the dynamical process is continuous diffusion governed by the combinatorial Laplacian, the von Neumann entropy obtained from the above statistical field theory is equal to the spectral entropy (De Domenico & Biamonte, 2016), introduced to analyze complex networks from an information-theoretic perspective.
Here, we consider random walk dynamics as a proxy for information transport in human connectomes. Therefore, the quasi-Hamiltonian equals L (CRW ) for classical random walk and L (MERW ) for maximal entropy random walk.

Maximum Posterior Probability
To quantify the statistical significance of the differences (or similarities) within the connectomes coming from two distinct groups (e.g., empirical data vs. model, control vs. disease, etc.) considered for this work, we performed pairwise t tests, adequately corrected for multiple test comparison. In particular, we tested two distinct null hypotheses (H 0 ): (a) the generative models reproduce the real data (dataset 1) and (b) the spectral entropies in healthy and nonhealthy brains are equal (dataset 2). Since we are performing multiple tests, that is, we tested real data against the four generative models and the healthy controls against different stages of dementia, we adjusted the resulting p values by means of the Bonferroni-Holm method. All pairwise t tests are performed by considering a 95% confidence interval.
To avoid confusion in the interpretation of the p values, here we use a Bayesian approach for p value calibration proposed by Sellke, Bayarri, and Berger (2001), so that p values can be Partition function: The summation of the stream sizes, encoding each stream contribution to the overall flow, that is used to normalize the density matrix.
Information streams: Operators (i.e., matrices) obtained from eigen-decomposition of propagator, responsible for directing information flow through the system. P value calibration: Recomputing the p value so that it can be interpreted as the probability of H 0 , under some specific assumptions.
interpreted from both a frequentist and a Bayesian perspective. Specifically, we compute the Bayes factor as: for p < 1/e, which corresponds to the lower bound on the odds provided by the data for H 0 and H 1 , the latter being the alternative hypothesis. If we consider the frequentist error probability of rejecting H 0 when it is true (type I error), the calibration is given by where, in this case, p is the adjusted p value. Therefore, we have two possible interpretation for the outcome of this calibration. From a frequentist perspective, it precisely coincides with the error probability of rejecting a true null hypothesis. From a Bayesian perspective, it is the (maximum) posterior probability of H 0 provided that the Bayes factor corresponds to the one expressed in Equation 5 and assuming that H 0 and H 1 have equal prior probabilities of 1/2. The results of t test thus can be either interpreted as the probability of rejecting the null hypothesis when it is true and as the probability of the null hypothesis itself. In other words, lower recalibrated p values are indicative of lower accordance between the samples that we are testing-lower maximum posterior probability-while higher recalibrated p values can be interpreted as higher probability of accordance between the samples under consideration.