Adaptive rewiring in nonuniform coupled oscillators

Abstract Structural plasticity of the brain can be represented in a highly simplified form as adaptive rewiring, the relay of connections according to the spontaneous dynamic synchronization in network activity. Adaptive rewiring, over time, leads from initial random networks to brain-like complex networks, that is, networks with modular small-world structures and a rich-club effect. Adaptive rewiring has only been studied, however, in networks of identical oscillators with uniform or random coupling strengths. To implement information-processing functions (e.g., stimulus selection or memory storage), it is necessary to consider symmetry-breaking perturbations of oscillator amplitudes and coupling strengths. We studied whether nonuniformities in amplitude or connection strength could operate in tandem with adaptive rewiring. Throughout network evolution, either amplitude or connection strength of a subset of oscillators was kept different from the rest. In these extreme conditions, subsets might become isolated from the rest of the network or otherwise interfere with the development of network complexity. However, whereas these subsets form distinctive structural and functional communities, they generally maintain connectivity with the rest of the network and allow the development of network complexity. Pathological development was observed only in a small proportion of the models. These results suggest that adaptive rewiring can robustly operate alongside information processing in biological and artificial neural networks.


Matrix-vector Realization of Coupled Logistic Maps
In the main body of the manuscript, we stated that the dynamics of a network of N coupled logistics map, each described by may be re-written in matrix notation as The equivalence of Equations S1 and S2 can be shown as follows. First, we may write the activities of all nodes at time t + 1 as a vector of size N , i.e., We may write I as element-wise multiplication of two vectors, for which we use the II can similarly be simplified using element-wise multiplication and division using Hadamard and operators: III is sum of activities of nodes connected to node i, and we may calculate III by leftmultiplying the adjacency matrix A t by the vector of activities of all nodes: This multiplication implies that III i (the ith element of the vector) is equal to j a ij (1 − α i x 2 it ), in which a ij is the element on row i and column j of A t , which is 1 if nodes i and j are connected and 0 otherwise. In other words, this multiplication serves the selection of neighboring nodes and summation of their values.
In a similar manner, IV is a vector in which the ith element is the number of nodes connected to node i. Thus, this vector is equal to row-wise summation of A t , which can be calculated using matrix multiplication via Thus, given commutative, associative, and distributive properties of Hadamard operators, we have Putting Equations S4 and S8 together and rearranging terms proves the equivalence: (S9)

Networks as Distributions
Network statistics can describe network structures either locally or globally. Local measures are suitable for node-wise (or clique-wise) comparisons, while the global measures are aggregates of some local properties that provide summary statistics for the structure. Whereas local measures hardly lead to a holistic description of networks (as the nodes are usually described in isolation from each node) in their aggregation for deriving global measures, structural information is sacrificed. Therefore, neither local nor global measures are optimally suitable for the comparison of networks. A possible solution is to use the distribution of several local measures for network comparison. Berlingerio et al. (2012) suggest characterizing each node i of the network with a seven-dimensional feature vector consisting of the following local measures that capture characteristics of the node and its first-and second-order neighbors: 1. d i = |N (i)|, the number of neighbors, or degree (N (i) is the set of first-order neighbors of node i, i.e., the nodes directly connected to i); 2. c i , local clustering coefficient, which is the number of triangles connected to node i over the number of connected triples centered on node i; The distribution of local features enables the comparison of their summary statistics. In NetSimile (Berlingerio et al., 2012), the feature distribution (which is a nodes × f eatures matrix) is aggregated into a 35-dimensional "signature vector" consisting of five summary statistics for each feature: median, mean, standard deviation, skewness, and kurtosis. The comparison of networks is thus reduced to calculating distances (or similarities) of the signature vectors. NetSimile is superior to other methods of inferring network similarity, as its computational complexity grows linearly with network size. More importantly, NetSimile allows the comparison of networks of different sizes.
Ranks and values of summary statistics characterize the overall shape of distributions and thus are highly diagnostic for their comparison (Berlingerio et al., 2012). Hence, the signature vectors are akin to ranked lists. It has been shown that the Canberra distance, defined in Equation S10 for two vectors V and W of size n, is an appropriate measure of dissimilarity for ranked lists (Jurman et al., 2009), as it is sensitive to small distances from zero and normalizes the pairwise distances of features by their absolute values. Moreover, Berlingerio et al. (2012) report that Canberra distance outperforms other distance measures in discriminating signature vectors, a desirable property of a dissimilarity measure for the task at hand.
We use this dissimilarity metric in the pairwise comparison of the signature vectors derived from the NetSimile algorithm. However, NetSimile does not allow hypothesis testing to infer significance levels for the distances. Berlingerio et al. (2012) suggest hypothesis testing for the independence of the distributions by pairwise comparison of the univariate distributions of the features and aggregating their p-values through averaging or choosing the maximum values. The authors report that neither Mann-Whitney nor Kolmogorov-Smirnov tests-which are nonparametric tests without any assumption for the distributions being compared-yield meaningful discrimination among networks. Notably, their approach of hypothesis testing ignores the multivariate dependencies amongst the features. Hence, we use another method to test the independence of distributions which is discussed below.

Hypothesis Testing for Similarities of Network Distributions
The significance tests used by Berlingerio et al. (2012) posit multivariate independence among features and lacks what the authors call "discrimination power." To tackle this issue, one must use multivariate dependence tests. Since parametric dependence tests make assumptions about the distributions being compared, we used the Heller-Heller-Gorfine (HHG) nonparametric permutation test of multivariate dependence (Heller et al., 2013) implemented in the HHG R package (Brill et al., 2018). HHG is a consistent omnibus test for the null hypothesis of distributional independence, i.e., that the joint distribution of two multivariate random variables X and Y is equal to the multiplication of the marginal distributions of those variables. Equation S11 shows the null and alternative hypotheses: , ∀x, y.
HHG has a reasonable computational complexity and uses norm distance matrices of the samples taken from X and Y separately. The technical details of this method are beyond the scope of this paper. In short, HHG iteratively forms hyperspheres in the joint space of F XY (x, y), and based on the implications of the null hypothesis, quantifies evidence against H 0 by likelihood ratio or Pearson's Chi-square tests statistics over contingency tables. From these tests, one can drive permutation p-values that can be interpreted as evidence against the null hypothesis of independence of the distributions. Hence, the lower the p-value, the more evidence favoring the dependence of the distributions being compared. Loosely speaking, one can treat the p-values derived from HHG methods as a form of distributional dissimilarity; the lower the p-value, the more "similar" the distributions are to each other. This interpretation allows us to compare non-significant p-values as relative measures of resemblance.
The hhg.test() function in HHG package runs the test for a number of permutations on distance matrices of the samples in X and Y . It outputs four different permutation p-values based on sums or maximum values of likelihood ratio or Chi-square test scores of all 2 × 2 contingency tables. We let HHG run for 2000 permutations for each pairwise comparison and extracted permutation p-value for the maximum of likelihood ratio score statistics as it yielded higher discriminative power compared to other test statistics.

Figure S1
Dimension reduction of nonlinear neuronal dynamics. (A) Phase space attractor of a three-dimensional neural mass flow. This attractor is an illustration of the dynamics generated by the flow of a neural mass mode (cf. Breakspear et al., 2003). The dynamical variables represent the mean membrane potential of pyramidal (V) and inhibitory (Z) neurons, and the average number of open potassium ion channels (W). (B) Poincaré first return map from the same attractor (cf. Breakspear et al., 2003); this map captures key features of the neural mass flow, by following each trajectory from one intersection (V) of the attractor to the next (P(V)). (C) The quadratic logistic map that has the same unimodal topology as the neural mass Poincaré return map. While the logistic map lacks the "thickness" of the neural mass map, it is several orders of magnitude faster to compute, hence allowing more detailed quantitative analysis. Figure and caption adapted from "Symbiotic relationship between brain structure and dynamics," by. M. Rubinov, O. Sporns, C. van Leeuwen, and M. Breakspear, 2009, BMC Neuroscience, 10(1), 55. CC BY 2.0.

Figure S2
Feigenbaum diagram of the values of the 200 draws of logistic maps (after a burnin period of 4000 iterations). For three levels of amplitude used in this study (i.e., α = 1.7, 1.8, or1.9; vertical lines), logistic maps exhibit chaotic behavior, making them suitable dynamic elements in our models.