## Abstract

Criticality is thought to be crucial for complex systems to adapt at the boundary between regimes with different dynamics, where the system may transition from one phase to another. Numerous systems, from sandpiles to gene regulatory networks to swarms to human brains, seem to work towards preserving a precarious balance right at their critical point. Understanding criticality therefore seems strongly related to a broad, fundamental theory for the physics of life as it could be, which still lacks a clear description of how life can arise and maintain itself in complex systems. In order to investigate this crucial question, we model populations of Ising agents competing for resources in a simple 2D environment subject to an evolutionary algorithm. We then compare its evolutionary dynamics under different experimental conditions. We demonstrate the utility that arises at a critical state and contrast it with the behaviors and dynamics that arise far from criticality. The results show compelling evidence that not only is a critical state remarkable in its ability to adapt and find solutions to the environment, but the evolving parameters in the agents tend to flow towards criticality if starting from a supercritical regime. We present simulations showing that a system in a supercritical state will tend to self-organize towards criticality, in contrast to a subcritical state, which remains subcritical though it is still capable of adapting and increasing its fitness.

## 1 Introduction

Critical systems are poised on the cusp between order and disorder and occur ubiquitously in nature [3], particularly in biological systems [13]. A growing body of research has shown that critical systems offer a plethora of desirable properties for complex, interacting, self-organizing systems that allow for long-range interactions enabled by modular and scale-free organization [10]. The phase transition can be sharp, as in the liquid-vapor transition for water, or gradual, as in superconductivity. Many properties are associated with criticality, including power-law divergences of some quantities such as magnetic susceptibility (described by critical exponents), ergodicity breaking, and fractal behavior [15].

Criticality has been studied in numerous systems, from sandpiles [4]; to gene networks [5, 11, 17]; to neural cultures [16]; to swarms, either natural [2, 7] or artificial [12, 19]; and even to the human brain [8, 18]. How such a wide range of systems across many levels of organization can self-organize towards criticality remains an unresolved question. However, new numerical methods allow researchers to simulate this process.

The 2D Ising model is well known for being an analytically solved model of criticality; however, the properties, or even the existence, of criticality for more general systems are not well understood. Recent work by Aguilera and Bedia [1] has demonstrated how arbitrary systems can be taught to be critical through the use of an inverse-Ising gradient descent algorithm on pairwise correlations. Criticality was learned by teaching the arbitrary model to generate the same distribution of correlations that a critical Ising model would. The critical correlation distribution was treated as an organizational invariant that any arbitrary system can be taught to replicate. The caveat, however, is that this arbitrary form of criticality is not necessarily functional in the context of some kind of objective function.

In this article, we explore the utility of critical learning by contextualizing our model within an artificial-life-style evolutionary simulation. Our model is a community of Ising neural agents composed of binarized neural networks. These agents react and compete for resources in a shared environment. The Ising model is chosen due to the extensive literature already written on the model in the context of criticality and learning [13, 20]. Our goal is to test whether an evolutionary algorithm will arrive at a critical state, in order to explore the relationship between the exploratory nature of critical systems and the pragmatic, solution-finding nature of evolutionary selection.

In contrast to sub- or supercritical systems, which are either too orderly or rigid to be sensitive to environmental perturbations, or too noisy to be able to propagate information, critical systems are maximally susceptible to perturbations and explore a broad dynamic range of behaviors. Critical systems are most commonly associated with the phenomena of diverging correlation lengths and are often detected by measuring thermodynamic properties or “order parameters” and observing power-law scalings near these divergences. While it is not clear what the cost of maintaining a critical system may be, it seems advantageous for living systems to be able to maintain such an integrated informational complex in order to maximally explore their behavioral state space. In the face of evolutionary selection pressures, critical systems may allow for more efficient and exploratory navigation of the solution state space. In this project, we seek to understand the interaction between critical systems and the selection pressures of life by simulating agents that can learn arbitrary forms of criticality but are also functionally competing for resources.

In our experiment, we subjected communities of Ising neural agents to an evolutionary algorithm where the edge weights of the neural network as well as the inverse-temperature (β) parameter of each organism could mutate and evolve. The temperature controled the propensity of the networks to minimize their energies. Three different simulations were run, initialized in a subcritical (β = 10), a supercritical (β = 0.1), and a critical (β = 1) regime by controlling the initial temperatures of the communities. We explored these distinct communities' evolutionary trajectories under these three different initial conditions.

## 2 Model

A community of 50 Ising neural agents were simulated via Glauber transitions [9] in a simple shared square 2D environment with periodic boundaries, where the community could interact with ever-spawning resources (food) (Figure 1). Each agent was composed of a network of neurons containing three sensory neurons, three hidden neurons, and four motor neurons for a total of ten neurons. The sensory neurons were sensitive to the distance to the closest food element, the angle to food, and a custom-made directional proximity sensor that was sensitive to the presence of other agents in front of the individual. The motor neurons controlled the linear and angular accelerations dv, . The hidden neurons bridged the sensory and motor neurons and were also interconnected with other hidden neurons. The agents did not take up space in the model, could overlap, and needed only to approach the food within a sufficiently small radius to eat it.

Figure 1.

A sample frame from the simulation. 50 agents (green circles) traverse a shared two-dimensional space and eat food (purple circles) by moving on top of the food parcels. The model visualization was inspired by the tutorial shared by Rooy [14].

Figure 1.

A sample frame from the simulation. 50 agents (green circles) traverse a shared two-dimensional space and eat food (purple circles) by moving on top of the food parcels. The model visualization was inspired by the tutorial shared by Rooy [14].

When simulating the Ising model, each neuron in an agent's brain can only be in one of two states, si = ±1, with the exception of the sensor neurons, which have been allowed to have the real values in (−1, 1). The sensor neurons are calculated as
$s0=θf180,s1=2tanhrsdf2+ϵ−1,s2=2tanhλp−1,$
(1)
where the boldface variables (θf, $df2$, λp) change in time and are functions of the relationship between the agent and its local environmental configuration. Constants rs = 0.05 and ϵ = 10−6 are, respectively, the radius of an individual agent and a small floating-point number to avoid singularities. θf is the angle the agent makes from its direction in space to the nearest unit of food. df is the distance between the agent and the nearest unit of food. λp is an experimental design for a proximity sensor, which is sensitive to the existence of other agents within a line of sight of the individual. It is calculated as
$λp=∑iNrscosθijdij2+ϵ,$
(2)
where θij = 0 if 360° > θij > 90° (when organisms are behind the cone of vision of the organism). The sensors are designed to remain bounded in (−1, 1); hence the use of the modified tanh functions. The rest of the neurons are updated according to Glauber transition rates
$pi=11+eΔEi/kBT,$
(3)
where kB is the Boltzmann constant, which we can set to unity; T is the temperature of the agent (more commonly referred to through the inverse temperature β = 1/T), an important control parameter; and ΔEi is the change in energy if a spin flip (si → −si) were to occur (therefore we can write Equation 3 as pi = 1/(eΔEiβ)). The energy E is defined as
$E=−∑ijJijsisj−∑ihisi,$
(4)
where hi is the local field at site i, which acts to bias the activity of that neuron, and Jij is the interaction strength (edge weight) between sites i and j and is defined by the artificial neural networks we generate. The summation {ij} is over all connected elements. Each individual in the community is initialized with randomly generated, sparse connectivity matrices uniformly distributed in [−1, 1] in a random binary configuration. The maximum edge weight is constrained to be in [−2, 2]. The change in energy ΔEi for flipping spin si is given by
$ΔEi=(−∑jJijsisj−hisi)−(−∑jJij−sisj−hi−si)=−2Ei,$
(5)
where Ei is the interaction energy of spin si, the first term in Equation 5. Finally, the four motor neurons of the agent decide the action of the agent at the end of each time step. In reality, there are only two motor neurons, split (digitized) into two parts each. This has been done to allow the option for the agent to do nothing. For example, if s9,10 = {+1, +1} or s9,10 = {−1, −1}, the agent will accelerate or decelerate, respectively. For all other mismatched configurations s9,10 = {±1, ∓1}, the agent neither accelerates nor decelerates. The same idea holds for the rotational motors s7,8 (see Figure 2).
Figure 2.

The architecture of the neural networks contained within each agent. The weights of these networks are subject to change according to the different evolution (learning) algorithms. The hidden neurons are interconnected. Sensor and motor neurons are not allowed to have connecting edges. Sensor neurons may have real values in [−1, 1], while all other neurons are binary and may only have values −1, 1.

Figure 2.

The architecture of the neural networks contained within each agent. The weights of these networks are subject to change according to the different evolution (learning) algorithms. The hidden neurons are interconnected. Sensor and motor neurons are not allowed to have connecting edges. Sensor neurons may have real values in [−1, 1], while all other neurons are binary and may only have values −1, 1.

### 2.1 Update

To update the state of an individual agent, first the sensors are updated according to the equations (1). Then the neurons are allowed some time to thermalize to the new sensors before the motor neurons activate action. In other words, the hidden and motor neurons are allowed to react to their new environment for a period, which we have set to 10 full cycles, wherein all neurons (in randomized order) are allowed to flip state according to the Glauber transition probability given by Equation 3.

Upon completion of 10 cycles of thermalization, the state of the motor neurons decides the action of the agent in its environment. For each time step in the simulation an agent is selected at random and is updated according to the methods outlined above until all agents are updated. Each generation is run for 2000 time steps.

### 2.2 Evolutionary Algorithm

An evolutionary algorithm governs the adaptation of these agents. The algorithm selects for the fittest agents; in this case the fitness of an agent is defined by the total of food it has eaten in its lifetime. At the end of each generation, the bottom 30 agents that acquired the least resources are culled, and the top 20 are selected for and processed through a genetic algorithm (GA) that uses a combination of elitist selection and roulette mating. The top 10 agents from the list of 20 are duplicated with a mutation rate mr = 0.1. From the now 30-long list of agents, pairs of agents are picked at random, and their connectivity matrices (Jij) and inverse temperature (β) are combined with weighted sums (Equation 6), where the crossover weight (P) between parent 1 (p1) and parent 2 (p2) is randomly selected from a uniform distribution in [0, 1). The inverse temperature of the child organism is also multiplied by a random number sampled from a folded normal distribution with mean μ = 1 and a mutation standard deviation of σβ = 0.02. The genotypes of the agents are thus defined completely by their connectivity matrices (and inverse temperature), which have the ability to be passed on and mutated through duplication and/or mating:
$Jijchild=P×Jijp1+1−PJijp2,βchild=P×βp1+1−Pβp2σβ.$
(6)

### 2.3 Experiment Design

The design of this experiment was inspired by, and a reaction to, the work done by Aguilera and Bedia [1], from which we wanted to extend the idea of critical learning by contextualizing the adaptation algorithm to a fitness function. In their work, an arbitrary form of criticality was learned by using a gradient-descent algorithm on an arbitrary distribution of correlations from a known critical model. Though this arbitrary form of criticality yielded some limited utility in the tasks their agents were subject to, the agents were still not adept at performing these tasks. Thus, in our experiment design we wanted to have a straightforward method for the agents to flow towards a critical state (if it is truly evolutionarily advantageous) while still learning useful edge weights to perform their task. We therefore introduced one extra parameter that can evolve, the inverse temperature (β). While the evolution of the edge weights is crucial for the fitness and environment-specific behavioral response of the agents, it may be too high-dimensional a space to find criticality in reasonable time. In response, by simply introducing evolutionary variation into the inverse-temperature parameter, this experiment was designed to see if our model would flow towards a critical state if we perturbed agents far from criticality in either direction. The justification for this arises from the fact that in the classical 2D Ising model with nearest-neighbor interactions, one uses the temperature as a control parameter to tune the model to criticality.

Three simulations were run, starting with three distinct initial inverse temperatures: β = 10, β = 1, and β = 0.1. Here we assumed that this range of β's would cover the subcritical, critical, and supercritical regimes of these agents, respectively. This assumption was proven correct when heat capacity measurements were taken (see Figure 3). The goal of this experiment was to see if the far-from-critical communities (β = 10, β = 0.1) would display any tendency towards criticality. Furthermore, the evolutionary dynamics of the community that was already close to criticality (β = 1) was of interest as well. We were interested in seeing how β tends to evolve in these different communities and how the statistics of these communities differ from each other, while considering their relative fitness levels. Measurements (see Section 2.4) and visualizations of the statistics of these communities are presented in Section 3.

Figure 3.

Specific heat evolution. The specific heat (C/N) of three communities in different adapting paradigms are plotted as a function of a global factor, denoted here simply as β. The dashed lines indicate the point corresponding to actual simulation conditions. The critical community starts, by definition, near criticality, as indicated by its peak coinciding with its simulation condition (red dots). Over time this curve becomes bimodal (blue dots), where the simulation condition is close to both peaks. The supercritical community starts far from criticality, but eventually enters a critical state. The subcritical community also starts far from criticality in the opposite direction, though this community never enters a critical state and in fact moves even farther away from criticality.

Figure 3.

Specific heat evolution. The specific heat (C/N) of three communities in different adapting paradigms are plotted as a function of a global factor, denoted here simply as β. The dashed lines indicate the point corresponding to actual simulation conditions. The critical community starts, by definition, near criticality, as indicated by its peak coinciding with its simulation condition (red dots). Over time this curve becomes bimodal (blue dots), where the simulation condition is close to both peaks. The supercritical community starts far from criticality, but eventually enters a critical state. The subcritical community also starts far from criticality in the opposite direction, though this community never enters a critical state and in fact moves even farther away from criticality.

### 2.4 Measurements

The heat capacity C(βg) = (βk · βg)2(〈E2〉 − 〈E2) for each agent (k) in the community was calculated for a range of hypothetical β values. By multiplying the βk's agentwise by a global factor βg we can see at what βg the entire community is critical; the divergence of the heat capacity roughly demarcates the onset of the critical point. The heat capacity of a system is a measure of how much its temperature changes when heat energy is added. Often near critical points this function diverges (small changes in temperature lead to large changes in energy, or conversely, large changes in energy lead to asymptotically small changes in temperature). The heat capacity here was measured with respect to the global factor βg and not the individual βk's of the agents. We required that the evolved community βk values (the simulation condition) coincide with the global factor βg = 1. Plots of heat capacity versus βg show the heat capacity versus the hypothetical inverse temperatures at which the agents are perturbed to operate. If an agent is critical, we expect the peaks of these functions to be located at βg = 1, so that the heat capacity of the community of agents is near a divergence without any multiplication by a global factor. This measure was calculated by simulating the community of agents in a dreamlike state for T = 1000 time points while measuring each individual's heat capacity. This dreamlike state allows the sensors to update according to Glauber transitions as well, which is normally not the case for actual simulations.

The motor response of each agent was calculated by fixing one of the three sensory neurons to a state, running the Monte Carlo simulation for T = 5000 time points, and calculating the average activation of the motor neurons dv, that control the linear and angular acceleration of the agent. Much like the heat capacity, the motor susceptibility was measured as σ2() = 〈2〉 − 〈2. In our results we only show the motor response of the angular acceleration motor () as a function of the angle-to-food sensor (and the corresponding susceptibilities of these response functions), as this seemed to be the only sensor and motor that the GA was sensitive to.

The hypothetical fitness of each agent is a measure of the fitness an agent would have if we rescaled the local βk's of a community by some global factor βg. Similarly to the heat capacity calculations, we swept through a range of global factors βg except this time we reran the full agent-environment simulation (as opposed to a dreamlike simulation), once again tallying the fitness levels of each agent (Figure 6).

## 3 Results

Three independent simulations were run, where the inverse temperatures of each individual for the respective communities were initially set to one of β = 10, 1, 0.1. For reference, each simulation will be referred to by its initial β value, though these values are subject to variation on application of the evolutionary algorithm. The fitness, heat capacity, motor response, and t-SNE projections of the genotypes for all simulations are visualized. Furthermore, hypothetical fitness values (Figure 6), the evolution of the edge weights, and the trajectories of the temperature parameter, as well as a combined parameter βJ, are visualized in Figure 9. These plots summarize the relationship between genotypes, the microstructure of the genotypes, the behavior, and the statistical measures necessary to detect criticality as a function of time.

### 3.1 Fitness

The fitness curves of the three simulations are plotted as a function of generation time in Figure 4. This plot demonstrates some notable characteristics of our different simulations at a glance. The β = 1 simulation is very clearly the quickest to peak in evolving foraging abilities. The β = 10 simulation, starting on the side of subcriticality, is much slower in finding successful agents, taking roughly two orders of magnitude more time to achieve notable success. Furthermore, this simulation exhibits very large variations in the fitness capabilities of its agents, even when agents are duplicates. As can be better seen in Figure 5, the β = 10 simulation is actually able to match the β = 1 simulation's peak fitness; however, the community averages are much lower, due to the large rates of failure in these communities. Finally, the β = 0.1, initially supercritical community simulation demonstrates a slow and steady approach to high fitness as it slowly converges towards critical statistics (as is corroborated in the other measures). The variations among the agents in this simulation were much lower, both in fitness and in genetic diversity.

Figure 4.

Average fitness line plot. The average fitness (food consumed per generation) as a function of generation for the three simulations (β = 10, 1, 0.1) corresponding to the supercritical, critical, and subcritical simulations. Dark shaded areas show the bounds for one standard deviation, and lightly shaded areas show min and max values. The critical community is first to plateau in its fitness curve and does so some orders of magnitude faster than other simulations. The supercritical community has a very slow and gradual increase in fitness, with low variability. The subcritical community evolves sporadically, with its fittest agents achieving similar fitness scores to the best critical agents; however, it has very large variability, and many organisms fail to achieve any fitness.

Figure 4.

Average fitness line plot. The average fitness (food consumed per generation) as a function of generation for the three simulations (β = 10, 1, 0.1) corresponding to the supercritical, critical, and subcritical simulations. Dark shaded areas show the bounds for one standard deviation, and lightly shaded areas show min and max values. The critical community is first to plateau in its fitness curve and does so some orders of magnitude faster than other simulations. The supercritical community has a very slow and gradual increase in fitness, with low variability. The subcritical community evolves sporadically, with its fittest agents achieving similar fitness scores to the best critical agents; however, it has very large variability, and many organisms fail to achieve any fitness.

Figure 5.

Fitness scatterplot. The individual fitness (food consumed per generation) as a function of generation number for the three simulations (β = 10, 1, 0.1) corresponding to the supercritical, critical, and subcritical simulations. Compared to Figure 3, this plot more clearly shows that the variability in the subcritical community arises from the large probability for an agent to catastrophically fail (fitness = 0). Some agents with fitness = 0 have identical genotypes to agents with maximum fitness, demonstrating the high sensitivity of the agents to the initial configurations of their neural networks and their susceptibility to being paralyzed in some minimal energy state.

Figure 5.

Fitness scatterplot. The individual fitness (food consumed per generation) as a function of generation number for the three simulations (β = 10, 1, 0.1) corresponding to the supercritical, critical, and subcritical simulations. Compared to Figure 3, this plot more clearly shows that the variability in the subcritical community arises from the large probability for an agent to catastrophically fail (fitness = 0). Some agents with fitness = 0 have identical genotypes to agents with maximum fitness, demonstrating the high sensitivity of the agents to the initial configurations of their neural networks and their susceptibility to being paralyzed in some minimal energy state.

### 3.2 Hypothetical Fitness

The hypothetical fitness curves of the different simulations are plotted as a function of a global factor βg in Figure 6. As expected, the generation 3999 results indicate that the fitness peaks when the global factor βg is unity, meaning the agents are fittest under the conditions they have been training under. However, it is interesting to note the shape of these curves and how they differ between generation 0 and generation 3999. In the initially β = 1, near-critical community at generation 0 we observe a branching in the fitness curves. This suggests that the behaviors of the Ising-neural agents also undergo a phase transition near β = 1, alongside their critical statistics. In the subcritical regime, behavior is dominated by catastrophic failure, where agents tend to have a fitness value of 0, though some few agents are capable of high levels of fitness. In the supercritical regime the variability among agents is smaller, and the minimum fitness is higher, but the maximum is also lower. As the community is scaled smoothly from one regime to the other, a branching occurs, indicating that the same agents are capable of oscillating between these two regimes. By generation 3999, all three communities have been able to peak their fitness at global β = 1; however, the subcritical community demonstrates a sharp cutoff as the global β decreases towards supercriticality. This cutoff may explain the tendency for the subcritical community to remain subcritical; however, the emergence of this cutoff is not yet understood.

Figure 6.

Fitness versus βg. The hypothetical fitness (food consumed per generation) as a function of a global factor βg for the three simulations (βg = 10, 1, 0.1) corresponding to the supercritical, critical, and subcritical simulations. The generation 0 plots (top row) show the fitness landscape as a function of βg, with a noticeable bifurcation in the βinitial = 1 plot. Variability characteristics of each simulation also agree with results from the fitness plots. The generation 4000 plots show how the GA ensures that at the hypothetical βg = 1 (the unscaled, actual β values of each simulation) the fitness curve peaks. Supercritical and critical communities evolve to be similar with smoothed-out bifurcations, whereas the subcritical community has a very different and discontinuous shape and variability characteristic.

Figure 6.

Fitness versus βg. The hypothetical fitness (food consumed per generation) as a function of a global factor βg for the three simulations (βg = 10, 1, 0.1) corresponding to the supercritical, critical, and subcritical simulations. The generation 0 plots (top row) show the fitness landscape as a function of βg, with a noticeable bifurcation in the βinitial = 1 plot. Variability characteristics of each simulation also agree with results from the fitness plots. The generation 4000 plots show how the GA ensures that at the hypothetical βg = 1 (the unscaled, actual β values of each simulation) the fitness curve peaks. Supercritical and critical communities evolve to be similar with smoothed-out bifurcations, whereas the subcritical community has a very different and discontinuous shape and variability characteristic.

### 3.3 Heat Capacity

The behavior of the specific heat over time is visualized by color-coding the generations and cumulatively plotting the functions on top of each other with transparency (Figure 7). The dotted red line at global βg = 1 indicates the location of the simulation condition. We note that the β = 0.1 and β = 1 communities converge upon and remain near criticality, respectively. This can be seen in that the peaks of these curves converge upon βg = 1, indicating that the simulation condition resides at criticality. Strangely, the β = 10 subcritical simulation seems to double down in its subcriticality, as we see the peaks of the heat capacity function move even farther away from βg = 1. In other words, the subcritical community is only becoming more orderly (rigid) rather than converging upon criticality. As suggested earlier, the mechanism of this tendency is thought to be due to the cutoff that emerges in the hypothetical fitness of these agents on the supercritical side of the subcritical community.

Figure 7.

Genotype t-SNE projections. t-SNE projections of the connectivity networks for the three simulations are plotted. Zoomed-in frames give a better look at the microstructure of genotype relationships as the well as the contrast in how these relationships evolve over time. Most notable is how the critical community tends to fill in the gaps in genotype space between old and new generations, whereas the other two communities start off tending to explore new areas of genotype space. Clusters begin to form as dominant and stable genotypes are discovered by the genetic algorithm.

Figure 7.

Genotype t-SNE projections. t-SNE projections of the connectivity networks for the three simulations are plotted. Zoomed-in frames give a better look at the microstructure of genotype relationships as the well as the contrast in how these relationships evolve over time. Most notable is how the critical community tends to fill in the gaps in genotype space between old and new generations, whereas the other two communities start off tending to explore new areas of genotype space. Clusters begin to form as dominant and stable genotypes are discovered by the genetic algorithm.

### 3.4 Motor Response

The angular motor response and susceptibility to the angle-to-food sensor of the three simulations are plotted in Figure 8. These functions give us insight into the behavior of these agents with respect to the sensory input they experience. Not plotted are the generation 0 response curves, which are largely centered at 0, indicating that the motor response has no preference towards the food yet. We see the convergence of the β = 0.1 supercritical community's motor response towards the β = 1 near-critical community. Both have essentially converged upon the solution to this foraging game: If the food is to the left, turn left; if it's to the right, turn right. The susceptibility of these two simulations also demonstrates the variation in the decision making of these agents, showing confidence in the extremes and higher sensitivity near the middle. The β = 10 subcritical community, however, has a much more idiosyncratic, perhaps conservative, behavioral pattern. Namely, an agent will not turn left or right for food that is too far left or right, instead only reacting to food sufficiently near its own direction. Not shown in this article are the results of other β = 10 simulation runs, which exhibited their own unique idiosyncratic behavior. However, the results of other critical and supercritical simulations (also not presented in this article) converged to the same shape shown in the results here.

Figure 8.

Motor response and susceptibility. The angular acceleration motor response () and susceptibility of the three communities are plotted as a function of the sensory activation of the angle to food. The critical and supercritical communities tend to converge to the same solutions (turn left if the food is to the left, turn right if the food is to the right), with a heightened sensitivity to food directly in front. The subcritical community exhibited a more idiosyncratic behavior (only turning for food that it roughly aligns with, and not bothering to turn for food too far out of the way).

Figure 8.

Motor response and susceptibility. The angular acceleration motor response () and susceptibility of the three communities are plotted as a function of the sensory activation of the angle to food. The critical and supercritical communities tend to converge to the same solutions (turn left if the food is to the left, turn right if the food is to the right), with a heightened sensitivity to food directly in front. The subcritical community exhibited a more idiosyncratic behavior (only turning for food that it roughly aligns with, and not bothering to turn for food too far out of the way).

### 3.5 t-SNE Projections

The t-SNE projections of the genotypes of these three communities are plotted in Figure 7. Each dot representing an agent is scaled with respect to its fitness and color-coded with respect to its generational age. The plots here corroborate some patterns we have already noted when looking at the fitness of these agents in Figure 4. Starting with β = 10, it can be seen that there is a long period where fit agents are not yet discovered and there does not seem to be any sense of a progressive lineage, or community. However, as the community eventually finds some solutions, clusters of agents begin to emerge that persist in time and exhibit a stable genetic island of stability for future genetic diversity. In the β = 1 simulation, it was already seen that a solution (dominant) genotype is discovered very early on. In the t-SNE plots this can be noted by the presence of clusters that span all the generations, including the very early ones. The drawback of this is that the dominant genotype that this simulation discovers is so dominant that not much other genetic diversity is discovered. This can be seen in the t-SNE plots from the lack of any continuity (lineage), which is very noticeable in the β = 0.1 supercritical simulation. Here a snaking pattern is observed between the old and new generations, demonstrating the distance between these generations in genotype space. No dominant agent really takes over this simulation until near generation 2000, where we finally begin to see some clusters forming around dominant agents.

### 3.6 Edge Weights and Temperature

Having looked at the t-SNE projections of the genotypes, now we would like to understand in higher resolution the evolutionary dynamics of the edge weights (along with the β evolution and ). The changes in the edge weights for all agents are plotted as a function of generational time in Figure 9. Starting with β = 0.1, a high variability in the edge weights for the first ≈1500 generations can be seen, after which point a relatively dominant agent begins to take root and give stability to the genetic variation. The quality of the edge weight dynamics begins to more closely resemble that of the β = 1 critical simulation, which shows relative edge weight stability from essentially its onset (see Figure 10 in  Appendix 1). Finally, the β = 10 subcritical community exhibits much more dramatic transitions in its edge weight evolution. A discontinuous transition near generation 400 gives rise to a dominant agent that dramatically stabilizes the genetic dynamics relative to the chaotic phase it was in before. It is at this point in the fitness curve that the subcritical simulation is able to make dramatic improvements to its fitness. A second, smoother transition can be seen near generation 1300, which once again demarcates a point in the fitness curve where the subcritical community makes big improvements. The genetic dynamics after this transition more closely resemble the stability observed in the critical simulation of β = 1. All simulations seem to show relative balance between positive and negative edge weights. For a broader perspective we can look at the plots, where we see that the β = 0.1 and β = 1 simulations seem to stabilize relatively close to 0. Conversely, the β = 10 simulation averages much higher near = 1, indicating a tendency for positive edge weights over negative ones.

Figure 9.

Genotype evolution. For each simulation, the edge weight values (J), the inverse temperature (β), and the product of these two parameters () are plotted. The combined parameter is useful in that the two individual parameters J and β show up together when calculating flip probabilities in Equation 3. This combined parameter gives insight into the tendency for the community to behave deterministically versus stochastically (large || versus small, respectively) as well as insights into the distribution of positive and negative edge weights. The amount of time it takes for each community to find a steady (or steadier) state is noticeable, with the critical community converging to a stable state within a few generations whereas the other two communities take a few hundred or thousand generations to stabilize.

Figure 9.

Genotype evolution. For each simulation, the edge weight values (J), the inverse temperature (β), and the product of these two parameters () are plotted. The combined parameter is useful in that the two individual parameters J and β show up together when calculating flip probabilities in Equation 3. This combined parameter gives insight into the tendency for the community to behave deterministically versus stochastically (large || versus small, respectively) as well as insights into the distribution of positive and negative edge weights. The amount of time it takes for each community to find a steady (or steadier) state is noticeable, with the critical community converging to a stable state within a few generations whereas the other two communities take a few hundred or thousand generations to stabilize.

## 4 Discussion

In this experiment our goal has been to explore the interaction and relationship between evolutionary dynamics and critical systems. We embodied Ising neural networks in agents placed in an environment they could interact with and compete in, and measured a variety of the statistics of these communities as they evolved and adapted. Three separate simulations were run, starting from different initial inverse-temperature conditions such that each simulation is forced into a different dynamical phase. Doing so allowed us to measure the evolutionary flows of the different simulations into the different dynamical regimes, where we could test the hypothesis that a critical regime acts as an evolutionary attractor by offering some kind of evolutionary utility.

We allowed the traditional control parameter in the Ising mode, the inverse temperature β, to evolve subject to the same evolutionary algorithm as the network edges. Since the function of β in the Ising model is to increase (decrease) the effects of noise (thermal fluctuations), changing β to extremely high values or extremely low values can push a system into sub- or supercriticality, as the system dynamics becomes dominated by edge weight (Jij) or thermal noise (β), respectively. By allowing β to vary with the GA, we gave the agents a single parameter that could, in principle, smoothly move them in and out of a phase transition. In contrast, much more work (and luck and time) is required for the GA to arrive at criticality through mutations in the edge weights alone, since in principle all (or most, if high-dimensional enough) Ising networks have a critical temperature if one can sweep through a large enough range of temperatures.

Therefore, three simulations were run, differing only in their initial inverse-temperature values (β = 0.1, 1, 10), representing communities that started off initially in a supercritical, critical, or subcritical state, respectively. We then allowed β to evolve (along with the rest of the edge-weight parameters) to see if criticality acts as a dynamical attractor in the parameter of our model. In Figure 4, it is explicitly clear how the three different communities differ in their progress towards increased fitness. The community that initially started the closest to criticality (β = 1) had the quickest approach to peak success, plateauing in its fitness curve within approximately 200 generations, roughly one or two orders of magnitude faster than the subcritical or the critical community. This result by itself is noteworthy in that one of the main goals of this project was to quantify the utility of criticality, and here we can already see that the critical community is the one that can adapt and find solutions the quickest in this foraging game environment.

However, just as noteworthy is the difference between the other two communities. As is better seen in the scatterplot version of the fitness curve in Figure 5, the initially subcritical community (β = 10) was able to match the fitness of the critical community's peak agents, doing so in spite of maintaining and even doubling down on its subcritical state (as seen in Figure 3). In other words, the elite of the subcritical community were capable of matching the fitnesses of the average and the elite of the critical community. However, the rate of failure was much higher in the subcritical community, which resulted in many agents with a fitness of 0, lowering the community average. In fact, some of the agents that have a fitness of 0 were exact duplicates of agents that were among the elite in the subcritical community, which goes to demonstrate the sensitivity of the subcritical community to the initial conditions of its neuron states. This sensitivity to initial conditions does not, however, translate into a dynamic susceptibility analogous to the critical community's, since the subcritical community's sensitivity to initial conditions seems relatively bimodal. If the agent's initial conditions resulted in a failure, it failed catastrophically and behaviorally acted paralyzed; otherwise, it was quite adept and susceptible to its environment, as measured in the motor response figures (Figure 8).

Turning finally to the supercritical community, we see a more slow and steady trend in its fitness curve. Community fitness values are relatively close to each other, with low variations across individuals. Towards the end of our results, near generation 4000, it can be seen that the supercritical community converges, perhaps asymptotically, towards the statistics of the initially critical β = 1 simulation. This trend can be observed essentially multimodally across the different measurements taken in this experiment; the specific heat curves (Figure 3) tend to converge to the same shapes; the t-SNE plots (Figure 7) and the edge-weight plots (Figure 9) demonstrate how stable clusters form in genotype space as the supercritical community transitions from a linear genetic progression (with high genetic diversity) to a more clustered genetic distribution that more closely resembles the pattern exhibited by the β = 1 critical community; and the motor response curves for the supercritical community asymptotically converge to the same shapes as those for the critical community.

### 4.1 The Phases of Evolutionary Dynamics

Overall, the experiment demonstrated a rich variety of behaviors and evolutionary dynamics as a function of only a single parameter, β in our model. All three communities were eventually able to (approximately) solve the foraging game they were embedded in with comparable fitness values. However, their trajectories to success were markedly different, highlighted by the relatively fast convergence to a solution exhibited by the critical community. True to the order-disorder duality, the sub- and supercritical communities arrived at their solutions with contrasting behaviors and dynamics as outlined above.

#### 4.1.1 Subcriticality

The subcritical community not only remained on the subcritical side of its dynamics; it seems to have become even more subcritical with time. Its genetic history is only ever punctuated by dominant agents, lacking any sort of linear progression in fitness or genotype space and maintaining a low genetic diversity. Behavioral variability is extremely high in the subcritical community, with high rates of failures even among identical agents.

#### 4.1.2 Supercriticality

In contrast, the supercritical community very smoothly transitions from the supercritical regime of dynamics towards a more critical state. This progression can be observed not only in the fitness as a function of time, but also in the projection of the genotype space in the t-SNE plots. The genetic diversity in these agents is high, and the fittest agents tend to be children of recent generations as opposed to the ancient agents in the subcritical community, whose genotypes would survive minimally changed for thousands of generations. The thermal fluctuations that the supercritical community is subject to act as an equalizing force among the different agents' behaviors, which tends to equalize the mating probabilities among the different agents, which ultimately allows a higher chance of success (reproduction) for newly generated agents. Therefore the behavioral variability among agents in the supercritical community is actually lower than the other two simulations (which can be thought of as a stabilizing force in the system).

#### 4.1.3 Criticality

Between these two extremes is the critical community, which seems capable of displaying some properties of both extremes. Unlike the supercritical community, the critical community seems capable of punctuated and dramatic changes in its gene pool. However, this event seems to occur at most once in the history of a simulation, near the beginning, when the community discovers a useful genotype, which is then followed by a long period of relative genetic stability where only small mutations are slowly accrued over time. The critical community is more welcoming to newly generated agents than is the subcritical community; however, it does not exhibit a directional progression in genotype space the same way the supercritical community does as it approaches a more critical state. Overall, while the critical community seems to have more stable genetic dynamics, it is capable of delivering both the spontaneous emergence of a massively dominant genotype (as seems to happen frequently in the subcritical community) and some genetic diversity across time (as seems to be the defining feature for the first 3500 generations of the supercritical community). All of this is contextualized in the fact that the critical community achieves peak fitness levels orders of magnitude faster than the other communities.

## 5 Conclusion

### 5.1 Summary

Understanding the utility of criticality in artificial life systems is important for understanding how complexity can self-organize into predictable but adaptive systems. This project measured how Ising neural agents perturbed to be either near, far from, or at criticality behave as they are subjected to evolutionary selection pressures.

In our simulations it was discovered that communities that start near criticality are orders of magnitude quicker to find solutions to maximize their fitness to the foraging task they were embedded in. Furthermore, unique evolutionary dynamics were measured across the three phases of the model (subcritical, supercritical, and critical). It was shown that as the supercritical community evolves, the system tends to flow towards a critical regime in parameter space. Once critical, the systems exhibited a state of relative genetic stability, where the flow in parameter space is less dramatic; some details will be mentioned in future works. In stark contrast to the other two simulations, the subcritical community demonstrated the ability to solve the task, albeit with high failure rates, with an erratic genetic history, and with comparatively idiosyncratic behavior.

Overall, this project demonstrated two important ways that criticality and evolutionary dynamics can be deeply intertwined: by demonstrating how critical systems are successful at discovering novel solutions, and by demonstrating a conditional flow in parameter space towards criticality. This gives further credence to the notion that complex systems can be fruitfully studied through the lens of the physics of critical systems.

### 5.2 Limitations and Future Work

#### 5.2.1 Critical Environment versus Critical Agent

One major limitation of this study has been the relative simplicity of the environment and task these agents are embedded in. This is an important limitation because the discussion of the utility and ubiquity of criticality in nature tends to involve complex systems embedded in open-ended astronomical, geological, chemical, or biological structures. Criticality is thought to be useful for these complex environments for allowing a high dynamic range of responses, a large informational bandwidth, and high informational transfer and fidelity. Making our foraging task simple or predictable means that the utility of criticality is bounded by the simplicity of our experimental design. Furthermore, we expect the full utility of criticality to be displayed only when the environment is actually complex enough to allow for critical systems to take advantage of their own informational properties [10]. One can easily imagine that there may exist an environment in which an appropriately designed subcritical agent may be able to compete equally, if not better, than a critical agent. Therefore, the problem of criticality in nature seems like the chicken-and-the-egg problem, in that the criticality we observe in life today is in response to the complexity that has already been existing in our universe, and only seems to be ever-growing. Future experiments using similar methods should explore how different evolutionary dynamical regimes react to changing environments. More generally, it may be insightful for any research project interested in open-ended evolution as a whole to consider the role that criticality can play in either generating open-ended environments or being embedded in such an environment.

#### 5.2.2 Bicritical Points

A trend observed most noticeably in the critical simulation, but also more subtly in the final generations of the supercritical simulation, is the transformation of the specific heat curves from a single peak towards a bimodal peak. This observation potentially implies the emergence of multiple critical points, though interpreting this trend is still difficult due to finite-size effects. One hypothesis for these results is that once the agents approach peak fitness as allowed by the environmental conditions, the competition changes from being dominated by agent-versus-environment interactions to agent-versus-agent interactions, also known as the Red Queen effect. The emergence of this bimodality may then be an adaptation to fluctuations in the environment that give rise to fluctuations in the effective temperature of the sensors of these agents, allowing the agents to have brains highly susceptible to both sub- and supercritical environmental configurations. Taking this interpretation seriously may have useful implications for understanding the emergence of bicamerality [6] in the nervous systems of the animals on earth, particularly our own human brains. However, more experiments and measurements are needed in order to talk more definitively about this trend, starting with larger-scale simulations with neural networks that are large enough to potentially specialize/modularize. To this end, the authors of this article plan to explore larger-scale simulations of subcritical, supercritical, and critical models using neuroevolutionary methods.

## Acknowledgments

This work was partially funded by the Earth-Life Science Institute Origins Network (EON) Research Fellow Program, supported by the John Templeton Foundation. The authors would also like to thank Miguel Aguilera, Manuel Baltieri, and Julien Hubert for their helpful advice.

## References

1
Aguilera
,
M.
, &
Bedia
,
M. G.
(
2017
).
Adaptation to criticality through organizational invariance in embodied agents
.
arXiv preprint arXiv:1712.05284
.
2
Attanasi
,
A.
,
Cavagna
,
A.
,
Del Castello
,
L.
,
Giardina
,
I.
,
Melillo
,
S.
,
Parisi
,
L.
,
Pohl
,
O.
,
Rossaro
,
B.
,
Shen
,
E.
,
Silvestri
,
E.
, et al
(
2014
).
Finite-size scaling as a way to probe near-criticality in natural swarms
.
Physical Review Letters
,
113
(
23
),
238102
.
3
Bak
,
P.
(
2013
).
How nature works: The science of self-organized criticality
.
New York
:
Springer-Verlag
.
4
Bak
,
P.
,
Tang
,
C.
, &
Wiesenfeld
,
K.
(
1987
).
Self-organized criticality: An explanation of the 1/f noise
.
Physical Review Letters
,
59
(
4
),
381
.
5
Balleza
,
E.
,
Alvarez-Buylla
,
E. R.
,
Chaos
,
A.
,
Kauffman
,
S.
,
Shmulevich
,
I.
, &
Aldana
,
M.
(
2008
).
Critical dynamics in genetic regulatory networks: Examples from four kingdoms
.
PLoS ONE
,
3
(
6
),
e2456
.
6
Cavanna
,
A. E.
,
Trimble
,
M.
,
Cinti
,
F.
, &
Monaco
,
F.
(
2007
).
The “bicameral mind” 30 years on: A critical reappraisal of Julian Jaynes' hypothesis
.
Functional Neurology
,
22
(
1
),
11
.
7
Chaté
,
H.
, &
Muñoz
,
M. A.
(
2014
).
Insect swarms go critical
.
Physics
,
7
,
120
.
8
Chialvo
,
D. R.
(
2012
).
Critical brain dynamics at large scale
.
arXiv preprint arXiv:1210.3632
.
9
Glauber
,
R. J.
(
1963
).
Time-dependent statistics of the Ising model
.
Journal of Mathematical Physics
,
4
(
2
),
294
307
.
10
Hidalgo
,
J.
,
Grilli
,
J.
,
Suweis
,
S.
,
Muñoz
,
M. A.
,
Banavar
,
J. R.
, &
Maritan
,
A.
(
2014
).
Information-based fitness and the emergence of criticality in living systems
.
Proceedings of the National Academy of Sciences of the U.S.A.
,
111
(
28
),
10095
10100
.
11
Krotov
,
D.
,
Dubuis
,
J. O.
,
Gregor
,
T.
, &
Bialek
,
W.
(
2014
).
Morphogenesis at criticality
.
Proceedings of the National Academy of Sciences of the U.S.A.
,
111
(
10
),
3683
3688
.
12
Lovbjerg
,
M.
, &
Krink
,
T.
(
2002
).
Extending particle swarm optimisers with self-organized criticality
. In
Proceedings of the 2002 Congress on Evolutionary Computation, CEC'02
(
vol. 2
, pp.
1588
1593
).
New York
:
IEEE
.
13
Mora
,
T.
, &
Bialek
,
W.
(
2011
).
Are biological systems poised at criticality?
Journal of Statistical Physics
,
144
(
2
),
268
302
.
14
Rooy
,
N.
(
2017
).
Evolving simple organisms using a genetic algorithm and deep learning from scratch with Python
.
Unpublished manuscript
.
15
Salinas
,
S. R.
(
2001
).
Scaling theories and the renormalization group
. In
Introduction to Statistical Physics
(pp.
277
304
).
New York
:
Springer
.
16
Schneidman
,
E.
,
Berry
II,
M. J.
,
Segev
,
R.
, &
Bialek
,
W.
(
2006
).
Weak pairwise correlations imply strongly correlated network states in a neural population
.
Nature
,
440
(
7087
),
1007
.
17
Torres-Sosa
,
C.
,
Huang
,
S.
, &
Aldana
,
M.
(
2012
).
Criticality is an emergent property of genetic networks that exhibit evolvability
.
PLoS Computational Biology
,
8
(
9
),
e1002669
.
18
Van Orden
,
G.
,
Hollis
,
G.
, &
Wallot
,
S.
(
2012
).
The blue-collar brain
.
Frontiers in Physiology
,
3
,
207
.
19
Witkowski
,
O.
, &
Ikegami
,
T.
(
2016
).
Emergence of swarming behavior: Foraging agents evolve collective motion based on signaling
.
PLoS ONE
,
11
(
4
),
e0152756
.
20
Zanoci
,
C.
,
Dehghani
,
N.
, &
Tegmark
,
M.
(
2018
).
Ensemble inhibition and excitation in the human cortex: An Ising model analysis with uncertainties
.
arXiv preprint arXiv:1810.07253
.

### Appendix 1: Comparing Genotypes

In order to quantify the differences and similarities between the three different simulations in the experiment, we applied a KS test to the distribution of edge weights of the Ising neural agents' networks. It should be noted that to obtain stronger statistics with this method (for example, to obtain error bars), ideally one would have more than just three simulations for each category (subcritical, supercritical, critical) and compare all possible pairs. However, though we are constrained by computational limitations, some basic interpretations can still be observed from these statistics. The lower the KS statistic, the less likely it is that the samples are from different distributions.

As argued in Section 3, the supercritical community tended towards critical statistics as measured across a number of different metrics (specific heat, motor response and susceptibility, fitness). With the KS test below (see Figure 10), it can be further demonstrated that the edge weight distribution of the supercritical community begins to reconverge towards the critical community's after the first few thousand generations following their initial divergence. In general, due to the critical community's unique ability to relatively quickly find fitness-optimizing genotypes, the KS statistic between it and both the subcritical and the supercritical communities diverges in the first few thousand generations. This is in contrast to the KS statistic between the subcritical and the supercritical communities, which remains relatively low even up to the final few hundred generations. The similarity between these two communities can be interpreted as due to similarities in their evolutionary dynamics in genotype space, since it has already been shown that in almost every other metric there was very little similarity between the subcritical and the supercritical communities. The main quality both these communities seem to share is their relative slowness in finding a stable genotype that ends up dominating its gene pool, in contrast with the critical community, which does so comparatively quickly.

Figure 10.

For each simulation β = 0.1, 1, 10 a two-sample KS test is used to compare the edge weight distributions for each generation. The KS statistic is plotted as a function of the number of generations. When the KS statistic is small, the null hypothesis that the two distributions are sampled from the same distribution cannot be rejected. At generation 0 the KS statistic is small, since all three simulations have their edge weights sampled from the same distribution. As the communities evolve, their edge weight distributions evolve and the KS statistics tend to grow. The distance between the critical community and the other two communities is larger than the distance between the sub- and supercritical communities. This trend is most notable starting from generation 1000 and begins to disappear near generation 3500 as the supercritical community begins to converge towards critical statistics.

Figure 10.

For each simulation β = 0.1, 1, 10 a two-sample KS test is used to compare the edge weight distributions for each generation. The KS statistic is plotted as a function of the number of generations. When the KS statistic is small, the null hypothesis that the two distributions are sampled from the same distribution cannot be rejected. At generation 0 the KS statistic is small, since all three simulations have their edge weights sampled from the same distribution. As the communities evolve, their edge weight distributions evolve and the KS statistics tend to grow. The distance between the critical community and the other two communities is larger than the distance between the sub- and supercritical communities. This trend is most notable starting from generation 1000 and begins to disappear near generation 3500 as the supercritical community begins to converge towards critical statistics.

In conclusion, this KS test confirms two important trends we observed in the simulation results. First, the critical community is superior to the subcritical and the supercritical communities in its ability to sample edge weights from a distribution, which offers more optimal fitness scores, as shown by the divergence in the KS statistic between the critical community and the other two communities (in view of its higher fitness scores). Second, the supercritical community eventually converges towards criticality, whereas the subcritical community remains distant, with no indication that it might converge. A larger number of simulations is needed to verify these comparisons, but in light of their agreement with our results using other measures, they can still be useful in quantifying qualitative observations.

### Appendix 2: Supplementary Materials

The source code is available at https://github.com/heysoos/CriticalForagingOrgs. Visualizations are available at https://github.com/heysoos/CriticalForagingOrgs/tree/master/figures.

The parameter values are given in Table 1.

Table 1.
Parameter list.
ParameterValue
Population size 50
agents to kill per generation 30
Food units 100
Food radius 0.03
agent size 0.05
Time steps per iteration 4000
Iterations per simulation 4000
Thermalization cycles 10
Simulation time step 0.2
Max rotational velocity 720
Max rotational acceleration 90
Max linear velocity 0.5
Max linear acceleration 0.05
Total number of neurons 10
Sensor neurons
Motor neurons
Sparsity of hidden neuron interconnectivity
Duplication ratio and mating reproduction 0.5
Duplication mutation rate 0.1
Temperature mutation σ 0.02
ParameterValue
Population size 50
agents to kill per generation 30
Food units 100
Food radius 0.03
agent size 0.05
Time steps per iteration 4000
Iterations per simulation 4000
Thermalization cycles 10
Simulation time step 0.2
Max rotational velocity 720
Max rotational acceleration 90
Max linear velocity 0.5
Max linear acceleration 0.05
Total number of neurons 10
Sensor neurons
Motor neurons
Sparsity of hidden neuron interconnectivity
Duplication ratio and mating reproduction 0.5
Duplication mutation rate 0.1
Temperature mutation σ 0.02