A grand challenge in the field of artificial life is to find a general theory of emergent self-organizing systems. In swarm systems most of the observed complexity is based on motion of simple entities. Similarly, statistical mechanics focuses on collective properties induced by the motion of many interacting particles. In this article we apply methods from statistical mechanics to swarm systems. We try to explain the emergent behavior of a simulated swarm by applying methods based on the fluctuation theorem. Empirical results indicate that swarms are able to produce negative entropy within an isolated subsystem due to frozen accidents. Individuals of a swarm are able to locally detect fluctuations of the global entropy measure and store them, if they are negative entropy productions. By accumulating these stored fluctuations over time the swarm as a whole is producing negative entropy and the system ends up in an ordered state. We claim that this indicates the existence of an inverted fluctuation theorem for emergent self-organizing dissipative systems. This approach bears the potential of general applicability.
One characteristic of living organisms is their metabolism. Living beings require energy in order to maintain their internal order. This is determined by the second law of thermodynamics, which describes the ubiquitous decay of all things and does not allow the increase of order without the cost of dissipation. In the context of self-organizing systems one might cite Parunak and Brueckner : “Emergent self-organization in multi-agent systems appears to contradict the second law of thermodynamics.” This is of course not the case; as discussed by Parunak and Brueckner , one has to distinguish between two kinds of subsystems: one that hosts the self-organizing swarm, and one in which disorder is increased. Hence, a swarm can be thought of as a heat pump that decreases entropy (entropy is defined in Section 2, and entropy measures are investigated in Section 5.1) in one basin in favor of increased entropy in another basin. However, the question of how the swarm manages to do that still persists. Whether thermodynamic properties are relevant and helpful in understanding such systems is currently being discussed [11, 18].
The complexity of the animate world is explained by natural selection in combination with random events (natural evolution). It is one thing to select the adapted organism, but the mutation that results in improved adaptivity has to occur first. Concerning the genetic code, Crick  coined the term frozen accident theory. Whereas Crick introduced this concept with a focus on genetics, Gell-Mann  applied it to everything:
[…] the effective complexity [of the universe] receives only a small contribution from the fundamental laws. The rest comes from the numerous regularities resulting from ‘frozen accidents’.
While a heat pump has to work against the second law (e.g., diffusion of heat) by expending energy, limited violations of the second law without the expenditure of energy  are also possible as, for example, indicated by Maxwell :
The truth of the second law is … a statistical, not a mathematical, truth, … for it depends on the fact that the bodies we deal with consist of millions of molecules….
The fluctuation theorem [5, 6] quantifies the probability of violations of the second law. For short intervals it can be said that nature was running in reverse, because the sign of entropy production is used to define the direction of the arrow of time. Even in reference to living systems this might be said. For example, small “machines” within a cell (e.g., mitochondria) are likely to run in reverse from time to time. A transfer of this concept to the macro world is typically denied categorically. In a review of Wang et al. , Gerstner  wrote: “For larger systems over normal periods of time, however, the second law of thermodynamics is absolutely rock solid.”
In this article, we report empirical evidence that the negative entropy production in emergent self-organizing systems is based initially on frozen accidents allowed by the original fluctuation theorem, which in turn leads in the end to a global behavior that is described by an inversion of the fluctuation theorem in dissipative self-organizing systems. This concept might bear the potential of embedding the concept of emergent behavior in multi-agent systems (swarms, self-propelled particles, etc.) in a theoretical framework built on sound foundations of theories from physics. Hence, we propose an approach to understand emergent behavior through thermodynamics which follows up our earlier-reported concept .
In addition, the relation to the fluctuation theorem might allow to us define preconditions for effective self-organizing systems in the future. For example, one can define minimum requirements for the agents of the system concerning its cognitive abilities in order to be able to leverage fluctuations. The agent needs sensors that allow it to estimate, at least probabilistically, whether the (local) entropy has just decreased. Furthermore, the system needs the ability to store such local fluctuations.
In the following we describe the fluctuation theorem and the investigated scenarios. We analyze the multi-agent system or swarm, discuss how the results could be viewed as obeying an inverted fluctuation theorem, and conclude by giving a short summary and outlook.
2 Fluctuation Theorem and Entropy Measures
But it is also possible that two experimenters assign different entropies […] to what is in fact the same microstate […] without either being in error. That is, they are contemplating a different set of possible macroscopic observations on the same system, embedding that microstate in two different reference classes […]. It is necessary to decide at the outset of a problem which macroscopic variables or degrees of freedom we shall measure and or control.
In order to measure probabilities of microstates effectively, one needs to define a countable set of microstates, which is usually done by coarse-graining (an example in the context of multi-agent systems is given by Parunak and Brueckner ). In the context of entropy production it is also important to determine whether negative or positive entropy production is observed. Note that this sign of entropy production might be opposite to the actual mathematical sign of a certain entropy measure. According to the statistical physics interpretation, this sign of entropy production is defined relative to the entropy of that macrostate that has the highest number of microstates. According to thermodynamics it is defined relative to the entropy of that state with the lowest amount of energy that can be used to do thermodynamic work. That macrostate is the equilibrium state, which would be achieved in the context of this article by mere random behavior of the particles or agents, that is, a uniform distribution of agents. Thus, any process that leaves the equilibrium state and enters states of lower entropy shows negative entropy production.
3 Investigated Scenarios
We investigate two scenarios. In the first one the swarm is controlled by the BEECLUST algorithm. The second scenario is a simple clustering process and is introduced to give evidence that our approach has potential for general applicability.
3.1 BEECLUST Algorithm
The BEECLUST algorithm can be considered a model algorithm for swarms. It is based on observations of young honeybees , has been analyzed in many models [10, 12, 13, 20, 21], and has even been implemented in a swarm of robots .
This algorithm allows a swarm to aggregate at a maximum of a scalar field although individual agents do not perform a greedy gradient ascent. In addition, a BEECLUST-controlled swarm is able to break symmetries  of equal maxima in the scalar field, as also observed here (e.g., see Figure 2). Hence, it might be justified to call this emergent behavior. Controlled by this algorithm three agents will stop (i.e., stopping threshold set to three; note that in previous works typically two agents were enough for stopping; the actual choice of this stopping threshold is, however, irrelevant in this article) when they approach each other, measure the local value of the scalar field, and wait for some time proportional to this measurement. Clusters form, and finally the swarm will be aggregated close to the global optimum of the scalar field (see the lower part of Figure 2). See Figure 3 for a definition of the BEECLUST algorithm.
The collective aggregation close to the global optimum is achieved via a positive feedback process . Clusters of three stopped agents will form by chance anywhere in the arena. The area covered by clusters grows with the number of contained agents, and clusters covering a bigger area are more likely to be approached, by chance, by moving agents. Hence, bigger clusters will grow faster. The intensity of this positive feedback is inhomogeneous in the arena. Agents in clusters closer to the global optimum have longer waiting times. These clusters will exist longer than those that are farther away from the global optimum. Hence, the chance of growing is bigger for clusters closer to the global optimum. This process, typically, generates one big cluster close to the global optimum. The agents interact only locally and, as noted above, a BEECLUST-controlled swarm breaks symmetries. Hence, this behavior is different from other aggregation processes, for example, star formation, which includes global interactions due to gravitation.
In the following experiments, the agents have initially random headings, are in the state moving, and are uniformly randomly distributed in the arena. The scalar field is bimodal with maxima of the same value and shape (see contours in Figure 2). See Table 1 for the standard parameters used.
|Arena dimensions||150 × 50 [length units]2|
|Proximity sensor range||3.5 [length units]|
|Maximum waiting time||660 [time units]|
|Velocity||4 [length units]/[time unit]|
|Number of agents||25|
|Arena dimensions||150 × 50 [length units]2|
|Proximity sensor range||3.5 [length units]|
|Maximum waiting time||660 [time units]|
|Velocity||4 [length units]/[time unit]|
|Number of agents||25|
The second scenario that we investigate is simple clustering of agents. It is similar to the BEECLUST-controlled swarm. However, the agents have constant waiting times independent of any scalar field, they move on a torus, and we introduce noisy sensors. The missing scalar field and missing wall effects remove the preferred regions of cluster formation from the system as seen in the BEECLUST scenario. The used parameters are the same as given in Table 1. We implement noise in the sensors in a simple way by an agent-agent recognition probability γ, that is, even if two agents are mutually within sensor range, they might not perceive each other and might not stop. We set γ = 0.75, that is, an agent will perceive another agent in its sensor range 75% of the time. The noise is uncorrelated in time. Noise was introduced because deterministic clustering seems to introduce artifacts into the entropy production histograms.
4 Analysis of Scenarios
4.1 Analysis of BEECLUST
The system dynamics takes place in a high-dimensional phase space (q0, q1,…, qN−1, p0, p1,…, pN−1) ∈ Γ. In the following we need to detect the essentials of this dynamics by a measure of entropy. First, we present an advantageous choice of an entropy measure, and in Section 5.1 we give an alternative, which is, however, less beneficial.
In Figure 5 the data shown in Figure 4 is tested for whether it obeys Equation 11. The fluctuation theorem is satisfied for this system although the system produces negative entropy and actually abandons the equilibrium to which it was initialized. Hence, one could speak of an inverted fluctuation theorem that is satisfied here.
In the following we want to investigate how it is possible for this self-organizing system to produce negative entropy. We hypothesize that the negative entropy production is based on fluctuations and the stopping behavior of the agents, and hence is a process of frozen accidents.
We start our analysis with a measurement of the entropy production within a limited time interval [t0 = 15, t1 = 20] in the early transient. In addition, we classify for each measurement whether at least one agent changed its state from moving to stopped (starting agents do not occur that early in the simulation). The entropy production distributions for these two classes are shown in Figure 6. For the measurements without a stopping agent the averaged change in the density modulation is about zero (〈(t1 − t0)Ωb〉 ≈ 0.06). In contrast, for those measurements with stopping agents the averaged change of density modulation is negative (〈(t1 − t0)Ωb〉 ≈ −3.09), indicating frozen accidents. For much later time intervals no difference between measurements with stopped and without stopping agents are found. The negative value of 〈(t1 − t0)Ωb〉 demands clarification, because in the limit t → ∞ the average density modulation is positive.
The explanation is based on a special feature of the BEECLUST-controlled swarm in this scenario, which consists of three phases (see Figure 7). In the short period before the first cluster forms, the average entropy production is 〈Ωb = 0〉, indicating that the original fluctuation theorem holds for this phase. The first cluster usually does not form close to the global optima, but relatively close to the middle of the arena; see Figure 7a. In this area the agent density modulation (Equation 8) contributes negatively. In a second phase the average density modulation is negative (〈Ωb < 0〉) because the density close to the middle of the arena increases further; see Figure 7(b). This is also indicated by the evolution of the agent density modulation over time as shown in Figure 8. Initially it stays close to 0, and only later does it clearly take a positive sign. The insert shows details of the first 50 time steps and indicates negative slope for the time interval [15, 20] (i.e., the second phase) of Figure 6. Only later do the clusters move toward the ends of the arena, probably aided by wall effects (see Figure 7c; agents are more likely to approach the cluster again after having left it on the side of the cluster that is closer to one of the ends of the arena), and consequently the average density modulation is positive (Ωb > 0).
4.2 Analysis of Clustering
The clustering scenario is simple, and it serves in this article as a second example to show the generality of our approach. Hence we reduce the analysis to a comparison of the entropy production and the test of the entropy production distribution with those obtained in the BEECLUST scenario.
The test of the entropy production distribution according to the fluctuation theorem and by analogy to Equation 11 is shown in Figure 9b. The data is close to linearity, but shows systematic deviations (too small for A < 0.01 and too large for 0.15 < A < 0.03). Given the fast convergence of the system to a majority of trajectories with negative entropy productions, this is still a satisfactory result.
To support our claim that this approach bears the potential of general applicability, we report the influence of an alternative entropy measure and investigate the influence of temperature in the following.
5.1 Independence from the Entropy Measure
The entropy measure that might be the most obvious to apply in the context of this article is probably the information-theoretic entropy of Shannon . The Shannon entropy, however, is discrete, and our system has many continuous features apart from the discrete concept of agent numbers. The continuous extension of the Shannon entropy (differential entropy) is also not directly applicable, because we would have to sample a probability distribution from only N samples represented by agent positions. To get from continuous state space to discrete microstates, we have to do some form of coarse-graining. Unfortunately there is an ambiguity because there are many possibilities to implement coarse-graining. Still, at least two conditions can be fixed. First, the dynamics of the obtained entropy measure should parallel those of the system, and second, it should capture its main features.
Initially the agents are uniformly randomly distributed, which causes about as many tiles to be occupied as there are agents (for N ≪ 75), and we get p1(0) ≈ N/75. After some time, the agents cluster and the number of occupied tiles typically decreases (a typical value is p1(t) = 5/75 for N = 25). Hence, the average entropy production will be negative (〈tΩc〉 < 0).
The above experiments can be repeated with this entropy measure, and no qualitative change is detectable. However, there are some technical problems. Due to the discrete measure, abnormalities in the histograms occur, and the number of trajectories showing positive entropy production decreases fast with increasing time, which complicates the statistical analysis, for which we need well-filled histogram bins on both the positive and the negative side. Hence, the obtained data will not fully converge to a line according to Equation 11, and we cannot sample within long time intervals. The results are shown in Figure 10. The shapes of the curves are systematic, due to coarse-graining. Still, the overall trend to linearity is seen. We conclude that the above entropy measure based on a density modulation was a better choice.
5.2 Influence of Temperature
Both investigated scenarios are certainly more complex than a mere physical multiparticle system, because they also incorporate autonomous stops of agents, sensor ranges, and waiting times in addition to particle velocities. Still, we can accommodate most of them with the concept of temperature as defined in Equation 10. When we reduce the agents' velocities, we alter the temperature. However, we have to extend the concept of temperature to allow the comparison of systems with different agent velocities. To scale the system behaviors at different agent velocities completely, it is clear that we also have to scale waiting times and sensor ranges, which are not considered in Equation 10. Hence, we have to apply an extended concept of temperature that includes these velocity-dependent properties of agents as well. However, the definition and analysis of a general temperature concept for multi-agent systems is beyond the scope of this article. Here, we just scale speeds (halved), waiting times (doubled), and sensor ranges (halved) and obtain two systems with temperatures T1 = 2T2. We compare the entropy production distributions of these two systems in Figure 11 for a late state at time t = 6,000. The system at half temperature (T2) shows systematic deviations because it converges slower. Still, the deviations are small enough to conclude that, at least for some scenarios, an extended concept of temperature can be found that allows one to scale these systems and compare them across different temperatures.
Note again that ρb(k, t) = 0 corresponds to maximum entropy. Therefore, any ρb(k, t) ≠ 0 in Figure 8 indicates negative entropy production. We conclude that the negative entropy production of this system is initiated by entropy fluctuations, which are normally distributed and are negative and positive with about the same probability according to the original fluctuation theorem and as seen in Figure 6(a). Some of the negative-entropy-production events are locally observable by the agents themselves, because there are simultaneous agent-to-agent encounters of three agents with mutual perception. This local perception of the global measure of entropy is leveraged by stopping all three agents and consequently storing the local entropy fluctuation. Cascades of such stopping behaviors generate a positive feedback (self-amplification of fluctuations as in Rayleigh-Bénard convection). In the end a system dynamics is generated that can be described by an inverted fluctuation theorem, which dictates an exponentially increasing probability of low-entropy states. Hence, this emergent self-organizing swarm does indeed rely on frozen accidents. Note that the overall system (i.e., including the heat bath) still produces positive entropy (e.g., due to accelerations of the agents), while the agent-position-based entropies are reduced only in the self-organizing subsystem.
The effectiveness of the frozen-accidents concept can easily be made clear by constructing a simple model. We represent the entropy contribution of each agent i by a random process Xi(t). The total entropy is just the sum ∑i=0NXi(t) over all agents N. The restriction of all random processes to the interval [−5, 5] is essential, and we define Xi(t) = 5 ∀t > t0, where t0 is the first time agent i achieved Xi(t0) = 5. That is, once a random process reaches Xi(t0) = 5 (a local property), it stays there forever—a frozen accident. As a consequence the number of active random processes Na will decrease monotonically. A sample run of this simple model for N = 25, based on Gaussian distributed Xi and initialization Xi(0) = 0, is shown in Figure 12. The bias in the otherwise random trajectory is noticeable. Note that the summation ∑NXi of Gaussian distributed random variables, each having a variance of σi2, results in a random variable that is also Gaussian distributed with a variance of σ2 = ∑N σi2. With decreasing number Na of active processes, more and more variances vanish (σi2 = 0). Hence, also the variance of the sum will decrease, which is the macroscopic effect of the frozen accidents and ensures that states of low entropy are much more likely to be maintained. The challenge in investigating self-organizing systems is, however, represented by the reverse engineering of such a model. One would start with an observed macroscopic effect as shown in Figure 12 and then decompose it into a set of subsystems that accidentally achieve low-entropy states by noise and then preserve them by freezing.
In this article, we have analyzed emergent self-organizing multi-agent (or swarm) systems with methods based on and suggested by the fluctuation theorem. The results provide empirical evidence for the existence of an inverted fluctuation theorem that could prepare a wide basis for the analysis of self-organizing systems. We claim these methods have a potential for general applicability. This claim of generality is supported by the reported results with an alternative entropy measure and the results using a broader interpretation of the concept of temperature.
Specific exemplary benefits of such a theory could be the definition of preconditions for self-organization, for example, concerning the cognitive abilities of the agents. Statistical properties of fluctuations describe the timescales on which negative entropy productions can be observed locally. The agents need to perceive local samples of this global property of negative entropy production and need to react within these timescales. Hence, conditions for controller sampling rates could be derived. The agents need appropriate sensors that allow local measurements of entropy with an accuracy that is sufficiently higher than the rate at which events of negative entropy production occur. Thus, preconditions for successfully generating positive feedback could be derived.
In particular, the origin of BEECLUST confirms the possibility of applying the proposed methods to natural systems such as clustering behaviors in young honeybees  or other social insects, as well as flocks, herds, and shoals. Hence, the same methods could be used for artificial and natural systems, which could, in turn, enrich primarily biological studies.
This work has proved again that thermodynamics and statistical physics offer many fully developed methods that can often be applied even unmodified to problems of emergent behavior (cf. Hamann et al. ). Pursuing this research track might be a promising way of achieving general insights into still rather fuzzy concepts such as emergence or self-organization.
Finally, it is clear that the reported approach is truly interdisciplinary in combining methods and problems from physics, biology, and computer science. It is obvious that, at least in the field of artificial life, any future research success has to be founded on a combination of several scientific fields. In our future work, we hope to continue this approach by generalizing the concept of an inverted fluctuation theorem for emergent self-organizing multi-agent systems.
The authors thank Payam Zahadat and the anonymous reviewers for helpful comments that improved this article. This reported research is supported by EU-ICT CoCoRo, no. 270382; EU-IST-FET project SYMBRION, no. 216342; EU-ICT project REPLICATOR, no. 216240; FWF (Austrian Science Fund) Project REBODIMENT, no. P23943-N13; and the Austrian Federal Ministry of Science and Research (BM.W_F).
In any embodied implementation of this system (e.g., robots) the particles would be affected by friction and would consequently need to have a permanent acceleration-compensating friction. This, in turn, means they would have an energy reservoir (cf. active particles ) and would permanently dissipate heat, which would result in a situation as shown in Figure 1. Energy costs have to be paid to allow self-organization and to comply with the second law of thermodynamics. Henceforth, we carry out the separation between the two subsystems: the self-organizing subsystem containing the agents, and the subsystem typified by the heat reservoir. Due to its energy dissipation, the self-organizing subsystem does not have to obey the second law of thermodynamics.
University of Paderborn, Department of Computer Science, Zukunftsmeile 1, 33102 Paderborn, Germany. E-mail: email@example.com