## Abstract

A grand challenge in the field of artificial life is to find a general theory of emergent self-organizing systems. In swarm systems most of the observed complexity is based on motion of simple entities. Similarly, statistical mechanics focuses on collective properties induced by the motion of many interacting particles. In this article we apply methods from statistical mechanics to swarm systems. We try to explain the emergent behavior of a simulated swarm by applying methods based on the fluctuation theorem. Empirical results indicate that swarms are able to produce negative entropy within an isolated subsystem due to frozen accidents. Individuals of a swarm are able to locally detect fluctuations of the global entropy measure and store them, if they are negative entropy productions. By accumulating these stored fluctuations over time the swarm as a whole is producing negative entropy and the system ends up in an ordered state. We claim that this indicates the existence of an inverted fluctuation theorem for emergent self-organizing dissipative systems. This approach bears the potential of general applicability.

## 1 Introduction

One characteristic of living organisms is their metabolism. Living beings require energy in order to maintain their internal order. This is determined by the second law of thermodynamics, which describes the ubiquitous decay of all things and does not allow the increase of order without the cost of dissipation. In the context of self-organizing systems one might cite Parunak and Brueckner [17]: “Emergent self-organization in multi-agent systems appears to contradict the second law of thermodynamics.” This is of course not the case; as discussed by Parunak and Brueckner [17], one has to distinguish between two kinds of subsystems: one that hosts the self-organizing swarm, and one in which disorder is increased. Hence, a swarm can be thought of as a heat pump that decreases entropy (entropy is defined in Section 2, and entropy measures are investigated in Section 5.1) in one basin in favor of increased entropy in another basin. However, the question of how the swarm manages to do that still persists. Whether thermodynamic properties are relevant and helpful in understanding such systems is currently being discussed [11, 18].

The complexity of the animate world is explained by natural selection in combination with random events (natural evolution). It is one thing to select the adapted organism, but the mutation that results in improved adaptivity has to occur first. Concerning the genetic code, Crick [3] coined the term *frozen accident theory*. Whereas Crick introduced this concept with a focus on genetics, Gell-Mann [7] applied it to everything:

[…] the effective complexity [of the universe] receives only a small contribution from the fundamental laws. The rest comes from the numerous regularities resulting from ‘frozen accidents’.

While a heat pump has to work against the second law (e.g., diffusion of heat) by expending energy, limited violations of the second law without the expenditure of energy [5] are also possible as, for example, indicated by Maxwell [15]:

The truth of the second law is … a statistical, not a mathematical, truth, … for it depends on the fact that the bodies we deal with consist of millions of molecules….

*summation*of such violations of the second law. As already noted, the second law is only statistical and hence allows spontaneous decreases of entropy in isolated systems with nonzero probability.

*x*is a thermodynamic variable (i.e., it describes a state of a thermodynamic system at a given time); then the probability distribution

*f*(

*x*) of this variable for a system at maximum entropy (i.e., at equilibrium state) turns out to be approximately Gaussian with mean μ = 0:where the variance is defined by the mean squared fluctuation σ

^{2}= 〈

*x*

^{2}〉, which is an average over many ensembles (i.e., average over many realizations of the system). Hence, the probabilities of observing negative (∫

_{−∞}

^{0}

*f*(

*x*)

*dx*) and positive fluctuations (∫

_{0}

^{+∞}

*f*(

*x*)

*dx*) are equal at equilibrium.

The fluctuation theorem [5, 6] quantifies the probability of violations of the second law. For short intervals it can be said that nature was running in reverse, because the sign of entropy production is used to define the direction of the arrow of time. Even in reference to living systems this might be said. For example, small “machines” within a cell (e.g., mitochondria) are likely to run in reverse from time to time. A transfer of this concept to the macro world is typically denied categorically. In a review of Wang et al. [27], Gerstner [8] wrote: “For larger systems over normal periods of time, however, the second law of thermodynamics is absolutely rock solid.”

*dQ*rejected by the system to a thermal bath at temperature

*T*, where

*k*

_{B}is the Boltzmann constant, −

*dS*is the reduction in entropy of the (sub)system's internal state, and is the increase in information (note that Smith [25] defines information as “the reduction in some measure of entropy”). Note that the mere property of being dissipative is not sufficient to explain a self-organizing system. In addition to squandering energy, the system has to generate orderly structures. Dissipation is only a necessary condition for negative entropy production, but additional sufficient conditions exist. In case of Rayleigh-Bénard convection [2], for example, initially fluctuating flows [29] occur that are enhanced and trigger the formation of Bénard cells in spontaneous symmetry breaking (cf. also [9, 16]). We want to point out the self-amplification of fluctuations as such is a sufficient condition here.

In this article, we report empirical evidence that the negative entropy production in emergent self-organizing systems is based initially on frozen accidents allowed by the original fluctuation theorem, which in turn leads in the end to a global behavior that is described by an inversion of the fluctuation theorem in dissipative self-organizing systems. This concept might bear the potential of embedding the concept of emergent behavior in multi-agent systems (swarms, self-propelled particles, etc.) in a theoretical framework built on sound foundations of theories from physics. Hence, we propose an approach to understand emergent behavior through thermodynamics which follows up our earlier-reported concept [11].

In addition, the relation to the fluctuation theorem might allow to us define preconditions for effective self-organizing systems in the future. For example, one can define minimum requirements for the agents of the system concerning its cognitive abilities in order to be able to leverage fluctuations. The agent needs sensors that allow it to estimate, at least probabilistically, whether the (local) entropy has just decreased. Furthermore, the system needs the ability to store such local fluctuations.

In the following we describe the fluctuation theorem and the investigated scenarios. We analyze the multi-agent system or swarm, discuss how the results could be viewed as obeying an inverted fluctuation theorem, and conclude by giving a short summary and outlook.

## 2 Fluctuation Theorem and Entropy Measures

^{1}dissipative non-equilibrium systems.” One of these theorems (steady state fluctuation theorems) applies to time-reversible, thermostatted, ergodic dynamical systems and yields the relation of fluctuations [6]for the time-averaged entropy production . The fluctuation theorem compares probabilities of observing a certain time-averaged entropy production

*A*and observing its negative −

*A*. The numerator describes the probability of finding the system initially in those states that subsequently generate bundles of trajectory segments with the time-averaged value

*A*. The above theorem in Equation 3 predicts an exponential increase of the ratio . Hence, with increasing time, positive-entropy-producing trajectories become exponentially more likely than their negative-entropy-producing counterparts.

*p*

_{i}:A microstate is said to be a detailed description of the system's current configuration. A macrostate is a much shorter description of such a configuration, usually based on averages, and hence summarizes a set of microstates. Macrostates are defined with an application in mind, as stated by Jaynes [14]:

But it is also possible that two experimenters assign different entropies […] to what is in fact the same microstate […] without either being in error. That is, they are contemplating a different set of possible macroscopic observations on the same system, embedding that microstate in two different reference classes […]. It is necessary to decide at the outset of a problem which macroscopic variables or degrees of freedom we shall measure and or control.

In order to measure probabilities of microstates effectively, one needs to define a countable set of microstates, which is usually done by coarse-graining (an example in the context of multi-agent systems is given by Parunak and Brueckner [17]). In the context of entropy production it is also important to determine whether negative or positive entropy production is observed. Note that this sign of entropy production might be opposite to the actual mathematical sign of a certain entropy measure. According to the statistical physics interpretation, this sign of entropy production is defined relative to the entropy of that macrostate that has the highest number of microstates. According to thermodynamics it is defined relative to the entropy of that state with the lowest amount of energy that can be used to do thermodynamic work. That macrostate is the equilibrium state, which would be achieved in the context of this article by mere random behavior of the particles or agents, that is, a uniform distribution of agents. Thus, any process that leaves the equilibrium state and enters states of lower entropy shows negative entropy production.

## 3 Investigated Scenarios

We investigate two scenarios. In the first one the swarm is controlled by the BEECLUST algorithm. The second scenario is a simple clustering process and is introduced to give evidence that our approach has potential for general applicability.

### 3.1 BEECLUST Algorithm

The BEECLUST algorithm can be considered a model algorithm for swarms. It is based on observations of young honeybees [26], has been analyzed in many models [10, 12, 13, 20, 21], and has even been implemented in a swarm of robots [22].

This algorithm allows a swarm to aggregate at a maximum of a scalar field although individual agents do not perform a greedy gradient ascent. In addition, a BEECLUST-controlled swarm is able to break symmetries [12] of equal maxima in the scalar field, as also observed here (e.g., see Figure 2). Hence, it might be justified to call this emergent behavior. Controlled by this algorithm three agents will stop (i.e., stopping threshold set to three; note that in previous works typically two agents were enough for stopping; the actual choice of this stopping threshold is, however, irrelevant in this article) when they approach each other, measure the local value of the scalar field, and wait for some time proportional to this measurement. Clusters form, and finally the swarm will be aggregated close to the global optimum of the scalar field (see the lower part of Figure 2). See Figure 3 for a definition of the BEECLUST algorithm.

The collective aggregation close to the global optimum is achieved via a positive feedback process [12]. Clusters of three stopped agents will form by chance anywhere in the arena. The area covered by clusters grows with the number of contained agents, and clusters covering a bigger area are more likely to be approached, by chance, by moving agents. Hence, bigger clusters will grow faster. The intensity of this positive feedback is inhomogeneous in the arena. Agents in clusters closer to the global optimum have longer waiting times. These clusters will exist longer than those that are farther away from the global optimum. Hence, the chance of growing is bigger for clusters closer to the global optimum. This process, typically, generates one big cluster close to the global optimum. The agents interact only locally and, as noted above, a BEECLUST-controlled swarm breaks symmetries. Hence, this behavior is different from other aggregation processes, for example, star formation, which includes global interactions due to gravitation.

In the following experiments, the agents have initially random headings, are in the state *moving*, and are uniformly randomly distributed in the arena. The scalar field is bimodal with maxima of the same value and shape (see contours in Figure 2). See Table 1 for the standard parameters used.

Arena dimensions | 150 × 50 [length units]^{2} |

Proximity sensor range | 3.5 [length units] |

Maximum waiting time | 660 [time units] |

Velocity | 4 [length units]/[time unit] |

Number of agents | 25 |

Arena dimensions | 150 × 50 [length units]^{2} |

Proximity sensor range | 3.5 [length units] |

Maximum waiting time | 660 [time units] |

Velocity | 4 [length units]/[time unit] |

Number of agents | 25 |

### 3.2 Clustering

The second scenario that we investigate is simple clustering of agents. It is similar to the BEECLUST-controlled swarm. However, the agents have constant waiting times independent of any scalar field, they move on a torus, and we introduce noisy sensors. The missing scalar field and missing wall effects remove the preferred regions of cluster formation from the system as seen in the BEECLUST scenario. The used parameters are the same as given in Table 1. We implement noise in the sensors in a simple way by an agent-agent recognition probability γ, that is, even if two agents are mutually within sensor range, they might not perceive each other and might not stop. We set γ = 0.75, that is, an agent will perceive another agent in its sensor range 75% of the time. The noise is uncorrelated in time. Noise was introduced because deterministic clustering seems to introduce artifacts into the entropy production histograms.

## 4 Analysis of Scenarios

### 4.1 Analysis of BEECLUST

*N*agents that move in a two-dimensional box and scalar field. The particles are always found in one of two states: moving with constant velocity or stopped.

^{2}We assume the following equations of motion for each agent

*i*:where

**q**

_{i}= (

*x*

_{i},

*y*

_{i})

^{⊺}is the position of agent

*i*,

**p**

_{i}is the momentum, and

**p**

_{i}′ is the value of

**p**

_{i}at the time the agent stopped. Whenever the agent moves undisturbed by walls or by other agents we have |

**F**

_{i}| = 0. If the agent bounces off the bounds or closely approaches another agent, we have a force |

**F**

_{i}| > 0 that separates agents from each other or, in the case of approaching a wall, implements the regular behavior of a billiard ball (angle of incidence equals angle of reflection). This can be implemented, for example, via a Weeks-Chandler-Andersen (WCA) potential (see [28]), which is a purely repulsive potential. As the thermostat method we use velocity scaling, that is, based on the current

**p**

_{i}, we calculate a velocity scaling factor (see Equation 10 below). Consequently we can scale time , which is then governed by the number of stopped agents. In particular, the special periods of time in which all agents are stopped are converted to time periods of no extent. Note that this is only our method of measuring the self-organizing system. It is not intrinsic to the system, and the behavior of the agents is unaffected by it.

The system dynamics takes place in a high-dimensional phase space (**q**_{0}, **q**_{1},…, **q**_{N−1}, **p**_{0}, **p**_{1},…, **p**_{N−1}) ∈ Γ. In the following we need to detect the essentials of this dynamics by a measure of entropy. First, we present an advantageous choice of an entropy measure, and in Section 5.1 we give an alternative, which is, however, less beneficial.

**p**and also the

*y*-positions, because the main feature of the clusters is defined by the agents'

*x*-positions (see Figure 2). Ignoring the momenta does not hide entropy. We start with all nonzero momenta, and during the experiments we have inhomogeneous momentum distributions, but the experiments typically end with almost all agents stopped (i.e., a homogeneous momentum distribution). Similarly to [6, Section 4.3], we measure the agent density modulation ρ

_{b}of the BEECLUST scenario viawhere

*x*

_{i}(

*t*) is the

*x*-position of agent

*i*at time

*t*,

*k*= 2π/

*L*, and

*L*= 150 is the box length. The applied sine function is shown in Figure 2. Agents in the leftmost and rightmost quarters of the arena contribute positively; agents in the middle contribute negatively. In equilibrium,

*x*

_{i}∈ [0,

*L*] is equally distributed when averaged over many ensembles, yielding 〈ρ

_{b}〉 = 0. By applying the converse argument, averages of 〈ρ

_{b}〉 ≠ 0 would correspond to unequal distributions of agents, with negative and positive values indicating whether the main cluster is in the middle or at the ends.

*dissipation function*Ω

_{b}(Γ) that gives the entropy production for a given phase space trajectory. We integrate changes of ρ

_{b}over a time interval [0,

*t*]:andis the reciprocal temperature of the initial ensemble, with

*k*

_{B}the Boltzmann constant and

*N*

_{d}= 2

*N*the number of degrees of freedom. The distribution of the entropy production for

*N*= 25 agents controlled by the BEECLUST algorithm, which were initially uniformly distributed, is shown in Figure 4 for

*t*= 1,500. The initial random uniform distribution yields 〈ρ

_{b}(0)〉 = 0, which is the state of maximal entropy. Hence, any distribution of the entropy production with a mean of 〈

*t*Ω

_{b}〉 ≠ 0 indicates negative entropy production (i.e., averaged differences of the density modulation can have negative or positive signs, but imply negative entropy production if they are nonzero). The ensemble average is 〈

*t*Ω

_{b}〉 ≈ 15.77, which means that negative entropy is produced (initially at maximum entropy). Note that there is no direct influence by the scalar field on the entropy productions, which are based on the agents'

*x*-positions. Furthermore, the waiting times, which are determined by the scalar field, vary only by a factor of five between the minimum and the maximum.

_{b}(

*k*,

*t*) − ρ

_{b}(

*k*, 0) =

*A*and ρ

_{b}(

*k*,

*t*) − ρ

_{b}(

*k*, 0) = −

*A*, because here both refer initially to negative entropy production. This indistinguishability, however, holds only for the initial phase. In later phases, ρ

_{b}(

*k*,

*t*) − ρ

_{b}(

*k*, 0) > 0 and ρ

_{b}(

*k*,

*t*) − ρ

_{b}(

*k*, 0) < 0 are to be distinguished, because the system has on average a bias toward nonzero values of ρ

_{b}(

*k*,

*t*), in particular toward positive values (i.e., the system operates most of the time in the positive half of Figure 8, discussed later). Consequently, a change of ρ

_{b}(

*k*,

*t*

_{1}) − ρ

_{b}(

*k*,

*t*

_{0}) < 0 takes the system back toward the maximal-entropy state for ρ

_{b}(

*k*,

*t*

_{0}) > 0, corresponding to positive entropy productions (and for ρ

_{b}(

*k*,

*t*

_{1}) − ρ

_{b}(

*k*,

*t*

_{0})) > 0 vice versa), as discussed in Section 2. Applying the fluctuation theorem gives

In Figure 5 the data shown in Figure 4 is tested for whether it obeys Equation 11. The fluctuation theorem is satisfied for this system although the system produces negative entropy and actually abandons the equilibrium to which it was initialized. Hence, one could speak of an inverted fluctuation theorem that is satisfied here.

In the following we want to investigate how it is possible for this self-organizing system to produce negative entropy. We hypothesize that the negative entropy production is based on fluctuations and the stopping behavior of the agents, and hence is a process of frozen accidents.

We start our analysis with a measurement of the entropy production within a limited time interval [*t*_{0} = 15, *t*_{1} = 20] in the early transient. In addition, we classify for each measurement whether at least one agent changed its state from moving to stopped (starting agents do not occur that early in the simulation). The entropy production distributions for these two classes are shown in Figure 6. For the measurements without a stopping agent the averaged change in the density modulation is about zero (〈(*t*_{1} − *t*_{0})Ω_{b}〉 ≈ 0.06). In contrast, for those measurements with stopping agents the averaged change of density modulation is negative (〈(*t*_{1} − *t*_{0})Ω_{b}〉 ≈ −3.09), indicating frozen accidents. For much later time intervals no difference between measurements with stopped and without stopping agents are found. The negative value of 〈(*t*_{1} − *t*_{0})Ω_{b}〉 demands clarification, because in the limit *t* → ∞ the average density modulation is positive.

The explanation is based on a special feature of the BEECLUST-controlled swarm in this scenario, which consists of three phases (see Figure 7). In the short period before the first cluster forms, the average entropy production is 〈Ω_{b} = 0〉, indicating that the original fluctuation theorem holds for this phase. The first cluster usually does not form close to the global optima, but relatively close to the middle of the arena; see Figure 7a. In this area the agent density modulation (Equation 8) contributes negatively. In a second phase the average density modulation is negative (〈Ω_{b} < 0〉) because the density close to the middle of the arena increases further; see Figure 7(b). This is also indicated by the evolution of the agent density modulation over time as shown in Figure 8. Initially it stays close to 0, and only later does it clearly take a positive sign. The insert shows details of the first 50 time steps and indicates negative slope for the time interval [15, 20] (i.e., the second phase) of Figure 6. Only later do the clusters move toward the ends of the arena, probably aided by wall effects (see Figure 7c; agents are more likely to approach the cluster again after having left it on the side of the cluster that is closer to one of the ends of the arena), and consequently the average density modulation is positive (Ω_{b} > 0).

### 4.2 Analysis of Clustering

The clustering scenario is simple, and it serves in this article as a second example to show the generality of our approach. Hence we reduce the analysis to a comparison of the entropy production and the test of the entropy production distribution with those obtained in the BEECLUST scenario.

*d*

_{max}is the maximal distance between two points on the torus. Consequently the dissipation function isWe ignore any additional constants for this scenario. The agents are initialized to uniformly randomly distributed positions on the torus. On average this uniform distribution corresponds to high entropy, which is reflected in high values of ρ

_{c}. On average these high values cannot be increased by the defined swarm behavior. On forming clusters, distances between several agents decrease, and consequently ρ

_{c}will decrease. Hence, we expect negative values of

*t*Ω

_{c}for the majority of runs. After

*t*= 80 time steps the entropy production

*t*Ω

_{c}is measured. We choose this relatively early state because the system converges fast and consequently only a few positive entropy productions are observed later, which complicates the statistical analysis. The obtained distribution is shown in Figure 9a. In contrast to Figure 4, the distribution is clearly asymmetrical, the average 〈

*t*Ω

_{c}〉 ≈ −0.015 is closer to zero (median: −0.0091), and the average is negative, indicating negative entropy production as expected.

The test of the entropy production distribution according to the fluctuation theorem and by analogy to Equation 11 is shown in Figure 9b. The data is close to linearity, but shows systematic deviations (too small for *A* < 0.01 and too large for 0.15 < *A* < 0.03). Given the fast convergence of the system to a majority of trajectories with negative entropy productions, this is still a satisfactory result.

## 5 Generality

To support our claim that this approach bears the potential of general applicability, we report the influence of an alternative entropy measure and investigate the influence of temperature in the following.

### 5.1 Independence from the Entropy Measure

The entropy measure that might be the most obvious to apply in the context of this article is probably the information-theoretic entropy of Shannon [24]. The Shannon entropy, however, is discrete, and our system has many continuous features apart from the discrete concept of agent numbers. The continuous extension of the Shannon entropy (differential entropy) is also not directly applicable, because we would have to sample a probability distribution from only *N* samples represented by agent positions. To get from continuous state space to discrete microstates, we have to do some form of coarse-graining. Unfortunately there is an ambiguity because there are many possibilities to implement coarse-graining. Still, at least two conditions can be fixed. First, the dynamics of the obtained entropy measure should parallel those of the system, and second, it should capture its main features.

*y*-positions as above, but we still ignore the momenta. We count two microstates: unoccupied and occupied tiles (ignoring the actual number of agents in a tile). This way we sample two probabilities,

*p*

_{0}(

*t*) and

*p*

_{1}(

*t*) = 1 −

*p*

_{0}(

*t*). The Shannon entropy is given by

*S*(

*t*) = −∑

_{i}

*p*

_{i}(

*t*) log

_{2}

*p*

_{i}(

*t*), and the entropy production during a time interval [0,

*t*] is given by

Initially the agents are uniformly randomly distributed, which causes about as many tiles to be occupied as there are agents (for *N* ≪ 75), and we get *p*_{1}(0) ≈ *N*/75. After some time, the agents cluster and the number of occupied tiles typically decreases (a typical value is *p*_{1}(*t*) = 5/75 for *N* = 25). Hence, the average entropy production will be negative (〈*t*Ω_{c}〉 < 0).

The above experiments can be repeated with this entropy measure, and no qualitative change is detectable. However, there are some technical problems. Due to the discrete measure, abnormalities in the histograms occur, and the number of trajectories showing positive entropy production decreases fast with increasing time, which complicates the statistical analysis, for which we need well-filled histogram bins on both the positive and the negative side. Hence, the obtained data will not fully converge to a line according to Equation 11, and we cannot sample within long time intervals. The results are shown in Figure 10. The shapes of the curves are systematic, due to coarse-graining. Still, the overall trend to linearity is seen. We conclude that the above entropy measure based on a density modulation was a better choice.

### 5.2 Influence of Temperature

Both investigated scenarios are certainly more complex than a mere physical multiparticle system, because they also incorporate autonomous stops of agents, sensor ranges, and waiting times in addition to particle velocities. Still, we can accommodate most of them with the concept of temperature as defined in Equation 10. When we reduce the agents' velocities, we alter the temperature. However, we have to extend the concept of temperature to allow the comparison of systems with different agent velocities. To scale the system behaviors at different agent velocities completely, it is clear that we also have to scale waiting times and sensor ranges, which are not considered in Equation 10. Hence, we have to apply an extended concept of temperature that includes these velocity-dependent properties of agents as well. However, the definition and analysis of a general temperature concept for multi-agent systems is beyond the scope of this article. Here, we just scale speeds (halved), waiting times (doubled), and sensor ranges (halved) and obtain two systems with temperatures *T*_{1} = 2*T*_{2}. We compare the entropy production distributions of these two systems in Figure 11 for a late state at time *t* = 6,000. The system at half temperature (*T*_{2}) shows systematic deviations because it converges slower. Still, the deviations are small enough to conclude that, at least for some scenarios, an extended concept of temperature can be found that allows one to scale these systems and compare them across different temperatures.

## 6 Discussion

Note again that ρ_{b}(*k*, *t*) = 0 corresponds to maximum entropy. Therefore, any ρ_{b}(*k*, *t*) ≠ 0 in Figure 8 indicates negative entropy production. We conclude that the negative entropy production of this system is initiated by entropy fluctuations, which are normally distributed and are negative and positive with about the same probability according to the original fluctuation theorem and as seen in Figure 6(a). Some of the negative-entropy-production events are locally observable by the agents themselves, because there are simultaneous agent-to-agent encounters of three agents with mutual perception. This local perception of the global measure of entropy is leveraged by stopping all three agents and consequently storing the local entropy fluctuation. Cascades of such stopping behaviors generate a positive feedback (self-amplification of fluctuations as in Rayleigh-Bénard convection). In the end a system dynamics is generated that can be described by an inverted fluctuation theorem, which dictates an exponentially increasing probability of low-entropy states. Hence, this emergent self-organizing swarm does indeed rely on frozen accidents. Note that the overall system (i.e., including the heat bath) still produces positive entropy (e.g., due to accelerations of the agents), while the agent-position-based entropies are reduced only in the self-organizing subsystem.

The effectiveness of the frozen-accidents concept can easily be made clear by constructing a simple model. We represent the entropy contribution of each agent *i* by a random process *X*_{i}(*t*). The total entropy is just the sum ∑_{i=0}^{N}*X*_{i}(*t*) over all agents *N*. The restriction of all random processes to the interval [−5, 5] is essential, and we define *X*_{i}(*t*) *=* 5 ∀*t* > *t*_{0}, where *t*_{0} is the first time agent *i* achieved *X*_{i}(*t*_{0}) = 5. That is, once a random process reaches *X*_{i}(*t*_{0}) = 5 (a local property), it stays there forever—a frozen accident. As a consequence the number of active random processes *N*_{a} will decrease monotonically. A sample run of this simple model for *N* = 25, based on Gaussian distributed *X*_{i} and initialization *X*_{i}(0) = 0, is shown in Figure 12. The bias in the otherwise random trajectory is noticeable. Note that the summation ∑_{N}*X*_{i} of Gaussian distributed random variables, each having a variance of σ_{i}^{2}, results in a random variable that is also Gaussian distributed with a variance of σ^{2} = ∑_{N} σ_{i}^{2}. With decreasing number *N*_{a} of active processes, more and more variances vanish (σ_{i}^{2} = 0). Hence, also the variance of the sum will decrease, which is the macroscopic effect of the frozen accidents and ensures that states of low entropy are much more likely to be maintained. The challenge in investigating self-organizing systems is, however, represented by the reverse engineering of such a model. One would start with an observed macroscopic effect as shown in Figure 12 and then decompose it into a set of subsystems that accidentally achieve low-entropy states by noise and then preserve them by freezing.

*self-organization equilibrium*of lower entropy to which the system will converge. As a second consequence, the self-organizing entropy-reduction behavior is a transient phenomenon (cf. [19, p. 62]).

## 7 Conclusion

In this article, we have analyzed emergent self-organizing multi-agent (or swarm) systems with methods based on and suggested by the fluctuation theorem. The results provide empirical evidence for the existence of an inverted fluctuation theorem that could prepare a wide basis for the analysis of self-organizing systems. We claim these methods have a potential for general applicability. This claim of generality is supported by the reported results with an alternative entropy measure and the results using a broader interpretation of the concept of temperature.

Specific exemplary benefits of such a theory could be the definition of preconditions for self-organization, for example, concerning the cognitive abilities of the agents. Statistical properties of fluctuations describe the timescales on which negative entropy productions can be observed locally. The agents need to perceive local samples of this global property of negative entropy production and need to react within these timescales. Hence, conditions for controller sampling rates could be derived. The agents need appropriate sensors that allow local measurements of entropy with an accuracy that is sufficiently higher than the rate at which events of negative entropy production occur. Thus, preconditions for successfully generating positive feedback could be derived.

In particular, the origin of BEECLUST confirms the possibility of applying the proposed methods to natural systems such as clustering behaviors in young honeybees [26] or other social insects, as well as flocks, herds, and shoals. Hence, the same methods could be used for artificial and natural systems, which could, in turn, enrich primarily biological studies.

This work has proved again that thermodynamics and statistical physics offer many fully developed methods that can often be applied even unmodified to problems of emergent behavior (cf. Hamann et al. [11]). Pursuing this research track might be a promising way of achieving general insights into still rather fuzzy concepts such as emergence or self-organization.

Finally, it is clear that the reported approach is truly interdisciplinary in combining methods and problems from physics, biology, and computer science. It is obvious that, at least in the field of artificial life, any future research success has to be founded on a combination of several scientific fields. In our future work, we hope to continue this approach by generalizing the concept of an inverted fluctuation theorem for emergent self-organizing multi-agent systems.

## Acknowledgments

The authors thank Payam Zahadat and the anonymous reviewers for helpful comments that improved this article. This reported research is supported by EU-ICT CoCoRo, no. 270382; EU-IST-FET project SYMBRION, no. 216342; EU-ICT project REPLICATOR, no. 216240; FWF (Austrian Science Fund) Project REBODIMENT, no. P23943-N13; and the Austrian Federal Ministry of Science and Research (BM.W_F).

## Notes

In a thermostatted system the temperature is kept constant, for example, by rescaling the particles' velocities. The system can be thought of as being in contact with a large heat reservoir in order to thermostat the system [1]. A thermostat method is applied in Section 4.1.

In any embodied implementation of this system (e.g., robots) the particles would be affected by friction and would consequently need to have a permanent acceleration-compensating friction. This, in turn, means they would have an energy reservoir (cf. active particles [23]) and would permanently dissipate heat, which would result in a situation as shown in Figure 1. Energy costs have to be paid to allow self-organization and to comply with the second law of thermodynamics. Henceforth, we carry out the separation between the two subsystems: the self-organizing subsystem containing the agents, and the subsystem typified by the heat reservoir. Due to its energy dissipation, the self-organizing subsystem does not have to obey the second law of thermodynamics.

## References

## Author notes

Contact author.

University of Paderborn, Department of Computer Science, Zukunftsmeile 1, 33102 Paderborn, Germany. E-mail: heiko.hamann@uni-paderborn.de

Artificial Life Lab of the Department of Zoology, Universitätsplatz 2, Karl-Franzens University Graz, 8010 Graz, Austria. E-mail: thomas.schmickl@uni-graz.at, karl.crailsheim@uni-graz.at