Abstract

A grand challenge in the field of artificial life is to find a general theory of emergent self-organizing systems. In swarm systems most of the observed complexity is based on motion of simple entities. Similarly, statistical mechanics focuses on collective properties induced by the motion of many interacting particles. In this article we apply methods from statistical mechanics to swarm systems. We try to explain the emergent behavior of a simulated swarm by applying methods based on the fluctuation theorem. Empirical results indicate that swarms are able to produce negative entropy within an isolated subsystem due to frozen accidents. Individuals of a swarm are able to locally detect fluctuations of the global entropy measure and store them, if they are negative entropy productions. By accumulating these stored fluctuations over time the swarm as a whole is producing negative entropy and the system ends up in an ordered state. We claim that this indicates the existence of an inverted fluctuation theorem for emergent self-organizing dissipative systems. This approach bears the potential of general applicability.

1 Introduction

One characteristic of living organisms is their metabolism. Living beings require energy in order to maintain their internal order. This is determined by the second law of thermodynamics, which describes the ubiquitous decay of all things and does not allow the increase of order without the cost of dissipation. In the context of self-organizing systems one might cite Parunak and Brueckner [17]: “Emergent self-organization in multi-agent systems appears to contradict the second law of thermodynamics.” This is of course not the case; as discussed by Parunak and Brueckner [17], one has to distinguish between two kinds of subsystems: one that hosts the self-organizing swarm, and one in which disorder is increased. Hence, a swarm can be thought of as a heat pump that decreases entropy (entropy is defined in Section 2, and entropy measures are investigated in Section 5.1) in one basin in favor of increased entropy in another basin. However, the question of how the swarm manages to do that still persists. Whether thermodynamic properties are relevant and helpful in understanding such systems is currently being discussed [11, 18].

The complexity of the animate world is explained by natural selection in combination with random events (natural evolution). It is one thing to select the adapted organism, but the mutation that results in improved adaptivity has to occur first. Concerning the genetic code, Crick [3] coined the term frozen accident theory. Whereas Crick introduced this concept with a focus on genetics, Gell-Mann [7] applied it to everything:

[…] the effective complexity [of the universe] receives only a small contribution from the fundamental laws. The rest comes from the numerous regularities resulting from ‘frozen accidents’.

We want to define a concept of frozen accidents within emergent self-organizing multi-agent systems [4] that explains how they can work as heat pumps in the sense as described above.

While a heat pump has to work against the second law (e.g., diffusion of heat) by expending energy, limited violations of the second law without the expenditure of energy [5] are also possible as, for example, indicated by Maxwell [15]:

The truth of the second law is … a statistical, not a mathematical, truth, … for it depends on the fact that the bodies we deal with consist of millions of molecules….

Violations of the second law are possible for small systems and short timescales, that is, at atomic and micron scales over short times (up to 2 s), and have been shown experimentally [27]. We claim that the reduction of entropy by emergent self-organizing systems could be explained by the summation of such violations of the second law. As already noted, the second law is only statistical and hence allows spontaneous decreases of entropy in isolated systems with nonzero probability.

The possibility of a time-limited decrease in entropy exists because a system at a temperature above absolute zero, according to statistical mechanics, always shows thermal fluctuations, which are random deviations of a system from its equilibrium. Say x is a thermodynamic variable (i.e., it describes a state of a thermodynamic system at a given time); then the probability distribution f(x) of this variable for a system at maximum entropy (i.e., at equilibrium state) turns out to be approximately Gaussian with mean μ = 0:
formula
where the variance is defined by the mean squared fluctuation σ2 = 〈x2〉, which is an average over many ensembles (i.e., average over many realizations of the system). Hence, the probabilities of observing negative (∫−∞0f(x)dx) and positive fluctuations (∫0+∞f(x)dx) are equal at equilibrium.

The fluctuation theorem [5, 6] quantifies the probability of violations of the second law. For short intervals it can be said that nature was running in reverse, because the sign of entropy production is used to define the direction of the arrow of time. Even in reference to living systems this might be said. For example, small “machines” within a cell (e.g., mitochondria) are likely to run in reverse from time to time. A transfer of this concept to the macro world is typically denied categorically. In a review of Wang et al. [27], Gerstner [8] wrote: “For larger systems over normal periods of time, however, the second law of thermodynamics is absolutely rock solid.”

Generally the fluctuation theorem is said to be applicable only to the micro world, where Brownian motion can be observed. Truly, this is a well-chosen hypothesis. However, what if we allow dissipation of energy in the first place, separate the system in two subsystems consisting of the self-organizing part and a heat bath, and then observe only the behavior in the self-organizing half of the system? That way one could argue that we simulate the micro world by a macro system at the cost of lost heat. This concept (see Figure 1) is for example taken into account by Smith [25] when stating
formula
for an increment of heat dQ rejected by the system to a thermal bath at temperature T, where kB is the Boltzmann constant, −dS is the reduction in entropy of the (sub)system's internal state, and is the increase in information (note that Smith [25] defines information as “the reduction in some measure of entropy”). Note that the mere property of being dissipative is not sufficient to explain a self-organizing system. In addition to squandering energy, the system has to generate orderly structures. Dissipation is only a necessary condition for negative entropy production, but additional sufficient conditions exist. In case of Rayleigh-Bénard convection [2], for example, initially fluctuating flows [29] occur that are enhanced and trigger the formation of Bénard cells in spontaneous symmetry breaking (cf. also [9, 16]). We want to point out the self-amplification of fluctuations as such is a sufficient condition here.
Figure 1. 

Schematic of a system divided into a heat bath with increasing entropy and a self-organizing, dissipative subsystem with decreasing entropy.

Figure 1. 

Schematic of a system divided into a heat bath with increasing entropy and a self-organizing, dissipative subsystem with decreasing entropy.

In this article, we report empirical evidence that the negative entropy production in emergent self-organizing systems is based initially on frozen accidents allowed by the original fluctuation theorem, which in turn leads in the end to a global behavior that is described by an inversion of the fluctuation theorem in dissipative self-organizing systems. This concept might bear the potential of embedding the concept of emergent behavior in multi-agent systems (swarms, self-propelled particles, etc.) in a theoretical framework built on sound foundations of theories from physics. Hence, we propose an approach to understand emergent behavior through thermodynamics which follows up our earlier-reported concept [11].

In addition, the relation to the fluctuation theorem might allow to us define preconditions for effective self-organizing systems in the future. For example, one can define minimum requirements for the agents of the system concerning its cognitive abilities in order to be able to leverage fluctuations. The agent needs sensors that allow it to estimate, at least probabilistically, whether the (local) entropy has just decreased. Furthermore, the system needs the ability to store such local fluctuations.

In the following we describe the fluctuation theorem and the investigated scenarios. We analyze the multi-agent system or swarm, discuss how the results could be viewed as obeying an inverted fluctuation theorem, and conclude by giving a short summary and outlook.

2 Fluctuation Theorem and Entropy Measures

According to Evans and Searles [6], the set of fluctuation theorems “gives an analytical expression for the probability of observing Second Law violating dynamical fluctuations in thermostatted1 dissipative non-equilibrium systems.” One of these theorems (steady state fluctuation theorems) applies to time-reversible, thermostatted, ergodic dynamical systems and yields the relation of fluctuations [6]
formula
for the time-averaged entropy production . The fluctuation theorem compares probabilities of observing a certain time-averaged entropy production A and observing its negative −A. The numerator describes the probability of finding the system initially in those states that subsequently generate bundles of trajectory segments with the time-averaged value A. The above theorem in Equation 3 predicts an exponential increase of the ratio . Hence, with increasing time, positive-entropy-producing trajectories become exponentially more likely than their negative-entropy-producing counterparts.
As a consequence of the fluctuation theorem one obtains the second-law inequality
formula
which states that the average over many ensembles in which the time-averaged entropy productions were measured is positive. Hence, the fluctuation theorem is in accordance with the second law of thermodynamics.
In the following we want to measure entropy productions and consequently entropy in swarm systems. It is important to note that there is not one generally applicable measure of entropy. Within statistical mechanics entropy is defined, in general, based on a set of microstates and their probabilities pi:
formula
A microstate is said to be a detailed description of the system's current configuration. A macrostate is a much shorter description of such a configuration, usually based on averages, and hence summarizes a set of microstates. Macrostates are defined with an application in mind, as stated by Jaynes [14]:

But it is also possible that two experimenters assign different entropies […] to what is in fact the same microstate […] without either being in error. That is, they are contemplating a different set of possible macroscopic observations on the same system, embedding that microstate in two different reference classes […]. It is necessary to decide at the outset of a problem which macroscopic variables or degrees of freedom we shall measure and or control.

In order to measure probabilities of microstates effectively, one needs to define a countable set of microstates, which is usually done by coarse-graining (an example in the context of multi-agent systems is given by Parunak and Brueckner [17]). In the context of entropy production it is also important to determine whether negative or positive entropy production is observed. Note that this sign of entropy production might be opposite to the actual mathematical sign of a certain entropy measure. According to the statistical physics interpretation, this sign of entropy production is defined relative to the entropy of that macrostate that has the highest number of microstates. According to thermodynamics it is defined relative to the entropy of that state with the lowest amount of energy that can be used to do thermodynamic work. That macrostate is the equilibrium state, which would be achieved in the context of this article by mere random behavior of the particles or agents, that is, a uniform distribution of agents. Thus, any process that leaves the equilibrium state and enters states of lower entropy shows negative entropy production.

3 Investigated Scenarios

We investigate two scenarios. In the first one the swarm is controlled by the BEECLUST algorithm. The second scenario is a simple clustering process and is introduced to give evidence that our approach has potential for general applicability.

3.1 BEECLUST Algorithm

The BEECLUST algorithm can be considered a model algorithm for swarms. It is based on observations of young honeybees [26], has been analyzed in many models [10, 12, 13, 20, 21], and has even been implemented in a swarm of robots [22].

This algorithm allows a swarm to aggregate at a maximum of a scalar field although individual agents do not perform a greedy gradient ascent. In addition, a BEECLUST-controlled swarm is able to break symmetries [12] of equal maxima in the scalar field, as also observed here (e.g., see Figure 2). Hence, it might be justified to call this emergent behavior. Controlled by this algorithm three agents will stop (i.e., stopping threshold set to three; note that in previous works typically two agents were enough for stopping; the actual choice of this stopping threshold is, however, irrelevant in this article) when they approach each other, measure the local value of the scalar field, and wait for some time proportional to this measurement. Clusters form, and finally the swarm will be aggregated close to the global optimum of the scalar field (see the lower part of Figure 2). See Figure 3 for a definition of the BEECLUST algorithm.

Figure 2. 

Bottom: Typical state of a swarm controlled by BEECLUST; positions of stopped agents (circles) and moving agents (triangles) with trajectories of the last 20 time steps. Contours show levels of the scalar field. Top: Function used in Equation 8.

Figure 2. 

Bottom: Typical state of a swarm controlled by BEECLUST; positions of stopped agents (circles) and moving agents (triangles) with trajectories of the last 20 time steps. Contours show levels of the scalar field. Top: Function used in Equation 8.

Figure 3. 

The BEECLUST algorithm (stop threshold of three agents).

Figure 3. 

The BEECLUST algorithm (stop threshold of three agents).

The collective aggregation close to the global optimum is achieved via a positive feedback process [12]. Clusters of three stopped agents will form by chance anywhere in the arena. The area covered by clusters grows with the number of contained agents, and clusters covering a bigger area are more likely to be approached, by chance, by moving agents. Hence, bigger clusters will grow faster. The intensity of this positive feedback is inhomogeneous in the arena. Agents in clusters closer to the global optimum have longer waiting times. These clusters will exist longer than those that are farther away from the global optimum. Hence, the chance of growing is bigger for clusters closer to the global optimum. This process, typically, generates one big cluster close to the global optimum. The agents interact only locally and, as noted above, a BEECLUST-controlled swarm breaks symmetries. Hence, this behavior is different from other aggregation processes, for example, star formation, which includes global interactions due to gravitation.

In the following experiments, the agents have initially random headings, are in the state moving, and are uniformly randomly distributed in the arena. The scalar field is bimodal with maxima of the same value and shape (see contours in Figure 2). See Table 1 for the standard parameters used.

Table 1. 

Parameter settings used in this work.

Arena dimensions 150 × 50 [length units]2 
Proximity sensor range 3.5 [length units] 
Maximum waiting time 660 [time units] 
Velocity 4 [length units]/[time unit] 
Number of agents 25 
Arena dimensions 150 × 50 [length units]2 
Proximity sensor range 3.5 [length units] 
Maximum waiting time 660 [time units] 
Velocity 4 [length units]/[time unit] 
Number of agents 25 

3.2 Clustering

The second scenario that we investigate is simple clustering of agents. It is similar to the BEECLUST-controlled swarm. However, the agents have constant waiting times independent of any scalar field, they move on a torus, and we introduce noisy sensors. The missing scalar field and missing wall effects remove the preferred regions of cluster formation from the system as seen in the BEECLUST scenario. The used parameters are the same as given in Table 1. We implement noise in the sensors in a simple way by an agent-agent recognition probability γ, that is, even if two agents are mutually within sensor range, they might not perceive each other and might not stop. We set γ = 0.75, that is, an agent will perceive another agent in its sensor range 75% of the time. The noise is uncorrelated in time. Noise was introduced because deterministic clustering seems to introduce artifacts into the entropy production histograms.

4 Analysis of Scenarios

4.1 Analysis of BEECLUST

We consider a system of N agents that move in a two-dimensional box and scalar field. The particles are always found in one of two states: moving with constant velocity or stopped.2 We assume the following equations of motion for each agent i:
formula
formula
where qi = (xi, yi) is the position of agent i, pi is the momentum, and pi′ is the value of pi at the time the agent stopped. Whenever the agent moves undisturbed by walls or by other agents we have |Fi| = 0. If the agent bounces off the bounds or closely approaches another agent, we have a force |Fi| > 0 that separates agents from each other or, in the case of approaching a wall, implements the regular behavior of a billiard ball (angle of incidence equals angle of reflection). This can be implemented, for example, via a Weeks-Chandler-Andersen (WCA) potential (see [28]), which is a purely repulsive potential. As the thermostat method we use velocity scaling, that is, based on the current pi, we calculate a velocity scaling factor (see Equation 10 below). Consequently we can scale time , which is then governed by the number of stopped agents. In particular, the special periods of time in which all agents are stopped are converted to time periods of no extent. Note that this is only our method of measuring the self-organizing system. It is not intrinsic to the system, and the behavior of the agents is unaffected by it.

The system dynamics takes place in a high-dimensional phase space (q0, q1,…, qN−1, p0, p1,…, pN−1) ∈ Γ. In the following we need to detect the essentials of this dynamics by a measure of entropy. First, we present an advantageous choice of an entropy measure, and in Section 5.1 we give an alternative, which is, however, less beneficial.

We ignore the momenta p and also the y-positions, because the main feature of the clusters is defined by the agents' x-positions (see Figure 2). Ignoring the momenta does not hide entropy. We start with all nonzero momenta, and during the experiments we have inhomogeneous momentum distributions, but the experiments typically end with almost all agents stopped (i.e., a homogeneous momentum distribution). Similarly to [6, Section 4.3], we measure the agent density modulation ρb of the BEECLUST scenario via
formula
where xi(t) is the x-position of agent i at time t, k = 2π/L, and L = 150 is the box length. The applied sine function is shown in Figure 2. Agents in the leftmost and rightmost quarters of the arena contribute positively; agents in the middle contribute negatively. In equilibrium, xi ∈ [0, L] is equally distributed when averaged over many ensembles, yielding 〈ρb〉 = 0. By applying the converse argument, averages of 〈ρb〉 ≠ 0 would correspond to unequal distributions of agents, with negative and positive values indicating whether the main cluster is in the middle or at the ends.
Following Evans and Searles [6], we define a dissipation function Ωb(Γ) that gives the entropy production for a given phase space trajectory. We integrate changes of ρb over a time interval [0, t]:
formula
and
formula
is the reciprocal temperature of the initial ensemble, with kB the Boltzmann constant and Nd = 2N the number of degrees of freedom. The distribution of the entropy production for N = 25 agents controlled by the BEECLUST algorithm, which were initially uniformly distributed, is shown in Figure 4 for t = 1,500. The initial random uniform distribution yields 〈ρb(0)〉 = 0, which is the state of maximal entropy. Hence, any distribution of the entropy production with a mean of 〈tΩb〉 ≠ 0 indicates negative entropy production (i.e., averaged differences of the density modulation can have negative or positive signs, but imply negative entropy production if they are nonzero). The ensemble average is 〈tΩb〉 ≈ 15.77, which means that negative entropy is produced (initially at maximum entropy). Note that there is no direct influence by the scalar field on the entropy productions, which are based on the agents' x-positions. Furthermore, the waiting times, which are determined by the scalar field, vary only by a factor of five between the minimum and the maximum.
Figure 4. 

Distribution of the entropy production for a swarm controlled by the BEECLUST algorithm; t = 1,500, 〈tΩb〉 ≈ 15.77, T = 909.1; number of samples, n ≈ 5.0 × 106.

Figure 4. 

Distribution of the entropy production for a swarm controlled by the BEECLUST algorithm; t = 1,500, 〈tΩb〉 ≈ 15.77, T = 909.1; number of samples, n ≈ 5.0 × 106.

Now we want to apply the fluctuation theorem (Equation 3) to this system. For this purpose we have to assume time reversibility, which is problematic because BEECLUST-controlled systems are in general not reversible [11]. However, we argue that it is fair to assume approximate reversibility, because the irreversibility vanishes if the agents aggregate and yield almost equal scalar field values (typically the difference is only about ±10%), determining almost equal waiting times and almost equal wake-ups. Hence, in a time-reversed setting there would predominantly be valid cause-and-effect processes. It might appear counterintuitive to distinguish between ρb(k, t) − ρb(k, 0) = A and ρb(k, t) − ρb(k, 0) = −A, because here both refer initially to negative entropy production. This indistinguishability, however, holds only for the initial phase. In later phases, ρb(k, t) − ρb(k, 0) > 0 and ρb(k, t) − ρb(k, 0) < 0 are to be distinguished, because the system has on average a bias toward nonzero values of ρb(k, t), in particular toward positive values (i.e., the system operates most of the time in the positive half of Figure 8, discussed later). Consequently, a change of ρb(k, t1) − ρb(k, t0) < 0 takes the system back toward the maximal-entropy state for ρb(k, t0) > 0, corresponding to positive entropy productions (and for ρb(k, t1) − ρb(k, t0)) > 0 vice versa), as discussed in Section 2. Applying the fluctuation theorem gives
formula

In Figure 5 the data shown in Figure 4 is tested for whether it obeys Equation 11. The fluctuation theorem is satisfied for this system although the system produces negative entropy and actually abandons the equilibrium to which it was initialized. Hence, one could speak of an inverted fluctuation theorem that is satisfied here.

Figure 5. 

Test of the entropy production distribution of the BEECLUST-controlled swarm shown in Figure 4 against the fluctuation theorem (Equation 11) with Note that any Y ≠ 0 corresponds to negative entropy production.

Figure 5. 

Test of the entropy production distribution of the BEECLUST-controlled swarm shown in Figure 4 against the fluctuation theorem (Equation 11) with Note that any Y ≠ 0 corresponds to negative entropy production.

In the following we want to investigate how it is possible for this self-organizing system to produce negative entropy. We hypothesize that the negative entropy production is based on fluctuations and the stopping behavior of the agents, and hence is a process of frozen accidents.

We start our analysis with a measurement of the entropy production within a limited time interval [t0 = 15, t1 = 20] in the early transient. In addition, we classify for each measurement whether at least one agent changed its state from moving to stopped (starting agents do not occur that early in the simulation). The entropy production distributions for these two classes are shown in Figure 6. For the measurements without a stopping agent the averaged change in the density modulation is about zero (〈(t1t0b〉 ≈ 0.06). In contrast, for those measurements with stopping agents the averaged change of density modulation is negative (〈(t1t0b〉 ≈ −3.09), indicating frozen accidents. For much later time intervals no difference between measurements with stopped and without stopping agents are found. The negative value of 〈(t1t0b〉 demands clarification, because in the limit t → ∞ the average density modulation is positive.

Figure 6. 

Distributions of the entropy production for the BEECLUST-controlled swarm at an early time interval during the transient (t0 = 15, t1 = 20, T = 909.1), classified according to whether a stopping agent was observed during the measurement. (a) no stopping, 〈(t1t0b〉 ≈ 0.06, n ≈ 8.6 × 106; (b) stopping, 〈(t1t0b〉 ≈ −3.09, n ≈ 4.1 × 105.

Figure 6. 

Distributions of the entropy production for the BEECLUST-controlled swarm at an early time interval during the transient (t0 = 15, t1 = 20, T = 909.1), classified according to whether a stopping agent was observed during the measurement. (a) no stopping, 〈(t1t0b〉 ≈ 0.06, n ≈ 8.6 × 106; (b) stopping, 〈(t1t0b〉 ≈ −3.09, n ≈ 4.1 × 105.

The explanation is based on a special feature of the BEECLUST-controlled swarm in this scenario, which consists of three phases (see Figure 7). In the short period before the first cluster forms, the average entropy production is 〈Ωb = 0〉, indicating that the original fluctuation theorem holds for this phase. The first cluster usually does not form close to the global optima, but relatively close to the middle of the arena; see Figure 7a. In this area the agent density modulation (Equation 8) contributes negatively. In a second phase the average density modulation is negative (〈Ωb < 0〉) because the density close to the middle of the arena increases further; see Figure 7(b). This is also indicated by the evolution of the agent density modulation over time as shown in Figure 8. Initially it stays close to 0, and only later does it clearly take a positive sign. The insert shows details of the first 50 time steps and indicates negative slope for the time interval [15, 20] (i.e., the second phase) of Figure 6. Only later do the clusters move toward the ends of the arena, probably aided by wall effects (see Figure 7c; agents are more likely to approach the cluster again after having left it on the side of the cluster that is closer to one of the ends of the arena), and consequently the average density modulation is positive (Ωb > 0).

Figure 7. 

The three phases observed in the BEECLUST scenario, each with a representative entropy production histogram and a plot of the arena showing moving (triangles) and stopped agents (circles), with a line indicating their most recent trajectory (histograms are meant to be qualitative). (a) Typical state at the time before the first occurrence of an agent-to-agent collision with 〈Ωb〉 = 0, 0 < t < 15, ρb(k, 0) = 2.1, ρb(k, 15) = −0.9, 15Ωb = −3.0. In the electronic version, blue marks indicate Ωb value in histogram and just-stopped agents. (b) Typical state of the early transient with 〈Ωb〉 < 0 (cf. Figure 6b), 15 < t < 20, ρb(k, 15) = −1.7, ρb(k, 20) = −3.6, 5Ωb = −1.9. In the electronic version, blue marks indicate Ωb value in histogram and just-stopped agents. (c) Typical state when approaching the self organizing equilibrium (cf. Figure 4), 20 < t < 200, ρb(k, 20) = 0.1, ρb(k, 200) = 10.1, 200Ωb = 10.0. In the electronic version, blue marks indicate Ωb value in histogram and main clusters.

Figure 7. 

The three phases observed in the BEECLUST scenario, each with a representative entropy production histogram and a plot of the arena showing moving (triangles) and stopped agents (circles), with a line indicating their most recent trajectory (histograms are meant to be qualitative). (a) Typical state at the time before the first occurrence of an agent-to-agent collision with 〈Ωb〉 = 0, 0 < t < 15, ρb(k, 0) = 2.1, ρb(k, 15) = −0.9, 15Ωb = −3.0. In the electronic version, blue marks indicate Ωb value in histogram and just-stopped agents. (b) Typical state of the early transient with 〈Ωb〉 < 0 (cf. Figure 6b), 15 < t < 20, ρb(k, 15) = −1.7, ρb(k, 20) = −3.6, 5Ωb = −1.9. In the electronic version, blue marks indicate Ωb value in histogram and just-stopped agents. (c) Typical state when approaching the self organizing equilibrium (cf. Figure 4), 20 < t < 200, ρb(k, 20) = 0.1, ρb(k, 200) = 10.1, 200Ωb = 10.0. In the electronic version, blue marks indicate Ωb value in histogram and main clusters.

Figure 8. 

Evolution of the agent density modulation over time in the BEECLUST scenario. Black line shows ensemble average; gray lines show samples; insert shows details of the ensemble average within the first 50 time steps.

Figure 8. 

Evolution of the agent density modulation over time in the BEECLUST scenario. Black line shows ensemble average; gray lines show samples; insert shows details of the ensemble average within the first 50 time steps.

4.2 Analysis of Clustering

The clustering scenario is simple, and it serves in this article as a second example to show the generality of our approach. Hence we reduce the analysis to a comparison of the entropy production and the test of the entropy production distribution with those obtained in the BEECLUST scenario.

By analogy to the above density modulation, we use a function
formula
defined by a normalized sum of all entries of the distance matrix, to construct the dissipation function. The function dist (·, ·) gives the distance between two points on a torus, and dmax is the maximal distance between two points on the torus. Consequently the dissipation function is
formula
We ignore any additional constants for this scenario. The agents are initialized to uniformly randomly distributed positions on the torus. On average this uniform distribution corresponds to high entropy, which is reflected in high values of ρc. On average these high values cannot be increased by the defined swarm behavior. On forming clusters, distances between several agents decrease, and consequently ρc will decrease. Hence, we expect negative values of tΩc for the majority of runs. After t = 80 time steps the entropy production tΩc is measured. We choose this relatively early state because the system converges fast and consequently only a few positive entropy productions are observed later, which complicates the statistical analysis. The obtained distribution is shown in Figure 9a. In contrast to Figure 4, the distribution is clearly asymmetrical, the average 〈tΩc〉 ≈ −0.015 is closer to zero (median: −0.0091), and the average is negative, indicating negative entropy production as expected.
Figure 9. 

Distribution and test of the entropy production for the clustering swarm on a torus (described in Section 3.2): t = 80; 〈tΩc〉 ≈ −0.015; T = 3.1; number of samples, n ≈ 8.5 × 106. (a) Distribution of the entropy production. (b) Test of the entropy production distribution (based on Equation 11).

Figure 9. 

Distribution and test of the entropy production for the clustering swarm on a torus (described in Section 3.2): t = 80; 〈tΩc〉 ≈ −0.015; T = 3.1; number of samples, n ≈ 8.5 × 106. (a) Distribution of the entropy production. (b) Test of the entropy production distribution (based on Equation 11).

The test of the entropy production distribution according to the fluctuation theorem and by analogy to Equation 11 is shown in Figure 9b. The data is close to linearity, but shows systematic deviations (too small for A < 0.01 and too large for 0.15 < A < 0.03). Given the fast convergence of the system to a majority of trajectories with negative entropy productions, this is still a satisfactory result.

5 Generality

To support our claim that this approach bears the potential of general applicability, we report the influence of an alternative entropy measure and investigate the influence of temperature in the following.

5.1 Independence from the Entropy Measure

The entropy measure that might be the most obvious to apply in the context of this article is probably the information-theoretic entropy of Shannon [24]. The Shannon entropy, however, is discrete, and our system has many continuous features apart from the discrete concept of agent numbers. The continuous extension of the Shannon entropy (differential entropy) is also not directly applicable, because we would have to sample a probability distribution from only N samples represented by agent positions. To get from continuous state space to discrete microstates, we have to do some form of coarse-graining. Unfortunately there is an ambiguity because there are many possibilities to implement coarse-graining. Still, at least two conditions can be fixed. First, the dynamics of the obtained entropy measure should parallel those of the system, and second, it should capture its main features.

We implement the coarse-graining by a standard method. We put a grid over the arena with tile side length 10, generating 75 tiles (cf. Table 1). Hence, we do not ignore the y-positions as above, but we still ignore the momenta. We count two microstates: unoccupied and occupied tiles (ignoring the actual number of agents in a tile). This way we sample two probabilities, p0(t) and p1(t) = 1 − p0(t). The Shannon entropy is given by S(t) = −∑ipi(t) log2pi(t), and the entropy production during a time interval [0, t] is given by
formula

Initially the agents are uniformly randomly distributed, which causes about as many tiles to be occupied as there are agents (for N ≪ 75), and we get p1(0) ≈ N/75. After some time, the agents cluster and the number of occupied tiles typically decreases (a typical value is p1(t) = 5/75 for N = 25). Hence, the average entropy production will be negative (〈tΩc〉 < 0).

The above experiments can be repeated with this entropy measure, and no qualitative change is detectable. However, there are some technical problems. Due to the discrete measure, abnormalities in the histograms occur, and the number of trajectories showing positive entropy production decreases fast with increasing time, which complicates the statistical analysis, for which we need well-filled histogram bins on both the positive and the negative side. Hence, the obtained data will not fully converge to a line according to Equation 11, and we cannot sample within long time intervals. The results are shown in Figure 10. The shapes of the curves are systematic, due to coarse-graining. Still, the overall trend to linearity is seen. We conclude that the above entropy measure based on a density modulation was a better choice.

Figure 10. 

Test of the entropy production distribution for Shannon entropy according to Equation 3 and for the BEECLUST scenario in analogy to Figure 5 for different times t ∈ {60, 100, 1,000, 2,000}; for t ≥ 1,000 only values for A ≤ 0.07 are given, because for A > 0.07 fast convergence does not allow a statistically sound analysis. Number of samples > 1.5 × 106 each.

Figure 10. 

Test of the entropy production distribution for Shannon entropy according to Equation 3 and for the BEECLUST scenario in analogy to Figure 5 for different times t ∈ {60, 100, 1,000, 2,000}; for t ≥ 1,000 only values for A ≤ 0.07 are given, because for A > 0.07 fast convergence does not allow a statistically sound analysis. Number of samples > 1.5 × 106 each.

5.2 Influence of Temperature

Both investigated scenarios are certainly more complex than a mere physical multiparticle system, because they also incorporate autonomous stops of agents, sensor ranges, and waiting times in addition to particle velocities. Still, we can accommodate most of them with the concept of temperature as defined in Equation 10. When we reduce the agents' velocities, we alter the temperature. However, we have to extend the concept of temperature to allow the comparison of systems with different agent velocities. To scale the system behaviors at different agent velocities completely, it is clear that we also have to scale waiting times and sensor ranges, which are not considered in Equation 10. Hence, we have to apply an extended concept of temperature that includes these velocity-dependent properties of agents as well. However, the definition and analysis of a general temperature concept for multi-agent systems is beyond the scope of this article. Here, we just scale speeds (halved), waiting times (doubled), and sensor ranges (halved) and obtain two systems with temperatures T1 = 2T2. We compare the entropy production distributions of these two systems in Figure 11 for a late state at time t = 6,000. The system at half temperature (T2) shows systematic deviations because it converges slower. Still, the deviations are small enough to conclude that, at least for some scenarios, an extended concept of temperature can be found that allows one to scale these systems and compare them across different temperatures.

Figure 11. 

Comparison of systems at two different temperatures T1 = 2T2 for the BEECLUST scenario. The first system is parametrized as in Table 1. The second system has halved speeds, doubled waiting times, and halved sensor ranges. The plot shows number of samples, n ≈ 2.5 × 106 each; t = 6,000.

Figure 11. 

Comparison of systems at two different temperatures T1 = 2T2 for the BEECLUST scenario. The first system is parametrized as in Table 1. The second system has halved speeds, doubled waiting times, and halved sensor ranges. The plot shows number of samples, n ≈ 2.5 × 106 each; t = 6,000.

6 Discussion

Note again that ρb(k, t) = 0 corresponds to maximum entropy. Therefore, any ρb(k, t) ≠ 0 in Figure 8 indicates negative entropy production. We conclude that the negative entropy production of this system is initiated by entropy fluctuations, which are normally distributed and are negative and positive with about the same probability according to the original fluctuation theorem and as seen in Figure 6(a). Some of the negative-entropy-production events are locally observable by the agents themselves, because there are simultaneous agent-to-agent encounters of three agents with mutual perception. This local perception of the global measure of entropy is leveraged by stopping all three agents and consequently storing the local entropy fluctuation. Cascades of such stopping behaviors generate a positive feedback (self-amplification of fluctuations as in Rayleigh-Bénard convection). In the end a system dynamics is generated that can be described by an inverted fluctuation theorem, which dictates an exponentially increasing probability of low-entropy states. Hence, this emergent self-organizing swarm does indeed rely on frozen accidents. Note that the overall system (i.e., including the heat bath) still produces positive entropy (e.g., due to accelerations of the agents), while the agent-position-based entropies are reduced only in the self-organizing subsystem.

The effectiveness of the frozen-accidents concept can easily be made clear by constructing a simple model. We represent the entropy contribution of each agent i by a random process Xi(t). The total entropy is just the sum ∑i=0NXi(t) over all agents N. The restriction of all random processes to the interval [−5, 5] is essential, and we define Xi(t) = 5 ∀t > t0, where t0 is the first time agent i achieved Xi(t0) = 5. That is, once a random process reaches Xi(t0) = 5 (a local property), it stays there forever—a frozen accident. As a consequence the number of active random processes Na will decrease monotonically. A sample run of this simple model for N = 25, based on Gaussian distributed Xi and initialization Xi(0) = 0, is shown in Figure 12. The bias in the otherwise random trajectory is noticeable. Note that the summation ∑NXi of Gaussian distributed random variables, each having a variance of σi2, results in a random variable that is also Gaussian distributed with a variance of σ2 = ∑N σi2. With decreasing number Na of active processes, more and more variances vanish (σi2 = 0). Hence, also the variance of the sum will decrease, which is the macroscopic effect of the frozen accidents and ensures that states of low entropy are much more likely to be maintained. The challenge in investigating self-organizing systems is, however, represented by the reverse engineering of such a model. One would start with an observed macroscopic effect as shown in Figure 12 and then decompose it into a set of subsystems that accidentally achieve low-entropy states by noise and then preserve them by freezing.

Figure 12. 

Sample run of a simple model based on summations of N = 25 random processes initialized to Xi(0) = 0, based on normally distributed random variables (μ = 0, σ2 = 1). A random process i becomes inactive once Xi(t) = 5 is reached. Na is the number of active random processes.

Figure 12. 

Sample run of a simple model based on summations of N = 25 random processes initialized to Xi(0) = 0, based on normally distributed random variables (μ = 0, σ2 = 1). A random process i becomes inactive once Xi(t) = 5 is reached. Na is the number of active random processes.

The reported results indicate that this emergent self-organized system obeys an inversion of the fluctuation theorem, which can be stated as
formula
following Equation 3. We get an immediate interpretation of this self-organizing system by inverting the interpretation of the fluctuation theorem. A self-organizing system that is started with high entropy will produce negative entropy with an exponentially increasing probability over time. As a consequence there is a self-organization equilibrium of lower entropy to which the system will converge. As a second consequence, the self-organizing entropy-reduction behavior is a transient phenomenon (cf. [19, p. 62]).

7 Conclusion

In this article, we have analyzed emergent self-organizing multi-agent (or swarm) systems with methods based on and suggested by the fluctuation theorem. The results provide empirical evidence for the existence of an inverted fluctuation theorem that could prepare a wide basis for the analysis of self-organizing systems. We claim these methods have a potential for general applicability. This claim of generality is supported by the reported results with an alternative entropy measure and the results using a broader interpretation of the concept of temperature.

Specific exemplary benefits of such a theory could be the definition of preconditions for self-organization, for example, concerning the cognitive abilities of the agents. Statistical properties of fluctuations describe the timescales on which negative entropy productions can be observed locally. The agents need to perceive local samples of this global property of negative entropy production and need to react within these timescales. Hence, conditions for controller sampling rates could be derived. The agents need appropriate sensors that allow local measurements of entropy with an accuracy that is sufficiently higher than the rate at which events of negative entropy production occur. Thus, preconditions for successfully generating positive feedback could be derived.

In particular, the origin of BEECLUST confirms the possibility of applying the proposed methods to natural systems such as clustering behaviors in young honeybees [26] or other social insects, as well as flocks, herds, and shoals. Hence, the same methods could be used for artificial and natural systems, which could, in turn, enrich primarily biological studies.

This work has proved again that thermodynamics and statistical physics offer many fully developed methods that can often be applied even unmodified to problems of emergent behavior (cf. Hamann et al. [11]). Pursuing this research track might be a promising way of achieving general insights into still rather fuzzy concepts such as emergence or self-organization.

Finally, it is clear that the reported approach is truly interdisciplinary in combining methods and problems from physics, biology, and computer science. It is obvious that, at least in the field of artificial life, any future research success has to be founded on a combination of several scientific fields. In our future work, we hope to continue this approach by generalizing the concept of an inverted fluctuation theorem for emergent self-organizing multi-agent systems.

Acknowledgments

The authors thank Payam Zahadat and the anonymous reviewers for helpful comments that improved this article. This reported research is supported by EU-ICT CoCoRo, no. 270382; EU-IST-FET project SYMBRION, no. 216342; EU-ICT project REPLICATOR, no. 216240; FWF (Austrian Science Fund) Project REBODIMENT, no. P23943-N13; and the Austrian Federal Ministry of Science and Research (BM.W_F).

Notes

1 

In a thermostatted system the temperature is kept constant, for example, by rescaling the particles' velocities. The system can be thought of as being in contact with a large heat reservoir in order to thermostat the system [1]. A thermostat method is applied in Section 4.1.

2 

In any embodied implementation of this system (e.g., robots) the particles would be affected by friction and would consequently need to have a permanent acceleration-compensating friction. This, in turn, means they would have an energy reservoir (cf. active particles [23]) and would permanently dissipate heat, which would result in a situation as shown in Figure 1. Energy costs have to be paid to allow self-organization and to comply with the second law of thermodynamics. Henceforth, we carry out the separation between the two subsystems: the self-organizing subsystem containing the agents, and the subsystem typified by the heat reservoir. Due to its energy dissipation, the self-organizing subsystem does not have to obey the second law of thermodynamics.

References

1. 
Andersen
,
H. C.
(
1980
).
Molecular dynamics simulations at constant pressure and/or temperature.
Journal of Chemical Physics
,
72
(
4
),
2384
2393
.
2. 
Bodenschatz
,
E.
,
Pesch
,
W.
, &
Ahlers
,
G.
(
2000
).
Recent developments in Rayleigh-Bénard convection.
Annual Review of Fluid Mechanics
,
32
(
1
),
709
778
.
3. 
Crick
,
F. H.
(
1968
).
The origin of the genetic code.
Journal of Molecular Biology
,
38
(
3
),
367
379
.
4. 
De Wolf
,
T.
, &
Holvoet
,
T.
(
2005
).
Emergence versus self-organisation: Different concepts but promising when combined.
In S. Brueckner, G. D. M. Serugendo, A. Karageorgos, & R. Nagpal (Eds.)
,
Proceedings of the Workshop on Engineering Self Organising Applications
(pp.
1
15
).
Berlin
:
Springer-Verlag
.
5. 
Evans
,
D. J.
,
Cohen
,
E. G. D.
, &
Morriss
,
G. P.
(
1993
).
Probability of second law violations in shearing steady states.
Physical Review Letters
,
71
,
2401
2404
.
6. 
Evans
,
D. J.
, &
Searles
,
D. J.
(
2002
).
The fluctuation theorem.
Advances in Physics
,
51
(
7
),
1529
1585
.
7. 
Gell-Mann
,
M.
(
1995
).
Plectics.
In J. Brockman (Ed.)
,
The third culture: Beyond the scientific revolution
(pp.
316
332
).
New York
:
Touchstone Press
.
8. 
Gerstner
,
E.
(
2002
).
Second law broken: Small-scale energy fluctuations could limit miniaturization.
Nature Online News.
9. 
Haken
,
H.
(
1977
).
Synergetics—An introduction.
Berlin
:
Springer-Verlag
.
10. 
Hamann
,
H.
,
Meyer
,
B.
,
Schmickl
,
T.
, &
Crailsheim
,
K.
(
2010
).
A model of symmetry breaking in collective decision-making.
In S. Doncieux, B. Girard, A. Guillot, J. Hallam, J. Meyer, & J. Mouret (Eds.)
,
From Animals to Animats 11
(pp.
639
648
).
Berlin
:
Springer-Verlag
.
11. 
Hamann
,
H.
,
Schmickl
,
T.
, &
Crailsheim
,
K.
(
2011
).
Thermodynamics of emergence: Langton's ant meets Boltzmann. In IEEE Symposium on Artificial Life (IEEE ALIFE 2011)
(pp.
62
69
).
12. 
Hamann
,
H.
,
Schmickl
,
T.
,
Wörn
,
H.
, &
Crailsheim
,
K.
(
2012
).
Analysis of emergent symmetry breaking in collective decision making.
Neural Computing & Applications
,
21
(
2
),
207
218
.
13. 
Hereford
,
J. M.
(
2011
).
Analysis of BEECLUST swarm algorithm. In Proceedings of the IEEE Symposium on Swarm Intelligence (SIS 2011)
(pp.
192
198
).
14. 
Jaynes
,
E. T.
(
1992
).
The Gibbs paradox.
In C. R. Smith, G. J. Erickson, & P. O. Neudorfer (Eds.)
,
Maximum entropy and bayesian methods
(pp.
1
22
).
Dordrecht
:
Kluwer Academic
.
15. 
Maxwell
,
J. C.
(
1878
).
Tait's ‘Thermodynamics’ (I).
Nature
,
17
,
257
259
.
16. 
Nicolis
,
G.
, &
Prigogine
,
I.
(
1977
).
Self-organization in nonequilibrium systems.
New York
:
Wiley
.
17. 
Parunak
,
H. V. D.
, &
Brueckner
,
S.
(
2001
).
Entropy and self-organization in multi-agent systems.
In
AGENTS'01: Proceedings of the Fifth International Conference on Autonomous Agents
(pp.
124
130
).
New York
:
ACM Press
.
18. 
Polani
,
D.
(
2008
).
Foundations and formalizations of self-organization.
In M. Prokopenko (Ed.)
,
Advances in applied self-organizing systems.
Berlin
:
Springer-Verlag
.
19. 
Prigogine
,
I.
(
1997
).
The end of certainty: Time, chaos, and the new laws of nature.
New York
:
Free Press
.
20. 
Schmickl
,
T.
, &
Hamann
,
H.
(
2011
).
BEECLUST: A swarm algorithm derived from honeybees.
In Y. Xiao (Ed.)
,
Bio-inspired computing and communication networks.
Boca Raton, FL
:
CRC Press
.
21. 
Schmickl
,
T.
,
Hamann
,
H.
,
Wörn
,
H.
, &
Crailsheim
,
K.
(
2009
).
Two different approaches to a macroscopic model of a bio-inspired robotic swarm.
Robotics and Autonomous Systems
,
57
(
9
),
913
921
.
22. 
Schmickl
,
T.
,
Thenius
,
R.
,
Möslinger
,
C.
,
Radspieler
,
G.
,
Kernbach
,
S.
, &
Crailsheim
,
K.
(
2008
).
Get in touch: Cooperative decision making based on robot-to-robot collisions.
Autonomous Agents and Multi-Agent Systems
,
18
(
1
),
133
155
.
23. 
Schweitzer
,
F.
(
2003
).
Brownian agents and active particles. On the emergence of complex behavior in the natural and social sciences.
Berlin
:
Springer-Verlag
.
24. 
Shannon
,
C. E.
(
1948
).
A mathematical theory of communication.
Bell System Technical Journal
,
27
,
379
423
.
25. 
Smith
,
E.
(
2008
).
Thermodynamics of natural selection I: Energy flow and the limits on organization.
Journal of Theoretical Biology
,
252
(
2
),
185
197
.
26. 
Szopek
,
M.
,
Radspieler
,
G.
,
Schmickl
,
T.
,
Thenius
,
R.
, &
Crailsheim
,
K.
(
2008
).
Recording and tracking of locomotion and clustering behavior in young honeybees (Apis mellifera).
In A. Spink, M. Ballintijn, N. Bogers, F. Grieco, L. Loijens, L. Noldus, G. Smit, & P. Zimmerman (Eds.)
,
Proceedings of Measuring Behavior 2008
,
Vol. 6
(p.
327
).
27. 
Wang
,
G. M.
,
Sevick
,
E. M.
,
Mittag
,
E.
,
Searles
,
D. J.
, &
Evans
,
D. J.
(
2002
).
Experimental demonstration of violations of the second law of thermodynamics for small systems and short time scales.
Physical Review Letters
,
89
(
5
),
050601
.
28. 
Weeks
,
J. D.
,
Chandler
,
D.
, &
Andersen
,
H. C.
(
1971
).
Role of repulsive forces in determining the equilibrium structure of simple liquids.
Journal of Chemical Physics
,
54
(
12
),
5237
.
29. 
Wu
,
M.
,
Ahlers
,
G.
, &
Cannell
,
D.
(
1995
).
Thermally induced fluctuations below the onset of Rayleigh-Bénard convection.
Physical Review Letters
,
75
(
9
),
1743
1746
.

Author notes

Contact author.

∗∗

University of Paderborn, Department of Computer Science, Zukunftsmeile 1, 33102 Paderborn, Germany. E-mail: heiko.hamann@uni-paderborn.de

Artificial Life Lab of the Department of Zoology, Universitätsplatz 2, Karl-Franzens University Graz, 8010 Graz, Austria. E-mail: thomas.schmickl@uni-graz.at, karl.crailsheim@uni-graz.at