## Abstract

Finding the distribution of systems over their possible states is a mathematical problem. One possible solution is the method of the *most probable distribution* developed by Boltzmann. This method has been instrumental in developing statistical mechanics and explaining the origin of many thermodynamics concepts, like entropy or temperature, but is also applicable in many other fields like ecology or economics. Artificial ecosystems have many features in common with ecological or economic systems, but surprisingly the method does not appear to have been very successful in this field of application. The hypothesis of this article is that this failure is due to the incorrect interpretation of the method's concepts and mathematical tools. We propose to review and reinterpret the method so that it can be correctly applied and all its potential exploited in order to study and characterize the global behavior of an artificial multi-agent ecosystem.

## 1 Introduction

A primary goal for ecology is to predict numerical species densities [53]. A recent approach is to use mathematical tools and statistical-mechanistic concepts [10, 21–23, 34, 36, 54, 55, 64]. This has been somewhat successful and at the same time controversial [48, 59, 61, 62]. The approach adopts the statistical apparatus of statistical mechanics, which was originally conceived to describe the most probable distribution of the number of molecules in a system depending on its energy, in order to explain what the most probable species distribution is, depending on available resources and each individual's metabolism or genetic traits. Other initiatives are afoot to establish a more general model for use to explain evolution and other, more complex processes [6, 35], although they are basically theoretical models.

Another line of research in the field of artificial ecosystems composed of multi-agent systems aims to explain why some distributions, like the Pareto distribution, appear recurrently in the real economy. To do this, they simulate agent ecosystems that buy and sell to each other, and apply statistical-mechanistic techniques to predict the most probable distribution of the money or earnings among agents and their points of economic equilibrium [7, 13, 14, 19, 40, 41, 47, 67, 71].

Statistical mechanics tools are useful for predicting or deducing how a system composed of many individuals will behave globally, based on knowledge about the rules governing the behavior of each individual locally. These may or may not be mechano-physical rules, such as the two examples mentioned above whose rules are taken from biology and economics. Looking at physics, we find that statistical mechanics tools are powerful enough to deduce and explain why some distribution patterns of individuals appear more than others, why the population of some species of individuals in the system is larger than others, why some species segregate and others mix, how the intraspecies and interspecies equilibrium conditions depend on species traits and the environment, what conditions mark phase changes, why individuals are spatially distributed in one way or another, and why some individuals form groups and others do not. On these grounds, it would be reasonable to expect these tools to have been successful at analyzing agent systems like Echo [26], Tierra [56], Avida [51], or Sugarscape [16] and drawing conclusions about their global behavior that would constitute the system thermodynamics. This does not, however, appear to have been the case.

There have been and are several general movements in this direction. One combines physical thermodynamics and Shannon's information entropy [8] to create a theoretical framework that relates the increase in the information that a system has about the environment to the increase in its complexity as the system evolves [3, 49]. Others try to directly equate physical energy with the information, but suggest that it is probably not the best approach [52]. And, finally, another group of authors have tried to establish a satisfactory interpretation of the general concepts of thermodynamics (energy, matter, work, entropy, temperature) of this type of systems [12, 20, 28, 52, 66]. None of the above approaches is likely to qualify as the conceptual and mathematical groundwork of a general theory for artificial ecosystems. As Melanie Mitchell recently pointed out [49], “Having the right conceptual vocabulary and the right mathematics is essential for being able to understand, predict, and in some cases, direct or control self-organizing systems with emergent properties. Developing such concepts and mathematical tools has been, and remains, the greatest challenge facing the sciences of complex systems.” We believe that Mitchell is right in that the concepts and mathematics that have been used are wrong. And we believe that this is due to a misinterpretation of the underlying statistical mechanics and thermodynamics concepts on three points: the association of the disorder concept with the second law of thermodynamics, the interpretation of entropy maximization as a physical law of nature, and the identification of the concept of entropy with Shannon's formula.

The association between the concept of disorder and entropy is a conceptual barrier that prevents the use of entropy to study systems that are self-organizing and also show other emergent properties closer to order than disorder. In fact, it is usually contended that these systems violate, or appear to violate, the second principle of thermodynamics because disorder rather than self-organization should emerge. The paradox is often resolved by resorting to Prigogine's theories [50], indicating that any loss of entropy within the system through self-organization is exported as disorder created in the environment. But the problem remains, because the export of entropy provides no explanation for how or why self-organization takes place [25]. The association of disorder with increased entropy is somehow unfortunate, and there are many examples in nature that illustrate this point.

Let us place a box in interplanetary space where gravity is zero and inject a mixture of gases through a hole. The gases will certainly expand within the box and will not show a preference for any particular region within the box. This is the usual interpretation of disorder associated with entropy. But if we run a second experiment with the box now on the Earth's surface, we will observe that the gases concentrate in the parts of the box closest to the surface: Disorder has vanished; there is a greater concentration of gases in some areas than others. This is what happens in the atmosphere, a textbook example of entropy maximization [18, 70]. Let us run a third experiment where we add ink to a glass of water. The water and the ink will mix, which supports the usual interpretation of disorder associated with entropy. But if, in a fourth experiment, instead of ink we add oil to the water, we will find that the two liquids separate and do not mix; they are ordered, but entropy has again been maximized [11]. All four experiments end at their entropy maxima, but only two would be associated with the concept of disorder. In the other two, order has emerged without disorder increasing elsewhere, and the second law of thermodynamics has not been violated.

There are many other examples out there that demonstrate that the association of the disorder concept with the second principle of thermodynamics is not very fortunate, and there is a tendency to explain the concept [1, 39, 42, 43] and a swing away from its use in physics and chemistry textbooks [44]. A priori, then, there is no conceptual obstacle to prevent a self-organizing system or system that exhibits emergent properties associated with any concept of order from being studied by analyzing its entropy maximization.

The second impediment is the understanding that entropy maximization is a law of nature, such as the law of universal gravitation, an interpretation that immediately prompts a couple of questions. The first is whether the law in question is also applicable to an artificial system. The source of this question is primarily the definition of entropy based on thermodynamic quantities, like heat, work, or temperature, that neither provide a very straightforward explanation of entropy nor are easily translated to an artificial system. However, the equivalent statistical-mechanistic definition of entropy, *S* = *k* ln *W* (its quantities will be explained in Section 2) is really a mathematical definition based on counting combinations of any system property whatsoever, just as temperature, pressure, or chemical potential are also mathematically defined—in fact, partial derivatives of the entropy [11, 32]. On this ground, it should be applicable to any property or trait whose distribution among the agents of the artificial system is countable. The second question concerns the use of the number yielded by the entropy calculation, that is, what purpose it serves. The numerical value of entropy is really not much use at all. The reason for calculating the system entropy is to find what the most probable distribution of the properties of a system is and analyze its characteristics mathematically. Being the most probable distribution, it is the distribution toward which the system will tend, the distribution that we will observe. Therefore, by studying the mathematical origin of this distribution, we will be able to predict the properties and behavior of the system. It is not, therefore, its numerical value that is of interest, but its mathematical expression. We will then operate with and maximize this expression and examine the properties of the maximum. This is the reasoning behind the method, a mathematical reasoning that does not depend on any real physical trait. As Jaynes [30] points out, “In the problem of prediction, the maximization of entropy is not an application of a law of physics, but merely a method of reasoning which ensures that no unconscious arbitrary assumptions have been introduced.”

The third obstacle arises out of the understanding that the mathematical expression for the entropy, *S* = *k* ln *W*, is equivalent to Shannon's information entropy formula *H* = −∑*p*_{i} ln *p*_{i} (the quantities in this equation will be described in Section 3). In some cases, they are equivalent both in physics and in an artificial system, and the most probable distribution can be studied based on either formula. But this does not apply in many other cases. In physics, the concepts differ with the application of Bose-Einstein or Fermi-Dirac statistics or when working with polymer solutions [11]. The calculation of entropy is useful for analyzing the most probable distribution, and confinement to Shannon's formula resembles restriction to the form *F* = *mg* with respect to gravitational force. Under some circumstances, as on the Earth's surface, they are equivalent, but this is not generally the case. Shannon's formula is sometimes applied as a law, without pausing to consider whether its validity conditions are met or the purpose for which it is being calculated. Consequently, few useful conclusions can be drawn.

In Section 2, we explain the rationale for defining entropy as *S* = *k* ln *W*, and in Section 3 we then look at its connection to Shannon's expression in order to understand when this is and is not applicable. In Section 4, we present a basic artificial ecosystem, and in Sections 567 to 8 we calculate, with the help of entropy, its most probable distribution with respect to several agent characteristics under different conditions. In Section 9 we also deduce its points of equilibrium to get a description of its observable behavior in the form of laws (of thermodynamics). The aim of this research is not to create a new statistical mechanics or thermodynamics of agent ecosystems, because all the concepts and mathematical tools already exist; it is to demonstrate how they should be interpreted and applied correctly, that is, which method to follow. This method is known as the method of the most probable distribution [60] and is also known as the combinatorial method; entropy maximization is a mere arithmetical consequence of it [63].

The statement of the example is simple, and we explore points that are potentially applicable to other agent systems. The resulting laws are only applicable to the example ecosystem, but the important point of this article is not the laws but the explanation of how they have been found and the method that should be used. This is the major contribution of this article.

## 2 Entropy as *S* = ln(*W*) or *S* = ln(*P*)

Suppose that we have a closed system of *N* noninteracting agents, *A* = {*a*_{1},…, *a*_{N}}, which have a countable property *X* (which could be wealth, age, size, memory size, energy, money, CPU time, or position on a map, whatever) taking a range of values *X* = {*x*_{1}, *x*_{max}}. An agent will have a particular value of that property, *a*_{i} ← *x*_{j}, at any time. The term “noninteracting” refers to the system's feature that no value of that property can be assigned to the interaction of an agent with the others; there are no “interaction values,” so to speak. So, the system will be characterized by the set of values of that property taken by each one of the *N* agents, {(*a*_{1} ← *x*_{j}), (*a*_{2} ← *x*_{k}),…, (*a*_{N} ← *x*_{l})}, at any time. That set of values is what is known as a *microstate*; it will change as the rules of system operation alter the values of each agent for each time instant *t*. A microstate is a microscopic description of the system because it describes, for each individual agent, the value of its property.

Suppose that we want to investigate the most probable distribution of those values among agents. In other words, for a microstate, we count the *N*_{1} agents that have the value *x*_{1}, the *N*_{2} agents that have the value *x*_{2}, and so on, to find out what the set is like or what is the most probable distribution of ψ_{X} = {*N*_{1},…, *N*_{max}}, with *N* = ∑*N*_{i}, is. We will call the set ψ_{X} a system *macrostate*. This is a global description of the system, as it makes no distinction between individual agents. It makes no difference which agents are members of the group counted in an *N*_{i}; the important thing is the value of *N*_{i}.

A microstate is part of or belongs to a macrostate ψ_{X} if the values counted in its agents yield the same distribution {*N*_{1},…, *N*_{max}} that characterizes that macrostate. So, if two microstates differ only as to each agent's particular values but have the same global count {*N*_{1},…, *N*_{max}}, the two microstates will belong to the same macrostate ψ_{X}.

*W*(ψ

_{X}) be the number of microstates that belong to the macrostate ψ

_{X}, or, alternatively, let

*W*(ψ

_{X}) be the number of all possible combinations of values among all the agents that yield the distribution ψ

_{X}= {

*N*

_{1},…,

*N*

_{max}}, then the statistical probability of finding the system in any macrostate whatsoever is

The definition of entropy that we will give is equivalent in meaning to thermodynamic entropy. To express this, we will follow Planck's reasoning [38]. On the one hand, the most probable distribution of the system will be a distribution ψ_{X}∗ for which *W*(ψ_{X}∗) ≥ *W*(ψ_{X}). At that point, the system will be at equilibrium, as it will have reached its point of maximum probability, to which it will quickly return after any fluctuation. On the other hand, if the system is in a state of thermodynamic equilibrium, its entropy *S* will also be maximal, *S*(ψ_{X}∗), and it will hover around that point. Therefore, it is reasonable to think that *S*(ψ_{X}∗) and *W*(ψ_{X}∗) should be correlated (this simple idea was Boltzmann's stroke of genius [45]). They might be correlated directly, *S*(ψ_{X}∗) ∝ *W*(ψ_{X}∗), but this would be incorrect, because thermodynamic entropy is extensive, that is, the thermodynamic entropy of a system composed of two independent subsystems is the sum of the entropies of the subsystems, *S*_{total} = *S*_{A} + *S*_{B}. Calculating the combinations, however, the number that we get by combining two independent systems is *W*_{total} = *W*_{A}*W*_{B}, which involves a multiplication, meaning that the correlation cannot be direct. The correlation must be based on a function that grows and is maximized when *W*(ψ) grows or is maximized, but that is also extensive. The correct result is the well-known formula *S* = *k*ln(*W*), where the Boltzmann constant *k* is a normalization factor, because *S*_{total} = *k*ln(*W*_{total}) = *k*ln(*W*_{A}) + *k*ln(*W*_{B}) = *S*_{A} + *S*_{B}. Obviating the units, we can use *S* = ln(*W*) as our definition of entropy. Thanks to the above definition of *P*(ψ_{X}), we obtain *S* = ln(*P*) + const, and to calculate the entropy we will be able to use either the number *W* of combinations that yield the distribution or its probability *P*. In the remainder of the article, we will use whichever version simplifies the mathematics.

We want to study the most probable distribution, and we are not looking for equivalence with thermodynamic entropy, which was Boltzmann's purpose. Hence, we might have reservations about whether the definition of entropy as *S* = ln(*W*) is the best for use with artificial systems and whether it would not be better to work directly with probabilities rather than their logarithms. In actual fact, the only reason for using that definition is to make the mathematics more straightforward, as factorials very often appear in the calculation of possible combinations or probabilities, which are simplified with the logarithm using Stirling's approximation.

As we have seen, statistical mechanics interprets the increase of entropy as a consequence of the natural tendency to move from a less to a more probable state. This is the underlying concept of the second principle of thermodynamics expressed as *S*(*t* + Δ*t*) − *S*(*t*) = Δ*S* ≥ 0, as the system tends to move from a less probable to a more probable state, where the entropy is therefore greater. While it is true that there may be fluctuations and changes to less probable states, if the form of *W*(ψ_{X}) corresponds to multiset permutations, it can be demonstrated by means of the entropy concentration theorem [31, 32] that the distribution of *W*(ψ_{X}) is very sharp and is entirely concentrated, like a delta function, at the maximum, ψ_{X}∗, fluctuations being minimal. Hence, the system very quickly evolves in one direction toward the state of maximum probability or entropy. For this reason, this direction is associated with the direction of time [45], as, given two macrostates whose entropy or probability is calculable, we can confidently state which the start and end states are, simply by comparing their entropy or probability values. For the very same reason, such a change is said to be irreversible because the system is unlikely to switch back from a state of greater entropy or probability to the original state of lower entropy or probability, given that *W*(ψ_{X}) is sharp. When the system is already at ψ_{X}∗, the system will oscillate between microstates of ψ_{X}∗, and the changes are said to be reversible.

## 3 Boltzmann, Shannon, and Gibbs Formulations of Entropy

*S*= ln(

*W*), without, however, specifying the formula that determines

*W*. Its form will depend on the property that is being counted and the system conditions. It is merely a matter of counting combinations, and this is sometimes the hardest part of the problem. In the simplest combinatorial case, multiset permutations, the expression isThis is the case of what is known as Maxwell-Boltzmann statistics. When

*N*is large enough, the entropy can be calculated using Stirling's approximation ln(

*y*!) ≈

*y*ln(

*y*) −

*y*:By dividing everything by

*N*and making

*P*

_{i}=

*N*

_{i}/

*N*, we get what it is called the Shannon entropy

*H*:

*H*and thermodynamics, but this relationship does not hold anymore whenever

*W*(ψ) in Equation 1 does not express the equation defining the number of combinations of a macrostate. A straightforward example is when, a priori each value

*x*

_{i}has not the same but a different probability,

*p*

_{i}=

*w*

_{i}/

*N*, of occurring for some values of

*w*

_{i}. In this case, the probability associated with a macrostate is the known multinomial distribution,

*P*(ψ

_{X}) = (

*N*!/∏

*N*

_{i}!)∏

*p*

_{i}

^{Ni}, and the number of combinations isCalculated thus, entropy yields the resultTherefore, if ,In physics, Equation 2 appears as soon as there are degenerate energy levels. But there are many other cases in physics where

*W*(ψ) cannot be calculated with either Equation 1 or 2, such as, for example, when there are quantum effects and Bose-Einstein statistics apply; when the Pauli exclusion principle is applicable, yielding the Fermi-Dirac statistics; or when we have to calculate possible polymer organizations in a solution.

*N*

_{C}plots, each with capacity for

*C*

_{i}agents, where

*C*= ∑

*C*

_{i}is the total capacity of the map. Suppose that we want to find out the number

*n*

_{i}of agents that occupy each plot, regardless of the identity or position of each agent in the plot. In other words, ψ

_{C}= {

*n*

_{1},…,

*n*

_{NC}} defines the macrostates that we want to study. As there will be

*n*

_{i}agents and

*C*

_{i}−

*n*

_{i}empty spaces in each plot,

*W*(

*c*

_{i},

*n*

_{i}) =

*C*

_{i}!/

*n*

_{i}!(

*C*

_{i}−

*n*

_{i})! will be the number of possible distributions of those agents and spaces in a plot. The distribution of one plot is independent of the distribution in any other, and consequently the distribution across the whole map is calculated asThe resulting entropy isClearly, once again.

*X*changes simultaneously but independently, then the total number of possible combinations will be given by multiplying any possible combinations with respect to the agent positions by the combinations of their possible values of

*X*, that is,

*W*

_{total}=

*W*(ψ

_{X})

*W*(ψ

_{C}). The total system entropy will then be given by

*S*

_{total}= ln(

*W*

_{total}) = ln

*W*(ψ

_{X}) + ln

*W*(ψ

_{C}), that is, the sum of the entropies calculated by Equations 3 and 4:Accordingly, we find that the Shannon entropy is just a special case, which is rather limited outside basic combinatorics, and we can draw few correct and useful conclusions from its calculation unless we are clear about the source of its formulation and its application scope.

*X*in one closed system of

*N*noninteracting and independent (statistically speaking) agents. Now let's consider that due to some forces, the agents may interact, forming pairs, triplets, or larger collections of agents, and a value of

*X*is also assigned to that interaction (as in physics we have interaction energies). In that case it may be difficult or impossible to calculate

*W*(ψ

_{X}) of Equation 1. But for that system we can easily calculate the average of the value of

*X*per agent, . Let's now change our focus from the problem of finding the distribution of the values of

*X*among the

*N*agents, to a different problem: Calculate the probability that the system has an average value per agent, , of the property

*X*. To analyze this problem, we now create an ensemble of systems consisting of

*M*mental copies of the

*N*-agent system, each of them with a different distribution of the values of

*X*among the

*N*agents. If

*M*

_{j}is the number of all possible different copies of the system with an average value , then

*M*= ∑

*M*

_{j}, and in the simplest combinatorial case the number of different distributions of the ensemble considering all the possible values of

*X*

_{j}isIf we now define

*S*

_{G}= ln(Ω), following the mathematical procedure at the beginning of this section with ρ

_{j}=

*M*

_{j}/

*M*, we getThis

*S*

_{G}is what it is known as the Gibbs entropy, which in this combinatorial case is equivalent to the Shannon entropy

*H*. Here

*H*constitutes the expectation value or average of −ln(ρ

_{j}). The Boltzmann formulation of entropy that we were using,

*S*

_{B}= ln(

*W*), which takes the agents as the counting focus, has limited applicability in physics [29], because it could apply only to dilute gases, as atoms and molecules in solids, liquids, or dense gases interact with one another and cannot simply be regarded as statistically independent; they can't be easily separated into counting units because of the existence of forces between them. But the Gibbs entropy

*S*

_{G}= ln(Ω) uses the whole system as the focus, saying nothing about the interaction of the agents within the system. Therefore, it can be applied to any system, and it constitutes the entropy formulation currently used in physics (in any case,

*S*

_{B}can be derived from

*S*

_{G}[11, 29] for systems forming the so-called microcanonical ensemble, such as a perfect gas, where ). Unfortunately, the formulation of

*S*

_{G}shown above (equivalent to Shannon's

*H*), which switches from a multiplicity (combinatorial) perspective to a probabilistic perspective, is taken very often as the starting definition of entropy, skipping the original counting-problem statement. The result is that the powerful idea of Boltzmann's method based on counting agents is lost. But artificial life systems are not gases, liquids, or solids: they are agent-based systems where counting agents is not supposed to be a problem, so the idea of considering the problem as a mere combinatorial problem could be worth recovering. The rest of the article explores this idea, and demonstrates that this is the case.

## 4 Incorporating Realistic Elements in an Artificial Ecosystem: The Groundwork of the Ecosystem to be Studied

A common property of artificial ecosystems is that agents possess a countable property or trait that changes over time. Agents accumulate money, gain energy, harvest food, consume CPU time or memory, or simply grow. To better illustrate the problem, imagine that we want to simulate a forest containing *N* trees (agents), and the variable that we are going to study and that changes or grows is their mass. Suppose that the trees grow over time according to a function *m*(*t*) = *F*(α*t*), where α is a parameter related to the growth rate. The constant α can be fixed, but in nature this growth rate varies from one time instant to another, depending on many factors [37]. Consequently, there is a tendency in ecological modeling to use stochastic models [27], meaning that random values α(τ) are used at each time instant τ, with mean μ_{α} and fixed variance σ_{α}^{2}.

For iterative modeling with a computer program varying α at each *t*, the above function must be transformable into *m*(*t* + 1) = *G*(α(*t*), *m*(*t*)), where α(*t*) is the value of the constant at each *t*, and *G* is found from *F* by replacing *t* by *t* +1.

There are many functions that model growth and especially tree growth [17], but we will opt for the following four: the Verhulst logistic equation [65], *m*(*t*) = *K*/[1 + (*K*/*m*_{0} − 1)*e*^{−αt}]; the Weibull function [72], *m*(*t*) = *K*(1 − *e*^{−αt}) + *m*_{0}, where *K* denotes the maximum value and *m*_{0} its initial value in both equations; and a pair of simplifications that are likely to be more common in other types of artificial ecosystems and are also used in real systems, namely, the linear function, *m*(*t*) = α*t* + *m*_{0}, and the exponential function, *m*(*t*) = *m*_{0}(1 + α)^{t}. The forest will have four species of tree, and each of the above models will describe how each species grows. Table 1 shows the parameters and functions *F* and *G* of each model, and Figure 1 plots the form of *F*.

Species . | Parameters . | m(t) = F(αt). | m(t + 1) = G(α, m(t)). | . |
---|---|---|---|---|

A | m_{0} = 1.0 | m_{0} + αt | m(t) + α | |

m_{max} = 9.99 | ||||

α ∈ U [0, 0.1] | ||||

μ_{α} = 0.05 | ||||

B | m_{0} = 1.0 | m_{0}(1 + α)^{t} | m(t) (1 + α) | |

m_{max} = 9.99 | ||||

α ∈ U [0, 0.026] | ||||

μ_{α} = 0.013 | ||||

C | m_{0} = 1.0 | |||

m_{max} = 9.99 | ||||

α ∈ U [0, 0.06] | ||||

μ_{α} = 0.03 | ||||

k = 10 | ||||

D | m_{0} = 1.0 | k(1 − e^{−αt}) + m_{0} | ||

m_{max} = 9.99 | ||||

α ∈ U [0, 0.04] | ||||

μ_{α} = 0.02 | ||||

k = 9 |

Species . | Parameters . | m(t) = F(αt). | m(t + 1) = G(α, m(t)). | . |
---|---|---|---|---|

A | m_{0} = 1.0 | m_{0} + αt | m(t) + α | |

m_{max} = 9.99 | ||||

α ∈ U [0, 0.1] | ||||

μ_{α} = 0.05 | ||||

B | m_{0} = 1.0 | m_{0}(1 + α)^{t} | m(t) (1 + α) | |

m_{max} = 9.99 | ||||

α ∈ U [0, 0.026] | ||||

μ_{α} = 0.013 | ||||

C | m_{0} = 1.0 | |||

m_{max} = 9.99 | ||||

α ∈ U [0, 0.06] | ||||

μ_{α} = 0.03 | ||||

k = 10 | ||||

D | m_{0} = 1.0 | k(1 − e^{−αt}) + m_{0} | ||

m_{max} = 9.99 | ||||

α ∈ U [0, 0.04] | ||||

μ_{α} = 0.02 | ||||

k = 9 |

Basically, we want to find the final most probable distribution of the trees of each species, depending on their mass, if we have *N* trees grow from an initial mass of *m*_{0} until time *t*. In other words, given a particular mass value, we want to know how many trees with that mass there will be in the forest at time *t*.

## 5 The Most Probable Distribution in Growing Agents: Central Limit Theorem and Its Extension

Let us start by studying a forest populated by a single species of tree. To calculate the most probable distribution *P*(*m*, *t*) of trees or agents, depending on their mass, after they have been growing for some time *t*, we will analyze the sum χ(*t*) = ∑_{0}^{t} α(τ), which is a random number resulting from the sum of random numbers α(τ). It is worthwhile analyzing that sum because it is easy to check that *m*(*t*) = *F*(χ(t)) for the first three species. Those species depend on that sum, whereas the exponential function does not. However, applying logarithms to both sides of its exponential equation, we have that ln(*m*(*t*)) = ln(*m*_{0}) + ∑_{0}^{t} ln(1 + α(τ)), and *m*(*t*) = *m*_{0}*e*^{Σ0t ln(1+α(τ))}, and hence we again have a sum of random numbers χ(t) = ∑_{0}^{t} α∗(τ) with α∗(τ) = ln(1 + α(τ)).

The sum χ(t) can take any value, but we know that on average its value will be μ_{α}*t* and its variance σ_{α}^{2}*t*. This means that the distribution *P*(*m*, *t*) of trees depending on mass will have a mean μ_{m} = *F*(μ_{α}*t*) at time *t*. What will that distribution be like? That will depend on the distribution of the values of χ(t) at time instant *t*. But one distribution will be more probable than the others.

*t*can lead to several final values χ

_{i}because of the randomness of α(τ). Let us denote by

*N*

_{i}the number of trees that have reached the final value χ

_{i}at time instant

*t*, with

*N*= ∑

*N*

_{i}. Given any particular values

*N*

_{i}, there are many different ways in which

*N*trees can grow individually for there to be

*N*

_{i}trees that have reached the value χ

_{i}at time instant

*t*. If we are concerned not with the identity of each tree but just with the final value χ

_{i}of their sum, all these ways are calculated thus:

*N*

_{i}could potentially take any integer value between 0 and

*N*because the process is random, and

*W*depends on those values. The most probable distribution will be the distribution that has the greatest number of ways, that is, the distribution for which

*W*is the maximum. That maximum corresponds to some particular values

*N*

_{i}, which are what we want to find. This we will do by calculating the values that maximize the entropy

*S*= ln(

*W*), as this simplifies the mathematics. The problem is a typical case of constrained function optimization, namely, to find the values of

*N*

_{i}that maximize

*S*subject to three constraints: (a)

*N*= ∑

*N*

_{i}, (b) μ

_{α}

*t*is the mean, and (c) σ

_{α}

^{2}

*t*is the variance of χ(

*t*) for all trees. It is well known that the most probable distribution that maximizes this entropy subject to those constraints is a normal distribution [33, 57]:This is the same result as for the central limit theorem.

*P*

_{i}=

*N*

_{i}/

*N*in terms of the values of χ,

*P*(χ), whereas what we really set out to calculate is the probability in terms of mass,

*P*(

*m*). As

*m*=

*F*(χ), then applying the change-of-variables method well known for statistical distributions, we getAs the expected value of χ is μ

_{α}

*t*, then

*d*χ ≈ μ

_{α}

*dt*, and we will finally get thatThat is, for a particular value of mass,

*m*

_{i}, the number of trees expected to end up with that mass at time

*t*will be

*N*

_{i}=

*NP*(

*m*

_{i}). Figure 2 shows the predictive value of Equation 5, having a population of 2,500 single-species trees grow over time, withand hence(see formula for

*p*

_{i}for species C in Table 1), on repeating the experiment 100 times.

*t*, but to simulate a real ecosystem we will have to grow a population containing trees of different ages. Suppose that the same number of individuals is born at each time iteration, and we have the population iterate until time τ. The total population will be described by the superposition of the distributions

*P*(

*m*) from

*t*= 0 to

*t*= τ:We will use an approximation to calculate the result of the summation. Making a change of variables with , we find that the summands are really all the same Gaussian , albeit evaluated at different points along

*t*. Save close to

*m*

_{0}and at the

*m*

_{max}endpoint (where mass is maximal), the sum will be more or less an integral of the Gaussian and hence equal to 1. Therefore, a good approximation to the distribution of the whole population far from the endpoints will bethe reciprocal of the tree growth rate. This distribution has to be normalized to make it a probability density function, so cancelling μ

_{α}:

*m*

_{0}and

*m*

_{max}, as pointed out above. The curve is the reciprocal of the growth rate, in Equation 6, and consequently there should be more trees at lower growth rates. As, in this case, the growth function is the sigmoid function (species C in Figure 1), the growth rate is much lower at the endpoints,

*m*

_{0}= 0.1 and

*m*

_{max}= 30.

## 6 The Most Probable Distribution When There Is a Shortage of Resources: Temperature and an Alternative Expression of the Second Principle of Thermodynamics

Section 5 illustrated how the forest grows if there are no constraints on resources. This is, however, neither a very interesting nor a realistic situation. Let us now look at what happens when resources are limited, when the sum of total system mass is constant, *U*.

We can assume that initially all mass is in the environment, in the ground, and the trees gradually absorb it as they are born and grow. There will come a time when all mass is located in the trees, *U* = ∑*m*_{i}*N*_{i}, and tree growth will stop because no more mass is available in the ground. Consequently, we have to decide how the system will work under those circumstances. One operational scheme is to kill a tree when there is no more mass left in the ground and return its mass to the environment so that other trees can grow. The decisions on how that tree is to be killed and on how trees are born will determine the form of the most probable distribution. We will consider another constraint, namely, that the number of trees, *N*, is also constant, and so when a tree dies another tree is automatically born with *m*_{0}.

One way of killing trees would be to select and remove one of the *N* trees at random. In this case, the most probable distribution is fairly easy to calculate, because, as the selection is made at random, we can analyze what happens with time. There will be trees of all ages in the forest. Let θ be the tree survival probability in each time step. We can assume that θ will be more or less constant for all ages. Therefore, the probability that a tree will reach age *t* is the probability that it does not die in *t* preceding generations, θ^{t}. Let *N*_{t} be the number of trees in the forest with age *t*; then *N*_{t} = *N*_{0}θ^{t}, where *N*_{0} is the number of trees that are born at each time instant. As *N*_{0} = (1 − θ)*N*, consequently *N*_{t} = (1 − θ)*N*θ^{t}. The final total mass will be *U* = ∑_{t=0}^{tmax}*m*_{t}(1 − θ)*N*θ^{t}. Accordingly, the constraint on the quantity of available resources can be interpreted as a specification of the survival rate θ.

*p*

_{t}=

*N*

_{t}/

*N*= (1 − θ)θ

^{t}. Given this probability, what is the probability of obtaining any distribution whatsoever of trees of different ages? The probability mass function of the multinomial distribution answers this question:The most probable distribution is the distribution that maximizes

*P*and therefore maximizes

*S*= ln(

*P*). The constraint on the available resources has already been specified within

*p*

_{t}by means of the survival rate. Therefore, we simply have to find the values that make ∂

*S*/∂

*N*

_{t}= 0 in order to calculate the values of

*N*

_{t}that maximize the entropy. The calculation is straightforward, and the solution is

*N*

_{t}/

*N*=

*p*

_{t}, which is precisely its a priori probability. Hence, we calculate how many trees of a particular mass there will be by substituting

*t*=

*F*

^{−1}(

*m*).

Another, more interesting way of killing trees that adds another realistic element to the ecosystem would be to eliminate a tree that does not have enough available mass in the environment to grow. This idea that a plant must grow in order to survive has quite a lot of support in ecological modeling [24].

The decision depends in this case on the growth rate at any time. If there is not much available mass, a tree that is in a very fast growth phase and needs to extract a lot of mass from the system will be more likely to die than a tree that is growing slowly at that point and has more modest needs.

*m*

_{i}. Hence the probability of obtaining any combination of trees whatsoever of different masses isThe most probable combination of trees is the combination that maximizes

*S*= ln(

*P*). There are two constraints on that maximum that have to be met,

*N*= ∑

*N*

_{i}and

*U*= ∑

*m*

_{i}

*N*

_{i}. If we make the mathematical definition 1/

*T*= (∂

*S*/∂

*U*)

_{N}, defining

*T*as temperature, we can deduce (see Appendix) that the most probable distribution will be given bywhereand the value of the entropy will be given byThe temperature value in the above equations must be such thatDepending on the form of

*p*

_{i}, the system temperature can sometimes be calculated analytically. Otherwise, a numerical function optimization method (i.e., Newton's method, conjugate gradient method) can be used to search for the value of

*T*such that Equation 9 holds.

We have expressed the second principle of thermodynamics of our system as *S*(*t* + 1) − *S*(*t*) = Δ*S* ≥ 0 for each iteration. But Equation 8 will provide another expression for the case where *U* is not constant and is changed from *U*_{1} to *U*_{2}. If the number of trees is constant, then an infinitesimal change in the limit of resources produces a change in entropy equal to *dS* = (∂*S*/∂*U*)_{N}*dU* = *dU*/*T*. By Equation 9, temperature is a parameter that depends on *U*, that is, *T* = *T*(*U*). Hence, if we change *U*_{1} to *U*_{2}, then Δ*S* = ∫_{U1}^{U2}*dU*/*T*(*U*). This equation is another expression of the second principle of thermodynamics for our system of agents.

*p*

_{i}= constant, as would be the case if the species were to grow according to

*m*(

*t*) = α

*t*+

*m*

_{0}, and calculate the average mass per tree in the system supposing that there is a maximum value

*m*

_{max}:

Figure 4 shows the graph of with temperature change and with *m*_{0} = 1 and *m*_{max} = 10. We find that some values of are only reachable if the temperature is negative. Note how the distribution of *N*_{i} changes with the temperature, according to Equation 7. Figure 5 shows two examples, one for *T* = +1, which yields , and another for *T* = −1, yielding . The distribution for the positive temperature is a typical exponential curve, where there are more trees or agents with small than with large mass. With negative temperature, however, the mean mass is greater, and, because we have specified a maximum value for mass, *m*_{max}, what is called *population inversion* occurs, where more agents have large mass values. If there were to be no such maximum value, any value of could be reached with positive temperatures by simply assigning large values of *m*_{i} to a few agents. This is equivalent to lengthening the exponential tail. As there is a maximum value, this is not possible, and the only option is for the system to move a large part of the population close to *m*_{max}. In nature, there are very few materials that have a maximum energy level (which is equivalent to mass in our system). Ordinary materials have practically infinite ladders of energy; they can absorb energy without saturation by distributing it in all its forms (translational, rotational, vibrational, and electronic energy) across their particles [11]. Consequently, positive temperatures are prevalent in nature and negative temperatures are rare.

The above explanation has taken into account only one variable, mass, and one constraint with respect to how much of that variable is available for the whole system. We have described the most probable distribution using some equations that depend on a parameter that we have referred to as temperature because its definition is identical to the definition of thermodynamic temperature in statistical mechanics. There is only one variable, mass, in our system. However, if there were more, like wealth, volume, energy, money, or CPU time, with similar constraints with respect to availability for the whole system, particular temperature parameters would appear for each of those variables, and we would refer to the system temperature with respect to mass, temperature with respect to wealth, temperature with respect to size, and so on, all of which would be different.

## 7 More than One Species: Temperature, Chemical Potential, and Chemical Reactions

*m*(

*t*) =

*F*(α

*t*) and dies depending on its growth rate in an environment with limited resources. We will now study what happens if we have

*L*species, each behaving according to a different rule

*m*

_{l}(

*t*) =

*F*

_{l}(α

*t*). The most straightforward case is when we simply have several separate subsystems, each with its species of trees and resources. Let

*N*

_{li}be the number of trees or agents of species

*l*whose mass value is

*m*

_{i}; then

*N*

_{l}= ∑

*N*

_{li}is the total number of trees of that species in the respective subsystem,

*U*

_{l}= ∑

*m*

_{i}

*N*

_{li}is the total mass of that species in the subsystem, and

*p*

_{li}is the a priori probability characterized by the inverse growth rate of the species. In this case, the probability of there being any combination whatsoever of trees of different masses isUsing

*S*= ln(

*P*), with 2 ×

*L*constraints,

*N*

_{l}= ∑

*N*

_{li}and

*U*

_{l}= ∑

*m*

_{i}

*N*

_{li}, we can easily find that

*S*= ∑

*S*

_{l}, where

*S*

_{l}is calculated with Equation 8. Consequently, the total entropy is the sum of the entropies of all subsystems.

But let us make a slight change whereby the species into which any new tree or agent is born is chosen at random, that is, when a tree of species *A* dies, a tree of species *B* may be born, or vice versa. The result is that one subsystem loses and the other gains an agent, agents move from one subsystem to another, and they can no longer be considered to be separate. Moreover, this exchange could be interpreted as a chemical reaction , as we will see later.

*L*constraints, we have just 2 (

*N*= ∑

*N*

_{l}and

*U*= ∑

*U*

_{l}), and the probability of there being any combination of trees whatsoever in different masses is now calculated as for a single system:Using

*S*= ln(

*P*), and maximizing the entropy, the solution is found in a similar manner to the one in Section 6, which also dealt with a single system:

*S*

_{l}is the entropy of each species, supposing that they were in different and separate systems. As several species coexist in the same system, we find that the system entropy is equal to the entropies of the separate systems, plus a new term,

*S*

_{mixing}=

*N*ln(

*N*) − ∑

*N*

_{l}ln(

*N*

_{l}), called the

*mixing entropy*. In other words, if we have a series of separate systems with total entropy ∑

*S*

_{l}, we have managed to increase the total system entropy by a factor

*S*

_{mixing}by simply allowing agents to be exchanged.

_{l}/

*T*= −(∂

*S*/∂

*N*

_{l})

_{Nj≠l,U}, we can deduce (see Appendix) that the most probable distribution will be evaluated by equating the temperatures and chemical potentials:Additionally, the number of trees or agents of each species will be given by

*K*is the equilibrium constant that is defined as follows:Looking at Equations 11 and 14, we observe that both

*N*

_{li}and

*N*

_{l}finally depend on the following two parameters: the a priori probabilities

*p*

_{li}and the temperature. The probabilities

*p*

_{li}are determined by the properties of each species, are assumed to be part of the system specification, and are therefore fixed. Otherwise, we would have to examine what effect these changes would have on the entropy maximization, which is outside the scope of this article. Therefore,

*N*

_{li}and

*N*

_{l}depend on only one parameter, the temperature. This parameter has to be calculated so that the baseline equation

*U*= ∑

*U*

_{l}holds:If the sums in the equation can be approximated by integrals to find the temperature, it can be calculated analytically. Otherwise, which is more often the case, a numerical function optimization method has to be run on a computer. Once we have found the temperature value, the distribution of each species will be completely specified.

We have run a simulation with *N* = 10,000 trees, an environment of mass *U* = 50,000, and a mixture of four species as defined earlier, whose dynamics, parameters, and a priori probability are shown in Table 1 and Figure 1. At *t* = 0, all trees are initialized at *m*_{0} and assigned a random species with identical probability, and they grow according to their growth function *F*. All their mass, including their initial mass and the mass satisfying growth requirements, is taken from the environment until it is depleted. The trees die one by one, either because there is no mass available in the environment to grow, or because they reach their maximum mass, *m*_{max}. After they die, their mass is returned to the environment and they are reborn, at which point they are again assigned a mass of *m*_{0} and a random species.

Figure 6 shows the result once we have calculated that, given the entropy value *S* = 90,468, the temperature value that satisfies Equation 15 is *T* = 5.32. We find that the prediction is a good fit, except for very low mass values (as expected).

Table 2 shows the values of *N*_{l} observed and predicted by Equation 14. Again, we find that the chemical potential values calculated using Equation 13 are equalized, also as predicted.

Species . | Observed N_{l}. | Predicted N_{l}. | Observed μ_{l}. |
---|---|---|---|

A . | 2,168 . | 2,185 . | −43.08 . |

B | 2,838 | 2,819 | −43.16 |

C | 2,727 | 2,733 | −43.11 |

D | 2,267 | 2,263 | −43.13 |

Species . | Observed N_{l}. | Predicted N_{l}. | Observed μ_{l}. |
---|---|---|---|

A . | 2,168 . | 2,185 . | −43.08 . |

B | 2,838 | 2,819 | −43.16 |

C | 2,727 | 2,733 | −43.11 |

D | 2,267 | 2,263 | −43.13 |

The final distribution per species depends on the amount of mass available in the environment according to Equation 11. It depends on the temperature, and thus on the mass of the environment according to Equation 15. Let us look at what happens if there is a major shortage of resources compared with the forest's potential growth capacity. We have run the same simulation but lowered the availability of resources to *U* = 15,000. In this case, the temperature and entropy consequently drop (*T* = 0.61 and *S* = 72,935), and the distribution changes as shown in Figure 7. Note that, although we are not simulating a real forest, as it would have to include many more aspects that have not been considered (as in [5]), we do find the so-called *self-thinning rule*, which is considered to be one of the more general principles of plant population biology [46, 68]. This rule relates plant size to density when density-dependent mortality (self-thinning) is occurring, in such a way that populations decline in density as plant size increases with age. The forest or stand first accumulates biomass with little mortality, gradually increases in mortality, and finally comes to move parallel to a self-thinning line that behaves as a boundary, as shown in Figure 7.

*Allometries* are scaling relationships of body size to the anatomical, physiological, behavioral, and ecological characteristics of animals, and they have long been a focus of interest in zoology [9, 49]. One of the most studied and most general allometric rules or laws relates the population density of an animal or vegetable species to its average mass by means of a power law [15, 53] of the type , where γ is a coefficient to be determined and . Note again that although our artificial forest of agents does not aim to simulate a real forest, it is interesting to see that the artificial system also appears in Figure 8 to obey the allometric law according to which the smaller animal or vegetable species are more plentiful than the larger species.

An explanation for both the allometric and self-thinning rules could be obtained by analyzing Equation 11 taken from the system thermodynamics, but it is beyond the scope of this article. However, a general understanding of how size and abundance are related in nature is still lacking [69].

## 8 The Agents Move: Chemical Potential and Pressure

Another property characterizing multi-agent systems that try to recreate ecosystems is position. In this section we will introduce this property and movement in order to calculate what is the most probable distribution of the agents on a map.

Suppose now that agents are located on a map. Suppose that this artificial world is subdivided into *N*_{C} plots or blocks that each have the capacity for *C*_{i} agents, with a total capacity of *C* = ∑*C*_{i}. Capacity can be determined by many factors, ranging from the resource availability to mere physical space. In order to illustrate how particular thermodynamic properties emerge, let us separate the availability of resources from plot capacity, which could be an independent property of the environment related, for example, to the volume occupied by agents. For simplicity's sake, let us consider the problem that the size or volume that an agent or tree occupies is always the same, *v*, irrespective of its mass. The volume occupied by the *n*_{i} trees that are located on a plot will be *v*_{i} = *vn*_{i}, the maximum available volume of a plot will be *V*_{i} = *vC*_{i}, and the total ecosystem volume will be *V* = ∑*V*_{i} = *vC*.

The agents in our system are trees and are therefore not moving. Nevertheless, we can simulate movement through the birth-death cycle by randomly placing trees in any position in any plot when they are born, provided that the plot capacity is not exceeded and the site is not occupied. Although mobility is realized at birth, the effect would be the same if the agents were to move as they grew, as if they were animals and not plants.

*S*= ln(

*W*). As illustrated in Section 3, the system entropy is given by Equation 4:If we define the chemical potential for each plot or volume unit as μ

_{i}/

*T*= −(∂

*S*/∂

*n*

_{i})

_{Vi}and the pressure as

*p*

_{i}/

*T*= (∂

*S*/∂

*V*

_{i})

_{ni}, with the parameter

*T*now as a normalization constant in both cases, we can demonstrate (see Appendix) that, at the point of equilibrium,andthat is, when the system is at its point of equilibrium, the pressures and chemical potentials of the plots are, like the temperatures and chemical potentials in Section 7, also equal.

*T*is equal to the temperature (as we will assume in Section 9), this is the ideal gas law. If we include the following term, we derive the van der Waals equation. That is, if the artificial world is very large compared with the number of agents, agent population movement will behave like that of an ideal gas.

We have run a simulation with *N* = 10,000 trees on a 500 × 50 map, divided into 10 adjacent 50 × 50 plots. The capacity per plot is established linearly as , yielding *C* = 13,750. Figure 9 shows the result. As predicted, the density of trees per plot, , grows as the plot capacity increases.

Table 3 shows the chemical potentials and partial pressures of each plot calculated at the final time instant. The global values are μ = −0.98 and *p* = 1.30, taking *T* = 1 and *v* = 1. We find that they are equalized values that oscillate around the global values. Note that they have been measured at a single time instant, and a sample of values taken at different time instants would have to be averaged to get a more precise measure.

Plot . | C_{i}. | Predicted n_{i}. | Observed n_{i}. | μ_{i}. | p_{i}. |
---|---|---|---|---|---|

1 | 250 | 182 | 195 | −1.27 | 1.51 |

2 | 500 | 364 | 368 | −1.03 | 1.33 |

3 | 750 | 545 | 556 | −1.05 | 1.35 |

4 | 1,000 | 727 | 727 | −0.98 | 1.30 |

5 | 1,250 | 909 | 868 | −0.82 | 1.19 |

6 | 1,500 | 1,091 | 1,081 | −0.95 | 1.28 |

7 | 1,750 | 1,273 | 1,310 | −1.09 | 1.38 |

8 | 2,000 | 1,455 | 1,454 | −0.98 | 1.30 |

9 | 2,250 | 1,636 | 1,681 | −1.08 | 1.37 |

10 | 2,500 | 1,818 | 1,760 | −0.87 | 1.22 |

Plot . | C_{i}. | Predicted n_{i}. | Observed n_{i}. | μ_{i}. | p_{i}. |
---|---|---|---|---|---|

1 | 250 | 182 | 195 | −1.27 | 1.51 |

2 | 500 | 364 | 368 | −1.03 | 1.33 |

3 | 750 | 545 | 556 | −1.05 | 1.35 |

4 | 1,000 | 727 | 727 | −0.98 | 1.30 |

5 | 1,250 | 909 | 868 | −0.82 | 1.19 |

6 | 1,500 | 1,091 | 1,081 | −0.95 | 1.28 |

7 | 1,750 | 1,273 | 1,310 | −1.09 | 1.38 |

8 | 2,000 | 1,455 | 1,454 | −0.98 | 1.30 |

9 | 2,250 | 1,636 | 1,681 | −1.08 | 1.37 |

10 | 2,500 | 1,818 | 1,760 | −0.87 | 1.22 |

## 9 Temperature, Pressure, and Chemical Potential Balance the System: The Fundamental Thermodynamic Equation

*N*= ∑

*N*

_{li},

*U*= ∑

*m*

_{i}

*N*

_{li}, and

*N*= ∑

*n*

_{j}. Let us make the final mathematical definition of temperature, chemical potentials, and pressure:We have seen in the previous sections that the result of applying the constraints and observing what happens in the most probable system state is that temperature, chemical potential, and pressure equalize, that is,

*S*(

*U*,

*V*,

*N*), which yields the value of entropy in its most probable state:From Equation 18, we can derive the other fundamental thermodynamic equations of this system (Helmholtz function, Gibbs function, enthalpy). Temperature, chemical potential, and pressure act as system forces [11], leading the system to equalize. Temperature is a descriptor of the tendency of the system to exchange quantities of the mass property among species or with the environment. Chemical potential describes the tendency of a system to exchange agents among species or among plots in an artificial world. And, finally, pressure describes the tendency of a system to increase or reduce the space occupied by agents. By equalizing these parameters, ensuring that there are more agents of one or another species or more agents in some plots than others, the system has increased the multiplicity of the macrostates of the most probable distribution,

*W*=

*W*(ψ

_{X}∗)

*W*(ψ

_{C}∗), thereby increasing their probability with respect to any other previous configuration and maximizing entropy.

## 10 Conclusions

We have created a multi-agent ecosystem that evolves in time with respect to two countable properties, mass and position, by means of simple local rules. We have used the term mass in order to better illustrate the example, although it is representative of other simulated properties (wealth, energy, food, CPU time, memory, etc.). The result of the system dynamics show some distinct stable patterns, summarized in Figures 6–7 and 9 with respect to each of the properties. These plots are informative, but not as informative as an analytical expression for the global behavior and properties of the system such as the ones given by Equations 11, 16, 17, and 18, obtained using the method of the most probable distribution.

These patterns are *emergent* in the sense that they pass the emergence test outlined in [58], in that (a) the description language with which the system has been designed, consisting of the equations listed in Table 1 and some simple birth, death, and movement rules, is completely different from the language of Equations 11, 16, 17, and 18, used to describe the global behavior and properties of the system, and (b) the relationship between the two languages is not evident.

On the other hand, these patterns obviously do not correspond to what we might associate with the term disorder, as disorder would be equivalent to a flat histogram in Figures 6–7 and a homogeneous distribution in Figure 9. Rather the contrary, we have found order. However, these distributions have been correctly predicted for the system's maximum-entropy point. This confirms the initial hypothesis that the two concepts, disorder and maximum entropy, should not be linked. The fact that order emerges can be explained by the fact that the system has been subjected to a series of constraints, and this reduces the achievable maximum entropy. The constraints have created order. This is not the same as considering that the system is in nonequilibrium conditions where order is created at the expense of creating disorder in another place, as occurs in Prigogine's open systems. They are two different sources of order.

The working methodology, the method of the most probable distribution, was selected to find a mathematical characterization of the most probable system state and then study its properties. To do this, we have to try to find a mathematical formulation that describes the states through the probability *P* of each state occurring, and then use it to find the most probable state subject to the environmental conditions and constraints. This is what one does in statistical mechanics. We can work either directly with that probability or with the enumeration *W* of its microstates, or with quantities such as their respective logarithms (entropy), whichever is easiest mathematically. No matter which approach we use, however, none of the enumeration or probability formulas can be considered generally applicable; they are proper to this system. In this article, the formulations of entropy used in Sections 6 and 7 are different, and neither of the two corresponds to its best-known formulation, which is Shannon's, as explained in Section 3. None of the laws are applicable to all systems, except one: The system will tend to move from less probable to more probable states.

The Echo system [26] reveals similar species distribution patterns by area to those observed in nature. Wealth distribution histograms among agents in the Sugarspace system [16] resemble patterns in the real economy, and there are patterns of movement and spatial concentration of agents around niches of wealth production. The Avida system [4] exhibits genotype abundance distribution patterns showing universal behavior in the form of a power law. All these patterns are observed in the same manner as the patterns shown in Figures 6–7 and 9, and in some cases the mathematical modeling of the curve that they approximate is similar. But none of the above three systems gives any explanation of their source, except Avida, where an approach based on statistical mechanics is also used to explain some of its aspects [2]. In this article, however, we have explained, through the study and characterization of its most probable state using mathematical statistical-mechanistic tools, how to generate the rules or laws described by Equations 11, 16, 17, and 18 that fully describe the observed patterns and systems dynamics generally. Additionally, we have found other general rules, such as the law of mass action and allometry, and others that occur under certain circumstances, such as the system behaving like an ideal gas or there also being a self-thinning rule. All of these rules have been explained by or could be inferred from the above equations. We have been able, thanks to the method of the most probable distribution, to describe the *nature* created in the ecosystem thermodynamically. But the final conclusion is that the method of the most probable distribution is a general method applicable to any other artificial system with any property or trait whose distribution among the agents is countable. In any case in which this is feasible and the same stable patterns of distributions emerge every time the system is run, indicating that the system always reaches the same equilibrium point, the method is applicable. Our general recommendation is to build the system by incremental steps, incorporating new rules once their implications for the mathematical form of the entropy and for the equilibrium points are studied and known. This tactic prevents the use of rules whose emerging consequences cannot be analyzed and understood. Most of the time these rules can be replaced by others with similar local effects, but with understandable and predictable emerging properties.

## References

### APPENDIX

#### A.1 Deriving the Distribution from the Temperature

*N*= ∑

*N*

_{i}and

*U*= ∑

*m*

_{i}

*N*

_{i}. We have to find the maximum of

*S*= ln(

*P*), where ∂

*S*/∂

*N*

_{i}= 0. We can use the method of Lagrange multipliers and work with the following function derived from the above:where α and β are the multipliers of the respective constraints. At the maximum, ∂

*S*/∂

*N*

_{i}= 0 ∀

*i*must hold; henceWe calculate

*e*

^{−1−α}as above:If we now make , then , and finallywithThe value of the second multiplier β remains to be calculated. This can be done analytically using the mean value of

*X*:This value can be approximated by means of integrals:Depending on the form of

*p*(

*m*), the above integrals sometimes have known or analytically calculable solutions, and we get , which can be inverted to calculate , where

*U*and

*N*are known because they are system constraints. If this is not possible and there is no analytical method, the practical option is to use a numerical function optimization method (i.e., Newton's method, conjugate gradient method) to find the value of β such that the equation holds on

*U*.

*N*

_{i}, it is interesting to see what form the entropy takes:As ∑

*N*

_{i}=

*N*, canceling equal terms, we haveIf we define temperature mathematically as 1/

*T*= (∂

*S*/∂

*U*)

_{N}, we have thatAs , then , and so the first and second terms of the temperature equation cancel each other out, and we get that 1/

*T*= (∂

*S*/∂

*U*)

_{N}= β. And therefore

#### A.2 Deriving Equations for a Mixture with Species Interchange

*S*

_{i}in the above equation by its value given by Equation 9 and equating to Equation 12, we haveIf we now calculate (∂

*S*/∂

*U*

_{l})

_{Nl}in the second and third members of the equality, we get thatthat is, the temperatures of all the species are equal in the system at equilibrium.

_{l}/

*T*= −(∂

*S*/∂

*N*

_{l})

_{Nj≠l,U}and calculate it termwise for the equation, taking into account that

*N*

_{j≠l}and

*U*are now constant:HenceReordering the entropy equation, we have that

*S*= −∑

*N*

_{l}ln(

*N*

_{l}/

*NQ*

_{l}) +

*U*/

*T*, and soAnd as, according to the mass action law,

*N*

_{l}/

*Q*

_{l}=

*N*/

*Q*, then μ

_{l}/

*T*has the same value for all the species, and consequently μ

_{l}/

*T*= ln(

*N*

_{l}/

*NQ*

_{l}) = ln(1/

*Q*). If we now define and calculate μ/

*T*= −(∂

*S*/∂

*N*)

_{U}for Equation 12, μ/

*T*= −ln(

*Q*), we will finally havethat is, when the system is balanced at its point of equilibrium, the chemical potentials of the species are, like the temperatures, also equal.

#### A.3 Deriving the Equations for Pressure and Chemical Potential in a Spatial Distribution

_{i}/

*T*= −(∂

*S*/∂

*n*

_{i})

_{Vi}and differentiate Equation A, we getUsing this result in Equation B leavesLet us now define pressure mathematically as

*p*

_{i}/

*T*= (∂

*S*/∂

*V*

_{i})

_{ni}. We find that if

*V*

_{i}=

*vC*

_{i},Differentiating Equation A, we haveHenceUsing the first equality,which, substituted into Equation B, givesAt its maximum point, agent interchange or mobility between plots should have a zero effect on the entropy value, and so (∂

*S*/∂

*n*

_{i})

_{V}= 0 ∀

*i*. To calculate the derivative, we have to use Equation A with a Lagrange multiplier for the constraint

*N*= ∑

*n*

_{i}:Differentiating this equation, we havewhich yields

*C*

_{i}=

*n*

_{i}/(

*e*

^{α}+ 1). Summing over all the plots, we have that , but also that . Hencethat is, when the system is balanced or at equilibrium, the densities of agents per plot or per volume unit and the chemical potentials of the species are, like the temperatures and chemical potentials, also equal.

## Author notes

Facultad de Informática, Universidad Politécnica de Madrid, 28660 Boadilla del Monte (Madrid), Spain. E-mail: fsegovia@fi.upm.es