## Abstract

To study open-ended coevolution, we define a complexity metric over interacting finite state machines playing formal language prediction games, and study the dynamics of populations under competitive and cooperative interactions. In the past purely competitive and purely cooperative interactions have been studied extensively, but neither can successfully and continuously drive an arms race. We present quantitative results using this complexity metric and analyze the causes of varying rates of complexity growth across different types of interactions. We find that while both purely competitive and purely cooperative coevolution are able to drive complexity growth above the rate of genetic drift, mixed systems with both competitive and cooperative interactions achieve significantly higher evolved complexity.

## 1 Introduction

The field of artificial life has sought to bottle the magic of evolution and reproduce its font of complexity in silico. If simulated evolution could be made to exhibit the same properties as natural evolution, it too could produce a continuing supply of novel solutions to problems posed to it. However, the quest for open-ended artificial evolution poses a difficult question: Which aspects of natural evolution are necessary to make it work, and which are optional? Artificial evolution is an imitation of natural evolution, and an imperfect one at that. This is both because we cannot hope to perfectly match the conditions of natural evolution (they are too difficult to reconstruct, and too expensive to simulate), and because we wish to generate new forms and solutions, not merely replay the tape of life. To this end, it is helpful to try to isolate the core functionality of natural evolution in a simple test bed. In this way, we can identify the minimal set of criteria needed to achieve open-ended evolution.

Unfortunately, attempts at recreating the success of natural evolution have shown that the process is not as simple as a pithy Darwin quote might make it sound. The pitfalls that nature seems to effortlessly avoid are persistent thorns in the side of artificial life researchers. Simulations that appear to offer pathways to continuous evolution instead show limited progress and mediocre results. Where natural evolution generates a vast diversity of forms, simulated evolution often sees a single dominant form take over the entire population. Where natural evolution creates a steady stream of novel strategies, simulated evolution sees populations fall into an inescapable cycle of a few.

To make matters worse, it can be difficult to monitor whether these problems are even occurring. Consider a pair of players learning to play chess against one another—if each improves at the same rate, their winning percentages will remain constant, the same result as if neither had improved at all. So it is as well with the fitness of evolving populations; progress can be difficult to spot. On the other hand, if one player's win rate starts improving, it may not indicate that they have learned anything useful; it could just as well indicate that their opponent is getting worse.

These difficulties would be irrelevant if we could simply measure the thing we actually care about: complexity. However, complexity is difficult to define, let alone quantify. We recognize complexity when we see it, but the question of giving a universally applicable definition has remained open. Instead, there exist many different definitions of complexity, each applying to its own specialized class of objects.

In this work, we present evolutionary simulations using a representation for which a natural, quantitative complexity metric is available. While finite state machines are necessarily restricted in expressive power, they allow us to conduct experiments that would be difficult or impossible with more powerful (e.g., programmatic) models. Our goal is to derive insights from these experiments, which can then be applied to settings in which complexity measurement is not practical.

In particular, we will examine the role of cooperation and competition between species in the development of complex strategies. Both cooperation and competition have been hypothesized both as potential sources of complexity growth, and as potential roadblocks to complexity growth. Competition is envisioned as driving arms races of complexity growth, but also stymieing growth by driving populations towards mediocre stable states. Cooperation offers the promise of symbiotic development, but also the threat of collapse to trivial strategies and lack of selection pressure.

We will demonstrate these undesirable behaviors in purely competitive and purely cooperative models, but show that combinations of both competition and cooperation are able to overcome them, generating continuous complexity growth. We will explore a variety of what we term *ecosystem topologies*, which define the structure of interactions between species, and we seek to understand what sorts of properties lead to successful ecosystems.

## 2 Complexity

In this work, we are concerned with achieving a continuous trend of complexity growth. Of course, “complexity” is a difficult concept to define.^{1}

Perhaps the simplest definition of complexity is to say that larger things are more complex. Longer genomes are more complex than shorter genomes, bigger organisms are more complex than smaller organisms, and so forth. At the very least, this seems to (superficially) match our intuition—we feel that humans are more complex than bacteria.^{2} This metric is employed by [13], in which the length of a strategy is taken as an indication of its complexity. While this metric may be imperfect, it at least serves as a reasonable rough guide.

Another important measure of complexity is *entropy*—more complex objects contain more information. By this metric, we can have very large genomes which contain very little information (for example, a sequence that repeats the same symbol a million times is very long, but has very low entropy). [18] employs an entropy metric to evaluate population-level complexity, in which diverse populations are taken to be more complex than uniform populations. A potential downside of entropy as a complexity measure is that completely random sequences will have high entropy, which does not match with our intuitive understanding of the concept. [3] argues for a refinement of entropy-as-complexity in which the complexity of a genome is the amount of information it stores *about its environment*. In other words, complex organisms are complex because they are specifically adapted to the environment in which they exist.

Theory of computation provides a third set of complexity definitions, based on the difficulty of producing a sequence in a particular computational model. Kolmogorov complexity measures the complexity of a sequence (or other mathematical object) in terms of the size of the smallest computer program^{3} that generates that sequence [12]. A sequence of a single symbol repeated a million times can be generated by a very short program, whereas a complex sequence without repeating patterns or other regular structures may require a very long program. Unfortunately, Kolmogorov complexity cannot be computed in the general case, as determining whether a program is the absolutely minimal generator for a particular sequence is not decidable. Further, random sequences have very high Kolmogorov complexity, which does not match our intuitive understanding of biological complexity, as argued by [2]. Despite these difficulties, Kolmogorov complexity can be a useful baseline from which to develop other metrics, as in [14].

We can also define complexity in terms of other, less powerful computational models. For example, the *state complexity* of a regular language is the size of the minimal finite state machine that represents that language [15]. Unlike the task of finding a minimal computer program for Kolmogorov complexity, the task of finding a minimal finite state machine is not only computable, but tractable [11], and it removes both bloat and duplication, which is rampant in models like genetic programming. Further details are given in [20].

Our goal in this work will be to present a domain in which a natural, tractable complexity metric exists. We will describe experiments that measure this complexity metric in the hope of deriving principles that can then be extended to domains that do not admit a clear complexity metric.

## 3 Model

### 3.1 Linguistic Prediction Game

We present a simple prediction game, which we term the *linguistic prediction game*. In this game, two players iteratively generate symbols (in this work, 0 or 1) and each is rewarded points in each round, depending on their symbol and the symbol of the other player. Symbol generation in each round is done simultaneously—neither player may know what symbol the other has chosen to produce. However, a player may condition their next symbol on the history of symbols played so far by both players in the current game.

Many different payoff matrices can be defined, but we concern ourselves with a few simple matrices that model cooperation and competition. In these games, each player either seeks to produce a symbol that matches that of the other (in other words, both produce zero or both produce one), or seeks to produce a symbol that mismatches that of the other (in other words, one player produces zero and the other produces one).^{4} If the goals of the two players are aligned, meaning they both wish to match or both wish to mismatch, then we say the game is cooperative, and if their goals are opposed, then we say the game is competitive.

*S*(

*p*) indicates the total score of player

*p*,

*r*the number of rounds played, and

*reward*(

*p*,

*i*) the reward received by player

*p*in round

*i*.

We consider two specific types of game matrices, as shown in Figure 1, which we term coop and comp. coop is a cooperative game in which both players receive a point if they output the same symbol. In this game, players must learn to coordinate their outputs to receive a maximal score. Players seek to establish a shared convention (a language) to which they can both adhere. comp, on the other hand, is a competitive game in which one player receives a point if they output the same symbol, and the other receives a point if they output different symbols. Here each player must try to predict the pattern of their opponent while simultaneously preventing prediction of their own pattern.

Note that there are two additional isomorphic formulations of each game—coop also occurs when both players seek to output different symbols, and comp also occurs when the roles of the two players are swapped. In our experiments, the choice of which form of each game to use has had no effect on the results.

### 3.2 Multi-game Ecosystems

We extend this linguistic prediction game model by embedding multiple instances of it in a network of interacting species, which we term an *ecosystem*. An ecosystem consists of a set of species and a set of pairwise connections between them. Each connection is labeled with a game matrix, which governs the interaction between members of those two species. In this way, a single organism may interact with other organisms of multiple different species, and will potentially play a different version of the game with each of those species.

We examine several of these ecosystem models here, as shown in Figure 2. The first is a trivial baseline model with a single species and no games being played. This serves as a control data point that allows us to understand the complexity that tends to arise only from a random walk through mutations. The next two models are a pair of two-species models that are fully cooperative (playing the coop game) and fully competitive (playing the comp game). These two models allow us to examine the behavior that occurs when the players have either aligned or opposing goals. We also consider an ecosystem that features a mix of cooperative and competitive games, which we term *three-species mixed*. This ecosystem consists of three species, *A*, *B*, and *C*. *A* plays the coop game against *B* and the comp game against *C* , and there is no direct interaction between *B* and *C*. This system is inspired by the three-player pursuit-evasion game introduced by Ficici et al. [8]. For comparison with this three-species system, we also examine a purely competitive three-species system in which *A* competes with *B* and with *C*, but with opposite goals—that is, *A* attempts to match *B* but mismatch *C*. We term this system *three-species comp*. Finally, we consider a four-species extension of Ficici's game in which both species *A* and *B* compete, and each has a cooperative partner, *C* or *D* respectively. *A* and *C* play coop, as do *B* and *D*. We term this system *four-species mixed*.

Critically, in ecosystems in which one individual may interact with multiple other species, there is no signal or indication of which such other species is being interacted with. For example, in the three-species mixed system, individuals of species *A* will not know ahead of time whether they are interacting with a member of species *B*, and therefore playing comp, or a member of species *C*, and therefore playing coop. To the extent that a member of species *A* exhibits different behavior in these two situations, it must do so by learning to recognize the species of its opponent based on its opponent's outputs, and adjust accordingly.

### 3.3 Organism Model

Individual organisms are represented as finite state machines in which each state is labeled with a symbol to be produced (either zero or one), and each state has an outgoing transition link for each symbol. State transitions occur according to the symbols produced by the other player in the game. Additionally, each finite state machine has a special initial state in which it will start at the beginning of each game. Thus, an organism can be represented as a tuple *S*, *T*, *L*, *s*_{0}, where *S* is a set of states, *T* : *S* × {0, 1} ↦ *S* defines state transitions for each state in *S* for both 0 and 1, *L* : *S* ↦ {0, 1} defines the output symbol for each state in *S*, and *s*_{0} indicates the initial state.

To simulate a game, each player is first set to its initial state *s*_{0} ∈ *S*.^{5} Then, each player produces the symbol indicated by the label on its current state, and performs the state transition indicated by the symbol produced by the opponent. To calculate the score received by each player, this process is repeated until the two players enter a loop (note that players are fully deterministic and possess a finite number of internal states, and therefore a pair of players must enter a finite-length loop after a finite number of rounds). To detect loops, we track the pairs of states of each player in each round, and wait for the system to return to the same pair of states as it has been in before. Each player's score for the game is then its average score over the loop. Note that this means interactions before the loop is entered (i.e., in the transient state of the system prior to its stable cycle) do not contribute to the score, as they contribute an infinitesimal amount in the limit of an infinite game. Because loop detection is performed at the level of states, a loop may include a repeating pattern of outputs with periodicity much smaller than the length of the loop itself. Thus, the length of the loop may not correlate with the complexity of the underlying players. Players that transition between many different states with the same output may enter long loops in which simple behavior is displayed.

As an example, consider the two organisms pictured in Figure 3. Let the first organism be *A*, with states *A*_{0} and *A*_{1}, and the second be *B*, with states *B*_{0} and *B*_{1}. A game between these two players will begin in states {*A*_{0}, *B*_{0}}, and will proceed as {*A*_{0}, *B*_{0}} → {*A*_{1}, *B*_{0}} → {*A*_{1}, *B*_{1}} → {*A*_{0}, *B*_{1}} → {*A*_{1}, *B*_{0}}. At this point, the game has entered a loop, and the score can be calculated using a game matrix.

#### 3.3.1 Mutation

We further define a set of mutation operators, which alter the structure of an organism.^{6} These are:

- •
Add a new state.

- •
Remove a state.

- •
Redirect a transition link.

- •
Change the symbol produced in a state.

Adding a new state results in a state with randomly assigned outgoing transitions (possibly pointing to itself), and with a random output symbol. Note that this new state will not yet be the target of any external transitions—it must later be linked by redirection mutation. Removing a state requires randomly selecting new destinations for every link that previously pointed to the deleted node. Further, if the deleted node was the initial state, a new initial state is randomly selected from the remaining nodes. If the organism has only one state, a deletion mutation instead has no effect. A redirection mutation simply selects a random transition link and selects for it a new destination. This may result in making some nodes unreachable, or making some previously unreachable nodes reachable. Changing the symbol produced by a state is self-explanatory.

### 3.4 Evolutionary Simulation

An evolutionary simulation consists of the following steps. First, for each species in an ecosystem, a population of organisms is initialized with minimal finite state machines—a single state with a random label, and transition links leading back to that single state. We then simulate a series of generations. Each generation consists of an evaluation period, in which individuals interact according to the games specified by the ecosystem to accumulate fitness, and a reproduction period, in which individuals produce offspring, which form the populations of the next generation.

In the evaluation period, for each pair of interacting species, an all-versus-all set of games is played, as shown in Figure 4. That is, if species *A* plays the coop game with species *B*, each member of *A* will play that game with each member of *B*. For each organism, its total fitness across all interactions is tallied.

In the reproduction stage, for each species, a new population is generated using fitness-proportional selection. That is, a number of offspring are created equal to the population size of each species, where the parent of each offspring is chosen at random from the current population of that species, with probability proportional to each individual's fitness accumulated during the evaluation period. Offspring are then subjected to mutation as described previously. In this work, we use a constant mutation rate of one mutation per offspring, with all four types being equally likely. Once the new population has been generated, the previous generation is discarded.

### 3.5 Complexity Metric

Our goal in this work is to measure the growth of complexity of organisms over the course of an evolutionary simulation. To do so, we require a quantitative metric of complexity. A simple approach would simply be to count the number of nodes and transitions in a particular organism, and define those with more nodes to be more complex. However, consider the example of an organism with a great many nodes, all of which are labeled to produce the symbol 0. This simple metric would label this organism as very complex, but its behavior would be identical to that of an organism with a single node. This is clearly unsatisfactory. In order for an organism with a large network to be truly complex, it seems that it should be the case that all of its nodes are essential to its strategy.

We take inspiration from the concept of Kolmogorov complexity, which (informally) defines the complexity of a string as the length of the shortest computer program that generates that string. We define the complexity of an organism as the size of the smallest organism that produces an identical strategy. Thus, the large organism in the example above would have a very small complexity, as its strategy can be produced by an organism with a single node. Similarly, unreachable nodes will not contribute to the complexity of an organism, as those nodes could be removed without altering the strategy.

To calculate the minimal organism that produces the same strategy, we turn to automata theory. A slight modification of Hopcroft's algorithm for minimization of deterministic finite state machines [11] allows us to efficiently compute the smallest equivalent organism. As in the base algorithm, we begin by removing all disconnected nodes. We then apply the partition-refinement process, but instead of starting with an initial partition of accepting and non-accepting states as in Hopcroft's algorithm, we begin with a partition of nodes by output label. From there, a straightforward application of the standard algorithm produces a minimal equivalent machine. The total number of nodes in this minimized machine is then used as our complexity metric for that organism.

## 4 Experiments and Results

For each of the five ecosystems described above, we perform 20 experiments of 10,000 generations each, with each species having a population size of 50. For each generation, the complexity (the size of the minimal equivalent machine) of each organism is computed, and we record the median complexity for each population. We use the median to avoid giving undue influence to a single complex mutant that does not manage to spread its complexity throughout the population. We then take the mean of these medians over the 20 experiments to determine the characteristic trajectory of complexity over time for each ecosystem. To analyze the trajectory of complexity growth within each species at the end of the simulation, we compute a line of best fit over the complexity values for the last 2,500 generations of each experiment, using the least-squares method, and report the slope of this line. Species that had sustained complexity growth through the end of the experiment display a positive slope, while those whose complexity growth had slowed show a near-zero slope. We will present results based on this trajectory, as well as the final level of complexity at the end of 10,000 generations. Figure 5 shows the trajectories of average complexity for each ecosystem. Table 1 summarizes final and maximal average complexity values as well as complexity trend-line slopes for each ecosystem.

Species . | Final complexity . | Max. complexity . | Complexity trend . |
---|---|---|---|

Control A | 12.44 | 14.24 | 0.006 |

Coop A | 18.28 | 19.78 | 1.380 |

Coop B | 18.74 | 22.52 | 1.645 |

Comp A | 16.00 | 19.88 | 0.483 |

Comp B | 16.02 | 20.70 | −0.865 |

3-Mix A | 26.18 | 29.12 | −0.082 |

3-Mix B | 20.54 | 23.14 | 0.718 |

3-Mix C | 41.72 | 41.72 | 0.568 |

3-Comp A | 13.76 | 16.10 | 0.141 |

3-Comp B | 14.02 | 16.48 | −0.127 |

3-Comp C | 13.08 | 16.16 | −0.259 |

4-Comp A | 26.68 | 29.92 | 0.140 |

4-Comp B | 23.82 | 26.22 | 1.374 |

4-Comp C | 38.40 | 39.78 | 3.089 |

4-Comp D | 36.10 | 37.98 | 3.767 |

Species . | Final complexity . | Max. complexity . | Complexity trend . |
---|---|---|---|

Control A | 12.44 | 14.24 | 0.006 |

Coop A | 18.28 | 19.78 | 1.380 |

Coop B | 18.74 | 22.52 | 1.645 |

Comp A | 16.00 | 19.88 | 0.483 |

Comp B | 16.02 | 20.70 | −0.865 |

3-Mix A | 26.18 | 29.12 | −0.082 |

3-Mix B | 20.54 | 23.14 | 0.718 |

3-Mix C | 41.72 | 41.72 | 0.568 |

3-Comp A | 13.76 | 16.10 | 0.141 |

3-Comp B | 14.02 | 16.48 | −0.127 |

3-Comp C | 13.08 | 16.16 | −0.259 |

4-Comp A | 26.68 | 29.92 | 0.140 |

4-Comp B | 23.82 | 26.22 | 1.374 |

4-Comp C | 38.40 | 39.78 | 3.089 |

4-Comp D | 36.10 | 37.98 | 3.767 |

Notes. The largest values for each column are in bold. Complexity growth values are the rate of growth in complexity per 1,000 generations, and are calculated over the final 2,500 generations.

### 4.1 Single Species (Control)

In this ecosystem, we have only a single species, which experiences no selective pressure—all organisms are equally likely to reproduce. Thus, the only driver of increased complexity is the random walk of genetic drift. Because this walk is bounded below at 1, we should expect a slight upward trend over time. This is indeed what we observe, with the populations achieving an average median complexity of about 12 nodes after 10,000 generations, with a high-water mark of 14.24 reached after 7,589 generations. By the end of the simulation, the control species had reached a near-zero rate of complexity growth. This will serve as a baseline for comparison with other systems—complexity growth above this rate should be the result of selective pressure.

### 4.2 Two-Species Cooperative

This ecosystem shows the behavior of two species with a purely cooperative interaction. Individuals that are able to consistently match the output of the other species are favored for reproduction. We find that this system is more amenable to complexity growth than the control—both in terms of its final average complexity and in terms of its rate of growth. The two species reach final complexities of about 18 nodes after 10,000 generations, with highs of 19.78 and 22.52 reached after 6,388 and 8,507 generations, respectively. By the end of the simulation, both species showed sustained, although slow, complexity growth at a rate of around 1.5 additional nodes per 1,000 generations. In some sense, this growth of complexity is surprising—the fully cooperative game can be played perfectly by two organisms with a single state, so it seems that extra states would be superfluous. From manual inspection of organism controllers, we observe that there is pressure towards strategies that are tolerant of mutations of their opponents. For example, consider a population of single-node organisms that all output 0. In each generation, there will likely be some offspring that have undergone a label-flipping mutation and instead output 1. Interaction with these mutants would result in zero fitness for the all-0 players. However, a two-state organism that always copies the last move of its opponent would achieve maximum fitness against both the original all-0 players and the all-1 mutants. Similar interactions drive further complexity development.

### 4.3 Two-Species Competitive

This ecosystem is the mirror of the cooperative environment above—the two species compete to achieve opposing goals: One receives fitness for matching output symbols, and the other for mismatching. Here we observe a very fast initial growth of complexity, with species reaching an average complexity of 10 nodes after only about 400 generations. By comparison, the purely cooperative system took 1,400 generations to reach this level, and the control took over 3,200 generations. However, this fast initial growth is not sustained. The system reaches final complexity values of about 16 nodes, with highs of 19.88 and 20.7, all slightly lower than those achieved by the cooperative model. The final complexity trend lines show a modestly increasing trajectory for one species (0.5 additional nodes per 1,000 generations) and a modestly decreasing trajectory for the other (−0.8 nodes per 1,000 generations). This system appears to fall victim to a form of mediocre stable state as described by [7], in which the system ceases to make progress as measured by an external metric (in our case, complexity). Instead, the species are driven into a cyclical pattern of what Ficici terms *convention-chasing*. That is, species spend all of their available mutational change trying to keep up with the changes of their opponents, rather than becoming more complex. In particular, selection seems to favor cycles of label-changing mutations, which push the populations through variants of the same core strategies but with different output symbols.

### 4.4 Three-Species Mixed

This ecosystem contains three species with both cooperative and competitive interactions. Unlike the previously discussed systems, the roles of the species are not symmetric. Species *A* faces both cooperative and competitive interactions, species *B* faces only a competitive interaction (its interaction is locally equivalent to those in the two-species competitive model), and species *C* faces only a cooperative interaction (locally equivalent to those in the two-species cooperative model). We observe significantly higher complexity values in this system than found in any of the two-species models. Species *A* reaches a final complexity of about 26 nodes, with a high of 29.12. Species *B* reaches a final complexity of about 20 nodes, with a high of 23.14, and species *C* reaches a final complexity of about 41 nodes, with a high of 41.72. On the other hand, the final complexity trajectories show that growth has slowed to less than one additional node per 1,000 generations for all three species by the end of the simulation. Species *C* in particular appears to suffer a period of regression around the 8,000th generation that arrests its otherwise consistent complexity growth. The exact cause of this regression is unclear.

The complexity values for *B* and *C* are both higher than those found in the two-species competitive and cooperative games respectively, although the values for *B* are only slightly so. The bulk of the increase in complexity seems to occur from the interaction between *A* and *C*, with both significantly exceeding the complexity levels observed in the purely cooperative game. The mechanism driving this growth seems to be similar to that in the cooperative game—*C* is driven to develop tolerance against changes in the behavior of *A*. However, the changes in the behavior of *A* are no longer merely the result of genetic drift; they are driven by selection pressure exerted by *B*.

### 4.5 Three-Species Comp

It might fairly be asked whether the increase in complexity shown in the previous system was merely due to the introduction of a third species—perhaps more players results in more complexity. To test this hypothesis, we consider a system with the same interactions, except that the game between *A* and *C* has changed to comp. Here we find no increase in complexity above that observed in the two-player case. In fact, the convergence to mediocre stability is even stronger, with all three species quickly reaching a point of stagnant complexity. The three species reach final complexities of 13.7, 14.0, and 13.0, respectively. They reach highs of 16.1, 16.5, and 16.1, respectively. All of these values are slightly below those observed in the two-species competitive case, and well below the levels observed in the three-species mixed system. Further, all three species display final trajectory trend lines with slopes very close to zero. We therefore conclude that the interaction of competition and cooperation is driving the increase in complexity, not merely the presence of a third species.

### 4.6 Four-Species Mixed

The four-species mixed ecosystem is formed by adding an additional species to the three-species mixed system, such that both *A* and *B* have a cooperative partner. Here we find that *B* attains similar complexity to that of *A* in the three-species mixed system, and *D* attains similar complexity to that of *C*. In particular, *A* and *B* attain final complexities of 26.7 and 23.8 nodes, respectively, with highs of 29.9 and 26.2. *C* and *D* attain final complexities of 38.4 and 36.1 and highs of 39.8 and 38.0. We also see that species *C* and *D* sustain final complexity trajectories well above that observed in any other simulations, both with rates of more than 3 additional nodes per 1,000 generations.

## 5 Discussion

Our experiments reveal that coevolution is able to drive complexity growth above that expected from genetic drift—indeed, all of our coevolutionary systems achieve higher final and maximal average complexity values than the control. However, the trajectories of complexity growth vary significantly depending on the nature of the coevolutionary interaction. Competitive coevolution is able to drive extremely fast initial complexity growth, but then settles into stagnation. Cooperative coevolution, somewhat surprisingly, is also able to drive complexity growth, but does so at a slower, steadier pace than competition.

Extension to three-species systems reveals that a mix of competitive and cooperative relationships is a much more powerful driver of complexity growth than either competition or cooperation alone. Such a system seems to alleviate the phenomenon of mediocre stable states observed in two- and three-species purely competitive systems. Returning to the competing hypothesis of Gould and Dawkins, we find that genetic drift alone is able to produce complexity growth. However, this baseline growth is well below that observed in coevolutionary systems. The competitive arms races hypothesized by Dawkins are successful in creating complexity above that created by genetic drift, but fall short of other more successful systems.

## 6 Long-term Simulations

While 10,000 generations are sufficient to show clear differences between the various ecosystems, it is also important to consider the longer-term trajectories of these simulations. Does the initial complexity growth peter out, or continue unbounded? In this section we present additional results for simulations over 100,000 generations, ten times the length in previous sections. For computational efficiency, we log complexity values only once every 100 generations. As before, we present median complexity values for each population, averaged over 20 simulations.

### 6.1 Control

Figure 6a shows the baseline results from the control simulation. As expected, complexity growth continues at a slow pace, reaching higher and higher levels as the simulation goes on. Qualitatively, the behavior is the same as in the 10,000-generation experiments. The final average complexity scores are approximately 50; this is the level we will compare the following results against.

### 6.2 Two-Species Cooperative

Figure 6b shows the results from the two-species cooperative ecosystem. As in the shorter experiments, this ecosystem is able to produce complexity values higher than that of the control, and maintains a slight upward slope at the end of the simulation. We also observe significant periods of relative stability, in which average complexity values remain nearly constant for tens of thousands of generations. During the period from roughly 40,000 to 60,000 generations, the two populations seem to diverge somewhat. This is unexpected, as the populations are exactly symmetrical. Probably it merely indicates the larger variation that will occur over longer time frames, and would be eliminated by a larger number of simulations.

In general, this result is consistent with our previous results for this ecosystem. The final complexity values of roughly 80 are well above the final values for our control experiment, just as they were for the shorter experiments. Both species were still seeing a slow increase in complexity by the end of the experiment, suggesting that this growth would continue on even larger time scales.

### 6.3 Two-Species Competitive

In contrast to the cooperative ecosystem, the two-species competitive ecosystem reaches a very stable level of complexity after a few thousand generations, and does not progress past that point, as shown in Figure 6c. In fact, the stability is so strong that this ecosystem lags behind the control experiment at this time scale, having roughly the same complexity after 100,000 generations as it did after 10,000. Even the variation from generation to generation is significantly lower than in the other ecosystems, suggesting that the populations are pulled back towards the stable state if they start to drift away.

Whereas the results for 10,000 generations suggested the possibility of slow but continued complexity growth, these simulations make quite clear that no further growth is forthcoming. Whatever competitive arms race occurs in the first few generations very quickly dies out, and the populations are locked in a moderate-complexity stable state from which they will not escape.

### 6.4 Three-Species Competitive

Much like the two-species competitive ecosystem, the three-species version shows remarkable stability at this time scale, as seen in Figure 6d. The addition of a third competitive species in insufficient to drive the system to escape from the mediocre stable state in which it is locked. This result is unsurprising, given its similarity to the results observed at the 10,000-generation time scale.

### 6.5 Three-Species Mixed

Figure 6e shows the results of the three-species mixed ecosystem over 100,000 generations. This experiment shows the starkest contrast to the 10,000-generation time scale that we have observed so far. Species *C* displays steady complexity growth over the entire simulation, reaching average complexity values of approximately 175 by the end of the simulation. Meanwhile, species *A* and *B* show significant stability after an initial period of complexity growth.

In the shorter experiments, the behavior of the three species was roughly similar—all three show relatively quick initial growth, followed by a period of slower growth. While species *C* reached higher complexity peaks than the others, all three maintained upward trajectories over the course of the 10,000 generations. We now observe a clear delineation—species *C* is the only one that displays the same complexity growth over the entire simulation.

The performance of species *C* in this experiment is consistent with what we would expect to observe in a system displaying open-ended evolution. Its growth is sustained, is at a rate far above that of the control experiment, and is also far above that observed in other ecosystems. There is, of course, always the possibility that this growth would stop if we were to increase the simulation length by another (or several more) factors of ten. Confirmation of the open-endedness of this model will require a theoretical understanding of its behavior in addition to these empirical results.

### 6.6 Four-Species Mixed

As with the three-species mixed ecosystem, the clear result of this experiment is the sustained complexity growth in species *C* and *D*, which both continue the upward trend observed in the 10,000-generation experiments, as seen in Figure 6f. Species *A* and *B*, again similarly to their behavior in the three-species mixed ecosystem, see their complexity growth slow to a halt.

We note a clear distinction between species *C* and *D*, which was not apparent in the previous experiments, arising only after about 40,000 generations. These two species are nearly symmetrical, so this divergence may simply be the result of larger variation at these time scales.

### 6.7 Discussion

These extended experiments generally confirm our previous results, and demonstrate the robustness of our analysis across different time scales. Most interesting is the continued success of the mixed ecosystems, in which some species are able to sustain complexity growth throughout their evolution. As previously discussed, this is empirical evidence of open-ended evolution. If that is indeed the case, and this model does admit true open-endedness, even longer experiments will only replicate our current results. Conclusive proof of open-endedness will require a theoretical understanding of the dynamics of these systems.

While the stability of the purely competitive ecosystems is very clear from our experiments, the exact nature of the attractor is unknown. Explaining the failure of the species to escape this stable state will require careful hand analysis of the evolved automata. Initial analysis suggests that the species suffer from the problem of convention-chasing, as identified by [7], but more work is needed.

The extended time scales also reveal a difference in behavior between species in the mixed ecosystems. In particular, those species with purely cooperative interactions (namely, species *C* in the 3-mixed ecosystem, and species *C* and *D* in the 4-mixed ecosystem) develop vastly more complexity than those with competitive interactions, and sustain a much stronger upward complexity trend. An exact explanation for this distinction will require careful analysis of the evolved strategies and their interactions, which is a daunting task for automata with hundreds of states. However, anecdotal analysis has led us to the following hypothesis. Consider the competitive interaction in the 3-mixed ecosystem between species *A* and *B*. As seen in the purely competitive ecosystems, this interaction is characterized by a drive to constantly adapt to keep up with the opponent. This continual change in the behavior of species *A* also affects its interaction with species *C*. In purely cooperative interactions, species seem to evolve mechanisms to adapt to changes in their partners' behavior due to mutational noise. Here species *C* must evolve mechanisms to adapt to much faster changes, changes that are driven by competitive interaction.

## 7 Larger Ecosystems

In addition to larger time scales, another natural extension of our initial work is to examine larger ecosystems with more species and more links of interaction between them. Real-world ecosystems are much more intricate than the ones in our experiments, but these larger systems yield additional complexity. Or is a simple mix of cooperation and competition sufficient to extract maximal complexity from this sort of iterated game?

As the number of species and interactions increases, the search space of all possible ecosystems quickly becomes intractable. When dealing with only two or three species, we were able to experiment with all possible combinations and select the most interesting. Doing so with many more species is too computationally expensive to be practical. Initial experiments with randomly generated ecosystems have not yielded promising results—it seems that there are a large number of degenerate ecosystems, which stifle evolution. In particular, clusters of mutually competing species tend to collapse to cycles of simple forms, a more extreme version of the behavior observed in 3-Comp above. Instead, we will rely on hand-generated ecosystems, focusing on applying the lessons from the three- and four-species models.

Figure 7 shows the 8-ladder and 8-ladder inverse ecosystems. The 8-ladder system is built by repeating the structure of the four-species mixed ecosystem. Each species acts both as a cooperator and as a competitor. The 8-ladder inverse system has the same topology, but exchanges the competitive and cooperative interactions. Whereas each species in 8-ladder had two cooperative interactions and one competitive, in 8-ladder inverse each has two competitive and one cooperative.

These new ecosystems will allow us to further test several aspects of our model. First, these systems have significantly more interactions than our previous experiments. Each has a total of 10 axes of interaction, compared to a maximum of 3 in the 4-mixed system. Will evolution be able to simultaneously optimize this many different interactions, or will the system collapse to simple forms due to too many competing objectives?

Initial experimental results for these systems are shown in Figure 8. We can observe a few interesting points about these results. First, the 8-ladder ecosystem generates complexity values significantly above those observed in any of the original experiments, with two species exceeding 50 nodes. This is promising evidence that larger ecosystems can create even stronger pressure to evolve complexity.

Also of interest in the 8-ladder system are the asymmetrical complexity values. In general, the nodes on the right side tend to generate higher complexity than their counterparts on the right, especially in the cases of *D* and *F* as compared to *C* and *E*. In previous experiments, the directionality of the comp game seemed to have little to no effect on the results—reversing which player was trying to match and which was trying to mismatch didn't make any difference. These results suggest that this is not always the case, and we will need to be careful when building and testing ecosystems to consider the direction of interactions. In this case, the species that showed higher complexity have mixed goals—they seek to match in their cooperative interactions but seek to mismatch in their competitive interactions. Further experimentation is needed to determine whether this can consistently cause the evolution of higher complexity.

The 8-ladder inverse ecosystem, despite having the same overall topology, produces significantly less complexity than its counterpart (although still well above many of the simpler ecosystems). A possible explanation is that the 8-ladder inverse system has instances of one species having to compete simultaneously with two others. In the 3-Comp system, this seemed to suppress complexity growth. Further analysis of the similarities between these systems may be helpful in explaining these observations.

## 8 Conclusions and Future Work

We've presented experimental results for a limited selection of the possible multi-species ecosystems, at two different time scales, demonstrating that particular mixes of competitive and cooperative relationships generate more complexity growth than either alone. It may be worthwhile to perform a systematic analysis of all possible ecosystems of a given size, in the style of Wolfram's analysis of elementary cellular automata. Our prediction game is completely deterministic, and many systems, from cellular automata to the iterated prisoner's dilemma, have very different dynamics when noise is introduced. Thus we plan to look at the game modified so a player's desired output is perturbed by noise, to see if complexity growth holds up. Finally, the dynamics of competitive systems have recently gained importance in the field of deep learning due to the success of generative adversarial networks (GANs) [10]. These systems often display the undesirable behavior observed in many purely competitive coevolutionary simulations, including those presented here. Much work has been done to coax these models towards more stable dynamics [5, 19]. We look towards applying the notion of mixed cooperative and competitive systems to the GAN architecture. A few works have successfully implemented systems that are approximately isomorphic to our three-species mixed model [1, 6], although with the aim of extending the capabilities of the system, rather than improving learning trajectories.

## Acknowledgments

The authors would like to acknowledge support for this work provided by the Alfred Schonwalter graduate fellowship in 2016. A shorter version of this article appeared in ECAL 2017, Lyon, France.

## Notes

[16] provides an overview of a variety of definitions from a wide array of fields.

Of course, we can also come up with counterexamples: A baby human is probably more complex than a large shrub.

In some fixed computer language.

Note that because we operate in a noiseless, deterministic setting, the choice of initial state may be critical in determining the long-term behavior of interacting players.

These mutation operators are similar to the class of structural mutators used in [4] and other network-based evolutionary algorithms.