## Abstract

Using the in silico experimental evolution platform Aevol, we have tested the existence of a complexity ratchet by evolving populations of digital organisms under environmental conditions in which simple organisms can very well thrive and reproduce. We observed that in most simulations, organisms become complex although such organisms are a lot less fit than simple ones and have no robustness or evolvability advantage. This excludes selection from the set of possible explanations for the evolution of complexity. However, complementary experiments showed that selection is nevertheless necessary for complexity to evolve, also excluding non-selective effects. Analyzing the long-term fate of complex organisms, we showed that complex organisms almost never switch back to simplicity despite the potential fitness benefit. On the contrary, they consistently accumulate complexity in the long term, meanwhile slowly increasing their fitness but never overtaking that of simple organisms. This suggests the existence of a complexity ratchet powered by negative epistasis: Mutations leading to simple solutions, which are favorable at the beginning of the simulation, become deleterious after other mutations—leading to complex solutions—have been fixed. This also suggests that this complexity ratchet cannot be beaten by selection, but that it can be overthrown by robustness because of the constraints it imposes on the coding capacity of the genome.

## 1 Introduction

Despite decades of deep interest by different scientific communities (including artificial life, population genetics, computational biology, and, of course, evolutionary biology), the question of the evolutionary origin of biological complexity is still controversial. While there is general agreement—tempered by the recognition that complexity has decreased in some organisms [4]—that biological complexity has globally increased during geological time, there is no general agreement on whether or not this is a general trend [19]. But the most discussed point is the ultimate causes of complexity increase. Roughly, two classes of theories are competing to explain this increase: those based upon selection and those invoking the variation process itself. According to theories of the former class, complexity rises because complex organisms are more likely to outcompete simple ones in a demanding environment (but the precise mechanisms vary among the authors). For theories belonging to the latter class, complexity is rooted in the properties of the variation process, which is supposed to be biased toward an increase in complexity (there again, the origin of the bias varies among the authors). Artificial life has provided many examples of the former [2, 35]. A famous adherent of the latter is Stephen Jay Gould, who proposed that, since complexity has a lower bound, it can only increase through a random variational process (the “drunkard's walk” model), whence the observed trend [12]. Following a similar neutrality hypothesis, Soyer and Bonhoeffer proposed that the complexification trend is due to duplication being less deleterious than deletion, an unbiased mutational process hence being likely to produce increasingly complex organisms in the long run [30]. More recently, McShea and Brandon have proposed the zero-force evolutionary law (ZFEL, [21]) stating that “In any evolutionary system in which there is variation and heredity, there is a tendency for diversity and complexity to increase, one that is always present but may be opposed or augmented by natural selection, other forces, or constraints acting on diversity or complexity” [21, chapter 1, p. 4]. According to the authors, the ZFEL spontaneously pushes evolving systems toward an increase in diversity and complexity even in the absence of selection and even when the mutational process is unbiased or when the considered system is far from its lower complexity bound, making it a strong universal mechanism.

There are many reasons why evolution of complexity is controversial [22]. Two of them are central: first, the lack of a universally accepted measure of complexity (although an elegant way to bypass this difficulty has been proposed by Adami, who considers complexity as equivalent to the quantity of information an organism integrates about its environment [1]); second, that biological organisms are multi-scale systems that can increase their complexity—or not—at different organization levels or even increase—or decrease—the number of organization levels (i.e., horizontal and vertical complexity, respectively [20]). A striking example is the strong loss of complexity undergone by endosymbionts, which is directly linked to the emergence of a new system through the association of a eukaryote and a bacterium [4]. Even when considering single organisms, there is no reason to suppose that the variations in complexity (or quantity of information) are homogeneous across the genome, transcriptome, proteome, and phenotype levels: Some well-known paradoxes such as the C-value paradox [31] and the G-value paradox [13] illustrate the fact that the quantity of information encoded in the genome may not be linked to the quantity of information at the phenotypic level. Hence, while most models used to investigate the evolution of complexity focus on a single organization level, it is necessary to consider the evolution of complexity at a given level in the context of the complexity needed at higher levels. Following this idea, in order to investigate whether or not the complexity increase is selected, one has to use a multi-scale model and let organisms evolve in an environment requiring only a simple phenotype (hence excluding the selection hypothesis). By observing whether this simple phenotype will be encoded by a simple or a complex functional organization, it is then possible to distinguish between passive and active trends towards complex structures.

Here we used the Aevol model [5, 14, 15] to implement this research program. Aevol is a digital evolution platform in which organisms are encoded at the genome level but with a decoding procedure directly inspired by the biological genotype-to-phenotype mapping and an abstract description of the functional levels (proteins and phenotype). Since this decoding procedure includes many degrees of freedom, Aevol allows the different organization levels (typically genome, proteome, and phenotype) to evolve different degrees of complexity. For instance a simple phenotypic function can be encoded either by a combination of many different genes or by one single gene. Similarly, the genome can evolve to be more or less compact, depending on the amount of noncoding sequence and depending on the sharing of sequences among multiple genes by, for example, operons or gene overlapping. This decoupling of complexity among the different organization levels makes Aevol perfectly suited to study the evolution of complexity. In the experiments described here, we used a slightly modified version of the model in which the environment allows for very simple organisms to thrive. We then studied a very large number of evolutionary trajectories to test whether or not these trajectories show an increase in complexity. Our results show that even though simple organisms are likely to have a higher fitness than complex ones, most lineages show a long-term increase in complexity during evolution. This suggests that even in simple environments there is a complexity ratchet that cannot be beaten by selection. We also show that, contrary to a widespread intuition, complex organisms are not more evolvable or more robust than simple ones and that, when selection is removed, all organisms quickly lose complexity, excluding the ZFEL from the set of possible candidates to explain the complexity trend observed in our experiments. Finally, our results show that while selection is not powerful enough to drive evolution toward simplicity, the need for mutational robustness is: When a complex organism experiences an increase in its mutation rate, its complexity is very likely to decrease, ultimately switching to a simple structure.

## 2 Methods

### 2.1 The Aevol Model

Aevol (www.aevol.fr and references therein) is an in silico experimental evolution platform developed by the Inria Beagle team (https://team.inria.fr/beagle). Figure 1 presents an overview of the model. Since Aevol has been extensively described in previous publications, we describe only its basic organization and focus on the structure of the information coding, as it is at the core of our experiments.

Figure 1.

The Aevol model. (a) Overview of the genotype-to-phenotype map. Note that the organism shown here is a real organism evolved within Aevol for 200,000 generations with a typical Aevol target (see main text and Figure 2 for the target used in the experiments presented here). Hence it contains many genes on both strands (left panel) and many proteins (central panel), and it is well adapted to its environment—its phenotypic function (black curve on the right panel) is very close to the target function (light red filled region). (b) Population on a grid and evolutionary loop. (c) Local selection and replication processes occur within a Moore neighborhood. (d) Variation operators include chromosomal rearrangements (duplications, deletions, translocations, and inversion—here a translocation and an inversion are shown) and local mutations (switches and InDels).

Figure 1.

The Aevol model. (a) Overview of the genotype-to-phenotype map. Note that the organism shown here is a real organism evolved within Aevol for 200,000 generations with a typical Aevol target (see main text and Figure 2 for the target used in the experiments presented here). Hence it contains many genes on both strands (left panel) and many proteins (central panel), and it is well adapted to its environment—its phenotypic function (black curve on the right panel) is very close to the target function (light red filled region). (b) Population on a grid and evolutionary loop. (c) Local selection and replication processes occur within a Moore neighborhood. (d) Variation operators include chromosomal rearrangements (duplications, deletions, translocations, and inversion—here a translocation and an inversion are shown) and local mutations (switches and InDels).

#### 2.1.1 Overview

The rationale of Aevol is that the structure of the fitness landscape of an organism is likely to be strongly determined by the structure of the biological information coding of this organism. Hence, Aevol mimics precisely the biological genomic structure as well as the structure of genotype-to-phenotype mapping. Organisms are then embedded in an evolutionary loop that includes classical selection operators and a large variety of mutational operators, including base switch (flipping the bits on both strands), small insertions and small deletions (called InDels), and large-scale chromosomal rearrangements (duplications, deletions, inversions, and translocations). All mutation operators have their own rate, expressed in mutations per base pair per generation (mut·bp−1·gen−1).

#### 2.1.2 Information Coding in Aevol

In Aevol, each individual owns a genome containing its heritable information. The genome is a binary double-strand sequence. It is decoded in two steps: transcription and translation. The transcription process relies on consensus signals (promoters) and hairpin-like structures (terminators) for transcription initiation and termination respectively. The translation process involves consensus ribosome-binding sites (RBSs) and an artificial genetic code based on triplet codons (including Start and Stop codons). The sequence of codons of a gene then constitutes the primary structure of a protein. Importantly, this decoding process introduces degrees of freedom between the genome and the proteome: Complex genomes can encode for simple proteomes (e.g., if all genes have the same sequence), and complex proteomes can be encoded on small sequences (if genes share sequences through, e.g., polycistronic mRNAs or overlapping genes). These degrees of freedom are similar to those found in real organisms.

Given the primary structure of a protein, Aevol computes its functional contribution. Although mimicking biological processes at the sequence level is feasible, it is—at least to date—impossible to compute the function of a protein from its primary structure in a realistic way. That is why Aevol uses an abstract mathematical formalism to describe the functional levels (i.e., protein functional contribution and phenotype). In Aevol all functions are expressed in a one-dimensional continuous functional space (more precisely, on the interval [0, 1]) by an activation value in the interval [−1, 1] (upper and lower bounds corresponding to maximum activation and maximum inhibition respectively). In this space, proteins are described as isosceles-triangle-shaped kernel functions. These triangles can themselves be described by three parameters (their mean m, height h, and half width w), which are computed from three interlaced variable-length binary codes in the primary structure of the protein (hence the longer the gene, the more precise the m, w, and h values). Once all the kernels have been computed from the protein set (Figure 1(a), center), they are summed to compute the phenotype (Figure 1(a), right). Just as the transcription-translation process introduces degrees of freedom between the genome and the proteome, this step introduces degrees of freedom between the proteome and the phenotype. Indeed, the combination of different proteins can result in a simple functional shape, for example, if the proteins share the same m and w values (see Section 2.3 below).

Finally, in Aevol, the fitness is computed as the exponential of the difference between the phenotypic function and a target function indirectly representing the abiotic conditions the organisms evolve in (in light red in Figure 1(a), right). Classically, in Aevol the target function is defined by a sum of Gaussians, hence requiring a virtually infinite number of triangular kernels to be perfectly fitted. In the experiment described here, we used a modified version of Aevol (code available at http://www.aevol.fr/publications/ressources/Liard2018_ALife_src.tgz) in which the target function is described by triangles, hence being perfectly fittable by the phenotype (see below).

### 2.2 Experimental Design

In order to test whether evolution has a spontaneous tendency to increase complexity or whether the complexity increase is due to the environmental pressure, we let evolve populations of 1,024 individuals in Aevol in a null model where the environment is so simple that it does not require a complex proteome or a complex genome. To this aim, we designed an environmental target whose shape is an isosceles triangle (Figure 2(a)—to be compared with the classical environment used in Aevol experiments, Figure 1(a), light red filled region). Hence, the target can be fitted by a single protein and thus a single gene. More precisely, the target is an isosceles triangle with mean m = 0.5, height h = 0.5, and half-width w = 0.1. Note that although this target can be fitted with a single gene, it is still hard to fit, since it requires that the corresponding gene be long to get enough precision (see the description of the model above).

Figure 2.

(a) Phenotypic target used during the experiment. (b), (c) Genome (top, black arcs represent the coding segments—the genes—on both strands of the circular chromosome) and proteome (bottom, red dashed line indicates the target function, and black triangles indicate the proteins function) of a simple (b) and a complex (c) individual (both evolved in exactly the same conditions; μ = 10−5 mut·bp−1·gen−1). (d) Zoom on the proteome of the complex individual shown in (c).

Figure 2.

(a) Phenotypic target used during the experiment. (b), (c) Genome (top, black arcs represent the coding segments—the genes—on both strands of the circular chromosome) and proteome (bottom, red dashed line indicates the target function, and black triangles indicate the proteins function) of a simple (b) and a complex (c) individual (both evolved in exactly the same conditions; μ = 10−5 mut·bp−1·gen−1). (d) Zoom on the proteome of the complex individual shown in (c).

All simulations are initialized with a random 5,000-bp genome containing at least one functional gene. We tested three mutation rates: μ = 10−4, μ = 10−5, and μ = 10−6 mut·bp−1·gen−1. Note that we also tested μ = 10−7 mut·bp−1·gen−1, but evolution was too slow for the data to be usable on only 270,000 generations. Each population evolved for 270,000 generations. We then reconstructed the lineage of the best final individual and the statistics of the fitness, genome size, number of genes, and structure of the protein network along this line of descent from generation 0 to generation 250,000 (i.e., all statistics are recorded on common ancestors of the final population—the last 20,000 generations being ignored because, due to coalescence time, there is no such thing as a fixed lineage when getting too close to the final generation). We then repeated the experiment 100 times for each mutation rate, for a total of 300 simulated evolutions.

### 2.3 Complexity Measures

Generally speaking, there is no consensus on complexity measures. Moreover, since Aevol is a multi-scale model, one has to choose different measures at the different levels (i.e., different measures of horizontal complexity [20], the vertical complexity—the number of levels—being constant in the model). Typically here we will measure complexity at the sequence level—the genome—and at the functional level—the proteome (complexity is not measured at the phenotypic level, since it is directly driven by selection, which requires that this level remain as simple as possible). We thus adopted two strategies. First, we adapted principles from [1] to Aevol in order to get quantitative measures at the genome and proteome levels by estimating the quantity of information stored in both structures. Second, we designed a qualitative classification of simple versus complex organisms based upon the structure of the model.

#### 2.3.1 Quantitative Measure at the Sequence Level

Aevol provides numerous statistics on the lineage of a given organism. In particular, it provides statistics about the number of essential base pairs (i.e., base pairs that, if mutated, change the phenotype of the organism). Hence, this measure can be directly used to estimate the quantity of information stored on the genome, that is, the genome complexity 𝒞G. Note that it may be very different from the genome size, since the genome can accumulate noncoding sequences. It can also be shorter than the sum of gene lengths, since genes can share sequences through gene overlapping on the same strand or on the opposite strand (see Figure 2(c) for examples of overlapping genes).

#### 2.3.2 Quantitative Measure at the Functional Level

While measuring complexity on the genome is relatively straightforward, measuring complexity at the proteome level (i.e., the functional complexity) is more difficult. Indeed, in a first approximation, one could consider that the proteome complexity is given by the number of non-degenerated proteins.1 However, since different proteins can perform similar functions (e.g., in the case of gene duplication), this would overestimate the quantity of information contained in the proteome. Hence, we considered proteome information in a more precise way by estimating the number of different parameters in the proteome. 𝒞P, the functional complexity measure, is then the sum of the numbers of different m, different w, and different h values (all with a small tolerance ε = 0.001 to allow for rounding errors) used to encode the protein set. Note that this definition of functional complexity is close to McShea and Brandon's notion of “pure complexity” (i.e., the number of different part types within an organism [21]).

#### 2.3.3 Qualitative Classification

To study the long-term fate of simple versus complex organisms, we defined a qualitative classification procedure. Since the environmental target constrains the phenotypic level, the phenotypic function cannot be used to classify the organisms. We then chose to classify organisms according to their functional structure, hence focusing on the proteome level. A simple solution would have been to define a threshold on the quantitative measure, but this threshold would be arbitrary. To avoid this, we used knowledge from the model structure to define the two classes. In Aevol, if all the non-degenerated proteins of an organism have the same mean m and the same half-width w, then their functions linearly sum to produce a triangular phenotype with the same characteristics (in other words, all the proteins have the same function, but possibly with different levels of activity h). We used this property to define the two following classes:

• •

Simple organisms (simples) are organisms for which all the non-degenerated proteins have the same function (i.e., the same m and w values, both with an ε = 0.001 tolerance), possibly with different activity levels (h). Figure 2(b) shows an example of a simple individual. Note that all organisms owning a single protein are necessarily simple but that simples may contain many genes and many proteins (possibly differing in their h values). Hence, simples can have different levels of functional complexity 𝒞P.

• •

Complex organisms (complexes) are organisms owning at least two non-degenerated proteins for which either the triangle means m or the triangle half-widths w are different (with the same tolerance ε). Figure 2(c) and 2(d) show an example of a complex individual.

### 2.4 Measure of Robustness and Evolvability

To measure robustness and evolvability in Aevol, we used a Monte Carlo sampling procedure: Starting from the final populations, we first retrieved the common ancestor at generation 250,000. Then, we used Aevol to produce 10,000,000 offspring of this ancestor. By measuring the fitness of these offspring and comparing it with their parent's, we were able to measure the local curvature of the fitness landscape and hence to estimate robustness and evolvability of the ancestral clone.

#### 2.4.1 Measure of Robustness

Robustness can be defined in several ways. Here we considered replication robustness, that is, the ability of an organism to replicate neutrally. Replication robustness must not be confused with mutational robustness (the ability of an organism to conserve its fitness in spite of the mutations it has undergone), as replication robustness also takes account of the fraction of organisms replicating without undergoing any mutation.

From the 10,000,000 offspring, replication robustness can be estimated directly by measuring Fν, the fraction of offspring that retain the fitness of their parent [14].

#### 2.4.2 Measure of Evolvability

Evolvability is a complex concept for which many definitions have been proposed in the literature [26]. Here we use the definition inspired by [34]: Evolvability—or evolutionary potential—is here quantified as the expectation of gain at the next generation. Hence, provided the sampling is large enough, evolvability can be estimated by the same procedure as robustness. However, obviously, such an evolvability measure is strongly dependent on the distance to the optimum (the closer to the optimum, the less evolvable an individual is likely to be). Hence, contrary to robustness, evolvability cannot be estimated directly from the naive replication of an ancestral clone: Since simple organisms are closer to the optimum than complex ones (see Section 3), their respective evolvability measures would not be directly comparable. To avoid this pitfall, we measured evolvability of the ancestor in a modified environment (i.e., an environment whose target function is still an isosceles triangle but whose mean m has been slightly drifted, from 0.5 to 0.495). In this new environment, all the organisms are positioned away from the (new) optimum whatever their complexity, and the fitnesses of the complexes and of the simples become indistinguishable (data not shown). Then, from the 10,000,000 offspring (see above), we recorded the favorable mutants (i.e., offspring whose fitness is higher than their parent's) and computed evolvability as the expectation of fitness gain at the next generation, that is, the sum of the fitness gains of these mutants divided by the number of samples (10,000,000).

### 2.5 Evolution with No Selection

In order to test whether the ZFEL [21] could be a driver of the increase in diversity and complexity in our experiments, we evolved populations without selection. To this aim, we first selected two individuals, one simple and one complex, in the lineages at generation 250,000. These two individuals were chosen as follows: We first extracted the 100 ancestors at generation 250,000 in each of the populations that evolved under a low mutation rate (10−6 mut·bp−1·gen−1). These 100 individuals were then classified into simples and complexes. We computed the median functional complexity (𝒞P) of each group, and in each group selected the individual whose 𝒞P value was closest to this median. We thus selected the most representative individual in each complexity class.

We used these two individuals to initialize 20 clonal populations (10 for the simple and 10 for the complex). These populations were then evolved for 10,000 generations under a 10−6 mutation rate with no selection (i.e., each individual has the same probability of replication, regardless of its distance to the environmental target). Meanwhile we measured (i) the fraction of complexes in each population, (ii) the diversity of genomic complexity (𝒞G) and functional complexity (𝒞P) in each population (estimated by 𝒞G and 𝒞P variances), and (iii) the mean genomic and functional complexity in each population.

## 3 Results

Among the 300 simulations we analyzed (3 mutation rates with 100 repeats each), 229 were classified as complexes (see Section 2.3) at generation 250,000. Table 1 shows the distribution of simple and complex organisms for the three mutation rates we analyzed.

Table 1.
Numbers of simple and complex lineages at generation 250,000 for the three tested mutation rates. Brackets show the 95% confidence intervals (CI95%) estimated from the number of samples in both classes using the Wilson method.
Mutation rate μ (mut·bp−1·gen−1)Number of simplesNumber of complexes
10−4 32 [24–43] 68 [58–76]
10−5 25 [18–34] 75 [66–82]
10−6 14 [9–22] 86 [78–91]
Mutation rate μ (mut·bp−1·gen−1)Number of simplesNumber of complexes
10−4 32 [24–43] 68 [58–76]
10−5 25 [18–34] 75 [66–82]
10−6 14 [9–22] 86 [78–91]

We first verified that the complex (the simple) organisms correspond to those accumulating (not accumulating) information. Figure 3 shows the quantity of information stored on the genomes (genomic complexity, 𝒞G) and on the proteomes (functional complexity, 𝒞P) for simple and complex organisms and for all the mutation rates. Note that 𝒞G and 𝒞P cannot be quantitatively compared, since they take account of the information content in a binary sequence and in a set of real values, respectively.

Figure 3.

Distribution of complexity measures for the complexes (top) and simples (bottom) at generation 250,000. Left: Genomic complexity (𝒞G). Right: Functional complexity (𝒞P). Colors indicate the mutation rates: blue, 10−4 mut·bp−1·gen−1; red, 10−5 mut·bp−1·gen−1; green, 10−6 mut·bp−1·gen−1.

Figure 3.

Distribution of complexity measures for the complexes (top) and simples (bottom) at generation 250,000. Left: Genomic complexity (𝒞G). Right: Functional complexity (𝒞P). Colors indicate the mutation rates: blue, 10−4 mut·bp−1·gen−1; red, 10−5 mut·bp−1·gen−1; green, 10−6 mut·bp−1·gen−1.

Figure 3 clearly shows that simples tend to accumulate less information in their proteome. The quantity of information stored on the genome also tends to be smaller for simples, although the difference is less pronounced (Figure 3, left). This is not surprising, given that our qualitative classification is based on the proteome structure and that Aevol allows degrees of freedom between the information coding in the genome and the information coding in the proteome (see model description in Section 2.1). Both measures also show a strong effect of mutation rates: The higher the mutational pressure, the lower 𝒞G and 𝒞P. This is not a surprise either, since the effect of mutation rate on genome structure has already been described in the literature [11, 14]. Contrary to the trend of the amount of information, this effect is more pronounced on the genome, probably because mutational effects directly affect the genome but only indirectly affect the proteome.

### 3.1 Simple Organisms are Fitter than Complex Ones

Having observed organisms evolving either simple or complex functional structure in the same simple environment, the decisive question is whether or not complexity is driven by selection. Figure 4 shows the fitness of the common ancestor at generation 250,000 (see Section 2.2) against 𝒞G (left) and 𝒞P (right). It clearly shows that simple organisms have a higher fitness than complex ones. This is confirmed by the fitness distribution between the two qualitative classes: Figure 5 shows that all simples reach a fitness that approaches 1, the best possible fitness in Aevol (mean fitness of simples: 0.99 ± 0.008). By contrast, the fitnesses of complexes range over the whole interval, most complexes having a fitness below 0.5 (mean fitness of complexes: 0.42 ± 0.32).

Figure 4.

Fitness of the common ancestor at generation 250,000 as a function of (left) genomic complexity 𝒞G and (right) functional complexity 𝒞P (log scales). Triangles and circles indicate lineages classified as simple or complex respectively. Same color code as in Figure 3.

Figure 4.

Fitness of the common ancestor at generation 250,000 as a function of (left) genomic complexity 𝒞G and (right) functional complexity 𝒞P (log scales). Triangles and circles indicate lineages classified as simple or complex respectively. Same color code as in Figure 3.

Figure 5.

Distribution of fitness values at generation 250,000 for complexes (top) and simples (bottom). Same color code as in Figure 3.

Figure 5.

Distribution of fitness values at generation 250,000 for complexes (top) and simples (bottom). Same color code as in Figure 3.

This result demonstrates that in our simulations, it is not selection that drives the populations toward functional simplicity or complexity. On the contrary, here complex functional structures evolve in spite of selection.

### 3.2 Complex Organisms Are Neither More Robust nor More Evolvable than Simple Ones

It has been shown [33] that, in some situations, indirect selection (i.e., selection for robustness or evolvability) can be strong enough to overcome direct selection for fitness. We thus estimated the robustness and evolvability of simple and complex organisms at generation 250,000 (see Section 2.4) to check whether these properties could explain the evolution of complex functional structures.

Figure 6 displays the replication robustness (left) and evolvability (right) measures for the simple and complex organisms and for all mutation rates. It clearly shows that simple organisms are more robust than complex ones, whatever the mutation rate. In contrast, there is no clear trend in evolvability, since simples or complexes can be more or less evolvable from each other (or even indistinguishable), depending on the mutation rates.

Figure 6.

Estimate of the replication robustness (left) and evolvability (right) of the complex (red) and simple (blue) organisms at generation 250,000 for each tested mutation rate.

Figure 6.

Estimate of the replication robustness (left) and evolvability (right) of the complex (red) and simple (blue) organisms at generation 250,000 for each tested mutation rate.

The difference of robustness between the simples and the complexes is straightforward to explain. Indeed, complex individuals own a larger genome on which they accumulate more information (Figures 3 and 7). Consequently, they present a larger mutational target and—for a given mutation rate—a larger number of mutational events. Hence, their replication robustness is lower [10, 11, 14].

Figure 7.

Genome size for the complexes (red) and simples (blue) at generation 250,000 for each tested mutation rate. Genomes of complexes are significantly larger than genomes of simples.

Figure 7.

Genome size for the complexes (red) and simples (blue) at generation 250,000 for each tested mutation rate. Genomes of complexes are significantly larger than genomes of simples.

Our results on evolvability deserve attention, as it is often assumed that complex structures are more evolvable than simple ones [29]. However, this common assumption is strongly rooted in modularity [8], a property that has no reason to evolve here. To better understand our results, we distinguished the two components of evolvability: the fraction of positive offspring, and the mean fitness gain of the positive offspring (Figure 8). This shows that simples have a significantly lower fraction of favorable offspring than complexes and that the fitness gain of the former is slightly higher than the fitness gain of the latter (the difference between mutation rates being explained by the difference of fitness between simples having evolved under different mutation rates; see Figure 5). In other words, the complexes have more possibilities to increase their fitness—which is coherent with their lower robustness and larger mutational target—but these possibilities result in a lower fitness gain.

Figure 8.

Fraction of positive offspring (left) and fitness gain among positive mutants (right) of the complex (red) and simple (blue) organisms at generation 250,000 for each tested mutation rate. Simple organisms have a lower fraction of positive mutants, but these mutants are often closer to the optimum than those issued from complex organisms.

Figure 8.

Fraction of positive offspring (left) and fitness gain among positive mutants (right) of the complex (red) and simple (blue) organisms at generation 250,000 for each tested mutation rate. Simple organisms have a lower fraction of positive mutants, but these mutants are often closer to the optimum than those issued from complex organisms.

These results show that, although the fitness landscape is the same in all our experiments, simples and complexes lie in very different parts of the landscape. Simples lie in high, steep regions with low connectivity, while complexes lie in low, flat, highly connected regions. This may seem contradictory with the higher robustness of the simples (as mutational robustness is higher in flat regions of the landscape [33]), but one has to remember that we estimated replication robustness, a property that depends not only on the fitness landscape but also on the number of noncoding sequences [14]. Indeed, in our simulations, the mutational robustness of the complexes is approximately twice that of the simples (except for the highest mutation rates, where they are not significantly different—data not shown).

To conclude, our measures of robustness and evolvability demonstrate that in our simulations, the choice between functional simplicity and functional complexity is not driven by indirect selection for either robustness or evolvability. On the contrary, here again, complex functional structures evolve in spite of indirect selection.

### 3.3 In the Absence of Selection, Complexity Quickly Drops to Zero

Previous results claim a non-selective driving of complexity increase (e.g., by the ZFEL [21]). We assessed the effects of selection by letting organisms evolve in new conditions where selection had been neutralized. Starting from two homogeneous populations (one consisting of complex individuals and the other of simple ones), we let them evolve for 10,000 generations, in the absence of selective pressure, and measured the diversity and complexity at both the genomic and the functional level. This process was performed ten times each for the two initial conditions (see Section 2.5).

Figure 9 shows the proportion of complexes in these experiments (top: initial population of complexes; bottom: initial population of simples). It shows that, when a population of complex individuals replicates with mutations but without selection, the proportion of complexes quickly drops to zero. When a population of simples replicates in the same conditions, the proportion of complexes initially grows, up to 20% in the conditions of our experiment, which could seem to support the ZFEL. However, after a few hundred generations, the number of complexes starts to decrease, eventually reaching zero (i.e., virtually none of the individuals in the population are complex at generation 10,000).

Figure 9.

Fraction of complex individuals in populations evolving with no selection for a population initially composed of complex clones (top) or simple clones (bottom). Colors indicate the different repetitions.

Figure 9.

Fraction of complex individuals in populations evolving with no selection for a population initially composed of complex clones (top) or simple clones (bottom). Colors indicate the different repetitions.

To understand this result, we analyzed the variation of diversity and complexity in the populations, at both the genomic and the functional level. Figures 10 and 11 show the evolution of the variance (left) and mean (right) of complexity measures among the populations for the ten repetitions initiated with a complex (a simple) individual, at the genomic (top) and functional (bottom) levels. In accordance with the ZFEL prediction, both figures show that the variance of complexity levels initially increases for the two complexity measures (left panels in Figures 10 and 11). This shows that, in the absence of selection, there is an increase of the level of diversity in the population. However, similarly to the proportion of complexes shown in Figure 9, this trend only lasts for a few hundred generations, after which the diversity at the genetic level slowly decreases, and that of the functional level quickly drops, eventually reaching zero. Both effects contradict the ZFEL prediction that, in the absence of selection, diversity should increase.

Figure 10.

Evolution of the variance (left) and mean (right) of complexity measures among the populations for the ten repetitions initiated with a complex individual, at the genomic (top) and functional (bottom) levels.

Figure 10.

Evolution of the variance (left) and mean (right) of complexity measures among the populations for the ten repetitions initiated with a complex individual, at the genomic (top) and functional (bottom) levels.

Figure 11.

Evolution of the variance (left) and mean (right) of complexity measures among the populations for the ten repetitions initiated with a simple individual, at the genomic (top) and functional (bottom) levels.

Figure 11.

Evolution of the variance (left) and mean (right) of complexity measures among the populations for the ten repetitions initiated with a simple individual, at the genomic (top) and functional (bottom) levels.

The contradiction is even more clear when looking at the mean levels of complexity in the populations (Figures 10 and 11, right panels). Whatever the initial conditions (simples or complexes) and the complexity measure (𝒞G or 𝒞P), in the absence of selection the complexity immediately drops, quickly reaching values close to zero.

Taken together, these results show that, in the absence of selection, populations quickly lose complexity, the initial increase of diversity being only due to the different individuals following different paths during this degradation process. Hence, contrary to the prediction of the ZFEL, selection appears as a necessary element to evolve—and actually maintain—horizontal complexity.

### 3.4 Complex Organisms Evolve Greater Complexity

So far we have analyzed only one time point: generation 250,000. To address the dynamics of the evolution of complexity, we analyzed the fate of simple and complex organisms between generations 10,000 and 250,000. Table 2 shows that most organisms classified as simples or complexes at generation 10,000 conserved this identity thereafter. These values are to be contrasted with the proportion of simples at generation 0 (99%, this high proportion being due to the initialization procedure—see Section 2.2), suggesting that most organisms switched from simple to complex between generation 0 and generation 10,000, but that the class they belong to at that time is then part of their identity.

Table 2.
Fraction of organisms that conserved their simple or complex identity between generation 10,000 and generation 250,000 (respectively, PSS or PCC). Brackets show the CI95% computed using the Wilson method; values in parentheses give the number of individuals with simple identity or complex identity at generations 250,000 and 10,000.
μ = 10−4μ = 10−5μ = 10−6
PSS 100% [100%–87.9%] (28/28) 100% [100%–85.7%] (23/23) 92.3% [98.6%–66.7%] (12/13)
PCC 94.4% [97.8%–86.6%] (68/72) 97.4% [99.3%–91%] (75/77) 97.7% [99.4%–92%] (85/87)
μ = 10−4μ = 10−5μ = 10−6
PSS 100% [100%–87.9%] (28/28) 100% [100%–85.7%] (23/23) 92.3% [98.6%–66.7%] (12/13)
PCC 94.4% [97.8%–86.6%] (68/72) 97.4% [99.3%–91%] (75/77) 97.7% [99.4%–92%] (85/87)

Figure 12 shows the evolution of 𝒞G (left) and 𝒞P (right) during the 250,000 generations of the experiment for the simple (blue) and the complex (red) organisms and for the three different mutation rates. It shows that the evolutionary trend is completely different between the two classes: While simples slightly decrease in 𝒞G and are almost constant in 𝒞P, complexes increase in 𝒞P, while only those with a low mutation rate (10−6 mut·bp−1·gen−1) increase in 𝒞G. Indeed, in mean, between generations 10,000 and 250,000, we have Δ𝒞G = −43.8 ± 2.2 in simple individuals (for all mutation rates) while Δ𝒞G = +25.3 ± 1.62 in complex ones. In the same period, Δ𝒞P = −0.32 ± 0.31 for simple individuals while Δ𝒞P = +3.58 ± 0.27 for complex ones (CI95% computed from the standard deviation and the number of individuals: CI95% = 1.96$σ2/NI10k$).

Figure 12.

Evolution of genomic complexity (𝒞G, left) and functional complexity (𝒞P, right) between generations 0 and 250,000 for complex (red) and simple (blue) individuals. Solid curves: low mutation rate (10−6 mut·bp−1·gen−1); dotted curves: medium mutation rate (10−5 mut·bp−1·gen−1); dashed curves: high mutation rate (10−4 mut·bp−1·gen−1).

Figure 12.

Evolution of genomic complexity (𝒞G, left) and functional complexity (𝒞P, right) between generations 0 and 250,000 for complex (red) and simple (blue) individuals. Solid curves: low mutation rate (10−6 mut·bp−1·gen−1); dotted curves: medium mutation rate (10−5 mut·bp−1·gen−1); dashed curves: high mutation rate (10−4 mut·bp−1·gen−1).

Figure 13 shows the evolution of fitness for the simples and the complexes for the three mutation rates. It shows that, whatever the mutation rates, the fitness of simples quickly reaches a very high value, close to the optimum—indirectly confirming the evolvability of simple organisms—and then almost plateaus for the rest of the experiment (mean fitness gain of the simples between generations 10,000 and 250,000: +0.09 ± 0.1). In contrast, complexes slowly grow in fitness all through the experiment, with sustained differences between the different mutation rates (the higher the mutation rate, the higher the fitness): Between generation 10,000 and generation 250,000 the mean fitness of the complexes has increased by ΔFitness = +0.16 ± 0.05.

Figure 13.

Evolution of fitness between generations 0 and 250,000 for complex (red) and simple (blue) individuals. Solid curves: low mutation rate (10−6 mut·bp−1·gen−1); dotted curves: medium mutation rate (10−5 mut·bp−1·gen−1); dashed curves: high mutation rate (10−4 mut·bp−1·gen−1).

Figure 13.

Evolution of fitness between generations 0 and 250,000 for complex (red) and simple (blue) individuals. Solid curves: low mutation rate (10−6 mut·bp−1·gen−1); dotted curves: medium mutation rate (10−5 mut·bp−1·gen−1); dashed curves: high mutation rate (10−4 mut·bp−1·gen−1).

The evolution of 𝒞G and 𝒞P for the simples can be easily understood by combining two factors. During the very first generations, their evolution is strongly driven by direct selection and their fitness quickly rises to nearly optimal values (Figure 13). However, after this initial period, direct selection becomes less efficient, as there is almost no room for further improvement. The evolution of the simples is then mainly driven by indirect selection for replication robustness. This leads to genome streamlining, specifically in organisms subjected to a high mutation rate (Figure 12, left, blue dashed curve); hence the drop of 𝒞G. However, this mechanism has no clear effect on 𝒞P, because 𝒞P is already very low and because in the simples there is no selective pressure to increase functional complexity.

The evolution of 𝒞G and 𝒞P in the complexes can also be explained by a combination of the effect of direct and indirect selection, although the mechanism is different. Since complexes remain far from the optimum, direct selection is active all through the experiment, and 𝒞P continuously increases, although at a decreasing pace over evolutionary time (Figure 12, right, red curves). But this mechanism is constrained by the genomic level, which cannot accumulate too much information, because of robustness pressures (Figure 12, left, red curves; [10, 11, 14]). This robustness pressure imposes a bound on 𝒞G that strongly depends on the mutation rate. These bounds are clearly visible in Figure 12 (left, red curves), at least for high and medium mutation rates. However, since in Aevol 𝒞G and 𝒞P are only weakly linked (see model description in Section 2.1), these 𝒞G bounds still allow for the accumulation of functional complexity (Figure 12, right, red curves) and fitness improvement (Figure 13, red curves).

### 3.5 Effect of Harsh Robustness Constraints on Complexity

It is well known that under elevated mutational stress, robust lineages can be selected over fitter ones [33] and that genome compactness is a direct driver of mutational robustness [14]. Here, we have shown that, in our experiments, simple organisms have a higher robustness than complex ones (Figures 4 and 12). Hence, if fitness cannot drive evolution toward complexity reduction, as also shown previously, we hypothesized that robustness might do so, by imposing a strong complexity limit on the genome.

To test this hypothesis, we subjected the 300 final populations to a harsh mutation rate during 100,000 generations.

Specifically, each population was further evolved with mutation rates μnew that were 10, 100, and 1,000 times greater than the initial rate (without exceeding the extreme rate μnew = 10−3). Table 3 shows, for the different levels of mutation rate increase, the percentage of organisms that, after being complex at generation 250,000, had switched to simple (C → S) at generation 350,000.

Table 3.
Fraction of complex → simple transitions for all pairs of initial (columns) and final (rows) mutation rates. CI95% computed using the Wilson method; values in parenthesis give the number of complex → simple transitions and number of complexes at generation 250,000.
μ = 10−4μ = 10−5μ = 10−6
μnew = 10−3 45.9% [58.3%–34%] (28/61) 64.4% [74.4%–52.9%] (47/73) 81.2% [88.1%–71.6%] (69/85)
μnew = 10−4 — 2.7% [9.3%–0.7%] (2/74) 10.6% [18.9%–5.7%] (9/85)
μnew = 10−5 — — 1.2% [6.4%–0.2%] (1/85)
μ = 10−4μ = 10−5μ = 10−6
μnew = 10−3 45.9% [58.3%–34%] (28/61) 64.4% [74.4%–52.9%] (47/73) 81.2% [88.1%–71.6%] (69/85)
μnew = 10−4 — 2.7% [9.3%–0.7%] (2/74) 10.6% [18.9%–5.7%] (9/85)
μnew = 10−5 — — 1.2% [6.4%–0.2%] (1/85)

Among the 600 experiments, 463 started with complex organisms. 156 (33.7 ± 4.3%) of those switched from complex to simple (Table 3). Strikingly, while these complex → simple organisms experienced a harsh robustness constraint, their fitness strongly increased (mean variation: +0.69 ± 0.055) during the 100,000 generations of the experiment. In contrast, the 307 remaining complex → complex organisms experienced a fitness variation of +0.17 ± 0.34. Note that although they retained their complex identity, these organisms experienced a strong complexity decrease in reaction to the mutational pressure (𝒞G and 𝒞P mean variation: −126.1 ± 25.6 and −4.28 ± 0.93, respectively).

Compared to the proportion of complex → simple switches between generations 10,000 and 250,000 in the main experiment (8 individuals among 236; see Table 2), the complex → simple proportion in the robustness experiments is huge, especially for the extreme mutation rate μnew = 10−3. However, the robustness pressure needs to be very harsh to produce this effect (Table 3). This is probably due to selection for robustness already acting during the first part of the experiment (see above): At generation 250,000 the complex organisms we propagated in the robustness experiment were probably already robust enough to cope with a reasonable increase in the mutational pressure.

## 4 Discussion

By evolving in a very simple environment populations of digital organisms whose complexity can evolve at the genomic and functional levels independently, we were able to acquire important insights into the evolution of complexity. First, the continuous increase in complexity in such a non-demanding environment is a strong argument in favor of a complexity ratchet, that is, an irreversible mechanism that can add components (or information) to the evolving system but that cannot get rid of existing ones, even though that could be more favorable [7]. Indeed, one of the most astonishing observations is that the complexity ratchet clicks and goes on clicking despite the selective advantage of simple solutions over complex ones and despite their greater robustness. However, by submitting the same organisms to a harsh robustness constraint, we have shown that, unlike selection for fitness, selection for robustness, when severe, can overcome the ratchet and push complex organisms back toward simplicity. These results may be interpreted as evidence of a non-selective mechanism such as the zero-force evolutionary law [21]. However, we also showed that, in the absence of selection, all forms of complexity quickly vanish in our simulations, excluding the ZFEL as a possible explanation of the ratchet. Given the simplicity of the ZFEL formulation, this deserves a specific discussion. Actually, McShea and Brandon's argument in favor of the ZFEL is based on a few assumptions—in particular, that complexities at the different organization levels are independent from one another and that increase of diversity naturally results in an increase of complexity. But, as shown by our simulations, both assumptions are false, first because complexity at the functional level is encoded at the molecular (genetic) level, hence coupling the two levels, and second because an increase of diversity at the molecular level may result in a decrease of complexity at the functional level. The former observation is self-evident, but the latter deserves explanation: Molecular encoding on the genome is based on consensus signals (promoters, RBS, Start codon, etc.) that need to be recognized by molecular readers (polymerases, ribosomes, etc.); now, an increase of diversity at the molecular level not only increases the diversity of the genes' sequences (hence of the functional level, as is transitorily observed in our simulations—see Figures 10 and 11, left panels), it also increases the diversity of the consensus signals, eventually hindering their ability to be recognized by their readers. A direct consequence is that the increase of diversity at the genomic level actually reduces the number of signals, and hence the number of encoded elements, leading to a decrease of complexity at both the genomic and the functional levels (see Figures 10 and 11, right panels).

In our experiments, simple organisms are fitter than complex ones. Previous results with Aevol showed that selection for robustness favors streamlined genomes [14, 24] and that the joint effect of duplications and deletions biases mutations toward reduction [11]. Then, if selection, robustness, and mutational biases all push in the same direction—simplicity—what is the force that counterbalances them all, hence leading to complexity increases? To answer this question, we first have to look back at the variation of fitness within the experiment (Figure 13). It shows that even though complexes stay far worse than simples, complexes still substantially gain fitness throughout the simulation: Although complexity increases in spite of selection, its increase is nevertheless driven by selection.

This immediately points toward a negative epistatic phenomenon, namely, the differential effect of a mutation according to the genetic context in which it occurs. Epistasis is extensively documented theoretically and experimentally [25], and, interestingly, it has been shown that in natural populations, epistasis correlates with complexity [28]. Here, mutations that would have been beneficial in a given simple individual are deleterious in the genetic context of complex individuals (and reciprocally). Indeed, selection only acts on the basis of the local topology of the fitness landscape, which depends on the genetic background of the individuals. In a complex genetic context, negative epistasis forbids the acquisition of some genes that could be highly favorable in a simple context. Since gene deletion is obviously deleterious, the only available evolutionary path for an already complex organism seems to be a headlong rush toward increasing complexity by acquiring new genes. Hence the ratchet clicks, further widening the fitness valley that separates the current genome from a simple one, and soon making it so wide it is very unlikely to be crossed.

The geometric properties of the Aevol functional structure provide a good illustration of the ratchet mechanism. In our experiments, the phenotypic target can be fitted by a single triangular protein kernel. However, as soon as the proteome contains a protein with m ≠ 0.5 or w ≠ 0.1, this is no longer possible, because the function that remains to be fitted (viz., the target minus the protein kernels) becomes multilinear, and the ratchet starts clicking. In other words, each protein added to the proteome increases the complexity of the function that remains to be fitted, forbidding its fitting by a single triangle and triggering further gene recruitment.

Now, if selection cannot overcome the ratchet, how can an increase in mutational pressure? It is known that severe robustness constraints can overcome selection by imposing an upper limit on the amount of information an organism can transmit to its offspring at the genetic [10] and at the genomic [11, 14] levels. In our experiments, raising the mutation rate strongly decreases the storage capacity of the genome, hence forcing gene elimination despite the fitness loss. This can lower epistatic constraints enough to allow the transition from complexity to simplicity.

Table 1 shows that the ratchet does not systematically start clicking: In nearly one-quarter of our simulations, evolution leads to simple solutions. Moreover, we saw that the path toward simplicity or complexity is taken very early in the simulations (often before generation 1,000; data not shown), which indirectly confirms that the ratchet is engaged when the organisms recruit their very first genes. But how is this initial direction determined? Starting with a single gene (see Section 2.2), the organisms can evolve in two ways: (1) optimizing this gene by mutation, (2) recruiting new genes, primarily through a duplication-divergence mechanism. Depending on this contingent alternative, akin to a “frozen accident” [9], evolution may be either more or less likely to lead to either simple or complex identity. However, selection can also play its role in the identity switch: Since the route to simplicity leads to higher fitnesses, clonal interference is likely to favor simplicity. Hence, if our explanation is correct, the fraction of simples should increase in very large populations (clonal interference being more frequent in large populations).

The two alternative evolutionary pathways described above also suggest that the negative epistatic interactions that lead to the accumulation of complexity could be due to rearrangement events (typically segmental duplications). We tested this hypothesis by evolving 100 populations in conditions where there is no rearrangement and where the mutation rate (for switches and InDels) is such that the total number of events is equivalent to the medium mutation rate in the main experiment (μ = 10−5 mut·bp−1·gen−1). We observed that, among these 100 populations, 98 evolved simple functional structures (to be compared with the 25 that evolved simple functional structures in the initial experiment—see Table 1). This confirms the strong role played by chromosomal rearrangements in this process. It also explains why the mechanism we identified here had not been observed previously. Indeed, to the best of our knowledge, Aevol is the sole model that is able to take precise account of the rearrangement mechanisms. Hence, our results also strongly call for a better accounting of these overlooked events in artificial life models.

Finally, if contingency can explain the initiation of the ratchet and epistasis its mechanisms, what about its long-term behavior? Will the ratchet click forever, thus reaching very high complexities? In our simulations the final complexity seems to be bounded, despite the great room for improvement in most of the complex organisms (Figures 4 and 12). Indeed, three effects can impose a bound on complexity: (1) As complexity grows, the advantage provided by new genes may become too small for selection to allow for their fixation. Indeed, it has been proposed that genome complexity could be mainly driven by population genetics effects [18]. However, this is unlikely to explain the apparent bound we observe, since complexes can still improve greatly (Figure 5). (2) Proteome complexity needs to be encoded in the genome, but there is an upper bound to the amount of information a genome (hence a proteome) can carry with given mutation [10] and rearrangement [11, 14] rates. (3) The waiting time to the next innovation increases as the organism becomes more complex. This is directly linked to the cost of complexity that slows adaptation down as the number of selected traits increases [23]. In our simulation, simples fit the target globally—as a single trait—while complexes virtually split the target into parts, which they fit more or less independently from one another. Hence complexes are likely to suffer from the cost of complexity: As complexity increases, evolution slows down in such a way that it would require virtually infinite waiting time to approach the two above-mentioned bounds.

When experimenting with models, a tricky question is always to tell evolutionary trends apart from model artefacts. Here, we used Aevol, a model that has already proven its consistency, but that nevertheless has its limits. Among them, three at least are likely to interfere with our results. First, as in most ALife models, we deal with very small populations compared to natural populations. As discussed previously, a larger population size may change the initial direction toward simplicity or complexity or the upper complexity bounds. However, since selection cannot invert the ratchet, we hypothesize that our general conclusions hold qualitatively whatever the size of the population. Second, the properties of our artificial chemistry may differ from real biochemistry. In particular, dosage effects are stronger in Aevol than they are in nature. However, this property is likely to limit the complexity increase, since gene duplications are more deleterious in the model than in nature. Thus, this should not alter our main conclusions. Last but not least, although Aevol is a multi-scale model, it lacks some scales that are likely to play a crucial role in the evolution of complexity. In particular it lacks a complex ecosystem and a gene network. Hence, we cannot observe here the effect of niche construction, which could act as an important player in the evolution of complexity [27].

On the gene network side, our results match very well those we obtained when using the RAevol version of the model to evolve genetic networks in constant versus variable environments [6, 16, 32]. Indeed, in these experiments the complexity of the network appeared to be driven by the mutation rate, and highly complex networks evolved even in constant environments. This opens the interesting prospect of replicating the present experiments in RAevol.

Our work opens many other prospects. Specifically, we would like to analyze the evolutionary dynamic of our populations at a finer grain. In particular, analyzing the effect of every single mutation on complexity, fitness, evolvability, and robustness, depending on the mutation type (point mutations versus rearrangements), would allow for a better characterization of the epistatic interactions in the model. Finally, the most engaging possibility would be to generalize the mechanisms observed here to other kinds of systems. Indeed, an open question is whether this complexity ratchet could contribute to open-ended evolution [3], hence opening the door for non-selectively-driven open-endedness. A difficult question here is whether epistasis (and negative epistasis) has an equivalent in other open-ended systems such as economy or innovation.

In conclusion, we would like to stress that our results, gathered from a null model, do not imply that there is no such thing as selection for complexity. But importantly, they show that selection for complexity is not mandatory for complexity to evolve. Hence, complex biological structures could flourish in conditions where complexity is not needed. Reciprocally, the global function of these complex structures could very well be simple. We think this result is greatly significant for both evolutionary biology and systems biology.

## Note

1

Degenerated proteins encode for triangles whose area is equal to zero (i.e., h = 0 and/or w = 0). These proteins hence don't contribute to the phenotype.

## References

1
,
C.
(
2002
).
What is complexity?
BioEssays
,
24
(
12
),
1085
1094
.
2
,
C.
,
Ofria
,
C.
, &
Collier
,
T. C.
(
2000
).
Evolution of biological complexity
.
Proceedings of the National Academy of Sciences of the U.S.A.
,
97
(
9
),
4463
4468
.
3
Banzhaf
,
W.
,
Baumgaertner
,
B.
,
Beslon
,
G.
,
Doursat
,
R.
,
Foster
,
J. A.
,
McMullin
,
B.
,
De Melo
,
V. V.
,
Miconi
,
T.
,
Spector
,
L.
,
Stepney
,
S.
, et al
(
2016
).
Defining and simulating open-ended novelty: Requirements, guidelines, and challenges
.
Theory in Biosciences
,
135
(
3
),
131
161
.
4
Batut
,
B.
,
Knibbe
,
C.
,
Marais
,
G.
, &
Daubin
,
V.
(
2014
).
Reductive genome evolution at both ends of the bacterial population size spectrum
.
Nature Reviews Microbiology
,
12
(
12
),
841
.
5
Batut
,
B.
,
Parsons
,
D. P.
,
Fischer
,
S.
,
Beslon
,
G.
, &
Knibbe
,
C.
(
2013
).
In silico experimental evolution: A tool to test evolutionary scenarios
.
BMC Bioinformatics
,
14
(
S11
).
6
Beslon
,
G.
,
Parsons
,
D. P.
,
Sanchez-Dehesa
,
Y.
,
Pena
,
J.-M.
, &
Knibbe
,
C.
(
2010
).
Scaling laws in bacterial genomes: A side-effect of selection of mutational robustness?
Biosystems
,
102
(
1
),
32
40
.
7
Cairns-Smith
,
A.
(
1995
).
The complexity ratchet
. In
G. S.
Shostak
(Ed.),
Progress in the Search for Extraterrestrial Life
(pp.
31
36
).
San Francisco
:
Astronomical Society of the Pacific
.
8
Clune
,
J.
,
Mouret
,
J.-B.
, &
Lipson
,
H.
(
2013
).
The evolutionary origins of modularity
.
Proceedings of the Royal Society B
,
280
(
1755
),
20122863
.
9
Crick
,
F. H.
(
1968
).
The origin of the genetic code
.
Journal of Molecular Biology
,
38
(
3
),
367
379
.
10
Eigen
,
M.
, &
Schuster
,
P.
(
1977
).
A principle of natural self-organization
.
Naturwissenschaften
,
64
(
11
),
541
565
.
11
Fischer
,
S.
,
Bernard
,
S.
,
Beslon
,
G.
, &
Knibbe
,
C.
(
2014
).
A model for genome size evolution
.
Bulletin of Mathematical Biology
,
76
(
9
),
2249
2291
.
12
Gould
,
S. J.
(
1996
).
Full house: The spread of joy from Plato to Darwin
.
New York
:
Harmony Books
.
13
Hahn
,
M. W.
, &
Wray
,
G. A.
(
2002
).
.
Evolution and Development
,
4
(
2
),
73
75
.
14
Knibbe
,
C.
,
Coulon
,
A.
,
Mazet
,
O.
,
Fayard
,
J.-M.
, &
Beslon
,
G.
(
2007
).
A long-term evolutionary pressure on the amount of noncoding DNA
.
Molecular Biology and Evolution
,
24
(
10
),
2344
2353
.
15
Knibbe
,
C.
,
Fayard
,
J.-M.
, &
Beslon
,
G.
(
2008
).
The topology of the protein network influences the dynamics of gene order: From systems biology to a systemic understanding of evolution
.
Artificial Life
,
14
(
1
),
149
156
.
16
Knibbe
,
C.
,
Parsons
,
D. P.
, &
Beslon
,
G.
(
2011
).
Parsimonious modeling of scaling laws in genomes and transcriptomes
. In
T.
Lenaerts
,
M.
Giacobini
,
H.
Bersini
,
P.
Bourgine
,
M.
Dorigo
, &
R.
Doursat
(Eds.),
Proceedings of the Eleventh European Conference on the Synthesis and Simulation of Living Systems (ECAL 11)
(pp.
414
415
).
Cambridge, MA
:
MIT Press
.
17
Liard
,
V.
,
Parsons
,
D.
,
Rouzaud-Cornabas
,
J.
, &
Beslon
,
G.
(
2018
).
The complexity ratchet: Stronger than selection, weaker than robustness
. In
T.
Ikegami
,
N.
Virgo
,
O.
Witkowski
,
M.
Oka
,
R.
Suzuki
, &
H.
Iizuka
(Eds.),
Artificial Life Conference Proceedings
(pp.
250
257
).
Cambridge, MA
:
MIT Press
.
18
Lynch
,
M.
, &
Conery
,
J. S.
(
2003
).
The origins of genome complexity
.
Science
,
302
(
5649
),
1401
1404
.
19
McShea
,
D. W.
(
1996
).
Metazoan complexity and evolution: Is there a trend?
Evolution
,
50
(
2
),
477
492
.
20
McShea
,
D. W.
(
2017
).
Evolution of complexity
. In
L.
Nuno de la Rosa
&
G.
Müller
(Eds.),
Evolutionary Developmental Biology: A Reference Guide
(pp.
1
11
).
New York
:
Springer
.
21
McShea
,
D. W.
, &
Brandon
,
R. N.
(
2010
).
Biology's first law: The tendency for diversity and complexity to increase in evolutionary systems
.
Chicago
:
University of Chicago Press
.
22
Miconi
,
T.
(
2008
).
Evolution and complexity: The double-edged sword
.
Artificial Life
,
14
(
3
),
325
344
.
23
Orr
,
A. H.
(
2000
).
Adaptation and the cost of complexity
.
Evolution
,
54
(
1
),
13
20
.
24
Parsons
,
D. P.
,
Knibbe
,
C.
, &
Beslon
,
G.
(
2010
).
Importance of the rearrangement rates on the organization of transcription
. In
H.
Fellerman
,
M.
Dörr
,
M.
Hanczy
,
L. L.
Laursen
,
S.
Mauer
,
D.
Merkle
,
P.-A.
Monard
,
K.
Støy
, &
S.
Rasmussen
(Eds.),
Artificial Life XII: Proceedings of the Twelfth International Conference on the Synthesis and Simulation of Living Systems
(pp.
479
486
).
Cambridge, MA
:
MIT Press
.
25
Phillips
,
P. C.
(
2008
).
Epistasis—the essential role of gene interactions in the structure and evolution of genetic systems
.
Nature Reviews Genetics
,
9
(
11
),
855
867
.
26
Pigliucci
,
M.
(
2008
).
Is evolvability evolvable?
Nature Reviews Genetics
,
9
(
1
),
75
.
27
Rocabert
,
C.
,
Knibbe
,
C.
,
Consuegra
,
J.
,
Schneider
,
D.
, &
Beslon
,
G.
(
2017
).
Beware batch culture: Seasonality and niche construction predicted to favor bacterial adaptive diversification
.
PLoS Computational Biology
,
13
(
3
),
e1005459
.
28
Sanjuán
,
R.
, &
Elena
,
S. F.
(
2006
).
Epistasis correlates to genomic complexity
.
Proceedings of the National Academy of Sciences of the U.S.A.
,
103
(
39
),
14402
14405
.
29
Simon
,
H. A.
(
1962
).
The architecture of complexity
.
Proceedings of the American Philosophical Society
,
106
(
6
),
467
482
.
30
Soyer
,
O. S.
, &
Bonhoeffer
,
S.
(
2006
).
Evolution of complexity in signaling pathways
.
Proceedings of the National Academy of Sciences of the U.S.A.
,
103
(
44
),
16337
16342
.
31
Thomas
,
C. A. J.
(
1971
).
The genetic organization of chromosomes
.
Annual Review of Genetics
,
5
(
1
),
237
256
.
32
,
Y.
,
Rouzaud-Cornabas
,
J.
, &
Beslon
,
G.
(
2016
).
In silico experimental evolution suggests a complex intertwining of selection, robustness and drift in the evolution of genetic networks complexity
. In
C.
Gershenson
,
T.
Froese
,
J. M.
Siqueiros-García
,
W.
Aguilar
,
E. J.
Izquierdo
, &
H.
Sayama
(Eds.),
ALIFE 2016: The Fifteenth International Conference on the Synthesis and Simulation of Living Systems
(pp.
180
188
).
Cambridge, MA
:
MIT Press
.
33
Wilke
,
C. O.
,
Wang
,
J. L.
,
Ofria
,
C.
,
Lenski
,
R. E.
, &
,
C.
(
2001
).
Evolution of digital organisms at high mutation rates leads to survival of the flattest
.
Nature
,
412
(
6844
),
331
.
34
Woods
,
R. J.
,
Barrick
,
J. E.
,
Cooper
,
T. F.
,
Shrestha
,
U.
,
Kauth
,
M. R.
, &
Lenski
,
R. E.
(
2011
).
Second-order selection for evolvability in a large Escherichia coli population
.
Science
,
331
(
6023
),
1433
1436
.
35
Yaeger
,
L.
,
Griffith
,
V.
, &
Sporns
,
O.
(
2008
).
Passive and driven trends in the evolution of complexity
. In
S.
Bullock
,
J.
Noble
,
R.
Watson
, &
M. A.
Bedau
(Eds.),
Artificial Life XI: Proceedings of the Eleventh International Conference on the Simulation and Synthesis of Living Systems
(pp.
725
732
).
Cambridge, MA
:
MIT Press
.