## Abstract

Is undecidability a requirement for open-ended evolution (OEE)? Using methods derived from algorithmic complexity theory, we propose robust computational definitions of open-ended evolution and the adaptability of computable dynamical systems. Within this framework, we show that decidability imposes absolute limits on the stable growth of complexity in computable dynamical systems. Conversely, systems that exhibit (strong) open-ended evolution must be undecidable, establishing undecidability as a requirement for such systems. Complexity is assessed in terms of three measures: sophistication, coarse sophistication, and busy beaver logical depth. These three complexity measures assign low complexity values to random (incompressible) objects. As time grows, the stated complexity measures allow for the existence of complex states during the evolution of a computable dynamical system. We show, however, that finding these states involves undecidable computations. We conjecture that for similar complexity measures that assign low complexity values, decidability imposes comparable limits on the stable growth of complexity, and that such behavior is necessary for nontrivial evolutionary systems. We show that the undecidability of adapted states imposes novel and unpredictable behavior on the individuals or populations being modeled. Such behavior is irreducible. Finally, we offer an example of a system, first proposed by Chaitin, that exhibits strong OEE.

## 1 Introduction

An in-depth review of the preliminaries, as well as more information on selected topics, can be found in the extended version of this article [22].

Broadly speaking, a dynamical system is one that changes over time. Prediction of the future behavior of dynamical systems is a fundamental concern of science generally. Scientific theories are tested upon the accuracy of their predictions, and establishing invariable properties through the evolution of a system is an important goal. Limits to this predictability are known in science. For instance, chaos theory establishes the existence of systems in which small deficits in the information of the initial states make accurate predictions of future states unattainable. However, in this article we focus on systems for which we have unambiguous, finite (as to size and time), and complete descriptions of initial states and behavior: computable dynamical systems.

Since their formalization by Church and Turing, the class of computable systems has shown that, even without information deficits (i.e., with complete descriptions), there are future states that cannot be predicted, in particular the state known as the *halting state* [34]. We will use this result and others from algorithmic information theory to show how predictability imposes limits on the growth of complexity during the evolution of computable systems. In particular, we will show that random (incompressible) times tightly bound the complexity of the associated states.

The relationship between dynamical systems and computability has been studied before by Bournez [10], Blondel [9], Moore [29], and Fredkin, Margolus, and Toffoli [20, 28], among others. That emergence is a consequence of incomputability has been proposed by Cooper [17]. Complexity as a source of undecidability has been observed in logic by Calude and Jurgensen [11]. Delvenne, Kurka, and Blondel [19] have proposed robust definitions of computable (effective) dynamical systems and universality, generalizing Turing's halting states, while also setting forth the conditions and implications for universality and decidability and their relationship with chaos. The definitions and general approach used in this paper differ from those in the sources cited above, but are ultimately related.

We will denote by *K*(*x*|*y*) the algorithmic descriptive complexity of the string *x* with respect to the string *y*. The dynamical systems we are considering are deterministic, and each state must contain all the information needed to compute successive states. We are assuming an infinity of possible states for non-cyclical systems. Mechanisms and requirements for open-ended evolution in systems with a finite number of states (resource-bounded) have been studied by Adams et al. [3].

### 1.1 Open-Ended Evolution in Computable Dynamical Systems

Informally, *open-ended evolution* (OEE) has been characterized as “evolutionary dynamics in which new, surprising, and sometimes more complex organisms and interactions continue to appear” [34, p. 1]. Establishing and defining the properties required for a system to exhibit OEE is considered an open question
[7, 31, 32], and OEE has been proposed as a required property of evolutionary systems capable of producing life [30]. This has been implicitly verified by various experiments *in silico* [27, 2, 26, 5].

One line of thought posits that open-ended evolutionary systems tend to produce families of objects of increasing *complexity* [6, 5]. Furthermore, for a number of complexity measures, it can be shown that the objects belonging to a given level of complexity are finite in number (for instance *K*(*x*)). Therefore an increase of complexity is a requirement for the continued production of new objects. A related observation, proposed by Chaitin [16, 15], associates evolution with the search for *mathematical creativity*, which implies an increase of complexity, as more complex mathematical operations are needed in order to solve interesting problems, which are required to drive evolution.

Following the aforementioned lines of thought, we have chosen to characterize OEE in computable dynamical systems as a process that has the property of producing families of objects of increasing complexity. Formally, given a*complexity measure C*, we say that a computable dynamical system *S*(*M*_{0}, *t*), where *M*_{t} =*S*(*M*_{0}, *t*) =*S*(*M*_{t−1}, 1) = *S*(*M*_{t}, 0) is the state of the system at the time *t*, exhibits*open-ended evolution* with respect to *C* if for every time *t* there exists a time *t*′ such that the complexity of the system at time *t*′ is greater than the complexity at time *t*, that is,*C*(*S*(*M*_{0},*t*)) <*C*(*S*(*M*_{0},*t*′)), where a complexity measure is a (not necessarily computable) function that goes from the state space to a positive numerical space.

The existence of such systems is trivial for complexity measures on which any infinite set of natural numbers (not necessarily computable) contains a subset where the measure grows strictly:

*Let C be a complexity measure such that any infinite set of natural numbers has a subset where C grows strictly. Then a computable system S(M _{0}, t) is a system that produces an infinite number of different states if and only if it exhibits OEE for C.*

*Proof.*

Let *S*(*M*_{0}, *t*) be a system that does not exhibit OEE, and *C* a complexity measure as described. Then there exists a time *t* such that for any other time *t*′ we have *C*(*M*_{t′}) ≤ *C*(*M*_{t}), which holds true for any subset of states of the system. It follows that the set of states must be finite. Conversely, if the system exhibits OEE, then there exists an infinite subset of states on which *S* grows strictly, hence an infinity of different states.

Given the previous lemma, a trivial computable system that simply produces all the strings in order exhibits OEE on a class of complexity measures that includes algorithmic description complexity. However, we intuitively conjecture that such systems have a much simpler behavior than that observed in the natural world and the artificial life systems referenced. To avoid some of these issues we propose a stronger version of OEE.

A sequence *n*_{0}, *n*_{1}, …,*n*_{i}, … exhibits *strong open-ended evolution* (strong OEE) with respect to a complexity measure *C* if for every index *i* there exists an index *i*′ such that *C*(*n*_{i}) < *C*(*n*_{i′}), and the sequence of complexities *C*(*n*_{0}), *C*(*n*_{1}), …, *C*(*n*_{i}), … does not drop *significantly*, that is, there exists a γ such that *i* ≤ *j* implies*C*(*n*_{i}) ≤ *C*(*n*_{j}) + γ(*j*) where γ(*j*) is a positive function that does not grow *significantly*.

It is important to note that while the definition of OEE allows for significant drops in complexity during the evolution of a system, strong OEE requires that the complexity of the system not decrease *significantly* during its evolution. In particular we will require that the complexity drops as measured by γ not grow as fast as the complexity itself. Formally*C*(*n*_{j}) −
γ(*j*) should not be upper-bounded for any infinite subsequence for the smallest γ where the strong OEE inequality holds.

We will construe the concept of *speed* of growth of complexity in a comparative way: Given two sequences of natural numbers *n*_{i} and *m*_{i}, *n*_{i}*grows faster* than*m*_{i} if for every infinite subsequence and natural number *N*, there exists*j* such that *n*_{i} − *m*_{j} ≥*N*. Conversely, a subsequence of indexes denoted by*i* grows faster than a subsequence of indexes denoted by*j* if for every natural *N*, there exists*i* with *i* < *j* such that *n*_{i} − *n*_{j} ≥*N*.

If a complexity measure is sophisticated enough to depend on more than just the size of an object, significant drops in complexity are a feature that can be observed in trivial sequences such as the ones produced by enumeration machines. Whether this is also true for *nontrivial* sequences is open to debate. However, if we classify random strings as low-complexity objects and posit that nontrivial sequences must contain a limited number of random objects, then a nontrivial sequence must observe bounded drops in complexity in order to be capable of showing nontrivial OEE. This is the intuition behind the definition of strong OEE.

Now, in the literature on dynamical systems, random objects are often considered simple ([1, p. 1]), with complexity being taken to lie between regularity and randomness. Various complexity measures have been proposed that assign low complexity to random or incompressible natural numbers. Two examples of such measures are logical depth [8] and sophistication [25]. Classifying random naturals as low-complexity objects is a requirement for the results shown in Section 3.

## 2 A Computational Model for Adaptation

Let's start by describing the evolution of an organism or a population by a computable dynamical system. It has been argued that in order for adaptation and survival to be possible, an organism must contain an effective representation of the environment, so that, given a reading of the environment, the organism can choose a behavior accordingly [35]. The more faithful this representation, the better the adaptation. If the organism is computable, this information can be codified by a computable structure. We will denote this structure by *M*_{t}, where *t* stands for the time corresponding to each of the stages of the evolution of the organism. This information is then processed following a finitely specified unambiguous set of rules that, in finite time, will determine the adapted behavior of the organism according to the information codified by *M*_{t}. We will denote this behavior (or a theory explaining it) using the program*p*_{t}. An adapted system is one that produces an acceptable approximation of its environment. An environment can also be represented by a computable structure *E*. In other words, the system is adapted if *p*_{t}(*M*_{t}) produces *E*. Based on this idea, we propose a robust, formal characterization for adaptation, which presents a necessary (but not sufficient) condition:

Let *K* be the prefix-free descriptive complexity. We say that the system at the state *M*_{t} is ϵ-*adapted* to *E* if *K*(*E*|*S*(*M*_{0}, *E*, *t*)) ≤ ϵ.

The inequality states that the minimal amount of information that is needed to describe *E* from a complete description of*M*_{t} is ϵ or less. This information is provided in the form of a program *p* that produces *E* from the system at time *t*. We will define such a program *p* as the *adapted behavior* of the system. It is not required that *p* be unique.

The proposed structure for adapted systems is robust, since *K*(*E*|*S*(*M*_{0}, *E*, *t*)) is less than or equal to the number of characters needed to describe any computable method of describing *E* from the state of the system at time *n*, whether it be a computable theory for adaptation or a computable model for an organism that tries to predict *E*. It follows that any computable characterization of adaptation that can be described within ϵ bits meets the definition of ϵ-*adapted*, given a suitable choice of *E*, the *adaptation condition* for any given environment. It is important to note that, although inspired by a representationalist approach to adaptation, the proposed characterization of adaptation is not contingent on the organism's containing an actual codification of the environment, since any organism that can produce an adapted behavior that can be explained effectively (is computable in finite time) is ϵ-adapted for some ϵ.

As a simple example, we can think of an organism that must find food located at the coordinates (*x*, *j*) on a grid in order to survive. If the information in an organism is codified by a computable structure *M* (such as DNA), and there is a set of finitely specified, unambiguous rules that govern how this information is used (such as the ones specified by biochemistry and biological theories), codified by a program *p*, then we say that the organism finds the food if *p*(*M*) = (*j*, *k*). If |〈*p*〉| ≤ ϵ, then we say that the organism is adapted according to a behavior that can be described within ϵ characters. The proposed model for adaptation is not limited to such simple interactions. For a start, we can suppose that the organism *sees* a grid, denoted by *g*, of size *n* × *m* with food at the coordinates (*j*, *k*). The environment can be codified as a function *E* such that *E*(*g*) = (*j*, *k*), and being ϵ-adapted implies that the organism defined by the genetic code *M*, which is interpreted by a theory or behavior written on ϵ bits, is capable of finding the food upon seeing *g*. Similarly, more complex computational structures and interactions imply ϵ-adaptation.

Now, describing an evolutionary system that (eventually) produces an ϵ-*adapted* system is trivial via an enumeration machine (the program that produces all the natural numbers in order), as it will eventually produce *E* itself. Moreover, we require the output of our process to remain adapted. Therefore we propose a stronger condition called *convergence*:

Given the description of a computable dynamical system *S*(*M*_{0}, *E*, *t*), where *t* ∈ ℕ is the variable of time, *M*_{0} is an initial state, and *E* is an environment, we say that the system *S converges* towards *E* with degree ϵ if there exists δ such that *t* ≥ δ implies *K*(*E*|*S*(*M*_{0}, *E*, *t*)) ≤ ϵ.

*M*

_{0}and environment

*E*, it is easy to see that the descriptive complexity of a state of the system depends mostly on

*t*: We can describe a program that, given full descriptions of

*S*,

*E*,

*M*

_{0}, and

*t*, finds

*S*(

*M*

_{0},

*E*,

*t*). Therefore

*t*grows,

*time becomes the main driver for the descriptive complexity within the system*.

### 2.1 Irreducibility of Descriptive Time Complexity

In the previous paragraph, it was established that time was the main factor in the descriptive complexity of the states within the evolution of a system. This result is expanded by the time complexity stability theorem (^{2}). This theorem establishes that, within an algorithmic descriptive complexity framework, similarly complex initial states must evolve into similarly complex future states over similarly complex time frames, effectively erasing the difference between the complexity of the state of the system and the complexity of the corresponding time, and establishing absolute limits to the reducibility of future states.

Let *F*(*t*) =*T*(*S*(*M*_{0},*E*, *t*)) be the *real execution time* of the system at time *t*. Using our time counting machine *U*^{H}, it is easy to see that *F*(*t*) is computable and, given the uniqueness of the successor state, *F* increases strictly with*t*, and hence is injective. Consequently, *F* has a computational inverse *F*^{−1} over its image. Therefore, we have it that (up to a small constant)*K*(*F*(*t*)) ≤*K*(*F*) +*K*(*t*) and*K*(*t*) ≤*K*(*F*^{−1}) +*K*(*F*(*t*)). It follows that*K*(*t*) =*K*(*F*(*t*)) +*O*(*c*), where *c* is an integer independent of *t* (but that can depend on*S*). In other words, for a fixed system *S*, the execution time and the system time are *equally complex up to a constant*. From here on we will not differentiate between the complexities of the two times. A generalization of the previous equation is given by the following theorem:

*Let S and S′ be two computable systems, and t and t′ the first time where the systems reach the states M _{t} and M_{t′}′, respectively. Then there exists c such that |K(M_{t}) − K(t)| ≤ c and |K(M_{t}) − K(M_{t′}′)| ≤ c.*

*Proof.*

First, note that we can describe a program such that given *S*, *M*_{0}, and *E*, we have that *S*(*M*_{0}, *E*, *x*) runs for each *x* until it finds *t*. Therefore *K*(*t*) ≤ *K*(*S*(*M*_{0}, *E*, *t*)) + *K*(*S*) + *K*(*M*_{0}) + *K*(*E*) + *O*(1). Similarly for *t*′. By the inequality (1) and the hypothesized equalities we obtain *K*(*t*) − (*K*(*S*) + *K*(*M*_{0}) + *K*(*E*) + *O*(1)) ≤ *K*(*M*_{t}) ≤ *K*(*t*) + (*K*(*S*) + *K*(*E*) + *K*(*M*_{0}) + *O*(1)).

The slow growth of time is a possible objection to the assertion that in the descriptive complexity of systems time is the dominating parameter for predicting their evolution: The function *K*(*t*) grows within *O*(log *t*), which is very slow and often considered insignificant in the information theory literature. However, we have to consider the scale of time we are using. For instance, one second of*real time* in the system we are modeling may mean an exponential number of discrete time steps for our computable model (for instance, if we are modeling a genetic machine with current computer technology), yielding a potential polynomial growth in their descriptive complexity. However, if this time conversion is computable, then*K*(*t*) grows at most at a constant pace. This is an instance of *irreducibility*, as there exist infinite sequences of times that cannot be obtained by computable methods. In the upcoming subsections we will call such times *random times* and the sequences containing them will be deemed *irreducible*.

### 2.2 Non-randomness of Decidable Convergence Times

One of the most important tasks for science is predicting the future behavior of dynamical systems. The prediction we will focus on is about the first state of convergence (^{1}): Will a system converge, and how long will it take? In this section we shall show the limit that decidability imposes on the complexity of the first convergent state. A consequence of this is the existence of undecidable adapted states.

*S*with degree ϵ to be decidable, there must exist an algorithm

*D*

_{ϵ}such that

*D*

_{ϵ}(

*S,M*

_{0},

*E*, δ) = 1 if the system is convergent at time δ and 0 otherwise. Moreover, we can describe a machine

*P*such that, given full descriptions of

*D*

_{ϵ},

*S*, and

*M*

_{0}, it runs

*D*

_{ϵ}with inputs

*S*and

*M*

_{0}while running over all the possible times

*t*, returning the first

*t*for which the system converges. Note that δ =

*P*(〈

*D*

_{ϵ}〉〈

*S*〉〈

*M*

_{0}〉〈

*E*〉). Hence we have a short description of δ, and therefore δ

*cannot be random*: If

*S*(

*M*

_{0},

*E,t*) is a convergent system, then

*Let S be a system convergent at time δ. If δ is considerably more descriptively complex than the system and the environment—that is, if for every reasonably large natural number d we have that K(δ) > K(S) + K(E) +
K(M _{0}) + d—then δ cannot be found by an algorithm described within d characters.*

*Proof.*

It is a direct consequence of the inequality (2).

We call such times *random convergence times*, and the state of the system *M*_{δ} a *random state*. It is important to note that the descriptive complexity of a random state must also be high:

*Let S be a convergent system with a complex state S(M*

_{0}, E, δ). For every reasonably large d we have*Proof.*

Suppose the contrary to be true, that is, that there exists *d* small enough that *K*(*S*(*M*_{0}, *E*, δ)) ≤ *K*(*S*) + *K*(*E*) + *K*(*M*_{0}) + *d*. Let *q* be the program that, given *S*, *E*, *M*_{0}, and *S*(*M*_{0}, *E*, δ), runs *S*(*M*_{0}, *E*, *t*) in order for each *t* and compares the result with *S*(*M*_{0}, *E*, δ), returning the first time where the equality is reached. Therefore, given that the succesor states are unique, we have δ = *q*(*S*, *M*_{0}, *E*, *S*(*M*_{0}, *E*, δ)) and *K*(δ) ≤ *K*(*S*) + *K*(*E*) + *K*(*M*_{0}) + *K*(*S*(*M*_{0}, *E*, δ)) + |*q*| ≤ *K*(*S*) + *K*(*E*) + *K*(*M*_{0}) + (*K*(*S*) + *K*(*E*) + *K*(*M*_{0}) + *d*) + *O*(1), which gives us a small upper bound to the random convergence time δ.

In other words, if δ has high descriptive complexity, then there does not exist a reasonable algorithm that finds it, even if we have a complete description of the system and its environment. It follows that the descriptive complexity of a computable convergent state cannot be much greater than the descriptive complexity of the system itself.

What a *reasonably large d* is has been handled so far with ambiguity, as it represents the descriptive complexity of any computable method*D*_{ϵ}. We may intend to find convergence times, which intuitively cannot be arbitrarily large. It is easy to “cheat” on the inequality (2) by including in the description of the program*D*_{ϵ} the full description of the convergence time δ, which is why we ask for *reasonable* descriptions.

Another question left to be answered is whether complex convergence times do exist for a given limit *d*, considering that the limits imposed by the inequality (2) loosen up in direct relation to the descriptive complexity of *S,E*, and *M*_{0}.

The next result answers both questions by proving the existence of complex convergence times for a broad characterization of the size of*d*:

*Let F be a total computable function. For any ϵ there exists a system S(M _{0}, E, t) such that the convergence times are F(S, M_{0}, E)-random.*

*Proof.*

Let *E* and *s* be two natural numbers such that *K*(*E*|*s*) > ϵ. By reduction to the halting problem [34], it is easy to see the existence of *F*(*S*, *M*_{0}, *E*)-random convergence times: Let *T* be a Turing machine, and *S*_{t} the Turing machine that emulates *T* for *t* steps with input *M*_{0} and returns *E* for every time equal to or greater than the halting time, and *s* otherwise. Let us consider the system *S*(*M*_{0}, *E*, *t*) =*S*_{t}(〈*T*〉〈*M*_{0}〉〈*t*〉〈*E*〉).

If the convergence times are not *F*(*S*, *M*_{0}, *E*)-random, then there exists a constant *c* such that we can decide HP by running *S* for each *t* that meets the inequality |*t*| + 2 log|*t*| + *c* ≤ |*S*| + |〈*T*〉〈*M*_{0}〉〈*t*〉〈*E*〉| + *F*(*S*, *M*_{0}, *E*),^{1} which cannot be done, since HP is undecidable.

Let us focus on what the previous lemma is saying: *F* can be any computable function. It can be a polynomial or exponential function with respect to the length of a given description for *M*_{0} and*E*. It can also be any computable theory that we might propose for setting an upper limit to the size of an algorithm that finds convergence times given descriptions of the system's behavior, environment, and initial state. In other words, for a class of dynamical systems, finding convergence times, and therefore convergent states, is not decidable, even with complete information about the system and its initial state. Finally, by the proof of the lemma, adapted states can be seen as a generalization of halting states.

### 2.3 Randomness of Convergence in Dynamic Environments

So far we have limited the discussion to fixed environments. However, as observed in the physical world, the environment itself can change over time. We call such environments *dynamic environments*. In this subsection we extend the previous results to cover environments that change depending on time as well as on the initial state of the system. We also propose a weaker convergence condition, called *weak convergence*, and propose a necessary (but not sufficient) condition for the computability of convergence times, called *descriptive differentiability*.

We can think of an environment *E* as a dynamic computable system, a moving target that also changes with time and depends on the initial state*M*_{0}. In order for the system to be convergent, we propose the same criterion—there must exist δ such that*n* ≥ δ implies*K*(*E*(*M*_{0},*n*)|*S*(*M*_{0},*E*(*M*_{0}, *n*),*n*)) ≤ ϵ. A system with a dynamic environment also meets the inequality (2) and ^{3} and ^{4}, since we can describe a machine that runs both *S* and *E* for the same time*t*. Given that *E* is a moving target, it is convenient to consider an *adaptation period* for the new states of *E*:

*S converges weakly*to

*E*if there exist an infinity of times δ

_{i}such that

As a direct consequence of the inequality (2) and ^{4} we have the following lemma:

*Let S(M _{0}, E(M_{0}, t), t) be a weakly converging system. Any decision algorithm D_{ϵ}(S,
M_{0}, E, δ_{i}) can only decide the first nonrandom time.*

As noted above, these results do not change when dynamic environments are considered. In fact, we can think of static environments as a special case of dynamic environments. However, with different targets of adaptability and convergence, it makes sense to generalize beyond the first convergence time. Also, it should be noted that specifying a convergence index adds additional information that a decision algorithm can potentially use.

*Let S(M _{0}, E(M_{0}, t), t) be a weakly converging system with an infinity of random times such that k > j implies that K(δ_{k}) =
K(δ_{j}) + ΔK_{δ}(j, k), where ΔK_{δ} is a (not necessarily computable) function with a range confined to the positive integers. If the function ΔK_{δ}(i, i + m) is unbounded with respect to i, then any decision algorithm D_{ϵ}(S,
M_{0}, E, i), where i is the i-th convergence time, can only decide a finite number of i's.*

*Proof.*

Suppose that *D*_{ϵ}(*S*, *M*_{0}, *E*, *i*) can decide an infinite number of instances. Let us consider two times δ_{i} and
δ_{i+m}. Note that we can describe a program that, by using *D*_{ϵ}, *S*, *E*, *M*_{0}, and *i* together with the distance *m*, finds δ_{i+m}. The next inequality follows:*K*(δ_{i+m}) ≤ *K*(*D*_{ϵ}) + *K*(*i*) + *K*(*m*) + *O*(1). Next, note that we can describe another program that, given
δ_{i} and using *D*_{ϵ}, *S*, *E*, and *M*_{0}, finds *i*, from which *K*(*i*) ≤ *K*(*D*_{ϵ}) + *K*(δ_{i}) + *O*(1) and
−*K*(δ_{i}) ≤ *K*(*D*_{ϵ}) − *K*(*i*) + *O*(1). Therefore Δ*K*_{δ}(*i*, *i* + *m*) =*K*(δ_{i+m}) − *K*(δ_{i}) ≤ 2*K*(*D*_{ϵ}) + *K*(*m*) + *O*(1), and Δ*K*_{δ}(*i*, *i* + *m*) is bounded with respect to *i*.

We will say that a sequence of times δ_{1}, …, δ_{i}, … is*non-descriptively differentiable* if
Δ*K*_{δ} is not a total function, which, as a consequence of the previous lemma, implies noncomputability of the sequence.

## 3 Beyond Halting States: Open-Ended Evolution

The inequality (2) states that being able to predict or recognize adaptation imposes a limit on the descriptive complexity of the first adapted state. A particular case is the halting state, as shown in the proof of ^{4}. In this section we extend the lemma to continuously evolving systems, showing that computability of adapted times limits the complexity of adapted states beyond the first, imposing a limit to open-ended evolution for three complexity measures: sophistication, coarse sophistication, and busy beaver logical depth.

For a system in constant evolution converging to a dynamic environment, ^{6} imposes a limit on the growth of the descriptive complexity of a system with computable adapted states: *If the growth of the descriptive complexity of a sequence of convergent times is unbounded in the sense of ^{6}, then all but a finite number of times are undecidable.* The converse would be convenient; however, it is not always true. Moreover, the next series of results shows that imposing such a limit would impede strong OEE:

*Let S be a non-cyclical computable system with initial state M _{0}, E a dynamic environment, and δ_{1}, …, δ_{i}, … a sequence of times such that for each δ_{i} there exists a total function p_{i} such that $piM\delta i=E\delta i$. If the function p : i ↦ p_{i} is computable, then the function δ : i ↦ δ_{i} is computable.*

*Proof.*

Assume that *p* is computable. We can describe a program *D*_{ϵ} that, given *S*, *M*_{0}, δ_{i},
and *E*, runs $p\delta iMt$ and *E*(*t*)
for each time *t*, returning 1 if the δ_{i}th *t* is such that $p\delta it=Et$, and 0 otherwise. Therefore the sequence of δ_{i}'s is computable.

The last result can be applied naturally to weakly convergent systems (^{5}): The way each adapted state approaches *E* is unpredictable; in other words, its *behavior* changes unpredictably over different stages. Formally:

*Let S(M _{0}, E, t) be a weakly converging system, with adapted states $M\delta 1,\u2026,M\delta i,\u2026$ and p_{1}, …, p_{i}, … its respective adapted behaviors. If the mapping δ : i ↦ δ_{i} is not computable, then the function p : i ↦ p_{i} is also not computable.*

*Proof.*

It is a direct consequence of applying ^{7} to the definition of weakly converging systems.

While asking for totality might look like an arbitrary limitation at first glance, the reader should recall that in weakly convergent systems the program *p*_{i} represents an organism, theory, or other computable system that uses $M\delta i$'s information to predict the behavior of *E*(δ_{i}), and if this prediction does not process its environment in a sensible time frame, then it is hard to argue that it represents an adapted system or a useful theory.

The intuition behind classifying descriptively differentiable adapted time sequences as *less complex* is better explained by borrowing ideas developed by Bennett and Koppel, within the framework of logical depth [8] and sophistication [25], respectively. Their argument states that random strings are as simple as very regular strings, given that there is no complex underlying structure in their minimal descriptions. The intuition that random objects contain no useful information leads us to the same conclusion. And, given ^{2}, the states must retain a high degree of randomness for random times.

*Sophistication* is a measure of useful information within a string. Proposed by Koppel, the underlying approach consists in dividing the description of a string *x* into two parts: the program that represents the *underlying structure* of the object, and the input, which is the random or *structureless* component of the object. This function is denoted by soph_{c}(*x*), where *c* is a natural number representing the significance level.

*sophistication*of a natural number

*x*at the significance level

*c*,

*c*∈ ℕ, is defined as

Now, the images of a mapping δ : *i* ↦
δ_{i} already have the form δ(*i*), where δ and *i* represent the structure and the random component respectively. Random strings should bind this structure strongly up to a logarithmic error, which is proven in the next lemma.

*Let δ _{1}, …, δ_{i}, … be a sequence of different natural numbers, and r a natural number. If the function δ : i ↦ δ_{i} is computable, then there exists an infinite subsequence where the sophistication is bounded up to a logarithm of a logarithmic term of their indexes.*

*Proof.*

Let δ be a computable function. Note that since δ is computable and the sequence is composed of different naturals, its inverse function δ^{−1} can be computed by a program *m* that, given a description of δ and δ_{i}, finds the first *i* that produces δ_{i} and returns it; therefore *K*(*i*) ≤ *K*(δ_{i}) + |〈*m*〉| + |〈δ〉| and *K*(δ) + *K*(*i*)
≤ *K*(δ_{i}) + |〈*m*〉| + 2|〈δ〉|. Now,
if *i* is an *r*-random natural where the inequality holds tightly, we have that (*K*(δ) + *O*(log|*i*|)) + |*i*|
− *r* ≤ *K*(δ_{i}) + |〈*m*〉| + 2|〈δ〉|,
which implies that, since δ is a total function, soph_{|〈m〉|+2|〈δ〉|+r}(δ_{i})
≤ *K*(δ) + *O*(log log *i*). Therefore, the sophistication is bounded up to a logarithm of a logarithmic term for a constant significance level for an infinite subsequence.

Small changes in the significance level of sophistication can have a large impact on the sophistication of a given string. Another possible problem is that the constant |〈*m*〉| + 2|〈δ〉| + *r* implied in ^{9} could appear to be large at first (but it becomes comparatively smaller as *i* grows). A *robust* variation of sophistication, called coarse sophistication [4], incorporates the significance level as a penalty. The definition presented here differs slightly from theirs in order to maintain congruence with the chosen prefix-free universal machine and to avoid negative values. This measure is denoted by csoph(*x*).

*coarse sophistication*of a natural number

*x*is defined as

*y*〉| is a computable unambiguous codification of

*y*.

With a similar argument to the one used to prove ^{9}, it is easy to show that coarse sophistication is similarly bounded up to an algorithm of a logarithmic term.

*Let δ _{1}, …, δ_{i}, … be a sequence of different natural numbers, and r a natural number. If the function δ : i ↦ δ_{i} is computable, then there exists an infinite subsequence where the coarse sophistication is bounded up to a logarithm of a logarithmic term.*

*Proof.*

*i*is

*r*-random, then by definition of csoph and the inequalities presented in the proof of

^{9}, we have that

Another proposed measure of complexity is Bennett's logical depth [8], which measures the minimum computational time required to compute an object from a nearly minimal description. Logical depth works under the assumption that complex, or *deep*, natural numbers take a long time to compute from near-minimal descriptions. Conversely, random or incompressible strings are *shallow*, since their minimal descriptions must contain the full description verbatim. For the next result we will use a related measure called busy beaver logical depth, denoted by depth_{bb}(*x*).

*busy beaver logical depth*of the description of a natural

*x*, denoted by depth

_{bb}(

*x*), is defined as

*T*(

*P*) is the halting time of the program

*p*, and BB(

*j*), known as the busy beaver function, is the halting time of the slowest program that can be described within

*j*bits [18].

The next result follows from a theorem formulated by Antunes and Fortnow [4] and from ^{10}.

*Let δ _{1}, …, δ_{i}, … be a sequence of different natural numbers, and r a natural number. If the function δ : i ↦ δ_{i} is computable, then there exists an infinite subsequence where the busy beaver logical depth is bounded up to an algorithm of a logarithmic term of their indexes.*

*Proof.*

By Theorem 5.2 in [4], for any *i* we have |csoph(δ_{i})
− depth_{bb}(δ_{i})| ≤ *O*(log|δ_{i}|). By ^{10} and ^{2} the result follows.

Let us focus on the consequence of ^{9} and ^{10} and ^{11}. Given the relationship established between descriptive time complexity and the corresponding state of a system (^{2}), these last results imply that either the complexity of the adapted states of a system (using any of the three complexity measures) grows very slowly for an infinite subsequence of times (becoming increasingly common up to a probability limit of 1 [13]), or the subsequence of adapted times is undecidable.

*If S(M _{0}, E(t), t) is a weakly converging system with adaptation times δ_{1}, …, δ_{i}, … that exhibits strong OEE with respect to* csoph

*and*depth

_{bb}

*, then the mapping δ : i ↦ δ*soph

_{i}is not computable. Also, there exists a constant c such that the result applies to_{c}.

*Proof.*

We can see the sequence of adapted states as a function $M\delta i:i\u21a6M\delta i$. By ^{9} and ^{10} and ^{11}, for the three stated measures of complexity, there exists an infinite subsequence where the respective complexity is upper-bounded by *O*(log log *i*). It follows that if the complexity grows faster than *O*(log log *i*) for an infinite subsequence, then there must exist an infinity of indexes *j* in the bounded succession where γ(*j*) grows faster than *C*(*M*_{j}).
Therefore there exists an infinity of indexes *j* where *C*(*M*_{j})
− γ(*j*) is upper bounded. Finally, note that if a computable mapping δ : *i* ↦ δ_{i} allowed growth on the order of *O*(log log *i*), then the computable function $\delta \u2032:i\u21a6\delta 22i$ would grow faster than the stated bound.

Now, in the absence of absolute solutions to the problem of finding adapted states in the presence of strong OEE, one might cast about for a partial solution or approximation that decides most (or at least some) of the adapted states. The following corollary shows that the problem is not even semicomputable: *Any algorithm one might propose can only decide a bounded number of adapted states*.

*If S(M _{0}, E, t) is a weakly converging system with adapted states M_{1}, …, M_{i}, … that show strong OEE, then the mapping δ : i ↦ δ_{i} is not even semicomputable.*

*Proof.*

Note that for any subsequence of adaptation times $\delta j1,\u2026,\delta jk,\u2026,$ the system must show strong OEE. Therefore,
by ^{12}, any subsequence must also not be computable. It follows that there cannot exist an algorithm that produces an infinity of elements of the sequence, since such an algorithm would allow the creation of a computable subsequence of adaptation times.

In short, ^{12} imposes undecidability on strong OEE, and, according to ^{8}, the behavior and interpretation of the system evolves in an unpredictable way, establishing one path for emergence: *a set of rules for future states that cannot be reduced to an initial set of rules*. Recall that for a given weakly converging dynamical system, the sequence of programs *p*_{i} represents the behavior or interpretation of each adapted state*M*_{i}. If a system exhibits strong OEE with respect to the complexity measure soph_{c}, csoph, or depth_{bb}, then by ^{8} and ^{12} the sequence of behaviors is uncomputable, and therefore irreducible to any function of the form *p* : *i* ↦*p*_{i}, even when possessing complete descriptions for the behavior of the system, its environment, and its initial state. In other words, *the behavior of iterative adapted states cannot be obtained from the initial set of rules*. Furthermore, we conjecture that the results hold for all adequate measures of complexity:

*Computability bounds the growing complexity rate to that of an order of the slowest-growing infinite subsequence with respect to any* adequate *complexity measure C.*

One way to understand ^{13} is that *the information of future states of a system is either contained in the initial state—hence their complexity is bounded by that initial state—or is undecidable*. This should be a consequence, given that, for any computable dynamical system, the randomness induced by time cannot be avoided.

### 3.1 A System Exhibiting OEE

With the aim of providing mathematical evidence for the adequacy of Darwinian evolution, Chaitin developed a mathematical model that converges to its environment significantly faster than exhaustive search, being fairly close to an *intelligent* solution to a mathematical problem that requires*maximal creativity* [16, 15].

One of the solutions Chaitin proposes is to find digital organisms that approximate the busy beaver function BB(*n*) =
max{*T*(*U*(*p*)) :
|*p*| ≤ *n*}, which is equivalent (up to a constant) to asking for the largest natural number that can be named within*n* bits and for the first *n* bits of Chaitin's constant, which is defined as
Ω_{U} =
∑_{T∈HP}2^{−|T|}, where HP is the set of all halting Turing machines for the universal machine*U*. We will omit the subindex from Ω in the rest of this text.

Chaitin's evolutionary system searches nondeterministically through the space of Turing machines using a reference universal machine *U*′ with the property that all strings are valid programs. This random walk starts with the empty string*M*_{0} = “ ”, and each new state is defined as the output of a Turing machine, called a *mutation*, with the previous state as an input. These mutations are chosen stochastically according to the universal distribution [24]. If these mutations help to more accurately approximate the digits of Ω, then this program becomes the new state *M*_{t+1}; otherwise we keep searching for new organisms. Chaitin demonstrates that the system approaches Ω *efficiently* (with quadratic overhead), arguing that this is evidence of the adequacy of Darwinian evolution [14].

*n*) [21], a deterministic version of Chaitin's system is the following:

*H*(

*M*

_{t−1},

*p*) is the

*distance*between the programs,

*M*

_{t−1},

*q*is the quantification of the number of mutations needed to transform one string into the other, and

*w*is a positive integer acting as an accumulator that resets to 1 whenever

*M*

_{t}increases in value, adding 1 otherwise.

Defining a computable environment or adaptation condition for this system is difficult, since the system seeks to approach an uncomputable function (BB) and the evolution rule itself is not computable, given the halting problem. The most direct way to define it is *E*(*t*) =
BB(*t*) or, equivalently, as the first *t* bits of Chaitin's constant Ω. Another way to define the environment is by an encoding of the property *larger than U*(*M*_{t−1}) for each time *t*. Given that we can compute *M*_{t−1} and its relationship with *M*_{t} given a description of the latter and a constant amount of information (ϵ), we find adaptation at the times *t* where the busy beaver function grows.

It is easy to see that the sequence of programs *i* ↦*M*_{i} is precisely what generates the busy beaver sequence η_{i} =
BB(*i*). Given that BB(*t*) is not a computable function, the evolution of the system, along with the respective adaptation times, is not computable. Furthermore, this sequence is composed of programs that compute, in order, an element of a sequence that exhibits strong OEE with respect to depth_{bb}: Let
η_{i} = BB(*i*) be the sequence of all busy beaver values; by definition, if *i* is the first value for which BB(*i*) was obtained, then depth_{bb}(BB(*i*)) = min{|*p*_{i}| −*K*(BB(*i*)) + *i*}, where*U*(*P*_{i}) =*K*(BB(*i*)). It follows that*K*(BB(*i*)) = |*p*_{i}| and depth_{bb}(BB(*i*)) = *i*; otherwise*p*_{i} would not be the minimal program.

Computing the system described requires a solution for the halting problem, and the system itself might also seem unnatural at first glance. However, we can think of the biosphere as a huge parallel computer that is constantly approximating solutions to the adaptation problem by means of survivability, and just as Ω has been approximated [12], we claim that *just as we generally cannot know whether a Turing machine will halt until it does, we may not know if an organism will keep adapting and survive in the future, but we can know when it failed to do so (extinction)*.

One may consider biological evolution to be a very rough, but fundamental, analogue to Chaitin's system in the pursuit of the busy beaver values (longest-running TMs), equivalent to the search for the longest-surviving organisms. Within this framework, the periods of increase in complexity are a natural consequence of approximating BB. In this view, our assertion that adaptation is a generalization of the halting problem has a natural interpretation.

## 4 On Logical Depth

Although we conjecture that ^{12} must also hold for logical depth as defined by Bennett [8], extending the results to this measure is still a work in progress. Encompassing logical depth will require a deeper understanding of the internal structure of the relationship between system and computing time, beyond the time complexity stability (^{2}), and might be related to open fundamental problems in computer science and mathematics. For instance, finding a *low* upper bound to the growth of logical depth of all computable series of natural numbers would suggest a negative answer to the question of the existence of an efficient way of generating deep strings, which Bennett relates to the *P* ≠ PSPACE problem.

*logical depth*of a natural

*x*at the level of significance

*c*is defined as

*n*,

*T*) produces strings of length

*n*with depth

*T*for a significance level

*n*−

*K*(

*T*) −

*O*(log

*n*), where

*K*(

*T*) must be smaller than

*n*, and

*n*must not be as large as (or larger than)

*T*to avoid shallow strings. One possible difficulty with this algorithm is that the significance level is not computable, and we can expect it to vary greatly with respect to

*K*(

*T*): For large

*T*with small

*K*(

*T*) (such as $TTT$), the significance level is nearly

*n*, which suggests that, for a

*steady*significance level with respect to times

*T*with large

*K*(

*T*), the growth in complexity might not be stable. This problem, along with an algorithm that consistently enumerates pairs of

*n*and

*T*such that

*K*(

*T*) <

*n*≪

*T*for growing

*T*'s, will be explored in future work, and its solution would require a formal definition of

*adequate*complexity measures. The fact that χ presents a challenge to

^{13}would suggest an important difference from the three complexity measures used in this article.

## 5 Conclusions

We have presented a formal and general mathematical model for adaptation within the framework of computable dynamical systems. This model exhibits universal properties for all computable dynamical systems, of which Turing machines are a subset. Among other results, we have given formal definitions of open-ended evolution (OEE) and strong open-ended evolution and supported the latter on the basis that it allows us to differentiate between trivial and nontrivial systems.

We have also shown that decidability imposes universal limits on the growth of complexity in computable systems, as measured by sophistication, coarse sophistication, and busy beaver logical depth. We show that as time dominates the descriptive algorithmic complexity of the states, the complexity of the evolution of a system tightly follows that of natural numbers, implying the existence of nontrivial states but the nonexistence of an algorithm for finding these states or any subsequence of them, which makes the computations for harnessing or identifying them undecidable.

Furthermore, as a direct implication of ^{8} and ^{12}, the undecidability of adapted states and the unpredictability of the behavior of the system at each state is a requirement for a system to exhibit strong open-ended evolution with respect to the complexity measures known as sophistication, coarse sophistication, and busy beaver logical depth, providing rigorous proof that undecidability and irreducibility of future behavior is a requirement for the growth of complexity in the class of computable dynamical systems. We conjecture that these results can be extended to any adequate complexity measure that assigns low complexity to random objects.

Finally, we have provided an example of a (noncomputable) evolutionary system that exhibits strong OEE, and supplied arguments for its adequacy as a model of evolution, which we claim supports our characterization of strong OEE. Also, we assert that adaptation can be seen as analogous to the problem of finding the values of the busy beaver function, producing complexity as a byproduct.

One possible reading for the undecidability results exposed in this article is that the chase for strong OEE is a scientific dead end, as the sequence of adapted states are not even semicomputable. However, the beauty of an algorithmic theory of evolution, as briefly introduced in Section 3.1, already exposes a strong OEE system that opens some avenues for further research in artificial and biological evolution within an algorithmic probability framework. Some results in this direction can already be found in [23].

## Note

For any string *s* there exists a self-delimited program (by a “print”) that takes a prefix-free input of the form 1^{log|s|}0|*s*|*s*.

## References

*Information and Control*