The Limits of Decidable States on Open-Ended Evolution and Emergence

Is undecidability a requirement for open-ended evolution (OEE)? Using algorithmic complexity theory methods, we propose robust computational definitions for open-ended evolution and adaptability of computable dynamical systems. Within this framework, we show that decidability imposes absolute limits to the growth of complexity on computable dynamical systems up to a logarithm of a logarithmic term. Conversely, systems that exhibit open-ended evolution must be undecidable, establishing undecidability as a requirement for such systems. Complexity is assessed in terms of three measures: sophistication, coarse sophistication and busy beaver logical depth. These three complexity measures assign low complexity values to random (incompressible) objects. We conjecture that, for similar complexity measures that assign low complexity values, decidability imposes comparable limits to the stable growth of complexity and such behaviour is necessary for non-trivial evolutionary systems. Finally, we show that undecidability of adapted states imposes novel and unpredictable behaviour on the individuals or population being modelled. Such behaviour is irreducible.


Introduction and Preliminaries
Broadly speaking, a dynamical system is one that changes over time.Prediction of the future behaviour of a dynamical system is a main issue for science generally: scientific theories are tested upon the accuracy of their predictions; and establishing invariable properties through the evolution of a system is an important goal.Limits to this predictability are known in science.For instance, chaos theory establishes the existence of systems in which small deficits in the information of the initial states makes accurate predictions of future states unattainable.However, on this document we focus on systems for which we have unambiguous, finite (on size and time) and complete descriptions of their initial states and their behaviour: computable dynamical systems.
Since their formalization by Church and Turing, the class of computable systems have shown that, even without information deficits (i.e., with complete descriptions), there are future states that cannot be predicted, in particular the state known as halting state (Turing, 1936).We will use this result and others from algorithmic information theory to show how predictability imposes limits to the growth of complexity during the evolution of computable systems.In particular, we show that random (incompressible) times bound tightly the complexity of the associated states.
The relationship between dynamical systems and computability has been studied before by Bournez (Bournez et al., 2013b,a), Blondel (Blondel et al., 2000), Moore (Moore, 1991) and by Fredkin, Margolus and Toffoli (Fredkin and Toffoli, 1982;Margolus, 1984), among others.Emergence as a consequence of incomputability has been proposed by Cooper (Cooper, 2009).Complexity as a source of undecidability has been observed in logic by Calude and Jurgensten (Calude and Jugensen, 2005).Delvenne, Kurka and Blondel (Delvenne et al., 2006) have proposed robust definitions for computable (effective) dynamical systems and universality generalizing Turing's halting states, along with conditions and implications for universality, decidability and their relationship with chaos.The definitions and general approach used in this paper differ from those sources, but are ultimately related.

Computable Functions
In a broad sense, an object x is computable if it can be described by a Turing machine (Turing, 1936); for example if there exists a Turing machine that produces x as an output.Is clear that any finite string on a finite alphabet is a computable object.Following Turing's tradition, we provide below a more formal definition.
A string p is a valid program for the Turing machine T if during the execution of T with p as input all the characters in p are read.We call T (p) the output of the machine, if it stops.A Turing Machine is prefix-free if no valid program can be a proper substring of another valid program (but can be a postfix of one).We call a valid program a self delimited object.Note that, given the relationship between natural numbers and binary strings, the set of all valid programs is an infinite proper subset of the natural numbers.
Formally, a function f : N → N is computable if there exist a Turing Machine T such that f (x) = T (x).A Turing Machine U is called universal if there exist a computable function g such that for every Turing machine T there exist a string T ∈ B * such that f (x) = U ( T g(x)), where T g(x) is the concatenation of the strings T and g(x).Given the previous case, T and g(x) are called a codification or representations of the function f and the natural number x, respectively.From now on we will denote by f and x the codification of f and x.The codification g(x) is unambiguous if is injective.
For functions with more than one variable, if x is a pair x = (x 1 , x 2 ), we say that the codification g(x) is unambiguous if its injective and the inverse functions g −1 1 : g(x) → x 1 and g −1 2 : g(x) → x 2 are computable.If x is a tuple (x 1 , ..., x i , ..., x n ), then the codification g(x) is unambiguous if the function (x, i) → x i is computable.
A sequence of strings δ 1 , δ 2 , ..., δ i , ... is computable if the function δ : i → δ i is computable.A real number is computable if its decimal expansion is a computable sequence.For complex numbers and higher dimensional spaces, we say that they are computable if each of its coordinates are also computable.
Finally, for each of the described objects, we call the representation of the associated Turing machine the representation of the object for the reference Turing machine U , and we define computability of further objects by considering their representations.For example, a function f : R → R is computable if the mapping x i → f (x i ) is computable and we will denote by f the representation of the associated Turing machine, calling it the codification of f itself.

Algorithmic Descriptive Complexity
Given a prefix-free universal Turing Machine U with alphabet Σ, the algorithmic descriptive complexity (also known as Kolmogorov complexity and Kolmogorov-Chaitin complexity (Kolmogorov, 1965;Chaitin, 1982)) of a string s ∈ Σ * is defined as where U is a universal prefix-free Turing Machine and |p| is the number of characters of p.
The algorithmic descriptive complexity measures the minimum amount of information needed to fully describe a computable object within the framework of a universal Turing machine U .If U (p) = s then the program p is called a description of s, the first of the smallest descriptions (on alphabetical order) is denoted by s * and by s a non necessarily minimal description computable over the class of objects.If M is a Turing machine, a program p is a description or codification of M for U if for every string s we have that M (s) = U (p s ).In the case of numbers, functions, sequences and other computable objects we consider the descriptive complexity of its smallest description.For example, for a computable function f : R → R, K(f ) is defined as K(f * ) where f * ∈ B * is the first of the minimal descriptions for f .Of particular importance for this document is the conditional descriptive complexity, which is defined as: where pr is the concatenation of p and r.This measure can be interpreted as the smallest amount of information needed to describe s given a full description of r.We can think of p as a program with input r.
One of the most important properties of the descriptive complexity measure is its stability: the difference between the descriptive complexity of an object, given two universal Turing machines, is at most constant.Therefore the reference machine U is usually omitted in favor of the universal measure K. From now on we will omit the subscript from the measure.
Randomness Given a natural number r, a string x is known as r-random or incompressible if K(x) ≥ |x| − r.This definition states that a string is random if it does not has a significantly shorter complete description than the string itself.A simple counting argument shows the existence of random strings.Now, is easy to verify that every string x has a self delimited computable unambiguous codification with strings of the form 1 log |s| 0|s|s (log |s| 1's followed by a 0, then the binary string corresponding to |s| concatenated with the string s itself (Li and Vitányi, 1997, section 1.4)).Therefore, there exist a natural r such that if x is r-random then K(x) = |x|−r +O(log |x|), where O(log |x|) is a positive term.We will say that such strings hold the randomness inequality tightly.
Let M be a halting Turing Machine with description M for the reference machine U .A simple argument can show that the halting time of M cannot be a large random number: let U H be a Turing Machine that emulates U while counting the number of steps, returning the execution time upon halting; if r is a large random number then M cannot stop in time r, otherwise the program U H M will give us a short description of r.This argument is summarized by the following inequality: where T (M ) is the number of steps that took the machine M to reach the halting state, the execution time of the machine M .

Computable Dynamical Systems
Formally, a dynamical system is a rule of evolution in time within a state space; space that is defined as the set of all possible states of the system (Meiss, 2007).For this work we will focus in a functional model for dynamical systems with a constant initial state and variables representing the previous state and the time of the system.This model allows us to set halting states for each time on a discrete scale in order to study the impact of the descriptive complexity of time during the evolution of a discrete computable system.A deterministic discrete space system is defined by an evolution function (or rule) of the form M t+1 = S(M 0 , t), where M 0 is called the initial state and t is a positive integer called the time variable of the system.The sequence of states M 0 , M 1 , ..., M t , ... is called the evolution of the system.Given a reference universal Turing Machine U , if S is a computable function and M 0 is a computable object, we will say that S is a computable dynamical system.An important property of computable dynamical systems is the uniqueness of the successor state which implies that equal states must evolve equally given the same evolution function, in other words: The converse is not necessarily true.Now, a complete description of a computable system S(M 0 , t) should contain enough information to compute the state of the system at any time and hence it must entail the codification of its evolution function S and a description of the initial state M 0 , which is denoted by M 0 .As a consequence, if we only describe the system at time t by a codification of M t , then we do not have enough information to compute the successive states of the system.So we will define the complete description of a computable system at the time t as a unambiguous codification of the ordered pair composed by S and M t , i.e. (S, M t ) , with (S, M 0 ) representing the initial state of the system.It is important to note that, for any computable and unambiguous codification function g of the stated pair, we have as we can write a program that uses the descriptions for S, M 0 and t to find the parameters and then evaluate S(M 0 , t), finally producing M t .

Open-Ended Evolution in Computable Dynamical Systems
Informally, Open-ended evolution (OEE) has been characterized as "evolutionary dynamics in which new, surprising, and sometimes more complex organisms and interactions continue to appear" (Taylor, 2015).Defining and establishing the properties required for a system to exhibit OEE is considered an open question (Bedau et al., 2000;Soros and Stanley, 2014;Standish, 2003) and OEE has been proposed as a required property of evolutionary systems capable of producing life (Ruiz-Mirazo et al., 2002).This has been implicitly verified by various experiments in-silico (Lindgren, 1992;Adami and Brown, 1994;Lehman and Stanley, 2008;Auerbach and Bongard, 2014).
A line of thought posits that open-ended evolutionary systems tend to produce families of objects of increasing complexity (Bedau, 1998;Auerbach and Bongard, 2014).Furthermore, for a number of complexity measures, it can be shown that the objects belonging to a level of complexity are finite (for instance K(x)), therefore an increase of complexity is a requirement to keep producing new objects.A related observation, proposed by Chaitin (Chaitin, 2009(Chaitin, , 2013)), associates evolution with the search of mathematical creativity, which implies an increase of complexity as more complex mathematical operations are needed to in order to solve interesting problems, which are required to drive evolution.
Following the previous ideas, we choose to characterize OEE in computable dynamical systems as a process that has the property of producing families of objects of increasing complexity.Formally, given a complexity measure C, we say that a computable dynamical system S exhibits openended evolution with respect to C if for every time t there exists a time t ′ such that the complexity of the system at the time t ′ is greater than the complexity at the time t, i.e.C(S(M 0 , t)) < C(S(M 0 , t ′ ), where a complexity measure is a (not necessarily computable) function that goes from the state space to a positive numeric space.
The existence of such systems is trivial for complexity measures on which any infinite set of the natural numbers (not necessarily computable) contains a subset where the measure grows strictly: Lemma 1.Let C be a complexity measure such that any infinite set of natural numbers has a subset where C grows strictly.Then a computable system S(M 0 , t) is a system that produces an infinite number of different states if and only if it exhibits OEE for C.
Proof.Let S(M 0 , t) be a system that does not exhibit OEE and C a complexity measure as described.Then there exist a time t such that for any other time t ′ we have that C(M t ) ≤ C(M t ′ ), which holds true for any subset of states of the system.Follows that the set of states must be finite.
For the converse, if the system exhibits OEE, then there exist an infinite subset of states on which S grows strictly, so there exist an infinity of different states.
Given the previous lemma, a trivial computable system that simply produces all the strings in order exhibits OEE on a class of complexity measures that includes algorithmic description complexity.However, intuitively, we conjecture that such systems have a much simple behaviour compared to what we observe on the natural world and the cited artificial life systems.We can avoid some of these issues with a stronger version of OEE.Definition 2. A sequence of naturals n 0 , n 1 , ..., n i , ... exhibits strong open-ended evolution (strong OEE) with respect to a complexity measure C if for every index i there exists an index i ′ such that C(n i ) < C(n i ′ ), and the complexity of the sequence C(n 0 ), C(n 1 ), ..., C(n i ), ... does not drop significantly, i.e. there exist a γ such that i ≤ j implies C(n i ) ≤ C(n j ) + γ(j) where γ(j) is a positive function that does not grow significantly.
It is important to note that, while the definition of OEE allows significant drops on the complexity during the evolution of a system, strong OEE requires for the complexity of the system to not decrease significantly during its evolution.In particular we will ask for the complexity drops as measured by γ to not grow as fast as the complexity itself.Formally C(n j ) − γ(j) should not be upper-bounded for any infinite subsequence for the smallest γ where the strong OEE inequality holds.
We will understand the concept of speed of the growth of complexity in a comparative way: given two sequence of natural numbers n i and m i , n i grows faster than m i if for every infinite subsequence and natural number N , there exist j such that n i −m j ≥ N .Conversely, a subsequence of over indexes denoted by i grows faster than a subsequence of indexes denoted by j if for every natural N , there exist i, j, with i < j, such that n i − n j ≥ N .
If a complexity measure is sophisticated enough to depend on more than just the size of an object, significant drops in complexity is a property that can be observed on trivial sequences such as the ones produced by enumeration machines.Whenever this is also true for non-trivial sequences is open for debate.However, if we classify random strings as low complexity objects and posit that non-trivial sequences must contain a limited number of random objects, then a non-trivial sequence must observe bounded drops in complexity in order to be capable of showing non-trivial OEE.This is the intuition behind the definition of strong OEE.Now, various complexity measures have been proposed that assign low complexity to random or incompressible natural numbers.Two examples of such measures are logical depth (Bennett, 1988) and sophistication (Koppel, 1988).Classifying random naturals as low complexity objects is a requirement for the results shown in section Beyond Halting States: Open-Ended Evolution and a motivation for this behaviour is given at the respective chapter.
Nonetheless, if C is a complexity measure capable of measuring OEE then there must exist infinite sets where C grows strictly.A trivial counting argument shows that the algorithmic descriptive complexity is unbounded in any infinite set, therefore is also unbounded in any set where any other complexity measure grows strictly.Formally: Lemma 3. If a system S exhibits OEE (and strong OEE) for a complexity measure C then it also shows OEE with respect to the descriptive complexity K.
Given the previous lemma, the results shown in the next section can also be extended to any other complexity measure capable of showing OEE.

A Computational Model for Adaptation
Lets start by describe the evolution of an organism or a population by a computable dynamical system.It has been argued that, in order for adaptation and survival to be possible, an organism must contain an effective representation of the environment so that, given a reading of the environment, the organism can choose a behaviour accordingly (Zenil et al., 2012).The more approximate this representation, the better the adaptation is.If the organism is computable, this information can be codified by a computable structure.We will denote this structure by M t , where t stands for the time corresponding to each of the stages of the evolution of the organism.This information is then processed following a finitely specified unambiguous set of rules that, in finite time, will determine the adapted behaviour of the organism according to the information codified by M t .We will denote this behaviour (or a theory explaining it) with the program p t .An adapted system is one that produces an acceptable approximation of its environment.An environment can also be represented by a computable structure E. In other words, the system is adapted if p t (M t ) produces E. Based on this idea we propose a robust, formal characterization for adaptation: Definition 4. Let K be the prefix-free descriptive complexity.We say that the system at the state M n is ǫ-adapted to the E if: The inequality states that the minimal amount of information that is needed to describe E from a complete description of M n is ǫ or less.This information is provided in form of a program p that produces E from the system at the time n.We will define such programs p as the adapted behaviour of the system.Uniqueness for p is not required.
The proposed structure for adapted systems is robust since K(E|S(M 0 , E, n)) is equal or less than the numbers of characters needed to describe any computable method of describing E from the state of the system at the time n, either be a computable theory for adaptation or a computable model for an organism that tries to predict E. Follows that any computable characterization of adaptation that can be described within ǫ number of bits meets the definition of ǫ-adapted given suitable choice of E, the adaptation condition for any given environment.Is important to note that, although inspired by a representationalist approach to adaptation, the presented characterization of adaptation is not contingent on the organism containing an actual codification of the environment, since for any organism that can produce adapted behaviour that can be explained effectively (computable in finite time) is ǫ-adapted for some ǫ.
As a simple example, we can think of an organism that must find the food located at the coordinates (x, j) on a grid in order to survive.If the information of an organism is codified by a computable structure M (such as DNA), and there is a set of finitely specified, unambiguous rules that govern how this information is used (such as the ones specified by biochemistry and biological theories) codified by a program p, then we say that the organism finds the food if p(M ) = (j, k).If | p | ≤ ǫ, then the we say that the organism is adapted according to a behaviour that can be described within ǫ characters.The proposed model for adaptation is not limited to such simple interactions.For a start, we can suppose that the organism sees a grid, denoted by g, of size n × m with food at the coordinates (j, k).The environment can be codified as a function E such that E(g) = (j, k) and ǫ-adapted implies that the organism defined by the genetic code M , which is interpreted by a theory or behaviour written on ǫ bits, is capable of finding the food upon seeing g.Similarly, more complex computational structures and interactions imply ǫ-adaptation.Now, describing an evolutionary system that (eventually) produces an ǫ-adapted system is trivial via an enumeration machine (the program that produces all the natural numbers in order), as it will eventually produce E itself; moreover, we want for the output of our process to remain adapted.Therefore we propose an stronger condition called convergence: Definition 5. Given the description of a computable dynamical system S(M 0 , E, t) where t ∈ N is the variable of time, M 0 is an initial state and E is an environment, we say that the system S converges towards E with degree ǫ if there exist δ such that t ≥ δ implies K(E|S(M 0 , E, t)) ≤ ǫ.
For a fixed initial state M 0 and environment E, is easy to see that the descriptive complexity of a state of the system depends mostly on t: we can describe a program that, given full descriptions of S, E, M 0 and t finds S(M 0 , E, t), therefore where the constant term is the length of the program described.In order words, as the time t grows, time is the main driver of the descriptive complexity within the system.

Irreducibility of Descriptive Time Complexity
At the previous section, it was established that time was the main factor in the descriptive complexity of the states within the evolution of a system.This result is expanded by the time complexity stability theorem (6).This theorem establishes that, within an algorithmic descriptive complexity framework, similarly complex initial states must evolve into similarly complex future states over similarly complex time frames, effectively erasing the difference between the complexity of the state of the system and the complexity of the corresponding time and establishing absolute limits to the reducibility of future states.Let F (t) = T (S(M 0 , E, t)) be the real execution time of the system at time t.By using our time counting machine U H it is easy to see that F (t) is computable and, by uniqueness of successor state, F increases strictly with t, and hence it is injective.Consequently, F has a computational inverse F −1 over its image.Therefore, we have that (up to a small constant) , where c is an integer independent of t (but that can depend on S).In other words, for a fixed system S, the execution time and the system time are equally complex up to a constant.From here on I will not differentiate between the complexity of both times.A generalization of the previous equation is given by the following theorem: Theorem 6 (Time Complexity Stability).Let S and S ′ be two computable systems and t and t ′ the first time where each system reaches the states M t and There exist a natural number c that depends on S and M 0 , but not on t, such that then there exist a constant c that does not depend on t such that |K(t) − K(t ′ )| ≤ c, where t and t ′ are the minimum times for which the corresponding state is reached.
iii) Let S and S ′ be two dynamical systems with an infinite number of equally, up to a constant, descriptive complex times α i and δ i .For any infinite subsequence of times with strictly growing descriptive complexity, all but finitely many j, k such that j > k comply with the equation Proof.First, note that we can describe a program such that given S, M 0 and E, runs S(M 0 , E, x) for each x until finding t, therefore similarly for t ′ .By the inequality 4 and the hypothesized equalities we obtain that which implies the first part.The second part is a direct consequence.
For the third part, suppose that there exist an infinity of times such that K(α k ) − K(α j ) > K(δ k ) − K(δ j ), therefore K(α k ) − K(δ k ) > K(α j ) − K(δ j ), which implies that the difference is unbounded, a contradiction to the first part.Analogously, the other inequality yields the same contradiction.
A possible objection to the assertion of descriptive complexity of system time being the dominating parameter for the descriptive complexity of the evolution of a systems can be its growing speed: the function K(t) grows within an order of O(log t), which is very slow and often considered insignificant in information theory literature.However, we have to consider the scale of time we are using.For instance, one second of real time on the system we are modelling can signify an exponential number of discrete times for our computable model, such as modelling a genetic machine with the current computer technology, yielding a potential polinomial growth of their descriptive complexity.However, if this time conversion is computable, then K(t) grows at most a constant value.This is a notion of irreducibility, as there exist infinite sequence of times, called random times at the upcoming sections, that cannot be obtained by computable methods.We call such sequences irreducible.

Non-Randomness of Decidable Convergence Times
One of the most important issues for science is the prediction of the future behaviour of dynamical systems.The prediction that we focus on is that of the first state of convergence (definition 5): Will a system converge and how long it will take?In this section we show what the first limit that decidability imposes to the complexity of the first convergent state is.A consequences of this is the existence of undecidable adapted states.
Formally, for the convergence of a system S with degree ǫ to be decidable there must exist an algorithm D ǫ such that D ǫ (S, M 0 , E, δ) = 1 if the system is convergent at the time δ and 0 otherwise.Moreover, we can describe a machine P such that given full descriptions of D ǫ , S and M 0 it runs D ǫ with inputs S and M 0 while running over all the possible times t, returning the first t for which the system converges.Note that δ = P ( D ǫ S M 0 E ), hence we have a short description of δ and therefore δ cannot be random: if S(M 0 , E, t) is a convergent system then where δ is the first time at which convergence is reached.Note that all the variables are known at the initial state of the system.This result can be resumed by the following lemma: Lemma 7. Let S be a system convergent at the time δ.If δ is considerably more descriptively complex than the system and the environment, i.e. for every reasonably large natural number d we have that then δ cannot be found by an algorithm described within d number of characters.
Proof.Is a direct consequence of the inequality 7.
We call such times random convergence times and the state of the system M δ a random state.It is important to notice that the descriptive complexity of a random state must be also high: Lemma 8. Let S be a convergent system with a complex state S(M 0 , E, δ).For every reasonably large d we have that Proof.Suppose the contrary, i.e. exist d small such that K(S(M 0 , E, δ)) ≤ K(S) + K(E) + K(M 0 ) + d.Let q be the program that given S, E, M 0 and S(M 0 , E, δ) runs S(M 0 , E, t) in order for each t and compares the result to S(M 0 , E, δ), returning the first time where the equality is reached.Therefore, by the uniqueness of successor state (2), δ = q(S, M 0 , E, S(M 0 , E, δ)) and which give us a small upper bound to the random convergence time δ.
In other words, if δ has high descriptive complexity, then there does not exist a reasonable algorithm that finds it even if we have a complete description of the system and its environment.It follows that the descriptive complexity of a computable convergent state cannot be much greater than the descriptive complexity of the system itself.
What a reasonably large d is has been handled so far with ambiguity as it represents the descriptive complexity of any computable method D ǫ .We might intend to find convergent times, which intuitively cannot be arbitrarily large.It is easy to 'cheat' on the inequality 7 by including in the description of the program D ǫ the full description of the convergence time δ, which is why we ask for reasonable descriptions.
Another question left to answer is whether complex convergent times do exist for a given limit d, considering that the limits imposed by the inequality 7 loosen up in direct relation to the descriptive complexity of S, E and M 0 .
The next result answers both questions by proving the existence of complex convergent times for a broad characterization of the size of d: Lemma 9 (Existence of Random Convergence Times).Let F be a total computable function.For any ǫ, there exist a system S(M 0 , E, t) such that the convergence times are F (S, M 0 , E)-random.
Proof.Let E and s two natural numbers such that K(E|s) > ǫ.By reduction to the Halting Problem (Turing (1936)) is easy to see the existence of F (S, M 0 , E)-random convergent times: let T ′ be a Turing Machine, and S t the Truing machine that emulates T for t steps with input M 0 and returns E for every time equal or greater than the halting time and s otherwise.Let us consider the system S(M 0 , E, t) = S t ( T M 0 t E ).
If the convergent times are not F (S, M 0 , E)-random, then there exist a constant c such that we can decide HP by running S ′ for each t that meets the following inequality: which cannot be done, since HP is undecidable.
Let us focus on what the previous lemma is saying: F can be any computable function.It can be a polynomial or exponential function with respect to the length of a given descriptions for M 0 and E. It can also be any computable theory that we might propose for setting an upper limit to the size of an algorithm that finds convergence times given descriptions of the system behaviour, environment and initial state.In other words, for a class of dynamical systems, finding convergence times, therefore convergent states, is not decidable even with complete information of the system and its initial state.Finally, by the proof of the lemma, adapted states can be seen as a generalization of halting states.

Randomness of Convergence in Dynamic Environments
So far we have limited the discussion to fixed environments.However, as observed in the physical world, the environment itself can change over time.We call such environments dynamic environments.In this section we extend the previous results to cover environments that change depending on time as well as on the initial state of the system.We also propose a weaker convergence condition called weak convergence and propose a necessary (but not sufficient) condition for the computability of convergent times called descriptive differentiability.
We can think of an environment E as a dynamic computable system, a moving target that also changes with time and depends on the initial state M 0 .In order for the system to be convergent, we propose the same criterion: there must exist δ such that n ≥ δ implies (8) A system with a dynamic environment also meets the inequality 7 and lemmas 7 and 9 since we can describe a machine that run both S and E for the same time t.Now with dynamic environments, E is a moving target and therefore is convenient to consider an adaptation period for the new states of E: Definition 10.We say that S converges weakly to E if there exist an infinity of times δ i such that As direct consequence of the inequality 7 and lemma 9 we have the following lemma: Lemma 11.Let S(M 0 , E(M 0 , t), t) be a weakly converging system.Any decision algorithm D ǫ (S, M 0 , E, δ i ) can only decide the first non-random time.
As noted above, these results do not change when dynamic environments are considered.In fact, we can think of static environments as a special case of dynamic environments.However, with different targets of adaptability and convergence, it makes sense to generalize beyond the first convergence time.Also, it should be noticed that specifying a convergence index adds additional information that a decision algorithm can potentially use.Lemma 12. Let S(M 0 , E(M 0 , t), t) be a weakly converging system with an infinity of random times such that k > j implies that K(δ k ) = K(δ j ) + ∆K δ (j, k), where ∆K δ is a (not necessarily computable) function with range on the positive integers.If the function ∆K δ (i, i + m) is unbounded with respect to i then any decision algorithm D ǫ (S, M 0 , E, i), where i is the i-th convergence time, can only decide a finite number of i's.
Proof.Suppose that D ǫ (S, M 0 , E, i) can decide an infinite number of instances.Let us consider two times δ i and δ i+m .Notice that we can describe a program such that, by using D ǫ , S, E and M 0 and given either i along with the distance m, finds δ i+m .The next inequality follows: Also, note that we can describe another program such that given δ i and using D ǫ , S, E and M 0 finds i, from which and ∆K δ (i, i + m) is bounded with respect to i.
One direct consequence of the previous lemma is that if a sequence of times δ 1 , δ 2 , ..., δ i , ... is decidable then for every m there exist a constant c δ,m such that which we can be generalized as: Definition 13.Let δ 1 , δ 2 , ..., δ i , ... be a strictly growing sequence of natural numbers.We define the descriptive derivative of the natural mapping δ : i → δ i as As a direct consequence of lemma 12, the existence of a descriptive derivative is a necessary condition for the computability of δ; thus not meeting this property is sufficient, but not necessary, for undecidability.Therefore the existence of a descriptive derivative is a stronger condition which we will call non-descriptively differentiable.Definition 14.We say that a sequence of times is δ 1 , δ 2 , ..., δ i , ... is non-descriptively differentiable if ∆K δ (m) is not a total function.

Beyond Halting States: Open-Ended Evolution
Inequality 7 states that being able to predict or recognize adaptation imposes a limit to the descriptive complexity of the first adapted state.A particular case is the halting state, as shown at the proof of lemma 9.By lemma 3 this result holds for any complexity measure.In this section we extend the lemma to continuously evolving systems, showing that computability of adapted times limits the complexity of adapted states beyond the first, imposing a limit to openended evolution for three complexity measures: sophistication, coarse sophistication and busy beaver logical depth.
For a system in constant evolution converging to a dynamic environment, the lemma 12 imposes a limit to the growth of descriptive complexity of a system with computable adapted states: if the growth of the descriptive complexity of a sequence of convergent times is unbounded in the sense of definition 14 then all but a finite number of times are undecidable.The converse would be convenient, however it is not always true.Moreover, the next series of result shows that imposing such limit would impede strong OEE: Theorem 15.Let S be a non cyclical computable system with initial state M 0 , E a dynamic environment and δ 1 , ..., δ i , ... a sequence of times such that for each δ i there exist a total function Proof.Assume that p is computable.We can describe a program D ǫ such that, given S, M 0 , δ i and E, for each time t runs p δi (M t ) and E(t) returning 1 if δ i -th t is such that p δi (t) = E(t) and 0 otherwise, therefore the sequence of δ i 's is computable.
The last result can be applied naturally to weakly convergent systems (10): the way each adapted state approaches to E is unpredictable, in other words, its behaviour changes over different stages unpredictably.Formally: Corollary 16.Let S(M 0 , E, t) be a weakly converging system with adapted states M δ1 , ..., M δi , ... and p 1 , ..., p i , ... their respective adapted behaviour.If the mapping δ : i → δ i is non-descriptively differentiable then the function p : i → p i is not computable.
Proof.Is a direct consequence of applying the theorem 15 to the definition of weakly converging systems.
While asking for totality might look like an arbitrary limitation at first glance, the reader should recall that in weakly convergent systems the program p i represents an organism, a theory or other computable system that uses M δi 's information to predict the behaviour of E(δ i ), and if this prediction does not process its environment in a sensible time frame then it is hard to argue that it represents an adapted system or an useful theory.
The intuition behind classifying descriptively differentiable adapted time sequences as less complex is better explained by borrowing ideas developed by Bennett and Koppel within the framework of logical depth (Bennett, 1988) and sophistication (Koppel, 1988), respectively.Their argument states that random strings are as simple as very regular strings, given that there is no complex underlying structure in their minimal descriptions.The intuition that random objects contain no useful information leads us to the same conclusion.And given the theorem 6, the states must retain a high degree of randomness for random times.
Sophistication is a measure of useful information within a string proposed by Koppel.The idea behind is to divide the description of a string x in two parts: the program that represents the underlying structure of the object and the input, which is the random or structureless component of the object.This function is denoted by soph c (x), where c is a natural number representing the significance level.
Definition 17.The sophistication of a natural number x at the significance level c, c ∈ N, is defined by: soph c (x) = min{| p | : p is a total function and ∃y.p(y) = x and Now, the images of a mapping δ : i → δ i already have the form δ(i), where δ and i represent the structure and the random component respectively.Random strings should hold strongly this structure up to a logarithmic error, which is proven in the next lemma.
Lemma 18.Let δ 1 , ..., δ i , ... be a sequence of different natural numbers and r a natural number.If the function δ : i → δ i is computable then there exists an infinite subsequence where the sophistication is bounded up to a logarithm of a logarithmic term of their indexes.
Proof.Let δ be a computable function.Note that, since δ is computable and the sequence is composed of different naturals, its inverse function δ −1 can be computed by a program m such that, given a description of δ and δ i , finds the first i that produces δ i and returns it, therefore Now, if i is a r-random natural where the inequality holds tightly we have that which implies that, given that δ is a total function, Therefore, the sophistication is bounded up to logarithmic of a logarithm term for a constant significance level for an infinite subsequence.
Small changes in the significance level of sophistication can have a large impact on the sophistication of a given string.Another possible issue is that the constant proposed at lemma 18 could appear to be large at first (but it becomes comparatively smaller as i grows).A robust variation of sophistication called coarse sophistication (Antunes and Fortnow, 2003), incorporates the significance level as a penalty.The definition presented here differs slightly from theirs in order to maintain congruence with the chosen prefix-free universal machine and to avoid negative values.This measure is denoted by csoph(x).Definition 19.The coarse sophistication of a natural number x is defined as: and p is total} where | y | is a computable unambiguous codification for y.
With a similar argument as the one used to prove lemma 18, it is easy to show that coarse sophistication is similarly bounded up to a logarithm of a logarithmic term.Lemma 20.Let δ 1 , ..., δ i , ... be a sequence of different natural numbers and r a natural number.If the function δ : i → δ i is computable then there exist an infinite subsequence where the coarse sophistication is bounded up to a logarithm of a logarithmic term.
Proof.If δ is computable and i is r-random then, by definition of csoph and the inequalities presented at the proof of lemma 18, we have that Other proposed measure of complexity is Bennett's logical depth (Bennett, 1988) which measures the minimum computational time required to compute an object from a nearly minimum description.Logical depth works under the assumption that complex or deep natural numbers take a long time to compute from near minimal descriptions.Conversely, random or incompressible strings are shallow since their minimal descriptions must contain the full description verbatim.For the next result we will use a related measure called busy beaver logical depth, denoted by depth bb (x).
Definition 21.The busy beaver logical depth of the description of a natural x, denoted by depth bb (x), is defined as: where T (P ) is the halting time of the program p and BB(j), known as the busy beaver function, is the halting time of the slowest program that can be described within j bits (Daley, 1982).
The next result follows from a theorem found by Antunes and Fortnow (Antunes and Fortnow, 2003) and lemma 20.
Corollary 22.Let δ 1 , ..., δ i , ... be a sequence of different natural numbers and r a natural number.If the function δ : i → δ i is computable then there exist an infinite subsequence where the busy beaver logical depth is bounded up to a logarithm of a logarithmic term of their indexes.
Proof.By theorem 5.2 at Antunes and Fortnow (2003), for any i we have that By lemma 20 and theorem 6 the result follows.
Let us focus on the consequence of lemmas 18 and 20 and corollary 22.Given the relationship established between descriptive time complexity and the corresponding state of a system (theorem 6), these last results imply that either the complexity of the adapted states of a system (using any of the three complexity measures) grows very slowly for an infinity subsequence of times (becoming increasingly common up to probability limit of 1 (Calude and Stay, 2006)) or the subsequence of adapted times is undecidable.
Theorem 23.If S(M 0 , E, t) is a weakly converging system with adaptation times δ 1 , ..., δ i , ... then there exist a constant c such that, if csoph, depth bb or soph c show strong OEE that grows faster than O(log log i) at an infinite subsequence, then the mapping δ : i → δ i is not computable.
Proof.We can see the sequence of adapted states as a function M δi : i → M δi .By lemmas 18 and 20 and corollary 22 for the three stated measures of complexity there exist a infinite subseries where the respective complexity is upper bounded by O(log log i).Follows that, if the complexity grows faster than O(log log i) for an infinite subsequence, then there must exist an infinity of indexes j in the bounded succession where γ(j) grows faster then C(M j ), therefore exist an infinity of indexes j where C(M j ) − γ(j) is upper bounded.Now, one might ask, in absence of an absolute solutions to the problem of finding adapted states in the presence of strong OEE, for the existence of a partial solution or approximation that decides most (or at least some) of the adapted states.The following corollary shows that the problem is not even semi-computable: any algorithm one might propose can only decide a bounded number of adapted states.
Corollary 24.If S(M 0 , E, t) is a weakly converging system with adapted states M 1 , ..., M i , ... that show strong OEE with speed greater than O(log log i) at an infinite subsequence for the three stated complexity measures, then the mapping δ : i → δ i is not even semi-computable.
Proof.Notice that, for any subsequence of adaptation times δ j1 , ..., δ j k , ..., the system must show strong OEE.Therefore, by theorem 23 any subsequence must also be not computable.Follows that there cannot exist an algorithm that produces an infinity of elements of the sequence, since such algorithm would allow the creation of a computable subsequence of adaptation times.
Technically theorem 23 does not impose undecidability to strong OEE.However, the growing rate that decidability imposes is extremely slow at double logarithm over the index of adapted states.If we disregard this increasingly insignificant growing rate, we can say that strong open-ended evolution implies undecidability of the adapted states.Is also important to note that, even though the upper bounds is over the index of the sequence, this limit cannot be increased by any computable method in more than a constant value, otherwise we would have a computable sequence that contradicts the theorem 23.
Furthermore, by theorem 16, the behaviour and interpretation of the system evolves in an unpredictable way, establishing one path for emergence: a set of rules of future states that cannot be reduced to an initial set of rules.Recall that for a given weakly converging dynamical system, the sequence of programs p i represents the behaviour or interpretation of each adapted state M i .If a system exhibits strong OEE with respect to the complexity measures soph c , csoph or depth bb , by corollary 16 and theorem 23 the sequence of behaviours is uncomputable and therefore, irreducible to any function of the form p : i → p i , even when possessing complete descriptions for the behaviour of the system, its environment and its initial state.In other words, the behaviour of iterative adapted states cannot be obtained from the initial set of rules.

A system Exhibiting OEE
With the aim of providing mathematical evidence for the adequacy of Darwinian evolution, Chaitin developed a mathematical model that converges to its environment significantly faster than exhaustive search, being fairly close to an intelligent solution to a mathematical problem that requires maximum creativity (Chaitin, 2009(Chaitin, , 2013)).
The problem Chaitin proposes is finding digital organisms that approximate the busy the beaver function: which is equivalent (up to a constant error) to asking for the largest natural number that can be named within n number of bits, along with the program that generates it.Let η i = BB(i) be the sequence of all busy beaver values; note that this sequence shows strong OEE with respect to depth bb : by definition, if i is the first value for which BB(i) was obtained, where U (P i ) = K(BB(i)).Follows that K(BB(i)) = |p i | and depth bb (BB(i)) = i, otherwise p i would not be the minimum program.Therefore, i ≤ j implies Chaitin's evolutionary system searches nondeterministically through the space of Turing machines using a reference universal machine U ′ with the property that all strings are valid programs.This random walk starts with the empty string, M 0 = "", and at each state mutates stochastically a number of bits of the program (change of value, delete, or insert a new value) with decaying probabilities for larger number of changes and with respect to the distance to the first bit, proceeding then to verify if the new program defines a larger integer; if it does then this program becomes the new state M t+1 , otherwise we keep searching for new organisms.For a more sophisticated version of the system, where new organisms are defined as outputs for a randomly chosen machine called mutation, Chaitin demonstrates that the system approaches BB(t) efficiently (with quadratic overhead), arguing that this is evidence of the adequacy of Darwinian evolution (Chaitin, 2012).
A deterministic version of the system is the following: where H(M t−1 , p) is the distance between the programs M t−1 and q, quantified as the number of mutations needed to transform one string into the other, and w a positive integer acting as an accumulator that resets to 1 whenever M t increases in value, adding 1 otherwise.
Defining a computable environment or adaptation condition for this system is difficult since Chaitin seeks to approach an uncomputable function (BB) and the evolution rule itself is not computable given the halting problem.The most direct way to define it is E(t) = BB(t) or, equivalently, as the first t-bits of Chaitin's constant Ω (Gardner, 1979).A less direct way that keeps the intuition of an aptitude condition is, for each time t, as the proposition larger than U (M t−1 ), noting that we can compute M t−1 and their relationship given M t and a constant amount of information (ǫ), therefore we have adaptation at the times where the busy beaver function grows.
Is easy to see that the sequence of programs i → M i is precisely the programs that generate the busy beaver sequence η i = BB(i).Given that BB(t) is not a computable function, the evolution of the system, along with the respective adaptation times, is not computable.Furthermore, this sequence is composed of the representation of objets which respective sequence exhibits strong OEE.

Logical Depth and Future Work
Although we conjecture that the theorem 23 must also hold for logical depth as defined by Bennett (Bennett, 1988), to extend the results to this measure is still a work in progress.Encompassing logical depth will require a deeper understanding of the internal structure of the relationship between system and computing time, beyond the time complexity stability (6), and might be related to open fundamental problems in computer science and mathematics.For instance, finding a low upper bound to the growth of logical depth of all computable series of natural numbers would suggest a negative answer to the question of the existence of an efficient way of generating deep strings, which Bennett relates to the P = P SP ACE problem.
A strong version of the stated conjecture is the following: Conjecture 25.Computability bounds the growing complexity rate to that of an order of the slowest growing infinite subsequence with respect to any adequate complexity measure C.
An intuitive version of the previous conjecture is that the information of future states of a system is either contained at the initial state, thus their complexity is bounded by that initial state, or is undecidable.This should be a consequence given that, for any computable dynamical system, the randomness induced by time cannot be avoided.
Given that we intend to expand upon these questions in future works, it is important to address if the diagonal algorithm that Bennett proposes to generate deep strings presents a contradiction to our conjecture: The logical depth of a natural x at the level of significance c is defined as: The algorithm χ(n, T ) produces strings of length n with depth T for a significance level n−K(T )−O(log n), where K(T ) must be smaller than n, and n not to be as large (or larger) than T to avoid shallow strings.One possible issue with this algorithm is that the significance level is not computable and we can expect it to vary greatly with respect to K(T ): For large T with small K(T ) (such as T T T ) the significance level is nearly n, which suggest that, for a steady significance level with respect to times T with large K(T ), the growth on complexity might not be stable.This issue, along with and algorithm that consistently enumerates pairs of n and T 's such that K(T ) < n << T for growing T 's, will be explored in future works and its solution would require a formal definition of adequate complexity measures.
The case of χ presenting a challenge to the conjeture 25 would suggest a very important difference with the three complexity measures used in this article.
Another future course of research is extending the results to continuous time scales.

Conclusions
We have presented a formal and general mathematical model for adaptation within the framework of computable dynamical systems.This model exhibits universal properties for all computable dynamical systems, of which Turing machines are a subset.
Among other results, we have given formal definitions of open-ended evolution (OEE) and strong open-ended evolution.We also showed that decidability imposes universal limits to the growth of complexity of computable systems as measured by sophistication, coarse sophistication and busy beaver logical depth.Furthermore, as a direct implication of corollary 16 and theorem 23, undecidability of adapted states and unpredictability of the behaviour of the system at each state is a requirement for a system to exhibit strong open-ended evolution (up to a O(log log t) term) with respect to the complexity measures known as sophistication, coarse sophistication and busy beaver logical depth, establishing a rigorous proof that undecidability and irreducibility of future behaviour is a requirement for the growth of complexity among the class of computable dynamical systems.