One of the main goals of Artificial Life is to research the conditions for the emergence of life, not necessarily as it is, but as it could be. Artificial chemistries are one of the most important tools for this purpose because they provide us with a basic framework to investigate under which conditions metabolisms capable of reproducing themselves, and ultimately, of evolving, can emerge. While there have been successful attempts at producing examples of emergent self-reproducing metabolisms, the set of rules involved remain too complex to shed much light on the underlying principles at work. In this article, we hypothesize that the key property needed for self-reproducing metabolisms to emerge is the existence of an autocatalyzed subset of Turing-complete reactions. We validate this hypothesis with a minimalistic artificial chemistry with conservation laws, which is based on a Turing-complete rewriting system called combinatory logic. Our experiments show that a single run of this chemistry, starting from a tabula rasa state, discovers—with no external intervention—a wide range of emergent structures including ones that self-reproduce in each cycle. All of these structures take the form of recursive algorithms that acquire basic constituents from the environment and decompose them in a process that is remarkably similar to biological metabolisms.

One central area of focus for Artificial Life is characterizing the conditions that lead to the emergence of living systems. More precisely, the goal is to understand under which conditions metabolisms capable of sustaining themselves in time, reproducing, and ultimately, evolving can emerge. Artificial chemistries can be used to reveal this process by simulating the properties of natural chemical systems at different levels of abstraction (see Dittrich et al., 2001, for a thorough review). The driving hypothesis is that complex organizations emerge from the interactions of simpler components thanks to self-organizing attractors in chemical networks (Kauffman, 1993; Walker & Ashby, 1966; Wuensche et al., 1992). While some artificial chemistries model as closely as possible the properties of the chemistry that gave rise to life on Earth (Flamm et al., 2010; Högerl, 2010; Young & Neshatian, 2013), others leave out the particularities of natural chemistry to focus only on their hypothesized core computational properties (Buliga & Kauffman, 2014; di Fenizio and Banzhaf, 2000; Fontana & Buss, 1994; Hutton, 2002; Sayama, 2018; Tominaga et al., 2007). Interestingly, some of these studies have described systems that produce emergent metabolic structures (Bagley & Farmer, 1992), and others feature self-reproducing structures as well (Hutton, 2002; Young & Neshatian, 2013), at times also capable of undergoing evolution (Hutton, 2007). However, it is still not clear which of the properties in these chemical systems are central to the emergence of these structures. Yet, gaining such insights is a crucial step for deriving more general biological theories grounded both in life-as-it-is and life-as-it-might-be (Langton, 1989).

Here, we hypothesize that self-reproducing metabolisms emerge as a subset of autocatalyzed reactions within a Turing-complete set. To validate this idea, we introduce combinatory chemistry, an artificial chemistry designed to capture the core computational properties of living systems. It consists in a set of autocatalyzed reactions that are based on the rewriting rules of combinatory logic (Curry et al., 1958; Schönfinkel, 1924). These reactions inherit from the original rewiring rules the capacity to perform universal computation. Furthermore, we adapted the rules so that the ensuing reactions would have conservative dynamics. Completing the set of possible reactions, there are random mixing rules that act at a far lower rate.

The resulting chemical system is strongly constructive (Fontana et al., 1993), which means that as the system evolves in time it can create—by chance or through the action of its own components—new components that can in turn modify its global dynamics. Furthermore, thanks to its universal computation capabilities, there is no theoretical limit to the complexity that emergent forms can have. On the other hand, because of the conservation dynamics, the memory cost of the system remains constant without needing to introduce external perturbations (such as randomly removing elements from the system), while possibly also providing a natural source of selective pressure.

In contrast to previous work that explicitly banned expressions that would not reduce to a normal form (di Fenizio and Banzhaf, 2000; Fontana & Buss, 1994), combinatory chemistry can handle them adequately by distributing their (potentially infinite) computation steps over time as individual reactions. Also, with respect to an earlier version of this work (Kruszewski & Mikolov, 2020), here we have further simplified the system, dropping the “reactant assemblage” mechanism through which we used to feed emergent structures with their required resources. This mechanism enabled them to grow their activity and thus be more easily spotted, but it also biased the evolution of the system. Instead, here we have devised a new metric to identify these structures in naturally occurring resource conditions. Moreover, we use Gillespie's algorithm (Gillespie, 1977) to simulate the time evolution of the system to obtain unbiased samples of the proposed process, instead of the more simplistic algorithm that was used before.

Starting from a tabula rasa state consisting of only elementary components, combinatory chemistry produces a diversity explosion, which then develops into a state dominated by self-organized emergent autopoietic structures (Maturana & Varela, 1973), including recursively growing and self-replicating ones. Notably, all these types of structures emerge during a single run of the system without requiring any external intervention. Furthermore, they preserve themselves in time by absorbing compounds from their environment and decomposing them step by step, in a process that has a striking resemblance to the metabolism of biological organisms. These structures take the form of recursive algorithms that never reach a normal form (i.e., halting point) as long as sufficient resources are present in their environment. Notably, we found that Turing-completeness was required in combinatory chemistry for representing self-replicating structures, but not for simple autopoietic or recursively growing ones.

This article is organized as follows. First, we describe earlier work in artificial chemistry that is most related to the approach introduced here. Then, we explain the basic workings of combinatory logic and how we adapted it in an artificial chemistry. Third, following earlier work, we discuss how autocatalytic sets can be used to detect emerging phenomena in this system, and propose a novel measure of emergent complexity, which is well adapted to the introduced system. Finally, we describe our experiments showcasing the emergence of complex structures in combinatory chemistry and discuss their implications.

Artificial chemistries are models inspired in natural chemical systems that are usually defined by three different components: a set of possible molecules, a set of reactions, and a reactor algorithm describing the reaction vessel and how molecules interact with each other (Dittrich et al., 2001). In the following discussion, we will focus on the algorithmic chemistries that are the closest to the present work.

AlChemy (Fontana & Buss, 1994) is an artificial chemistry where molecules are given by λ-calculus expressions. λ-calculus is a mathematical formalism that, like Turing machines, can describe any computable function. In AlChemy, pairs of randomly sampled expressions are joined through function application, evaluated, and the corresponding result is added back to the population. To keep the population size bounded, expressions are randomly discarded. Fontana and Buss showed that expressions that computed themselves quickly emerged in this system, which they called level 0 organizations. Furthermore, when these expressions were explicitly prohibited, a more complex organization emerged where every expression in a set was computed by other expressions within the same set (level 1 organizations). Finally, mixing level 1 organizations could lead to higher order interactions between them (level 2 organizations). Yet, this system had some limitations. First, each level of organization was only reached after external interventions. In addition, programs were evaluated using β reductions, which require that they reach a normal form, namely, that there are no more λ-calculus rules than can be applied. Thus, recursive programs, which never reach a normal form, are banned from the system. Here, we use weak reductions instead, allowing the system to compute the time evolution of programs that never reach a formal form. Interestingly, it is exactly in this way that emergent metabolisms are represented. Furthermore, in AlChemy, two processes were introduced as analogues of food and waste, respectively. First, when expressions are combined, they are not removed from the system, allowing the system to temporarily grow in size. Second, expressions that after being combined with existing expressions do not match any λ-calculus reduction rules are removed. Without these processes, complex organizations fail to emerge. Yet, it is not clear under which circumstances these external interventions would not be needed anymore in order for the system to evolve autonomously. Finally, bounding the total number of expressions by randomly removing excess ones creates perturbations to the system that can arbitrarily affect the dynamics. Fontana and Buss (1996) later proposed MC2, a chemistry based on linear logic that addressed some of these limitations (notably, conservation of mass), although we are not aware of empirical work on it.

Here, we propose an artificial chemistry based on combinatory logic. This formalism has been explored before in the context of artificial chemistries by di Fenizio and Banzhaf (2000). This work also introduces conservation laws, even though they rely on decomposing expressions to their individual components, introducing some randomness into the dynamics that here we have avoided. Furthermore, like AlChemy, it reduces expressions until they reach their normal forms, explicitly forbidding recursive and other types of expressions that do not converge.

Other very related artificial chemistries are based on graph rewriting systems. Squirm3 (Hutton, 2002) is a chemistry in which atoms are placed in a 2D grid where they can react with each other, creating or breaking bonds. Interestingly, Hutton (2002) shows that self-reproducing evolvable chains can emerge in this environment when using the right set of reactions, which like in the artificial chemistry here introduced, have intrinsic conservation laws. Yet, it remains unclear which characteristics of those reactions make it possible for this emergence to occur. In this work, we study the hypothesis that self-reproducing metabolisms are linked to recursive programs expressed through a network of autocatalysed reactions endowed with universal computation capabilities. In a different vein, Chemlambda (Buliga, 2020; Buliga & Kauffman, 2014) is a Turing-complete graph rewriting artificial chemistry that allows the encoding of λ-calculus and combinatory logic operators. As such, it is complementary in many ways to the system proposed here. While the original Chemlambda did not consider conservation laws, an extension called Hapax is currently exploring them. Yet, emergent phenomena have not yet been explored under this formalism.

Combinatory logic is a minimalistic computational system that was independently invented by Moses Schönfinkel, John von Neumann, and Haskell Curry (Cardone & Hindley, 2006). Aside from its relevance to Computability Theory, it has also been applied in Cognitive Science as a model for a Language of Thought (Piantadosi, 2021). One of the main advantages of combinatory logic is its formal simplicity while capturing Turing-complete expressiveness. In contrast to other mathematical formalisms, such as λ-calculus, it dispenses with the notion of variables and all the necessary bookkeeping that comes with it. For instance, a function f (x) = 1 + x + y would be nonsensical, and a function-generating system based on λ-calculus would need to have explicit rules to avoid the formation of such expressions. Instead, combinatory logic expressions are built by composing elementary operators called combinators. Here, we restrict combinators to S, K and I, which form a Turing-complete basis.1 A combinatory logic expression is defined either to be a singleton combinator or recursively, given two expressions x and y, by the application operation (x y). It is important to note that, by convention, application is left-associative and thus, (x y z ) and ((x y)z ) are equivalent. Given an expression e of the form e = (αXβ) where X is a well-formed subexpression and α and β are some arbitrary left and right context, it can be rewritten in combinatory logic, as follows:
$α(If)β⊳αfβα(Kfg)β⊳αfβα(Sfgx)β⊳α(fx(gx))β$
When (αXβ) matches the left-hand side of any of the rules above, the term X is called a reducible expression or redex. A single expression can contain multiple redexes. If no rule is matched, the expression is said to be in normal form. The application of these rules to rewrite any redex is called a (weak) reduction. For example, the expression (SII(SII)) could be reduced as follows (underlining the corresponding redexes being rewritten): $(SII(SII)̲)⊳(I(SII)̲(I(SII)))⊳(SII(I(SII)̲))⊳(SII(SII))$. Thus, this expression, also known as the omega combinator, reduces to itself. We will later see that expressions such as this one will be important for the self-organizing behavior of the system introduced here. In contrast, (SII ) is not reducible because S requires three arguments.2 Additionally, note that (I(SII )(I(SII ))) has two redexes that can be rewritten, namely, the outermost or the innermost I combinators. Even though many different evaluation order strategies have been defined (Pierce, 2002), here we opt for picking a redex at random, both because this is more natural for a chemical system and to avoid limitations that would come from following a fixed deterministic evaluation order.
One of our main contributions deals with reformulating these reduction rules as reactions in a chemical system with conservation laws. For this, we postulate the existence of a multiset of combinatory logic expressions $P$ that react following reduction rules, plus random condensation and cleavages. In principle, the application of a reduction rule to any expression involves removing one copy of the expression from the multiset and adding back the resulting product of the rule. Note that if we were to apply plain combinatory logic rules to reduce these expressions, the total number of combinators in the system would not be preserved (Lafont, 1997)—first, because the application of a reduction rule always removes the first combinator in the redex from the resulting expression; and second, because while the K combinator discards a part of the expression (the argument g), S duplicates its third argument x. Thus, to make a chemical system with conservation laws, we posit that, on one hand, reduction operations can generate one or more by-product expressions. On the other hand, the reduction rules can be applied to more than one expression simultaneously. Therefore, using the + symbol to indicate that multiple expressions are being rewritten when it appears on the left-hand side, or more than one expression is being added back to the multiset when it is on the right-hand side, we define reduce reactions for an expression or substrate (αXβ), as follows:
$α(If)β⇒αfβ+I$
(1)
$α(Kfg)β⇒αfβ+g+K$
(2)
$α(Sfgx)β︸substrate+x︸reactant⇒α(fx(gx))β︸product+S︸by-product$
(3)
An expression in combinatory chemistry is said to be reducible if it contains a combinatory chemistry redex (CC-redex). A CC-redex is a plain combinatory logic redex, except when it involves the reduction of an S combinator, in which case a copy of its third argument x (the reactant) must also be present in the multiset $P$ for it to be a redex in combinatory chemistry. For example, the expression SII(SII) is reducible if and only if the third argument of the combinator S, namely (SII), is also present in the set. When a reduction operation is applied, the redex is rewritten following the rules of combinatory logic, removing any reactant from $P$ and adding back to it the product and all by-products, as specified on the right-hand side of the reaction. The type of combinator being reduced gives name to the reaction. For instance, the S-reaction operating on SII(SII) + (SII) removes these two elements from $P$, adding back I(SII)(I(SII)) and S to it. Notably, each of these reduction rules preserves the total number of combinators in the multiset, intrinsically enforcing conservation laws in this chemistry. It is also worth noting that each of these combinators plays different roles in the creation of novel compounds. While K-reactions split the expression, decreasing its total size and complexity, S-reactions create larger and possibly more complex expressions from smaller parts.

In contrast to previous attempts in which expressions were combined and then reduced to normal form (di Fenizio and Banzhaf, 2000; Fontana & Buss, 1994)—thus being forced to exclude expressions that did not reach a normal form—here each reduce reaction corresponds to a single reduction step that can always be computed. For this reason, we do not need to take any precautions to avoid recursive expressions, but instead allow these interesting programs to form part of our system's dynamics.

Completing the set of possible reactions in this chemistry, condensations and cleavages can generate novel expressions through random recombination:
$x+y↔(xy)$
(4)
Condensations correspond to the application operation between x and y, whereas cleavages are the inverse. Note that cleaving (xyz) can only result in (xy) + z because, otherwise, the tree structure would not be preserved.

In combinatory chemistry, computation takes precedence. This means that reduction reactions must happen at much higher rates than those of random recombination. Over the following section we detail how this happens.

### 4.1 Temporal Evolution

The system is initialized with a tabula rasa state containing only expressions with a single combinator S, K, or I. In this way, we can be sure that any emergent diversity is the consequence of the system's dynamics rather than the outcome of an external intervention. Then, it evolves by sampling reactions following Gillespie's algorithm (Gillespie, 1977, 2007). We note that the time evolution algorithm has changed from an earlier version of this work (Kruszewski & Mikolov, 2020) in favor of this more principled approach.

More precisely, we define a propensity function aj(x) for each reaction j, which computes the unnormalized probability that the reaction j will occur within an infinitesimal time interval given the system's state vector x. The component xx indicates the number of instances of an expression x that are present in $P$. To define the propensity functions, we make use of reaction rate constants kX, kΠ, $kAS$, $kAK$, $kAI$, for the cleavage, condensation, and S, K, and I reduction reactions, respectively. Importantly, because in combinatory chemistry computation takes precedence, reaction rates kj of reduction reactions must be significantly larger than those of random recombinations: kA∈{S,K,I}kB∈{X,Π}. The propensity function takes different forms depending on whether the reaction is unimolecular or bimolecular, following the formulation of Gillespie (2007). For unimolecular reactions, such as I and K reductions, and cleavages, aj takes the form $aj(x)=cjxx1$ where $xx1$ is the number of copies of the reaction's substrate x1 in x. For bimolecular reactions like S reductions and condensation with substrate x1 and reactant x2, it takes the form $aj(x)=cjxx1xx2$ if x1x2 and the form $aj(x)=cj12xx1(xx1−1)$ if x1 = x2, where $xx1$ and $xx2$ are the number of expressions of x1 and x2, respectively. In the case of unimolecular reactions, cj is equal to a reaction rate constant kJ where J ∈{X,AK,AI} is the type of reaction j, whereas for bimolecular reactions cj = kJ/Ω if x1x2 and 2kJ/Ω if x1 = x2, where Ω is the volume of the simulated container (Gillespie, 2007) and J ∈{Π,AS}.

Then, the probability of the next reaction being j is p( j) = aj(x)/a0 where $a0=∑jaj(x)$, and the time interval until its occurrence is distributed as an exponential distribution with parameter a0. To sample from this process, we follow the direct method (Gillespie, 1977). For efficiency reasons, we factorize the reaction probability by the kind of the next reaction J being a condensation (Π), a cleavage (X), or a reduction (A = ASAKAI), where {AZ}Z∈{S,K,I} stands for the set of all possible Z reductions:
$p(j)=∑J∈{Π,X,A}p(j|J)p(J),$
(5)
where p( J ) = aJ(x)/a0(x), $a0(x)=∑J∈{Π,X,A}aJ(x)$, and $aJ=∑j∈Jaj(x)$.
For cleavage and condensation reactions, aJ takes a simpler form that can be efficiently computed by keeping track of the total number of expressions $∑xxx$, whereas for reduce reactions we must explicitly sum over all of reactions:
$aX(x)=kX∑xxx−xS−xK−xI,$
(6)
$aΠ(x)=kΠΩ∑xxx∑x′xx′−1,$
(7)
$aA(x)=∑j∈AIkIxj1+∑j∈AKkKxj1+∑j∈ASkSΩxj1xj2,$
(8)
where j1 is the substrate of the reduction reaction j and j2 is the reactant in the case of reducing an S combinator.

Thus, to sample a reaction, we first sample the reaction type J. If it is a cleavage, then we sample one expression according to its concentration, and cleave it into two subexpressions by dividing it at the root. If it is a condensation, then we sample two expressions according to their concentration (the second, after removing one element of the first one), and combine them through the application operator. Finally, if it is a reduction, then we sample one reduce reaction from the space of all possible reduce reactions, with probability proportional to its propensity. In practice, we just need to compute all possible reductions involving the expressions that are present in the system's state.3 The complete algorithm describing the temporal evolution of our system is summarized on Algorithm 1.4

Having described the dynamics of combinatory chemistry, we now turn to discuss how we can characterize emergent structures in this system. For this, we first discuss how autocatalytic sets can be applied for this purpose. Second, we observe that this formalism may not completely account for some emergent structures of interest, and thus, we propose to instead track reactant consumption rates as a proxy metric to uncover the presence of these structures. Finally, we enrich this metric to detect consumption levels that are above chance levels.

### 5.1 Autocatalytic Sets

Self-organized order in complex systems is hypothesized to be driven by the existence of attracting states in the system's dynamics (Kauffman, 1993; Walker & Ashby, 1966; Wuensche et al., 1992). Autocatalytic sets (Kauffman, 1993) were first introduced by Stuart Kauffman in 1971 as one type of such attractors that could help explain the emergence of life in chemical networks. (See Hordijk, 2019, for a comprehensive historical review on the topic.) Related notions are the concept of autopoiesis (Maturana & Varela, 1973), and the hypercycle model (Eigen & Schuster, 1978).

Autocatalytic sets are reaction networks that perpetuate in time by relying on a network of catalyzed reactions, where each reactant and catalyst of a reaction is either produced by at least some reaction in the network, or it is freely available in the environment. This notion was later formalized in mathematical form (Hordijk & Steel, 2004; Hordijk et al., 2015) with the name of reflexively autocatalytic food-generated sets (RAFs). Particularly, a chemical reaction system (CRS) is first defined to denote the set of possible molecules, the set of possible reactions, and a catalysis set indicating which reactions are catalyzed by which molecules. Furthermore, a set of freely available molecules in the environment, called the food set, is assumed to exist. Then, an autocatalytic set (or RAF set) $S$ of a CRS with associated food set F is a subset of reactions, which is:

1. reflexively autocatalytic (RA): Each reaction $r∈S$ is catalyzed by at least one molecule that is either present in F or can be formed from F by using a series of reactions in $S$ itself.

2. food-generated (F): Each reactant of each reaction in $S$ is either present in F or can be formed by using a series of reactions from $S$ itself.

### 5.2 Metabolic Structures in Combinatory Chemistry

In combinatory chemistry, all reducing reactions take precedence over random condensations and cleavages, and thus, they proceed at a higher rate than random reactions without requiring definition of a catalyst. Therefore, we say that they are autocatalyzed and note that they all trivially satisfy condition 1. Thus, autocatalytic sets in this system are defined in terms of subsets of reduce reactions in which every reactant is produced by a reduce reaction in the set or is freely available in the environment (condition 2). For example, if we assume that A = (SII) is in the food set, Figure 1 shows a simple autocatalytic set associated with the expression (AA) = (SII(SII)). As shown, a chain of reduce reactions keeps the expression in a self-sustaining loop: When the formula is first reduced by the reaction r1, a reactant A is absorbed from the environment and one S combinator is released. Over the following steps, two I combinators are sequentially applied and released back into the multiset $P$, with the expression returning back to its original form. In other words, in one cycle the expression (AA) absorbs one copy of A from the environment releasing back into it the elementary components obtained as by-products of the reactions. We refer to this process as a metabolic cycle because of its strong resemblance to its natural counterpart. For convenience, we write the just described cycle as (AA) + A*(AA) + ϕ(A), where ϕ is a convenience function that allows us to succinctly represent the decomposition of A as elementary combinators by mapping its argument into nSS + nKK + nII where nS, nK, and nI stand for the number of S, K, and I combinators in A, and the ⇒* symbol indicates that there exists a pathway of reduction reactions from the reactives in the left-hand side to the products in the right-hand side.

Figure 1.

r1r5 form an autocatalytic set, granted that (SII) belongs to the food set. (SII(SII))'s metabolic cycle starts with r1 reducing the S combinator, while taking (SII) as reactant. Then, the cycle is completed by the reduction of the two identity combinators, in any of the possible orders. Original figure from Kruszewski & Mikolov, 2020, used with permission.

Figure 1.

r1r5 form an autocatalytic set, granted that (SII) belongs to the food set. (SII(SII))'s metabolic cycle starts with r1 reducing the S combinator, while taking (SII) as reactant. Then, the cycle is completed by the reduction of the two identity combinators, in any of the possible orders. Original figure from Kruszewski & Mikolov, 2020, used with permission.

Close modal

It should be noted that there are (infinitely) many possible pathways, some of them not necessarily closing the loop. For instance, instead of reducing (SII(I(SII))) through the only available I reduction (r4 in Figure 1), it could also be possible to reduce the full expression by applying the S reduction rule as long as at least one copy of (I(SII)) is present in $P$, yielding (I(I(SII))(I(I(SII)))) as a result. We could continue reducing this expression, first through the two outermost I combinators, thus obtaining (SII(I(I(SII)))), and then though the first S combinator, provided that a copy of (I(I(SII))) is present in $P$, obtaining (I(I(I(SII)))(I(I(I(SII))))) as a result. This outermost-first reduction order could be followed ad infinitum, stacking ever more I combinators within the expression. Nonetheless, this also has a cost, as there must be copies of reactants (I(SII)), (I(I(SII))), …, (I(…(I(SII)))) in the multiset $P$ for these reactions to occur, and longer expressions are normally scarcer. For this reason, when we talk about the metabolic cycle we usually refer to the least effort cycle, in which I and K combinators are reduced before S ones, and where S combinators with shorter or naturally more frequent expressions in the third argument (the reactant) are reduced before longer ones.

While autocatalytic sets provide a compelling formalism to study emergent organization in artificial chemistry, they also leave some blind spots for detecting emergent structures of interest. Such is the case for recursively growing metabolisms. Consider, for instance, e = (S(SI)I(S(SI)I)). This expression is composed of two copies of A = (S(SI)I ) applied to itself (AA). As shown in Figure 2, during its metabolic cycle, it will consume two copies of the element A, metabolizing one to perform its computation and appending the other one to itself, thus (AA) + 2A*(A(AA)) + ϕ(A). As time proceeds, the same computation will occur recursively, thus (A(AA)) + 2A*(A(A(AA))) + ϕ(A), and so on. While this particular behavior cannot be detected through autocatalytic sets, because the resulting expression is not exactly equal to the original one, it still involves a structure that preserves in time its functionality.

Figure 2.

One of the possible pathways in the reduction of the tail-recursive structure (AA) with A = (S(SI)I). It appends one A to itself by metabolizing another copy absorbed from the environment. Original figure from Kruszewski & Mikolov, 2020, used with permission.

Figure 2.

One of the possible pathways in the reduction of the tail-recursive structure (AA) with A = (S(SI)I). It appends one A to itself by metabolizing another copy absorbed from the environment. Original figure from Kruszewski & Mikolov, 2020, used with permission.

Close modal

Moreover, while the concept of autocatalytic set captures both patterns that perpetuate themselves in time and patterns that also multiply their numbers, it does not explicitly differentiate between them. A pattern with a metabolic cycle of the form AA + A*AA + ϕ(A) (as in Figure 1) keeps its own structure in time by metabolizing one A in the food set, but it does not self-reproduce. We call such patterns simple autopoietic (Maturana & Varela, 1973). In contrast, for a pattern to be self-reproducing it must create copies of itself that are later released as new expressions in the environment. For instance, consider a metabolic cycle in Figure 3 with the form (AA) + 3A*2(AA) + ϕ(A). This structure creates a copy of itself from two freely available units of A and metabolizes a third one to carry out the process.

Figure 3.

Metabolic cycle (showing one of the possible pathways) of a self-reproducing structure that emerges from the dynamics of combinatory chemistry. Starting from (AA), where A = (SI(S(SK)I)), it acquires three copies of A from its environment and uses two to create a copy of itself, metabolizing the third one to carry out the process. Original figure from Kruszewski & Mikolov, 2020, used with permission.

Figure 3.

Metabolic cycle (showing one of the possible pathways) of a self-reproducing structure that emerges from the dynamics of combinatory chemistry. Starting from (AA), where A = (SI(S(SK)I)), it acquires three copies of A from its environment and uses two to create a copy of itself, metabolizing the third one to carry out the process. Original figure from Kruszewski & Mikolov, 2020, used with permission.

Close modal

### 5.3 Metrics for Detecting Emergence

All structures identified in the previous section have in common the need to absorb reactants from the environment to preserve themselves in homeostasis. Furthermore, because they follow a cyclical process, they will continuously consume the same types of reactants. Thus, we propose tracking reactant consumption as a proxy metric for the emergence of structures. In other words, we note that the only operation that allows an expression to incorporate a reactant into its own body is the reduction of the S combinator, and thus we count the number of reactants x consumed by expressions of the form α(Sf g x)β in the time interval [t : t + δ) normalized by the total number of reactants consumed in the same interval, denoting it as Ot:t +δ(x), or simply O(x).

Indeed, this metric was used in a previous version of this work (Kruszewski & Mikolov, 2020) to detect emergent structures. Yet, it has the problem that it is also sensitive to reactants that are consumed at high rates just because they are very frequent, and as a consequence, expressions containing them as a third argument to an S combinator are expected to be common as well, inflating this metric for uninteresting reactions.

For this reason, we propose measuring the (positive) pointwise information.5I(x) associated with observing a consumption rate O(x) for a reactant x in contrast to its consumption rate if the process were driven by chance only, R(x):
$I(x)=defmaxlogO(x)R(x),0$
(9)
We define R(x) as the relative frequency of any expression on a hypothetical process where no reduce reactions are present, but instead only random mixing given by random collisions and cleavages. To model this process, we follow the formulation of the reaction kinetic equations given by Fellermann et al. (2017):
$dxxdt=cXc0∑y,z(xy)=z(yx)=zxz−∑y,z(yz)=xxx+cΠc0∑y,z(yz)=xxyxz−∑y,z(xy)=z(yx)=zxxxy,$
(10)
where, cX = kX, cΠ = kΠ/Ω, and c0 = a0(x) is the partition function, and define R(x) as the normalized equilibrium concentration $xx*$ of expression x under these random kinetics:
$R(x)=defxx*∑xxx*.$
(11)
At equilibrium, Equation 10 admits the following solution:
$xx*=cXcΠe−b|x|,$
(12)
where |x| is the length of expression x and b is a constant that depends on the boundary conditions. In particular, we ask for the initial mass of the system, represented by the dimensionless constant M, to be conserved in the equilibrium distribution by equating it to the total number of combinators:
$∑xxx*|x|=M.$
(13)
We can rewrite this equation by first substituting $xx*$ according to Equation 12, and then replacing the sum over expressions by a sum over lengths. Noting that there are knCn−1 expressions of length n, where k = 3 is the number of different possible combinators and Cn is the nth Catalan number, this equation becomes:
$cXcΠ∑n=1∞knCn−1e−bnn=M.$
(14)
After solving this equation for b, we obtain $b=log2k+2k1+cXcΠ214M2$. Finally, we compute the normalization constant for the distribution x*:
$∑xxx*=cXcΠ∑n=1∞knCn−1e−bn=cX2cΠ1−1−4ke−b,$
(15)
completing the definiton of I(x). We refer to Appendix 1 for all the computations verifying these claims. In the following section, we show its effects experimentally.

### 6.1 Metrics

We began by testing the effect of the proposed information metric on a system with uniformly distributed M = 104S, K, and I combinators simulated until it reached time T = 1,000. Candidate reactions are sampled by 10 threads working simultaneously. For all our experiments, we used kΠ = kX = 1, kK = kI = 104, kS = 106 and fixed the dimensionless constant Ω = M. We leave a thorough exploration of parameter values for future work. With these parameters, the expressions associated with random kinetic dynamics become $b=log(2+5)k$ and $R(x)=12(3+5)(2+5)k−|x|$. All curves are smoothed through locally averaging every data point at time t with those in the interval [t − 10,t + 10].

We first present the raw consumption rates O(x), which are displayed in Figure 4(a). As shown, some of the most frequently consumed reactants include the atomic combinators S, K, and I, and some of their binary compositions. Binary reactants such as (KI) and atomic ones such as I do not form part of any stable structure, and the expressions consuming them are produced by chance. Yet, they are used with considerable frequency because S combinators are more likely to be applied to shorter arguments than longer ones. For this reason, the consumption of I is considerably higher than the consumption of (KI). Yet, even though by the same argument the consumption rate of A = (SII) should be below binary reactants, self-organization into autopoietic patterns drives the usage of this reactant above what would be expected if chance were the only force at play. Indeed, the curve corresponding to the consumption of the reactant A = (SII) is associated with the autopoietic pattern (SII(SII)), composed of two copies of this reactant, and a metabolic cycle of the form (AA) + A*(AA) + ϕ(A), as shown in Figure 1. Nonetheless, (S(SI)I), used by the structure in Figure 2, is cramped at the bottom with the binary reactants in the consumption rate plot.

Figure 4.

Reactant consumption metrics for the most consumed reactants x in a simulation with 10k combinators.

Figure 4.

Reactant consumption metrics for the most consumed reactants x in a simulation with 10k combinators.

Close modal

When applying the proposed information metric I(x) the curves corresponding to emerging structures become featured at the top of the graph whereas all other reactants that are mostly driven by random generation are pushed to the bottom, as shown in Figure 4(b). Therefore, this experiment showcases the usefulness of this metric in separating consumption rates propelled by emergent structures from uninteresting fortuitous ones.

### 6.2 Emergent Metabolic Structures

For the next part, we studied some of the emergent structures in a system initialized with a tabula rasa state consisting of M = 106 evenly distributed S, K, and I combinators simulated for 1,000 units of time.

In an earlier version of this work (Kruszewski & Mikolov, 2020), we had used a supplementary mechanism called reactant assemblage, through which we “fed” emerging structures with their required reactants to allow them to be spotted through sheer reactant consumption rates in a relatively small system with only 10,000 combinators. Here, we simplified the model and dropped the need for this mechanism, thanks both to simulating a larger system with 1M combinators and to the usage of the re-weighting metric presented in Metrics for Detecting Emergence, above, which allowed us to spot emergent complex structures even at very low levels of reactant consumption rates.

We began by analyzing some general metrics on one given run of the system. First, and in agreement with previous work (Meyer et al., 2009), we find that there is a tendency for the system to create increasingly longer expressions, as shown by the length of the largest expression in the system (Figure 5(a)). We also count the number of distinct expressions present in the system at any given time, which we display in Figure 5(b). As can be seen, diversity explodes at the beginning, driven by the random recombination of the elementary combinators, peaking very early on, then decreasing fast at first, and then slower after about time 200. This behavior is consistent with a system that self-organizes into attracting states dominated by fewer, but more frequent expressions. Again, this result agrees with previous work that has shown that diversity decreases over time on a number of ACs (Banzhaf et al., 1996; Dittrich & Banzhaf, 1998; Dittrich et al., 2001). However, we note that in contrast with many of these systems that were initialized with random elements, thus, maximizing diversity from the start, here the initial diversity is an emergent property of the system dynamics as it is only initialized with three different elements, namely, the singleton expressions S, K, and I. Next, we note that the proportion of reducing vs. random recombination reactions is an emergent property of the system which depends on the number of reducible expressions that are present in its state at any given time. As can be seen in Figure 5(c), this rate, which necessarily starts at 0, increases sharply in the beginning, reaching slightly more than 30%. Then, it starts to slowly decrease, but remains always above 25% during the studied period.

Figure 5.

Global metrics for a system initialized with 1M uniformly distributed S, K, and I combinators.

Figure 5.

Global metrics for a system initialized with 1M uniformly distributed S, K, and I combinators.

Close modal

However, it is unclear from these results whether there are emergent complex structures that act as attractors, or if a different explanation for these outcomes is at play. To answer this question, we turned to the reactant consumption rates weighted by Equation 9 to detect whether specific reactants were more prominently used by some emergent structures.

Results are shown in Figure 6(a) for a few selected reactants that highlight the emergence of different types of structures, including simple autopoietic, recursively growing, and self-reproducing ones. Interestingly, they can emerge at different points in time, co-exist, or be driven to extinction. In parallel, Figure 6(b) shows the number of copies of these reactants available at each discrete time interval of unit length.

Figure 6.

Simulation results for 1M uniformly selected S, K, and I combinators for 1,000 generations, on a set of manually chosen reactants.

Figure 6.

Simulation results for 1M uniformly selected S, K, and I combinators for 1,000 generations, on a set of manually chosen reactants.

Close modal

While there are (infinitely) many possible expressions that can consume a given type of reactant, only a few of them will correspond to emergent metabolisms. In general, we observed that expressions that consume any given reactant A are typically composed of multiple juxtaposed copies of this reactant in an expression of the form (AA). This is linked to the fact that in order to express recursive functions in combinatory logic, the function (in this case denoted by A) must take itself as its own argument, but this particularity also confirms the old adage: “Tell me what you eat and I will tell you what you are.”

The first curve in Figure 6(a) (in the order of the legend) corresponds to the reactant A = (SII), associated with the simple autopoietic structure of Figure 1. The consumption rates are much more stabilized in comparison with the results reported in (b), which belonged to a system that was 100 times smaller. Furthermore, the number of copies of this reactant decreases sharply at the beginning, but then, intriguingly, they slowly increase again until reaching a concentration of about 200 units in total.

The second reactant in the plot, (AA) (where A = (SII)) is not actually consumed by a metabolism. Instead, it is consumed by one of the two possible reductions of the expression (A(AA)) = (SII(SII(SII))), which reduces to (AA(AA)) = (SII(SII)(SII(SII))). This new structure is yet another autopoietic structure that is composed of two copies of (AA) = (SII(SII)) that persist in time by consuming the (SII) reactant independently of each other.

The next three reactants in Figure 6(a) correspond to recursively growing structures. The first one uses the reactant A = (S(SI)I) and follows a right-branching cycle that linearly increases the size of the structure: (AA) + 2A*(A(AA)) + ϕ(A) (Figure 2). The second one, with reactant A = (S(SII)I), also grows recursively, although with a left-branching structure: (AA) + 2A*(AAA) + ϕ(A). Third, there is the reactant A = (S(SSI)K), which is associated with a binary-branching recursive structure.6

Finally, the curve of the reactant A = (SI(S(SK)I)) corresponds to the emergence of a self-reproducing structure, following a cycle of the form (AA) + 3A*2(AA) + ϕ(A), thus duplicating itself after metabolizing one copy of the reactant in the process. Interestingly, this structure emerges not only through the effect of random recombination, but also thanks to self-organization. Instead of being a product of a random combination of two copies of the reactant A, it often emerges after the condensation of other reactants, such as (S(SI(SI))(SI)) and (S(SK)I), (SI(SI(SI))) and (S(SK)I), or (SII) and (SI(S(SK)I)), among other possibilities that induce a chain of reduction reactions that result in producing at least one copy of (AA).

It is worth noting that one cycle of this structure's metabolism requires three copies of the reactant. When there are none in the environment, the structure cannot proceed with its metabolism and this structure is vulnerable to being cleaved or being condensed with an expression that will cause it to stop functioning normally. Because of the rare supply of its six-combinator-long reactant, plus the fact that all existing structures compete with one another, the structure will fall into extinction at about time t = 200, although it will make a short comeback at time t = 400, when only two reproduction cycles are completed, before falling back into oblivion. Nonetheless, we speculate that in larger systems the population will recover from periods of resource scarcity by following the periodic dynamics originally proposed by Lotka (1910), thus allowing these structures to perpetuate in time.

Notably, recursive structures also experience low resource conditions starting at time t = 200. However, they might be able to cope with conditions of low resources more effectively because their repetitive structure allows them to be cleaved and still conserve their function. For instance, A(AA) ⇒ A + (AA) still leaves a functioning (AA) structure. When new copies of A become available either through the random condensation of combinators released by every computed reduction or by some other process, they can consume them and grow back again.

### 6.3 Other Bases

Thus far, we have explored emergent structures on systems composed of S, K, and I combinators. Because one of the main goals of this work is linking the emergence of metabolic structures to the core computational properties of the underlying chemistry, we explored other possible (smaller) bases.

First of all, we note that using K or I combinators by themselves would not produce any meaningful structure, neither would a combination thereof. Any expression formed out of these combinators can only decay into binary expressions at most. In the case of the S combinator, while it still allows expressions that can be reduced infinitely, such as (SSS(SSS)(SSS)), there are no expressions that do so by continually consuming the same reactant. This is because, as shown by Waldmann (2000), there are no expressions X composed only of the S combinator such that X*(αXβ). Therefore, autocatalytic sets are not possible in this environment. For all of these reasons, it is not meaningful to look at simulations containing either one of the S, K, or I combinators alone.

The only two possible remaining subsets of combinators are the pairs SI and SK, with only the latter being Turing-complete. We ran again experiments in which only one pair of combinators was present. We used 104 combinators for the SI basis, and 106 for the SK one. Figure 7(a) shows the information traces for the reactant (SII), consumed by the simple autopoietic structure in Figure 1, and (S(SI)I), consumed by the structure in Figure 2, corroborating that these structures also emerge when only S and I combinators are present. In Figure 7(b), we also note that a homologous autopoietic structure emerges with SK basis, as shown by the consumption of the reactant A = (S(SK)(SKK)). This structure has a metabolic cycle of the form (AA) + 3A*(AA) + 2A + ϕ(A), thus needing to absorb two extra copies of the reactant to perform the computation, even if they are later released unchanged. Additionally, recursively growing structures can be spotted in the SK base, as shown by the consumption of A = (S(SSK)), which is associated with a metabolic cycle of the form (AAA) + 3A + AA*(SSKA(AAA)) + 4S + K + AA, thus growing and incorporating SSK as a prefix in the process. Interestingly, AA is absorbed and released intact, which could be construed as an emergent catalyst for the reaction: Even though we can interpret each reduce reaction to be autocatalyzed, reaction chains can have emergent properties, such as in this case, where a reactant is just used to complete the metabolic cycle and then released.

Figure 7.

Pointwise information for a selected set of reactants on different combinator bases.

Figure 7.

Pointwise information for a selected set of reactants on different combinator bases.

Close modal

Finally, we note that while we have not yet witnessed an emergent self-reproducing structure using the SK only, it can in fact be represented with the expression (AA) where A = (S(SK)(S(SK)(SKK))), and having as metabolic cycle (AA) + 5A*2(AA) + A + KA + 5S + 3K. However, finding it requires discovering a considerably longer expression than the one in the SKI basis and consumes much longer reactants, which might explain why we have not yet found them to emerge in our simulations at the current scale. Nonetheless, it is worth noting that in the incomplete SI basis it would not be possible to represent a self-reproducing metabolic cycle of the form X*2X + … for any X because this would ask for a reaction capable of producing as a by-product an arbitrarily long expression that reduces to X at some point during the cycle. Yet, S and I combinators have only themselves as by-products of their respective reactions and cannot fulfill this requisite.

We have introduced combinatory chemistry, an algorithmic artificial chemistry based on combinatory logic. Even though it has relatively simple dynamics, it gives rise to a wide range of autopoietic structures, including recursively growing and self-reproducing ones. These structures feature reaction cycles that bear a striking resemblance to natural metabolisms. All of them take the form of recursive algorithms that continually consume specific resources, incorporating them into their structure and decomposing them to perform their function. Thanks to combinatory logic being Turing-complete, the presented system can theoretically represent patterns of arbitrary complexity. Furthermore, we have argued that in the context of the SKI basis, this computational universality property is both necessary and sufficient to represent self-reproducing patterns, while also showing them to emerge at least in the case where all three combinators are present. On the other hand, a non-universal basis consisting only of S and I combinators can still give rise to simple autopoietic and recursively growing structures.

The proposed system does not need to start from a random set of initial expressions to kick-start diversity. Instead, this initial diversity is the product of the system's own dynamics, as it is only initialized with elementary combinators. In this way, we can expect that this first burst of diversity is not just a one-off event, but it is deeply embedded into the mechanics of the system, possibly allowing it to keep on developing novel structures continually.

To conclude, we have introduced a simple model of emergent complexity in which self-reproduction emerges autonomously from the system's own dynamics. In the future, we will seek to apply it to explain the emergence of evolvability, one of the central questions in Artificial Life. We believe that the simplicity of the model, along with the encouraging results presently obtained and the creativity obtained from balancing computation with random recombination to search for new forms, leaves it in good standing to tackle the many challenges that lie ahead.

We would like to thank the two anonymous reviewers for their thorough and insightful feedback, which significantly improved earlier versions of this work.

### A1.1 Equilibrium Distribution

Here we show that Equation 12 corresponds to the equilibrium distribution of the process defined by Equation 10. Plugging Equation 12 into Equation 10 we obtain:
$dxx*dt=cXc0∑(xy)=z(yx)=zcXcΠe−b|z|−∑(yz)=xcXcΠe−b|x|+cΠc0∑(yz)=xcX2cΠ2e−b(|y|+|z|)−∑(xy)=z(yx)=zcX2cΠ2e−b(|x|+|y|).$
(16)
Noting that if z = (xy), then |z| = |x| + | y|, that an expression x = (xlxr) can only be cleaved into xl and xr, while vice versa, only the condensation of xl and xr can form x, and that there are knCn−1 possible expression of length n (where Cn stands for the nth Catalan number and k = 3 is the number of combinators), and that we must sum twice the factors corresponding to expression z that can be formed either as (xy) or as (yx), then we have:
$dxxdt=1c0∑n=1∞cX2cΠ2×knCn−1e−b(|x|+n)−cX2cΠe−b|x|+1c0cX2cΠe−b(|x|)−∑n=1∞cX2cΠ2×knCn−1e−b(|x|+n).$
(17)
This expression evaluates to 0 as long as the series converges, which we can assess using the ratio test, and the identity $Cn+1=2(2n+1)n+2Cn$:
$limn→∞kn+1Cnexp−b(|x|+n+1)knCn−1exp−b(|x|+n)=2k(2n−1)n+1e−b<1.$
(18)
Thus, if $b>log(4k)$, then Equation 12 defines the equilibrium distribution.

### A1.2 Boundary Conditions

Here, we derive the value of b by solving Equation 14. We start by showing that the following identity holds:
$∑n=1∞kn−1nCe−bnn=ke−b1−4ke−b.$
(19)
Rearranging the terms of the series, we have:
$∑n=1∞knCn−1e−bnn=∑n=1∞Cn−1e−(b−logk)nn=∑n=1∞Cn−1ann=a∑n=1∞Cn−1an−1n=a∑m=0∞Cmam(m+1),$
(20)
where $a=e−(b−logk)$, m = n − 1. Then, using the definition of $Cn=1n+12nn$, first, and the generating function for central binomial coefficients (Lehmer, 1985), second, we obtain:
$∑m=0∞Cmam(m+1)=∑n=0∞2mmam=11−4a,$
(21)
with |a| < 1/4. Finally, replacing Equation 21 into 20 and expanding a, we have:
$∑n=1∞knCn−1e−bnn=e−(b−logk)11−4e−(b−logk)=ke−b1−4ke−b,$
(22)
with $b>log4k$. Next, we replace Equation 22 into 14,
$cXcΠke−b1−4ke−b=M$
(23)
and solve for b to obtain:
$b=log2k+2k1+cXcΠ214M2.$
(24)

### A1.3 Normalizing Constant

Next, we compute the normalizing constant $∑xxx*$. We start by showing the derivation for the following identity:
$∑n=1∞knCn−1e−bn=121−1−4ke−b.$
(25)
Using this time the generating function for Catalan numbers (Davis, 2006),
$∑n=0∞Cnan=1−1−4a2a,$
(26)
we follow an analogous argument to the one above:
$∑n=1∞knCn−1e−bn=∑n=1∞Cn−1e−(b−logk)n=∑n=1∞Cn−1an=a∑n=1∞Cn−1an−1a∑m=0∞Cmam=1−1−4a2=1−1−4ke−b2.$
(27)
where $a=e−(b−logk)$, m = n − 1. Thus, using again that there are knCn−1 expressions of length n:
$∑xxx*=cXcΠ∑n=1∞knCn−1e−bn=cX2cΠ1−1−4ke−b.$
(28)

The following derivations show one of the possible pathways that each of the described structures can undertake as they develop. Whenever more than one reduction is possible, the “least effort” path is followed, namely, I and K combinators are reduced first, and then S combinators with the shortest reactant (i.e., third argument). Also, note that every expression written as (( fx)( g y)) can also be written simply as ( fx( g y)), a fact that we often make use of when applying an S reduction.

### A2.1 Metabolic Cycle of a Simple Autopoietic Pattern

Let A = (SII). Then,
$(AA̲)+A⇒((IA)(IA))+S(IA̲(IA))⇒(A(IA))+I(A(IA̲))⇒(AA)+I$

### A2.2 Metabolic Cycle of a Right-Branching Recursively Growing Structure

Let A = (S(SI)I). Then,
$(AA̲)+A⇒(SIA(IA))+S(SIA(IA̲))⇒SIAA+I(SIAA̲)+A⇒(IA(AA))+S(IA(AA))⇒(A(AA))+I$

### A2.3 Metabolic Cycle of a Binary-Branching Structure

Let A = (S(SSI)K). Then (AA) can follow the metabolic pathway:
$(AA̲)+A⇒(SSIA(KA))+SSSIA̲(KA)+A⇒(SA(IA)(KA))+S(SA(IA̲)(KA))⇒(SAA(KA))+I(SAA(KA)̲)+(KA)⇒(A(KA)(A(KA)))+S$
Then each copy of (A(KA)) can be reduced as follows
$(A(KA)̲)+(KA)⇒SSI(KA)(K(KA))+S(SSI(KA)̲(K(KA))+(KA)⇒(S(KA)(I(KA))(K(KA)))+S(S(KA)(I(KA)̲)(K(KA)))⇒(S(KA)(KA)(K(KA)))+I(S(KA)(KA)(K(KA))̲)+(K(KA))⇒(KA(K(KA))(KA(K(KA))))+S(KA(K(KA))̲(KA(K(KA))))⇒(A(KA(K(KA))))+(K(KA))+K(A(KA(K(KA))̲))⇒(AA)+(K(KA))+K$
Thus, the complete pathway can be summarized as (AA) + 2A + 5(KA) + (K(KA)) ⇒*AA(AA) + 4(K(KA)) + 2ϕ(A).

### A2.4 Metabolic Cycle of a Self-Reproducing Expression

Let A = (SI(S(SK)I)). Then,
$(AA̲)+A⇒(IA(S(SK)IA))+S(IA̲(S(SK)IA))⇒(A(S(SK)IA))+I(A(S(SK)IA̲))+A⇒(A(SKA(IA)))+S(A(SKA(IA̲)))⇒(A(SKAA))+I(A(SKAA̲))⇒(A(KA(AA)))+S(A(KA(AA)̲))⇒(AA)+(AA)+K$

### A2.5 Arrival of the Self-Reproducing Expression

In our simulations, we found that (AA) with A = (SI(S(SK)I)) often emerged from the condensation of two expressions leading to a chain of reactions that resulted in (AA). Here we show one simple path involving the condensation of (SI(SI)) and (S(SK)I) to produce (SI(SI)(S(SK)I)). Let us call B = (S(SK)I), and note that A = (SIB). The reduction chain that leads to (AA) proceeds as follows:
$(SI(SI)B̲)+B⇒(IBA)+S(IB̲A)⇒(BA)+I(S(SK)IA̲)+A⇒(SKA(IA))+S(SKA(IA̲))⇒(SKAA)+I(SKAA̲)+A⇒(KA(AA))+S(KA(AA)̲)⇒AA+A+K$

### A2.6 Metabolic Cycle of a Simple Autopoietic Pattern on the S − K Basis

Let A = (S(SK)(SKK)). Then,
$(AA̲)+A⇒((SKA)(SKKA))+S(SKA(SKKA̲))+A⇒(SKA(KA(KA)))+S(SKA(KA(KA)̲))⇒(SKAA)+KA+K(SKAA̲)+A⇒(KA(AA))+A(KA(AA)̲)⇒A+(AA)+K$

### A2.7 Metabolic Cycle of a Recursively-Growing Expression on the S − K Basis

Let A = (S(SSK)). Then,
$(AAA̲)+A⇒(SSKA(AA))+S(SSKA̲(AA))+A⇒(SA(KA)(AA))+S(SA(KA)(AA))̲+AA⇒(A(AA)(KA(AA)))+S(A(AA)(KA(AA)̲))⇒(A(AA)A)+(AA)+K(A(AA)A̲)+A⇒(SSKA(AAA))+S$

### A2.8 Metabolic Cycle of a Self-Reproducing Expression on the S − K Basis

Let A = (S(SK)(S(SK)(SKK))). Then,
$(AA̲)+A⇒(SKA(S(SK)(SKK)A))+S(SKA(S(SK)(SKK)A̲))+A⇒(SKA(SKA(SKKA)))+S(SKA(SKA(SKKA̲)))+A⇒(SKA(SKA(KA(KA))))+S(SKA(SKA(KA(KA)̲)))⇒(SKA(SKAA))+K+KA(SKA(SKAA̲))+A⇒(SKA(KA(AA)))+S(SKA(KA(AA)̲))⇒(SKAA)+AA+K(SKAA̲)+A⇒(KA(AA))+S(KA(AA)̲)⇒A+AA+K$
1

As a matter of fact, S and K suffice because I can be written as (SKK). The inclusion of I simply allows for expressing more complex programs with shorter expressions.

2

Also, I cannot be reduced with I as an argument because (SII) = ((SI)I) and thus, the second I is not an argument for the first I but to (SI).

3

Some expressions can participate in a very large number of reductions, considerably slowing the simulation. For this reason, the system is currently limited to computing up to 10 reductions per expression in no special order.

4

We make available the code to simulate combinatory chemistry in https://github.com/germank/combinatory-chemistry.

5

The name and the formula are related to the pointwise mutual information (PMI) metric (Church & Hanks, 1990) that has been extensively used in computational linguistics. PMI computes the log ratio between the empirical co-occurrence probability of two events x and y with respect to their expected probability if these two events were independent, as given by the product of the marginals. Like PMI, the proposed metric computes the log ratio between observed odds and chance-driven ones.

6

See Appendix 2 for more details on these derivations.

Bagley
,
R. J.
, &
Farmer
,
J. D.
(
1992
).
Spontaneous emergence of a metabolism
. In
C. G.
Langton
,
C.
Taylor
,
J. D.
Farmer
, &
S.
Rasmussen
(Eds.)
,
Artificial Life II: Proceedings of the interdisciplinary workshop on the synthesis and simulation of living systems
(pp.
93
140
).
Addison-Wesley;
.
Banzhaf
,
W.
,
Dittrich
,
P.
, &
Rauhe
,
H.
(
1996
).
Emergent computation by catalytic reactions
.
Nanotechnology
,
7
(
4
),
307
314
.
Buliga
,
M.
(
2020
).
Artificial chemistry experiments with chemlambda, lambda calculus, interaction combinators
.
ArXiv
. https://arxiv.org/abs/2003.14332
Buliga
,
M.
, &
Kauffman
,
L.
(
2014
).
Chemlambda, universality and self-multiplication
. In
Proceedings of the ALIFE 14: The fourteenth international conference on the synthesis and simulation of living systems
(pp.
490
497
).
MIT Press
. https://direct.mit.edu/isal/proceedings-pdf/alife2014/26/490/1901863/978-0-262-32621-6-ch079.pdf
Cardone
,
F.
, &
Hindley
,
J. R.
(
2006
).
Lambda-calculus and combinators in the 20th century
. In
D. M.
Gabbay
&
J.
Woods
(Eds.),
Logic from Russell to Church
(Vol.
5
,
Handbook of the History of Logic
, pp.
723
817
).
Elsevier
.
Church
,
K. W.
, &
Hanks
,
P.
(
1990
).
Word association norms, mutual information, and lexicography
.
Computational Linguistics
,
16
(
1
),
22
29
.
Curry
,
H. B.
,
Feys
,
R.
,
Craig
,
W.
,
Hindley
,
J. R.
, &
Seldin
,
J. P.
(
1958
).
Combinatory logic
(Vol.
1
).
North-Holland Publishing
.
di Fenizio
,
P. S.
, &
Banzhaf
,
W.
(
2000
).
A less abstract artificial chemistry
. In
M. A.
Bedau
,
J. S.
McCaskill
,
N. H.
Packard
, &
S.
Rasmussen
(Eds.),
Artificial Life VII: Proceedings of the seventh international conference on Artificial Life
(pp.
49
53
).
MIT Press
.
Dittrich
,
P.
, &
Banzhaf
,
W.
(
1998
).
Self-evolution in a constructive binary string system
.
Artificial Life
,
4
(
2
),
203
220
. ,
[PubMed]
Dittrich
,
P.
,
Ziegler
,
J.
, &
Banzhaf
,
W.
(
2001
).
Artificial chemistries: A review
.
Artificial Life
,
7
(
3
),
225
275
. ,
[PubMed]
Eigen
,
M.
, &
Schuster
,
P.
(
1978
).
The hypercycle
.
Naturwissenschaften
,
65
(
1
),
7
41
.
Fellermann
,
H.
,
Tanaka
,
S.
, &
Rasmussen
,
S.
(
2017
).
Sequence selection by dynamical symmetry breaking in an autocatalytic binary polymer model
.
Physical Review E
,
96
(
6
), Article
062407
. ,
[PubMed]
Flamm
,
C.
,
Ullrich
,
A.
,
Ekker
,
H.
,
Mann
,
M.
,
Högerl
,
D.
,
Rohrschneider
,
M.
,
Sauer
,
S.
,
Scheuermann
,
G.
,
Klemm
,
K.
,
Hofacker
,
I. L.
, &
Stadler
,
P. F.
(
2010
).
Evolution of metabolic networks: A computational frame-work
.
Journal of Systems Chemistry
,
1
(
1
), Article
4
.
Fontana
,
W.
, &
Buss
,
L. W.
(
1994
).
What would be conserved if “the tape were played twice”?
Proceedings of the National Academy of Sciences
,
91
(
2
),
757
761
. ,
[PubMed]
Fontana
,
W.
, &
Buss
,
L. W.
(
1996
).
The barrier of objects: From dynamical systems to bounded organizations
. In
J.
Casti
&
A.
Karlqvist
(Eds.)
,
Boundaries and barriers
(pp.
56
116
).
Addison-Wesley
.
Fontana
,
W.
,
Wagner
,
G.
, &
Buss
,
L. W.
(
1993
).
Beyond digital naturalism
.
Artificial Life
,
1
(
1–2
),
211
227
.
Gillespie
,
D. T.
(
1977
).
Exact stochastic simulation of coupled chemical reactions
.
Journal of Physical Chemistry
,
81
(
25
),
2340
2361
.
Gillespie
,
D. T.
(
2007
).
Stochastic simulation of chemical kinetics
.
Annual Review of Physical Chemistry
,
58
,
35
55
. ,
[PubMed]
Högerl
,
D.
(
2010
).
Simulation of prebiotic chemistries
(Master's thesis)
.
University of Vienna
.
Hordijk
,
W.
(
2019
).
A history of autocatalytic sets: A tribute to Stuart Kauffman
.
Biological Theory
,
14
(
4
),
224
246
.
Hordijk
,
W.
,
Smith
,
J. I.
, &
Steel
,
M.
(
2015
).
Algorithms for detecting and analysing autocatalytic sets
.
Algorithms for Molecular Biology
,
10
(
1
), Article
15
. ,
[PubMed]
Hordijk
,
W.
, &
Steel
,
M.
(
2004
).
Detecting autocatalytic, self-sustaining sets in chemical reaction systems
.
Journal of Theoretical Biology
,
227
(
4
),
451
461
. ,
[PubMed]
Hutton
,
T. J.
(
2002
).
Evolvable self-replicating molecules in an artificial chemistry
.
Artificial life
,
8
(
4
),
341
356
. ,
[PubMed]
Hutton
,
T. J.
(
2007
).
Evolvable self-reproducing cells in a two-dimensional artificial chemistry
.
Artificial Life
,
13
(
1
),
11
30
. ,
[PubMed]
Kauffman
,
S. A.
(
1993
).
The origins of order: Self-organization and selection in evolution
.
Oxford University Press
.
Kruszewski
,
G.
, &
Mikolov
,
T.
(
2020
).
Combinatory chemistry: Towards a simple model of emergent evolution
. In
ALIFE 2020: Proceedings of the 2020 conference on Artificial Life
(pp.
411
419
).
MIT Press
.
Lafont
,
Y.
(
1997
).
Interaction combinators
.
Information and Computation
,
137
(
1
),
69
101
.
Langton
,
C. G.
(Ed.) (
1989
).
Artificial Life: The proceedings of an interdisciplinary workshop on the synthesis and simulation of living systems, September 1987, in Los Alamos, New Mexico
.
Addison-Wesley
.
Lehmer
,
D. H.
(
1985
).
Interesting series involving the central binomial coefficient
.
The American Mathematical Monthly
,
92
(
7
),
449
457
.
Lotka
,
A. J.
(
1910
).
Contribution to the theory of periodic reactions
.
Journal of Physical Chemistry
,
14
(
3
),
271
274
.
Maturana
,
H. R.
, &
Varela
,
F. J.
(
1973
).
De máquinas y seres vivos: Una teoría sobre la organización biológica
[Of machines and living beings: A theory on biological organization]
.
Editorial Universitaria
.
Meyer
,
T.
,
Yamamoto
,
L.
,
Banzhaf
,
W.
, &
Tschudin
,
C.
(
2009
).
Elongation control in an algorithmic chemistry
, In
G.
Kampis
,
I.
Karsai
, &
E.
Szathmáry
(Eds.),
ECAL'09: Proceedings of the 10th European conference on advances in Artificial Life: Darwin meets von Neumann
(pp.
273
280
).
Springer-Verlag
.
Piantadosi
,
S. T.
(
2021
).
The computational origin of representation
.
Minds and Machines: Journal for Artificial Intelligence, Philosophy and Cognitive Science
,
31
(
1
),
1
58
. ,
[PubMed]
Pierce
,
B. C.
(
2002
).
Types and programming languages
.
MIT Press
.
Sayama
,
H.
(
2018
).
Seeking open-ended evolution in swarm chemistry II: Analyzing long-term dynamics via automated object harvesting
. In
Proceedings of the ALIFE 2018: The 2018 conference on Artificial Life
(pp.
59
66
).
MIT Press
. ,
[PubMed]
Schönfinkel
,
M.
(
1924
).
Über die Bausteine der mathematischen Logik [About the building blocks of mathematical logic]
.
Mathematische Annalen
,
92
(
3
),
305
316
.
Tominaga
,
K.
,
Watanabe
,
T.
,
Kobayashi
,
K.
,
Nakamura
,
M.
,
Kishi
,
K.
, &
Kazuno
,
M.
(
2007
).
Modeling molecular computing systems by an artificial chemistry: Its expressive power and application
.
Artificial Life
,
13
(
3
),
223
247
. ,
[PubMed]
Waldmann
,
J.
(
2000
).
The combinator S
.
Information and Computation
,
159
(
1–2
),
2
21
.
Walker
,
C.
, &
Ashby
,
W. R.
(
1966
).
On temporal characteristics of behavior in certain complex systems
.
Kybernetik
,
3
(
2
),
100
108
. ,
[PubMed]
Wuensche
,
A
,
Lesser
,
M.
, &
Lesser
,
M. J.
(
1992
).
Global dynamics of cellular automata: An atlas of basin of attraction fields of one-dimensional cellular automata
, (Vol.
1
).
Addison-Wesley
.
Young
,
T. J.
, &
Neshatian
,
K.
(
2013
).
A constructive artificial chemistry to explore open-ended evolution
. In
S.
Cranefield
&
A.
Nayak
(Eds.).
Australasian joint conference on artificial intelligence AI 2013: Advances in artificial intelligence
(pp.
228
233
).
Springer
.
This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. For a full description of the license, please visit https://creativecommons.org/licenses/by/4.0/legalcode.