## Abstract

One important sense of the term “adaptation” is the process by which an agent changes appropriately in response to new information provided by environmental stimuli. We propose a novel quantitative measure of this phenomenon, which extends a little-known definition of adaptation as “increased robustness to repeated perturbation” proposed by Klyubin (2002). Our proposed definition essentially corresponds to the average value (relative to some fitness function) of state changes that are caused by the environment (in some statistical ensemble of environments). We compute this value by comparing the agent's actual fitness with its fitness in a counterfactual world where the causal links between agent and environment are disrupted. The proposed measure is illustrated in a simple Markov chain model and also using a recent model of autopoietic agency in a simulated protocell.

## 1 Introduction

How might we understand the notion of adaptation? We propose a definition, inspired by Klyubin's definition of adaptation as “increased robustness to repeated perturbation” [8, p. ii]. Our approach essentially formalizes the notion of an organism displaying beneficial changes *in response to* interactions with its environment. In order to provide a more mathematically rigorous framework, we use the causal probability theory by Pearl [14] and the information agent formalism by Ay and Polani [3], although prior knowledge of those is not needed to understand the intuitions in this article. The mathematically inclined reader can find more details in the Appendix.

The key intuition behind our approach is the following: Comparing the effect of an organism's action with the effect of other actions it could have taken is much more straightforward than trying to measure what its actions do to promote continued viability. Therefore, our approach to measuring adaptivity relies on the relative fitness gain an organism obtains after responding to a stimulus in a way that is particularly beneficial in its current environment, compared to the way it could have responded had it been in another environment. This essentially corresponds to drawing a distinction between adaptation to a specific environment and overall improvement. Furthermore, we distinguish between two distinct ways in which an organism can be adaptive to its environment that are different from diachronic improvement.

In this article we provide mathematical tools to quantify various adaptivity measures following this paradigm, illustrate them in a recent model of a protocell, and discuss what benefits and insights can be obtained about the system using this approach.

These measures are formulated in an abstract manner, and are easily applicable to systems in other fields within artificial life research. For example, these tools can be used to measure the benefits of phenotypic plasticity in a given organism, or the adaptivity of certain behaviors exhibited by such an organism. Importantly, our approach moves beyond the standard survival-normative accounts of adaptivity and allows researchers to operationalize their definitions of fitness in any way that is suitable to study their system of interest.

### 1.1 Companion Articles

This article is the second of two companion articles considering the notion of adaptation from an abstract, formal perspective:

- 1.
“Adaptation Is Not Just Improvement over Time” critiques an influential view of adaptation, under which an organism can be mechanically constituted so as to improve its future prospects over time.

- 2.
“Measuring Fitness Effects of Agent-Environment Interactions” proposes a quantitative measure for the degree to which an agent's internal dynamics capture beneficial information from its sensory stimuli.

^{1}

### 1.2 Structure of the Article

Section 2 introduces our approach to defining adaptation during an organism's lifetime: effectively, a quantification of the benefit the organism derives from responding to a specific environment. We briefly summarize the mathematical tools we are using (the causal information agent framework) and provide formal definitions in Section 3; and we apply them to some examples in Section 4, including a protocell model used recently to investigate a different notion of adaptation (discussed in more detail in our companion article). In Section 5 we briefly recap our proposal and relate it to other work in subjectivity, causality, and agency. Section 6 concludes the main body of the article.

## 2 Our Concept of Adaptation

The word “adaptive” has a variety of meanings in the scientific context. In particular, it refers to a specific technical concept within evolutionary theory, which differs from how the term is used informally in the non-evolutionary sciences. As a disclaimer, we note that here we address the problem of measuring within-lifetime adaptation, and we do *not* make any claims about adaptation in evolutionary biology. The companion article provides a fuller discussion of notions of lifetime adaptation in artificial life science.

Before we dive into the mathematical definitions, in this section we walk through an example that displays the basic reasoning behind our concept of adaptation. Although the organism in this example is arguably more complex than the organism we analyze later in Section 4, it serves as a good illustration of our intuitions.

### 2.1 The Rat in the Maze

Consider a rat navigating an unfamiliar maze *M*_{1}. The rat will get better at navigating this particular maze (and, indeed, mazes in general) as a consequence of exposure to the maze. We may be tempted to equate behavioral adaptation with a diachronic change in the animal's “objective” future prospects (i.e., a difference between the organism's viability at different time steps), but—as we argue in our companion article—this invites some mathematical pathologies, since a fully objective perspective would foresee the rat's adaptive capacities and factor them accurately into any evaluation of its future prospects.

Instead, we propose a different view of adaptation that is based on counterfactual reasoning and provides a more nuanced account of adaptive phenomena. This view is inspired by the definition of adaptation as “increased robustness to repeated perturbation” [8, p. ii, 9], which seems to capture an important aspect of adaptive processes. In particular, what this definition refers to is the agent's molding itself to a specific, temporally extended aspect of the environment. Let us consider two different aspects of this molding-to-environment phenomenon:

- 1.
The organism's tendency to respond to a stimulus from its actual environment, in ways that are better suited to that environment than its responses to stimuli from other possible environments.

- 2.
The organism's tendency to respond to a stimulus from its actual environment, such that its response is better suited to that environment than to other possible environments.

### 2.2 Causal Responsivity

This section illustrates the first aspect of molding to environment: In responding to stimuli from its environment, the rat's prospects within its actual environment have improved relative to what its prospects would be if it had been responding to stimuli from a different environment.

Suppose the rat, while familiarizing itself with maze *M*_{1}, were to be magically transported to a different maze *M*_{2} for a short time interval *T*, and then transported back to maze *M*_{1}. The rat would be confused by the unexpected stimuli it had encountered during *T*, and would require more time to reorient itself in maze *M*_{1}.

However, this would not be the case if the local configuration of maze *M*_{2} were identical, over *T*, to that of maze *M*_{1}: In such a case, the rat would be unable to distinguish between the two mazes, and its performance in maze *M*_{1} would be unaffected. We would not want to say that, over *T*, the rat had adjusted *specifically* to maze *M*_{1} if the rat's state changes would have been identical in maze *M*_{2}.

### 2.3 Selective Benefit

This subsection illustrates the second aspect of molding to environment: By interacting with its environment, the rat's prospects within its actual environment have improved relative to its imagined prospects in another, counterfactual environment. We can illustrate this as follows.

Suppose the rat, after familiarizing itself with maze *M*_{1}, is magically transported to a different maze *M*_{2}. There was some change in the rat that occurred through interactions with maze *M*_{1}. This adaptation produced a propensity for behavior well suited for maze *M*_{1}, but less so for maze *M*_{2}; indeed, the rat may now fare worse in maze *M*_{2} than if it had never been exposed to maze *M*_{1}. (Imagine, for example, a case in which the spatial layout of *M*_{1} and *M*_{2} is identical, but the exit or goal in *M*_{2} is diametrically opposed to the one in *M*_{1}. In this case, when transported to *M*_{2} the rat would head in the wrong direction and end up further away from the goal.)

Imagine, for contrast, a scenario in which the rat has been allowed to familiarize itself with both maze *M*_{1} and maze *M*_{2}, and has now been drugged. The amount of time taken to traverse maze *M*_{1} will improve over time, just as for a naive rat that familiarizes itself over time with maze *M*_{1}. However, the reasons will be importantly different: In the new case, the rat's performance improves because the drugs are wearing off, and not because its behavior is becoming more tailored to maze *M*_{1} than to other possible environments. Indeed, as the drugs wear off, the same improvements will be observed in maze *M*_{2}. This is still improvement, but we would not call it adaptation to *M*_{1}, because the changes are not specific to *M*_{1}.

## 3 Quantifying Fitness Effects of Agent-Environment Interactions

### 3.1 Preliminaries

The measures of adaptivity we propose are inspired by and grounded in causal probability [15] and information theory [3], although detailed prior knowledge of these is not necessary to understand the intuitions behind this work. Here we describe the measures, and leave the technical details to the Appendix.

Imagine an ensemble of agents at time *t*_{0}, with a variety of different internal states and in a variety of different environments. We can represent this ensemble by a joint probability distribution over random variables *X*, the agent's state at time *t*_{0}, and *Y*, the environment's state at time *t*_{0}. The probability ℙ(*X* = *x*, *Y* = *y*) will correspond to the degree to which we consider the pair *x*, *y* to be typical of this class of agents in this class of environments. At time *t*_{1}, after one (discrete) time step, the state of the agent transitions to *X*′ and the state of the environment to *Y*′.

The state of the environment will in general affect the organism's dynamics, and vice versa. These dependences can be compactly represented in the causal graph (or causal Bayesian graph) shown in Figure 1. While causal graphs have a precise mathematical definition [15], for our purposes here it suffices to say that a link from *A* to *B* implies that in general *B* will respond to changes in *A* when everything else is held constant. Then, reading the graph, we can tell that in general *X*′ and *Y*′ depend on *X* and *Y* and are distributed as ℙ(*x*′, *y*′|*x*, *y*) = ℙ(*x*′|*x*, *y*) ℙ(*y*′|*x*, *y*). We assume that causal relations apply strictly forwards in time, and are temporally local, that is, that variables at each time step are directly causally affected only by variables at the immediately preceding time step.

Finally, we must specify with respect to what metric the organism is adapting or improving. Treatments of adaptation [5, 8] in the artificial life literature often adopt a *survival-normative* approach, in assuming that the only event relevant to an organism's concerns is its own survival or death [10]. However, we will relax this requirement and allow the organism's fitness to be an arbitrary function *F* : 𝒳 × 𝒴 → ℝ of the organism-environment state.

### 3.2 Causal Responsivity

*x*,

*y*) → (

*x*′,

*y*′). The new fitness

*F*(

*x*′,

*y*′) is a sample from some distribution with a population mean of

*x*→

*x*′ would have been equally likely in any environment, it does not seem reasonable to identify (

*x*,

*y*) → (

*x*′,

*y*′) as a specific adaptation to

*y*(see Section 2.2). Let's therefore imagine that we use some random environment

*y*

_{*}, consistent with

*x*, to generate the transition (

*x*,

*y*

_{*}) → (

*x*

_{*}

^{′},

*y*

_{*}

^{′}), and evaluate the fitness

*F*(

*x*

_{*}

^{′},

*y*′). This is equivalent to imagining that, when making the transition

*x*→

*x*

_{*}

^{′}, the agent is influenced by a random environment

*y*

_{*}instead of its actual environment

*y*, and the fitness

*F*is evaluated on the true state of the environment

*y*′ (i.e., the equivalent of transporting the rat from our previous example from maze

*M*

_{1}to maze

*M*

_{2}for a short time interval and then measuring its viability in

*M*

_{1}). The fitness

*F*(

*x*

_{*}

^{′},

*y*′) is then a sample drawn from a distribution with population mean

*δ*is the Kronecker delta function. Intuitively, ℙ(

*y*′|

*x*,

*y*) is the standard transition probability of the environment and, importantly, ℙ(

*x*′|

*x*) = ∑

_{y*}ℙ(

*x*′|

*x*,

*y*

_{*})ℙ(

*y*

_{*}|

*x*) is the overall probability that

*x*transitions into

*x*′ averaged across all environments

*y*

_{*}.

^{2}We define the

*causal responsivity*

*ρ*(

*x*,

*y*) of a state

*x*in an environment

*y*as the difference between Equations 1 and 2:

*x*in environment

*y*will have a positive causal responsivity

*ρ*if its response to

*y*results in a higher fitness in

*y*than its response to other environments

*y*

_{*}would.

Alternatively, this quantity can be seen as representing the portion of expected fitness that depends on the organism receiving incoming information about its actual environment, rather than some random environment consistent with its state: in other words, how much worse off the organism would be in a counterfactual world with the same organism-environment dynamics but where the red arrow in Figure 1 did not exist. In fact, in the Appendix, section A.4, we show that, as expected, the causal responsivity is zero if there is no information flow along the relevant arrow in the causal graph. Additionally, it is possible for this quantity to be negative, which would indicate that the organism would actually perform better in a scenario where it was receiving information from a random environment—for example, if the organism's response to environment *y* were particularly harmful to fitness in the same environment *y*.

### 3.3 Selective Benefit

*global improvements*rather than

*adaptations*, in analogy to the rat recovering from being drugged in the example discussed above. Imagine, as in Section 3.2, that in addition to a transition (

*x*,

*y*) → (

*x*′,

*y*′) we generate a transition (

*x*,

*y*

_{*}) → (

*x*

_{*}

^{′},

*y*

_{*}

^{′}) starting in a random environment

*y*

_{*}and evaluate the fitness

*F*(

*x*′,

*y*

_{*}

^{′}). This is equivalent to considering how good the agent's actual transition under

*y*would be if the agent were in a random environment

*y*

_{*}. We will define the

*selective benefit*

*σ*(

*x*,

*y*) of a state

*x*in an environment

*y*as

*F*(

*x*′,

*y*′) to

*F*(

*x*′,

*y*

_{*}

^{′}) instead of to

*F*(

*x*

_{*}

^{′},

*y*′). In other words, an organism will have high selective benefit if its response to environment

*y*is more beneficial in environment

*y*than in a random environment

*y*

_{*}

^{′}.

Selective benefit can be seen as the portion of expected fitness (in the next time step) that depends on the organism being in its actual environment rather than a random environment compatible with its initial state: in other words, how much worse off the organism would be in a counterfactual world where the blue arrow in Figure 1 did not exist.

### 3.4 Environment Adaptivity

*ρ*(

*x*,

*y*) and

*σ*(

*x*,

*y*) above, we can prove the following interesting property: Both of them have the same expected value when averaged over

*y*. Mathematically,

*α*(

*x*), which we call the

*environment adaptivity*of

*x*:

*ρ*(

*x*,

*y*) and

*σ*(

*x*,

*y*) yield the same result after averaging over

*y*is somewhat surprising—after all, they were designed to measure two qualitatively different aspects of agent-environment interaction. This makes

*α*(

*x*) a natural candidate for the overall adaptivity of the organism in state

*x*, as it can be interpreted from the two complementary views in Section 2.1.

### 3.5 Normalized Causal Responsivity and Selective Benefit

As an overall measure, expected adaptivity balances the benefit the organism obtains from environment-*y*-typical perturbations (as opposed to perturbations from a completely random environment). However, if we compare the organism's typical behavior in environment *y* with its typical behavior in a random environment *y*_{*}, there is in general a nonzero probability that *y*_{*} = *y* and hence that there will be no difference. The larger the prior probability of *y*, the higher this chance, and the lower the measures *ρ*(*x*, *y*) and *σ*(*x*, *y*) will be. This is unlikely to be what we want when we consider an organism's relationship to a specific environment.

*y*by comparing the organism's behavior in

*y*with its behavior in a random

*other*environment

*y*

_{*}. In our notation, we will define the

*normalized causal responsivity*

*ρ*

_{*}(

*x*,

*y*) and

*normalized selective benefit*

*σ*

_{*}(

*x*,

*y*) of

*x*in

*y*as follows:

*x*′|

*x*,

*Y*≠

*y*) is the same distribution over the organism's future state as in the standard case, but normalized with respect to the assumption that

*Y*≠

*y*. Mathematically,

*y*′|

*x*,

*Y*≠

*y*). After a few steps of algebra, we find that the normalized measures can be succinctly written in terms of their non-normalized counterparts as

*ρ*

_{*},

*σ*

_{*}when considering the organism's relationship to a specific environment, and the non-normalized average

*α*(

*x*) when considering the overall adaptivity of a state

*x*.

## 4 Examples

For the sake of simplicity, our examples will consider only the case where the environment's state is fixed. The more general case where the environment's state varies as a consequence of endogenous and interactional dynamics is little different, and omitted here only for reasons of space. In Section 5.3 we briefly discuss how our approach may also be used to address further complexities regarding the evaluation of active effects by the agent on the environment.

### 4.1 Simple Markov Chain

Let's start with a simple Markov process we can use to illustrate our concepts of causal responsivity and selective benefit. We will assume that the environment is fixed or its dynamics are much slower than those of the agent, so that ℙ(*Y*′|*X*, *Y*) = ℙ(*Y*′|*X*).

Consider an agent with three states, labeled *x*_{0}, *x*_{1}, and *x*_{2}. The agent starts always in state *x*_{0}, and it can transition to either *x*_{1} or *x*_{2}. There are two possible environments, labeled *y*_{0} and *y*_{1}, with prior probabilities *γ*_{i} = ℙ(*y*_{i} | *x*_{0}). A graphical representation of the Markov process is shown in Figure 2. Edges represent transition probabilities, and the numbers next to each node represent fitness values. Edges and numbers in red apply in environment *y*_{0}, and edges and numbers in blue apply in environment *y*_{1}.

*λ*

_{i}∈ [0, 1] :

*i*∈ {0, 1} control the probability of internal state transitions in each environment

*y*

_{i}. If

*λ*

_{i}= 0, then the agent is certain to transition to state

*x*

_{1}in environment

*y*

_{i}; if

*λ*

_{i}= 1, then it is certain to transition to

*x*

_{2}.

Parameters *β*_{i} ∈ ℝ control the relative fitness of state *x*_{2} compared to *x*_{1} in the same environment (recall that states may have different fitness values in different environments). If *β*_{i} = 0, both states are equally good in environment *y*_{i}, whereas if |*β*_{i}| → ±∞, one state is infinitely preferable to the other (both of these properties apply to environment *i*).

Parameters *γ*_{i} ∈ (0, 1) represent the probability that the agent is in environment *y*_{i} given that its internal state is *x*_{0}—that is, how “typical” the environment *y*_{i} is for the initial state *x*_{0} of the agent. We require that every environment have a nonzero probability of occurring, *γ*_{i} > 0.

*ρ*(

*x*

_{0},

*y*

_{i}),

*ρ*

_{*}(

*x*

_{0}.

*y*

_{i}) of state

*x*

_{0}in environment

*y*

_{i}can be calculated using Equations 3 and 6:

*i*= 1 −

*i*. Note that the “raw” causal responsivity

*ρ*(

*x*

_{0},

*y*

_{i}) vanishes when

*γ*

_{i}≈ 1, which corresponds to the case in which one of the environments never occurs. This supports the use of the normalized adaptivity measures introduced in Section 3.5.

The normalized causal responsivity *ρ*_{*}(*x*_{0}, *y*_{i}), in turn, is proportional to the relative fitness of one state with respect to the other and to the internal state transition probabilities. Also note that it vanishes in two conditions: when there is no benefit to being in *x*_{2} rather than *x*_{1} (in environment *y*_{i}), or when *x* responds no differently to *y*_{0} than to *y*_{1}. This is consistent with quantifying the first aspect of molding to environment discussed in Section 2.1.

*σ*(

*x*

_{0},

*y*

_{i}) and

*σ*

_{*}(

*x*

_{0},

*y*

_{i}) using Equations 4 and 7:

*y*

_{i}by transitioning to the fixed-fitness state

*x*

_{1}, or when the next-step fitness does not depend on

*y*. Notice the symmetry with the expression for

*ρ*

_{*}(

*x*

_{0},

*y*

_{i}) above.

*α*(

*x*

_{0}) using Equation 5:

*γ*

_{¬i}in

*ρ*(

*x*

_{0},

*y*

_{i}) and

*σ*(

*x*

_{0},

*y*

_{i}), the quantity

*α*(

*x*) has highest magnitude when

*γ*

_{0}=

*γ*

_{1}.

### 4.2 Adaptation to Environment in a Protocell Model

As a more elaborated example, we applied our measure to a simulated protocell model by Agmon, Gates, and Beer (henceforth, AGB), used recently in the modeling of adaptation to perturbations [1]. This is a two-level hierarchical model simulated at two different time scales; the life span is defined in terms of time steps at the higher level of this model. The lower-level model is a deterministic reaction–diffusion-repulsion system that has a variety of attractor states; these attractor states resemble a central reservoir surrounded by a chemical membrane. In each higher-level time step, an attractor state is subjected to one of several different instantaneous perturbations, and then permitted to go to equilibrium;^{3} transitions at the higher level of dynamics are therefore from one lower-level stable state *x* to another lower-level stable state *x*′. By placing a probability distribution over the perturbations, the model can be formulated as an irreducible discrete Markov process. For technical details about the model we refer the reader to the original articles [1, 2].

In Agmon et al. [1], the authors simulated point perturbations to two different chemical species, at three different magnitudes, in nine distinct locations, for a total of 72 distinct perturbations. Beginning with a single initial stable micro state, they computed a network of other micro states reachable from this initial state by some series of perturbation-relaxation cycles. They arbitrarily terminated their search at 16 cycles, discovering 267 distinct attractors in total within this distance (including the death state), including 113 attractors whose successors were not computed. For the purposes of calculating viability, they assumed that the uncomputed successors of all these 113 attractors were exactly the death state.

This produced a graph with 267 nodes, whose edges were labeled with one of the 72 perturbations. By placing a probability distribution over the perturbations, the model can be formulated as an irreducible discrete Markov process. AGB provided us with the data for this graph; we used this data without running the underlying simulation.

We can apply our measures to the AGB model by supposing that the protocell finds itself in one of two different possible environments, each with a different probability distribution over possible perturbations. We can then ask whether the protocell tends to adapt to these different environments by shifting into states that are more robust to the particular set of perturbations found in its environment. In particular, we may consider how a protocell fares in a particular environment, after being exposed to the perturbations from that environment, compared to how it would fare if exposed to perturbations from a random (other) environment. This contrasts with Agmon et al.'s approach, which features only a single unified distribution over perturbations.

We considered two different environments *y*_{Mem} and *y*_{Aut} for the protocell, corresponding respectively to uniformly distributed perturbations of the membrane chemical, and uniformly distributed perturbations of the autocatalyst chemical. The protocell was assumed to begin in each environment with equal probability 0.5, and we identified the fitness *F*(*x*, *y*) of a state *x* in environment *y* as the expected life span *L*(*x*, *y*) of that state in that environment, that is, the expected number of transitions from *x* to the dead state when all perturbations are drawn from environment *y*.

#### 4.2.1 Measuring *α*, *ρ*_{*}, and *σ*_{*}

*y*

_{Mem}and

*y*

_{Aut}that don't change over time, our measures

*α*,

*ρ*

_{*}, and

*σ*

_{*}become trivial to calculate, and take on intuitively straightforward meanings. Writing $y\xaf$ to denote the environment that is not

*y*, we have

*ρ*

_{*}(

*x*,

*y*) is equal to the difference between the expected life span of

*x*′ in

*y*and the expected life span

*x*′ would have in

*y*if the transition

*x*→

*x*′ were driven by a perturbation from the other environment $y\xaf$, rather than

*y*. Similarly,

*σ*

_{*}(

*x*,

*y*) is equal to the difference between the expected life span of

*x*′ in

*y*and the expected life span

*x*′ would have in $y\xaf$, both based on a transition

*x*→

*x*′ driven by a perturbation from

*y*.

Finally, we can calculate the expected life span for *x*′ in a randomly chosen environment *y* after a transition *x* → *x*′ driven by a perturbation from the same environment *y*, and compare it with the expected life span after a perturbation from the other environment $y\xaf$. The difference between these two expected life spans is equal to 2*α*(*x*). The left-hand graph in Figure 3 plots these two values against each other, showing clearly that almost all states (94%) have a positive overall environmental adaptivity. The average *α* value for all (non-dead) states was 0.28.

The right-hand graph in Figure 3 provides some further information: Those states with the highest *α* are the ones with the highest expected life span in the membrane perturbation environment *y*_{Mem} (*r* = 0.95, Pearson's product–moment correlation). Speaking roughly, autocatalyst perturbations appear to be more disruptive than membrane perturbations (as can be inferred from the same graph: Expected life spans in *y*_{Aut} are overall lower than in *y*_{Mem}). Hence, we can expect states with a high life span under repeated membrane perturbations to be the ones that benefit most (on average over both environments) from non-abnormal perturbations.

Figures 4 and 5 show the expected next life span of (non-dead) states *x* in the *y*_{Mem} and *y*_{Aut} environments, depending on which environment the next perturbation is drawn from. Figure 4 arranges this data to illustrate the *ρ*_{*} comparison, and Figure 5 arranges it to illustrate the *σ*_{*} comparison. Note that even though a regime of autocatalyst perturbations is more disruptive than a regime of membrane perturbations, a modest majority (57.1%) of states fare better after suffering a single autocatalyst perturbation than a single membrane perturbation, if they are to be subjected to a regime of autocatalyst perturbations (these are the blue and white circles on the right-hand graph in Figure 4).

A minority of states (white circles) are causally responsive to, or benefit selectively from, perturbations in both environments: These are the states that fall underneath the identity line on both left- and right-hand graphs in Figures 4 and 5. The majority of states, for both measures *ρ*_{*} and *σ*_{*}, have a positive value in one environment *y* and a negative value in the other environment $y\xaf$. For instance, the red (blue) circles in Figure 4 represent states that “prefer” membrane (autocatalyst) perturbations respectively, regardless of which environment their viability is actually evaluated in. Similarly, the red (blue) circles in Figure 5 represent states that “prefer” repeated future perturbations to affect the membrane (autocatalyst) chemicals respectively, regardless of what sort of perturbation they happen to receive in the current time step.

Summarizing our findings in the protocell model: Autocatalyst perturbations are substantially more disruptive than membrane perturbation, but despite this asymmetry we measure positive environment adaptivity in 94% of states. We can also break down the environment effects, and observe that the majority of states are suited to one specific environment (i.e., *ρ*_{*} and *σ*_{*} are positive in one environment and negative in the other). In the case of membrane perturbations, states with the highest expected life span under repeated perturbations are also the most adaptive.

## 5 Discussion

### 5.1 Fitness Improvement Relative to Counterfactual Scenarios

We have proposed that organism-level adaptation must be understood

- 1.
in terms of the organism's relationship to its environment,

- 2.
within a context of other possible environments that the organism could have been in, that is, in terms of fitness improvements

*relative to counterfactual scenarios*.

In our framework, an adaptive organism is one that tends to use information from its environment to tailor its internal state for that specific environment. Our causal responsivity measure *ρ*(*x*, *y*) quantifies the portion of an organism's fitness (in state *x*) that depends on receiving information from its specific environment *y*, while our selective benefit measure *σ*(*x*, *y*) quantifies the portion of an organism's fitness that comes from its next state being appropriate to the particular environment *y*. Considered over all typical environments, these two measures have the same average *α*(*x*), which we have identified with the environment adaptivity of state *x*.

Note that, according to our framework, even an organism whose fitness falls steadily over time (as in the case of a dying organism) can still be adaptive to its environment. A high *α* value in such a case corresponds to an organism whose fitness would fall even faster were it continually switched between random environments.

### 5.2 Adaptivity and Causal Interventions

This article has made use of Pearl's causal Bayesian network formalism [15] for our definitions in Section 3.2. Since we consider interventions directly on the variables *X*′, *Y*′ that feature in the fitness evaluation *F*(*X*′, *Y*′), no causal model is strictly necessary, and we can express our measures directly in terms of ordinary observational probabilities.

However, we have introduced the causal formalism because it helps to clarify the intuitive interpretation of our measures, and because of the important role that causation appears to play in formal treatments of cognition. For instance, Ortega and Braun [13] argue that for a system to be able to take desirable actions, it must regard its own decision-making process as causally independent from the physical world.

Our quantity *α*(*x*) is related to Ay and Polani's causal information flow [3], but has some important differences. We use a utility function *F* that allows us to distinguish between adaptive and maladaptive variation; causal information flow could tell us whether the environment affected the agent's state, but not whether it was beneficial or not. Along these lines, *α*(*x*) is also related to the “value of information” measures introduced by Howard [7]. In contrast to causal information flow, we also make finer-grained distinctions by considering what happens when intervened variables are set according to specific conditional distributions.

We prefer the use of causal Bayesian networks to observational proxies for causation such as transfer entropy [16] (and the special case of Granger causality [4]) because the intervention-based formalism more closely captures the semantics of causal relationships, giving correct results when observational measures fail [3].

### 5.3 Action as Information Flow from Agent to Environment

*ρ*,

*σ*, and

*α*are no different in the general sensorimotor closed-loop case. In this article, we have only considered the effects of causal links from the environment: either to the agent (in the case of

*ρ*(

*x*,

*y*)), or within the environment (in the case of

*σ*(

*x*,

*y*))—recall the graph in Figure 1. However, a benefit of our framework is that we can also consider similar quantities measuring the efficacy of the agent's actions and internal state transitions. One could use the same method to determine the portion of fitness that depends on causal links from the agent, by considering the measure

*x*), and the measure

*x*. Notice the symmetry of these expressions compared to the originals defined in Sections 3.2 and 3.3.

We believe this constitutes a promising research direction stemming from this work. Future research should clarify the meaning of these fitness effects of agent-to-environment information flows, and all possible implications for theories of action and agency.

## 6 Conclusion

We build on information theory and causal intervention theory to develop a set of new tools to study within-lifetime adaptivity in artificial and biological systems. What distinguishes our approach from previous work in the field is that we measure adaptivity by comparing an organism's fitness in its actual environment with its fitness in other counterfactual environments, instead of tracking the organism's fitness over time. In particular, we define two closely related measures—causal responsivity *ρ* and selective benefit *σ*—that quantify specific aspects of the causal information flow in the agent-environment system and how they affect the agent's expected fitness. In this framework, expected adaptivity *α* comes in as a natural way of quantifying the overall fitness effects of the causal links between agent and environment.

We illustrate our framework by giving specific numerical examples with a simulated protocell model. This constitutes a practical example of how to use the proposed measures to study the adaptivity of an organism in a variety of states and environments.

In summary, we advocate a view of adaptivity as a causal property of the agent-environment system. The contribution of this article is to propose a measure of this kind, that relies on causal interventions and counterfactual reasoning. These measures are shown to be of interest for practical applications, and solidly grounded in their theory.

## Acknowledgments

The authors would like to thank Eran Agmon for providing parameters for the simulation of the protocell model used in Section 4. We would also like to thank anonymous reviewers who provided helpful feedback, and the editor of *Artificial Life* for permitting an unconventional submission in the form of dual companion articles.

## Notes

Both articles are found in this issue of *Artificial Life*.

The derivation of this expression is provided in the Appendix, sections A.2 and A.3.

They do not report finding any non-point attractors.

## References

### Appendix. Causal-Probabilistic Interpretations and Proofs

#### A.1 Mathematical Prerequisites

In order to formalize the ideas described in Section 2, some mathematical tools are required. Although we covered the basic intuitions in the text without appealing to mathematical details, we now describe those tools and make our formalism more precise.

Our proposed approach is best made sense of within Ay and Polani's information agent framework. The information agent framework uses tools from causal probability and information theory to characterize typical properties of an agent's interactions with its environment, in terms of information flow. This article is not meant as a comprehensive introduction to the information agent framework, so only the key concepts will be summarized here.

Briefly, the information agent framework considers the statistical properties of agent and environment dynamics over a range of counterfactual possible situations. In contrast to well-known sample statistics, such as Pearson's product–moment correlation, which are simple to compute but fail to capture complex nonlinear relationships, the information agent framework considers information-theoretic quantities, which are maximally general (and correspondingly harder to compute or estimate from samples).

*t*

_{0}described by the probability distribution ℙ(

*X*

_{0}=

*x*,

*Y*

_{0}=

*y*). Let's assume there is some (possibly stochastic) “physical law” governing the agent in its environment, which can be described by a probability function ℙ(

*X*

_{i+1}=

*x*′,

*Y*

_{i+1}=

*y*′ |

*X*

_{i}=

*x*,

*Y*

_{i}=

*y*). We may naturally ask questions about the typicality of the agent's and environment's possible states

*X*

_{1},

*Y*

_{1}at the next time step

*t*

_{1}; these will correspond to applying some (possibly stochastic) “physical law” to the agent-environment state, and averaging our results over the initial typicality values:

*I*(

*X*

_{i};

*Y*

_{i+1}), the amount of information the agent's current state tells us about its next environment state; or

*I*(

*X*

_{i+1};

*X*

_{i}|

*Y*

_{i}=

*y*), the amount of additional information the agent's current state provides about its next state if we already know that the current environment is in state

*y*.

In general, agents simultaneously adapt themselves to their environment and adapt their environment to themselves. To use Friston's distinction, agents perform both inference and action [6]. Both of these processes tend to increase the mutual information between agent state and environment state, so it might seem that it would be difficult to disentangle them using purely statistical tools.^{4}

Within standard probability theory, this is true. However, the development of causal probability theory by Pearl [14] makes it possible to identify the direction of causal effects. Pearl's innovation was the concept of an *intervention*: an externally-imposed change of a variable. Informally, when an intervention $X^$ = *x* is applied to a probability space, the term *X* in the equations for the probabilities of all other variables becomes replaced with the constant *x*.

This allows us to write expressions like ℙ(*X*_{i+1} = *x*′|*Ŷ*_{i} = *y*): the probability that the agent will next be in state *x*′, if we externally force the current environment into state *y* without directly affecting anything else. This is usually not equal to ℙ(*X*_{i+1} = *x*′|*Y*_{i} = *y*), that is, the probability that the agent will next be in state *x*′ if we happen to observe *y*.

#### A.2 Interventionized Expectations

*interventionized expectation*of some random variable

*V*. These allow us to consider the causal effects of particular interactions on fitness, by imagining what would happen if the interaction were erased by an external intervention of some sort. We will require that

*V*be produced by some causal stochastic mechanism, that is, that

*V*is a node in a causal Bayesian network [15]

*G*. We'll write the ordinary expectation operator, averaging over the distribution

*P*(

*V*|

*a*), as 𝔼

_{a}[

*V*]:

*V*, given the observation

*A*=

*a*, when the stochastic process

*q*described by

*G*proceeds autonomously (i.e., without any intervention). Consider another variable

*B*in

*G*. Imagine that we were to intervene so as to set $B^$ =

*b*via some independent external stochastic mechanism

*m*, inducing another process

*q*′. Clearly, the statistics of

*V*under such circumstances will, in general, depend on the details of

*m*.

Ay and Polani consider such a scenario in order to define conditional causal information, and stipulate that the distribution over intervened values must match the marginal observed under *q* (in our case the conditional marginal *P*(*B*|*a*)). However, we will in general want to consider the case in which *P*($B^$) varies according to some other (conditional) marginal *P*(*B*|*c*).

*V*under this scenario as 𝔼

_{a}

^{$B^$|c}[

*V*] (pronounced “the expectation of

*V*given

*a*, but setting

*B*according to

*c*”):

*B*from its parents were severed, and instead a value

*b*were externally imposed, by an external mechanism whose statistics match the original marginal distribution over

*B*(conditioned on the observation

*C*=

*c*), then what would be the expectation of

*V*(conditioned on the observation

*A*=

*a*)?

#### A.3 Connections with *ρ*, *σ*, and *α*

In this section we draw connections between the three measures of adaptation we have proposed that are particularly illuminating and straightforward when formulated in terms of causal probability theory and the interventionized expectations introduced above.

*ρ*(

*x*,

*y*) and

*σ*(

*x*,

*y*). This makes it explicit that they are both expectations of the same quantity under interventions on different variables.

*y*using

*P*(

*Y*|

*x*):

*α*(

*x*) can also be conveniently written as

#### A.4 Connections with Mutual Information

In this section we draw a few useful connections between our adaptivity measures and (conditional) mutual information as commonly used in standard information theory.

*I*(

*Y*;

*X*′|

*x*) = 0 states that knowing the value of

*Y*tells us no additional information about

*X*′ when we already know

*x*—that is, that

*Y*is conditionally independent of

*X*′ given that

*X*=

*x*. It can be shown that the condition

*I*(

*Y*;

*X*′|

*x*) = 0 implies that

*ρ*(

*x*,

*y*) = 0 for all

*y*. Using ⊥ to denote the conditional independence operator, we have

*I*(

*X*′;

*Y*′|

*x*) = 0, then ℙ(

*X*′,

*Y*′|

*x*) = ℙ(

*X*′|

*x*)ℙ(

*Y*′|

*x*) and

Note that for the conditions above the converse does not hold: For example, when *F*(*x*′, *y*′) = *k* for all *x*′, *y*′ (i.e., fitness is constant regardless of organism-environment state), then *α*(*x*) must equal 0 even if *I*(*X*′; *Y*′|*x*) > 0.

#### A.5 Normalized Measures

*y*′|

*x*,

*Y*≠

*y*). To calculate

*ρ*

_{*}(

*x*,

*y*) we first use this expression to write

*Y*′ is independent of

*X*′ given

*X*and

*Y*, we have ℙ(

*x*′|

*x*,

*y*,

*y*′) = ℙ(

*x*′|

*x*,

*y*), allowing us to write