## Abstract

One important sense of the term “adaptation” is the process by which an agent changes appropriately in response to new information provided by environmental stimuli. We propose a novel quantitative measure of this phenomenon, which extends a little-known definition of adaptation as “increased robustness to repeated perturbation” proposed by Klyubin (2002). Our proposed definition essentially corresponds to the average value (relative to some fitness function) of state changes that are caused by the environment (in some statistical ensemble of environments). We compute this value by comparing the agent's actual fitness with its fitness in a counterfactual world where the causal links between agent and environment are disrupted. The proposed measure is illustrated in a simple Markov chain model and also using a recent model of autopoietic agency in a simulated protocell.

## 1 Introduction

How might we understand the notion of adaptation? We propose a definition, inspired by Klyubin's definition of adaptation as “increased robustness to repeated perturbation” [8, p. ii]. Our approach essentially formalizes the notion of an organism displaying beneficial changes in response to interactions with its environment. In order to provide a more mathematically rigorous framework, we use the causal probability theory by Pearl [14] and the information agent formalism by Ay and Polani [3], although prior knowledge of those is not needed to understand the intuitions in this article. The mathematically inclined reader can find more details in the  Appendix.

The key intuition behind our approach is the following: Comparing the effect of an organism's action with the effect of other actions it could have taken is much more straightforward than trying to measure what its actions do to promote continued viability. Therefore, our approach to measuring adaptivity relies on the relative fitness gain an organism obtains after responding to a stimulus in a way that is particularly beneficial in its current environment, compared to the way it could have responded had it been in another environment. This essentially corresponds to drawing a distinction between adaptation to a specific environment and overall improvement. Furthermore, we distinguish between two distinct ways in which an organism can be adaptive to its environment that are different from diachronic improvement.

In this article we provide mathematical tools to quantify various adaptivity measures following this paradigm, illustrate them in a recent model of a protocell, and discuss what benefits and insights can be obtained about the system using this approach.

These measures are formulated in an abstract manner, and are easily applicable to systems in other fields within artificial life research. For example, these tools can be used to measure the benefits of phenotypic plasticity in a given organism, or the adaptivity of certain behaviors exhibited by such an organism. Importantly, our approach moves beyond the standard survival-normative accounts of adaptivity and allows researchers to operationalize their definitions of fitness in any way that is suitable to study their system of interest.

### 1.1 Companion Articles

• 1.

“Adaptation Is Not Just Improvement over Time” critiques an influential view of adaptation, under which an organism can be mechanically constituted so as to improve its future prospects over time.

• 2.

“Measuring Fitness Effects of Agent-Environment Interactions” proposes a quantitative measure for the degree to which an agent's internal dynamics capture beneficial information from its sensory stimuli.

The two articles can be read independently, but deal with closely related themes.1

### 1.2 Structure of the Article

Section 2 introduces our approach to defining adaptation during an organism's lifetime: effectively, a quantification of the benefit the organism derives from responding to a specific environment. We briefly summarize the mathematical tools we are using (the causal information agent framework) and provide formal definitions in Section 3; and we apply them to some examples in Section 4, including a protocell model used recently to investigate a different notion of adaptation (discussed in more detail in our companion article). In Section 5 we briefly recap our proposal and relate it to other work in subjectivity, causality, and agency. Section 6 concludes the main body of the article.

## 2 Our Concept of Adaptation

The word “adaptive” has a variety of meanings in the scientific context. In particular, it refers to a specific technical concept within evolutionary theory, which differs from how the term is used informally in the non-evolutionary sciences. As a disclaimer, we note that here we address the problem of measuring within-lifetime adaptation, and we do not make any claims about adaptation in evolutionary biology. The companion article provides a fuller discussion of notions of lifetime adaptation in artificial life science.

Before we dive into the mathematical definitions, in this section we walk through an example that displays the basic reasoning behind our concept of adaptation. Although the organism in this example is arguably more complex than the organism we analyze later in Section 4, it serves as a good illustration of our intuitions.

### 2.1 The Rat in the Maze

Consider a rat navigating an unfamiliar maze M1. The rat will get better at navigating this particular maze (and, indeed, mazes in general) as a consequence of exposure to the maze. We may be tempted to equate behavioral adaptation with a diachronic change in the animal's “objective” future prospects (i.e., a difference between the organism's viability at different time steps), but—as we argue in our companion article—this invites some mathematical pathologies, since a fully objective perspective would foresee the rat's adaptive capacities and factor them accurately into any evaluation of its future prospects.

Instead, we propose a different view of adaptation that is based on counterfactual reasoning and provides a more nuanced account of adaptive phenomena. This view is inspired by the definition of adaptation as “increased robustness to repeated perturbation” [8, p. ii, 9], which seems to capture an important aspect of adaptive processes. In particular, what this definition refers to is the agent's molding itself to a specific, temporally extended aspect of the environment. Let us consider two different aspects of this molding-to-environment phenomenon:

• 1.

The organism's tendency to respond to a stimulus from its actual environment, in ways that are better suited to that environment than its responses to stimuli from other possible environments.

• 2.

The organism's tendency to respond to a stimulus from its actual environment, such that its response is better suited to that environment than to other possible environments.

We may use this insight to formulate a more general notion of adaptation and show it, in general, to be closely related to notions of information processing. The cost of this approach is that it renders adaptation a matter of theoretical perspective, since adaptation is considered relative to a range of possible counterfactual environments; and this range of environments is arbitrarily determined by the explanatory needs of the theorist.

### 2.2 Causal Responsivity

This section illustrates the first aspect of molding to environment: In responding to stimuli from its environment, the rat's prospects within its actual environment have improved relative to what its prospects would be if it had been responding to stimuli from a different environment.

Suppose the rat, while familiarizing itself with maze M1, were to be magically transported to a different maze M2 for a short time interval T, and then transported back to maze M1. The rat would be confused by the unexpected stimuli it had encountered during T, and would require more time to reorient itself in maze M1.

However, this would not be the case if the local configuration of maze M2 were identical, over T, to that of maze M1: In such a case, the rat would be unable to distinguish between the two mazes, and its performance in maze M1 would be unaffected. We would not want to say that, over T, the rat had adjusted specifically to maze M1 if the rat's state changes would have been identical in maze M2.

### 2.3 Selective Benefit

This subsection illustrates the second aspect of molding to environment: By interacting with its environment, the rat's prospects within its actual environment have improved relative to its imagined prospects in another, counterfactual environment. We can illustrate this as follows.

Suppose the rat, after familiarizing itself with maze M1, is magically transported to a different maze M2. There was some change in the rat that occurred through interactions with maze M1. This adaptation produced a propensity for behavior well suited for maze M1, but less so for maze M2; indeed, the rat may now fare worse in maze M2 than if it had never been exposed to maze M1. (Imagine, for example, a case in which the spatial layout of M1 and M2 is identical, but the exit or goal in M2 is diametrically opposed to the one in M1. In this case, when transported to M2 the rat would head in the wrong direction and end up further away from the goal.)

Imagine, for contrast, a scenario in which the rat has been allowed to familiarize itself with both maze M1 and maze M2, and has now been drugged. The amount of time taken to traverse maze M1 will improve over time, just as for a naive rat that familiarizes itself over time with maze M1. However, the reasons will be importantly different: In the new case, the rat's performance improves because the drugs are wearing off, and not because its behavior is becoming more tailored to maze M1 than to other possible environments. Indeed, as the drugs wear off, the same improvements will be observed in maze M2. This is still improvement, but we would not call it adaptation to M1, because the changes are not specific to M1.

## 3 Quantifying Fitness Effects of Agent-Environment Interactions

### 3.1 Preliminaries

The measures of adaptivity we propose are inspired by and grounded in causal probability [15] and information theory [3], although detailed prior knowledge of these is not necessary to understand the intuitions behind this work. Here we describe the measures, and leave the technical details to the  Appendix.

Imagine an ensemble of agents at time t0, with a variety of different internal states and in a variety of different environments. We can represent this ensemble by a joint probability distribution over random variables X, the agent's state at time t0, and Y, the environment's state at time t0. The probability ℙ(X = x, Y = y) will correspond to the degree to which we consider the pair x, y to be typical of this class of agents in this class of environments. At time t1, after one (discrete) time step, the state of the agent transitions to X′ and the state of the environment to Y′.

The state of the environment will in general affect the organism's dynamics, and vice versa. These dependences can be compactly represented in the causal graph (or causal Bayesian graph) shown in Figure 1. While causal graphs have a precise mathematical definition [15], for our purposes here it suffices to say that a link from A to B implies that in general B will respond to changes in A when everything else is held constant. Then, reading the graph, we can tell that in general X′ and Y′ depend on X and Y and are distributed as ℙ(x′, y′|x, y) = ℙ(x′|x, y) ℙ(y′|x, y). We assume that causal relations apply strictly forwards in time, and are temporally local, that is, that variables at each time step are directly causally affected only by variables at the immediately preceding time step.

Figure 1.

Causal Bayesian graph showing the causal relations between the agent-environment's current state X, Y and the agent-environment's next state X′, Y′. Blue arrow shows causal effect of Y on Y′, and red arrow shows causal effect of Y on X′.

Figure 1.

Causal Bayesian graph showing the causal relations between the agent-environment's current state X, Y and the agent-environment's next state X′, Y′. Blue arrow shows causal effect of Y on Y′, and red arrow shows causal effect of Y on X′.

Finally, we must specify with respect to what metric the organism is adapting or improving. Treatments of adaptation [5, 8] in the artificial life literature often adopt a survival-normative approach, in assuming that the only event relevant to an organism's concerns is its own survival or death [10]. However, we will relax this requirement and allow the organism's fitness to be an arbitrary function F : 𝒳 × 𝒴 → ℝ of the organism-environment state.

### 3.2 Causal Responsivity

Let us formulate a quantitative operationalization of the first aspect of molding to environment introduced in Section 2.1. Suppose we observe one transition (x, y) → (x′, y′). The new fitness F(x′, y′) is a sample from some distribution with a population mean of
$∑x′,y′Fx′y′ℙX′=x,Y′=y′X=x,Y=y.$
(1)
If the transition xx′ would have been equally likely in any environment, it does not seem reasonable to identify (x, y) → (x′, y′) as a specific adaptation to y (see Section 2.2). Let's therefore imagine that we use some random environment y*, consistent with x, to generate the transition (x, y*) → (x*, y*), and evaluate the fitness F(x*, y′). This is equivalent to imagining that, when making the transition xx*, the agent is influenced by a random environment y* instead of its actual environment y, and the fitness F is evaluated on the true state of the environment y′ (i.e., the equivalent of transporting the rat from our previous example from maze M1 to maze M2 for a short time interval and then measuring its viability in M1). The fitness F(x*, y′) is then a sample drawn from a distribution with population mean
$∑x′,y′,x*′Fx′y′ℙ(X′=x′,Y′=y′|X=x,Y=y,X^′=x*′)ℙ(X′=x*′|X=x)=∑x′,y′,x*′Fx′y′δx*′x′ℙ(y′|x, y)ℙ(x*′|x)=∑x′,y′Fx′y′ℙ(y′|x,y)ℙ(x′|x),$
(2)
where δ is the Kronecker delta function. Intuitively, ℙ(y′|x, y) is the standard transition probability of the environment and, importantly, ℙ(x′|x) = ∑y* ℙ(x′|x, y*)ℙ(y*|x) is the overall probability that x transitions into x′ averaged across all environments y*.2 We define the causal responsivityρ(x, y) of a state x in an environment y as the difference between Equations 1 and 2:
$ρxy=∑x′,y′Fx′y′ℙ(x′,y′|x,y)−ℙ(y′|x,y)ℙ(x′|x).$
(3)
We can now see how this quantity corresponds to the first aspect of molding to environment in Section 2.1—an organism in state x in environment y will have a positive causal responsivity ρ if its response to y results in a higher fitness in y than its response to other environments y* would.

Alternatively, this quantity can be seen as representing the portion of expected fitness that depends on the organism receiving incoming information about its actual environment, rather than some random environment consistent with its state: in other words, how much worse off the organism would be in a counterfactual world with the same organism-environment dynamics but where the red arrow in Figure 1 did not exist. In fact, in the  Appendix, section A.4, we show that, as expected, the causal responsivity is zero if there is no information flow along the relevant arrow in the causal graph. Additionally, it is possible for this quantity to be negative, which would indicate that the organism would actually perform better in a scenario where it was receiving information from a random environment—for example, if the organism's response to environment y were particularly harmful to fitness in the same environment y.

### 3.3 Selective Benefit

In the spirit of the second aspect of molding to environment in Section 2.1, we want to identify transitions that are equally beneficial in every environment as global improvements rather than adaptations, in analogy to the rat recovering from being drugged in the example discussed above. Imagine, as in Section 3.2, that in addition to a transition (x, y) → (x′, y′) we generate a transition (x, y*) → (x*, y*) starting in a random environment y* and evaluate the fitness F(x′, y*). This is equivalent to considering how good the agent's actual transition under y would be if the agent were in a random environment y*. We will define the selective benefitσ(x, y) of a state x in an environment y as
$σxy=∑x′,y′Fx′y′(ℙ(x′,y′|x, y)−ℙ(x′|x,y)ℙ(y′|x)),$
(4)
which is identical to Equation 3, except that we are considering the relation of F(x′, y′) to F(x′, y*) instead of to F(x*, y′). In other words, an organism will have high selective benefit if its response to environment y is more beneficial in environment y than in a random environment y*.

Selective benefit can be seen as the portion of expected fitness (in the next time step) that depends on the organism being in its actual environment rather than a random environment compatible with its initial state: in other words, how much worse off the organism would be in a counterfactual world where the blue arrow in Figure 1 did not exist.

With the definitions of ρ(x, y) and σ(x, y) above, we can prove the following interesting property: Both of them have the same expected value when averaged over y. Mathematically,
$∑yρxyℙ(y|x)=∑yσxyℙ(y|x).$
This average yields a quantity α(x), which we call the environment adaptivity of x:
$αx=∑yρxyℙ(y|x)=∑yσxyℙ(y|x)=∑x′,y′F(x′,y′)(ℙ(x′,y′|x)−ℙ(x′|x)ℙ(y′|x)).$
(5)
The fact that ρ(x, y) and σ(x, y) yield the same result after averaging over y is somewhat surprising—after all, they were designed to measure two qualitatively different aspects of agent-environment interaction. This makes α(x) a natural candidate for the overall adaptivity of the organism in state x, as it can be interpreted from the two complementary views in Section 2.1.

### 3.5 Normalized Causal Responsivity and Selective Benefit

As an overall measure, expected adaptivity balances the benefit the organism obtains from environment-y-typical perturbations (as opposed to perturbations from a completely random environment). However, if we compare the organism's typical behavior in environment y with its typical behavior in a random environment y*, there is in general a nonzero probability that y* = y and hence that there will be no difference. The larger the prior probability of y, the higher this chance, and the lower the measures ρ(x, y) and σ(x, y) will be. This is unlikely to be what we want when we consider an organism's relationship to a specific environment.

We can avoid this dependence on the prior probability of y by comparing the organism's behavior in y with its behavior in a random other environment y*. In our notation, we will define the normalized causal responsivityρ*(x, y) and normalized selective benefitσ*(x, y) of x in y as follows:
$ρ*xy=∑x′,y′F(x′,y′)(ℙ(x′,y′|x,y)−ℙ(y′|x,y)ℙ(x′|x,Y≠y)),$
$σ*xy=∑x′,y′F(x′,y′)(ℙ(x′,y′|x,y)−ℙ(x′|x,y)ℙ(y′|x,Y≠y)),$
where ℙ(x′|x, Yy) is the same distribution over the organism's future state as in the standard case, but normalized with respect to the assumption that Yy. Mathematically,
$ℙ(x′|x,Y≠y)=ℙ(x′|x)−ℙ(x′,y|x)1−ℙ(y|x).$
The same consideration applies to ℙ(y′|x, Yy). After a few steps of algebra, we find that the normalized measures can be succinctly written in terms of their non-normalized counterparts as
$ρ*xy=ρ(x,y)1−ℙ(y|x),$
(6)
$σ*xy=σxy1−ℙ(y|x).$
(7)
The derivation of these expressions can be found in Appendix A.5. In general, we suggest the use of the normalized measures ρ*, σ* when considering the organism's relationship to a specific environment, and the non-normalized average α(x) when considering the overall adaptivity of a state x.

## 4 Examples

For the sake of simplicity, our examples will consider only the case where the environment's state is fixed. The more general case where the environment's state varies as a consequence of endogenous and interactional dynamics is little different, and omitted here only for reasons of space. In Section 5.3 we briefly discuss how our approach may also be used to address further complexities regarding the evaluation of active effects by the agent on the environment.

### 4.1 Simple Markov Chain

Let's start with a simple Markov process we can use to illustrate our concepts of causal responsivity and selective benefit. We will assume that the environment is fixed or its dynamics are much slower than those of the agent, so that ℙ(Y′|X, Y) = ℙ(Y′|X).

Consider an agent with three states, labeled x0, x1, and x2. The agent starts always in state x0, and it can transition to either x1 or x2. There are two possible environments, labeled y0 and y1, with prior probabilities γi = ℙ(yi | x0). A graphical representation of the Markov process is shown in Figure 2. Edges represent transition probabilities, and the numbers next to each node represent fitness values. Edges and numbers in red apply in environment y0, and edges and numbers in blue apply in environment y1.

Figure 2.

Simple Markov chain used to illustrate the concepts of causal responsivity and selective benefit. See text for details.

Figure 2.

Simple Markov chain used to illustrate the concepts of causal responsivity and selective benefit. See text for details.

The figure shows a process with
$ℙ(yi|x0)=γi,ℙ(x′,y′|yi,x0)={λiifx′=x2andy′=yi,(1−λi)ifx′=x1andy′=yi,0otherwise,F(x′,yi)={βiifx′=x2,0otherwise.$
Parameters λi ∈ [0, 1] : i ∈ {0, 1} control the probability of internal state transitions in each environment yi. If λi = 0, then the agent is certain to transition to state x1 in environment yi; if λi = 1, then it is certain to transition to x2.

Parameters βi ∈ ℝ control the relative fitness of state x2 compared to x1 in the same environment (recall that states may have different fitness values in different environments). If βi = 0, both states are equally good in environment yi, whereas if |βi| → ±∞, one state is infinitely preferable to the other (both of these properties apply to environment i).

Parameters γi ∈ (0, 1) represent the probability that the agent is in environment yi given that its internal state is x0—that is, how “typical” the environment yi is for the initial state x0 of the agent. We require that every environment have a nonzero probability of occurring, γi > 0.

The (non-normalized and normalized) causal responsivities ρ(x0, yi), ρ*(x0.yi) of state x0 in environment yi can be calculated using Equations 3 and 6:
$ρx0yi=∑x′,y′Fx′y′ℙ(x′,y′|x0,yi)−ℙ(y′|x0,yi)ℙ(x′|x0)=βiλi−λ0γ0−λ1γ1=βiλi−λiγi−λ¬iγ¬i=βiλi−λ¬iγ¬i,ρ*x0yi=βiλi−λ¬i,$
where ¬i = 1 − i. Note that the “raw” causal responsivity ρ(x0, yi) vanishes when γi ≈ 1, which corresponds to the case in which one of the environments never occurs. This supports the use of the normalized adaptivity measures introduced in Section 3.5.

The normalized causal responsivity ρ*(x0, yi), in turn, is proportional to the relative fitness of one state with respect to the other and to the internal state transition probabilities. Also note that it vanishes in two conditions: when there is no benefit to being in x2 rather than x1 (in environment yi), or when x responds no differently to y0 than to y1. This is consistent with quantifying the first aspect of molding to environment discussed in Section 2.1.

We can similarly calculate σ(x0, yi) and σ*(x0, yi) using Equations 4 and 7:
$σx0yi=∑x′,y′Fx′y′ℙ(x′,y′|x0,yi)−ℙ(y′|x0)ℙ(x′|x0,yi)=βiλi−β0λiγ0−β1λiγ1=λiβi−βiγi−β¬iγ¬i=βi−β¬iλiγ¬i,σ*x0yi=βi−β¬iλi.$
As before, the non-normalized measure vanishes when one environment never occurs. The normalized selective benefit also vanishes in two conditions: when the organism always responds to yi by transitioning to the fixed-fitness state x1, or when the next-step fitness does not depend on y. Notice the symmetry with the expression for ρ*(x0, yi) above.
Finally, we can calculate the expected adaptivity α(x0) using Equation 5:
$αx0=∑x′,y′Fx′y′ℙ(x′,y′|x0)−ℙ(x′|x0)ℙ(y′|x0)=∑x′,y′Fx′y′ℙ(x′|x0,y′)−ℙ(x′|x0)ℙ(y′|x0)=β0λ0−λ0γ0−λ1γ1γ0+β1λ1−λ−1γ0−λ1γ1γ1=β0λ0−λ1γ0γ1+β1λ1−λ0γ0γ1=β1−β0λ1−λ0γ0γ1.$
(8)
Note that despite the proportional dependences on γ¬i in ρ(x0, yi) and σ(x0, yi), the quantity α(x) has highest magnitude when γ0 = γ1.

### 4.2 Adaptation to Environment in a Protocell Model

As a more elaborated example, we applied our measure to a simulated protocell model by Agmon, Gates, and Beer (henceforth, AGB), used recently in the modeling of adaptation to perturbations [1]. This is a two-level hierarchical model simulated at two different time scales; the life span is defined in terms of time steps at the higher level of this model. The lower-level model is a deterministic reaction–diffusion-repulsion system that has a variety of attractor states; these attractor states resemble a central reservoir surrounded by a chemical membrane. In each higher-level time step, an attractor state is subjected to one of several different instantaneous perturbations, and then permitted to go to equilibrium;3 transitions at the higher level of dynamics are therefore from one lower-level stable state x to another lower-level stable state x′. By placing a probability distribution over the perturbations, the model can be formulated as an irreducible discrete Markov process. For technical details about the model we refer the reader to the original articles [1, 2].

In Agmon et al. [1], the authors simulated point perturbations to two different chemical species, at three different magnitudes, in nine distinct locations, for a total of 72 distinct perturbations. Beginning with a single initial stable micro state, they computed a network of other micro states reachable from this initial state by some series of perturbation-relaxation cycles. They arbitrarily terminated their search at 16 cycles, discovering 267 distinct attractors in total within this distance (including the death state), including 113 attractors whose successors were not computed. For the purposes of calculating viability, they assumed that the uncomputed successors of all these 113 attractors were exactly the death state.

This produced a graph with 267 nodes, whose edges were labeled with one of the 72 perturbations. By placing a probability distribution over the perturbations, the model can be formulated as an irreducible discrete Markov process. AGB provided us with the data for this graph; we used this data without running the underlying simulation.

We can apply our measures to the AGB model by supposing that the protocell finds itself in one of two different possible environments, each with a different probability distribution over possible perturbations. We can then ask whether the protocell tends to adapt to these different environments by shifting into states that are more robust to the particular set of perturbations found in its environment. In particular, we may consider how a protocell fares in a particular environment, after being exposed to the perturbations from that environment, compared to how it would fare if exposed to perturbations from a random (other) environment. This contrasts with Agmon et al.'s approach, which features only a single unified distribution over perturbations.

We considered two different environments yMem and yAut for the protocell, corresponding respectively to uniformly distributed perturbations of the membrane chemical, and uniformly distributed perturbations of the autocatalyst chemical. The protocell was assumed to begin in each environment with equal probability 0.5, and we identified the fitness F(x, y) of a state x in environment y as the expected life span L(x, y) of that state in that environment, that is, the expected number of transitions from x to the dead state when all perturbations are drawn from environment y.

#### 4.2.1 Measuring α, ρ*, and σ*

In the case of two equally likely environments yMem and yAut that don't change over time, our measures α, ρ*, and σ* become trivial to calculate, and take on intuitively straightforward meanings. Writing $y¯$ to denote the environment that is not y, we have
$ρ*xy=2ρxy=∑x′Fx′yℙ(x′|x,y)−ℙ(x′|x,y¯),σ*xy=2σxy=∑x′F(x′,y)−F(x′,y¯)ℙ(x′|x,y)αx=∑yρxyℙ(y|x)=∑yσxyℙ(y|x)=14∑x′,yFx′yℙ(x′|x,y)−ℙ(x′|x,y¯).$
In other words, ρ*(x, y) is equal to the difference between the expected life span of x′ in y and the expected life span x′ would have in y if the transition xx′ were driven by a perturbation from the other environment $y¯$, rather than y. Similarly, σ*(x, y) is equal to the difference between the expected life span of x′ in y and the expected life span x′ would have in $y¯$, both based on a transition xx′ driven by a perturbation from y.

Finally, we can calculate the expected life span for x′ in a randomly chosen environment y after a transition xx′ driven by a perturbation from the same environment y, and compare it with the expected life span after a perturbation from the other environment $y¯$. The difference between these two expected life spans is equal to 2α(x). The left-hand graph in Figure 3 plots these two values against each other, showing clearly that almost all states (94%) have a positive overall environmental adaptivity. The average α value for all (non-dead) states was 0.28.

Figure 3.

Left-hand graph: life span effects of perturbations, according to whether the perturbation comes from a class representing future perturbations, or a class not representing future perturbations. Identity line shown for reference. Most (94%) of the states lie below the identity line, showing that their fitness is higher in their actual environment than in the other environment, and therefore can be considered adaptive. Right-hand graph: environmental adaptivity α(x) plotted against expected life span in both environments.

Figure 3.

Left-hand graph: life span effects of perturbations, according to whether the perturbation comes from a class representing future perturbations, or a class not representing future perturbations. Identity line shown for reference. Most (94%) of the states lie below the identity line, showing that their fitness is higher in their actual environment than in the other environment, and therefore can be considered adaptive. Right-hand graph: environmental adaptivity α(x) plotted against expected life span in both environments.

The right-hand graph in Figure 3 provides some further information: Those states with the highest α are the ones with the highest expected life span in the membrane perturbation environment yMem (r = 0.95, Pearson's product–moment correlation). Speaking roughly, autocatalyst perturbations appear to be more disruptive than membrane perturbations (as can be inferred from the same graph: Expected life spans in yAut are overall lower than in yMem). Hence, we can expect states with a high life span under repeated membrane perturbations to be the ones that benefit most (on average over both environments) from non-abnormal perturbations.

Figures 4 and 5 show the expected next life span of (non-dead) states x in the yMem and yAut environments, depending on which environment the next perturbation is drawn from. Figure 4 arranges this data to illustrate the ρ* comparison, and Figure 5 arranges it to illustrate the σ* comparison. Note that even though a regime of autocatalyst perturbations is more disruptive than a regime of membrane perturbations, a modest majority (57.1%) of states fare better after suffering a single autocatalyst perturbation than a single membrane perturbation, if they are to be subjected to a regime of autocatalyst perturbations (these are the blue and white circles on the right-hand graph in Figure 4).

Figure 4.

Expected life span 𝔼[L(X′, y)], for all non-dead states x and both environments y, of the state x′ produced by a transition xx′ typical either of y or of the other environment $y¯$. Marker colors indicate the environments y for which a state x has positive ρ*(x, y). 85.1% of states have a positive responsivity in one environment but not in the other, and 13.6% of them respond positively in both environments.

Figure 4.

Expected life span 𝔼[L(X′, y)], for all non-dead states x and both environments y, of the state x′ produced by a transition xx′ typical either of y or of the other environment $y¯$. Marker colors indicate the environments y for which a state x has positive ρ*(x, y). 85.1% of states have a positive responsivity in one environment but not in the other, and 13.6% of them respond positively in both environments.

Figure 5.

Expected life spans 𝔼[L(X′, y)] and 𝔼[L(X′, $y¯$)], for all states x and both environments y, of the state x′ produced by a transition xx′ typical of y. Marker colors indicate the environments y for which a state x has positive σ*(x, y). As with ρ*, most states selectively benefit from one environment or the other, although the gap between yMem and yAut is larger due to the difference in severity of the perturbations.

Figure 5.

Expected life spans 𝔼[L(X′, y)] and 𝔼[L(X′, $y¯$)], for all states x and both environments y, of the state x′ produced by a transition xx′ typical of y. Marker colors indicate the environments y for which a state x has positive σ*(x, y). As with ρ*, most states selectively benefit from one environment or the other, although the gap between yMem and yAut is larger due to the difference in severity of the perturbations.

A minority of states (white circles) are causally responsive to, or benefit selectively from, perturbations in both environments: These are the states that fall underneath the identity line on both left- and right-hand graphs in Figures 4 and 5. The majority of states, for both measures ρ* and σ*, have a positive value in one environment y and a negative value in the other environment $y¯$. For instance, the red (blue) circles in Figure 4 represent states that “prefer” membrane (autocatalyst) perturbations respectively, regardless of which environment their viability is actually evaluated in. Similarly, the red (blue) circles in Figure 5 represent states that “prefer” repeated future perturbations to affect the membrane (autocatalyst) chemicals respectively, regardless of what sort of perturbation they happen to receive in the current time step.

Summarizing our findings in the protocell model: Autocatalyst perturbations are substantially more disruptive than membrane perturbation, but despite this asymmetry we measure positive environment adaptivity in 94% of states. We can also break down the environment effects, and observe that the majority of states are suited to one specific environment (i.e., ρ* and σ* are positive in one environment and negative in the other). In the case of membrane perturbations, states with the highest expected life span under repeated perturbations are also the most adaptive.

## 5 Discussion

### 5.1 Fitness Improvement Relative to Counterfactual Scenarios

We have proposed that organism-level adaptation must be understood

• 1.

in terms of the organism's relationship to its environment,

• 2.

within a context of other possible environments that the organism could have been in, that is, in terms of fitness improvements relative to counterfactual scenarios.

This approach stems from the intuition that an organism whose fitness does not depend on its environment (either directly, or via the effects of that environment on its state) cannot reasonably be said to be adapting.

In our framework, an adaptive organism is one that tends to use information from its environment to tailor its internal state for that specific environment. Our causal responsivity measure ρ(x, y) quantifies the portion of an organism's fitness (in state x) that depends on receiving information from its specific environment y, while our selective benefit measure σ(x, y) quantifies the portion of an organism's fitness that comes from its next state being appropriate to the particular environment y. Considered over all typical environments, these two measures have the same average α(x), which we have identified with the environment adaptivity of state x.

Note that, according to our framework, even an organism whose fitness falls steadily over time (as in the case of a dying organism) can still be adaptive to its environment. A high α value in such a case corresponds to an organism whose fitness would fall even faster were it continually switched between random environments.

### 5.2 Adaptivity and Causal Interventions

This article has made use of Pearl's causal Bayesian network formalism [15] for our definitions in Section 3.2. Since we consider interventions directly on the variables X′, Y′ that feature in the fitness evaluation F(X′, Y′), no causal model is strictly necessary, and we can express our measures directly in terms of ordinary observational probabilities.

However, we have introduced the causal formalism because it helps to clarify the intuitive interpretation of our measures, and because of the important role that causation appears to play in formal treatments of cognition. For instance, Ortega and Braun [13] argue that for a system to be able to take desirable actions, it must regard its own decision-making process as causally independent from the physical world.

Our quantity α(x) is related to Ay and Polani's causal information flow [3], but has some important differences. We use a utility function F that allows us to distinguish between adaptive and maladaptive variation; causal information flow could tell us whether the environment affected the agent's state, but not whether it was beneficial or not. Along these lines, α(x) is also related to the “value of information” measures introduced by Howard [7]. In contrast to causal information flow, we also make finer-grained distinctions by considering what happens when intervened variables are set according to specific conditional distributions.

We prefer the use of causal Bayesian networks to observational proxies for causation such as transfer entropy [16] (and the special case of Granger causality [4]) because the intervention-based formalism more closely captures the semantics of causal relationships, giving correct results when observational measures fail [3].

### 5.3 Action as Information Flow from Agent to Environment

Although we have chosen to consider examples in which the organism's state has no effect on the environment, the equations for ρ, σ, and α are no different in the general sensorimotor closed-loop case. In this article, we have only considered the effects of causal links from the environment: either to the agent (in the case of ρ(x, y)), or within the environment (in the case of σ(x, y))—recall the graph in Figure 1. However, a benefit of our framework is that we can also consider similar quantities measuring the efficacy of the agent's actions and internal state transitions. One could use the same method to determine the portion of fitness that depends on causal links from the agent, by considering the measure
$∑x′,y′Fx′y′ℙ(x′|x,y)ℙ(y′|y),$
which corresponds to the fitness the organism would have if it took a random action consistent with its environment (ignoring the needs of its particular state x), and the measure
$∑x′,y′Fx′y′ℙ(y′|x,y)ℙ(x′|y),$
which corresponds to the fitness the organism would have if its transition came from a random initial state consistent with its environment, rather than its actual initial state x. Notice the symmetry of these expressions compared to the originals defined in Sections 3.2 and 3.3.

We believe this constitutes a promising research direction stemming from this work. Future research should clarify the meaning of these fitness effects of agent-to-environment information flows, and all possible implications for theories of action and agency.

## 6 Conclusion

We build on information theory and causal intervention theory to develop a set of new tools to study within-lifetime adaptivity in artificial and biological systems. What distinguishes our approach from previous work in the field is that we measure adaptivity by comparing an organism's fitness in its actual environment with its fitness in other counterfactual environments, instead of tracking the organism's fitness over time. In particular, we define two closely related measures—causal responsivity ρ and selective benefit σ—that quantify specific aspects of the causal information flow in the agent-environment system and how they affect the agent's expected fitness. In this framework, expected adaptivity α comes in as a natural way of quantifying the overall fitness effects of the causal links between agent and environment.

We illustrate our framework by giving specific numerical examples with a simulated protocell model. This constitutes a practical example of how to use the proposed measures to study the adaptivity of an organism in a variety of states and environments.

In summary, we advocate a view of adaptivity as a causal property of the agent-environment system. The contribution of this article is to propose a measure of this kind, that relies on causal interventions and counterfactual reasoning. These measures are shown to be of interest for practical applications, and solidly grounded in their theory.

## Acknowledgments

The authors would like to thank Eran Agmon for providing parameters for the simulation of the protocell model used in Section 4. We would also like to thank anonymous reviewers who provided helpful feedback, and the editor of Artificial Life for permitting an unconventional submission in the form of dual companion articles.

## Notes

1

Both articles are found in this issue of Artificial Life.

2

The derivation of this expression is provided in the  Appendix, sections A.2 and A.3.

3

They do not report finding any non-point attractors.

4

As a matter of fact, in most cases this can be proven to be impossible [11, 12].

## References

1
Agmon
,
E.
,
Gates
,
A. J.
, &
Beer
,
R. D.
(
2015
).
Ontogeny and adaptivity in a model protocell
. In
P.
Andrews
,
L.
Caves
,
R.
Doursat
,
S.
Hickinbotham
,
F.
Polak
,
S.
Stepney
,
T.
Taylor
, &
J.
Timmis
(Eds.),
Proceedings of the European Conference on Artificial Life 2015
(pp.
216
223
).
Cambridge, MA
:
MIT Press
.
2
Agmon
,
E.
,
Gates
,
A. J.
,
Churavy
,
V.
, &
Beer
,
R. D.
(
2016
).
Exploring the space of viable configurations in a model of metabolism–boundary co-construction
.
Artificial Life
,
22
(
2
),
153
171
.
3
Ay
,
N.
, &
Polani
,
D.
(
2008
).
Information flows in causal networks
.
,
11
(
01
),
17
41
.
4
Barnett
,
L.
,
Barrett
,
A. B.
, &
Seth
,
A. K.
(
2009
).
Granger causality and transfer entropy are equivalent for Gaussian variables
.
Physical Review Letters
,
103
(
23
),
238701
.
5
Di Paolo
,
E. A.
(
2005
).
.
Phenomenology and the Cognitive Sciences
,
4
(
4
),
429
452
.
6
Friston
,
K.
(
2010
).
The free-energy principle: A unified brain theory?
Nature Reviews Neuroscience
,
11
,
127
138
.
7
Howard
,
R.
(
1966
).
Information value theory
.
IEEE Transactions on Systems Science and Cybernetics
,
2
,
22
26
.
8
Klyubin
,
A.
(
2002
).
.
Master's thesis
,
Tallinn Technical University
,
Tallinn, Estonia
.
9
Klyubin
,
A. S.
,
Polani
,
D.
, &
Nehaniv
,
C. L.
(
2007
).
Representations of space and time in the maximization of information flow in the perception-action loop
.
Neural Computation
,
19
(
9
),
2387
2432
.
10
McGregor
,
S.
(
2016
).
A more basic version of agency? As if!
In
L.
Tuci
,
A.
Gigkos
,
M.
Wilson
, &
J.
Hallam
(Eds.),
SAB 2016: Proceedings of the The 14th International Conference on the Simulation of Adaptive Behavior
.
London
:
Springer
.
11
Ortega
,
P.
, &
Braun
,
D.
(
2011
).
Information, utility and bounded rationality
. In
International Conference on Artificial General Intelligence
(pp.
269
274
).
London
:
Springer
.
12
Ortega
,
P. A.
(
2011
).
Bayesian causal induction
. In
2011 NIPS Workshop in Philosophy and Machine Learning
.
.
13
Ortega
,
P. A.
, &
Braun
,
D. A.
(
2010
).
A Bayesian rule for adaptive control based on causal interventions
. In
Third Conference on Artificial General Intelligence
(pp.
121
126
).
Paris
:
Atlantis Press
.
14
Pearl
,
J.
(
1993
).
Graphical models, causality and intervention
.
Statistical Science
,
8
(
3
),
266
269
.
15
Pearl
,
J.
(
2000
).
Causality: Models, reasoning and inference
.
New York
:
Cambridge University Press
.
16
Schreiber
,
T.
(
2000
).
Measuring information transfer
.
Physical Review Letters
,
85
(
2
),
461
.

### Appendix. Causal-Probabilistic Interpretations and Proofs

#### A.1 Mathematical Prerequisites

In order to formalize the ideas described in Section 2, some mathematical tools are required. Although we covered the basic intuitions in the text without appealing to mathematical details, we now describe those tools and make our formalism more precise.

Our proposed approach is best made sense of within Ay and Polani's information agent framework. The information agent framework uses tools from causal probability and information theory to characterize typical properties of an agent's interactions with its environment, in terms of information flow. This article is not meant as a comprehensive introduction to the information agent framework, so only the key concepts will be summarized here.

Briefly, the information agent framework considers the statistical properties of agent and environment dynamics over a range of counterfactual possible situations. In contrast to well-known sample statistics, such as Pearson's product–moment correlation, which are simple to compute but fail to capture complex nonlinear relationships, the information agent framework considers information-theoretic quantities, which are maximally general (and correspondingly harder to compute or estimate from samples).

As in the main text, we consider an ensemble of agents in environments at t0 described by the probability distribution ℙ(X0 = x, Y0 = y). Let's assume there is some (possibly stochastic) “physical law” governing the agent in its environment, which can be described by a probability function ℙ(Xi+1 = x′, Yi+1 = y′ | Xi = x, Yi = y). We may naturally ask questions about the typicality of the agent's and environment's possible states X1, Y1 at the next time step t1; these will correspond to applying some (possibly stochastic) “physical law” to the agent-environment state, and averaging our results over the initial typicality values:
$ℙX1=x′,Y1=y′=∑x,yℙ(X1=x′,Y1=y′X0=x,Y0=y)ℙX0=x,Y0=y.$
Standard statistical (Shannon) information theory then allows us to define quantities such as I(Xi; Yi+1), the amount of information the agent's current state tells us about its next environment state; or I(Xi+1; Xi | Yi = y), the amount of additional information the agent's current state provides about its next state if we already know that the current environment is in state y.

In general, agents simultaneously adapt themselves to their environment and adapt their environment to themselves. To use Friston's distinction, agents perform both inference and action [6]. Both of these processes tend to increase the mutual information between agent state and environment state, so it might seem that it would be difficult to disentangle them using purely statistical tools.4

Within standard probability theory, this is true. However, the development of causal probability theory by Pearl [14] makes it possible to identify the direction of causal effects. Pearl's innovation was the concept of an intervention: an externally-imposed change of a variable. Informally, when an intervention $X^$ = x is applied to a probability space, the term X in the equations for the probabilities of all other variables becomes replaced with the constant x.

This allows us to write expressions like ℙ(Xi+1 = x′|Ŷi = y): the probability that the agent will next be in state x′, if we externally force the current environment into state y without directly affecting anything else. This is usually not equal to ℙ(Xi+1 = x′|Yi = y), that is, the probability that the agent will next be in state x′ if we happen to observe y.

#### A.2 Interventionized Expectations

In this section we introduce a novel concept that will allow us to write our adaptivity measures more compactly: The interventionized expectation of some random variable V. These allow us to consider the causal effects of particular interactions on fitness, by imagining what would happen if the interaction were erased by an external intervention of some sort. We will require that V be produced by some causal stochastic mechanism, that is, that V is a node in a causal Bayesian network [15] G. We'll write the ordinary expectation operator, averaging over the distribution P(V|a), as 𝔼a[V]:
$EaV≡∑v∈Vvℙva.$
(9)
This is the expectation of V, given the observation A = a, when the stochastic process q described by G proceeds autonomously (i.e., without any intervention). Consider another variable B in G. Imagine that we were to intervene so as to set $B^$ = b via some independent external stochastic mechanism m, inducing another process q′. Clearly, the statistics of V under such circumstances will, in general, depend on the details of m.

Ay and Polani consider such a scenario in order to define conditional causal information, and stipulate that the distribution over intervened values must match the marginal observed under q (in our case the conditional marginal P(B|a)). However, we will in general want to consider the case in which P($B^$) varies according to some other (conditional) marginal P(B|c).

We will write the expectation of V under this scenario as 𝔼a$B^$|c[V] (pronounced “the expectation of V given a, but setting B according to c”):
$EaB^|cV≡∑b,vvℙV=v|A=a,B^=bℙB=bC=c.$
(10)
This can be interpreted as an answer to the following question: If the causal link to B from its parents were severed, and instead a value b were externally imposed, by an external mechanism whose statistics match the original marginal distribution over B (conditioned on the observation C = c), then what would be the expectation of V (conditioned on the observation A = a)?
We will also use a Δ operator on interventionized expectations to indicate the difference between an interventionized expectation and the non-interventionized expectation, for example,
$ΔEaB^|cV≡EaB^|cV−EaV.$
(11)

#### A.3 Connections with ρ, σ, and α

In this section we draw connections between the three measures of adaptation we have proposed that are particularly illuminating and straightforward when formulated in terms of causal probability theory and the interventionized expectations introduced above.

First, we note that causal responsivity and selective benefit can be succinctly written using the 𝔼 notation:
$ρxy=−ΔEx,yX^′|xFX′Y′=∑x′,y′Fx′y′ℙ(x′,y′|x,y)−ℙ(y′|x,y)ℙ(x′|x),$
(12)
$σxy=−ΔEx,yY^′|xFX′Y′=∑x′,y′Fx′y′ℙx′,y′xy−ℙy′xℙx′x,y.$
(13)
Notice the symmetry in the 𝔼-expression for ρ(x, y) and σ(x, y). This makes it explicit that they are both expectations of the same quantity under interventions on different variables.
In order to calculate environment adaptivity, we observe that it is easy to verify that $Ex,yX^′|xFX′Y′$ and $Ex,yY^′|xFX′Y′$ have the same expectation when averaged over y using P(Y|x):
$ExX^′|x[FX′Y′]=ExY^′|x[FX′Y′]=Ex(X^′|x),(Y^′|x)[FX′Y′]=∑x′,y′Fx′y′ℙx′xℙy′x.$
(14)
Consequently, α(x) can also be conveniently written as
$αx=Exρxy=Exσxy=∑x′,y′Fx′y′(ℙ(x′,y′|x)−ℙ(x′|x)ℙ(y′|x))=−ΔExX^′|x,Y^′|xFX′Y′.$
(15)
Finally, as briefly mentioned in Section 5.3, the measures of fitness that depends on causal links from the agent to environment can be similarly written as
$Ex,yY^′|yFX′Y′=∑x′,y′Fx′y′ℙx′x,yℙy′y,$
$Ex,yX^′|yFX′Y′=∑x′,y′Fx′y′ℙy′x,yℙx′y.$

#### A.4 Connections with Mutual Information

In this section we draw a few useful connections between our adaptivity measures and (conditional) mutual information as commonly used in standard information theory.

The expression I(Y; X′|x) = 0 states that knowing the value of Y tells us no additional information about X′ when we already know x—that is, that Y is conditionally independent of X′ given that X = x. It can be shown that the condition I(Y; X′|x) = 0 implies that ρ(x, y) = 0 for all y. Using ⊥ to denote the conditional independence operator, we have
$Y′⊥X′|X,YfromtheBayesiangraph⇒IY,Y′;X′|x=IY;X′|x=0⇒ℙX′|x,Y,Y′=ℙX′|x⇒ℙX′,Y′|x,Y=ℙY′|x,YℙX′|x⇒∀y∈Y:ρxy=0.$
(16)
And, analogously to Equation 16, for selective benefit we can show that
$IY;Y′x=0⇒∀y∈Y:σxy=0.$
(17)
Finally, note that if the conditional mutual information I(X′; Y′|x) = 0, then ℙ(X′, Y′|x) = ℙ(X′|x)ℙ(Y′|x) and
$IX′;Y′x=0⇒∀x∈X:αx=0.$
(18)
These conditions effectively link our adaptivity measures to information flows between agent and environment, and provide a set of necessary conditions for an organism to be adaptive in a certain environment. They prove the desirable property that if there is no information flow between agent and environment, then the agent cannot adapt to that environment (although it could conceivably increase its fitness).

Note that for the conditions above the converse does not hold: For example, when F(x′, y′) = k for all x′, y′ (i.e., fitness is constant regardless of organism-environment state), then α(x) must equal 0 even if I(X′; Y′|x) > 0.

#### A.5 Normalized Measures

Following the logic in Section 3.5, the normalized causal responsivity and selective benefit are defined as
$ρ*xy=∑x′,y′Fx′y′ℙx′,y′xy−ℙy′xyℙx′x,Y≠y,$
$σ*xy=∑x′,y′Fx′y′ℙx′,y′xy−ℙx′xyℙy′x,Y≠y,$
where
$ℙx′|x,Y≠y=ℙx′|x−ℙx′,y|x1−ℙy|x,$
and similarly for ℙ(y′|x, Yy). To calculate ρ*(x, y) we first use this expression to write
$ρ*xy=∑x′,y′Fx′y′ℙx′,y′|x,y−ℙx′,y′|x,yℙy|x−ℙy′|x,yℙx′|x−ℙx′,y|x1−ℙy|x.$
Since, according to our causal graph in Figure 1, Y′ is independent of X′ given X and Y, we have ℙ(x′|x, y, y′) = ℙ(x′|x, y), allowing us to write
$ℙx′,y′|x,yℙy|x=ℙy′|x,yℙx′|x,yℙy|x=ℙy′|x,yℙx′,y|x,$
giving
$ρ*xy=∑x′,y′Fx′y′ℙx′,y′|x,y−ℙy′|x,yℙx′|x1−ℙy|x,$
$ρ*xy=ρxy1−ℙy|x,$
(19)
$andsimilarlyσ*xy=σxy1−ℙyx,$
(20)
which are the expressions used in Section 4. Finally, using again the 𝔼 notation, the normalized measures can be written as
$ρ*xy=−ΔEx,yX^′|x,Y≠yFX′Y′,$
(21)
$σ*xy=−ΔEx,yY^′|x,Y≠yFX′Y′.$
(22)