One important sense of the term “adaptation” is the process by which an agent changes appropriately in response to new information provided by environmental stimuli. We propose a novel quantitative measure of this phenomenon, which extends a little-known definition of adaptation as “increased robustness to repeated perturbation” proposed by Klyubin (2002). Our proposed definition essentially corresponds to the average value (relative to some fitness function) of state changes that are caused by the environment (in some statistical ensemble of environments). We compute this value by comparing the agent's actual fitness with its fitness in a counterfactual world where the causal links between agent and environment are disrupted. The proposed measure is illustrated in a simple Markov chain model and also using a recent model of autopoietic agency in a simulated protocell.
How might we understand the notion of adaptation? We propose a definition, inspired by Klyubin's definition of adaptation as “increased robustness to repeated perturbation” [8, p. ii]. Our approach essentially formalizes the notion of an organism displaying beneficial changes in response to interactions with its environment. In order to provide a more mathematically rigorous framework, we use the causal probability theory by Pearl  and the information agent formalism by Ay and Polani , although prior knowledge of those is not needed to understand the intuitions in this article. The mathematically inclined reader can find more details in the Appendix.
The key intuition behind our approach is the following: Comparing the effect of an organism's action with the effect of other actions it could have taken is much more straightforward than trying to measure what its actions do to promote continued viability. Therefore, our approach to measuring adaptivity relies on the relative fitness gain an organism obtains after responding to a stimulus in a way that is particularly beneficial in its current environment, compared to the way it could have responded had it been in another environment. This essentially corresponds to drawing a distinction between adaptation to a specific environment and overall improvement. Furthermore, we distinguish between two distinct ways in which an organism can be adaptive to its environment that are different from diachronic improvement.
In this article we provide mathematical tools to quantify various adaptivity measures following this paradigm, illustrate them in a recent model of a protocell, and discuss what benefits and insights can be obtained about the system using this approach.
These measures are formulated in an abstract manner, and are easily applicable to systems in other fields within artificial life research. For example, these tools can be used to measure the benefits of phenotypic plasticity in a given organism, or the adaptivity of certain behaviors exhibited by such an organism. Importantly, our approach moves beyond the standard survival-normative accounts of adaptivity and allows researchers to operationalize their definitions of fitness in any way that is suitable to study their system of interest.
1.1 Companion Articles
This article is the second of two companion articles considering the notion of adaptation from an abstract, formal perspective:
“Adaptation Is Not Just Improvement over Time” critiques an influential view of adaptation, under which an organism can be mechanically constituted so as to improve its future prospects over time.
“Measuring Fitness Effects of Agent-Environment Interactions” proposes a quantitative measure for the degree to which an agent's internal dynamics capture beneficial information from its sensory stimuli.
1.2 Structure of the Article
Section 2 introduces our approach to defining adaptation during an organism's lifetime: effectively, a quantification of the benefit the organism derives from responding to a specific environment. We briefly summarize the mathematical tools we are using (the causal information agent framework) and provide formal definitions in Section 3; and we apply them to some examples in Section 4, including a protocell model used recently to investigate a different notion of adaptation (discussed in more detail in our companion article). In Section 5 we briefly recap our proposal and relate it to other work in subjectivity, causality, and agency. Section 6 concludes the main body of the article.
2 Our Concept of Adaptation
The word “adaptive” has a variety of meanings in the scientific context. In particular, it refers to a specific technical concept within evolutionary theory, which differs from how the term is used informally in the non-evolutionary sciences. As a disclaimer, we note that here we address the problem of measuring within-lifetime adaptation, and we do not make any claims about adaptation in evolutionary biology. The companion article provides a fuller discussion of notions of lifetime adaptation in artificial life science.
Before we dive into the mathematical definitions, in this section we walk through an example that displays the basic reasoning behind our concept of adaptation. Although the organism in this example is arguably more complex than the organism we analyze later in Section 4, it serves as a good illustration of our intuitions.
2.1 The Rat in the Maze
Consider a rat navigating an unfamiliar maze M1. The rat will get better at navigating this particular maze (and, indeed, mazes in general) as a consequence of exposure to the maze. We may be tempted to equate behavioral adaptation with a diachronic change in the animal's “objective” future prospects (i.e., a difference between the organism's viability at different time steps), but—as we argue in our companion article—this invites some mathematical pathologies, since a fully objective perspective would foresee the rat's adaptive capacities and factor them accurately into any evaluation of its future prospects.
Instead, we propose a different view of adaptation that is based on counterfactual reasoning and provides a more nuanced account of adaptive phenomena. This view is inspired by the definition of adaptation as “increased robustness to repeated perturbation” [8, p. ii, 9], which seems to capture an important aspect of adaptive processes. In particular, what this definition refers to is the agent's molding itself to a specific, temporally extended aspect of the environment. Let us consider two different aspects of this molding-to-environment phenomenon:
The organism's tendency to respond to a stimulus from its actual environment, in ways that are better suited to that environment than its responses to stimuli from other possible environments.
The organism's tendency to respond to a stimulus from its actual environment, such that its response is better suited to that environment than to other possible environments.
2.2 Causal Responsivity
This section illustrates the first aspect of molding to environment: In responding to stimuli from its environment, the rat's prospects within its actual environment have improved relative to what its prospects would be if it had been responding to stimuli from a different environment.
Suppose the rat, while familiarizing itself with maze M1, were to be magically transported to a different maze M2 for a short time interval T, and then transported back to maze M1. The rat would be confused by the unexpected stimuli it had encountered during T, and would require more time to reorient itself in maze M1.
However, this would not be the case if the local configuration of maze M2 were identical, over T, to that of maze M1: In such a case, the rat would be unable to distinguish between the two mazes, and its performance in maze M1 would be unaffected. We would not want to say that, over T, the rat had adjusted specifically to maze M1 if the rat's state changes would have been identical in maze M2.
2.3 Selective Benefit
This subsection illustrates the second aspect of molding to environment: By interacting with its environment, the rat's prospects within its actual environment have improved relative to its imagined prospects in another, counterfactual environment. We can illustrate this as follows.
Suppose the rat, after familiarizing itself with maze M1, is magically transported to a different maze M2. There was some change in the rat that occurred through interactions with maze M1. This adaptation produced a propensity for behavior well suited for maze M1, but less so for maze M2; indeed, the rat may now fare worse in maze M2 than if it had never been exposed to maze M1. (Imagine, for example, a case in which the spatial layout of M1 and M2 is identical, but the exit or goal in M2 is diametrically opposed to the one in M1. In this case, when transported to M2 the rat would head in the wrong direction and end up further away from the goal.)
Imagine, for contrast, a scenario in which the rat has been allowed to familiarize itself with both maze M1 and maze M2, and has now been drugged. The amount of time taken to traverse maze M1 will improve over time, just as for a naive rat that familiarizes itself over time with maze M1. However, the reasons will be importantly different: In the new case, the rat's performance improves because the drugs are wearing off, and not because its behavior is becoming more tailored to maze M1 than to other possible environments. Indeed, as the drugs wear off, the same improvements will be observed in maze M2. This is still improvement, but we would not call it adaptation to M1, because the changes are not specific to M1.
3 Quantifying Fitness Effects of Agent-Environment Interactions
The measures of adaptivity we propose are inspired by and grounded in causal probability  and information theory , although detailed prior knowledge of these is not necessary to understand the intuitions behind this work. Here we describe the measures, and leave the technical details to the Appendix.
Imagine an ensemble of agents at time t0, with a variety of different internal states and in a variety of different environments. We can represent this ensemble by a joint probability distribution over random variables X, the agent's state at time t0, and Y, the environment's state at time t0. The probability ℙ(X = x, Y = y) will correspond to the degree to which we consider the pair x, y to be typical of this class of agents in this class of environments. At time t1, after one (discrete) time step, the state of the agent transitions to X′ and the state of the environment to Y′.
The state of the environment will in general affect the organism's dynamics, and vice versa. These dependences can be compactly represented in the causal graph (or causal Bayesian graph) shown in Figure 1. While causal graphs have a precise mathematical definition , for our purposes here it suffices to say that a link from A to B implies that in general B will respond to changes in A when everything else is held constant. Then, reading the graph, we can tell that in general X′ and Y′ depend on X and Y and are distributed as ℙ(x′, y′|x, y) = ℙ(x′|x, y) ℙ(y′|x, y). We assume that causal relations apply strictly forwards in time, and are temporally local, that is, that variables at each time step are directly causally affected only by variables at the immediately preceding time step.
Finally, we must specify with respect to what metric the organism is adapting or improving. Treatments of adaptation [5, 8] in the artificial life literature often adopt a survival-normative approach, in assuming that the only event relevant to an organism's concerns is its own survival or death . However, we will relax this requirement and allow the organism's fitness to be an arbitrary function F : 𝒳 × 𝒴 → ℝ of the organism-environment state.
3.2 Causal Responsivity
Alternatively, this quantity can be seen as representing the portion of expected fitness that depends on the organism receiving incoming information about its actual environment, rather than some random environment consistent with its state: in other words, how much worse off the organism would be in a counterfactual world with the same organism-environment dynamics but where the red arrow in Figure 1 did not exist. In fact, in the Appendix, section A.4, we show that, as expected, the causal responsivity is zero if there is no information flow along the relevant arrow in the causal graph. Additionally, it is possible for this quantity to be negative, which would indicate that the organism would actually perform better in a scenario where it was receiving information from a random environment—for example, if the organism's response to environment y were particularly harmful to fitness in the same environment y.
3.3 Selective Benefit
Selective benefit can be seen as the portion of expected fitness (in the next time step) that depends on the organism being in its actual environment rather than a random environment compatible with its initial state: in other words, how much worse off the organism would be in a counterfactual world where the blue arrow in Figure 1 did not exist.
3.4 Environment Adaptivity
3.5 Normalized Causal Responsivity and Selective Benefit
As an overall measure, expected adaptivity balances the benefit the organism obtains from environment-y-typical perturbations (as opposed to perturbations from a completely random environment). However, if we compare the organism's typical behavior in environment y with its typical behavior in a random environment y*, there is in general a nonzero probability that y* = y and hence that there will be no difference. The larger the prior probability of y, the higher this chance, and the lower the measures ρ(x, y) and σ(x, y) will be. This is unlikely to be what we want when we consider an organism's relationship to a specific environment.
For the sake of simplicity, our examples will consider only the case where the environment's state is fixed. The more general case where the environment's state varies as a consequence of endogenous and interactional dynamics is little different, and omitted here only for reasons of space. In Section 5.3 we briefly discuss how our approach may also be used to address further complexities regarding the evaluation of active effects by the agent on the environment.
4.1 Simple Markov Chain
Let's start with a simple Markov process we can use to illustrate our concepts of causal responsivity and selective benefit. We will assume that the environment is fixed or its dynamics are much slower than those of the agent, so that ℙ(Y′|X, Y) = ℙ(Y′|X).
Consider an agent with three states, labeled x0, x1, and x2. The agent starts always in state x0, and it can transition to either x1 or x2. There are two possible environments, labeled y0 and y1, with prior probabilities γi = ℙ(yi | x0). A graphical representation of the Markov process is shown in Figure 2. Edges represent transition probabilities, and the numbers next to each node represent fitness values. Edges and numbers in red apply in environment y0, and edges and numbers in blue apply in environment y1.
Parameters βi ∈ ℝ control the relative fitness of state x2 compared to x1 in the same environment (recall that states may have different fitness values in different environments). If βi = 0, both states are equally good in environment yi, whereas if |βi| → ±∞, one state is infinitely preferable to the other (both of these properties apply to environment i).
Parameters γi ∈ (0, 1) represent the probability that the agent is in environment yi given that its internal state is x0—that is, how “typical” the environment yi is for the initial state x0 of the agent. We require that every environment have a nonzero probability of occurring, γi > 0.
The normalized causal responsivity ρ*(x0, yi), in turn, is proportional to the relative fitness of one state with respect to the other and to the internal state transition probabilities. Also note that it vanishes in two conditions: when there is no benefit to being in x2 rather than x1 (in environment yi), or when x responds no differently to y0 than to y1. This is consistent with quantifying the first aspect of molding to environment discussed in Section 2.1.
4.2 Adaptation to Environment in a Protocell Model
As a more elaborated example, we applied our measure to a simulated protocell model by Agmon, Gates, and Beer (henceforth, AGB), used recently in the modeling of adaptation to perturbations . This is a two-level hierarchical model simulated at two different time scales; the life span is defined in terms of time steps at the higher level of this model. The lower-level model is a deterministic reaction–diffusion-repulsion system that has a variety of attractor states; these attractor states resemble a central reservoir surrounded by a chemical membrane. In each higher-level time step, an attractor state is subjected to one of several different instantaneous perturbations, and then permitted to go to equilibrium;3 transitions at the higher level of dynamics are therefore from one lower-level stable state x to another lower-level stable state x′. By placing a probability distribution over the perturbations, the model can be formulated as an irreducible discrete Markov process. For technical details about the model we refer the reader to the original articles [1, 2].
In Agmon et al. , the authors simulated point perturbations to two different chemical species, at three different magnitudes, in nine distinct locations, for a total of 72 distinct perturbations. Beginning with a single initial stable micro state, they computed a network of other micro states reachable from this initial state by some series of perturbation-relaxation cycles. They arbitrarily terminated their search at 16 cycles, discovering 267 distinct attractors in total within this distance (including the death state), including 113 attractors whose successors were not computed. For the purposes of calculating viability, they assumed that the uncomputed successors of all these 113 attractors were exactly the death state.
This produced a graph with 267 nodes, whose edges were labeled with one of the 72 perturbations. By placing a probability distribution over the perturbations, the model can be formulated as an irreducible discrete Markov process. AGB provided us with the data for this graph; we used this data without running the underlying simulation.
We can apply our measures to the AGB model by supposing that the protocell finds itself in one of two different possible environments, each with a different probability distribution over possible perturbations. We can then ask whether the protocell tends to adapt to these different environments by shifting into states that are more robust to the particular set of perturbations found in its environment. In particular, we may consider how a protocell fares in a particular environment, after being exposed to the perturbations from that environment, compared to how it would fare if exposed to perturbations from a random (other) environment. This contrasts with Agmon et al.'s approach, which features only a single unified distribution over perturbations.
We considered two different environments yMem and yAut for the protocell, corresponding respectively to uniformly distributed perturbations of the membrane chemical, and uniformly distributed perturbations of the autocatalyst chemical. The protocell was assumed to begin in each environment with equal probability 0.5, and we identified the fitness F(x, y) of a state x in environment y as the expected life span L(x, y) of that state in that environment, that is, the expected number of transitions from x to the dead state when all perturbations are drawn from environment y.
4.2.1 Measuring α, ρ*, and σ*
Finally, we can calculate the expected life span for x′ in a randomly chosen environment y after a transition x → x′ driven by a perturbation from the same environment y, and compare it with the expected life span after a perturbation from the other environment . The difference between these two expected life spans is equal to 2α(x). The left-hand graph in Figure 3 plots these two values against each other, showing clearly that almost all states (94%) have a positive overall environmental adaptivity. The average α value for all (non-dead) states was 0.28.
The right-hand graph in Figure 3 provides some further information: Those states with the highest α are the ones with the highest expected life span in the membrane perturbation environment yMem (r = 0.95, Pearson's product–moment correlation). Speaking roughly, autocatalyst perturbations appear to be more disruptive than membrane perturbations (as can be inferred from the same graph: Expected life spans in yAut are overall lower than in yMem). Hence, we can expect states with a high life span under repeated membrane perturbations to be the ones that benefit most (on average over both environments) from non-abnormal perturbations.
Figures 4 and 5 show the expected next life span of (non-dead) states x in the yMem and yAut environments, depending on which environment the next perturbation is drawn from. Figure 4 arranges this data to illustrate the ρ* comparison, and Figure 5 arranges it to illustrate the σ* comparison. Note that even though a regime of autocatalyst perturbations is more disruptive than a regime of membrane perturbations, a modest majority (57.1%) of states fare better after suffering a single autocatalyst perturbation than a single membrane perturbation, if they are to be subjected to a regime of autocatalyst perturbations (these are the blue and white circles on the right-hand graph in Figure 4).
A minority of states (white circles) are causally responsive to, or benefit selectively from, perturbations in both environments: These are the states that fall underneath the identity line on both left- and right-hand graphs in Figures 4 and 5. The majority of states, for both measures ρ* and σ*, have a positive value in one environment y and a negative value in the other environment . For instance, the red (blue) circles in Figure 4 represent states that “prefer” membrane (autocatalyst) perturbations respectively, regardless of which environment their viability is actually evaluated in. Similarly, the red (blue) circles in Figure 5 represent states that “prefer” repeated future perturbations to affect the membrane (autocatalyst) chemicals respectively, regardless of what sort of perturbation they happen to receive in the current time step.
Summarizing our findings in the protocell model: Autocatalyst perturbations are substantially more disruptive than membrane perturbation, but despite this asymmetry we measure positive environment adaptivity in 94% of states. We can also break down the environment effects, and observe that the majority of states are suited to one specific environment (i.e., ρ* and σ* are positive in one environment and negative in the other). In the case of membrane perturbations, states with the highest expected life span under repeated perturbations are also the most adaptive.
5.1 Fitness Improvement Relative to Counterfactual Scenarios
We have proposed that organism-level adaptation must be understood
in terms of the organism's relationship to its environment,
within a context of other possible environments that the organism could have been in, that is, in terms of fitness improvements relative to counterfactual scenarios.
In our framework, an adaptive organism is one that tends to use information from its environment to tailor its internal state for that specific environment. Our causal responsivity measure ρ(x, y) quantifies the portion of an organism's fitness (in state x) that depends on receiving information from its specific environment y, while our selective benefit measure σ(x, y) quantifies the portion of an organism's fitness that comes from its next state being appropriate to the particular environment y. Considered over all typical environments, these two measures have the same average α(x), which we have identified with the environment adaptivity of state x.
Note that, according to our framework, even an organism whose fitness falls steadily over time (as in the case of a dying organism) can still be adaptive to its environment. A high α value in such a case corresponds to an organism whose fitness would fall even faster were it continually switched between random environments.
5.2 Adaptivity and Causal Interventions
This article has made use of Pearl's causal Bayesian network formalism  for our definitions in Section 3.2. Since we consider interventions directly on the variables X′, Y′ that feature in the fitness evaluation F(X′, Y′), no causal model is strictly necessary, and we can express our measures directly in terms of ordinary observational probabilities.
However, we have introduced the causal formalism because it helps to clarify the intuitive interpretation of our measures, and because of the important role that causation appears to play in formal treatments of cognition. For instance, Ortega and Braun  argue that for a system to be able to take desirable actions, it must regard its own decision-making process as causally independent from the physical world.
Our quantity α(x) is related to Ay and Polani's causal information flow , but has some important differences. We use a utility function F that allows us to distinguish between adaptive and maladaptive variation; causal information flow could tell us whether the environment affected the agent's state, but not whether it was beneficial or not. Along these lines, α(x) is also related to the “value of information” measures introduced by Howard . In contrast to causal information flow, we also make finer-grained distinctions by considering what happens when intervened variables are set according to specific conditional distributions.
We prefer the use of causal Bayesian networks to observational proxies for causation such as transfer entropy  (and the special case of Granger causality ) because the intervention-based formalism more closely captures the semantics of causal relationships, giving correct results when observational measures fail .
5.3 Action as Information Flow from Agent to Environment
We believe this constitutes a promising research direction stemming from this work. Future research should clarify the meaning of these fitness effects of agent-to-environment information flows, and all possible implications for theories of action and agency.
We build on information theory and causal intervention theory to develop a set of new tools to study within-lifetime adaptivity in artificial and biological systems. What distinguishes our approach from previous work in the field is that we measure adaptivity by comparing an organism's fitness in its actual environment with its fitness in other counterfactual environments, instead of tracking the organism's fitness over time. In particular, we define two closely related measures—causal responsivity ρ and selective benefit σ—that quantify specific aspects of the causal information flow in the agent-environment system and how they affect the agent's expected fitness. In this framework, expected adaptivity α comes in as a natural way of quantifying the overall fitness effects of the causal links between agent and environment.
We illustrate our framework by giving specific numerical examples with a simulated protocell model. This constitutes a practical example of how to use the proposed measures to study the adaptivity of an organism in a variety of states and environments.
In summary, we advocate a view of adaptivity as a causal property of the agent-environment system. The contribution of this article is to propose a measure of this kind, that relies on causal interventions and counterfactual reasoning. These measures are shown to be of interest for practical applications, and solidly grounded in their theory.
The authors would like to thank Eran Agmon for providing parameters for the simulation of the protocell model used in Section 4. We would also like to thank anonymous reviewers who provided helpful feedback, and the editor of Artificial Life for permitting an unconventional submission in the form of dual companion articles.
Both articles are found in this issue of Artificial Life.
They do not report finding any non-point attractors.
Appendix. Causal-Probabilistic Interpretations and Proofs
A.1 Mathematical Prerequisites
In order to formalize the ideas described in Section 2, some mathematical tools are required. Although we covered the basic intuitions in the text without appealing to mathematical details, we now describe those tools and make our formalism more precise.
Our proposed approach is best made sense of within Ay and Polani's information agent framework. The information agent framework uses tools from causal probability and information theory to characterize typical properties of an agent's interactions with its environment, in terms of information flow. This article is not meant as a comprehensive introduction to the information agent framework, so only the key concepts will be summarized here.
Briefly, the information agent framework considers the statistical properties of agent and environment dynamics over a range of counterfactual possible situations. In contrast to well-known sample statistics, such as Pearson's product–moment correlation, which are simple to compute but fail to capture complex nonlinear relationships, the information agent framework considers information-theoretic quantities, which are maximally general (and correspondingly harder to compute or estimate from samples).
In general, agents simultaneously adapt themselves to their environment and adapt their environment to themselves. To use Friston's distinction, agents perform both inference and action . Both of these processes tend to increase the mutual information between agent state and environment state, so it might seem that it would be difficult to disentangle them using purely statistical tools.4
Within standard probability theory, this is true. However, the development of causal probability theory by Pearl  makes it possible to identify the direction of causal effects. Pearl's innovation was the concept of an intervention: an externally-imposed change of a variable. Informally, when an intervention = x is applied to a probability space, the term X in the equations for the probabilities of all other variables becomes replaced with the constant x.
This allows us to write expressions like ℙ(Xi+1 = x′|Ŷi = y): the probability that the agent will next be in state x′, if we externally force the current environment into state y without directly affecting anything else. This is usually not equal to ℙ(Xi+1 = x′|Yi = y), that is, the probability that the agent will next be in state x′ if we happen to observe y.
A.2 Interventionized Expectations
Ay and Polani consider such a scenario in order to define conditional causal information, and stipulate that the distribution over intervened values must match the marginal observed under q (in our case the conditional marginal P(B|a)). However, we will in general want to consider the case in which P() varies according to some other (conditional) marginal P(B|c).
A.3 Connections with ρ, σ, and α
In this section we draw connections between the three measures of adaptation we have proposed that are particularly illuminating and straightforward when formulated in terms of causal probability theory and the interventionized expectations introduced above.
A.4 Connections with Mutual Information
In this section we draw a few useful connections between our adaptivity measures and (conditional) mutual information as commonly used in standard information theory.
Note that for the conditions above the converse does not hold: For example, when F(x′, y′) = k for all x′, y′ (i.e., fitness is constant regardless of organism-environment state), then α(x) must equal 0 even if I(X′; Y′|x) > 0.