## Abstract

Intermittency is ubiquitous in animal behavior. We depict a coordination problem that is part of the more general structure of intermittent adaptation: the *adjustment-deployment dilemma*. It captures the intricate compromise between the time spent in adjusting a response and the time used to deploy it: The adjustment process improves fitness with time, but during deployment fitness of the solution decays as environmental conditions change. We provide a formal characterization of the dilemma, and solve it using computational methods. We find that the optimal solution always results in a high intermittency between adjustment and deployment around a non-maximal fitness value. Furthermore we show that this non-maximal fitness value is directly determined by the ratio between the exponential coefficient of the fitness increase during adjustment and that of its decay coefficient during deployment. We compare the model results with experimental data obtained from observation and measurement of intermittent behavior in animals. Among other phenomena, the model is able to predict the uneven distribution of average duration of search and motion phases found among various species such as fishes, birds, and lizards. Despite the complexity of the problem, it can be shown to be solved by relatively simple mechanisms. We find that a model of a single continuous-time recurrent neuron, with the same parametric configuration, is capable of solving the dilemma for a wide set of conditions. We finally hypothesize that many of the different patterns of intermittent behavior found in nature might respond to optimal solutions of complexified versions of the adjustment-deployment dilemma under different constraints.

## 1 Introduction

Most models of biological behavior are based on steady state assumptions, considering that the processes governing organisms occur in a constant and sustained way. However, activity in living beings at many different levels often happens in bursts, in states of marginal instability in which pauses are alternated with brief activity. These forms of intermittency have been linked [17] with processes of adaptation to ever-changing and unpredictable environments, revealing the importance of continuous interaction with the world when dealing with unknown situations. More importantly, intermittency brings up the importance of timing and coordination in cognitive processes, connecting with dynamical perspectives that have had a mayor impact in behavioral and cognitive sciences [5, 13, 26] during the last two decades.

Intermittent locomotion is a widespread biological phenomenon. Many organisms' behavior (ranging from protozoans to mammals) is intermittent: They move, pause briefly, and move again. These pauses last from milliseconds to minutes, being part of a dynamical system by which organisms adjust their behavior to changing environments [17].

Despite the energy costs of acceleration and deceleration, a variety of benefits arise when pauses are alternated with action. Intermittent bounding and undulating flight modes in birds (which alternate periods of flapping with pauses where wings are either extended to permit gliding or held close to the body) save mechanical power compared to continuous flight over a broad range of speeds [21]. A similar effect takes place in fishes when *burst-coast* swimming [28]. Many species, when chasing a prey, alternate pauses and moves to stabilize their sensory field. Thus, while moves tend to be straight, both pursuits of a prey and changes of direction are initiated after pauses [14, 18, 25]. *Saltatory search* in foraging animals (from insects and lizards to mammals) minimizes the search time by alternating phases of fast motion and phases of intensive search [3, 8].

Additionally, intermittent behavior has benefits that are related not so much with locomotion dynamics as with the dynamics of sensorimotor processes such as attention to the visual field. For example, when examining the visual field, eye movement is not smooth but alternates rapid movements (saccades) with stable intervals (fixations) [22]. Other examples include primates pausing briefly while moving between trees in the canopy, the pauses being related to the requirement to identify a route for the next movement sequence [12]; or humans balancing a pole on their fingertips, displaying *on-off intermittency*, where most of the time the equilibration is stable, and corrective movements occur in quick bursts [10].

Broadly speaking, the nature of intermittent behavior can be considered to have two (non-exclusive, yet radically different) origins:

- 1.
Intermittent behavior is just an epiphenomenon of embodied behavior; it is physical or dynamical constraints (e.g., the muscles needing to rest after some time of activation, or the refractory period of a neuron just after a spike, when it cannot fire again) that provoke intermittent behavior.

- 2.
Intermittent behavior results from a strategy developed by organisms to face the challenge of dynamically adjusting their behavior to changing environments.

We may ask ourselves if the overabundance, and specific patterns, of intermittent modes of behavior in living beings are the result of a general strategy developed at different levels of biological organization to adapt to complex, ever-changing environments. This is the central question of this article. We do not deny that intermittent behavior is produced by physical or physiological constraints. But if these constraints appear so frequently, maybe it is because they play a fundamental role in organisms' adaptation. Neither do we suggest that all kinds of intermittency are caused by such a strategy. Most probably there are many instances of intermittent behavior that do not serve adaptive purposes. However, its presence over such a wide range of temporal scales and different types of organisms and forms of behavior suggests that there might be a more general reason for its existence than just the particular constraints that different organisms have to deal with. In this article we will try to show that there is in fact a general and wide adaptive problem, what we call the *adjustment-deployment dilemma* [1], for which intermittent behavior is an optimal solution.

We provide a formal characterization of the adjustment-deployment dilemma in which a system has to find an equilibrium between two complementary stages: adjustment of behavior to an environmental condition, and the deployment of that behavior. This dilemma captures the difficult compromise between the time spent in adjusting a response and the time used to deploy it: The adjustment process improves fitness with time, but does not directly benefit the organism or contribute to the task goal until the adjusted response is deployed. However, during deployment the benefit of the adjustment phase starts to decay as the environment changes. As a result, if you spend too little time adjusting your behavior, the results of your action are poor, but if you spend too much during deployment, the result is no longer valid. As we shall see, the adjustment-deployment dilemma is able to offer a formal explanation of specific patterns of intermittent behavior in terms of adaptive efficiency. It shows how, under some conditions, the best response of a system (be it a neural ensemble, a control system, an organism, etc.) is to rapidly alternate between different modes of behavior.

In the next section, we review the assumptions we made in order to present a general minimal formal model of intermittent adaptation. Sections 2 and 3 introduce the model and its assumptions. Section 4 compares its results with experimental data. In Section 5 we present a minimal implementation that is able to solve the dilemma for a wide range of environment dynamics. Finally, Section 6 suggests some directions for future research, and Section 7 discusses the implications of the presented model.

## 2 The Adjustment-Deployment Dilemma: Characterization, Scope, and Modeling Assumptions

The challenge we are proposing is to present a model that can explain different kinds of adaptive intermittency within a common framework. The scope is meant to be general and multi-scale, covering interspecies differences as well as various types of intermittent behavior within the same organism. Ultimately, we expect it to be applicable to intermittent functioning of different components within a behavior generating mechanism (although, throughout the article, we will favor a behavioral agent-environment interpretation).

We consider that some of the previous approaches to intermittency (see [1] for a review) might have failed to build this common framework because they were typically limited to the study of a particular case of intermittent behavior. It is also worth noting that there have been few, if any, attempts to describe intermittency as a general systemic property that results from agent-environment adaptive feedback dynamics with a specific temporal structure. Descriptions of intermittent behavior in organisms are just concerned with the actions of the agent, typically dismissing the role of the environment and the agent's coupling with it. Part of the problem lies in the fact that a systemic approach brings forth many problems due to the great complexity of the many levels of interaction present in living beings. One aspect of this complexity is the fact that some adaptive properties are not shown for a single instance of behavior, but in a wider context of recurrent agent-environment interactions. Examples of adaptive and yet counterintuitive (when studied in a single instance) forms of behavior are ubiquitous in nature—some examples include group foraging [23], host-parasite-predation interactions [16], and mating behavior [2]). We will favor a systemic approach, tackling the complexity of the task by reducing our model to a minimal expression that will still be able to capture the essence of the problem. Trying to reach the mathematical abstraction behind intermittent adaptation, we assume the simplest possible case of intermittency, where:

- 1.
The system switches between only two possible behaviors. We do not consider situations where the system has a more complex repertoire of behaviors.

- 2.
These behaviors cannot overlap in time; neither can there be a situation where the system is not performing any of them.

- 3.
Transition times between states (modes of behavior) are considered to be small enough to be neglected.

- 4.
Though many behaviors might adapt the system to its environment (adjustment) or improve the system's situation in that environment (deployment) at the same time and in different degrees, we consider that the system cannot do these two things at these same time.

- 5.
It is, in principle, possible to measure fitness of the agent-environment relationship at any given time, potential or virtual fitness during adjustment, and real or effective fitness during deployment.

Needless to say, these assumptions do not cover all the possible instances of intermittent behavior found in nature. In some cases behaviors are going to overlap in time, or transition times between behaviors will not be negligible. However, what we intend here is to make a first approximation, taking a minimal example that preserves the essence of the intermittent phenomena. Despite its simplicity, this minimal model can represent a lot of cases of intermittent behavior in animals. In order to illustrate the dilemma and its mathematical formulation, we shall make use of an example of the type of phenomena we are about to model: a predator having to decide whether to run chasing a prey or stop to stabilize its visual field and precisely locate the position of the prey (Figure 1).

In our model, we assume that an intermittent system generates a pattern that combines two mutually exclusive stages:

- •
*Adjustment*is a behavior that improves the position of the organism and increases its possibilities of achieving its goals by adjusting its possibilities for effective action in an specific task. In our example, adjustment will correspond to stopping during a pursuit to stabilize the visual field to localize the prey's position. - •
*Deployment*is a behavior that takes advantage of the possibilities generated during the previous phase, executing an action that makes them effective (deploys them). Moving toward the chased prey would correspond to the deployment phase in our example.

The intermittency between adjustment and deployment is not a mere sequential ordering of phases of adjustment followed by phases of deployment, but poses a problem of functional coordination dynamics: How much time do I need to spend focusing on a prey before I move? What is the best ratio between stopping for sensory stabilization and moving during a pursuit? A correct dynamic equilibrium between adjustment and deployment is crucial in most cases and might change under different circumstances. We have coined the term *adjustment-deployment dilemma* to name a generic characterization of this problem. To our knowledge, no explicit theoretical, mathematical, or simulation approach has yet explicitly addressed it.

There are, however, several models that address the nature of intermittent behavior in specific contexts. Some quite interesting research in intermittent behavior has been developed in the field of intermittent search strategies, where kinetic models have been proposed in which intermittent behavior emerges as an optimal strategy for detecting prey [9, 11]. As well, intermittent behavior in gradient-climbing organisms based on sporadic cues and partial information has been modeled by the so-called *infotaxis* model, in which the searcher adopts a strategy of movement alternating exploration and exploitation phases, in order to maximize the expected rate of information gain [27]. As well, intermittent flight modes in birds have been modeled in terms of aerodynamic and energetic considerations [21].

Still, such models represent particular stances of intermittent behavior, which arise from particular constraints of the task the organism is facing. The adjustment-deployment model presented here aspires to be a general model of intermittent behavior, able to provide an explanation for the recurrence of intermittent solutions emerging as optimal strategies in a wide range of tasks. In the following sections, we try to define the simplest model that is able to capture the essence of different kinds of intermittent behavior, abstracting away some of the particularities of the specific task models mentioned above.

## 3 Formalization of the Adjustment-Deployment Dilemma

In order to explore the mathematical core of the adjustment-deployment dilemma, we have simplified the problem to its minimal form. In general terms we have a system adjusting its behavior (or solution to a problem) and then executing or deploying it. We can take as an example the case of a toad chasing a prey, having to alternate movement with pauses for stabilizing its visual field [18]. The toad cannot see while it is moving, because its visual field blurs. Thus, the toad has to stop for some instants to locate the position of the prey. In the absence of obstacles the toad moves toward the position where the prey was just the instant before the toad started to move. Prey velocity has no influence on the direction of the toad's movement. Also, while the toad is moving, it is not going to correct its course if the prey changes its position (it cannot perceive such a change). The distance the toad hops in a single bound depends on the initial separation between the toad and prey, and it is not altered if the prey vanishes or moves during the toad's approach. Both the distance moved and the direction of the toad are uncorrected by visual feedback until the toad stops its movement (Figure 1).

In terms of our adjustment-deployment dilemma, the toad has to alternate between a *move* state (deployment), where it can approach the prey, and a *stop* state (adjustment), in which it can stabilize the image it perceives and update the information about the prey's position. Thus, the toad has to find an equilibrium between how much time it spends adjusting its visualization of the prey and how much time it subsequently spends deploying a pursuit behavior. We also can see how the relative amount of time expended in either state is going to depend on the dynamics of the situation. When the prey moves slowly or when it is far away, the toad has less necessity of adjusting its behavior, and can move for longer amounts of time; when the prey moves fast or it is too close, the toad has to stop and adjust its orientation more frequently, having less time for effectively moving toward the prey.

More explicitly, we have expressed the model in a series of mathematical terms, which are seen in Table 1. We introduce them below.

Concept . | Notation . | Formulation . | Description . |
---|---|---|---|

Adjustment (virtual): f(t) = 1 − e^{−t/τ} | Evolution of the virtual or effective fitness in relation to task goal during a behavioral phase (increases for adjustment and decays for deployment). | ||

Fitness | f(t) | Deployment (effective): f(t) = e^{−t/ε} | |

Adjustment: γ(t) = γ_{0} | Binary exclusive choice of a system over time between adjustment and deployment. | ||

Choice | γ(t) | Deployment: γ(t) = γ_{1} | |

Performance | Mean accumulated effective fitness during the task duration | ||

Optimal solution | f∗(τ, ε) | Optimal choice dynamics that maximizes mean effective fitness. |

Concept . | Notation . | Formulation . | Description . |
---|---|---|---|

Adjustment (virtual): f(t) = 1 − e^{−t/τ} | Evolution of the virtual or effective fitness in relation to task goal during a behavioral phase (increases for adjustment and decays for deployment). | ||

Fitness | f(t) | Deployment (effective): f(t) = e^{−t/ε} | |

Adjustment: γ(t) = γ_{0} | Binary exclusive choice of a system over time between adjustment and deployment. | ||

Choice | γ(t) | Deployment: γ(t) = γ_{1} | |

Performance | Mean accumulated effective fitness during the task duration | ||

Optimal solution | f∗(τ, ε) | Optimal choice dynamics that maximizes mean effective fitness. |

### 3.1 Fitness

Fitness represents the mean ability of a system to maximize the chances of achieving its goals, that is, obtaining a successful solution for a given problem or situation. The fitness or quality of a solution at an instant *t* is denoted by a fitness function *f*(*t*) ∈ [0, 1]. Note that fitness here does not mean evolutionary or survival fitness directly. Rather it denotes fitness in relation to the task goal (e.g., following a prey), which shall in turn make a contribution to survival fitness (e.g., catching and eating the prey). We will assume that:

- 1.
The system has an adjustment mechanism for improving its behavior with respect to the environment. We assume that the functional relation between the quality of a solution and time during adjustment is known, and we consider it to be a nonlinear function (the effort in obtaining better results grows in relative terms with time), and we assume it to be exponential,

*f*(*t*) = 1 −*e*^{−t/τ}, where τ is the adjustment speed. - 2.
We assume that the solution degrades throughout time due to environment changes during the deployment phase. Also taken as exponential is the functional dependence between quality of a solution and time:

*f*(*t*) =*e*^{−t/ε}, where ε stands for the degradation rate.

It has to be noted that although the fitness is generated during the adjustment phase, it can only be exploited during the deployment phase. To stress this fact, we will refer to the fitness value of the solution as *virtual* while the organism is adjusting and *effective* when the agent deploys the solution.

In the case of the toad, the fitness will correspond to the difference between the prey's azimuth and the toad's orientation. When the toad is pointed at the prey, the fitness value is 1, and it will decrease when the prey changes its position. In general, we are going to consider ε and τ as constants defining the dynamics of an environment. However, in Section 5 we will show how a system can adapt to an environment where it has to face different possible values of ε and τ.

At this point we have to make an important clarification. Even if we define *f*(*t*) as a deterministic function, it does not mean that the environment we are modeling is predictable. On the contrary, the adjustment-deployment dilemma describes a situation where a system has to adapt to an uncertain and changing environment. We can know how fast the system is changing (how fast the prey moves away from the toad, or how fast the toad stabilizes its visual field), but we do not know *how* the system is changing, that is, we cannot predict the future position of the prey. Knowing the adjustment and degradation rates of the changes in the system does not mean that we can predict the state of the environment, just the state of the current fitness or adaptation level. Throughout the article, when we refer to the uncertainty or unpredictability of the environment, we shall be referring to this impossibility of predicting future states of the environment.

### 3.2 Choice

The resolution structure of the dilemma can be captured with a single variable denoted by γ(*t*) ∈ {γ_{0}, γ_{1}}, that is, as the binary exclusive choice of the system over time, with γ_{0} representing adjustment and γ_{1} deployment.

### 3.3 Formulation

Now, the following equations to describe the behavior of the system result from the previous formalization:

- •
*Adjustment:**f*(*t*) = 1 − e^{−t/τ},γ(t) = γ_{0} - •
*Deployment:**f*(*t*) = e^{−t/ε},γ(t) = γ_{1}

The structure of the dilemma can thus be reduced to finding the strategy (i.e., the form of γ(*t*)) that obtains the best results. However, the crucial point is that either adjustment or deployment requires a minimum duration to have an effect (e.g., a toad cannot perform half a jump). Therefore, after the value of γ(*t*) switches it has to be maintained for a minimal time span. We will refer to the minimal adjustment and deployment periods as *h*_{A} and *h*_{D} respectively. Thus, the function γ(*t*) can be reduced to a discrete sequence {γ_{k}}, where each value has to be maintained for its corresponding period, *h*_{A} or *h*_{D}. For example, each discrete value of γ_{k} could correspond to an action where the toad either performs a complete jump or stays still for a period of time.

### 3.4 Adjustment-Deployment Model

*t*), we will specify the evolution of the fitness over time:The agent's performance will be obtained by just integrating the fitness of the system during the deployment periods (as we said before, we assume that the system only can take advantage of its situation in the world during the deployment phase, where fitness becomes effective, unlike the virtual fitness obtained during adjustment periods). We will take γ

_{0}= 0 and γ

_{1}= 1, allowing us to combine both previous functions in a single equation describing the global behavior:

Having the problem so defined, the best solution is the one that offers a maximum value of . Yet, the mathematical solution to the problem is nontrivial. The cost of computing the evolution of *f*(*t*) for every possible combination of values of γ(*t*) is prohibitive. In order to circumvent this problem we have used the Bellman algorithm [7] for finding the optimal solution in a recursive way ( Appendix 1 shows the mathematical derivation of the solution). The result is shown in Figure 2, where we can see the solution of the problem for given values of τ and ε. We see how the optimal strategy for solving the adjustment-deployment dilemma is not the one that maximizes the fitness at a given instant of time. Instead, the best solution is one that reaches a suboptimal solution and, instead of enhancing it, maintains it constant through time. Also, if the result is constrained to a limited time window, the final steps of the behavior exploit all the accumulated fitness in a final prolonged adjustment phase. Our results show that the global solution crucially depends on the accurate coordination of the agent-environment interactions.

### 3.5 Intermittent Adaptation: Maximizing Interactions with the Environment

What do the previous results mean? As seen in Figure 2, the obtained optimal strategy for solving the adjustment-deployment dilemma tends:

- •
not to maximize fitness, but to reach a intermediate value

*f*^{∗}(τ, ε) which is kept until the process is about to end. - •
to maximize the number of behavioral changes (i.e., the alternation between adjustment and deployment).

Let's go back to the toad. We have an agent that has to act in a changing environment. Also, when the toad is performing a task in this environment (chasing the prey), the toad does not know how the environment is changing. However, we suppose that the toad knows (i.e., can adapt to) how fast the environment (the position of the prey) is changing, so the toad can have a measure of how long a time it can move until the direction of its movement is no longer valid. In an intuitive first approach to the problem, we could think that what the toad has to do is just change its orientation until it is pointing toward the prey, and start moving until it reaches a point where its orientation is no longer valid.

*f*

^{∗}(τ, ε), is going to depend on the relation between τ and ε. Appendix 2 shows that, under some assumptions, the value of

*f*

^{∗}(τ, ε) can be computed by the following equation (Figure 3):

In a nutshell, the optimal solution to the adjustment-deployment dilemma can be captured under the following dictum: When the environment changes, the best behavior is the one that maximizes the number of interactions with the world, the optimal fitness level being determined by the ratio between the organism's adjustment speed and the environmental rate of change. The timing between interactions (i.e., the transitions between adjustment and deployment in the oscillations) is determined by physiological and environmental constraints (for a list of timings in different animals, see [17]). The last part of the conclusion is especially interesting, since it adds a new condition for adaptation by means of intermittent behavior. According to this result, adaptation to the environment is not always going to require well-adjusted solutions. Instead, suboptimal solutions combined in an intermittent way will be the best strategy to cope with changing environments. As well, this suboptimal fitness value is not going to be determined by either the agent or the environment alone, but it is going to be a result of the dynamical coupling between both.

## 4 Comparison with Experimental Data

*f*

^{∗}(τ, ε) determines the amount of time that a system spends in adjustment and deployment. Specifically, under some conditions (see Appendix 2), the relative time spent in deployment,

*r*

_{dep}, is going to be equal to the optimal fitness value:

It follows that when adaptation is slower than environmental changes, an organism will need to spend more time in adjustment than in deployment. As well, it will be forced to develop strategies with poorer solution quality. That is coherent with empirical data:

- •
In adult viviparous lizards

*r*_{dep}is around 0.7 and 0.8 for general locomotion, while it is reduced to nearly 0.25 when the lizards are actively searching for prey [4]. That is, when an agent has enough time to exploit its adjustment, it can afford high-fitness strategies (Figure 2c), while low-fitness strategies will be developed by an agent when the available deployment time is smaller (Figure 2b). - •
Several studies have pointed out behavioral changes of animals looking for prey as the search environment changes. When prey are more difficult to detect or when environments are visually more complex, the value of

*r*_{dep}decreases [20, 24].

The percentage of the time spent in deployment varies greatly among different organisms. As seen in [17], *r*_{dep} ranges from 0.04 to 0.94 for different tasks and species. Also, according to experimental data [8] (Figure 4), *r*_{dep} follows a binomial distribution in foraging animals, meaning that most foragers either spend more time searching than moving or spend more time moving than searching; very few foragers spend similar amounts of time searching and moving. Such results are seen in the binomial distribution of Figure 3, where, if ε/τ is assumed to be log-uniformly distributed (e.g., if we assume that it is reasonable that activity in nature occurs with similar probability at all temporal scales), in most cases *r*_{dep} will be either small or large, and only in a small percentage of cases will it have medium values. The sigmoidal relation between *r*_{dep} and ε/τ makes it likely that distributions of *r*_{dep} will tend to be overrepresented in the extremes (for small and large values).

## 5 A Minimal Model Implementation of the Adjustment-Deployment Dilemma

We have presented a formal characterization of the adjustment-deployment dilemma and a formal optimal solution for different parametric configurations of the dilemma. We can now ask the following questions: Can evolutionary, developmental, or learning processes lead to an organism that can find (or approximate) this solution? If so, what is the simplest mechanism that can match an optimal solution to the adjustment-deployment dilemma?

In order to answer these questions we have used artificial evolution to evolve a behavioral selection mechanism whose simplicity and biological plausibility could be assumed for wide range of organisms. We have used continuous-time recurrent neural networks (CTRNNs) to implement a dynamical system capable of developing the optimal strategy for a wide range of possible situations (i.e., ε-values). CTRNNs have been among the most popular neural controllers for designing adaptive systems within a dynamical perspective [6]. They constitute a good choice for the proposed task because (1) they are the simplest nonlinear, continuous dynamical neural network model; (2) despite their simplicity, they are universal dynamics approximators in the sense that, for any finite interval of time, CTRNNs (provided that there is no constraint on the number of nodes) can approximate the trajectories of any smooth dynamical system [15].

*N*neurons iswhere

*i*= 1, 2, …,

*N*,

*y*is the state of each neuron, τ is its time constant (τ > 0),

*w*

_{ij}is the strength of the connection from the

*j*th to the

*i*th neuron, θ is a bias term,

*g*is a gain term, σ(

*x*) = 1/(1 +

*e*

^{−x}) is the standard activation sigmoidal function, and

*I*represents a constant external input. We allow each neuron to have external input information about three variables in the environment. Each neuron will have access to (1) the current quality of the solution being implemented (the value of the fitness

*f*(

*t*)), (2) how fast the current quality of the virtual fitness improves over time in the adjustment phase, and (3) how fast the effective fitness decays during the deployment phase. We define the external input for each neuron as the weighted sum of these three variables:where is equal to the last value of when the system was in the adjustment phase (γ(

*t*) = 0), and is equal to the last value of when the system was in the deployment phase (γ(

*t*) = 1); here is the first derivative of

*f*(

*t*). Note that and cannot be mutually activated, so in adjustment and in deployment .

The CTRNN runs with an Euler step of *h* = 0.01 s, and we initially assigned arbitrary values to the adjustment and deployment minimal steps, *h*_{A} = *h*_{D} = 10 · *h* = 0.1 s, which, according to [1], is approximately the mean duration of pauses and movement periods in intermittent animal behavior.

Once the neural networks are defined, we want to find the minimal configuration that successfully solves the adjustment-deployment dilemma. Also, we want the network to be able to adapt to different values of τ and ε.

In order to find the appropriate values for the parameters of the network to solve the problem, we used artificial evolution: a rank-based genetic algorithm with elitism and binary encoding [28]. We ran different genetic algorithms for populations of 60 neural networks for 12 generations, each genetic algorithm having a population of networks of a different size, from one to six neurons. Each neural network was evaluated against an environment with changing parameters during trials of *T* = 200 s. The environmental parameters were changed every 10 s, generating new random values of τ and ε from a function 2^{x}, where *x* was uniformly distributed in the interval [−4, 4]. The fitness function of the genetic algorithm was equal to the value of (the performance, i.e., the mean effective fitness, of the system in the adjustment-deployment dilemma; see Equation 3). The mutation probability was set to 0.01 for each binary digit of the chromosome.

### 5.1 Minimal Intermittent Adaptive Structure

*s*

_{1}= −10,

*s*

_{2}= 9.3548, and

*s*

_{3}= −4.8387.

As shown by the value of ω, the neuron has a negative feedback self-connection. It behaves as a nonlinear oscillator when interacting with its environment. As we see (Figure 5), the neuron is able to adapt perfectly to different values of τ and ε, by modulating its oscillations through interaction with the environment (Figure 6). Furthermore, the neuron is also able to adapt to different values of *h*_{A} and *h*_{D} without the need of previous training (Figure 6). However, the dynamics of the neuron (concretely, its time constant α) limits how fast intermittency can happen, since the speed of the neural network is limited by the parameter α.

## 6 Discussion

The present model can be expanded and improved in different ways. Some of the underlying assumptions could be relaxed and the model complexified. For instance, many crucial temporal aspects of the adjustment-deployment dilemma were left aside in this study, and many of them might provide avenues for future research. The inclusion of forced perceptual delays, evaluation delays (organisms need to take some time to taste a food source, or to evaluate the outcome of its interaction), possible overlap between adjustment and deployment, constraints on deployment duration, and the like could be included in future developments. The measurement of fitness could also be enriched by including additional cost functions associated with deployment (energy expenditure), adjustment (risk of being detected/hunted), or intermittency itself. It is also important to acknowledge the lack of embodiment of the current model. This was crucial to achieve a model of wide generality, but its application will have to include a variety of spatial and embodiment constraints. In this line, future development should also include reference to and modeling of specific examples of animal behavior that face different versions of the adjustment-deployment dilemma in order to compare the model's predictions with experimental data and adjust the relevant parameters and dimensionality of the model.

Finally, for a more general model of intermittency, we should study cases with more than two possible behaviors, together with more complex dependences of the fitness function on the world, allowing the agent to perform behaviors that are able at the same time to adapt to the environment (adjustment) and to benefit from it (deployment) in different degrees.

Regarding the minimal mechanism capable of optimally solving the dilemma, its strongest limitation lies in the required input. Although the mechanism itself is simple (a unique and highly simplified neuron), it demands high-quality information about the problem (current fitness and indicators of current fitness change during deployment and adjustment) in order to perform the task. It is very unlikely that an organism has direct access to this information in any given task environment. Future work should reduce this assumption and try to find mechanisms that can solve the problem with poor or partial information. Alternatively, the possibility of more complex mechanisms could be explored, including specific cognitive mechanisms that could process sensory information and deliver the required input to an “adjustment-deployment neuron.” It is more likely, however, that organisms do not modularize the problem and alternative solutions emerge out of brain-body-environment dynamics that were not considered in this article. In order to explore this possibility, full agent-environment models could be developed where the dilemma and its solution might emerge as higher-order phenomena from lower-level behavioral/adaptive capacities and the recurrent sensorimotor coupling with the environment. In any case the presented minimal mechanism provides a proof of concept to show that, despite the apparent mathematical complexity of the dilemma, relatively simple organisms could, at least in principle, be able to find optimal solutions to it.

## 7 Conclusion

We have shown how characteristic patterns of intermittency result from solutions to the *adjustment-deployment dilemma*: the dynamic interplay between the time spent adjusting a solution to the changing environment and the execution time taken by the deployment of the solution.

Despite its ubiquity in biological behavior, to our knowledge this is the first characterization, formalization, and modeling approach to the adjustment-deployment dilemma. We have formalized mathematically the structure of the dilemma and numerically computed its optimal solution for different configurations. The problem-structuring parameter was found to be the ratio between the rate of adjustment and the rate of fitness decay while deployment takes place. The optimal solution always results in a high intermittency between adjustment and deployment around a non-maximal fitness value. Furthermore, we have shown that this non-maximal fitness value is directly determined by the ratio between the exponential coefficient of the fitness increase during adjustment and the decay coefficient during deployment.

Our hypothesis is that at least part of the intermittent behavior displayed by living organism is a response to this dilemma, whose general solution can be captured by the motto “When the environment changes, the best behavior is the one that maximizes the number of interactions with the world, the optimal fitness level being determined by the dynamic ratio between adjustment speed and environmental change.”

The distribution of optimal strategies over the range of parameter values takes a sigmoidal shape. It follows that most solutions will be distributed over the two extremes of the solution spaces: one where adjustment is very fast with long periods of deployment, and the another the opposite, where long periods of adjustment are followed by quick deployment. It turns out that the distribution of intermittency patterns found in animal behavior matches our model's optimal-solution distribution.

A simple model composed of a single neuron with negative feedback is able to display this optimal behavior, assuming the following inputs: an indicator of the success of its deployment, and the current adjustment and degradation rates. The implemented mechanism shows a high degree of robustness, being able to adapt to a wide range of possible configurations of the dilemma. These results suggest that optimal solutions to the adjustment-deployment dilemma could, in principle, be instantiated by very simple mechanisms, given the appropriate input, and should therefore be accessible even to unicellular systems.

Our model also brings forth the need to include the temporal dimension of agents and environment into current modeling frameworks. Adjustment speed, decay rates, deployment duration, patterns of intermittency, and so on crucially matter when it comes to real-world problem solving. Computational and representational approaches to cognition are prone to neglect such time-dependent phenomena and might often fail to account for natural behavior. Furthermore, they can fail to provide models that solve adaptive problems by means of temporally rich and structured agent-environment coordination patterns. The adjustment-deployment dilemma might constitute one such case where, given strong cognitivist assumptions, one would be tempted to build models that first compute a near-maximum-fitness solution and only then deliver an output command. We have shown, however, that the optimal solution to the adjustment-deployment dilemma exploits non-maximal solutions by means of fast intermittent behavior in a manner that, in addition, requires only very simple control mechanisms. It could be further conjectured that, under certain constraints, intermittency and, perhaps more generally, recurrent agent-environment suboptimal interactions provide robust and simple solutions to many adaptive problems. We have shown that the adjustment-deployment dilemma is one such case and that its solutions' distribution matches the patterns of intermittent behavior found in animals.

## Acknowledgments

Miguel Aguilera, Manuel G. Bedia, and Francisco Serón were supported in part by the project TIN2011-24660 funded by the Spanish Ministerio de Ciencia e Innovación. Miguel Aguilera currently holds a FPU predoctoral fellowship from the Spanish Ministerio de Educación.

During the development of this article Dr. Xabier E. Barandiaran held a postdoctoral position funded by FP7 project eSMCs IST-270212. He also acknowledges funding from the research project Autonomía y Niveles de Organización financed by the Spanish Government (FFI2011-25665) and IAS research group funding IT590-13 from the Basque Government (in which M.B. and M.A. are also collaborators).

## References

### Appendix 1: Formal Solution of the Adjustment-Deployment Dilemma

*t*) = {0, 1} and we want to find the set {γ

_{k}} that maximizes

*p*(

*t*), where each γ

_{k}corresponds to a period of adjustment or deployment, with duration

*h*

_{A}or

*h*

_{D}, respectively. For simplicity, we will take

*h*

_{A}=

*h*

_{D}=

*h*, but the result is the same if the adjustment and deployment periods are different. Now we can discretize the system with step

*h*(with

*h*

_{A}≠

*h*

_{D}, the discretization step would be the least common multiple between them):where

*h*is a temporal step,

*k*= 0, 1, 2, …,

*N*, so

*f*(0) =

*f*

_{0},

*p*(

*T*) =

*p*

_{N}, given

*T*= {

*t*

_{1},

*t*

_{2}, …,

*t*

_{N}}. For the sampled version, the problem can be reformulated by the following (knowing that

*h*is constant): “Find the set of decisions {γ

_{k}(

*t*

_{k})} that maximizes .” That is, the {γ

_{k}(

*t*

_{k})} values must be computed providing thatwhich, since it starts at

*a*

_{0}, will be denoted by (

*a*

_{0}).

_{0}, γ

_{1}, …, γ

_{N}) in a recursive way and backward:whereIterating, we obtain the sequence

_{N}(

*t*

_{N}), the previous instant γ

_{N−1}(

*t*

_{N−1}) is computed, applying the following equation:We know thatTherefore,Given γ

_{N−1}= {0, 1}, we only have to compute which one of the two cases is larger:

### Appendix 2: Approximation of the Optimal Adjustment in the Adjustment-Deployment Dilemma

*f*

^{∗}(τ, ε), we can assume some conditions that will allow us to accurately compute an approximation of the value of

*f*

^{∗}(τ, ε). If we imagine a system performing an optimal behavior for the adjustment-deployment dilemma (as represented in Figure 2), the assumptions that allow us to compute an approximation of the optimal fitness level are the following:

- •
The time spent in the transitions at the start and the end of the performance (when the system is not around

*f*^{∗}(τ, ε)) is negligible compared to the time when the system is intermittently alternating adjustment and deployment. - •
The minimum durations of adjustment and deployment (

*h*and_{A}*h*) are small enough to consider that the changes in_{D}*f*(*t*) during the cycles of either*h*and_{A}*h*are small enough to be negligible._{D}

*f*(

*t*) is always very close to the optimal level

*f*

^{∗}(τ, ε). In this situation, it is going to be much easier to compute how effective the system is for given parameters. Thus, we can simplify Equation 3 in the following way:where

*r*

_{dep}represents the relative time spent in adjustment. As well, if we just consider the time when the system is alternating adjustment and deployment, we can approximate

*r*

_{dep}as the value that equilibrates the effects of adjustment and deployment reaching a steady value of

*f*(

*t*):which, assuming that we can approximate the value of

*f*(

*t*) by

*f*

^{∗}(τ, ε), gives us

This is an approximate solution, valid for small values of *h*_{A} and *h*_{D}, and small transition times before and after the intermittent phase; but it is still a good approximation if these conditions are not met.

## Author notes

Contact author.

Department of Informatics, University of Zaragoza, 50018 Zaragoza, Spain. E-mail: miguel.academic@maguilera.net (M.A.)

IAS–Research Centre for Life, Mind and Society & Department of Philosophy & University School of Social Work, UPV/EHU, University of the Basque Country, Av. de Tolosa 54, 20018 Donostia, Gipuzkoa, Spain.