Abstract

One of the main challenges in automatic controller synthesis is to develop methods that can successfully be applied for complex tasks. The difficulty is increased even more in the case of settings with multiple interacting agents. We apply the artificial homeostatic hormone system (AHHS) approach, which is inspired by the signaling network of unicellular organisms, to control a system of several independently acting agents decentrally. The approach is designed for evaluation-minimal, artificial evolution in order to be applicable to complex modular robotics scenarios. The performance of AHHS controllers is compared with neuroevolution of augmenting topologies (NEAT) in the coupled inverted pendulums benchmark. AHHS controllers are found to be better for multimodular settings. We analyze the evolved controllers with regard to the usage of sensory inputs and the emerging oscillations, and we give a nonlinear dynamics interpretation. The generalization of evolved controllers to initial conditions far from the original conditions is investigated and found to be good. Similarly, the performance of controllers scales well even with module numbers different from the original domain the controller was evolved for. Two reference implementations of a similar controller approach are reported and shown to have shortcomings. We discuss the related work and conclude by summarizing the main contributions of our work.

1  Introduction

The survival of natural creatures and the success of artificial creatures is importantly affected by their cognitive abilities. An embodied, mobile, autonomous agent has to process sensory inputs and has to control its actuators appropriately in order to survive or to fulfill its assigned task. The number of possible actions that could potentially be performed by such an agent is generally high. Also, there are many constraints on the feasibility of actions, either due to limitations of the agents' bodies or induced by the environment.

The organs in natural agents, which select appropriate actions based on perceptions, emerged through processes of natural selection. The variety of naturally evolved control systems is vast and goes much beyond the frequently stated example of the central nervous system of vertebrates. This is most evident in the case of unicellular organisms that often show nontrivial behavior without possessing a single nerve cell, for example, Paramecium [2, 6].

Generally, nature has evolved several principles of communication that partially act in parallel within organisms. In unicellular organisms (microorganisms, microbes), receptors can alter the production of chemical cell signals, which diffuse within the cell and integrate. These signals are not solely a way to transport information through space, because they are sometimes part of complex biochemical cascades. Some of these signals also allow the organism to switch between different modes of operation (e.g., motion principles or physiological states), as, for example, in Paramecium [6]. As cells are internally structured by compartmentalization, these processes can be interpreted as biochemical computation in space and time [4–6, 48]. Generally, this sort of communication can be interpreted as broadcast communication, because there is no direct link between sender and receiver, nor is there a dedicated addressed message. However, as intracellular chemical signals usually affect specific organelles or actuators and as specific compositions of chemical signals are often correlated with specific cell states, an emitter-channel-receiver model can still be applied to this sort of communication.

Within cells of multicellular organisms (metazoa), several pathways of communication exist, which differ concerning the processes they exploit. Between neighboring cells, transmembrane proteins act as important mechanisms to transport chemical molecules against the concentration gradient, which makes this sort of communication significantly different from basal diffusion. Among the most prominent examples are ion pumps.

Generally, cell-to-cell signaling can be divided into three categories [1]: communication between cells based on direct contact and strong dependence on local morphology (juxtacrine signaling), communication over short distances with medium dependence on local morphology (paracrine signaling), and communication over large distances and/or scales with weak dependence on local morphology (endocrine signaling). Gap junctions and notch signaling, belonging to the first category, bind communication pathways strongly to the local morphology around the sender's location. Paracrine hormones and neurotransmitters, still having a strong linkage between morphology (location of the communicating cells) and communication functionality, belong to the second category. In the third category, endocrine hormones allow communication over greater distances and show only a weak coupling between morphology and communication functionality.

Returning to the main focus of this article, we interpret juxtacrine signaling and paracrine signaling as belonging to one group of communication mechanisms, in that they display a clear morphological (topology-mediated) coupling between sender and receiver. Also, far-reaching neural communication can be classified as belonging to this group, in that neural communication follows a dedicated pathway within the nervous system. We argue that the concept of artificial neural networks (ANNs) is mainly inspired by this group of directed communication (unicast), as is also evidently indicated by the directed edges between neurons.

In addition, we interpret intracellular signal processing and endocrine communication as belonging to a second group of communication mechanisms, in that these communication principles exhibit a rather loose morphological coupling (except for receptor-mediated linkages) between sender and receiver. These processes are built on the diffusion of chemical signals. The design principles of the artificial homeostatic hormone system (AHHS) clearly reflect this group of natural broadcast communications (undirected communications).

It is to be noted that presenting these two groups as distinct paradigms is a simplification in that they are two extremes in a continuous transition between paradigms of biological communication. For example, neuroendocrine cells and transmembrane proteins are features that add aspects of one group to the mechanisms of the other group. Also, other means of communication or communication clues such as mechanical forces, temperatures, or external light inputs might be important in intercellular and intracellular communication.

Reaching out for new sources of inspiration in the context of engineering controllers of modular technical devices is worthwhile, although the high capabilities (especially) of the human brain are unquestioned. However, the artificial synthesis of controllers of that complexity is out of reach. Actually, the synthesis of any controller that sufficiently completes the information processes for an agent living in a dynamic, unpredictable environment is challenging, and the non-neuronal information processing of unicellular organisms definitely generates nontrivial behaviors [4].

1.1  Challenges of Controller Synthesis

The complexity of controller synthesis can be rated, for example, by investigating the field of evolutionary robotics [19, 44, 69]. The (semi)automatic synthesis of robot controllers by applying artificial evolution belongs to the software section of evolutionary robotics [12]. The curse of complexity bears the main challenge in this field. An increase in the difficulty of the desired behavior seems to result in a significant increase in the complexity of its evolution. This is partially documented by the absence of complex benchmark tasks in the literature [42]. It is still an open question how to find either a basic principle of the controller design that is appropriate and universal or one that is optimal for a given scenario [7, 8, 21, 22, 74]. The latter seems to be more likely to succeed, because the existence of a true general problem solver is very questionable (cf. the no-free-lunch theorem [76]).

Translating the concept of controller design and general problem solving to the context of evolutionary algorithms [29, 34, 50, 58] is associated with the problem of generating general or scenario-dependent smooth fitness landscapes. The general controller design defines the designable fraction of the search space and the fitness landscape (non-designable fractions are induced, for example, by the environment or the task itself). While the density of acceptable solutions in the search space should be kept high, the fitness landscape should generally be smooth with a minimum number of local optima. Experience shows that these two criteria are contradicting. In addition, the search space is usually high-dimensional and might have unfavorable structures. We summarize this set of challenges by the aim to “strive for high evolvability.”

One focus of our research track is to design fitness landscapes by applying appropriate controller designs. We test whether it is useful to maximize the causality of the mutation operator (i.e., small causes have small effects) by reducing the maximal effect on the organism's behavior [24]. However, whether high causality is really desirable is questionable (e.g., Chouard [11]).

The challenge of appropriate, efficient, and safe behavioral control is brought to a new level whenever groups of agents interact [78]. This holds also for organisms that are built from autonomous entities that collectively establish (possibly several different) topologies of their bodies. Examples of natural systems that belong to this category are the slime molds Dictyostelium discoideum and Dictyostelium mucoroides [9] and the algae Volvox aureus [30]. Volvox aureus lives in spherical aggregates consisting of hundreds or thousands of cells. The colony is able to perform collective phototaxis to stay in favorable light conditions. This is achieved by modulating the flagellar activity of cells depending on the local light intensity [26]. The responses to the light stimuli vary around the sphere, forming a gradient of flagellar activity suggesting intercellular interaction (be it chemical, electrical, or mechanical). This allows global coordination throughout the colony. Such organisms and behaviors are good sources of inspiration for paradigms of decentral control.

1.2  Inspiration by Signaling Networks in Unicellular Organisms

Especially the utilization of intercellular interactions is of interest here. Organisms such as Volvox aureus have not evolved explicit communication processes between a sender and a receiver (unicast), but rather implement implicit communication by chemical gradients (broadcast) or mechanical clues. Based on these considerations and inspired by the signaling network in unicellular organisms, we have proposed controllers based on AHHSs [23, 24, 54–57, 65, 66]. The word “hormone” is meant as a generic term for any kind of cell signal. These systems can be viewed as reaction-diffusion systems that are embedded within the autonomous agents. Sensory stimuli are converted into hormone secretions (cell signals), which in turn control the actuators. In addition, hormones interact linearly and nonlinearly in a way comparable to the hidden layer of an artificial neural network (ANN). Such systems show homeostatic processes in that they typically converge to trivial steady states for constant sensory input. Sensory stimuli trigger hormone secretion; hormone concentrations are essentially integrated (a form of memory) and decomposed over time (oblivion). However, during a limited period of time (transient) after a stimulus they also show variant behavior, especially if nonlinear hormone-to-hormone interactions are applied. This way, the bootstrapping problem of how to generate many sensory-motor configurations initially is overcome, because the controllers explore many such configurations even without input.

The concept of AHHS is related to gene regulatory networks. However, here each edge has its own activation threshold, and redundant edges with different activations are allowed. Hormones may diffuse within virtual internal spatial structures in the agent, but also from one agent to neighboring agents.

The desired main application of AHHS is multimodular robotics [51, 67]. In this field, autonomous robotic modules are studied that are able to physically connect to each other, and they can also establish a communication and energy connection. Hence, they form a super-robot (“organism”) that is able to reconfigure its body shape; see, for example [39, 53, 59, 61]. Therefore, the underlying idea of diffusion in our reaction-diffusion system is that hormones diffuse from robot module to robot module and establish low-level communication. Following our maxim of trying to reach a maximum of plasticity, we use identical controllers in each module independent of their position within the robot organism, so there is neither a controller nor module specialization. This concept implements the focus of evolutionary robotics on modularity (among others) in terms of hardware and software [44]. Although we evolve cooperative behaviors by evolving a kind of self-organized role selection, there is no coevolution, because there are identical genomes in every module.

In the following we give a detailed description of AHHSs, and we discuss several design decisions and how the computational complexity can be limited. The benchmark based on inverted pendulums, which is applied throughout the article, is described and discussed. In the second half of the article, we investigate many properties of AHHSs, such as the resulting control networks, how they compare with controllers evolved by applying NEAT [63], the nonlinear dynamics, the oscillations in such systems, the generalization of evolved controllers to initial conditions far from the original domain, and their scaling with respect to increased and decreased complexity of the scenario. Finally, we report two reference implementations, discuss their shortcomings, and discuss related work.

2  Artificial Homeostatic Hormone Systems

The original version of the AHHS was introduced by Schmickl and Crailsheim [54] and Schmickl et al. [57]. Several publications followed that report applications of AHHSs to the evolution of controllers for simulated single robots [66] and hand-coded controllers for real single robots [56, 65]. General perspectives on the AHHS approach were reported by Schmickl et al. [55]. The authors reported an improved version of the AHHS in [23], which will be used in this work (the old version is not used here). An application to modular robotics is reported in [24].

The basic concept of the AHHS is described in the following (for a less detailed description, see [23]). An AHHS is defined by a set of hormones that define all properties of hormones, and a set of rules that define how sensory input and hormone concentrations are converted into hormone concentration changes and actuator control signals. The hormones have base production rates, that is, they are allowed to increase their concentrations independently. A decay rate describes their independent decrease of concentration. In addition, rules manipulate these hormones and also influence the change of hormone concentrations based on sensory input (sensor subrules) and based on the concentrations of other hormones or the hormone itself (hormone-to-hormone subrules). Finally, the actuators are controlled by rules that transform hormone concentrations into actuator control values.

The diffusion of hormones can be based on any spatial/modular topology. For example, the topology could be an abstract compartment structure within the agent, in analogy to compartmentalizations in biological cells [68], or a loosely coupled aggregate of several agents. Hence, diffusion generates a means of communication, which is, however, rather coarse. It implements a communication based on broadcasts, which can also be called implicit communication in order to distinguish it from direct communication (unicast or explicit communication) of a preserved message between sender and receiver.

2.1  Mathematical Model of an AHHS

Now we give a detailed description of an AHHS. For each hormone its special characteristics are specified, such as the decay rate, base production rate, and diffusion coefficient (see Table 1 for a summary). A hormone H is defined by a 3-tuple
formula
for base production rate α, decay rate μ, and diffusion coefficient D. This tuple is also called the hormone gene below, because we apply genetic algorithms to evolve the AHHS.
Table 1. 

The genome of the AHHS controller.

Gene
Description
Range
Hormone Chromosome 
Base production rate α Amount that is produced without sensory stimulation 0 ≤ α ≤ Hmax 
Decay rate μ Equation 3 0 ≤ μ ≤ 1 
Diffusion coefficient D Equation 3 0 ≤ D ≤ 1 
Max. (min.) value of hormone concentration, Hmax (HminValue at which a saturation (bottom) is forced  
 
Rule Chromosome 
Subrule type weight w Weights of the three subrules and the idle subrule   
Trigger window center ζ Defines (along with trigger window width η) the triggering condition and intensity Hmin ≤ ζ ≤ Hmax 
Trigger window width η Defines (along with trigger window center ζ) the triggering condition and intensity Hmin ≤ η ≤ Hmax 
Dependent dose λ Equations 8, 47  
Fixed dose κ Equations 8, 47  
Sensory input s Weighted IDs of the sensors that influence the hormone through sensor subrules   (limited by number of sensors) 
Actuator output α Weighted IDs of the actuators that are influenced by the hormone through actuator subrules   (limited by number of actuators) 
Hormone input h Weighted IDs of the hormones that are influenced through sensor subrule or that influence another hormone through hormone subrules   (limited by number of hormones) 
Hormone output k Weighted IDs of the influenced hormones (through hormone subrules (limited by number of hormones) 
Gene
Description
Range
Hormone Chromosome 
Base production rate α Amount that is produced without sensory stimulation 0 ≤ α ≤ Hmax 
Decay rate μ Equation 3 0 ≤ μ ≤ 1 
Diffusion coefficient D Equation 3 0 ≤ D ≤ 1 
Max. (min.) value of hormone concentration, Hmax (HminValue at which a saturation (bottom) is forced  
 
Rule Chromosome 
Subrule type weight w Weights of the three subrules and the idle subrule   
Trigger window center ζ Defines (along with trigger window width η) the triggering condition and intensity Hmin ≤ ζ ≤ Hmax 
Trigger window width η Defines (along with trigger window center ζ) the triggering condition and intensity Hmin ≤ η ≤ Hmax 
Dependent dose λ Equations 8, 47  
Fixed dose κ Equations 8, 47  
Sensory input s Weighted IDs of the sensors that influence the hormone through sensor subrules   (limited by number of sensors) 
Actuator output α Weighted IDs of the actuators that are influenced by the hormone through actuator subrules   (limited by number of actuators) 
Hormone input h Weighted IDs of the hormones that are influenced through sensor subrule or that influence another hormone through hormone subrules   (limited by number of hormones) 
Hormone output k Weighted IDs of the influenced hormones (through hormone subrules (limited by number of hormones) 

The basic concept of rules is that each rule contains four subrules, that is, each rule comprises a package of subrules: actuator subrule, sensor subrule, linear hormone-to-hormone subrule, and nonlinear hormone-to-hormone subrule. Nonlinear subrules were introduced in addition to linear interactions “to allow intrinsic dynamics of higher complexity” [23]. Each rule is listening for one selected sensor and influences one selected actuator. However, a direct influence from sensor to actuator is generally excluded, because the hormones that are influenced by the sensor do not influence the actuator themselves (see Figure 1). Each rule influences a pair of hormones through sensory input, it influences an actuator through the concentration of a pair of hormones, and it influences one pair of hormones through the concentrations of another (or possibly the same) pair of hormones. The pairs of hormones are defined by weighted (floating, i.e., continuous) indices, that is, real numbers. Floating indices were introduced to optimize the local search when synthesizing an AHHS with evolutionary methods as explained in Section 2.3.4. For example, a floating index of 0.5 addresses with equal weights of 50% hormone H0 and hormone H1. A floating index of 1.9 would address H2 with a weight of 90% and H1 with 10%. Hence, a rule consists of a package of subrules that can be viewed as a (sub)graph or network as sketched in Figure 1.

Figure 1. 

Sensor-to-hormone, hormone-to-hormone, and hormone-to-actuator interactions defined by a single rule. The two arrows between actuator-hormone pair and sensor-hormone pair indicate the twofold interaction through the linear and the nonlinear subrule.

Figure 1. 

Sensor-to-hormone, hormone-to-hormone, and hormone-to-actuator interactions defined by a single rule. The two arrows between actuator-hormone pair and sensor-hormone pair indicate the twofold interaction through the linear and the nonlinear subrule.

A rule is defined by a 12-tuple (also called a rule gene below)
formula
for subrule weights w, trigger window center ζ, trigger window width η, dependent dose λ, fixed dose κ, sensor ID s, actuator ID a, input hormone ID h for the sensor-hormone pair, and output hormone ID k for the actuator-hormone pair. Each of the subrules has a weight determining its influence. There is also an implicit weight defining a virtual “idle subrule” that corresponds to a subrule without effect. All subrule weights sum to . Hence, an increasing influence of one subrule always comes with a decreasing influence of another subrule. It is possible to deactivate a rule by setting , and it is also possible to define a specialized rule by setting one of the other weights to one. Such a rule would be fully dedicated to one of the subrules (e.g., a sensor rule with ). This concept of weights allows a step-by-step (in principle, continuous) transition of rules between the rule types, which is important for high evolvability. Note that the parameters of the subrules cannot be separated into individual subrule tuples, because the subrules share several parameters (detailed motivation and discussion in Section 2.3).
In the following we define the hormone dynamics of an AHHS. The dynamics of hormone concentrations is influenced by a constant production of hormone, diffusion of hormones, decay, and the summation of influences by other hormones (and itself) and sensors as defined by rules. Note that diffusion of hormone concentrations within a certain spatial structure of compartments (see Section 2.3.3) establishes the only means of communication in this system, which is discussed in Section 2.3.2. The change of hormone concentration Hhc of hormone h in compartment c at time t is described in dimensionless units by
formula
for the parameters of hormone Hh as defined in the hormone gene (Equation 1): the production rate αh, the diffusion coefficient Dh, the decay rate μh, and the summed influence of all applicable sensor subrules , applicable linear hormone subrules , and applicable nonlinear hormone subrules (all of which are introduced in the following). The rationale for defining the influence of sensors and hormones by sums of several “sub-influences” defined by subrules is motivated and discussed in Section 2.3.1. The diffusion term is taken as continuous to obtain a concise description, although it is, of course, discretized in the implementation. Hormones have common minimal values (Hmin) and maximal values (Hmax) (i.e., ∀h, t : HminHh(t) ≤ Hmax) so as to have well-defined intervals for the dynamics of hormones.
We define the sensor subrule, which specifies the influence on the concentration Hhc of a hormone (i.e., each sensor subrule is applied to two output hormones h1 and h2, because we use weighted IDs of hormones and get hormone pairs; cf. Figure 1) in compartment c (cf. Equation 3):
formula
for sensor subrule weight , sensory input Ss(t) from sensor s (subscript specified by rule gene; cf. Equation 2), a linear sensor scaling constant σs, dependent dose λi, and fixed dose κi (described in the following). We use a tent function θ as a trigger function that determines whether and with what intensity the subrule is executed. This was introduced in preference to a mere threshold because experiments indicated it gave better performance [24]. It is defined by
formula
for trigger window center ζi and trigger window width ηi. This is a linear weighting depending on the distance to the trigger window center and its width. Examples in Figure 2 show the interaction of Equations 4 and 5.
Figure 2. 

Examples of parameter settings for ζ, η, λ, κ defining trigger function θ and contribution of a sensor subrule (σs = 1, ). Parts b, c, and d are the result of multiplying the trigger function θ as shown in part a with each of the straight lines a′, b′, and c′ in part a (cf. Equations 4 and 5).

Figure 2. 

Examples of parameter settings for ζ, η, λ, κ defining trigger function θ and contribution of a sensor subrule (σs = 1, ). Parts b, c, and d are the result of multiplying the trigger function θ as shown in part a with each of the straight lines a′, b′, and c′ in part a (cf. Equations 4 and 5).

We define the linear hormone subrule
formula
which is applied to an output hormone concentration Hhc (to each of the sensor hormone pairs as shown in Figure 1) in each compartment c (cf. Equation 3). The input is the hormone concentration Hk. Note that h = k is allowed; thus self-referencing of a hormone is possible. This is an important feature, because it introduces a feedback that can be leveraged to generate complex dynamics (e.g., see Section 5.3). The other parameters are as defined above.
The nonlinear hormone subrule is characterized by the product of the input and the output hormone concentrations, Hk(t)Hh(t) (this choice is remotely inspired by chemical reactions and their rate equations) and defined by
formula
which is applied to an output hormone Hhc in each compartment c.
We define the actuator subrule
formula
which is applied to an actuator control Aa (cf. Equation 9). The current actuator control value Aa of actuator a in time step t is defined by
formula
for actuator subrules , maximum actuator value Aamax, minimum actuator value Aamin, and actuator scaling constant σa that linearly scales hormone values to the relevant actuator control value interval. Equation 9 implements the control of the actuator by the summed influence of all actuator subrules and the limitation to the relevant interval.

2.2  Encoding the AHHS in a Genome

A specific data structure, which parametrizes the AHHS controllers, is introduced for use by the genetic algorithm as genome. The genome is a pair Γ = (Ch, Cr) consisting of two logical entities: the hormone chromosome Ch and rule chromosome Cr. There is one hormone gene GH for each of the N hormones in the hormone chromosome Ch = (G1H, G2H,…, GNH). There is one rule gene GR for each of the M rules in the rule chromosome Cr = (G1R, G2R,…, GMR). Table 1 gives a listing of all genes in these two types of chromosomes. The hormone genes contain the actual parameters in a 3-tuple (see Equation 1). The rule genes contain the actual parameters in a 12-tuple (see Equation 2). Hence, the genome can be implemented, for example, as a mere array of floating-point numbers. A short summary of the AHHS is shown in Figure 3.

Figure 3. 

Short summary of AHHS.

Figure 3. 

Short summary of AHHS.

2.3  Discussion of Several Design Decisions

The proposed design of AHHS controllers includes several fundamental decisions that are discussed in the following.

2.3.1  Indirect Shaping of Sensor-to-Hormone, Hormone-to-Hormone, and Hormone-to-Actuator Mappings

One of the main features of the AHHS approach is the definition of behaviors through indirect shaping of mappings fs of sensor values to hormone concentration changes (sensor-to-hormone mappings; see Equation 3 and Figure 4):
formula
formula
based on the summation of several sensor-to-hormone functions (see Equation 4 and Figure 2), and similarly for mappings of hormone concentrations to hormone concentrations (hormone-to-hormone mappings) and hormone concentrations to actuator values (hormone-to-actuator mappings). Only the summands can be directly changed, for example, by methods of evolutionary computation. This seems to be a performance-sensitive feature of AHHS, as shown below in Section 7 by the comparison with reference implementations.
Figure 4. 

Summations of the influences by hormone-to-hormone subrules define overall hormone-to-hormone mappings (similar for sensor-to-hormone and hormone-to-actuator subrules).

Figure 4. 

Summations of the influences by hormone-to-hormone subrules define overall hormone-to-hormone mappings (similar for sensor-to-hormone and hormone-to-actuator subrules).

Defining the controllers via summations of several functions (cf. Equation 3 and Figure 4) rather than using one function description alone has proved to be advantageous, as shown below by empirical evidence. The number of functions (i.e., the number of rules) that contribute, for example, to a hormone-to-hormone mapping varies with how many rules apply to the considered hormone concentration pair Hh and Hk (similarly for sensor-to-hormone and hormone-to-actuator subrules). Figure 4a shows such a situation where seven rules apply. The resulting hormone-to-hormone mapping defines the hormone concentration change ΔHout based on the hormone concentration Hin of an input hormone. Note that small changes of a single function will also have small effects on the summation. Especially, qualitative, global changes of the summation are unlikely. Whether this is really desirable is questionable, as mentioned above [11].

If we investigate the summation of linear and nonlinear hormone-to-hormone subrules, we get two-dimensional functions for the case of hk (i.e., the input and output hormone indices are different), as shown in Figure 4b.

The summed influences of subrules can be interpreted as those entities that define the resulting behaviors and on which the artificial evolution acts. This is, for example, shown in Figure 4c. In a process over many generations, the general quality of the hormone-to-hormone mapping is preserved, but the quantities are changed (in this case almost linearly).

A comparison to ANN is self-evident because AHHS controllers can be interpreted as hormone reaction networks (cf. Figure 1 and Figure 9). The equivalent to the above-mentioned summed X-to-hormone mappings in ANN is not well defined, but could be a (rather exceptional) network function of the form
formula
where wij are weight functions depending on the outputs oi of connected neurons, instead of mere weight constants. These weight functions would be subject to evolution. The authors are not aware of any approach pursuing this method with methods of evolutionary computation. Still, this approach is representable with standard ANNs by implementing the functions wij through corresponding subtrees (or subgraphs) of neurons with regular weights. It is, however, questionable how such structures should emerge in artificial evolution—a process that is in principle blind to topological qualities of the network. Hence, one could argue that the AHHS approach predefines useful clusters of features that would correspond to independent subgraphs in an ANN.

Another difference between an AHHS and an ANN is the analogy to a conservation of mass in the AHHS. Hormone concentrations are produced, are preserved over time, and vanish only by a regular decay process. Hence, the emergence of processes based on memorized sensory inputs is likely. In contrast, the activations of neurons in the ANN are not explicitly preserved; they can only be implemented by establishing feedback loops.

2.3.2  Fuzzification and Low-Level Communication

The internal state of an AHHS controller is determined by a few continuous values (one for each hormone). The agents' behaviors are determined by these values in combination with the hormone-to-actuator mappings. For small deviations, these mappings typically give similar results (see Figure 9) and introduce a kind of fuzzification into the system. Hence, diffusion of hormone concentrations implements a low-level communication (implicit communication), because hormone values are propagated through the compartment/module system, and similar hormone values have similar “meanings” within the controller. This might prove to be a powerful process because, on the one hand, communication is important in decentral control and, on the other hand, the alternative of evolving sender-receiver pairs would be a difficult problem [16, 17, 28].

2.3.3  Spatial Structure Due to Compartments

The idea of either compartmentalizing a single agent into abstract compartments or just using the natural compartmentalization (e.g., in modular robotics) is to allow a reflection of the agents' embodiment in the controller. This way sensors, actuators, and possibly even subrules (e.g., sensor subrules via their coupling with certain sensors) are associated with certain compartments and allow a body-based (embodied) modularization of the controller. In addition, the evolution of complex compartmentalizations consisting of dozens of compartments might enable behaviors that are otherwise difficult to evolve. This is, however, part of our future work agenda.

2.3.4  Floating Indices and Maximal Causality of Mutations

The idea of using floating indices to address hormones and the idea of using weights to specify the influence of subrules are introduced to maximize the causality of the mutation operator by reducing the impact of a single mutation on the agent's resulting behavior. Using discrete indices would correspond to implementing mutations as switches. Changing a rule from one hormone to another or changing from one subrule to another corresponds typically, however, to a radical change in the behavior. Radical changes are typically fatal in the evolutionary process. Having many fatal mutations complicates the search, and the genetic algorithm might be caught in local maxima more often. However, radical changes might be a relevant process of natural evolution [11]. Hence, a good choice should depend on the fitness landscape of the investigated domain. However, research on rugged fitness landscapes [45] that investigates fitness landscapes and that is more complex than Kauffman's NK model [33] is still pending, mainly due to the computational complexity of such empirical studies.

2.3.5  Dependent Parameters in Rules

The idea of packaging several subrule types in one rule is a tradeoff. Several parameters of a rule, such as the dependent dose λ or the fixed dose κ, are used for several different subrule types, that is, these values are correlated and cannot be optimized independently by the applied method of evolutionary computation unless the corresponding weights are decreased close to zero. The independent parameters (e.g., sensor ID or actuator ID), in turn, can become silent genes if the corresponding subrule weights are zero. An alternative would be a considerable increase of the genome size (e.g., a full set of parameters for each subrule). However, empirical evidence shows a tendency to local optima if the controller population is started with specialized rules having one weight with w = 1, which would be the equivalent to the full-set-of-parameters approach in the current implementation (data not shown). In turn, the initial uniform distribution of subrule weights implies an initial exploration phase, as there are many subrules that influence the behavior. Later an exploitation phase is characterized by diminishing the weights of counterproductive subrules. A more intensive investigation of these interconnections is, however, pending.

2.4  Reduction of Computational Costs

The computation of the hormone dynamics (Equation 3) is costly. In particular, the computational complexity of the summation of the influences of all rules,
formula
scales linearly with the number of rules, M—that is, in Landau notation, —and needs to be done in each update step of the controller. This can be avoided by approximating the three functions defined by , , and for each hormone (and each compartment in the case of ) with a stepwise linear function that is precalculated and stored in a lookup table. That way only the hormone production, decay, and diffusion need to be calculated at each time step, while the influence through hormones and sensors is determined by simply accessing this lookup table (i.e., in ).

Note the shift of paradigm in that now there is a considerable difference between the genotype of the controller, which forms the basis on which it is evolved, and its phenotype, which is the actual representation in the agent at run time. A drawback is the increased memory cost, which scales exponentially with the number of hormones (N) due to the combinatorial possibilities of the hormone-to-hormone subrules (i.e., it is ). As we are typically using small numbers of hormones (N ≤ 3), memory is not an issue, and the computational complexity reduces to about that of the ANN or even below for the case of using only N = 1 hormone.

The speedup due to using lookup tables instead of computing Equation 3 in each time step is about 50; the speedup due to using the AHHS with lookup tables, compared to NEAT [63], is about 1.3 (data not shown) in the benchmark reported in the following.

3  Domain: Coupled Inverted Pendulums

The primary domain for AHHS controllers is modular robotics. However, in order to permit high numbers of evolutionary runs, we had to restrict our studies to a domain of much lower computational cost. As there is as yet no standard, abstract benchmark for modular robotics that would incorporate its typical requirements, we have defined the domain of coupled inverted pendulums [25].1

3.1  Description of the Coupled-Inverted-Pendulums Benchmark

Research on synthesizing controllers for a single inverted pendulum (broom balancing) dates back at least to Widrow and Smith [73]. Applying evolutionary algorithms to this problem dates back at least to Koza and Keane [35]. Over the past 20 years the problem has been successfully solved for even more complex scenarios, such as the double pendulum or the triple pendulum (this is, however, out of the scope of this article).

We apply several changes to the standard inverted-pendulum scenario to increase its complexity and to increase the similarity of its challenges to those of modular robotics scenarios. The pendulums are started in lower positions, that is, we include the nonlinear upswinging phase. We also restrict the cart track length, resulting in a scenario similar, for example, to that reported by Chatterjee et al. [10]. Because of that and the limited acceleration of the cart motor, the upswinging cannot be managed by just moving back and forth once.

In addition, we limited the sampling rates of all sensors. The sampling rates are low, which is documented by the relation between the preset cycle length τ of the controller and the maximal angular velocity of 0.05π[1/τ] = 9°[1/τ]. The pendulum can move up to 9° between two calls of the controller. The controller has little time to adapt to new configurations.

In order to adapt the sensor setting to those that are more typical in robotic scenarios, the sensors do not deliver actual angles and positions directly. These values are partitioned onto several sensors, and they are also relative rather than absolute (e.g., distance to wall instead of cart's position); see Table 2 for details. All sensor and actuator values are integers in the interval [0, 127]. The controllers have two outputs (left actuator A0 and right actuator A1), and the acceleration control of the cart is determined by their difference (see Table 2).

Table 2

Sensor and actuator settings.

Sensor or actuator ID
Compartment (AHHS)
Sensor or actuator name
Mapping of system states to sensor or actuator values
S0 Left Pendulum angle sensor 1 ϕ ∈ [0, 0.5π] → [127, 0], 0 else 
S1 Right Pendulum angle sensor 2 ϕ ∈ [π, 1.5π] → [0, 127], 0 else 
S2 Left Pendulum angle sensor 3 ϕ ∈ [0.5π, π] → [127, 0], 0 else 
S3 Right Pendulum angle sensor 4 ϕ ∈ [1.5π, 2π] → [0, 127], 0 else 
S4 Left Proximity sensor 1 Dist. to obstacle left (max. 1) → [0, 127] 
S5 Right Proximity sensor 2 Dist. to obstacle right (max. 1) → [0, 127] 
S6 Left Cart velocity sensor 1 v ∈ [−2, 0] → [127, 0], 0 else 
S7 Right Cart velocity sensor 2 v ∈ [0, 2] → [0, 127], 0 else 
S8 Left Pend. angular vel. sensor 1 ω ∈ [−5π, 0] → [0, 127], 0 else 
S9 Right Pend. angular vel. sensor 2 ω ∈ [0, 5π] → [0, 127], 0 else 
 
A0 Left Actuator left A0 ∈ [0,127]\lower10.75\Bigg\} A0/127 − A1/127 \rightarrow [−1, 1] 
A1 Right Actuator right A1 ∈ [0,127] 
Sensor or actuator ID
Compartment (AHHS)
Sensor or actuator name
Mapping of system states to sensor or actuator values
S0 Left Pendulum angle sensor 1 ϕ ∈ [0, 0.5π] → [127, 0], 0 else 
S1 Right Pendulum angle sensor 2 ϕ ∈ [π, 1.5π] → [0, 127], 0 else 
S2 Left Pendulum angle sensor 3 ϕ ∈ [0.5π, π] → [127, 0], 0 else 
S3 Right Pendulum angle sensor 4 ϕ ∈ [1.5π, 2π] → [0, 127], 0 else 
S4 Left Proximity sensor 1 Dist. to obstacle left (max. 1) → [0, 127] 
S5 Right Proximity sensor 2 Dist. to obstacle right (max. 1) → [0, 127] 
S6 Left Cart velocity sensor 1 v ∈ [−2, 0] → [127, 0], 0 else 
S7 Right Cart velocity sensor 2 v ∈ [0, 2] → [0, 127], 0 else 
S8 Left Pend. angular vel. sensor 1 ω ∈ [−5π, 0] → [0, 127], 0 else 
S9 Right Pend. angular vel. sensor 2 ω ∈ [0, 5π] → [0, 127], 0 else 
 
A0 Left Actuator left A0 ∈ [0,127]\lower10.75\Bigg\} A0/127 − A1/127 \rightarrow [−1, 1] 
A1 Right Actuator right A1 ∈ [0,127] 

The most important difference from the standard inverted pendulum is that we couple several carts (or modules) using chains (see Figure 5). Carts can move independently as long as they do not pull a chain or run into each other. Hence, each cart has to avoid other carts and walls (cart track ends) and has to balance its pendulum at the same time. Note the difference of this domain from others that mount several pendulums on the same cart; for example, see Xin and Kaneda [77]. This would correspond to a chain length of 0. However, here we are able to define degrees of coupling continuously [25].

Figure 5. 

Coupled-inverted-pendulum benchmark with two carts. Pendulums, free to move a full 360°, are mounted on the carts, which move in one dimension (left-right) bounded by walls (track ends) and other carts. Marked angle is pendulum angle ϕ.

Figure 5. 

Coupled-inverted-pendulum benchmark with two carts. Pendulums, free to move a full 360°, are mounted on the carts, which move in one dimension (left-right) bounded by walls (track ends) and other carts. Marked angle is pendulum angle ϕ.

In the following experiments we increase the number of modules (i.e., carts) without changing the track length. Hence, with an increasing number of modules, the module density is also increased, which increases the difficulty even more. The modules are controlled locally without global information, ID, or positional information, and all modules are controlled by identical controllers.

We use an aggregate fitness function [42], which is basically the percentage of time steps that all pendulums spent in the upper equilibrium position (ϕ = 0). Deviations from ϕ = 0 are linearly scaled, that is, ϕ = 0.5π, for example, is evaluated as “50% in upper position.” A fitness of 1 means all pendulums spent all the time in the upper position, a fitness of 0.5 means the pendulums spent half the time in the upper position, and a fitness of 0 means all pendulums spent all the time in the lower equilibrium. If any constraint is violated (e.g., cart runs into other cart, cart runs into chain, cart runs into wall, pendulum velocity too high), the evaluation run is aborted and the fitness is reduced proportionally to the elapsed time.

The implementation of the cart-pole dynamics is calculated by a third-order Runge-Kutta method [49] with a discrete time step of size Δt = 0.01.

3.2  Discussion of Properties of the Coupled-Pendulums Benchmark

A detailed discussion of this benchmark is reported in [25]. Search algorithms operating on fitness landscapes related to this benchmark seem to be prone to local optima. Early in an evolutionary run, fast motion of the carts earns good fitness improvements. Subsequently, further improvements in the fitness can be reached by spinning the pendulums quickly. In a third phase of fitness increase the controller might manage to slow down the pendulums' speed when approaching ϕ = 0, but the pendulums still spin. Finally, in the absence of noise, an evolved controller could generate deterministic cart trajectories that end up with all pendulums at ϕ = 0. These solutions can be seen as local optima before a fully reactive controller is evolved that can actually control the pendulums in noisy conditions.

Note also that conflicting interpretations of sensory inputs exist. A close-by neighboring cart might be a safe condition if it moved in the same direction. In contrast, a close-by wall might be a dangerous condition.

4  Results

In the following, we report several results of applying the AHHS approach to the coupled inverted pendulums benchmark. We compare the evolution of AHHS controllers with that in NEAT [62], we investigate how the performance of the AHHS can be improved, we investigate several internal processes of AHHS controllers, and we analyze the scaling behavior of AHHS controllers to a higher number of carts.

The initial settings as shown in Table 3 (cart and pendulum positions, etc.) are maintained except for the explicit investigation of varied initial settings. In the case of AHHS we use a mutation rate of 0.4 per hormone and per rule, and values are changed by at most ±60% of the maximal value (randomly uniform). We use a mere linearly proportional selection with elitism of three (the three best individuals are guaranteed to stay in the population).

Table 3. 

Settings of the evolutionary experiments.

Min. dist. between carts, dmin 0.05 
Max. dist. between carts, dmax 0.35 
Track length w 2 (on the interval [−1, 1]) 
Initial pos. xi(0) of cart i x(0) = (−0.4, −0.2, 0, 0.2, 0.4) 
Initial pos. ϕi(0) of pendulum i ϕ(0) = (0.8π, 0.9π, π, 1.1π, 1.2π) 
Initial cart velocities vi(0) v(0) = (0, 0, 0, 0, 0) 
Initial pendulum angular vel. ωi(0) ω(0) = (0, 0, 0, 0, 0) 
 
Population size 100 
Number of generations 200 
Mutation rate 0.4 
Recombination rate 0.01 
Selection Linearly proportional 
Elitism Three best individuals 
Min. dist. between carts, dmin 0.05 
Max. dist. between carts, dmax 0.35 
Track length w 2 (on the interval [−1, 1]) 
Initial pos. xi(0) of cart i x(0) = (−0.4, −0.2, 0, 0.2, 0.4) 
Initial pos. ϕi(0) of pendulum i ϕ(0) = (0.8π, 0.9π, π, 1.1π, 1.2π) 
Initial cart velocities vi(0) v(0) = (0, 0, 0, 0, 0) 
Initial pendulum angular vel. ωi(0) ω(0) = (0, 0, 0, 0, 0) 
 
Population size 100 
Number of generations 200 
Mutation rate 0.4 
Recombination rate 0.01 
Selection Linearly proportional 
Elitism Three best individuals 

All of this research is done with a focus on modular robotics applications [24, 67, 51]. The main application is the offline evolution of controllers for modular robots based on simulations. In these simulations, the application of a physics engine (simulation of friction, inertia, etc.) is important because the evolved behaviors often rely, for example, on friction [75]. A drawback is the very high computational cost, which considerably reduces the number of feasible evaluations. Therefore, we have to limit both the population size and the number of generations. The experiments in this article are based on the coupled-inverted-pendulums benchmark, which has low computational costs, and we would have the resources to increase the evaluation number. However, we want to investigate the performance of the AHHS approach with few evaluations in order to transfer the results to the application of offline evolution of controllers for modular robotics. Hence, the population size is limited to 100 genomes and the number of generations is 200 (see Table 3), which results in 2 × 104 evaluations.

4.1  Comparison of AHHS with NEAT

We compare the AHHS approach with NEAT [63]. Our implementation of NEAT is based on rtNEAT C++ v1.0.1 by Kenneth Stanley.2 The optimization of the parameters used by NEAT is complex. Several settings were tested. The best results were achieved using the parameter settings reported by Whiteson and Stone [72] for the mountain car task. The initial network topology was chosen analogous to the sensor and actuator setting of the AHHS except for two extra sensors and two extra actuators that implement two communication channels (one to a possible left neighbor and one to a possible right neighbor). The communication channel is implemented by coupling the output nodes with input nodes of neighboring modules and vice versa. This is necessary to allow similar synchronization possibilities as in case of the AHHS approach through diffusion of hormones. However, the experiments did not show significant differences in the performance of NEAT with or without such communication channels (data not shown).

The AHHS controllers are set to N = 1 hormone and M = 30 rules, which proved to be a good setting for this domain. For N > 1 we observed worse performance (data not shown), and an analysis of changing numbers of rules (M) is reported in Section 5.1. The modules have a simple, virtual compartmentalization. They have a left and a right compartment, and the sole hormone H0 occurs in two concentrations, H00 (left) and H01 (right), accordingly. The hormone diffuses between the two compartments of the module as well as between neighboring compartments of neighboring modules with a fixed diffusion coefficient D0 (cf. Equation 3; see [24] for a more detailed discussion of module-to-module diffusion, e.g., Figure 11b in [24]). Sensors and actuators are also associated with one of the two compartments (see Table 2). The population is initialized with random controllers (uniformly distributed subrule weights, fixed and dependent doses, etc.).

The comparison was done for one through five modules with n = 30 evolutionary runs each. The results are shown in Figure 6. For the case of one module, NEAT outperforms the AHHS significantly. However, for all multi-module settings the AHHS significantly outperforms NEAT. From these results we conclude that AHHS is at least competitive to other state-of-the-art approaches in the case of decentrally controlled multi-module systems.

Figure 6. 

Box-and-whisker plots of the fitnesses of the best evolved controllers for the performance comparison between AHHS and NEAT with several numbers of modules and n = 30 runs each. Asterisks show significance of p < 0.05 using the Wilcoxon rank-sum test.

Figure 6. 

Box-and-whisker plots of the fitnesses of the best evolved controllers for the performance comparison between AHHS and NEAT with several numbers of modules and n = 30 runs each. Asterisks show significance of p < 0.05 using the Wilcoxon rank-sum test.

4.2  Applying Noise During Evolution for Better Performance

In the following experiment we have applied noise to the hormone during the evolution. A uniformly random distributed noise on the interval [−0.05, 0.05] was added to the hormone concentrations during generations [0, 30], [60, 100], and [140, 160]. During these generations the fitness was based on the minimum fitness out of three evaluations. The idea of introducing phases of noisy hormone concentrations is to perform an additional exploration in the search space and to find more robust solutions. The phases without noise are important to find the best controllers for the deterministic domain, which is our benchmark here. The results are shown in Figure 7; the performance achieved by applying noise is compared with the AHHS performance shown in Figure 6. Significantly better performance was achieved for five modules, while significantly worse performance was achieved for one module. For two, three, and four modules only a trend seems to indicate better performance on applying noise. Hence, it is questionable whether the higher computational cost due to the larger number of evaluations is justified.

Figure 7. 

Box-and-whisker plots of the fitnesses of the best evolved controllers for the method that applies noise to the hormone, compared with the performance of the standard AHHS method as shown in Figure 6.

Figure 7. 

Box-and-whisker plots of the fitnesses of the best evolved controllers for the method that applies noise to the hormone, compared with the performance of the standard AHHS method as shown in Figure 6.

4.3  Analysis of an Evolved Controller

In the following we investigate the best evolved controller that was achieved using the noise method described above for the two-module scenario. The behavior of the evolved controller in a regular evaluation run is shown in Figure 8. For this initial state it achieves a fitness of about 0.85, which is equivalent to keeping both pendulums upwards balanced for 3,400 time steps (the actual value is smaller because during the upswinging phase pendulum angles ϕ ≠ 0 occur). Both carts show only small movements and move in synchrony, which is obviously a good behavior to avoid collisions. The carts keep moving towards x = −1, which indicates that the pendulums probably will not be balanced for much longer than the evaluation period. However, keeping the carts around one position was not part of the fitness function. Note the complex dynamics of the hormone, which is an oscillation over three time steps of the form 0, 1, x, 0, 1, x,…. Hormone concentrations of 0 and 1 correspond to actuator control values of 0. Hence, the actuator actually is active only every third time step.

Figure 8. 

System states for a run of an evolved controller for two modules. Note the complex dynamics of the hormone, which is an oscillation of period 3 time steps; cf. Figure 9.

Figure 8. 

System states for a run of an evolved controller for two modules. Note the complex dynamics of the hormone, which is an oscillation of period 3 time steps; cf. Figure 9.

One great benefit of the AHHS approach is that a complete qualitative representation of the evolved controllers is possible as shown in Figure 9. This representation is based on lookup tables that map sensory input to hormone concentrations, hormone concentrations to hormone concentrations, and hormone concentrations to actuator control values, and can be generated automatically. In this graphical description of the controller it is, for example, easy to see that sensors S0 to S6 and actuator A1 are basically turned off. Hence, no collision avoidance behavior was evolved, because input from the proximity sensors is ignored. The sensors of the pendulum angles are also not used.

Figure 9. 

Complete qualitative representation of the best evolved AHHS controller. The upper row of nodes represents the 10 sensors. The 10 small diagrams show the sensor-to-hormone mappings. The node in the center represents the hormone concentration, and the neighboring arrow loop represents the hormone-to-hormone mapping. The two lower nodes represent the actuator outputs with the corresponding hormone-to-actuator mappings.

Figure 9. 

Complete qualitative representation of the best evolved AHHS controller. The upper row of nodes represents the 10 sensors. The 10 small diagrams show the sensor-to-hormone mappings. The node in the center represents the hormone concentration, and the neighboring arrow loop represents the hormone-to-hormone mapping. The two lower nodes represent the actuator outputs with the corresponding hormone-to-actuator mappings.

The internal dynamics (i.e., without dynamic sensory input) is determined by the feedback loop through hormone-to-hormone rules. A bifurcation diagram of the hormone-to-hormone mapping function as shown in Figure 9 is shown in Figure 10. In order to simplify the dynamics of the complete system, the overall influence by sensors, , is assumed to be constant and is added to the hormone concentration in every time step. That yields the following map for the hormone dynamics:
formula
In Figure 10 the constant is used as bifurcation parameter. For each value of there are infinitely many combinations of associated sensor values that are not individually considered. The diagram shows three qualitatively different regions: a period-two region (), a period-three region (), and a fixed-point region (). The controller mainly operates on the interval of approximately (data not shown), which is the period-three region (i.e., attractors based on oscillations between three different hormone concentrations with a period length of three time steps).
Figure 10. 

Bifurcation diagram for the internal hormone concentration dynamics of the evolved controller shown in Figure 9. The overall influence of the sensory input is used as bifurcation parameter (see Equation 14). The controller mainly operates in the period-three region of about (cf. dynamics of the hormone concentration in Figure 8f).

Figure 10. 

Bifurcation diagram for the internal hormone concentration dynamics of the evolved controller shown in Figure 9. The overall influence of the sensory input is used as bifurcation parameter (see Equation 14). The controller mainly operates in the period-three region of about (cf. dynamics of the hormone concentration in Figure 8f).

5  AHHS—Mode of Operation

In the following we investigate the mode of operation of AHHS controllers. We test the evolved selection of sensors and the dependence on sensory input. The hormone dynamics is interpreted in a nonlinear dynamics context in extension of the results shown in Figure 10. Finally, we investigate the emergence of oscillations and different time scales in the hormone dynamics.

5.1  Influence of the Number of Rules

We begin by investigating the influence of the allowed number of rules, M, for the setting with two modules. The results are shown in Figure 11. On the one hand, a certain number of rules (from Figure 11 one can tell that M = 30 would most likely be a good choice here) is necessary to reach good performance. On the other hand, larger numbers seem not to decrease performance significantly. Note that the dependence on the number of rules is not simply a question of whether a function (as shown in Figure 9) is representable by few rules, but rather of how much of the space of all possible behaviors is covered by the initial population. Hence, a well-chosen number of rules would maximize the heterogeneity of behavior in the initial population without increasing the search space unnecessarily. This relation, however, needs more investigation and will be future work, together with the option of evolving the number of rules (i.e., dynamic genome lengths).

Figure 11. 

Box-and-whisker plots of the fitnesses of the best evolved controllers for several numbers of rules, M ∈ {2, 4, 6, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 200} (significance not displayed).

Figure 11. 

Box-and-whisker plots of the fitnesses of the best evolved controllers for several numbers of rules, M ∈ {2, 4, 6, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 200} (significance not displayed).

5.2  Most Influential Sensors

The approach with just one hormone proposed in this work allows a straightforward time series analysis of the internal states (described by a single hormone concentration) compared with the sensory input. We use Pearson's product-moment correlation coefficient (PMCC) for the temporal differences (first derivatives) for pairs of the hormone concentration and sensor values. By this investigation we want to detect the sensors that mainly influence the controllers. This way, we regard the evolved controllers as black boxes and analyze their behavior statistically.

The correlation coefficient r gives a good estimate of how relevant the particular sensor is. Values of r = 1 or r = −1 indicate a positively linear or negatively linear relationship. Values of |r| < 1 indicate increasing noise until the maximum noise is reached for r = 0. Note that nonlinear relationships cannot be discovered with the PMCC.

The results for the best controllers that were evolved applying the noise method (see Figure 7) are shown in Figure 12. Interestingly, no controller listens mainly for the pendulum angles (sensors 1 through 4), and those controllers listening for the proximity sensors correlate almost always positively. The best evolved controllers listen only to sensors 6 through 9, representing the cart velocity and pendulum velocity. These sensors are commonly regarded as the most relevant in inverted-pendulum scenarios [46].

Figure 12. 

Correlation coefficients for correlations between changes of hormone value (first derivative) and the changes of the 10 sensors in the left and right compartments; dotted lines show thresholds of high significance (p < 0.01) for this data set (outside the two lines, all data is significant; inside, only two data points).

Figure 12. 

Correlation coefficients for correlations between changes of hormone value (first derivative) and the changes of the 10 sensors in the left and right compartments; dotted lines show thresholds of high significance (p < 0.01) for this data set (outside the two lines, all data is significant; inside, only two data points).

5.3  Nonlinear Dynamics Interpretation

The bifurcation diagram shown above (Figure 10) already indicated that AHHS controllers can be investigated by applying methods of nonlinear dynamics. The hormone-to-hormone subrules define the internal dynamics by a feedback loop. This internal dynamics is then disturbed by sensory input. The dynamics of the hormone concentrations can be interpreted as trajectories in a state space. A demonstration (based on a controller with several hormones, in contrast to those reported above) in the form of a vector field of such a state space, focusing on one hormone-to-hormone relationship, is shown in Figure 13a. The vector field has a fixed point at about (0.64,0.93). The occurring states of a regular evaluation run, shown in Figure 13a, indicate, however, that the system almost never stays at the fixed point. In fact, the sensory input and possibly also the influence of other hormones disturb the system and drive it temporarily away from the fixed point.

Figure 13. 

Hormone dynamics: (a) vector field as defined by the linear and nonlinear hormone-to-hormone rules for two selected hormones; (b) dynamics of these hormones, and corresponding sensory input and angular velocity of the pendulum of one module.

Figure 13. 

Hormone dynamics: (a) vector field as defined by the linear and nonlinear hormone-to-hormone rules for two selected hormones; (b) dynamics of these hormones, and corresponding sensory input and angular velocity of the pendulum of one module.

Figure 13b shows the hormone concentration H0, the most influential sensor S9 (negative angular velocities), and the corresponding pendulum angular velocity ω for the initial 2,500 time steps. Without any disturbance by sensory input, hormone 0 would relax to the steady state of about H0 = 0.93. The sensory input S9 adds to the hormone and increases the hormone concentration to H0 = 1. Without input from S9, the hormone oscillates irregularly between about 0.75 and 1 due to other inputs. The actuator is activated in these phases, resulting in short periods of acceleration of the cart that keep the pendulum swinging.

5.4  Control Based on Oscillations

An interesting property of all evolved controllers is the oscillation of all hormone concentrations. Certainly the domain of oscillating pendulums introduces oscillations into the system, but the controllers also evolved much faster oscillations. Even fast-swinging pendulums typically have periods of more than 200 time steps, whereas hormones often show periods of 2 to 7 time steps. Obviously, controllers whose functionality is based on such oscillations are favored by the evolutionary algorithm in this domain. In fact, these oscillations can be interpreted as a means of storing system states. This is clearly indicated by Figure 14, which shows a hormone with two different kinds of oscillation behavior. Hence, one can argue that in addition to those oscillations of the pendulum a second time scale emerges. The emergence of time scales and their significance as forms of memory and for robustness have been reported regularly [20, 36, 52]. Similar dynamics were found in single neurons [15]. In this context, note also the discussion of Moioli et al. [38] in Section 8.2 below. Considering these fast oscillations, the similarity to neurons, and our inspiration by hormones, one could argue that there is a contradiction. However, we argue that the dynamics of hormones in unicellular organisms is also fast compared to changes in their environments (e.g., nutrition gradients).

Figure 14. 

Controller states represented by different oscillations: hormone dynamics and angle of the pendulum of one module in a two-pendulum scenario.

Figure 14. 

Controller states represented by different oscillations: hormone dynamics and angle of the pendulum of one module in a two-pendulum scenario.

6  Generalization and Scaling

All controllers reported in this article were evolved for only one initialization of module numbers and cart and pendulum positions. In the following we investigate how these controllers generalize to other initializations and to other module numbers. Both are desirable, because they would allow us to evolve the controllers in scenarios of lower complexity (e.g., only three modules and only one initialization of cart and pendulum positions) than the final application scenario, which is resource-conserving.

6.1  Generalization to Initial Conditions Far from the Original Domain

In the following we investigate how evolved controllers generalize to initial angles of pendulums for which the controllers were not evolved. This was done for the best controllers of the two-module scenario using the AHHS-noise approach that is shown in Figure 7. Without re-evolving the controllers, they were tested for all pairs of initial pendulum angles (ϕ0(0), ϕ1(0)) on the full interval ϕi(0) ∈ [0, 2π] for i ∈ {0, 1}, with a resolution of Δϕ = 0.05.

The median fitness over all 30 controllers and all tested initial angles was 0.61 (first quartile: 0.23; third quartile: 0.70). This is a good result compared to the NEAT experiments, which were obtained by explicitly evolving for one initialization and reached only a median of 0.59 (see Figure 6).

In Figure 15a the results for a controller are shown that achieved a fitness of 0.86 for the original initialization. The median over all initializations for this controller was 0.88 (first quartile: 0.28; third quartile: 0.92), which is above the fitness for the original initialization. The white cross shape in Figure 15a is obtained for 28 of the 30 evaluated controllers and indicates low fitness for initializations of ϕi(0) ≈ π, which is about the lower equilibrium of the pendulum. These initializations are difficult for the controllers because they are not part of the original initialization of the two-module scenario (in contrast to the three-, four-, and five-module scenarios; see Table 3). For ϕ = π the pendulum is at rest and the important sensory stimulus from the angular velocity sensors is missing. In similar experiments for the three-module scenario (i.e., controllers evolved in the three-module scenario were evaluated on the full interval of ϕi) the cross shapes vanish (see Figure 15b) because the initialization ϕi(0) = π is part of the initialization for which the controllers are evolved.

Figure 15. 

Fitness for 1681 initial pendulum angles of the first two pendulums, ϕ0 and ϕ1. The crosses at (0.8π, 0.9π) mark the initialization for which the controller was evolved (in the case of the three-module scenario we have additionally ϕ2(0) = π).

Figure 15. 

Fitness for 1681 initial pendulum angles of the first two pendulums, ϕ0 and ϕ1. The crosses at (0.8π, 0.9π) mark the initialization for which the controller was evolved (in the case of the three-module scenario we have additionally ϕ2(0) = π).

Similar results were reported before for an ANN and a single inverse pendulum by Pasemann [46]. Adaptive controllers, which also worked well on initial conditions far from the original domain, emerged for the single pendulum.

6.2  Scaling of Module Numbers

In the following we investigate how evolved controllers cope with numbers of modules that are higher and lower than in the scenario they have been evolved for. The respectively best controllers of the 30 runs for the three-module scenario using the AHHS-noise approach were evaluated in one-, two-, four-, and five-module scenarios. The result is shown in Figure 16. There is no significant difference between the five settings. Hence, the controllers scale up and down equally well. Effective scaling opens up the possibility of evolving controllers for simpler scenarios that consume less resources. Later the evolved controller is applied to more complex scenarios. The feasibility of this approach is also supported by the good scaling in individual cases of controllers that were evolved for the two-module scenario and that showed high fitness in the five-module scenario, close to the all-time record.

Figure 16. 

Scaling of module numbers; the controllers were evolved for the three-module setting (data shown in Figure 7). The same controllers were evaluated in the one-, two-, four-, and five-module settings without re-evolving them. There is no significant difference in the performance.

Figure 16. 

Scaling of module numbers; the controllers were evolved for the three-module setting (data shown in Figure 7). The same controllers were evaluated in the one-, two-, four-, and five-module settings without re-evolving them. There is no significant difference in the performance.

7  Comparison with Similar Implementations

In order to test the usefulness of defining the sensor-to-hormone, hormone-to-hormone, and the hormone-to-actuator mappings using the AHHS concept of rules, we have implemented two reference implementations. We call the first one the direct table (DT) approach. It uses the lookup tables of the AHHS implementation, but the table entries are filled directly, without the help of summations of functions defined by subrules. Initially the tables are filled with random numbers, which are directly mutated by the genetic algorithm. Neighboring values in the table are not correlated, in contrast to the AHHS approach.

The second reference implementation is called the Fourier approach. It implements functions using Fourier series to fill the lookup tables. This way, neighboring values in the table are correlated because Fourier series are differentiable; the correlation should also be reflected in discrete samplings of Fourier series. The Fourier coefficients are initialized randomly and mutated by the genetic algorithm.

7.1  Direct Encoding of Lookup Tables

This reference implementation uses the lookup tables of the AHHS implementation, but the entries are filled directly. Note that this preserves the general network topology as shown in Figure 9. However, instead of filling the tables by the summation of several subrules, the tables are filled directly. Initially they are filled with random numbers. The mutations act on these tables directly, too. The following results were obtained with tables of 128 bins (i.e., the intervals of sensor values and the intervals of hormone concentrations are discretized with a resolution of 128 discrete steps) and with an average of 320 mutations per controller and generation. The results are shown in Figure 17. The performance of the DT approach seems to be independent of the module number. This is explained by the local optimum that is typically found, which is a fast pendulum-swinging behavior without any interaction between the carts. Besides the quantitatively small difference of the DT and AHHS medians of the four- and five-module scenarios, there is a qualitative difference because the DT behaviors show only fast pendulum swinging while the AHHS behaviors use more promising strategies (e.g., a single pendulum is balanced).

Figure 17. 

Box-and-whisker plots of the fitnesses of the best evolved controllers for the performance comparison between the direct encoding of lookup tables (DT) and the best AHHS approach (AHHS-noise approach for several modules, and standard approach for one module).

Figure 17. 

Box-and-whisker plots of the fitnesses of the best evolved controllers for the performance comparison between the direct encoding of lookup tables (DT) and the best AHHS approach (AHHS-noise approach for several modules, and standard approach for one module).

In addition, this simple approach shows significantly better performance for three and five modules than does NEAT. This might be seen as an indicator that shaping sensor-to-hormone and hormone-to-actuator mappings is a generally beneficial approach.

7.2  Lookup Tables Based on Fourier Series

In this second reference implementation we use Fourier series of order k = 24. Such a high order might be counterintuitive, but proved empirically to be a better approach than using lower orders. Obviously, short wavelengths in the functions are advantageous. The lookup tables are filled by functions of the form
formula
with x ∈ [0, 2π] representing the scaled sensor or hormone value associated with the considered table entry. The values a0, ak, bk (k ∈ [1, 24]) are initialized randomly and mutated by the genetic algorithm.

The results shown in Figure 18 indicate that the rule-based AHHS approach reported in this article is not the general single best approach for all scenarios. The AHHS approach reaches significantly better performance for the one- and two-module scenarios. However, the Fourier series approach has higher computational cost. Our implementation ran slower by a factor of about two than the AHHS-noise approach reported in Section 4.2. Reducing the order of the Fourier series would lower the computational cost but also the performance. Hence we conclude that the AHHS approach is still the better choice for this domain.

Figure 18. 

Box-and-whisker plots of the fitnesses of the best evolved controllers for the performance comparison between the approach based on Fourier series and the best AHHS approach (AHHS-noise approach for multiple modules, and standard approach for one module).

Figure 18. 

Box-and-whisker plots of the fitnesses of the best evolved controllers for the performance comparison between the approach based on Fourier series and the best AHHS approach (AHHS-noise approach for multiple modules, and standard approach for one module).

8  Related Work

The proposed AHHS controller approach has methodological similarities to many other approaches; however, the unrestricted evolution of hormone-to-hormone reactions as well as sensor-to-hormone and hormone-to-actuator mappings seems to be a unique feature. In the following we compare our approach with neural network approaches, gene regulatory networks, and reaction-diffusion approaches.

8.1  Artificial Neural Networks, NEAT, HyperNEAT, and GasNet

It seems to be justified to call ANNs (in particular, continuous-time recurrent neural networks) the standard approach in the automatic synthesis of controllers, especially in the field of evolutionary robotics [14, 27, 31, 43, 44, 69]. However, the standard ANN approach fails for complex tasks, which failure is, as mentioned in the introduction, documented by the absence of complex tasks in the literature [42]. The difference of the AHHS approach from the ANN is mainly the concept of hormone diffusion and, as mentioned above, the X-to-hormone mappings, which can be interpreted as network functions with weights depending on the outputs of connected neurons.

In contrast to the standard ANN approach, NEAT [63] and HyperNEAT [64] have verifiably higher evolvability. NEAT differs in the sophisticated methods of evolutionary computation, while the underlying ANN is that of the standard approach.

AHHSs can be compared with ANNs in the following way: Assume (and for one example it is empirically shown in this article) that it is a good way to solve the problem of controller synthesis by means of function fitting for sensor-to-hormone, hormone-to-hormone, and hormone-to-actuator mappings as reported in this work. Obviously, ANNs are general enough to solve it, in principle, in the same way. However, it is improbable that separated function-fitting clusters to implement these mappings will emerge within the network, even if NEAT is used.

HyperNEAT differs from NEAT by its generation of connectivity patterns that are comparable to the spatiality introduced by the compartmentalization using AHHS. Otherwise there seem not to be many similarities between these two approaches.

Many more similarities are found in comparing the AHHS with GasNet [31], because in addition to the standard ANN they both introduce diffusing hormones and hence represent hybrid approaches. The related artificial homeostatic systems [41, 70, 71] emphasize a homeostatic aspect similar to the internal dynamics of the AHHS.

8.2  Gene Regulatory Networks and Reaction-Diffusion Systems

The most similarities are found by comparing the AHHS with other approaches that do not rely on ANNs. Gene regulatory networks (GRNs) [3, 18] are inspired by biological gene networks, implementing an implicit encoding of networks. Still, there are many differences between GRNs and AHHSs, although superficially there seem to be a lot of similarities due to the similar biological inspiration. A difference in the genotypes is that AHHS do not have the distinction between coding and noncoding regions. The AHHS genome is rather predefined and static. Concerning the resulting networks and their functioning, each edge has its own activation threshold, and redundant edges with different activations are allowed in the AHHS approach.

Probably the most similarities of all are found in the comparison with the diffusion-reaction controller approach [13]. Among these similarities are the definition of controllers based on reactions and the essential influence of diffusion processes. Differences include the predefined reactions (Gray-Scott reaction-diffusion system) in [13]. In addition, spatiality plays a much more important role, as Dale and Husbands use 128 compartments, while our agents have only two compartments in this work. Their approach relies on this high resolution and hence shows spatial qualities that are not observed in the AHHS approach.

Furthermore, similarities are found in comparison with the controller based on the Kuramoto model by Moioli et al. [38]. In that work, the focus is on synchronization based on predefined oscillations. In the AHHS approach, oscillations and synchronization emerge by themselves.

The evolution of sensor-to-hormone mappings can be interpreted as a direct function-fitting approach related to Jang [32]. However, the AHHS approach is much more sophisticated and defines these mappings not directly but in a form that shows high evolvability.

Other approaches that are inspired by hormones are very different from ours. For example, Shen et al. [59] report a message-based system, and Shen et al. [60] report a pheromone-like implementation (comparable to that of Payton et al. [47]). Furthermore, see Meng et al. [37] for a modular robotics approach inspired by a mechanochemical model for cell morphogenesis—an aspect that we do not address here but do consider elsewhere; see Schmickl et al. [55].

9  Conclusion

In this article we have reported an application and intensive analysis of the AHHS controller approach, which is inspired by signaling networks in unicellular organisms. We have reported a comparison with NEAT on the coupled-inverted-pendulums benchmark that revealed a significantly better performance of the AHHS controller for multi-module settings with a limited number of evaluations. It can be considered a main result of this work that the AHHS approach can be interpreted as an ANN with a rather exceptional network function based on weight functions depending on the outputs of connected neurons. This concept might also be a promising approach for ANNs. The lookup-table implementation of the AHHS approach introduces an interesting difference between the genotype and the phenotype of the controllers. The genotype is designed for high evolvability, while the phenotype is optimized for low computational complexity.

An advantage of the AHHS over, for example, the ANN, NEAT, or HyperNEAT, is that a full but simple representation of an AHHS controller is possible as shown in Figure 9. Note that the networks evolved by the NEAT approach are much more complex and contain between 25 and 40 nodes.

A nonlinear dynamics analysis of the evolved controllers is possible, especially in view of the one-dimensionality due to our using only one hormone in this work. This analysis showed that the controllers are mostly based on periodic attractors that emerge during evolution.

The investigation of two reference implementations showed that evolving X-to-hormone mappings based on summations of functions is advantageous over direct encoding of such mappings. Hence, our approach of evolving the mappings indirectly via groups of tunable subrules seems to be particularly suitable for methods of artificial evolution.

Generally the AHHS approach does not require predefined topologies. In this work the network topology was not predefined, because the controllers were initialized with fully connected networks. Unnecessary connections had to be turned off explicitly by evolution, as is seen in Figure 9 (zero lines for sensors 0 through 6). The probability of fully connected networks is reduced either by decreasing the number of rules (M) or by increasing the number of hormones (N). The relation between M and N also changes the representable shapes of the X-to-hormone mappings (cf. Figure 4a). Initialization with almost no connections would also be possible if the addition of rules (i.e., growing genome lengths) were allowed.

The application of the AHHS approach to other domains has been shown previously. The evolution of controllers for single robots and for modular robotics has been shown in [23, 24, 66]. The analysis of single robot controllers was reported in [54]. The implementation of a simple AHHS controller on hardware was reported in [56, 65]. Still, further investigations of additional domains would be desirable, as well as intensive comparisons with other methods such as HyperNEAT, GasNets, and GRN-based methods.

The analysis has shown that the proposed approach is superior to state-of-the-art approaches in the considered domain with local control, in that evolved controllers generalize to other initializations and scale with the number of modules. This opens up many possibilities. For example, controllers can be evolved within simpler scenarios in order to save resources. Furthermore, changes in the number of coupled modules could also be managed dynamically by the same controller without external adaptations. The good scaling and generalization properties could probably even be improved by re-evolving the controllers in new or more complex domains. In addition, this method could be combined with environmental incremental evolution as reported by Nakamura et al. [40]. Our future work will mainly be the application of the proposed approach to the multimodular robotics domain.

Acknowledgments

We thank the anonymous referees for helpful comments. This work is supported by EU-IST-FET project SYMBRION, no. 216342, and by EU-ICT project REPLICATOR, no. 216240.

Notes

References

1. 
Alberts
,
B.
(
1989
).
Molecular biology of the cell.
New York
:
Garland Publications
.
2. 
Armus
,
H.
,
Montgomery
,
A.
, &
Jellison
,
J.
(
2006
).
Discrimination learning in Paramecia (P. caudatum).
The Psychological Record
,
56
,
489
498
.
3. 
Bongard
,
J. C.
(
2002
).
Evolving modular genetic regulatory networks.
In
Proceedings of the 2002 Congress on Evolutionary Computation (CEC'02)
(pp.
17
21
).
4. 
Bray
,
D.
(
1990
).
Intracellular signalling as a parallel distributed process.
Journal of Theoretical Biology
,
143
(
2
),
215
231
.
5. 
Bray
,
D.
(
1995
).
Protein molecules as computational elements in living cells.
Nature
,
376
,
307
312
.
6. 
Bray
,
D.
(
2009
).
Wetware: A computer in every living cell.
New Haven, CT
:
Yale University Press
.
7. 
Brooks
,
R.
(
1986
).
A robust layered control system for a mobile robot.
IEEE Journal of Robotics and Automation
,
2
(
1
),
14
23
.
8. 
Brooks
,
R.
(
1997
).
From earwigs to humans.
Robotics and Autonomous Systems
,
20
(
2–4
),
291
304
.
9. 
Camazine
,
S.
,
Deneubourg
,
J.-L.
,
Franks
,
N. R.
,
Sneyd
,
J.
,
Theraulaz
,
G.
, &
Bonabeau
,
E.
(
2001
).
Self-organization in biological systems.
Princeton, NJ
:
Princeton University Press
.
10. 
Chatterjee
,
D.
,
Patra
,
A.
, &
Joglekar
,
H. K.
(
2002
).
Swing-up and stabilization of a cart-pendulum system under restricted cart track length.
Systems & Control Letters
,
47
(
4
),
355
364
.
11. 
Chouard
,
T.
(
2010
).
Revenge of the hopeful monster.
Nature
,
463
,
864
867
.
12. 
Cliff
,
D.
,
Harvey
,
I.
, &
Husbands
,
P.
(
1993
).
Explorations in evolutionary robotics.
Adaptive Behavior
,
2
(
1
),
71
108
.
13. 
Dale
,
K.
, &
Husbands
,
P.
(
2010
).
The evolution of reaction-diffusion controllers for minimally cognitive agents.
Artificial Life
,
16
(
1
),
1
19
.
14. 
Dorigo
,
M.
,
Trianni
,
V.
,
S¸ahin
,
E.
,
Groß
,
R.
,
Labella
,
T. H.
,
Baldassarre
,
G.
,
Nolfi
,
S.
,
Deneubourg
,
J.-L.
,
Mondada
,
F.
,
Floreano
,
D.
, &
Gambardella
,
L. M.
(
2004
).
Evolving self-organizing behaviors for a swarm-bot.
Autonomous Robots
,
17
(
2–3
),
223
245
.
15. 
Egorov
,
A. V.
,
Hamam
,
B. N.
,
Hasselmo
,
E. F. M. E.
, &
Alonso
,
A. A.
(
2002
).
Graded persistent activity in entorhinal cortex neurons.
Nature
,
420
,
173
178
.
16. 
Endler
,
J. A.
(
1993
).
Some general comments on the evolution and design of animal communication systems.
Philosophical Transactions of the Royal Society B
,
340
,
215
225
.
17. 
Floreano
,
D.
,
Mitri
,
S.
,
Magnenat
,
S.
, &
Keller
,
L.
(
2007
).
Evolutionary conditions for the emergence of communication in robots.
Current Biology
,
17
,
514
519
.
18. 
Floreano
,
D.
,
Dürr
,
P.
, &
Mattiussi
,
C.
(
2008
).
Neuroevolution: From architectures to learning.
Evolutionary Intelligence
,
1
,
47
62
.
19. 
Floreano
,
D.
,
Husbands
,
P.
, &
Nolfi
,
S.
(
2008
).
Evolutionary robotics.
In B. Siciliano & K. Oussama (Eds.)
,
Handbook of robotics
(pp.
1423
1452
).
Berlin
:
Springer-Verlag
.
20. 
Fujimoto
,
K.
, &
Kaneko
,
K.
(
2003
).
How fast elements can affect slow dynamics.
Physica D
:
Nonlinear Phenomena
,
180
(
1–2
),
1
16
.
21. 
Grey Walter
,
W.
(
1950
).
An imitation of life.
Scientific American
,
182
(
5
),
42
45
.
22. 
Grey Walter
,
W.
(
1951
).
A machine that learns.
Scientific American
,
185
(
2
),
60
63
.
23. 
Hamann
,
H.
,
Stradner
,
J.
,
Schmickl
,
T.
, &
Crailsheim
,
K.
(
2010
).
A hormone-based controller for evolutionary multi-modular robotics: From single modules to gait learning.
In
Proceedings of the IEEE Congress on Evolutionary Computation (CEC'10)
(pp.
224
251
).
24. 
Hamann
,
H.
,
Stradner
,
J.
,
Schmickl
,
T.
, &
Crailsheim
,
K.
(
2010
).
Artificial hormone reaction networks: Towards higher evolvability in evolutionary multimodular robotics.
In H. Fellermann, M. Dörr, M. M. Hanczyc, L. L. Laursen, S. Maurer, D. Merkle, P.-A. Monnard, K. Støy, & S. Rasmussen (Eds.)
,
Proceedings of the ALife XII Conference
(pp.
773
780
).
Cambridge, MA
:
MIT Press
.
25. 
Hamann
,
H.
,
Schmickl
,
T.
, &
Crailsheim
,
K.
(
2011
).
Coupled inverted pendulums: A benchmark for evolving decentral controllers in modular robotics.
In N. Krasnogor & P. L. Lanzi (Eds.)
,
Proceedings of the 13th Annual Genetic and Evolutionary Computation Conference, GECCO 2011
(pp.
195
202
).
ACM
. .
26. 
Hand
,
W.
, &
Haupt
,
W.
(
1971
).
Flagellar activity of the colony members of Volvox aureus Ehrbg. during light stimulation.
Journal of Eukaryotic Microbiology
,
18
(
3
),
361
364
.
27. 
Harvey
,
I.
,
Husbands
,
P.
,
Cliff
,
D.
,
Thompson
,
A.
, &
Jakobi
,
N.
(
1997
).
Evolutionary robotics: The Sussex approach.
Robotics and Autonomous Systems
,
20
(
2–4
),
205
224
.
28. 
Hauser
,
M. D.
(
1996
).
The evolution of communication.
Cambridge, MA
:
MIT Press
.
29. 
Holland
,
J. H.
(
1975
).
Adaptation in natural and artificial systems.
Ann Arbor, MI
:
University of Michigan Press
.
30. 
Holmes
,
S.
(
1903
).
Phototaxis in Volvox.
Biological Bulletin
,
4
(
6
),
319
326
.
31. 
Husbands
,
P.
(
1998
).
Evolving robot behaviours with diffusing gas networks.
In
Evolutionary Robotics
(pp.
71
86
).
Berlin
:
Springer
.
32. 
Jang
,
J.-S. R.
(
2002
).
Self-learning fuzzy controllers based on temporal backpropagation.
IEEE Transactions on Neural Networks
,
3
(
5
),
714
723
.
33. 
Kauffman
,
S. A.
, &
Levin
,
S.
(
1987
).
Towards a general theory of adaptive walks on rugged landscapes.
Journal of Theoretical Biology
,
128
(
1
),
11
45
.
34. 
Koza
,
J.
(
1992
).
Genetic programming: On the programming of computers by means of natural selection.
Cambridge, MA
:
MIT Press
.
35. 
Koza
,
J. R.
, &
Keane
,
M. A.
(
1990
).
Genetic breeding of non-linear optimal control strategies for broom balancing.
In A. Bensoussan & J. Lions (Eds.)
,
Analysis and optimization of systems
(pp.
47
56
).
Berlin
:
Springer-Verlag
.
36. 
Kremling
,
A.
,
Fischer
,
S.
,
Sauter
,
T.
,
Bettenbrock
,
K.
, &
Gilles
,
E.
(
2004
).
Time hierarchies in the Escherichia coli carbohydrate uptake and metabolism.
BioSystems
,
73
(
1
),
57
71
.
37. 
Meng
,
Y.
,
Zhang
,
Y.
, &
Jin
,
Y.
(
2011
).
Autonomous self-reconfiguration of modular robots by evolving a hierarchical mechanochemical model.
IEEE Computational Intelligence Magazine
,
6
(
1
),
43
54
.
38. 
Moioli
,
R.
,
Vargas
,
P. A.
, &
Husbands
,
P.
(
2010
).
Exploring the Kuramoto model of coupled oscillators in minimally cognitive evolutionary robotics tasks.
In
WCCI 2010 IEEE World Congress on Computational Intelligence—CEC IEEE
(pp.
2483
2490
).
39. 
Murata
,
S.
,
Kakomura
,
K.
, &
Kurokawa
,
H.
(
2008
).
Toward a scalable modular robotic system—Navigation, docking, and integration of M-TRAN.
IEEE Robotics & Automation Magazine
,
14
(
4
),
56
63
.
40. 
Nakamura
,
H.
,
Ishiguro
,
A.
, &
Uchilkawa
,
Y.
(
2000
).
Evolutionary construction of behavior arbitration mechanisms based on dynamically-rearranging neural networks.
In
Proceedings of the 2000 Congress on Evolutionary Computation
,
Vol. 1
(pp.
158
165
).
Piscataway, NJ
:
IEEE
.
41. 
Neal
,
M.
, &
Timmis
,
J.
(
2003
).
Timidity: A useful mechanism for robot control?
Informatica
,
4
(
27
),
197
204
.
42. 
Nelson
,
A. L.
,
Barlow
,
G. J.
, &
Doitsidis
,
L.
(
2009
).
Fitness functions in evolutionary robotics: A survey and analysis.
Robotics and Autonomous Systems
,
57
,
345
370
.
43. 
Nolfi
,
S.
, &
Floreano
,
D.
(
1999
).
Learning and evolution.
Autonomous Robots
,
7
,
89
113
.
44. 
Nolfi
,
S.
, &
Floreano
,
D.
(
2004
).
Evolutionary robotics: The biology, intelligence, and technology of self-organizing machines.
Cambridge, MA
:
MIT Press
.
45. 
Østman
,
B.
,
Hintze
,
A.
, &
Adami
,
C.
(
2010
).
Critical properties of complex fitness landscapes.
In H. Fellermann, M. Dörr, M. M. Hanczyc, L. L. Laursen, S. Maurer, D. Merkle, P.-A. Monnard, K. Støy, & S. Rasmussen(Eds.)
,
Proceedings of the ALife XII Conference
(pp.
126
132
).
Cambridge, MA
:
MIT Press
.
46. 
Pasemann
,
F.
(
1998
).
Evolving neurocontrollers for balancing an inverted pendulum.
Network: Computation in Neural Systems
,
9
(
4
),
495
511
.
47. 
Payton
,
D.
,
Daily
,
M.
,
Estowski
,
R.
,
Howard
,
M.
, &
Lee
,
C.
(
2001
).
Pheromone robotics.
Autonomous Robots
,
11
(
3
),
319
324
.
48. 
Peak
,
D.
,
West
,
J. D.
,
Messinger
,
S. M.
, &
Mott
,
K. A.
(
2004
).
Evidence for complex, collective dynamics and emergent, distributed computation in plants.
Proceedings of the National Academy of Science
,
101
(
4
),
918
922
.
49. 
Press
,
W. H.
,
Teukolsky
,
S. A.
,
Vetterling
,
W. T.
, &
Flannery
,
B. P.
(
2002
).
Numerical recipes in C++.
Cambridge, UK
:
Cambridge University Press
.
50. 
Rechenberg
,
I.
(
1994
).
Evolutionsstrategie '94.
Frommann Holzboog
.
51. 
REPLICATOR
. (
2011
).
Project Web site.
http://www.replicators.eu.
52. 
Rojdestvenski
,
I.
,
Cottam
,
M.
,
Park
,
Y.-I.
, &
Öquist
,
G.
(
1999
).
Robustness and time-scale hierarchy in biological systems.
BioSystems
,
50
(
1
),
71
82
.
53. 
Rubenstein
,
M.
, &
Shen
,
W.-M.
(
2009
).
Scalable self-assembly and self-repair in a collective of robots
. In
Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), St. Louis, Missouri, USA
.
54. 
Schmickl
,
T.
, &
Crailsheim
,
K.
(
2009
).
Modelling a hormone-based robot controller.
In
MATHMOD 2009—6th Vienna International Conference on Mathematical Modelling
.
55. 
Schmickl
,
T.
,
Hamann
,
H.
,
Stradner
,
J.
, &
Crailsheim
,
K.
(
2010
).
Hormone-based control for multi-modular robotics.
In P. Levi & S. Kernbach (Eds.)
,
Symbiotic multi-robot organisms: Reliability, adaptability, evolution.
Berlin
:
Springer-Verlag
.
56. 
Schmickl
,
T.
,
Hamann
,
H.
,
Stradner
,
J.
,
Mayet
,
R.
, &
Crailsheim
,
K.
(
2010
).
Complex taxis-behaviour in a novel bio-inspired robot controller.
In H. Fellermann, M. Dörr, M. M. Hanczyc, L. L. Laursen, S. Maurer, D. Merkle, P.-A. Monnard, K. Støy, & S. Rasmussen (Eds.)
,
Proceedings of the ALife XII Conference
(pp.
648
655
).
Cambridge, MA
:
MIT Press
.
57. 
Schmickl
,
T.
,
Hamann
,
H.
, &
Crailsheim
,
K.
(
2011
).
Modelling a hormone-inspired controller for individual- and multi-modular robotic systems.
Mathematical and Computer Modelling of Dynamical Systems
,
17
(
3
),
221
242
.
58. 
Schwefel
,
H.-P.
(
1995
).
Evolution and optimum seeking.
New York
:
Wiley
.
59. 
Shen
,
W.-M.
,
Salemi
,
B.
, &
Will
,
P.
(
2002
).
Hormone-inspired adaptive communication and distributed control for CONRO self-reconfigurable robots.
IEEE Transactions on Robotics and Automation
,
18
(
5
),
700
712
.
60. 
Shen
,
W.-M.
,
Will
,
P.
,
Galstyan
,
A.
, &
Chuong
,
C.-M.
(
2004
).
Hormone-inspired self-organization and distributed control of robotic swarms.
Autonomous Robots
,
17
,
93
105
.
61. 
Shen
,
W.-M.
,
Krivokon
,
M.
,
Chiu
,
H.
,
Everist
,
J.
,
Rubenstein
,
M.
, &
Venkatesh
,
J.
(
2006
).
Multimode locomotion via SuperBot reconfigurable robots.
Autonomous Robots
,
20
(
2
),
165
177
.
62. 
Stanley
,
K. O.
, &
Miikkulainen
,
R.
(
2002
).
Evolving neural networks through augmenting topologies
(Technical Report AI2001-290). Department of Computer Sciences, The University of Texas at Austin, 2002. http://nn.cs.utexas.edu/?stanley:ec02
.
63. 
Stanley
,
K. O.
, &
Miikkulainen
,
R.
(
2004
).
Competitive coevolution through evolutionary complexification.
Journal of Artificial Intelligence Research
,
21
(
1
),
63
100
.
64. 
Stanley
,
K. O.
,
D'Ambrosio
,
D. B.
, &
Gauci
,
J.
(
2009
).
A hypercube-based encoding for evolving large-scale neural networks.
Artificial Life
,
15
(
2
),
185
212
.
65. 
Stradner
,
J.
,
Hamann
,
H.
,
Schmickl
,
T.
, &
Crailsheim
,
K.
(
2009
).
Analysis and implementation of an artificial homeostatic hormone system: A first case study in robotic hardware.
In
The 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS'09)
(pp.
595
600
).
Piscataway, NJ
:
IEEE Press
.
66. 
Stradner
,
J.
,
Hamann
,
H.
,
Schmickl
,
T.
,
Thenius
,
R.
, &
Crailsheim
,
K.
(
2011
).
Evolving a novel bio-inspired controller in reconfigurable robots.
In G. Kampis, I. Karsai, & E. Szathma´ry (Eds.)
,
Advances in Artificial Life, 10th European Conference, ECAL 2009
(pp.
132
139
).
Berlin
:
Springer-Verlag
.
67. 
SYMBRION
. (
2011
).
Project Web site.
.
68. 
Takahashi
,
K.
,
Arjunan
,
S. N. V.
, &
Tomita
,
M.
(
2005
).
Space in systems biology of signaling pathways towards intracellular molecular crowding in silico.
FEBS Letters
,
579
,
1783
1788
.
69. 
Trianni
,
V.
(
2008
).
Evolutionary swarm robotics—Evolving self-organising behaviours in groups of autonomous robots.
Berlin
:
Springer-Verlag
.
70. 
Vargas
,
P. A.
,
Moioli
,
R. C.
,
de Castro
,
L. N.
,
Timmis
,
J.
,
Neal
,
M.
, &
von Zuben
,
F. J.
(
2005
).
Artificial homeostatic system: A novel approach.
In M. S. Capcarre`re, A. A. Freitas, P. J. Bentley, C. G. Johnson, & J. Timmis (Eds.)
,
8th European Conference on Artificial Life (ECAL'05)
(pp.
754
764
).
Berlin
:
Springer-Verlag
.
71. 
Vargas
,
P. A.
,
Moioli
,
R. C.
,
von Zuben
,
F. J.
, &
Husbands
,
P.
(
2009
).
Homeostasis and evolution together dealing with novelties and managing disruptions.
International Journal of Intelligent Computing and Cybernetics
,
2
(
3
),
435
454
.
72. 
Whiteson
,
S.
, &
Stone
,
P.
(
2006
).
Evolutionary function approximation for reinforcement learning.
Journal of Machine Learning Research
,
7
,
877
917
.
73. 
Widrow
,
B.
, &
Smith
,
F. W.
(
1964
).
Pattern recognizing control systems.
In J. T. Tou & R. H. Wilcox (Eds.)
,
Computer and information sciences
(pp.
288
317
).
Washington, DC
:
Clever Hume Press
.
74. 
Wiener
,
N.
(
1948
).
Cybernetics: Or control and communication in the animal and the machine.
Cambridge, MA
:
MIT Press
.
75. 
Winkler
,
L.
, &
Wo¨rn
,
H.
(
2009
).
Symbricator3D—A distributed simulation environment for modular robots.
In M. Xie, Y. Xiong, C. Xiong, H. Liu, & Z. Hu (Eds.)
,
ICIRA
(pp.
1266
1277
).
Berlin
:
Springer-Verlag
.
76. 
Wolpert
,
D. H.
, &
Macready
,
W. G.
(
1997
).
No free lunch theorems for optimization.
IEEE Transactions on Evolutionary Computation
,
1
(
1
),
67
82
.
77. 
Xin
,
X.
, &
Kaneda
,
M.
(
2005
).
Analysis of the energy based control for swinging up two pendulums.
IEEE Transactions on Automatic Control
,
50
(
5
),
679
684
.
78. 
Yim
,
M.
,
Shen
,
W.
,
Salemi
,
B.
,
Rus
,
D.
,
Moll
,
M.
,
Lipson
,
H.
,
Klavins
,
E.
, &
Chirikjian
,
G.
(
2007
).
Modular self-reconfigurable robot systems.
IEEE Robotics and Automation Magazine
,
14
(
1
),
43
52
.

Author notes

Contact author.

∗∗

Artificial Life Lab of the Department of Zoology, Universitätsplatz 2, Karl-Franzens University Graz, 8010 Graz, Austria. E-mail: heiko.hamann@uni-graz.at (H.H.); thomas.schmickl@uni-graz.at (T.S.); karl.crailsheim@uni-graz.at (K.C.)