Abstract

Recent approaches in evolutionary robotics (ER) propose to generate behavioral diversity in order to evolve desired behaviors more easily. These approaches require the definition of a behavioral distance, which often includes task-specific features and hence a priori knowledge. Alternative methods, which do not explicitly force selective pressure towards diversity (SPTD) but still generate it, are known from the field of artificial life, such as in artificial ecologies (AEs). In this study, we investigate how SPTD is generated without task-specific behavioral features or other forms of a priori knowledge and detect how methods of generating SPTD can be transferred from the domain of AE to ER. A promising finding is that in both types of systems, in systems from ER that generate behavioral diversity and also in the investigated speciation model, selective pressure is generated towards unpopulated regions of search space. In a simple case study we investigate the practical implications of these findings and point to options for transferring the idea of self-organizing SPTD in AEs to the domain of ER.

1 Introduction

Methods of evolutionary computation [10, 25, 29] have been successful as optimization techniques for many years. Also, the optimization of behaviors, which can justifiably be called generation of behaviors, in the field of evolutionary robotics (ER) [22] has proven to be effective. However, the next step in this research, towards more complex behaviors and tasks, seems to be particularly difficult. Such a complex task could involve, for example, several successive subtasks, whereas earlier-learned subtasks have no utility before later subtasks are learned as well. The relative simplicity of investigated tasks in ER, especially when compared to natural systems, is, for example, discussed by Nelson et al. [19].

Evolving robot behaviors becomes even more challenging if the necessary a priori knowledge is minimized, which is necessary to achieve generally applicable approaches. Notably this concerns the fitness function and how elaborated it is. Nelson et al. [19] define several fitness function classes such as the behavioral fitness functions and the aggregate fitness functions. Behavioral fitness functions incorporate a lot of a priori knowledge because the fitness function “selects for behavioral features of a presupposed solution to a given task” [19], that is, how the task is accomplished. In contrast, aggregate fitness functions incorporate a very low degree of a priori knowledge because the fitness function measures what the robot has accomplished and not how it was accomplished. Often behavioral fitness functions are applied in ER because otherwise behaviors of certain complexities cannot be evolved with a reasonable commitment of resources. The main cause for these difficulties seems to be early convergence on unsatisfactory behaviors. An in-depth analysis of negative results in evolutionary robotics that investigates limiting factors seems not to be available to the best knowledge of the author.

One of the common symptoms that indicate a defect in an evolutionary approach is early convergence on suboptimal solutions, that is, lack of exploration in the search algorithm. Early convergence within the classical field of optimization means convergence on local optima. A different approach is open-ended evolution, which is well known in artificial life [1, 13]. The idea of open-ended evolution is to create an artificial system that generates perpetual novelty, which seems to be the case for natural evolution. Clearly the standard approach to the search for an optimum of a static objective function cannot satisfy such needs. Hence, it is worth trying to integrate methods from open-ended evolution with the approaches of evolutionary robotics.

1.1 Generating Diversity in Artificial Evolution

Premature convergence can be avoided by generating and keeping populations that have a high diversity of different potential solutions to the problem. An option is to increase the diversity in the population, for example, by using fitness sharing as described by Sareni and Krähenbühl [26]. However, these methods require the measurement of a distance between genotypes, which is computationally intractable for common encodings in ER, such as artificial neural networks (ANNs) [3, 18]. Instead, promising recent results suggest increasing the behavioral diversity during the search or within the current population and measuring the distances between behaviors, which can usually be done efficiently. Probably the most prominent examples are novelty search [14] and the approach by Mouret and Doncieux [17, 18]. Related concepts are those of curiosity in reinforcement learning [28] and of self-organized explorativity [15].

Novelty search [14] operates without an actual objective function. Instead, selective pressure is generated towards behaviors that have not been seen before in the evolutionary run. The desired behaviors maximize their behavioral distance to all known behaviors. For illustration and later use this is shown schematically in Figure 1a. The circles represent behaviors that were found during the evolutionary run (graded colors represent different generations) and their position in search space. Behaviors that are close to known behaviors are undesirable, which is represented by low values of the fitness function F around circles. Consequently, selective pressure towards unpopulated regions in search space is generated, which is represented accordingly by steep, upward slopes around circles. Multi-objective behavior diversity (MOBD [17]) includes the behavioral distance only as a component in the multi-objective fitness function. In contrast to novelty search, it only accounts for the behaviors in the current population. This is sketched in Figure 1b.

Figure 1. 

Schematic representation of the fitness functions generated by novelty search [14] and MOBD [17]. Circles represent known behaviors; selective pressure is towards bigger values of F.

Figure 1. 

Schematic representation of the fitness functions generated by novelty search [14] and MOBD [17]. Circles represent known behaviors; selective pressure is towards bigger values of F.

For approaches using behavioral diversity, a distance between behaviors needs to be defined, which typically is task-specific. Hence, this is a process similar to defining an appropriate fitness function for a given task in standard ER. It is argued that even naive behavioral distance definitions can improve the evolutionary process, such as definitions based on the final position of the robot or of movable objects at the end of the evaluation [13, 17]. However, if we follow the analogy to fitness functions, it seems likely that for more complex tasks the measure of behavioral distance has also a need to be more complex [11, 12]. This factual connection is also plausible with reference to the above-mentioned comment by Nelson et al. [19] about behavioral fitness functions that select for behavioral features. In the design of the behavioral distance measure not only what is accomplished is relevant, but also how it is accomplished. In a simple exploration task, this is not really apparent if the behavioral distance is defined on the robot's final position [17], but it addresses at least the order of how the maze is explored. In a more complex task such as the locomotion of a biped robot [14], it gets more apparent, as the behavioral distance is defined by the trajectory of the center of mass (i.e., how the locomotion is implemented) instead of just the total offset (i.e., what is achieved). Hence we state the hypothesis that also methods based on behavioral distances run into similar problems to those seen in fitness function design, such as having to define “task-specific hand-formulated functions that contain various types of selection metrics” [19] and consequently having to involve a high degree of a priori knowledge.

1.2 Diversity in Natural Evolution

Based on our hypothesis, we assume that neither genotypic distances nor behavioral distances are able to generate sustainable diversity in ER. What could be candidate solutions? Natural evolution represents a perfect standard for the generation of diversity. Particularly we are interested in evolutionary radiation, which is an increase in taxonomic diversity. A typical example of a radiation is the Cambrian explosion. It generated a diversity that is comprehensible when looking at the corresponding phylogenetic tree, which gets literally bushy within a comparatively short period of time. Each branching corresponds to an event of speciation. Hence, to understand diversity we want to understand speciation and the underlying process that generates SPTD. Our objectives are (1) to detect how SPTD is generated in a self-organizing system (e.g., speciation) and (2) how to transfer this knowledge to ER.

Next we summarize the knowledge on how speciation operates. Coyne and Orr [4] ask: “Why are there species?” (p. 48) and they comment: “We regard it as one of the most important unanswered questions in evolutionary biology” (p. 49). Accordingly, they do not answer but only discuss the question. They point to Maynard Smith and Szathmáry [16], who consider three explanations. One, species are discrete “stable states” formed by a self-organizing system. However, this option lacks a mechanism that would explain the origin of species. Two, species fill discrete ecological niches. Third, reproductive isolation is an inevitable result of evolutionary divergence. The latter two are dependent on reproductive isolation and are not mutually exclusive [4]. A conclusive concept that connects both is that of adaptive peaks [5]. Species are adaptive peaks that are separated by adaptive valleys, which are genotypes that are unfit for survival. Whether these adaptive valleys are due to ecological or environmental effects is kept open. Another aspect is whether asexual or sexual reproduction is considered. While in the case of asexual reproduction ecological niches seem to be indispensable to create a reproductive barrier, in the case of sexual reproduction the reproductive barrier might be generated without explicit ecological niches, especially because sexual selection could be effective.

To summarize, we note that evolutionary radiation and speciation are yet difficult to understand, as also pointed out by Venditti et al. [34]: “Attempts to understand species-radiations […] should look to the size of the catalogue of potential causes of speciation shared by a group of closely related organisms rather than to how those causes combine” (p. 352). The conclusion concerning inspirations for new methods to generate diversity in artificial evolution, especially ER, regrettably has to stay inconclusive for now, due to the limited knowledge about the natural system itself. However, it is the starting point of the following investigations.

1.3 Prerequisites to Generate Diversity

Instead of detecting methods to generate speciation, what are prerequisites for evolutionary radiation, that is, for active speciation? A prerequisite could be the existence of a complex environment that favors the formation of adaptive peaks and that initiates specialization in the organisms. Another prerequisite could be the existence of an ecology that creates complexity in the interaction of different organisms and species. The suitability of providing a complex environment within applications of ER is limited, notably if a complex environment is not part of the desired task.

On the contrary, a considerable amount of research on the creation of AEs in order to evolve behaviors has been reported. For example, the minimal ecology, that of just two species, is investigated in studies of coevolution [7, 21], and AEs, possibly with many species, are popular in the field of artificial life [24, 27, 35]. The studies of coevolution have an emphasis on the actually evolved behaviors combined with considerations about their utility and complexity. It turns out that “the co-evolutionary process tends to [fall] into dynamical attractors in which the same solutions are adopted by both populations over and over” [20, pp. 35–36]. In AE studies the actual behaviors are of less interest; instead, the evolutionary process as a whole is usually investigated in more detail. In addition it is also unclear how these AEs would have to be designed to generate desired behaviors for a given task.

Common to both AEs and ER is their high sensitivity to parameters and the challenge of creating actively progressing evolutionary processes. However, they are interesting examples of how SPTD is generated in a self-organizing process without incorporating a priori knowledge about how a certain behavior is accomplished.

Summing up, the prerequisite to generate diversity is a minimum of complexity that provokes the emergence of adaptive peaks. The trigger could be ecological or environmental features, but also effects of sexual selection. The goal of this study is (1) to detect and measure SPTD that is generated by a self-organizing process in AEs, and (2) to determine how methods from the domain of AEs could be transferred to ER. To the knowledge of the author, published studies on behavioral diversity tend to focus either on self-organizing diversity without aiming for the solution of a given task (e.g., Ray [24]) or on explicitly imposed SPTD while searching for the solution of a particular task (e.g., Lehman and Stanley [14]). In the following we investigate a speciation model due to Woehrer et al. [36] and an extension to it in order to investigate how SPTD is generated in this self-organizing system. The results are then compared with approaches that explicitly impose SPTD. The practical implications of these findings for the approaches in evolutionary robotics are then investigated with the help of a simple case study.1

2 A Model of Speciation

One distinguishes several types of speciation, such as allopatric speciation and sympatric speciation. Previously it was thought that speciation happens mostly allopatrically, that is, by spatial separation of populations, which then develop reproductive isolation. More recent results [4] suggest that sympatric speciation, which is speciation within the same geographic region, might be more common than expected. Here, sympatric speciation is of more interest because it is a self-organizing, evolutionary process, while allopatric speciation occurs due to external forces (arguably except for migration). In terms of the application of speciation to increase the behavioral diversity in ER, sympatric speciation is preferred, as it does not need a priori knowledge, while allopatric speciation would need an implementation of a cause.

2.1 Fixed Assortative Mating

Woehrer et al. [36] report an artificial life model of sympatric speciation based on sexual selection—in particular, assortative mating, which is a mating pattern where mating between individuals with similar genotypes or phenotypes is more likely. The model is inspired by the natural system of finches on the Galápagos Islands. Although an island setting might let allopatric speciation appear as a good and exclusive explanation, this does not seem to be the case on the Galápagos Islands [36]. Woehrer et al. [36] point to the speciality of the proposed system that combines natural selection and sexual selection acting on the same trait, which is directly related to so-called magic traits [30]. Here we reproduce their results, report an extension of the model, and perform additional measurements in simulations.

The artificial system models a bird population of dynamic size. A bird is modeled by age, beak size, energy level, and gender. The birds have to forage (implemented as random search) for seeds to survive. The seeds are modeled by energy and by uniformly distributed size, and they are distributed in discrete space of size 100 units × 100 units (see Table 1 for the parameter settings used). Selective pressure is imposed by the limited resource of seeds. Initially in the dry season a number of seeds are placed in the world; the number decreases consecutively over a period of 30 or 43 simulated days (depending on which setting is used) as the birds forage from it. The birds' search for seeds is limited by their beak size s, because they can only feed on seeds of size [s − 1, s + 1]. The search costs energy on each day, and eaten seeds add to the bird's energy. If a bird runs out of energy during the dry season, it dies. Those that survive may attempt to reproduce. Reproduction is based on sexual selection and assortative mating (with random mating, no speciation was observed [36]). Females with beak size sf select only mates with beak size sm within the female's beak-size interval sm ∈ [sf − Δ, sf + Δ] for an assortative range fixed to Δ = 0.5. The offspring has a beak size averaged over its parents' plus Gaussian noise and random gender. For all remaining details see [36] and Table 1.

Table 1. 

Parameter settings for fixed assortative mating (FAM) and for evolved assortative mating (EAM); some parameters differ from [36].

ParameterFAMEAM
Max. age 4 years 4 years 
Max. energy 2.0 2.0 
Max. energy per seed 2.0 2.0 
Search energy cost 0.1 0.1 
Max. num. male matings 
Max num. generations 1000 1000 
Initial number of birds 400 150 
Initial beak size mean 5.5 5.5 
Initial beak size variance 0.5 3.5 
Beak size interval [1, 10] [1, 10] 
Dry season length 43 days 30 days 
Initial number of seeds 5000 6300 
Feeding square size 10 10 
World size 100 100 
Variance of offspring prop. 0.03 0.01 
Assortative range Δ 0.5 [0.01, 10] 
Seed width W n.a. [0.01, 10] 
ParameterFAMEAM
Max. age 4 years 4 years 
Max. energy 2.0 2.0 
Max. energy per seed 2.0 2.0 
Search energy cost 0.1 0.1 
Max. num. male matings 
Max num. generations 1000 1000 
Initial number of birds 400 150 
Initial beak size mean 5.5 5.5 
Initial beak size variance 0.5 3.5 
Beak size interval [1, 10] [1, 10] 
Dry season length 43 days 30 days 
Initial number of seeds 5000 6300 
Feeding square size 10 10 
World size 100 100 
Variance of offspring prop. 0.03 0.01 
Assortative range Δ 0.5 [0.01, 10] 
Seed width W n.a. [0.01, 10] 

A typical run is shown in Figure 2, which is a plot of all beak sizes that occur in the population over 375 generations. The resemblance to a phylogenetic tree is obvious. Also, the drift of species, branching into two species, and the extinction of species can be noticed. Hence the model of Woehrer et al. [36] is a simple model of self-organized speciation and can be used as an easy-to-handle analogy to the studies on behavioral diversity in ER. The interval of allowed beak sizes s ∈ [1, 10] is the equivalent of the behavior space, and a bird's beak size would be the 1D equivalent of the behavior defined by an ANN. The extreme difference between high-dimensional ANNs and the simplistic 1D beak size interval is not a limiting factor of this analogy, because the speciation model possesses the one qualitative feature that is relevant for this study, namely self-organized generation of diversity. The assortative mating corresponds to allowing recombination only for ANNs that share a magic trait, which is defined by sexual selection and could be a behavioral feature. In addition, this way the emergence of speciation relies crucially on a predefined parameter (the assortative range Δ). In order to avoid such a predefined measure for the above-mentioned reasons, we can allow the evolutionary algorithm to vary features of sexual selection. The perfect solution would be to evolve the full process of sexual selection, but that is beyond the scope of this article. Instead we restrict the following investigations to the evolution of the allowed difference between females' and males' beak sizes, defined by the assortative range Δ in this particular case of assortative mating. While this solves the problem of having predefined parameters, it still corresponds to a predefined process of sexual selection. That way we are able to investigate whether the evolutionary process can self-organize towards a higher degree of diversity without being forced to do so by a parameter setting.

Figure 2. 

Beak sizes s over generations t for fixed assortative mating; for parameters see Table 1.

Figure 2. 

Beak sizes s over generations t for fixed assortative mating; for parameters see Table 1.

2.2 Evolved Assortative Mating

It turns out that the model is very sensitive to settings of the assortative range Δ; that is typical for such systems, as pointed out above (Section 1.3). If Δ is set too low, species extend over a narrow interval of beak sizes, feed from a small set of seeds that they are able to eat, and become extinct often (data not shown). If Δ is set too high, no speciation is observed, because one big connected component of birds in beak-size space emerges. Still we proceed and allow the assortative range Δ to be evolved. The extension of the above model is described in the following. The parameter Δ that defines sexual selection by setting the beak-size range is defined now as an individual bird property. It is passed on by an average over the parents plus Gaussian noise (see Table 1 for parameters). Furthermore, we introduce an evolved parameter of each individual bird, called the seed width W, to increase the attractiveness of being a specialist. In addition to the beak size, it also determines the interval of seed sizes a bird is able to feed on ([sW, s + W]; priority is with the more restrictive interval), and the energy of a seed is scaled by 1/W2 (for W < 1 a seed's energy is increased quadratically in W). The system also shows speciation without this additional feature of the seed width W, but its inclusion stimulates speciation (data not shown).

For the following presentation and analysis of our results we define a technical concept of species in this simple model. We interpret the distribution of a population's beak sizes as a graph wherein each bird's beak size represents a node and two such nodes are connected to each other if they are within each other's individual assortative range of sexual selection defined by Δ. A species is defined by the graph-theoretic concept of connected components. A connected component is a subset of nodes, and for each pair of such nodes there is a path connecting them. With this definition we are able to implement an automatic classifier that determines where and how many species exist in a given configuration.

3 Speciation Model Results

In the following we investigate whether speciation is observed, and we investigate the distribution of beak sizes over time to find what is beneficial for speciation in this extended model, the distribution of branch lengths, and the dynamics of species.

3.1 Distribution of Beak Sizes

With this extended model we obtain for different random initializations results (for parameters see Table 1) that are characterized by three classes: showing speciation (see Figure 3a), intermediate (see Figure 3b), and not showing speciation (see Figure 3c). Following our connected-component definition, species within limited beak-size intervals and clear separation are noticed in Figure 3a. In Figure 3b species are not clearly separated at all times. For example, at t ≈ 600 one species spans almost the whole beak-size interval. In Figure 3c, a single species covers the whole interval. Hence this system does not reliably self-organize towards diversity, at least for the tested parameters.

Figure 3. 

Beak sizes s over generations t for evolved assortative mating; for parameters see Table 1. These three examples differ only in their random initialization.

Figure 3. 

Beak sizes s over generations t for evolved assortative mating; for parameters see Table 1. These three examples differ only in their random initialization.

For the following investigations we classify occurrent configurations. It turns out that four configurations are frequent: species only in the left part of the beak-size interval (s < 4.5, called left; frequency: 3.24%), species only in the right part (s > 6.5, called right; frequency: 3.14%), species distributed over the whole interval (called all-over; frequency: 88.37%), and species in the two outer parts but not in the middle (called symmetrical; frequency: 5.24%). Concerning the evolved assortative range Δ, the all-over configuration is distinguishable from the three other configurations. In Figure 4a the distributions of all occurring assortative ranges over a number of evolutionary runs are compared. The mean for all-over configurations is 1.9 and bigger than those of the others (about 1.3). With bigger assortative range a species spans big intervals more easily. Consequently, big assortative ranges counter diversity in speciation. The populations that spread over the whole interval are bigger than those showing diversity (mean of about 200 birds, compared to about 100), because they fully exploit the energy provided by seeds. Consequently they are less prone to fluctuations and have a smaller risk of extinction. At the same time they make sure to exploit seeds of all sizes. Hence the low-diversity solution actually seems to be the evolutionarily more robust approach, which raises the question of how speciation could be further stimulated (an optional target of an investigation beyond the scope of this article would be the tradeoff between a generalist's advantages and costs due to the seed width W). In turn it is possible to force the system into speciation by forcing bi- or multimodal distributions of seed sizes [36]. However, this is tweaking the environment, which is not a good option for our application in ER.

Figure 4. 

The distribution of the assortative range Δ for different configurations, and the distribution of branch lengths in the phylogenetic tree with fitted Weibull distribution.

Figure 4. 

The distribution of the assortative range Δ for different configurations, and the distribution of branch lengths in the phylogenetic tree with fitted Weibull distribution.

3.2 Branch Length Distribution

Generally, populations with big assortative ranges seem to be more stable, and there is also a tendency to spread over the whole interval with increasing time. We are able to support this claim by investigating the distribution of branch lengths in the phylogenetic trees. The branch length is the time period between split-ups of species or their extinction. Based on an automatic check for connected components, species and the corresponding branch lengths are determined in the implementation of the model. The distribution of branch lengths based on independent evolutionary runs (4.4 × 105 samples) is shown in Figure 4b (squares). In the analysis of this data we follow Venditti et al. [34], who discuss and interpret different branch length distributions of phylogenetic trees. They apply their methods to natural systems, but we found that they can also be applied to this artificial system. The best fit we found is the Weibull distribution,2 which Venditti et al. [34] interpret in the following way: “the Weibull density can accommodate the probability of speciation changing according to the amount of divergence from the ancestral species. This model will fit the data if, for example, species are either more or less likely to speciate the older they get” (p. 349). This supports our finding that old species tend to speciate less, as they tend to span the whole interval.

3.3 Dynamics of Speciation

Despite these shortcomings in the sustainability of the evolution of species, we are able to investigate the dynamics of speciation in this system. Our aim is to detect and measure how SPTD is generated without explicitly forcing it. The evolution of species in space and time based on the above connected-component definition can be interpreted as a discrete birth-death process combined with drifting motion. For the above-defined four classes of configurations the birth and death rates depending on the beak size were measured; see Figure 5a and b (qualitative, no errors shown). The birth rate is bigger by about an order of magnitude because in our implementation the merger of species was not classified as death; only actual extinction was classified as death. Interestingly, the birth rate for the all-over configuration is almost homogeneous, while the other configurations have a dip at s ≈ 3 and s ≈ 8. This is most likely because species in the all-over configuration are on average almost evenly distributed, while in the other configurations species are more likely to be positioned at s ≈ 3 and/or s ≈ 8. Once this niche is covered, a birth at the same position is not possible. The death rates of all configurations have peaks at the bounds, which is explained by the smaller number of available seeds once birds cover all seed sizes that do not occur (<1 or >10). Interestingly, beak sizes at a distance from the bounds support survival for all configurations.

Figure 5. 

Birth and death rates of species over beak size for different configurations.

Figure 5. 

Birth and death rates of species over beak size for different configurations.

In the following we want to measure the dynamics of species, that is, the average movement of species in beak-size space in certain configurations. A measurement of how the species drift for the four configurations is shown in Figure 6a. Δx gives the average displacement of a species from one to the next generation for 1.9 × 107 samples. Positive values Δx > 0 describe the drift of a species towards bigger beak sizes, and negative values Δx < 0 describe drift towards smaller beak sizes. For the configurations left, right, and all-over the average motion of species indicates spreading, and species keep moving towards the bounds even when already approaching them, which means they move into regions of high death rate (Figure 5b). The configuration symmetrical is different because it has a stabilizing effect at s ≈ 4 and s ≈ 6.5, which are maxima. This is, however, a direct effect of the classification and due to situations when an all-over configuration turns into a symmetrical configuration. Still, it is valid data within our classification scheme.

Figure 6. 

Averaged movement of species and implied potentials over beak size for different configurations.

Figure 6. 

Averaged movement of species and implied potentials over beak size for different configurations.

To draw a direct connection to applications in ER we determine the potentials that are implied by the average movement of species (for example, similar to gravitational potentials), which here are given merely by integration over the average movement. Figure 6b shows the four potentials of the four configurations (normalized to similar scales). These potentials are important findings for this study, because they are emergent fitness functions with selective pressure towards bigger values in the same way as in the above schematic representations of the fitness functions for novelty search and MOBD (see Figure 1). Here also, currently populated regions in beak-size space are less desirable, and there is pressure towards unpopulated regions. For example, the potential for the left configuration has a minimum at s = 3, which corresponds to the typically populated position in this configuration as determined by low birth and death rates around s = 3 (Figure 5a and d) and an average motion of Δx = 0 (Figure 6a). These potentials are the confirmation that this self-organizing system generates SPTD. While, for example, in novelty search that pressure is explicitly enforced by pushing towards behavioral diversity, here the selective pressure is a feature of the ecological system. Unpopulated regions in beak-size space correspond to big resources of energy in the form of seeds no one forages for. Birds that manage to push into these regions compete with few fellows, gather plenty of energy, and increase their fitness for survival. This analogy shows, on the one hand, that approaches based on behavioral distance should not be considered just as engineered abstractions, but rather as bio-inspired approaches that have a direct connection to ecological features of speciation. On the other hand, it shows an option for generating a self-organizing SPTD, as discussed in the following.

4 Practical Implications and a Simple Case Study

An application of the above-presented insights to a problem of evolutionary robotics raises a few questions about design decisions. First, the above beak-size space is now the behavior space of our robots, which is typically high-dimensional. Second, we need to implement a mechanism that allows for self-organized SPTD in analogy to the seeds of different size in the above example. Third, a mechanism of inheritance needs to be implemented that creates a locality property, that is, the offspring should occupy approximately the same area of the behavior space as its parents.

A complete description of a behavior is a list of configurations Ct for each time step t, that is, its trajectory in configuration space. A configuration Ct contains all sensor inputs, all actuator outputs, the robot's complete internal state, the robot's position and orientation, and so on. For an implementation a discrete approximation ĈtCt is necessary. Then a particular behavior as generated in an evaluation of a given ANN can be described as a histogram that counts configurations Ct and ignores the chronology. However, the size of the configuration space might be too big for an actual implementation, due to limited memory. A solution is to ignore certain features of the configurations, such as the sensor and actuator values. Selecting the appropriate features might be challenging, and the effectiveness of the approach might depend on certain features. These dependences are likely to be task-specific, creating challenges like those seen in the application of novelty search [11, 12]. The development of appropriate selection methods and of task classifications is certainly a challenge for future work.

Our fitness function should not depend on task-specific features and should not introduce a bias as to how the task is solved by the robot. Instead, in similarity to the distribution of seeds in the above model, we can keep track of the configurations that are visited by a particular behavior. The fitness function then rewards behaviors that visit neglected configurations.

Guaranteeing an appropriate degree of locality in the evolutionary operators might be challenging. The recombination of ANNs is difficult because the merge of two networks with possibly different topologies is ambiguous [23, 31]. Although this article focuses on speciation based on sexual reproduction, the implementation of a recombination operator might not be necessary for generating SPTD. Following the above considerations, using asexual reproduction exclusively means that ecological niches, which create reproductive barriers, seem indispensable. However, the minimal requirements for our approach are locality (an offspring covers about the same area in behavior space as its parent) and a pressure to abandon populated areas of the behavior space. These two requirements are met by a fitness function that rewards visiting previously neglected configurations and a mutation operator.

4.1 Case Study

To show and discuss the practical implications of the above findings, we investigate an example scenario. A robot is positioned in a simple maze and should explore as much area as possible within a limited time (see Figure 7). The robot has three proximity sensors on the front, has a differential drive, and is controlled by a standard ANN with three input neurons, three neurons in the hidden layer, and two output neurons. The weights of the ANN are mutated with a probability of 0.05, we use a simple proportionate selection, elitism is set to one, the population size is 30, and we do 400 generations.

Figure 7. 

Setting of the exploration task with a typical trajectory. The robot is initially positioned randomly within the lower left room with a random heading.

Figure 7. 

Setting of the exploration task with a typical trajectory. The robot is initially positioned randomly within the lower left room with a random heading.

A standard approach to define a fitness function for this task is to define a grid over the whole area and to give a reward proportional to the number of grid cells that were visited during an evaluation. This was done with a 100 × 100 grid, and the best fitness of 100 independent evolutionary runs is shown in Figure 8a, boxplot a. We are interested in how much of the configuration space was explored during evolution. Here we define a simplified configuration space that ignores the chronology, sensor inputs, and actuator outputs. Only discrete robot positions along with discrete headings are considered. This was implemented by a 100 × 100 × 100 grid, and the percentage of visited configurations for the standard approach is shown in Figure 8b, boxplot a.

Figure 8. 

Performance of best evaluation and percentage of explored configuration space for different settings. See Table 2 and the text for the settings of experiments ah (100 evolutionary runs for each setting).

Figure 8. 

Performance of best evaluation and percentage of explored configuration space for different settings. See Table 2 and the text for the settings of experiments ah (100 evolutionary runs for each setting).

Next we investigate a number of options inspired by the above findings (see Table 2 for an overview). The fitness function can reward the exploration of the area (i.e., be task-specific; see the second column in Table 2) and/or the exploration of the configuration space (i.e., be task-independent; see the third column). Reward for exploring the configuration space is the main implication of the above investigations of the speciation model, because it generates selective pressure towards unpopulated regions of configuration space and hence SPTD. In the above-discussed AE, the supply of seeds is replenished periodically within an evolutionary run. In analogy we can choose to reward the exploration of the area on an individual level, that is, within each evaluation, or on a population level, that is, only visits to areas that have not yet been visited by other individuals within the current generation (sixth column in Table 2). The exploration of configuration space, in turn, is rewarded on the level of the full evolutionary run, that is, only true innovations are rewarded (configurations that have not been seen before during the whole evolutionary run). For the settings that use both fitness components (b, c, e, g) the rewards for exploring the configuration space are weighted 100 times higher than those for the exploration of area.

Table 2. 

Tested combinations of fitness functions, evaluation times, and configuration spaces.

Fitness, areaFitness, config. spaceDynamic evaluation timesConfig. space incl. robot headingsVisited area per evaluation
a ✓ × × ✓ ✓ 
b ✓ ✓ × × × 
c ✓ ✓ × ✓ × 
d × ✓ × ✓ × 
e ✓ ✓ ✓ ✓ × 
f × ✓ ✓ ✓ × 
g ✓ ✓ × ✓ ✓ 
h × ✓ × ✓ ✓ 
Fitness, areaFitness, config. spaceDynamic evaluation timesConfig. space incl. robot headingsVisited area per evaluation
a ✓ × × ✓ ✓ 
b ✓ ✓ × × × 
c ✓ ✓ × ✓ × 
d × ✓ × ✓ × 
e ✓ ✓ ✓ ✓ × 
f × ✓ ✓ ✓ × 
g ✓ ✓ × ✓ ✓ 
h × ✓ × ✓ ✓ 

Rewarding the exploration of the area on the population level gives a big advantage to those controllers that are evaluated early on, because the whole area is unexplored at the beginning of the evaluations for a new generation. An option could be to execute the controllers of a population in a quasi-parallel manner, similar to the foraging procedure in the above AE. However, here we implement the option of a competitive procedure to determine the order of the evaluations (fourth column in Table 2). In addition to the weights of the ANN, each individual picks a time ticket T ∈ [0, 1] (an inherited and mutated property of each individual). The individuals are evaluated in the order of their time tickets (smallest first); however, the evaluation time scales with the time ticket as 1/T2. Hence, there is a tradeoff between being allowed to start early and having less time to explore. To allow for a fair comparison, the overall number of simulated time steps per evolutionary run is kept constant.

The last option that we investigate is to reduce the configuration space even further by ignoring the robots' headings (fifth column in Table 2). Defining configurations based on robot positions only means that the reward for explored area and the reward for new configurations operate within the same space. However, the explored area is rewarded on the time scale of generations, and the exploration of the configuration space is rewarded on the time scale of the whole evolutionary run.

The percentage of covered area (PCA) by the best controller and the percentage of the covered configuration space (PCC) by the whole evolutionary run are averaged over 100 independent evolutionary runs on each of all eight tested configurations ah as shown in Figure 8. The PCA of the standard approach (option a) and the approach with a limited configuration space (option b) show a large variance. The PCC of b is high (mean of 80.8%) because the much smaller configuration space is easily explored. However, the mean PCA of 24.6% is much lower. Options c and d (constant evaluation times) achieve a good PCA, and also their PCC is significantly higher than that of the standard approach. Options e and f (dynamic evaluation times based on time tickets) achieve extremely low PCAs, possibly because most of the evaluations are considerably shorter than the maximally allowed evaluation time. However, their PCCs are significantly higher than the others (except for option b with reduced configuration space), with extreme outliers of up to 52.5%. Options g and h represent a hybrid approach. The visited area is rewarded on an individual level as in the standard approach a, but also innovations in the exploration of the configuration space are rewarded. Options g and h achieve a PCA that is significantly higher than all the others, while their PCC is similar to that of options c and d (constant evaluation times, visited area rewarded on population level).

Rewarding innovations in terms of explorations of the configuration space seems advantageous for the investigated task. However, including a task-specific component in the fitness function improves the results even more.

Due to the high dimensionality of the behavior and configuration spaces, it is difficult to estimate whether rewarding the exploration of the configuration space generates a higher diversity or even speciation. Simple diagrams like those for the speciation model and all the beak sizes of a population cannot be drawn. At least a relative and pairwise comparison based on a simple distance measure is feasible. The distance measure is defined as the Hamming distance between the grid representations of the covered configuration space of the considered individuals. A histogram of the distances between all members of the population over 100 evolutionary runs of option c is shown in Figure 9. The distribution is bimodal, which might indicate a certain degree of speciation; however, it might also be a trivial separation between ineffective (e.g., circling) controllers and explorative controllers. The development and application of more sophisticated methods are left for future work.

Figure 9. 

Histogram of distances between the explored regions of configuration space for all individuals (accumulated over 100 evolutionary runs).

Figure 9. 

Histogram of distances between the explored regions of configuration space for all individuals (accumulated over 100 evolutionary runs).

5 Discussion and Conclusion

In this article we have detected self-organizing generation of SPTD in an AE and we have motivated the need for methods that generate diversity, particularly behavioral diversity in the field of ER. Standard methods to increase the diversity in a population, but also recent methods based on behavioral distances, include a considerable amount of a priori knowledge, because they (tend to) include behavioral features in the definition of the behavioral distance that measure how the robot accomplishes a task. Mouret and Doncieux [18] discuss the analogy between fitness function design and behavioral distance measure design:

More importantly, novelty search critically depends on a good behavior characterization to create a gradient. Researchers in ER used to craft fitness function to create a perfect fitness gradient; novelty search users have to craft the behavior distance to create a similar gradient. This last option may be easier for some problems but eventually some distances will be hard to define. [18, p. 123]

Hence, even with methods not relying on fitness functions, the former problem of including a priori knowledge about the task persists. Mouret and Doncieux [18] discuss what the ideal setting for ER might look like:

In an ideal ER setup, ER researchers would only define a high-level fitness function and let the generic evolutionary process do the rest. This goal could be achieved with a generic behavioral distance function that could be used with most ER tasks while still improving the evolutionary process. [18, p. 100]

Whether such a generic measure of behavioral distances can be found is unknown. Instead we study diversity in natural evolution that is generated in a self-organizing process, and our focus is on diversity by speciation. Due to limited biological knowledge about why species exist, it is also difficult to determine prerequisites for the generation of diversity in artificial systems. Therefore we investigated the simple speciation model by Woehrer et al. [36] and extended it to allow the evolution of features that define the process of sexual selection based on assortative mating. Our findings indicate that self-organizing speciation is achievable but the system is sensitive to parameter settings, as is known from other AEs.

It seems difficult to provoke evolutionary dynamics that favor diversity over uniform solutions without imposing diversity by predefined environmental features (e.g., multimodal distributions of seeds; see [36]). In fact, it seems questionable whether sustainable generation of diversity is possible without stimulating influences from a dynamic environment or a complex ecology with intensive interspecies interaction. A beneficial finding is the analogy between methods based on behavioral distances in ER and the investigated self-organizing ecology in terms of the selective pressure that is generated. In both systems selective pressure is generated towards unpopulated regions of the search space (cf. Figures 1 and 6b). Hence we conclude that the two systems are related to each other and that there might be a way of transferring the methods of generating selective pressure in AEs to ER.

In the speciation model this selective pressure is an ecological feature because unpopulated regions of search space hold plenty of seeds, which increase the fitness for survival once they are foraged. While this is easily implemented in this speciation model by defining a beak-size search space, it is unknown how distributing seeds in search space can be transferred by analogy to the behavior space of ER. This triggers an important research question: How to define AEs in the context of ER that generate SPTD without explicitly addressing particular task-specific behavioral features and without including a priori knowledge. This can be done by a data structure that covers the full behavior space and keeps track of which regions of this abstract space have been visited (following the analogy: regions where most of the seeds have been eaten up). However, for a typical application in ER this approach is intractable and only some considerable parts of the behavior and configuration space can be represented, as presented in the case study of Section 4. The selection of certain features of behavior space introduces task-specific considerations. Other options to solve this dimensionality reduction problem using heuristics possibly face similar difficulties, as, for example, function approximation in reinforcement learning [33]. Presumably, any method that maps behaviors from the actual behavior space into a smaller feature space will suffer from being either task-specific or of limited benefit, although very simple mappings were shown to be beneficial in simple tasks [13, 17]. Seemingly the statistics about the frequencies of behaviors is embedded into the environment in natural systems similarly to the concept of stigmergy in swarm intelligence [8]. A candidate solution would be to evolve behaviors in an embodied system that allows for embedding behavioral statistics in the environment. Fortunately, an embodied approach is feasible in ER [2, 6, 32].

As reported in the case study of Section 4, simple approaches towards a self-organizing SPTD can be implemented that are efficient due to a reward to controllers that are innovative in visiting previously neglected configurations. However, the case study also shows that hybrid approaches, which combine rewards for innovations with task-specific fitness components similarly to approaches of multi-objective evolution [17], are more successful. A tradeoff between the application of a priori knowledge and novelty-based methods seems unavoidable.

Furthermore, we have reported measurements (Figures 5 and 6) that allow for modeling speciation as a discrete birth-death process combined with a spatial feature determined by drifting motion towards unpopulated regions. That way speciation truly is linked “to rare stochastic events that cause reproductive isolation” [34, p. 349]. In addition we have shown that the analysis of branch length distributions by Venditti et al. [34] is also applicable in artificial systems and might prove to be instrumental in classifying artificial systems of speciation. In future work we plan to continue investigations of how to integrate AEs into behavior space that is not task-specific and that generates SPTD in complex tasks for ER. In addition it might be desirable to investigate also models of speciation that themselves evolve features of sexual selection that generate diversity.

Notes

1 

This is an extended version of reference [9]; in particular, the case study presented in Section 4 was added.

2 

f(x) = (ca/b)xa − 1 exp(−(x/b)a), a = 0.32407, b = 0.064919, c = 29827.

References

1
Bedau
,
M. A.
,
Snyder
,
E.
, &
Packard
,
N. H.
(
1998
).
A classification of long-term evolutionary dynamics
. In
Artificial life VI
(pp.
228
237
).
Cambridge, MA
:
MIT Press
.
2
Bredeche
,
N.
,
Montanier
,
J.-M.
,
Liu
,
W.
, &
Winfield
,
A. F.
(
2012
).
Environment-driven distributed evolutionary adaptation in a population of autonomous robotic agents
.
Mathematical and Computer Modelling of Dynamical Systems
,
18
(
1
),
101
129
.
3
Bunke
,
H.
, &
Shearer
,
K.
(
1998
).
A graph distance metric based on the maximal common subgraph
.
Pattern Recognition Letters
,
19
(
3
),
255
259
.
4
Coyne
,
J. A.
, &
Orr
,
H. A.
(
2004
).
Speciation
.
Sunderland, MA
:
Sinauer Associates
.
5
Dobzhansky
,
T.
(
1951
).
Genetics and the origin of species
.
New York
:
Columbia University Press
.
6
Eiben
,
Á. E.
,
Haasdijk
,
E.
, &
Bredeche
,
N.
(
2010
).
Embodied, on-line, on-board evolution for autonomous robotics
. In
P.
Levi
&
S.
Kernbach
(Eds.),
Symbiotic multi-robot organisms: Reliability, adaptability, evolution
(pp.
362
384
).
London
:
Springer
.
7
Floreano
,
D.
, &
Nolfi
,
S.
(
1997
).
Adaptive behavior in competing co-evolving species
. In
Proceedings of the Fourth European Conference on Artificial Life
(pp.
378
387
).
Cambridge, MA
:
MIT Press
.
8
Grassé
,
P.-P.
(
1959
).
La reconstruction du nid et les coordinations interindividuelles chez Bellicositermes natalensis et Cubitermes sp. La théorie de la stigmergie: Essai d'interprétation du comportement des termites constructeurs
.
Insectes Sociaux
,
6
,
41
83
.
9
Hamann
,
H.
(
2013
).
Speciation dynamics: Generating selective pressure towards diversity
. In
P. Liò, O. Miglino, G. Nicosia, S. Nolfi, & M. Pavone (Eds.)
,
12th European Conference on Artificial Life (ECAL 2013)
(pp.
947
954
).
Cambridge, MA
:
MIT Press
.
10
Holland
,
J. H.
(
1975
).
Adaptation in natural and artificial systems
.
Ann Arbor
:
University of Michigan Press
.
11
Kistemaker
,
S.
, &
Whiteson
,
S.
(
2011
).
Critical factors in the performance of novelty search
. In
Proceedings of the 13th Annual Conference on Genetic and Evolutionary Computation (GECCO'11)
(pp.
965
972
).
ACM
.
12
Krčah
,
P.
(
2012
).
Solving deceptive tasks in robot body-brain co-evolution by searching for behavioral novelty
. In
T.
Gulrez
&
A. E.
Hassanien
(Eds.),
Advances in robotics and virtual reality
(pp.
167
186
).
London
:
Springer
.
13
Lehman
,
J.
, &
Stanley
,
K. O.
(
2008
).
Exploiting open-endedness to solve problems through the search for novelty
. In
S. Bullock, J. Noble, R. Watson, & M. A. Bedau (Eds.)
,
Artificial life XI: Proceedings of the Eleventh International Conference on the Simulation and Synthesis of Living Systems
(pp.
329
336
).
Cambridge, MA
:
MIT Press
.
14
Lehman
,
J.
, &
Stanley
,
K. O.
(
2011
).
Improving evolvability through novelty search and self-adaptation
. In
Proceedings of the 2011 IEEE Congress on Evolutionary Computation (CEC'11)
(pp.
2693
2700
).
IEEE
.
15
Martius
,
G.
, &
Herrmann
,
J. M.
(
2010
).
Taming the beast: Guided self-organization of behavior in autonomous robots
. In
S. Doncieux, B. Girard, A. Guillot, J. Hallam, J. A. Meyer, & J. B. Mouret (Eds.)
,
From animals to animats 11
(pp.
50
61
).
London
:
Springer
. http://dx.doi.org/10.1007/978-3-642-15193-4_5.
16
Maynard Smith
,
J.
, &
Szathmáry
,
E.
(
1998
).
The major transitions in evolution
.
Oxford, UK
:
Oxford University Press
.
17
Mouret
,
J.-B.
, &
Doncieux
,
S.
(
2009
).
Using behavioral exploration objectives to solve deceptive problems in neuro-evolution
. In
Proceedings of the 11th Annual Conference on Genetic and Evolutionary Computation (GECCO'09)
(pp.
627
634
).
ACM
.
18
Mouret
,
J.-B.
, &
Doncieux
,
S.
(
2012
).
Encouraging behavioral diversity in evolutionary robotics: An empirical study
.
Evolutionary Computation
,
20
(
1
),
91
133
.
19
Nelson
,
A. L.
,
Barlow
,
G. J.
, &
Doitsidis
,
L.
(
2009
).
Fitness functions in evolutionary robotics: A survey and analysis
.
Robotics and Autonomous Systems
,
57
,
345
370
.
20
Nolfi
,
S.
, &
Floreano
,
D.
(
1998
).
How co-evolution can enhance the adaptive power of artificial evolution: Implications for evolutionary robotics
. In
P.
Husbands
&
J.-A.
Meyer
(Eds.),
Evolutionary robotics: First European Workshop, EvoRobot98
(pp.
22
38
).
21
Nolfi
,
S.
, &
Floreano
,
D.
(
1998
).
Co-evolving predator and prey robots: Do ‘arm races’ arise in artificial evolution?
Artificial Life
,
4
(
4
),
311
335
.
22
Nolfi
,
S.
, &
Floreano
,
D.
(
2000
).
Evolutionary robotics: The biology, intelligence, and technology of self-organizing machines
.
Cambridge, MA
:
MIT Press
.
23
Radcliffe
,
N. J.
(
1993
).
Genetic set recombination and its application to neural network topology optimisation
.
Neural Computing & Applications
,
1
(
1
),
67
90
.
24
Ray
,
T. S.
(
1991
).
Evolution and optimization of digital organisms
. In
Proceedings of the 1990 IBM Supercomputing Competition: Large Scale Computing Analysis and Modeling Conference
(pp.
489
531
).
Baldwin Press
.
25
Rechenberg
,
I.
(
1973
).
Evolutionsstrategie. Optimierung technischer Systeme nach Prinzipien der biologischen Evolution
.
Frommann Holzboog
.
26
Sareni
,
B.
, &
Krähenbühl
,
L.
(
1998
).
Fitness sharing and niching methods revisited
.
IEEE Transactions on Evolutionary Computation
,
2
(
3
),
97
106
.
27
Schmickl
,
T.
, &
Crailsheim
,
K.
(
2006
).
Bubbleworld.Evo: Artificial evolution of behavioral decisions in a simulated predator-prey ecosystem
. In
From animals to animats 9
(pp.
594
605
).
London
:
Springer
.
28
Schmidhuber
,
J.
(
2006
).
Developmental robotics, optimal artificial curiosity, creativity, music, and the fine arts
.
Connection Science
,
18
(
2
),
173
187
.
29
Schwefel
,
H.-P.
(
1995
).
Evolution and optimum seeking
.
New York
:
Wiley
.
30
Servedio
,
M. R.
,
Doorn
,
G. S. V.
,
Kopp
,
M.
,
Frame
,
A. M.
, &
Nosil
,
P.
(
2011
).
Magic traits in speciation: ‘magic’ but not rare?
Trends in Ecology and Evolution
,
26
(
8
),
389
397
.
31
Stanley
,
K. O.
, &
Miikkulainen
,
R.
(
2004
).
Competitive coevolution through evolutionary complexification
.
Journal of Artificial Intelligence Research
,
21
(
1
),
63
100
.
32
Stradner
,
J.
,
Hamann
,
H.
,
Zahadat
,
P.
,
Schmickl
,
T.
, &
Crailsheim
,
K.
(
2012
).
Online, on-board evolution of reaction-diffusion control for self-adaptation
. In
C.
Adami
,
D. M.
Bryson
,
C.
Ofria
, &
R. T.
Pennock
(Eds.),
Alife XIII
(pp.
597
598
).
Cambridge, MA
:
MIT Press
.
33
Sutton
,
R. S.
, &
Barto
,
A. G.
(
1998
).
Reinforcement learning: An introduction
.
Cambridge, MA
:
MIT Press
.
34
Venditti
,
C.
,
Meade
,
A.
, &
Pagel
,
M.
(
2010
).
Phylogenies reveal new interpretation of speciation and the Red Queen
.
Nature
,
463
,
349
352
.
doi:10.1038/nature08630
.
35
Ward
,
C. R.
,
Gobet
,
F.
, &
Kendall
,
G.
(
2001
).
Evolving collective behavior in an artificial ecology
.
Artificial Life
,
7
(
2
),
191
209
.
36
Woehrer
,
M.
,
Hougen
,
D.
, &
Schlupp
,
I.
(
2012
).
Sexual selection, resource distribution, and population size in synthetic sympatric speciation
. In
C.
Adami
,
D. M.
Bryson
,
C.
Ofria
, &
R. T.
Pennock
(Eds.),
Proceedings of the Thirteenth International Conference on the Simulation and Synthesis of Living Systems (Alife13)
(pp.
137
144
).
Cambridge, MA
:
MIT Press
.

Author notes

Heinz Nixdorf Institute, Department of Computer Science, Fürstenallee 11, 33102 Paderborn, Germany. E-mail: heiko.hamann@uni-paderborn.de