Abstract
The structure and performance of neural networks are intimately connected, and by use of evolutionary algorithms, neural network structures optimally adapted to a given task can be explored. Guiding such neuroevolution with additional objectives related to network structure has been shown to improve performance in some cases, especially when modular neural networks are beneficial. However, apart from objectives aiming to make networks more modular, such structural objectives have not been widely explored. We propose two new structural objectives and test their ability to guide evolving neural networks on two problems which can benefit from decomposition into subtasks. The first structural objective guides evolution to align neural networks with a user-recommended decomposition pattern. Intuitively, this should be a powerful guiding target for problems where human users can easily identify a structure. The second structural objective guides evolution towards a population with a high diversity in decomposition patterns. This results in exploration of many different ways to decompose a problem, allowing evolution to find good decompositions faster. Tests on our target problems reveal that both methods perform well on a problem with a very clear and decomposable structure. However, on a problem where the optimal decomposition is less obvious, the structural diversity objective is found to outcompete other structural objectives—and this technique can even increase performance on problems without any decomposable structure at all.
1 Introduction
The structure and performance of neural networks are closely related; yet the most common technique for training neural networks does not allow structures to change: Only the weights of existing connections are modified (LeCun et al., 2015). The field of neuroevolution, where neural networks are optimized with evolutionary algorithms, offers an alternative where both connections and structures can be modified (Xin, 1999). However, there have been only a limited number of studies of how the structure and performance of evolving neural networks are related, and fewer still on the potential for objectives related to network structure to guide the evolutionary search. One structural feature that has gained some attention, and been shown to guide evolution when applied as an objective, is modularity.
Modularity in evolving neural networks has been demonstrated to improve performance on complex tasks, as it allows problem decomposition, hierarchical knowledge structures, and multimodal behavior. There is therefore a growing interest in techniques for increasing the functional modularity of evolving neural networks (Clune et al., 2013; Mengistu and Clune, 2016; Schrum and Miikkulainen, 2016b; Velez and Clune, 2017). Most techniques for increasing modularity in neuroevolution belong to one of two extremes.
On one extreme are techniques that explicitly form separate modules for solving separate parts of a problem (Togelius, 2004; Cardamone et al., 2009; Schrum and Miikkulainen, 2016b). Common to these methods are that there is a clear task division, where it is always known which module solves which subproblem. Examples include layered evolution (Togelius, 2004) and multitask networks (Schrum and Miikkulainen, 2012). We refer to such techniques as explicit modularity guidance.
On the other extreme are methods that encourage general modularity, where it is not always obvious which module solves which subproblem, or where a very clear modularity may not emerge at all. Examples are methods that evolve modularity by imposing connection costs during evolution (Clune et al., 2013) and methods where the genotype-phenotype mapping tends to lead to high levels of modularity (Mouret and Doncieux, 2008; Verbancsics and Stanley, 2011; Huizinga et al., 2014; Gruau, 1994). These methods tend to give evolution more freedom to explore different network topologies, at the cost of guiding evolution less towards promising solutions. We refer to these techniques as implicit modularity guidance.
In this article, we suggest two new ways to guide evolution towards promising modular decompositions, both relying on a new distance measure quantifying the difference between the modular decompositions in two neural networks (see Figure 1). The first, user-defined modularity attempts to achieve both the free-form evolution and problem decomposition from the implicit modularity methods, and the ability to guide evolution with the user's knowledge of problem decomposition offered by explicit modularity guidance. It does this by adding a recommended problem decomposition as an evolutionary objective, guiding the search—but without constraining the search to this specific decomposition pattern.
The second technique we propose takes the opposite approach: Rather than guiding evolution towards a single modular decomposition, it guides evolution towards as many different decompositions as possible, by adding modular diversity as an objective.
We compare neuroevolution guided by user-defined modularity, general modularity, and modular diversity on two problems where finding the best modular decomposition can benefit evolution (see Figure 2). Our results indicate that searching for one specific decomposition can work well for problems with a very clear modular structure, but fails when tasks have a less clear structure. However, modularity diversity is demonstrated to be a good guide for evolution for both kinds of problems—even increasing performance on a problem without any modular structure at all.
2 Related Work
2.1 Evolution of Neural Modularity
Modularity is here understood as the tendency for a network to have multiple densely connected clusters, each with only a limited connection to other clusters (Clune et al., 2013). Such modularity is an important organizing principle in many biological networks, including the neural networks that make up the brains of humans and animals (Alon, 2006; Mountcastle, 1997). Understanding why modularity evolved in such networks has therefore been a focus of much research, resulting in several different hypotheses on which factors promote the evolution of modularity.
A leading hypothesis has been that modularity evolves when the overall evolutionary goal changes rapidly, but subgoals remain fixed (Kashtan and Alon, 2005). Such conditions may have been present in biological evolution in environments that change, but require different combinations of some basic skills or functions. Several other hypotheses have been suggested and demonstrated to lead to the evolution of modularity in simulation, including modularity emerging as a way to reduce interference between different patterns of network activity (Espinosa-Soto and Wagner, 2010), modularity emerging due to a noisy genotype-phenotype mapping (Høverstad, 2011), and modularity emerging due to the costs of building and maintaining neural connections (Clune et al., 2013).
Unlike the work introduced above, our structurally guiding objectives are not meant to offer a biologically plausible explanation for how modularity evolves in neural networks. Rather, they represent tools which can be used to study the relationship between structure and performance in evolving neural networks, as well as achieving better performing networks by aligning with the problem structure or exploring a wider range of modular decompositions.
2.1.1 Techniques for Leveraging Modularity in Neuroevolution
Several researchers have studied how modularity can increase the performance of evolving neural networks. Particularly on complex, decomposable problems, modularity has been demonstrated to allow evolution to find good solutions faster. Evolving modular neural networks can be done in a variety of ways, ranging from explicitly decomposing the problem using domain knowledge, to gently guiding evolution towards modular structures.
An example of explicit decomposition is layered evolution (Togelius, 2004), inspired by the subsumption architecture from behavior-based robotics (Brooks, 1986). Layered evolution evolves neural networks incrementally and modularly, beginning with networks for low-level behaviors, and evolving more complex behaviors on top of them as they stabilize. User-knowledge is required for defining the sub-behaviors each module should learn, but the selection among modules during execution is optimized by evolution.
In a series of experiments, Schrum and Miikkulainen (2012, 2014, 2016a,b) studied the role of modular neural networks in game scenarios which require multiple skills. In these experiments, separate output modules allowed single neural networks to show multiple different behaviors. The researchers studied different ways to perform this modular decomposition, with different degrees of hand design—ranging from manually specifying which module to use in which situation to giving evolution full control both over the number of different modules and how they are used.
Schrum and Miikkulainen (2016a) also discuss the importance of task divisions at the input level, highlighting two different ways one can influence neural network architectures. The first method, called “split sensors,” is to have separate inputs for sensory information that needs to be processed differently (e.g., different inputs for poisonous and edible objects), thereby biasing learning towards one particular task division. An alternative to this is “conflict sensors,” which means that these neural-network inputs carry information about multiple different types of events (e.g., a single input that signals both poisonous and edible). The latter make learning harder, but it is important to be able to learn with conflict sensors, since split sensors are not available in all domains. Schrum and Miikkulainen (2016a) go on to demonstrate that evolving modular neural networks is a way to learn multimodal behaviors with conflict sensors. We test our methods on a task with a very explicit task decomposition at the input level, similar to “split sensors,” and on tasks with much less obvious mappings from network inputs/outputs to modular decompositions—studying the effect of structural objectives also on tasks closer to real-world scenarios, where we are not always sure to which module a neuron belongs.
An alternative to the methods above is to increase modularity in neural networks by letting it emerge without any explicit human design. One such technique adds as an evolutionary objective the reduction of connection costs (Clune et al., 2013)—resulting in evolved neural networks with increased modularity. This technique has also been found to improve performance on modularity-demanding tasks, such as tasks requiring the learning and retention of multiple skills (Ellefsen et al., 2015), tasks with multiple subproblems (Huizinga et al., 2014) and tasks with hierarchical structure (Huizinga et al., 2014; Mengistu and Clune, 2016). A related idea is to add network modularity itself as an objective. This has also been demonstrated to increase modularity, and in some cases performance, of evolving neural networks (Huizinga et al., 2016).
A final way to increase modularity in evolving neural networks is to apply genotype-phenotype mappings with modularity-inducing properties (Mouret and Doncieux, 2008; Verbancsics and Stanley, 2011; Huizinga et al., 2014; Gruau, 1994). An example is applying a developmental process as mapping, which can produce modularity by recursive repetition of developmental rules (Gruau, 1994). Similarly to the addition of an objective guiding evolution towards modularity, these techniques encourage modularity in general, but do typically not apply any problem-specific domain knowledge.
The most explicit methods for evolving modular networks have the advantage of having a clear task division, where it is always clear which module is responsible for each action. On the other hand, the techniques producing modularity by applying guiding objectives or modularity-inducing genotype–phenotype mappings, may give evolution more freedom to explore unconventional modular decompositions, with the disadvantage that these decompositions may be difficult to interpret, and that we do not exploit the user's knowledge about the problem structure.
2.2 Encouraging Diversity in Evolving Neural Networks
Maintaining diversity in evolutionary algorithms is a commonly used technique to encourage exploration and avoid convergence to local optima (Eiben and Smith, 2003). Generally, diversity may be encouraged at the genotype or phenotype level, both requiring an application-specific distance–measure for individuals. An example is fitness sharing (Goldberg and Richardson, 1987), wherein similar individuals share their fitness value, resulting in a lower selection pressure on solutions that are very different from others. The popular neuroevolution-algorithm NEAT (Stanley and Miikkulainen, 2002) applies this technique to evolving neural networks, by imposing fitness sharing according to the number of identical genes (which in turn indicate identical nodes or connections). More recently, multiobjective evolution has emerged as a way to add diversity as a separate objective, to be optimized together with performance (Mouret and Doncieux, 2009).
A key challenge in encouraging diversity in evolutionary algorithms is finding an appropriate way to measure the distance between two individuals. In addition to uncovering interesting differences between two individuals, this measure should be efficient to calculate: All individuals in the population need to be compared to all others, resulting in calculations of this value for each generation. In general, computing the distance between graphs (an example of which is neural networks) is NP-hard, ruling out a complete structural distance calculation (Mouret and Doncieux, 2009). Applying approximate structural distance as a guiding objective has been tested as a way to encourage structural diversity in a population of evolving neural networks (Mouret and Doncieux, 2009)—however, this was not found to improve the evolutionary search.
Since structural differences are difficult to calculate, and may not necessarily lead to interesting differences in the functionality of neural networks, a more common technique in evolving neural networks is to apply behavioral diversity as an objective (Mouret and Doncieux, 2012; Risi et al., 2009). Behavioral diversity techniques rely on quantifying the difference in how evolved individuals actually behave. For instance, in a robot navigation task, this could include some information about where the evolved robot tends to navigate to. Encouraging behavioral diversity has been demonstrated to substantially improve the performance of evolutionary algorithms on a variety of different tasks (Mouret and Doncieux, 2012), and to outperform structural diversification (Mouret and Doncieux, 2009).
Unlike previous techniques, our structural diversity measure encompasses the idea that the interesting differences between evolving networks lie in their higher-level modular structures, and not in the exact patterns of connectivity. This higher level of abstraction in measuring structural diversity has the additional benefit that high-level differences are faster to compute. We demonstrate that evolution guided by our structural diversity measure performs similarly to evolution guided by behavioral diversity and that structural diversity may even lead to faster convergence. A further advantage of structural diversity is that the calculation is independent of the problem, whereas behavioral diversity techniques typically need some adaptation to a given task (Mouret and Doncieux, 2012).
3 Targeted Problems
We apply structurally guided neuroevolution to two problems that are expected to give different insights into the role of structural objectives. The first, the retina problem, has a very clear structure and can clearly benefit from one specific decomposition pattern. The second problem, a robot locomotion problem, also has a modular structure, but it is not obvious which modular decomposition would work best, or if such a decomposition is necessary.
3.1 The Retina Problem
The retina problem (see Figure 3a) is a pattern–recognition task that has been the focus of several previous studies on the evolution of modular neural network structures (Kashtan and Alon, 2005; Clune et al., 2010; Høverstad, 2011; Clune et al., 2013; Huizinga et al., 2014). In this task, an 8-bit input is to be classified as 1 or 0. The task is modular because the input patterns have two independent parts (left and right), both of which should contain one of several target patterns for the classification to be a 1. We can think of this as abstracting the left and right half of a retina, seeing independent parts of the visual scene.
Interestingly, even though the retina problem has a modular structure, and evolution can benefit from dividing it into separate parts, evolving neural networks for this problem tends to produce non-modular solutions (Kashtan and Alon, 2005). Variants of this problem have therefore been used to gain a better understanding of the environmental pressures that encourage the evolution of modularity (Kashtan and Alon, 2005; Høverstad, 2011; Clune et al., 2013).
3.1.1 Problem Setup
Our setup of the retina problem follows the “left AND right” setup used in previous studies (Kashtan and Alon, 2005; Clune et al., 2013). The left and right half of the retina both consist of 4 inputs, yielding a total of 16 potential binary patterns on each half of the retina. 8 of these are classified as target patterns (see Figure 3b).
To explore the relationship between problem structure and structural objectives, we also test the techniques on a nonmodular version of the retina problem. In this experiment, patterns are distributed randomly across all inputs, eliminating any decomposable problem structure. To keep the problem difficulty similar to the modular retina experiment, we define the same number of target patterns for the nonmodular retina. That is, 64 of the 256 patterns are randomly chosen to be targets.
3.1.2 Neural Network Details
The neural network setup replicates recent work on evolving modular neural networks for the retina problem (Clune et al., 2013). Evolution optimized the connectivity and connection weights of feed-forward neural networks with a layered structure, only allowing connections between neighboring layers. The output of each neuron, , was calculated as the following function of its inputs: . is the set of all inputs to node , is a bias input, and is the weight of the connection between node and . The function ensures an output of each neuron in the range [1, 1], and determines the slope of the activation function between the limits. Identically to Clune et al. (2013), we set to 20, making the activation function very steep, resembling a step function. The evolving neural networks had 5 layers, with a maximum of 8/4/2 nodes in each hidden layer. Following Clune et al. (2013), evolution chose from a discrete set of values for weights and biases (the values 2, 1, 1, 2, and 2, 1, 0, 1, 2, respectively).
3.1.3 Interface towards Neural Networks
The neural networks apply the same input/output structure as previous work on this problem (Kashtan and Alon, 2005; Clune et al., 2013). The eight binary-valued pixels of the retina are sent to 8 separate input neurons, resulting in 4 neurons receiving the “left-half” retina stimuli and 4 others receiving the “right-half” stimuli. Output from the network is a single number, with positive output values being interpreted as true, and negative outputs as false.
The structural distance between neural networks, , is measured on input-neurons, since these mirror the modular structure of the problem. The recommended decomposition pattern reflects the obvious modular decomposition (see Figure 4a).
3.2 Robot Locomotion
Robot locomotion is a problem that has received significant attention in studies of neuroevolution, including recent experiments on deep neuroevolution (Conti et al., 2018), and studies on the relationship between structure and performance in evolving neural networks (Huizinga et al., 2016). We test our proposed structural objectives on the robot-locomotion task from Huizinga et al. (2016) to measure their effect on a practical problem, which, unlike the retina problem, does not have a very clear mapping between the structure of the problem and the evolving neural networks.
3.2.1 Problem Setup
3.2.2 Neural Network Details
The network is a Continuous-Time Recurrent Neural Network (CTRNN) (Beer and Gallagher, 1992; see Supplementary Material Table 4 for the equations), with its parameters specified by the HyperNEAT encoding (Stanley et al., 2009). In the HyperNEAT encoding, the genotype of the network is a Compositional Pattern Producing Network (CPPN) (Stanley, 2007), which is effectively a neural network that takes as input the coordinates of two neurons and that outputs the weight of the connection between those neurons. Here, the CPPN is extended with a Link Expression Output (Verbancsics and Stanley, 2011), meaning connections are not expressed at all if the value of this output is smaller than zero, and it implements the multispatial substrate method (Pugh and Stanley, 2013; see Supplementary Material Figure 1 for an explanation), which is recommended for robotics problems with different input and output modalities. The CPPN is further extended with two additional outputs that specify the bias and time constant of each neuron. To encode these neuron-specific parameters, the CPPN is presented with the coordinates of the relevant neuron as its first inputs, while its other inputs are set to zero. Afterwards, the output of the CPPN is scaled to the desired range, with the CPPN weight and bias outputs scaled to and the time-constant outputs scaled to . For details about the aforementioned methods, we refer the reader to the cited papers. CPPNs are initialized as minimal, fully-connected networks without hidden nodes, weights drawn uniformly from and activation functions are drawn uniformly from the available set of sine, identity, Gaussian, and sigmoid.
The spatial coordinates of the neurons in the CTRNN controlling the spider robot are as depicted in Figure 4b, where the neurons are shown inside a cube with sides of length 2, centered around the origin such that it extends from −1 to 1 in all dimensions. The extreme neurons shown in this picture all lie at the edge of this cube.
3.2.3 Interface towards Neural Networks
As seen in Figure 4b, the CTRNNs controlling the spider robot have six inputs and 12 outputs. The inputs represent the spider body's velocity along the X, Y, and Z- axes, as well as the robot's heading compared to each axis. The heading takes values between 1 and 1, indicating the robot is facing exactly in the direction of the relevant axis, and exactly in the opposite direction, respectively.
The robot has 6 legs, each with 1 hip joint with 2 degrees of freedom (up-down, front-back) and one knee joint with 1 degree of freedom (see Figure 5). The range of the neural network outputs are rescaled to span the feasible range of their respective actuator and the resulting value is interpreted as the desired angle for that actuator (see Supplementary Material Table 5 for the actuator ranges and velocity calculation).
This recurrent network can generate rhythmic patterns of activation without any inputs, and initial experiments indicated that many good robot controllers choose to disconnect all inputs to the network. The structural decomposition of inputs is uninformative for such networks, and we therefore use the modular decomposition of outputs to calculate the structural distance measure for this task. It is less obvious how to select a user-recommended decomposition pattern here. One clear structural feature of this task is that each of the robot's six legs has the same three degrees of freedom (see Figure 5), which could potentially benefit from somewhat similar patterns of movement. We therefore recommend a decomposition that divides the output neurons into three groups, one for each of the three degrees of freedom (see Figure 4b).
4 Methods
4.1 Evolving Neural Networks
Evolution begins with a population of randomly generated neural networks (for the retina task) or CPPNs (for the robot locomotion task), and works towards better performance by allowing the fittest individuals to have more offspring, and applying random mutations to those offspring. Following previous studies on neuroevolution guided by additional objectives (e.g., Clune et al., 2013; Ellefsen and Torresen, 2017), we apply the multiobjective optimization algorithm NSGA-II (Deb et al., 2002). All individuals have the primary objective of solving the target problem (retina or robot locomotion) as well as possible. Different experimental treatments apply different additional guiding objectives, as outlined in Figure 2. The experiments were carried out in the Sferes evolutionary algorithm software package (Mouret and Doncieux, 2010). Experimental parameters are given in Supplementary Material Table 1, available at http://www.mitpressjournals.org/doi/suppl/10.1162/evco_a_00250.
Treatment . | Structural Objective . |
---|---|
PA | None |
UserMod | Maximizing match with user-defined modularity pattern (see Figure 4) |
Q-Mod | Maximizing modularity as measured with the Q-metric |
ModDiv | Maximizing diversity of modular decompositions in the population |
Treatment . | Structural Objective . |
---|---|
PA | None |
UserMod | Maximizing match with user-defined modularity pattern (see Figure 4) |
Q-Mod | Maximizing modularity as measured with the Q-metric |
ModDiv | Maximizing diversity of modular decompositions in the population |
Variation in evolving neural networks and CPPNs is created via mutations. For the directly encoded networks in the retina problem, mutations have a small chance of adding connections, removing connections, moving connections, changing the weights of connections, and changing bias-inputs to nodes (see Supplementary Material Table 2). For the evolving CPPNs in the robot locomotion task, mutations have the potential of modifying connectivity and connection weights, as well as randomly replacing activation functions (see Supplementary Material Table 3).
4.2 User-Defined Modularity
Our user-defined modularity technique allows the user to influence the direction of evolutionary search by defining a modular decomposition that could help solve the target problem. In practice, this is implemented by the user defining a list of lists, where each list corresponds to a module and each element of a list corresponds to a neuron. For instance, the decomposition corresponds to a network where input neurons 1 and 2 belong to one module (A), input neurons 3 and 4 belong to a different module (B), and any other input/output neurons are unspecified. Unspecified neurons indicate we do not care which module they belong to: They could belong to module A, module B, or a different, separate module.
The user-defined modularity pattern, specified in the array-format described above, is given to the multiobjective evolutionary algorithm, which now has the objectives of (1) maximizing task performance and (2) maximizing the degree of match with the guiding modularity pattern.
4.3 Quantifying the Distance between Two Modular Decompositions
With the user-defined modularity technique, it is necessary to evaluate how well each evolved network matches the recommended modularity pattern, , and with the modularity-diversity technique, it is necessary to determine how well the modularity patterns in all pairs of evolved networks match. The same distance measure, (see Figure 1), is applied in both cases. In the discussion below, we refer to the modularity pattern in the evolved network we are currently evaluating as , and the pattern we are comparing it to as . will thus be a user-defined pattern for the user-defined modularity technique, and the pattern of a different evolved neural network for the modularity-diversity technique. The calculation of has two steps: (1) estimate the modular decomposition of the evolved network(s), and (2) calculate how well this decomposition matches .
4.3.1 Calculating the Modular Decomposition of Evolved Networks
To evaluate and visualize which are the main modules in an evolved network, we follow a technique applied in previous papers on evolving modular neural networks (e.g., Clune et al., 2013). This technique approximates the best modular decomposition of a network, and simultaneously calculates the modularity score of this decomposition. This modularity calculation estimates the network division which maximizes the Q-metric (Newman, 2006b; Leicht and Newman, 2008). The Q-metric measures modularity as the difference between the number of connections inside each module and the expected number of such connections for random networks with the same number of edges. In other words, it reflects how “unexpectedly modular” a given network is. Maximizing Q is an NP-hard problem, and we therefore apply an approximate optimization algorithm to find the most modular division (Fortunato, 2010). More details on this technique can be found in Ellefsen et al. (2015).
The result is an estimate of which are the most prominent modules in our evolved neural networks, and the modularity Q-score associated with this modular decomposition. In visualizations (e.g., see Figure 1) we color the different discovered modules in different colors.
4.3.2 Comparing to
The match between an evolved modular decomposition, , and a different decomposition, , (either a user recommendation or a different evolved network) is reflected in the metric . Since our evolving neural networks can choose to connect or disconnect internal neurons, we limit this calculation to input and/or output neurons, depending on where we find it most relevant to measure modular decompositions. For the retina problem, input neurons mirror the modular structure of the task, whereas for the robot locomotion problem, the output neurons have the clearest modular decomposition (and many good solutions do not connect the inputs at all). We therefore measure on inputs for the former, and on outputs for the latter (see Figure 4). For simplicity, we discuss measurements on ANN inputs below, but the same calculations apply to measuring on outputs, or even on internal neurons.
When comparing decompositions, we are interested in which neurons belong to the same, and which belong to different modules. Other than their constituent neurons, modules have no identity—the color we display to tell modules apart has no special meaning. For this reason, we cannot compare two neural networks by counting whether their neurons agree on which module they belong to. For instance, in Figure 6, it does not matter that neuron in and both belong to the “blue” module. However, it does matter that neurons - in both and belong to the same module, and that neuron and belong to different modules, both in and . We therefore need a measure that reflects to which degree neurons in and are grouped together in the same way.
There are two separate issues that are important when comparing and . The first is that the neurons belonging to the same module in should as far as possible also do so in . For instance, for the recommended pattern in Figure 6a, evolved networks will have the lowest if neurons to belong to the same module, and neurons and are also grouped together. We call this measure uniformity, as it reflects to what degree neurons that were intended to belong to the same module actually do so.
Note that having a high uniformity is not enough for two decompositions to be a good match. For instance, if has all neurons belonging to the same module, it will score maximally on uniformity no matter how looks: All the modules in are 100% uniform in . We therefore need to measure also how frequently pairs of neurons in belong to the same module, but their counterparts in do not. This is for instance the case for neurons and in Figure 6b: They belong to the same module, but were recommended not to do so. We call such a situation a conflict. To evaluate how well aligned and are, needs to reflect the degree of uniformity inside recommended modules, and the degree of conflict between them.
We facilitate some explanations below by discussing the “color” of modules. As discussed in Section 4.2, modular decompositions are lists of lists of neuron IDs. When we discuss modules or neurons of different “color,” we simply mean that these belong to different sublists.
4.3.3 Calculating Uniformity
Algorithm 1 calculates the uniformity of two decompositions. The inputs are the evolved and compared modular structures, both presented as lists of lists of neuron-IDs, as seen in Figure 6. The algorithm goes through each module in , and calculates the uniformity (to which degree they also belong to a single module) of corresponding nodes in .
For each module in , the method first extracts the IDs of all neurons in that module, . In , we then check to which module(s) those neurons belong (which color they have in the modular decomposition). The most common color among these neurons is referred to as the of this module, and counting how many of the neurons in have that color gives us an indication of the module's uniformity. The uniformity is summed over all modules in , and normalized to be in the range [0,1] where higher values indicate more uniformity.
An example of maximum uniformity is shown in Figure 6b. (see Figure 6a) has 2 modules, consisting of neurons - and -, respectively. The uniformity calculation processes these two sequentially. First, it is found that neurons - indeed all belong to the same module in the evolved network, adding 3 to . Next, the same is found for neurons -, adding 2 to . The final uniformity is therefore 5/5 = 1. By similar reasoning, the network in Figure 6c is also found to be fully uniform with respect to . These two figures illustrate why uniformity alone is not a sufficient measure of : Both have a full uniformity with , but only the evolved network in Figure 6c has the intended structure.
4.3.4 Calculating Conflicts
Algorithm 2 calculates the number of conflicts between and . A conflict exists when neurons from a single module in belong to several different modules in . The algorithm goes through all modules in , and extracts the colors of the corresponding neurons in . The inner loop goes through all other modules in to see if any of the same colors can be found on their corresponding neurons in . For each such match, we count one conflict. The conflict measure is normalized to lie in the range [0,1] where higher numbers mean fewer conflicts. This is done to make 1 indicate the highest level of agreement for both uniformity and conflicts.
To give an example, in Figure 6a, there are two recommended modules. Algorithm 2 starts by assigning the first (neurons -) as and the second (neurons -) as . In the evolved network in Figure 6b, the colors in are [blue, blue, blue] and in [blue, blue]. goes through all the neurons in , and counts how many of the neurons in match their color. In this case, we have matches. The normalizing factor is incremented by the maximum number of conflicts between these two modules, which also happens to be . For this network, is equal to , resulting in a conflict measure of 0 (the worst possible). A similar calculation on the evolved network in Figure 6c reveals it has the best possible conflicts-score of 1.
4.3.5 Calculating
thus ranges from 0 to 1, where 0 indicates a perfect match between the compared decompositions, and 1 indicates the worst possible match.
4.4 Diversity Measurement
4.4.1 Behavioral Distance
In one experiment, we compare the use of behavioral and modular diversity in evolving neural networks. A key difference in these two approaches is that behavioral differences typically have to be calculated with problem-specific methods, whereas, using our metric, modular diversity can be calculated the same way for any problem with neural network phenotypes.
A generic behavior distance metric, which has been found to work well for several problems in evolutionary robotics, is the Hamming distance between sensory-motor vectors (Mouret and Doncieux, 2012). The idea in this approach is to store all inputs and outputs of a neural network in a large binary vector (a process which may require some problem-specific adaptation), and calculate the Hamming distance (the number of positions at which the bits are different) between pairs of networks.
Inspired by this, the behavioral descriptors for both our tasks reflect the idea that the behavior of a network is considered different if its response to a particular input is different from the response of the rest of the population. Both diversity measurements are based on representing the history of network outputs as binary vectors, by converting positive outputs to 1 and zero-valued or negative outputs to 0. For the retina task, inputs are always presented to the neural networks in the same order, and simply appending each binary output to a vector generates a description of how the network “behaves” as it sees each unique input.
For the robot locomotion problem, behaviors are more complex, since the input to the neural network depends on the previous motions of the robot. To characterize network behaviors here, we use a measure of behavioral diversity similar to the one presented in Huizinga et al. (2016): We give each network a collection of predefined inputs, and measure its response as follows. Setting one of the inputs to 1 and all others to 0, we record the output of the network over 5 time steps, converting it to a binary vector with length equal to the number of outputs. This process is repeated for each input, yielding a behavioral descriptor of length . Note that even though successful networks for this task sometimes do not connect to the inputs, this method can capture behavioral differences, since the pattern of outputs varies depending on the evolved CTRNN, even without any inputs.
4.4.2 Measuring Diversity against the Population
4.5 Experimental Treatments
Our main experiment compared three different ways of guiding neuroevolution with structural objectives (see Table 1). The baseline treatment is “Performance Alone” (PA), where evolution is guided only by performance on the target problem. In this single-objective case, NSGA-II is an elitist evolutionary algorithm with tournament-based selection. UserMod applies the user-defined modularity-technique, inserting knowledge about the recommended problem decomposition in the evolutionary search. Q-mod guides evolution towards more modular neural networks, but without applying any problem-specific knowledge. Previous work has shown such general modularity pressure to form more modular (Huizinga et al., 2016) and better performing (Clune et al., 2013) neural networks when applied to modularly decomposable tasks. Finally, ModDiv applies the modularity-diversity technique, selecting for networks with different modular decompositions than the rest of the population.
4.6 Metrics and Visualizations
When calculating the structural modularity of evolved networks, we apply the widely used Q-score (Newman, 2006b). In visualizations, we follow Clune et al. (2013) in first moving nodes to the position that minimizes the length of the neural network, while holding inputs and outputs fixed. This shows structural modularity more clearly, while not changing the functionality or modularity score of the network. Also following Clune et al. (2013), in our visualizations, we estimate the most modular split of the network, and color each neuron according to which module it belongs to.
All experimental treatments were repeated 50 times with different stochastic events (i.e., using different random seeds). Analyses of evolved networks focus on the best performing network (with regards to the primary objective, and the secondary objective used to break ties) at the end of each trial. All tests of statistical significance apply the Mann-Whitney U test.
5 Results and Discussion
To understand the role of the structural objectives outlined in Figure 1 in guiding neuroevolution, we measure performance, modularity, and diversity in evolving populations guided by each structural objective. We also compare the modularity diversity objective to the powerful technique of encouraging behavioral diversity. Finally, we test the techniques on a non-modular problem, demonstrating that also problems without any obvious structure can benefit from a structurally diverse population. Figures present medians, bootstrapped 95% confidence intervals, and markers where there are significant differences between an indicated treatment and the others.
5.1 Performance
5.1.1 The Retina Problem
We compared neuroevolution guided only by the performance of evolving networks (PA) to that guided by each of the structural objectives outlined in Table 1. On the clearly decomposable retina problem, the treatments converging fastest are the two opposites of (1) searching for a diverse set of modular decompositions (ModDiv) and (2) searching for a single, user-defined modular decomposition (UserMod) (see Figure 7a).
As expected, the general modularity objective (Q-Mod) produces significantly more modular networks than all other treatments (see Figure 7b). We also observe that the user-defined structural guidance has a dramatic effect on the problem decomposition in evolving networks: Within a few hundred generations this objective leads the average network to be fully aligned with the user-defined problem decomposition (see Figure 7c).
Finally, we measure the average structural diversity in the population. The result (see Figure 7d) confirms the ordering we suggested in Figure 2: User-defined modularity results in the least structural diversity, since it exploits a single modular decomposition. Structural diversity as a guiding objective (ModDiv) has the opposite effect: Exploring a wide variety of ways to decompose the problem. Searching for general modularity (Q-Mod) results in an intermediate level of population diversity. Using performance as the only objective results in the lowest level of structural diversity, since networks without any structural objective tend to become very densely connected, leaving all input neurons in the same module (see Figure 9a).
Previous work on guiding evolution towards more modular networks on the retina problem has indicated that having an equally strong pressure on performance and structural objectives may lead evolution towards pathological, poorly performing structures (Clune et al., 2013). We therefore also tested applying the modularity-objective probabilistically, affecting selection only 25% of the time, as proposed in Clune et al. (2013). While this does improve the performance of the general modularity-objective, both UserMod and ModDiv still reach the optimal solution faster (see Supplementary Material Figure 1).
5.1.2 The Robot Locomotion Problem
The robot locomotion problem benefits the most from guidance by a structural diversity objective, which significantly outperforms all other treatments (see Figure 8a). The performance of the user-defined modularity pattern is weaker on this problem; we believe the reason is that the problem has a less obviously modular structure. However, it cannot be ruled out that a different recommended decomposition could improve the performance of the UserMod treatment. This highlights a limitation of the technique: It requires the user to correctly identify the right way to decompose the neural network. In agreement with the retina-problem, we see Q-Mod producing the most modular structures (see Figure 8b) and the same relative ordering of how diverse generated solutions are (see Figure 8d).1
5.2 Neural Network Structures
In this section, we show and analyze the structure of final evolved networks. The presented ANNs are all “winners” of their respective evolutionary run, meaning they performed the best on the primary objective and, in case of ties, outperformed others on the secondary objective. We focus on median results from the 50 repetitions of each treatment, since they reveal the most interesting differences. All treatments occasionally reach very good performance—the main advantage of structural guidance is that very good performance is reached much more frequently.
5.2.1 The Retina Problem
Figure 9 shows final evolved networks for the retina problem. The Performance Alone treatment (see Figure 9a) results in entangled networks without any modular structure on input neurons. Both general modularity (Q-Mod) and user-defined modularity as a guiding objective result in networks frequently matching the recommended problem decomposition (see Figures 9b and 9c) but as one might expect, UserMod tends to do so more frequently (92% of UserMod networks match the recommended decomposition, vs 22% of Q-Mod networks). Guiding evolution towards a diverse collection of modularity patterns has the effect of producing networks with unexpected, yet well-performing problem decompositions (see Figure 9d).
5.2.2 The Robot Locomotion Problem
The neural networks evolved for the robot locomotion problem reveal interesting properties about the applied structural objectives. Guiding evolution with the user-defined modularity pattern from Figure 4b results in 42% of the winner-networks perfectly matching the recommended structure (see Supplementary Material Figures 9–10). On this problem, guiding evolution with general modularity as the objective (Q-Mod) never results in reaching the recommended pattern (see Supplementary Material Figures 11–12). Still, the performance scores of Q-Mod and UserMod are comparable. This indicates that there are more alternative decomposition patterns to exploit for this problem; there is a less clear relationship between modularity patterns and performance. Again we see the networks guided by modularity diversity outperform the others by reaching unexpected, well-performing decompositions (see Supplementary Material Figures 13–14).
5.3 Structural Versus Behavioral Diversity
To test how our structural diversity technique compares to the powerful technique of encouraging behavioral diversity, we evolved neural networks with the behavioral diversity measurement outlined in Section 4.4 as a guiding objective, and compared the results to networks evolved with the modularity diversity objective. Figure 10 shows the resulting performance on the retina and robot locomotion task. For the retina, structural diversity leads to the best-performing solutions significantly faster than behavioral diversity. For the locomotion problem, performance of the two is similar.
We hypothesize that the reason structural diversity does not outperform behavioral diversity on the robotic locomotion problem is that for this problem, the structure of evolving networks is not very indicative of their potential performance—this is supported by the finding that a general pressure towards modular networks never leads to the user-recommended structure for this problem (see Supplementary Material Figures 11–12). For the retina problem, structure and performance are more closely related, as indicated by the fact that the user-defined structure is reached almost immediately for the UserMod-treatment on this problem (see Figure 7c). This makes a population rich in structural diversity the best guiding objective for the retina problem.
We consider it an important topic of future research to work towards a better understanding of which kind of problems can gain the most from guidance from high-level structural objectives.
5.4 A Non-Modular Problem
By making all eight inputs to the retina problem a single “pattern detector,” the problem becomes non-modular (see Section 3.1.1). for this non-modular problem was the same pattern as before (see Figure 4a). Since there is no modular structure in this problem, there is no other recommended decomposition that we expect to be a good guide for evolution here. As in the modular version, the three modularity-inducing objectives have differing effects with regards to the amount of modularity (Q), diversity, and specific modularity patterns in evolved networks (see Figure 11). As one might expect, this non-modular problem no longer benefits from the guidance of the modularity-maximizing or user-defined modularity objective (see Figure 11a). However, the modularity diversity objective still improves performance significantly. Our interpretation is that a diverse set of high-level network structures help guide evolution, independently of the structure of the target problem.
5.5 Scaling Up
One important direction for future experiments is to investigate the ability of the structural objectives to guide evolution on more complex problems, including more challenging simulated reinforcement learning problems and real-world tasks. Evolutionary algorithms have recently been demonstrated to be a viable technique for challenging reinforcement learning problems, rivaling the performance of popular backpropagation-based deep learning techniques (Such et al., 2017; Salimans et al., 2017). Further, it was recently demonstrated that techniques for guiding neuroevolution by encouraging novel behaviors (originally developed for small-scale evolved networks) are also valuable when scaling up to deep reinforcement learning tasks (Conti et al., 2018). It is well known that the structure of deep neural networks is very important for their performance, and evolutionary algorithms are emerging as a competitive way of finding effective architectures (Real et al., 2018). As such, it is likely that an evolutionary algorithm which searches not just for optimal performance, but which also explores many different ways of structurally organizing the network will find solutions that perform well in these deep neural networks.
Since detecting clusters of modules in a neural network is an NP-complete problem (Brandes et al., 2008), calculating the modular decomposition of a neural network may seem like an impediment to scaling up to very large networks. However, similar to previous papers applying modularity measurements as part of neuroevolution (e.g., Ellefsen et al., 2015; Clune et al., 2013), we apply the spectral optimization method in our modularity calculations, which gives good results in practice at a low computational cost (Newman, 2006a; Fortunato, 2010). Modularity calculation needs to be done only once per network, whereas measuring the performance of networks will usually require hundreds or thousands of passes of data through the network, as well as other computations, such as physics simulation (for robotics tasks) or training the neural network (when evolving network structures for supervised learning tasks). This performance measurement will in most cases by far be the most time consuming part of neuroevolution.
The most successful and popular application of deep learning, including deep reinforcement learning, is solving difficult problems directly from pixel inputs with deep convolutional neural networks (LeCun et al., 2015; Mnih et al., 2015). These networks already have a very specific modular structure, inspired by visual processing in living creatures, and are very efficient at recognizing objects in images (Simonyan and Zisserman, 2014). Although evolutionary algorithms have been demonstrated to be a powerful technique for searching for high-level architectures for convolutional neural networks (Real et al., 2018), it is not likely that a neural network with a freely evolving structure (like the ones we study herein) would outcompete state-of-the-art convolutional networks. However, the networks applied in deep reinforcement learning usually have fully connected layers following the convolutions, which map high-level state representations to actions. While out of the scope of the current study, an intriguing opportunity is to apply structurally guided neuroevolution only to the latter part of the network—using, for example, a pretrained convolutional network as front-end (Poulsen et al., 2017). Exploring different neural network structures here could potentially aid evolution by guiding it towards networks grouping together states that require similar actions.
6 Conclusion
We have explored the ability of objectives related to the high-level structure of neural networks to act as guiding objectives for neuroevolution. Our results are in line with previous work demonstrating that modularity-encouraging objectives can guide neuroevolution (Clune et al., 2013), and add to that work by (1) showing that applying specific modular decompositions as guiding objectives aids evolution on tasks with very clear, modular structure and (2) showing that guiding evolution towards a population with a diverse set of modular decompositions increases performance both on modular and non-modular problems. This modularity-diversity technique is even demonstrated to produce results comparable to the powerful and popular behavioral diversity technique.
We also demonstrated that evolution guided towards a single user-defined decomposition does not perform well for tasks that do not have a very obvious structure. This agrees with previous work demonstrating that evolving neural networks often end up with unexpected decomposition patterns not agreeing with human intuition (Huizinga et al., 2016; Schrum and Miikkulainen, 2016b; Ellefsen et al., 2015). The technique of guiding evolving neural networks towards a diversity of decomposition patterns presents a way to take advantage of unexpected, creative solutions, allowing an automatic way to discover many functional problem decompositions. The fact that the modularity-diversity technique showed good performance on two very different types of neural network, and genotype–phenotype mappings, strengthens our confidence that it will be a valuable asset on a large range of neuroevolution problems.
Central to these findings is our new technique for measuring the distance between modularity patterns in pairs of neural networks. A key to this technique is that it compares high-level structures of neural networks, rather than their exact patterns of connectivity. Previous work has shown such lower-level structural distance measures to be a poor guide for neuroevolution (Mouret and Doncieux, 2009). Presumably, low-level structure in neural networks is not very indicative of how the network decomposes and solves a problem. Thus, by applying structural comparisons on a higher level, we both maintain computational complexity within reasonable limits, and capture the most important structural differences between networks.
This article has presented initial evidence of the power of guiding evolution with high-level structural objectives, and there are many important issues to address for future work. First, the structural distance measure we propose is calculated by analyzing how input or output neurons of networks are modularly separated. For some tasks, we may not expect the problem decomposition to be present at the input/output level, but only at intermediate stages of processing. In this case, we would need a way to compare high-level structure based on the modularity pattern of internal neurons. This could be facilitated by giving each internal neuron a separate ID, and keeping the number of neurons at each layer of the neural network fixed (but allowing connections to/from them to appear and disappear). Another interesting direction for further research is to better understand the relationship between high-level structural diversity and behavioral diversity as guiding objectives. Our results indicate that the former may work best for problems that have a clearly decomposable structure which is reflected also in the structure of high-performing neural networks, whereas the two methods performed similarly on a problem with a less obvious structure. Further systematic tests of the two with different problem types and behavior descriptors will help uncover the strengths and limitations of each method.
Acknowledgments
This work is supported by The Research Council of Norway as part of the Engineering Predictability with Embodied Cognition (EPEC) project 240862, the Collaboration on Intelligent Machines (COINMAC) project, under grant agreement 261645 and the Centres of Excellence scheme, project 262762. The experiments were performed on the Abel Cluster, owned by the University of Oslo and the Norwegian Metacenter for High Performance Computing (NOTUR), and operated by the Department for Research Computing at USIT, the University of Oslo IT Department, http://www.hpc.uio.no/. We thank Roby Velez for valuable feedback on the manuscript.
Note
Videos of the best and median resulting robot gaits across 50 replications of each treatment can be seen at https://youtu.be/ZbP1JgQffLI and https://youtu.be/cPS-7g65YwY, respectively.