This short article presents an abstract, tunable model of genomic structural change within the cell life cycle and explores its use with simulated evolution. A well-known Boolean model of genetic regulatory networks is extended to include changes in node connectivity based upon the current cell state to begin to capture some of the effects of transposable elements. The evolvability of such networks is explored using a version of the NK model of fitness landscapes with both synchronous and asynchronous updating. Structural dynamism is found to be selected for in nonstationary environments with both update schemes and subsequently shown capable of providing a mechanism for evolutionary innovation when such reorganizations are inherited. This is also found to be the case in stationary environments with asynchronous updating.
Numerous mechanisms have been identified through which changes in DNA sequences can occur in ways other than copy errors, particularly via transposable elements (e.g., see  for an overview). The significance of such mechanisms with respect to evolutionary innovation has recently been highlighted . This article presents an abstract model to begin consideration of transposable elements within a simple genetic regulatory network (GRN)—an extension of a well-known Boolean GRN formalism. The generic scheme is used to examine how such modification processes affect evolvability, using a recently presented extension to a well-known tunable fitness landscape model.
Random Boolean networks (RBN models)  were introduced as abstract models by which to explore aspects of genetic regulatory networks. RBN models consist of R genetic loci (nodes), each connected to B other randomly chosen nodes, with each performing a randomly assigned Boolean update function based upon the current state of those nodes. The emergent behavior of these discrete dynamic systems has been explored widely (e.g., see [10, 15] for overviews). Some of this work has included the use of simulated evolution to induce specific network-wide behavior (e.g., ) and to enable their use for computation (e.g., see  for an overview).
With the aim of enabling systematic exploration of the evolvability of such GRNs, RBN models have recently been combined with the NK model of fitness landscapes. The NK model  considers sets of N (binary) genes or traits, each of which depends upon K others within the set. As the degree of epistasis, K, is varied, features of the landscapes are affected. This abstract, tunable model has been used widely to explore aspects of evolution (e.g., see [15, 31] for overviews). In the combined form—termed the RBNK model —a simple relationship between the states of N randomly assigned nodes within an RBN model was assumed, such that their value is used within a given NK fitness landscape of trait dependences. This article adds dynamic connectivity based upon the current state of the GRN and its environment during its life cycle to the RBNK model to begin to explore the role of context-dependent feedbacks.
2 Adding Structural Dynamism to the RBN Model
2.1 Random Boolean Networks: A Simple GRN
As noted above, within the traditional form of RBN, a network of R nodes, each with B directed connections randomly assigned from other nodes in the network, all update synchronously based upon the current state of those B nodes. Hence those B nodes are seen to have a regulatory effect upon the given node, specified by the given Boolean function arbitrarily attributed to it; the details of RNA and/or protein actions are abstracted out. Since they have a finite number of possible states and they are deterministic, such networks eventually fall into an attractor. It is well established that the value of B affects the emergent behavior of RBN models in that attractors typically contain an increasing number of states with increasing B. Three phases of behavior were originally suggested through observation: ordered when B = 1, with attractors consisting of one or a few states; chaotic when B ≥ 3, with a very large number of states per attractor; and a critical regime around B = 2, where similar states lie on trajectories that tend to neither diverge nor converge (see  for discussions of this critical regime, e.g., with respect to perturbations). Subsequent formal analysis using an annealed approximation of behavior identified B = 2 as the critical value of connectivity for behavior change .
Traditional RBN models update synchronously, that is, a global clock signal is assumed to exist. It has long been suggested that this assumption is less than realistic for natural systems. Harvey and Bossomaier  presented an asynchronous form of RBN wherein a node is picked at random (with replacement) to be updated, with the process repeated R times per cycle to give equivalence to the synchronous case. The resulting loss of determinism means such networks no longer fall into regular cyclic attractors; rather they fall into either point attractors (one state) or so-called loose attractors where “the network passes indefinitely through a subset of its possible states” . Gershenson  reports that there are no loose attractors for B = 1, with the probability of one existing increasing thereafter for increasing B. Many forms of asynchronous updating are possible (e.g., see  for an overview), but the simple random scheme is used here.
2.2 Transposable Elements: Mobile DNA
Mobile genetic elements are DNA sequences that may be either copied or removed and then inserted at a new position in the genome . For example, retrotransposons use an RNA copy of themselves, whereas DNA transposons rely upon specific proteins for their “cutting and pasting” into new sites. The targeting of a new position ranges from the very specific, typically by exploiting sequence recognition proteins, to more or less arbitrary movement. These processes, insertion in particular, can be reliant upon proteins produced elsewhere within the genome. Transposons are found widely in both prokaryotes and eukaryotes, and they have been associated with many significant evolutionary innovations (e.g., see  for an overview). For example, analysis of the amplification dynamics of a class of retrotransposons indicates their involvement with the genetic changes that separate humans, chimpanzees, and gorillas (e.g., ). Mobile genetic elements function across phyla to various degrees. For example, current estimates are that human genomes typically contain around 100 active L-1 retrotransposons (e.g., ) at present and that at least 1 in every 50 humans carries a new L-1 insertion that occurred in parental germline cells (e.g., ).
Through movement within a genome, transposable elements can therefore change the regulation of a cell: Insertion into a gene will typically disrupt its coding sequence (i.e., it will be mutated); insertion next to a gene may affect its subsequent regulation (e.g., the mobile element's regulatory sequence may take control of the gene); the act of excision can leave behind DNA fragments, which cause a change in the sequence at that location; coding segments between transposons can be moved with them; and so on. The effects of such movement can be beneficial or detrimental to a cell, of course, and hence they are typically closely controlled—for example, by requiring non-encoded enzymes in the transposon itself as mentioned above, via silencing RNAs, or through changes in methylation status. It can be noted that methylation is also involved in epigenetics (e.g., see  for an overview), the other form of lifetime context-dependent genomic change—expressive as opposed to structural.
To add structural dynamism through transposon-like alterations to the traditional RBN model, some fraction of the R nodes (analogous to genetic loci) are extended to include a second set of B′ connections to defined nodes. Each such dynamic node also performs an assigned rewiring function based upon the current state of the B′ nodes, as shown in Figure 1. Hence on each cycle, each node updates its state according to the current state of the B nodes it is connected to, using the Boolean logic function assigned to it in the standard way. Then, if that node is also structurally dynamic, those B connections are altered according to the current state of the B′ nodes it is connected to, using its rewiring table. The moving of the B connections of a given node via the actions/states of the B′ nodes is therefore seen as an abstraction of the effect of a mobile element as discussed above, triggered by one or more of the B′ nodes, causing a change in the regulatory network, which affects the given node. Structural dynamism in general has previously been explored in related discrete systems (e.g., ).
In all known prior uses of transposons within simulated evolution, their role has only been considered at the point of reproduction—either as a form of recombination (e.g., ) or as DNA sequence duplication (e.g., )—and not as an ongoing, context-dependent restructuring processes during the life cycle of the parent.
3 Evolving RBN with Structural Dynamism
3.1 Evolving RBN
There is a small amount of prior work exploring the use of simulated evolution to design RBN models. Van den Broeck and Kawai  explored the use of a simulated-annealing approach to design feedforward RBN for the four-bit parity problem. Kauffman  evolved RBN to match a given attractor. Lemke et al.  did the same. The same approach has been used to explore attractor stability  and to model real regulatory network data (e.g., ). Sipper and Ruppin  evolved RBN for the well-known density task. Bull  has evolved synchronous and asynchronous RBN ensembles to solve simple machine learning problems. DiPaolo  evolved asynchronous RBN to match a given attractor, or more specifically, to exhibit defined rhythmic behavior. See  for a general overview.
3.2 The RBNK Model
As noted above, the RBN and NK models have recently been combined—into the RBNK model—to explore GRN and phenotype dependence . As shown in Figure 2, N nodes in the RBN are chosen as outputs, that is, their state determines fitness using the NK model. Kauffman and Levin  introduced the NK model to allow the systematic study of various aspects of fitness optimization (see ). In the standard model an individual is represented by a set of N (binary) genes or traits, each of which depends upon its own value and that of K randomly chosen others in the individual. Thus increasing K, with respect to N, increases the epistasis. This increases the ruggedness of the fitness landscapes by increasing the number of fitness peaks.
The NK model assumes all epistatic interactions are so complex that it is only appropriate to assign (uniform) random values to their effects on fitness. Therefore, for each of the possible K interactions, a table of 2K+1 fitnesses is created, with all entries in the range 0.0 to 1.0, such that there is one fitness value for each combination of traits. The fitness contribution of each trait is found from its individual table. These fitnesses are then summed and normalized by N to give the selective fitness of the individual. Exhaustive search of NK landscapes  suggests three general classes exist: unimodal when K = 0; uncorrelated, multi-peaked when K > 3; and a critical regime around 0 < K < 4, where multiple peaks are correlated. This differs slightly from Kauffman's  analysis, wherein it was suggested that correlation decreases as K → N—a difference perhaps caused by his use of non-exhaustive search to explore landscapes of increasing N. An element of correspondence between NK and RBN models with respect to the degree of connectivity and typical properties can therefore be noted.
The combination of the RBN and NK models enables a systematic exploration of the relationship between phenotypic traits and the genetic regulatory network by which they are produced. In this article, following , a simple scheme is adopted: N phenotypic traits are attributed to arbitrarily chosen nodes within the network of R genetic loci, but with environmental inputs applied to the first N loci (Figure 2). Hence the NK element creates a tunable component to the overall fitness landscape with behavior (potentially) influenced by the environment. Note that it has previously been shown how typical RBN behavior can be affected by external inputs (interference—e.g., ).
Following , the simple case of a greedy, genetic hillclimber is considered here. Each RBN model is represented as a list of integers to specify each node's start state, Boolean function, B connection IDs, B′ connection IDs, and connection-change table entries (see Figure 2), and whether it is a dynamic node or not. Mutation can therefore either (with equal probability): alter the Boolean function of a randomly chosen node; alter a randomly chosen B connection (used as the initial connectivity if a dynamic node); alter a node start state; turn a node into or out of being a dynamic rewiring node; alter one of the rewiring entries in the lookup table if it is a dynamic node; or alter a randomly chosen B′ connection, again only if it is a dynamic node. A single fitness evaluation of a given GRN is ascertained by updating each node for 100 cycles from the genome-defined start states. At each update cycle, the value of each of the N trait nodes in the GRN is used to calculate fitness on the given NK landscape. The final fitness assigned to the GRN is the average over 100 such updates here. A mutated GRN becomes the parent for the next generation if its fitness is higher than that of the original. In the case of fitness ties the number of dynamic nodes is considered, with the smaller number favored, the decision being arbitrary upon a further tie. Hence there is a slight selective pressure against structural dynamism here.
In the following, R = 100, N = 10, and results are the averages of 100 runs (10 runs on each of 10 landscapes per parameter configuration). Evolution is run for 50,000 generations—when fitness stasis is typically seen. As in , 0 < B ≤ 5 and 0 ≤ K < 5 are used.
4.1 Stationary and Nonstationary Fitness Landscapes
Figure 3 shows examples of the typical evolutionary behavior of the structurally dynamic, synchronous RBN model on two types of NK landscape. On the left, a traditional single fitness landscape is used and a constant input of N 0's applied to the RBN model. As can be seen, the fraction of dynamic nodes within the GRN stays at zero. On the right, a nonstationary version of the model is used such that, after 50 update cycles on one NK landscape, with all 0s applied as the input, fitness is then ascertained from a second (random) NK landscape for the remaining 50 cycles, with all 1s applied as the input. As can be seen, the percentage of dynamic nodes stabilizes at around 5%. Hence structural dynamism is selected for in the nonstationary case. This latter version of the model was somewhat motivated by the growing number of examples of environmentally triggered—typically under stress conditions—genomic rearrangements found in a wide variety of organisms (e.g., see ). Figure 4 shows further examples of how this result holds for all B < 3 GRNs regardless of the underlying topology of the fitness landscapes (T-test, p < 0.05). Analysis of the rewiring behavior in the low-B cases shows that the dynamic nodes typically fire for only the first few update cycles after both initialization and the switch in input halfway through the life cycle. Fitness is significantly decreased (T-test, p < 0.05) for B > 3 in all cases (i.e., those that are traditionally chaotic), as previously reported , where there is also no significant difference in structure (T-test, p ≥ 0.05). The same general results were found on varying the size of the networks (e.g., R = 50 or 200); the final percentage of dynamic nodes tends to vary proportionally with R (≈2.5% and 10%, respectively), but not with consistent statistical significance (not shown). Given that both the number and the size of attractors are known to be proportional to R , this is perhaps to be expected. Thus it appears that in low-connectivity networks the evolutionary process is exploiting structural dynamism to help shape the attractor space of the RBN model so that high fitness can be reliably reached, according to the environmental input and GRN state. The capacity for structural dynamism is not selected for in static environments with synchronous updating under the same conditions.
4.2 Inherited Rewiring: A Source of Evolutionary Innovation
In the above an offspring's nodes were initialized according to their genome specification, as in the previous work on evolving RBN, regardless of the final connectivity pattern of the parent due to the effects of any dynamic nodes it contained. That is, genomic rearrangements were not inherited. Not least because of the considerable efforts invested by cells to avoid copy errors, it has long been argued that transposon-mediated changes are a principal source of heritable variation (e.g., ). This can be explored with the model.
Figure 5 shows the evolutionary behavior of the dynamic RBN model when the parent's final network structure and node states are inherited by the offspring. The very first RBN are assigned random connectivity and node start states. The traditional random mutation operations were also reduced, so that they can either: alter the Boolean function of a randomly chosen node; turn a node into or out of being a dynamic rewiring node; alter one of the rewiring entries in the lookup table if it is a dynamic node; or alter a randomly chosen B′ connection, again only if it is a dynamic node. That is, the rewiring behavior becomes the principal source of connectivity variation for evolution, as opposed to random copy errors. The results indicate there is no significant change in fitness or the percentage of dynamic nodes (T-test, p ≥ 0.05) from the previous version for any of the B and K combinations used (compare with Figure 4). That is, there is no significant change in the evolutionary process when the effects of the transposable elements are inheritable. Moreover, while it is not consistently statistically significant, there is a trend to increased average percentages of dynamic nodes in the cases where they are positively selected for, namely, low B, for all K, in the nonstationary case.
4.3 Asynchronous RBN Models
The same experiments have also been undertaken using asynchronous node updating within the GRN described above. Figure 6 shows how the percentage of dynamic nodes stabilizes at around 5–10% in both the stationary and the nonstationary case. Figure 7 shows further examples of how this result holds for all combinations considered. Fitness is again significantly decreased (T-test, p < 0.05) for high B in all cases, as previously reported . It can also be seen that in the nonstationary case there are significantly more dynamic nodes for low-B networks (T-test, p < 0.05) than for B > 2. The percentage of dynamic nodes is significantly higher than in the equivalent B < 2 synchronous RBN model. Again in contrast to the synchronous RBN model above, analysis of the rewiring behavior in the low-B cases shows that the dynamic nodes fire throughout the life cycle. Hence structural dynamism becomes more beneficial in low-connectivity networks when the underlying dynamics are more complex—the rewiring behavior appears to be exploited to affect the loose attractors during the execution of the asynchronous RBN model so that high-fitness solutions are consistently found.
Figure 8 shows how, as with synchronous updating, performance is not significantly altered when the rewiring is inherited by offspring, again with a trend to an increase in the average percentage in dynamic nodes.
4.4 Random Dynamism
Figure 9 (left column) shows example results from further exploring the positive selection for dynamic nodes in a synchronous RBN model for nonstationary environments. Here the entries in the lookup tables for the rewiring of B connections in any dynamic nodes and their B′ connections were reassigned arbitrarily in offspring, not inherited from the parent. Fitness is typically significantly decreased (T-test, p < 0.05) when B < 3 and K < 4. This again reinforces the view that evolution is able to exploit rewiring for low-connectivity networks to shape the attractor space, with the caveat that it is most beneficial when the underlying fitness landscape is largely correlated, as perhaps might be expected (see Section 3.2); when the positions of fitness optima are structured, the rewiring is not simply a source of purely random variation that alters GRN behavior so that they are better suited to nonstationary environments. Note also that the rewiring is still selected for in the K > 3 cases, indicating that rewiring can also serve as a useful source of variation in less structured fitness landscapes as well.
Figure 9 (right column) also shows results from the same experiment with asynchronous updating. In contrast to the synchronous case, there is typically no statistically significant (T-test, p ≥ 0.05) difference in fitness reached when B < 3 for any K. The same was found for the stationary environment (not shown). Moreover, there is no significant difference in the percentage of dynamic nodes within the networks either (T-test, p < 0.05). That is, the evolutionary process actively selects for rewiring behavior to some degree, even when it is essentially random. This is perhaps somewhat unexpected, since the rewiring means a larger number of nodes will be changing state per update cycle than in the traditional case—for example, there will typically not be any point attractors, which might be expected to be easier than loose attractors for evolution to shape to maintain consistently high-fitness behavior. While the attractor space of standard asynchronous RBN models is less well understood (e.g., see ), it is known that the basins change and that network states can be in more than one basin . The increase in nodes changing state due to random alterations in regulatory structure might therefore be increasing the degree of separation between the basins, thereby easing the evolutionary search process.
These results indicate that structural changes in an evolving regulatory network can be beneficial in different ways, depending upon the environment and the underlying dynamics of the network, ranging from creating simple heuristic-like control of behavior to exploiting purely random variation.
For over 60 years, molecular biologists have been identifying a wide variety of mechanisms through which changes in DNA sequences occur in ways other than through copy errors. Specific biochemical processes generate novelty, via transposable elements, through targeted DNA restructuring, to varying degrees of specificity, depending upon the internal and external states of a GRN during the organismal life cycle. This article has presented an extended version of a well-known abstract GRN model to begin to explore such mechanisms, with respect to both their temporal behavior and their potential to act as a source of evolutionary innovation.
It has been shown that simple structural dynamism is positively selected for in nonstationary environments with low network connectivity regardless of the GRN updating scheme used, and in stationary environments with asynchronous updating. Moreover, any genomic rearrangements occurring during a parent's life cycle can be inherited by the offspring without detriment. This was shown to be the case even though the traditional source of direct (random) connection mutations was removed. The model still relies upon traditional copy error mutations to alter the structure regulatory circuits (table entries), and hence structure is still varied indirectly through its actions under selection. Further, randomly assigned rewiring capabilities were shown to be as beneficial as inheritable capabilities with asynchronous updating: Future work should explore how the attractors are changed.
Based upon these findings, a number of extensions can be envisaged in the near future, including (1) the use of retrotransposon-like copying, as opposed to the DNA transposon-like cutting and pasting here, thereby causing changes in the size of the GRN (see ), and (2) consideration of altering the binding process (i.e., the Boolean function) at each node. It can also be noted that a rewiring process has been explored that adjusts gene connections relative to their current position (e.g., within a range of ±5 nodes), as opposed to the explicit node changes used here with the same general behavior seen (not shown). Future work should further consider the degree and type of insertion site specificity.
Current work is exploring the use of these mechanisms within artificial GRNs for a number of benchmark computational tasks.
Department of Computer Science & Creative Technologies, University of the West of England, Bristol BS16 1QY, U.K. E-mail: Larry.Bull@uwe.ac.uk