Using Pictures to Visualize the Complexity of Gene Regulatory Networks

This paper proposes a new method to evaluate the complexity of a Gene Regulatory Network (GRN). It is based on the generation of pictures. In addition to being visually interesting, the pictures shows the capacity of the GRN to produce smooth and/or sudden transitions, fractal-like complexity and regularities. We also have studied the influence of the size of the GRN on the complexity of pictures generated.


Introduction
In nature, the development processes are able to produce very large and very complex structures.Based on cells driven by a gene regulatory network, the growth process is able to produce organisms composed of billions of specialized cells organized so that they can act in their environment.Over the past years, many researchers in the field of artificial embryogenesis have proposed various developmental models more or less biologically plausible.These works are mainly based on gene regulation with two leading models (Eggenberger, 1997;Banzhaf, 2003).However, if we only focus on the generation of morphologies (or shapes) with specialization, the results are limited in comparison to what nature is able to produce.One of the best results consists in developing a 2-D or 3-D colored shapes, where the colors represent the cell specialization (Joachimczak and Wróbel, 2008;Doursat, 2008;Cussat-Blanc et al., 2011).
Our main project is to use a cell-based developmental model to generate robot morphologies.A cellular model is used to develop an artificial organism evaluated in a physics simulator (Cussat-Blanc and Pollack, 2012).A gene regulatory network controls the behavior of the cells.It allows the cells to orient their division plan, to differentiate to a particular cell type or to chose between a symmetric division (no cell specialization) or an asymmetric one (one cell is specialized whereas the second one is unspecialized).With this approach, we already were able to generate interesting robot morphologies, as presented in figure 1, that are currently under-construction with real robotic units.
In our opinion, a GRN is well suited for this range of problems because it is biologically plausible.Because nature proves that this approach works, we can expect them to scale up better than other existing methods.However, for now, the morphologies are far from what nature is able to produce.To try to understand why an artificial Gene Regulatory Networks (GRN) cannot produce shapes as complex as a real regulatory network, we propose in this paper to focus on the regulatory network itself and to remove the cell-based developmental model usually plugged to this system.Instead, the genotype-phenotype mapping translates pixel addresses to colors.We call it a pixel mapping.The earliest use we know of involved imaging is the results of learning on the Intertwined Spiral problem (Fahlman, 1990).
Many generative methods exist to generate pictures.They took inspiration from Karl Sims' work in which he used a blind watchmaker to evolve symbolic expression rules to produce images (Sims, 1991).The closest approach to our must be the Secretan et al.'s CPPN-based approach (Secretan et al., 2008).They propose an online tool to generate pictures.In a CPPN, the coordinates of a unit (here a pixel) are used to modify the weights of a NEAT network.For picture generation, the output of the neural network evolved by the NEAT algorithm is the pixel color.With same objective, David Hart used genetic programming to generate interesting pictures (Hart, 2007).His approach is based on a set of predefined functions that an evolutionary algorithm combines.Once again, the coordinates of the pixels are used as inputs of the systems.Romero and Machado propose a full state-of-the-art of evolutionary art in (Romero and Machado, 2007).In this review, many other approaches are presented.
DOI: http://dx.doi.org/10.7551/978-0-262-31050-5-ch064 In this work, we have used a GRN to generate pictures.The results we have obtained were unexpected: the pictures generated are very complex, with or without regularities and are surprisingly aesthetic.The GRN can generate various complex structures in the same picture, producing smooth or sudden transitions between the colors.Some fractal-like properties have also been observed in many pictures.
This paper is organized as follow.The next section introduces the functioning of a real gene regulatory network.It also details our implementation of the regulatory network.Then, we propose a method to use the regulatory network to generate pictures.We also present the blind watchmaker approach used to evolve our regulatory network.Next, we present a set of pictures obtained with our system.The discussion describes the capacity of the GRN and proposes a study of the influence of the size of the GRN on the complexity of the pictures.Finally, the paper concludes on the future work opened by this approach.

Background on artificial regulatory networks
Many current developmental models rely on an artificial GRN's to simulate cell differentiation.These systems are more or less inspired by gene regulation systems of living systems.In living systems, the cells of an organism have several functions.They are described in the organism genome and their expressions are controlled by a regulatory network (Davidson, 2006).Cells use external signals collected from protein sensors localized on the membrane to activate or inhibit the transcription of the genes.The gene expressions determine the cells' behaviors.
Eggenberger first used a GRN to generate a 3-D organism able to move in its environment by modifying its morphology (Eggenberger, 1997).Reil then proposed a model biologically plausible with a genome defined as a vector of numbers (Reil, 1999).Here, each gene starts with the sequence (0101), named the "promoter".Then, a graph is used to visualize the gene activations and inhibitions over time with networks randomly generated.Observations revealed the existence of various patterns such as gene activation sequencing, chaotic expressions or cyclic expressions.The author also pointed out that the system was resistant to randomly deteriorations of the genomes.Banzahf also described an artificial GRN model close to real-world gene regulation (Banzhaf, 2003), detailed further bellow.Starting from these seminal models, many variations have been explored in order to address various concerns and applications.Several works addressed artificial embryogeny problems with models of GRN ranging from cellular automaton modeling (Chavoya and Duthen, 2008) to strippeddown version of GRN combined with complex developmental systems (Joachimczak and Wróbel, 2008;Doursat, 2008).Some works have also addressed control problems: using GRN as a control function to map a virtual robot's sensory inputs to its motor actuator values.This has been applied in various setup, from foraging agents (Joachimczak and Wróbel, 2010) to pole balancing (Nicolau et al., 2010).

Our implementation of the regulatory network
We have based our regulatory network on Banzhaf's model (Banzhaf, 2003).He designed it to be as close as possible to a real gene regulatory network.As DNA is composed of a sequence of nucleotides, Banzhaf's network is encoded within a sequence of bits.As a real gene starts with the particular sequence of nucleotides e.g.TATA, a gene in Banzhaf's network starts with a particular sequence of 8 bits named the "promoter".A gene is then encoded next to this sequence by five 32-bit integers, named the "sites".This mechanism allows the generation of a variable number of genes in a fixed size chromosome.However, as in nature, it also generates a certain amount of noncoding DNA, the probability to have a promoter being very low (2 −8 ).This noncoding DNA1 is thought to be used in nature to protect the genome from mutation by lowering the probability that a mutation will affect a coding nucleotide.
Banzhaf's model has been neither designed to be evolved nor to control any kind of agent.However, Nicolau used an evolution strategy to evolve the GRN to control a polebalancing cart (Nicolau et al., 2010).Even if the cart has shown consistent behaviors, the evolution of the GRN has been an issue.In our opinion, the difficulty of the evolution is due to: (1) the noncoding DNA and (2) the dynamics of the network.According to these observations, we have decided to modify the encoding of the regulatory network and its dynamics.In our model, a gene regulatory network is defined as a set of proteins.Each protein has the following properties: • The protein identifier coded as an integer between 0 and p.The upper value p of the domain can be changed in order to control the precision of the GRN.In Banzhaf's work, p is equivalent to the size of a site, which is 32 bits.
We have kept the same precision by setting up p to 32.
• The enhancer identifier coded as an integer between 0 and p.The enhancer identifier is used to calculate the enhancing matching factor between two proteins.
• The inhibiter identifier coded as an integer between 0 and p.The inhibiter identifier is used to calculate the inhibiting matching factor between two proteins.
• The type determines if the protein is an input protein (which concentration is given by the environment of the GRN and which regulates other proteins but is not regulated), an output protein (which concentration is used as output of the network and which is regulated but does not regulate other proteins) or a regulatory protein (internal protein that regulates and is regulated by other proteins).
This encoding removes the problem of noncoding DNA of Banzhaf's approach.Each integer is used in the regulatory network and a modification of one of them will automatically imply a modification of the network.The dynamics of the GRN is calcultated as follow.First, the affinity of a protein a with another protein b is given by the enhancing factor u + ab and the inhibiting u − ab : where id x is the identifier, enh x is the enhancer identifier of protein x and inh x is the inhibiting identifier.The GRN's dynamics is calculated by comparing the proteins two by two using the enhancing and the inhibiting matching factors.For each protein of the network, the global enhancing value is given by the following equation: where g i (resp.h i ) is the enhancing (resp.inhibiting) value for a protein i, N is the number of proteins in the network, c j is the concentration of protein j and u + max (resp.u − max ) is the maximum enhancing (resp.inhibiting) matching factor observed.β is a control parameter described hereafter.
The final modification of protein i concentration is given by the following differential equation: where Φ is a function that keeps of the sum of all protein concentrations equal to 1. β and δ are two constants that set up the speed of reaction of the regulatory network.The higher these values, the more sudden the transitions in the GRN.The lower they are, the smoother the transitions are.
Whereas the input proteins of a GRN can be used to describe the current state of the environment, the output proteins select the level of application of each possible action.The network can also be easily encoded in a genome to be evolved by an evolutionary algorithm.The next section presents how the GRN is used to generate pictures and how it is encoded in a genome.

Picture generation Binding between a GRN and a picture
To generate a picture with a GRN, the GRN calculates the RGB color of each pixel of the picture.To do so, the GRN has two inputs that correspond to the coordinates of the current pixel and three outputs, one for each color component.The coordinate (x, y) of a pixel are transformed into proteins concentrations so that they do not overflow the network: 1y height where c x (resp.c y ) is the concentration of the protein associated to the abscissa x (resp.the ordinate y) of the current pixel, width and height define the size of the picture.
The resulting RGB component values are given by the following equations: where out r (resp.out g and out b ) is the value of the red (resp.green and blue) component for the current pixel, c r (resp.c g and c b ) is the concentration of the output protein associated to the red (resp.green and blue) component in the GRN (this concentration is always between 0 and 1) and max r (resp.max g and max b ) is the maximum concentration observed in the picture for the red (resp.green and blue) component.
Before the generation of the picture, the GRN is first evolved for 100 steps without any inputs in order to stabilize the concentration.This is a very common technique because the GRN are known to oscillate during the first steps.After this initialization, the GRN is duplicated for each pixel of the picture and the duplicated GRN's are run for 25 more steps with the inputs corresponding to their pixels.The pixel colors are then calculated as explained before.

Encoding of the GRN
To be evolved by an evolutionary algorithm, the GRN is encoded into a genome with two independent chromosomes.The first chromosome encodes the set of proteins and the second one encodes the parameters of the dynamics β and δ.
Because a GRN can have a variable number of proteins, the first chromosome is defined as a variable length chromosome of indivisible proteins.Each protein is encoded within four integers: three between 0 and p for the three different identifiers and one in [0, 2] for the type of the protein.
If an evolutionary algorithm has to evolve this chromosome, the modification operators have to be redefined.First, the crossover consists in exchanging subparts of two different networks.Because proteins are indivisible, the crossover points have to be chosen between two proteins.It ensures the integrity of each sub-network.The local connectivity is thus kept.Only new links between the different sub-networks are created.The mutation can be applied in three equiprobable ways: mutating an existing protein by randomly changing one of its four integers, adding a new protein randomly generated or removing one random protein from the network.
In this work, the chromosome is ordered as following: (1) the first two proteins are two inputs proteins that correspond to the coordinate of the pixel, (2) the three next proteins are the three output proteins: one for the red component, one for the green and one for the blue, (3) the remaining proteins are only regulatory proteins.Because one of the objective if the study of the impact of the size of the regulatory network on the complexity of its behavior, the size of this chromosome has been fixed and only the mutation of existing proteins is applied.All the experimentations presented hereafter give the corresponding numbers of proteins.
The second chromosome only contains the constants β and δ.It is defined by a chromosomes that contains 2 float values.These values can evolve between 0.5 and 2. These bounds have been empirically chosen.If the values are less than 0.5, the GRN stays stationary.With high values, the GRN behavior is usually chaotic.
To evolve the GRN, we use a "Blind Watchmaker" interactive evolutionary algorithm, described in the next section.

Interactive evolution of the pictures
The blind watchmaker is a common name given to an interactive evolutionary method first proposed in 1986 by Richard Dawkins (Dawkins, 1986).He originally used this method to sustain the theory of natural evolution using a pedagogical model called biomorphs, fractal-like creatures generated with a small set of genes.This method gave birth more recently to the field of interactive evolution.Many applications are nowadays based on this principle to solve various problems.For example, it has been used with genetic programming to generate realistic camouflage (Reynolds, 2011), or with HyperNEAT to generate 2-D pictures (Secretan et al., 2008) or 3-D shapes (Clune et al., 2010).
In this work, we first generate 9 random genomes.The 9 corresponding pictures are then produced and proposed to the user.The user can then save the GRN's that have generated pictures he likes and select one of the 9 pictures to be evolved.When a GRN is selected, the application generates 9 new pictures by mutating 10% of the selected GRN's genome.We have decided not to use the crossover operator to enhance the diversity of generated pictures.For the same reason, the mutation rate has been deliberately chosen high.With this method, we have generated a pool of diversified pictures.Next section presents some of them and discusses the properties of the GRN, which generate these pictures.

Study of the complexity of the GRN
In order to visualize the complexity of the outputs generated by the GRN, we first used a GRN composed of 12 regulatory proteins (in addition to the 2 inputs proteins and the 3 output ones).With the blind watchmaker, we have evolved a set of Figure 2 shows some pictures obtained with this approach.These pictures have be selected in two runs of the blind watchmaker in the first two columns (one run by column) and in various runs in the last columns.
First, we can observe the variety of the pictures obtained, as well with different seeds (columns) or during one seed's evolution (rows).Figure 3 shows the smooth changes generations after generations, even with a high mutation rate.
The complexity can also be visualized by the capacity of the GRN to produce smooth transitions between the colors such as on the pictures A 3 and B 3 of figure 2 or very sudden changes such as on picture B 2 .Many pictures also present both type of transition such as A 1 , B 4 or C 4 .It shows the capacity of the GRN to produce very different kinds of behaviors even with smooth modification of the inputs, a shift of one pixel in a direction producing a very small modification of one input protein.
The GRN is also able to produce this complexity in very few generations (usually, about 15 to 20 generations are necessary to obtain very complex pictures).Once the first complex picture is obtained, the complexity does not increase, visually speaking.The high mutation rate allows a large diversity of generated pictures, even if the GRN seems to be converged: in few generations, the blind watchmaker is able to generate new pictures completely different from the previous generations.
Finally, some pictures present regularities, such as picture C 3 or C 4 of figure 2. The same patterns are repeated many times with few variations.For example, in picture C 3 , the same strips are repeated with a variation of width but with close colors.In picture C 4 , ovoid leaf-like shape are repeated with a rotation around a central point.This property is very important because it can explain the capacity of a GRN to produce repeated sequence of action with small variations.It shows how a GRN can produce in living organisms multiple legs, branches or any kinds of complex organ.

Influence of the size of the GRN on the complexity
In the previous experimentation, the number of the regulatory proteins has been arbitrary chosen equal to 12.This value has been determined so that the pictures generated are interesting enough while keeping the GRN's size reasonable to maintain the interactivity with the user.To understand the importance of the size of the GRN, we have decided to generate pictures with GRN that have 6 and 18 regulatory proteins.The more complex pictures obtained are presented in figure 4 for GRN's with 6 regulatory proteins and figure 5 for GRN's with 18 regulatory proteins.Here, the pictures are taken from different runs.The complexity of the pictures obtained is comparable with the different tested sizes of GRN's.However, with only 6 regulatory proteins, it was harder to generate images with With 18 regulatory proteins, the same kind of pictures is generated as with 12 regulatory proteins.However, a bigger GRN seem to generate complexity faster than a smaller one: in all the runs we have made with 18 regulatory proteins, 5 to 10 generations were necessary to obtain interesting pictures instead of 15 to 20 with 12 regulatory proteins.
The increase of the size of the GRN seems to reduce the time necessary to obtain complex behaviors.However, the computation time is also impacted by an increase of the size of the GRN.As presented in figure 6, the CPU time increases as well with the size of the pictures as with the number of regulatory proteins.In this experimentation, we have used a 3.16GHz Intel Xeon CPU.The values presented here represent an average of 50 runs made on randomly chosen GRN's obtained during different interactive evolution runs.
The main issue with the increase of the computation duration is the loss of interactivity of the software.It is important to find a good balance between the size of the pictures presented in the blind watchmaker and the number of regulatory proteins.In our experience, a GRN that contains 12 regulatory proteins is sufficient to generate interesting pictures.A GRN with 18 regulatory proteins generates the same kind of pictures but in fewer generations.Concerning the size of the picture, a 50x50 picture makes the appreciation the picture difficult but is sufficient to appreciate its complexity.A 100x100 is already sufficient to observe some details.

Scalability of the approach
An interesting property of this approach is that the images are scalable: if a user likes a picture, it can be easily enlarged by running the same GRN at a higher resolution.The same picture will be generated with more details.This property can be illustrated by figure 7 where we have zoomed on a specific region of a picture generated by evolution.We have zoomed in three steps 216 times from the original picture (on the left) to the last picture (on the right).
Zooming allows more and more details on the picture to appear.Transitory states of the regulatory network seem to be very complex.Some of them seem to have fractal property, such as the top purple-yellow transition on the right side picture.Even zoomed 216 times, a lot of details are invisible on the transition, some red pixels appearing at different points of the transition.
This quantity of details has to be compared with the size of the GRN's encoding.Indeed, each protein is encoded with 4 integers (3 for the identifiers and 1 for the protein type).Because these integers are between 0 and 32, 4 short integers are sufficient to encode a protein.Thus, it can be encoded with 4 bytes.The size of a GRN is then 4 * nbP rot + 16 bytes.The 16 bytes added correspond to the two double floating-point values that encode to the constants β and δ used to control the GRN's dynamics.In this experimentation, the GRN contains 17 proteins (2 inputs, 3 outputs and 12 regulatory proteins).Thus, the size of the GRN is 84 bytes, which is extremely low in comparison to all existing picture formats and the details generated by the GRN's.The GRN could be evolved to generate a given picture.It would produce a powerful compression algorithm, related to the IFS fractals of Barnsley (Barnsley, 1988).

Conclusion and perspectives
In this paper, we have used a gene regulatory network to generate pictures.We have used a direct encoding between the GRN and the pictures.The GRN provides the RGB values of each pixel of the picture according to its coordinates.This direct encoding is very common in literature (Sims, 1991;Hart, 2007;Secretan et al., 2008).The interesting results about using is a GRN instead of a CPPN or genetic programming is that the complexity of the generated pictures is inherent to the GRN.No function is used to control the input of the network.Moreover, the GRN's were able to produce fractal pattern and regularities in many pictures, which can be an interesting property when used to generate robot plans.
While there are other candidates for generative represen- tations, such as grammars, L-systems or HyperNEAT, we believe that GRN's are the most authentic representation coming from nature.Due to their high non-linearity, they are impossible to design and must be evolved.We have shown that evolution can be effective in a blind watchmaker setting, and that artificial GRN's have utility both in generating robotic body plans as well as interesting images.
The ease with which complex behaviors are obtained is surprising.Whereas most of existing approaches need many generations to obtain them, few are necessary with the GRN.The excessive complexity generated by nonlinear dynamical systems like GRN's is both a blessing and a curse.It enables the evolution of highly complex and multifaceted structures in nature, but gaining control over the process computationally has proven to be fraught with difficulty.
If we want the system to be really usable for an artistic purpose, the generation time of the pictures has to be improved.Currently, only 50x50 thumbnails are generated to keep the evolution interactive.With a GRN that contains 12 regulatory proteins, it takes about 45 seconds to generate the 9 pictures.Even if the application is multithreaded so that it divides the generation time by the number of cores, it is still the main limitation of the approach.However, the regulatory network could be easily transformed into matrix computing and, then, deployed on a graphics card.In this case, the computational time would be strongly reduced.
In conclusion, as a field, Artificial Life should reflect, algorithmically, on the various models which we take from Nature, such as Evolutionary Algorithms and Neural Networks.Gene Regulatory Networks are a newer instance of biologically inspired computational models, and so it behooves us to study them further to learn what are the strengths and weaknesses, especially when compared to other bio-inspired models.In this paper, we showed that GRN's can have complex, nonlinear behaviors, which nonetheless can be evolved fairly directly and can be measured using human perception on the combined output of 10's of thousands of artificial cells.GRN's are the most plausible models for dealing with developmental processes, although L-systems, which are closer to symbolic AI, are probably more compact descriptions.Following work in interactive evolution using NEAT and HyperNEAT, we think that GRN's can be as useful, yet more biologically plausible in the natural design of artificial life artifacts, such as robots.

Figure 1 :
Figure 1: Examples of robot morphologies generated by the use of a cell-based developmental controlled by a gene regulatory network.

Figure 3 :
Figure 2: Examples of generated pictures with 12 regulatory proteins in the GRN taken in the same run (first 2 columns) or in various runs (last column) Figure 4: Examples of pictures with GRN's that contain 6 regulatory proteins

Figure 6 :
Figure 6: CPU time needed to generate an image with a GRN in function of the size of the image and the number of regulatory proteins in the GRN.

Figure 5 :
Figure 5: Examples of pictures generated with GRN's that contain 18 regulatory proteins

Figure 7 :
Figure 7: Example of the scalability of generated pictures.The picture on the left side is the original one, evolve with 12 regulatory proteins in 28 generations.The second picture is an enlargement of the first one.It is extended 6 times.The third picture is zoom 12 times on the second picture and the last one is zoomed 3 times on penultimate one.