For the first time, a field programmable transistor array (FPTA) was used to evolve robot control circuits directly in analog hardware. Controllers were successfully incrementally evolved for a physical robot engaged in a series of visually guided behaviours, including finding a target in a complex environment where the goal was hidden from most locations. Circuits for recognising spoken commands were also evolved and these were used in conjunction with the controllers to enable voice control of the robot, triggering behavioural switching. Poor quality visual sensors were deliberately used to test the ability of evolved analog circuits to deal with noisy uncertain data in realtime. Visual features were coevolved with the controllers to automatically achieve dimensionality reduction and feature extraction and selection in an integrated way. An efficient new method was developed for simulating the robot in its visual environment. This allowed controllers to be evaluated in a simulation connected to the FPTA. The controllers then transferred seamlessly to the real world. The circuit replication issue was also addressed in experiments where circuits were evolved to be able to function correctly in multiple areas of the FPTA. A methodology was developed to analyse the evolved circuits which provided insights into their operation. Comparative experiments demonstrated the superior evolvability of the transistor array medium.
Evolvable hardware (EHW), or evolutionary electronics (Thompson, 1998)—the application of evolutionary search algorithms to the design of electronic circuits—developed rapidly during the 1990s into the 2000s, with landmarks including efficient circuit designs evolved in simulation (Koza et al., 1996), the first circuits evolved on real hardware (Thompson et al., 1996), and the first VLSI chip produced exclusively for evolution of electronic circuits at JPL, NASA (Stoica et al., 2000), and continues to flourish today (Howard et al., 2014; Trefzer and Tyrrell, 2015; Lopez et al., 2014; Ping et al., 2017; Campos et al., 2013). EHW has been applied to a range of practical problems including antenna design (Lohn et al., 2001; Lohn and Hornby, 2006), adaptive pattern recognition (Yasunaga et al., 2000; Cagnoni, 2009), image data compression (Sakanashi et al., 2004), self-regulating and fault tolerant circuits (Lopez et al., 2014; Ping et al., 2017; Campos et al., 2013), and self-checking circuits (Garvie and Thompson, 2003; Garvie, 2005) which outperformed the state of the art in the literature by several orders of magnitude. EHW spawned many highly unconventional circuit designs that operated in very different—often superior—ways to conventional hand designed systems (Thompson et al., 1999; Koza et al., 2003).
However, one EHW research area of considerable early promise that has received less attention in recent years is its application to the real-time processing of sensor data coupled to actuator control in physical systems. Application could include wearable and embedded devices, but traditionally the primary exemplar and testbed for such applications is autonomous robotics. With this article, we hope to reignite interest in that topic by demonstrating for the first time the successful evolution of component level analog electronic circuits for controlling a (physical) robot engaged in visually guided behaviours. By component level we refer to circuits at the level of basic electronic components such as transistors and resistors (in our case primarily transistors).
In fact one of the very earliest works in EHW was the evolution of a hardware control system for a physical robot (Thompson, 1995). A RAM chip-based dynamic state machine (DSM—a kind of generalisation of a finite state machine) was evolved directly in hardware by using a genetic algorithm to configure the RAM contents, with sensors and actuators connected directly to the DSM. Evolved digital circuits for controlling similar, or slightly more complex, robot behaviours soon followed (Keymeulen et al., 1996; Naito et al., 1996; Haddow and Tufte, 2000; Roggen et al., 2003). These later systems were all based around bespoke or proprietary (digital) reconfigurable gate-level circuitry (e.g. FPGAs).
However, an interesting aspect of the original DSM work was the fact that it pointed towards the potential power of evolved analog processing in the robotic context. Whether or not the DSM input, state, and output variables were clocked or free-running was genetically determined, as was the rate of the clock for any variables that used it (Thompson, 1995; Thompson et al., 1999). So although the circuit was not exactly analog in the conventional sense, it was not standardly digital either. The potentially rich dynamics of such a system could be exploited to mesh with the dynamics of the robot–environment interactions arising as the robot behaved in the world. Thus tight, efficient sensorimotor loops, running through the environment and the hardware, were evolved, illustrating the power of unconventional electronics unleashed by the EHW approach. This approach chimed with the movements within cognitive science and AI that emphasized dynamical systems understandings of behaviour and behaviour generation in animals and robots (Harvey, 1992b; Beer, 1997; Husbands et al., 1995; van Gelder, 1995).
Thus, when specialized boards that allowed component level evolution of analog circuits started to appear (Layzell, 1999; Zebulum et al., 1998; Thompson et al., 1999; Langeheine et al., 2000, 2002; Stoica et al., 2001), the possible application of such technology to robot control was often discussed (Sekanina and Zebulum, 2005). The potential dynamics of unconventional evolved analog circuits is particularly rich and evolved systems have the potential for online adaptation allowing recovery from faults (Thompson et al., 1999; Garvie and Thompson, 2003). This, combined with insights from evolutionary robotics (Nolfi et al., 2016; Vargas et al., 2014) where it was discovered that dynamically complex neural networks are highly evolvable (Husbands et al., 2010; Beer and Williams, 2015), suggested that analog EHW might be very well suited to evolving compact controllers operating with small numbers of components, even for visually guided behaviours which traditionally employed high levels of processing (Bekey, 2005). An interesting property of the dynamically complex evolved networks mentioned above is their ability to cope with noisy, poor quality sensory data, even when the networks have very few nodes (Husbands et al., 2010). This suggests that analog EHW might also be a useful approach for low cost real-time hardware applications requiring cheap sensors and simple circuits. Such areas include some applications of active wearable sensors (Mayol et al., 2002; Yang et al., 2017), embedded visual sensors (Bo et al., 2014), low cost consumer robotics and autonomous toys. In this article, we explore an autonomous robotics application, but in such a way (mainly through the deliberate use of low-grade sensors) that the study has some relevance to these other potential applications.
There were considerable technical challenges in applying these early analog EHW boards to robotic control, so no such work was attempted. More recently, Berenson et al. (2005) used field programmable analog arrays (FPAAs) as a substrate for the evolution, in hardware, of artificial neural networks (ANNs) for robot control. They were able to successfully evolve ANN controllers for legged locomotion in bipedal and quadrupedal robots, with evaluations occurring directly on the robots. The authors of that work reasoned that evolving at the lower, component level of analog circuits, rather than with the analog implementation of a neural network, would probably be too difficult within their methodology. The assumption was that the considerably expanded search space would likely mean that the time needed for on-robot evaluations before finding a good solution would become infeasible. Technical constraints on the commercial FPAAs used would have also created challenges for circuit-level evolution. Hence, to date there has been no investigation of component-level analog EHW applied to robot control, or any other application involving real-time coordination of sensory and motor processing in a physical system acting in the world.
Here we fill that gap by presenting the results of the first such investigation. We used a field programmable transistor array (FPTA) to evolve transistor level analog circuits to control a physical mobile robot engaged in various autonomous visually guided behaviours. We deliberately use a low-grade camera to explore the ability of analog EHW to deal with noisy, unreliable visual data. By coevolving visual features along with the analog circuits, feature extraction and selection and dimensionality reduction is automatically integrated into our approach. An incremental approach was used whereby the robot tasks become increasingly more challenging.
Recently there has been renewed interest in analog EHW in other application areas (Trefzer and Tyrrell, 2017; Howard et al., 2014) and of the use of other evolvable (continuous) physical substrates (including in simulated robot control) (Mohid et al., 2016; Miller et al., 2014; Adamatzky, 2013), which also suggests that the time is ripe to re-examine analog EHW in autonomous robotics.
A number of challenges had to be overcome that had perhaps been holding back research in analog EHW approaches to evolutionary robotics. Commercially available FPAAs tend to have relatively small numbers of reconfigurable cells and inputs and outputs, and strong constraints on access to low-level configuration. This renders them of limited use in EHW research, especially in robotic application where more than a handful of sensor inputs may be required (using vision opens up a much more useful and interesting space of behaviours but requires higher numbers of inputs). A small number of research FPTAs were built (Langeheine et al., 2002; Stoica et al., 2001) but most of these are more than a decade old and rely on outdated software that is not easily compatible with most current computer systems. Somehow such hardware must be made to talk to a mobile robot and accept enough inputs to allow more than the most minimal of visual sensing. In addition, there is the challenge of either evaluating behaviours on the robot within a reasonable timeframe, or developing sophisticated enough simulations to allow general visually guided behaviours to evolve that transfer into the real world without loss in performance. While a number of approaches to this latter problem have been developed over the years (Jakobi, 1998; Nolfi et al., 2016), general simulation techniques that cross the reality gap (Jakobi et al., 1995) for any but the most low resolution cameras have proven elusive (Nolfi et al., 2016; Vargas et al., 2014).
In this article, we describe how these issues were addressed, enabling us to use an FPTA to evolve unconventional transistor circuits that successfully controlled a mobile robot engaged in various visually guided behaviours, including one involving responding to voice commands as well as camera and other sensor inputs. In order to achieve this we developed a new method for simulating the robot in its environment. This allowed us to evolve controllers offline by evaluating the robot in a simulation connected to the FPTA. The controllers then transferred seamlessly to the real world. The circuit replication issue was also addressed in experiments where control circuits were evolved to be able to function correctly in multiple areas of the FPTA rather than just a single area. In order to investigate a wider set of sensory modalities, along with the integration of multiple evolved circuits, FPTA circuits capable of voice recognition were also evolved. These were used in conjunction with the controllers to enable voice control of the robot, triggering behavioural switching. An analysis technique was developed to gain insight into the operation of the evolved circuits.
The central question being explored in this research can be simply stated. Can transistor level analog circuits be evolved to act as controllers for physical robots engaged in nontrivial visually guided behaviours? In this article, we show that the answer is yes.
The secondary questions we were interested in relate to whether the evolutionary process could exploit the dynamics of the analog EHW chip, in conjunction with the robot–environment dynamics, to produce robust behaviour even with noisy, uncertain low-grade visual sensors. We show that it can, making use of an integrated evolutionary visual feature extraction and selection method.
After describing our methodology and experimental setup, we present the results of a series of experiments where progressively more complex behaviours are evolved in an incremental way. Some of the evolved systems are analysed in detail. We also present a set of comparative experiments that dig deeper into which properties of the FPTA make it a reliable evolvable hardware medium. The article finishes with a discussion of our findings and reflections on future directions.
In this section, we describe the core methods and experimental setup used throughout this research. We tackled the problem of evaluating very large numbers of robot controller solutions in the real world by developing new methods for accurately simulating the robot acting in its environment. This allowed us to evolve behaviours using a robot simulation which successfully transferred to the real robot.
2.1 The FPTA
2.2 The FPTA Interface
The FPTA is hosted on a PCI FPGA board and communicates via a third party library called Windriver to the Heidelberg “darkgaqt” C++ software. A socket server extension was added to this software providing a JSON-RPC API into the chip allowing FPTA access from any programming language and host through a simple API (program chip, get and set pin voltages). During evolution the interface is accessed by our Java-based custom robot simulator and evolutionary search software. At other times it is accessed by tools to visualise individual solutions or by the Robot Runner which facilitates live robot-FPTA Wi-Fi communication allowing transfer to and fro of sensory inputs and motor outputs (Figure 2). During evolution the simulated robot is connected to the (real) FPTA in the same way as the physical robot is when evolved controllers are used in the real world.
The original Heidelberg FPTA setup allows a maximum of seven concurrent buffered IOBs (Langeheine, 2005) which was not sufficient for our robotics experiments, especially when using vision. A workaround was developed which effectively allowed simultaneous use of all 64 IOBs: all inputs whose values are to be maintained must be associated to a common sample line at all times. In order to update specific inputs the corresponding inputs are moved to unique sample lines and then returned to the common line after. For reading outputs, each cell is configured to a unique sample line and then back to nothing. A maximum of four inputs can be updated, or six outputs read, simultaneously. This is due to three of the seven sample lines having a different channel implementation through the host FPGA ( as opposed to in the darkgaqt GUI). This workaround incurs a performance penalty as I/O configuration is rewritten to the FPGA before every FPTA pin read/write. This is mitigated by keeping track of the current I/O configuration to optimise pin read/write scheduling. The maximum read write cycle through the FPTA interface is 2.2 Mhz. (See Supplementary Material for further details, available at https://www.mitpressjournals.org/doi/suppl/10.1162/evco_a_00272.)
2.3 The Robot
Communication between the robot and a computer (which in turn communicates with the FPTA) was set up as follows (Figure 2): IR and motor signals: A robot interface socket server was written in C to run on the SmartEvo turret providing the K-Junior serial API over Wi-Fi while accessing native K-Junior library calls. Vision: an MPG streamer was started with 188 120 resolution at 10 frames a second and accessed via Wi-Fi from the computer.
The 188 120-resolution was used due to limitations of the SmartEvo turret camera which achieves less than one frame per second at its full 752 480 resolution. The camera's poor resolution, limited noisy colour range and slow rate made the visual tasks considerably more challenging, and provided the kind of test we desired to explore the capabilities of evolved transistor circuits to cope with low grade uncertain sensory data, as explained in the introduction. The proprietary camera turret communication system also introduced errors and noise into the reading of IR sensors and the writing of motor commands, which added to the challenges.
Figure 3 shows how the various robot sensors and motors are mapped to specific FPTA IO pins.
2.4 The Robot Arena
The robot arena used in all the experiments described later was a cm open topped plywood box with various visual features stuck on the walls, most notably a red A4 sheet next to a black A4 sheet used as the goal during visual homing. A variety of obstacles were introduced within the arena during simulation and live tests.
2.5 The Robot Simulator
In order to evaluate robot behaviours in simulation, a carefully constructed simulation of the robot and its interactions with the environment was developed. The robot's kinematics and response of its IR sensors were modelled using techniques that have a long and successful track record at Sussex and elsewhere. The most complex part of the simulation---modelling vision---employed novel techniques developed for this research.
An efficient physics-based simulator was written in the style of Jakobi et al. (1995) to model the kinematics, assuming a flat floor. The velocity achieved for a given motor actuation value was derived empirically from a set of careful repeated measurements of the robot. A geometrically accurate model of the robot resolved movement into linear and rotational components, using the empirically derived motor signal versus wheel velocity response curve. Motor noise was introduced at a level empirically determined to match the real behaviour. Each IR sensor was modelled using an empirically derived response function which represented the way IR reflects off plain wood surfaces (measured repeatedly at many angles and distances), along with the underlying properties of the sensors (reflective spread, etc). Noise was introduced to the IR values at an empirically determined level. Each IR sensor was simulated by averaging beam reflectance over a spread of 5 traced rays 0.218 radians apart. A reflectance parameter could be altered to give a close approximation to reflective properties of materials other than wood. Motor and IR noise levels were also parametrized. The simulator operated at a configurable discrete time step .
The method used to simulate vision employs an empirical sampling technique made feasible by the use of a 360 field of view. The technique builds on a method previously utilized in Baddeley et al. (2011), but with significant extensions and improvements. As far as we know, it is the first time such a technique has been used in an evolutionary robotics context. The basic idea was to divide the whole world (the robot arena) into a set of equally sized cells. The image seen by the robot was then sampled in each grid cell to build up a database representing the robot's visual world. Because the robot has 360 vision, the panoramic image at a given location is essentially the same for any orientation of the robot, it has just been rotated. Hence, instead of having to sample at each location for many orientations, a small number of samples is sufficient. The retrieved image can be easily mathematically rotated to match the actual robot orientation. It is this trick, which relies on the rotational symmetry of the 360 image, that makes the technique feasible; otherwise the number of samples needed would become too large.
The arena was sampled by capturing the world as seen by a north and a south facing robot at every location on a grid of -cm cells. Multiple orientations were taken into account for slight variations in arena floor tilt and/or variations in the camera angle. Also, due to the camera being off centre on the robot, this provides a finer grained sampling of the arena on the y axis. At any given moment during simulation, a north or south orientation is chosen at random. The image sampled in that direction is picked from the sampling cell closest to the current position of the simulated camera. The chosen 360 image is then rotated according to the simulated robot orientation. The simulator also added noise to the sampled image (at empirically determined levels). The addition of noise and use of randomly chosen samples (north or south orientation) forces the evolved controllers to be robust to a range of visual conditions rather than relying on a fixed set of values. Such robustness is essential for transferring to the real world and operating in realistic conditions. The discrete nature of the sampling, and the use of the nearest sample to the actual position of the simulated robot, adds further noise and coarseness which increases the pressure to produce general, robust solutions.
In some of the experiments using vision, additional features and obstacles were introduced into the modelled arena through Computer Generated Imagery (CGI) injection into the sampled images. Hence the original sampled world can become the basis of new, more complex environments. In this work, such injected obstacles were limited to vertical opaque cylinders with uniform colour reflectance at every angle. However, this aspect could fairly easily be generalised if desired.
2.6 Core Evolutionary Search Algorithm
A generational genetic algorithm was used with a binary genotype, linear rank selection, single point cross-over, mutation, and elitism. After preliminary investigations, a population size of 30 was used. With only a single FPTA which had to be used for each (expensive) evaluation this population size proved a good compromise, providing quick enough evolution. Preliminary experiments indicated cross-over probability of 0.6 and per bit mutation rate as the best values to use. The GA started with a population of 30 random genotypes. Each genotype was a fixed length binary string which encoded a FPTA configuration using 6144 bits describing transistor properties and connections for the array, as defined by the chip configuration protocol (Langeheine, 2005). This binary encoded protocol is hardwired into the chip design and must be used to configure the FPTA. Hence, it made sense to use it directly as the genetic encoding as intended by the chip designers (Langeheine, 2005) (any other encoding would have to be translated into it to enable chip configuration). A statistical asymmetry in the transistor channel length encoding defined by this protocol (010, 011, 110, 111 all mapping to 8 m) was mitigated by finding redundant length encodings in the genotype and replacing them with random non-redundant ones. Preliminary experiments showed that this encoding worked well and so it was adopted for all FPTA experiments described in this article. (See Supplementary Material for full details of the encoding.)
For vision-based experiments an extra 120 bits were appended for visual sensor configurations (24 bits each for 5 evolved visual filters as described in Subsection 2.8).
An incremental approach to evolution (Harvey, 1992a) was used in the series of experiments described in this article. Task difficulty increased from stage to stage; each new stage was seeded with a population from the previous stage's final generation. Preliminary experiments indicated that this was the most efficient way to proceed as attempts to go straight to the final target behaviour failed; this is in keeping with previous explorations of this issue (Nolfi et al., 2016; Vargas et al., 2014).
2.7 Noise and Fitness Evaluation
Robotics is inherently noisy. Sensor and actuator noise and natural variations in environmental conditions (e.g., lighting) are always present (these are replicated in our simulations). In our case the physical medium used for the controller (unconventional FPTA circuits) can provide another source of inherent noise (e.g., parasitic capacitance build-up). Because of this, nominally identical fitness evaluations will result in different fitness values. Since we require the controllers to be robust to such variation, as well as to different initial conditions (e.g., position and orientation of the robot), the evaluation method must be carefully designed. Multiple trials must be used and these should be appropriately weighted in order to produce a selection pressure towards general and robust behaviours.
In the evolutionary robotics experiments detailed in the next section, evaluations were made per individual and overall fitness was integrated using Equation 1. For most runs, . Each evaluation started from different randomized initial locations and orientations. The shape of noise deliberately introduced in the simulations was dictated by a random number generator (RNG) as in Garvie (2005). In order to rank individuals correctly it is important to provide all members of the population with the same generated noise so that they are all evaluated in the same set of conditions, helping to ensure that their relative fitnesses (and hence rankings) are accurate. This makes sure the rank-based selection mechanism is not too noisy, which is important because rank determines the relative contributions of individuals to the next generation. It is done to help mitigate against some individuals getting much more favourable evaluation conditions than the others (e.g., all starting positions near the target) and hence receiving an artificially high relative fitness (and ranking) which will likely have a deleterious effect on the next generation. RNG seeds were used per generation to create sets of noise used for all individuals.
2.8 Vision: Processing and Genetic Representation
The image quality provided by the robot SmartEvo vision turret is poor: it is very washed out and lacking in colour information (Figure 4, right). In order to mitigate this to some extent, camera images were white balanced using the “Grey World” algorithm (Finlayson et al., 1998) and extra hand-picked values of 40, 45, and 30 were subtracted from all pixel values, respectively, for calibration of black. Two channels were then extracted from this image:
Red: , where is clipped to a minimum of .
The red channel responds most strongly to pure red and has low values for both white and black. In order to broaden the scope of possible evolved visual processing, it was genetically determined which of these channels any Haar-like feature detector used as raw input.
In order to allow evolution to operate in a very general, unconstrained way in the design of the visual filters, an extra 120 bits of the genotype encoded the configuration for the five Haar-like filters (24 bits for each), determining their sizes, positions, and other properties. Numerical values were represented using a Gray code (Gray, 1953) so that single bit mutations cause incremental changes in values resulting in a smoother fitness landscape. Each filter effectively acted as a separate visual sensor, feeding into its own FTPA input. The configuration encoding was as follows:
bits 1-3 filter type as per Figure 4 (middle)
bits 4-9 filter centre : range [0,64] with 32 = straight ahead on panoramic image. Values increase clockwise; both 0 and 64 represent straight behind.
bits 10-14 filter centre : 0 is at top, increasing values nearer to bottom.
bits 14-18 filter height such that the maximum value fills the full height of the field of view and 0 is a single pixel. Width is scaled such that each filter section is a square.
bits 19 channel type: 0 for Greyscale and 1 for Red.
bits 20-24 thresholding value such that visual sensor output is clamped to 0 if .
The height of the field of view was constrained such that some of the floor was visible while nothing above the arena wall was available to the visual sensors as controllers might otherwise evolve to depend on information external to the arena which may be unreliable when transferred to the real robot: external objects may move around, the whole arena may be moved, lights may be switched on or off and so on.
Once the thresholding function was applied to the output of a Haar-like filter, the result was amplified to a [0,5] V range for FPTA input.
3 Incremental Visual Target Finding Behaviours
In this section, we describe the core experiments carried out to explore the potential of our approach to evolving unconventional analog circuits for robot control. An incremental evolutionary approach was used to develop the following series of behaviours: obstacle avoidance, visual target approach in empty environment, visual target approach in complex cluttered environment. At each stage the previous behaviours were subsumed.
3.1 Obstacle Avoidance
The robot starts at position (30 cm, 15 cm) from the centre of the arena facing in a random orientation. Each of 6 trials ends when 200 simulated seconds is reached or the robot crashes into an obstacle, whichever is sooner. The simulation time step is 200 ms. The same sets of IR, motor and orientation noise shapes are used to evaluate all individuals in a generation (see Subsection 2.7).
In live robot performance evaluations (see Subsection 2.3), the maximum I/O loop frequency was found to vary around a mean of about 20 Hz but was limited to a steady 10 Hz to allow rigorous comparisons and replications.
3.2 Homing: Vision-Based Goal Approach in an Empty Arena
The live robot (see Subsection 2.3) maximum I/O loop frequency was about 5 Hz with a video stream latency of around 600 ms. No artificial delays were added.
3.3 Vision-Based Goal Approach in a Cluttered “Maze” Environment
A bug in the proprietary K-Junior & Evo stack resulted in extremely low front IR sensor readings; these being more relevant when approaching cylinders than flat walls. A 600 ms video latency in the SmartEvo system (even at the lowest camera resolution) caused the robot to miss landmarks it would otherwise approach: by the time the controller was “aware” of the landmark the robot had already turned past it and would see something different in the next frame. Motor values below 5 did not make the wheels turn with the Evo turret on the robot. Hence, evolved behaviour in which the simulated robot would slowly explore its surroundings, resulted in a stationary live robot.
All the robot shortcomings identified above were included in a modified version of the simulator which was then used in all the experiments reported here. Specifically, the time step was set to 0.3 s, simulated video latency to 600 ms (two frames delay), signals from the front IR sensor was clamped to 0 and motor values of 4 to 4 resulting in no wheel movement.
The evaluation procedure was exactly the same as in the previous task, except the maximum trial length was 200 time-steps. This task is more difficult than most previous ER visual behaviours and has not been attempted before (Nolfi et al., 2016).
4 Results of Incremental Evolution
Figure 5 (left) shows the summary statistics for ten incremental evolutionary runs. The height of the bars shows the mean numbers of generation to a robust, successful solution at each stage (that is, a high-scoring solution that remains the best in the population for 30 generations with re-evaluations on each generation, thus eliminating “lucky” individuals that scored well once; a high score for obstacle avoidance is 500, for both the visual tasks it is 1). The error bars show the standard error of the mean. All runs were successful at each stage, with moderate standard errors, indicating that the methodology is highly robust. All evolutionary runs started from fully random “primordial soup” populations except when seeded from a previous incremental evolutionary stage; that is, no hardcoding of genotypes was made at any point. Videos of all the behaviours described in this section can be found at www.sussex.ac.uk/easy/research/ehw-vids. At each stage very good transference to the real robot was achieved.
4.1 Obstacle Avoidance
After approximately 250 generations (mean s.e. 261 51) optimal avoiders evolved on each run, going full steam ahead as much as possible. A typical such avoider is illustrated in Figure 6. This one had a slight bias towards a constant left turn, which helped it lock into the optimal attractor around the central obstacle. When approaching a wall head-on, it usually turned left. Some others had a corresponding right bias, these were the two classes of successful avoiders that emerged.
The controller shown in Figure 6 transferred very well onto the real robot, producing qualitatively almost identical behaviour to that in the simulator, as can be seen on the right hand side of the figure. Ten fitness evaluations of this controller (the best at the end of an evolutionary run) were measured in the real world with the evaluation function used during evolution. Its mean fitness was 95.1 1.6% of the mean fitness in simulation over ten evaluations. The robot was slightly slower in some areas of the arena, where the floor was not perfectly flat, and sometimes rotated away from walls more slowly than in simulation, which accounts for the slightly lower overall fitness. It generalized very well, dealing with obstacles introduced in real time and coping with environments more complex and cluttered than the one used for evolution. When approaching a wedge-shape corridor it slowed down and inched through the gap if it was wide enough, or stopped if not. (See videos at www.sussex.ac.uk/easy/research/ehw-vids.) All other successful avoiders transferred to reality equally well.
4.2 Vision-Based Goal Approach in Empty Arena
Seeded from the best obstacle avoider from the end of first stage, after approximately a further 1200 generations (1228 381) robust successful visual goal finding (homing) controllers were evolved on each of the incremental runs. The behaviour of a typical successful evolved controller is shown in Figure 7. It begins by rotating clockwise until the goal is behind it. If it misses “locking the target” in one rotation (due to the low frame rate and distant target), it slows its rotation speed down until it catches it. At that point it starts approaching the goal in reverse with a very slight left turn. This left turn causes it to occasionally “unlock” the target which prompts a return to the clockwise rotation whereon the robot quickly locks back onto the target again: this amounts to an evolved calibration method which also allows it to deal with natural drift due to motor noise. Upon reaching the target, the robot maximizes its fitness score by moving back and forth from the target while staying within the (double scoring) goal zone: remarkably complex behaviour for considerably less than 256 transistors. Traces of the best individuals throughout this particular evolutionary run can be seen in Figure 7, giving an idea of how the behaviour developed over evolutionary time.
The best controller at the end of each evolutionary run transferred very well to the real arena (Figure 7) robustly identifying and approaching the target from all parts of the arena and qualitatively replicating the behaviour seen in the simulator. Ten fitness evaluations of the controller shown in Figure 7 were measured in the real world with the evaluation function used during evolution. Its mean fitness was 96.3 1.8% of the mean fitness in simulation over ten evaluations. The robot was sometimes slightly slower in rotating and locking onto the target, and like the obstacle avoiders, was slower moving in some (unlevel) areas of the arena. Figure 7 (far right) shows the controller generalising to a fairly significant variation in the environment which was unseen during evolution, namely an obstacle obscuring the view of the target. The controller also generalized well to a moving “portable target” (a red plastic box) through its natural recalibration mechanism. On other runs the solutions found were very similar at the behavioural level; some rotated in the opposite direction, some moved forwards rather than backwards, some used slightly larger looping movements and tended to approach from more of an angle. The very fittest all produced behaviour very similar to that shown in Figure 7.
4.3 Visual-Based Goal Approach in Maze Environment
In each of the ten incremental runs, after approximately a further 5500 generations (5590 986), seeded from the best individual from the previous stage, robust visual target finding behaviours evolved for a far more complex environment which included a maze in the arena in the form of cylinders of varying colours. Figure 8 shows plots of the best robot's paths to goal for a typical run. The robot approaches and stays at the target from any start position. With all visual sensors clamped to 0, this controller spins counterclockwise with a 12 cm radius whilst avoiding collisions when encountering a wall both on its left or right, demonstrating that it uses visual and IR sensing. An initial investigation of the controller revealed that when sensor #4 (greyscale light detecting sensor in the front-right of the robot's visual field in Figure 10) fires the robot straightens up. Given the location of the three interlocking white cylinders (appearing cyan and on the right in the simulated robot trajectories in Figure 8) and the size of the turning angle relative to the quadrants, the above is sufficient for the robot to arrive at the target reliably. The controller has adapted the robot's behavioural “morphology” or natural turning angle to the environment, and exploited efficient visual clues for navigation by using appropriate positions and properties for its visual sensors. As the robot reaches the target wall, it looses sight of the white cylinders and turns left, arriving with the target on its front right. Here the balance between sensor #1 and the others forces a clockwise turn, and the natural (and right IR induced) counterclockwise turn which happens once it has rotated past the target, create a goal-sitting calibration mechanism. The amounts by which each wheel is made to go forwards or backwards given the target sensor and right IR values are precisely those required to accumulate maximum points during evolution by remaining in the goal zone. The results of this run with a complex environment is an example of evolution harnessing complexity (and asymmetry) in an environment in order to structure its behaviour and produce high fitness. By allowing evolution to shape and then exploit the robot-environment dynamics, along with the visual sensor morphology and properties, an elegant, and remarkably resource efficient, controller emerged. Although quite subtle, the evolved visual processing using to solve the task is extraordinarily minimal.
The evolved successful controllers transferred very well to the real robot, producing qualitatively very similar behaviour to the simulation; a trace of the live robot extracted from an overhead video can be seen in Figure 8. Ten fitness evaluations of the controller shown in Figure 7 were measured in the real world with the evaluation function used during evolution giving a mean fitness was 93.1 2.1% of the mean fitness in simulation. Even though the controllers were evolved to produce very efficient behaviour in the environment shown in Figure 8 (with variation during evaluation trials as explained in Subsection 3.3), and exploited robot-environment dynamics particular to this environment, our methodology still made them general enough to be able to successfully perform the task (find the target and stay at it) in unseen variations of the environment where the target had been moved to a different location or the shape of the environment had been altered, as shown in Figure 9.10 This demonstrates that the successful controllers were processing sensory information to generate the behavior, rather than using some trick to blindly learn the shape of the environment and location of the target.
All in all, the limitations of a robot which can only move fast, and with a 600 ms camera latency and no front IR sensor (see Subsection 3.3), might appear damming and would lead many a conventional designer to lift their arms up in despair, especially when the task is navigating to a target in a complex environment using a maximum of 256 transistors. The success of our EHW approach shows the power of evolution to adapt under any circumstance having implications on robustness and adaptability for systems in the field. Evolution is neutral to “faulty” systems; it will strive to find a path towards a solution no matter what the medium.
5 Evolved Circuit Analysis
A circuit reduction and simplification methodology was developed to analyse the evolved circuits. This method proved very useful in giving a “first pass” understanding of the evolved circuits. Discrepancies between behaviours generated by the original and simplified controllers gave indirect evidence that the evolved controllers used active and subtle dynamics at the both the FPTA and robot-environment levels. (Full details can be found in the Supplementary Material.)
6 Comparative Evolutionary Runs
The results described in Section 4 suggest the FPTA is a suitably evolvable medium for developing robust sensorimotor behaviours, even when the sensors and motors are noisy and unreliable. The observed, rather subtle, dynamics of the evolved behaviours (especially the visually guided behaviours) suggest the potentially rich dynamics of the FPTA medium are being exploited. The analysis of evolved circuits (see Section 5 and Supplementary Material) further supports the exploitation of analog FPTA dynamics as being import. In this section this idea is explored further by comparing evolutionary results of the unconstrained FPTA with a setup where the FPTA dynamics are greatly constrained, and with other analog arrays that lack dynamics, as well as a system with rich dynamics.
It is possible to effectively bypass all the transistors in the FPTA by fixing all routing to be “pass through” so that the chip becomes an evolvable routing network. By doing this, the FPTA becomes a different kind of medium which can be used to evolve routing networks that can connect sensors and actuators in potentially complex (or relatively simple) ways but which no longer make any use of the transistors and the potential dynamics they can impart.
Although all of both the FPTA and CTRNN runs were successful (as defined at the start of Section 4), it can be seen from Figure 11 (middle right) that both the average and standard error of the FPTA runs (617 264) is lower than that of the CTRNN runs (1806 513). A non-parametric Mann--Whitney U test revealed that the FPTA is significantly better than the CTRNN at the 95% confidence level (). Clearly the DynA runs were much worse, half of them not completing successfully.
Having established that the transistors play a crucial role in the evolved FPTA controllers, and appear to make the medium suitably evolvable, a further set of comparative experiments were performed to get some insight into which, if any, of the transistor properties were most important during the evolutionary search in terms of evolvability, as measured by average speed to a good solution. Figure 11 (bottom) shows the results of evolutionary runs on the avoider task under different constraints on the way evolution could change the properties of the transistors in the FPTA. It shows ten runs under each of four conditions: no constraints, transistor width (W) fixed at the mid value, transistor length (L) fixed at the mid value, W and L both fixed at their mid values. All runs were successful. Pairwise Mann--Whitney U tests with the Bonferroni correction for multiple comparisons showed that the fixed W constraint is significantly less evolvable than both the fixed L constraint () and the no-constraint condition (). Runs with both W and L fixed have a much higher average and standard error than the fixed L and unconstrained conditions but the difference in evolvability is only statistically significant at the 90% confidence level. The no-constraint condition has the lowest mean and standard error of the mean. These results suggest that the best approach is to allow evolution to control all transistor properties and routing connections (as was done in all other runs). The question remains: why does the fixed W constraint perform so much worse than the fixed L and unconstrained conditions, and have a much higher mean than the W and L both fixed runs? Fixed W runs get stuck in local optima much more often and for much longer. We believe this relates to the way the fitness landscape is determined by the chip configuration protocol which is built into the design of the chip (Langeheine, 2005) and which is used to map the binary genotypes to an FPTA circuit (see Supplementary Material for details). There are 16 widths to choose from, but only 5 lengths. Actual transistors use the same fixed 5 lengths, but width variation is accomplished by combining multiple transistors in the cell (i.e., setting up virtual transistors). So, for instance, changing from L = 2, W = 1 to L = 2, W = 15 means moving from a single L = 2, W = 1 transistor to four L = 2, W = 1, W = 2, W = 4, W = 8 transistors with their implied routing. This would likely cause a large behavioural variation given all the impedances of the routing and circuitry involved in hooking the transistors together. Hence, allowing W to vary provides a rich range of behavioural changes (some small, some big). There are far fewer possible values for L to take and the way the configuration protocol works means that single bit flips in the L encoding mostly results in relatively large changes, as opposed to the often gradual changes in W. Compared to changes in L, changes in W usually result in smoother progress from one parasitic (circuitry) combination to another. All this means that constraining W to a single value produces a less smooth, less correlated fitness landscape: it is better to allow W to vary. Fixing L has much less of an effect as long as W is unconstrained (indeed there is no statistical difference between the unconstrained and fixed L conditions).
7 Evolved Voice Control of the Robot
In a live setting, software automatically recorded audio samples above a certain amplitude threshold for chip input while chip output at 10-Hz refresh was monitored on a graphical output.
Although tone discrimination circuits have previously been evolved (Thompson, 1998; Harding and Miller, 2004), this is the first time an EHW approach has been demonstrated to be capable of the more difficult task of word discrimination. For the purposes of the current article, we were interested in behaviour switching through the integration of multiple evolved circuits and multiple sensory modalities. This was successfully achieved as can be seen on the video mentioned above.
It should be noted that voice commands were simply used as a behavioural example of using the sound modality that required more complex processing of temporal features than simple tone discrimination, and was employed in order to further test the capabilities of the FPTA. Discriminating between two words in a fairly general way (as the FPTA managed) is not a difficult task for state of art word recognition techniques (which can discriminate between large numbers of words). However, those techniques are computationally very heavy. They typically employ multistage preprocessing of the sound waves with approximately 240 FFT bins followed by a bank of 20–30 Mel filter banks which usually generate Mel Frequency Cepstral Coefficients for 30 ms overlapping time frames (Deng and Li, 2013). The outputs of this are then fed into a machine learning system (often a large deep-learning neural network). What is interesting about the FPTA system evolved here is that it used only very minimal pre-processing of the sound wave: just 6 FFT bins, and was able to achieve the task in 100 generations.
8 Circuit Replication and Transfer
Early work in EHW revealed that circuits evolved direct in hardware (in practice on digital FPGAs) would generally not work as well, or at all, when transferred to another chip or another part of the same chip as it was evolved on (Thompson et al., 1996). This was because such circuits used physical particularities of the chip they evolved on, or the chip area they were evolved in. While demonstrating very interesting, highly unconventional designs, this was clearly a potential problem for practical applications of EHW. Later work explored similar techniques to those employed in evolutionary robotics to ensure greatly generality: namely evolving circuits in multiple conditions on each evaluation, which met with some success (Thompson and Layzell, 2000). An alternative approach, continuing/restarting evolution in situ for changed conditions/medium (e.g., to allow continued operation at higher temperatures), was also successful (Stoica et al., 2004).
A final set of preliminary experiments were conducted as an initial investigation into evolved circuit replication/transfer on the FPTA, in the context of robot control circuits. First, it aimed to see if there was a replication problem with the kind of FPTA medium used here, as this issue had not been investigated for it before. The goal was to evolve controllers independent of parametric particularities of specific chip components.
In the second part of the experiment, the controller was evolved with configuration settings for the top half, but evaluation was now performed in both halves of the chip. On each of its multiple fitness trials (as per Subsection 3.1) the controller was randomly (Subsection 2.7) either kept in the top half of the chip or transferred to the bottom half. Transfer was implemented by moving configuration bits to the second half of the array, keeping zeros in the first half, while I/O was moved to pins 32, 14, 35, 9, 39, 12 for IR and 37,34 for motor signals. After 500 generations an avoider similar to the one described in Subsection 4.1 emerged, with a slight right turn resulting in a clockwise attractor round the arena. This controller worked exactly the same in both halves of the FPTA as shown in Figure 13 (two rightmost plots). Evaluating in both halves during evolution produced a pressure to generalise over the chip so that the controller would work well in either half and could not rely on specific particularities of one region. This demonstrates that evolving for generality of FPTA medium to secure replication and transfer of evolved circuits works well for intra-chip transfer in this context, and strongly suggests it would also work for inter-chip parametric variations. The same finding was repeated on ten runs of the experiment which all produced general controllers that worked in both halves of the chip.
The use of Equation 1 in evaluation provides a strong drive to generalization over all the mediums/environments used in evaluation, but the relative ease of achieving this result suggests the analog FPTA medium, with its rich, fluid dynamics, might be particularly amenable to this kind of generalization.
Because only one FPTA chip was available, the transfer/replication issue could not be fully investigated; that will be the subject of future work when more chips are available. Ideally the circuits would be evaluated over many chips/areas of chips/conditions during evolution and then tested on unseen chips/areas of chips to confirm generalisation.
Component level analog electronics have been successfully evolved to act as controllers for a physical robot engaged in nontrivial visually guided behaviours. The evolutionary process successfully exploited the dynamics of the analog EHW chip, in conjunction with the robot–environment dynamics, to produce robust behaviour even with noisy, uncertain low-grade visual sensors, making use of an integrated evolutionary visual feature extraction and selection method. Thus both the questions posed in the introduction have been answered in the positive. We also demonstrated, for the first time, evolved word recognition circuits that were used in a behaviour switching task that integrated several evolved circuits.
Novel methods were developed to enable this work which has expanded FPTA-based EHW into the realm of realtime processing of sensor data coupled to actuator control in physical systems. A new method for simulating robot vision during evolution was presented, with excellent transfer to the real robot. The evolved solutions to non-trivial visual navigation tasks are best viewed as dynamical systems with (behavioral) attractors that result in completion of the task regardless of start conditions (Husbands et al., 1995; van Gelder, 1995). The continuous analog medium of the FPTA seems a particularly good substrate to enable the evolution of such attractors. This work supports the view of cognitive processes such as learning and memory as the reconfiguration of behavioral attractors in an embodied dynamical system (Beer, 1997; Beer and Williams, 2015). The reconfiguration of properties of the virtual transistors in each cell of the array are achieved by configuring combinations of actual (fixed property) transistors within the cell. Perhaps this process, with the addition routing and potential interactions within the cells, contributes to the rich dynamics of the FPTA medium and the level of evolvability we have demonstrated, particularly in comparison to the alternatives with limited dynamics we tried.
Our overall approach involved incremental evolution. Some interesting preliminary experiments were run to probe this approach further. In a simple version of the maze cylinder task (Section 3) a cylinder was inserted and removed from the arena centre during evolution. Visual homing controller populations evolved with a cylinder historically present in previous generations, and since removed, were then much faster to adapt to its reintroduction, by displaying very efficient navigation around it, than populations solely evolved in an empty arena. This suggests that genetic material encoding the previous adaptation was still present in the population, ready to kick in, or at least the population remained in an area of fitness space from which it was easier to rediscover this adaptation. A fuller investigation of this effect, particularly on how to exploit it for more adaptable and evolvable systems, will be the subject of further work.
Because of the methodology we used, involving multiple fitness evaluations under noisy conditions, most evolved controllers generalized well to variations of the environment. However, for more radical changes in environment, some more specifically targeted evolution was used. For instance, in a separate experiment black markers were added around the simulated arena walls in order to facilitate transfer of homing to a lab floor with several dark objects around. The resulting controller transferred well to a complex environment, approaching a red box from up to 160 cm (limited by camera resolution).
An interesting phenomenon was observed in early word recognition runs. We found that after a while there was a sudden drop in population fitness. This was caused by some configurations performing better after repeated evaluations, most likely through exploitation of parasitic capacitance build-up. As long as the capacitance was maintained, such solutions would be acceptable when deployed in the field. However, as they came to dominate the evolving population, fitness increased until the evaluation of a new variation that discharged the capacitance resulted in a sudden drop in population fitness. We also found that other (unchanging) configurations could lose fitness after many evaluations, causing population fitness decline as they took over the population. Programming random FPTA chip reconfigurations between evaluations mitigated both scenarios and all solutions found after this measure was introduced were confirmed to retain fitness over long evaluations. This is a reminder that while many of the unconventional mechanisms exploited in EHW can results in highly robust efficient circuits, sometimes they need to be treated with care, and may have to be prevented from emerging.
Despite what turned out to be quite severe limitations in the robot when using the vision turret (high-latency, low-definition poor quality image, and nonoperational front IR sensor), we were able to reliably evolve robust, high performing controllers with our setup. This suggests that analog EHW might also be a useful approach for low cost realtime hardware applications requiring cheap sensors and simple circuits. However, in future work we will also investigate the use of the methodology with better robotic equipment. Having only a single FPTA chip was a considerable restriction. Multiple FPTAs would allow use of a fully distributed GA (Muhlenbein et al., 1991; Garvie, 2005) greatly increasing evolution speed and efficacy, as well as interchip transfer validation. A reconfigurable analog chip design with mid- and long-range routing, operational power and temperature monitoring, and on-the-fly self-reconfiguration would vastly open up behavioural solution space, whilst allowing parametric optimisation for specific applications. Such chips would enable a move to the next level of system: those capable of continuous adaptation and self organisation. Such systems have also been shown to be more evolvable (Husbands et al., 2010), so we believe the future of EHW lies in that direction.
Thanks to Martin Trefzer for help with initial setup of the FPTA, to Chris Johnson for help and discussion, and to the anonymous reviewers for helpful comments on an earlier draft of this article. This work was supported by Intel and EU ICT FET Open project INSIGHT.