Abstract
Inspired by natural biochemicals that perform complex information processing within living cells, we design and simulate a chemically implemented feedforward neural network, which learns by a novel chemical-reaction-based analogue of backpropagation. Our network is implemented in a simulated chemical system, where individual neurons are separated from each other by semipermeable cell-like membranes. Our compartmentalized, modular design allows a variety of network topologies to be constructed from the same building blocks. This brings us towards general-purpose, adaptive learning in chemico: wet machine learning in an embodied dynamical system.
1 Introduction
Just as neural networks were inspired by models of the brain and a desire to emulate and understand its computational power, chemical computation looks to the cell and other biological systems, hoping to comprehend and harness the information-processing dynamics that drive life [13, 15, 31]. In this view, each living cell is a powerful chemical computer, capable of dynamic resource allocation, repair, reproduction, and sometimes even motility. We wish to understand chemical computation, with the ultimate goal of building synthetic chemical networks capable of adaptation and learning.
Furthermore, a programmable wet computer could have myriad medical applications, particularly in smart drug delivery [17, 37, 59]: We imagine a tiny chemical computer injected into a patient, perhaps in its own cellular membrane, that dynamically reasons about the chemical composition of the patient's blood and releases drugs as appropriate. This is a particularly potent option, considering that such chemical computers would likely be implemented using biocompatible synthetic DNA strands [55].
There have been many projects on developing chemical computers in simulated environments over the past two decades [9, 15, 16, 29, 58]. Several models of chemical neural networks have been explored [11, 24, 25, 32], but so far these networks have been static in their behavior, incapable of learning.
In our previous work we proposed several binary and analogue chemical perceptrons [4, 6, 7]. These were the first simulated chemical systems capable of fully autonomous learning. We used chemical reaction networks (CRNs), which are unstructured macroscopic chemistries, where interactions between symbolic species follow realistic reaction-kinetic laws. In these CRNs, single perceptrons were able to learn two-input linearly separable functions, and we also tackled time series learning and prediction by introducing chemical delay lines, which store the past inputs (concentrations) and feed them to underlying perceptrons [3, 46].
Having developed a family of individual chemical perceptrons, we wish to design a method for connecting these in a more computationally powerful network. The network should be modular, such that networks with different topologies are constructed from different combinations of the same parts. As neural networks have been shown to be powerful machine learners, we hope that a network of single chemical perceptrons could be a step towards the first general-use reprogrammable chemical computer.
We achieve this goal with the feedforward chemical neural network (FCNN), a network of cell-like compartments, each containing a chemical neuron as a module. Communication between nodes in the network is achieved by permeation through the walls of these compartments, facilitating the network's feedforward and backpropagation mechanisms.
Like standard single-layer perceptrons, each of our individual chemical perceptrons can learn the linearly separable binary two-input logic functions, such as and and or, but they are incapable of learning the linearly inseparable functions xor and xnor [45]. Here, we demonstrate that the FCNN learns each of these functions. To our knowledge, it is the first chemical system able to learn a linearly inseparable function.
The FCNN has the simplest feedforward neural network topology: one hidden layer with two hidden neurons, the same as the first classical neural network to learn xor via backpropagation [53]. For more complex learning problems, we show how the FCNN's modular design can be applied to topologies with more, or larger, hidden layers. Using an FCNN is as simple as injecting a few species into its outermost chemical container and measuring the concentration of an output species. By modulating the concentrations of injected species, different inputs can be provided, the network can be trained to new tasks, and its learning rate can be annealed.
Built on a realistic model of chemical dynamics, the FCNN is a step towards reusable, reprogrammable chemical computers and a possible union of biochemistry and computer science. The project is also driven by a desire to better understand, through emulation and analogy, the information processing that occurs naturally in living systems. In this sense, the FCNN is an exploration of how relatively simple chemical reaction networks can embody computation, self-modification, and learning.
1.1 Related Work
Several projects have already implemented static, non-adapting neural networks in simulated chemical systems, especially using DNA-based models [10, 11, 25, 26, 32, 44, 51]. Hjelmfelt et al. [26] and Otakamo et al. [47], for example, both use schemes where connection weights are encoded in the concentrations of catalysts, which in turn speed up or slow down the relative consumption of the input chemical in parallel reactions. Most previous work encodes binary values by considering whether chemicals are present in the system above a certain threshold. We have taken the same approach here.
While Hjelmfelt et al. [26] and others have made important progress in implementing various components of the familiar feedforward neural network, such as a chemical activation function with sigmoidal properties, all of the above-mentioned projects focus only on the forward propagation of information, and not the backward propagation of the error. In each case, learning either is not considered, or is performed by an external, non-chemical system that computes the weights for a formal neural network before converting them to molecular concentrations to serve as parameters for the chemical implementation [32, 51]. For a system to be truly useful, the real challenge is to implement chemical adaptation, where a reinforcement signal triggers reactions that then modify the system's own behavior.
Several groups have designed chemical systems based on learning models other than neural networks, such as decision trees [48] and least-squares regression (LSR) [34]. McGregor et al. [42] demonstrated how in silico evolution can be used used to develop artificial chemistries capable of learning some particular fixed function. Lakin et al.'s scheme for rudimentary learning in an enzymatic chemistry [34] is equivalent to non-negative least-squares regression and achieves learning within a chemical system (and similarly for our more recent learning scheme based on buffered DNA strand displacement chemistry [35, 36]). Our FCNN differs from these architectures in two important ways. First, it enjoys the conceptual advantages of the neural network model itself, which is perhaps the most widely used and well-researched machine learning paradigm, and therefore represents a good model to use as the backbone of a general-purpose adaptable computer. Second, the FCNN is built from modular and composable components, so that it or something similar could be the scalable basis of future networks that are much larger and capable of more complex computation, whereas previous chemical computers have generally been limited-purpose and not scalable.
Chiang et al. [12] proposed a construction method for chemical reaction networks to represent reconfigurable logic functions. As opposed to our work, the reconfiguration is performed manually by changing the concentrations of certain control species. Wu et al. [60] surveyed a wide range of DNA-based logic gates and their applications in biotechnology and biomedicine. Fernando et al. [21] proposed a system based on gene regulatory networks embedded in Escherichia coli and demonstrated that even single-celled organisms could carry out associative learning.
2 Chemical Formalism
2.1 The Chemical Reaction Network
Our simulated chemical system consists of chemical species and reactions between them, along with rate coefficients governing the speed of those reactions. In this model, also known as a chemical reaction network (CRN) [18, 26], there is no notion of space, as each species is considered only in terms of its concentration in a uniformly mixed chemical bath where pressure, volume, and temperature are constant. The chemical species are completely abstract, without any defined molecular structure or atomic components. Their behavior is fully described by the reaction network, whose dynamics are deterministic, given the species' concentrations and the reaction coefficients.
We allow annihilation reactions, which remove chemicals from the system. These reactions work like any other, but produce the null product λ, which does not signify disappearance of matter, but simply an inert species that does not further interact with other species in the system.
These different types of chemical reactions then give rise to a system of ordinary differential equations (ODEs) over the concentrations of species within our system. This system of ODEs mathematically defines our chemical reaction network.
2.2 Cell-like Compartments
Since Aristotle, Pasteur, and more recently Maturana and Varela [41], cellular compartmentalization has been considered a central characteristic of living systems; this is a major motivation behind the modern study of “protocells” [52]. The analogy of the cell as a self-contained, self-sustaining regulatory machine is a familiar one, and the most basic requirement for such a machine is compartmentalization—separation from the outside world. Hence, much effort has gone into determining the salient characteristics of a cell and simulating cells in the field of artificial life [20, 28, 54, 56]. For these reasons, cells have been a common component of several significant accomplishments in chemical computing [8, 15, 57].
In our artificial chemistry, we introduce cell-like compartments, used as containers for each of the individual chemical neurons composing the FCNN. Interaction between neurons is facilitated by permeation across the walls of these compartments. We defined the permeation dynamics of these compartment walls with reactionlike rules that we call channels.
Channels can be either reactive or inert. In the first case, a species is allowed to enter the compartment wall, but a different species emerges on the other side. Chemically, we imagine a molecule that reacts with a cell wall as it passes through it. In the other case, a species passing through an inert channel is not changed; it simply travels from one side of the membrane to the other. A given compartment wall can have any number of channels of either type.
In modeling membrane permeation, it is common to consider the pressure inside the cell, or, in a similar vein, the numbers of particular molecules within it [30, 54]. In the latter case, a chemical species S permeates into a cell more slowly if there is already a high concentration of S in the cell. Furthermore, if a cell's total volume is constrained, species can permeate into it only up to a certain point, when the cell becomes “full.” More detailed models also consider solute permeability, viscosity, hydraulic conductivity [33], pH, and temperature.
We make several simplifying assumptions regarding the dynamics of our cell-like compartments. First, each compartment's internal pressure is assumed constant, meaning that the total volume of species entering or exiting through permeation is much smaller than the total volume of the container; in other words, there is a large amount of some inert solvent inside each compartment. Second, we assume that permeation rules are inherently one-way; if a species passes from side A of a container wall to side B, it does so aware only of the state on side A. This means that permeation is not osmotic, or equilibrium-seeking, but conceptually more akin to so-called active transport across cell walls. Third, since we consider chemical species only in terms of their concentrations, other physical parameters such as temperature and viscosity are beyond the scope of our model. Having thus simplified the chemical picture, channel permeation rates follow mass-action kinetics: The rate of permeation is exactly proportional to the concentration of the source species in the source container and a permeability constant k.
In the parentheses, the first species is always inside C, and the second outside. The arrow denotes the direction of permeation. Here C : (S2 ← S1) is a reactive channel in which S1 passes into C, turning into S2 in the process, while C : (S1′ → S2′) is a reactive channel in which S1′ passes out of C, turning into S2′.
It should be noted that the computational use of cells and their membranes is the central object of study in the field of P systems, or membrane systems, which are abstracted chemical systems where different reactions occur in different cells, with communication between cells via membrane permeation [49, 50]. The work presented here is somewhat similar to P systems, so it is important to understand key differences between the two models. Our approach to chemistry is less abstracted than that taken in P systems: The chemical objects of P systems are strings that “react” through rewriting rules akin to Chomsky's context-free grammars [49]. P systems successfully demonstrated the theoretical computational power of chemically inspired systems of a certain type, but omissions such as reaction-kinetic models generally make P systems unwieldy for the design of viable chemical computers.
Our goal is to use the systems of ODEs that describe our CRNs to construct autonomously learning neural networks. Each phase of the operation of a multilayered perceptron, such as “calculate linear sum,” and “update weights,” is only qualitatively emulated by our ODEs—we do not aim to reproduce the mathematics of neural networks in a one-to-one fashion in chemistry. Thus, our chemical perceptrons are mathematically distinct from perceptrons [22]. Moreover, chemical reactions representing each operation run continuously and in parallel, rather than discretely and in sequence. Because our network is a large, nonlinear system of ODEs, there are generally no analytic solutions for, say, what concentration of Y will be present in our system a set time after it is started with given initial conditions. We therefore use numerical ODE solvers as the backbone of our analysis (see Section 6).
Previous results on the theoretical power of perceptrons and neural networks (e.g., Hornik's proof that feedforward perceptrons are universal approximators [27]) start with the mathematical definitions of the models, and thus rely upon assumptions (as basic as y = ∑wixi) that are not valid in our chemical perceptrons. Furthermore, our scalable network construction is restricted to treelike topologies, a restriction that is generally not made in the study of classical perceptrons. Thus, this work demonstrates the robustness of the general conceptual framework behind neural networks, and that the mathematical particulars of a neural network's implementation are perhaps less important than the general, functional relationships between its parts.
In the spirit that drove the early history of neural networks, we consider in this article the same problem that is the focus of Rumelhart's classic exposition of backpropagation [53]: solving xor with a one-hidden-layer, two-hidden-unit feedforward perceptron. In Section 6 we will use this early result from classical multilayer perceptrons as a benchmark against our FCNN.
3 Chemical Neurons
The chemical neuron we chose as the building block for FCNNs is the analogue asymmetric signal perceptron (AASP) [4]. Two new variants of the AASP are used, one for hidden neurons and one for the output neuron. As a reference throughout this section and the rest of the article, Table 1 lists all chemical species in the FCNN. Tables listing all reactions, reaction rate constants, and permeation channels are provided in the Online Supplementary Material (http://www.mitpressjournals.org/doi/suppl/10.1162/ARTL_a_00233).
Function | Species |
Inputs | X1, X2 |
Output | Y |
Weights | W0, W1, W2 |
Input (clock) signal | Sin |
Learning signal | SL |
Production records | X1Y, X2Y,SinY |
Weight adjusters | W⊕, W⊖ |
Indiv. weight decreasers | W0⊖, W1⊖, W2⊖ |
Inert input transmit∗ | X1′, X2′, Sin′ |
Binary threshold∗ | T |
Penalty∗ | P |
Error signal∗ | E⊕, E⊖ |
Backprop. signals∗ | P1⊕, P1⊖, P2⊕, P2⊖ |
Feedforward signal# | SF |
Feedforward output# | F |
Function | Species |
Inputs | X1, X2 |
Output | Y |
Weights | W0, W1, W2 |
Input (clock) signal | Sin |
Learning signal | SL |
Production records | X1Y, X2Y,SinY |
Weight adjusters | W⊕, W⊖ |
Indiv. weight decreasers | W0⊖, W1⊖, W2⊖ |
Inert input transmit∗ | X1′, X2′, Sin′ |
Binary threshold∗ | T |
Penalty∗ | P |
Error signal∗ | E⊕, E⊖ |
Backprop. signals∗ | P1⊕, P1⊖, P2⊕, P2⊖ |
Feedforward signal# | SF |
Feedforward output# | F |
Notes. ∗: Output neuron only; #: hidden neuron only.
3.1 Our Chemical Neuron: The AASP
Our previous work [4] introduced the AASP and described its mechanism and motivation in depth. Because both our hidden and output neurons (described in Sections 3.2.1 and 3.2.2) are modified versions of the AASP, here we describe those features of the AASP that are relevant to the FCNN. We present the two-input AASP, but the model can be easily extended by treating additional inputs analogously to the first two.
3.1.1 Input
Each component of the input vector is represented by a species Xi. Although the AASP accepts continuous input values, it also operates very well as a binary-input perceptron. In this case, we inject only those Xi whose value is 1, at a preset input concentration, and do not inject the Xi whose value is 0.
Under this input encoding, the zero input (Xi = 0 for all i) corresponds to no chemical injection, which poses problems for a system that should react to such an input because the zero input is indistinguishable from no input at all. We therefore include a special clock signal Sin with every input, alerting the perceptron to the input event. Though Sin is necessary only to recognize the zero input, it is included with all inputs for the sake of consistency. We will see later that Sin is also useful in the weight integration.
3.1.2 Input-Weight Integration
Here we encounter a common problem in mapping mathematical structures to chemical systems, the representation of negative numbers. As chemical species cannot exist in negative concentrations, an arbitrary real number, such as a perceptron's input weight, cannot be represented by a straightforward mapping to a concentration of a single species.
In our earliest models of chemical perceptrons, we solved this problem by associating each weight with two species, one to encode positive weight values and one for negative [6]. To reduce the size and complexity of the chemical system, we have since used a scheme that requires only one species per weight. For this we must sacrifice the straightforward linear relationship between the weight of a formal perceptron and concentrations of weight species in a chemical system—hence the name asymmetric signal perceptron. Our ASP is not only simpler than its predecessors, but also learns more accurately [7].
The first reaction, which produces Y as a function of Xi and Wi, also produces a record-keeping species XiY. This species is later used in the weight-update stage of learning, as will be described in Section 3.1.3. XiY is merely another abstract species in our artificial chemistry, and should not be interpreted as a molecular complex containing both Xi and Y. The clock species Sin is already present in every injection, so it is used as the constant multiplicative coefficient of the bias W0. In terms of Equation 5, Sin = X0.
Note that upon input injection, each (Wi, Xi) pair is simultaneously producing and annihilating the same Y. A consequence of this is another difference between chemical and formal perceptrons: Weight strengths are relative, as each copy of the second reaction above will proceed at a rate proportional to [Y], which is constantly changing in a manner dependent on every input weight. This sharing of Y by the various inputs, as well as the two reactions above, is diagrammed in Figure 1.
Although we cannot determine the nonlinear dynamics of [Y] without running an ODE solver, the nature of the chemical reaction network enforces a lower bound of zero and an upper bound equal to the sum of the input concentrations, ∑i[Xi]. The upper bound is asymptotically reached as weights go to infinity.
3.1.3 Learning
Chemical implementations of the input-weight integration are not new [25, 32], but our previous work was the first to implement autonomous learning [6]. In an AASP, learning starts with the injection of a desired output species Ŷ, and ultimately updates each weight in the appropriate direction by a magnitude relative to that input dimension's influence on the most recent output Y combined with an analogue of an error. This scheme is described below, and illustrated in its entirety in Figure 2.
Once weight-adjustment signals are produced, all neuron types behave identically, adjusting their ith weight in proportion to both the adjustment signal and the ith input dimension's influence on the most recent production of Y. This qualitatively reproduces the so-called delta rule of classic perceptron learning [22].
Note that it takes an intermediate species and two reactions for W⊖ to annihilate Wi. This is simply because the hypothetical reaction , with two reactants and a catalyst, would require the collision of three molecules in the same instant. As such three-molecule interactions are unlikely, we do not use them in our designs.
3.2 Two Breeds of AASP
To accommodate communication between neurons in FCNNs, we had to modify the original AASP design. This resulted in two related breeds: one for hidden neurons, which has a modulated output schedule and accepts backpropagated error signals; and another for the output neuron, modified to initialize the cascading backpropagation reactions. We discuss the means by which these neurons are connected to each other to achieve feeding forward and backpropagation in Sections 4.2 and 4.3, but here we first discuss the details that distinguish our output and hidden neurons from each other and the AASP.
3.2.1 The Hidden Chemical Neuron
The hidden chemical neuron (HCN) is the modular component from which FCNNs of various topologies are constructed. It has two differences from the AASP as presented in Section 3.1, one each to facilitate feeding forward and backpropagation.
The AASP produces output Y constantly, as the input dimensions are gradually integrated through the series of reactions in Section 3.1.2. It is designed to receive input signals instantaneously, however, so in a network context we cannot simply feed forward a hidden neuron's output as it is produced. In practice, we have found that AASPs perform poorly with their input injected gradually over time.
In learning, an HCN has less work to do than an AASP. An AASP reasons about its output and a desired output to produce the weight-update species W⊕ and W⊖, but hidden neurons receive error signals through backpropagation, not through a direct comparison of the outputs. Thus, the W⊕- and W⊖-producing reactions of an AASP, illustrated in Figure 2a, are omitted from the HCN.
3.2.2 The Output Chemical Neuron
The output chemical neuron (OCN) has a different learning mechanism from the AASP. As the current FCNN is designed to learn binary functions, such as xor, inserting a desired-output species to instigate learning, as Figure 2a shows, is somewhat ineffective. For example, if the binary threshold is 0.5, the error will be minuscule if the actual output is, say, 0.499. This was not a serious problem in the single AASP, but the OCN's error signal must not only update its own weights, but propagate backwards as well. The reactions involving backpropagation will be discussed in Section 4.3; here, we explain the method by which weight-changing signals are produced within the OCN.
The OCN's learning phase begins with the external evaluation of its output. A penalty species P is injected only if the OCN's output is incorrect, that is, on the wrong side of the binary threshold that is used to interpret chemical concentrations as binary values. Along with P, the AASP's learning signal SL is injected. Further, a threshold species T is injected in concentration equivalent to the binary threshold. T communicates the binary threshold to the OCN, and behaves somewhat analogously to the desired-output species Ŷ in the AASP (Section 3.1.3).
The goal of the OCN's error-evaluating reactions, diagrammed in Figure 3, is to amplify small continuous-valued errors to emulate distinct binary-valued errors. This is done in two stages. First, Y and T annihilate in a fast reaction. Whichever is left over encodes whether Y was above or below the threshold, and whether the weights should be increased or decreased. Thus, SL catalyzes relatively slow reactions from Y and T to signed error species E⊖ and E⊕, respectively.
The autocatalytic reactions ensure that the total amount of weight change is not limited by the initial difference between [Y] and [T], which is encoded in the [E]s. The total amount of weight change is bounded, however, by the injected concentration of P, as it is the only reactant creating W⊕ or W⊖. Thus, we can achieve annealing by decreasing the concentration of successive injections of P. To summarize the error-evaluating reactions: The initial difference between [Y] and [T] determines the sign of the weight adjustment, and [P] determines the magnitude.
4 Networking
This section discusses the methods by which AASPs are connected to make a FCNN. We first describe the network's topology, both as a neural network and as a series of chemical containers; and then its mechanisms of feeding forward signals and error backpropagation.
4.1 Constructing the Network Topology
With neurons in nested compartments and links across compartment walls, our networks are topologically trees, each wall being a branch and the outermost compartment being the root (Figure 4). The outermost compartment (1 in the figure) corresponds to the output layer in a feedforward multilayer perceptron, and for a given set of nested compartments, the deepest compartments (4–7 in the figure) correspond to the input layer. We cannot construct arbitrary feedforward networks, because in our treelike networks each node can feed forward to only one node in the next layer.
This nested architecture enables the FCNN's modularity. Each neuron is guaranteed to share a permeable wall with any neuron to which it is connected in the network, so messages that must pass from one neuron to another do not need to be “addressed”—they passively permeate from parent to child or vice versa. This makes the architecture scalable. The signal species between different pairs of neurons can be chemically identical because the signaling occurs in distinct compartments. For this reason, the number of species in the FCNN is sublinear in the size of the network.
If we wished to make arbitrary feedforward topologies, that would be possible provided we included distinct signal species for every linkage in the neural network. This could be achieved by placing all hidden neurons, regardless of their level in the network, as sibling subcompartments within the larger output neuron (Figure 5). Feedforward and backpropagation signals would travel between hidden neurons via this shared compartment. As long as there are unique signal species for each link in the network, this design allows arbitrary feedforward network topologies.
In the simulation results presented in this article, we used a simple two-hidden-neuron, one-hidden-layer topology to solve xor (Figure 6). This topology is consistent with either of the above paradigms. As we are more interested in the modularity afforded by a treelike topology, we implement backpropagation and feeding forward in ways generalizable to treelike designs.
4.2 The Forward Pass
4.2.1 Injection
Thus, SF plays two roles: It alerts the hidden neurons to feed forward their output, and then by the above permeation alerts the neurons in the next layer to begin processing input. With these reactions, the output Y of each hidden chemical neuron feeds forward. This process is illustrated in Figure 7, which shows (in silico) experimental concentration-versus-time data for the species in the feedforward process, in each neuron in a simple one-hidden-layer FCNN.
4.3 Backpropagation
Backpropagation is the algorithm that first learned xor and popularized the modern feedforward neural network [53]. Again, our chemical system does not exactly reproduce the classic formulae defining this process, and we focus instead on chemically reproducing the important conceptual relationships between parts.
To review, classical backpropagation proceeds in three steps:
- 1.
The output perceptron's error e = Ŷ − y is calculated, where Ŷ and y are the desired and actual outputs.
- 2.
Moving backwards, each perceptron's error is calculated as the sum of the errors that it has affected, weighted by the connecting weights. In our topologically restricted multilayer perceptrons, each node only feeds forward to one other, so this reduces to ei = ejwji, where wji is the weight from perceptron i to j, and ej is already known.
- 3.
Each weight is updated proportionally to this error, the weight's own magnitude, and the most recent output of the source perceptron: Δwji ∝ eiwjiyi, where yi is the last output of perceptron i.
The first step above is emulated by the OCN's internal reactions, described in Section 3.2.2. These reactions produce weight-update species that encode the sign and magnitude of the network's overall error.
Thus, in the one-hidden-layer case, the concentration of the weight-changing species W⊕ and W⊖ in the ith inner compartment is related to (a) W⊕ and W⊖ in the outer compartment, and (b) the weight Wi connecting the two compartments. This relationship is diagrammed in Figure 8.
5 Methodology
5.1 Rate and Permeation Constants
We have so far specified the reactants, products, and catalysts in each reaction, but every reaction also has at least one dimensionless rate coefficient, which describes its speed (k in Equations 1 and 2). Here we discuss how those rate coefficients have been set in our simulations. Tables listing all reaction rates can be found in the Online Supplementary Material.
As each AASP (Section 3.1) contains around twenty distinct rate constants, the rate parameter space is prohibitively large and complex to explore via exhaustive search. In our previous work [4] we searched for these constants with a standard genetic algorithm. As both the hidden and output chemical neurons share most of their reactions with the AASP, those reactions are set to the rates found by the method described in [4].
Unlike with the AASP, we found success in setting the rate constants introduced in this article by hand. Our intuition in doing so was based on the intended function of each reaction, and which reactions should occur faster or slower than others. To illustrate the intuitive setting, consider the case when two species annihilate in order to determine which had the larger initial concentration, and then the remaining species is transformed into another to perform a task dependent on the initial comparison. This is the case when Y and T are compared in the OCN's error-calculating mechanism (Figure 3): Whichever species remains after annihilation is meant to turn into an appropriately signed error species. In such cases, the comparison reaction should be faster than the follow-up reactions, dependent on that comparison. Otherwise, the second reaction would execute before the comparison had been meaningfully made. These manually set rate constants, as well as those set by the genetic algorithm, are listed in the Online Supplementary Material.
5.2 Simulation Details
The FCNN is a large system of ODEs. As such systems are generally unsolvable analytically, we use numerical Runge-Kutta-type ODE solvers to observe the FCNN's behavior. We used a fixed-time-step solver with a step size of 0.1, chosen for speed as well as stability. All simulations were run on the COEL web-based chemical modeling and simulation framework [2, 5].
In the interest of modularity and flexibility, COEL's ODE solvers do not consider the FCNN as a whole. Rather, the contents of each compartment are simulated semi-independently, each by a separate ODE solver. Several solvers then communicate with each other as needed. This allows us to use different solvers for different compartments, or even to have compartments solved by entirely different types of simulations. When using adaptive time-step-size ODE solvers, the compartments are synchronized by imposing the slowest compartment's time-step size on the other compartments, and so all of the chemical concentrations in the FCNN are updated in unison.
6 Results
As our goal is to learn linearly inseparable functions in a chemical neural network, we built the first FCNN in the classic one-hidden-layer, two-hidden-neuron network topology first shown to learn xor [53]. We will refer to this topology, with rate constants set as described in Section 5.1, simply as the FCNN in this section.
The FCNN successfully learns each of the 16 binary two-input logic functions. In a single learning run, it was trained to a given logic function with 2,000 random inputs and corresponding penalty injections (described in Section 4). We generated inputs randomly from {0, 1}2 and later injected a threshold species T with concentration 0.6 and a penalty P as appropriate, 2,000 times consecutively. The concentration of the penalty species, which is analogous to a learning rate, was annealed by a factor of 0.0008 at each injection (see the Online Supplementary Material for further details).
The FCNN's accuracy at each of the 2,000 consecutive iterations was averaged over 10,000 training runs to produce accuracy-versus-time data for each logic function shown in Figure 9. Figure 10 shows an example of the FCNN converging to a solution of xor.
The most important results are the FCNN's performance on xor and xnor, which, because of their linear inseparability, cannot be learned by a single perceptron [45]. On the 2,000th iteration, the FCNN's accuracy on xor and xnor is 100.00% and 98.05%, respectively. Averaged over all 16 functions, the FCNN is 99.88% accurate by the 2,000th learning iteration.
As can be seen in Figure 10, the FCNN converges to learn inseparable functions in a relatively short period of time. The 14 separable functions are almost perfectly learned by the 600th learning iteration, but it takes until around iteration 1,200 to learn xor. xnor is not perfectly learned even by iteration 2,000. We see this as confirmation that linear inseparability is a challenging feature to learn. Nonetheless, the FCNN learns about as quickly as the original multilayer perceptrons that solved xor: Rumelhart's classic multilayer perceptron took roughly 1,000 learning iterations [53].
The difference in performance between xor and xnor can be explained by the FCNN's asymmetric treatment of the input values 0 and 1. As can be seen in Figure 10, the functions !X1 and !X2 are learned almost identically well by the FCNN. This is because the FCNN architecture is symmetric on input; each input dimension behaves according to the same logic. Its architecture is not symmetric on negation, however, because 1 and 0 values are treated fundamentally differently. Consider the fact that the output species Y is ultimately produced by the two input species X1 and X2, as well as the signal species Sin (Sections 3.1.1 and 4.2). For this reason, it is easier for the FCNN to learn to output 0 when it is given the input (0, 0) (i.e., when only Sin is present at injection) than to learn to output 1.
Unlike its building block, the AASP (Section 3.1), the FCNN in this context behaves as a binary perceptron. Thus, we compare its performance on the binary logic functions not with the AASP, but with our previous binary single chemical perceptrons: the SASP (standard ASP), an automatically thresholded version, the TASP (thresholded ASP) [7], the WLP, and the WRP [6]. The two more recent models, the SASP and the TASP, represent the state of the art in binary chemical perceptrons; although the TASP is more accurate, the SASP is more similar to the FCNN because of its lack of automatic output thresholding. As shown in Table 2, the FCNN is significantly more capable than any of them.
FCNN | WLP | WRP | SASP | TASP | |
xor | 100.00 | 57.61 | 61.88 | 50.53 | 59.00 |
xnor | 98.05 | 57.79 | 61.12 | 51.02 | 57.86 |
16-function average | 99.88 | 94.71 | 95.18 | 93.40 | 94.80 |
FCNN | WLP | WRP | SASP | TASP | |
xor | 100.00 | 57.61 | 61.88 | 50.53 | 59.00 |
xnor | 98.05 | 57.79 | 61.12 | 51.02 | 57.86 |
16-function average | 99.88 | 94.71 | 95.18 | 93.40 | 94.80 |
Thus, the FCNN is the first simulated chemical system to autonomously learn linearly inseparable functions. As shown by Minsky and Papert [45], single perceptrons cannot learn such functions, because, as binary classifiers, they can only divide the input space along a hyperplane. In common with a multilayer perceptron, the FCNN does not have this constraint. To illustrate this, we mapped the response surface of an FCNN after it had successfully learned to compute xor, as shown in Figure 11.
7 Conclusion
We have presented a hierarchical, compartmentalized artificial chemical system that is capable of autonomous learning. The feedforward chemical neural network (FCNN) is, to our knowledge, the first simulated chemical system to learn a linearly inseparable function. It does so by a chemical analogue of backpropagation. This is demonstrated in the classic example of the xor binary function, which the FCNN learns perfectly in under 1,500 learning iterations, 100% of the time.
Each chemical neuron in the FCNN is a modified version of the AASP from our previous work. Neurons are distinguished from each other by their compartmentalization in nested cell-like containers. This nesting constrains FCNNs to treelike topologies, but allows modular design. Interneuron communication, facilitating feeding forward and backpropagation, is mediated by selective permeation of signal species through the compartment walls.
Each hidden neuron is chemically identical in the species and reactions that occur within them. Thus, the FCNN brings us closer to a world where we might build a whole class of general-purpose chemical computers from the same scalable, composable components.
Because they follow textbook reaction rate laws, the chemical reaction networks we have designed (i.e., each individual chemical neuron) are directly translatable to wet chemistries thanks to advances in DNA strand-displacement circuits [55]. Translating our cell-like containers, along with their reactive channels, into a wet implementation would be more challenging, as the problem of constructing chemical membranes has no straightforward solution. We are hopeful that current work in synthesizing bilayer lipid membranes will develop such a solution.
One technical complication in the FCNN design worth addressing is the manual injection of the feedforward signal species SF. Ideally, a chemical neural network could operate completely independently once receiving input and a desired output signal. We are hopeful of developing mechanisms by which SF could be generated periodically within the FCNN, perhaps using chemical oscillators such as the Lotka-Volterra predator-prey system [38, 39].
Another direction for further research is optimizing the modules of the FCNN. In our previous work, we have developed a small family of single chemical perceptrons—we have already mentioned the (standard) asymmetric signal perceptron [7], the thresholded asymmetric signal perceptron [7], and the analogue ASP [3]. We have explored network architectures with many of these as building blocks, shown in our Online Supplementary Materials. Though we have presented the best-performing architecture we have tested, it is possible that there are yet better components for the FCNN than our current hidden and output neurons.
In our previous work we have developed chemical perceptrons that incorporate delay lines to learn patterns in time series [3, 46]. Integrating these into an FCNN would add a temporal dimension to the network's pattern-recognition capabilities, which would enable us to tackle standard neural network and reservoir computing benchmark time series such as NARMA [1], Mackey-Glass [19, 40], and the Hénon map [23].
Another logical extension of this work is to explore the FCNN model on larger networks, with either more hidden layers, more neurons per layer, or both. Larger FCNNs are expected to tackle more complex problems, and it is an open question how many layers and neurons per layer are necessary for the FCNN to solve certain tasks, as its behavior and performance will likely differ somewhat from classically implemented neural networks.
It bears repeating that the FCNN is not a chemical transcription of a neural network, but an analogy. As discussed with our results in Section 6, the FCNN we used here to solve xor converged about as quickly as Rumelhart, Hinton, and Williams's first xor-solving network [53]. It would be interesting to explore if the FCNN generally converges faster or slower than classical neural nets. Our analogue of input-weight integration, mathematically written in the language of ODEs, is highly nonlinear.
The importance of this research is the hope of a reprogrammable chemical computer. Like the formal multilayered perceptron, the FCNN has two modes: simply processing output when given an input, and learning via backpropagation of errors. A wet implementation of the FCNN, once trained to learn a task, could perform that task reliably as long as desired. The chemical computer could then, at any time, be retrained to perform any other task of which it is capable.
As parallelization becomes ever more prominent in computing, it is difficult not to be tempted by the possibility of molecule-scale chemical computers. Such tiny machines could conceivably be combined in molar amounts—on the scale of 1023—to perform naturally distributed, parallel computing.
Furthermore, the liquid nature of chemical computers and the possibility of constructing them from bio-compatible DNA strands [55] open profound possibilities for patient-embedded biochemical computers. A fleet of cell-size machines could monitor chemical concentrations in the bloodstream, modulating the release of some drug accordingly. With time-series integration, biochemical computers could keep a record of changes in a biological system and act as diagnostic aids and tools in preventative medicine.
There has already been significant research into medical uses of computational chemical networks. One recent result [61] presented a chemical logic circuit that tests for the presence of particular microRNA molecules and the absence of others, effectively identifying a HeLa cancer cell. The authors designed a bespoke chemical logic circuit for this purpose, but this is exactly the type of computation an FCNN excels at learning. A standardized model of a programmable chemical computer would trivialize the design of such medical machines, whose potential has only begun to be explored.
Acknowledgments
This material is based upon work supported by the National Science Foundation under grants 1318833, 1518833, and 1518861.
References
Author notes
Contact author.
Wild Me, 1726 North Terry Street, Portland, OR 97217. E-mail: [email protected]
Luxembourg Centre for Systems Biomedicine, University of Luxembourg, Luxembourg. E-mail: [email protected]
Department of Electrical and Computer Engineering, Portland State University, Portland, OR 97201. E-mail: [email protected]
Department of Computer Science and Center for Biomedical Engineering, University of New Mexico, Albuquerque, NM 87131.