Abstract
The design and implementation of adaptive chemical reaction networks, capable of adjusting their behavior over time in response to experience, is a key goal for the fields of molecular computing and DNA nanotechnology. Mainstream machine learning research offers powerful tools for implementing learning behavior that could one day be realized in a wet chemistry system. Here we develop an abstract chemical reaction network model that implements the backpropagation learning algorithm for a feedforward neural network whose nodes employ the nonlinear “leaky rectified linear unit” transfer function. Our network directly implements the mathematics behind this well-studied learning algorithm, and we demonstrate its capabilities by training the system to learn a linearly inseparable decision surface, specifically, the XOR logic function. We show that this simulation quantitatively follows the definition of the underlying algorithm. To implement this system, we also report ProBioSim, a simulator that enables arbitrary training protocols for simulated chemical reaction networks to be straightforwardly defined using constructs from the host programming language. This work thus provides new insight into the capabilities of learning chemical reaction networks and also develops new computational tools to simulate their behavior, which could be applied in the design and implementations of adaptive artificial life.
1 Introduction
The field of molecular computing has made great strides in the implementation of sophisticated information processing systems at the nanoscale, including logic circuits (Qian & Winfree, 2011a; Seelig et al., 2006), distributed algorithms (Chen et al., 2013), and artificial neural networks (Cherry & Qian, 2018; Qian et al., 2011). Such systems often use DNA molecules to implement their computational processes because the sequence-specific nature of DNA chemistry makes it possible to encode structures and interactions based solely on the sequences of the constituent molecules that make up these systems. The ability to process information at the nanoscale has a range of enticing applications in biomedical diagnostics and therapeutics (Chatterjee et al., 2018; Chen et al., 2015; Groves et al., 2016; C. Zhang et al., 2020) and in the control of nanoscale robotic devices (Thubagere, Li, et al., 2017).
The fact that molecular computing devices can mimic the computational behavior of artificial neural networks is of particular interest because it hints at the possibility of implementing adaptive behavior in engineered biochemical systems, if those molecular systems could be made to learn. To date, this has not been accomplished in a molecular implementation of an artificial neural network. However, living systems exhibit a range of adaptive behaviors, and even single-celled organisms have been shown to adapt to changes in their environment (Dexter et al., 2019), with claims made in the literature that even these simple organisms are capable of learning (Hennessey et al., 1979; Wood, 1988). This implies that there may be a molecular basis for learning behaviors in biological systems and, furthermore, that engineered molecular computing systems might be able to replicate some of that behavioral complexity. The design and implementation of learning algorithms for molecular computing systems is of great interest and would enable the development of engineered nanoscale control systems capable of intelligently responding to changes in their environment. For example, smart biomedical devices endowed with molecular learning circuits could be programmed to learn and adapt to the specifics of a particular patient’s physiology.
In mainstream machine learning research, artificial neural networks have become established as a powerful mechanism for training models. Given that previous work has demonstrated the implementation of neural networks using DNA strand displacement reactions (Cherry & Qian, 2018; Qian et al., 2011), neural networks are a promising framework for implementing machine learning algorithms in a molecular computing system. Furthermore, mathematically well-defined learning algorithms exist for training, such as the well-established backpropagation algorithm (Rumelhart et al., 1986). Therefore, in this article, we show that a simple neural network learning algorithm based on backpropagation can be realized as an abstract chemical reacton network (CRN). CRNs are the underlying “programming language” of molecular computing circuits (Cook et al., 2009), and it has been shown that any such network can be compiled into a corresponding network in real wet chemistry using DNA strand displacement reactions (Soloveichik et al., 2010). This provides a potential route to a future laboratory implementation of our circuit designs. Furthermore, simulating CRNs is of great interest for molecular programming.
Here we outline the design of an abstract CRN that can implement backpropagation learning in a multilayer artificial neural network architecture. The neurons in our system utilize the leaky rectified linear unit (leaky ReLU) nonlinearity, which is widely used in modern machine learning models. We demonstrate the use of our system to train a simple neural network to learn a logic function that is not linearly separable, by simulating an deterministic ordinary differential equation (ODE) model of the CRN. We illustrate the correctness of this CRN design by comparing its output to that from a reference implementation of the underlying backpropagation algorithm. We develop a flexible, general-purpose simulator to enable these simulations to be carried out straightforwardly. Our work therefore demonstrates the first implementation of backpropagation learning in a multilayer nonlinear chemical neural network design, thereby opening the door to future development of chemical reaction network designs with advanced learning capabilities.
2 Related Work
Previous work by ourselves and others has developed and simulated designs for neural network–like chemical reaction systems capable of learning, from which we draw inspiration for the current work. In early work, we developed designs for systems capable of supervised learning in networks of DNAzyme-based reactions (Lakin et al., 2014) and toehold-mediated DNA strand displacement reactions (Lakin & Stefanovic, 2016). Blount et al. (2017) reported a feedforward chemical network that uses a backpropagation-style algorithm for learning; however, the system design does not implement a rigorous mathematical specification of the algorithm, and thus its behavior is somewhat hard to predict a priori. Other approaches to learning in chemical systems have also been investigated, including training via reinforcement learning (Banda et al., 2014) and learning via weight perturbation learning in a multilayer network of neurons using the tanh nonlinearity (Arredondo & Lakin, 2022). The crucial goal in all of this work has been to establish systems in which the learning process is carried out solely within the simulated chemistry and not via external computational mechanisms. Autonomous chemical learning systems could, at least in principle, be developed to function autonomously and be trained by an experimenter to implement specific computational functions. However, there remains a lack of circuit designs that can implement the standard and well-known algorithm of backpropagation for supervised learning in a well-understood and clearly specified manner; the current article aims to address this.
Theroretical work on chemical neural networks that can compute outputs but cannot learn has yielded promising approaches to use deep learning to program molecular circuits (Vasić et al., 2020). Previous work has discovered rate-independent chemical reaction systems for computing the nonleaky version of the rectified linear unit (ReLU) nonlinearity (Vasić, Soloveichik, & Khurshid, 2020b) and has also explored the potential for DNA-based implementations of binarized neural networks (Linder et al., 2021). However, while using neurons with binarized weights greatly simplifies the forward computation, training such neurons via gradient descent is challenging because gradient descent works by making many small changes to the weight values, which cannot be done if the weights are binarized. Thus an additional continuous representation of the weights must be maintained for training purposes, which is then binarized for carrying out the forward computations (Simons & Lee, 2019).
Much of the work discussed has focused on high-level abstract specifications of CRNs, in which the system design is expressed in terms of transformation rules between abstract chemical reactants and products. While abstract, such specifications can nevertheless be related to experimental science via translation into networks of toehold-mediated DNA strand displacement reactions (D. Y. Zhang & Seelig, 2011). This correspondence was established by Soloveichik et al. (2010) for arbitrary abstract CRNs, and previous work has studied a range of distinct translation schemes (Cardelli, 2010, 2013). Thus abstract CRNs are conveniently high level while also linked to low-level molecular implementations; therefore a range of computational tools to design and test these translations has been developed. Such tools specialized for DNA nanotechnology include the Nuskell and Peppercorn tools developed in the Winfree group (Badelt et al., 2017, 2020), which are also embedded within Python. The Visual DSD system for DNA strand displacement modeling (Lakin et al., 2011) has limited facilities for perturbing the system (Yordanov et al., 2014). With regard to tools more suited for simulations of adaptive and trainable chemical systems, other, more general simulators, such as StochKit2 (Sanft et al., 2011) and GillesPy (Abel et al., 2016), use both time-based and state-based event triggers for system perturbations that can be used to model external interference from an experimenter or other feedback from the chemical environment. The Kappa system includes perturbation events expressed in the Kappa DSL itself (Boutillier et al., 2018), along with alarms that check these periodically. The COEL cloud-based simulation framework (Banda & Teuscher, 2016) uses perturbation definitions based on the Java Math Expression Parser. To our knowledge, these related systems do not permit the use of arbitrary computation in a general-purpose host language to specify arbitrarily complex interventions in the system when those triggers are activated, as is possible with the simulator on which we report herein.
Finally, some experimental work on computational DNA strand displacement circuits has yielded exciting results in terms of neural network implementations. Most notably, Qian et al. (2011) implemented a recurrent Hopfield neural network using catalytic “seesaw” DNA strand displacement reactions (Qian & Winfree, 2011b). That work was subsequently scaled up to much larger networks that could carry out pattern recognition tasks on a version of the MNIST data set of handwritten digits (Cherry & Qian, 2018). These results hint at exciting future developments in molecular learning circuits. However, these systems are still very small and very simple by the standards of modern-day silicon-based computing hardware, and the amount of additional circuitry required to go from a one-shot neural network implementation to a version with built-in learning capabilities is significant. Thus computational work still has a part to play in laying out future paths for this work.
3 Chemical Reaction Networks
4 Results
4.1 Neural Network Structure and Definition
In this work, we use abstract CRNs to input trainable neural networks composed of leaky ReLU neurons. This nonlinearity is in widespread use in mainstream machine learning research because of its relative simplificity and good learning performance. Figure 1 outlines the schematics of single neurons and the network structure under study. In a single neuron, the input signals are multipled by weights before being passed through the leaky ReLU nonlinearity. The whole network under study accepts two input signals, X1 and X2, that are passed into two hidden layer neurons, N0 and N1. These produce output signals H0 and H1, respectively. These signals are fed into a single output layer neuron, N2, which produces the overall output signal, Y. Each neuron also has a third input, a bias signal that is clamped to −1. Importantly, this is a feedforward network, which means that we can straightforwardly apply the backpropagation algorithm to calculate the weight updates, as outlined in what follows.
4.2 Derivations of Loss Derivatives
4.3 Designing a CRN to Directly Compute the Loss Derivatives
Given the mathematical definition of our network’s desired operation, both for output generation and for learning, we can now start to build an abstract CRN implementation of this behavior. This is a key advantage of our approach; by grounding our system in a well-studied and well-understood network implementation and learning algorithm, we have a concrete target against which to test our CRN implementation so as to gauge its performance. In this section, we describe the various features of our CRN neural network structure and implementation. Although the various components of the CRN design are presented separately, our system is a single well-mixed solution in which all reactions are assumed to be occurring simultaneously within a one-pot reaction vessel. As outlined in what follows, the timing of reactions is mediated by the use of a chemical clock signal, but the reactions are all assumed to be occurring in parallel and competing for reactants; indeed, this competition is a key aspect of our system design.
4.3.1 Regulation by a Molecular Clock Signal
Clock phase . | Operations carried out . |
---|---|
C1 | Input fanout |
C3 | Input copying and weight calculations for hidden layer neurons (N0) and N1 |
C5 | Preparing leaky ReLU calculations for N0 and N1 |
C7 | Executing leaky ReLU calculations for N0 and N1 |
C9 | Input copying and weight calculations for output layer neuron (N2) |
C11 | Preparing leaky ReLU calculation for N2 |
C13 | Executing leaky ReLU calculation for N2 |
C15 | Calculation of network loss |
C17 | Calculation of backpropagation signal into N2 |
C19 | Calculation of weight updates for N2 and backpropagation signals into N1 and N0 |
C21 | Calculation of weight updates for N1 and N0 |
C23 | Application of weight updates |
C25 | Cleanup (degradation of various signals) in preparation for next training round |
Clock phase . | Operations carried out . |
---|---|
C1 | Input fanout |
C3 | Input copying and weight calculations for hidden layer neurons (N0) and N1 |
C5 | Preparing leaky ReLU calculations for N0 and N1 |
C7 | Executing leaky ReLU calculations for N0 and N1 |
C9 | Input copying and weight calculations for output layer neuron (N2) |
C11 | Preparing leaky ReLU calculation for N2 |
C13 | Executing leaky ReLU calculation for N2 |
C15 | Calculation of network loss |
C17 | Calculation of backpropagation signal into N2 |
C19 | Calculation of weight updates for N2 and backpropagation signals into N1 and N0 |
C21 | Calculation of weight updates for N1 and N0 |
C23 | Application of weight updates |
C25 | Cleanup (degradation of various signals) in preparation for next training round |
4.3.2 Dual-Rail Signal Representation
4.3.3 Input Fan-out
4.3.4 Input Weighting
4.3.5 Input Copying
4.3.6 Calculation of Leaky ReLU Nonlinearity
Having calculated the weighted sums of the inputs, and copied the input values as appropriate, the next step in the feedforward network computation is to pass the weighted inputs through the leaky ReLU nonlinearity. Previous work (Linder et al., 2021; Vasić, Soloveichik, & Khurshid, 2020b) has uncovered rate-independent CRNs for calculating the nonleaky variant of the ReLU nonlinearity; however, here we wish to use the leaky variant and thus must design our own, new CRN.
4.3.7 Calculation of Network Error
Having completed the feedforward network computation, before we can start the backpropagation learning stage, we must first calculate the network error (also known as loss) between the output signal Y produced by the output neuron and the expected output value TARGET, which we assume is supplied by the experimenter along with the input signals at the start of each training round.
In fact, what we actually calculate to serve as the input to the backpropagation algorithm is the partial derivative of the loss with respect to the network output, . Given that we define the loss as L = · (Y − TARGET)2, this partial derivative can be computed straightforwardly as = Y − TARGET. This obviates the need to calculate the square of the difference between the output Y and the target output TARGET, although this could in principle be computed by a CRN using previously reported techniques (Buisman et al., 2009).
4.3.8 Calculation of Loss Derivatives via Backpropagation
The loss derivative having been calculated as the dual-rail species E, this can serve as the input to the backpropagation calculations required to compute the weight updates for gradient descent learning. This is the crux of our CRN learning algorithm. The outline structure of the backpropagation calculations is summarized in Figure 2. Once the input signals and chosen leaky ReLU gradients have been copied as outlined earlier, the backpropagation phase is just a series of straightforward multiplications, following the derivations presented in section S1 of the Supplemental Material.
For each neuron, we divide the backpropagation calculation into two parts, the computing of the backpropagation signal “into” that neuron, which is the same for all weight updates associated with that neuron, and the computing of the individual weight updates, which differ between the various weights associated with a given neuron. To simplify the analysis of the CRN, these occur in distinct clock phases, although they could be combined if reducing clock phases were a concern. We use species Qi to represent the shared incoming value for neuron Ni in the backpropagation calculation.
4.3.9 Application of Weight Updates
Having calculated the partial loss derivatives, we can now apply the updates to the weights. Note that each weight cannot be updated immediately as the partial loss derivatives are calculated because the original weight values used in the feedforward network computation phase of each training cycle are required to compute the backpropagation signals; these computations are going on at the same time, as outlined previously. Thus we compute the partial loss derivatives as earlier for all weights, then apply the weight updates in the next clock phase (C23) after the backpropagation process has concluded.
4.3.10 Cleanup at the End of Training Cycle
4.3.11 Chemical Reaction Network Size
The numbers of species and reactions required are a common metric for CRN design complexity; the CRN design outlined herein includes a total of 135 species and 315 reactions. This is similar to the simpler of the two similar networks from our previous work that we trained via weight perturbation (Arredondo & Lakin, 2022); that CRN involved 144 species and 286 reactions. However, only one weight could be adjusted per training round in that version, while our backpropgation network adjusts every weight in each training cycle. The similar network that could adjust every weight in each training cycle via weight perturbation was significantly larger, involving 583 species and 1,213 reactions. Thus the design outlined herein improves on previous work to enable the entire network to be trained using a more compact design than previously available. A full listing of the backpropagation CRN species and reactions is presented in section S3 of the Supplemental Material.
4.3.12 Initial Network State
In the initial state of the CRN, the concentrations of the clock species C25 and C26 are set to 1.0, with all other clock species concentrations set to 10−6. Thus the system starts in the “last” cycle of the oscillator, and the first full training round will begin shortly after the simulation begins. The dual-rail weight species Wij are initialized with values corresponding to the initial values of all the corresponding weights in the untrained network. The DPOSi and DNEGi species for each neuron Ni are initialized with the values for the leaky ReLU gradient parameters α and β for each neuron. (These could, in principle, be different for each neuron.) Finally, the Alpha species is assigned a concentration corresponding to the value of the learning rate parameter for the network. The initial concentrations of all other CRN species are set to zero.
4.3.13 Training Implementation
The CRN is trained by increasing the concentrations of the species representing the dual-rail input signals (X1, X2, and BIAS) at the start of each C1 oscillator phase, which corresponds to the start of a new training cycle. The amount of the increase represents the value of that signal for this training cycle. An additional concentration of the TARGET species, representing the expected output of the network given these inputs, is also supplied. These additions represent the actions of an experimenter external to the system; thus, the system will undergo supervised learning. We assume that the volume increase associated with these additions is neglible so that we do not need to model dilution effects. This is repeated at the start of the next training cycle, and so on. These additions are modeled via a system of model perturbations implemented in our simulation engine, which we describe later.
4.4 Simulation Methodology
Having established a design for abstract CRNs capable of backpropagation learning, we need a methodology to easily simulate their behavior. To this end, we have developed ProBioSim, a simulator that incorporates features specifically to simulate such systems, in which the CRN component interacts with an environment that is specified externally to the CRN itself but that can modify the CRN in an arbitrarily complex, state- and time-dependent manner. Given that existing designs for molecular systems are currently beyond our experimental capabilities, many of these potential applications of DNA nanotechnology have been studied in silico. This motivates the development of software tools targeted specifically for the study of the interactions of engineered molecular systems with externally specified environments.
4.4.1 ProBioSim Simulator Outline
As outlined in Figure 3(a), ProBioSim is a Python library that incorporates domain-specific language (DSL) for molecular circuits (Lakin & Phillips, 2020). The DSL included within ProBioSim is a simple syntax for specifying the reactions, rate constants, and simulation settings required to define and simulate the behavior of abstract CRNs. This language also includes the capability to specify simple external interventions to add or remove quantities of certain chemical species at specified points in simulation time. This syntax can encode unconditional, deterministic perturbations of the system like those used in previous work on supervised learning in chemical systems (Lakin & Stefanovic, 2016) and allows for rapid prototyping and testing of candidate CRN designs. The fact that ProBioSim is based on Python means that it can be trivially integrated with systems for rapid prototyping and visual feedback, such as Jupyter Notebooks (Kluyver et al., 2016), which also enhances reproducibility of modeling and simulation workflows.
There are two types of perturbation in ProBioSim: perturbation actions and perturbation functions. In each case, the perturbation is scheduled to occur at a specific time point in the simulation, at which point the simulation (either stochastic or deterministic) is paused, the specified changes to the simulation state are applied, and the simulation is restarted, as outlined in Figure 3(b). Multiple perturbation actions and functions can be specified to execute at the same time point.
4.4.2 Perturbation Actions
Perturbation actions are explicitly specified modifications to the state of the simulation. They can be specified within the text-based CRN specification language included within ProBioSim and can be exported to and imported from text files that specify the model. The set of perturbation actions that can be specified in this way is limited to the following:
increment the concentration or count of a species by a specified amount,
decrement the concentration or count of a species by a specified amount, or
set the concentration or count of a species to a specified absolute amount.
perturbation X + = 10 @ 5.0
4.4.3 Arbitrary Perturbation Functions Expressed as Python Code
The real power of the ProBioSim system comes from its ability to define perturbations to the CRN system in terms of arbitrary code. This is possible because, in contrast to some other molecular design tools (Lakin et al., 2011), ProBioSim is embedded within a host language: the Python general-purpose programming language. This provides a ready-made language for expressing arbitrary responses to CRN behaviors, which can be used to simulate the interactions of the CRN with an experimenter or a responsive external environment. Such perturbation functions are modifications to the simulation state that are specified in terms of functions written in Python. As such, perturbation functions are significantly more powerful than perturbation actions, as they allow arbitrary computations to be carried out to determine how the state of the system should be updated. ProBioSim itself does not check the content of user-supplied perturbation functions. However, those perturbation functions must be written in a particular way to enable them to be called by the simulator as required. Specifically, the form of a perturbation function must be as follows:
def pfun(t, x0, get, set, adjust):
# … perturbation code here …
t is the simulation time at which the function is called;
x0 is the simulation state passed into the function;
get is a function that, when called as get(x0, sp), returns the current value associated with species sp in the current state x0;
set is a function that, when called as set(x0, sp, n), updates the state x0 by assigning the value n to the species sp; and
adjust is a function that, when called as adjust (x0, sp, n), updates the state x0 by adding n (which could be negative) to the value currently assigned to the species sp.
def pfun(t, x0, get, set, adjust):
print(‘>>> Value of X__before__perturbation at time’ +
str(t) + ‘=’ + str(get(x0, ‘X’)))
adjust(x0, ‘X’, 10.0)
print(‘>>> Value of X__after__perturbation at time’ +
str(t) + ‘=’ + str(get(x0, ‘X’)))
This approach to defining perturbations obviates the need to define a full DSL capable of implementing a desired set of environmental responses to the behavior of the simulated CRN. In addition, free variables in the perturbation function can be used to pass in additional information to the perturbation function. For example, random number generator objects can be straightforwardly defined in the simulation setup code and referenced as free variables in the perturbation function. In this way, randomized responses can be trivially implemented, as outlined in section S4 in the Supplemental Material. Other uses of this technique could include passing a file handle into the simulator to enable facile logging of the simulation state at each perturbation.
4.4.4 Applying Perturbations
Perturbations can be defined in any order in the input and are organized into time order by the simulator. If multiple perturbation actions or functions are specified at the same time point (or at time points that are indistinguishable up to the tolerance of floating-point arithmetic), then all of the perturbation actions will be applied to the simulation first, followed by all of the perturbation functions. Within these two categories, the actions and functions will be applied in the order in which they were added to the model. (Note that changing the order of addition could therefore change the overall result of the perturbations applied at a given time point.)
After the perturbation actions and functions are applied, the resulting state must be checked to ensure the integrity of the simulation. First, the resulting state of the system is checked to ensure that there are no negative values; if there are, these are set to zero (negative concentrations or species counts being unrealistic). Then, if the simulation is stochastic, any noninteger values are converted to integers, by rounding down if necessary. This prevents fractional species counts being recorded. The simulator is then restarted, beginning from the new state that resulted from the application of the perturbations.
4.4.5 Model Import and Export
ProBioSim models can be saved to a human-readable text file and subsequently reparsed back from that file (with the exception of perturbation functions, which must be added to the model programmatically). Models can also be typeset via generation of corresponding LaTeX code. However, all model construction and modification functions can also be carried out programmatically from a Python script, enabling the automated, modular construction of large-scale CRNs. See section S4 in the Supplemental Material for example model definitions specified partially as text and defined programmatically using Python code.
4.4.6 Simulating ProBioSim Models
Stochastic simulations are implemented using Gillespie’s (1977) direct method, which provides an exact implementation of stochastic mass-action kinetics. Briefly, for each reaction rj, the counts of all reactant species are multiplied together and by the reaction’s rate constant rate(rj) to calculate the propensity ρj of that reaction. The next reaction is chosen at random, with the probability of choosing rj given by ρj/(∑jρj), that is, with probability proportional to the corresponding reaction propensity. The time until the next reaction fires is drawn from an exponential distribution with mean 1/(∑jρj). The reaction is applied to update the state, the propensities are recalculated, and the simulation loop moves on to the next iteration. (Note, however, that our backpropagation learning simulations will only use the deterministic ODE solving simulation method.)
In the following section, we present results from learning simulations on our backpropagation- based chemical neural network design, carried out using the ProBioSim simulator. We refer the reader to section S4 of the Supplemental Material for further examples of ProBioSim model code and simulation outputs, including some examples involving state-dependent and probabilistic perturbations that show the full power of the ProBioSim approach, including some examples of associative learning in simple abstract chemical reaction networks.
4.5 Learning Simulation Results
To demonstrate the successful operation of our CRN, we simulated it learning the XOR logic function. The truth table for this function can be summarized as follows:
F xor F = F F xor T = T T xor F = T T xor T = F
The specific CRN instance that we simulated starts with the following initial weight values: W00 = −0.8, W01 = −0.8, W02 = −0.8, W10 = −2.0, W11 = −0.8, W12 = −1.0, W20 = −1.2, W21 = 1.6, and W22 = −1.0. For each neuron i, the values of DPOSi and DNEGi, representing the gradients for the positive and negative arms of the leaky ReLU transfer function, are 1.0 and 0.1, respectively. The value of the learning rate parameter was α = 0.5. To train the system, the following sequence of eight training instances was repeated three times, for a total of 24 training rounds:
X1 = −1, X2 = −1, TARGET = −1;
X1 = −1, X2 = 1, TARGET = 1;
X1 = 1, X2 = −1, TARGET = 1;
X1 = 1, X2 = 1, TARGET = −1;
X1 = −0.5, X2 = −0.5, TARGET = −1;
X1 = −0.5, X2 = 0.5, TARGET = 1;
X1 = 0.5, X2 = −0.5, TARGET = 1; and
X1 = 0.5, X2 = 0.5, TARGET = −1.
The learning CRN corresponding to this system was compiled into reactions and perturbations, and the resulting ProBioSim model is presented in section S3 of the Supplemental Material. This CRN was simulated by integrating a deterministic ODE model, and the resulting weights were used to calculate the decision surface after certain training rounds; these are plotted as heat maps in Figure 4. The heat maps show the decision surface induced by the weights shifting toward the desired form over the course of the training rounds; eventually, it ends up in the expected shape, with a band of high output running diagonally between the (−1, +1) and (+1, −1) corners of the space, with low output in the vicinity of the (−1, −1) and (+1, +1) corners. This qualitatively indicates that our CRN system is indeed able to learn the challenging XOR example using the backpropagation learning algorithm. To illustrate the sequence of key operations that takes place during training, section S2 of the Supplemental Material presents example time course figures from the first training round of this simulation, covering both the feedforward and backpropagation phases.
To assess the quantitative performance of our learning CRN, we compared the sequence of weights obtained in this simulation with those that would be expected to arise from a direct reference implementation of the backpropagation algorithm. We calculated the difference between the weight values observed at the end of each training round in our simulated learning CRN and the corresponding weight values obtained from a reference implementation written in Python. These values are presented in Table 2. We observe accuracy to at least two decimal places in all cases and to three or more in many cases. This indicates that our system is quantitatively correct in that its behavior closely matches that of the underlying learning algorithm that it aims to implement. We briefly outline some potential sources of error in the following pages.
Round . | W00 . | W01 . | W02 . | W10 . | W11 . | W12 . | W20 . | W21 . | W22 . |
---|---|---|---|---|---|---|---|---|---|
1 | −0.0029 | −0.0029 | −0.0029 | 0.0018 | 0.0018 | 0.0018 | −0.0010 | 0.0028 | 0.0043 |
2 | −0.0030 | −0.0030 | −0.0028 | −0.0013 | −0.0013 | 0.0049 | −0.0007 | 0.0028 | 0.0032 |
3 | −0.0031 | −0.0029 | −0.0029 | −0.0048 | 0.0022 | 0.0014 | −0.0003 | 0.0029 | 0.0007 |
4 | −0.0005 | −0.0055 | −0.0056 | −0.0045 | 0.0018 | 0.0011 | −0.0000 | 0.0045 | 0.0002 |
5 | −0.0005 | −0.0055 | −0.0056 | −0.0046 | 0.0018 | 0.0010 | −0.0000 | 0.0045 | 0.0003 |
6 | −0.0006 | −0.0056 | −0.0055 | −0.0054 | 0.0014 | 0.0014 | 0.0000 | 0.0045 | 0.0006 |
7 | −0.0006 | −0.0055 | −0.0056 | −0.0060 | 0.0017 | 0.0011 | 0.0000 | 0.0045 | 0.0010 |
8 | −0.0004 | −0.0057 | −0.0057 | −0.0060 | 0.0017 | 0.0010 | 0.0000 | 0.0047 | 0.0010 |
9 | −0.0004 | −0.0057 | −0.0057 | −0.0063 | 0.0014 | 0.0008 | 0.0001 | 0.0048 | 0.0010 |
10 | −0.0005 | −0.0057 | −0.0057 | −0.0067 | 0.0010 | 0.0012 | 0.0001 | 0.0048 | 0.0014 |
11 | −0.0005 | −0.0057 | −0.0057 | −0.0072 | 0.0015 | 0.0007 | 0.0001 | 0.0048 | 0.0018 |
12 | −0.0006 | −0.0056 | −0.0057 | −0.0072 | 0.0015 | 0.0007 | 0.0002 | 0.0047 | 0.0018 |
13 | −0.0005 | −0.0056 | −0.0057 | −0.0069 | 0.0017 | 0.0009 | 0.0001 | 0.0046 | 0.0020 |
14 | −0.0006 | −0.0056 | −0.0057 | −0.0072 | 0.0016 | 0.0011 | 0.0001 | 0.0046 | 0.0024 |
15 | −0.0006 | −0.0056 | −0.0057 | −0.0075 | 0.0017 | 0.0009 | 0.0000 | 0.0046 | 0.0028 |
16 | −0.0003 | −0.0058 | −0.0058 | −0.0075 | 0.0017 | 0.0009 | 0.0000 | 0.0048 | 0.0028 |
17 | −0.0003 | −0.0057 | −0.0058 | −0.0074 | 0.0018 | 0.0010 | 0.0000 | 0.0048 | 0.0029 |
18 | −0.0004 | −0.0058 | −0.0058 | −0.0077 | 0.0015 | 0.0013 | −0.0000 | 0.0048 | 0.0033 |
19 | −0.0004 | −0.0057 | −0.0058 | −0.0080 | 0.0018 | 0.0010 | −0.0001 | 0.0048 | 0.0037 |
20 | −0.0005 | −0.0056 | −0.0057 | −0.0081 | 0.0019 | 0.0010 | 0.0001 | 0.0046 | 0.0037 |
21 | −0.0005 | −0.0056 | −0.0057 | −0.0074 | 0.0022 | 0.0013 | −0.0000 | 0.0045 | 0.0037 |
22 | −0.0006 | −0.0057 | −0.0056 | −0.0067 | 0.0026 | 0.0010 | −0.0007 | 0.0045 | 0.0040 |
23 | −0.0006 | −0.0057 | −0.0056 | −0.0066 | 0.0025 | 0.0010 | −0.0015 | 0.0045 | 0.0042 |
24 | −0.0003 | −0.0058 | −0.0057 | −0.0065 | 0.0025 | 0.0010 | −0.0015 | 0.0047 | 0.0041 |
Round . | W00 . | W01 . | W02 . | W10 . | W11 . | W12 . | W20 . | W21 . | W22 . |
---|---|---|---|---|---|---|---|---|---|
1 | −0.0029 | −0.0029 | −0.0029 | 0.0018 | 0.0018 | 0.0018 | −0.0010 | 0.0028 | 0.0043 |
2 | −0.0030 | −0.0030 | −0.0028 | −0.0013 | −0.0013 | 0.0049 | −0.0007 | 0.0028 | 0.0032 |
3 | −0.0031 | −0.0029 | −0.0029 | −0.0048 | 0.0022 | 0.0014 | −0.0003 | 0.0029 | 0.0007 |
4 | −0.0005 | −0.0055 | −0.0056 | −0.0045 | 0.0018 | 0.0011 | −0.0000 | 0.0045 | 0.0002 |
5 | −0.0005 | −0.0055 | −0.0056 | −0.0046 | 0.0018 | 0.0010 | −0.0000 | 0.0045 | 0.0003 |
6 | −0.0006 | −0.0056 | −0.0055 | −0.0054 | 0.0014 | 0.0014 | 0.0000 | 0.0045 | 0.0006 |
7 | −0.0006 | −0.0055 | −0.0056 | −0.0060 | 0.0017 | 0.0011 | 0.0000 | 0.0045 | 0.0010 |
8 | −0.0004 | −0.0057 | −0.0057 | −0.0060 | 0.0017 | 0.0010 | 0.0000 | 0.0047 | 0.0010 |
9 | −0.0004 | −0.0057 | −0.0057 | −0.0063 | 0.0014 | 0.0008 | 0.0001 | 0.0048 | 0.0010 |
10 | −0.0005 | −0.0057 | −0.0057 | −0.0067 | 0.0010 | 0.0012 | 0.0001 | 0.0048 | 0.0014 |
11 | −0.0005 | −0.0057 | −0.0057 | −0.0072 | 0.0015 | 0.0007 | 0.0001 | 0.0048 | 0.0018 |
12 | −0.0006 | −0.0056 | −0.0057 | −0.0072 | 0.0015 | 0.0007 | 0.0002 | 0.0047 | 0.0018 |
13 | −0.0005 | −0.0056 | −0.0057 | −0.0069 | 0.0017 | 0.0009 | 0.0001 | 0.0046 | 0.0020 |
14 | −0.0006 | −0.0056 | −0.0057 | −0.0072 | 0.0016 | 0.0011 | 0.0001 | 0.0046 | 0.0024 |
15 | −0.0006 | −0.0056 | −0.0057 | −0.0075 | 0.0017 | 0.0009 | 0.0000 | 0.0046 | 0.0028 |
16 | −0.0003 | −0.0058 | −0.0058 | −0.0075 | 0.0017 | 0.0009 | 0.0000 | 0.0048 | 0.0028 |
17 | −0.0003 | −0.0057 | −0.0058 | −0.0074 | 0.0018 | 0.0010 | 0.0000 | 0.0048 | 0.0029 |
18 | −0.0004 | −0.0058 | −0.0058 | −0.0077 | 0.0015 | 0.0013 | −0.0000 | 0.0048 | 0.0033 |
19 | −0.0004 | −0.0057 | −0.0058 | −0.0080 | 0.0018 | 0.0010 | −0.0001 | 0.0048 | 0.0037 |
20 | −0.0005 | −0.0056 | −0.0057 | −0.0081 | 0.0019 | 0.0010 | 0.0001 | 0.0046 | 0.0037 |
21 | −0.0005 | −0.0056 | −0.0057 | −0.0074 | 0.0022 | 0.0013 | −0.0000 | 0.0045 | 0.0037 |
22 | −0.0006 | −0.0057 | −0.0056 | −0.0067 | 0.0026 | 0.0010 | −0.0007 | 0.0045 | 0.0040 |
23 | −0.0006 | −0.0057 | −0.0056 | −0.0066 | 0.0025 | 0.0010 | −0.0015 | 0.0045 | 0.0042 |
24 | −0.0003 | −0.0058 | −0.0057 | −0.0065 | 0.0025 | 0.0010 | −0.0015 | 0.0047 | 0.0041 |
Note. Error values were calculated as CRN output minus expected output and are shown to four decimal places. These values demonstrate that the learning calculations carried out by our CRN design produce results very close to those expected based on a direct mathematical calculation of the expected system behavior.
5 Discussion
5.1 CRN Learning Performance
The results presented in Table 2 demonstrate that our learning CRN design qualitatively implements the backpropagation learning algorithm for our leaky ReLU neural network with relatively small errors. However, although the errors are small, they are nonzero, which means there is some potential room for improvement in our designs. One posssible source of error is that the various reactions in our CRN design are scheduled via a molecular clock; though the off-cycle clock phases are low in the times between their correct cycles, they are not zero, which means that the reactions catalyzed by them are still happening in our model, albeit with very low fluxes through them. This introduces small errors into the computation that cannot be avoided if we wish to use an autonomous clock of this form to organize the reactions, as in previous work (Arredondo & Lakin, 2022; Vasić, Soloveichik, & Khurshid, 2020a). An additional potential source of inaccuracy is the fact that our leaky ReLU calculations have some inherent error around the discontinuity in the derivative when the input equals zero. This is because the species representing the input signal catalyzes copying of the species representing the gradient values into the corresponding species required for computing the partial loss derivatives during the backpropagation phase; this process might fail to complete before the associated clock cycle ends if the input value is very close to zero. This could be alleviated by increasing the value of the kCopy rate constant.
5.2 Learning CRN Design Considerations
The design of our CRN and neural network reflects certain limitations of the CRN formalism and the ability to train within that formalism. In particular, the basic arithmetic operations that our system must perform (addition, subtraction, and multiplication) can all be implemented straightforwardly in an abstract CRN using established techniques (Buisman et al., 2009). We use the molecular clock methodology to sequence reactions into discrete phases in large part because this makes it simpler to calculate the discontinuous derivatives of the leaky ReLU units by simply copying the appropriate value into a specific output species, before using that value for computation in a subsequent step. Other approaches, such as rate-independent calculations of the nonleaky ReLU function (Vasić, Soloveichik, & Khurshid, 2020b), are valid only for integer-valued signal representations in stochastic models; furthermore, as has been established, that nonlinearity suffers from the “dying ReLU” problem in that neurons can no longer learn if their input goes negative and the derivative is zero. Note also that in previous work (Arredondo & Lakin, 2022), we used a tanh nonlinearity and trained the system via weight perturbation rather than backpropagation; that was done because our approach to calculating the tanh(x) function could not be straightforwardly applied to compute its derivative, which is 1 − tanh2(x), because the second derivative at x = 0 is already zero. By not using a sigmoidal nonlinearity here, we avoid that problem, but there is the alternative possibility of unbounded output signals being produced by our network, which can cause training to fail. Thus the approach taken here is shaped by the desire to produce a system that can be trained within the chemistry using relatively well-known approaches, while still using a transfer function that is of practical relevance.
5.3 Future Directions for Work on Learning CRNs
Future work on learning CRNs such as ours could take a number of directions. For example, the approach taken here of using sequential phases to execute the various parts of the calculation could be adapted to these other approaches with discontinuities in their derivatives. Previous work (Linder et al., 2021) also used the “hardtanh” transfer function (a piecewise linear approximation to tanh); this could be amenable to our approach. However, that system suffers from some of the same issues as the nonleaky ReLU function due to having gradients of zero in both the positive and negative directions. In addition, the use of complex CRN designs such as those presented here to implement these learning systems would make them extremely challenging to realize in an actual wet chemistry experiment. Some of these challenges could be ameliorated in future iterations of our learning CRN design, for example, by trying to reduce the number of distinct clock phases required to drive the system, or perhaps by doing away with the autonomous molecular clock altogether and using manually controlled signals to regulate the operation of the circuit.
Ideally, chemical learning systems would leverage the massive parallelism inherent in chemistry to advantage, and possible approaches might eventually move beyond feedforward networks to recurrent networks. Related work on chemical Boltzmann machines (Poole et al., 2017) and message-passing inference algorithms (Napp & Adams, 2013) could offer inspiration here, as could work on chemical implementations of reservoir computing systems (Goudarzi et al., 2013), which use a highly recurrent fixed network attached to a linear readout layer that could be trained relatively straightforwardly.
Given the practical difficulties with building molecular computing networks in the laboratory, the development of adaptive and trainable molecular systems is an important goal for the field of molecular programming, not least because it would enable a “build now, train later” approach that could greatly simplify the development and deployment of molecular circuits with specific autonomous behaviors for practical tasks, such as the diagnosis of disease (Lopez et al., 2018; C. Zhang et al., 2020). Finally, molecular learning systems could find applications in Artificial Life, as they could serve as adaptive control programs for engineered lifelike systems and synthetic cells.
Thus a major challenge in the field of chemical learning networks is to find a formalism that both fits naturally with the CRN framework and is also tractable to simulate and, ultimately, build in the laboratory. The latter criterion would imply that the system would need to be very simple; it could well be that learning systems that go beyond supervised learning to embrace alternative frameworks, like reinforcement learning (Banda et al., 2014) or associative learning (Fernando et al., 2009), could hold the answer.
With regard to our particular design, the presence of trimolecular reactions is a possible sticking point for future experimental implementations; however, we note that CRN implementation frameworks such as DNA strand displacement reactions (Soloveichik et al., 2010; D. Y. Zhang & Seelig, 2011) are theoretically capable of realizing such systems via their multistep encodings. However, it is worth noting that the third reactant in the trimolecular reactions in our scheme is always a clock catalyst; this could make it easier to realize in an experimental system via alternative means, such as the presence or absence of some nonchemical signal that controls whether the reactions may proceed, such as an optical signal (Prokup et al., 2012). Furthermore, given that our design relies on competition for input species to drive the computation, rate constants do need to be well balanced to ensure that the results are accurate; this is a fundamental difficulty in analog computation. Recent work on rate-independent chemical implementations of ReLU networks (Vasić et al., 2022) offers a possible way forward here; alternatively, for an experimental implementation using DNA strand displacement, models now exist to predict strand displacement kinetics based on sequence (J. X. Zhang et al., 2021). Experimentalists are also able to tune the concentrations of other components as necessary to recover the desired kinetic balance (Thubagere, Thachuk, et al., 2017).
5.4 Future Directions for Work on the ProBioSim Simulator
Future work to further develop the ProBioSim simulator may include performance optimizations as well as the implementation of alternative simulation algorithms, such as tau-leaping (Gillespie, 2001), or the implementation of additional rate laws, such as Michaelis–Menten kinetics. To enhance the expressiveness of the system, the format of the perturbation functions outlined herein could be generalized to modify other simulation parameters. In particular, the ability to perturb rate constant values would allow ProBioSim to straightforwardly model physical changes in the system, such as changes in temperature or pH (Idili et al., 2014) or the activation of optical control signals (Prokup et al., 2012). These are of interest because they could be used to modulate the behavior of an experimental implementation of the learning CRNs outlined earlier. The simulator could also be modified to explicitly model dilution effects when species are added to perturb the system, as these are not currently modeled explicitly. Our current assumption is reasonable if one makes the assumption that the volume of new sample added during the perturbation is negligible compared to the overall reaction volume. However, it is worth noting that the flexibility of our perturbation function approach could already be used to model these effects; any changes in this regard would be purely for convenience. One could also imagine implementing triggers for events based on state-based predicates, such as the counts or concentrations of various species. These could be implemented (to some extent) within the perturbation function framework, although such conditional perturbations could only possibly be applied at the prespecified times at which the perturbation function is applied. Finally, the ProBioSim simulator could in principle be integrated with techniques for parameter estimation to infer rate constants and other model parameters based on experimental data, although this would only be of real use if such data were available: this is not yet the case for experimental implementations of learning in chemical reaction networks.
6 Conclusions
In conclusion, we have reported a design for novel abstract CRNs that can implement supervised learning via the standard backpropagation training algorithm. This CRN design directly implements the mathematics of backpropagation for a class of nonlinear neurons using the leaky ReLU nonlinear transfer function. We have demonstrated the capabilities of this system to carry out learning tasks by training it to learn the two-input XOR logic function, a function that is not linearly separable and thus provably requires the use of a multilayer network. In addition, we have reported ProBioSim, a simulator for CRNs that incorporates the novel capability to define perturbations to the concentrations (or counts) of species as the results of arbitrary pieces of code expressed in the host programming language, Python. This enables the straightforward modeling of nontrivial environmental responses to the behavior of the CRN component of the system, such as our training protocols. These would otherwise require an ad hoc infrastructure for calculating and implementing these perturbations to be constructed around the underlying simulator. Our approach allows such calculations to be integrated directly into the simulation engine via a straightforward mechanism, enabling facile programming of complex interdependencies between an abstract CRN and the simulated training environment. ProBioSim is released under an open source license, and the source code is available online at https://github.com/matthewlakin/ProBioSim/.
Acknowledgments
This material is based on work supported by the National Science Foundation under grants 1518861 and 1935087.