Unconventional computing devices operating on nonlinear chemical media offer an interesting alternative to standard, semiconductor-based computers. In this work we study in-silico a chemical medium composed of communicating droplets that functions as a database classifier. The droplet network can be “programmed” by an externally provided illumination pattern. The complex relationship between the illumination pattern and the droplet behavior makes manual programming hard. We introduce an evolutionary algorithm that automatically finds the optimal illumination pattern for a given classification problem. Notably, our approach does not require us to prespecify the signals that represent the output classes of the classification problem, which is achieved by using a fitness function that measures the mutual information between chemical oscillation patterns and desired output classes. We illustrate the feasibility of our approach in computer simulations by evolving droplet classifiers for three machine learning datasets. We demonstrate that the same medium composed of 25 droplets located on a square lattice can be successfully used for different classification tasks by applying different illumination patterns as its externally supplied program.
There is no doubt that nature equipped living organisms with a powerful information processing potential. Algorithms and computing media inspired by biology are intensively studied to find new computational strategies. The emerging techniques such as molecular computing (Conrad and Zauner, 1990), DNA computing (Adleman, 1994), molecular electronics (Joachim et al., 2000; Azov et al., 2006; Gould, 2005), chemical robotics (Lund et al., 2010; Cejkova et al., 2014), or quantum computing (DiVincenzo, 1995, 2000) benefit from high computational parallelism and component miniaturization. It can be expected that new generations of unconventional computers apart from a better performance can be applied in areas where the potential of silicon computers cannot be fully exploited. Chemistry-based computing seems important for new functional, intelligent materials (Jani et al., 2014), controlled drug delivery (Kahan et al., 2008), or programmable molecular factories for synthesizing individual materials (Stephanopoulos et al., 2005; Li et al., 2015). Apart from their novelty, chemical computers are attractive due to their simplicity in construction and low production cost, e.g., through microfluidic devices (Guzowski et al., 2016; Thutupalli and Herminghaus, 2013).
Many studies and implementations of chemical information processing are concerned with reaction-diffusion phenomena and, in particular, with the Belousov–Zhabotinsky (BZ) reaction (Zaikin and Zhabotinsky, 1970). The BZ reaction is a chemical process in which an organic substrate is oxidized in an acidic environment in the presence of a transient metal catalyst. The reaction is famous because, at specific concentrations of substrates, the intermediate reagents change periodically in time. Reactions taken into account in the basic model of BZ-reaction (Field–Koros–Noyes (FKN)) model (Field and Noyes, 1974) are illustrated in Figure 1. Oscillations can be easily observed because the colors of solution with reduced and oxidized catalyst are different. Therefore, the state of system with BZ reaction can be read from medium color. Bromous acid, , is an activating agent of BZ reaction because, at specific conditions, it is autocatalytically produced. One can distinguish three phases during an oscillation cycle. The rapid production of activator yields oxidation of the catalyst, seen as the change of droplet color from red to blue named below as an excitation phase. A refractory phase begins just after excitation and it is characterized by a high level of inhibitor. In this state the medium becomes insensitive to external excitations coming from its neighborhood. Next, the inhibitor concentration decreases with time and after crossing some threshold value the reaction enters responsive phase in which an incoming stimulus can lead to an excitation. If no external triggering occurs, then the amount of inhibitor drops to a level when self-excitation occurs. In a spatially distributed system molecules diffuse out of the region of their high concentration and can trigger autocatalytic production in the neighborhood. Such local excitations are responsible for a rich variety of spatio-temporal structures (Epstein and Pojman, 1998). The waves of oxidized catalyst moving in the system can be interpreted as propagating signals used for reaction-diffusion information processing (Gorecki and Gorecka, 2009). With a structure pre-prepared by an experimentalist, reaction-diffusion computing has been successfully applied for basic information processing tasks (Kuhnert et al., 1989; Steinbock et al., 1996; Adamatzky et al., 2005, 2008; Gorecki et al., 2003, 2009).
Recently, an increasing interest in a medium composed of droplets, containing a solution of reagents of BZ reaction can be observed (Gruenert et al., 2013, 2015; Rossi et al., 2015; Thutupalli et al., 2011; Thutupalli and Herminghaus, 2013; Tomasi et al., 2014). Properly adjusted, droplet networks exhibit neural–network-like spiking behavior (Howard et al., 2014). Furthermore, interacting droplets are more flexible than pre-prepared excitable and non-excitable channels (Gorecki and Gorecka, 2009), because they can be formed via self-organization. At selected conditions the system appears to be scalable, since the droplet size can vary from micrometers (Vanag and Epstein, 2008; Vanag, 2011; Toiya et al., 2008) to millimeters (Szymanski et al., 2011; Rossi et al., 2015).
Droplets are formed when a small amount of BZ medium is immersed into an organic oil phase containing lipids or surfactants. The lipid molecules dissolved in the organic phase cover the surface of a droplet and stabilize it mechanically (Szymanski et al., 2011). Therefore, droplets can be arranged into larger structures that remain stable for a long time. When two droplets come in contact, lipids form a bilayer at the connection surface (King et al., 2014). Molecules of BZ activator can diffuse through this membrane and excite the medium behind, triggering a chemical wave in the neighboring droplet, as illustrated by Figure 2 (Szymanski et al., 2011). In most cases it is observed that one of the droplets functions as a pacemaker and forces excitations in the neighboring ones (Guzowski et al., 2016).
Here we discuss a new approach to designing chemical information processing devices. We demonstrate that evolutionary algorithms can be used to find a teaching procedure that transforms the network of oscillating BZ droplets, controlled by an external factor, into a database classifier. Our arguments are based on computer simulations. The algorithms describing the time evolution of the network elements and their interactions are simplified. Yet, the parameters of our models were selected to agree with experimental results. Therefore, the presented classifier design can be verified in experiments.
1.1 Our Approach
We consider a classifier as a geometrical structure of 25 nodes located on a square lattice (see Fig. 3). Each node is occupied by a BZ droplet in which the chemical medium oscillates. The droplets are interconnected with neighbors according to a von-Neumann neighborhood.
Furthermore, we assume that oscillations in each droplet can be individually controlled. This control can be obtained using external factors like a magnetic field (Nishikiori et al., 2011), an electric field (Pornprompanya et al., 2002), or temperature (Masia et al., 2001). Having in mind experimental realization of droplet classifiers, the photosensitive BZ reaction is most suitable (Kadar et al., 1997). If a ruthenium catalyst is used, illumination with blue light produces ions that inhibit BZ reaction. As shown for two uncoupled chemical droplets in Figure 4 their self-excitations are suppressed during illumination. After illumination is switched off, chemical oscillations are restored immediately.
Given a droplet network, a single droplet is considered as the output and a number of droplets are used as inputs (see Fig. 3). The remaining droplets, called “normal,” transmit and integrate excitations. Furthermore, they are used to program the network by a sequence of illumination patterns. For different inputs, the input droplets are illuminated differently according to the input data, while the normal droplets are always illuminated by the same sequence. The pattern of normal droplets' illumination can be regarded as a compiled code of classifier program.
Because the manual programming of nonlinear media is hard and time consuming, we introduce an evolutionary algorithm (EA) to find the optimal illumination sequence automatically, here, for each droplet a time interval during which it is illuminated. Note that classical approaches that combine EA with an artificial neural network (ANN) usually optimize the network’s static parameters (Fogel et al., 1990) or its structure (Cliff et al., 1993; Yao and Liu, 1997). Here, however, we evolve for the given network a temporal control pattern (the illumination pattern), which is applied together with each input.
The network fitness is evaluated by numerical simulations based on a stochastic model of droplet interactions. Notably, the fitness function does not require to pre-specify how output classes are represented by chemical signals. This is achieved by using a fitness function that measures the mutual information (Shannon, 1948) between chemical oscillation patterns and desired output classes. The node with the highest mutual information is taken as the one that produces the output.
The method applicability and the functionality of droplet based classifiers are illustrated by applying the idea to two datasets from UCI Machine Learning Repository: CANCER (Wisconsin Breast Cancer Data Set; Bache and Lichman, 2013a) and CAR (Car Evaluation Set; Bache and Lichman, 2013b). The third classification problem, SPHERE, is an artificial, randomly generated dataset. The results show that the same droplet architecture, composed of 25 nodes, can successfully implement different classifiers by applying different evolved illumination pattern.
The classification problems are more general than logic gates. So, our results significantly extend reported applications of reaction–diffusion computing. Unlike the definition of a logic gate, which lists all possible input states, a dataset considered in a typical classification problem contains a fraction of all possible cases. Each test case contains a number of input variables (predictors) and an assigned category (output class). The aim is to find a generalizing algorithm or a device that correctly classifies the test cases. It is believed that it also gives the correct answer in situations that are not contained in the dataset.
2 Chemical Prototype and Simulation Methods
Our chemical prototype consists of interconnected BZ medium filled droplets inside an oil phase stabilized by lipid membranes (Section 2.1). We model this system by a discrete event, stochastic simulation model (Section 2.2). We measure the information flow, depending on input and illumination pattern (program), by mutual information (Section 2.3). For finding automatically an illumination pattern, an evolutionary algorithm is introduced (Section 2.4). The fitness function measures the mutual information between a droplet’s spike pattern and the desired output class, thus avoiding the necessity to define the representation beforehand (Section 2.5).
2.1 The Chemical Medium
External triggering, that is, a transfer of a chemical excitation from one droplet to another, occurs when an activator pulse reaches the connection point with neighboring droplets and the influenced droplet is in the responsive phase. In our theoretical analysis we assume that a droplet can be triggered only by the four closest neighbors (von Neumann neighborhood). The experimental results however, indicate that for millimeter size droplets the excitation does not occur uniformly in the whole volume (Guzowski et al., 2016). If a droplet self-excites, then usually the ignition point is located in its geometrical center and the excitation spreads out toward boundaries. If a droplet is activated by another one, then the excitation wave originates at the connection point and propagates through the droplet in form of a directional wave (see Fig. 2).
The process of droplet activation and the excitation transfer can be described by kinetic equations with diffusion terms. However, even the most simplified model, such as the two variable model (one variable is the concentration of bromous acid that plays the role of reaction activator and another is the concentration of the catalyst in its oxidized form that works as the inhibitor), described in Gorecki et al. (2011) is computationally expensive especially if it is applied to a two-dimensional, multidroplet structure. To reduce the computation time required to calculate the time evolution of the network we have introduced an event-based model (Gruenert et al., 2013, 2015), described next.
2.2 Event-Based Model
The evolved networks are simulated using a stochastic, time-continuous model described in Gruenert et al. (2013). We assume that all droplets that form a network have the same chemical composition. Excitation transfer between two interconnected droplets occurs only if one droplet is in an excited phase whereas the other is in a responsive state. Having in mind experimental results on interactions between droplets, we assumed in the model that the duration of the excitation, refractory and responsive phases are 1 s, 10 s, and 19 s, respectively. These numbers sum up to a typical oscillation period of 30 s (Szymanski et al., 2011). Since a photosensitive reaction is considered, oscillations can be suppressed in a droplet when it is illuminated with sufficient light intensity (see Fig. 4). Then, neither self-excitation nor external triggering occurs so these droplets remain in the refractory phase. When the illumination is switched off, the droplet starts to oscillate again. The mechanism of photoinhibition of oscillations is used here to evolve an illumination pattern for the network in order to optimize its functionality. We assume that each droplet can be separately illuminated; therefore, the illumination pattern is described by a vector of illumination times , where [, ] is the time interval from the beginning of the simulation (), during which the i-th droplet is illuminated. When the illumination time for a droplet is as long as the total time of the simulation, the droplet remains inactive during the whole simulation, and it can be regarded as an empty slot in the network.
Note that in case of self-excitation, the time at which the wave travels from the ignition point to the boundaries of the droplet, where external triggering of neighboring droplet occurs, depends only on the droplet diameter (compare Fig. 2). On the other hand when a droplet is triggered externally the distance covered by the excitation wave before it reaches another connection point can be different. Wave propagation in the self-excited droplet is shown schematically in Figure 2 (droplet A), whereas external triggering is visualized for droplets B and C. In a real system a triggered activation wave excites some neighbors first (droplet C), and others later (droplet D) (excitation not shown). Here, for the simplicity of simulation, we introduce a propagation time parameter =1 s, identical for all droplets, regardless of the excitation geometry.
Standard computers are built with the assumption that all components are fabricated without any defects and that the fluctuations that are present in the system do not affect the results of the executed programs. However, a highly nonlinear chemical medium, such as the BZ reaction, is very sensitive to internal and external fluc-tuations. In order to take this randomness into account, stochastic effects are introduced into simulations in the form of normal distributed noise with the standard deviation of 0.05 added to and to the times where a droplet remains in a given phase.
In a single simulation, we register the time evolution of oscillations in the system by recording the times at which excitations occur in each droplet. We assume that a signal represented in this way provides complete information necessary to analyze the network dynamics.
2.3 Information Inflow and Outflow
We consider three droplet types: input, normal, and output (see Fig. 3). They are distinguished according to the functionality in the network. Each droplet is assigned either with an input or a normal type. A droplet of the normal type is inhibited during the illumination time . Illumination times of normal droplets remain the same for all test cases. For an input droplet the corresponding parameter is not used and the illumination procedure is described later.
Input droplets are used to introduce the predictor values of each input case from the dataset into the network. In order to evaluate a single network, all test cases have to be simulated separately. Since the predictor values change according to the test case, the illumination time of input droplets depends on a case. If a dataset contains k predictors, then we distinguish k different input types. If information from smaller number of inputs is sufficient to obtain high mutual information between the database and droplet evolution then we can reduce the number of used predictors. For example, only 3 out of 10 predictors are used from the CANCER dataset (see Section 3.1).
There are many ways in which the predictor value can be transferred into the medium. Here, if the predictor value is , then the control factor is applied to all droplets selected as inputs of this predictor for the time proportional to . Let us consider a dataset with predictors where all of them are used as inputs to the network. For a test case from a database, we fix as the value of the predictor , where s = 1, …, k. If a droplet is of the input type, corresponding to the predictor , then during a simulation of this test case it is illuminated in the time interval . and limit the time interval coevolved with the set of initial illuminations (see Section 2.4, and Figures 5 and 6). The interval is the same for all input droplets in a given network. The limits are subjects of optimization.
The types of the droplets are also subjected to evolution and thus the position of the inputs might vary according to generation. We also do not introduce any constraints on the number of input droplets in the network. For example, it is possible that there are three droplets with input type 3 and no droplet with input type 1 in the newly generated network as seen in Figure 6. In this situation, the predictor 3 is provided to all corresponding input droplets and no information about predictor 1 will be transferred to the network.
Exactly one droplet of the network is selected as the output droplet. The mutual information with the output class distribution is checked separately for each droplet in the network during the fitness evaluation procedure and the one with the highest value is selected as the output droplet. Since the mutual information in the droplets changes during evolution, the position of the output is not fixed and can change from generation to generation. Note that for certain conditions it is possible that an input droplet will be used as the output one.
2.4 Evolution of BZ Networks
The evolution scheme is based on the approach presented in Beyer and Schwefel (2002); however, the individuals contain no endogenous (i.e., evolvable) strategy parameters. The population size is parents and offspring. The number of generations is selected based on the complexity of the dataset, and it is specific for each experiment. Depending on the noise influence, either COMMA or PLUS strategy was applied (Schwefel, 1977, 1981). The first option assumes that only newly generated offspring are transferred to the next generation. This process is similar to natural procreation in the sense that the parents give birth to the offspring population and die afterwards. In contrast, when the PLUS strategy is applied, a number of best parents are copied to the next generation along with the offspring. In this case, an individual with a high fitness can survive for many generations.
Following the typical evolution strategy notation, our evolution hence resembles a (8/2, 30)-strategy for COMMA selection and a (5/2 + 25)—strategy for PLUS selection. Following the standard recombination techniques (Goldberg, 1989), parents are involved in the procreation of one offspring. A rectangular subgrid of droplets from Parent 2 constrained by two randomly selected points A and B is replaced with the corresponding subgrid in Parent 2 (as illustrated in Figure 6) to yield a new individual. The illumination interval of input droplets co-evolved with the network is copied from the Parent 1 to the offspring. The newly generated individual is subjected to three subsequent mutation operators with fixed mutation rates (see Fig. 6), according to the following order:
- Input illumination interval mutationTimes determining illumination interval of droplet with input types, that is, , are selected from the normal distributions with the averages , , respectively and as follows: is a random number selected from the normal distribution with average and variance . If , both times are swapped; that is, is replaced with and vice versa.
Droplet type mutationWe assume that input droplets can change into normal droplets and vice versa. The probability of droplet transformation is . Regardless of droplet type, the probability of obtaining an input droplet is and for the normal one .
Illumination time mutationIf a droplet is of the normal type, then with a probability its illumination time is mutated. The new illumination time for the Offspring is generated from the normal distribution with the average and .
The values of all probabilities were selected arbitrarily as being reasonably small, but still large enough to produce noticeable changes in network functionality after mutation. They should have little influence on the final, optimized classifier, but definitely decide the rate of convergence toward the optimum solution. Independent of the mutation operators, the fitness function always chooses the output droplet that generates the highest mutual information to the desired output class.
2.5 Fitness Evaluation
Evaluation of individuals is based on techniques offered by information theory. Using the mutual information, we measure the dependence between spikes in droplets and the expected output. For a single network the mutual information in relation to the desired output class is calculated for each droplet separately and the droplet with the largest mutual information becomes the output. To find an easy classification rule we assume that only one droplet, with the highest mutual information, serves as an output. Even though larger structures of droplets can be used for classification, it seems to be more difficult to interpret their dynamics as classification output.
Since a continuous-time model is considered, a droplet can spike at any moment of time during the simulation. A series of these spikes defines a continuous spike pattern for a single droplet. In this representation a large number of different patterns can appear. The mutual information is calculated from a probability distribution of the patterns appearing in the output signal. In order to limit the number of patterns, we discretize the continuous spike patterns for all droplets in the network, dividing the total simulation time into 25 frames, each with the length . Note that maximally one oscillation per frame is possible because the refractory time is longer than . Next, the series of frames is converted into a discrete spike pattern P, that is, a binary string where the symbol 1 corresponds to a frame containing a spike and 0 otherwise (see Fig. 7).
Two fitness calculators based on mutual information are used: Spike Pattern Mutual Information (SPMI) and Number of Spikes Mutual Information (NSMI). The fitness function in the first case corresponds to the amount of mutual information between the distribution of discrete spike patterns and the output class. For the NSMI, all spikes from a single pattern are summed up and finally the distribution of the number of spikes is used to calculate the fitness.
Here and are the entropies of the spike pattern distribution in droplet and the distribution of output classes in the dataset respectively. is the joint entropy of both distributions.
can be obtained by building a new distribution from the spike patterns by appending the appropriate output class to each spike pattern.
We evaluate a network’s fitness using the mutual information between either the oscillatory patterns or the total number of oscillations in the output droplet and the distribution of output classes in the dataset. In other words, this mutual information, and hence the fitness value, can be interpreted as the reduction of uncertainty about the output class when observing a particular spike pattern or spike number.
3 Results and Discussion
3.1 Classification Problems
Three standard classification datasets are considered as test problems for the evolved droplet computer. Each dataset contains a number of test cases (instances) where a single case is composed of predictors and the output, that allow for a discrete classification to one of the output classes.
The CANCER dataset has 699 instances, with 458 (65.5%) benign and 241 (34.5%) malignant cases with the total entropy of ca. 0.93 bits. Each instance is described by 10 integer-valued predictors, and a binary class label. Since the mutual information of the predictors 1, 2, and 6 to the desired output is high and close to 0.92 bits (almost the whole information is contained in three predictors), only these channels are used as inputs in the experiments. As the result of the dataset reduction four test cases from each of the output classes have the same values of the predictors and thus cannot be distinguished by the classifier. Furthermore, the number of unique test cases for benign and malignant output class is reduced to 61 and 143, respectively. Note that the duplicated instances are kept in the dataset and classified.
The two classes of the CANCER dataset are almost linearly separable. Thus we used methods from Support Vector Machine (SVM) (Chang and Lin, 2011) to find a hyperplane in a 3-dimensional space that would separate the classes with the highest accuracy. The SVM-based classifier with a linear kernel applied yields performance of 95.99%.
The CAR dataset contains 1728 instances with following number of outcomes: 1210 (70%), 384 (22.2%), 69 (3.9%), and 65 (3.7%) for class 0, 1, 2, and 3, respectively, where the higher output class value indicates better evaluation. All six predictors are used as inputs and thus the information entropy of the dataset is the same as the mutual information between the inputs and the output and equal to 1.21 bits. For both repository datasets, the data were standardized and the predictor values were scaled to .
3.2 CANCER Dataset
3.2.1 SPMI Fitness
We performed 50 computer experiments with the COMMA strategy, each with 50 generations. A typical fitness trajectory as observed in experiments is shown in Figure 8a. The fitness stabilizes after about 20 generations and approaches the maximum value of 0.9 bits, close to the entropy of the dataset. That means that almost the whole information is encoded in the spike patterns appearing at the output droplet. The evolved light pattern and the position of inputs and the output droplet for the discussed network are presented in Figure 9a.
The distribution of spike patterns in the output droplet is presented in Figure 10a. A signal combined from all time frames can be too complex to build an intuitive classification rule; however, a certain part of the patterns exhibits visibly different behavior for different output classes. For the time frame marked with an arrow, an excitation of the output droplet was observed in 93.2% of simulations with the benign class and only in 17.8% for the malignant cases. Accuracy, measured as the number of correctly classified cases divided by the total number of instances, for the classification based on spike presence or absence in the marked frame, is equal to 89.4%. The majority of the incorrect classifications occur in the area of class separation.
3.2.2 NSMI Fitness
For the NSMI fitness calculator, we carried out 25 evolution runs with COMMA strategy and the number of generations was increased to 250. Stabilization of the fitness value for the best individual was observed after about 200 generations with maximum information close to 0.82 bits. The summed number of spikes for all test cases presented in Figure 10b indicates that in simulations with benign class, the output droplet oscillated more frequently than for the other class. If we introduce a threshold value (marked with a dashed line) and build a classification based on number of spikes present in the output droplet during the simulation, then 97% accuracy is obtained.
A comparison of fitness convergence in SPMI and NSMI cases (see Fig. 8) reveals that in the latter case a longer evolution run is necessary to find a network with a high mutual information value. After 250 generations, the fitness for the NSMI evaluation was still about 0.08 bits lower than for SPMI experiment with 50 generations. Note, however, that despite the lower information contained in the network evolved with NSMI calculator the classification in this case is more accurate. For a network evolved with SPMI fitness we used only a single, (11-th) time frame to construct the classification rule, whereas the information from the other 24 frames is forgotten. In contrast, the applied classification rule, of the network evolved with NSMI fitness, derives benefit from spikes present in all time frames. Thus, we can expect that for a more complex, multiple–frame-based classification rule, the performance of the SPMI network could be improved. On the other hand, if we allow for slightly lower accuracy of classifier, then the time needed for decision is reduced by a factor of 2.
3.3 SPHERE Dataset
For the SPHERE dataset we performed 50 simulations (25 with SPMI and 25 with NSMI fitness evaluation), each with 500 generations. The test cases in this dataset were created from uniform random distribution separately for each generation as described in Section 3.1. That resulted in increased noise level present in the system. In order to reduce the noise, we employed a hybrid evolution scheme. For the first 50 generations, we let the evolution explore the fitness landscape more thoroughly using the COMMA strategy. For the remaining 450 generations, a PLUS strategy was applied. The moment of evolution run at which the strategy was changed is marked with dashed lines in Figure 11.
3.3.1 SPMI Fitness
Fitness of the best network with SPMI evaluation reached a maximum value of 0.93 bits after 241 generations as presented in Figure 11a. The distribution of mutual information and evolved illumination pattern for this network is illustrated in Figure 12a. In the optimized structure the output droplet is in direct contact with three inputs (in1, in2, and in3).
In this approach, test cases having spikes in either of frames 2, 3, 4, 23, or 24 are classified to class 0, whereas all the other cases are classified as 1. Using this rule, an accuracy of 70% can be obtained.
3.3.2 NSMI Fitness
The maximum fitness value in the simulations with NSMI calculator is 0.48 bits. Fitness evolution of the system here is significantly different from the networks presented, previously as shown in Figure 11b. In this case, after the initial stagnation of fitness at a level of about 0.2 bits, a significant increase to 0.45 bits can be observed at generation 330. Note that switching to the PLUS strategy for the NSMI evaluation does not reduce the noise level, whereas we see this effect for the network evolved with the SPMI rule (see Fig. 11a).
The corresponding network with illumination pattern is presented in Figure 12b. Analysis of number of spikes distribution at the output of this network (see Fig. 13b) reveals that for all experiments, the output droplet oscillated either 5, 6, or 7 times. If a threshold is set to 5 (marked with a dashed line), an accuracy of 79% can be obtained, even though the mutual information with the output class distribution equals only 0.48 bits.
In both evolved networks (see Figures 12a and 12b), the output droplet is located close to the inputs. Therefore, we found it interesting to verify whether the core of the network consisting only of the output droplet (marked with a dashed rectangle) surrounded by inputs has similar classification accuracy as the original network. For SPMI and NSMI classifiers, we considered reduced networks shown in Figures 12c and 12d, respectively. In order to deactivate droplets that were not assigned as inputs or the output, we assumed that they remain illuminated during the entire simulation. In both reduced networks the mutual information at the output was significantly lower than for the original one. Moreover, in the case of the reduced NSMI network, the mutual information in the original output droplet becomes lower than mutual information in droplet (in2) (see Fig. 12d). As a consequence, the in2 droplet is selected as the output of the reduced network.
This result means that interactions between all droplets in an optimized network are important for its functionality. Even in the case when the output droplet is in direct contact with all available inputs, the network reduction can decrease classification ability.
3.4 CAR Dataset
For each evaluation method (SPMI and NSMI) 25 simulations with 500 generations were performed. In all simulations the strategy was switched from COMMA to PLUS after 50 generations as in the previously discussed dataset (SPHERE).
3.4.1 SPMI Fitness
The network with the highest fitness value, equal to 1.04 bits, was observed in generation 496 as presented in Figure 14a. Strategy change decreased significantly the stochastic noise. The distribution of mutual information and evolved illumination pattern for this network is shown in Figure 15a. The optimized structure contains a single droplet (the output one) with mutual information significantly higher than the other droplets. The number of inputs selected during network optimization is lower than the total number of predictors in the dataset. As the result, the maximal mutual information that can be observed for such a network is lower than in the case of a network with all inputs. Since the evolved network contains inputs 1, 2, 4, 5, and 6, only the five corresponding predictors are used for classification. The calculated maximal mutual information between the reduced set of predictors and the output class distribution is equal to 1.10 bits.
The time distribution of spikes from all test cases for this classifier is shown in Figure 16a. In most of the time frames we observe spikes corresponding to different output classes; however, in the frame 10 (marked with an arrow) only spikes from class 0 are present. The classification rule based on spikes’ presence in this frame classifies correctly 960 cases from class 0 without any errors. Since most of the test cases belong to class 0, it seems reasonable to apply the rule first and reduce the dataset. The spike distribution of the cases from the reduced dataset are presented in Figure 16b.
By sequentially applying further classification rules and reducing the test dataset, one can obtain a chain of rules presented in Table 1. However, designing the rules for each step in the chain to find an optimal classification accuracy is a complicated task since correlations between spikes for all output classes have to be followed. Therefore, we introduced an automatic rule selection system that compares all rules for a given subset of test instances and selects the one with the highest accuracy ratio (the number of correct classifications to incorrect ones). Using this approach we obtained a classifier containing 47 rules presented in Table 1, that yields accuracy of 93.4%. The number of cases from a given class, classified at each step, is illustrated in Figure 17.
|Rule no. .||Rule .||Class .||Rule no. .||Rule .||Class .|
|Rule no. .||Rule .||Class .||Rule no. .||Rule .||Class .|
3.4.2 NSMI Fitness
The fitness evolution for the NSMI evaluation method is presented in Figure 14b. In the first part of the simulation with COMMA strategy, one can observe large changes in the fitness value of the best individual. After switching, the strategy to PLUS fitness is stabilized and only two steep changes occur at generations 100 and 180. Note that after switching to PLUS strategy, fitness of the best individual change almost in a stepwise manner. This means that despite the noise associated with stochasticity of the model, the same individual reevaluated multiple times in different generations has the same fitness value each time. We assume this same fitness is a result of the NSMI evaluation scheme and the selection for robustness: since for each simulation, the number of spikes can vary between only eight different discrete steps, the stochastic noise in the model evaluation here is most probably lost in the discretization to the spike number.
The corresponding illumination pattern for the NSMI classifier is shown in Figure 15b. Note that as in the case of SPMI evaluation not all inputs are present in the evolved structure. The calculated maximal mutual information between the reduced set of predictors (1, 3, 4, 6) and output class distribution is equal to 0.70 bits.
The distribution of number of spikes that are registered for the test cases from all output classes is presented in Figure 16c. For all of the cases, the number of spikes was between 4 and 8. In almost all of the simulations of cases from classes 2 and 3 the number of oscillations was the same and equal to six. Therefore, these two classes cannot be distinguished with the NSMI classifier. Moreover, six oscillations at the output droplet were observed for almost half of the cases from class 1, with the total number of test cases five times larger than classes 2 or 3. Thus, in order to obtain high performance of the NSMI classifier, only classes 0 and 1 should be considered.
A strategy that assigns all cases with 5 or less spikes to class 1, all cases with 6 spikes to class 2, and all cases with 7 or more spikes to class 0 gives lower overall accuracy.
In this article, we have presented a novel approach for designing a chemical classifier composed of spiking droplets containing reagents of oscillatory BZ reaction. Because the time evolution of signals in such media is complex, manual programming is difficult. The medium programming described in our work is based on four steps: identification of input droplets, selection of the method used to input predictor values into the network, control of oscillations in normal droplets, and selection of the output droplet for which the time evolution delivers maximum information about the classification problem. We demonstrate that evolutionary algorithms together with a representation-independent fitness function using mutual information can be successfully used for automated programming.
The classifier is programmed by the temporal control of illumination patterns, which are optimized by the evolutionary algorithm to solve a given classification problem. The selection of light as a network control parameter is advantageous for the implementation of chemical classifiers because a modification of the droplet structure would require a much larger effort compared to adjusting the illumination of individual droplets. Future experiments are needed to validate whether the presented solutions can be transferred directly to a real system.
We have shown that the same simple network of 25 oscillating droplets located on the square lattice can be successfully used for three different classification problems. For a binary dataset with almost linearly separable output classes (here the CANCER dataset) very high accuracy can be obtained using a simple classification rule based on oscillation frequency. In this case, a classification based on the presence of a spike in a single time frame also yields a good performance. In contrast, for a more complicated binary dataset (the SPHERE dataset) the accuracy of the evolved classifier is smaller regardless of the evaluation method (cf. Table 2).
|.||Correct classifications .||Incorrect classifications .||.|
|.||0 .||1 .||2 .||3 .||0 .||1 .||2 .||3 .||Accuracy .|
|.||Correct classifications .||Incorrect classifications .||.|
|.||0 .||1 .||2 .||3 .||0 .||1 .||2 .||3 .||Accuracy .|
Classification of datasets with a large number of inputs and multiple outputs with unequal distribution (here the CAR dataset) is more difficult. At the expense of simplicity and classification time a set of rules can be developed that significantly increases the classification accuracy. One can expect that classifier performance can be improved by extending the size of the networks or experimental time. Alternatively, one can consider separate networks for each output class to reduce the number of different output classes for each network. Furthermore, additional droplet properties like the internal chemical dynamics can be put under evolutionary control (Khan et al., 2011).
While the NSMI fitness function generates networks with relatively simple outputs that can be easily interpreted, the SPMI fitness function leads to droplet networks with potentially very complex spike patterns. Here, we have demonstrated the manual and automatic extraction of useful classification rules from complex patterns found by the SPMI fitness function. Further work will have to be done to automatically extract useful rules that allow a classification based on spike patterns that carry a high mutual information with the output class.
When allowing very complex output patterns, the computing droplet system can become simpler and export more of the actual computation to the observer interpreting the output signal. In the extreme case, all inputs are just mapped into the output spike pattern without doing any information modification. By continuously reducing the possible complexity of the spike patterns, that is, the number of considered time frames, we can continuously increase the amount of computation that has to be done in the network itself.
There are many parameters in our simulations. They can be divided into four groups. First, there are parameters of a physical model of reaction. In this article, they are reduced to the times the system spends in refractory, responsive, and excitable states and the time in which an excitation transmits between neighboring droplets. These parameters should match the experiment as close as possible. Size and geometry of the studied network belong to another group. It can be expected that larger and more complex networks should give better classifiers. However, the selected size (25 nodes) and square geometry are sufficient to support our statement that evolutionary algorithms are useful as a tool for designing chemical classifiers. The third group encompasses parameters describing evolutionary strategy (listed in Section 2.4) that can have an important impact on algorithm efficiency. However, even the values used in our simulations lead to reasonably good classifiers. Therefore, we think that precise tuning of these parameters has a secondary role. Finally, there are parameters involved in the transformation of observed time evolution into discrete signals (here and ). We were successful in selecting the values of these parameters for which the fitness function, based on mutual information, is a good estimation of classifier quality.
The evolutionary algorithms and optimalization based on mutual information can be linked with a real chemical network to design a chemical classifier. In our opinion direct coupling of evolution strategy with experiment is not efficient because a long time is necessary to test in vivo all cases from database. Therefore, the classifier should be first designed insilico and next verified in experiments on a selected subset of the database cases.
In conclusion, we would like to add that the presented methods can be easily generalized to the other types of structured nonlinear media such as chemical reactors connected by inflow of regents. We believe that chemical classifiers can find application where information processing systems are expected to be small and operate without an external power supply.
This work was supported in part by the NEUNEU project sponsored by the European Community within FP7-ICT-2009-4 ICT-4-8.3—Future and Emergent Technologies Proactive 3: Biochemistry-based Information Technology (CHEM-IT) program (project No. 248992) and by a supplementary grant from the Polish Ministry of Science (2187/7.PRUE/ 2011/2). The work of K.G. was financed within the International PhD Projects Program of the Foundation for Polish Science (MPD/2009/1/styp14). The final stage of the work was supported by the Polish National Science Centre grant UMO-2014/15/B/ST4/04954.