## Abstract

Error backpropagation in networks of spiking neurons (SpikeProp) shows promise for the supervised learning of temporal patterns. However, its widespread use is hindered by its computational load and occasional convergence failures. In this letter, we show that the neuronal firing time equation at the core of SpikeProp can be solved analytically using the Lambert *W* function, offering a marked reduction in execution time over the step-based method used in the literature. Applying this analytical method to SpikeProp, we find that training time per epoch can be reduced by 12% to 56% under different experimental conditions. Finally, this work opens the way for further investigations of SpikeProp’s convergence behavior.

## 1 Introduction

One of SpikeProp’s features is the possibility of adapting the classical error backpropagation algorithm to networks of spiking neurons (Bohte, Kok, & La Poutré, 2000, 2002). Although improvements have been proposed (Schrauwen & Van Campenhout, 2004; McKennoch, Liu, & Bushnell, 2006; Ghosh-Dastidar & Adeli, 2007), SpikeProps’s running time and convergence are unsatisfactory in comparison to backpropagation in networks of sigmoid perceptrons (Fujita, Takase, Kita, & Hayashi, 2008; Takase et al., 2009; Thiruvarudchelvan, Crane, & Bossomaier, 2013; Thiruvarudchelvan, Moore, & Antolovich, 2013).

In a formal spiking neuron model, the neuron’s response to input stimuli is governed by the equation , the solutions of which are firing times of spikes (Gerstner, 1995). The function represents the neuron’s membrane potential at time , and is a constant threshold.

Under a response coding scheme called time-to-first-spike, a single spike is emitted at time . The single firing time is computed by evaluating sequentially with a fixed time step until obtaining a value for greater than or equal to . When no solution is found, the neuron emits no spike.

In this letter, we show that the firing time equation , analyzed piecewise, can be solved analytically using the Lambert *W* function.^{1} To motivate this novel result, we present a performance comparison of the analytical and step-based solutions, and we experiment with SpikeProp to determine how analytical performance affects the training time of spiking neural networks.

## 2 Analytical Solution of the Firing Time Equation

*n*postsynaptic potentials (PSPs) from antecedent neurons in a network, is expressed as with the PSP profile where is the weight of the

*k*th PSP, is the onset time of the

*k*th PSP, and is a decay constant. Without loss of generality, we now consider that since time can be measured relative to this constant.

We now derive the analytical solution to the firing time equation using the principal branch of the Lambert *W* function, noted . We illustrate this function in Figure 1.

*t*.

To solve analytically, we start from and search for the earliest interval during which reaches . Let us consider an interval , with taken as infinity for , for which we have . On *I _{k}*, the membrane potential has the first derivative with a single root at (note that this root might not be in

*I*). Thus, if and , the function must increase on

_{k}*I*, and there exists a solution to the firing time equation. In this case, the search ends with the interval . Otherwise, the function must have a maximum greater than or equal to on

_{k}*I*in order to find a solution. This happens if and , and in this case, the search ends with the interval . If no solution is to be found on

_{k}*I*, we continue the search with , for which .

_{k}Algorithm 1 describes this process. Its outputs are the firing interval , with and , and the coefficients and , or the empty set if is never reached.

Note that we do not have to search for a solution when . First, if , we have , in which case the membrane potential is driven toward zero, thus precluding the neuron from firing. Second, *T _{k}* is the location of a maximum if , which is equivalent to such that

*A*must be positive.

_{k}For a general value of the decay constant , the firing time is .

If algorithm 1 returns the empty set, does not exist (it is infinite by definition of the infimum).

The condition can be dropped. As per algorithm 1, cannot reach at iteration *k* with .

## 3 Analytical Performance and Step-Based Solutions

We now compare the performance of solutions implemented in Matlab R2015b 64 bits on Ubuntu 14.04 with a Pentium Dual-Core T4500 2.3 GHz processor.^{2} We generated 100,000 random instances of spiking neurons similar to those used in Bohte et al. (2000, 2002). The input stimuli are composed of five trains of 16 PSPs. Each PSP has a random weight between −1 and 2, the first PSP of each train has a random onset between 1 and 7, and subsequent onsets of each train are each delayed by a value of 1. We set the decay constant to 7. We computed the Lambert *W* function numerically with Halley’s method (Corless, Gonnet, Hare, Jeffrey, & Knuth, 1996). We measured the execution time required to evaluate the analytical solutions and the step-based solutions (with and 0.01) for the 100,000 neuron instances using the threshold values , and we computed the corresponding mean execution time per evaluation. The results are presented in Table 1.

. | 1 . | 5 . | 10 . | 15 . |
---|---|---|---|---|

Analytical solution | ||||

Step-based solution () | ||||

Step-based solution () |

. | 1 . | 5 . | 10 . | 15 . |
---|---|---|---|---|

Analytical solution | ||||

Step-based solution () | ||||

Step-based solution () |

Notes: Mean execution time (in seconds) over 100,000 instances (for and for ). Time was measured using the Matlab tic and toc commands.

Overall, the analytical solutions were evaluated faster than the step-based solutions (from 1.45 to 101.25 times faster depending on and ). We also found that occasionally the step-based method could miss the firing time found analytically as remained over for less than 0.1 units of time between two consecutive steps. When this happens, either the analytical solution is missed, as never rises again to reach , or there is a step-based solution that is at least one time step late. Table 2 lists the type of errors for each value of . Using prevented these errors, at a cost of more execution time. Note that was used most often in previous work with SpikeProp.

## 4 Application to SpikeProp

### 4.1 Network Architecture and Weight Update Rules

We now describe the feedforward network architecture in SpikeProp (Bohte et al., 2000, 2002). Neurons in successive network layers are linked with connections comprising multiple synaptic terminals. A neuron in the network layer *L* fires a single action potential (or spike) when the membrane potential crosses a determined threshold. This spike generates postsynaptic potentials (PSPs) in the output connections of the neuron (one PSP per synaptic terminal). Figure 2 illustrates this network configuration. The membrane potential activity of each neuron in layer stems from the spatial and temporal summation of incoming PSPs from layer *L*, following equation 2.1. As before, we assume that the PSPs of a connection are each delayed by one unit of time. We also assume that, following a spike, the onset of the first PSP is delayed by one unit of time. Furthermore, the decay constant is usually set to the same value for every neuron in the network.

*M*neurons in the layer connected to the output neuron. In that layer, the update rule for the

*k*th weight () connecting the

*j*th neuron (with firing time ) to the output neuron is given by with where a step size hyperparameter.

*L*. Suppose layer has

*M*neurons and layer has

*M*

_{1}neurons. The update rule for the

*k*th weight connecting the

*i*th neuron of layer to the

*j*th neuron of layer

*L*is given by with where the terms are computed recursively from the last layers of the neuron toward the first layers. When the layer

*L*is the layer connected to the output neuron, we have and .

*C*patterns, the error is expressed by the sum-of-squares where and are, respectively, the desired and actual firing time for the

*c*th pattern.

### 4.2 XOR Problem

Input 1 . | Input 2 . | Input 3 . | Class . | Firing Time . |
---|---|---|---|---|

0 | 0 | 0 | 0 | 16 |

0 | 6 | 0 | 1 | 10 |

6 | 0 | 0 | 1 | 10 |

6 | 6 | 0 | 0 | 16 |

Input 1 . | Input 2 . | Input 3 . | Class . | Firing Time . |
---|---|---|---|---|

0 | 0 | 0 | 0 | 16 |

0 | 6 | 0 | 1 | 10 |

6 | 0 | 0 | 1 | 10 |

6 | 6 | 0 | 0 | 16 |

Notes: Input values correspond to the firing times of spikes that generate PSPs in the first layer of the network. Inputs 1 and 2 can represent an early spike fired at time 0 or a late spike fired at time 6. Input 3 is a reference spike always fired at time 0. Class 1 represents an early response of the network’s output neuron, and class 0 represents a late response.

To compare analytical and step-based solutions, we generated 10 spiking neural network instances having 3 inputs, 3 hidden neurons, and 1 output neuron, plus 10 additional instances with 5 hidden neurons. All network weights were initialized to a random value between −1 and 2. We used , , and . A low value is used for so that, initially, it is very likely that all neurons fire for all input patterns; this was found to facilitate the training process in previous work with SpikeProp. We trained each network with SpikeProp using the values 0.01,0.05,0.10 for the hyperparameter until the fell under a value of 1, or until we reached a maximum of 20,000 epochs. We call a run of SpikeProp convergent if it ended with in fewer than 20,000 epochs (otherwise it is nonconvergent).

For each run, we measured the number of epochs and the training time, and we computed the corresponding means for each experimental condition (determined by solution method, number of hidden neurons, and value of ). We also counted the number of convergent runs and derived the training time per epoch. The results are presented in Tables 4 and 5. Note that we excluded nonconvergent runs when computing mean training epochs and mean training time. Over all 120 runs, the largest number of epochs for a convergent run was 9240.

. | 0.01 . | 0.05 . | 0.10 . |
---|---|---|---|

Analytical solution | 332 epochs | 1979 epochs | 291 epochs |

4.1 s | 24.1 s | 3.6 s | |

10 convergent runs | 10 convergent runs | 8 convergent runs | |

12.3 ms/epoch | 12.2 ms/epoch | 11.8 ms/epoch | |

Step-based solution | 744 epochs | 1993 epochs | 597 epochs |

11.2 s | 31.5 s | 9.6 s | |

10 convergent runs | 9 convergent runs | 7 convergent runs | |

15.0 ms/epoch | 18.3 ms/epoch | 26.5 ms/epoch |

. | 0.01 . | 0.05 . | 0.10 . |
---|---|---|---|

Analytical solution | 332 epochs | 1979 epochs | 291 epochs |

4.1 s | 24.1 s | 3.6 s | |

10 convergent runs | 10 convergent runs | 8 convergent runs | |

12.3 ms/epoch | 12.2 ms/epoch | 11.8 ms/epoch | |

Step-based solution | 744 epochs | 1993 epochs | 597 epochs |

11.2 s | 31.5 s | 9.6 s | |

10 convergent runs | 9 convergent runs | 7 convergent runs | |

15.0 ms/epoch | 18.3 ms/epoch | 26.5 ms/epoch |

Notes: Each cell presents the mean number of training epochs and mean training time (in seconds) over convergent runs, number of convergent runs, and mean training time per epoch (in milliseconds) over all 10 runs. Time was measured using the Matlab and commands.

. | 0.01 . | 0.05 . | 0.10 . |
---|---|---|---|

Analytical solution | 312 epochs | 154 epochs | 192 epochs |

6.2 s | 3.1 s | 3.8 s | |

10 convergent runs | 10 convergent runs | 10 convergent runs | |

19.9 ms/epoch | 19.9 ms/epoch | 19.9 ms/epoch | |

Step-based solution | 334 epochs | 125 epochs | 394 epochs |

7.6 s | 2.9 s | 10.8 s | |

10 convergent runs | 10 convergent runs | 10 convergent runs | |

22.7 ms/epoch | 23.0 ms/epoch | 27.3 ms/epoch |

. | 0.01 . | 0.05 . | 0.10 . |
---|---|---|---|

Analytical solution | 312 epochs | 154 epochs | 192 epochs |

6.2 s | 3.1 s | 3.8 s | |

10 convergent runs | 10 convergent runs | 10 convergent runs | |

19.9 ms/epoch | 19.9 ms/epoch | 19.9 ms/epoch | |

Step-based solution | 334 epochs | 125 epochs | 394 epochs |

7.6 s | 2.9 s | 10.8 s | |

10 convergent runs | 10 convergent runs | 10 convergent runs | |

22.7 ms/epoch | 23.0 ms/epoch | 27.3 ms/epoch |

Notes: Mean number of training epochs and mean training time (in seconds) over convergent runs, number of convergent runs, and mean training time per epoch (in milliseconds) over all 10 runs. Time was measured using the Matlab and commands.

Our results show that the hyperparameter and the number of hidden neurons can have much more impact on the training process than the solution method. With one exception (five hidden neurons, ), training took less time and fewer epochs on average with the analytical solution. For specific network instances, using the analytical solution instead of the step-based solution does not guarantee a reduction in the number of epochs and training time. Still, we obtained a 12% to 56% reduction of training time per epoch across different conditions (see Table 6). Note that we observed no occurrence of missed analytical solutions or late step-based solutions while training these networks.

. | 0.01 . | 0.05 . | 0.10 . |
---|---|---|---|

Three hidden neurons | |||

Five hidden neurons |

. | 0.01 . | 0.05 . | 0.10 . |
---|---|---|---|

Three hidden neurons | |||

Five hidden neurons |

Finally, we obtained some convergence failures with networks having three hidden neurons. However, with no errors in step-based solutions and with nonconvergent runs involving both analytical and step-based solutions, there is no indication that step-based solutions cause additional convergence failures.

## 5 Discussion

### 5.1 Scope of the Analytical Solution

Our results, while promising, are limited by the small scale of our experiments. A model of how analytical performance scales up with PSP count and network size could help inform the use of our method on large-scale problems. However, we found that the derivation of such a model is challenging due to the firing process of spiking neurons that affects the PSP count in a network. Approximations of the firing process, such as those used in Thiruvarudchelvan, Crane, et al. (2013), could help toward that goal.

It is of interest to extend our analytical solution to other PSP profiles and to multispiking neural networks that add exponential terms to the membrane potential to represent refractoriness (Booij & Nguyen, 2005; Ghosh-Dastidar & Adeli, 2009). Changing the firing time equation can preclude its solution via the Lambert *W* function. Still, this function is not elementary, and a numerical method is required for its evaluation. Halley’s method could very well be adapted to more general forms of membrane potential. However, when possible, it is desirable to solve the firing time equation via special functions because of their well-understood behavior, which facilitates their use.

Furthermore, it would be worthwhile to adapt our analytical solution to recurrent spiking neural networks. In such networks, the onset times of PSPs are not known in advance, and our method should be modified to compute firing times simultaneously in all neurons with a list of known onset times that could be appended as new spikes are fired.

### 5.2 Significance of the Analytical Performance

On average, analytical solutions helped to reduce the training time per epoch comparatively to step-based solutions under several experimental conditions. Those conditions can have a higher impact on performance than the solution method itself, but identifying optimal conditions is a time-consuming process that is largely problem dependent. Analytical solutions could contribute to speed up this process, and further experiments are needed to confirm their performance across different problems. Also, for large-scale spiking neural networks, parallel implementations should be employed as well to further improve training time (Nuno-Maganda, Arias-Estrada, Torres-Huitzil, & Girau, 2009).

We found that solving the firing time equation is subject to errors with the step-based method. These errors are very rare, and they did not occur during our experiment with SpikeProp. Their impact on convergence remains uncertain, which warrants further study of the properties of SpikeProp.

## 6 Conclusion

In this letter, we have shown that the firing time of spiking neurons in SpikeProp can be determined analytically via the Lambert *W* function. This analytical solution offers better performance than the step-based solution employed in the literature. In certain situations, it can even correct erroneous results of the step-based solution. Furthermore, we showed that the analytical solution reduced the training time per epoch of SpikeProp.

Further work involves the development of a model of analytical performance scale-up to improve our understanding of analytical and step-based solutions, the extension of our analytical method to other variants of spiking neural networks and to larger scales to solve real-world classification problems, and additional experimentation to characterize the convergence behavior of SpikeProp.

## Acknowledgments

This research is supported by the Bill and Melinda Gates Foundation, award OPP1110049. We are grateful to the reviewers for their constructive comments, which helped improve this work.

## References

*W*function

*W*function

## Notes

^{1}

This special function is often available in numerical analysis software, and analytical approximations also exist (Barry et al., 2000).

^{2}

We developed our own Matlab scripts and functions, with no particular run-time optimizations.