## Abstract

Using methods from nonlinear dynamics and interpolation techniques from applied mathematics, we show how to use data alone to construct discrete time dynamical rules that forecast observed neuron properties. These data may come from simulations of a Hodgkin-Huxley (HH) neuron model or from laboratory current clamp experiments. In each case, the reduced-dimension, data-driven forecasting (DDF) models are shown to predict accurately for times after the training period.

When the available observations for neuron preparations are, for example, membrane voltage V(t) only, we use the technique of time delay embedding from nonlinear dynamics to generate an appropriate space in which the full dynamics can be realized.

The DDF constructions are reduced-dimension models relative to HH models as they are built on and forecast only observables such as V(t). They do not require detailed specification of ion channels, their gating variables, and the many parameters that accompany an HH model for laboratory measurements, yet all of this important information is encoded in the DDF model. As the DDF models use and forecast only voltage data, they can be used in building networks with biophysical connections. Both gap junction connections and ligand gated synaptic connections among neurons involve presynaptic voltages and induce postsynaptic voltage response. Biophysically based DDF neuron models can replace other reduced-dimension neuron models, say, of the integrate-and-fire type, in developing and analyzing large networks of neurons.

When one does have detailed HH model neurons for network components, a reduced-dimension DDF realization of the HH voltage dynamics may be used in network computations to achieve computational efficiency and the exploration of larger biological networks.

## 1 Introduction

### 1.1 General Setting

When we have observed data generated by a complex, nonlinear system, neurons and their networks are a prime example of this, but we do not have a model for the detailed neurodynamics of the system, it is possible to use those measured data alone to create a dynamical rule that forecasts the future of the observed quantities beyond the window in time where the data were measured. We call this data-driven forecasting (DDF).

Our goal in this article is to demonstrate how to achieve and then use DDF in the context of neuroscience.

DDF is to be viewed as a way to capture observed aspects of neurobiological data in contrast to the usual method of creating a detailed biophysical model, often of the Hodgkin-Huxley (HH) type, which may or may not be correct, specifying all of the relevant ion currents, the required gating variables, and the concentrations of relevant quantities such as $[Ca2+](t)$ (Johnston & Wu, 1995; Sterratt, Graham, Gillies, & Willshaw, 2011).

Unknown quantities in such an HH model may be estimated using data assimilation (DA) (Toth et al., 2011; Kostuk et al., 2012; Meliza et al., 2014; Nogaret et al., 2016; Abarbanel, 2022). DA requires a model of the observed complex system and, of course, data from observing that system. The data are used to train items in the model such as fixed parameters and unobserved state variables.

DDF does not require a model of all these often unobservable details. Nonetheless, as it is built on observed data, it encodes those details while forecasting only the observable properties of the neuron activity. Both DA and DDF may be seen as a form of supervised learning (Abarbanel, Rozdeba, & Shirman, 2018). In this regard, they may also be regarded as methods of machine learning.

If the construction and analysis of a biophysically detailed HH model have been achieved, perhaps employing DA, using HH models in large networks of biological interest may prove computationally quite challenging. DDF may be utilized to accurately forecast the voltage time course of this HH model, thus replacing it in the network of interest. Only the voltages across neuron cell membranes are used in the communication among neurons in a network; thus, DDF neurons are nicely suited for use as the dynamical elements of functional biological networks. Using a reduced-dimensional model in network studies can result in significantly decreased computational tasks.

### 1.2 Data of Interest

We start with the formulation of DDF models for individual neurons.

We have in mind data where a neuron is stimulated by a known forcing via an injected current $Istim(t)$, and its membrane voltage response V(t) is measured, namely, current clamp experiments. One observes V(t) at discrete times $tn=t0+nh;n=0,1,\u2026,N$. We use these data to build a biophysically based, nonlinear discrete time map that takes $V(tn)\u2192V(tn+1)$ for any selected stimulating current. Importantly, the map must predict well for $n\u2265N+1$.

We demonstrate that this is accomplished in the analysis of numerically generated data from a standard neuron model and in the analysis of current clamp data collected in a laboratory environment.

When this is successful, we will have created a DDF dynamical rule moving V(t) forward in time without regard for gating variable time courses—parameters such as maximal conductances or reversal potentials, chemical reaction rates, or any of the other detailed biophysical characteristics of the HH neuron dynamics. Yet built from observed data, the biophysical information is embedded in the DDF model.

As we shall demonstrate here, one is able to accomplish this but, not surprisingly, must give up some things that are found in the use of a detailed model. The method is restricted to forecasting only what is observed.

### 1.3 Useful Attributes of DDF Neurons

This forecasting construction $V(tn)\u2192V(tn+1)$, which we call a DDF neuron, may be used in network models of interest. DDF neurons would be located at the nodes of the network, leaving only the network connectivity to be determined, possibly by DA from observed network activity data (Abarbanel, 2022).

In biological networks, the individual neurons are driven by external currents, if present, and by the currents received from other neurons presynaptic to it. The gap junction and synaptic current connections are described by the presynaptic voltage and the postsynaptic voltages, allowing DDF neurons to be valuable, efficient network nodes in computational models of functional neural networks.

If one has the goal of controlling aspects of functional neural networks to achieve desired goals, DDF neurons provide a computationally inexpensive way of incorporating the observable properties of such a network. This is significant as observables are the attributes that control forces may affect.

## 2 Plan for This Article

1. To begin, we describe the DDF method as applied to individual neuron data and give an example using numerical data from solving a basic Hodgkin-Huxley (HH) neuron model (Hodgkin & Huxley, 1952; Johnston & Wu, 1995; Sterratt et al., 2011) with Na, K, and Leak channels. For brevity, we designate this HH model as an NaKL neuron.

2. The time courses for the voltage and the Na and K channel gating variables ${V(t),m(t),h(t),n(t)}$ from the model are used as “data” and analyzed in two settings:

The first setting is a confidence-building exercise,

*not biologically realistic*, that assumes we have observed data on the membrane voltage V(t), as well as on all three of the HH gating variables ${m(t),h(t),n(t)}$ in the model.The second setting conforms to what one can actually do in a current clamp experiment, namely, observe only the membrane voltage V(t) given the stimulating current $Istim(t)$. This requires us to add to the basic DDF formulation the idea of constructing enlarged state spaces from the observed variables and their time delays (Takens, 1981; Aeyels, 1981a, 1981b; Abarbanel, 1996; Kantz & Schreiber, 2004). This method is familiar and essential in the study of nonlinear dynamics and will be explained in the present context.

3. We next turn to the DDF analysis of laboratory current clamp data acquired in the Margoliash laboratory at the University of Chicago.

4. An analysis is then made of how DDF neurons can be used in the construction and study of networks of neurons. In this article, we first direct our attention to a quite simple two-neuron example with gap junction connections.

5. Then we present a study of a network segment where a presynaptic neuron, driven by a stimulating current $Istim(t)$, drives a postsynaptic neuron via a ligand gated synapse.

6. A summary and discussion complete the article.

7. We include three appendixes:

Appendix A is a brief manual discussing how one builds DDF models in practice.

Appendix B has the formulation for incorporating ligand gated synaptic connections into a DDF-based network model.

Appendix C contains a short biophysical discussion on the choice of stimulating currents $Istim(t)$ selected to permit the DDF model to generalize its response to a wide class of stimulating currents. This appendix was developed following questions from two reviewers of an earlier draft of this article.

## 3 Discussion of the Methods of DDF

Our discussion begins with a broader formulation of DDF than will be required in neurobiology, where only V(t) is observed in laboratory current clamp experiments. We return to the realistic scenario of observing only V(t) after providing the broad perspective. An example of data, beyond membrane voltage, where DDF will permit accurate forecasting, includes fluorescence associated with Ca concentration variation $(Ca2+)(t)$ in the presence of a Ca indicator (Smetters, Majewska, & Yuste, 1999; Vogelstein et al., 2009; MacLean & Yuste, 2009), $(Ca2+)(t)$. The DDF formulation for $[Ca2+](t)$ is described in section 7.

The idea of DDF is to start with the collection of D-dimensional observed data $u(tn)=u(n)={ua(n)};a=1,2,\u2026,D$, sampled at discrete times $tn$ over an observation window $[t0\u2264tn\u2264tN];tn=t0+nh;n=0,1,2,\u2026,N$ in time steps of size $h$.

Next, using only these data, we discuss how to construct a discrete time map $u(n+1)=u(n)+f(u(n),\chi )$ for forecasting the future of those data. The $\chi $ are parameters in $f(u(n),\chi )$ that we will estimate (train) on the observed data. The nonlinear function $f(u,\chi )$ is called the *vector field* (Strogatz, 2015) of the discrete time map.

When we do not have a model of the biophysical processes generating $u(t)$, we ask: Can we build a representation of $f(u,\chi )$ and use observed data to determine the unknown, time-independent quantities $\chi $ in that representation?

We will answer this in the affirmative using applicable developments in applied mathematics (Hardy, 1971; Broomhead & Lowe, 1988; Casdagli, 1989; Judd & Mees, 1995; Schaback, 1995; Powell, 2002) explored in depth over many years.

The general idea was well investigated in the context of autonomous systems where there is no external forcing. This is not the situation that one addresses in neurobiology, where neurons are stimulated by external sources and, in functional networks, by the activity of other neurons in the network.

The first step is to select a parameterized representation for $f(u,\chi )$ using $u(n+1)=u(n)+f(u(n),\chi )$ for $0\u2264n\u2264N$, to estimate (train) the parameters $\chi $. Once the $\chi $ are known, we are able to use this trained discrete time dynamical rule $u(n+1)=u(n)+f(u(n),\chi )$ to forecast the behavior of the observed quantities $u(n)$ for $n\u2265N+1$ in time steps of size h.

In equation 3.1, the $pj(u)$ are multivariate polynomials of order j, The functions $\psi ((u-uc(q))2,\sigma )$ are called *radial basis functions* (RBFs). The $Nc$${uc(q)}$ are denoted as *centers*, and they are selected from the observed data: $Nc\u2264N$.

In developing this representation of the vector field as a function on $u$ space, we may think of the ${u(n);n=0,\u2026,N}$ as samples of a distribution in $u$. The $f(u,\chi )$ are designed to interpolate among these samples.

The training of the ${caj,waq}$ is a linear algebra problem (Press, Teukolsky, Vetterling, & Flannery, 2007). The linear algebra problem for $Nc\u2260N$ requires regularization, and that means we must specify a Tikhonov regularization parameter, which we call $\beta $. This is also called ridge regression.

We must also specify any parameters appearing in the functions $\psi ((u-uc(q))2,\sigma )$. A guide to how we select all parameters of the DDF formulation, in practice, including ${caj,waq,\sigma ,\beta}$, is given in appendix A. When we use time delay coordinates, as we do in section 6, two more parameters enter: the time delay $\tau $ and the dimension of the time delay vector $DE$. So $\chi ={caj,waq,\sigma ,\beta ,\tau ,DE}$ is the full set of parameters that we must estimate from the given data.

The representation of the vector field $fa(u,\chi )$ in equation 3.1 has both what one often finds described as RBFs by themselves (the second term on the right) and polynomials in $u$. We first tried to work with the polynomials alone, but found that when $J$ was only 3 or 4, they were not able to represent the kind of nonlinearities found in biophysical models of neurons. As $J$ increases the number of coefficients $caj$ grows more or less as $J!$, and even the linear algebra problem becomes difficult. The more general case in equation 3.1 is discussed in Schaback (1995) and Powell (2002). In practice we only used the monomial $u$.

Once the linear algebra problem of determining the ${waq}$ is completed, equation 3.2 becomes our discrete time (in steps of size h) dynamical, nonlinear forecasting rule for times $tn\u2265tN$.

There are many choices for these RBFs (Hardy, 1971; Powell, 2002; Schaback, 1995; Buhmann, 2009). Our RBF choice has been the gaussian: $\psi G((u-uc(q))2)=exp[-R(u-uc(q))2)]$, $R=1/\sigma 2$. Other choices, and there are many (see Table I in Schaback, 1995), have given equivalent results in practice.

## 4 Using DDF in Neurobiology

### 4.1 Data Assimilation

In the study of the ingredients of functional neuronal networks, one is able to measure the time series of voltage V(t) across the cell membrane of individual neurons in a routine manner. Using observed values of V(t) along with knowledge of the forcing by a stimulating current $Istim(t)$ presented to the neurons, it is often possible to estimate the detailed electrophysiological properties of a Hodgkin-Huxley model of an individual neuron (Hodgkin & Huxley, 1952; Sterratt et al., 2011; Johnston & Wu, 1995) using data assimilation (Abarbanel, 2022).

The data assimilation estimation involves inferring the unmeasured state variables, including ionic gating variables, and unknown parameters such as maximal conductances of ion channels. A model developed and completed in this manner is validated by comparing its voltage time course when driven by stimulating currents after the observation window. Only in the observation window is information passed to the model from measurements of V(t).

### 4.2 DDF

The DDF perspective selects a related, but also biophysically grounded, path that emphasizes what can actually be measured and forecasts only those aspects of complex neuronal activity.

Approaching the question of biophysical models for the dynamics of neurons from a DDF perspective provides a way to make predictions or forecasts of V(t) without the details of the biophysical model. This, as noted, sets aside the knowledge of the details of the model, but for purposes of building up the dynamics of voltage activity in a network, it may be of great utility.

One attractive feature of DDF is that it results in significant model reduction from the many state variables and proliferation of parameters present in the neuron dynamics (Nogaret, Meliza, Margoiliash, & Abarbanel, 2016) by focusing on those state variables that can be measured.

### 4.3 Hodgkin-Huxley Structure for Driven Neurobiological Dynamics

We continue our discussion of DDF in neurodynamics by attending to how one can work with numerical and simulated data from a basic, well-studied, Hodgkin-Huxley (HH) model neuron. Our analysis of experimental current clamp data will follow this set of numerical examples.

In the HH formalism, the equation for the voltage is driven by $Istim(t)$ in an additive manner. We use this in formulating our DDF protocol for biophysical neuron models. What we do not know from the data alone are the vector fields ${Fintrinsic(V(t),A(t)),FA(V(t),A(t))}$, whose specification yields the detailed HH biophysical model.

### 4.4 The Basic NaKL HH Neuron as an Example

We now analyze the DDF representation of the flow, 4.2 in the NaKL HH model neuron, in two cases: (1) when we observe all state variables ${V(t),A(t)}$ and (2) when we observe only V(t). Case 1 is not a realistic scenario in current clamp experiments, but we include it as a confidence-building exercise in the construction of DDF neurons. Case 2 is the realistic scenario in a current clamp experiment where a simulating current $Istim(t)$ drives a neuron and only the membrane voltage V(t) is observed.

Our first step is to work with numerically generated data from the HH NaKL model neuron (Johnston & Wu, 1995; Sterratt et al., 2011).

This basic, well-studied, detailed biophysical HH neuron model has 4 state variables and the order of 20 parameters (Johnston & Wu, 1995; Toth et al., 2011; Kostuk et al., 2012). DDF will replace this with a forecasting equation for V(t), the experimentally observable quantity, alone. As we have argued, much is gained by this model reduction.

In these equations, we use the parameters given in Toth et al. (2011) and Kostuk et al. (2012). Data are generated by solving equation 4.3 using a fourth-order Runge-Kutta method (Press et al., 2007).

*not*the realistic biological feature of a current clamp experiment, is the following ($w(tn)=w(n)$):

We use the trapezoidal rule (Press et al., 2007) for the integration over $Istim(t)$. We are taking steps of size h in time. There are many improvements over the trapezoidal rule for the integration from $tn$ to $tn$+h over $Istim(t)$. Those improvements require sampling or estimating values at points between $tn$ and $tn+h$. We do not have these quantities in our data set.

The task for DDF is to select representations for the vector fields $fV(w)$ and $fA(w)$, which we do below. Interestingly, the constant C, the membrane capacitance, can be estimated in the DDF training protocol as the time course of the driving force, $Istim(tn)$, must be specified by the user.

## 5 Observing Only V(t): Time-Delay Methods

What is actually measured in current clamp laboratory experiments is V(t) alone. The neuron dynamics resides in a higher-dimensional space than the one-dimensional V(t) that is measured. What we observe is the operation of the full dynamics projected down to the single dimension V(t). To proceed, we must effectively “unproject” the dynamics back to a “proxy space,” comprising the voltage and its time delays (Takens, 1981; Aeyels, 1981a, 1981b; Abarbanel, 1996; Kantz & Schreiber, 2004), which is equivalent to the original state space of V(t) and the gating variables for the ion channels.

The physics behind the time-delay construction is that as the observed system moves from time $tn-\tau n+1\u2192tn-\tau n$, the dynamics of the system incorporates information about the activity of all other variables beyond the voltage alone. When the quantity $V(tn-\tau n)$ is approximately statistically independent of the quantity $V(tn-\tau n+1)$, each can be used as the components in a “proxy vector” $S(n)$ representing the system dynamics as it develops in dimensions higher than V(t) alone.

Using the average mutual information (Fraser & Swinney, 1986; Abarbanel, 1996; Kantz & Schreiber, 2004) between $Sj$ and $Sk\u2260j$, and choosing time lags ${\tau j,\tau k}$ giving a minimum of this average mutual information, we achieve an approximate information-theoretic, nonlinear independence of the $DE$ components of $S(n)$ with respect to each other.

In principle in the discussion of Takens's work, if one has an infinite amount of noise-free data, any time delay or set of time delays (Hirata, Suzuki, & Aihara, 2006) would give a proxy state vector $S(n)$ that is equivalent to the original dynamical space of the data source. Of course, we never have that, so a guide was devised by Fraser and Swinney (1986) suggesting that choosing the components of $S(n)$ to be nonlinearly independent of each other, using average mutual information as a “nonlinear correlation” among the components, would provide a good measure of the ability of the components of $S(n)$ not to be parallel to each other. If that is achieved, then they would span the $DE$-dimensional space of $S(n)$ in a numerically useful manner. If there are multiple time scales in the data, the method of Hirata et al. (2006) could be a useful method to implement.

To estimate $DE$, one may use the method of false nearest neighbors (Abarbanel, 1996; Kantz & Schreiber, 2004).

It is typical, but not required, to select $\tau a=(a-1)\tau ;a=1,2,\u2026,DE$. In this standard choice of delay coordinates, there are two parameters that determine the vectors $S(t):{\tau ,DE}$. $\tau $ is conveniently taken to be an integer multiple of h; $DE$ is an integer about twice the dimension of the system generating V(t). More precise results for these criteria are given in Sauer, Yorke, and Casdagli (1991); Abarbanel (1996); and Kantz & Schreiber (2004).

In equation 5.3 we have introduced $DE$ vector fields $fa(S,\chi )$ whose parameters $\chi $ must be estimated by requiring equation 5.3 to be true over a training set with ${tn};n=1,2,\u2026,N-1$. The trained dynamical map $S(n)\u2192S(n+1)$ is used to forecast all components of $S(k)$ for $k\u2265N$.

## 6 Results When Only V(t) Is Observed

### 6.1 DDF Analysis of Numerical V(t) Data from an NaKL Neuron

Still using data from the numerical solution of the basic HH equation, equation 4.3, we now train a gaussian RBF via equation 5.3, with 125 ms of data for V(t) alone employed in the estimation of the parameters $\chi $ of the RBF.

The trained DDF is used to predict the subsequent 500 ms of the V(t) time course.

Note that one cannot forecast the gating variables ${A(t)}$ of the HH NaKL model as we have no observed information about them. Through the time-delay vector $S(tn)$, the biophysical information in the ${A(t)}$ is encoded in the trained parameters $\chi $.

#### 6.1.1 Comparing DDF Forecasting and NaKL Forecasting Times

To assess the effectiveness of using a trained DDF to forecast V(t) data, for example, for the efficiency of computational demands on a DDF neuron in a circuit where it replaces a HH model, we compared the computation time for solving our HH NaKL model to the forecasting time of a DDF trained on the V(t) from that HH NaKL model.

We generated HH NaKL data by solving equation 4.3 using a standard fourth-order Runge-Kutta method (Press et al., 2007; Olver, 2017) with a time step of h $=$ 0.02 ms. The times taken by the generation of the NaKL data in a forecast window of 2000 ms (10$5$ time steps) were 8.9 s for either CPU time or wall clock time.

We then forecast in the same window using the same $Istim(t)$ as for the HH NaKL model but using a trained DDF, trained on V(t) from the HH NaKL model and forecasting only V(t).

The choice of the number of centers $Nc$ in the training and forecasting for the DDF is important to the forecasting time of the DDF. If we choose $Nc=500$, then the CPU time for forecasting the 2000 ms with h $=$ 0.02 is 3 s, while the wall clock time is 2.4 s. If we decrease the number of centers to $Nc=100$, then the CPU time during forecasting is reduced to 2 s, while the wall clock time drops to 1.5 s.

The training time for the DDF with $Nc=500$, on V(t) from the HH NaKL models is 1.1 s for CPU time and 0.63 s for wall clock time. This decreases to 135 ms CPU time and 142 ms wall clock time for $Nc=100$.

These times will vary as the complexity of the HH model neuron increases from the minimalist NaKL model to a model for observed laboratory observations. One expects the V(t) trained and forecasting DDF to become relatively more efficient than our results on the simple NaKL model. The training times for a DDF on V(t) data alone are quite fast. In the scenarios where we substitute DDF V(t) neurons for HH neurons in a circuit, the computational efficiency is what will be of central importance.

### 6.2 DDF Analysis of Laboratory Data from an HVC Neuron in the Avian Song System

With these DDF results on numerical data generated by the solution of the NaKL HH equation in hand, we turn to the use of DDF when presented with experimental current clamp data.

### 6.3 Only V(t) Is Observed in Laboratory Current Clamp Data

Current clamp data were collected by C. D. Meliza at the University of Chicago laboratory of Daniel Margoliash from presenting various stimulating currents $Istim(t)$ to isolated HVC neurons in a zebra finch in vitro preparation. The data were organized into epochs of length about 2 to 6 seconds observed over several hours.

### 6.4 Training a DDF Neuron in One Epoch and Using It to Forecast in Another Epoch on Experimental Current Clamp Data

Another informative test of the DDF approach is to train a DDF neuron on neuron data from one time epoch with a selected $Istim(t)$, then using the same parameters $\chi $ from the first epoch to forecast the V(t) response to different $Istim(t)$ presented in a second time epoch. This is a test of the DDF neuron ability to correctly respond to different stimulating currents.

In Figure 5 (top), we show $Istim(t)$ in epoch 25 and the resulting, $VData(t)$ (bottom). In Figure 6 (top), we show $Istim(t)$ in epoch 26 and the resulting, $VData(t)$ (bottom). These data are from two epochs of a current clamp experiment from the Margoliash laboratory (University of Chicago) on an isolated neuron from the zebra finch HVC nucleus.

### 6.5 Comments on the HVC Current Clamp Experiments

There is a large library of current clamp data from this preparation. The observation window for the data used here was about 1650 to 3500 ms. Many entries in the library have a longer window of time over which data were collected, and this would allow longer training windows to be used. In previous work with these kinds of data (Kostuk et al., 2012; Nogaret et al., 2016) longer estimation windows typically result in better forecasting.

The stimulating currents in these data were designed with three biophysical principles in mind. First, the amplitude variations of $Istim(t)$ must be large enough to generate many action potentials as well as substantial periods of subthreshold behavior. This guarantees that the full dynamic range of neuron response is well represented. Second, the observation window must be long enough in time to ensure the same goals as in the third principle. Third, the frequencies in $Istim(t)$ should be low enough so that properties of the stimulating signal are not filtered out by the RC low-pass characteristic of the cell membrane. If these criteria are not met, aspects of $Istim(t)$ are filtered out by the cell membrane, and the training is likely to be insufficiently well informed. V(t) data collected with $Istim(t)$ chosen employing these guidlines were regularly successful in using DA to estimate the properties of rich HH models (Toth et al., 2011; Kostuk et al., 2012; Nogaret et al., 2016) from laboratory data.

The context of this discussion is expanded in appendix C.

## 7 Adding Other Observables

In many neurobiological investigations, more observables than just V(t) may be available. We examine how these may be combined with observations of V(t) in a DDF framework or used on their own.

The sources of Ca ions are attributed to voltage gated Ca channels with various properties and release of and uptake of Ca from intracellular stores such as the endoplasmic reticulum (Houart, Dupont, & Goldbeter, 1999, 2003; Dupont & Goldbeter, 1993; Ye et al., 2014).

$y\xb1=1\xb1h2\tau c$ appears here as we identify the appearance of $\Delta (n)$ and $\Delta (n+1)$ on both sides of the equation for the flow of $Ca(t)$. The integral over $\Delta (t)$ uses the trapezoidal approximation, and the unknown dynamics for the sources and sinks of $Ca(t)$ are represented within the RBF vector field $fCa(D(t),\chi )$.

## 8 Using DDF Neurons in a Network

One important goal of using neuron models trained by data alone (e.g., DDF neurons) is to provide a reduced model based on biophysical observations to employ in building network models.

Note that it is only the presynaptic and postsynaptic voltages that convey information from any neuron in this circuit to others in the circuit. Either the HH model NaKL neuron or the NaKL-trained DDF neuron may be used in this small network. Each produces voltage signals that couple the neurons.

### 8.1 Discrete Time Gap Junction Dynamics

## 9 Dynamics of a Simple Two-Neuron Network

We are now prepared to use the dynamical discrete time maps for circuits such as the one in Figure 9.

This proceeds as follows:

Determine the neuron RBFs for the two neurons $fV1(S1)$ and $fV2(S2)$. In the simplest case, which we adopt here, the neurons are the same, and the RBFs, $fV(S)$, are the same function as their respective multivariate arguments $S1(t)$ and $S2(t)$.

Select a stimulating current $Istim(t)$.

Select excitatory or inhibitory synaptic connections or gap junction connections.

Using the DDF neurons in place of the HH neurons appearing in the circuit, use the trained DDF neurons.

Evaluate the network behavior using the discrete time map for the coupled $V1(tn)$ and $V2(tn)$, equations 8.2 and 8.3, when DDF neurons are at the nodes of the network.

### 9.1 A Two-Neuron Circuit: HH NaKL Neurons or DDF, NaKL Trained, Neurons; Gap Junction Connections

Using the HH-NaKL neuron at both nodes of the network (see Figure 9), we generated the time course of $V1(t)$ and $V2(t)$. Then replacing the HH-NaKL neurons with the trained DDF-NaKL neuron, we generated another set of $V1(t)$ and $V2(t)$ time courses.

This result is, as noted, for gap junction couplings between the two neurons. The way one introduces ligand gated synaptic connections into a network is discussed in appendix B.

### 9.2 One DDF Neuron Driving a Second DDF Neuron through a Synaptic Connection

For the excitatory synaptic connection, we selected $gsyn=0.5,Erev=50mV,\tau A=0.1ms,A1=9/8,V0=-50mV$ and $dV0=10mV$.

The performance of this network segment was evaluated using the simple NaKL HH neuron as both the presynaptic neuron and the postsynaptic neuron. Along with the dynamics of the synaptic gating variable $A(t,V(t))$, this is a nine-dimensional dynamical system. Selecting $Istim(t)$ as we have earlier, we solved for $V1(t)$ and $V2(t)$ and stored these data for later use.

Next, using just the V(t) from an isolated NaKL neuron driven by the selected $Istim(t)$, we built a DDF neuron using the time-delay method described earlier. In the construction of the DDF neuron model, we used $Nc=500$, $DE=4$, $\tau =2$, and h $=$ 0.02 ms in the data.

Two of these DDF neurons were then used in the network segment in place of the simple NaKL HH neurons, and the same network segment was driven by the same $Istim(t)$ presented to neuron 1 connected by the same synaptic dynamics to DDF neuron 2. The DDF neurons were trained with 500 ms of HH model V(t) data.

This network segment result indicates that, indeed, we may replace the more complex HH neuron voltage activity with the reduced dimension, biophysically trained, V(t) DDF neuron in synaptic connections occurring in a network of neurons.

## 10 Summary and Discussion

This article is a melding of many ideas from nonlinear dynamics and applied mathematics to the goal of constructing biophysically based models of observables in neurobiology. These data-driven models encode the full information in experimental observations on the complex system, the neuron, which is observed in a current clamp experiment, that is, one with a given $Istim(t)$ with membrane voltage V(t) observed, and permit forecasting and predicting the voltage response of the observed neuron to other stimuli.

The method is called data-driven forecasting (DDF), and the model construction produces a discrete time, nonlinear map $V(t)\u2192V(t+\Delta t)=V(t+h)$ that may be used in a network with synaptic or gap junction connections, as the dynamical function of each is determined by the voltages of the presynaptic and the postsynaptic cells. We have shown this in a simple network for gap junction neuronal connections and for a network segment where a presynaptic neuron, stimulated by $Istim(t)$, drives a postynaptic neuron via an excitatory synapse. The formulation of the required dynamical map for the gating variables of synaptic connections is given in appendix B.

The DDF-based network permits a computationally efficient network model where the trained DDF discrete time map, trained on biophysical data, replaces complex Hodgkin-Huxley models typically used in contemporary research. The computational advantage is achieved in two ways: (1) no differential equations at the network nodes must be solved, and (2) the nodal models are substantially reduced in complexity as it is only the observable, V(t) that is forecast.

In the timing comparisons we performed on the use of an HH NaKL model neuron solved by a standard fourth-order Runge-Kutta ordinary differential equation solver compared to the forecasting efficiency of the V(t) alone DDF for the same time period with the same $Istim(t)$, we found a computational improvement by a factor of 3.7. As the HH NaKL model evaluates four state variables ${V(t),m(t),h(t),n(t)}$ while the V(t)-trained DDF neuron gives only the time course of V(t), $V(t)\u2192V(t+h)$, a factor of about four in execution of the DDF is sensible.

For the data in Figure 3, a detailed Hodgkin-Huxley model with 14 state variables, V(t), and 13 gating variables was constructed using methods of data assimilation (Nogaret et al., 2016). The improvement in computational efficiency in forecasting V(t) using the DDF neuron trained on these data over the HH model was approximately 10. This suggests the idea that for an HH neuron with $Ngating$ variables, the computational efficiency of a DDF neuron accurately forecasting the membrane voltage will be about $Ngating$ faster than solving the HH model at each node of a functional network.

One must recognize that while much is gained by the DDF neuron construction, something is set aside, and that is the knowledge of the biophysics in detailed HH models of individual neurons, including the operation of gating variables for the ion channels selected for the model, parameters determining the dynamics and strength of those ion channels, and other unobserved yet relevant biophysical properties of the neurons.

In this article, we have presented the formulation of the DDF method. Many familiar with its ingredients will recognize the provenance of such a strategy. We have also shown how this approach can be applied to experimental data from current clamp experiments from an avian songbird preparation.

The challenges ahead in building large, realistic functional network models in neurobiology are significant. Again, as the DDF models are based on observations of V(t) from experimental data, the biophysical content of each representation of neuron dynamics is equivalent.

While we certainly have not shown this in this article, it is not inappropriate to state that the connectivity of a realistic functional network with DDF neuron models at the nodes can be established using methods of data assimilation (Toth et al., 2011; Kostuk et al., 2012; Nogaret et al., 2016; Abarbanel, 2022).

Data assimilation for such a network requires a model of the connectivity among the component neurons, along with data on the operation of many neurons in the intact network. The ability to use Ca fluorescence experimental results (Smetters et al., 1999; MacLean & Yuste, 2009; Vogelstein et al., 2009), where a large number of neurons are observed in a data assimilation environment suggests a path forward for this kind of network connectivity construction.

Another potential use of DDF arises when one does have detailed HH model neurons for network components. This may be of value in answering questions more intricate than computationally efficient neurons at the nodes of large networks (Daou & Margoliash, 2020; Abarbanel, 2022, chap. 9). It is likely that data assimilation will have been used to create such detailed biophysical models. Using V(t) from such models, one can construct a reduced dimension DDF $V(t)\u2192V(t+h)$ realization of the biophysics in the HH model, and then, as we showed, one may replace the complex HH model with the reduced-dimension DDF construct. One can expect to achieve computational efficiency and allow the exploration of larger biological networks when using the DDF construction to capture the neuron biophysics.

## Appendix A: A Guide for the Implementation of DDF

### A.1 Guide to the Use of Radial Basis Functions in Data-Driven Forecasting

To provide readers an overview of how we construct and use the representations of the vector field for the DDF discrete time maps constructed from observed data, we present in this appendix a “manual.” Our manual is more general than the applications in the main body of the article, which has as its focus the use of the methods in neuroscience. Here, we identify what is special for neuroscience and what is useful in general:

1. Collect D-dimensional data for the observations on your system of interest. We call these data $u(tn)=u(n)\u2208RD;tn=t0+(n-1)h,n=1,2,\u2026N$. We wish to use these data to construct a discrete time map $u(n)\u2192u(n+1)$, which we then use for $n>N$: to forecast in the dynamical development of the observations beyond the data that have been collected.

2. Select a training subset of the data ${u(n)};n=1,2,\u2026,NT;NT<N$.

3. Select a subset of the observed data to use as “centers” $uc(q)q=1,2,\u2026,Nc;Nc\u2264NT$. The RBF was designed to be an accurate interpolating function in $u$ space, and the centers tell us which samples of the distribution in $u$ space we utilize.

4. Select a radial basis function (RBF) $\psi ([u-uc(q)]2,\sigma )$. All of the RBFs have a shape factor $\sigma $ that is associated with the distance in D-dimensional $u$ space over which the influence of the observed data is important (Wu, Wang, Zhang, & Du, 2012).

5. Represent the vector field $f(u(n),\chi )$ for the discrete time dynamics $u(n+1)=u(n)+f(u(n),\chi )$. The $\chi $ are a set of parameters that we will learn while training the RBF.

*ridge regression*in the literature. The regularization consists of realizing the determination of the ${waq}$ as the minimization of

To implement this protocol, we have proceeded in the following way:

### A.2 Including Polynomial Terms in the Representation of $f(u)$

which is what we see in equation 3.1.

The addition of polynomials is indicated if we know on other grounds something about the underlying dynamic of the source of the data that suggests polynomial terms in the vector fields are present. That is not the case in applying these methods to neuroscience, so except for the term $u(n)$ in equation A.2, we do not use the freedom of further polynomials in $u$.

### A.3 How to Choose Centers

The choice of centers, the $uc$ vectors in the RBF, is constrained by the data points in the training window $1\u2264n\u2264NT$. We used K-means clustering (Du & Swamy, 2006) to select the $uc$, and this is a way of selecting centers where they are needed to sample the distribution in $u$ space. While there may be more advanced methods, we have found K-means to work well enough and have used the method in finding all of the results in this article.

Another aspect to consider is the number of centers $Nc$. A general rule of thumb is that the more centers, the better, as the distribution in $u$ space is better sampled. However, this comes at the cost of both memory and computational time. In this article, we typically chose about 5000 centers for training lengths $NT\u224825-50\xd7104$.

There is a possible strategy with regard the selecting $Nc$ that balances a finely grained sampling of the distribution in $u$ space against the increased computing cost in evaluating the dynamical map $u(n)\u2192u(n+1)$ when $Nc$. When a data set that is noisy is presented, the resolution in $u$ space is coarsely grained when the data arrive. The idea is to recognize that limitation on how well any DDF can perform by evaluating a metric on the match between the predictions of the trained DDF ${uDDF(n)$ compared to the known, noisy knowledge we have of the data $udata(n)$ for $n\u2265NT$. This could be something like $\u2211n>NT[uDDF(n)-udata(n)]2$. Noise will limit the improvements in forecasting quality as $Nc$ increases, and the performance of the metric should lead to a modest $Nc$ larger than which no increase in forecasting skill is seen.

#### A.3.1 Centers and RBFs

We found that the obvious choices for DDF worked out well, namely, choosing gaussian RBFs and K-means clustering. These strategies are commonly used in the RBF literature.

However, these are far from the only good working ideas there that could possibly work better. The centers could also be chosen by emphasizing places where the first or second derivatives in the data $u(n)$ are greatest. This could work well because it could select more centers where data might be sparse—for example, when the neuron data are spiking and regions are traversed at higher speed than in subthreshold regions with smaller derivatives in $u(n)$ (Sánchez, 1995).

### A.4 Working with Imposed Driving Forces Acting on the Source of the Data ${u(n)}$: Stimulating Currents $Istim(t)$

We have not found a general formulation for DDF in the case of a driven dynamical system. One natural idea is to include those forces—call them $Kext(t)$—on the list of observed quantities in the data: ${u(n)}\u2192{u(n),Kext(n)}$ and enter this into the arguments of the relevant RBFs.

In many interesting situations in physics and neurobiology, however, the forces are additive, so the formulation presented in this article may be applicable to many interesting problems.

If we were analyzing the underlying detailed dynamical equations of the processes producing the data, the $Kext(t)$ would have to be provided as they are not given a differential equation of their own but specified by the user. This is true when we solve the differential equations for a model of the driven systems and remains true for DDF.

In the case of the dynamics of neurons, we are in a fortunate situation. In this instance, the forcing is from currents external to the intrinsic currents associated with the ion channel dynamics. The voltage HH equation is a statement of current conservation, and currents are additive.

and is an example where some knowledge of the biophysics or other physics of the data source is of value.

### A.5 Evaluating the $\chi $

when the parameters $\Xi ={R,\beta ,\tau ,DE}$ are fixed.

We proceeded by selecting a set of $\Xi $, making a coarse grid search over the $\Xi $, performing the regularized linear algebra to train the coefficients $\phi ={caj,waj}$ for each choice of the $\Xi $, and using this trained representation to perform a validation test for each choice. $\chi ={\phi ,\xi}$.

Our practice has been to examine the quality of $uV(n)\u2248u(n)$, locate regions of the $\Xi $ where they match well, and refine the grid search in the $\Xi $ as required.

#### A.5.1 Differential Evolution to Search for the minimum of $C(\Xi )$

Differential evolution is a genetic algorithm that could be implemented in our DDF construction to improve our ability to create DDF models by more precisely choosing sets of parameters $\Xi ={R,\beta ,\tau ,DE}$. The method is described in Storn and Price (1997).

It works by initiating a parent set of parameters from a user-defined uniform distribution. From that start, new “children” sets of $\Xi $ are made from combining the parents in an algorithmic way and comparing the new validation forecast ${uV(j)}$ to the old one. A user-defined cost function, such as equation A.12, is used to compare the children to their parents.

As generations go by, parents will gradually be replaced by children until all of the parents converge on a minimum in the cost function. A user who is able to define a clever cost function could get a result that surpasses those from the simple grid searching employed so far.

### A.6 Code for Our Implementation of DDF

We did all of our testing and programming in Python. We give links to the code we used published in GitHub. In the links is a DDF program that was written purely for radial basis functions; in this repository, we include code for both gaussian and multiquadric forms of the RBF. There is a link to a DDF neuron repository that includes code that was built specifically for the study of an NaKL neuron, where only the voltage and the stimulus are known; this means that the code is built to perform time-delay embedding on a single dimension of voltage. The treatment of the stimulus current in this code p. 66: a list of 4 URLs is as we described earlier. The third and final attached link is to a method of using DDF with polynomial terms in the representation of $f(u,\chi )$ as an additional resource:

### A.7 Memory Management

An important note to consider is the size of the matrices involved in training, for they can grow to the size of gigabytes. For example, the largest matrix involved in training is that of all values of the RBFs at all times; $NT$ by $Nc$ can have a standard length of 25,000 data points with 5000 centers. Typically we use float64 for all our values, resulting in this matrix having a size $25,000(NT)\xd75000(Nc)\xd78=1$ gigabyte. We need another gigabyte for its transpose used in the regularized matrix inversion calculation. This could be a limiting factor if one is running multiple tests in parallel on a CPU or on a cluster with limited storage.

### A.8 Parallelizing DDF Calculations

This section serves as a suggestion to take advantage of the ability to run DDF code in parallel. The grid searching method discussed above lends itself readily to parallel programming. Even the differential evolution method can be run in parallel. A single trial of DDF does not have much potential for parallel operations as the training is just matrix multiplication, and the forecasting is a step-by-step process that relies on the result of the previous step.

## Appendix B: Synaptic Dynamics

In the main text, we implemented a simple two-neuron circuit with gap junction connections between the neurons. A richer network consists of more neurons and allows for synaptic connections as well. Starting with an equation for synaptic currents $Isynaptic(t)$, we provide a discrete time map to update the time dependence in a map for V(t).

### B.1 Discrete Time Synaptic Dynamics

In the discrete time dynamical rules for the membrane voltages of the neurons in a network, we require the gating variable at discrete time $tn$ and $tn+h=tn+1$.

In equation B.1, we have time constants $\tau (A1-A0(Vpre(t))$ and a function $A0(Vpre(t))$ to specify. The function $A0(V)$ describes the transition of the gating variable from $A(t)\u22480$, namely a closed synaptic channel, to $A(t)\u22481$, an open synaptic channel.

When the width of this transition $dV0$ is small, $A0(V)$ is essentially a step function in voltage.

### B.2 Excitatory and Inhibitory Synaptic Formulation

In equation B.1, we have two time constants ${\tau AA1,\tau A(A1-1)}$ corresponding to the time it takes for a neurotransmitter to undock from a postsynaptic receptor and dock at the receptor, respectively.

For an *excitatory* AMPA receptor, mediated by the neurotransmitter glutamate, the approximate value of $\tau AA1\u22483$ ms, and of $\tau A(A1-1)\u22481$ ms. So for this excitatory synaptic connection $\tau A=2$ ms and $A1=1.5$. For an inhibitory synaptic connection, mediated by GABA, $\tau AA1\u22488$ ms, and of $\tau A(A1-1)\u22482$ ms, leading to $\tau A=6$ ms and $A1=4/3$.

## Appendix C: A Comment on Injected Currents

### C.1 What Waveforms Should Be Chosen for $Istim(t)$?

This question arose in the valuable comments of the reviewers and is certainly an important issue in training a DDF model neuron so that it will respond accurately to any stimulating current it may encounter in a network or from environmental stimulation.

The fact that the stimulating current enters the biophysical (HH) equations in an additive manner, reflecting the way current conservation works in all electric circuits, suggests that a judicious choice of $Istim(t)$ is possible.

We do not have a mathematical statement about what stimulating currents should be used to train a DDF or, for that matter, an HH neuron model via data assimilation. However, we do have a biophysical argument.

If one wanted to learn how a DDF would respond when parameter value is changed, then presenting data to the DDF training protocol from a range of parameter values would permit the DDF to interpolate through that range. DDFs and radial basis functions (RBFs) are interpolating mathematical devices by construction. With enough data sampling enough of the dynamical system attractor, the RBF can accurately interpolate among the given data points.

As we have no differential equation for the stimulating current $Istim(tk)$), $tk=t0+hk$, $k=0,1,2,\u2026,N$ are just like parameters in the voltage or any other DDF. If we choose a training I(t), which evokes a V(t) over the dynamical range expected of a neuron, say, $-$100 mV to 100 mV, then we will be able to interpolate to new V(t) values within the training range. That is likely not to be quite enough as $Istim(t)$ is also a time series, albeit prescribed by the user, so we have found that the strength of $Istim(t)$ in its Fourier power spectrum must lie in a range that can be determined by the data presented. Biophysically, if the $Istim(t)$ frequencies are too high, they would be filtered out by the cell membranes RC time constant, as that acts as a low-pass filter.

Just this line of thinking led to the stimulating currents one sees in each of the experimental time courses presented in this article. Before we used this biophysical heuristic, data assimilation—training an HH model with data—was unsuccessful. With this heuristic guiding the choices of $Istim(t)$, successful DA is now routine.

## Acknowledgments

We acknowledge support from the Neurological Disorders and Stroke Institute, National Institutes of Health, “From Ion Channels to Graph Theory in Sensorimotor Learning,” grant U01 NS115821-01, and the Office of Naval Research, grants N00014-20-1-2580 and N00014-19-1-2522. Discussions with Erik Bollt and Daniel Gauthier were instrumental in initiating this work. C. Dan Meliza (University of Virginia) and Daniel Margoliash (University of Chicago) kindly provided us with the current clamp data from experiments on neurons within the nucleus HVC of zebra finch songbirds. Discussions with Stefan Luther and Ulrich Parlitz gave us insights into how DDF may have broad applications in biophysics. Two anonymous referees assisted in improving earlier drafts of this article.

## References

*The analysis of observed chaotic data*

*The statistical physics of data assimilation and machine learning*

*Neural Computation*

*SIAM J. Control Optim.*

*Systems Control Lett.*

*Complex Systems*

*Radial basis functions: Theory and implementations*

*Physica D*

*Nature Communications*

*Neural networks in a softcomputing framework*

*Cell Calcium*

*Physical Review A*

*Journal of Geophysical Research*

*Physical Review E*

*Journal of Physiology*

*Bull. Math. Biol.*

*Understanding calcium dynamics: Experiments and theory*

*Foundations of cellular neurophysiology*

*Physica D*

*Nonlinear time series analysis*

*Biological Cybernetics*

*Cold Spring Harbor Protocols*

*Science*

*Biological Cybernetics*

*Constructive Approximation*

*Scientific Reports*

*Nonlinear ordinary differential equations*

*Fifth Hellenic-European Conference on Computer Mathematics and Its Applications*

*Numerical recipes: The art of scientific computing*

*Neurocomputing*

*Journal of Statistical Physics*

*Advances in Computational Mathematics*

*Methods: A Companion to Methods in Enzymology*

*Principles of computational modelling in neuroscience*

*Journal of Global Optimization*

*Nonlinear dynamics and chaos: With applications to physics, biology, chemistry, and engineering*

*Lecture Notes in Math.*

*Biological Cybernetics*

*Biophysical Journal*

*ISRN Applied Mathematics*

*Physical Review E*