Abstract
Decision formation in perceptual decision making involves sensory evidence accumulation instantiated by the temporal integration of an internal decision variable toward some decision criterion or threshold, as described by sequential sampling theoretical models. The decision variable can be represented in the form of experimentally observable neural activities. Hence, elucidating the appropriate theoretical model becomes crucial to understanding the mechanisms underlying perceptual decision formation. Existing computational methods are limited to either fitting of choice behavioral data or linear model estimation from neural activity data. In this work, we made use of sparse identification of nonlinear dynamics (SINDy), a data-driven approach, to elucidate the deterministic linear and nonlinear components of often-used stochastic decision models within reaction time task paradigms. Based on the simulated decision variable activities of the models and assuming the noise coefficient term is known beforehand, SINDy, enhanced with approaches using multiple trials, could readily estimate the deterministic terms in the dynamical equations, choice accuracy, and decision time of the models across a range of signal-to-noise ratio values. In particular, SINDy performed the best using the more memory-intensive multi-trial approach while trial-averaging of parameters performed more moderately. The single-trial approach, although expectedly not performing as well, may be useful for real-time modeling. Taken together, our work offers alternative approaches for SINDy to uncover the dynamics in perceptual decision making and, more generally, for first-passage time problems.
1 Introduction
Decision making is the keystone to unlocking higher cognitive functions (Dreher & Tremblay, 2016). More specifically, perceptual decision making involves transforming sensory information into motor actions, in which the latter overtly provides choice information such as choice accuracy and reaction time (Luce, 1986; Roitman & Shadlen, 2002). Its neural correlates have also gradually been revealed (Gold & Shadlen, 2007; Dreher & Tremblay, 2016; Hanks & Summerfield, 2017; Najafi & Churchland, 2018; O'Connell et al., 2018; O'Connell & Kelly, 2021).
Previous studies have indicated that the perceptual decision-making process can be described by the sequential sampling modeling framework in which noisy sensory evidence is accumulated over time until reaching some internal decision threshold on which a choice will be committed (Gold & Shadlen, 2007; O'Connell et al., 2018). Popular cognitive computational models of perceptual decision making include the drift-diffusion model (DDM) and its variants (Ratcliff, 1978; Ditterich, 2006; Ratcliff et al., 2016; Asadpour et al., 2024). Connectionist or neural network representation of the sequential sampling models include the leaky competing accumulator (LCA) model (Usher & McClelland, 2001), which was shown to approximate the classic DDM process (LCA-DDM) under finely tuned conditions (Bogacz et al., 2006). In reaction-time tasks, the decision formation process can be mathematically described as a first-passage time problem (Gillespie, 1992; Bogacz et al., 2006; Kampen, 2007; Shinn et al., 2020). Although simple to implement and understand, while being mathematically tractable, these models’ variables and parameters may not directly relate to neurophysiology and may account for limited neural data (Ditterich, 2006; Gold & Shadlen, 2007; Kelly & O'Connell, 2013).
More biologically based models of perceptual decision making have been developed to bridge from neural to behavioral data while providing additional model constraints (Wang, 2002; Wong & Wang, 2006; Roxin & Ledberg, 2008; O'Connell et al., 2018). Models informed by neural activities can better delineate underlying decision processes and reduce model mimicry based only on choice behavior (Ratcliff et al., 2016; O'Connell et al., 2018; Bose et al., 2020). These models are often accompanied by intrinsic nonlinear behaviors that may allow better replication of neural activity profile while accounting for choice behavior (Niyogi & Wong-Lin, 2013). Perhaps the simplest nonlinear dynamical model for two-choice tasks can be mathematically described by a normal form with pitchfork bifurcation, a nonlinear bistable (NLB) model (Zhou et al., 2009) that resembles biologically grounded mean-field models (Wong & Wang, 2006; Roxin & Ledberg, 2008).
Several model-fitting algorithms had been used to identify decision model parameters to account for behavioral data. For decision network modeling, model parameters were fitted to either neural (Friston et al., 2003) or behavioral data (Bogacz & Cohen, 2004; Pinto et al., 2019; Shinn et al., 2020) or did not employ automated learning algorithms (Wong & Wang, 2006; Niyogi & Wong-Lin, 2013). More complex and high-dimensional recurrent neural network models have also been trained on choice behavioral data before comparing their neural activities to those of recorded neurons (Mante et al., 2013). However, there are limited studies that link these high-dimensional models to lower-dimensional models (Roach et al., 2023). Importantly, there is limited data-driven modeling work that automatically and directly uncovers low-dimensional nonlinear dynamical equations underlying decision formation from internally noisy decision variable data, which in turn predicts choice behavior.
In this letter, we adopt a data-driven approach for elucidating the governing dynamical equations of stochastic decision models. Some existing data-driven methods are sensitive to noise, while others are computationally expensive (Raissi et al., 2019) or limited to specific fields (Schmid, 2010). Other methods are black boxes that do not provide explicit equations or explanations (Lipton, 2018). Importantly, none have been tested on stochastic decision dynamical models in reaction-time tasks, which can be mathematically described as first-passage time problems.
Here, we use the sparse identification of nonlinear dynamics (SINDy) (Brunton et al., 2016; de Silva et al., 2020) to identify parsimonious linear or nonlinear equations of low-dimensional single-trial decision dynamics, given existing evidence suggesting that decision neural dynamics are embedded in low-dimensional space (Ganguli et al., 2008; Shenoy & Kao, 2021; Steinemann et al., 2023). We evaluate the SINDy algorithm on simulated noisy data generated from often-used decision models: DDM, linear LCA, LCA-DDM, and NLB. We compare decision variable activity, choice behavior, and parameters of the original models at single trials and across trials with that estimated by SINDy. Our focus is on identifying the deterministic components of these models. In addition to the standard single-trial SINDy approach, we make use of approaches using multiple trials to enhance SINDy's performance. Our general findings show that SINDy can readily elucidate the dynamical equations and account for decision models’ variable profile and choice behavior within reaction-time tasks.
2 Methods
2.1 Model Description
Four sequential sampling two-choice decision-making models were simulated, and their decision variable activities and choice behaviors were compared with the respective estimated models elucidated by SINDy. Specifically, we used the standard DDM (Ratcliff, 1978); the linear version of the LCA model (Usher & McClelland, 2001); the LCA, model, which approximates the DDM process (LCA-DDM; Usher & McClelland, 2001; Bogacz et al., 2006); and the NLB model (Zhou et al., 2009).
where denotes some internal decision variable, is the drift rate determined by the input signal or stimulus, is a random variable that follows a gaussian distribution with a mean of zero and a standard deviation of one, is the noise size, and is time with time step . Values of were varied between 0 and (to mimic varying choice task difficulty) while was fixed at .
During decision formation, started with an initial value of 0 and was integrated over time via equation 2.1 such that it reaches either a prescribed upper- or lower-decision threshold, indicating one of the two choices being made. The upper threshold for a correct choice (for positive drift rates) was set at 1, while the lower threshold for an error choice was −1. Once a threshold has been reached, the integration process is stopped (i.e., an absorbing threshold), and this time from stimulus onset is defined as the decision time.
With appropriately configured parameter values, the linear LCA can lead to runaway ramping (acceleration) over time for one of the decision variables due to the existence of a metastable saddle steady state (Usher & McClelland, 2001), or it may converge toward some stable steady state (a fixed-point attractor; Usher & McClelland, 2001). We focus on the metastable version of the linear LCA as the stable steady-state version will be investigated using the NLB model (see below).
Importantly, the DDM process can be approximated from the linear LCA model when the decay rate term () and mutual inhibition () are equal and high, that is, finely tuned parameters (Bogacz et al., 2006). For the linear LCA model, the mutual inhibitory factor and decay rate were set at 4 and 3, respectively, while was varied between and and fixed at . For the LCA-DDM model, and were both 10, while was varied between 3 and and fixed at 3. Both models had noise at .
2.2 Model Simulations
A wide range of values of the signal-to-noise ratios (SNR) for each model was used. The appropriate number of trials for each model is based on the results of the confidence intervals of the predicted choice behavior meeting the 90% confidence interval criterion. Each set of model parameters is generated over 10,000 trials. To compare across models, we normalized the decision times with min-max normalization for both the decision and derived SINDy models.
To numerically integrate the stochastic differential equations in all the models, we employed the Euler-Maruyama method (Higham, 2001). For DDM, a time step of arbitrary unit (a.u.) was used, while it was a.u. for LCA, LCA-DDM, and NLB models. Smaller time steps did not affect the results.
2.3 Sparse Identification of Nonlinear Dynamics
Here, we used the Python version of SINDy (PySINDy) (de Silva et al., 2020). We only used polynomials as in all the considered decision models can be described by a combination of polynomials. We used the Savitsky-Golay filter (Press & Teukolsky, 1990) in PySINDy to smooth out noise. As decision models entail more noise than in previous SINDy studies, we also performed approaches that made use of multiple trials. Specifically, to handle noise effects, we used the multi-trial approach available in SINDy (Brunton et al., 2016), which estimates a model based on trajectories from multiple trials. We also employed a less computationally memory-intensive, trial-averaging method of model parameters deduced from single trials. It should be noted that SINDy elucidates only the deterministic part of a dynamical system.
Schematic of workflow for SINDy processing and comparison with original models. DV: decision variable.
Schematic of workflow for SINDy processing and comparison with original models. DV: decision variable.
When using SINDy, it is important to set the polynomial order, as this dictates the complexity of the estimated model. To identify the best polynomial order, we ran simulations with various polynomial orders on the trial-averaged parameters and multi-trial approaches to determine the order that best fits the choice behavior. For multi-trial DDM, LCA-DDM, and LCA, SINDy parameter estimation was performed using all 10,000 time courses for all 41 SNRs; however, for NLB, due to the memory-intensive operations resulting from the time step and SINDy fitting, we could use only 4,000 of the time courses for parameter estimation (see online supplementary note 5).
It is worth noting that fitting each model by trial-averaging the time course of truncated decision variables will lead to an artifact that may seemingly look like some stable steady state (fixed point) lying beneath decision thresholds even if the considered models are perfect neural integrators such as DDM or actually have a stable steady state beyond the decision thresholds (as in NLB). This approach leads to confusion and poor performance by SINDy. To evade this, one may consider applying SINDy only on an earlier time epoch of the trial-averaged decision variable; however, this would again lead to low SINDy performance with fewer data available for SINDy. Hence, we did not pursue these directions.
2.4 Statistical Analysis
To assess the choice behavior of simulated decision models versus SINDy's derived decision models, we employed the Kolmogorov-Smirnov (KS) test (Smirnov, 1948). We used Cliff's Delta (Cliff, 1993) to quantify the effect size of the differences between choice behavior distributions.
For both single-trial condition and conditions over multiple trials, we employed a nonparametric bootstrap resampling technique approach (Davison & Hinkley, 1997) to calculate confidence intervals, using a predefined number of bootstrap samples (1,000) and aiming for a confidence level of 90%. We derived the lower and upper bounds of the confidence intervals based on the 10th and 90th percentiles of the bootstrap mean distributions (Efron & Tibshirani, 1994).
To compare the averaged decision variable activities between the original models and the corresponding SINDy-derived models, we split the SNR into four different values. First, we identified the high SNR based on two conditions: when the choice accuracy for both the original and SINDy-derived models attained or more and their differences are or less. Then we equally divide between this SNR value and zero SNR into three equal intervals, leading to categorization of low and medium SNR values.
To quantify the differences between the model parameters obtained by SINDy and those of the corresponding original models, we compute room mean squared error (RMSE) of the parameters for each condition per trial and then aggregate the values (see supplementary Figure S3).
2.5 Software and Hardware
The simulations and analyses used Jupyter Notebook running on Python 3. We used a Windows machine with four memory cores, Intel i7-7300HQ, and 32GB RAM, and the Northern Ireland High Performance Computing (NI-HPC) facility (www.ni-hpc.ac.uk). Source codes are available at https://github.com/Lenfesty-b/SINDy.
3 Results
We first investigate how SINDy can estimate single-trial decision variable dynamics simulated from the stochastic decision models considered in the reaction-time task. We then determine how well the single-trial estimated dynamics can predict the aggregated choices for specific SNR. This is followed by SINDy approaches using multiple trials.
3.1 Mixed Results of Choice Prediction from SINDY's Single-Trial Parameter Estimation
Single-trial parameter estimation by SINDy produced mixed results. (A–D) Models from top to bottom: DDM, LCA-DDM, LCA, and NLB. Same within-trial random seed used for original and estimated data. Reaction time task (i.e., first-passage time process). Blue (orange): original (estimated) model's data. Left: SINDy-generated decision variable dynamics compared to that of original models. Time from stimulus onset. Dashed: decision threshold. (B, C, left) Faded: Activity of losing competing units. Middle: Choice accuracy. Right: Normalized decision time. Middle, right: Color label as in left column; (B, middle, left): gold for the optimal zero-order SINDy-derived model. Confidence interval (90%) shown. Asterisks at bottom: statistical significance () between original and derived models.
Single-trial parameter estimation by SINDy produced mixed results. (A–D) Models from top to bottom: DDM, LCA-DDM, LCA, and NLB. Same within-trial random seed used for original and estimated data. Reaction time task (i.e., first-passage time process). Blue (orange): original (estimated) model's data. Left: SINDy-generated decision variable dynamics compared to that of original models. Time from stimulus onset. Dashed: decision threshold. (B, C, left) Faded: Activity of losing competing units. Middle: Choice accuracy. Right: Normalized decision time. Middle, right: Color label as in left column; (B, middle, left): gold for the optimal zero-order SINDy-derived model. Confidence interval (90%) shown. Asterisks at bottom: statistical significance () between original and derived models.
The samples presented here had the recreated dynamics reaching the same decision threshold as that of the original models; that is, they correctly predicted the trials. There were also trials where they reached a different decision threshold, incorrectly predicting the trials (not shown). SINDy's recreated dynamics readily replicated the original time course well, especially at the beginning, before diverging, and hence reaching the decision threshold at a different time point (i.e., different decision/first-passage time) than that of the original model despite having the same seeded noise. This suggested that SINDy might estimate the model parameters slightly differently from that of the original models. This was indeed the case for DDM in Figure 2A (see supplementary note 2 for details). For the LCA-DDM, assuming the underlying model order to be of polynomial one, we found that the SINDy-derived model was not as asymmetrical as the original LCA-DDM (see supplementary note 2). SINDy's derived model of the linear LCA provided a closer approximation and hence better prediction of the dynamics than the LCA-DDM derived model (see supplementary note 2). Finally, the NLB replicated the activity well despite not capturing the original model parameters as closely as for the LCA (see supplementary note 2).
We next aggregated the trials for specific SNR. From this, we compared the choice accuracy and decision time between the original models and the corresponding models by SINDy. We found that as SNR increased from zero, the choice accuracy for DDM, LCA-DDM, and NLB deviated from that of SINDy's models (see Figure 2B, middle; compare blue with orange). Higher SNR generally led to faster decisions and, hence, fewer generated data revealed to SINDy. This results in higher difficulty in elucidating the model parameters well. However, there are some nuances. First, for DDM, sufficiently high SNR led to better SINDy prediction (see Figure 2A, middle, right). Second, SINDy predicted LCA better with higher SNR, and attained the best prediction among the models (see Figure 2C, middle, right). This was due to the more unstable (runaway) dynamics of LCA, such that noise became relatively less important to contributing to the overall dynamics. Third, SINDy did not predict LCA-DDM choice behavior generally well when first-order polynomial was used (see Figure 2B orange, middle, right). However, when zero-order polynomial was used, the prediction improved substantially (see Figure 2B, gold). This is not surprising, given that the LCA-DDM behaves similar to DDM through model parameter fine tuning (Bogacz et al., 2006). Interestingly, SINDy predicted readily well for NLB despite its multistable states and the presence of noise.
Overall, SINDy predicted noisy single-trial decision variable dynamics and choice behavior with mixed results, performing well only for certain models and SNR ranges. We next investigate whether trial-averaged estimated model parameters and multi-trial SINDy can improve the prediction of the averaged decision variable dynamics and choice behavior.
3.2 Trial-Averaged and Multi-Trial Parameter Estimation Enhanced Choice Prediction for SINDy
Improved SINDy-derived model predictions of choice behavior with trial-averaging and multi-trial approaches. (B) Only with model with zero-order polynomial. Labels as in Figure 2. Choice behavior shown only for the optimal polynomial order.
Improved SINDy-derived model predictions of choice behavior with trial-averaging and multi-trial approaches. (B) Only with model with zero-order polynomial. Labels as in Figure 2. Choice behavior shown only for the optimal polynomial order.
Trial-averaging-of-parameter approach's decision variable activities for different SNRs explain choice behavioral fits. Zero, low, medium, high SNRs; high SNR based on choice accuracy of both original and SINDy-derived models or higher and their differences being or less. Low and medium SNRs determined from equally divided intervals between these two SNR values. Higher SNRs improved SlNDY's model prediction. Labels as in Figure 2.
Trial-averaging-of-parameter approach's decision variable activities for different SNRs explain choice behavioral fits. Zero, low, medium, high SNRs; high SNR based on choice accuracy of both original and SINDy-derived models or higher and their differences being or less. Low and medium SNRs determined from equally divided intervals between these two SNR values. Higher SNRs improved SlNDY's model prediction. Labels as in Figure 2.
For the LCA model, this approach seemingly proved most effective; SINDy was readily able to capture the decision variable dynamics (see Figure 4C) in addition to choice behavior for the model (see Figure 3C). For the case of the NLB model, SINDy's predictive performance also improved, especially with higher SNR (see Figure 4D).
Multi-trial approach's decision variable activities for different SNRs explain choice behavioral fits. Decision variable dynamics averaged over the same SNRs for comparison with those for the trial-averaged approach (compared with Figure 4). Blue (red): original (estimated) averaged decision variable activities.
Multi-trial approach's decision variable activities for different SNRs explain choice behavioral fits. Decision variable dynamics averaged over the same SNRs for comparison with those for the trial-averaged approach (compared with Figure 4). Blue (red): original (estimated) averaged decision variable activities.
Thus, with the trial-averaging approach, SINDy can generally handle the noisy dynamics better than the single-trial SINDy approach and therefore predict choice behavior with greater confidence compared to that of the single-trial approach. The multi-trial SINDy further enhances SINDy's performance, with predicted choice behaviors and decision variable dynamics even closer to those of the original models. These results were also validated based on the differences between the model parameters (see supplementary notes 1 to 3 for the highest SNR) and quantified by their RMSE values (see supplementary Figure S2).
4 Discussion
Identifying the right decision-making model is important for understanding underlying decision mechanisms. We have shown that the SINDy algorithm, especially using multiple trials, performs reasonably well at recovering the underlying governing equations for the various often-used sequential- sampling-based decision-making models. Specifically, this was achieved by uncovering the dynamical equations from the DDM, linear and metastable LCA, its approximation to DDM (LCA-DDM), and the NLB model (Ratcliff, 1978; Usher & McClelland, 2001; Usher & McClelland, 2001; Bogacz et al., 2006; Zhou et al., 2009). We showed the potential of SINDy's utility replicating the deterministic portion of the decision variable profiles, which aid in predicting choice behaviors. This not only advances our computational tool kit but also enriches our understanding of underlying decision-making processes, extending the foundational work by Brunton et al. (2016). Further, as the reaction-time task in the decision-making models is akin to first-passage-time problems, wherein the systems’ dynamics may be only partially observable, our results show that the application of SINDy may be extended to similar problems in other fields, such as engineering, physics, mathematics, and biology (Gillespie, 1992; Kampen, 2011).
The adaptability of SINDy to predict choice accuracy and decision times across different SNR underscores its versatility. Moreover, SINDy can readily handle complex, nonlinear, and multistable models (NLB) and finely tuned LCA-DDM. Additional spurious terms of higher orders for SINDy-derived NLB model may be minimized by considering smaller decision variable values, for instance, closer to the beginning of trials.
We investigated three different SINDy approaches in this study. The multi-trial SINDy approach demonstrated the best performance in elucidating choice behaviors and decision variable dynamics of all the original models. It also provided the fastest performance but was more computationally memory intensive (see supplementary note 4). This is because the multi-trial SINDy approach used all the data and created a large library of candidate functions due to the large data size and variability. This became memory intensive when the large numbers of nonzero candidate functions were fed into the least square regression algorithm, which involved inverting or factoring the large matrices, further increasing the need for larger memory.
More modest performance was obtained using trial-averaging of model parameters across trials and doing especially well for the LCA model. This approach took much longer than the multi-trial approach but was less memory intensive. Although the single-trial SINDy approach's performance was not as high as the above two methods, its utility might lie in modeling neural data in real time (Raza et al., 2020). When compared with the other two approaches to fit both models using all trials at once, its memory requirements are less but its run time is longer than the multi-trial approach (see supplementary note 4). Hence, our work offers alternative SINDy approaches depending on the needs.
Although the methods that used multiple trials were able to handle noise, the assumption of known noise characteristics may present some limitation, particularly when translating these models to analyze real neural data where noise information may not be readily available. This challenge is accentuated in LCA-DDM, where SINDy's performance varied, highlighting the algorithm's sensitivity to finely tuned parameters.
With some assumptions on the characteristics of the noise (e.g., additive and white noise like), perhaps more exact determination can be obtained through optimal fitting of the choice behavior of the original respective model, after the deterministic portion of the model is identified using the methods shown in this study. Future work should also evaluate SINDy on other nonlinear decision-making models (Marshall et al., 2022; Pirrone et al., 2022; Asadpour et al., 2024). Other similar data-driven methods could be explored and compared with SINDy (Pandarinath et al., 2018) within decision making or first-passage-time processes.
Although applying SINDy to empirical neural data is beyond the scope of our investigation, neural correlates of evidence accumulation for decision making in the form of time-series data are well known and have been identified in invasive and noninvasive recordings across species (Hanks & Summerfield, 2017; O'Connell et al., 2018). For large-scale or brain-wide neural data, dimensional reduction may be required before further analysis or modeling (Cunningham & Yu, 2014). This is especially the case for decision making, which often resides in lower-dimensional neural space (Ganguli et al., 2008; Shenoy & Kao, 2021; Steinemann et al., 2023). In fact, one of the advantages of SINDy is its ability to handle high-dimensional data (Brunton et al., 2016). Thus, this would be an interesting future direction to apply our developed methods.
In conclusion, while we have successfully demonstrated the SINDy algorithm's performance on cognitive computational models, its areas for improvement have also been revealed in the process. For instance, improving the algorithm's handling of noise and its applicability to more complex models will be crucial for its successful integration into decision neuroscience research. Such advancements will provide deeper insights into the computational basis of decision making and improve our ability to interpret and predict neural activity underlying cognitive processes.
Author Contributions
B.L. and K.W.-L. designed and conceptualized the analyses. B.L. performed the simulations and analyses. S.B. and K.W.-L. validated the codes, data, and analyses. B.L., S.B., and K.W.-L. wrote the letter. K.W.-L. supervised the research.
Acknowledgments
We thank Abdoreza Asadpour and Cian O'Donnell for useful discussions. B.L. was supported by Ulster University via the Northern Ireland Department for the Economy. K.W.-L. was supported by Health and Social Care Research and Development (HSC R&D; STL/5540/19) and UK Research and Innovation (UKRI) Medical Research Council (MRC) (MC_OC_20020). We are grateful for access to the Tier 2 High Performance Computing resources provided by the Northern Ireland High Performance Computing facility funded by the UK Engineering and Physical Sciences Research Council, grant EP/T022175/1.