The brain continuously estimates the state of body and environment, with specific regions that are thought to act as Bayesian estimator, optimally integrating noisy and delayed sensory feedback with sensory predictions generated by the cerebellum. In control theory, Bayesian estimators are usually implemented using high-level representations. In this work, we designed a new spike-based computational model of a Bayesian estimator. The state estimator receives spiking activity from two neural populations encoding the sensory feedback and the cerebellar prediction, and it continuously computes the spike variability within each population as a reliability index of the signal these populations encode. The state estimator output encodes the current state estimate. We simulated a reaching task at different stages of cerebellar learning. The activity of the sensory feedback neurons encoded a noisy version of the trajectory after actual movement, with an almost constant intrapopulation spiking variability. Conversely, the activity of the cerebellar output neurons depended on the phase of the learning process. Before learning, they fired at their baseline not encoding any relevant information, and the variability was set to be higher than that of the sensory feedback (more reliable, albeit delayed). When learning was complete, their activity encoded the trajectory before the actual execution, providing an accurate sensory prediction; in this case, the variability was set to be lower than that of the sensory feedback. The state estimator model optimally integrated the neural activities of the afferent populations, so that the output state estimate was primarily driven by sensory feedback in prelearning and by the cerebellar prediction in postlearning. It was able to deal even with more complex scenarios, for example, by shifting the dominant source during the movement execution if information availability suddenly changed. The proposed tool will be a critical block within integrated spiking, brain-inspired control systems for simulations of sensorimotor tasks.
Humans can perform complex movements that require the coordination of many muscles and joints, automatically and unthinkingly (Thach, 1998). Even in noisy conditions (e.g., in foggy or dark environments) our brain can integrate multiple available sensory information with previous knowledge in order to estimate the current state of the body and the environment and to use this estimate to generate appropriate motor commands (Alessandro et al., 2016; Kawato, 1999; Shadmehr & Krakauer, 2008a; Wolpert, Goodbody, & Husain, 1998). This process of state estimation is essential to deal with the inherent delays in our sensory systems. As an example, during a tennis match, the information about the position of the ball extracted from the visual input becomes available to the central nervous system (CNS) with a delay of about 100 ms (Wolpert & Ghahramani, 2000). If not appropriately compensated, such noisy and delayed information would lead to inappropriate motor commands, eventually resulting in unsatisfactory movement execution. Instead, the CNS can filter the noisy sensory feedback and combine this information with fast sensory predictions that compensate for delays due to movement execution and feedback reafferences (Wolpert et al., 1998). In the previous example, the tennis player can successfully estimate the spin of the ball using reliable predictions generated by internal models of the ball and of the body dynamics, ultimately generating appropriate motor commands, not just compensating for delays but also in case of uncertain sensory feedbacks (e.g., foggy day) (Wolpert & Ghahramani, 2000).
Several brain areas are involved in this process. There is large evidence that the CNS computes sensory predictions by means of the cerebellum (Kawato, 1999; Kawato & Gomi, 1992; Shadmehr & Krakauer, 2008a; Wolpert et al., 1998). The cerebellum, acting as a forward model, learns to predict sensory consequences of actions using the planned motor commands received from the motor cortex through the efference copy (Miall & Wolpert, 1996; Popa & Ebner, 2019; Stein, 2009). Therefore, it facilitates fast and smooth coordination and aids accurate and well-timed sensorimotor execution and adaptation (Miall & Wolpert, 1996; Thach, 1998). Cerebellar predictions are integrated with the actual sensory information in a process that computes reliable estimates of the state of the body interacting with the environment. This process is thought to be carried out by the parietal cortex (Shadmehr & Krakauer, 2008a), which receives, through the thalamus, projections from both the deep cerebellar nuclei (output of the cerebellum; Palesi et al., 2014) and peripheral sensory structures (Dum, Levinthal, & Strick, 2009). Accordingly, it has been suggested that damages to the parietal cortex cause performance errors compatible with the inability to compute state estimates (Wolpert et al., 1998; Wolpert & Ghahramani, 2000).
Over the years, several theories have been proposed to explain how the CNS combines sensory information originating from the periphery and the cerebellum. It has been suggested that the CNS integrates these information sources through a process of Kalman filtering or Bayesian integration (Körding & Wolpert, 2004; Shadmehr & Krakauer, 2008a). These ideas have been investigated using computational models (de Xivry, Coppe, Blohm, & Lefèvre, 2013; Körding & Wolpert, 2004). However, given their abstract and high-level computational nature, these models provide little insight into the biological features of the underlying neural mechanisms. A more biologically plausible model of the neuromotor control system has been developed by Deneve et al. (2007), who implemented an optimal sensorimotor integration model for state estimation using recurrent neural networks of cortical circuits. However, how these processes may be implemented in spike-based systems remains an unresolved issue.
Here, we designed a spike-based state estimator model that operates with the naturalistic time coding representations typical of neuronal activity. The model receives spiking signals from two afferent neuronal populations, emulating the cerebellar output and the sensory feedback, and computes the spike synchronicity within each of these populations. These measures are used to define relative reliability of the incoming signals, allowing the computation of an optimal Bayesian estimation of the body state (Barak, 2017). The state estimator model was tested at different stages of the cerebellar learning process, emulated by modulating the spiking variability of the cerebellar output. Indeed, recent studies suggest that a strong increase of within-population synchronicity (i.e., a decrease of within-population variability) is a neural correlate of learning (Chervyakov et al., 2016; Sedaghat-Nejad, Pi, Hage, Fakharian, & Shadmehr, 2022). Accordingly, the synchronicity among cerebellar neurons is maximal at the end of the adaptation (Wagner et al., 2019).
Furthermore, we tested our model in more complex scenarios: (1) with a sudden interruption of the sensory feedback during movement execution in a middle-learning condition and (2) with an unexpected perturbation in postlearning. In all of these contexts, our model generated appropriate state estimates. In the future, we will integrate the state estimator proposed here with whole-brain models, including a realistic spiking network of the plastic cerebellar circuit (de Schepper et al., 2021; Geminiani et al., 2018; Geminiani, Pedrocchi, D'Angelo, & Casellato, 2019), paving the way of bio-inspired multiarea brain simulations in closed-loop controllers.
2 Materials and Methods
2.1 System Design
2.2 Neural Populations and Signal Encoding
The three neural populations—cerebellar output neurons, sensory feedback neurons, and state estimator neurons—were implemented as spiking neural networks (SNN) (Ghosh-Dastidar & Adeli, 2009) in NEST (Eppler et al., 2009; Jordan et al., 2019). The sensory feedback and cerebellar output neurons were modeled as Poisson single-point neurons whose spike trains were generated by applying a frequency coding strategy based on a Poisson distribution of spikes (Brette, Roland, Panzeri, & Graham, 2015). The controlled body was a point mass that could move over a bidimensional space: position at time t was identified by two variables x(t), y(t). Consistent with the foundational concept on neuronal population coding movement directions and the neural representation of variables with vectorial attributes (Georgopoulos, Schwartz, & Kettner, 1986), each population was subdivided into two subpopulations (x and y) and further divided into two groups (positive and negative), encoding the signal of opposite signs. The firing rates were linearly proportional to the position of the point mass along the specific axis and direction, plus a baseline firing rate of 50 Hz. This basal discharge corresponded to the initial neutral configuration and was considered the physiologically minimum neural activity (i.e., neurons functionally silent). Therefore, the position of the point mass on each axis could be decoded by computing the net firing rate (i.e., the difference between the firing rates of the positive and the negative groups) and then dividing this net rate by the gain factor. Each of these groups consisted of 100 neurons. In summary, there were 200 neurons for each axis, for a total of 400 sensory feedback neurons. The cerebellar output and state estimator populations were analogously organized. Therefore, the overall system consisted of 1200 neurons. This system is a proof-of-concept for the spiking implementation of the state estimator; in principle, it can be easily scaled in terms of multijoint tasks by increasing the number of neurons, subpopulations, and groups, accordingly.
2.3 State Estimation
Finally, these variabilities were used to compute the reliabilities of the afferent populations and the weighted sum described in equation 2.3. The obtained signal was then converted into spike patterns by the state estimator Poisson single-point neurons. The estimate can be read out as the net difference in firing rates between the positive and negative groups of the state estimator population.
2.4 Task Design and Tests
The state estimator model was evaluated in the context of a reaching movement of a point mass along the x-direction, from (0, 0) to (1, 0) m in 500 ms. The trajectory of the point mass was defined as a minimum-jerk fifth-order polynomial with bell-shaped velocity profile, the typical trajectory observed in humans during reaching tasks (Flash & Hogan, 1985). We decided to simulate these 1D movements for simplicity without loss of generality. However, our model allows the simulation of 2D movements, as illustrated in the supplementary material.
The variability among sensory feedback neurons was defined by the level of noise and the amount of the delay (Wolpert & Ghahramani, 2000), both of which affect the reliability of the sensory feedback (Wagner et al., 2021). In particular, each neuron was corrupted by a gaussian noise with zero mean and standard deviation dependent on the amplitude of the transmitted signal (i.e., signal-dependent noise; Clamann, 1969; Harris & Wolpert, 1998), such that their firing activity was slightly desynchronized as it occurs in real biological networks (Wagner et al., 2019). The amount of variability of the sensory feedback neurons was not altered depending on the stage of cerebellar learning. In contrast, it has been observed that the synchronicity among cerebellar neurons varies with learning (Casellato et al., 2014), showing its maximum at the end of the adaptation (Wagner et al., 2019). Therefore, we considered two scenarios: high variability to simulate inaccurate cerebellar predictions (prelearning), and low variability to simulate accurate cerebellar predictions (postlearning). Indeed, in the prelearning condition, the cerebellar neurons did not encode any task-relevant information, while in the postlearning condition, they encoded the planned trajectory, the way that the sensory feedback neurons do but without delay. Due to this lack of delay, postlearning cerebellar predictions should be more accurate than the sensory feedback (Shadmehr & Krakauer, 2008b); hence, we set the variability of the sensory feedback neurons to be higher than that of the cerebellar neurons. On the contrary, in the prelearning condition, we set the variability of the sensory feedback neurons to be lower than that of the cerebellar neurons.
Two further tests were carried out in order to challenge the state estimator model in the face of unexpected perturbations. First, a sudden switch-off of the sensory feedback during the movement was simulated. A reaching trajectory from (0, 0) to (1, 0) m was simulated over a period of 2 seconds. In this simulation, a scenario of intermediate cerebellar adaptation was defined by setting the cerebellar output neuron variability to be slightly higher than the feedback neuron variability. At half movement (t = 1 s), the sensory feedback was switched off (simulated with null input trajectory and variability set to infinite), leaving the cerebellar prediction, still not completely reliable as in postlearning, as the only source of information. This simulation allowed us to test the robustness of the state estimator output when the reliability of the afferent populations changed during the movement.
Second, a mismatch between the planned and the executed trajectory was simulated in postlearning (e.g., visuo-motor rotation after cerebellar adaptation). Since the postlearning condition simulates a scenario in which the cerebellum was already adapted, the cerebellar output neurons accurately (i.e., low interneuron variability) encoded the planned trajectory, from (0, 0) to (1, 0) m in 0.5 s. At the same time, however, the sensory feedback neurons encoded the actual movement under the perturbation: from (0, 0) to (−1, 0) m in 0.5 s. Therefore, the state estimator received inconsistent information from the afferent neural populations, both with a good level of reliability.
3.1 Neural Coding of the Afferent Populations to the State Estimator
It is known that the spike patterns of the cerebellar output neurons (namely, deep cerebellar nuclei cells) are modulated according to the learning stage in terms of timing, amplitude, and synchronization (Antonietti, Martina, Casellato, D'Angelo, & Pedrocchi, 2019; Casellato et al., 2014). At prelearning, when the cerebellum had not yet encoded any task-relevant information, all neurons fired at the constant background rate (see Figure 2B). On the contrary, at postlearning, the cerebellar output neurons encoded the predicted movement, with the positive group gradually increasing its activity during movement trial and the negative group maintaining a constant background activity (see Figure 2C; Casellato et al., 2014; Naveros et al., 2019). Similar results were obtained on both subpopulations (x- and y-axes) during bidimensional movement (see Figure S1).
If there are no perturbations, the net firing rate between the positive and the negative cerebellar output neurons is therefore predictive of the executed trajectory without delay.
3.2 Reliability of the Afferent Signals to the State Estimator
3.3 Output of the State Estimator in Stable and Perturbed Environments
A closer look at the net firing rate of the state estimator output allows us to better describe the behavior of the model in the two learning conditions. In the prelearning condition, when the cerebellum did not provide task-relevant information, the output of the state estimator preferentially followed the sensory feedback (see Figure 4A, right), which was more reliable than the cerebellar prediction despite its inherent delay. As a result, the generated estimate was delayed with respect to the planned trajectory. In the postlearning condition, the state estimator mainly relied on the cerebellar prediction. As a result, the firing rate of the state estimator neurons increased earlier than in the prelearning condition (see Figure 4B left), and the generated estimate matched the planned trajectory with no delay (see Figure 4B right).
In this work, we implemented a spiking neural network model of a state estimator based on Bayesian integration theory. Our model is a proof-of-concept and was tested here as an isolated block, receiving spike-encoded input signals. When no reliable cerebellar predictions were available (prelearning), the model estimated body states mainly relying on (delayed) sensory feedback. In this scenario, the cerebellar network, acting as a forward internal model, is still untrained and its output is unreliable and noisy. In more advanced stages of the learning process, the cerebellum acquires an accurate internal model of the body and environment, providing a reliable prediction of the planned movement with negligible noise and delay (Freeman, 2014; Ito, 2000). Accordingly, in the postlearning condition, our model estimated the body state using cerebellar prediction as the dominant source of information. This work provides a useful tool that can be used within spiking bio-inspired sensorimotor controllers for the simulation of motor tasks and could be easily integrated into existing brain models.
Here, the functionalities of the proposed state estimator model were tested in isolation from the other brain areas (e.g., the cerebellum and the sensory feedback). To do so, we had to make assumptions on the statistics of these areas’ neural activity based on literature: in the prelearning condition, the reliability of the (delayed) sensory feedback was higher than that of the cerebellar prediction (because before learning, the cerebellum provides erroneous predictions; Shadmehr & Krakauer, 2008b); in the postlearning condition, on the contrary, the reliability of the sensory feedback was set to be lower than that of the cerebellum, which provides accurate predictions with no delay, showing a strong interneuron variability (Wagner et al., 2019). These assumptions allowed us to illustrate that the proposed model indeed worked in accordance with the Bayesian integration theory, considering the dominant source of information as that with the highest reliability.
In biological sensorimotor systems, sensory feedback is typically noisy and delayed due to latencies in information processing (Wolpert & Ghahramani, 2000) or even absent due to the lack of the appropriate sensory receptors and excessive movement speed (e.g., in ballistic movements like saccades; Abrams, Meyer, & Kornblum, 1989). Compensating for the delay or the lack of sensory feedback requires a process of state estimation that uses internal prediction about the time-varying body state. Based on the hypothesis that an optimal state estimation should exploit the variability of the incoming spike trains (Scott, 2002), the state estimator model implemented here extracted such variability metrics from sensory feedback and cerebellar output neurons to evaluate the reliability of the signals encoded by these two neural populations. This implies that along the learning process, the cerebellar output signal should acquire an advantageous signal-to-noise ratio (Wagner et al., 2021). This coherent spike activity of the cerebellar output neurons occurs when the cerebellum becomes able, throughout the repetition of several movement trials, to translate the motor efference copy it received into accurate and well-timed predictions of the corresponding sensory consequences (Shadmehr, 2017; Tseng et al., 2007).
By integrating the two source signals by means of their relative reliabilities, the spiking state estimator model was able to properly move from an initial state estimation relying only on the noisy and delayed sensory feedback in the prelearning condition (when the cerebellar spike trains were strongly desynchronized), to an accurate predictive estimate of the upcoming state of the moving body in the postlearning condition. It is worth noting that even after learning, the overall 100 ms delay was not completely compensated since a nonzero weight was set for the sensory feedback signal to face any possible further unpredictable event. Importantly, the continuous computation of the reliability measures during movement execution allows the system to deal with unexpected changes of environmental or internal contexts, such as a sudden interruption of the sensory feedback or an unpredictable perturbation (Haith & Krakauer, 2013).
It could be claimed that activity correlations may be associated with a reduction of the encoded stimulus information (e.g., impaired perceptual discrimination). However, as discussed in Valente et al. (2021), correlations are higher when correct choices and movements are made, thus showing that the effects of correlations in enhancing decoding of behavioral choices from sensory information overcome their detrimental information-limiting effects.
Accordingly, the basic assumption of our work is that correlation is associated with consistency of information across neurons and time, and therefore it is maximal when proper motor responses are learned (Valente et al., 2021). Synchronization of spikes among a group of neurons is indeed a special form of temporal coding (Sedaghat-Nejad et al., 2022).
In our model, signals amplitude was encoded in the firing rates of direction-dependent neural subpopulations (Georgopoulos et al., 1986). Each subpopulation was in turn divided into positive and negative groups, encoding signals of opposite signs (e.g., agonist and antagonist signals) The neuron baseline value represents the background activity of a functionally silent neuron (ten Brinke et al., 2017).
In our simulations, the baseline firing rate was set for all neurons to 50 Hz to ensure a sustained neural activity in all neurons, hence guaranteeing a good resolution in signal encoding even with a relatively low number of neurons. However, in future implementations with large-scale brain models and more proper population numerosity, this baseline rate could be flexibly set to match in vivo recordings, differentiating each neuronal population.
4.2 Future Work
The results obtained here provide a solid basis for future investigations on how spiking neural mechanisms and interaction of different brain areas generate accurate and timely motor commands. Indeed, the state estimator block has been implemented as a spiking processing unit, complementing current models based on high-level representations of cortical computations or artificial neural networks (Lanillos & van Gerven, 2021; Parrell, Ramanarayanan, Nagarajan, & Houde, 2019; Xu, Hu, Han, & Zhang, 2021). The spiking approach increases the biological realism of the model and paves the way to future studies on how the brain performs the integration of sensory feedback and sensory prediction, with a level of neuronal firing variability and synchronicity strongly informative, which depends on learning stage and environmental context.
In future work, a full spiking model of the cerebellar microcircuit endowed with plasticity rules, which was previously tuned and validated on experimental data (Casali, Marenzi, Medini, Casellato, & D'Angelo, 2019; de Schepper et al., 2021; Geminiani et al., 2019), could be connected to the spiking state estimator. In this construct, the cerebellar prediction will emerge from adaptive circuit processing throughout task repetition (D'Angelo et al., 2016). This will allow us to let the spiking variability of cerebellar neuronal populations to evolve along with the acquisition of an accurate internal dynamic model. Indeed, during the formation of internal predictions, the cerebellar circuit undergoes an adaptation process based on the error between the predicted body movement and the actual movement revealed by sensory afferences. This “sensory prediction error” would be conveyed to the cerebellum through the inferior olive circuit.
We showed that in the face of unexpected perturbations in the postlearning condition, our model does not generate an estimate that corresponds to the actual executed movement. If the model was embedded in a complete sensorimotor control loop (Shadmehr & Krakauer, 2008b) in such a situation, the difference between the sensory feedback and cerebellar prediction (i.e., sensory prediction error; Tseng, Diedrichsen, Krakauer, Shadmehr, & Bastian, 2007) would generate activity in the inferior olive, signaling inaccurate predictions and triggering plastic processes that would eventually allow compensating for the perturbation. This error would increase the variability of the cerebellar output neurons, potentially due to the olivo-DCN collaterals (Lu, Yang, & Jaeger, 2016). As a result, the state estimator would start to preferentially follow the sensory feedback until the cerebellum provided new, accurate predictions (i.e., reincreased reliability). Experimental recordings from parietal cortex and cerebellum neurons in behaving mice could be fundamental to validating this process.
The spiking state estimator could also be introduced in a control system embedding spiking models of different brain areas wired using connectome data (Oh et al., 2014). This will interestingly generate sensorimotor behaviors and allow monitoring the underlying dynamics of all involved neuronal populations. This modular system could embed blocks at different scales and levels of neuronal detail and could be used to control more complex dynamic bodies, with several degrees of freedom interacting with the environment in realistic scenarios. Using this general system to control detailed models of the musculoskeletal apparatus will be instrumental to investigating open issues in the intricate relationship between neural control and musculoskeletal biomechanics (e.g., the emergence of muscle synergies; Alessandro, Carbajal, & d'Avell, 2012), and the sensory integration for the regulation of internal joint loading (Barroso, Alessandro, & Tresch, 2019). Finally, the implemented system could be used to control neurorobots, exploiting the already available interfaces with the software MUSIC to synchronize the brain controller and the actuation of a robotic body (Weidel, Djurfeldt, Duarte, & Morrison, 2016) and can be embedded into neurorobotic environments like the Neurorobotics Platform (Falotico et al., 2017) or real robots (Antonietti et al., 2019; Casellato et al., 2014).
5 Release of the Code
The spiking state estimator code implemented here is available as open source at the following repository: https://github.com/dbbs-lab/state-estimator. Starting from the proof-of-concept applications tested here, it could be generalized to more complex scenarios (with additional encoded variables and increased number of neurons) and/or embedded in closed-loop control systems to simulate the full control of actions resulting from the coordinated activity of multiple brain areas.
This research has received funding from the European Union's Horizon 2020 Framework Programme for Research and Innovation under the specific grant agreement 945539 (Human Brain Project SGA3) and was supported by the EBRAINS platform and the ICEI-FENIX research infrastructure.
Alessandra Pedrocchi and Claudia Casellato are co–last authors.