Abstract

We present a computational model that highlights the role of basal ganglia (BG) in generating simple reaching movements. The model is cast within the reinforcement learning (RL) framework with correspondence between RL components and neuroanatomy as follows: dopamine signal of substantia nigra pars compacta as the temporal difference error, striatum as the substrate for the critic, and the motor cortex as the actor. A key feature of this neurobiological interpretation is our hypothesis that the indirect pathway is the explorer. Chaotic activity, originating from the indirect pathway part of the model, drives the wandering, exploratory movements of the arm. Thus, the direct pathway subserves exploitation, while the indirect pathway subserves exploration. The motor cortex becomes more and more independent of the corrective influence of BG as training progresses. Reaching trajectories show diminishing variability with training. Reaching movements associated with Parkinson's disease (PD) are simulated by reducing dopamine and degrading the complexity of indirect pathway dynamics by switching it from chaotic to periodic behavior. Under the simulated PD conditions, the arm exhibits PD motor symptoms like tremor, bradykinesia and undershooting. The model echoes the notion that PD is a dynamical disease.

1  Introduction

Reaching movements are to motor function what the simple pendulum is to classical mechanics. Although reaching movements are straightforward to understand, they are interesting to researchers of motor function since they are supported by a full array of motor areas in the brain. From a clinical point of view, reaching movements have diagnostic value since various motor disorders, like Parkinson's disease (PD), manifest characteristic changes in reaching. PD patients are known to exhibit slower reaction times and movement times in simple movements aimed at targets (Brown & Jahanshahi, 1996). The first agonist burst in PD reaching movements is also observed to be weaker than in normals, resulting in longer and multistaged reaching movements requiring multiple agonist bursts. It has been suggested that part of the reason behind the longer movement times (MTs) in PD movements is that patients adopt a closed-loop strategy to execute movements that normal subjects execute in fast, open-loop fashion (Flowers, 1976). Though PD patients are capable of making fast ballistic movements, such performance results in impaired accuracy (Sheridan & Flowers, 1990). Another aspect of PD movement is the greater variability in movement end point for larger movements. Thus, bradykinesia, which refers to the relative slowness of Parkinsonian movement and the closed-loop mode of operation, seems to be the strategy that PD patients adopt to compensate for their inability to make consistent, large-amplitude movements.

Understanding PD reaching movements requires understanding the causal relationship between PD-related dopamine deficiency in the basal ganglia (BG) and arm movements. Such understanding is best formulated in terms of a computational model. Bischoff (1998) presents a model of Parkinsonian arm control involving a simple reaching task and a reciprocal aiming task. Under PD conditions, the model exhibits bradykinesia and impaired ability to make sequential movements. Cutsuridis and Perantonis (2006) have modeled PD bradykinesia with an extensive model that includes, in addition to dopamine projections to BG and cortex, dopamine projections to the spinal cord. The model successfully reproduces aspects of PD bradykinesia in terms of electromyographic (EMG) and movement parameters. Weaknesses of this model include the absence of explicit representations of BG nuclei and the absence of a mechanism for learning reaching movements.

A perspective of BG function that has been gaining strength for over a decade is the idea that BG forms a neural substrate for reinforcement learning (RL), a branch of machine learning inspired by instrumental conditioning (Joel, Niv, & Ruppin, 2002). Although a good number of BG models are not RL based, most of them address only specific aspects of multitudinous functions of BG. Efforts are underway to explain the rich variety of BG functions solely within the RL framework (Chakravarthy, Joseph, & Bapi, in press).

A key feature of the proposed BG model is its interpretation of the role of the BG indirect pathway, which in the past has been given a varied and tentative interpretation, including withholding of action (Albin, Reiner, Anderson, Penney, & Young, 1990; Frank, 2005), focusing and sequencing (Hikosaka, Takikawa, & Kawagoe, 2000), action selection (Redgrave, Prescott, & Gurney, 1999), and switching (Isoda & Hikosaka, 2008). We have been developing a line of modeling that hypothesizes that the indirect pathway subserves exploratory behavior (Sridharan, Prashanth, & Chakravarthy, 2006). Thus, the direct pathway and indirect pathway play complementary roles, whereby the direct pathway subserves exploitation while the indirect pathway supports exploration. The presence of complex dynamics in the indirect pathway justifies its putative role in exploration, and degradation of such complex activity to more regular forms of activity like synchronized bursts is hypothesized to contribute to impaired movement. Experimental evidence that is consistent with such a hypothesis is reviewed in section 5. As a departure from the traditional description of BG functional anatomy according to which the direct pathway and indirect pathway support the go and no-go regimes, respectively, we propose a third regime: the explore regime, which comes between the go and no-go regimes. This explore regime is also supported by the indirect pathway.

In this letter, we describe a model of BG that essentially belongs to the RL class of BG models. In this model, the dopamine (DA) signal is related to incremental changes in error between the target position and the position of the end effector of the arm. The DA level switches the transmission between the direct pathway and indirect pathway in BG (Clark, Boutros, & Mendez, 2005). BG output is used as a corrective signal to the motor cortex (MC). This combined output of MC and BG is used to control the arm. As learning in MC progresses, the motor cortex becomes gradually independent of BG and begins to perform relatively independent of the modulatory influence of BG. Parkinsonian pathology is also captured naturally by the model through reduction of the dopamine level (temporal difference (TD) error, and by degrading the complex dynamics of indirect pathway.

The letter is organized as follows. Section 2 presents a brief background of BG structure and function. Section 3 describes the model architecture, including training dynamics and measures of performance evaluation. Numerical simulations with the model training and testing under normal and Parkinsonian conditions are described in section 4. A discussion of the work is given in the final section.

2.  Background

2.1.  Basal Ganglia Circuitry.

The BG comprises a group of subcortical nuclei that form a highly interconnected network of modules. The caudate nucleus and the putamen, together referred to as the striatum (STR), are the major input nuclei. The striatum receives inputs from a number of cortical areas and the thalamus. Another input port of BG, which is not, however, typically considered so, is the subthalamic nucleus (STN), which, like the striatum, also receives inputs from the cortex. An internal module of BG, which is not directly connected to cortex or thalamus, is the globus pallidus externa (GPe), which is thought to play a central role in BG according to more recent perspectives on BG circuitry (Obeso, Rodriguez-Oroz, Blesa, & Guridi, 2006; Nambu, 2008). The BG has two output nuclei: the globus pallidus interna (GPi) and substantia nigra pars reticulata (SNr). The output nuclei mainly target three nuclei: the thalamus, pedunculopontine nucleus, and superior colliculus. The activity of basal ganglia is modulated through constant feeds of the neurotransmitter dopamine (DA) from substantia nigra pars compacta (SNc) via the nigrostriatal pathway. The degeneration of neurons in SNc, whose axons form this pathway, is known to cause idiopathic Parkinson's disease. Traditionally signal propagation through the BG is thought to occur via two alternative pathways: the direct pathway, which includes STR GPi/SNr, and the indirect pathway, which consists of STR GPe STN GPi. Dopaminergic transmission from SNc has a differential effect on striatal neurons according to striatum dopamine level: at smaller dopamine levels, the indirect pathway is selected, and an increase in striatal dopamine shifts the balance toward the direct pathway, thereby increasing overall motor activity. Thus, the indirect pathway is the normally active pathway. The balance is switched just before movement onset, when dopamine release to striatum activates the direct pathway (Clark et al., 2005).

Although for a long time, BG were thought to support motor functions exclusively, it is now recognized that BG also have a role in cognitive, affective, and autonomous functions. BG circuitry is involved in a great range of functions, including (1) reward-based learning (Schultz, 1998), (2) exploratory and navigational behavior (Packard & Knowlton, 2002), (3) goal-oriented behavior (Cohen, Braver, & Brown, 2002), (4) motor preparation (Alexander, 1987), (5) working memory (Cohen et al., 2002), (6) timing (Buhusi & Meck, 2005), (7) action gating, (8) action selection (Redgrave et al., 1999), (9) fatigue (Chaudhuri & Behan, 2000) and (10) apathy (Levy & Dubois, 2005). In spite of the significant progress in our knowledge of BG at several levels, it is still not clear how such an overwhelming range of functions is supported by the same subcortical circuit.

2.2.  Reinforcement Learning—Dopamine.

A key idea that opens doors to understanding BG function is the idea that the activity of dopaminergic cells in BG represents reward signaling (Schultz, 1998). More precisely, dopamine neurons are activated by rewarding events that are better than predicted, remain unaffected by events that are as good as predicted, and are depressed by events that are worse than predicted. Thus, the dopamine signal seems to represent the error between predicted future reward and actual reward (Montague, Dayan, & Sejnowski, 1996).

Interestingly, a quantity known as temporal difference error, analogous to the error between predicted and actual future rewards, plays a key role in reinforcement learning (RL), a branch of machine learning. This conceptual association enabled the application of RL concepts (Sutton & Barto, 1998) to BG research (Joel et al., 2002). RL studies how an agent learns to respond to stimuli optimally without an explicit teacher; the agent's learning process is driven by reward or punishment signals that come from the environment in response to the agent's actions. Responses that result in rewards are reinforced, and those that lead to punishment are avoided. Actor, critic, and explorer are key components in a typical RL framework. The critic is a module that estimates the reward-giving potential, the value (, of the current state. The actor uses the gradient in to choose actions that increase . In the model examined here, when the gradient is absent or too weak, the choice of actions becomes increasingly stochastic. This stochasticity in choice of actions is identified with the explorer.

3.  The Complete Model: Arm, Basal Ganglia, and Motor Cortex

Figure 1 depicts the architecture of the arm control system including the BG circuit, the motor cortex (MC), and the two-link arm model (AM). The motor task on which the system is trained consists of commanding the end effector of the arm from the initial central position to one of the four surrounding targets. Information corresponding to the ith target is coded in the target selection vector () such that ith component ( is set to 1, while all the other components equal 0. The target selection vector is presented to both MC and BG (see Figure 1). Outputs of MC and BG are combined to produce g, which represents the activations given to the four muscles of the two-link arm. The output of the BG may be regarded as a correction to the output of MC in controlling the arm. The basis of this correction is the error information associated with the relative position of the arm with respect to the target; this error is coded as the dopamine signal available to the BG. Thus, in the model, the role of BG is twofold: (1) to provide real-time corrective information to MC based on error information conveyed by the nigrostriatal dopamine signal and (2) to use this corrective signal to train the cortex on the motor task at hand.

Figure 1:

A schematic of the model components.

Figure 1:

A schematic of the model components.

The proposed model consists of three components: MC, BG, and the arm. The arm has to reach one of the four target locations. Each target is specified by a target selection vector, , which is given as input to both the MC and the BG. In response to the input, the outputs produced by MC and BG are gm and gbg, respectively. These outputs are linearly combined to produce g, which is given as activations to the muscles of the arm, calculated as
formula
3.1
where and are coefficients that control the relative contributions of MC and BG to movement, as described below. A given value of g puts the arm in a unique configuration. For a given value of , BG output gbg is a highly labile quantity, which perturbs gmuntil the arm makes a successful reach. The value of gbg that results in a successful reach is used by MC for training itself. These processes are described below.

3.1.  Inputs and Outputs of BG Model

3.1.1.  Motor Cortex.

The motor cortex is modeled as a perceptron with as input and gm as output. Since the arm has four muscles (see Figure 1), gm is a four-dimensional vector given as
formula
3.2
where W is the weight matrix and b is the vector of biases.

3.1.2.  Basal Ganglia Model.

The BG part of the model has four key components: the critic, which is implemented in the striatum; the direct pathway; the indirect pathway; and the TD error, , which represents the dopamine signal arising out of SNc.

3.1.3.  Critic.

For a given target position, , and the current position, X, of the end effector of the arm (“hand”), the critic computes the value of the current position as
formula
3.3
where R represents the distance from the target over which the value is nonzero.

3.1.4.  Dopamine Signal.

During the exploration of the arm in its planar work space, if the arm accidentally strays close enough to the desired target, the BG receives a reward signal . In line with the standard RL literature, we define the TD error, , which represents the phasic activity of DA cells of SNc, as follows:
formula
3.4
where is the discount factor. The reward, , is a sharp gaussian of d, with mean 0, standard deviation, , and amplitude A, where d is the distance between the arm and the target. Thus, the reward is given when the arm comes very close to the target. is thought to be computed within the loop: striatum SNcstriatum (see Figure 1).

3.2.  Direct and Indirect Pathways.

It is well known from the functional anatomy of the BG that the striatal dopamine switches the transmission between the direct and the indirect pathways: the direct pathway is selected at higher values of dopamine and the indirect pathway for lower values (Clark et al., 2005). Selection of the direct pathway is thought to facilitate movement (“go”) and selection of indirect pathway to withhold movement (“no-go”). Between the high and low ranges of dopamine, which correspond to the classical go and no-go regimes, we posit an intermediate range, which corresponds to the explore regime. These three regimes operate in the current model as follows. In the go case, the direct pathway is activated, and g is updated such that the arm continues to move a little in the previous direction. In the no-go case, the indirect pathway is activated, and gbg is updated such that the arm shows a tendency to move a little in the direction opposite to the previous direction. In the explore case, again the indirect pathway is activated, and gbg is updated in a random fashion unrelated to the previous increment in g. These mechanisms are embodied in the following equations:
formula
3.5
where is a random four-dimensional vector calculated as in equation 3.12b. The threshold values DA and DA vary with training as
formula
3.6
where a = 0.1 and is defined in equation 3.9 below, and is updated such that
formula
3.7

3.3.  Training MC.

The reaching movement driven by BG output as described by equations 3.1 to 3.7 proceeds until the end effector comes within a radius of Rtol from the target location. The value of g, which results in this successful reach, is used as a target output of MC. Thus, MC is trained by the delta rule as follows:
formula
3.8
Two more quantities, and (see equation 3.1), are updated as training proceeds. As MC becomes more and more skilled at the movement, the contribution of MC to movement increases, while that of BG decreases. This is achieved by making a function of the average reaching error, E, over all targets, when MC alone is used to drive the arm. Thus,
formula
3.9
Since E decreases with training, the contribution of MC, represented by , increases, and the contribution of BG, denoted by , decreases, with training.

3.4.  Arm Model.

Since BG dynamics is the focus of the letter, we chose an extremely simple model of arm dynamics. The arm consists of two joints with four muscles. The muscles are activated by g, a four-dimensional vector: g1 and g2 activate the agonist and antagonist of the “shoulder,” respectively, while g3 and g4 activate the agonist and antagonist of the “forearm,” respectively. The “shoulder” and “forearm” joint angles, and , respectively, are given by:
formula
3.10a
formula
3.10b
Thus, in our simple arm model, the relationship between muscle activations and arm configuration is static.

3.5.  Modeling Parkinsonian Dynamics.

Two mechanisms are used in the model examined here to simulate Parkinsonian conditions. The first is reduced dopamine. Reduced dopamine conditions are simulated by putting a ceiling on the DA level (i.e., on ). This ceiling is imposed on , which is governed by equation 3.4, as follows:
formula
3.11
where min( is defined as
formula
Parkinsonian disease progression is simulated by gradually decreasing DA. For every value of DA, MC is trained for several epochs before reducing DA further. In the simulations of Parkinsonian conditions described in the following section, we start with DA 0.5 and reduce it in steps to −0.5.
The second mechanism is reduced complexity in the explore regime. In order to vary the complexity of indirect pathway dynamics during the explore regime, we use a chaotic map to calculate the random vector used in equation 3.5. Each component of is calculated using a logistic map as follows:
formula
3.12a
formula
3.12b
where B denotes a scaling parameter.

The map parameter K controls the transitions between fixed point (, periodic (for ( (approximate)), and chaotic ( behaviors of the logistic map (May, 1976). However, note that even in the chaotic regime (, there are the so-called islands of stability—small ranges of K where the map exhibits periodic behavior. However, these islands are not likely to be detected in our simulations since we have not scanned the space of K at sufficiently high resolution. In the simulations in the following section, we designate 4 to correspond to the normal condition and use smaller values of K down to 3 to simulate PD-related degeneration.

4.  Simulations: Normal and PD Reaching Movements

4.1.  Normal Reaching Movements.

The model described in the previous section is used to reach the four targets shown in Figure 1. Training simulations are run for 20 epochs, where each epoch consists of reaching (or making time-limited attempts to reach) all four targets. The weights of MC are randomly initialized between −0.5 and 0.5. Each reaching movement lasts for at most 100 time steps, or until the arm serendipitously comes sufficiently close to the target. Thus, reaching movements are made once toward each of the four targets in one epoch. This process is repeated for 20 epochs, at the end of which, MC is almost completely trained. Even if training is continued beyond 20 epochs, MC error does not reach 0, but fluctuates around a small, positive value. The labile influences coming from BG to train MC play a dual, and mutually conflicting, role. This variability, however, is necessary to explore the output space and discover rewarding increments to muscle activations. The same variability, however, prevents the MC from learning further once a low error value is reached. It is for this reason that we increase (MC's contribution to movement), and decrease (BG's contribution), as a function of training error, as training progresses. Numerical values of various parameters used in the model are listed in Table 1.

Table 1:
Numerical Values of Parameters Used in the Model Equations of Section 3.
ParameterValueDescription
Amplitude of the reward and value functions 
Spread of the value function 
 Discount factor 
Rtol 0.3 Radius of the tolerance circle 
a 0.1 Scaling factor for DA and DA in BG function 
 0.2 Learning rate of MC 
 0.03 Standard deviation of the gaussian used in reward calculation 
0.04 Scaling factor for logistic map function 
ParameterValueDescription
Amplitude of the reward and value functions 
Spread of the value function 
 Discount factor 
Rtol 0.3 Radius of the tolerance circle 
a 0.1 Scaling factor for DA and DA in BG function 
 0.2 Learning rate of MC 
 0.03 Standard deviation of the gaussian used in reward calculation 
0.04 Scaling factor for logistic map function 

The parameters of Table 1 are chosen by experimentation keeping in view the various trade-offs involved. For example, a controls the DA thresholds in equation 3.6. Larger values of a increase the time spent in exploration. Similarly, B controls the amplitude of exploration. A larger value of Rtol increases the probability of a successful reach but worsens reaching error. Variation of these parameters within a small range around the currently used values did not exhibit any sudden unexpected changes in system behavior. However, a systematic sensitivity analysis of the above parameters could form part of a separate study.

The evolution of the reaching performance over the epochs is characterized by three metrics: the MC performance error, reaching duration, and path variability. Figure 2 shows the trajectories of the arm in the first epoch. Since the MC is untrained, the arm makes long, wandering movements to reach the target. Arm trajectories are nearly straight in the last (twentieth) epoch (see Figure 3). Figure 4 shows the reaching movements made by the arm under the sole influence of MC, without the BG contribution. This is done by setting 1 and 0. Note that the perturbative influence of BG is absent in this case. Although the average reaching duration appears to decrease in the mean with learning, the trend does not seem to be significant when the error bars are considered (see Figure 5). Note that the error bars in all figures denote standard deviation. As the MC learns to reach, the initial movement, which is driven by the MC, arrives closer and closer to the target; thus the time-consuming wandering search for the target is reduced as learning progresses. For the same reason, path variability is also reduced as training progresses (see Figure 6). Naturally, since the goal of training is to train the MC to reach, MC reaching error decreases with epochs (see Figures 7 and 8).

Figure 2:

Trajectories of reaching movements during the first epoch.

Figure 2:

Trajectories of reaching movements during the first epoch.

Figure 3:

Trajectories of reaching movements made during the last (twentieth) epoch.

Figure 3:

Trajectories of reaching movements made during the last (twentieth) epoch.

Figure 4:

Reaching movements made to the four targets after training (MC alone, no BG contribution).

Figure 4:

Reaching movements made to the four targets after training (MC alone, no BG contribution).

Figure 5:

Variation of average reaching duration with learning in normal conditions.

Figure 5:

Variation of average reaching duration with learning in normal conditions.

Figure 6:

Variation of path variability with learning in normal conditions.

Figure 6:

Variation of path variability with learning in normal conditions.

Figure 7:

Variation of output error of MC with learning in normal conditions.

Figure 7:

Variation of output error of MC with learning in normal conditions.

Figure 8:

Average time spent in various BG regimes.

Figure 8:

Average time spent in various BG regimes.

The classical description of the function of direct pathway and indirect pathway associates direct pathway with movement facilitation and indirect pathway with movement inhibition. In the model here, we propose that the dynamics of the go and no-go regime are opposite to each other: the respective changes in BG output in the two regimes have opposite signs. However, it may be argued that a simpler way to implement the no-go regime is to let the BG output remain unaltered. We implemented this variation of the regime and found that the results were qualitatively the same (see appendix B). Therefore, we continue with the formulation of regimes as depicted in equations 3.5 and consider their consequences in Parkinsonian conditions in the next section.

4.2.  PD Reaching Movements.

Simulations of PD-related pathology are based on three types of models. In the type A PD model, both dopamine reduction (−0.5 DA 0.5) and reduced complexity of indirect pathway dynamics (3 K 4) are incorporated. In the type B PD model, only dopamine reduction is implemented (−0.5 DA 0.5, and K = 4). In the type C PD model, only the reduced complexity of indirect pathway dynamics (3 4) is incorporated with no reduction in dopamine (DA 0.5).

We define a few metrics to characterize reaching performance in PD conditions:

  • • 

    The undershoot factor, which quantifies the extent by which the final position of the arm undershoots the target

  • • 

    The tremor factor, which quantifies the tremor seen in arm move- ments

  • • 

    Average velocity to quantify bradykinesia

Formal definitions of these metrics are given in Appendix A.

Since the loss of dopaminergic cells in SNc is the etiology of idiopathic PD, it would be natural to describe the degree of PD by the percentage loss of DA cells. In this simulation, the degree of PD pathology is expressed by the quantity Dceil, which clamps the DA signal . We now define a quantity, PDA, which represents the percentage of DA cell loss and relate it to Dceil. Note that can take both positive and negative values, typically varying between −0.5 and 0.5 in the simulations. When 0, Dceil can take its highest value of 0.5, and when 1, Dceil takes its lowest value of −0.5. Thus, we have, . For a given trial, there are 20 epochs for each 5% of DA loss, and each epoch lasts a maximum of 100 iterations; if the arm freezes for more than 10 time steps, the reach is terminated. There are 10 trials for each DA level. Trials represent repeated simulation for the same DA level. Such repetition is necessary to examine the level of variability in reaching.

In the three types of PD simulations, we start with the MC fully trained under normal conditions (as in section 4.1) and continue to train it under the pathological conditions by varying PDA and K. Variation of various metrics like undershooting presents three possible scenarios of PD disease progression.

4.2.1.  PD Model, Type A

In this PD model, both dopamine reduction (−0.5 DA 0.5) and reduced complexity of indirect pathway dynamics (3 K 4) are incorporated. DA and K are related to PDA as follows: and .

PD patients are known to often undershoot targets in reaching performance (van Gemmert, Adler, & Stelmach, 2003). This is clearly seen in Figure 9, where undershooting worsens with the increasing loss of DA cells (PDA). At around 50% loss of DA cells, undershooting reaches nearly its minimum and does not change significantly henceforth. Figure 10 shows a snapshot of reaching trajectories with undershooting. Note that apart from undershooting the target, there is also a large error in reaching direction.

Figure 9:

Variation of undershooting factor with PDA, for a type A PD model.

Figure 9:

Variation of undershooting factor with PDA, for a type A PD model.

Figure 10:

Instance of undershooting with a type A PD model ( 100%).

Figure 10:

Instance of undershooting with a type A PD model ( 100%).

Tremor also increases with increasing PDA up to about PDA = 50%; henceforth, it quickly drops to 0 at PDA = 60% and remains at 0 for larger values of PDA (see Figure 11). This development may be accounted for as follows. As PDA is increased, DA is also reduced, and spends more and more time in the explore and no-go regimes. Such exaggerated exploration, occurring in place of a straight target pursuit corresponding to the go regime, seems to manifest as tremor. As PDA is increased further, is always confined to the no-go regime, regardless of the actual performance of the arm. Thus, the arm enters a relatively frozen state with no tremor. Ramifications of this change can be seen in average velocity also. Average velocity decreases with increasing PDA, reaching a small average velocity at about and remaining there for larger values of PDA (see Figure 12). Although the disease pathology is confined to BG, these performance error of MC gradually increases with increasing PDA and saturates at about (see Figure 13). Thus, undershooting and average velocity seem to show a common pattern: a nearly gradual worsening up to about and a subsequent relatively frozen condition marked by a paucity of movement. However, tremor gradually increases to a peak value, before falling rapidly around These patterns seem to be reflected in the variation of time spent in various regimes (see Figure 14). Variation of time spent in the explore regime seems to resemble a variation of tremor, which undershooting and average velocity variation seem to follow the variation of time spent in the go regime (see Figure 14).

Figure 11:

Tremor factor in a case of type A PD.

Figure 11:

Tremor factor in a case of type A PD.

Figure 12:

Average velocity of the arm in a case of type A PD.

Figure 12:

Average velocity of the arm in a case of type A PD.

Figure 13:

Actor or MC error in a case of type A PD.

Figure 13:

Actor or MC error in a case of type A PD.

Figure 14:

Average time spent by BG in various regimes for type A PD.

Figure 14:

Average time spent by BG in various regimes for type A PD.

Normal reaching is accompanied by a large, initial agonist burst, followed by an antagonist burst, which sometimes is followed by a second, smaller agonist burst (see Figure 15, top). Note that all agonist burst plots correspond to the variation of the first component of the muscle activation vector (representing the shoulder) as the arm reaches target 1. We have not included other components, since they show similar behavior. Time zero in all agonist burst plots corresponds to the start of the reaching movement. Such biphasic response seen in the normal case does not appear in type A PD results, which show a nearly monotonic build-up of activity (see Figure 15, bottom).

Figure 15:

Agonist burst. Normal (top) and PD model type A (bottom). The normal plot corresponds to target number 1 and epoch number 16. The PD plot corresponds to DA loss of 30% and target number 1.

Figure 15:

Agonist burst. Normal (top) and PD model type A (bottom). The normal plot corresponds to target number 1 and epoch number 16. The PD plot corresponds to DA loss of 30% and target number 1.

4.2.2.  PD Model Type B.

In this PD model, only dopamine reduction (−0.5 DA 0.5; 4) is incorporated. DA is related to PDA as follows: .

Although the general trends are similar to those seen in case of the type A PD model, there is an important difference. For example, if we consider the variation of undershooting with PDA, in the case of a type A model, there is a gradual reduction followed by saturation. However, in the case of a type B model, undershooting remained nearly constant and fell drastically at a PDA value of about 50%, without much subsequent variation (see Figure 16). Figure 17 shows a snapshot of undershooting in this case. Tremor also remains nearly constant until 50%, falling abruptly to 0 thereafter (see Figure 18). However, tremor exhibits a sharp transient rise at 50% before it falls to 0. A similar step-like change is observed in average velocity also (see Figure 19). However, actor (MC) error shows insignificant variation with P (see Figure 20). In this case, too, variation of symptoms reflects variation of time spent in various regimes. For instance, as in the previous case, variation of average velocity and undershooting resembles variation of time spent in the go regime, while variation of tremor resembles the explore regime (see Figure 21). In this case, too, agonist burst shows a monotonic variation (see Figure 22, bottom), compared to a biphasic response of a normal case (see Figure 22, top).

Figure 16:

Variation of the undershooting factor with percentage DA cell loss in a type B PD model.

Figure 16:

Variation of the undershooting factor with percentage DA cell loss in a type B PD model.

Figure 17:

A snapshot of the undershooting reaching movement in a type B PD model ( 100%).

Figure 17:

A snapshot of the undershooting reaching movement in a type B PD model ( 100%).

Figure 18:

Variation of a tremor with percentage DA cell loss in a type B PD model.

Figure 18:

Variation of a tremor with percentage DA cell loss in a type B PD model.

Figure 19:

Variation of average velocity with percentage DA cell loss in a type B PD model.

Figure 19:

Variation of average velocity with percentage DA cell loss in a type B PD model.

Figure 20:

Variation of actor or MC error with percentage DA cell loss in a type B PD model.

Figure 20:

Variation of actor or MC error with percentage DA cell loss in a type B PD model.

Figure 21:

Average time spent by BG in various regimes for a type B PD model.

Figure 21:

Average time spent by BG in various regimes for a type B PD model.

Figure 22:

Agonist burst. Normal (top) and type B PD model (bottom). A normal plot corresponds to target number 1 and epoch number 16. A PD plot corresponds to DA loss of 30% and target number 1.

Figure 22:

Agonist burst. Normal (top) and type B PD model (bottom). A normal plot corresponds to target number 1 and epoch number 16. A PD plot corresponds to DA loss of 30% and target number 1.

Thus, unlike the type A model, symptoms in a type B model show a step-like variation, with the symptoms remaining constant up to a critical value of 50%, thereafter transitioning to a permanently worse state. This can perhaps be accounted as follows: the loss of DA neurons might be compensated by the intact indirect pathway dynamics, and this balance is perhaps disturbed when reaches a critical level (in this case, 50%).

4.2.3.  PD Model, Type C.

In this PD model, only reduced complexity of indirect pathway dynamics () is incorporated with no reduction in dopamine ( 0.5). DA is fixed at 0.5, and K is related to PDA as: .

In type A simulations, we have seen a gradual variation of symptoms, followed by saturation at PDA of about 50%. In type B, we have seen a nearly constant profile up to a PDA of about 50%, followed by a sudden shift to another plateau. In type C, we see a generally gradual variation of symptoms (undershooting factor, Figures 23 and 24; tremor factor, Figure 25; average velocity, Figure 26) with no sharp transition at PDA of about 50%. Since DA is fixed, is allowed a full, unconstrained variation. Thus, the sharp transitions among the three regimes do not occur here. Whatever impairment is observed is due to the degradation of the complexity of exploration (reduction in K). Note that the boundaries between regimes are not fixed but vary as a function of actor error (see equation 3.6). The span of the explore regime is wider for a higher actor error. Thus, actor error increases as PDA increases (see Figure 27). As a consequence, for larger values of PDA, the system spends more time in the explore regime than the go regime (see Figure 28). Since never or rarely drops too low, a no-go regime is rarely selected (see Figure 28). In this case too, an agonist burst shows a monotonic variation (see Figure 29, bottom) compared to a biphasic response of a normal case (see Figure 29, top).

Figure 23:

Variation of undershooting factor with percentage DA cell loss in a type C PD model.

Figure 23:

Variation of undershooting factor with percentage DA cell loss in a type C PD model.

Figure 24:

A snapshot of an undershot reaching movement in a type C PD model ( 100%).

Figure 24:

A snapshot of an undershot reaching movement in a type C PD model ( 100%).

Figure 25:

Variation of tremor with percentage DA cell loss in a type C PD model.

Figure 25:

Variation of tremor with percentage DA cell loss in a type C PD model.

Figure 26:

Variation of average velocity with percentage DA cell loss in a type C PD model.

Figure 26:

Variation of average velocity with percentage DA cell loss in a type C PD model.

Figure 27:

Variation of actor or MC error with percentage DA cell loss in a type C PD model.

Figure 27:

Variation of actor or MC error with percentage DA cell loss in a type C PD model.

Figure 28:

Average time spent by BG in various regimes for a type C PD model.

Figure 28:

Average time spent by BG in various regimes for a type C PD model.

Figure 29:

Agonist burst. Normal (top) and type C PD model (bottom). A normal plot corresponds to target number 1 and epoch number 16. A PD plot corresponds to DA loss of 30% and target number 1.

Figure 29:

Agonist burst. Normal (top) and type C PD model (bottom). A normal plot corresponds to target number 1 and epoch number 16. A PD plot corresponds to DA loss of 30% and target number 1.

5.  Discussion

We have presented a model of Parkinsonian reaching dynamics. The model consists of MC, BG, and a two-link arm. The BG model is cast essentially in the framework of reinforcement learning, though we depart radically from the interpretation of neural substrates of various RL components. In line with an actor-critic type of BG models, we interpret the temporal difference error as the DA signal. The value function, which is thought to be computed in the striatum, is not learned but predefined in terms of distance of the arm's end effector and the target. The DA signal switches transmission between the direct and indirect pathway. Thus, BG output is dominated by direct or indirect pathway activity, depending on the magnitude of the DA signal. BG output in combination with MC output controls the arm. Thus, the perturbative corrections from the BG and the DA signal together help MC learn to reach. MC's dependence on BG gradually diminishes as training progresses. Thus, in the model, BG discovers the correct output by reward-related dynamics and transfers the knowledge to MC. A similar scenario of sequential learning between BG and cortex was described in the experimental literature. Studies on different time courses of learning in basal ganglia and prefrontal areas exhibit a similar sequencing (first basal ganglia and then prefrontal) in saccade-related behavior in monkeys (Pasupathy & Miller, 2005).

The DA signal in this letter does not distinguish between the two forms of DA release from mesencephalic dopamine centers reported in the experimental literature: the phasic release which acts on a timescale of seconds, and tonic release, which acts over a few minutes (Dreher & Burnod, 2002). Phasic release is linked to the difference in expected future reward and actual reward, a quantity described in the RL literature as the temporal difference error. Both tonic and phasic dopamine releases are thought to have differential roles in efficient updating of working memory information in the prefrontal cortex. Tonic DA is thought to increase the stability of maintained information in the PFC by increasing the signal-to-noise ratio of the pattern with respect to background noise. By contrast, phasic DA is thought to control when an activity has to be maintained or when it must be updated (Cohen et al., 2002). The possibility of interaction between tonic and phasic dopamine has also been considered. It has been suggested that tonic dopamine can regulate the intensity of phasic dopamine by the effect of the former on extracellular dopamine levels (Grace, 1991). On the whole, an extensive literature exists on the question of specific roles of phasic and tonic DA; a comprehensive theory of the role of these two forms of release on various cortical and subcortical targets continues to be elusive. In this letter, we do not distinguish between these two forms of DA release. However, the single variable shown in this letter is similar to TD error and therefore closer to phasic DA than tonic DA.

In simulations of normal reaching in section 4.1, there is an early stage when the arm exhibits prolonged wandering movements before it reaches the target. When the arm approaches the target accidentally, a reward is delivered, which is used to train the MC. Reaching movements become more direct and briefer as training proceeds. Thus, variability in reaching trajectories decreases with learning (see Figure 6). Studies with primate reaching patterns reveal an exponential reduction of variability with learning (Georgopoulos, Kalaska, & Massey, 1981). The exploratory movements of the arm, driven by the chaotic activity of indirect pathway, are reminiscent of the notion of motor babbling proposed in the context of imitation learning in infants (Meltzoff & Moore, 1997). Infants are thought to make random movements and, by confirmatory feedback from an adult in the environment, learn to relate the movements initiated and the end states of the body. A similar learning of articulatory-auditory relation also seems to be driven by the more familiar vocal babbling (Kuhl & Meltzoff, 1996).

In the PD reaching simulations of section 4.2, we considered three types of models. Typically PD pathophysiology is modeled purely in terms of a reduction in dopamine. However, in the type A PD model, we incorporated two factors related to PD pathology: dopamine reduction and reduction in the complexity of indirect pathway dynamics. Measures of impaired reaching movement like (longer) reaching duration indicating bradykinesia, tremor, and undershooting are calculated. All three types of PD models (A, B, and C) showed longer reaching duration, more tremor, and greater undershooting compared to normal. In type A, as disease progressed (increased loss of DA cells (PDA) and decreased complexity of indirect pathway dynamics), these measures gradually approached an extreme value before they saturated. In type B, the measures showed a step-like variation. Type C exhibits a smooth variation of symptoms, except for tremor, which exhibits a nearly flat peak.

The pattern of variation of symptoms seems to be reflected in the variation of time spent in various regimes. With increasing DA loss, the time spent in the go regime falls, and spends more time within the explore range. Thus, time spent in the explore regime increases. This increase continues until falls below a critical value and enters the no-go regime and remains mostly confined there. Thus, we see that as DA loss increases, the time spent in the explore regime gradually approaches a peak and falls subsequently to near zero (in types A and B). Since there is no restriction on in type C, the time spent in the explore regime increases monotonically. In general, the variation of time spent in the explore regime seems to resemble the variation of tremor, while undershooting and average velocity variation seem to follow the variation of time spent in the go regime. Experimental studies have linked tremor to changes in GP (Hurtado, Graym, Tamas, & Sigvardt, 1999) and STN (Hamani, Saint-Cyr, Fraser, Kaplitt, & Lozano, 2004)—in other word to changes in indirect pathway. In cases of akinetic rigidity, Albin et al. (1990) found a profound loss of striatal cells projecting to GPi, which constitutes the direct pathway. Based on the features of disease progression in Huntington's disease, another neurodegenerative disorder, it was suggested that degeneration of the direct pathway is responsible for rigidity and bradykinesia (Berardelli et al., 1999). In all three types in our model, the frozen, or rigid-like, state is associated with drastically reduced time spent in go regime, whose substrate is the direct pathway.

Increased movement duration in PD patients is a well-known clinical fact. In a study in which patients were asked to look and point to visual targets on a screen, PD patients took 24% more time to execute the movement than control subjects (Desmurget, Grafton, Vindras, Gréa, & Turner, 2003). Bradykinesia is thought to occur due to the failure of BG output to reinforce the cortical mechanisms that prepare and execute the commands to move (Berardelli, Rothwell, Thompson, & Hallett, 2001). In our model too, bradykinesia is a result of impaired interaction between BG and MC, which is caused by DA cell loss.

Undershooting the target is another prominent feature of goal-oriented PD movements. In a study in which PD patients were asked to copy target lines of fixed size, patients, compared to controls, undershot the required size when the target size is greater than or equal to 2 cm (van Gemmert et al., 2003). It is noteworthy that in PD patients, saccadic movements also typically undershoot targets, particularly in the vertical direction (White, Saint-Cyr, Tomlinson, & Sharpe, 1983). In the simulations of the previous section, error in reaching includes both undershooting and error in direction. However, reaching error in PD patients is typically dominated by undershooting with no significant error in direction. This discrepancy in model performance related to differential error in reaching direction and undershooting will have to be investigated further.

Tremor is another classic symptom of PD motor impairment. Our model too exhibits tremor, which might be different from the way it is characterized in the experimental literature. Tremor in the movement disorder literature is marked by the presence of strong oscillatory components in electromyogram (EMG), and PD tremor is sometimes found to correlate with abnormal neural activity in GP (Hurtado et al., 1999) and STN (Hamani et al., 2004). In our model, tremor is quantified as the root mean square (RMS) value of acceleration of the arm's end effector (see appendix A). Thus what we refer to as tremor, strictly speaking, denotes fluctuations in movement velocity, which is higher in the PD version of the model than in the normal condition. Furthermore, the tremor described in our model emerges during reaching and is therefore akin to action tremor. Although action tremor is found in PD patients, resting tremor is found more often than action tremor. One way of extending the current model to address the problem of resting tremor is to treat the resting state as another possible target location. Since a typical hand may be assumed to spend more time in the resting state than in any other state, this feature can be incorporated in the simulations. It would be interesting to note the differences in the action tremor and resting tremor that emerge from such a model.

Another aspect of tremor in the model is that in this work, we use a simple measure of tremor based on the RMS value of acceleration, but considering the suggested link with degradation of chaos in the indirect pathway, it would perhaps be more appropriate to perform chaotic time-series analysis on tremor and check if there is a reduction in chaoticity with disease progression. Such analyses will have a bearing on clinical data since it was reported that PD tremor displays reduced chaoticity due to the effect of treatment (Yulmetyev et al., 2006). These alternative measures of tremor will be the subject of future work.

Dounskaia, Fradet, Lee, Leis, and Adler (2009) characterize movement irregularities in PD reaching using a measure called the normalized jerk score (NJS) and show that this score in PD patients is greater than in normal subjects. Since PD movements are known to have abnormal fluctuation in velocity and acceleration, the magnitude of jerk, which refers to the temporal derivative of acceleration, is understandably higher in PD patients than in normals. The proposed model is a lumped model of BG, which aims to embody the essence of BG dynamics. It is meant to present a picture of BG in which the direct pathway subserves exploitation and indirect pathway subserves exploration (see Figure 30) and in this respect departs radically from existing BG modeling literature. It attempts to present only the large-scale picture and is not meant to be a detailed, network-level, or biophysical model of BG function. It is a systems-level model that aims to link PD pathology at a circuit level with its behavioral manifestations in reaching. To achieve such a wide scope, model components have been simplified. The arm and the muscles involved are static models, and therefore arm dynamics are produced purely by temporal variation in muscle activation. A more realistic model would use a dynamic arm and also incorporate a forward model necessary to control the arm. The actor/MC is also a static model, a perceptron, which happens to be adequate to the problem at hand. The value function is precalculated and is not trained by , as it should be in a full RL framework. There is also no explicit representation of striatum or corticostriatal connections modifiable by DA signals. A novel feature of the model presented here is to represent part of the indirect pathway dynamics using a chaotic system and suppress its chaoticity to represent PD pathology. We envision two stages of future development of this model. In the first, each model component is replaced by networks of abstract neurons with appropriate dynamics. The second stage would consist of biophysical neuron models, with the model architecture closely complying BG anatomy.

Figure 30:

A hypothetical schema of BG function in which the direct pathway subserves exploitation and indirect pathway exploration.

Figure 30:

A hypothetical schema of BG function in which the direct pathway subserves exploitation and indirect pathway exploration.

Another novel feature of the proposed model is that it attempts to capture disease progression in a neurodegenerative disorder (NDD) like PD, as opposed to contrasting normal function with disease state at a particular level. NDDs, whose incidence seems to be increasing dramatically, are marked by a progressive impairment in function. Understanding the nature of this progressive impairment is complicated by the fact that the impairment is usually associated with high short-term fluctuation in symptoms (Walker et al., 2000). Although neurological deficits in such cases are thought to be related to neuronal cell loss, more recent findings suggest that the situation could be more complicated (Terry et al., 1991). Behavioral impairment in NDDs is associated with the formation of abnormal protein assemblies (plaques, tangles, and inclusion bodies), neuronal cell loss, and network dysfunction (Palop, Chin, & Mucke, 2006). An integral understanding of disease progression entails progress in understanding at all the above levels. Another tricky affair in NDDs is to be able to distinguish between a co-pathologic and a compensatory change. Only an integral understanding of NDD progression at the network level will help the development of effective therapies for NDD. The model presented here marks a step in that direction for the specific case of PD.

5.1.  STN-GPe and Exploration.

A key idea that is embodied in our model—an idea that is developed from earlier work (Sridharan et al., 2006; Gangadhar, Joseph, & Chakravarthy, 2008)—is that the STN-GPe system, which constitutes the indirect pathway, plays the role of the explorer in BG dynamics. RL-based or actor-critic models are an important class of models describing BG function. Of the three key components of RL—actor, critic, and explorer—substrates to both actor and critic have been located within the BG nuclei; however, no subcortical substrate to the explorer has been discovered in experimental work or suggested in modeling studies. Functional imaging studies identify two cortical substrates of exploration—anterior frontopolar cortex and intraparietal sulcus—but no subcortical counterpart of exploration has been found (Daw, O'Doherty, Seymour, Dayan, & Dolan, 2006). On the other hand, the roles attributed to the STN-GPe have been variable and tentative ranging from movement inhibition (Albin et al., 1990; Frank, 2005), to focusing and sequencing (Hikosaka et al., 2000), action selection (Gurney, Redgrave, & Prescott, 2001), switching (Isoda & Hikosaka, 2008). Thus, we try to fit the peg of a missing subcortical substrate for exploration into the hole of a tentative understanding of STN-GPe function and propose that the STN-GPe system is the subcortical substrate for exploration.

The STN-GPe loop is often studied as a single unit, perhaps since oscillations produced by this loop have fascinated many researchers (Terman, Rubin, Yew, & Wilson, 2002). Based on their studies of BG organotypic tissue cultures, Plenz and Kitai (1999) have proposed that correlated activity can arise in both STN and GPe structures and is caused by the interaction of the two structures rather than being driven by an external source. Recent experimental studies have revealed prominent low-frequency periodicity (4–30 Hz) of firing and dramatically increased correlations among neurons in the GPe and the STN, though there were no significant changes in firing rates (Bergman, Wichmann, Karmon, & DeLong, 1994; Nini, Feingold, Slovin, & Bergman, 1995; Magnin, Morel, & Jeanmonod, 2000; Brown et al., 2001). Under dopamine-deficient conditions associated with PD, recordings from STN neurons of PD animals and patients revealed synchronized oscillations (Magnin et al., 2000; Nini et al., 1995).

Thus, we propose a functional role for the presence of complex oscillations in STN-GPe in normal conditions and explain the pathological consequences of the loss of complex dynamics in that structure. The idea of explaining PD symptoms in terms of reduction in complexity of dynamics of relevant neural structures has existed for some time (Edwards, Beuter, & Glass, 1999). It is from such considerations that PD has been dubbed a “dynamical disease” (Beuter & Vasilakos, 1995). Accordingly, fixed-point dynamics have been linked to akinetic rigidity of PD and limit cycles to PD tremor. Several instances have been discovered in physiology, particularly cardiac physiology, where the chaotic activity of a system is essential for its normal function (Goldberger, Rigney, & West, 1990). There may be a similar situation in the STN-GPe system: complex activity may correspond to normal function and loss of complexity to disease.

Appendix A:  Definitions

A.1.  Normal.

  • Average velocity: The rate of change of displacement at the end of a reach.

  • Actor error: The magnitude of the vector connecting the target and the end point reached by the arm with the sole contribution of the actor (see Figure 31).

  • Path variability: The standard deviation of the length of the normals from the line connecting its extremities, intersecting it (see Figure 32).

  • Parkinson's disease: The undershooting factor is the ratio of the magnitudes of the projection of the actual displacement vector of the reach onto the vector connecting the origin to the target, to that of the vector connecting the origin to the target. Tremor factor is defined as the root mean square value of the acceleration of the arm during the reaching task. Average velocity is the same as the normal case.

Figure 31:

Calculating actor error.

Figure 31:

Calculating actor error.

Figure 32:

Calculating path variability.

Figure 32:

Calculating path variability.

Appendix B:  An Altered Form of Dynamics in the No-Go Regime

Of the three regimes described by equations 3.5, in the no-go regime, the BG output at time t is changed such that it is negative of the change that occurred in the previous step (t − 1). Since the traditional definition of no-go is to withhold movement and not reverse movement, we consider an alternative form of no-go dynamics in which there is no change in BG output. Thus, the new regimes can be defined as
formula
B.1
where is defined as before in equations 3.5. With this variation of the no-go regime, we repeat the simulations of the normal case described in section 4.1. The results obtained are shown in Figures 33 through 39. Comparing Figures 34 to 39 with Figures 2 to 8 of section 4.1, we note that most results are similar except one significant difference. The reaching time depicted in Figure 36 is somewhat longer than the reaching times depicted in Figure 5. Therefore, for the pathological studies, we use only the no-go regime as it is depicted in equations 3.5 and do not pursue the no-go regime as it is depicted in equations B.1.
Figure 33:

Calculating undershooting.

Figure 33:

Calculating undershooting.

Figure 34:

Trajectory during the first altered no-go form of epoch (under dynamics). Trajectory at the last epoch (under altered no-go form of dynamics).

Figure 34:

Trajectory during the first altered no-go form of epoch (under dynamics). Trajectory at the last epoch (under altered no-go form of dynamics).

Figure 35:

Trajectory after training (MC alone, no BG contribution, under altered no-go form of dynamics).

Figure 35:

Trajectory after training (MC alone, no BG contribution, under altered no-go form of dynamics).

Figure 36:

Variation of average reaching duration with learning in normal conditions (under altered no-go form of dynamics).

Figure 36:

Variation of average reaching duration with learning in normal conditions (under altered no-go form of dynamics).

Figure 37:

Variation of path variability with learning in normal conditions (under altered no-go form of dynamics).

Figure 37:

Variation of path variability with learning in normal conditions (under altered no-go form of dynamics).

Figure 38:

Variation of output error of MC with learning in normal conditions (under altered no-go form of dynamics).

Figure 38:

Variation of output error of MC with learning in normal conditions (under altered no-go form of dynamics).

Figure 39:

Average time spent in various BG regimes (under altered no-go form of dynamics).

Figure 39:

Average time spent in various BG regimes (under altered no-go form of dynamics).

References

Albin
,
R. L.
,
Reiner
,
A.
,
Anderson
,
K. D.
,
Penney
,
J. B.
, &
Young
,
A. B.
(
1990
).
Striatal and nigral neuron subpopulations in rigid Huntington's disease: implications for the functional anatomy of chorea and rigidity—akinesia
.
Ann. Neurol.
,
27
,
357
365
.
Alexander
,
G. E.
(
1987
).
Selective neuronal discharge in monkey putamen reflects intended direction of planned limb movements
.
Experimental Brain Research
,
67
,
623
634
.
Berardelli
,
A.
,
Noth
,
J.
,
Thompson
,
P. D.
,
Bollen
,
E. L. E. M
,
Curra
,
A.
, &
Deuschl
,
G.
, et al
(
1999
).
Pathophysiology of chorea and bradykinesia in Huntington's disease
.
Movement Disorders
,
14
,
398
403
.
Berardelli
,
A.
,
Rothwell
,
J. C.
,
Thompson
,
P. D.
, &
Hallett
,
M.
(
2001
).
Pathophysiology of bradykinesia in Parkinson's disease
.
Brain
,
124
,
2131
2146
.
Bergman
,
H.
,
Wichmann
,
T.
,
Karmon
,
B.
, &
DeLong
,
M. R.
(
1994
).
The primate subthalamic nucleus. II. Neuronal activity in the MPTP model of Parkinsonism
.
Journal of Neurophysiology
,
72
,
507
520
.
Beuter
,
A.
, &
Vasilakos
,
K.
(
1995
).
Is Parkinson's disease a dynamical disease?
Chaos
,
5
,
35
42
.
Bischoff
,
A.
(
1998
).
Modeling the basal ganglia in the control of arm movements
.
Unpublished doctoral dissertation, University of Southern California
.
Brown
,
R. G.
, &
Jahanshahi
,
M.
(
1996
).
Cognitive-motor function in Parkinson's disease
.
European J. Neurology.
,
36
,
24
31
.
Brown
,
P.
,
Oliviero
,
A.
,
Mazzone
,
P.
,
Insola
,
A.
,
Tonali
,
P.
, &
Di Lazzaro
,
V.
(
2001
).
Dopamine dependency of oscillations between subthalamic nucleus and pallidum in Parkinson's disease
.
Journal of Neuroscience
,
21
,
1033
1038
.
Buhusi
,
C. V.
, &
Meck
,
W. H.
(
2005
).
What makes us tick? Functional and neural mechanisms of interval timing
.
Nature Reviews Neuroscience
,
6
,
755
756
.
Chakravarthy
,
V. S.
,
Joseph
,
D.
, &
Bapi
,
S. R.
(
in press
).
What do the basal ganglia do? A modeling perspective
.
Biological Cybernetics
.
Chaudhuri
,
A.
&
Behan
,
P. O.
(
2000
).
Fatigue in neurological disorders
,
Lancet
,
179
,
34
42
.
Clark
,
D.
,
Boutros
,
N.
, &
Mendez
,
M.
(
2005
).
The brain and behavior
.
Cambridge
:
Cambridge University Press
.
Cohen
,
J. D.
,
Braver
,
T. S.
, &
Brown
,
J. W.
(
2002
).
Computational perspectives on dopamine function in prefrontal cortex
.
Current Opinion in Neurobiology
,
12
,
223
229
.
Cutsuridis
,
V.
, &
Perantonis
,
S.
(
2006
).
A neural network model of Parkinson's disease bradykinesia
.
Neural Networks
,
19
,
354
374
.
Daw
,
N. D.
,
O'Doherty
,
J. P.
,
Seymour
,
B.
,
Dayan
,
P.
, &
Dolan
R. J.
, (
2006
).
Cortical substrates for exploratory decisions in humans
.
Nature
,
441
,
876
879
.
Desmurget
,
M.
,
Grafton
,
S.T.
,
Vindras
,
P.
,
Gréa
,
H.
, &
Turner
,
R. S.
(
2003
).
Basal ganglia network mediates the control of movement amplitude
.
Exp. Brain Res.
,
153
,
197
209
.
Dounskaia
,
N.
,
Fradet
,
L.
,
Lee
,
G.
,
Leis
,
B. C.
, &
Adler
,
C. H.
(
2009
).
Submovements during pointing movements in Parkinson's disease
.
Exp Brain Res.
,
193
,
529
544
.
Dreher
,
J. C.
, &
Burnod
,
Y.
(
2002
).
An integrative theory of the phasic and tonic modes of dopamine modulation in the prefrontal cortex
.
Neural Networks
,
15
,
583
602
.
Edwards
,
R.
,
Beuter
,
A.
, &
Glass
,
L.
(
1999
).
Parkinsonian tremor and simplification in network dynamics
.
Bulletin of Mathematical Biology
,
61
,
157
177
.
Flowers
,
K. A.
(
1976
).
Visual “closed-loop” and “open-loop” characteristics of voluntary movement in patients with Parkinsonism and intention tremor
.
Brain
,
104
,
167
186
.
Frank
,
M. J.
(
2005
).
Dynamic dopamine modulation in the basal ganglia: A neurocomputational account of cognitive deficits in medicated and non-medicated Parkinsonism
.
Journal of Cognitive Neuroscience
,
17
,
51
72
.
Gangadhar
,
G.
,
Joseph
,
D.
, &
Chakravarthy
,
V. S.
(
2008
).
Understanding Parkinsonian handwriting using a computational model of basal ganglia
.
Neural Computation
,
20
,
1
35
.
Georgopoulos
,
A. P.
,
Kalaska
,
J. F.
, &
Massey
,
J. T.
(
1981
).
Spatial trajectories and reaction times of aimed movements: Effects of practice, uncertainty, and change in target location
.
J. Neurophysiology
,
46
,
725
743
.
Goldberger
,
A. L.
,
Rigney
,
D. R.
, &
West
,
B. J.
(
1990
, February).
Chaos and fractals in human Physiology
.
Scientific American
, p.
34
.
Grace
A. A.
, (
1991
).
Phasic versus tonic dopamine release and the modulation of dopamine system responsivity: A hypothesis for the etiology of schizophrenia
.
Neuroscience
,
41
,
1
24
.
Gurney
,
K.
,
Redgrave
,
P.
, &
Prescott
,
T. J.
(
2001
).
A computational model of action selection in the basal ganglia II. Analysis and simulation of behaviour
.
Biological Cybernetics
,
84
,
411
423
.
Hamani
,
C.
,
Saint-Cyr
,
J. A.
,
Fraser
,
J.
,
Kaplitt
,
M.
, &
Lozano
,
A. M.
(
2004
).
The subthalamic nucleus in the context of movement disorders
.
Brain
,
127
,
4
20
.
Hikosaka
,
O.
,
Takikawa
,
Y.
, &
Kawagoe
,
R.
(
2000
).
Role of the basal ganglia in the control of purposive saccadic eye movements
.
Physiol. Rev.
,
80
,
953
978
.
Hurtado
,
J. M.
,
Graym
,
C. M.
,
Tamas
,
L.B.
, &
Sigvardt
,
K. A.
(
1999
).
Dynamics of tremor-related oscillations in the human globus pallidus: A single case study
.
Neurobiology
,
96
,
1674
1679
.
Isoda
,
M.
, &
Hikosaka
,
O.
(
2008
).
Role for subthalamic nucleus neurons in switching from automatic to controlled eye movement
.
Journal of Neuroscience
,
28
,
7209
7218
.
Joel
,
D.
,
Niv
,
Y.
, &
Ruppin
,
E.
(
2002
).
Actor-critic models of the basal ganglia: New anatomical and computational perspectives
.
Neural Networks
,
15
,
535
547
.
Kuhl
,
P. K.
, &
Meltzoff
,
A. N.
(
1996
).
Infant vocalizations in response to speech: Vocal imitation and developmental change
.
Journal of the Acoustical Society of America
,
100
,
2425
2438
.
Levy
,
R.
, &
Dubois
,
B.
(
2005
).
Apathy and the functional anatomy of the prefrontal cortex: Basal ganglia circuits
.
Cerebral Cortex
,
16
,
916
928
.
Magnin
,
M.
,
Morel
,
A.
, &
Jeanmonod
,
D.
(
2000
).
Single-unit analysis of the pallidum, thalamus and subthalamic nucleus in Parkinsonian patients
.
Neuroscience
,
96
,
549
564
.
May
,
R. M.
(
1976
).
Simple mathematical models with very complicated dynamics
.
Nature
,
261
,
459
.
Meltzoff
,
A. N.
, &
Moore
,
M. K.
(
1997
).
Explaining facial imitation: A theoretical model
.
Early Development and Parenting
,
6
,
179
192
.
Montague
,
P. R.
,
Dayan
,
P.
, &
Sejnowski
,
T. J.
(
1996
).
A framework for mesencephalic dopamine systems based on predictive Hebbian learning
.
Journal of Neuroscience
,
16
,
1936
1947
.
Nambu
,
A.
(
2008
).
Seven problems on the basal ganglia
.
Current Opinion in Neurobiology
,
18
,
1
10
.
Nini
,
A.
,
Feingold
,
A.
,
Slovin
,
H.
, &
Bergman
,
H.
(
1995
).
Neurons in the globus pallidus do not show correlated activity in the normal monkey, but phase-locked oscillations appear in the MPTP model of Parkinsonism
.
Journal of Neurophysiology
,
74
,
1800
1805
.
Obeso
,
J.A.
,
Rodriguez-Oroz
,
M. C.
,
Blesa
,
F. J.
, &
Guridi
,
J.
(
2006
).
The globus pallidus pars externa and Parkinson's disease. Ready for prime time
?
Experimental Neurology
,
202
,
1
7
.
Packard
,
M. G.
, &
Knowlton
,
B. J.
(
2002
).
Learning and memory functions of the basal ganglia
.
Annu. Rev. Neurosci.
,
25
,
563
593
.
Palop
,
J. J.
,
Chin
,
J.
, &
Mucke
,
L.
(
2006
).
A network dysfunction perspective on neurodegenerative diseases
.
Nature
,
443
,
768
773
.
Pasupathy
,
A.
, &
Miller
,
E. K.
(
2005
).
Different time courses of learning-related activity in the prefrontal cortex and striatum
.
Nature
,
433
,
873
876
.
Plenz
,
D.
, &
Kitai
,
S. T.
(
1999
).
A basal ganglia pacemaker formed by the subthalamic nucleus and external globus pallidus
.
Nature
,
400
,
677
682
.
Redgrave
,
P.
,
Prescott
,
T.J.
, &
Gurney
,
K.
(
1999
).
The basal ganglia: A vertebrate solution to the selection problem
?
Neuroscience
,
89
,
1009
1023
.
Schultz
,
W.
(
1998
),
Predictive reward signal of dopamine neurons
. J. Neurophysiol.
80
,
1
27
.
Sheridan
,
M. R
,
Flowers
,
K. A.
(
1990
).
Movement variability and bradykinesia in Parkinson's disease
.
Brain
,
113
,
1149
1161
.
Sridharan
,
D.
,
Prashanth
,
P. S.
, &
Chakravarthy
,
V. S.
(
2006
).
The role of the basal ganglia in exploration in a neural model based on reinforcement learning
.
International Journal of Neural Systems
,
16
,
111
124
.
Sutton
,
R. S.
&
Barto
,
A.G.
(
1998
).
Reinforcement learning: An introduction
.
Cambridge, MA
:
MIT Press
.
Terman
,
D.
,
Rubin
,
J. E.
,
Yew
,
A. C.
, &
Wilson
,
C. J.
(
2002
).
Activity patterns in a model for the subthalamopallidal network of the basal ganglia
.
J. Neurosci.
,
22
,
2963
2976
.
Terry
,
R. D.
,
Masliah
,
E.
,
Salmon
,
D. P.
,
Butters
,
N.
,
DeTeresa
,
R.
,
Hill
,
R.
, et al
(
1991
).
Physical basis of cognitive alterations in Alzheimer's disease: Synapse loss is the major correlate of cognitive impairment
.
Annals of Neurology
,
30
(
4
),
572
580
.
van Gemmert
,
A. W. A.
,
Adler
,
C. H.
, &
Stelmach
,
G. E.
(
2003
).
Parkinson's disease patients undershoot target size in handwriting and similar tasks
.
Journal of Neurology, Neurosurgery, and Psychiatry
,
74
,
1502
1508
.
Walker
,
M. P.
,
Ayre
,
G. A.
,
Cummings
,
J. L.
,
Wesnes
,
K.
,
Mckeith
,
G.
, &
O'Brien
,
J. T.
, et al
(
2000
).
Quantifying fluctuation in dementia with Lewy bodies, Alzheimer's disease, and vascular dementia
.
Neurology
,
54
,
1616
1625
.
White
,
O. B.
,
Saint-Cyr
,
J. A.
,
Tomlinson
,
R. D.
, &
Sharpe
,
J.
(
1983
).
Ocular motor deficits in Parkinson's disease. II: Control of saccadic and smooth pursuit systems
.
Brain
,
106
,
571
587
.
Yulmetyev
,
R. M.
,
Demin
,
S. A.
,
Panischev
,
O.
,
Yu
,
P. H.
,
Timashev
,
S. F.
, &
Vstovsky
,
G.V.
(
2006
).
Regular and stochastic behavior of Parkinsonian pathological tremor signals
.
Physica A
,
369
,
655
678
.