Impulsivity, i.e. irresistibility in the execution of actions, may be prominent in Parkinson's disease (PD) patients who are treated with dopamine precursors or dopamine receptor agonists. In this study, we combine clinical investigations with computational modeling to explore whether impulsivity in PD patients on medication may arise as a result of abnormalities in risk, reward and punishment learning. In order to empirically assess learning outcomes involving risk, reward and punishment, four subject groups were examined: healthy controls, ON medication PD patients with impulse control disorder (PD-ON ICD) or without ICD (PD-ON non-ICD), and OFF medication PD patients (PD-OFF). A neural network model of the Basal Ganglia (BG) that has the capacity to predict the dysfunction of both the dopaminergic (DA) and the serotonergic (5HT) neuromodulator systems was developed and used to facilitate the interpretation of experimental results. In the model, the BG action selection dynamics were mimicked using a utility function based decision making framework, with DA controlling reward prediction and 5HT controlling punishment and risk predictions. The striatal model included three pools of Medium Spiny Neurons (MSNs), with D1 receptor (R) alone, D2R alone and co-expressing D1R-D2R. Empirical studies showed that reward optimality was increased in PD-ON ICD patients while punishment optimality was increased in PD-OFF patients. Empirical studies also revealed that PD-ON ICD subjects had lower reaction times (RT) compared to that of the PD-ON non-ICD patients. Computational modeling suggested that PD-OFF patients have higher punishment sensitivity, while healthy controls showed comparatively higher risk sensitivity. A significant decrease in sensitivity to punishment and risk was crucial for explaining behavioral changes observed in PD-ON ICD patients. Our results highlight the power of computational modelling for identifying neuronal circuitry implicated in learning, and its impairment in PD. The results presented here not only show that computational modelling can be used as a valuable tool for understanding and interpreting clinical data, but they also show that computational modeling has the potential to become an invaluable tool to predict the onset of behavioral changes during disease progression.
Citation: Balasubramani PP, Chakravarthy VS, Ali M, Ravindran B, Moustafa AA (2015) Identifying the Basal Ganglia Network Model Markers for Medication-Induced Impulsivity in Parkinson's Disease Patients. PLoS ONE 10(6): e0127542. https://doi.org/10.1371/journal.pone.0127542
Academic Editor: Osama Ali Abulseoud, Mayo Clinic, UNITED STATES
Received: December 19, 2014; Accepted: April 16, 2015; Published: June 4, 2015
Copyright: © 2015 Balasubramani et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited
Data Availability: All relevant data are within the paper and its Supporting Information files.
Funding: The authors have no support or funding to report.
Competing interests: The authors have declared that no competing interests exist.
Impulsivity is a multi-factorial problem that is assessed based on the capacity of an individual to accurately perform a goal directed action, and their ability to inhibit action impulses from interfering with the execution of the goal directed action [1–3]. It is also defined as the tendency to act prematurely, and has been linked to motor and cognitive disorders . Some tests for impulsiveness include action selection paradigms such as Go / NoGo tasks, activities assessing response alternation due to delays, contingency degradation, or devaluation [4,5]. All of the above tests measure the subject's capacity to optimise the trade-off between speed and accuracy. Impulsive behaviors are exhibited in these tasks in the form of shorter reaction times, lesser behavioral inhibition over the non-optimal actions, less perseveration, and higher delay discounting [6–8]. Impulsivity is also the hallmark of several other psychiatric disorders such as attention deficit hyperactive disorder, aggression, substance abuse, and obsessive compulsive disorder .
Impulsivity in Parkinson's disease
Parkinson's disease (PD) is characterised by the loss of dopaminergic (DA) neurons in substantia nigra pars compacta (SNc) [9,10]. The key motor symptoms that mark PD are tremor, rigidity, akinesia, and advanced cases may exhibit freezing of gait [11,12]. However non-motor symptoms such as cognitive dysfunction, behavioral and sleep disorders, dysautonomia, psychiatric disorders such as depression and anxiety, are also common in these patients . A class of people suffers from an inability to resist an inappropriate hedonic drive, eventually resulting in performance of unfavorable actions with harmful consequences. This inability is termed as impulse control disorder (ICD), and is displayed in around 14% of ON medication PD (PD-ON) patients who are mostly treated with DA agonists . ICDs include pathological gambling, compulsive shopping, binge eating, punding, overuse of dopaminergic medication, and over-engaging in meaningless hobby-like activities. The reduction of the medication can induce withdrawal symptoms, thus demanding an optimal therapy to ameliorate both the motor and the non-motor symptoms in PD .
Neural substrates identified for impulsivity
Reported neural substrates of impulsivity include cortical structures such as the prefrontal cortex and orbito-frontal cortex, as well as subcortical structures such as the striatum, subthalamic nucleus (STN), globus pallidum externa and interna (GPe and GPi) of the basal ganglia (BG) [8,16]. In-vivo neurochemical analysis in rats performing a serial reaction time task revealed that dysfunction in neuromodulators such as DA and serotonin (5HT) in the fronto-striatal circuitry is associated with impulsivity . Specifically receptors such as DA D2, and 5HT 1,2,6 are shown to significantly contribute to the impulse control disorder [14,17,18]. Computational modelling can be used for a better understanding of the contribution of the above mentioned structures and neurochemicals to impulsive decision making, as is described below.
Computational modelling of neural substrates of impulsivity
Since PD-ON ICD is primarily linked to impairment in DA signalling and the BG function, several contemporary models of PD-ON ICD have focused on the role of the BG, often by using a reinforcement learning (RL) framework [19,20]. In this framework, learning is driven by rewards and punishments obtained as a result of executing actions [21,22]. The prediction error which is the difference between expected and received rewards is signalled by DA. There is evidence supporting that mesencephalic DA signalling codes for the temporal prediction error in reinforcement learning framework [23–25]. Such a prediction error facilitates the computation of some "goodness" measures such as the value function associated with an action. The value function refers to the expected sum of the future rewards obtained on executing actions. Functional imaging studies suggest that value is computed in striatum of the BG [26,27]. This computation is thought to be achieved by combining the reward prediction error information from the SNc to striatum along with the cortical state conveyed to the striatum by corticostriatal projections [20,26–29].
PD-ON ICD patients are reported to display exaggerated reward learning and attenuated punishment learning . This is in contrast to the OFF- medication PD patients (PD-OFF) who are more sensitive to punishments than rewards . Evidence suggest that phasic DA signals are necessary for reward punishment learning. While positive phasic DA signals are necessary for reward learning, negative phasic DA signals (and the duration of phasic dip) are needed for punishment learning . The loss of dopaminergic neurons and decreased levels of DA  is known to amplify phasic dips of DA and hence promote punishment learning. On the contrary, medication increases the basal firing of the dopaminergic neurons and the availability of tonic DA, thereby promoting reward learning. The opponency between the direct and indirect pathways of the BG, mediated by the available DA for a particular subject type, is utilized by several models to explain the ICD behavior [20,32–34].
Some models account for these differences in reward / punishment learning between the PD-ON and PD-OFF patients by invoking differential learning rates for positive and negative feedback learning . According to one model, ICD is an effect of automaticity of the stimulus-response relationship that becomes insensitive to the outcome; thus ICD is thought to be a form of habitual action . Another model that belongs to the actor-critic family of BG models localizes the critic module (which evaluates the rewards associated with an action) to ventral striatum, and the actor module (which provides an executable plan for performing actions) to dorsal striatum. A dysfunction in the critic module has been proposed to explain the impaired stimulus-response relationship in PD-ON ICD cases . Some other models use matching law to relate the probability of selecting a choice among two given alternatives to both the relative magnitudes and relative delays of the reinforcers associated with the alternatives . The preference to choices increased with the magnitude of the associated reinforcer, but decreased with the delay associated with the reinforcer. Increased sensitivity to delays was predicted to increase impulsive behavior in that study .
Our modeling approach
In the case of medication-induced impulsivity in PD patients, there are many experiments reporting a non-significant role of DA in medication-induced forms of impulsivity, for example, delay discounting task [35–38]. And some experiments suggest that an impaired balance between 5HT and DA is the root of impulsivity [39–42]. Additionally, there are several instances of experimental studies that relate central 5HT and functional polymorphisms of the 5HT transporter gene to impulsivity . Thus the ætiology of ICD in PD should involve dysfunction in both 5HT and DA systems [7,8]. Therefore a modeling approach that is based solely on DA mediated dynamics in the BG  should ideally be expanded to include the 5HT system for better representation of the experimentally observed behavior. Most of the models reviewed above consider only DA dysfunction to explain impulsivity behavior. There is clearly a need for a model that unifies the contributions of other neuromodulators such as 5HT in addition to DA, in order to gain a comprehensive understanding of impulsivity.
In this study, we propose a unified computational network model of the BG that can mimic impulsivity disorder. The model is cast in the RL framework. It explicitly includes the anatomical modules such as the striatum, GPe, GPi and STN [43–45]. In addition to these anatomical components, the model also incorporates the roles of two key neuromodulators implicated in ICD–DA and 5HT. In line with classical RL-based models of the BG, the DA signal corresponds to reward prediction error in the present model. Invoking the natural relationship between impulsivity and risk-seeking [46,47], we borrow elements from a recent model  that links 5HT and risk-based decision making, and incorporate them in the proposed model.
The paper is outlined as follows: Section 2 deals with the materials and methods along with the model approach. Section 3 is concerned with the experimental and the modelling results, which are then discussed in Section 4.
Materials and Methods
This study was part of a larger project conducted at Ain Shams University Hospital, Cairo, Egypt. Seventy six participants were recruited for the project containing 160 trials of a probabilistic learning task. The subjects include (1) PD patients tested OFF medication (PD-OFF, n = 26, 6 females); (2) PD patients without ICD tested ON medication (PD-ON non-ICD, n = 14, 3 females); (3) PD patients with ICD tested ON medication (PD-ON ICD, n = 16, 2 females); and (4) healthy controls (n = 20, 3 females). The healthy control participants did not have any history of neurological or psychiatric disorders. The PD-OFF group was withdrawn from medications for a period of at least 18 hours. The majority of ON-medication patients were taking dopamine precursors (levodopa-containing medications) and D2 receptor agonists, specifically, Requip, Mirapex, Stalevo, Kepra, and C-Dopa. The mean disease duration was 8.35, 9.56, and 9.8 years for PD-ON non-ICD, PD-ON ICD, and PD-OFF patients, respectively. The OFF medicated PD patients had 9.8 years of mean disease duration. All participants gave written informed consent and the study was approved by the ethical board of Ain Shams University.
The Unified Parkinson’s Disease Rating Scale (UPDRS) was used to measure the severity of PD . The UPDRS for all patients were measured ON medication. There was no significant difference among the patient groups in their UPDRS scores (F(2,63) = 0.5432, p = 0.5836) and their MMSE scores (F(2,63) = 0.5432, p = 0.5836). All participants were also tested for intact cognitive function and absence of dementia with the Mini-Mental Status Exam- MMSE . Furthermore, there were no significant difference between the patient groups on the North American Adult Reading Test , the Beck Depression Inventory , and the forward and backward digit span tasks (p > 0.05 in each case using one-factor ANOVA analysis). The scores of all patient groups in Barratt impulsiveness scale were significantly different from each other (F(2,63) = 9.3264, p = 0.0003). A post hoc t- test with two tail analysis showed that ICD patients contributed mostly to the differences observed in the scores.
The experimental paradigm encompasses probabilistic reward and punishment learning. There were 160 trials wherein each trial, one of four different stimuli (I1, I2, I3, and I4) was presented in a pseudorandomized manner. The participants were asked to categorise them to response A or B. Two stimuli (I1 and I2) were used for testing the reward learning, and the other two stimuli (I3 and I4) were used for testing the punishment learning. An outcome follows every response, and an optimal response is the one maximising the observed outcome. In reward trials, an optimal response leads to +25 points 80% of the time and no reward for 20% of trials. In contrast, a non-optimal response resulted in +25 points only 20% of the time. In punishment trials, an optimal response resulted in no reward 80% of the time, and -25 points 20% of the time. Whereas a non-optimal response resulted in -25 points 80% of the time (Table 1, Fig 1). This experiment has been previously performed with PD patients and healthy control subjects as described in  but the present study extends the same experimental setup to analyse the subject's reaction times.
The highlighted circles denote instances of the response selected for receiving an outcome. The images are represented by I1, I2, I3, I4 whose details are provided in Table 1. And the outcomes are presented to the subjects as "You Lose 25 Points", "You Win 25 Points", or none.
Our earlier modeling study  showed that the role of the BG in risk-based decision making can be efficiently modeled using utility-based learning, rather than just the value-based learning [19,20,52]. In utility-based learning, the utility of a state and an action pair is a combination of its value function and risk function. The state referred to here is the cortical state that forms the input of the BG, and the action refers to the behavioral response. The striatum of the BG receives input from a wider area of cortex including the pre-frontal cortex, orbito-frontal cortex, and sensory-motor cortices . These nuclei also receive numerous 5HT and DA projections that are proposed to control the perception of value and variance / risk associated with the sampled rewards, respectively . The striatal projections then project to the GPe, STN and GPi through the direct or indirect pathways; which together contribute to the action selection dynamics . The framework used in this study is adapted from classical BG models as proposed in [53–56]. A detailed schematic representation of the current model is provided in Fig 2.
The BG model components shown are striatum, GPe, GPi, and STN along with SNc, DRN, and Thalamus. The schematic also denotes various DA and 5HT model correlates, as described in the Section: Model framework. The inset details the notations used in model section for representing cortico-striatal weights (w) and responses (y) of various kinds of MSNs (D1R expressing, D2R expressing, and D1R-D2R co-expressing) in the striatum, with a sample cortical state size of 4, and maximum number of action choices available for performing selection in every state as 2.
While the value function represents expected reward, risk function tracks reward and reward prediction error's variance over time [28,45,57,58]. Using a utility-based approach , that combines value and risk, it was possible to model experiments on reward-punishment learning , time scale of the reward prediction  and risk-based learning . Moreover, the study also reconciles the multifarious roles of 5HT in the BG, as instantiated in these experiments [59–61], within a single framework. The seemingly unrelated roles in controlling behavioral inhibition, time scale of reward / punishment predictions that control sensitivity to delays in receiving outcomes, and risk learning were captured in our model of utility based decision making—where 5HT is modeled as a parameter affecting the risk prediction error . Hence the current study models 5HT to control the risk function, and uses the classical representation of DA in controlling the reward prediction error. We borrow the above mentioned key ideas from Balasubramani et al. (2014)  and present here a detailed network model of the BG (Fig 2) to understand the behavioral data collected from PD patients and healthy controls.
The utility function proposed in (Balasubramani et al 2014) is given below: (1) where U, Q, and h are respectively the utility, action value and the risk functions associated with a state, 's' and action, 'a' at time, 't'. Risk sensitivity is controlled by the parameter α in the above Eq (1) and is proposed to represent the neuromodulator 5HT.
This lumped model has been extended to the BG network model with the value and the risk functions computed by the medium spiny neurons (MSNs) in the striatum . Our earlier study proposed that striatal DA receptor (D1R) expressing MSNs code for value function, while the MSNs co-expressing both D1R and D2R (D1R-D2R) code the risk function. Whereas the D1R MSNs project via the direct pathway (DP) to GPi, the D2R and the D1R-D2R co-expressing MSNs project to the GPe in the indirect pathway (IP)  (Fig 2).
The outputs of the different kinds of MSNs—D1R expressing, D2R expressing and the D1R-D2R co- expressing neurons–are represented by variables yD1, yD2, and yD1D2, respectively in Eq (2). The subscript t denotes the time of response.(2)
In the above equations, 'x' is a logical variable modeled to be equal to 1 for the current state, st, i.e., x(si) = 1 if si = st (see Fig 2 inset). The Utility, U, is then obtained from the network model as described in the following Eq (3) . (3) where Here in Eq (3), the risk sensitivity parameter is defined by αD1D2 which denotes the specific modulation of 5HT on the D1R-D2R co-expressing MSNs coding the risk function. The model DA parameter is used for the updating of cortico-striatal weights, and also controlling the switching at GPi . Thereby the model postulates multiple forms of DA and 5HT signals, each of which has a differential action on D1R, D2R, and the D1R-D2R MSNs, as detailed later in this section. The bi-directional connectivity in the STN-GPe system that facilitates complex oscillations and "exploratory" behavior is also captured in this model . We now present equations for the individual modules of the proposed network model of the BG. The reader refer our earlier studies for more details [43,45].
Model components: Striatum.
The Striatum is proposed to have three types of MSNs, D1R expressing MSNs, D2R expressing MSNs, and D1R-D2R co-expressing MSNs, all of which have their gain functions (λ) as described below in Eq (4). The c1, c2, c3 are constants that vary with the receptor type. The value function (Q) requires a continuously increasing gain as a function of DA in the MSNs, which is shown to occur in the DA D1R containing MSNs. The risk function (h) [28,45,57,58] would simply require an increasing gain with increasing magnitude of DA, i.e. a 'U' shaped gain function which gives increased response with increasing δ2. It is plausible that these risk-type of gain functions would then probably be exhibited by the neurons that co-express both the D1R-like gain function that increases as a function of DA, and D2R-like gain function that decreases as a function of DA [63–66], as identified in a recent experimental study . The D2R MSN's gain function whose activity decreases as a function of DA makes them suitable for punishment computation, in opposition to that of the D1R MSNs responding positively to the reward prediction error (DA).(4)
The weight update equations for a given (state, action) pair in the different kinds of MSNs are provided in Eq (5).(5)
The δ's in the weight update equations are computed for the immediate reward condition as provided in Eq (6). It represents the DA form of activity that updates the cortico-striatal weights and is the classical temporal difference (TD) error [21,68].(6)
In the network model of the STN-GPe system, STN and GPe layers have equal number of neurons, with each neuron in STN uniquely connected bidirectionally to a neuron in GPe. Both STN and GPe layers are assumed to have weak lateral connections within the layer. The number of neurons in the STN (or GPe) (Fig 2) is taken to be equal to the number of possible actions for any given state, n [69,70]. The dynamics of the STN-GPe network is given below. (7) - internal state (same as the output) representation of ith neuron in GPe;
- internal state representation of ith neuron in STN, with the output represented by ;
WGPe - lateral connections within GPe, equated to a small negative number ϵg for both the self (i = j) and non-self (i ≠ j) connections for every GPe neuron i.
WSTN - lateral connections within STN, equated to a small positive number ϵs for all non-self (i ≠ j) lateral connections, while the weight of self-connection (i = j) is equal to 1+ ϵs, for each STN neuron i.
Both STN and GPe are modeled to have complete internal connectivity with every neuron in a layer connected to every other neuron in that layer with the same connection strength. That common lateral connection strength is ϵs for STN, and ϵg for GPe. Likewise, STN and GPe neurons are connected in a one-to-one fashion–ith neuron in STN is connected to ith neuron in GPe and vice-versa. For all the simulations presented below, we set ϵg = -ϵs; the learning rates 1 / τS = 0.1; 1 / τg = 0.033; and the slope λSTN = 3; ϵs = 0.12.
The DP and IP projections to GPi.
Description for the parameters αD1, αD2, αD1D2 in the Eqs (8 and 9): The neuromodulator 5HT's specificity in expression along with a particular type of MSN is not known [71–74]. In the present model, 5HT is thought to modulate the activity of all three kinds of MSNs (D1R expressing, D2R expressing and the D1R-D2R co-expressing). Hence the modeling correlates of 5HT are the parameters αD1 (Eq (8)), αD2, αD1D2 (Eq (9)) for modulating the output of the D1R, D2R and the D1R-D2R MSNs respectively, and they represent the tonic-5HT modulation exerted by dorsal raphe nucleus (DRN) [75–77]. The utility function described in Eq (3) involves specifically the 5HT parameter, αD1D2, to represent the selective modulation on 5HT on the risk-coding D1R-D2R MSNs; it does not involve the αD2 parameter which represents the effect of 5HT on D2R MSNs in the striatum.
The variables yD1,t, yD2,t, yD1D2,t as a function of state (s) and action (a) at time, t, are obtained from Eq (2).
Description for the parameters λD1, λD2, λD1D2 in the Eqs (8 and 9): The D2R and the D1R-D2R MSNs form part of the striatal matrisomes known to project to the IP, while the D1R MSNs project to the DP [69,71,72,78,79]. It should also be noted that λs used as a gain factor in Eqs (8 and 9) have different parameters from λs used in Eq (5). And the gain functions in Eq (8 and 9) are a function of the DA form  which represents the temporal difference in utility function, δU (Eq (10)). This is different from the DA form, δ, described in Eq (6).(10)
Another correlate of DA (Fig 2) affecting the model is the sign(Q) term in the Eqs (3 and 9), that is a form of value function, Q [22,24]. This term ensures the non-linear risk sensitivity observed in subjects based on the nature of the outcomes: risk aversive for rewards and risk-seeking for punishments [28,46]. The utility difference form of DA (Eq (10)) is proposed to be computed in SNc using the value inputs from D1R MSNs, and the risk inputs from the D1R-D2R MSNs, for a particular (state, action) pair. Hence, both the D1R and the D1R-D2R MSNs form a part of the striatal striosomes that contribute to the computation of DA error signal in SNc [69,71,72,78,79]. A summary of different mathematical forms of DA and 5HT used in the present model are listed in Table 2. Utility is thought to be computed in the SNc where the projections from D1R and D1R-D2R MSNs converge; D2 MSNs are not modeled to project to SNc [69,71,72,78,79] (Fig 2). Therefore the utility in Eq (3) is constructed as a summation of the value function computed by the D1R MSNs and the risk function computed by the D1R-D2R MSNs. But the action selection dynamics at GPi involve all the three types of MSNs (D1R, D2R and the D1R-D2R MSNs) through Eqs (8 and 9).
Action Selection at GPi.
Since D1R is activated at increased dopamine levels, higher dopamine levels favour activating DP (constituted by the projections of D1R MSNs) over IP. This is consistent with the nature of switching facilitated by DA in the striatum [81–84]. The relative weightage of the STN projections to GPi is represented by wSTN-GPi, and is set to 1 for all the GPi neurons in the current study.
Action Selection at Thalamus.
GPi neurons project to thalamus through inhibitory connections. Hence the thalamic afferents can be simply expressed as a modified form of Eq (11).(12)
These afferents in Eq (12) activate thalamic neurons as follows, (13) where is the state of the ith thalamic neuron. Action selected is simply the 'i' (i = 1,2,..,n) whose first crosses the threshold on integration. In the case of many actions crossing the threshold at the same time, the action with maximum at that time is selected. The reaction times (RT) associated with the trial is the number of iterations required for of the selected action to reach the threshold [85–87]. The threshold value used in the current simulation is 1.815.
Modeling Parkinson’s disease.
The PD version of the proposed model has the following features (Eq (14)) for OFF and ON medication. PD pathology is associated with a huge loss in SNc dopaminergic neurons . Since DA levels are lower in PD than in healthy controls, the δ (Eq (6)) is clamped to an upper bound (δLim), and this marks the PD-OFF case. In the PD-ON case, there is a higher level of tonic DA available due to medication. This is modeled by a simple addition of a fixed constant (δMed denoting the medication levels) to the clamped δ [89–93].(14)
Behavioural performance was assessed by analysing the optimality of participant responses and their reaction times. First, proportions of optimal responding to reward and punishment stimuli were calculated for each participant. A one-way ANOVA revealed significant group differences between optimizing rewards (F(3,72) = 12.12, p = 1.64X10-6) and punishments (F(3,72) = 3.76, p = 0.01) (Table 3). Post hoc analysis showed increased differences existing in the distributions of PD-OFF and PD-ON ICD patients responses (p = 2.23x10-7) for having optimality in reward learning (Stimuli I1 and I2) as the factor of analysis, and (p = 0.003) while having optimality in punishment learning (Stimuli I3 and I4) as the factor of analysis. That is, PD-ON ICD patients showed increased reward optimisation and decreased punishment optimisation relative to PD-OFF patients. The PD-ON non-ICD patients and healthy controls showed comparatively equal reward and punishment based optimality.
A similar analysis was conducted on reaction times, revealing overall significant group differences (F(3,72) = 11.63, p = 2.65X10-6), as shown in Table 3. The post hoc analysis showed this difference to be driven by the RT distributions of the PD-ON non-ICD, for having significantly larger RT distributions than the PD-OFF groups (p = 7.39x10-6), whilst PD-ON ICD group did not differ significantly from healthy controls.
The network model described in the previous section is now applied to the experimental data. The reward of 25 points is simulated as r = +1, the punishment of -25 points as r = -1, and 0 points is simulated by r = 0. The four kinds of images (I1, I2, I3, I4) are simulated as states (s), and the two kinds of responses (choosing A or B) for a given image are simulated as actions (a) (Figs 1 and 2).
The experimental and the simulation results showing the selection optimality in the task-setup for different subject groups is shown in Fig 3A. Experimental reaction time analysis for every subject group is provided in the Fig 3B. The same is matched through our proposed model. The RT results from the simulation are shown in Fig 3C and 3D.
(a) The percentage optimality is depicted for various subject categories as obtained from the experimental data and the simulations (run for 100 instances). The reaction times (RT) in msec through trials are also shown for (b) the experimental data, and (c) for simulation. The average RTs in msec across the subject groups are provided for both experiment and simulation in Fig (d). The outliers are in prior removed with p = 0.05 on the iterative Grubbs test . The similarity between the experiment and the simulation is analysed using a one way ANOVA, with reward valence, punishment valence, and RT as factors of analysis. They showed significant differences among the subject groups as seen in the experimental data, but no significant difference is observed between the simulation and the experiment. The subject categories healthy controls, PD-ON ICD, PD-ON non-ICD and PD-OFF are represented as HC, ON-ICD, ON-non-ICD, and OFF in the figures.
The modeling study suggests that optimising the parameters (Tables 4 and 5) related to DA- δ (viz. δLim and δMed in Eq (14)), and 5HT–(αD1, αD2, αD1D2 in Eqs (3, 8 and 9)) are essential to model the ICD behavior in the PD patients. The following are the key modeling results:
- An increased reward sensitivity in PD-ON, and increased punishment sensitivity in PD-OFF cases, are seen (Fig 3A)
- Decreased reaction times are seen in ICD category of the PD-ON cases compared to that of the non-ICD PD-ON group (expt-Fig 3B, sims-Fig 3C, Fig 3D).
- The model correlates of 5HT along with DA have to be optimized for improving the reward-punishment sensitivity in PD patients. The 5HT+DA model (αD1D2 > 0) captures the experiment profile better than just a DA model of the BG (αD1 = 1, αD2 = 1, αD1D2 = 0) (Table 5, S2 File, S3 File).
- PD-ON ICD case required significantly reduced 5HT modulation of the striatal D2R (αD2) and the D1R-D2R (αD1D2) MSNs.
- PD-ON non-ICD case is explained in our model by an increased 5HT modulation of D2R MSNs (αD2), and a decreased 5HT modulation of D1R-D2R MSNs (αD1D2).
- A significant increase in the modulation of D2R MSNs (αD2) characterizes the PD-OFF case of the model. The above comparisons are made with respect to the healthy controls.
Details of optimization.
To investigate if the model can veritably predict differences in reaction time between the four different groups, given the selection accuracy alone, we performed the following tests:
- Step 1: First, we identified parameter sets that are optimal for the cost function based on reward punishment action selection optimality only.
- Step 2: We then selected solutions from Step 1 that can also explain the desired RT measures. The resulting parameter set is then taken as the optimal solution to the problem for a specific group.
The parameters for each experiment are initially selected using grid search and are eventually optimized using genetic algorithm (GA)  (Details of the GA option set are given in S1 File). The optimized parameter set for explaining the behavioral data in various subject groups is provided in Table 5. The procedure followed for optimizing the key parameters in the Table 5 using grid search are as follows:
- The parameters αD1, αD2, and αD1D2 are optimized in the model of healthy controls.
- For a model of PD-OFF, the parameters αD1, αD2, αD1D2, and δLim are optimized to match the experimental results. Setting the parameter δLim is a key addition to the PD-OFF model when compared to the healthy controls. This constraint reflects the deficit in DA availability in the model.
- Then to explain action selection accuracy and reaction times of ICD in PD-ON medication case, αD1, αD2, αD1D2 and δMed are optimized. The δLim value denoting DA deficit is kept the same as that obtained for the OFF medication case.
- The non-ICD category of the PD-ON patients’ behavior is finally captured in the model by only optimising the parameters [αD1, αD2, αD1D2]. As mentioned above, δLim is set to be the same in PD-ON (ICD and nonICD) and PD-OFF cases. Similarly, the medication level (δMed) is maintained to be the same across the ICD and the non-ICD categories of the PD-ON patients. Hence the parameters differentiating the PD-ON ICD and the nonICD subjects are [αD1, αD2, αD1D2].
The aim of this study is to understand ICD in PD patients. Our experimental results suggest that the PD-ON ICD patients are more sensitive to rewards than to punishments. The PD-ON non-ICD patients had no significant difference between reward and punishment learning, similar to the healthy controls. The PD-OFF patients, on the contrary, showed significantly higher (Fig 3A) learning for punitive outcomes compared to rewarding outcomes. Within the PD-ON group, the ICD group showed shorter RTs than the non-ICD patients. The PD-OFF subjects were observed to have the shortest RT. Such trends in RT and reward-punishment based action selection accuracy have been reported previously in similar studies [20,30] on PD patients.
Application of the proposed network model to the experimental data suggests how impaired actions of DA and 5HT in the BG contribute to ICD behavior in the PD patients.
The proposed BG model uses utility function framework to model action selection and the associated reaction times [85,87,95–98]. This is an extended form of classical BG models as proposed in [53–56]. The oscillatory dynamics of the STN-GPe is modeled by using a simple Lienard oscillator model [43,62,99]. In the model, the BG system is thought to compute value and risk functions necessary for decision making [28,43]. Specifically, the DA-D1R containing MSNs compute the value function, whilst the co-expressing D1R-D2R containing MSNs compute risk function. Anatomical studies in primates reporting that D1R-D2R co-expressing MSNs form a significant proportion of the striatal MSNs [71,72,78,100]. The MSNs of the striatum project through the direct and the indirect pathways to the BG's output nuclei, GPi. The GPi then relays to the thalamus. Time taken for the activity of the winning thalamic neuron to reach a threshold corresponds to RT, while the index of winning thalamic neuron corresponds to the action selected.
The neuromodulators DA and 5HT affect BG dynamics in the model via different mechanisms as mentioned in Table 2. The variables that represent DA in the model are:
- the temporal difference error, δ, that updates the cortico-striatal weights [21,68],
- the temporal difference of utility , δU, that aids the action selection at the GPi level , and
- the sign(value function) term controlling the response of D1R-D2R MSNs [22,24,45].
Likewise, 5HT differentially affects the D1R, D2R and D1R-D2R co-expressing MSNs, which is represented by the model parameters αD1, αD2, and αD1D2 respectively. Serotonin is proposed to control risk sensitivity in action selection performance of the BG . Particularly, 5HT is shown to affect the D2R MSNs and co-expressing D1R-D2R MSNs (S2 File).
Significance of the current study
Our previous study  has shown similarity between the effects of discount factor used to control myopicity of reward prediction  and the risk sensitivity factor (α) of Eq (1), in a delay discounting task. Some models relate impulsivity to discount factor, i.e., an increased discounting and myopicity in reward prediction is related to impulsive behavior [19,60,102]. We show that such effects can be captured in the proposed model by the risk sensitivity term (αD1D2) of the Eq (3) . Furthermore, earlier models of ICD in PD only take DA deficiency in striatum into account , leaving behind other potential factors such as 5HT.
In some other models, reduced learning from the negative consequences in PD-ON ICD patients was modeled using an explicitly reduced learning rate parameter associated to negative prediction error . But the proposed model naturally takes the nonlinearity in reward-punishment learning into consideration through the sign() term in risk function computation (Eq (3)). The nonlinearity mediated by α.sign() term towards rewards and punishments results in the PD-ON ICD case to learn more from rewarding outcomes, and the PD-OFF case to be more sensitive to punitive outcomes. The lower availability of DA leads to devaluation of the reward-associated choices more than that of the punishment in the PD-OFF case (Fig 3A) which favors punishment learning. Similarly in PD-ON cases, the punishment linked choices are overvalued to reduce the optimality in punishment learning.
Our model finds that the modulation of both DA and 5HT in the BG model is necessary to effectively explain the aspects of impulsive behaviour observed in our experiment. Please refer S2 File for computations showing the necessity of optimizing αD1, αD2, αD1D2 to explain the experimental data; and refer S3 File for computations showing that just DA related parameters cannot explain the experimental data. Using only the effect of D1R MSNs and D2R MSNs (αD1 = 1; αD2 = 1) without including the co-expressing D1R-D2R MSNs along with the 5HT effect (αD1D2 = 0), does not explain the experimental results (S2 File). This differentiates our model from those that invoke only the opponency between the DA mediated activity of D1R MSNs and D2R MSNs for explaining the PD-ON ICD behavior [20,32–34]. The main results from modeling of striatal MSNs are included in Table 6.
By investigating the functioning of neuromodulators DA and 5HT in this study, we find that there is a sub-optimal utility computation driven by the neuromodulators DA and 5HT in the PD patients. The clamping done to the availability of DA (Eq (14)) represents reduced DA availability or DA receptor density or dopaminergic projections to the BG in the PD-OFF case [103,104]. In the PD-ON case, an increased tonic level of DA is modeled by the addition of a medication constant (δMed) [89–93]. Our model also predicts a lower availability of 5HT in the BG for both PD-OFF and PD-ON cases as previously reported by various experimental studies [9,105–107]. Specifically based on 5HT modulation in the model, a lowered sensitivity to the D2R MSNs and the D1R-D2R MSNs are observed in ICD. They exhibit a significantly reduced inhibition of actions along with risk-seeking behavior. Thus extremely low αD2 and αD1D2 efficiently differentiates ICD group among the PD-ON cases. The model also shows that the PD-OFF patients would have very high sensitivity to punishment (αD2) and increased behavioral inhibition, while the healthy controls have a higher sensitivity to risk (αD1D2).
Concisely, the model classifies the medication induced ICD in the PD patients to be possessing limited DA and altered 5HT modulations particularly on the D2R and D1R-D2R MSNs.
Limitations of the study and future work
The co-expressing D1R-D2R MSNs are experimentally shown to significantly contribute to both the direct and the indirect pathways of the BG [71,72,108]. These two distinct pools of D1R-D2R MSNs—one following DP that controls exploitation, and the other following IP that controls exploration [43,44,62], might be used for modeling the non-linearity in risk sensitivity based on outcomes (i.e., risk aversion during gains and risk seeking during losses) . The inherent opponency between the DP and IP [55,109] pathway would facilitate the projections of corresponding D1R-D2R MSNs for showing contrasting risk sensitive behavior. Each of the neuronal pools computing the risk function should then be weighed by appropriate sensitivity coefficients (representing neuromodulators DA and 5HT ) to capture the non-linear risk sensitive behavior  based on the valence of outcomes (Eq (3)). This is simplified in the present modeling study by considering the projections of D1R-D2R MSNs to IP alone, multiplied by a (α sign(Q)) term. Moreover, the increased magnitude of risk associated with an action is experimentally found to enhance exploration in the dynamics [110–112]. This is made possible in the model by routing the co-expressing D1R-D2R MSN activity to the IP, since in the present model it is IP that predominantly controls levels of exploration [43,44,62]. Moreover, there is evidence supporting the involvement of STN in controlling impulsivity , as their lesions are shown to decrease RT and increase premature responding behavior [95,114–116]. Also, the levels of synchronisation in STN-GPe contribute to the cognitive symptoms namely impulsivity [98,117], similar to its contribution to the motor symptoms in PD namely tremor, postural instability and gait disturbances [118–120]. In PD, markedly depleted levels of DA are associated with highly synchronised firing pattern and a slight increase in firing rates in STN [121,122]. These are the motivations behind specifically considering the projections of D1R-D2R MSNs to IP only. Expanding the framework to include the D1R-D2R MSNs projections to GPi (in the DP) would be incorporated in our future work. We would also involve the detailed neuronal modeling of STN-GPe system in our future work, to understand the possible role of oscillatory activity of STN in PD-related impulsivity [98,117].
Projections from GPe to GPi are found in the primates [56,123,124]. GPe projections to GPi are thought to be more focused, compared to the more diffuse projections of STN to GPi. These GPe-GPi connections bypass the GPe-STN-GPi connectivity—wherein the former are thought to perform a focused suppression of GPi response to a particular action, whereas the latter impose a Global NoGo influence [56,125]. Though the functional significance of these connections is not known, not accounting for this connectivity (STN-GPe-GPi) is a limitation of the modeling study. However, since we do not differentiate a global / local NoGo in our study, the proposed minimal model adapted from classical BG models [53–56] is demonstrated to capture the required experimental results at the neural network level.
STN also receives extensive norepinephrine (NE) afferents from locus ceruleus (LC) [125,126]. Furthermore since the dynamics of STN-GPe is strongly controlled by the neuromodulator NE [127,128], we would like to explore the possible role of NE in the BG dynamics. Particularly, NE is expected to control the lateral connection strengths in STN-GPe, and the gain of cortical input [110,129,130] to striatum and STN. The control of response inhibition through STN is thought to be established through NE activity in STN, and a dysfunction in such control could be related to ICD [131,132]. A detailed model of STN-GPe dynamics and the effect of NE, could help us better understand the role of the STN-GPe system in impulsivity and design better deep brain stimulation protocols to cure ICD .
Although DA, 5HT and NE along with the STN-GPe dynamics figure prominently in the experimental studies on impulsivity, computational models that closely resemble the neurobiological data supporting all those factors do not exist. Our model becomes the first of its kind to include the contributions of both DA and 5HT in the ICD pathology, and present a better "bench to bedside" proposal.
S2 File. Analysing the effect of parameters (αD1, αD2, αD1D2) in various conditions (healthy controls, PD-ON ICD, PD-ON non-ICD, PD-OFF).
S3 File. Sensitivity analysis of the parameters controlling the model DA parameter.
Conceived and designed the experiments: PPB VSC AAM MA BR. Performed the experiments: PPB VSC AAM MA. Analyzed the data: PPB VSC AAM. Wrote the paper: PPB VSC AAM.
- 1. Ahlskog JE (2010) Think before you leap Donepezil reduces falls? Neurology 75: 1226–1227. pmid:20826715
- 2. Ridderinkhof KR (2002) Activation and suppression in conflict tasks: Empirical clarification through distributional analyses.
- 3. Wylie SA, Ridderinkhof KR, Bashore TR, van den Wildenberg WP (2010) The effect of Parkinson's disease on the dynamics of on-line and proactive cognitive control during action selection. J Cogn Neurosci 22: 2058–2073. pmid:19702465
- 4. Nombela C, Rittman T, Robbins TW, Rowe JB (2014) Multiple modes of impulsivity in Parkinson's disease. PLoS One 9: e85747. pmid:24465678
- 5. Dougherty DM, Mathias CW, Marsh DM, Jagar AA (2005) Laboratory behavioral measures of impulsivity. Behavior Research Methods 37: 82–90. pmid:16097347
- 6. Evenden JL (1999) Varieties of impulsivity. Psychopharmacology (Berl) 146: 348–361. pmid:10550486
- 7. Dalley JW, Everitt BJ, Robbins TW (2011) Impulsivity, compulsivity, and top-down cognitive control. Neuron 69: 680–694. pmid:21338879
- 8. Dalley JW, Mar AC, Economidou D, Robbins TW (2008) Neurobehavioral mechanisms of impulsivity: fronto-striatal systems and functional neurochemistry. Pharmacology Biochemistry and Behavior 90: 250–260. pmid:18272211
- 9. Fahn S, Libsch LR, Cutler RW (1971) Monoamines in the human neostriatum: topographic distribution in normals and in Parkinson's disease and their role in akinesia, rigidity, chorea, and tremor. J Neurol Sci 14: 427–455. pmid:5125758
- 10. Kordower JH, Olanow CW, Dodiya HB, Chu Y, Beach TG, Adler CH, et al. (2013) Disease duration and the integrity of the nigrostriatal system in Parkinson’s disease. Brain 136: 2419–2431. pmid:23884810
- 11. Cools AR, van den Bercken JH, Horstink MW, van Spaendonck KP, Berger HJ (1984) Cognitive and motor shifting aptitude disorder in Parkinson's disease. J Neurol Neurosurg Psychiatry 47: 443–453. pmid:6736974
- 12. Schneider JS, Diamond SG, Markham CH (1987) Parkinson's disease: sensory and motor problems in arms and hands. Neurology 37: 951–956. pmid:3587646
- 13. Chaudhuri K, Healy DG, Schapira AH (2006) Non-motor symptoms of Parkinson's disease: diagnosis and management. The Lancet Neurology 5: 235–245. pmid:16488379
- 14. Bugalho P, Oliveira-Maia AJ (2013) Impulse control disorders in Parkinson’s disease: crossroads between neurology, psychiatry and neuroscience. Behav Neurol 27: 547–557. pmid:23242359
- 15. Djamshidian A, Averbeck BB, Lees AJ, O'Sullivan SS (2011) Clinical aspects of impulsive compulsive behaviours in Parkinson's disease. J Neurol Sci 310: 183–188. pmid:21839478
- 16. Ray N, Antonelli F, Strafella AP (2011) Imaging impulsivity in Parkinson's disease and the contribution of the subthalamic nucleus. Parkinsons Dis 2011.
- 17. Averbeck B, O'Sullivan S, Djamshidian A (2014) Impulsive and Compulsive Behaviors in Parkinson's Disease. Annual review of clinical psychology 10: 553–580. pmid:24313567
- 18. Evans AH, Strafella AP, Weintraub D, Stacy M (2009) Impulsive and compulsive behaviors in Parkinson's disease. Movement Disorders 24: 1561–1570. pmid:19526584
- 19. Doya K (2002) Metalearning and neuromodulation. Neural Netw 15: 495–506. pmid:12371507
- 20. Frank MJ, Samanta J, Moustafa AA, Sherman SJ (2007) Hold your horses: impulsivity, deep brain stimulation, and medication in parkinsonism. Science 318: 1309–1312. pmid:17962524
- 21. Houk JC, Bastianen C, Fansler D, Fishbach A, Fraser D, Reber PJ, et al. (2007) Action selection and refinement in subcortical loops through basal ganglia and cerebellum. Philos Trans R Soc Lond B Biol Sci 362: 1573–1583. pmid:17428771
- 22. Schultz W (2010) Dopamine signals for reward value and risk: basic and recent data. Behav Brain Funct 6: 24. pmid:20416052
- 23. Sutton RS, Barto AG (1998) Reinforcement learning: an introduction. Cambridge, Mass.: MIT Press. xviii, 322 p. p.
- 24. Schultz W (2010) Subjective neuronal coding of reward: temporal value discounting and risk. European Journal of Neuroscience 31: 2124–2135. pmid:20497474
- 25. Schultz W (2013) Updating dopamine reward signals. Curr Opin Neurobiol 23: 229–238. pmid:23267662
- 26. O'Doherty J, Dayan P, Schultz J, Deichmann R, Friston K, Dolan RJ, et al. (2004) Dissociable roles of ventral and dorsal striatum in instrumental conditioning. Science 304: 452–454. pmid:15087550
- 27. O'Doherty JP, Buchanan TW, Seymour B, Dolan RJ (2006) Predictive neural coding of reward preference involves dissociable responses in human ventral midbrain and ventral striatum. Neuron 49: 157–166. pmid:16387647
- 28. Balasubramani PP, Chakravarthy S, Ravindran B, Moustafa AA (2014) An extended reinforcement learning model of basal ganglia to understand the contributions of serotonin and dopamine in risk-based decision making, reward prediction, and punishment learning. Frontiers in Computational Neuroscience 8: 47. pmid:24795614
- 29. Kalva SK, Rengaswamy M, Chakravarthy V, Gupte N (2012) On the neural substrates for exploratory dynamics in basal ganglia: A model. Neural Networks.
- 30. Piray P, Zeighami Y, Bahrami F, Eissa AM, Hewedi DH, Moustafa AA, et al. (2014) Impulse Control Disorders in Parkinson's Disease Are Associated with Dysfunction in Stimulus Valuation But Not Action Valuation. The Journal of neuroscience 34: 7814–7824. pmid:24899705
- 31. Harden DG, Grace AA (1995) Activation of dopamine cell firing by repeated L-DOPA administration to dopamine-depleted rats: its potential role in mediating the therapeutic response to L-DOPA treatment. The Journal of neuroscience 15: 6157–6166. pmid:7666198
- 32. Cohen MX, Frank MJ (2009) Neurocomputational models of basal ganglia function in learning, memory and choice. Behav Brain Res 199: 141–156. pmid:18950662
- 33. Frank MJ, Scheres A, Sherman SJ (2007) Understanding decision-making deficits in neurological conditions: insights from models of natural action selection. Philosophical Transactions of the Royal Society B: Biological Sciences 362: 1641–1654. pmid:17428775
- 34. Frank MJ, Seeberger LC, O'Reilly RC (2004) By carrot or by stick: cognitive reinforcement learning in parkinsonism. Science 306: 1940–1943. pmid:15528409
- 35. Hamidovic A, Kang UJ, de Wit H (2008) Effects of low to moderate acute doses of pramipexole on impulsivity and cognition in healthy volunteers. J Clin Psychopharmacol 28: 45–51. pmid:18204340
- 36. Avanzi M, Baratti M, Cabrini S, Uber E, Brighetti G, Bonfa F. (2006) Prevalence of pathological gambling in patients with Parkinson's disease. Movement Disorders 21: 2068–2072. pmid:17044068
- 37. Voon V, Hassan K, Zurowski M, De Souza M, Thomsen T, Fox S, et al. (2006) Prevalence of repetitive and reward-seeking behaviors in Parkinson disease. Neurology 67: 1254–1257. pmid:16957130
- 38. Weintraub D, Siderowf AD, Potenza MN, Goveas J, Morales KH, Duda JE, et al. (2006) Association of dopamine agonist use with impulse control disorders in Parkinson disease. Arch Neurol 63: 969–973. pmid:16831966
- 39. Oades RD (2002) Dopamine may be ‘hyper’with respect to noradrenaline metabolism, but ‘hypo’with respect to serotonin metabolism in children with attention-deficit hyperactivity disorder. Behavioural brain research 130: 97–102. pmid:11864724
- 40. Winstanley CA, Theobald DE, Dalley JW, Glennon JC, Robbins TW (2004) 5-HT2A and 5-HT2C receptor antagonists have opposing effects on a measure of impulsivity: interactions with global 5-HT depletion. Psychopharmacology (Berl) 176: 376–385. pmid:15232674
- 41. Winstanley CA, Theobald DE, Dalley JW, Robbins TW (2005) Interactions between serotonin and dopamine in the control of impulsive choice in rats: therapeutic implications for impulse control disorders. Neuropsychopharmacology 30: 669–682. pmid:15688093
- 42. Fox SH, Chuang R, Brotchie JM (2009) Serotonin and Parkinson's disease: On movement, mood, and madness. Mov Disord 24: 1255–1266. pmid:19412960
- 43. Chakravarthy VS, Balasubramani PP (2013) Basal Ganglia System as an Engine for Exploration. In: Jaeger D. JR, editor. Encyclopedia of Computational Neuroscience. Berlin Heidelberg: SpringerReference (www.springerreference.com). Springer-Verlag
- 44. Chakravarthy VS, Joseph D, Bapi RS (2010) What do the basal ganglia do? A modeling perspective. Biol Cybern 103: 237–253. pmid:20644953
- 45. Balasubramani PP, Chakravarthy S, Ravindran B, Moustafa AA (Not published) A network model of basal ganglia for understanding the roles of dopamine and serotonin in reward-punishment-risk based decision making.
- 46. Kahneman D, Tversky A (1979) Prospect theory: an analysis of decision under risk. Econometrica 47: 263–292.
- 47. Kalenscher T (2007) Decision making: Don't risk a delay. Current biology 17: R58–R61. pmid:17240330
- 48. Lang A, Fahn S (1989) Assessment of Parkinson's disease, Munsat TL,. uantification of neurologic deficit: 285–309.
- 49. Folstein MF, Folstein SE, McHugh PR (1975) “Mini-mental state”: a practical method for grading the cognitive state of patients for the clinician. Journal of psychiatric research 12: 189–198. pmid:1202204
- 50. Uttl B (2002) North American Adult Reading Test: age norms, reliability, and validity. J Clin Exp Neuropsychol 24: 1123–1137. pmid:12650237
- 51. Beck AT, Steer RA, Brown GK (2005) Beck Depression Inventory. Manual, Swedish version. Sandviken: Psykologiförlaget.
- 52. Krishnan R, Ratnadurai S, Subramanian D, Chakravarthy VS, Rengaswamy M (2011) Modeling the role of basal ganglia in saccade generation: is the indirect pathway the explorer? Neural Netw 24: 801–813. pmid:21726978
- 53. Albin RL, Young AB, Penney JB (1989) The functional anatomy of basal ganglia disorders. Trends Neurosci 12: 366–375. pmid:2479133
- 54. Bar-Gad I, Bergman H (2001) Stepping out of the box: information processing in the neural networks of the basal ganglia. Curr Opin Neurobiol 11: 689–695. pmid:11741019
- 55. DeLong MR (1990) Primate models of movement disorders of basal ganglia origin. Trends Neurosci 13: 281–285. pmid:1695404
- 56. Mink JW (1996) The basal ganglia: focused selection and inhibition of competing motor programs. Progress in neurobiology 50: 381. pmid:9004351
- 57. Bell DE (1995) Risk,return and utility. Management Science 41: 23–30.
- 58. d'Acremont M, Lu ZL, Li X, Van der Linden M, Bechara A (2009) Neural correlates of risk prediction error during reinforcement learning in humans. Neuroimage 47: 1929–1939. pmid:19442744
- 59. Cools R, Robinson OJ, Sahakian B (2008) Acute tryptophan depletion in healthy volunteers enhances punishment prediction but does not affect reward prediction. Neuropsychopharmacology 33: 2291–2299. pmid:17940553
- 60. Tanaka SC, Schweighofer N, Asahi S, Shishida K, Okamoto Y, Yamawaki S, et al. (2007) Serotonin differentially regulates short- and long-term prediction of rewards in the ventral and dorsal striatum. PLoS One 2: e1333. pmid:18091999
- 61. Long AB, Kuhn CM, Platt ML (2009) Serotonin shapes risky decision making in monkeys. Soc Cogn Affect Neurosci 4: 346–356. pmid:19553236
- 62. Kalva SK, Rengaswamy M, Chakravarthy VS, Gupte N (2012) On the neural substrates for exploratory dynamics in basal ganglia: a model. Neural Netw 32: 65–73. pmid:22386780
- 63. Humphries MD, Lepora N, Wood R, Gurney K (2009) Capturing dopaminergic modulation and bimodal membrane behaviour of striatal medium spiny neurons in accurate, reduced models. Frontiers in computational neuroscience 3.
- 64. Moyer JT, Wolf JA, Finkel LH (2007) Effects of dopaminergic modulation on the integrative properties of the ventral striatal medium spiny neuron. Journal of neurophysiology 98: 3731–3748. pmid:17913980
- 65. Servan-Schreiber D, Printz H, Cohen JD (1990) A network model of catecholamine effects: gain, signal-to-noise ratio, and behavior. Science 249: 892–895. pmid:2392679
- 66. Thurley K, Senn W, Lüscher H- R (2008) Dopamine increases the gain of the input-output response of rat prefrontal pyramidal neurons. Journal of neurophysiology 99: 2985–2997. pmid:18400958
- 67. Allen AT, Maher KN, Wani KA, Betts KE, Chase DL (2011) Coexpressed D1-and D2-like dopamine receptors antagonistically modulate acetylcholine release in Caenorhabditis elegans. Genetics 188: 579–590. pmid:21515580
- 68. Schultz W, Dayan P, Montague PR (1997) A neural substrate of prediction and reward. Science 275: 1593–1599. pmid:9054347
- 69. Amemori K, Gibb LG, Graybiel AM (2011) Shifting responsibly: the importance of striatal modularity to reinforcement learning in uncertain environments. Front Hum Neurosci 5: 47. pmid:21660099
- 70. Sarvestani IK, Lindahl M, Hellgren-Kotaleski J, Ekeberg Ö (2011) The arbitration–extension hypothesis: a hierarchical interpretation of the functional organization of the basal ganglia. Frontiers in systems neuroscience 5.
- 71. Nadjar A, Brotchie JM, Guigoni C, Li Q, Zhou S-B, Gui-Jie Wang, et al. (2006) Phenotype of striatofugal medium spiny neurons in parkinsonian and dyskinetic nonhuman primates: a call for a reappraisal of the functional organization of the basal ganglia. The Journal of neuroscience 26: 8653–8661. pmid:16928853
- 72. Surmeier DJ, Song WJ, Yan Z (1996) Coordinated expression of dopamine receptors in neostriatal medium spiny neurons. J Neurosci 16: 6579–6591. pmid:8815934
- 73. Eberle‐Wang K, Mikeladze Z, Uryu K, Chesselet MF (1997) Pattern of expression of the serotonin2C receptor messenger RNA in the basal ganglia of adult rats. Journal of Comparative Neurology 384: 233–247. pmid:9215720
- 74. Ward RP, Dorsa DM (1996) Colocalization of serotonin receptor subtypes 5‐HT2A, 5‐HT2C, and 5‐HT6 with neuropeptides in rat striatum. Journal of Comparative Neurology 370: 405–414. pmid:8799865
- 75. Nakamura K (2013) The role of the dorsal raphé nucleus in reward-seeking behavior. Front Integr Neurosci 7.
- 76. Alex KD, Pehek EA (2007) Pharmacologic mechanisms of serotonergic regulation of dopamine neurotransmission. Pharmacol Ther 113: 296–320. pmid:17049611
- 77. Jiang LH, Ashby CR Jr, Kasser RJ, Wang RY (1990) The effect of intraventricular administration of the 5-HT3 receptor agonist 2-methylserotonin on the release of dopamine in the nucleus accumbens: an in vivo chronocoulometric study. Brain Res 513: 156–160. pmid:2112416
- 78. Calabresi P, Picconi B, Tozzi A, Ghiglieri V, Di Filippo M (2014) Direct and indirect pathways of basal ganglia: a critical reappraisal. Nat Neurosci 17: 1022–1030. pmid:25065439
- 79. Jakab RL, Hazrati LN, Goldman‐Rakic P (1996) Distribution and neurochemical character of substance P receptor (SPR)‐immunoreactive striatal neurons of the macaque monkey: Accumulation of SP fibers and SPR neurons and dendrites in “striocapsules” encircling striosomes. Journal of Comparative Neurology 369: 137–149. pmid:8723708
- 80. Stauffer WR, Lak A, Schultz W (2014) Dopamine Reward Prediction Error Responses Reflect Marginal Utility. Current biology 24: 2491–2500. pmid:25283778
- 81. Chang J, Chen L, Luo F, Shi L-H, Woodward D (2002) Neuronal responses in the frontal cortico-basal ganglia system during delayed matching-to-sample task: ensemble recording in freely moving rats. Experimental Brain Research 142: 67–80. pmid:11797085
- 82. Frank MJ, Claus ED (2006) Anatomy of a decision: striato-orbitofrontal interactions in reinforcement learning, decision making, and reversal. Psychol Rev 113: 300. pmid:16637763
- 83. Lauwereyns J, Watanabe K, Coe B, Hikosaka O (2002) A neural correlate of response bias in monkey caudate nucleus. Nature 418: 413–417. pmid:12140557
- 84. Tanaka SC, Samejima K, Okada G, Ueda K, Okamoto Y, Yamawaki S, et al. (2006) Brain mechanism of reward prediction under predictable and unpredictable environmental dynamics. Neural Networks 19: 1233–1241. pmid:16979871
- 85. Amalric M, Moukhles H, Nieoullon A, Daszuta A (1995) Complex Deficits on Reaction Time Performance following Bilateral Intrastriatal 6‐OHDA Infusion in the Rat. European Journal of Neuroscience 7: 972–980. pmid:7613632
- 86. Bogacz R, Gurney K (2007) The basal ganglia and cortex implement optimal decision making between alternative actions. Neural Comput 19: 442–477. pmid:17206871
- 87. Lo C- C, Wang X-J (2006) Cortico–basal ganglia circuit mechanism for a decision threshold in reaction time tasks. Nat Neurosci 9: 956–963. pmid:16767089
- 88. Kish SJ, Shannak K, Hornykiewicz O (1988) Uneven pattern of dopamine loss in the striatum of patients with idiopathic Parkinson's disease. Pathophysiologic and clinical implications. N Engl J Med 318: 876–880. pmid:3352672
- 89. Dauer W, Przedborski S (2003) Parkinson's disease: mechanisms and models. Neuron 39: 889–909. pmid:12971891
- 90. Foley P, Gerlach M, Double KL, Riederer P (2004) Dopamine receptor agonists in the therapy of Parkinson's disease. J Neural Transm 111: 1375–1446. pmid:15480844
- 91. Gupta A, Balasubramani PP, Chakravarthy S (2013) Computational model of precision grip in Parkinson’s disease: A Utility based approach. Frontiers in Computational Neuroscience 7.
- 92. Magdoom KN, Subramanian D, Chakravarthy VS, Ravindran B, Amari S, Meenakshisundaram N, et al. (2011) Modeling basal ganglia for understanding Parkinsonian reaching movements. Neural Comput 23: 477–516. pmid:21105828
- 93. Muralidharan V, Balasubramani PP, Chakravarthy VS, Lewis SJ, Moustafa AA (2014) A computational model of altered gait patterns in parkinson's disease patients negotiating narrow doorways. Front Comput Neurosci 7: 190. pmid:24409137
- 94. Goldberg DE (1989) Genetic Algorithms in Search Optimization and Machine Learning: Addison-Wesley Longman Publishing Co.,.
- 95. Baunez C, Nieoullon A, Amalric M (1995) In a rat model of parkinsonism, lesions of the subthalamic nucleus reverse increases of reaction time but induce a dramatic premature responding deficit. The Journal of neuroscience 15: 6531–6541. pmid:7472415
- 96. Bellebaum C, Koch B, Schwarz M, Daum I (2008) Focal basal ganglia lesions are associated with impairments in reward-based reversal learning. Brain 131: 829–841. pmid:18263624
- 97. Bloxham C, Dick D, Moore M (1987) Reaction times and attention in Parkinson's disease. Journal of Neurology, Neurosurgery & Psychiatry 50: 1178–1183.
- 98. Williams D, Kühn A, Kupsch A, Tijssen M, Van Bruggen G, Speelman H, et al. (2005) The relationship between oscillatory activity and motor reaction time in the parkinsonian subthalamic nucleus. European Journal of Neuroscience 21: 249–258. pmid:15654862
- 99. Liénard A (1928) Etude des oscillations entretenues. Revue générale de l’électricité 23: 901–912. pmid:21842191
- 100. Bertran-Gonzalez J, Hervé D, Girault J-A, Valjent E (2010) What is the degree of segregation between striatonigral and striatopallidal projections? Front Neuroanat 4.
- 101. Sutton RS, Barto AG (1998) Reinforcement Learning: An Introduction. Adaptive Computations and Machine Learning: MIT Press/Bradford.
- 102. Doya K (2008) Modulators of decision making. Nat Neurosci 11: 410–416. pmid:18368048
- 103. Evans AH, Pavese N, Lawrence AD, Tai YF, Appel S, Doder M, et al. (2006) Compulsive drug use linked to sensitized ventral striatal dopamine transmission. Ann Neurol 59: 852–858. pmid:16557571
- 104. Steeves T, Miyasaki J, Zurowski M, Lang A, Pellecchia G, Van Eimeren T, et al. (2009) Increased striatal dopamine release in Parkinsonian patients with pathological gambling: a [11C] raclopride PET study. Brain 132: 1376–1385. pmid:19346328
- 105. Bedard C, Wallman MJ, Pourcher E, Gould PV, Parent A, Parent M, et al. (2011) Serotonin and dopamine striatal innervation in Parkinson's disease and Huntington's chorea. Parkinsonism Relat Disord 17: 593–598. pmid:21664855
- 106. Fahn S, Snider S, Prasad AL, Lane E, Makadon H (1975) Normalization of brain serotonin by L-tryptophan in levodopa-treated rats. Neurology 25: 861–865. pmid:1172210
- 107. Halliday GM, Blumbergs PC, Cotton RG, Blessing WW, Geffen LB (1990) Loss of brainstem serotonin- and substance P-containing neurons in Parkinson's disease. Brain Res 510: 104–107. pmid:1691042
- 108. Perreault ML, Hasbi A, O'Dowd BF, George SR (2011) The dopamine d1-d2 receptor heteromer in striatal medium spiny neurons: evidence for a third distinct neuronal pathway in Basal Ganglia. Front Neuroanat 5: 31. pmid:21747759
- 109. Albin RL (1998) Fuch's corneal dystrophy in a patient with mitochondrial DNA mutations. J Med Genet 35: 258–259. pmid:9541117
- 110. Cohen JD, McClure SM, Angela JY (2007) Should I stay or should I go? How the human brain manages the trade-off between exploitation and exploration. Philosophical Transactions of the Royal Society B: Biological Sciences 362: 933–942. pmid:17395573
- 111. Daw ND, O'Doherty JP, Dayan P, Seymour B, Dolan RJ (2006) Cortical substrates for exploratory decisions in humans. Nature 441: 876–879. pmid:16778890
- 112. Frank MJ, Doll BB, Oas-Terpstra J, Moreno F (2009) Prefrontal and striatal dopaminergic genes predict individual differences in exploration and exploitation. Nat Neurosci 12: 1062–1068. pmid:19620978
- 113. Ballanger B, van Eimeren T, Moro E, Lozano AM, Hamani C, Boulinguez P, et al. (2009) Stimulation of the subthalamic nucleus and impulsivity: release your horses. Ann Neurol 66: 817–824. pmid:20035509
- 114. Baunez C, Robbins TW (1997) Bilateral lesions of the subthalamic nucleus induce multiple deficits in an attentional task in rats. European Journal of Neuroscience 9: 2086–2099. pmid:9421169
- 115. Florio T, Capozzo A, Cellini R, Pizzuti G, Staderini E, Scarnati E (2001) Unilateral lesions of the pedunculopontine nucleus do not alleviate subthalamic nucleus-mediated anticipatory responding in a delayed sensorimotor task in the rat. Behavioural brain research 126: 93–103. pmid:11704255
- 116. Phillips JM, Brown VJ (1999) Reaction time performance following unilateral striatal dopamine depletion and lesions of the subthalamic nucleus in the rat. European Journal of Neuroscience 11: 1003–1010. pmid:10223809
- 117. Wylie SA, van den Wildenberg W, Ridderinkhof KR, Claassen DO, Wooten GF, Manning CA (2012) Differential susceptibility to motor impulsivity among functional subtypes of Parkinson's disease. Journal of Neurology, Neurosurgery & Psychiatry: jnnp-2012-303056.
- 118. Kuhn AA, Doyle L, Pogosyan A, Yarrow K, Kupsch A, Schneider GH, et al. (2006) Modulation of beta oscillations in the subthalamic area during motor imagery in Parkinson's disease. Brain 129: 695–706. pmid:16364953
- 119. Kühn AA, Tsui A, Aziz T, Ray N, Brücke C, Kupsch A, et al. (2009) Pathological synchronisation in the subthalamic nucleus of patients with Parkinson's disease relates to both bradykinesia and rigidity. Exp Neurol 215: 380–387. pmid:19070616
- 120. Levy R, Hutchison WD, Lozano AM, Dostrovsky JO (2002) Synchronized neuronal discharge in the basal ganglia of parkinsonian patients is limited to oscillatory activity. The Journal of neuroscience 22: 2855–2861. pmid:11923450
- 121. Plenz D, Kital ST (1999) A basal ganglia pacemaker formed by the subthalamic nucleus and external globus pallidus. Nature 400: 677–682. pmid:10458164
- 122. Park C, Rubchinsky LL (2012) Potential mechanisms for imperfect synchronization in parkinsonian basal ganglia. PLoS One 7: e51530. pmid:23284707
- 123. Gerfen CR, Wilson CJ (1996) Chapter II The basal ganglia. Handbook of chemical neuroanatomy 12: 371–468.
- 124. Kawaguchi Y, Wilson CJ, Emson PC (1990) Projection subtypes of rat neostriatal matrix cells revealed by intracellular injection of biocytin. The Journal of neuroscience 10: 3421–3438. pmid:1698947
- 125. Parent A, Hazrati L-N (1995) Functional anatomy of the basal ganglia. II. The place of subthalamic nucleus and external pallidium in basal ganglia circuitry. Brain Res Rev 20: 128–154. pmid:7711765
- 126. Wang R, Macmillan L, Fremeau R Jr, Magnuson M, Lindner J, Limbird LE (1996) Expression of α2-adrenergic receptor subtypes in the mouse brain: evaluation of spatial and temporal information imparted by 3 kb of 5′ regulatory sequence for the α2A AR-receptor gene in transgenic animals. Neuroscience 74: 199–218. pmid:8843087
- 127. Belujon P, Bezard E, Taupignon A, Bioulac B, Benazzouz A (2007) Noradrenergic modulation of subthalamic nucleus activity: behavioral and electrophysiological evidence in intact and 6-hydroxydopamine-lesioned rats. The Journal of neuroscience 27: 9595–9606. pmid:17804620
- 128. Delaville C, Zapata J, Cardoit L, Benazzouz A (2012) Activation of subthalamic alpha 2 noradrenergic receptors induces motor deficits as a consequence of neuronal burst firing. Neurobiol Dis 47: 322–330. pmid:22668781
- 129. Aston-Jones G, Cohen JD (2005) An integrative theory of locus coeruleus-norepinephrine function: adaptive gain and optimal performance. Annu Rev Neurosci 28: 403–450. pmid:16022602
- 130. Dayan P, Yu AJ (2006) Phasic norepinephrine: a neural interrupt signal for unexpected events. Network: Computation in Neural Systems 17: 335–350. pmid:17162459
- 131. Economidou D, Theobald DE, Robbins TW, Everitt BJ, Dalley JW (2012) Norepinephrine and dopamine modulate impulsivity on the five-choice serial reaction time task through opponent actions in the shell and core sub-regions of the nucleus accumbens. Neuropsychopharmacology 37: 2057–2066. pmid:22510726
- 132. Swann AC, Lijffijt M, Lane SD, Cox B, Steinberg JL, Moeller FG (2013) Norepinephrine and impulsivity: effects of acute yohimbine. Psychopharmacology (Berl) 229: 83–94. pmid:23559222
- 133. Grubbs FE (1969) Procedures for detecting outlying observations in samples. Technometrics 11: 1–21.