Animals remember temporal links between their actions and subsequent rewards. We previously discovered a synaptic mechanism underlying such reward learning in D1 receptor (D1R)-expressing spiny projection neurons (D1 SPN) of the striatum. Dopamine (DA) bursts promote dendritic spine enlargement in a time window of only a few seconds after paired pre- and post-synaptic spiking (pre–post pairing), which is termed as reinforcement plasticity (RP). The previous study has also identified underlying signaling pathways; however, it still remains unclear how the signaling dynamics results in RP. In the present study, we first developed a computational model of signaling dynamics of D1 SPNs. The D1 RP model successfully reproduced experimentally observed protein kinase A (PKA) activity, including its critical time window. In this model, adenylate cyclase type 1 (AC1) in the spines/thin dendrites played a pivotal role as a coincidence detector against pre–post pairing and DA burst. In particular, pre–post pairing (Ca2+ signal) stimulated AC1 with a delay, and the Ca2+-stimulated AC1 was activated by the DA burst for the asymmetric time window. Moreover, the smallness of the spines/thin dendrites is crucial to the short time window for the PKA activity. We then developed a RP model for D2 SPNs, which also predicted the critical time window for RP that depended on the timing of pre–post pairing and phasic DA dip. AC1 worked for the coincidence detector in the D2 RP model as well. We further simulated the signaling pathway leading to Ca2+/calmodulin-dependent protein kinase II (CaMKII) activation and clarified the role of the downstream molecules of AC1 as the integrators that turn transient input signals into persistent spine enlargement. Finally, we discuss how such timing windows guide animals’ reward learning.
How do animals associate their actions with subsequent rewards (e.g., food)? We previously found a synaptic basis for this association. Synaptic plasticity in the striatum occurs only if the action signal comes just before dopamine (DA) signal, which is known to encode reward/regret information in the brain. Through developing an intracellular signaling model of this synaptic plasticity, we here elucidated the importance of a specific molecule, adenylate cyclase type 1 (AC1). AC1 works as a coincidence detector to associate temporally proximal action and DA signals. The importance of AC1 was shared by two major types of striatal projection neurons: one encodes positive reward-prediction error (RPE) into volume change of the spines, while the other encodes negative RPE. The signaling model also highlighted the importance of the smallness of the spines/thin dendrites for the coincidence detection, and the role of downstream signaling molecules as signal integrators. Taken together, computational modeling of the intracellular signaling strongly promoted our understanding of the mechanisms of this important synaptic plasticity.
Citation: Urakubo H, Yagishita S, Kasai H, Ishii S (2020) Signaling models for dopamine-dependent temporal contiguity in striatal synaptic plasticity. PLoS Comput Biol 16(7): e1008078. https://doi.org/10.1371/journal.pcbi.1008078
Editor: Joanna Jędrzejewska-Szmek, Instytut Biologii Doswiadczalnej im M Nenckiego Polskiej Akademii Nauk, POLAND
Received: October 1, 2019; Accepted: June 19, 2020; Published: July 23, 2020
Copyright: © 2020 Urakubo et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: Developed MATLAB code and their SBML-style files are available at the public repository GitHub (https://github.com/urakubo/ModelRP.git).
Funding: This work was supported partly by the Strategic Research Program for Brain Sciences ("Bioinformatics for Brain Sciences") from the Ministry of Education, Culture, Sports, Science and Technology (MEXT), CREST (JPMJCR1652 to SI and HK) from the Japan Science and Technology Agency (JST), the Brain Mapping by Integrated Neurotechnologies for Disease Studies (Brain/MINDS; JP19dm0207001 to SI) from the Japanese Agency for Medical Research and Development (AMED), the International Research Center for Neurointelligence (WPI-IRCN) at The University of Tokyo Institutes for Advanced Study, and JSPS KAKENHI (17K00404 to HU, 17H06310 to SI, and 26221001 to HK). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interests: The authors have declared that no competing interests exist.
Animals behave in their environments to obtain larger rewards . During such reward learning, a specific brain region—the striatum—plays a central role in associating the situation/behavioral action with reward signals [2,3]. Unexpected reward presentation leads to a phasic dopamine (DA) burst (0.3–1 s, DA burst), whereas unexpected reward omission leads to a phasic DA decrease (0.4–2 s, DA dip). To explore the neural mechanism in decoding such “reward prediction error” (RPE) signals, we previously targeted dopamine D1 receptor (D1R)-expressing spiny projection neurons (D1 SPNs) of the striatum, in particular the nucleus accumbens (NAc) core, and found that the DA burst promotes dendritic spine enlargement within a short (~2 s) time window after paired pre- and post-synaptic spiking (pre–post pairing) . The timing dependence of DA burst explains delayed reinforcement , so we have designated this form of spike-timing-dependent plasticity (STDP) as reinforcement plasticity (RP).
In D1 and D2 SPNs, RP occurs in response to the two types of inputs: pre–post pairing and DA signal [4,6]. Pre–post pairing mediates large Ca2+ influx through voltage-gated Ca2+ channels (VGCCs), N-methyl-D-aspartic acid receptors (NMDARs), and/or Ca2+-permeable α-amino-3-hydroxy-5-methyl-4-isoxazolepropionic acid receptors (AMPARs) [4,6–8]. DA signal activates D1Rs or dopamine D2 receptors (D2Rs), Golf/Gi/o proteins, and then adenylate cyclases (ACs) [4,6]. The interaction between pre–post pairing and DA signal determines the amplitude of and timing dependence in RP. Studies have shown that type 5 AC (AC5) plays predominant roles in both cAMP production and synaptic plasticity in the striatum [9–11]. Thus, two signaling models based on AC5 have been proposed to explain how those signaling dynamics result in timing dependence. One explores the direct effect of DA signal on NMDARs and VGCCs , and the other models DA-delay dependence in Ca2+ dynamics through a Ca2+ buffer . However, those models predict 100-ms and 20-s time windows, respectively, and it still remains obscure about the mechanisms of the in-between time window of ~2 s, which is required for DA-mediated reward conditioning . Further, a pharmacological experiment showed that RP depends on the other type of AC, Ca2+-sensitive type 1 AC (AC1) , which is also expressed in the striatum .
We thus developed the RP models that are based on AC1 to explore signaling dynamics that shapes the time window. In D1 SPNs, the protein kinase A (PKA) activity encodes the input-timing information in RP . Thus, we first focused on the PKA signaling, and elucidated the role of AC1 as a coincidence detector of pre–post pairing and DA burst. We next developed a D2 RP model, which detected the relative timing of pre–post pairing and phasic DA dip. The detection of phasic DA dip was realized by the molecules specifically expressed in D2 SPNs: D2R and adenosine A2A receptors (A2ARs), as suggested by precedent studies [16,17]. We further simulated the downstream signaling molecules as far as Ca2+/calmodulin (CaM)-dependent protein kinase II (CaMKII), and clarified their roles as signal integrators. The association of pre–post pairing and DA signal is important for animals' reward learning, which is formalized as the theory of reinforcement learning. We finally discussed the roles of RP in the reinforcement learning based on our computational study.
Overview of modeling
We built computational models of intracellular signaling in RP. The model targeted D1R and D2R SPNs in the NAc core of the striatum, referred to as the D1 and D2 RP models, respectively (Fig 1A, S1 Fig). We aimed to simulate the temporal dynamics of molecular interactions as simple as possible (see Binding and enzymatic reactions). Thus, both the D1 and D2 RP models had only a single compartment. Some of parameters in the RP models were based on the preceding models [18–22], and many other parameters were updated according to recent experimental evidence (Tables in S1 Appendix).
(A) Signaling mechanisms for RP. The indicated signals were modeled in the present study. Membreane molecules are indicated by the gray shaded areas, whereas the others are cytosolic or extracellular molecules. (B) Enlargement of stimulated spines . Left: time lapse images. The target spine (red and white arrowheads) was stimulated with pre–post pairing (pairing of glutamate uncaging and action potential) and the DA burst with a 0.6-s delay (inset, also see Methods). Right: quantification of the volume change (ΔVH). From Yagishita et al., Science 26 Sep 2014:Vol. 345, Issue 6204, pp. 1616–1620 (DOI: 10.1126/science.1255514). Reprinted with permission from AAAS. (C) Critical time window for RP . Top and middle: time courses of pre–post pairing (top, blue arrowheads; 10Hz, 10 times) and DA burst (middle, red lines; 30Hz, 10 times). Bottom: increase in spine volume (ΔVH) against the time delay of DA burst from the onset of pre–post pairing. From Yagishita et al., Science 26 Sep 2014:Vol. 345, Issue 6204, pp. 1616–1620 (DOI:10.1126/science.1255514). Reprinted with permission from AAAS.
The D1 and D2 RP models had two types of inputs: pre–post pairing and DA burst/dip (see Stimulation). The pre–post pairing led to Ca2+ signal (Tables B and D in S1 Appendix), and the DA burst/dip led to Golf/Gi/o signal (Table C in S1 Appendix). Both of them converged on AC1 (Table E in S1 Appendix). The activated AC1 produced cyclic adenosine monophosphate (cAMP), which led to PKA activation, and the free catalytic (C) subunit of PKA phosphorylated DA- and cAMP-regulated phosphoprotein 32 kDa (DARPP32) at T34 (Table F in S1 Appendix). Phosphorylated DARPP32 inhibited protein phosphatase 1 (PP1) and thus led to CaMKII activation (Table G in S1 Appendix). In the experiment, the levels of PKA and CaMKII activities were observed using fluorescence resonance energy transfer (FRET) signals. Those experimental observations were compared with the activities of PKA (free C unit concentration) and CaMKII in the D1 RP model.
Computer simulation was carried out using MATLAB SimBiology (R2018a; MathWorks) under the Suite of Nonlinear and Differential/Algebraic Equation Solvers (SUNDIALS). The developed MATLAB code and its SBML-style files are available at the public repository GitHub (https://github.com/urakubo/ModelRP.git).
Binding and enzymatic reactions
All molecular interactions in the D1 and D2 RP models were represented by binding and enzymatic reactions in a deterministic manner [23,24]. For example, a binding reaction in which A binds to B to form AB is expressed by the following equation: where kf and kb are the rate constants for the forward and backward reactions, respectively. All higher order binding reactions (> 3) were decomposed into sets of second-order reactions. This is important because the approximation of higher order binding reactions is inappropriate to simulate temporal dynamics of molecules. Enzymatic reactions were modeled based on the Michaelis-Menten formulation: where S, E and P denote substrates, enzymes and products, respectively, and Km and kcat are the Michaelis constant and product turnover rate, respectively, and we did not consider E–S complexes for simplicity.
The D1 and D2 RP models had two type of inputs: pre–post pairing and DA burst/dip. In the experiment, the pre–post pairing constituted of 10-consective elemental pairs at 10 Hz, and each pair comprised a single glutamate uncaging and subsequent 3-consective postsynaptic spiking . We modeled the pre–post pairing by incrementing the level of Ca2+ channel opening ([CaChannel]), that represents NMDAR- and VGCC-mediated Ca2+ influxes as follows: where kdeact,CaChannel is the deactivation rate constant of the Ca2+ channels, which mainly represents VGCCs (Table B in S1 Appendix), and δ is the Dirac δ-function. Similarly, DA burst (10 DA spikes at 30 Hz) was modeled by incrementing the current DA level ([DA]): where kdec,DA is the decrease rate constant (Table C in S1 Appendix), DAmax is the amplitude of the DA signal (Table A in S1 Appendix), and tDA delay is the DA delay from the pre–post pairing. Finally, DA dip, which is a pause of tonic DA release from many DA fibers, was modeled by
Where DAbasal is the basal DA level (0.5 μM), DAtarg is the target DA level (0 μM), and tDA dip is the duration of DA dip (0.4 s). All the parameters are described in Tables A and C in S1 Appendix.
Modeling AC1 activation kinetics
Ca2+/CaM is known to stimulate AC1 with two types of time lags: first-order time constant and dead time [25,26], although the mechanism of the dead time is unknown. We thus assumed two intermediate inactive states of an AC1– Ca2+/CaM complex (AC1sub1, AC1sub2) as one of the simplest ways to introduce an effective dead time:
Where kon,AC,CaM and koff,AC,CaM are, respectively, the binding and unbinding rate constants of AC1 to Ca2+/CaM, and kup,AC and kdown,AC are the state transition rate constants, and x is the allosteric factor (see Table D in S1 Appendix). The active AC1– Ca2+/CaM complex showed enzymatic activity only when it further bound to Golf or A2AR. Overall scheme and parameters are described in Table D in S1 Appendix. The inactive states also introduced the first order time constant.
Modeling domain radius dependence
In the D1 RP model, the spine and soma were considered to be spheres with radiuses of 0.1 μm and 10 μm, respectively , which gave the surface-to-volume ratios (SVRs) of SVRspine = ~ 0.3 /μm, and SVRsoma = 30 /μm, because SVRspine = (4πrspine2) /(4πrspine3/3). Effective cytosolic concentrations of membrane molecules were determined by the multiplication of the SVR to keep the consistent molecular density per membrane area (Table A in S1 Appendix).
Fitting simulation to Epac-SH150 response
DA-concentration dependence of cAMP levels in the D1 and D2 RP models were fitted to experimentally observed responses of a cAMP sensor, Epac-SH150 . Yapo et al. have shown that Epac-SH150 did not affect cAMP dynamics . Thus, we simply modeled the Epac-SH150 activity, [Epac]bound, that obeys cAMP level as: where [Epac]total denotes the total Epac concentration, and KdEpac = 11 μM . Peak [Epac]bound levels were normalized for the fit.
Fitting exponential decays to the responses of AKAR and Camui
Experimentally observed time courses of AKAR and Camui activities were fitted with exponential decay functions. One thousand least-square fits were conducted against bootstrap replicates of 50 (AKAR) or 43 (Camui) observations, and the decay rate constants (τ) of the 1000 fits were used to obtain the means and 95% confidence intervals (CIs).
Requirement of AC1 in the D1 RP model
In Yagishita et al. (2014) , two kinds of stimuli were presented to D1 SPNs: glutamate uncaging paired with current injection that elicits post-synaptic spiking (pre–post pairing), and laser irradiation of channelrhodopsin-2 (ChR2)-expressing DA fibers to achieve pre-synaptic DA release (DA burst) (Fig 1B). The pre–post pairing alone did not induce any plastic changes of spines (without (w/o) DA, Fig 1C), but spine volume and synaptic efficacy were both increased when the pre–post pairing further coincided with the DA bursts within a narrow (~2 s) and asymmetric time window (Fig 1B and 1C). In the same study, we also explored the underlying signaling mechanisms in RP (Fig 1A); we found that the PKA signal codes the input timing-dependence (Fig 2D, left), as it shows high correlation with spine enlargement (Spearman’s correlation coefficient: 0.94; Fig 2B). Further, the PKA inhibitor PKI completely inhibited the spine volume change . Thus, not only does PKA hold the timing information of the two types of inputs, but also that information is necessary for the expression of RP .
(A) Schematics of PKA signaling in the somas and spines. The symbol ⨂ denotes the product of two inputs. (B) Spine PKA activity (AKAR2-CR) highly correlates with spine enlargement (r = 0.94, Spearman's correlation coefficient) . From Yagishita et al., Science 26 Sep 2014:Vol. 345, Issue 6204, pp. 1616–1620 (DOI:10.1126/science.1255514). Reprinted with permission from AAAS. (C) Time courses of PKA activity in the experiment (left; DA delay: 0.6 s)  and model (right; DA delay: 1 s). From Yagishita et al., Science 26 Sep 2014:Vol. 345, Issue 6204, pp. 1616–1620 (DOI:10.1126/science.1255514). Reprinted with permission from AAAS. (D) Critical time windows for PKA activity in the experiment and model (left and right, respectively). The peak amplitudes of active AKAR2-CR (left) and PKA-free catalytic subunits (right) were plotted. (E) DA-delay dependence of PKA activity. Gray shaded area denotes the periods of pre–post pairing, and red shaded area indicates the periods of DA bursts.
The coincidence detection for RP should be realized by an upstream of PKA, presumably ACs [4,29]. Literature showed that AC5 works in the striatum [9,15], and accordingly it has been involved in many computational models [19–21,30]. In our experiments, however, an AC1 inhibitor NB001 blocked the PKA activity of the spines [4,31]. AC1 is known to be activated only when it binds to Golf and Ca2+/CaM simultaneously [32,33]. The Golf signal can be caused by DA burst, and the Ca2+/CaM signal is mediated by pre–post pairing. The interaction of those two signals was expected to produce the critical time window for the PKA involvement in RP. We thus built a D1 RP model based on the kinetics of AC1 (Fig 2A).
In the D1 RP model, we simulated the dynamics of molecular interactions based on mass assumption (Figs 1C and 2A, right; Methods). DA burst caused D1R activation, and the activated D1R facilitated GDP–GTP exchange on Golf proteins, followed by the binding of the resultant free Golf to AC1. On the other hand, pre–post pairing caused VGCC/NMDAR-mediated Ca2+ influx. The Ca2+ was bound to CaM, and Ca2+/CaM was further bound to AC1. Golf and Ca2+/CaM synergistically activated AC1, and the activated AC1 produced cAMP, leading to PKA activation. As expected, the combination of pre–post pairing and DA burst led to PKA activation, consistent with the experiments (Fig 2C), and the delay of the DA burst reproduced the critical time window for the PKA activity (Fig 2D and 2E) . Thus, in the D1 RP model, AC1 indeed worked for the coincidence detection in RP.
We then examined the D1 RP model that has AC5 but not AC1, because existing studies showed that AC5 dominantly works in the striatum [9,11,15]. In this model, AC5 was activated by DA burst alone (Fig 2D, right; open circle), but additional pre–post pairing did not further amplify AC5 activation, let alone PKA activation (Fig 2D, right; thick dotted line). This result was consistent with the PKA activity in the somas of our experiment (Fig 2D, left; open circles ) and other experiments [16,34]. It is thus plausible that AC5 works in the somas whereas AC1 works in the spines and thin dendrites (Fig 2A).
Temporal contiguity detection by AC1
D1 RP has the asymmetric time window for the DA burst (Fig 2D), which is suitable for detection of a delayed reinforcer. We then addressed how the asymmetry of the time window is shaped (Fig 3A). In the D1 RP model, pre–post pairing resulted in the increase in Ca2+ level (Fig 3A, left) . The Ca2+ bound to CaM, and the Ca2+/CaM stimulated AC1. A precedent study has discovered the involvement of two types of time lags in this process, i.e., first-order time constant and dead time [25,36]. Here, the first-order time constant represents the primary delay of AC1 response against Ca2+ signal, where the AC1 response can be fitted by an exponential function (Fig 3A, left; red dotted line, left). The first-order time constant also determines the time-scale of AC1 activity after the end of Ca2+ signal (Fig 3A, left; red dotted line, right). The dead time denotes a latent time before AC1 response, and its mechanism is currently unknown. We thus assumed two intermediate “inactive” states of an AC1–Ca2+/CaM complex to introduce an effective dead time (see Methods; Table D in S1 Appendix). Note that those inactive states also introduce the first-order time constant. The related model parameters were set to give the dead time and first-order time constant of ~0.3 s (Fig 3A, left, periods between blue dotted lines) and ~2 s, respectively (Fig 3A, right, red dotted lines). On the other hand, DA burst led to the rapid binding and subsequent release of Golf to AC1 (Fig 3A, right). Taken together, rapid Golf binding to AC1 occurs during the slow and delayed time couse of the active AC1–Ca2+/CaM complex formation, resulting in AC1 activation (Fig 3A, bottom). This shaped an asymmetric time window for the temporal contiguity detection of pre–post pairing and DA signal (Fig 2D). Here, the first-order time constant extended the time window for DA signal by ~2 s beyond the end of pre–post pairing, whereas the dead time ensured no RP by the simultaneous stimulation of pre–post pairing (Ca2+ signal) and DA signal.
(A) Pre–post pairing stimulated AC1 through Ca2+/CaM signal with two types of time lags, i.e., dead time (period between blue dotted lines) and first-order time constant (red dotted lines), and Ca2+-stimulated AC1 was activated only during DA burst through Golf binding (right). DA bursts at particular times are plotted (red arrows, 0 s, 1 s, or 2 s). (B) Time window in a simple control model of AC1. Ca2+ signal [Ca(t) = 1 (0 < t ≤ 1), 0 (otherwise)] was first passed through a first-order time constant [1/(1+τs)] then a dead time [exp(−Ts)] before AC1 was stimulated. DA signal [DA(t) = 1 (tdelay < t ≤ tdelay + 0.3), 0 (otherwise)] directly stimulated AC1. Here, T is the dead time (0.3 s), τ is the first-order time constant (2 s), and tdelay is the DA delay. The time integration of AC(t) corresponds to PKA Free-C or ΔSynaptic strength. (C) Pre–post pairing (Ca2+-signal) duration dependence of the peak amplitudes of PKA-free C in the detailed D1 RP model. The peak PKA free-C values at the red horizontal lines are shown in the bottom panel.
While the detailed D1 RP model clarified that molecular dynamics can explain the time window, its complexity makes it difficult to understand the roles of AC1 intuitively. We thus built a simple control model (Fig 3B). In Laplace space, a first-order time constant is represented by 1/(1−τ s), and a dead time is represented by exp(−T s), where s is the Laplace variable, T is the dead time, and τ is the time constant. Using these equations, the response of AC1 can be described by:
Where L−1 (t) is the inverse Laplacian, Ca(s) is the Ca2+ signal in Laplace space, DA(t, tdelay) is the DA signal in time space, and tdelay is the DA delay. If the Ca2+ signal is represented by a square wave (Fig 3B, top), the AC1 response can be solved as: where H(t) is the Heaviside step function, and d is the duration of the square-wave Ca2+ signal. The final readout PKA Free-C or ΔSynaptic strength was defined by:
When the square waves of Ca2+ and DA signals were given to the simple model (Fig 3B, top), it produced the time window that was nearly the same as that in the detailed D1 RP model (Fig 3B, bottom). Therefore, the simple D1 RP model successfully captured the mechanism of the temporal contiguity detection.
Because the time window appeared along DA delay (Fig 3B, bottom), one may think that the time lags in the DA signal also lead to the similar asymmetric time window. However, in reality, the DA-signal time lags move the time window in the negative direction. For example, if the dead time in DA signal, TDA, is introduced, ΔSynaptic strength(tdelay) is changed to ΔSynaptic strength(tdelay+TDA), i.e., ΔSynaptic strength(0) appears at tdelay = −TDA. In RP, the longer time lags in the Ca2+ signal is essential for the detection of delayed DA signal.
Then, going back to the detailed RP model, we varied the duration of pre–post pairing, and the prolonged time window for the PKA activity was observed when the duration was extended (Fig 3C). This was caused by the progressive increase in Ca2+/CaM-bound AC1, which was not saturated at the 1-s pre–post pairing (Fig 3A, left). The simultaneous stimulation of the two inputs did not lead to PKA activation regardless of the duration of the pre–post pairing. Varieties of durations of pre–post spiking are observed in behavioral experiments (e.g., Isomura et al. ). Even in such in vivo conditions, the D1 RP model robustly detected the temporal order of pre–post pairing and DA signal, suggesting the central role of RP in the temporal contiguity detection in classical conditioning.
The detailed D1 RP model was built on many parameters (number of kinetic constants, 58; number of molecular species, 15), some of which may regulate the time window in addition to the AC1 coincidence detection. We thus conducted a sensitivity analysis to clarify what parameters contributed to forming the PKA time window (Fig 4). Here we defined three characteristics of the time window, i.e., the peak amplitude, peak delay, and the full widths at half maximum (FWHM) (Fig 4A), and top twelve sensitive parameters were checked if each of the parameters was changed by ±10% of its original value (Fig 4B and S3 Fig).
(A) Target characteristics of the parameter sensitivity analysis. The changes of amplitude, delay, and FWHM of the time window were quantified against ±10% changes of each of the model parameters, i.e., kinetic constants and molecular concentrations. (B) Top twelve parameters sensitive to the amplitude, delay and FWHM. Parameters regarding Ca2+, DA, and PKA signals were denoted by blue, orange, and green, respectively. When a model parameter was changed by −10% and +10%, the corresponding changes of a target characteristic were obtained as α1 and α2, respectively, and the average of their absolute values, i.e., (|α1|+|α2|)/2 was plotted. Sensitivities against all parameters were shown in S2 Fig.
In the sensitivity analysis, the amplitude of the time window was mainly affected by the parameters related to Ca2+ signal (Fig 4B, left; parameter names colored blue). For example, kinflux, CaChannel denotes the absolute rate of Ca2+ influx through NMDARs/VGCCs. This is because the 1-s Ca2+ signal was not sufficient to saturate PKA activity, and the increase in the amplitudes of the Ca2+ signal further increased the peak PKA activity. Next, the delay of the peak time was mostly governed by the parameter, kdown, AC1, which represents a transition rate between the states of AC1– Ca2+/CaM complex (Fig 4B, center; see Methods). This indicates that the time lags for AC1 activation was primarily important for the delay of the time window. The FWHM was also governed primarily by the parameter, kdown, AC1, in addition to DA-related parameters (Fig 4B, right; parameter names colored orange). This is because the FWHM was determined based on the convolution of Ca2+ and DA signals, and the DA signal is also important as the Ca2+ signal. Together, the sensitivity analysis clarified the importance of AC1 delay and validated the effectiveness of the simple model.
Domain-size dependence of PKA activity
In Yagishita et al. (2014), PKA activity in the somas persisted for longer periods than in the spines or thin dendrites (Fig 5B, filled circles). This persistence was insensitive to a phosphodiesterases (PDE) inhibitor, papaverine (Fig 5B, bottom; red crosses). This observation is explained with the detailed RP model as follows. ACs and phosphodiesterases (PDE) are membrane proteins, and PKA and cAMP are cytosolic molecules (Fig 4A). If a domain is as small as the spines, ACs and PDE in the membranes can rapidly produce and degrade cytosolic cAMP (Fig 5A, top), while they can only slowly produce or degrade cAMP if the domain is as large as the somas (Fig 4A, bottom). More precisely, if we can assume that ACs and PDE are uniformly distributed in the membrane, the number of membrane proteins per domain volume, which is proportional to the SVR, of the somas (SVRspine = ~ 0.3 /μm where radius rspine = ~0.1 μm; Methods) is 100-times smaller than that of the spines (SVRsoma 30 /μm where rspine = ~10 μm; Methods). Indeed, when we introduced the small SVR in the somas, the simulated PKA activity decreased much slower than in the spines (Fig 5C, solid lines). The PKA decrease was also insensitive to the PDE blockage (80% decrease in PDE concentration), because of the smaller contribution of PDE in the somas (Fig 5C, bottom; red dotted lines). Thus, the D1 RP model successfully recapitulated the characteristics of PKA activities in the somas and spines by taking their SVRs into account.
(A) Domain size affects cAMP dynamics. Spines/thin dendrites are small (radius r = ~0.1 μm), thus having small cytosolic volumes per membrane area (top). The membrane AC1 activity rapidly increases cytosolic cAMP level (the number density of black points), which is rapidly decreased by the membrane PDE. In the somas, cAMP level is increased by the membrane AC5, and slowly decreased by the membrane PDE (bottom). The cAMP levels (the number density of black points per cytosolic volume) are detected by PKA. (B) PKA activities (AKAR2-CR) in the spines and somas in the experiment . Here, DA denotes DA burst, AP designates postsynaptic burst, and papaverine is a PDE inhibitor. The data were taken from the experiment by Yagishita et al. . (C) The simulated PKA activity dynamics with (red dotted lines) and without (black lines) an 80% decrease in PDE. (D–G) Small domain size is necessary for the short time window for PKA activity. (D) A spherical domain with the radius (r). In this domain, pre–post pairing activated VGCCs and NMDARs, increasing cytosolic Ca2+ level for AC1 activation, while DA burst activated AC1 via membrane Golf. (E) Time windows for the PKA activity in the domain with the indicated radiuses r. (F) Ca2+ dynamics with the radiuses r indicated in (E). (G) Radius dependence of FWHMs of the time windows. The FWHMs under a fixed Ca2+ dynamics (same in the case of r = 0.1 μm) were also plotted (dashed line).
We also examined the dependence of the time window for PKA activity on the domain size. A spherical domain was modeled with the uniform densities of membrane AC1 as well as other membrane molecules (Fig 5D; Methods) in order to quantify the radius dependence of the time window. The FWHM of the time window was small (~1 s) when the radius r was less than 1 μm, comparable to those of the dendrites (r = 0.1~1.0 μm) or spines (r = ~0.1 μm) (Fig 5E and 5G). However, the FWHMs of the time window became large (> 2 s) if the domain radius was larger than 1 μm (Fig 5E and 5G), like those of the somas (r = ~10 μm). The broad time window was caused by the temporal broadening of cytosolic Ca2+ signal, an upstream signal of AC1 (Fig 5F), as in the case of cytosolic cAMP (Fig 5B and 5C). The FWHMs of the time windows remained unchanged when the Ca2+ dynamics of the domains was kept identical to that of the spines (Fig 5G, open squares). Together, the domain size critically affected the PKA activity through the interaction of cytosolic and membrane molecules, and the smallness of the spine and thin dendrites was important for shaping the short time window.
DA-dip detection in the D2 RP model
While D1 SPNs detect phasic DA burst for RP, D2 SPNs detect transient DA dip (0.4–3 s) [16,29,38–40]. We thus built a D2 RP model to examine its DA-dip detectability (Fig 6). The D2 RP model did not have D1R, but D2R and A2ARs (Fig 6A, right) [41,42]. A2AR is selectively expressed in D2 SPNs in the striatum [6,43,44] (Fig 6A, right), and A2AR is continuously activated by extracellular adenosine, counteracting with D2R . Details of the D2 RP model are described in Tables A–G of S1 Appendix.
(A) Schematics of signaling pathways in the D1 and D2 RP models (left and right, respectively). In D2 SPNs (right), AC1 is inhibited by D2R via Gi/o, and activated by A2AR and VGCC via Golf and Ca2+, respectively. They interact multiplicatively, resulting in PKA activity . (B) PKA responses of the D1 and D2 RP models against pre–post pairing (1-s duration) together with 1-s delayed DA signal (0.4-s duration) with the indicated concentrations. (C) DA concentration dependence of PKA response in the D1 and D2 RP models. (D) The curves show DA concentration dependence of Epac response in the D1 and D2 RP models (Methods). The overlaid data points were taken from the experiments by Yapo et al. . (E) Time window for PKA activity in the D2 RP model. Thick line: 0-μM DA dip (0.4-s duration). Dashed line: 2-μM DA increase (0.4-s duration).
The D1 and D2 RP models were used to simulate pre–post pairing (1-s duration) together with DA signal (0.4-s duration) lagging by 1 s, and the PKA responses were quantified (Fig 6B). To check the DA-dip detectability, we introduced 0.5 μM basal DA signal only to the D2 RP model (Fig 6B, right; see Discussion) . In the D1 RP model, the larger DA amplitude led to the larger PKA signal (Figs 6B, left, and 4C), because D1R stimulates AC1 via Golf (Fig 6A). In the D2 RP model, the larger DA amplitude conversely resulted in PKA suppression, whereas the transient DA dip activated PKA (Figs 6B, right and 6C). This is because the DA-dip decreased D2R-mediated Gi/o activity, resulting in the relief from the Gi/o inhibition of AC1, and the disinhibited AC1 was activated by the stimulation of Ca2+/CaM and A2AR-mediated Golf signals (Fig 6A). Note that Gi/o, Golf, and Ca2+/CaM multiplicatively regulate AC1 . The DA dose responses of peak cAMP levels in the models showed good fits with those in the precedent experiment using a cAMP biosensor Epac-SH150 (Fig 6A and 6D; Methods) . This clear fitting supported the opposite PKA responses between the D1 and D2 RP models.
Finally, we found that the D2 RP model detected the time lag between the pre–post pairing (1-s duration) and the DA dip (0 μM DA, 0.4-s duration), forming a time window for the PKA activation (Fig 6E). The time window was temporally asymmetric as in the case of the D1 RP model, suitable for detecting the temporal order of the two signals. If the DA-dip was replaced with 0.4-s DA increase, the PKA activation was conversely suppressed (Fig 6E, dotted line), showing the possible bidirectional regulation by DA bursts/dip [6,46]. Thus, the D2 RP model, which satisfied a variety of experimental constraints (see Tables A–G in S1 Appendix), successfully detected phasic DA-dip in an input-timing dependent manner.
Determinants of the time window in the D2 RP model
We also clarified the signaling dynamics transduced to AC1 in the D2 RP model (Fig 7A). As in the case of the D1 RP model, pre–post pairing resulted in the increase of Ca2+ level  (Fig 7A, top, left), leading to AC1 activation with a dead time and first-order time constant (Fig 7A, middle, left; Table D in S1 Appendix; see Methods). On the other hand, DA-dip signal resulted in rapid unbinding of Gi/o from AC1, which in turn activated AC1, due to the continuous Golf signal by A2AR (Figs 6A, right, and 7A, right). The time course of the AC1 activation was thus governed by the Gi/o-AC1 unbinding process, such as the GTP hydrolysis on Gi/o and the Gi/o-GDP unbinding from AC1 (see Discussion). The time window was asymmetrical with the DA dip lasting for 0.4 s (Fig 6E); the asymmetry allows detection of the temporal order of pre–post pairing and DA dip. We then built a simple model of D2 RP (Fig 7B). The square wave of Ca2+ signal was delayed with the two types of time lags (Fig 7B, middle, left) as in the case of the simple D1 model (Fig 3B), and DA-dip disinhibited AC1 due to the relief from D2R-activated Gi/o, which was represented by a first-order reaction. The total response of AC1 was formalized by: where K is the effective affinity of DA with D2R (0.3 μM). The final readout PKA Free-C or ΔSynaptic strength was the time integration of ACD2(t). The square-wave Ca2+ and DA signals (Fig 7B, top) produced the time window (Fig 7B, bottom) similar to that in the detailed D2 RP model (Fig 6E). Thus, the simple D2 model successfully recapitulated the characteristics of the detailed D2 RP model.
(A) Time lag between Ca2+ signal (pre–post pairing) and DA-dip (delay: −1 s, 1 s, or 2 s) was also detected by the AC1 coincidence detector in the D2 RP model. DA dips triggered at red arrow times (top, right) led to relief of AC1 from the Gi/o inhibition, and the relieved AC1 was activated by Ca2+ burst (left). The rapid unbinding of Gi/o activated the active Ca2+/CaM-bound AC1, resulting in the asymmetric time window (bottom). (B) A simple model of D2 RP. Ca2+ signal [Ca(t) = 1 (0 < t < 1), 0 (otherwise)] was simulated as in the case of D1 SPNs (see Fig 3A), whereas DA dips [DA(t) = 0 (Δt < t < Δt + 0.4), 0.5 (otherwise)] disinhibited AC1, which was formulated by 1 / (1+ DA(t)/K). Here, T was 0.3 s, τ was 2 s, and K is the effective affinity of DA with D2R (0.3 μM). (C) Pairing duration dependence of the peak amplitudes of PKA-free C in the detailed D2 RP model. The peak PKA free-C values at the red horizontal lines are shown in the bottom panel.
We also examined the duration dependence of pre–post pairing in the detailed D2 RP model (Fig 7C). Even when the duration of pre–post pairing was extended, the D2 RP model detects the temporal order between the pairing and DA-dip (Fig 7C). It should be noted that the basal level of PKA activation increased by the longer pre–post pairing, because continuous (tonic) baseline DA signal incompletely inhibited AC1 through Gi/o (Fig 7A, right) [6,46], and because the partially-uninhibited AC1 was activated by conjunctive stimulation of A2AR and Ca2+ signals, regardless of DA-dip delay. This PKA activity depended on the duration of pre–post pairing or Ca2+ signal, which resulted in DA-dip independent baseline PKA signal (Fig 7C, bottom). This suggests the requirement of unknown compensation or adaptation mechanisms for D2 SPNs (see Discussion).
Downstream signals of AC1 as signal integrators
We then examined the downstream signaling of AC1 as far as CaMKII to clarify their roles in RP (Fig 8A). In the D1 RP model, two input signals synergistically activated AC1. The AC1 produced cAMP, and thus activated PKA. The activated PKA phosphorylated DARPP32 at threonine 34 (T34). Subsequently, the phosphorylated DARPP32 bound to PP1 and inhibited its enzyme activity, which led to the activation of CaMKII (Fig 8B). Requirement of this signaling pathway has been validated by pharmacological experiments . The FWHMs of the durations of the AC1 activation, cAMP increase, PKA activation, DARPP32 phosphorylation, PP1 inhibition, and CaMKII activation were progressively increased in the model (Fig 8D, filled circles and square), while the similar difference had been observed in the decay rate constants between AKAR2 (PKA) and Camui (CaMKII) in the experiment (Fig 8C, left and right, respectively; Fig 8D, open circles) . The downstream signaling of AC1 thus worked as signal integrators to transform the 1-s short signal into persistent spine enlargement. Note that the timing information of pre–post pairing and DA burst was successfully maintained even after the signal integration process; The active CaMKII levels 30 s after the stimulation still showed the clear time windows for DA delay both in D1 and D2 RP models (Fig 8E). The CaMKII activation is considered to be tightly connected to the synaptic plasticity , although the signaling pathways downstream of CaMKII are still under investigation .
(A) Signaling downstream of AC1 at the spines. (B) Activities of the indicated molecules. Pre–post pairing together with 1-s delayed DA burst was given to the D1 RP model. Inset in the active CaMKII shows an overlay of the results of discrete sampling of CaMKII activity (filled circles, DA + pairing; open circles, paring only) as used for the experimental results (C, right; 0.2 Hz). (C) Experimentally-observed activities of PKA (AKAR2-CR) and CaMKII (Camui-CR) in D1 SPNs . The AKAR2-CR corresponds to the PKA-free C (B, left, bottom), and the Camui-CR corresponds to the active CaMKII (B, right, bottom). Exponential fits were overlaid. From Yagishita et al., Science 26 Sep 2014:Vol. 345, Issue 6204, pp. 1616–1620 (DOI:10.1126/science.1255514). Reprinted with permission from AAAS. (D) FWHM of the molecular activities in (B) (filled symbols). The CaMKII activity was separated into the short and sustained components (square and circle, respectively). Decay rate constants of the activities of PKA (AKAR2-CR) and CaMKII (Camui-CR) in (C) were overlaid (open circles; mean ± 95% CI, *p < 0.05). (E) The downstream molecule CaMKII maintains the time windows for DA delay. The CaMKII activities were taken at 30 s after the stimulation.
D1 SPNs memorize the association between situation/action signal (pre–post pairing) and DA bursts, whereas D2 SPNs retain the association between the situation/action signal and DA dip. This scenario was hypothesized in a modeling study  and has been validated experimentally [4,6,39,49]. We here computationally demonstrated that, in both D1 and D2 SPNs, AC1 plays important roles in associating pre–post pairing and such DA signals. The asymmetric time windows support a concept in classical conditioning, the temporal contiguity detection between unconditioned and conditioned stimuli (US and CS, respectively) . The theory of temporal difference (TD) learning predicts the association between situation/action (pre–post pairing) and RPE (DA signals) for reward expectation . Indeed, RP has recently been reported in vivo [40,52]. RP should thus play pivotal roles in such reinforcement learning, all of which are important for animals’ survival.
In the scheme of reinforcement learning, RP in D1 SPNs should work for solving credit assignment problem, i.e., discovering which choices are responsible for rewards in multiple situation/action signals with multiple delay time [53,54]. Situation/action events activate specific corticostriatal and thalamostriatal fibers, some of which are selectively strengthened if their activation coincides with phasic DA bursts (Fig 1). This strengthening results in the reinforcement of specific situation/action signals that are specifically associated with DA bursts. When situation/action signals are prolonged, such events can be decomposed into a series of phasic signals to the striatum (delay lines), owing to other neuronal systems, e.g., the hippocampus [55,56]. Selective strengthening of some of the delay lines is indeed equivalent to the process of original TD learning . Thus, RP in D1 SPNs is a plausible neural implementation of the TD learning.
In the simple model, we introduced two types of lag times for Ca2+-induced AC1 activation, i.e., first-order time constant (τ) and dead time (T). The first-order time constant formed the time window for delayed DA signal (Figs 2D and 3A). The dead time was also necessary to ensure no RP by the simultaneous stimulation of Ca2+ and DA (Figs 1B and 2D); if DA and Ca2+ signals are both given at onset time 0, the DA signal rapidly activates AC1, while the Ca2+-induced AC1 activation is postponed during the DA-activation period (Fig 3A). Onyike et al. originally discovered those time lags (τ = 13 s and T = ~2.5 s) , but they are too long for RP in the NAc. Because their experimental environment (artificially expressed AC1 in isolated insect cell membranes) is completely different from SPNs in the NAc, we set these rate constants so as to produce the time window (τ = 2 s and T = 0.3 s). The present study highlights the importance of identification of the time lags because they are directly related to animals' behaviors in classical conditioning.
In addition to the NAc, RP has been reported in multiple brain regions [54,57,58]. Shindou et al. discovered similar RP in the dorsomedial striatum, and its 2-s time window should be based on the similar mechanism . He et al. also discovered RP in the visual cortex that depends on serotonin (5HT) or norepinephrine (NE) signals . NE binds to β2A receptors, activating Gs (similar to Golf). Thus, NE-dependent LTP may also depend on AC1 coincidence detection. On the other hand, 5HT binds to 5HT2C receptors, which does not couple Golf, Gs, or Gi, indicating the involvement of other coincidence-detection mechanisms. Also, Brzosko et al. reported hippocampal RP that has a 30 min time window . The long-time window cannot be achieved by AC1 kinetics, but by other mechanisms, such as cAMP response element binding protein (CREB)-regulated gene expression. Thus, multiple mechanisms work for coincidence detection or eligibility trace in RP.
The peak of PKA activity in the D1 RP model appeared at the DA delay of 1 s (Fig 2D), whereas the peak in the experiment appeared at 0.6 s (Figs 1B and 2D). This difference can be resolved by introducing an amplification factor of AC1 activity during pre–post pairing (S3 Fig, top, left). DA fiber stimulation was given for 0.3 s (red arrows in S3 Fig, top, right); thus, Golf-bound AC1 also appeared for 0.3 s (curve in S3 Fig, top, right). The Golf binding is a prerequisite for AC1 activation. When it was convoluted with the amplification factor (S3 Fig, top, left), the time window showed the amplification ranging between −0.3 s and 0.7 s. The subsequent multiplication with Ca2+ signal resulted in the modified PKA time window with a 0.6-s peak as observed in the experiment (S3 Fig, bottom). Indeed, AC1 is reported to be amplified by membrane depolarization (postsynaptic spiking) through the increase in extracellular K+ level . Further studies are necessary to confirm the cause of the 0.6-s peak more clearly, but this suggests that such modulatory factors also affect DA-delay dependence in RP.
In the D1 RP model, cAMP levels in dendrites were increased by AC1, and those in somas were increased by AC5 (Figs 2D and 5A–5C). Those settings explain the time window for DA delay (Fig 2D) as well as the time course of PKA activation (Fig 5A–5C). However, this does not denote the actual localization of AC1 and AC5 in dendrites and somas, respectively. In the striatum, AC5 gives a ~80% in the total cAMP levels of neurons , is expressed not only in the somas but also in the dendrites of neurons , and contributes to synaptic plasticity . Considering our studies, AC1 and AC5 should rather be functionally separated. In particular, upon optogenetic stimulation of DA fibers to give phasic DA signals, the short stimulation may give spatiotemporally limited G-protein signaling only for AC1 activation. Indeed, AC1 is known to be sensitive to transient Ca2+ signal . Given that AC5 should also play roles in synaptic plasticity, AC5 may contribute to the time window in other forms of RP .
The multiple roles of D2 SPNs have been described. One study reports that the spiking of D2 SPNs encodes no-reward outcome and next-action selection , and another shows that the artificial activation of D2 SPNs leads to freezing and escape behaviors of mice . Together, D2 SPNs work for the change of animal’s current behavior into more favorable one, and this process requires detection of DA-dip as a regret signal . In the present study, the phasic DA dip was successfully detected by the D2 RP model with a set of plausible parameters (see Tables A–G of S1 Appendix). Although we did not address its parameter dependence in the present study, the DA-dip detection was determined not by activation rate constants, but by unbinding and deactivation rate constants of the Gi/o signaling. This is because the sudden decrease of DA triggers D2R deactivation and the subsequent Gi/o deactivation process. The Gi/o-GTP hydrolysis rate (deactivation rate) should be set within a certain range, and the deactivated Gi/o-GDP must be unbound from AC1 rapidly. Such parameter requirements for DA-dip detection will be addressed in the future study.
Several research groups have already proposed signaling models of RP. First, Yarali et al. developed a model of Drosophila aversive learning . In this model, AC detects the temporal order of US/CS-activated molecules, producing a 100-s time window. The wider time window is mainly due to the slower dynamics of G-protein and Ca2+ signals, e.g., their activation time constants are 7 s and 10 s, respectively, but further studies are necessary to clarify such species-dependent difference in RP signaling. Nakano et al. have also developed a model of RP . They considered acute DA effects on NMDARs and VGCCs within several hundred milliseconds, and predicted larger Ca2+ signals when the DA signal precedes postsynaptic spiking. In our experimental conditions, we did not see any DA signal-dependent change of Ca2+ level , but such receptor/channel modulation may affect the excitability of SPNs . Nair et al. have also simulated Ca2+- and DA-timing-dependent CaMKII activity , linking to RP. They showed the time window for CaMKII activation that is caused by input-timing dependent phosphorylation of DARPP32 and a Ca2+ buffer. However, their model, in principle, does not show any Ca2+ and DA timing dependence in PKA activity, going against the clear dependence shown in our experiments (Fig 2D). Together, only the current model has successfully described RP that is dependent on pre–post pairing and DA burst/dip, taking timing dependence as well as signaling mechanisms into account.
The current D1 and D2 RP models were single-compartment models, and no-spatial factors were considered (Methods). This simplification facilitates conceptual understanding, but it leaves important issues that need to be addressed in future studies: spatial interaction between dendrite-wide PKA signal (Fig 2C) and stimulated synapse-specific spine enlargement (Fig 1B). In Yagishita et al., PKA was activated in a dendrite-wide manner , which was presumably the result of dendritic AC1 activation in response to dendritic Ca2+ and Golf /Gi/o signals. The dendritic Ca2+ is mediated by VGCC through postsynaptic spiking, and DA fiber activation gives spatially-nonspecific DA signal , activating dendritic D1R/D2R for the dendritic Golf /Gi/o signals . On the other hand, focal uncaging of glutamate triggers stimulated synapse-specific CaMKII activation, resulting in specific spine enlargement. The interaction between those two signals may determine the spatial extent of RP in the dendritic space. This issue will be targeted by reaction-diffusion models of intracellular signaling .
In vivo, DA fibers in mice tonically deliver spikes at ~5 Hz (e.g., Cohen et al. ), and the tonic DA firing gives basal DA signal. While measured basal DA concentrations differ depending on literature, ranging from 10 nM to 1 μM [69–71], we set the basal DA concentration at 0.5 μM on the basis of a microdialysis experiment  (Fig 4B–4E). Some researchers consider the basal DA concentration to be a few-dozen nanomolar [29,72], which is close to D2R’s affinity for DA in the high affinity state of D2R (Kd: ~ 40 nM) , i.e., in which the D2R is pre-coupled with Gi/o . The basal DA is within the dynamic range of the D2R; however, the high affinity state of D2R has slow unbinding rate constant of DA (t1/2 = 80 s), unsuitable for detecting 0.4-s DA dip . Rather, the dynamics of striatal DA signal in vivo can be observed using genetically-encoded fluorescent DA sensors that have Kds of 0.3–1.6 μM , and D1R and D2R SPNs show cAMP responses against sub-micromolar DA signals [16,77] (Fig 4D). Thus, basal DA concentration seems to lie within a sub-micromolar range, and D1R and D2R may operate under their low affinity states (Kd of D1R: ~ 5 μM; Kd of D2R: ~ 2 μM) . Indeed, fitting our models to the DA-dose responses of cAMP (Fig 4) gave the Kds of 2 μM (D1R) and 10 μM (D2R), both of which are rather close to those of their low affinity states. In the low affinity state, D1R and D2R do not form a stable complex with G-protein, but activate many surrounding G-proteins . Although our results support the low affinity states of D1R and D2R, much controversy exists over this topic, and further studies will be necessary to draw the conclusion.
In the present study, we did not simulate the D1 RP model under the basal DA signal. This was because it was difficult to build a quantitative model of the adaptation mechanisms of D1R. Most prominently, while the DA–D1R binding results in PKA activation (Fig 2A), intracellular PKA activity in turn leads to a 100-fold decrease of D1R’s affinity for DA and D1R’s maximal activity also decreases by 100-fold with a time constant of 10 min . Thus, the D1R–ACs–PKA signaling constitutes a negative feedback system to limit PKA response solely to phasic DA signal. Similar adaptations also exist in D2R SPNs [79,80], and part of them should be regulated by immediate early genes, which are beyond the scope of the present study. In short, the DA sensitivity of PKA is modifiable both in D1R and D2R SPNs, especially by the basal DA signal itself. Quantitative modeling of such adaptations is an important future direction, because they could be related to Parkinson’s disease and drug abuse. Symptoms of Parkinson’s disease appear after over 50% loss of DA fibers , and the robustness against the DA loss should partly come from the above mentioned adaptations. Chronic cocaine intake increases DA level in the NAc  as well as the levels of cAMP and CREB , all of which are expected to affect RP. If those factors are successfully taken into consideration, the RP models presented here will become more powerful to give a comprehensive view of the whole DA system for RP.
S1 Fig. Overview of the D1 and D2 RP models.
Left: D1 RP model. Pre−post pairing and DA burst/dip were inputs, and PKA activity, CaMKII activity, and synaptic volume change were readouts in the experiment. Right: D2 RP model. Unlike the D1 RP model, DA dip disinhibited AC1 via the relief from Gi/o inhibition, while pre−post pairing and A2AR activated AC1.
S2 Fig. Overall results of the parameter sensitivity analysis on the D1 RP model.
The changes of amplitude (A), delay (B), and FWHM (C) of the time window were quantified when each target kinetics constant or molecular concentration was changed by ±10%. The mean changes of the top twelve changes were plotted in Fig 4.
S3 Fig. Amplification of AC1 during pre-post pairing resulted in the peak of PKA at a 0.6-s DA delay.
The amplification factor of AC1 activity during pre-post pairing (top, left) was convoluted with DA-dependent Golf stimulation for 0.3 s (top, right). This convolution resulted in the amplification of AC1 between −0.3 s and 0.7 s (middle). The amplification was further multiplied with Ca2+ signal, which resulted in a PKA time window with a 0.6-s peak, as observed in the experiment (bottom, blue). Red arrows denote the times of DA-fiber stimulation (top, right). The amplification level was set to be ×1.45 to give a best fit with the experiment.
We would like to thank Yoshihisa Fujita and Kouichi C. Nakamura for their helpful comments.
- 1. Schultz W. Neuronal reward and decision signals: from theories to data. Physiol Rev. 2015; 95(3): 853–951. pmid:26109341.
- 2. Meredith GE, Baldo BA, Andrezjewski ME, Kelley AE. The structural basis for mapping behavior onto the ventral striatum and its subdivisions. Brain Struct Funct. 2008; 213(1–2): 17–27. pmid:18256852.
- 3. Sesack SR, Grace AA. Cortico-Basal Ganglia reward network: microcircuitry. Neuropsychopharmacology. 2010; 35(1): 27–47. pmid:19675534
- 4. Yagishita S, Hayashi-Takagi A, Ellis-Davies GCR, Urakubo H, Ishii S, Kasai H. A critical time window for dopamine actions on the structural plasticity of dendritic spines. Science. 2014; 345(6204): 1616–1620. ISI:000342164500047. pmid:25258080
- 5. Black J, Belluzzi JD, Stein L. Reinforcement delay of one second severely impairs acquisition of brain self-stimulation. Brain Res. 1985; 359(1–2): 113–119. pmid:4075139.
- 6. Shen W, Flajolet M, Greengard P, Surmeier DJ. Dichotomous dopaminergic control of striatal synaptic plasticity. Science. 2008; 321(5890): 848–851. pmid:18687967.
- 7. Fino E, Paille V, Cui Y, Morera-Herreras T, Deniau JM, Venance L. Distinct coincidence detectors govern the corticostriatal spike timing-dependent plasticity. J Physiol. 2010; 588(Pt 16): 3045–3062. pmid:20603333.
- 8. Caporale N, Dan Y. Spike timing-dependent plasticity: a Hebbian learning rule. Annu Rev Neurosci. 2008; 31: 25–46. pmid:18275283.
- 9. Lee KW, Hong JH, Choi IY, Che Y, Lee JK, Yang SD, et al. Impaired D2 dopamine receptor function in mice lacking type 5 adenylyl cyclase. J Neurosci. 2002; 22(18): 7931–7940. pmid:12223546.
- 10. Xie K, Masuho I, Shih CC, Cao Y, Sasaki K, Lai CW, et al. Stable G protein-effector complexes in striatal neurons: mechanism of assembly and role in neurotransmitter signaling. Elife. 2015; 4. Epub 2015/11/28. pmid:26613416.
- 11. Kheirbek MA, Britt JP, Beeler JA, Ishikawa Y, McGehee DS, Zhuang X. Adenylyl cyclase type 5 contributes to corticostriatal plasticity and striatum-dependent learning. J Neurosci. 2009; 29(39): 12115–12124. pmid:19793969.
- 12. Nakano T, Yoshimoto J, Doya K. A model-based prediction of the calcium responses in the striatal synaptic spines depending on the timing of cortical and dopaminergic inputs and post-synaptic spikes. Front Comput Neurosci. 2013; 7: 119. pmid:24062681.
- 13. Nair AG, Bhalla US, Hellgren Kotaleski J. Role of DARPP-32 and ARPP-21 in the Emergence of Temporal Constraints on Striatal Calcium and Dopamine Integration. PLoS Comput Biol. 2016; 12(9): e1005080. pmid:27584878.
- 14. Lee K, Claar LD, Hachisuka A, Bakhurin KI, Nguyen J, Trott JM, et al. Temporally restricted dopaminergic control of reward-conditioned movements. Nat Neurosci. 2020; 23(2): 209–216. pmid:31932769.
- 15. Visel A, Alvarez-Bolado G, Thaller C, Eichele G. Comprehensive analysis of the expression patterns of the adenylate cyclase gene family in the developing and adult mouse brain. J Comp Neurol. 2006; 496(5): 684–697. pmid:16615126.
- 16. Yapo C, Nair AG, Clement L, Castro LR, Hellgren Kotaleski J, Vincent P. Detection of phasic dopamine by D1 and D2 striatal medium spiny neurons. J Physiol. 2017; 595(24): 7451–7475. pmid:28782235.
- 17. Zhang X, Nagai T, Ahammad RU, Kuroda K, Nakamuta S, Nakano T, et al. Balance between dopamine and adenosine signals regulates the PKA/Rap1 pathway in striatal medium spiny neurons. Neurochem Int. 2019; 122: 8–18. pmid:30336179.
- 18. Kotter R. Postsynaptic integration of glutamatergic and dopaminergic signals in the striatum. Prog Neurobiol. 1994; 44(2): 163–196. Epub 1994/10/01. pmid:7831476
- 19. Lindskog M, Kim M, Wikstrom MA, Blackwell KT, Kotaleski JH. Transient calcium and dopamine increase PKA activity and DARPP-32 phosphorylation. PLoS Comput Biol. 2006; 2(9): e119. Epub 2006/09/13. pmid:16965177.
- 20. Nakano T, Doi T, Yoshimoto J, Doya K. A kinetic model of dopamine- and calcium-dependent striatal synaptic plasticity. PLoS Comput Biol. 2010; 6(2): e1000670. Epub 2010/02/20. pmid:20169176.
- 21. Fernandez E, Schiappa R, Girault JA, Le Novere N. DARPP-32 is a robust integrator of dopamine and glutamate signals. PLoS Computational Biology. 2006; 2(12): e176. pmid:17194217
- 22. Li L, Gervasi N, Girault JA. Dendritic geometry shapes neuronal cAMP signalling to the nucleus. Nature Communications. 2015; 6. ISI:000350290900002.
- 23. Bhalla US, Iyengar R. Emergent properties of networks of biological signaling pathways. Science. 1999; 283(5400): 381–387. pmid:9888852.
- 24. Urakubo H, Honda M, Froemke RC, Kuroda S. Requirement of an allosteric kinetics of NMDA receptors for spike timing-dependent plasticity. J Neurosci. 2008; 28(13): 3310–3323. Epub 2008/03/28. pmid:18367598.
- 25. Onyike CU, Lin AH, Abrams TW. Persistence of the interaction of calmodulin with adenylyl cyclase: implications for integration of transient calcium stimuli. J Neurochem. 1998; 71(3): 1298–1306. pmid:9721756.
- 26. Masada N, Ciruela A, Macdougall DA, Cooper DM. Distinct mechanisms of regulation by Ca2+/calmodulin of type 1 and 8 adenylyl cyclases support their different physiological roles. J Biol Chem. 2009; 284(7): 4451–4463. Epub 2008/11/26. pmid:19029295.
- 27. Stuart G, Spruston N, Hausser M. Dendrites: Oxford Unversity Press; 1999.
- 28. Polito M, Klarenbeek J, Jalink K, Paupardin-Tritsch D, Vincent P, Castro LR. The NO/cGMP pathway inhibits transient cAMP signals through the activation of PDE2 in striatal neurons. Front Cell Neurosci. 2013; 7: 211. pmid:24302895.
- 29. Nair AG, Gutierrez-Arenas O, Eriksson O, Vincent P, Hellgren Kotaleski J. Sensing Positive versus Negative Reward Signals through Adenylyl Cyclase-Coupled GPCRs in Direct and Indirect Pathway Striatal Medium Spiny Neurons. J Neurosci. 2015; 35(41): 14017–14030. pmid:26468202.
- 30. Glatt CE, Snyder SH. Cloning and expression of an adenylyl cyclase localized to the corpus striatum. Nature. 1993; 361(6412): 536–538. pmid:8429907.
- 31. Wang H, Xu H, Wu LJ, Kim SS, Chen T, Koga K, et al. Identification of an adenylyl cyclase inhibitor for treating neuropathic and inflammatory pain. Sci Transl Med. 2011; 3(65): 65ra63. pmid:21228397.
- 32. Wayman GA, Impey S, Wu Z, Kindsvogel W, Prichard L, Storm DR. Synergistic activation of the type I adenylyl cyclase by Ca2+ and Gs-coupled receptors in vivo. J Biol Chem. 1994; 269(41): 25400–25405. Epub 1994/10/14. pmid:7929237.
- 33. Ferguson GD, Storm DR. Why calcium-stimulated adenylyl cyclases? Physiology (Bethesda). 2004; 19: 271–276. Epub 2004/09/24. pmid:15381755.
- 34. Castro LR, Brito M, Guiot E, Polito M, Korn CW, Herve D, et al. Striatal neurones have a specific ability to respond to phasic dopamine release. J Physiol. 2013; 591(Pt 13): 3197–3214. Epub 2013/04/05. pmid:23551948.
- 35. Carter AG, Sabatini BL. State-dependent calcium signaling in dendritic spines of striatal medium spiny neurons. Neuron. 2004; 44(3): 483–493. ISI:000224771700010. pmid:15504328
- 36. Yarali A, Nehrkorn J, Tanimoto H, Herz AV. Event timing in associative learning: from biochemical reaction dynamics to behavioural observations. PLoS One. 2012; 7(3): e32885. pmid:22493657.
- 37. Isomura Y, Takekawa T, Harukuni R, Handa T, Aizawa H, Takada M, et al. Reward-modulated motor information in identified striatum neurons. J Neurosci. 2013; 33(25): 10209–10220. pmid:23785137.
- 38. Frank MJ. Dynamic dopamine modulation in the basal ganglia: a neurocomputational account of cognitive deficits in medicated and nonmedicated Parkinsonism. J Cogn Neurosci. 2005; 17(1): 51–72. pmid:15701239.
- 39. Danjo T, Yoshimi K, Funabiki K, Yawata S, Nakanishi S. Aversive behavior induced by optogenetic inactivation of ventral tegmental area dopamine neurons is mediated by dopamine D2 receptors in the nucleus accumbens. Proc Natl Acad Sci U S A. 2014; 111(17): 6455–6460. pmid:24737889.
- 40. Iino Y, Sawada T, Yamaguchi K, Tajiri M, Ishii S, Kasai H, et al. Dopamine D2 receptors in discrimination learning and spine enlargement. Nature. 2020; 579(7800): 555–560. Epub 2020/03/28. pmid:32214250.
- 41. Fink JS, Weaver DR, Rivkees SA, Peterfreund RA, Pollack AE, Adler EM, et al. Molecular cloning of the rat A2 adenosine receptor: selective co-expression with D2 dopamine receptors in rat striatum. Brain Res Mol Brain Res. 1992; 14(3): 186–195. pmid:1279342.
- 42. Schiffmann SN, Fisone G, Moresco R, Cunha RA, Ferre S. Adenosine A2A receptors and basal ganglia physiology. Prog Neurobiol. 2007; 83(5): 277–292. pmid:17646043.
- 43. Svenningsson P, Le Moine C, Fisone G, Fredholm BB. Distribution, biochemistry and function of striatal adenosine A2A receptors. Prog Neurobiol. 1999; 59(4): 355–396. pmid:10501634.
- 44. Kull B, Svenningsson P, Fredholm BB. Adenosine A(2A) receptors are colocalized with and activate g(olf) in rat striatum. Mol Pharmacol. 2000; 58(4): 771–777. pmid:10999947
- 45. Taussig R, Tang WJ, Hepler JR, Gilman AG. Distinct patterns of bidirectional regulation of mammalian adenylyl cyclases. J Biol Chem. 1994; 269(8): 6093–6100. pmid:8119955.
- 46. Bateup HS, Svenningsson P, Kuroiwa M, Gong S, Nishi A, Heintz N, et al. Cell type-specific regulation of DARPP-32 phosphorylation by psychostimulant and antipsychotic drugs. Nat Neurosci. 2008; 11(8): 932–939. pmid:18622401.
- 47. Lisman J, Yasuda R, Raghavachari S. Mechanisms of CaMKII action in long-term potentiation. Nat Rev Neurosci. 2012; 13(3): 169–182. pmid:22334212.
- 48. Murakoshi H, Yasuda R. Postsynaptic signaling during plasticity of dendritic spines. Trends Neurosci. 2012; 35(2): 135–143. pmid:22222350.
- 49. Cox SM, Frank MJ, Larcher K, Fellows LK, Clark CA, Leyton M, et al. Striatal D1 and D2 signaling differentially predict learning from positive and negative outcomes. Neuroimage. 2015; 109: 95–101. pmid:25562824.
- 50. Abrams TW, Kandel ER. Is contiguity detection in classical conditioning a system or a cellular property? Learning in Aplysia suggests a possible molecular site. Trends Neurosci. 1988; 11(4): 128–135. pmid:2469180.
- 51. Schultz W, Dayan P, Montague PR. A neural substrate of prediction and reward. Science. 1997; 275(5306): 1593–1599. pmid:9054347.
- 52. Fisher SD, Robertson PB, Black MJ, Redgrave P, Sagar MA, Abraham WC, et al. Reinforcement determines the timing dependence of corticostriatal synaptic plasticity in vivo. Nat Commun. 2017; 8(1): 334. pmid:28839128.
- 53. Stolyarova A. Solving the Credit Assignment Problem With the Prefrontal Cortex. Front Neurosci. 2018; 12: 182. pmid:29636659.
- 54. Perrin E, Venance L. Bridging the gap between striatal plasticity and learning. Curr Opin Neurobiol. 2019; 54: 104–112. pmid:30321866.
- 55. Foster DJ, Wilson MA. Reverse replay of behavioural sequences in hippocampal place cells during the awake state. Nature. 2006; 440(7084): 680–683. pmid:16474382.
- 56. van der Meer MA, Redish AD. pmid:21221409. Front Neurosci. 2010; 4: 6.
- 57. Gerstner W, Lehmann M, Liakoni V, Corneil D, Brea J. Eligibility Traces and Plasticity on Behavioral Time Scales: Experimental Support of NeoHebbian Three-Factor Learning Rules. Front Neural Circuits. 2018; 12: 53. pmid:30108488.
- 58. Brzosko Z, Schultz W, Paulsen O. Retroactive modulation of spike timing-dependent plasticity by dopamine. Elife. 2015; 4. pmid:26516682.
- 59. Shindou T, Shindou M, Watanabe S, Wickens J. A silent eligibility trace enables dopamine-dependent synaptic plasticity for reinforcement learning in the mouse striatum. Eur J Neurosci. 2019; 49(5): 726–736. pmid:29603470.
- 60. He K, Huertas M, Hong SZ, Tie X, Hell JW, Shouval H, et al. Distinct Eligibility Traces for LTP and LTD in Cortical Synapses. Neuron. 2015; 88(3): 528–538. pmid:26593091.
- 61. Cooper DM, Schell MJ, Thorn P, Irvine RF. Regulation of adenylyl cyclase by membrane potential. J Biol Chem. 1998; 273(42): 27703–27707. pmid:9765307.
- 62. Wang SC, Lai HL, Chiu YT, Ou R, Huang CL, Chern Y. Regulation of type V adenylate cyclase by Ric8a, a guanine nucleotide exchange factor. Biochem J. 2007; 406(3): 383–388. pmid:17593019.
- 63. Nonomura S, Nishizawa K, Sakai Y, Kawaguchi Y, Kato S, Uchigashima M, et al. Monitoring and Updating of Action Selection for Goal-Directed Behavior through the Striatal Direct and Indirect Pathways. Neuron. 2018; 99(6): 1302–1314 e1305. pmid:30146299.
- 64. Kravitz AV, Tye LD, Kreitzer AC. Distinct roles for direct and indirect pathway striatal neurons in reinforcement. Nat Neurosci. 2012; 15(6): 816–818. pmid:22544310.
- 65. Arbuthnott GW, Wickens J. Space, time and dopamine. Trends Neurosci. 2007; 30(2): 62–69. Epub 2006/12/19. pmid:17173981.
- 66. Uchigashima M, Ohtsuka T, Kobayashi K, Watanabe M. Dopamine synapse is a neuroligin-2-mediated contact between dopaminergic presynaptic and GABAergic postsynaptic structures. Proc Natl Acad Sci U S A. 2016; 113(15): 4206–4211. pmid:27035941.
- 67. Blackwell KT, Salinas AG, Tewatia P, English B, Hellgren Kotaleski J, Lovinger DM. Molecular mechanisms underlying striatal synaptic plasticity: relevance to chronic alcohol consumption and seeking. Eur J Neurosci. 2019; 49(6): 768–783. pmid:29602186.
- 68. Cohen JY, Haesler S, Vong L, Lowell BB, Uchida N. Neuron-type-specific signals for reward and punishment in the ventral tegmental area. Nature. 2012; 482(7383): 85–88. pmid:22258508.
- 69. Abercrombie ED, Keefe KA, DiFrischia DS, Zigmond MJ. Differential effect of stress on in vivo dopamine release in striatum, nucleus accumbens, and medial frontal cortex. J Neurochem. 1989; 52(5): 1655–1658. pmid:2709017.
- 70. Chen KC, Budygin EA. Extracting the basal extracellular dopamine concentrations from the evoked responses: re-analysis of the dopamine kinetics. J Neurosci Methods. 2007; 164(1): 27–42. pmid:17498808.
- 71. Lindefors N, Amberg G, Ungerstedt U. Intracerebral microdialysis: I. Experimental studies of diffusion kinetics. J Pharmacol Methods. 1989; 22(3): 141–156. pmid:2586111.
- 72. Schultz W. Multiple dopamine functions at different time courses. Annu Rev Neurosci. 2007; 30: 259–288. pmid:17600522.
- 73. Richfield EK, Penney JB, Young AB. Anatomical and affinity state comparisons between dopamine D1 and D2 receptors in the rat central nervous system. Neuroscience. 1989; 30(3): 767–777. pmid:2528080.
- 74. Skinbjerg M, Sibley DR, Javitch JA, Abi-Dargham A. Imaging the high-affinity state of the dopamine D2 receptor in vivo: fact or fiction? Biochem Pharmacol. 2012; 83(2): 193–198. pmid:21945484.
- 75. Hunger L, Kumar A, Schmidt R. Abundance compensates kinetics: Similar effect of dopamine signals on D1 and D2 receptor populations. bioRxiv. 2019: 444984.
- 76. Patriarchi T, Cho JR, Merten K, Howe MW, Marley A, Xiong WH, et al. Ultrafast neuronal imaging of dopamine dynamics with designed genetically encoded sensors. Science. 2018; 360(6396). pmid:29853555.
- 77. Marcott PF, Mamaligas AA, Ford CP. Phasic dopamine release drives rapid activation of striatal D2-receptors. Neuron. 2014; 84(1): 164–176. pmid:25242218
- 78. Jiang D, Sibley DR. pmid:10496949. Mol Pharmacol. 1999; 56(4): 675–683.
- 79. Watts VJ, Neve KA. Sensitization of adenylate pmid:15922020 by Galpha i/o-coupled receptors. Pharmacol Ther. 2005; 106(3): 405–421.
- 80. Bonaventura J, Navarro G, Casado-Anguera V, Azdad K, Rea W, Moreno E, et al. Allosteric interactions between agonists and antagonists within the adenosine A2A receptor-dopamine D2 receptor heterotetramer. Proc Natl Acad Sci U S A. 2015; 112(27): E3609–3618. pmid:26100888.
- 81. Burke RE, O'Malley K. Axon degeneration in Parkinson's disease. Exp Neurol. 2013; 246: 72–83. pmid:22285449.
- 82. Petitti DB, Coleman C. Cocaine and the risk of low birth weight. Am J Public Health. 1990; 80(1): 25–28. pmid:2293798.
- 83. Nestler EJ. Historical review: Molecular and cellular mechanisms of opiate and cocaine addiction. Trends Pharmacol Sci. 2004; 25(4): 210–218. pmid:15063085.