Suppression of Dopamine Neurons Mediates Reward

Massive activation of dopamine neurons is critical for natural reward and drug abuse. In contrast, the significance of their spontaneous activity remains elusive. In Drosophila melanogaster, depolarization of the protocerebral anterior medial (PAM) cluster dopamine neurons en masse signals reward to the mushroom body (MB) and drives appetitive memory. Focusing on the functional heterogeneity of PAM cluster neurons, we identified that a single class of PAM neurons, PAM-γ3, mediates sugar reward by suppressing their own activity. PAM-γ3 is selectively required for appetitive olfactory learning, while activation of these neurons in turn induces aversive memory. Ongoing activity of PAM-γ3 gets suppressed upon sugar ingestion. Strikingly, transient inactivation of basal PAM-γ3 activity can substitute for reward and induces appetitive memory. Furthermore, we identified the satiety-signaling neuropeptide Allatostatin A (AstA) as a key mediator that conveys inhibitory input onto PAM-γ3. Our results suggest the significance of basal dopamine release in reward signaling and reveal a circuit mechanism for negative regulation.


Author Summary
Dopamine neurons in the midbrain of mammals fire action potentials in response to rewarding stimuli, while punitive stimuli or omission of reward suppress their activity. Different signs in the activity of dopamine neurons thus can encode appetitive and aversive values; however, how these bidirectional activities directly relate to behavior has yet to be elucidated. In fruit flies Drosophila, en masse activation of dopaminergic neurons in the protocerebral anterior medial (PAM) cluster has been shown to signal reward. Here, we demonstrate that a specific sub-class of these dopaminergic neurons, called PAM-γ3, mediates both aversive and appetitive reinforcement through activation and suppression of their activity, respectively. Notably, transient inactivation of the basal activity of PAM-γ3 neurons substitutes for reward and induces appetitive memory formation. Interestingly, we found that Allatostatin A, a neuropeptide that signals satiety, conveys inhibitory input onto PAM-γ3 neurons. Our results highlight the bidirectional activity of defined dopaminergic neurons, which underlies encoding of behaviorally relevant appetitive and aversive values. a1111111111 a1111111111 a1111111111 a1111111111 a1111111111
Besides phasic neurotransmission, recent studies revealed that valence-coding dopamine neurons have basal activity with fluctuating Ca 2+ transient at the presynaptic terminals in the MB [17][18][19][20][21]. Ongoing dopamine release has been shown to control state-dependent consolidation of associative memory [18][19][20][21]. Considering the functional heterogeneity of the adult PAM cluster neurons [7,12,17], excitation and inhibition of dopamine neurons may signal appetitive and aversive values to drive bidirectional associative memories.
By characterizing PAM-γ3, a single class of dopamine neurons projecting to the γ3 region of the MB, we here show that sugar ingestion drives appetitive memory by suppressing the baseline activity of PAM-γ3. Furthermore, we searched for feeding-related signal molecules that inhibit PAM-γ3 and identified the neurons expressing the neuropeptide Allatostatin A (AstA). These results point to the importance of basal dopamine release and its negative regulation in reward processing.

Results
PAM cluster dopamine neurons are heterogeneous both morphologically and functionally and project to the distinct domains of the MB [7,12,17,22,23]. PAM-γ3 extends their dendritic arbor in the brain area surrounding the MB medial lobes (crepine) and projects specifically to the γ3 compartment of the MB (Fig 1A and 1B). While the majority of PAM neurons convey reward, previous studies implied that PAM-γ3 mediates aversive reinforcement [6,12]. However, additional GAL4 expression in other PAM neurons and nondopaminergic neurons of the driver line precluded identifying the responsible cells. We thus employed recently established split-GAL4 drivers MB441B-GAL4 and MB195B-GAL4 to specifically label 9 and 5 PAM-γ3 neurons, respectively [22] (Fig 1A).
To examine the role of PAM-γ3 neurons in learning, we activated them by directing the expression of dTrpA1, a temperature-sensitive cation channel [24], using MB441B-GAL4 and MB195B-GAL4. Simultaneous presentation of an odor with thermoactivation of the PAM-γ3 neurons induced robust conditioned avoidance of the paired odor (Fig 2A). Aversive memory established by thermoactivation of PAM-γ3 suggests their role in aversive reinforcement. As electric shock, compared to other aversive reinforcers, recruits the broadest set of dopamine neurons [6,10,14], we examined the requirement of the PAM-γ3 neurons in shock-induced aversive olfactory memory using targeted shibire ts1 (shi ts1 ) expression, a temperature-sensitive dominant negative form of dynamin GFPase that inhibits vesicle endocytosis [25]. Surprisingly, thermal blockade of PAM-γ3 using MB441B and MB195B did not significantly impair electric shock-reinforced aversive memory (Fig 2B). The result was same with the Shi ts1 blockade by another driver, R58E02-GAL4, which strongly drives transgene expression in the majority of the PAM cluster neurons, including PAM-γ3 [7] (S1 Fig).
As multiple classes of adult PAM neurons contribute to reward signaling differently [7,12,17], we examined the requirement of PAM-γ3 for sugar-induced appetitive memory. Contrary to aversive memory induced by depolarizing the PAM-γ3 neurons, thermal blockade of PAM-γ3 with MB441B-GAL4/UAS-shi ts1 and MB195B-GAL4/UAS-shi ts1 flies significantly impaired appetitive memory ( Fig 2C). Moreover, blocking PAM-γ3 only during the acquisition of appetitive memory revealed a similar impairment, suggesting the role of PAM-γ3 in reward processing (Fig 2D). Their memory performance at a permissive temperature was not  significantly different from those of the genetic controls ( Fig 2E). The blockade of PAM-γ3 did not impair innate sugar preference at the concentration used for the learning experiment or lower (Figs 2F and S3A). These results suggest the selective requirement of PAM-γ3 for mediating sugar reward.
As PAM-γ3 activation drives aversive memory [7], inhibition of the basal activity might be important for processing sugar reward. To examine this hypothesis, we imaged the Ca 2+ response of PAM-γ3 by expressing GCaMP5 [26], a genetically encoded fluorescent calcium sensor, under the control of MB441B-GAL4. The baseline activity was fluctuating without stimulation ( Fig 3A and 3B). Strikingly, sugar ingestion immediately silenced the baseline activity ( Fig 3A, 3B and 3C). The Ca 2+ level of PAM-γ3 neurons remained suppressed even after the ingestion (Fig 3B), which recovered to the baseline level approximately 20 s after the stimulus offset ( Fig 3B). These data are compatible with the idea that PAM-γ3 neurons mediate appetitive reinforcement by acutely suppressing their baseline activity.
We hypothesized that transient inactivation of the PAM-γ3 dopamine neurons may be sufficient to signal appetitive reinforcement. Similar to the reinforcement substitution experiment with dTrpA1-mediated depolarization (Fig 1), we paired the Shi ts1 blockade of PAM-γ3 with one of the two odors; temperature was shifted only during the presentation of the conditioned odor ( Fig 4A). This protocol is different from the former Shi ts1 experiments ( Fig 2C and 2D), in which PAM-γ3 was blocked during both CS + and CS -. The paired blockade of PAM-γ3 in MB441B-GAL4/UAS-shi ts1 flies indeed induced significant appetitive odor memory ( Fig 4B). To confirm this appetitive memory, we established an optogenetic silencing approach by using engineered halorhodopsin (eNpHR) [27], a light-gated chloride ion pump. Transient blockade of the PAM-γ3 output by applying yellow light (591 nm) during the odor presentation resulted in an induction of appetitive memory (Fig 4C and 4D). We thus conclude that the suppression of PAM-γ3 baseline activity is sufficient to signal appetitive reinforcement.
How does sugar ingestion suppress PAM-γ3 activity? Many neuropeptides are known to reflect feeding states and inhibit target cells through their receptors coupled with inhibitory G proteins [28]. We thus examined the expression patterns of a series of neuropeptide-related GAL4 drivers for their potential connection with the PAM-γ3 dendrites in silico [12]. Image registration of confocal stacks of the expression of neuropeptide GAL4 lines and MB441B-GAL4 into a standard brain revealed a spatial overlap between the processes of AstA and the PAM-γ3 neurons (Fig 5A and 5B). This putative connection was experimentally confirmed  using reconstituted GFP signal between PAM-γ3 and AstA neurons by the GFP reconstitution across synaptic partners (GRASP) technique [29] (Fig 5C).
AstA was shown to signal satiation in Drosophila [30] and inhibits target neurons [31]. We therefore asked the involvement of the AstA neurons in mediating sugar reward. Thermoactivation of dTrpA1 with AstA-GAL4, which exclusively labels a subset of AstA immunopositive neurons [30], resulted in the formation of appetitive odor memory (Fig 6A). The blockade of AstA neurons during learning significantly lowered appetitive memory of both sucrose ( Fig  6B) and nonnutritive sugar arabinose (Fig 6E). The sugar preference of AstA-GAL4/UAS-shi ts1 flies and their memory performance at a permissive temperature were unimpaired (Figs 6C and 6D and S3B). Thus, AstA-expressing neurons are necessary and sufficient for mediating the reinforcement property of sugar reward, likely sweetness.
To confirm that the AstA protein is the underlying modulatory signal, we generated multiple null alleles of AstA using the CRISPR/Cas9 system. Appetitive memory of these mutant flies was significantly impaired (Fig 7A) while leaving their innate sugar preference unaffected (Fig 7B). Moreover, we also generated an RNA interference (RNAi) fly line against AstA based on the small hairpin RNA (shRNA) technique [32]. Down-regulation of AstA using AstA-GAL4 resulted in an impaired appetitive learning (Fig 7C) while leaving sugar preference intact ( Fig 7D). Given that AstA is an inhibitory neuropeptide [31], these results suggest that the AstA release conveys sugar reward by inhibiting the PAM-γ3 dopamine neurons.
In order to visualize the distribution of an AstA receptor, we inserted the GAL4 transgene into the C-terminus of the Allatostatin A receptor 1 (DAR-1) coding region [33] by means of the CRISPR-Cas9 system. Confocal examination of DAR-1-GAL4 expression revealed positive labelling in the dopamine neurons projecting to the MB, including PAM-γ3 (Figs 8A and S4).
If AstA/DAR-1 signaling also works in an inhibitory manner in the PAM-γ3, down-regulation of DAR-1 may weaken the PAM-γ3 suppression. To test this, we generated RNAi fly strains against DAR-1 based on the shRNA technique. Strikingly, knocking down DAR-1 in the PAM-γ3 significantly attenuated the suppression of the baseline activity (Fig 8B and 8C). Altogether, we propose that AstA provides an inhibitory signal to PAM-γ3 upon the ingestion of rewarding substances.
To examine the behavioral effect of AstA/DAR-1 signaling in the PAM-γ3, we down-regulated DAR-1 expression in the PAM-γ3 neurons and examined their appetitive memory. Consistent with our proposal, the knockdown significantly impaired sugar learning (Fig 9A and  9B) while leaving the innate sugar preference intact (Fig 9C and 9D).
AstA/DAR-1 signaling was shown to suppress neuronal activity in receiving cells through Gαi/o signaling [31,34]. To examine the intracellular mechanism of AstA/DAR-1 signaling in PAM-γ3, we inhibited the Gαo subunit by expressing the pertussis toxin with MB441-GAL4 [35]. As these flies had defective sugar memory (Fig 10A) but unimpaired sugar preference (Fig 10B), we suggest that DAR-1 inhibits PAM-γ3 activity by recruiting Gαo.

Discussion
Sugar ingestion triggers multiple reward signals in the fly brain [7,12]. We here provided lines of evidence that part of the reward is signaled by inactivating dopamine neurons (Figs 1-4). The role of PAM-γ3 highlights the striking functional heterogeneity of PAM cluster dopamine neurons. The decrease and increase of dopamine can convey reward to the adjacent compartments of the same MB lobe-γ3 and γ4- (Figs 3 and 4) [9,17]. The reward signal by the transient decrease of dopamine is in stark contrast to the widely acknowledged role of dopamine [36,37]. Midbrain dopamine neurons in mammals were shown to be suppressed upon the presentation of aversive stimuli [38] or the omission of an expected reward, implying valence coding by the bidirectional activity [39]. As depolarization of PAM-γ3 can signal aversive reinforcement (Fig 2), these neurons convey the opposite modulatory signals to the specific MB domain by the sign of their activity. Intriguingly, the presentation and cessation of electric shock act as punishment and reward, respectively [40]. Such bidirectional activity of PAM-γ3 may represent the presentation and omission of reward (Figs 1-4). While thermoactivation of PAM-γ3 induced robust aversive memory, blocking their synaptic transmission did not affect shock learning, leaving a question regarding their role in endogenous aversive memory process. PAM-γ3 may only be involved in processing aversive reinforcement different from electric shock-like heat [10] or bitter taste [11,41]-or respond only to the omission of a reward as pointed above [40,42]. However, two studies show that dopamine neurons mediating aversive reinforcement of high temperature and bitter N,N-Diethyl-3-methylbenzamide (DEET) are part of those for electric shock. Identification of such aversive stimuli that are signaled by PAM-γ3 activation is certainly interesting, as it is perceived as the opposite of sugar reward and thus provides the whole picture of the valence spectrum. Another scenario where sufficiency and necessity do not match is the compensation of the reinforcing effect by other dopamine cell types (e.g. MB-M3 [6]). The lack of PAM-γ3 requirements for electric shock memory may be explained by a similar mechanism.
How can the suppression of PAM-γ3 modulate the downstream cell and drive appetitive memory? Optogenetic activation of the MB output neurons from the γ3 compartment induces approach behavior [43]. This suggests that the suppression of the PAM-γ3 neurons upon reward leads to local potentiation of Kenyon cell output. This model is supported by recent studies showing the depression of MB output synapses during associative learning [17,[44][45][46]. A likely molecular mechanism is the de-repression of inhibitory D2-like dopamine receptors, DD2R [47]. As D2R signaling is a widely conserved mechanism [48], it may be one of the most ancestral modes of neuromodulation.
Furthermore, recent anatomical and physiological studies demonstrated that different MBprojecting dopamine neurons are connected to each other and act in coordination to respond to sugar or shock [17,43]. Therefore, memories induced by activation or inhibition of PAM-γ3 may well involve the activity of other dopamine cell types.
Our finding that appetitive reinforcement is encoded by both activation and suppression of dopamine neurons raises the question as to the complexity of reward processing circuits ( Fig  11). It is, however, reasonable to implement a component like PAM-γ3 as a target of the satiety-signaling inhibitory neuropeptide AstA. Intriguingly, the visualization of AstA receptor distribution by DAR-1-GAL4 revealed expression in two types of MB-projecting dopamine neurons: PAM-γ3 and MB-MV1 (also named as PPL1-γ2α'1). Given the roles of MB-MV1 in aversive reinforcement and locomotion arrest [6,10,17,19], AstA/DAR-1 signaling may also inhibit a punishment pathway upon feeding. We thus speculate that this complex dopamine reward circuit may be configured to make use of bidirectional appetitive signals in the brain (Fig 11).

Image Registration
Landmark matching-based affine and nonrigid registration of whole brains was performed as previously described [12]. Confocal images of entire brains of GAL4/UAS-mCD8::GFP flies were scanned with n-cadherin (n-Cad) counterstaining and registered into the standardized brain by referring the n-Cad channel. The transformations computed with the n-Cad channel were then applied to the mCD8::GFP channel. The registered images were assigned into the standardized brain and represented as different colors using ImageJ.

Behavioral Assays
The conditioning and testing protocol was as described previously [7,12]. Briefly, for sugar learning and the US substitution experiment by dTrpA1-mediated thermoactivation, a group of approximately 50 flies in a training tube alternately received octan-3-ol (OCT; Merck) and 4-methylcyclohexanol (MCH; Sigma-Aldrich) for 1 min in a constant air stream with or without dried sucrose paper or 30˚C heat. For the US substitution experiment by Shi ts1 -mediated thermoinactivation (Fig 4), a group of approximately 50 flies in a training tube alternately received OCT and MCH for 2 min in a constant air stream with or without 33˚C heat. For the US substitution experiment by eNpHR3-mediated light-inactivation (Fig 4), a group of approximately 50 flies were put into a custom-made LED-embedded aluminum tube and alternately received OCT and MCH for 1 min twice in a constant air stream with or without a continuous light exposure (591 nm). The light intensity was approximately100 mW/mm 2 at a distance of 10 mm from the LED, measured with the Laser Power Meter Console (Thorlabs, PM100A). Flies were fed with all-trans-retinal contained food (2.5 mM) at least for 3 d before the experiments. OCT and MCH were diluted 10% in paraffin oil (Sigma-Aldrich) and placed in a cup with a diameter of 3 mm or 5 mm, respectively. After a given retention time, the conditioned response of the trained flies was measured with a choice between CS+ and CS-for 2 min in a T maze. The memories were tested immediately after training unless otherwise stated. The restrictive temperature for the experiments with UAS-shi ts1 was 33˚C and the permissive temperature was 24˚C, measured with the VC-960 digital multimeter (Voltcraft). For memory retention, trained flies were kept in a vial with moistened filter paper. After a given retention time, the trained flies were allowed to choose between MCH and OCT for 2 min in a T maze. A learning index was then calculated by taking the mean preference of the two reciprocally trained groups. Half of the trained groups received reinforcement together with the first presented odor and the other half with the second odor to cancel the effect of the order of reinforcement.

Statistics
Statistical analyses were performed with Prism5 (GraphPad). Most of the data did not violate the assumption of normal distribution and homogeneity of variance. Therefore, the data were analyzed with parametric statistics: one-way analysis of variance followed by the planned pairwise multiple comparisons (Bonferroni two-tailed test). Figs 2B, 2F, 6C, 6D, 7B, 9A, S1 and S2 were analyzed with nonparametric statistics: Kruskal-Wallis one-way analysis of variance followed by the planned pairwise multiple comparisons (Dunn's test). The significance level of statistical tests was set to 0.05. For detailed results for statistical tests, see S1 Table. The numerical data used in all figures are included in S1 Data.

In Vivo Calcium Imaging
The expression of GCaMP5 [26] calcium reporter was targeted to PAM-γ3 neurons by crossing MB441B-GAL4 to mb247-dsRed, UAS-GCaMP5, or UAS-V20-DAR-1-RNAi; UAS-GCaMP5 flies. Flies that were 2 to 3 d old from the offspring were starved at 25˚C for 24 h on a Kimwipe soaked with water. For DAR-1 knockdown experiments, flies were aged to 8-12 d after eclosion. Flies were then prepared for in vivo imaging by confocal microscopy as previously described [55]. Fluorescence was recorded in a transverse section of the brain. Recordings were made with a frame rate of 2 Hz in two animals, 10 Hz in two animals, and for the rest at 5 Hz, which did not alter the results. Each fly was presented with a droplet of 500 mM sucrose. The fly had access to the gustatory stimulus for 10 s. Image analysis was performed essentially as described previously [55]. Briefly, an object in each recording was stabilized by phase correlation-based image alignment using dsRed signal, then GCaMP5 signal was used as a fluorescent F value. In each animal, a region of interest in the left hemisphere was used. The baseline value of fluorescence F mean was calculated as the average of ΔF/F mean over 15 s before the start of the stimulation.  Table. List of crosses and statistics for behavior experiments (DOCX) S1 Data. Excel spreadsheet containing, in separate sheets, the underlying numerical data and statistical analysis for Fig panels 2A, 2B, 2C, 2D, 2E, 2F, 3C, 4B, 4D, 6A, 6B, 6C, 6D,  6E, 7A, 7B, 7C, 7D, 8C, 9A, 9B, 9C, 9D, 10A and 10B, S1, S2, S3A and S3B. (XLSX)