Conceived and designed the experiments: AH LL SW MF TW. Performed the experiments: LL SW MF TW. Analyzed the data: LL AH SW MF TW. Contributed reagents/materials/analysis tools: TJB LMS HSC. Wrote the paper: LL SW TW MF TJB HSC AH JLB.
The authors have declared that no competing interests exist.
The mouse has emerged as a uniquely valuable species for studying the molecular and genetic basis of complex behaviors and modeling neuropsychiatric disease states. While valid and reliable preclinical assays for reward-related behaviors are critical to understanding addiction-related processes, and various behavioral procedures have been developed and characterized in rats and primates, there have been relatively few studies using operant-based addiction-relevant behavioral paradigms in the mouse. Here we describe the performance of the C57BL/6J inbred mouse strain on three major reward-related paradigms, and replicate the same procedures in two other commonly used inbred strains (DBA/2J, BALB/cJ). We examined Pavlovian-instrumental transfer (PIT) by measuring the ability of an auditory cue associated with food reward to promote an instrumental (lever press) response. In a separate experiment, we assessed the acquisition and extinction of a simple stimulus-reward instrumental behavior on a touchscreen-based task. Reinstatement of this behavior was then examined following either continuous exposure to cues (conditioned reinforcers, CRs) associated with reward, brief reward and CR exposure, or brief reward exposure followed by continuous CR exposure. The third paradigm examined sensitivity of an instrumental (lever press) response to devaluation of food reward (a probe for outcome insensitive, habitual behavior) by repeated pairing with malaise. Results showed that C57BL/6J mice displayed robust PIT, as well as clear extinction and reinstatement, but were insensitive to reinforcer devaluation. DBA/2J mice showed good PIT and (rewarded) reinstatement, but were slow to extinguish and did not show reinforcer devaluation or significant CR-reinstatement. BALB/cJ mice also displayed good PIT, extinction and reinstatement, and retained instrumental responding following devaluation, but, unlike the other strains, demonstrated reduced Pavlovian approach behavior (food magazine head entries). Overall, these assays provide robust paradigms for future studies using the mouse to elucidate the neural, molecular and genetic factors underpinning reward-related behaviors relevant to addiction research.
The availability of valid and reliable methods for studying incentive learning and other reward-related behaviors in experimental animals is essential to furthering our understanding of the neural, genetic and molecular basis of addiction. To this end, various rodent behavioral assays have been developed to probe core processes that underlie the natural motivation to seek reward, and that are theorized to go awry in drug abuse and addiction
Amongst the rodent paradigms relevant to addiction are those which measure Pavlovian-instrumental transfer (PIT) – a process by which, through their association with reward, previously neutral stimuli can instigate or energize an existing instrumental reward-seeking response
Maintenance and relapse in drug addiction is most often modeled in rodents using extinction and reinstatement procedures. Extinction occurs when the frequency of an instrumental response is reduced by removing the previous response-contingent reward. The extinguished response can be subsequently reinstated by various events, including stressors, presentation of reward-associated cues (‘conditioned reinforcers’) or brief exposure to the reward (‘priming’)
More recently, and spurred by theoretical developments, there has been a growing interest in understanding the role of habit learning and behavioral flexibility in compulsive drug use and addiction
In this context, the mouse is a uniquely informative model species for elucidating the molecular and genetic basis of motivated behavior, addictions and other neuropsychiatric disease states
Subjects were male C57BL/6J, DBA/2J and BALB/cJ mice obtained from The Jackson Laboratory (Bar Harbor, ME). These strains of inbred mice were selected on the basis of 1) their frequent use in behavioral neuroscience as the genetic backgrounds for mouse mutant lines, 2) their inclusion as “group A” priority strains in the Mouse Phenome Project - an international effort to provide the biomedical research community with phenotypic data on the most commonly used mouse strains (
Mice were aged 8–9 wks at the start of the experiments and housed in pairs in a temperature- (72±5°F) and humidity- (45±15%) controlled
Testing was conducted in 21.6×17.8×12.7 cm operant chambers (model #ENV-307W, Med Associates, St. Albans, VT) housed within sound and light attenuating enclosures (Med Associates model #ENV-022MD). The grid floor of the chamber was covered with solid Plexiglas to facilitate ambulation. A pellet dispenser delivering a 14-mg reward pellet (catalogue #F05684; BioServ) into a food magazine was located at one end of the chamber. An infrared photo-beam was located inside the receptacle to detect head entries (HEs) into the magazine. Ultra-sensitive response levers (model # ENV-310W) were located ∼5 cm to each side of the magazine. Speakers emitting either a ∼85 dB broadband white-noise cue (Med Associates model # ENV-325SW) or a 3 kHz pure tone cue (Med Associates model # ENV-324W) were positioned ∼5 cm above the levers. MED-PC software (Med Associates) controlled cue presentation and reward delivery and recorded HEs and lever presses.
Testing began with a single 30 min session to habituate mice to the chamber and to the intermittent availability (on a random interval (RI) 60 sec schedule) of food pellets (unconditioned stimulus or US) in the recessed magazine (levers were unavailable). Mice then underwent daily Pavlovian discrimination training sessions to associate one auditory cue (conditioned stimulus, CS+) with the delivery of the US, and a second cue (CS−) with the absence of reward (
(
Next, mice were trained to press one of 2 levers to receive response-contingent delivery of the US (
After instrumental training was completed, a Pavlovian-instrumental transfer (PIT) probe test (
This behavioral assay had 3 components: 1) acquisition of a simple instrumental S-R association reinforced by food (US) reward, 2) extinction of instrumental responding by removal of reinforcement, and 3) assessment of reinstatement of instrumental responding in the presence of the primary (i.e., food) reinforcer, the food-associated light and tone cues (conditioned reinforcers), or a combination of the primary + conditioned reinforcer. The acquisition and extinction procedures have been described previously
The apparatus was the same as for the PIT assay except that instead of levers, the response device consisted of a touch-sensitive LCD screen located at the opposite end of the operant chamber from the food magazine. The touchscreen (Light Industrial Metal Cased TFT LCD Monitor, Craft Data Limited, Chesham, U.K.) was covered by a opaque Plexiglas panel with 2×5 cm2 ‘cut-outs’ 6.5 cm above the chamber floor that outlined 2 discrete stimulus presentation windows separated by 0.5 cm (
Daily testing began with a single 30 min habituation session to the chamber and intermittent delivery of food pellets into the magazine. This was followed by a 3-phase pre-training procedure to shape the instrumental response. During phase 1, visual stimuli (shape randomly varied) were presented pseudorandomly in 1 of the touchscreen windows for 10 sec, on average every 15 sec, immediately followed by delivery of a single US. Reward delivery was concomitant with the compound presentation of 2 cues, consisting of a 2-sec 65 dB auditory tone and illumination of the food magazine, that were designed to serve as explicit secondary or conditioned reinforcers during the tests for reinstatement (see below). Food reward retrieval was detected by the first HE following delivery, and this also initiated the next trial. In phase 2, delivery of the food US was made contingent on the mouse making physical contact with the touch-sensitive LCD screen in the window (pseudorandomly determined) displaying the randomly-shaped visual stimulus (presented pseudorandomly in 1 of the touchscreen windows) and the response also initiated the subsequent trial. Phase 3 was the same as phase 2 with the additional requirement that trial initiation (after the first) was dependent upon the mouse making an additional HE into the magazine after reward retrieval and, to discourage indiscriminate responding, the inclusion of a 5 sec lights-out, time-out period after responses into a blank window. To progress through each of the 3 phases, mice were required to retrieve 30 pellets within a 30-min session period.
For the acquisition task proper, mice were required to initiate and respond to either 1 of 2 stimuli (1×2.8 cm2 white square per window) over 30 trials (5 sec ITI). Stimuli remained on the screen until a response was made. A response produced a single reward and the CRs. Acquisition criterion was making 30 responses within 12.5 min on each of 5 consecutive sessions.
The assessment of extinction began the session after acquisition criterion was met by monitoring (previously food reinforced) instrumental responding in the absence of food reinforcement. The visual touchscreen panel stimuli were presented over 30 trials and remained on-screen for 9 sec or until a response was made. A response produced no food US reinforcement or the explicit conditioned reinforcers (
The session after attaining extinction criterion, mice were assessed for reinstatement of instrumental responding. Separate groups of mice were tested on 1 of 3 reinstatement procedures (
For assaying habit-learning using the devaluation procedure, the apparatus was the same as that used to test for PIT. Procedures were based upon those previously used to test for reinforcer devaluation in rats
Beginning the day after completing instrumental training, the US was devalued by repeatedly pairing it with sickness (
The day after the 4 day devaluation phase, mice were probed for the effects of devaluation on instrumental responding by measuring the number of lever presses and HEs over a 5-min test session conducted in the absence of food reinforcement, i.e. under extinction conditions (
C57BL/6J mice took on average ∼15 sessions to reach criterion for Pavlovian discriminated approach, although ∼20% of the original sample failed to attain criterion even with extensive training (>45 sessions) and were excluded from further testing. C57BL/6J mice passed through instrumental training in another ∼15 sessions, with the VI60 schedule accounting for over half of the training sessions (
(
C57BL/6J mice rapidly acquired the simple stimulus-controlled instrumental responding in ∼12 sessions (
C57BL/6J mice showed increasing rates of lever pressing from 1–2 presses/min on the FR schedule, to ∼8 and ∼15 lever presses by the end of RR10 and RR20 training, respectively (
While ∼8% of DBA/2J mice failed to reach criterion for Pavlovian discriminated approach even with extensive training, the majority of mice of this strain took ∼15 sessions to show robust Pavlovian discrimination (not shown). A similar number of sessions were needed to complete the instrumental training, largely accounted for by instrumental sessions on the VI60 schedule of reinforcement (
(
DBA/2J mice rapidly acquired and reached the instrumental response criterion in ∼9 sessions (
Lever pressing gradually increased, most clearly during the RR schedules, over instrumental training to ∼12 presses/min (
The BALB/cJ strain took ∼13 sessions on average to attain criterion for Pavlovian discriminated approach, but with ∼20% of the mice tested being unable to reach the criterion even with prolonged training. Mice also passed through instrumental training within ∼13 sessions (
(
BALB/cJ mice attained acquisition criterion in ∼11 sessions (
BALB/cJ mice monotonically increased lever pressing across instrumental sessions to a rate of ∼15 presses/min (
C57BL/6J is one of most commonly used inbred strains of mice, especially as a genetic background in mutant mouse lines
This conclusion was bolstered by the performance of this strain in our instrumental stimulus-response paradigm where C57BL/6J mice readily learned to correctly respond to a visual stimulus on a touchscreen to obtain a food reward. In turn, when reinforcement was omitted, instrumental responding efficiently extinguished, indicating that a S-R association had been established during training. We have reported similar patterns in C57BL/6J and 129/SvImJ inbred mice
An interesting and somewhat surprising finding was that C57BL/6J mice were insensitive to outcome devaluation caused by repeated pairing of the food US with sickness induced by LiCl injection. This was indicated by the fact that neither instrumental responding (lever pressing) nor discriminated approach or goal-tracking behavior (magazine head-entries) were reduced in mice having undergone LiCl-food paired devaluation, relative to non-devalued control mice. It is important to emphasize, however, that the absence of any instrumental devaluation effect was not simply an artifact of a failure of mice to form an association between the food US and the experience of malaise, because mice in the devalued group clearly showed a persistent aversion to consuming the freely available reward in the home cage and in the operant chambers themselves. By definition then, these findings suggest that the instrumental reward-seeking response is, at least in part, habitual and driven by processes that are separate and divorced from the outcome.
On the other hand, the absence of a devaluation effect contrasts with previous reports of clear outcome devaluation induced using sensory-specific satiety, rather than LiCl pairings, in C57BL/6J-background mice
This conclusion was not limited to C57BL/6J mice, as we found that DBA/2J mice also failed to show altered responding following reinforcer devaluation (again, despite clear evidence of the formation of a successful food-sickness pairing). To our knowledge, this is the first published report of a reinforcer devaluation procedure used with this mouse strain. DBA/2J mice have been widely used in behavioral neuroscience and, within addiction research, have been heavily studied because of an aversion (relative to C57BL/6J) to orally consumed alcohol
In addition to insensitivity to malaise-induced reward devaluation, we found that DBA/2J mice were similar to C57BL/6J mice in that they by and large showed robust PIT, extinction and reinstatement. The purpose of the current study was not to cross-compare strains as strains were not tested in a fully counterbalanced design, precluding direct statistical comparison. However, testing was done under identical conditions and informal visual comparison of the data suggests that DBA/2J mice were similar to C57BL/6J mice on the majority of behavioral measures. One exception was that DBA/2J mice seemed slower to extinguish the instrumental response – requiring more sessions to extinguish this response than to acquire it (C57BL/6J mice showed the opposite pattern). The DBA/2J strain also failed to show significant reinstatement of this response when exposed to conditioned reinforcers alone. While this suggests weak conditioned reinforcement in this paradigm, it cannot simply be explained by a more general failure to form cue-reward relationships,
The third strain we characterized on these tasks was BALB/cJ. BALB/cJ has been well-studied for its heightened anxiety-like behavior and stress reactivity in comparison to, for example, C57BL/6J
As with the other two strains, instrumental lever pressing was undiminished by reinforcer devaluation. Interestingly, however, there was a significant decrease in magazine entries in devalued BALB/cJ mice (although the strength of the LiCl-induced illness-food pairing seemed relatively weak in these mice and was not expressed on the long-term retention test). This indicates that at least one component of the behavioral repertoire of BALB/cJ mice in this test was sensitive to the current reward value. But, again, instrumental lever pressing and Pavlovian-approach behaviors are dissociable processes
In summary, the current study describes a set of paradigms for assaying various operant-based reward-related behaviors in three of the most commonly used inbred mouse strains. We describe a procedure for demonstrating PIT, and a method for studying acquisition, extinction and multiple forms of reinstatement of an instrumental touchscreen response. While we were unable to demonstrate malaise-induced devaluation of an instrumental response in any of our strains (although Pavlovian approach responses were sensitive to diminished outcome value in BALB/cJ mice) we found no reason to conclude that mice were unable to form the necessary food-malaise association and the negative results more likely point to more complex factors needing additional study. These procedures provide a useful platform for future studies using the mouse as a model species to elucidate the critical neural, molecular and genetic factors subserving reward-related behaviors, and ultimately provide new insights into maladaptive manifestations of motivated behaviors such as drug addiction.