Carrots for the donkey: Influence of evaluative conditioning and training on self-paced exercise intensity and delay discounting of exercise in healthy adults

To choose exercise over alternative behaviours, subjective reward evaluation of the potential choices is a principal step in decision making. However, the selection of exercise intensity might integrate acute visceral responses (i.e. pleasant or unpleasant feelings) and motives related to goals (i.e. enjoyment, competition, health). To understand the factors determining the selection of exercise in its intensity and evaluation as a modality, we conducted a study combining exercise training and evaluative conditioning. Evaluative conditioning was performed by using a novel technique using a primary reinforcer (sweetness) as the unconditioned stimulus and physical strain i.e. heart rate elevation as the conditioned stimulus during interval training, using a randomized control design (N = 58). Pre, post-three weeks interval training w/o conditioning, and after 4 weeks follow-up, participants were tested on self-paced speed selection on treadmill measuring heart rate, subjective pleasantness, and effort levels, as well as delay-discounting of exercise and food rewards. Results revealed that the selection of exercise intensity was significantly increased by adaptation to training and evaluative conditioning, revealing the importance of visceral factors as well as learned expected rewards. Delay discounting rates of self-paced exercise were transiently reduced by training but not affected by evaluative conditioning. In conclusion, exercise decisions are suggested to separate the decision-making process into a modality-specific cognitive evaluation of exercise, and an exercise intensity selection based on acute visceral experience integrating effort, pleasantness, and learned rewards.


Introduction
Exercise and physical activity are strongly associated with physical and mental wellbeing [1][2][3], but only a small proportion of the population meets the required recommendations in physical visual and appetitive stimuli as unconditional stimuli [38][39][40]. In the context of physical activity, only a few studies performed EC using visual stimuli for conditioning [41,42]. However, sweet rewards are often used in animal conditioning paradigms [43][44][45] but less in humans [39,40,46]. The rewarding nature of sweetness and the strength of its addictive potential are confirmed numerous times [43,44,47], and, even non-caloric sweeteners have the potential to be used in conditioning paradigms [43,48].
In this study, we used a novel evaluative conditioning (EC) paradigm, pairing a primary reinforcer (sweet solution) with cardiovascular strain during exercise interval training over three weeks. A further training-only group received a neutral saline solution during trainingand a control group received no training and no conditioning. Pre and post, as well as 4 weeks after training (follow-up) subjects performed sessions for self-selected speed selection and delay discounting of the selected exercise; subjects selected intensity of exercise to adjust to maximize pleasantness (Feeling Scale) and also reported rate of perceived exertion/effort (RPE); besides, heart rate, body characteristics, and a battery of psychological questionnaires plus an assessment of delay discounting rate of the self-selected exercise, favorite food and money were performed with a computer paradigm [31].
We hypothesized that evaluative conditioning would increase the self-selected speed with a concomitant increase in heart rate and RPE levels after training and follow-up above the level induced by training only, assuming that the conditioned reward would be integrated into the exercise reward. Exercise training alone would lead to a transient increase of self-selected speed due to transient physiological adaptations and a decline at the follow-up; training adaptations would lead to changes in physiological strain perceived as rewarding.
Moreover, the elevation of the reward value after training and conditioning would induce a reduction of delay discounting of exercise due to a magnitude effect [49].

Participants
After ethical approval by the ethics committee of the School of Sport, Health and Exercise Science, Bangor University (ethics number: P05-16/17), 62 subjects (32 females) were recruited from students and the general public in Bangor, UK, and 58 finished the study; two participants dropped out without stating reasons, two were excluded because of missing training sessions. Eligible for participation were healthy female and male subjects aged between 18-50, who did not engage in regular physical activity or dieting. Subjects with medical conditions that contraindicated performing regular high-intensity interval training were excluded. Participants received £100 as a reimbursement for their time. The sample size was calculated based on the study by Antoniewicz and Brand [42], who used EC with visual stimuli and observed an increase in exercise intensity selection. The power analysis aimed to detect a significant difference in exercise intensities between groups using G � Power 3.1.9.2, ANOVA: Repeated Measures, within-between interactions at a significance level of 5%; a sample size of 16 (8 in each group) would have 95% power to detect an effect based on a partial eta squared of 0.290 between groups. The sample size for the effect of exercise training on discounting rates, based on Albelwi et al. [31], aimed to detect a significant difference between groups, ANOVA: Fixed effects, special, main effects and interactions at a significance level of 5%; a sample size of 54 (divided into 2 groups) would have 90% power to detect an effect size of 0.204 between groups.

Design
A between-and within-subjects experimental design was used investigating self-selected exercise intensity, heart rate, RPE, as well as delay discounting of exercise over the three study phases, baseline, post-training, and follow-up. Fifty-two participants were randomly assigned into two groups, a training group (TR) who received unflavoured electrolyte mouth rinse, and a conditioning plus training group with sweet mouth rinse group (COTR) during all interval training sessions. A no training, no conditioning group (NTR) was recruited separately from the same population for testing of training effects on parameters (n = 10) (Fig 1).

Physical characteristics and physiological parameters
Weight and body composition (i.e. percent body fat) were assessed via bioelectrical impedance measurement using a Tanita BC-418 MA system. Participants' height was measured using a standard stadiometer. Heart rate was measured during all exercise sessions (self-selected exercise sessions and interval training sessions) using a Polar heart rate monitor in connection with a bespoke computerized system for EC.

Self-report measures
Participants were asked to fill out selected questionnaires for better characterization of the population sample and to measure exercise motivation (full list and details, see S1 File).
Exercise Motivation Inventory (EMI 2) [52] assesses exercise participation motives applicable to both exercisers and non-exercisers. We used the items challenge, affiliation, revitalization, and enjoyment as intrinsic factors (this study, Crα: 0.788), and appearance, weight management, positive health, health pressure, ill-health avoidance, and strength/endurance as extrinsic factors (this study, Crα: 0.713), according to Egli [12]. Further motives (e.g., social recognition, stress management) are more difficult to classify along with dichotomous categories and were included in the total exercise motivation score (this study, Crα: 0.883).
Procedure: Phase 1: Baseline measurements of self-selected exercise Visit 1. Participants were informed to wear comfortable clothing that allows exercising on all visits. In addition, participants had to abstain from alcohol and caffeine consumption for at least 12 hours, and not engage in strenuous exercise for at least 48 hours. Visits were scheduled at consistent times for the individuals (morning/afternoon). They were introduced to the protocol, consent was given, and asked to fill out self-report questionnaires (see self-reported measures), followed by measuring body characteristics. Then, participants were asked to walk for about 3 to 5 minutes on the treadmill to be familiarised with the exercise and manual settings of the treadmill for further visits.
The goal for the exercise trials (1st and 2 nd visits) was to establish the most pleasant/enjoyable exercise intensity possible for each participant to establish. This specific exercise experience would later serve as the exercise imagined during the delay discounting task. The exercise intensity was set by repetitively self-adjusting the treadmill speed during the trials (see below). Social desirability and demand characteristics were minimized by emphasizing the aim to find the optimal exercise intensity for the participant's enjoyment using the same verbal protocol for all assessments and involving five different experimenters for reduction of bias and interpersonal contact.
All trials were performed using the same treadmill; during the exercise period, a nature soundtrack consisting of bird and forest sounds was played through speakers while the participants were facing a natural scenery through a wide window. This was performed to reduce possible negative effects of the technical environment on participants' perception. The heart rate (HR) was monitored throughout and after the end of the exercise sessions with a HR monitor (Polar RS800CX). Exercise trials were terminated after 30 minutes or whenever the participant chose to end it earlier. The sessions were separated by at least 48 hours and maximally by one week. Before each exercise trial, each participant warmed up on a cycle ergometer (Lode Excalibur) for 3 minutes before stepping on the treadmill.
For the first exercise trial, participants started walking on the treadmill at 3 km/h; display of speed and time was concealed from individuals in all trials. They were instructed to find the most pleasant exercise intensity they could adjust by modifying the speed of the treadmill using the control panel; it was emphasized that the experiment was not about fitness or performance. Participants were told that the exercise duration was up to 30 minutes maximally. After 2 minutes of exercise, participants were asked to rate the pleasantness of exercise using the 11-point Likert Feeling Scale (FS) that ranges from -5 to 5; anchors are provided at zero ('Neutral') and all odd integers, ranging from 'Very Good' (5) to 'Very Bad' (-5) [53]. For the rating, the participants were asked 'How do you rate the current exercise of being pleasant?'. Participants could modify the treadmill speed every two minutes to optimise pleasantness, e.g. increasing or decreasing the speed (selection could be made during the first 30 sec of the two minutes). Rating of current pleasantness was requested after the two-minute period elapsed; any set values were not visible for participants. After cooling down, the participants were free to leave. Visit 2. The second exercise trial had the goal to reconfirm the setting and experience of the self-selected exercise. After warming up, participants walked on the treadmill at 3 km/h and the speed was elevated gradually to the preferred speed selected in the first exercise trial by the experimenter and held for further 2 minutes. The researcher then manipulated the speed by increasing and decreasing it by 10% of the preferred speed level, each period for 2 minutes, while pleasantness was rated every 2 minutes to confirm the optimal setting of the preferred speed [54]. Thereafter, a 5 minutes rest was given, and participants performed a further 5-10 minutes bout at the preferred speed to validate perception and to reinforce the feeling regarding this exercise bout. Visit 3. The purpose of this exercise trial was to explore the perception of perceived exertion/effort at preferred exercise speed. This measure was not introduced during the previous two trials to avoid any cross-over effects between the two perceptual modalities (pleasantness and effort). However, the RPE scale was thoroughly explained verbally and anchored by memory. After the warmup, participants started walking on the treadmill, and then the speed was gradually increased to the formerly self-selected preferred exercise speed by the experimenter; the speed was increased, as well as decreased, according to the protocol of visit 2. Participants were asked to rate their perception of exertion/effort every 2 minutes using Rate of Perceived Exertion (RPE) scale [55] which starts from 6 (no exertion at all) to 20 (maximal exertion). Thereafter, a 5 minutes rest was given, and participants performed a further 5-10 minutes bout at the preferred speed to validate effort ratings. RPE values were taken from the periods at self-selected speed and from the bout after rest; the most frequent value of RPE at self-selected speed was used.
After a resting period of about 5 minutes, participants were introduced to the delay discounting task on the computer.
Delay Discounting (DD) tasks. Tests were performed according to Albelwi [31]; in brief, each participant was verbally introduced to the task, read the introductions, and followed instructions on the computer screen. The DD tasks were generated using a specially designed computer programme based on the paradigm described by [56] via the Inquisit™ program (developed by Milliseconds Software). The indifference points (IP) for each time delay of rewards for the tested commodities were obtained by randomization between delays and the size of rewards. The sooner, the smaller hypothetical reward was offered 'at the end of this session' as an immediate choice, and after 1, 7, 30, 60, and 180 days delay. The values of the three commodity rewards were adjusted based on their monetary value [57,58]. The adjustment of the rewards was masked by randomization between delays and amount of rewards, and with the progression of the test, distractors were displayed to prevent the subject from predicting the questions and unmasking the underlying technique of the test as recommended by [56]. The program terminated automatically and saved the experimental data after IP criteria had been achieved. Each computer task took about 4-6 minutes to be finished.
Exercise DD task. For the exercise, the hypothetical exercises sessions offered were based on the formerly established treadmill exercise sessions (see above) and were fragmented into (5, 10, 15, 20, and 30 minutes (1 gym session), 60 minutes (2 gym sessions), 90 minutes (3 gym sessions), 120 minutes (4 gym sessions) and 150 minutes (5 gym sessions); assuming that 1 gym session = 30 minutes of exercise. The complete script is to be found in the S1 File.
The taste test was administered by the end of this phase for the COTR subjects (see S1 File).

Phase 2: Interval training sessions with and without evaluative conditioning
Evaluative conditioning paradigm. Syringe pump system. Two 60 ml syringes filled with either (2 x 60 ml NEUTRAL SOLUTION (see S1 File) for the TR group, or 1 x 60 ml NEU-TRAL SOLUTION and 1 x 60 ml 100% SWEET SOLUTION (see S1 File) for the COTR group) were attached to two New Era programmable syringe pumps, Model: NE-4000. The combined flow rate was 2 ml/minute infused into a double tubing system and released through a double-barrel mouthpiece onto the participant's tongue/oral cavity. The selected flow rate of 2ml/min is in the range of normal stimulated saliva flow rate in adults which is 1-3 mL/minute [59] to minimize swallowing during exercise. The TR group subjects received a constant NEU-TRAL SOLUTION injection into the mouth during interval training, COTR subjects received NEUTRAL SOLUTION with SWEET SOLUTION admixture depending on HR. The admixture was controlled by a bespoke computer program using the HR measure for adjusting sweetness of mouth rinse solution of the COTR group � 85% of calculated HR max received 100% SWEET SOLUTION, while HR at individually self-selected speed (baseline) received 0% sweetness = NEUTRAL SOLUTION. While the total liquid rate kept constant (2ml/min), any increase in HR above baseline increased the sweetness admixture in a quadratic exponential manner achieving 100% sweetness at 85% HR max, see Eq 1.
The pairing between heart rate change (conditioned stimulus, CS) and the sweet reward (unconditioned stimulus, US) depended on exercise intensity increments during interval training. Six pairing periods were applied per training session; a total of 54 pairings over 9 conditioning sessions were applied during interval training sessions over 3 weeks [60]. Pairings were induced during the interval training sessions in the COTR groups, while the TR group received the same interval training with NEUTRAL SOLUTION injection. All participants attended all nine interval training sessions over three weeks.
Interval training sessions. The interval exercise training consisted of three sessions per week over three weeks for TR and COTR groups. The exercise training was performed using an interval training protocol on a treadmill consisting of progressive peak training intensities between 60-85% of the estimated HR max . HR max was calculated via 220 -age = HR max which is suitable for the recruited age group [61]. Target velocities for the treadmill were calculated by using the HR/treadmill speed relationship from the assessment trials for self-selected speed. Speed was adjusted manually by the researcher if the target HR was not achieved. Subjects were verbally informed about oncoming increases or decreases in speed to avoid accidents. The training protocol was identical for TR and COTR groups.
After a warm-up on a cycle ergometer (Lode Excalibur) for 3 minutes, subjects started exercising on a treadmill (computer-controlled treadmill, h/p-Cosmos) and gradually increased to participants' preferred speed (baseline). Subjects were blinded to all data of the treadmill settings. This was followed by intervals ramped from baseline to 60% HRmax over 10 sec, then held for 2 minutes, followed by slowing down to baseline speed over~10 sec; baseline speed was then held for further 1.5 minutes followed by the next cycle increasing intensity by 5% of HRmax. Six cycles were performed per training session (i.e. 60%, 65%, 70%, 75%, 80%, 85% of HRmax), followed by a cool-down at walking speed. Total exercise time was about 30 minutes.

Phase 3: Post-training assessments
It included 3 sessions, carried out during the following week after the exercise interval training phase for TR and COTR groups, for the control group (NTR), three weeks after phase 1 (no training). All sessions and measurements were performed in the same order and ways as described for baseline measurements except omitting the taste test.

Phase 4: Follow-up assessments
These were carried out 4 weeks after post-training assessments (phase 3) for TR and COTR groups. During this period participants were instructed not to engage in any physical training that was out of their usual former (before the intervention) daily routine. This phase included three sessions, using the same protocol as for phase 3. Subjects were reimbursed after this session and debriefed; participants were initially informed that the study was aimed to investigate the influence of oral cavity rinse to avoid dry mouth during the exercise.

Analysis
All variables were tested on assumptions for parametric testing (i.e. mixed model ANOVA and ANCOVA, t-test); parameters, which were not normally distributed were transformed (by log: delay discounting constants k, speed, heart rate; X 2 : rate of perceived exertion (RPE)) to comply with ANOVA/ANCOVA test assumptions. ANOVA and ANCOVA were applied according to recommendations by Van Breukelen [62] with Bonferroni correction. Parameters, which could not be successfully transformed were analyzed using non-parametric analysis e.g. Kruskal-Wallis followed by Mann & Whitney U-tests as indicated in the results section. Multiple regression analysis was performed using the enter method of selected parameters, as indicated in the results section. For model fitting of the hyperbolic effort discounting equation [63] on selected data, the Microsoft Excel Solver program using the least square fit method was used to obtain effort discounting constant value k. The fit of model parameters was tested using the Wilcoxon sign test. Besides, delay discounting constants k for money, food, and exercise were calculated using Mazur's equation [64] fitting participants' indifference points (IP) data to hyperbolic functions using the least square fit method with the Microsoft Excel Solver program. Data sets were removed if poor-fit in the hyperbolic model (R 2 <0.7). Correlation analysis was performed using Spearman's and Pearson's correlation analysis. Data are displayed in mean and standard deviation, or median, and 25 and 75 percentiles. Significance levels were reported if lower than p<0.05. Data were analyzed using Statistical Package for the Social Science (IBM SPSS) version 25.

Results
Physiological and psychological characteristics 58 participants, out of 62 recruited, completed the study (32 females). 48 participants concluded the randomized control trial, training group (TR) (n = 24) and training plus conditioning group (COTR) (n = 24); in the no-training group (NTR), to control for time effects over the training period, ten participants completed (n = 10). Body characteristics and psychological self-report parameters are shown in Table 1 and S1 Table in S1 File in supplements. Participants were young adults (24.3 (5.2) yrs.), with a wide range of BMI (BMI 18.5-40.5) but mostly eutrophic (53%), mainly reported moderate to high physical activity (86%), and more than medium exercise motivation (EMI 2). Participants were not aware of the EC process; only 3 out of 24 participants of the COTR group reported contingency awareness of higher speed with sweetness after the intervention (debriefing). Concurrently, we assume that any conditioning effects were produced out of awareness.

Effects of training and conditioning on self-selected speed, heart rate, and RPE
Training improved participants' cardiovascular exercise efficiency; heart rate per speed (HR/ Speed) ( Table 2) was significantly reduced after training in both TR and COTR groups (mixed model ANOVA; main effect of time: F = 21.87, p<0.0001, η 2 = 0.504; contrast baseline versus post-training: F = 43.48, p<0.0001, η 2 = 0.497; no significant interaction of group x time); no changes were detected in the NTR group.
Analyzing the self-selected speed, heart rate, and RPE data of the randomized control trial, results show that self-selected speed was selected on significantly higher levels after training than at baseline in both, TR and COTR groups. A main effect of time was reported in the ANCOVA with baseline speed as a covariate (F = 6.65, df = 2, p = 0.003, η 2 = 0.236), contrasts detected a significant increase in speed at post-training (T 0 versus T 1 : F = 9.64, df = 1, p = 0.003, η 2 = 0.180). The self-selected speed was about 1 km/h faster after training than at baseline in both groups (see Table 2); no significant interaction of group x time was reported between baseline (T0) and post-training (T1). ANCOVA, however, reported a significant interaction of group x time (F = 7.70, df = 2, p = 0.001, η 2 = 0.264), whereby contrasts revealed that the significant interaction was between after training (T 1 ) and 4 weeks follow-up (T 2 ), (T 2 versus T 1 , F = 13.32, p = 0.001, η 2 = 0.232), (Fig 2). Pairwise comparison showed that the COTR group selected the speed significantly higher than the TR group at 4 weeks follow-up (T 2 ), (t = -3.05, df = 45, p = 0.004), ( Table 2, Fig 2). The self-selected speed at T 2 was not different to baseline T 0 in the TR group. Concomitant heart rate measurements at self-selected speeds showed that cardiovascular strain was selected on a higher level after training (T1) compared with baseline (T0) (Fig 2,  Table 2); ANCOVA with baseline heart rate as covariate reported a main effect of time (F = 8.67, df = 2, p = 0.001, η 2 = 0.292), where contrasts revealed that the heart rate was a significant higher at post-training compared with baseline (T1 versus T0: F = 7.16, df = 1, p = 0.01, η 2 = 0.143) and with 4 weeks follow-up (T2 versus T1: F = 6.42, df = 1, η 2 = 0.130) in both groups. Moreover, a significant interaction of time x group was reported (F = 5.43, df = 2, p = 0.008, η 2 = 0.205), where the interaction was based on the difference in change of heart rate between T2 and T1 between groups (F = 6.55, df = 1, p = 0.014, η 2 = 0.132), showing that COTR group maintained a higher cardiovascular strain at 4 weeks follow-up compared with TR, which is consistent with the conditioning effect in self-selected speed (COTR).
Furthermore, ANCOVA with baseline as a covariate showed, that RPE levels ( Table 2, Fig  2) at self-selected speed were increased after training (main effect of time: F = 17.58, df = 2,

PLOS ONE
Influence of evaluative conditioning and training on self-paced exercise intensity

Effects of training and conditioning on reward discounting
Computer-based assessments of reward discounting of money, food, and exercise at baseline showed that the decay constant (k) for money (k m ) was significantly lower than k for food (k fo ) and exercise (k ex ) across groups (k-values were log-transformed due to skewed distribution; repeated measure ANOVA: F = 69.96, df = 2, p<0.0001, η 2 = 0.714; contrast k m versus k ex : F = 130.2, df = 1, p<0.0001, η 2 = 0.696); no significant difference between k fo and k ex was found. Outcomes demonstrate that exercise was discounted faster than money, like a nontransferrable reward, similar to food (Table 3).
To assess the hypothesized specific exercise training effect on discounting rates of exercise, ANOVAs of change between baseline and post-training values were performed on the logtransformed decay constants of k ex and k fo using the data of NTR and TR groups. This method was preferred to a mixed model ANOVA due to significantly lower levels of discounting rate constants in the NTR group compared with TR group at baseline (t-test: k ex : t = 3.02, df = 31.39, p = 0.0002; k fo : t = 2.68, df = 25.52, p = 0.013). Results revealed a significant difference between TR and NTR groups in the change of K ex from baseline to after training and notraining periods, respectively (Δk ex : F = 13.80, df = 1, p = 0.001); k ex was significantly reduced after training while k ex in the NTR group was unaltered (Table 3). Moreover, this effect of exercise training was specific to k ex ; the changes in k fo from baseline to after training/no-training period were not significantly different between groups, and no change over time was reported within groups. Consequently, these results show that exercise training reduced discounting rates of exercise specifically; no effect on k fo , and discounting of both, exercise and food rewards, were not affected by time (no change in NTR group).
For the hypothesis of an influence of EC on discounting rates of exercise, we used the data from the randomized control trial, TR, and COTR groups. Groups revealed no significant difference of k ex and k fo at baseline and mixed model ANOVA of log-transformed data over three time points (baseline-T 0 ; post-exercise-T 1 ; 4 weeks follow-up-T 2 ) was performed. Results (Table 3, Fig 3) showed a main effect of time (F = 24.56, df = 2, p<0.0001, η 2 = 0.522), where the post-training k ex values (T 1 ) were significantly reduced compared with baseline (T 0 ) (contrast T 1 versus T 0 : F = 45.78, df = 1, p<0.0001, η 2 = 0.499). Moreover, contrasts revealed that interaction of group x time; the figure shows means and SE. Data are depicted and used in the form of meeting statistical test assumptions. Untransformed data are presented in Table 2. k ex values returned towards baseline levels in both groups at 4 weeks follow-up (T 2 versus T 1 ; F = 27.94, df = 1, p<0.0001, η 2 = 0.378), with no significant differences between baseline and follow-up. Moreover, no significant effects of group and no interaction of group x time were found. In contrast, k fo values were not affected by training or conditioning (Table 3, Fig 3), no significant effects of time, group, and interaction were reported, revealing stable levels of food reward discounting over time in both groups.

Multiple regression analysis
We assumed that the self-selection of exercise intensity results from perceptual and evaluative processes, involving training status, effort perception, and evaluation of exercise as reward and cost. Consequently, we used the changes in variables training adaptation (HR/speed), exercise delay discounting (k ex ), effort perception (RPE), as well as evaluative conditioning (group), for the explanation of variance in exercise intensity change over the study time points. We performed multiple regression analysis using the enter method adding the alterations of the former parameters from baseline to either post-training or follow-up as explanatory variables. In addition, we used age as a further explanatory variable because of the known age-dependent response to training. The first model (Table 4) analyzed the period from baseline to post-training (T 0 to T 1 ). The model explained about 60% of the variance of self-selected speed alterations, where the alteration in HR/speed, as a measure of training adaptation, explained most Table 3

Mean STD (±) MED, (25 th | 75 th PCTL) R 2 mean STD (±)
No of the variance (beta = -0.600, p<0.0001), next to the alteration in RPE (beta = 0.396, p = 0.001), exercise discounting change and group did not contribute to the model. Higher increases in self-selected speed over this period were associated with stronger training adaptation (reduction in HR/speed) and larger effort acceptance (higher RPE). For the self-selected speed alterations from baseline to follow-up (T 0 to T 2 ), the model (Table 4, model 2) explained about 76% of the variance; the significant variables were the grouping variable, showing the influence of conditioning (beta = 0.428, p<0.0001), and the alteration of HR/speed (beta = -0.487, p<0.0001), next to age. Higher speed was therefore selected under influence of EC and better training effect (HR/Speed) over this period.

Effort model application
To integrate our findings into a concept which entails perceptual and evaluative parameters which seemed to explain exercise intensity selection, we suggested that choices have been made according to effort discounting models [25,65,66]. To test this, we used the hyperbolic effort discounting model Vp = M/(1+kC) [63], where the subjective value (Vp) is a function of the reward value (M) and perceived costs (C) in connection with the constant k. For our exercise intensity question, we assumed that participants' pleasantness scores (Feeling Scale (FS), see Table 3), recorded while selecting their self-selected speed, would be a measure of the subjective value (Vp) of the exercise. Moreover, we assumed that the rate of perceived exertion (RPE) given at self-selected speed would be a measure of perceived costs (C), and the parameter HR/Speed a measure that would determine the reward value (M) perceived during exercise.  Table 3.
https://doi.org/10.1371/journal.pone.0257953.g003 This reward value could be influenced by expected rewards through learning. Using the data from time points T 0 , T 1 , and T 2 , assuming, that if the model is valid for the combination of data, the Vp data from the least-squares fit would not be significantly different from the measured FS values (Vp) data. Also, we expected that the k values of at T 0 and T 1 and T 2 TR (training only group) would be on the same level, while the k value at T 2 COTR of the conditioning group should be hugely affected by the added conditioned reward, which is not accounted for in the model. Because k is connected to costs, an unaccounted reward value (conditioning) would need to reduce the cost term by reducing k to adjust to the subjective value measured. Firstly, fitting of the hyperbolic model to data at baseline (T 0 ), after training (T 1 ), and follow up (T 2 ) produced Vp values, which were not significantly different from the measured FS values (Table 4) Table 5), model fitting was improved (n = 24; 2-TR: Z = -0629, p = 0.530; T2-COTR: Z = -0.743, p = 0.458). k values of the fitted models reveal that ks are consistent over the periods of baseline, after training and follow up (T 2 , TR only): k-T 0 = 0.204; k-T 1 = 0.204; k-T 2 (TR) = 0.212. However, k-T 2 (COTR) = 0.042, revealed that the k value in the conditioning group (COTR) was adjusted five times lower to accommodate for the unaccounted reward value from the conditioning effect. The effort discounting model, therefore, fitted the observed data of our experiments reasonably and represents a possible explanation for the decision-making process.

Alterations in self-selected speed: Effects of training and evaluative conditioning
We hypothesized that exercise training would lead to a transient increase of self-selected speed due to transient physiological adaptations and a decline at the follow-up. Evaluative conditioning would increase self-selected speed selection due to the integration of the reward into the visceral reward of exercising. The hypotheses were driven by the assumption that self-selected exercise intensity could be perceived as a reward at an individual level of cardiovascular strain influenced by transient training adaption and balanced against the perceived effort at a given intensity.  Indeed, self-selected speed was significantly increased after training (TR), no changes were detected in the control group (NTR) over time, showing that the training effect on self-selected speed was specific to training. Multiple regression analysis (Table 4) showed that the speed changes could be foremost attributed to training adaptation i.e. changes in heart rate per speed (HR/Speed) and RPE score changes. HR per running speed at submaximal levels has a linear association over a wide range of intensity and has been used for monitoring training and as a predictor of endurance performance in connection with cardiovascular fitness [67,68]. Alterations in HR/Speed are connected to physiological adaptations to training enabling lower heart rate at a set speed after training [68]; however, the transient nature of those is seen at follow-up four weeks after training. Adaptation to aerobic training and detraining concerning heart rate changes are commonly observed and an expected outcome [69,70].
To understand the specific selection for a speed and why training, as well as conditioning led to an alteration of it, we assumed that the selection of exercise intensity would follow models generally suggested for choice decisions that include costs and rewards [25,27,71]. Indeed, most behavioural models assume that the subjective value of a utility is a function of rewards and costs [27,63,72,73]. Distinct choice paradigms where subjects work for an external reward with varied imagined or received rewards (i.e. money) have shown that a reward value is discounted against effort or work, resulting in a subjective value for the rewarding utility [25,65,66]. In confirmation, many animal studies showed that rewards and costs are similarly discounted in behavioural choices as well [25,26,28,30]. In this regard, various models are suggested from hyperbolic [63], sigmoidal [74] to parabolic [27]; however, the principal rule applies that effort carries a negative value which is used as a reference against a reward is evaluated [25].
If we assume that adjustment of self-selected speed is a function of cost-reward, it is conceivable that training could have either reduced the perceived costs for a set speed or increased the perceived reward. Indeed, the higher the reduction of HR/Speed after training was for an individual, the higher the speed was selected, which could be an indication for perceived cost reduction. However, subjects selected an increased speed with even higher RPE scores, where RPE is undoubtedly a measure of costs, which weakens the former argument. RPE is often associated with individuals' heart rate in aerobic exercise [75] and scales are partially attuned to heart rate levels i.e. scale 6-20 [76]. However, the integration of cardiovascular response (heart rate) into effort level perception does not exclude the possibility that fitness, i.e. heart rate per workload, could determine the reward perceived at a certain workload. Indeed, heart rate is not always associated with effort perception, attention allocation influences the association, and at higher heart rates with increased workloads, the attention shifts more closely towards physiological sensation i.e. heart rate [77]. Accordingly, we propose that HR/Speed, i.e. cardiovascular fitness, might determine the intensity of workload an individual selects based on its properties of limiting what can be perceived as rewarding. In our paradigm, subjects do not discount effort against an external reward but against a visceral reward made available by exercising at a distinct level of cardiovascular strain.
Additional support for this assumption of a modifiable reward in exercise linked to cardiovascular strain comes from our experiment using EC. We performed EC using our new paradigm, where the receipt of the sweet solution as a reward was associated with elevation of heart rate during exercise training. We hypothesized that EC would increase self-selected speed after training and follow-up, assuming that the conditioned reward would be integrated into the exercise reward enabling higher speed selection with concomitant increase in RPE. Indeed, the conditioning (COTR group) resulted in a significant increase in self-selected speed at followup compared with the training group (TR), while the former training effect on speed selection was not maintained over the follow-up period in the TR group. In the conditioning group (COTR) self-selected speed was preserved on a significantly higher level (about 1km/hr higher than baseline); the higher self-selected speed in COTR was associated with significantly higher heart rate and RPE than in TR group. However, EC effect was not significant directly after the training (T 1 ), which could be due to the strong training effect on speed selection at this time point.
In terms of the conditioning process, there is no doubt about the rewarding nature of sweetness in humans [78,79], and our participants were tested in a taste test about the pleasantness of the tastant used. Moreover, the brain areas known to be activated and concerned with reward are heavily activated in response to sweetness [80], even with non-caloric sweeteners [78,79]. The selection of higher speed, cardiovascular strain (HR), and RPE in the conditioning group (COTR), shows that higher costs are chosen, which can only be explained by the integration of the sweet reward into the processes relevant for the speed selection. If exercise intensity would only be selected based on minimizing costs of 'travel', the EC would be without effect due to the lack of principal integration of rewards into the selection of speed.
Evaluative conditioning paradigms in humans usually used visual representations of an object or behaviour for the pairing with the unconditioned stimulus [37]. In connection with exercise, there are only two studies that applied EC to exercise behaviour using visual stimuli for the exercise representation and the unconditioned stimulus [41,42]; however, only the study by Antoniewicz and Brand [42] observed acute exercise intensity changes after the EC procedure. To our knowledge, our study is the first study that has used the pairing of a physiological parameter during the performance of the behaviour with a primary reinforcer as UCS. Moreover, we associated the intensity of a tastant reward with the intensity of physical strain (heart rate), to direct the effect of the EC towards the selection of higher intensity in our selfselected exercise task. The use of primary reinforcers as unconditioned stimulus, tastants (rewarding and aversive), in connection with visual representations of food items and other objects, modifying food choices or implicit attitude towards selected food items, has been used before but only in a limited number of studies [81,82].
To integrate our findings of training and conditioning, we further explored our data using the hyperbolic effort discounting model to calculate effort discounting constant k [63]. Outcomes revealed that using the feeling scale values as a measure of subjective value, the RPE values as costs, and the HR/Speed values as a determinant of reward, the model predicted the feeling scores successfully at all three study time points (Table 5). Indeed, the fitting produced a consistent k value for effort discounting at baseline, after training, and at follow-up for the TR group, while the k value of the COTR deviated strongly at the follow-up time point, where significant conditioning effects were detected for speed and RPE. The five times smaller k value for the COTR group at follow-up could be explained by an additional reward which was not imputed in the equation, consistent with our assumption that EC added a reward apart from the one determined by HR/Speed values. Usually, paradigms in effort discounting use an external reward to be gained by various degrees of workload [25,29,66,74,83]; and k values within a similar range of our data have been reported [29,84]. However, in our case the reward is the self-selected exercise itself, which is determined by cardiovascular fitness (i.e. HR/ Speed.) of the individual and the conditioned reward; concurrently, the effort invested (RPE) is adjusted to the paradigm's demand of maximization of pleasantness (i.e. subjective value). Consequently, subjects do not adjust the speed to the lowest possible effort (RPE) or maintain the level. In our opinion, this interpretation makes also evolutional sense if we integrate the idea of foraging and hunting into the interpretation; humans with elevated physical fitness would 'travel' larger distances, enabling them to increase the probability of success in their foraging/hunting. Selection of speed or workload intensity would be selected as rewarding based on the specific capacity of an individual, apart from the exercise itself. Our conditioning experiment also suggests that learned/predicted rewards can be integrated with visceral rewards (i.e. fitness dependent). It is possible that reported associations in volume of elevated exercise intensity with intrinsic motivation could be indirectly attributed to this mechanism [85,86].

Alterations in delay discounting: Effects of training and evaluative conditioning
Our study shows that taking part in exercise training shifted the choice preference for exercise towards delayed option in the discounting paradigm; k ex was significantly reduced in the training group (TR) but remained unaltered in the no-training group (NTR) after the three weeks intervention period. This effect was specific for exercise discounting; no alterations in food discounting were detected over time in both groups. Moreover, the effect on exercise discounting was not sustained after the four weeks follow-up period, where participants stopped training and returned to their habitual physical activity; k ex returned to baseline at follow-up. Moreover, k ex values have not been influenced by EC, no interaction of group and time was reported, while a reduction was expected due to a magnitude effect that would reduce k values [49]. However, there is a caveat for this expectation; the paradigm asked people to optimize the pleasantness, based on the feeling scale (optimizing subjective value), by adjusting the speed. Concurrently, participants perceived the same pleasantness, producing matching subjective values over time and groups. Therefore, the added reward by conditioning might not have revealed itself in the discounting paradigm, while it was apparent in the selection of speed at the follow-up time point. Moreover, propositions by Loewenstein [35] are referring to the problem of using past and likely future visceral factors for decision making. In general, visceral factors are underestimated in their capability for influencing behaviour if it comes to decision making regards cognitive evaluation and planning of future behaviour, as well as the actual impact for a behaviour. It seems, that the information of a visceral change towards a higher reward, which resulted in higher speed selection was not included in the processing of information for the delay discounting of exercise. Indeed, if the visceral changes in exercise experience would be integrated, we would detect an association between k ex changes and speed changes; however, the alterations in k ex were not correlated with the speed changes (not shown). This again suggests that the process for the actual selection of speed was not driven by the same factors as the delay discounting of the exercise.
We could formerly show that motivation towards delayed extrinsic goals, i.e. related to health and fitness, was associated with k ex [31]. However, context-specific valuation could play a dominant role for the alteration of k ex after training i.e. contextual relevance of delayed exercise training goals, which is also supported by the return to baseline k ex levels at follow-up, where the contextualization of delayed goals is expected to decline. In agreement with this interpretation, discounting was shown to be context-sensitive in gamblers where k values were higher in a gambling environment than in a neutral environment [87]. When a situation or state is not directly experienced anymore, it turns to be more psychologically distant and would need more abstract cognitive representation [88]. The mental representation of delayed exercise training outcomes and goals might be more psychologically distant and decontextualized with emerging time distance to the training period. Alterations in temporal discounting by manipulations of construal levels (concrete or abstract) and psychological distance has been shown experimentally [89]. Further support for this interpretation, is that discounting alterations were specific for exercise and not seen in food discounting, revealing no generalized effect on discounting.
Our study has limitations; particularly, our paradigm limits the generalization to other exercise types which might be connected to other rewarding stimuli (group exercises, competitions, etc.). Moreover, self-selection of intensity might be limited in many team and competitive sports, therefore reducing the relevance of our findings in those areas. In addition, our participants had already a high exercise motivation at baseline, limiting the expansion of finding to groups with low motivation i.e. sedentary. In addition, we did not investigate sex differences in this study; sex differences in self-selected speed, RPE, and avHR were not detected, while slightly lower feeling scale values were detected for women (data not shown).
In conclusion, our study suggests that self-selected exercise can be perceived and evaluated as a reward. However, the intensity of exercise and exercise as a modality are differentially evaluated in the context of delay and effort discounting. The self-selected intensity of exercise, which can be perceived as rewarding, is determined by cardiovascular fitness, and learned rewards, and is discounted against perceived effort. However, delay discounting of self-selected exercise as a modality seemed to be strongly influenced by contextual factors and exercise motivation.