## Figures

## Abstract

Reward probability and uncertainty are two fundamental parameters of decision making. Whereas reward probability indicates the prospect of winning, reward uncertainty, measured as the variance of probability, indicates the degree of risk. Several lines of evidence have suggested that the anterior cingulate cortex (ACC) plays an important role in reward processing. What is lacking is a quantitative analysis of the encoding of reward probability and uncertainty in the human ACC. In this study, we addressed this issue by analyzing the feedback-related negativity (FRN), an event-related potential (ERP) component that reflects the ACC activity, in a simple gambling task in which reward probability and uncertainty were parametrically manipulated through predicting cues. Results showed that at the outcome evaluation phase, while both win and loss-related FRN amplitudes increased as the probability of win or loss decreased, only the win-related FRN was modulated by reward uncertainty. This study demonstrates the rapid encoding of reward probability and uncertainty in the human ACC and offers new insights into the functions of the ACC.

**Citation: **Yu R, Zhou W, Zhou X (2011) Rapid Processing of Both Reward Probability and Reward Uncertainty in the Human Anterior Cingulate Cortex. PLoS ONE 6(12):
e29633.
https://doi.org/10.1371/journal.pone.0029633

**Editor: **Jason Jeremy Sinclair Barton, University of British Columbia, Canada

**Received: **April 30, 2011; **Accepted: **December 2, 2011; **Published: ** December 27, 2011

**Copyright: ** © 2011 Yu et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

**Funding: **This study was supported by grants from the Natural Science Foundation of China (60435010, 30770712). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

**Competing interests: ** The authors have declared that no competing interests exist.

## Introduction

Reward probability and uncertainty are essential parameters in the computation of the utility function of a behavior choice [1], [2]. Whereas reward probability crucially determines the expected reward value associated with a behavior choice, reward uncertainty, i.e., the variance of the probability distribution, provides an estimate of the risk associated with the same choice. In non-human primates, substantial evidence indicates that the midbrain dopamine neurons encode the reward prediction signal that is based on reward probability, as well as the reward prediction error signal that is the difference between the actual and expected reward [3]–[5]. The cues that predict higher reward probabilities evoke larger phasic activations in the midbrain dopamine neurons. Whereas the outcomes that are better than predicted (positive prediction errors) evoke phasic activations in the dopamine neurons, the outcomes that are worse than predicted (negative prediction errors) evoke phasic inhibitions. In a seminal study, Fiorillo et al. (2003) further showed that the midbrain dopamine neurons encode reward uncertainty in their tonic discharges. Recent fMRI studies reported similar encoding of reward probability and uncertainty in the human midbrain regions [6], [7].

The anterior cingulate cortex (ACC) receives projections from the midbrain dopaminergic regions and has been proposed to play an important role in reward processing. Event-related potential (ERP) studies in humans found that an ERP component, called the feedback related negativity (FRN), is sensitive to reward expectation error. The FRN, which peaks at around 300 ms and is maximal at frontal-central scalp electrode sites, is likely being generated in the ACC [8]–[10]. Consistent with this account, fMRI studies of ACC have shown that the activity in ACC can reflect reward prediction errors [11], [12]. A recent fMRI study also found that the ACC activity is modulated by the uncertainty of reward environment during feedback monitoring and the degree of such modulation predicts the learning rate across individuals [13], suggesting that the ACC may track the reward uncertainty.

The goal of this ERP study is to use the FRN amplitude as a measure of the ACC activity and perform a quantitative analysis of the encoding of both reward probability and uncertainty in the ACC. As the uncertainty is derived and calculated from the probability [14], in most circumstance, these two factors are highly correlated. Increasing the probability of win from 75% to 100% not only changes reward probability but also decreases uncertainty (i.e., 100% win is most certain). On the other hand, decreasing the probability of win from 25% to 0% not only decreases reward probability but also decreases the uncertainty (i.e., 0% win is most certain). The uncertainty reaches its maximum when reward probability is 50%. Above 50%, it decreases as reward probability increases, whereas below 50%, it decreases as reward probability decreases. Given these opposite directions of correlations, the correlation between probability and uncertainty will be close to zero if the win probability varies from 0 to 100%. In this study, to ensure reward probability and uncertainty could be disassociated, reward probability was varied over a wide range of probabilities with a sufficient number of intermediate values (every 12.5% from 0 to 100%). Given the evidence that the ACC encodes both reward probability and uncertainty in fMRI and the evidence for the link between the FRN and the ACC [8]–[13], we predicted that the FRN amplitude would be modulated by both reward probability and uncertainty.

## Materials and Methods

### Participants

Sixteen undergraduate students (8 male; mean age 22±2.5 years) participated in the gambling experiment. They were told that their performance in the gambling task determined how much they would be awarded or penalized on the top of a base payment of 40 yuan (about US $6). Written, informed consent was obtained from each participant, and the study was approved by the Academic Committee of the Department of Psychology at Peking University.

### Experimental design

We used a modified version of a gambling task in which reward probability and uncertainty were manipulated parametrically [14]–[16] (Fig. 1). In each trial, participants were first presented with the back side of two cards that were drawn without replacement randomly from a deck of nine cards numbered between 2 to 10. They were asked to guess within 3000 ms which card had a larger number in order to win 0.5 yuan. A 0.5 yuan penalty was imposed for late response. Participants were explicitly informed about this rule and a visual feedback “too late, lose 0.5 yuan” was presented to participants if they failed to respond within 3000 ms. At 700 ms after participants' response, the chosen card (called cue card) was presented for 1000 ms. The winning probability was indicated by the number of the cue card ranging from 2 to 10, which corresponded to the winning probability of 0, 0.125, 0.25, 0.375, 0.5, 0.675, 0.75, 0.875, and 1, respectively. Participants were explicitly informed of these probabilities. At 700 ms after the offset of the cue card, a sign of “+50” or “−50” was presented for 1000 ms to indicate a win (and 0.5 yuan reward) or loss (and 0.5 yuan penalty) trial, respectively. We only presented the numeric feedback without showing the original two cards in order to control for the visual property of feedback stimuli. The next trial began 1000 ms after the offset of the feedback in the previous trial. The experiment consisted of 9 blocks of 96 trials with each cue card being presented a total of 96 times. For each cue condition, the proportion of trials for the win or loss outcome followed exactly the probability indicated by the cue number. For example, for the cue card 3, 12.5% trials would give the win feedback and 87.5% trials the loss feedback. There was a short break between blocks.

For each condition, reward probability was indicated by the number in the cue card, as we pointed out earlier. There are several measures of uncertainty that are all maximal at P = 0.5 and minimal at P = 0 or 1. In this study, uncertainty was defined as reward variance, which is an inversely quadratic function of probability [14], [16]. Thus, reward uncertainty has a value of 0, 0.44, 0.75, 0.94, 1, 0.94, 0.75, 0.44 and 0 for reward probability value of 0, 0.125, 0.25, 0.375, 0.5, 0.625, 0.75, 0.875 and 1, respectively (Table 1). We used this measure in order to be consistent with previous neuroimaging studies [14], [16]. Uncertainty was also measured as entropy [4] and similar effects were observed. At the outcome stage, positive prediction error elicited by actual win feedback was measured as 1 minus probability of winning at the cue stage, whereas negative prediction error elicited by actual loss feedback was measured as 1 minus probability of losing at the cue stage (see Table 1). The uncertainty prediction error was measured as the uncertainty at the cue stage minus 0 as there was no uncertainty at the outcome stage (uncertainty resolved). At the cue stage, two analyses were carried out: a one-factor ANOVA analysis with 9 levels of probability and a regression analysis with mean FRN amplitudes across participants as dependent variable and reward probability and reward uncertainty as two independent variables. Repeated measures ANOVA analyses tested whether the FRN amplitude showed significant linear or quadratic relationship with reward probability. Since uncertainty, measured as reward variance, is an inversely quadratic function of probability that is minimal at P = 0 and P = 1 and maximal at P = 0.5, a significant quadratic effect would suggest a significant relationship between uncertainty and FRN amplitude. Degrees of freedom were corrected using Greenhouse-Geisser estimates of sphericity when the Mauchly's test indicated that the assumption of sphericity had been violated. Note, because uncertainty was calculated from probability, it was impossible in ANOVA to examine both factor together, for example, controlling probability while examining the effect of uncertainty. While the ANOVA analyses examine the effect of probability or uncertainty separately, linear regression analyses examine the effect of one factor (e.g. probability) after controlling for the all other factors in the model. Similar data analyses were carried out for the FRN at the outcome stage.

### ERP recording and analysis

EEGs were recorded from 64 scalp sites using tin electrodes mounted in an elastic cap according to the International 10/20 system (NeuroScan Inc. Herndon, Virginia, USA). The impedance of electrodes was maintained below 5 KΩ. Eye blinks were recorded from the left supraorbital and infraorbital electrodes. The horizontal electro-oculogram (EOG) was recorded from electrodes placed 1.5 cm lateral to the left and right external canthi. All electrode recordings were referenced to an electrode placed on the left mastoid. The EEG and EOG were band-pass filtered (0.05∼70 Hz), sampled at 500 Hz and stored in hard disks for off-line analysis.

Ocular artifacts were corrected with an eye-movement correction algorithm (Gratton et al., 1983). All trials in which EEG voltages exceeded a threshold of +/− 70 µV during the recording epoch were excluded from analysis. The EEG data were re-referenced offline to linked-mastoid electrodes by subtracting 50% of the signal in the right mastoid electrode from the signal in each channel. The EEG signal was baseline corrected and further band-pass filtered from 2∼20 Hz (24 dB octave roll off). This was to minimize the overlap between the FRN and other reward-sensitive ERP components, particularly the P300, since it has been known that the P300 is a closely associated slow wave ERP response [17], [18]. Epochs of 800 ms (with 200 ms pre-stimulus baseline) EEG from each electrode were sorted by experimental conditions.

At both cue and feedback phases, the FRN was measured as the mean amplitude at Fz, where there was maximal effect of valence (loss minus win), during the interval 275–325 ms after stimulus presentation [19], [20]. To confirm that our findings were not affected by the particular time window we selected for the FRN, we also reported the mean FRN amplitude during the interval 250–325 ms post-stimulus for cue conditions. The FRN was also measured as the base to peak amplitude and a similar pattern of effects was observed. We did not use the difference wave approach since our aim was to quantitatively evaluate the relationship between the FRN amplitudes and probability or uncertainty rather than simply compare the FRN amplitudes in two experimental conditions [21], [22]. To assess the coding of reward probability and uncertainty by the FRN, linear regressions were performed using the mean FRN amplitude (in each condition) as a dependent variable and reward probability and uncertainty as independent variables.

### Dipole Analysis

An attempt was made to localize the dipole sources of the ERP components at the cue phase and the feedback phase. The cue ERP waveform was generated by averaging all cue locked ERP waveforms across all conditions. The win or loss ERP waveform was generated by averaging all feedback locked ERP waveforms across all win or loss conditions. Source localization was carried out with the Brain Electrical Source Analysis program (BESA, Version, 5.0) using a four-shell ellipsoidal head model. As suggested [23], data were high-pass filtered (2 Hz) before dipole fitting in order to remove slow drifts which could bias the resulting solution.

For both cue and feedback locked ERP components, time windows of 75 to 125, 150 to 200, and 250 to 350 ms post-response, were chosen for the localization analysis of the N1, P2, and FRN components, respectively. We use symmetric dipoles for the localization analysis of the N1 and P2 components since early sensory processes were likely to occur at both hemispheres. The dipoles were fitted with no restriction on their direction and location for each component and then fitted with fixed location for the 0 to 350 interval covering all the ERP components.

## Results

### Cue-evoked FRN

For the cue-evoked FRN in the interval of 275–325 ms post-cue presentation (Table 1 and Fig. 2A), ANOVA with 9 levels of probabilities revealed a significant main effect of probability, F(8,120) = 3.57, *p* = 0.016, a significant linear main effect, F(1,15) = 5.33, *p* = 0.036, and a marginally significant quadratic effect, F(1,15) = 3.28, P = 0.09. For the cue-evoked FRN in the interval 275–325 ms after cue presentation, similar ANOVA revealed a significant main effect of probability, F(8,120) = 5.874, *p* = 0.001, a marginally significant linear main effect, F(1,15) = 3.977, *p* = 0.065, and a significant quadratic effect, F(1,15) = 10.657, *p* = 0.005. These results suggest that the FRN encodes reward probability, such that smaller reward probability was associated with larger FRN amplitude, as well as reward uncertainty, although these effects are not robust.

Please note, the outcome probability used in this figure refers to the actual outcome frequency. Thus low probability indicates that the outcome is infrequent. For example, 25% probability in win condition refers to ‘actual win after the prediction of 25% winning probability’, whereas 25% probability in loss condition refers to ‘actual loss after the prediction of 75% winning probability’. For clarity, only waveforms for probabilities of 25%, 50%, 75%, and 100% are presented. The topographic map of mean FRN at 300ms in the cue, win, and loss conditions were also shown. (D) Coding of reward probability and reward uncertainty in cue-evoked FRN, and (E) outcome-evoked FRN. The regression lines were computed based on the regression equations for each condition.

Regression analysis on mean FRN amplitudes revealed that the regression coefficient (Beta value) associated with reward probability was 0.745±0.24 and the coefficient associated with reward uncertainty was −0.556±0.21. T tests revealed that both coefficients were significantly different from zero (*t* = 3.08, *p* = 0.022 for probability, and *t* = −2.64, *p* = 0.038 for uncertainty), suggesting that cue-evoked FRN was modulated by both reward probability and uncertainty. The coefficients indicated that the FRN had larger amplitudes for smaller reward probabilities and high uncertainties (Fig. 2D). The proportion of the variance explained by the model was high, with *R*^{2} = 0.73, *p* = 0.019. Note, the uncertainty effect might be interpreted with caution, as the effect may predominately driven by the P = 1 condition. After taking out the P = 1 condition, there was no significant correlation between FRN amplitude and reward probability or uncertainty (P values >0.05).

For the interval 250–325 ms post-cue (Table 2), regression analysis revealed that both probability coefficient (0.565±0.26) and uncertainty coefficient (−0.789±0.23) were significantly different from zero (t = 2.17, p = 0.073 for probability, and t = −3.33, p = 0.014 for uncertainty). The explanation power was the same as the model on FRN data in the interval of 275–325 ms post-cue.

### Outcome-evoked FRN

ANOVA with two types of outcomes (win/loss) and 8 levels of probabilities revealed a significant main effect of valence, F(1,15) = 16.39, P = 0.001, a significant main effect of probability, F(7,105) = 12.91, P<0.001, and a significant interaction between valence and probability, F(7,105) = 5.37, P = 0.002, suggesting that the effects of outcome probability on FRN amplitude differ in win and loss domain.

For win outcomes, tests of within-subjects contrasts revealed a significant linear main effect, F(1,15) = 32.90, P<0.001, and a significant quadratic, F(1,15) = 7.56, P = 0.015, suggesting that win-evoked FRN encode both reward probability and uncertainty, when examined separately. Consistent with the ANOVA analysis, regression analysis revealed that the win-evoked FRN (Fig. 2B) was significantly modulated by positive prediction error, *t(7)* = −8.20, *p*<0.001, and uncertainty prediction error, *t(7)* = 7.89, *p* = 0.001, with a coefficient of −2.596±0.32 and 2.234±0.28 for positive prediction error and uncertainty prediction error, respectively (Fig. 2E, in blue. Note, the outcome probability in this figure refers to the actual outcome frequency, as explained in the figure caption). The regression coefficient associated with positive prediction error indicated that the FRN had larger amplitudes for infrequent win feedback, whereas the regression coefficient associated with uncertainty prediction error indicated FRN amplitudes were larger for the win outcome with lower reward uncertainty. The proportion of the variance explained by the model was very high, with *R*^{2} = 0.947, *p* = 0.001.

In the loss condition, tests of within-subjects contrasts revealed a significant linear main effect, F(1,15) = 9.71, P = 0.007, and a non-significant quadratic, F(1,15) = 2.94, P = 0.107, suggesting that loss associated FRN encode reward probability but not uncertainty. In consistent with the ANOVA analysis, regression analysis revealed that the loss-evoked FRN (Fig. 2C) was significantly modulated by negative prediction error, *t(7)* = 7.70, *p* = 0.001, with a coefficient of −4.795±0.62, but not by uncertainty prediction error (the coefficient was 1.011±0.56, *t(7)* = 1.81, p = 0.130). The proportion of the variance explained by the model was high, with *R*^{2} = 0.93, *p* = 0.001 (Fig. 2E, in red). Note, the regression coefficients associated with reward prediction error were negative for both win-evoked FRN and loss-evoked FRN, suggesting that infrequent outcome evoked stronger negative-going FRN in both win and loss domains.

### Source analysis of the FRN

In the cue condition, the resulting five-source model accounts for the data with a residual variance of 4.86% (Fig. 3A) and the source of the cue-evoked FRN was located in the site of ACC (x = 10, y = 5, z = 37). In the win outcome condition, the resulting five-source model accounts for the data in the period 0 to 350 ms post onset of win feedback with a residual variance of 4.85% and the source of the win-evoked FRN was also located in the site of ACC (x = 5, y = −2, z = 37). The same model for the win condition also accounts for the ERP data in the loss condition with a residual variance of 4.74%, suggesting that win and loss ERPs have the same sources (Fig. 3B). Thus the dipole source analysis further indicated an involvement of the ACC in the rapid processing of reward probability and uncertainty signals.

Dipoles were superimposed on MRI-based head models for grand-average ERP waveforms in (A) cue phase and (B) outcome (win/loss) phase.

## Discussion

In this study, the FRN, as an indicator of the ACC activity, was measured in a simple gambling task in which reward probability and uncertainty could be dissociated. We provided, for the first time to our knowledge, a quantitative analysis of the encoding of reward probability and uncertainty in the human ACC. Our results suggest that the cue-evoked FRN may encode reward probability and uncertainty. While both win and loss-related FRN amplitudes decreased as a function of outcome probability, only the win-related FRN but not the loss-related FRN was modulated by reward uncertainty. These results provide new insights into the functions of the ACC in reward decision making.

Previous ERP studies have examined the encoding of reward probability in the ACC. They only used limited number of probability values (i.e. 25%, 50%, and 75%) and yielded inconsistent findings [22], [24]–[26]. Two studies found that negative prediction errors evoked larger FRN amplitudes than positive prediction errors [22], [24]. While one study found that reward probability only modulated the win-evoked FRN, but not the loss-evoked FRN [25], another study found that reward probability modulated neither the win-evoked FRN nor the loss-evoked FRN [26]. The present study has two unique features that may allow us to overcome the limitations of previous studies and provide a more comprehensive analysis of the encoding of reward parameters in the ACC. First, the reward probability information was explicitly provided with a cue card and the feedback cannot be used to optimize decisions. Thus our design minimized the possible influence of asymmetric sensitivity to positive and negative outcomes in learning [27], [28]. Second, reward probability was varied over a wide range of probabilities with a sufficient number of intermediate values (every 12.5% from 0 to 100%) to ensure that reward probability and uncertainty were disassociated.

The first main finding of this study was that the amplitudes of the win- and loss-evoked FRNs all increased with outcome probability, indicating that positive and negative prediction errors were similarly encoded in the ACC. This finding challenges the hypothesis that the ACC activity mirrors the activity of the midbrain dopamine neurons in the encoding of reward prediction error in win and loss conditions [29], [30]. The reinforcement learning theory of the FRN proposes that the FRN reflects the impact of the midbrain dopamine signals on the ACC [29], [30]; the phasic changes in the midbrain dopamine activity are associated with fluctuations in the FRN amplitude and negative and positive prediction errors increase and decrease the FRN amplitude, respectively [29], [30]. The phasic decreases in dopamine inputs elicited by negative prediction errors give rise to the increased ACC activity that is reflected as larger FRN amplitudes. The phasic increases in dopamine signals elicited by positive prediction errors give rise to decreased ACC activity that is reflected as smaller FRN amplitudes. While a linear correlation between the negative prediction error and the FRN amplitude in this study is consistent with earlier ERP studies [22], [24], [31]–[33], the linear association of the larger FRN amplitude with larger (rather than smaller) positive prediction error is a novel finding, which suggests that positive prediction errors evoke a linear increase rather than decrease in the ACC activity. It has been found that negative feedback elicited a large FRN only when participants estimated they had responded correctly but not when they estimated they had responded erroneously [34]. Further, false-positive feedback presented after participants made large errors after erroneous trials elicited a significantly larger FRN than negative feedback [34]. Our study extends these previous findings by further showing a linear relationship between probability and FRN amplitude. Violation of reward magnitude expectation was also found to elicit larger FRN [35]. These results are consistent with recent single unit recording studies in monkeys and humans that found two groups of ACC neurons sensitive either to unexpected wins or losses [36]–[38]. Taken together, these findings support the notion that the ACC generally monitors violations in expectancy rather than negative feedback per se [34].

The second main finding was that reward uncertainty was encoded in the cue-evoked FRN and the win-evoked FRN. Uncertainty is crucial to decision making and attention based learning [1]. Different from monkeys' midbrain dopamine neurons that encode reward uncertainty by sustained and delayed signals [4], the present study showed that reward uncertainty signals were rapidly processed in the human ACC. This rapid encoding of uncertainty may reflect the need for a rapid motivational evaluation of the informativeness of stimuli. We found that larger cue-evoked FRN amplitudes were elicited by cues indicating high uncertainty in making reward prediction. This finding is consistent with the notion that the FRN reflects motivational evaluation of outcome since high uncertainty cues are less informative and thus less rewarding to participants. This finding is also consistent with an earlier fMRI study that also found stronger ACC activity to the uncertainty of reward cues during reward anticipation [15]. The uncertainty is resolved when outcomes are presented. The resolution of high uncertainty should be more informative and more rewarding than the resolution of low uncertainty. Indeed, for the win outcome, compared with wins following more certain cues, wins following uncertain cues are evaluated more positively, indicated by the decreased FRN amplitudes (i.e., more positive deflection). Our informativeness account is supported by the evidence that human ACC activity in the outcome monitoring phase is modulated by the volatility or uncertainty of the reward environment [13]. Taken together, these findings highlight the contribution of ACC in encoding uncertainty.

Some limitations in the present study are worth mentioning. First, in the outcome phase, the numbers of trials change with experimental conditions, raising the possibility that trial numbers may contribute to the FRN patterns. However, a recent study found that the FRN component rapidly stabilizes at 20 trials (or even 10 trials in one experiment) in healthy populations [39], indicating that increasing the number of trials after that would not significantly change the FRN amplitude. Second, although the objective reward probability associated with each cue card is explicit, different participants might perceive them differently. Moreover, participants may have irrational believe that their actions could influence outcomes [40], [41] and they may be overoptimistic about the chances of winning [34]. How the subjective probabilities might differ from objective probabilities is an interesting question for future studies. Third, our interpretation of associations between uncertainty and FRN amplitudes is speculative. The exact mechanisms reflected in the FRN amplitude/ACC activity are largely unknown. It is also currently unknown why the uncertainty effect was significant for the FRN in the win condition but not in the loss condition. Also, the informativeness account cannot explain other FRN findings, such as why the FRN is more negative for losses than for wins. Further computational model-based studies are needed to resolve this issue. Fourth, although the FRN has primarily been localized to the ACC [8], [42], [43], there is no direct evidence to link the ACC with the FRN. In fact, some studies have localized the FRN to the striatum [44], [45]. The mesocorticolimbic dopamine system, which includes the midbrain, striatum, orbital frontal cortex, and medial prefrontal cortex (e.g. ACC), has long been implicated in reward processing [46]. It is possible that the FRN also reflects reward processing in reward regions beyond the ACC [45]. Fifth, recent studies suggest that modulation of the FRN amplitudes results from the superposition on correct trials of a positive-going deflection, known as reward positivity [21], [47]–[49]. The reduction in the FRN amplitude could have resulted from superposition of the reward positivity that cancels out the FRN. Given that the present study was not designed to test these possibilities, further studies are necessary to examine the FRN using advanced methods such as principal components analysis (PCA) [49]. Our FRN findings at the cue phase could be driven by some peculiar experimental conditions, and they are in need of replication before conclusive arguments are made.

In summary, we demonstrate that reward probability and reward uncertainty can be processed rapidly and discretely in the human ACC at about 300 ms after stimulus presentation. An integrated processing of uncertainty and probability enables optimal inference and learning in a noisy and changeable environment. Current models of the FRN should thus be modified to take into account the uncertainty signal in the ACC.

## Author Contributions

Conceived and designed the experiments: RY XZ. Performed the experiments: RY. Analyzed the data: RY. Contributed reagents/materials/analysis tools: RY. Wrote the paper: RY XZ WZ.

## References

- 1. Levy H, Markowitz HM (1979) Approximating expected utility by a function of mean and variance. American Economic Review 69: 308–317.
- 2. Real LA (1991) Animal choice behavior and the evolution of cognitive architecture. Science 253: 980–986.
- 3. Schultz W, Dayan P, Montague PR (1997) A neural substrate of prediction and reward. Science 275: 1593–1599.
- 4. Fiorillo CD, Tobler PN, Schultz W (2003) Discrete coding of reward probability and uncertainty by dopamine neurons. Science 299: 1898–1902.
- 5. Tobler PN, O'Doherty JP, Dolan RJ, Schultz W (2007) Reward value coding distinct from risk attitude-related uncertainty coding in human reward systems. J Neurophysiol 97: 1621–1632.
- 6. Aron AR, Shohamy D, Clark J, Myers C, Gluck MA, et al. (2004) Human midbrain sensitivity to cognitive feedback and uncertainty during classification learning. J Neurophysiol 92: 1144–1152.
- 7. Dreher JC, Kohn P, Berman KF (2006) Neural coding of distinct statistical properties of reward information in humans. Cereb Cortex 16: 561–573.
- 8. Gehring WJ, Willoughby AR (2002) The medial frontal cortex and the rapid processing of monetary gains and losses. Science 295: 2279–2282.
- 9. van Veen V, Holroyd CB, Cohen JD, Stenger VA, Carter CS (2004) Errors without conflict: implications for performance monitoring theories of anterior cingulate cortex. Brain Cogn 56: 267–276.
- 10.
Miltner WHR, Braun CH, Coles MGH (1997) Event-related brain potentials following incorrect feedback in a time-estimation task: Evidence for a “generic” neural system for error detection. J Cogn Neurosci. pp. 788–798.
- 11. Rolls ET, McCabe C, Redoute J (2008) Expected value, reward outcome, and temporal difference error representations in a probabilistic decision task. Cerebral Cortex 18: 652–663.
- 12. Jessup RK, Busemeyer JR, Brown JW (2010) Error effects in anterior cingulate cortex reverse when error likelihood is high. J Neurosci 30: 3467–3472.
- 13. Behrens TE, Woolrich MW, Walton ME, Rushworth MF (2007) Learning the value of information in an uncertain world. Nat Neurosci 10: 1214–1221.
- 14. Preuschoff K, Bossaerts P, Quartz SR (2006) Neural differentiation of expected reward and risk in human subcortical structures. Neuron 51: 381–390.
- 15. Critchley HD, Mathias CJ, Dolan RJ (2001) Neural activity in the human brain relating to uncertainty and arousal during anticipation. Neuron 29: 537–545.
- 16. Preuschoff K, Quartz SR, Bossaerts P (2008) Human insula activation reflects risk prediction errors as well as risk. J Neurosci 28: 2745–2752.
- 17. Donkers FC, Nieuwenhuis S, van Boxtel GJ (2005) Mediofrontal negativities in the absence of responding. Brain Res Cogn Brain Res 25: 777–787.
- 18. Luu P, Tucker DM, Derryberry D, Reed M, Poulsen C (2003) Electrophysiological responses to errors and feedback in the process of action regulation. Psychol Sci 14: 47–53.
- 19. Zhou Z, Yu R, Zhou X (2010) To do or not to do? Action enlarges the FRN and P300 effects in outcome evaluation. Neuropsychologia 48: 3606–3613.
- 20. Yu RJ, Zhou XL (2009) To Bet or Not to Bet? The Error Negativity or Error-related Negativity Associated with Risk-taking Choices. Journal of Cognitive Neuroscience 21: 684–696.
- 21. Hajcak G, Moser JS, Holroyd CB, Simons RF (2006) The feedback-related negativity reflects the binary evaluation of good versus bad outcomes. Biological Psychology 71: 148–154.
- 22. Holroyd CB, Nieuwenhuis S, Yeung N, Cohen JD (2003) Errors in reward prediction are reflected in the event-related brain potential. Neuroreport 14: 2481–2484.
- 23.
Scherg M, Berg P (1990) BESA-Brain electric source analysis handbook. Munich: Max-Planck Institue for Psychiatry.
- 24. Yasuda A, Sato A, Miyawaki K, Kumano H, Kuboki T (2004) Error-related negativity reflects detection of negative reward prediction error. Neuroreport 15: 2561–2565.
- 25. Cohen MX, Elger CE, Ranganath C (2007) Reward expectation modulates feedback-related negativity and EEG spectra. Neuroimage 35: 968–978.
- 26. Hajcak G, Holroyd CB, Moser JS, Simons RF (2005) Brain potentials associated with expected and unexpected good and bad outcomes. Psychophysiology 42: 161–170.
- 27. Frank MJ, Seeberger LC, O'Reilly R C (2004) By carrot or by stick: cognitive reinforcement learning in parkinsonism. Science 306: 1940–1943.
- 28. Frank MJ, Woroch BS, Curran T (2005) Error-related negativity predicts reinforcement learning and conflict biases. Neuron 47: 495–501.
- 29. Holroyd CB, Coles MGH (2002) The neutral basis of human error processing: Reinforcement learning, dopamine, and the error-related negativity. Psychological Review 109: 679–709.
- 30. Nieuwenhuis S, Holroyd CB, Mol N, Coles MG (2004) Reinforcement-related brain potentials from medial frontal cortex: origins and functional significance. Neurosci Biobehav Rev 28: 441–448.
- 31. Bellebaum C, Polezzi D, Daum I (2010) It is less than you expected: the feedback-related negativity reflects violations of reward magnitude expectations. Neuropsychologia 48: 3343–3350.
- 32. Bellebaum C, Daum I (2008) Learning-related changes in reward expectancy are reflected in the feedback-related negativity. Eur J Neurosci 27: 1823–1835.
- 33. Holroyd CB, Hajcak G, Larsen JT (2006) The good, the bad and the neutral: electrophysiological responses to feedback stimuli. Brain Res 1105: 93–101.
- 34. Oliveira FT, McDonald JJ, Goodman D (2007) Performance monitoring in the anterior cingulate is not all error related: expectancy deviation and the representation of action-outcome associations. J Cogn Neurosci 19: 1994–2004.
- 35. Wu Y, Zhou X (2009) The P300 and reward valence, magnitude, and expectancy in outcome evaluation. Brain Res 1286: 114–122.
- 36. Matsumoto M, Matsumoto K, Abe H, Tanaka K (2007) Medial prefrontal cell activity signaling prediction errors of action values. Nat Neurosci 10: 647–656.
- 37. Williams ZM, Bush G, Rauch SL, Cosgrove GR, Eskandar EN (2004) Human anterior cingulate neurons and the integration of monetary reward with motor responses. Nat Neurosci 7: 1370–1375.
- 38. Sallet J, Quilodran R, Rothe M, Vezoli J, Joseph JP, et al. (2007) Expectations, gains, and losses in the anterior cingulate cortex. Cogn Affect Behav Neurosci 7: 327–336.
- 39.
Marco-Pallares J, Cucurell D, Munte TF, Strien N, Rodriguez-Fornells A (2010) On the number of trials needed for a stable feedback-related negativity. Psychophysiology.
- 40. Moser JS, Simons RF (2009) The neural consequences of flip-flopping: the feedback-related negativity and salience of reward prediction. Psychophysiology 46: 313–320.
- 41. Yeung N, Holroyd CB, Cohen JD (2005) ERP correlates of feedback and reward processing in the presence and absence of response choice. Cereb Cortex 15: 535–544.
- 42. Miltner WHR, Braun CH, Coles MGH (1997) Event-related brain potentials following incorrect feedback in a time-estimation task: Evidence for a “generic” neural system for error detection. Journal of Cognitive Neuroscience 9: 788–798.
- 43. Potts GF, Martin LE, Burton P, Montague PR (2006) When things are better or worse than expected: the medial frontal cortex and the allocation of processing resources. J Cogn Neurosci 18: 1112–1119.
- 44. Martin LE, Potts GF, Burton PC, Montague PR (2009) Electrophysiological and hemodynamic responses to reward prediction violation. Neuroreport 20: 1140–1143.
- 45. Carlson JM, Foti D, Mujica-Parodi LR, Harmon-Jones E, Hajcak G (2011) Ventral striatal and medial prefrontal BOLD activation is correlated with reward-related electrocortical activity: a combined ERP and fMRI study. Neuroimage 57: 1608–1616.
- 46. Han MH, Cao JL, Covington HE, Friedman AK, Wilkinson MB, et al. (2010) Mesolimbic Dopamine Neurons in the Brain Reward Circuit Mediate Susceptibility to Social Defeat and Antidepressant Action. Journal of Neuroscience 30: 16453–16458.
- 47. Holroyd CB, Pakzad-Vaezi KL, Krigolson OE (2008) The feedback correct-related positivity: Sensitivity of the event-related brain potential to unexpected positive feedback. Psychophysiology 45: 688–697.
- 48. Donkers FCL, van Boxtel GJM, Nieuwenhuis S (2005) Mediofrontal negativities in the absence of responding. Cognitive Brain Research 25: 777–787.
- 49. Holroyd CB, Krigolson OE, Seung L (2011) Reward positivity elicited by predictive cues. Neuroreport 22: 249–252.