The value of confidence: Confidence prediction errors drive value-based learning in the absence of external feedback

doi:10.1371/journal.pcbi.1010580

Fig 1.

Experimental design.

(A) Block structure. In each block, participants had to learn about the values of five CS based on the feedback in phases 1 and 3. The critical phase was a period of 5–15 trials in-between phases 1 and 3 in which participants did not receive feedback (phase 2). Before and after phase 2, participants rated the values of each CS on a continuous scale. (B) Trial structure. In each trial, participants chose between two CS and indicated their confidence on a scale from 0 to 10. In phases 1 and 3, the reward outcome for the chosen option was presented in the form of a scratch card with 50 fields, each of which could contain a 1 EUR coin or a blank. In phase 2, the scratch card was not revealed; however, participants were instructed that they would receive the hidden reward on the scratch card at the end of the experiment.

More »

Expand

Fig 2.

Performance and confidence.

Block-averaged time courses are separated according to the duration of phase 1 (9–18 trials) and aligned to the beginning of phase 2. Shaded areas indicate standard error of the mean. (A) Value-based learning. The accuracy of choices gradually increased across the phases with feedback (phases 1 and 3), indicating that participants successfully learned the task. (B) Confidence. Reported confidence (normalized to [0; 1]) likewise increases across the course of a block. Black lines indicate averages across CS value levels. (C) Confidence increases in phase 2 in dependence of the CS value level. The parameter estimate β and the p-value are based on a linear model with value level as IV and average confidence slope in phase 2 as DV.

More »

Expand

Fig 3.

Changes in choice consistency and subjective value ratings in phase 2.

(A) Choice consistency between first and second (in blue), as well as between second and third choice (in orange) for identical CS pairs in phase 2. (B) Subjective value ratings. Depicted are the changes of the subjective value ratings (post-phase-2 minus pre-phase-2), separately for each of the four CS value levels within a block.

More »

Expand

Table 1.

Models.

More »

Expand

Table 2.

Free model parameters.

More »

Expand

Fig 4.

Model comparison.

Models were compared by means of the Akaike information criterion (AIC). Each value represents the average AIC of a model across participants (± SEM). The number in parentheses indicates the number of model parameters.

More »

Expand

Fig 5.

Latent variables and posterior predictive fits of model ConfUnspec.

All time courses represent averages across blocks and subjects, split according to the duration of phase 1 (line styles) and the four CS value levels within a block (colors). (A) Expected values indicate current beliefs about the value of each stimulus. (B) Posterior predictive fit for model performance: expected proportion correct responses based on choice probabilities. (C) Posterior predictive fit for model confidence. Model confidence is computed based on the choice probability for the chosen CS (normalized to the range 0–1). Black lines indicate averages across value levels. (D) Confidence slopes of (C) in phase 2 in dependence of the CS value level. (E) Expected confidence corresponds to an integration of past confidence experiences using a Rescorla-Wagner-type learning rule. (F) Confidence prediction errors indicate the deviation of a momentary confidence experience from expected confidence. (G) Absolute confidence prediction error.

More »

Expand

Fig 6.

Model parameters for the winning model ConfUnspec.

Blue solid lines indicate parameter means, green dashed lines parameter medians. (A) Histogram of the reward learning rate α_r. (B) Histogram of the inverse decision noise parameter β. (C) Histogram of the confidence transfer parameter γ. (D) Histogram of confidence learning rate α_c. (E) Scatter plot between reward learning rate α_r and confidence transfer γ. The black line indicates a linear fit to the data; the correlation coefficient is based on a Pearson correlation.

More »

Expand