Fig 1.
The subject-level distributions of reported probabilities and the ‘true’ underlying probabilities in the dataset of Khaw, Stevens, and Woodford [27].
In contrast to the distributions of probabilities reported by individual subjects, the underlying probabilities were drawn uniformly from the unit interval. Deviations from the uniform distribution in the underlying probabilities reflect finite sampling.
Fig 2.
Pooled and individual-level probability estimates.
(A) The median estimate across the pooled sample tracks the true probability in each bin. The error bars denote interquartile ranges. (B) The median response of each participant, across trials and sessions, generates varying S-shaped response functions. Participants are color-coded by their degree of conservatism, with darker colors indicating more conservatism (as defined in the Computational Models section).
Fig 3.
The Bayesian forecast and true probabilities.
(A) The median Bayesian estimate closely tracks the true underlying probability in each bin. (B) The median Bayesian responses corresponding to the sessions completed by each individual generate very small dispersion around the diagonal. (C) and (D) Much of the deviation in the subjective probability forecasts from the true probabilities reflect deviations from the Bayesian forecast.
Fig 4.
The two types of estimation biases accommodated by the free parameters of the tested models.
(A) The additive bias implied by non-zero values of α. (B) The non-linear distortion toward or away from 0.5, implied by β different from 1.
Table 1.
Model estimates.
Table 2.
In-sample and out-of-sample measures of goodness-of-fit compared for the fully heterogeneous model and close alternatives.
Fig 5.
Bias parameter distributions and subjects’ response densities.
(A) Dot plots showing the distributions of best-fitting parameters (α, β, σ) across subjects. (B) The density of responses corresponding to members between terciles of β values. Bold colored lines indicate the output produced by the model in Eq (1) using the median model parameters from each tercile.
Table 3.
Individual parameter estimates.
Fig 6.
Examination of within and between-subject variability in session-wise parameter estimates.
(A) Average between-subject variance is greater than within-subject variance by a factor of around two for parameters β and σ, with the latter difference being statistically significant. (B) Non-parametric bootstrap tests mirror the observed differences in variance ratios from the null benchmark.
Table 4.
ANOVA results on median split parameter estimates across sessions.
Fig 7.
Bias parameter magnitudes across sessions.
(A) Each of the three average parameter values belonging to each sub-group do not differ significantly across the 10 experiment sessions. (B) Subjects’ parameters β and σ estimated from early and late sessions (comprising the first and second halves of the study) are positively correlated.
Fig 8.
Timing-related behaviors and individual biases.
(A) Increasing levels of the repulsion parameter are negatively associated with average response times (in requesting for the next ring sample). A similar negative correlation holds for the average adjustment lag (number of rings before an adjustment to the slider is made) exhibited by each subject. (B) An internal replication of both relations. Negative correlations are also observed using session-wise parameter estimates and averages.
Table 5.
Comparison of parameter values in alternative forecasting models.
Fig 9.
Comparisons between alternative models of probability encoding and reporting.
(A) The difference in BIC values across the three different types of models, relative to the distorted Bayesian benchmark. The smallest differences are for subject 6, for whom the QB BIC is 1 point higher, the delta rule BIC is 11 points higher, and the PTN BIC is 18 points lower than the benchmark. Estimation results are based on the data conditional on adjustment for all models. The BIC levels vary across subjects depending on how frequently each subject adjusted their forecast. (B) The difference in the standard deviation of the noise associated with each model, relative to the distorted Bayesian benchmark.