A nonlinear relationship between prediction errors and learning rates in human reinforcement-learning

doi:10.1371/journal.pcbi.1013445

A nonlinear relationship between prediction errors and learning rates in human reinforcement-learning

Fig 6

Learning rate results from the reanalysis of the bucket task.

(A) Overview of the bucket task. (B) The analysis of the data with the novel reinforcement-learning models demonstrated that, among models that can estimate trialwise learning rates, the exponential-logarithmic model exhibits greater flexibility to account for participant behaviour (i.e., the model minimising the deviation from participants bucket placement) relative to the cubic model which assumes a strictly parabolic relationship. The values are normalised from 360° to [0,1].***p < .001 (C) The model estimates steep learning rate increases over the prediction error space and results mostly align with those reported by Vaghi et al., (2017). Average model-free learning rates reported by Vaghi and colleagues indicated by circular markers, curve with grey shading denotes the trajectory estimated by the exponential-logarithmic model. Here, population average of learning rates estimated by the RW model is 0.968 ± 0.034 (mean±SD, mean shown by the blue horizontal dashed line). Consequently, the exponential-logarithmic model offers greater flexibility at the lower PE values covering the spectrum where most of the actual PEs in this task were, whereas learning rates from these competing models somewhat converge at the higher PE values (i.e., top right of panel C).

doi: https://doi.org/10.1371/journal.pcbi.1013445.g006