Information uncertainty influences learning strategy from sequentially delayed rewards
Fig 3
Behavioral signatures of learning and model predictions.
(A) This diagram demonstrates the logical structure of a delayed choice that reappears after three trials, categorized as either ‘stay’ (S) or ‘switch’ (W). It includes feedback valence for immediate and two-trials forward choices. Correctly utilizing two-trials forward information (illustrated with a curved arrow) suggests staying with the initial choice. (B) The probability of maintaining (i.e., staying on) a delayed choice three trials later was modeled using multilevel logistic regression, as a function of condition (conjoint or disjoint), time (immediate, 0F or two-trial forward, 2F), and reward (positive, + or negative, -), as well as their interactions (equation 1). This regression was run on data from participants (red), as well as data generated by the tabular model (green) and data generated by the eligibility model (blue) for comparison. The logistic regression’s estimated marginal means are displayed, accompanied by 95% confidence intervals for each condition. Text displays when participants’ estimated confidence intervals overlap with eligibility [E,-], tabular [-,T], or both [E,T]. (C-E) Collapsed two-way interactions for each of the combinations between the three variables, specifically time*reward collapsed across condition (C), condition*time collapsed across reward (D), and reward*condition collapsed across time (E). Bars represent marginal means; error bars represent 95% confidence intervals.