Fig 1.
A. Participants start with empty slots to collect tokens for each of the three available suits (cat, hat, car). B. Each suit type is mapped to two distinct cards which serve to stand in for the suits in the game. C. Panel shows a sample round of the task: 1. Participants see suit collection progress of all three goals. 2. Participants pick a card corresponding to one of the three goals. 3. Participants sees the outcome of their action 4. If a token is received, slots are updated. 5. Periodically, participants report the suit they are pursuing. D. Conditional probabilities. for different token-card combinations are shown for the 80–20 block. Selecting the basket or the mat card gives you the cat token with a high probability of 0.8, making cat the dominant suit in the block. E. Block shifts Each block has fixed probabilities with one dominant suit and two inferior suits of identical probabilities. The dominant suit is switched across adjacent blocks (Cat is most abundant in block N while hat is the most abundant in the next one). F. Token probabilities. Two experimental variants (experiment 1 and 2) with probability configurations specified for each suit. Figure credits. We used images from https://openclipart.org/ to generate the figures. See the Task design section in Methods for full image credits.
Table 1.
for the 80-20 block. The dominant token is
with
Fig 2.
A. Participants show a preference for the progress accrued over the current rate of progress.
Participants primarily chose between the dominant suit (one with the highest probability within the block) and the max progress suit (one which made greatest progress towards the target), while choosing the third option relatively infrequently. B. The proportion of times participants chose the max progress suit over the dominant suit (when the two diverge) is broken down by block type. The preference for the max progress option increases with the difference in progress between the dominant and max progress suits. C. Participants show a preference for the retrospective suit over the prospective suit (when the two choices diverge). Participants chose the retrospective suit (with the greatest progress) over the prospective suit (optimal choice prescribed the task-optimized prospective agent). D. Learning effects within blocks. Panel shows the dominant suit choice probability in the first and second halves of the block (15 rounds each). The proportion of dominant choices increases in the second half indicating learning effects within blocks. The choice patterns are contrasted with that of the task-optimized prospective agent, wherein, preference for the dominant suit is significantly lower in participants in both halves of the blocks, indicating suboptimal retrospective bias in goal selection.
Fig 3.
A. Participants show greater persistence towards the retrospective suit.
Participants make more switches away from the prospective option than the retrospective option. Switches are split by suit type (prospective/retrospective), block condition, and whether the round elicits a token or not. Significance markers indicate paired-sample t-tests (). B. Switches split by progress differences between the suits. H diff, L diff correspond to higher and lower progress difference between the suits. Increased persistence is observed with the retrospective suit when progress difference between the suits is high.
Fig 4.
A, B. Human performance contrasted against task-optimized prospective and retrospective agents.
Human behavior significantly falls short of optimal prospective agent and marginally beats the retrospective agent in task performance (total number of suits collected). Agents’ parameters are optimized for performance in the task. Performance is broken down by task condition and the distribution of performance is shown. B. Preference towards accrued progress shown by the two agents contrasted with human participants. The retrospective agent has the highest proportion of retrospective choice and the prospective agent the lowest, while human retrospective preferences lies between these extremes.
Fig 5.
A, B. Effects on explicit instruction about prospects of alternative goals on behavior.
A. Task performance improves with instructions yet significantly falls short of the optimal prospective agent. B. Retrospective choice proportion decreases with instructions yet is significantly inflated when contrasted with the prospective agent. C. Learning effects within blocks. Panel shows the dominant suit choice probability in the first and second halves of the block (15 rounds each) in the presence and absence of explicit instructions. Preference towards the dominant choice increases in both halves of the blocks indicating a shift towards optimal goal selection in the presence of instructions.
Fig 6.
A, B. Features of the TD-momentum algorithm.
A. The relation of TD-momentum value estimates with progress is determined by the discount factor, step size of progress, and the probability of making unit progress. In the simulations, the probability of unit progress is 0.6, and the step size of unit progress is 0.14. B. Comparison between the value computations from prospective and TD-momentum algorithms. The black line indicates the shift in probability of unit progress from 0.6 to 0.2 after 15 trials. TD-momentum provides stable value estimates, as opposed to the prospective model which exhibits fluctuations in response to individual reinforcements. Step size of unit progress is 0.05, learning rate is 0.4, and is 0.9 for both algorithms.
Fig 7.
TD-momentum accounts for the switching choices in the task better than the other prospective, hybrid, and TD-persistence models as per AIC, BIC, and 3-fold cross validation metrics in experiments 1 and 2. Mean and standard error bars of the metrics shown in the figure. Significance markers correspond to two-tailed paired sample t-tests (). C. TD-momentum also offers the best account of behavior in the variant with explicit instructions. D. Incidence of the competing hypothesis of TD-persistence increases in the version with instructions.
Fig 8.
TD-momentum: model predictions
TD-momentum agent captures general trends in preferences towards maximum progress suit and switching patterns in the task.
Fig 9.
Experiment 4: Variable targets for goals
A. Slot configuration for experiment 4. B. Suit selections divided by high disparity (H disp) and low disparity (L disp) conditions and sub divided by the dominant token type (cat/hat/car). C, D, E. Behavioral signatures (suit selections, retrospective choice, switching patterns) replicated from experiments 1 and 2. Figure credits. We used images from https://openclipart.org/ to generate this figure. See the Task Design section in Methods for full image credits.
Fig 10.
Experiment 4: Model comparisons
A. TD-momentum outperforms other models in AIC, BIC, cross validation metrics. B. Prospective agent shows bias towards the cat suit, while TD-momentum closely captures distribution of goal selections. C. TD-momentum captures retrospective choice patterns better than the prospective agent.
Fig 11.
Normative account of momentum in goal pursuit.
Optimal performance attained by the TD-momentum agent contrasted with that of the prospective agent when the token-outcome probabilities change in Gaussian random walks (left) and when they change in abrupt probability reversals (right). Each point in the plot refers to a variant of the task with randomly generated token contingencies; a sample task design was illustrated for both random walks and probability reversals.
Table 2.
Participant recruitment.