Humans combine value learning and hypothesis testing strategically in multi-dimensional probabilistic reward learning
Table 1
The reward probability of a stimulus in each game type (1D, 2D, and 3D-relevant games) was determined by the number of rewarding features in the stimulus.
Each row corresponds to one game type. Across all game types, the reward probabilities were 20% if the stimulus contained no rewarding features, 80% if it contained all rewarding features, and linear interpolations between 20% and 80% if it contained a subset of rewarding features. For example, in a 3D-relevant game, if the stimulus contained two of the three rewarding features, the reward probability for that trial would be 60%. These probabilities guarantee that a participant who performs randomly would have 40% probability of obtaining a reward across all game types. This can be seen by calculating, for each game type, the chance of randomly choosing a certain number of rewarding features, multiplied by the corresponding reward probability. Equal chance probability across game types ensured that chance behavior would not be informative about the number of relevant dimensions in unknown games.