Hierarchical Bayesian inference for concurrent model fitting and comparison for group studies
Fig 10
Performance of the HBI t-test for making inference at the population level.
RL agents with a bias parameter were generated according to different mean (effect size) values in two simulations where A) there is only one model in the model-space (scenario 1); or B) there are two models in the model-space (scenario 2). The HBI makes inference using the HBI t-test, the NHI makes inference by performing a t-test on its estimated parameters and the HPE makes inference by comparing the full fit and null fit (in which the group-level prior mean for the bias parameter is fixed). The sensitivity (or power) of the tests in detecting true effects at P <0.05 for a number of different effect sizes is plotted (i.e. true positive rate). For the HPE, log-evidence of at least 3 was considered as significant. The HPE shows lower sensitivity than the other methods in both scenarios. Moreover, the HBI shows higher sensitivity than the NHI in scenario 2.