Active reinforcement learning versus action bias and hysteresis: control with a mixture of experts and nonexperts
Fig 14
Alternatives to state-independent action hysteresis.
Compare to Fig 12. To falsify alternative hypotheses concerning the origins of the apparent effects of state-independent action hysteresis Ht(a) (“2CE1”), the model comparison was first extended to test substitution of state-dependent action hysteresis Ht(st,a) (“sE1+2C”), state-independent action value Qt(a) (“Qa+2C”), confirmation bias in learning with the constraint αN < αP (“cLR+2C”), or asymmetric learning rates with no constraint for αN ≠ αP (“LR+2C”). As expected, none of these alternatives were capable of generating the original action-history curves that only state-independent action hysteresis could produce.