Reward Optimization in the Primate Brain: A Probabilistic Model of Decision Making under Uncertainty

Figure 3

Relationship between Model and Neural Activity.

The input to the model is a random dots motion sequence. Neurons in MT with tuning curves emit spikes at time step , which constitutes the observation in the POMDP model. The animal maintains the belief state by computing ( can be parameterized by and - see text). The optimal policy is implemented by selecting rightward eye movement when , or equivalently, when (and likewise for leftward eye movement ).

