A Unifying Probabilistic View of Associative Learning

doi:10.1371/journal.pcbi.1004567

Fig 1.

Organizing Bayesian and reinforcement learning theories.

Point estimation algorithms learn the expected reward or value, while Bayesian algorithms learn a posterior distribution over reward or value. The columns show what is learned, and the rows show how it is learned.

More »

Expand

Fig 2.

Kalman filter simulation of latent inhibition.

(A) Reward expectation following pre-exposure (Pre) and no pre-exposure (No-Pre) conditions. (B) The Kalman gain as a function of pre-exposure trial.

More »

Expand

Fig 3.

Kalman filter simulation of recovery phenomena.

(A) Overshadowing and unovershadowing by extinction of the overshadowing stimulus. (B) Forward blocking and unblocking by extinction of the blocking stimulus. (C) Overexpectation and unoverexpectation by extinction of one element. (D) Conditioned inhibition and uninhibition by extinction of the excitatory stimulus.

More »

Expand

Fig 4.

Overshadowing and second-order conditioning.

(A) Experimental design [55]. Note that two control groups have been ignored here for simplicity. (B) Simulated value of stimulus Z computed by Kalman TD (left) and TD (right). Only Kalman TD correctly predicts that extinguishing an overshadowing stimulus will allow the overshadowed stimulus to support second-order conditioning. (C) Posterior covariance between weights for stimuli A and X (left) and Kalman gain for stimulus X (right) as a function of Phase 1 trial. (D) Posterior covariance between weights for stimuli A and X (left) and Kalman gain for stimulus X (right) as a function of Phase 2 trial.

More »

Expand

Fig 5.

Second-order extinction.

(A) Experimental design [56]. (B) Simulated value of stimulus Z computed by Kalman TD (left) and TD (right).

More »

Expand

Fig 6.

Serial compound extinction.

(A) Experimental design [61]. (B) Simulated value of stimulus Z computed by Kalman TD (left) and TD (right). (C) Posterior covariance between the weights for stimuli Z and X as a function of conditioning trial.

More »

Expand

Fig 7.

Serial compound latent inhibition.

(A) Experimental design [61]. (B) Simulated value of stimulus Z computed by Kalman TD (left) and TD (right). (C) Posterior variance (left) and Kalman gain (right) of stimulus X as a function of pre-exposure trial.

More »

Expand

Fig 8.

Recovery from overshadowing.

(A) Experimental design [62]. (B) Simulated value of stimulus X and stimulus Y computed by Kalman TD (left) and TD (right).

More »

Expand