Predictive reward-prediction errors of climbing fiber inputs integrate modular reinforcement learning with supervised learning
Fig 6
Spiking neural network model of the cerebellum with 5,000 neurons in Go/No-go tasks.
A: The model consists of two groups of neurons in the PC–CN–IO circuitry, each corresponding to TC1 & TC3 (TCGo: PCGo–CNGo–IOGo) and TC2 & TC4 (TCNogo: PCNogo–CNNogo–IONogo). Sensory input to the PCGo and PCNogo were transmitted via mossy fibers (MFs) to granule cells for Go (GrCGo) and No-go (GrCNogo), respectively. Note that the two neuronal groups received shared mossy fiber input, which is represented by equal connection of GrCGo and GrCNogo to both PCGo and PCNogo. In this model, LTP and LTD are assumed to occur at PF-PC synapses of TCGo and TCNogo, when IO firing is lower and higher than the threshold, respectively. For each group, PCs, CN, and IO designated by green, yellow and blue discs contained 100 simulated neurons each, and we prepared 2000 GrCs for both Go (GrCGo) and No-go (GrCNogo) cues. B: The lick rate is modeled as a sigmoid function of the combined firing rates of CNGo and CNNogo neurons, with the maximum lick rate (ratemax) set at 6 Hz. C: The error rates of Go and No-go trials, defined by the difference between the target lick rate (ratemax for Go and 0 for No-Go trials) and the actual lick rate, are transformed into the rate of Poisson spike generator inputs ErrGo and ErrNogo to IOGo and IONogo neurons, respectively. This reproduces the established negative correlations between δQ and CSs in Go trials for TCGo (blue region) and No-go trials for TCNogo (red region). D: A lattice structure with 10x10 IO neurons for each of TCGo and TCNogo is modeled, where the effective coupling strength between neurons is proportional to their relative distance. In each trial, the effective coupling strength was determined by the firing rate of CN neurons (see Methods for details).