Learning with sparse reward in a gap junction network inspired by the insect mushroom body

doi:10.1371/journal.pcbi.1012086

Learning with sparse reward in a gap junction network inspired by the insect mushroom body

Fig 10

Result of using Q-learning with a similar training configuration to solve Voronoi World.

That is, maximum 10000 steps for each episode and sparse reward. Q-learning did not converge in such a training configuration. (A) is reward per episode, (B) is reward per step.

doi: https://doi.org/10.1371/journal.pcbi.1012086.g010