Learning with sparse reward in a gap junction network inspired by the insect mushroom body

doi:10.1371/journal.pcbi.1012086

Learning with sparse reward in a gap junction network inspired by the insect mushroom body

Fig 6

Number of steps per episode in Taxi-v3 task (A) is the number of steps with the dynamic routing model.

The inset figure is a zoomed-in version of the outer plot showing convergence to around 20 steps. (B) is the number of steps taken per episode by Q-learning in the same training configuration same as Fig 5. Q-learning does not converge.

doi: https://doi.org/10.1371/journal.pcbi.1012086.g006