Learning with sparse reward in a gap junction network inspired by the insect mushroom body

doi:10.1371/journal.pcbi.1012086

Learning with sparse reward in a gap junction network inspired by the insect mushroom body

Fig 2

The state of taxi locations.

(A) There are 25 locations in the environment, here we number them from 0 to 24. (B) Because the car can only move to adjacent locations without obstacles on its way, the possible transitions between the locations are constrained. (C) Our model learns the possible transitions between locations after training. The conductance strength and direction between nodes are shown by arrow width and direction. The state nodes with high potential (e.g. the current state in the example, 16) are in yellow and the state nodes with low potential (e.g. the goal state, 20) are in purple. The resulting current strength is shown by the brightness of the arrow, in this case creating flow from 16 to 20, guiding the car to take this route.

doi: https://doi.org/10.1371/journal.pcbi.1012086.g002