Maynard Smith revisited: A multi-agent reinforcement learning approach to the coevolution of signalling behaviour
Fig 5
Q-values Case 2, darker line is average across all, faint lines are average for each state.
Q-values Case 2, darker line is average across all, faint lines are average for each state.