Maynard Smith revisited: A multi-agent reinforcement learning approach to the coevolution of signalling behaviour

Table 5

Strategies most often learned by the Beneficiary with parameters S = 0.2, V = 0.2, r = 0.5 (Case 1), see Fig 3.