Maynard Smith revisited: A multi-agent reinforcement learning approach to the coevolution of signalling behaviour
Table 4
Resulting strategies of learning agents, showing the top strategy learned and the proportion of runs this was the resulting strategy for the parameters LR=0.9, DR=0.1.
For expectation, means the Beneficiary does not signal, and the Donor keeps the resource. Valid for any value of p.