## Figures

## Abstract

Many problems of cooperation involve repeated interactions among the same groups of individuals. When collective action is at stake, groups often engage in *Public Goods Games* (**PGG**), where individuals contribute (or not) to a common pool, subsequently sharing the resources. Such scenarios of repeated group interactions materialize situations in which direct reciprocation to groups may be at work. Here we study direct group reciprocity considering the complete set of reactive strategies, where individuals behave conditionally on what they observed in the previous round. We study both analytically and by computer simulations the evolutionary dynamics encompassing this extensive strategy space, witnessing the emergence of a surprisingly simple strategy that we call *All-Or-None* (**AoN**). **AoN** consists in cooperating only after a round of unanimous group behavior (cooperation or defection), and proves robust in the presence of errors, thus fostering cooperation in a wide range of group sizes. The principles encapsulated in this strategy share a level of complexity reminiscent of that found already in 2-person games under direct and indirect reciprocity, reducing, in fact, to the well-known *Win-Stay-Lose-Shift* strategy in the limit of the repeated 2-person *Prisoner's Dilemma*.

## Author Summary

The problem of cooperation has been a target of many studies, and some of the most complex dilemmas arise when we deal with groups repeatedly interacting by means of a *Public Goods Game* (**PGG**), where individuals may contribute to a common pool, subsequently sharing the resources. Here we study generalized direct group reciprocity by incorporating the complete set of reactive strategies, where action is dictated by what happened in the previous round. We compute the pervasiveness in time of each possible reactive strategy, and find a ubiquitous strategy profile that prevails throughout evolution, independently of group size and specific **PGG** parameters, proving also robust in the presence of errors. This strategy, that we call *All-Or-None* (**AoN**), consists in cooperating only after a round of unanimous group behavior (cooperation or defection); not only is it conceptually very simple, it also ensures that cooperation can self-sustain in a population. **AoN** contains core principles found, e.g., in the repeated 2-person Prisoner's Dilemma, in which case it reduces to the famous *Win-Stay-Lose-Shift* strategy.

**Citation: **Pinheiro FL, Vasconcelos VV, Santos FC, Pacheco JM (2014) Evolution of All-or-None Strategies in Repeated Public Goods Dilemmas. PLoS Comput Biol 10(11):
e1003945.
https://doi.org/10.1371/journal.pcbi.1003945

**Editor: **Jean Daunizeau, Brain and Spine Institute (ICM), France

**Received: **May 27, 2014; **Accepted: **September 27, 2014; **Published: ** November 13, 2014

**Copyright: ** © 2014 Pinheiro et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

**Data Availability: **The authors confirm that all data underlying the findings are fully available without restriction. All relevant data are within the paper and its Supporting Information files.

**Funding: **This research was supported by FEDER through POFC – COMPETE and by FCT-Portugal through fellowships SFRH/BD/77389/2011 and SFRH/BD/86465/2012, by grants PTDC/MAT/122897/2010 and EXPL/EEI-SII/2556/2013, by multi-annual funding of CMAF-UL, CBMA-UM and INESC-ID (under the projects PEst-OE/MAT/UI0209/2013, PEst-OE/BIA/UI4050/2014 and PEst-OE/EEI/LA0021/2013) provided by FCT-Portugal, and by Fundação Calouste Gulbenkian through the “Stimulus to Research” program for young researchers. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

**Competing interests: ** The authors have declared that no competing interests exist.

## Introduction

The emergence and sustainability of cooperation constitutes one of the most important problems in social and biological sciences [1]. It revolves around the clash between individual and collective interest, which becomes particularly clear when one considers the evolution of collective action involving *Public Goods Games* (**PGG**), such as the stereotypical *N-person Prisoner's Dilemma* (**NPD**) [2], [3]. In the absence of additional mechanisms, such as the presence of *thresholds* [4], [5], *risk* [6], an *embedding network of interactions* [7]–[12], *institutions* [13]–[15], *punishment* or *voluntary participation* [16]–[19], evolutionary game theory predicts a population fated to fall into a tragedy of the commons [20].

Collective action problems, however, often involve repeated interactions between members of the same group [21]–[23], as exemplified by the repeated attempts from country leaders to cooperate in reducing emissions of greenhouse gases [6], [24]–[29] or in finding a solution to the Euro monetary crisis [30]–[32]. In such scenarios, where collective action is more difficult to achieve in larger groups [6], one is naturally led to question whether a generalization of the direct reciprocity [33] mechanism to problems of collective action may provide an escape hatch to the aforementioned tragedy of the commons. Moreover, *N*-player interactions pose many additional difficulties, in particular in what concerns the emergence of reciprocation: If one interacts repeatedly in a group of N-players it is hard to identify towards whom should one reciprocate [3]. In fact, only recently direct reciprocity has been generalized to **PGG**s [22], [23], studying the co-evolution of unconditional defectors with generalized reciprocators, that is, individuals who, in a group of size *N*, only cooperate if there were at least *M* (0≤*M*≤*N*) individuals who cooperated in the previous round. Results show [22], [23] that generalized reciprocators are very successful in promoting cooperation. Moreover, for a given group size *N*, there is a critical threshold level of fairness, *M ^{*}*, at which reciprocation optimizes the emergence of cooperation [22].

Generalized reciprocators [22] provide an intuitive generalization of the **TFT** strategy to repeated *N*-player games. However, and despite the underlying intuition, they constitute but a small subset of all possible individual (reactive) strategies one can envisage in a group of size *N*.

Here we explore the complete set of reactive strategies that individuals may adopt when engaging in repeated *Public Goods Games* with *N*-1 other individuals, assuming that the decision to cooperate or not is based on the behavioral decisions of the group in the previous round (see below). We find that, in the context of *Public Goods Games*, a reactive strategy not belonging to the set of generalized reciprocators emerges as ubiquitous, ensuring the emergence and sustainability of cooperation.

## Models

Let us consider a finite and *well-mixed* population of *Z* individuals, who assemble in groups of size *N* randomly formed, and play a repeated version of the **NPD** [34]. In each round individuals either *cooperate* (**C**) by contributing an amount *c* to a public good or *defect* (**D**) by not doing so. The aggregated contributions of the group are multiplied by an enhancement factor *F* and equally divided among the *N* individuals of the group. Hence, in each round, **D**s achieve a payoff of , while **C**s attain where *k* is the number of contributions in that round. We consider a repeated **PGG** with an undetermined number of rounds, such that at the end of each round, another round will take place with probability *w* [3], leading to an average number of rounds — *m* — given by *m* = (1−*w*)^{−1}. At the beginning of each round (with the exception of the first), each individual decides to contribute (*i.e.* to play ** C**) or not (

*i.e.*to play

**), depending on the total number of contributions that took place in the previous round.**

*D*Each strategy *S _{i}* defines how an individual behaves in each round (

*i.e.*if she/he decides to cooperate or defect) and is encoded in a string with

*N*+2 bits (

*b*). The first bit (

^{−1}b^{0}b^{1}…b^{N−1}b^{N}*b*) dictates the behavior in the initial round, while the remaining

^{−1}*N*+1 bits (

*b*) correspond in sequence to the player's behavior depending on the number of

^{0}b^{1}…b^{N−1}b^{N}**s in the previous round. In this definition a bit 1 corresponds to a**

*C**cooperative*act and a bit 0 to a

*defective*one. Hence, one obtains a maximum of 2

^{N}^{+2}strategies, corresponding to all possible combinations of 0 s and 1 s in a string of size

*N*+2.

We consider groups of *N* individuals, randomly sampled from a finite population of size *Z*, playing a repeated **NPD**. Individuals revise their strategies through the Fermi update rule [35]–[38], a stochastic birth-death process with mutations. At each time step a randomly selected individual *A* (with strategy *S _{A}* and fitness ) may adopt a different strategy

*i*) by mutation with probability

*μ*or

*ii*) by imitating a random member

*B*of the population (with strategy

*S*and fitness ) with probability , where

_{B}*β*is the

*intensity of selection*that regulates the randomness of the decision process. The fitness of each strategy is the average payoff attained over all rounds and possible groups by individuals adopting strategy

*S*. It is well known that execution errors profoundly affect the evolutionary dynamics of repeated 2-person games [39]–[45]. Consequently, we shall also consider that, in each round, and after deciding to contribute or not according to

_{i}*b*, an individual may act with the opposite behavior (1−

^{q}*b*) with a probability

^{q}*ε*, thus making an execution error.

## Results/Discussion

Let us start by investigating the evolutionary dynamics of the population in the *small mutation limit* approximation [46]. This allows us to compute analytically the relative pervasiveness of each strategy in time. It is noteworthy, however, that the results obtained through this approximation remain valid for a wide range of mutation probabilities, as we show explicitly in the Supporting Information (**SI**) via comparison with results from computer simulations. In a nutshell, and whenever mutations are rare, a new mutant that appears in the population will either get extinct or invade the entire population before the occurrence of the next mutation. Hence, in each time-step there will be, at most, 2 strategies present in the population, which allows one to describe the evolutionary dynamics of the population in terms of an embedded (and reduced) Markov Chain with a size equal to the number of strategies available. Each state represents a monomorphic population adopting a given strategy, whereas transitions are defined by the fixation probabilities of a single mutant [47]. The resulting stationary distribution *τ _{i}* will then indicate the fraction of time the population spends in each of the 2

^{N}^{+2}states (or strategies

*S*

_{i}). We shall also make use of

*τ*to compute the fraction of time the population spends in a configuration/strategy with

_{i}*b*= 1, a quantity we call

_{i}^{q}*stationary bit strategy*, defined as , where corresponds to the bit

*q*of strategy

*i*. The

*stationary bit strategy*allows us to easily quantify the relative dominance of each behavior and extract the most pervasive strategic profiles.

Figure 1 shows the *stationary bit distribution*, , for different group sizes. Colored cells highlight those bits (*b ^{q}*) that retain the same value more than 75% of the time, with ≥0.75 (blue) and ≤0.25 (red). For simplicity, we associate this feature with what we call

*dominant bit*.

Each bit (square) corresponds to the weighted sum of the fraction of time (*i.e.* the analytically computed stationary distribution) the population spends in strategy configurations in which *b ^{q}* = 1. Blue (red) cells identify those bits that are employed at least ¾ of the time with value

*b*= 1.0 (

^{q}*b*= 0.0). The analysis provided extends for groups sizes (

^{q}*N*) between 2 and 10 (rows). Other model parameters:

*Z*= 100,

*β*= 1.0,

*F/N*= 0.85,

*w*= 0.96,

*ε*= 0.05,

*μ*≪1/

*Z*.

Analysis of the *stationary bit distributions* for different group sizes under small error probabilities puts into evidence the overall evolutionary success of strategies that conform with a particular profile: *b ^{0}* =

*b*= 1 and

^{N}*b*= 0 for 0<

^{q}*q*<

*N*. A similar trend is obtained if instead we analyze the stationary distribution

*τ*

_{i}for all possible strategies

*S*: This strategy — or minor variations on this profile (see below) — shows the highest prevalence for a wide range of parameters even in the absence of errors of execution (see

_{i}**SI**). The philosophy encapsulated in this strategy is a simple yet efficient one: cooperating only after a round of unanimous group behavior (cooperation or defection). Hence we refer to this strategy as

*All-Or-None*(

**AoN**), highlighting the two situations in which these individuals are prone to cooperate. As group size increases, so does the number of expected errors per round, which leads to an overall reduction of the number of

*dominant bits*found in the intermediate sector (

*i.e. b*for 0<

^{q}*q*<

*N*) without affecting the “edge bits”, which again reveals the prevalence of

**AoN**behaviour in the population.

A monomorphic population of **AoN** players can easily sustain unanimous group cooperation, even in the presence of errors. Indeed, after an occasional individual defection, a round of full defection ensues, resuming back to unanimous cooperation in the following round. Therefore, **AoN** allows a prompt recovery from errors of execution, which constitutes a key feature that allows cooperation to thrive.

To investigate the robustness of **AoN** we show, in Figure 2, the effect of execution errors on the *stationary bit distribution* () for a fixed group size (here *N* = 5): Clearly, both *b ^{0}* and

*b*remain associated with cooperation for a wide range of error probabilities (

^{N}*ε*≤0.2). The internal bits, in turn, remain qualitatively close to the

**AoN**profile (

*i.e. b*= 0 for 0<

^{q}*q*<

*N*), undergoing changes as the error rate increases, allowing an efficient resume into full cooperation, after (at least) one behavioral error. In particular, for 0.01<

*ε*<0.1, evolution selects for defection in bits

*b*to

^{1}*b*, with particular incidence to adjacent bits of

^{N−1}*b*and

^{0}*b*, allowing a fast error recovery. This feature gets enhanced with increasing

^{N}*ε*. For larger values of

*ε*(

*ε*>0.1), unanimity becomes less likely and we witness an adaptation of the predominant strategy that acts to reduce the interval of bits in which defection prevails. In other words, it is as if execution errors redefine the notion of unanimity itself or, alternatively, individuals become more tolerant as execution errors become more likely. It is also noteworthy that the non-monotonous response to errors shown in Figure 2 has been previously observed in other evolutionary models of cooperation [48] where intermediate degrees of stochasticity emerge as maximizers of cooperation. We confirmed that the results remain qualitatively equivalent for different group sizes.

We plot (log-linear scale) the fraction of time the population spends in a strategy with *b ^{q}* = 1 for a broad range of error probabilities

*ε*. Circles on the left indicate the values obtained for

*ε*= 0.0, gray areas show the range of values for which bits were defined to have a dominant behavior. Note that for

*ε*= 0.5 all strategies behave randomly. The bar plot on the right shows the results for

*ε*= 0.06 (vertical dashed line). Other model parameters:

*Z*= 100,

*β*= 1.0,

*N*= 5,

*F/N*= 0.85,

*w*= 0.96 and

*μ*≪1/

*Z*.

In the following we investigate the relevant issue of asserting whether the introduction of this strategy can efficiently promote the average fraction of cooperative actions. The level of cooperation, *η*, may be defined as the average number of contributions per round divided by the maximum number of contributions possible. Denoting by *C _{i}* the average number of contributions per round associated with strategy

*S*,

_{i}*η*reads , where

*τ*is the fraction of time the population spends in the configuration

_{i}*S*and

_{i}*N*is the group size. As shown in Figure 3, the overall levels of cooperation remain high as long as the average number of rounds is sizeable (left panel, for different values of the

**PGG**enhancement factor

*F*).

*m* for three different values of the enhancement value *F* (4, 3 and 2) with *N* = 5 and in the absence of behavioral errors. **Right:** Gradients of Selection [5] for the evolutionary game between **ALLD** and **AoN** (*b ^{−1}* = 0,

*N*= 5, w = 0.96 or

*m*= 25; other model parameters:

*Z*= 100 and

*β*= 1.0).

The success of **AoN** can also be inferred by assessing its evolutionary chances when interacting with unconditional defectors (**AllD**). To do so, we compute the gradient of selection [5] — *G*(*k*) — which provide information on the most likely direction of change of the population configuration with time. This is given by the difference between the probabilities of increasing and decreasing the number of **AoN** players in a population of *k* **AoN**s and *Z-k* **AllD**s. The result is depicted in the right panel of Figure 3, a profile characteristic of a coordination game, in which case the **AoN** strategy dominates whenever the population accumulates a critical fraction of **AoN** players. Moreover, the size of coordination barrier decreases with increasing values of the enhancement factor *F*. In the **SI** we further show that the location of the coordination point is rather insensitive to other game parameters, in particular when the number of rounds is large. Notably, the evolutionary chances of the **AoN** strategy remain qualitatively independent from alterations on the first bit (*b ^{−1}*). Similarly, we have checked the robustness of the

**AoN**strategy when interacting with random strategists (

**RS**), i.e., individuals that cooperate or defect with equal probability. It can be shown that both

**AoN**and

**AllD**are advantageous with respect to

**RS**strategists (regardless of their prevalence in the population), while these should drive

**AllC**to extinction. Finally, contrary to the generalized versions of

**TFT**strategies, in the presence of errors, the

**AoN**strategy is robust to invasion of unconditional cooperators (

**AllC**) by random drift, as the former can efficiently exploit the latter.

To sum up, we have shown that the strategy **AoN** emerges as the most viable strategy that leads to the emergence of cooperation under repeated **PGG**s. This strategy, despite its remarkable simplicity, cannot be encoded within the subspace of generalized reciprocators studied before in this context [22]. When we consider individuals capable of making behavioral errors, **AoN** is dominant as suggested by analyzing both the *stationary bit strategy* (Figures 1 and 2) and the stationary distribution in the monomorphic configuration space (**SI**). More importantly, our results suggest that **AoN** dominates independently of the group size and over a wide range of error rates.

Previous works have identified similar strategy principles in different contexts. For instance, the *Win-Stay-Lose-Shift* [39]–[41], [49] strategy discovered in the context of the repeated 2-person Prisoner's Dilemma constitutes the *N* = 2 limit of **AoN**. In the context of repeated N-Person games on the multiverse [34], the strategy entitled *generic Pavlov* [50] encapsulates a behavioral principle which is similar to that underlying **AoN**. In fact, one may argue that the principles underlying **AoN** may very well be ubiquitous: The simplicity of this strategy can be seen as equivalent — in the context of problems of collective action [5], [6], [14] involving *Public Goods Games* — to the simplicity of *tit-for-tat* or *Win-Stay-Lose-Shift* strategies discovered in the context of 2-person direct reciprocity, or the *stern-judging* social norm of indirect reciprocity [51]. In these cases, we observe a fine balance between strict replies towards defective actions and prompt forgiving moves, allowing the emergence of unambiguous decision rules (or norms) that may efficiently recover from past mistakes. Thus, despite the inherent complexity of N-person interactions and the individual capacity to develop complex strategies, it is remarkable how evolution still selects simple key principles that lead to widespread cooperative behaviors.

## Supporting Information

### Text S1.

**Supporting text.** (containing 4 additional figures) provides additional details concerning the methodology adopted and investigates the impact of *i*) mutation rates and *ii*) the evolution in the absence of execution error rates in the evolutionary dynamics of the *N-Person* repeated *Prisoner's Dilemma*.

https://doi.org/10.1371/journal.pcbi.1003945.s001

(RAR)

## Author Contributions

Conceived and designed the experiments: FLP VVV FCS JMP. Performed the experiments: FLP VVV FCS JMP. Analyzed the data: FLP VVV FCS JMP. Contributed reagents/materials/analysis tools: FLP VVV FCS JMP. Wrote the paper: FLP VVV FCS JMP.

## References

- 1.
Sigmund K (1993) Games of life: explorations in ecology, evolution and behaviour. Oxford, UK: Oxford University Press, Inc.
- 2. Kollock P (1998) Social dilemmas: The anatomy of cooperation. Annu Rev Sociol 24: 183–214.
- 3.
Sigmund K (2010) The calculus of selfishness. Princeton, USA: Princeton University Press.
- 4. Souza MO, Pacheco JM, Santos FC (2009) Evolution of cooperation under N-person snowdrift games. J Theor Biol 260(4): 581–588.
- 5. Pacheco JM, Santos FC, Souza MO, Skyrms B (2009) Evolutionary dynamics of collective action in N-person stag hunt dilemmas. Proc R Soc B 276(1655): 315–321.
- 6. Santos FC, Pacheco JM (2011) Risk of collective failure provides an escape from the tragedy of the commons. Proc Natl Acad Sci USA 108(26): 10421–10425.
- 7. Santos FC, Santos MD, Pacheco JM (2008) Social diversity promotes the emergence of cooperation in public goods games. Nature 454(7201): 213–216.
- 8. Santos MD, Pinheiro FL, Santos FC, Pacheco JM (2012) Dynamics of N-person snowdrift games in structured populations. J Theor Biol 315: 81–86.
- 9. Perc M, Szolnoki A (2010) Coevolutionary games — a mini review. BioSystems 99(2): 109–125.
- 10. Perc M, Gómez-Gardeñes J, Szolnoki A, Floría LM, Moreno Y (2013) Evolutionary dynamics of group interactions on structured populations: a review. J R Soc Interface 10(80): 20120997.
- 11. Wang Z, Szolnoki A, Perc M (2013) Interdependent network reciprocity in evolutionary games. Sci Rep 3: 1183.
- 12. Szolnoki A, Vukov J, Perc M (2014) From pairwise to group interactions in games of cyclic dominance. Phys Rev E 89(6): 062125.
- 13. Sasaki T, Brännström Å, Dieckmann U, Sigmund K (2012) The take-it-or-leave-it option allows small penalties to overcome social dilemmas. Proc Natl Acad Sci USA 109(4): 1165–1169.
- 14. Vasconcelos VV, Santos FC, Pacheco JM (2013) A bottom-up institutional approach to cooperative governance of risky commons. Nature Clim Change 3(9): 797–801.
- 15. Sigmund K, De Silva H, Traulsen A, Hauert C (2010) Social learning promotes institutions for governing the commons. Nature 466(7308): 861–863.
- 16. Hauert C, De Monte S, Hofbauer J, Sigmund K (2002) Volunteering as red queen mechanism for cooperation in public goods games. Science 296(5570): 1129–1132.
- 17. Brandt H, Hauert C, Sigmund K (2006) Punishing and abstaining for public goods. Proc Natl Acad Sci USA 103(2): 495–497.
- 18. Fehr E, Gächter S (2002) Altruistic punishment in humans. Nature 415(6868): 137–140.
- 19. Szabó G, Fáth G (2007) Evolutionary games on graphs. Phys Rep 446(4): 97–216.
- 20. Hardin G (1968) The Tragedy of the Commons. Science 162(3859): 1243–1248.
- 21. Boyd R, Richerson PJ (1988) The evolution of reciprocity in sizable groups. J Theor Biol 132(3): 337–356.
- 22. Van Segbroeck S, Pacheco JM, Lenaerts T, Santos FC (2012) Emergence of fairness in repeated group interactions. Phys Rev Lett 108(15): 158104.
- 23. Kurokawa S, Ihara Y (2009) Emergence of cooperation in public goods games. Proc R Soc B 276(1660): 1379–1384.
- 24. Milinski M, Sommerfeld RD, Krambeck H-J, Reed FA, Marotzke J (2008) The collective-risk social dilemma and the prevention of simulated dangerous climate change. Proc Natl Acad Sci USA 105(7): 2291–2294.
- 25. Bosetti V, Carraro C, Duval R, Tavoni M (2011) What should we expect from innovation? A model-based assessment of the environmental and mitigation cost implications of climate-related R&D. Energ Econ 33(6): 1313–1320.
- 26.
Barrett S (2007) Why cooperate? The incentive to supply global public goods. Oxford, UK: Oxford University Press.
- 27. Barrett S, Dannenberg A (2012) Climate negotiations under scientific uncertainty. Proc Natl Acad Sci USA 109(43): 17372–17376.
- 28.
IPCC (2013) http://www.ipcc.ch/: WMO and UNEP.
- 29. Vasconcelos VV, Santos FC, Pacheco JM, Levin SA (2014) Climate policies under wealth inequality. Proc Natl Acad Sci USA 111: 2212–2216.
- 30.
Klau T (2011) Two challenges for Europe's politicians. http://www.ecfr.eu/content/entry/commentary_two_challenges_for_europes_politician: European Council of Foreign Relations
- 31.
Stiglitz J (2011) Eurozone's problems are political, not economic. The A List. http://blogs.ft.com/the-a-list/2011/07/20/eurozones-problems-are-political-not-economic/- axzz2Rrr973yy: Financial Times.
- 32. Soros G (2013) A European Solution To The Eurozone's Problem. Social Europe Journal http://www.social-europe.eu/2013/04/a-european-solution-to-the-eurozones-problem/: Social Europe Journal.
- 33. Trivers RL (1971) The evolution of reciprocal altruism. Q Rev Biol 35–57.
- 34. Gokhale CS, Traulsen A (2010) Evolutionary games in the multiverse. Proc Natl Acad Sci USA 107(12): 5500–5504.
- 35. Traulsen A, Nowak MA, Pacheco JM (2006) Stochastic dynamics of invasion and fixation. Phys Rev E 74(1): 011909.
- 36. Blume LE (1993) The statistical mechanics of strategic interaction. Game Econ Behav 5(3): 387–424.
- 37. Szabó G, Tőke C (1998) Evolutionary prisoner's dilemma game on a square lattice. Phys Rev E 58(1): 69.
- 38.
Sandholm WH (2010) Population games and evolutionary dynamics. Cambridge, MA, USA: MIT press.
- 39. Nowak M, Sigmund K (1993) A strategy of win-stay, lose-shift that outperforms tit-for-tat in the Prisoner's Dilemma game. Nature 364(6432): 56–58.
- 40.
Posch M (1997) Win Stay – Lose Shift: An Elementary Learning Rule for Normal Form Games. Working Paper No. 97-06-056e, Santa Fe Institute.
- 41. Imhof LA, Fudenberg D, Nowak MA (2007) Tit-for-tat or win-stay, lose-shift? J Theor Biol 247(3): 574–580.
- 42. Fundenberg D, Maskin E (1990) Evolution and cooperation in noisy repeated games. Am Econ Rev 80(2): 274–279.
- 43. Gale J, Binmore KG, Samuelson L (1995) Learning to be imperfect: The ultimatum game. Game Econ Behav 8(1): 56–90.
- 44. Boyd R (1989) Mistakes allow evolutionary stability in the repeated prisoner's dilemma game. J Theor Biol 136(1): 47–56.
- 45. Nowak MA, Sigmund K, El-Sedy E (1995) Automata, repeated games and noise. J Math Biol 33(7): 703–722.
- 46. Fudenberg D, Imhof L (2005) Imitation Processes with Small Mutations. J Econ Theory 131: 251–262.
- 47. Nowak MA, Sasaki A, Taylor C, Fudenberg D (2004) Emergence of cooperation and evolutionary stability in finite populations. Nature 428(6983): 646–650.
- 48. Pinheiro FL, Santos FC, Pacheco JM (2012) How selection pressure changes the nature of social dilemmas in structured populations. New J Phys 14(7): 073035.
- 49. Kraines D, Kraines V (1995) Evolution of learning among Pavlov strategies in a competitive environment with noise. J Conflict Resolut 39(3): 439–466.
- 50. Hauert C, Schuster HG (1997) Effects of increasing the number of players and memory size in the iterated Prisoner's Dilemma: a numerical approach. Proc R Soc B 264(1381): 513–519.
- 51. Pacheco JM, Santos FC, Chalub FAC (2006) Stern-judging: A simple, successful norm which promotes cooperation under indirect reciprocity. PLoS Comput Biol 2(12): e178.