Advertisement
  • Loading metrics

Evolution of All-or-None Strategies in Repeated Public Goods Dilemmas

  • Flávio L. Pinheiro ,

    Contributed equally to this work with: Flávio L. Pinheiro, Vítor V. Vasconcelos, Francisco C. Santos, Jorge M. Pacheco

    Affiliations: Centro de Biologia Molecular e Ambiental da Universidade do Minho, Braga, Portugal, INESC-ID & Instituto Superior Técnico, Universidade de Lisboa, Taguspark, Porto Salvo, Portugal, Centro de Física da Universidade do Minho, Braga, Portugal, ATP-group, CMAF, Instituto para a Investigação Interdisciplinar, Lisboa, Portugal

  • Vítor V. Vasconcelos ,

    Contributed equally to this work with: Flávio L. Pinheiro, Vítor V. Vasconcelos, Francisco C. Santos, Jorge M. Pacheco

    Affiliations: Centro de Biologia Molecular e Ambiental da Universidade do Minho, Braga, Portugal, INESC-ID & Instituto Superior Técnico, Universidade de Lisboa, Taguspark, Porto Salvo, Portugal, Centro de Física da Universidade do Minho, Braga, Portugal, ATP-group, CMAF, Instituto para a Investigação Interdisciplinar, Lisboa, Portugal

  • Francisco C. Santos ,

    Contributed equally to this work with: Flávio L. Pinheiro, Vítor V. Vasconcelos, Francisco C. Santos, Jorge M. Pacheco

    Affiliations: INESC-ID & Instituto Superior Técnico, Universidade de Lisboa, Taguspark, Porto Salvo, Portugal, ATP-group, CMAF, Instituto para a Investigação Interdisciplinar, Lisboa, Portugal

  • Jorge M. Pacheco

    Contributed equally to this work with: Flávio L. Pinheiro, Vítor V. Vasconcelos, Francisco C. Santos, Jorge M. Pacheco

    jmpacheco@math.uminho.pt, jorgem.pacheco@gmail.com

    Affiliations: Centro de Biologia Molecular e Ambiental da Universidade do Minho, Braga, Portugal, ATP-group, CMAF, Instituto para a Investigação Interdisciplinar, Lisboa, Portugal, Departamento de Matemática e Aplicações da Universidade do Minho, Braga, Portugal

Evolution of All-or-None Strategies in Repeated Public Goods Dilemmas

  • Flávio L. Pinheiro, 
  • Vítor V. Vasconcelos, 
  • Francisco C. Santos, 
  • Jorge M. Pacheco
PLOS
x

Abstract

Many problems of cooperation involve repeated interactions among the same groups of individuals. When collective action is at stake, groups often engage in Public Goods Games (PGG), where individuals contribute (or not) to a common pool, subsequently sharing the resources. Such scenarios of repeated group interactions materialize situations in which direct reciprocation to groups may be at work. Here we study direct group reciprocity considering the complete set of reactive strategies, where individuals behave conditionally on what they observed in the previous round. We study both analytically and by computer simulations the evolutionary dynamics encompassing this extensive strategy space, witnessing the emergence of a surprisingly simple strategy that we call All-Or-None (AoN). AoN consists in cooperating only after a round of unanimous group behavior (cooperation or defection), and proves robust in the presence of errors, thus fostering cooperation in a wide range of group sizes. The principles encapsulated in this strategy share a level of complexity reminiscent of that found already in 2-person games under direct and indirect reciprocity, reducing, in fact, to the well-known Win-Stay-Lose-Shift strategy in the limit of the repeated 2-person Prisoner's Dilemma.

Author Summary

The problem of cooperation has been a target of many studies, and some of the most complex dilemmas arise when we deal with groups repeatedly interacting by means of a Public Goods Game (PGG), where individuals may contribute to a common pool, subsequently sharing the resources. Here we study generalized direct group reciprocity by incorporating the complete set of reactive strategies, where action is dictated by what happened in the previous round. We compute the pervasiveness in time of each possible reactive strategy, and find a ubiquitous strategy profile that prevails throughout evolution, independently of group size and specific PGG parameters, proving also robust in the presence of errors. This strategy, that we call All-Or-None (AoN), consists in cooperating only after a round of unanimous group behavior (cooperation or defection); not only is it conceptually very simple, it also ensures that cooperation can self-sustain in a population. AoN contains core principles found, e.g., in the repeated 2-person Prisoner's Dilemma, in which case it reduces to the famous Win-Stay-Lose-Shift strategy.

Introduction

The emergence and sustainability of cooperation constitutes one of the most important problems in social and biological sciences [1]. It revolves around the clash between individual and collective interest, which becomes particularly clear when one considers the evolution of collective action involving Public Goods Games (PGG), such as the stereotypical N-person Prisoner's Dilemma (NPD) [2], [3]. In the absence of additional mechanisms, such as the presence of thresholds [4], [5], risk [6], an embedding network of interactions [7][12], institutions [13][15], punishment or voluntary participation [16][19], evolutionary game theory predicts a population fated to fall into a tragedy of the commons [20].

Collective action problems, however, often involve repeated interactions between members of the same group [21][23], as exemplified by the repeated attempts from country leaders to cooperate in reducing emissions of greenhouse gases [6], [24][29] or in finding a solution to the Euro monetary crisis [30][32]. In such scenarios, where collective action is more difficult to achieve in larger groups [6], one is naturally led to question whether a generalization of the direct reciprocity [33] mechanism to problems of collective action may provide an escape hatch to the aforementioned tragedy of the commons. Moreover, N-player interactions pose many additional difficulties, in particular in what concerns the emergence of reciprocation: If one interacts repeatedly in a group of N-players it is hard to identify towards whom should one reciprocate [3]. In fact, only recently direct reciprocity has been generalized to PGGs [22], [23], studying the co-evolution of unconditional defectors with generalized reciprocators, that is, individuals who, in a group of size N, only cooperate if there were at least M (0≤MN) individuals who cooperated in the previous round. Results show [22], [23] that generalized reciprocators are very successful in promoting cooperation. Moreover, for a given group size N, there is a critical threshold level of fairness, M*, at which reciprocation optimizes the emergence of cooperation [22].

Generalized reciprocators [22] provide an intuitive generalization of the TFT strategy to repeated N-player games. However, and despite the underlying intuition, they constitute but a small subset of all possible individual (reactive) strategies one can envisage in a group of size N.

Here we explore the complete set of reactive strategies that individuals may adopt when engaging in repeated Public Goods Games with N-1 other individuals, assuming that the decision to cooperate or not is based on the behavioral decisions of the group in the previous round (see below). We find that, in the context of Public Goods Games, a reactive strategy not belonging to the set of generalized reciprocators emerges as ubiquitous, ensuring the emergence and sustainability of cooperation.

Models

Let us consider a finite and well-mixed population of Z individuals, who assemble in groups of size N randomly formed, and play a repeated version of the NPD [34]. In each round individuals either cooperate (C) by contributing an amount c to a public good or defect (D) by not doing so. The aggregated contributions of the group are multiplied by an enhancement factor F and equally divided among the N individuals of the group. Hence, in each round, Ds achieve a payoff of , while Cs attain where k is the number of contributions in that round. We consider a repeated PGG with an undetermined number of rounds, such that at the end of each round, another round will take place with probability w [3], leading to an average number of rounds — m — given by m = (1−w)−1. At the beginning of each round (with the exception of the first), each individual decides to contribute (i.e. to play C) or not (i.e. to play D), depending on the total number of contributions that took place in the previous round.

Each strategy Si defines how an individual behaves in each round (i.e. if she/he decides to cooperate or defect) and is encoded in a string with N+2 bits (b−1b0b1…bN−1bN). The first bit (b−1) dictates the behavior in the initial round, while the remaining N+1 bits (b0b1…bN−1bN) correspond in sequence to the player's behavior depending on the number of Cs in the previous round. In this definition a bit 1 corresponds to a cooperative act and a bit 0 to a defective one. Hence, one obtains a maximum of 2N+2 strategies, corresponding to all possible combinations of 0 s and 1 s in a string of size N+2.

We consider groups of N individuals, randomly sampled from a finite population of size Z, playing a repeated NPD. Individuals revise their strategies through the Fermi update rule [35][38], a stochastic birth-death process with mutations. At each time step a randomly selected individual A (with strategy SA and fitness ) may adopt a different strategy i) by mutation with probability μ or ii) by imitating a random member B of the population (with strategy SB and fitness ) with probability , where β is the intensity of selection that regulates the randomness of the decision process. The fitness of each strategy is the average payoff attained over all rounds and possible groups by individuals adopting strategy Si. It is well known that execution errors profoundly affect the evolutionary dynamics of repeated 2-person games [39][45]. Consequently, we shall also consider that, in each round, and after deciding to contribute or not according to bq, an individual may act with the opposite behavior (1−bq) with a probability ε, thus making an execution error.

Results/Discussion

Let us start by investigating the evolutionary dynamics of the population in the small mutation limit approximation [46]. This allows us to compute analytically the relative pervasiveness of each strategy in time. It is noteworthy, however, that the results obtained through this approximation remain valid for a wide range of mutation probabilities, as we show explicitly in the Supporting Information (SI) via comparison with results from computer simulations. In a nutshell, and whenever mutations are rare, a new mutant that appears in the population will either get extinct or invade the entire population before the occurrence of the next mutation. Hence, in each time-step there will be, at most, 2 strategies present in the population, which allows one to describe the evolutionary dynamics of the population in terms of an embedded (and reduced) Markov Chain with a size equal to the number of strategies available. Each state represents a monomorphic population adopting a given strategy, whereas transitions are defined by the fixation probabilities of a single mutant [47]. The resulting stationary distribution τi will then indicate the fraction of time the population spends in each of the 2N+2 states (or strategies Si). We shall also make use of τi to compute the fraction of time the population spends in a configuration/strategy with biq = 1, a quantity we call stationary bit strategy, defined as , where corresponds to the bit q of strategy i. The stationary bit strategy allows us to easily quantify the relative dominance of each behavior and extract the most pervasive strategic profiles.

Figure 1 shows the stationary bit distribution, , for different group sizes. Colored cells highlight those bits (bq) that retain the same value more than 75% of the time, with ≥0.75 (blue) and ≤0.25 (red). For simplicity, we associate this feature with what we call dominant bit.

thumbnail
Figure 1. Stationary bit distribution as a function of N.

Each bit (square) corresponds to the weighted sum of the fraction of time (i.e. the analytically computed stationary distribution) the population spends in strategy configurations in which bq = 1. Blue (red) cells identify those bits that are employed at least ¾ of the time with value bq = 1.0 (bq = 0.0). The analysis provided extends for groups sizes (N) between 2 and 10 (rows). Other model parameters: Z = 100, β = 1.0, F/N = 0.85, w = 0.96, ε = 0.05, μ≪1/Z.

http://dx.doi.org/10.1371/journal.pcbi.1003945.g001

Analysis of the stationary bit distributions for different group sizes under small error probabilities puts into evidence the overall evolutionary success of strategies that conform with a particular profile: b0 = bN = 1 and bq = 0 for 0<q<N. A similar trend is obtained if instead we analyze the stationary distribution τi for all possible strategies Si: This strategy — or minor variations on this profile (see below) — shows the highest prevalence for a wide range of parameters even in the absence of errors of execution (see SI). The philosophy encapsulated in this strategy is a simple yet efficient one: cooperating only after a round of unanimous group behavior (cooperation or defection). Hence we refer to this strategy as All-Or-None (AoN), highlighting the two situations in which these individuals are prone to cooperate. As group size increases, so does the number of expected errors per round, which leads to an overall reduction of the number of dominant bits found in the intermediate sector (i.e. bq for 0<q<N) without affecting the “edge bits”, which again reveals the prevalence of AoN behaviour in the population.

A monomorphic population of AoN players can easily sustain unanimous group cooperation, even in the presence of errors. Indeed, after an occasional individual defection, a round of full defection ensues, resuming back to unanimous cooperation in the following round. Therefore, AoN allows a prompt recovery from errors of execution, which constitutes a key feature that allows cooperation to thrive.

To investigate the robustness of AoN we show, in Figure 2, the effect of execution errors on the stationary bit distribution () for a fixed group size (here N = 5): Clearly, both b0 and bN remain associated with cooperation for a wide range of error probabilities (ε≤0.2). The internal bits, in turn, remain qualitatively close to the AoN profile (i.e. bq = 0 for 0<q<N), undergoing changes as the error rate increases, allowing an efficient resume into full cooperation, after (at least) one behavioral error. In particular, for 0.01<ε<0.1, evolution selects for defection in bits b1 to bN−1, with particular incidence to adjacent bits of b0 and bN, allowing a fast error recovery. This feature gets enhanced with increasing ε. For larger values of ε (ε>0.1), unanimity becomes less likely and we witness an adaptation of the predominant strategy that acts to reduce the interval of bits in which defection prevails. In other words, it is as if execution errors redefine the notion of unanimity itself or, alternatively, individuals become more tolerant as execution errors become more likely. It is also noteworthy that the non-monotonous response to errors shown in Figure 2 has been previously observed in other evolutionary models of cooperation [48] where intermediate degrees of stochasticity emerge as maximizers of cooperation. We confirmed that the results remain qualitatively equivalent for different group sizes.

thumbnail
Figure 2. Stationary bit distribution as a function of the error rate.

We plot (log-linear scale) the fraction of time the population spends in a strategy with bq = 1 for a broad range of error probabilities ε. Circles on the left indicate the values obtained for ε = 0.0, gray areas show the range of values for which bits were defined to have a dominant behavior. Note that for ε = 0.5 all strategies behave randomly. The bar plot on the right shows the results for ε = 0.06 (vertical dashed line). Other model parameters: Z = 100, β = 1.0, N = 5, F/N = 0.85, w = 0.96 and μ≪1/Z.

http://dx.doi.org/10.1371/journal.pcbi.1003945.g002

In the following we investigate the relevant issue of asserting whether the introduction of this strategy can efficiently promote the average fraction of cooperative actions. The level of cooperation, η, may be defined as the average number of contributions per round divided by the maximum number of contributions possible. Denoting by Ci the average number of contributions per round associated with strategy Si, η reads , where τi is the fraction of time the population spends in the configuration Si and N is the group size. As shown in Figure 3, the overall levels of cooperation remain high as long as the average number of rounds is sizeable (left panel, for different values of the PGG enhancement factor F).

thumbnail
Figure 3. Left: Level of cooperation as a function of average number of rounds.

m for three different values of the enhancement value F (4, 3 and 2) with N = 5 and in the absence of behavioral errors. Right: Gradients of Selection [5] for the evolutionary game between ALLD and AoN (b−1 = 0, N = 5, w = 0.96 or m = 25; other model parameters: Z = 100 and β = 1.0).

http://dx.doi.org/10.1371/journal.pcbi.1003945.g003

The success of AoN can also be inferred by assessing its evolutionary chances when interacting with unconditional defectors (AllD). To do so, we compute the gradient of selection [5]G(k) — which provide information on the most likely direction of change of the population configuration with time. This is given by the difference between the probabilities of increasing and decreasing the number of AoN players in a population of k AoNs and Z-k AllDs. The result is depicted in the right panel of Figure 3, a profile characteristic of a coordination game, in which case the AoN strategy dominates whenever the population accumulates a critical fraction of AoN players. Moreover, the size of coordination barrier decreases with increasing values of the enhancement factor F. In the SI we further show that the location of the coordination point is rather insensitive to other game parameters, in particular when the number of rounds is large. Notably, the evolutionary chances of the AoN strategy remain qualitatively independent from alterations on the first bit (b−1). Similarly, we have checked the robustness of the AoN strategy when interacting with random strategists (RS), i.e., individuals that cooperate or defect with equal probability. It can be shown that both AoN and AllD are advantageous with respect to RS strategists (regardless of their prevalence in the population), while these should drive AllC to extinction. Finally, contrary to the generalized versions of TFT strategies, in the presence of errors, the AoN strategy is robust to invasion of unconditional cooperators (AllC) by random drift, as the former can efficiently exploit the latter.

To sum up, we have shown that the strategy AoN emerges as the most viable strategy that leads to the emergence of cooperation under repeated PGGs. This strategy, despite its remarkable simplicity, cannot be encoded within the subspace of generalized reciprocators studied before in this context [22]. When we consider individuals capable of making behavioral errors, AoN is dominant as suggested by analyzing both the stationary bit strategy (Figures 1 and 2) and the stationary distribution in the monomorphic configuration space (SI). More importantly, our results suggest that AoN dominates independently of the group size and over a wide range of error rates.

Previous works have identified similar strategy principles in different contexts. For instance, the Win-Stay-Lose-Shift [39][41], [49] strategy discovered in the context of the repeated 2-person Prisoner's Dilemma constitutes the N = 2 limit of AoN. In the context of repeated N-Person games on the multiverse [34], the strategy entitled generic Pavlov [50] encapsulates a behavioral principle which is similar to that underlying AoN. In fact, one may argue that the principles underlying AoN may very well be ubiquitous: The simplicity of this strategy can be seen as equivalent — in the context of problems of collective action [5], [6], [14] involving Public Goods Games — to the simplicity of tit-for-tat or Win-Stay-Lose-Shift strategies discovered in the context of 2-person direct reciprocity, or the stern-judging social norm of indirect reciprocity [51]. In these cases, we observe a fine balance between strict replies towards defective actions and prompt forgiving moves, allowing the emergence of unambiguous decision rules (or norms) that may efficiently recover from past mistakes. Thus, despite the inherent complexity of N-person interactions and the individual capacity to develop complex strategies, it is remarkable how evolution still selects simple key principles that lead to widespread cooperative behaviors.

Supporting Information

Text S1.

Supporting text. (containing 4 additional figures) provides additional details concerning the methodology adopted and investigates the impact of i) mutation rates and ii) the evolution in the absence of execution error rates in the evolutionary dynamics of the N-Person repeated Prisoner's Dilemma.

doi:10.1371/journal.pcbi.1003945.s001

(RAR)

Author Contributions

Conceived and designed the experiments: FLP VVV FCS JMP. Performed the experiments: FLP VVV FCS JMP. Analyzed the data: FLP VVV FCS JMP. Contributed reagents/materials/analysis tools: FLP VVV FCS JMP. Wrote the paper: FLP VVV FCS JMP.

References

  1. 1. Sigmund K (1993) Games of life: explorations in ecology, evolution and behaviour. Oxford, UK: Oxford University Press, Inc.
  2. 2. Kollock P (1998) Social dilemmas: The anatomy of cooperation. Annu Rev Sociol 24: 183–214. doi: 10.1146/annurev.soc.24.1.183
  3. 3. Sigmund K (2010) The calculus of selfishness. Princeton, USA: Princeton University Press.
  4. 4. Souza MO, Pacheco JM, Santos FC (2009) Evolution of cooperation under N-person snowdrift games. J Theor Biol 260(4): 581–588. doi: 10.1016/j.jtbi.2009.07.010
  5. 5. Pacheco JM, Santos FC, Souza MO, Skyrms B (2009) Evolutionary dynamics of collective action in N-person stag hunt dilemmas. Proc R Soc B 276(1655): 315–321. doi: 10.1098/rspb.2008.1126
  6. 6. Santos FC, Pacheco JM (2011) Risk of collective failure provides an escape from the tragedy of the commons. Proc Natl Acad Sci USA 108(26): 10421–10425. doi: 10.1073/pnas.1015648108
  7. 7. Santos FC, Santos MD, Pacheco JM (2008) Social diversity promotes the emergence of cooperation in public goods games. Nature 454(7201): 213–216. doi: 10.1038/nature06940
  8. 8. Santos MD, Pinheiro FL, Santos FC, Pacheco JM (2012) Dynamics of N-person snowdrift games in structured populations. J Theor Biol 315: 81–86. doi: 10.1016/j.jtbi.2012.09.001
  9. 9. Perc M, Szolnoki A (2010) Coevolutionary games — a mini review. BioSystems 99(2): 109–125. doi: 10.1016/j.biosystems.2009.10.003
  10. 10. Perc M, Gómez-Gardeñes J, Szolnoki A, Floría LM, Moreno Y (2013) Evolutionary dynamics of group interactions on structured populations: a review. J R Soc Interface 10(80): 20120997. doi: 10.1098/rsif.2012.0997
  11. 11. Wang Z, Szolnoki A, Perc M (2013) Interdependent network reciprocity in evolutionary games. Sci Rep 3: 1183. doi: 10.1038/srep01183
  12. 12. Szolnoki A, Vukov J, Perc M (2014) From pairwise to group interactions in games of cyclic dominance. Phys Rev E 89(6): 062125. doi: 10.1103/physreve.89.062125
  13. 13. Sasaki T, Brännström Å, Dieckmann U, Sigmund K (2012) The take-it-or-leave-it option allows small penalties to overcome social dilemmas. Proc Natl Acad Sci USA 109(4): 1165–1169. doi: 10.1073/pnas.1115219109
  14. 14. Vasconcelos VV, Santos FC, Pacheco JM (2013) A bottom-up institutional approach to cooperative governance of risky commons. Nature Clim Change 3(9): 797–801. doi: 10.1038/nclimate1927
  15. 15. Sigmund K, De Silva H, Traulsen A, Hauert C (2010) Social learning promotes institutions for governing the commons. Nature 466(7308): 861–863. doi: 10.1038/nature09203
  16. 16. Hauert C, De Monte S, Hofbauer J, Sigmund K (2002) Volunteering as red queen mechanism for cooperation in public goods games. Science 296(5570): 1129–1132. doi: 10.1126/science.1070582
  17. 17. Brandt H, Hauert C, Sigmund K (2006) Punishing and abstaining for public goods. Proc Natl Acad Sci USA 103(2): 495–497. doi: 10.1073/pnas.0507229103
  18. 18. Fehr E, Gächter S (2002) Altruistic punishment in humans. Nature 415(6868): 137–140. doi: 10.1038/415137a
  19. 19. Szabó G, Fáth G (2007) Evolutionary games on graphs. Phys Rep 446(4): 97–216. doi: 10.1016/j.physrep.2007.04.004
  20. 20. Hardin G (1968) The Tragedy of the Commons. Science 162(3859): 1243–1248. doi: 10.1126/science.162.3859.1243
  21. 21. Boyd R, Richerson PJ (1988) The evolution of reciprocity in sizable groups. J Theor Biol 132(3): 337–356. doi: 10.1016/s0022-5193(88)80219-4
  22. 22. Van Segbroeck S, Pacheco JM, Lenaerts T, Santos FC (2012) Emergence of fairness in repeated group interactions. Phys Rev Lett 108(15): 158104. doi: 10.1103/physrevlett.108.158104
  23. 23. Kurokawa S, Ihara Y (2009) Emergence of cooperation in public goods games. Proc R Soc B 276(1660): 1379–1384. doi: 10.1098/rspb.2008.1546
  24. 24. Milinski M, Sommerfeld RD, Krambeck H-J, Reed FA, Marotzke J (2008) The collective-risk social dilemma and the prevention of simulated dangerous climate change. Proc Natl Acad Sci USA 105(7): 2291–2294. doi: 10.1073/pnas.0709546105
  25. 25. Bosetti V, Carraro C, Duval R, Tavoni M (2011) What should we expect from innovation? A model-based assessment of the environmental and mitigation cost implications of climate-related R&D. Energ Econ 33(6): 1313–1320. doi: 10.1016/j.eneco.2011.02.010
  26. 26. Barrett S (2007) Why cooperate? The incentive to supply global public goods. Oxford, UK: Oxford University Press.
  27. 27. Barrett S, Dannenberg A (2012) Climate negotiations under scientific uncertainty. Proc Natl Acad Sci USA 109(43): 17372–17376. doi: 10.1073/pnas.1208417109
  28. 28. IPCC (2013) http://www.ipcc.ch/: WMO and UNEP.
  29. 29. Vasconcelos VV, Santos FC, Pacheco JM, Levin SA (2014) Climate policies under wealth inequality. Proc Natl Acad Sci USA 111: 2212–2216. doi: 10.1073/pnas.1323479111
  30. 30. Klau T (2011) Two challenges for Europe's politicians. http://www.ecfr.eu/content/entry/commentary_two_challenges_for_europes_politician: European Council of Foreign Relations
  31. 31. Stiglitz J (2011) Eurozone's problems are political, not economic. The A List. http://blogs.ft.com/the-a-list/2011/07/20/eurozones-problems-are-political-not-economic/- axzz2Rrr973yy: Financial Times.
  32. 32. Soros G (2013) A European Solution To The Eurozone's Problem. Social Europe Journal http://www.social-europe.eu/2013/04/a-european-solution-to-the-eurozones-problem/: Social Europe Journal.
  33. 33. Trivers RL (1971) The evolution of reciprocal altruism. Q Rev Biol 35–57. doi: 10.1086/406755
  34. 34. Gokhale CS, Traulsen A (2010) Evolutionary games in the multiverse. Proc Natl Acad Sci USA 107(12): 5500–5504. doi: 10.1073/pnas.0912214107
  35. 35. Traulsen A, Nowak MA, Pacheco JM (2006) Stochastic dynamics of invasion and fixation. Phys Rev E 74(1): 011909. doi: 10.1103/physreve.74.011909
  36. 36. Blume LE (1993) The statistical mechanics of strategic interaction. Game Econ Behav 5(3): 387–424. doi: 10.1006/game.1993.1023
  37. 37. Szabó G, Tőke C (1998) Evolutionary prisoner's dilemma game on a square lattice. Phys Rev E 58(1): 69. doi: 10.1103/physreve.58.69
  38. 38. Sandholm WH (2010) Population games and evolutionary dynamics. Cambridge, MA, USA: MIT press.
  39. 39. Nowak M, Sigmund K (1993) A strategy of win-stay, lose-shift that outperforms tit-for-tat in the Prisoner's Dilemma game. Nature 364(6432): 56–58. doi: 10.1038/364056a0
  40. 40. Posch M (1997) Win Stay – Lose Shift: An Elementary Learning Rule for Normal Form Games. Working Paper No. 97-06-056e, Santa Fe Institute.
  41. 41. Imhof LA, Fudenberg D, Nowak MA (2007) Tit-for-tat or win-stay, lose-shift? J Theor Biol 247(3): 574–580. doi: 10.1016/j.jtbi.2007.03.027
  42. 42. Fundenberg D, Maskin E (1990) Evolution and cooperation in noisy repeated games. Am Econ Rev 80(2): 274–279.
  43. 43. Gale J, Binmore KG, Samuelson L (1995) Learning to be imperfect: The ultimatum game. Game Econ Behav 8(1): 56–90. doi: 10.1016/s0899-8256(05)80017-x
  44. 44. Boyd R (1989) Mistakes allow evolutionary stability in the repeated prisoner's dilemma game. J Theor Biol 136(1): 47–56. doi: 10.1016/s0022-5193(89)80188-2
  45. 45. Nowak MA, Sigmund K, El-Sedy E (1995) Automata, repeated games and noise. J Math Biol 33(7): 703–722. doi: 10.1007/bf00184645
  46. 46. Fudenberg D, Imhof L (2005) Imitation Processes with Small Mutations. J Econ Theory 131: 251–262. doi: 10.1016/j.jet.2005.04.006
  47. 47. Nowak MA, Sasaki A, Taylor C, Fudenberg D (2004) Emergence of cooperation and evolutionary stability in finite populations. Nature 428(6983): 646–650. doi: 10.1038/nature02414
  48. 48. Pinheiro FL, Santos FC, Pacheco JM (2012) How selection pressure changes the nature of social dilemmas in structured populations. New J Phys 14(7): 073035. doi: 10.1088/1367-2630/14/7/073035
  49. 49. Kraines D, Kraines V (1995) Evolution of learning among Pavlov strategies in a competitive environment with noise. J Conflict Resolut 39(3): 439–466. doi: 10.1177/0022002795039003003
  50. 50. Hauert C, Schuster HG (1997) Effects of increasing the number of players and memory size in the iterated Prisoner's Dilemma: a numerical approach. Proc R Soc B 264(1381): 513–519. doi: 10.1098/rspb.1997.0073
  51. 51. Pacheco JM, Santos FC, Chalub FAC (2006) Stern-judging: A simple, successful norm which promotes cooperation under indirect reciprocity. PLoS Comput Biol 2(12): e178. doi: 10.1371/journal.pcbi.0020178