Indirect punishment can outperform direct punishment in promoting cooperation in structured populations

Yujia Wen; Zhixue He; Chen Shen; Jun Tanimoto

doi:10.1371/journal.pcbi.1013068

Abstract

Indirect punishment traditionally sustains cooperation in social systems through reputation or norms, often by reducing defectors’ payoffs indirectly. In this study, we redefine indirect punishment for structured populations as a spatially explicit mechanism, where individuals on a square lattice target second-order defectors—those harming their neighbors—rather than their own immediate defectors, guided by the principle: “I help you by punishing those who defect against you”. Using evolutionary simulations, we compare this adapted indirect punishment to direct punishment, where individuals punish immediate defectors. Results show that within a narrow range of low punishment costs and fines, adapted indirect punishment outperforms direct punishment in promoting cooperation. However, outside this cost-fine region, outcomes vary: direct punishment may excel, both may be equally effective, or neither improves cooperation, depending on the parameter values. These findings hold even when network reciprocity alone does not support cooperation. Notably, when adapted indirect punishment outperforms direct punishment in promoting cooperation, defectors face stricter penalties without appreciably increasing punishers’ costs, making it more efficient than direct punishment. Overall, our findings provide insights into the role of indirect punishment in structured populations and highlight its importance in understanding the evolution of cooperation.

Author summary

Punishment is often considered a key mechanism for maintaining cooperation in human and biological systems. Traditionally, indirect punishment relies on reputation or social norms, often by withholding cooperation, whereas we adapt this concept to penalize second-order defectors in a spatial setting. In our study, individuals on a square lattice punish those who harm their neighbors, rather than their own direct defectors. Through evolutionary simulations, we compare this adapted indirect punishment strategy with direct punishment strategy, in which individuals punish their own immediate defective neighbors. We find that under certain conditions, adapted indirect punishment is more effective at sustaining cooperation while incurring lower costs for punishers. These results enhance our understanding of how different forms of punishment influence cooperation and offer insights into their optimal use in resolving social dilemmas.

Citation: Wen Y, He Z, Shen C, Tanimoto J (2025) Indirect punishment can outperform direct punishment in promoting cooperation in structured populations. PLoS Comput Biol 21(6): e1013068. https://doi.org/10.1371/journal.pcbi.1013068

Editor: Feng Fu, Dartmouth College, UNITED STATES OF AMERICA

Received: October 2, 2024; Accepted: April 22, 2025; Published: June 2, 2025

Copyright: © 2025 Wen et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: The simulation code used to obtain the results of this work is openly available on GitHub: https://github.com/Yujia-WEN/Indirect-reward-outperform-direct-punishment.

Funding: This work was supported by the Japan Society for the Promotion of Science (JSPS) through the Grant-in-Aid for Scientific Research (KAKENHI), under Grant Numbers JP23H03499 (to C.S. and J.T.) and JP20H02314 (to J.T.), as well as by the Kakihara Science and Technology Foundation (FY2017–2018) and the Pfizer Health Research Foundation (FY2024–2025), both to J.T. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Competing interests: The authors have declared that no competing interests exist.

Introduction

The advancement of society relies on cooperation among individuals to maximize collective benefits [1–4]. However, cooperation imposes individual costs while benefiting others, making it vulnerable to exploitation by selfish free-riders [5]. Understanding how cooperation emerges among strangers, particularly in one-shot and anonymous encounters [6], is crucial because these settings artificially remove key mechanisms that typically sustain cooperation. Such mechanisms include repeated interactions [7–10], reputation effects [11,12], and other reciprocity mechanisms [13], creating the most challenging conditions for cooperation to emerge. Yet, despite these obstacles, economic game experiments consistently show that people do cooperate, with cooperation levels stabilizing at lower but persistent rates [14,15]. Evidence suggests that cooperation in such scenarios may result from prosocial preferences or confusion about the game structure[16–19]. Nevertheless, researchers continue to explore mechanisms to sustain cooperation, particularly through costly punishment[6,20]. Beyond human interactions, punishment is widely observed in nature [21,22]. In animal populations, individuals often use punishment to constrain non-cooperative or deceptive behaviors, thereby promoting population stability [23]. These findings suggest that punishment is a fundamental enforcement mechanism shaping cooperative behavior across biological and social systems.

Costly punishment generally takes two forms: direct punishment and indirect punishment. Direct punishment occurs when an individual penalizes a defector who has personally harmed them [20,24–26]. While this strategy can deter defection, it imposes a cost on the punishers and risks retaliation, making it evolutionarily unstable unless supported by additional mechanisms such as reputation effects [27], voluntary participation [28], or network reciprocity [29–35]. By contrast, indirect punishment occurs when an individual punishes a defector who has harmed someone else, even though the punisher was not personally affected. This enforcement mechanism helps maintain cooperation by discouraging selfish behavior on a broader social scale. A well-documented example is indirect reciprocity, where individuals refuse to cooperate with those who have previously defected against someone else [12,36–38]. Experimental studies suggest that even when indirect punishment is rarely used, its mere availability can sustain cooperative behavior by reinforcing prosocial norms [37]. Unlike direct punishment, which requires personal retaliation, indirect punishment typically relies on social mechanisms such as reputation systems to track defectors’ past actions and coordinate sanctions [11,12,38]. While it may lower retaliation risks, it still incurs costs, such as monitoring effort and the risk of misjudging defectors.

Owing to these differences, the evolutionary dynamics of direct punishment have been extensively studied in structured populations, where network structures stabilize it by balancing second-order free-riding (cooperators who don’t punish) and antisocial punishment (targeting cooperators), thus ensuring their survival [29,30,32]. By contrast, indirect punishment has received less attention in spatial networks, as it traditionally relies on reputation or norms to sanction defectors across broader social scales, often among strangers [37,38]. In structured populations where interactions are localized, direct punishment can naturally be applied to nearby defectors. However, monitoring reputations and enforcing norms can be challenging, potentially weakening the effectiveness of aforementioned indirect punishment within such populations. Despite these constraints, the fundamental principle of indirect punishment—sanctioning defectors for others’ sake—suggests that alternative spatially adapted mechanisms, independent of global reputation, may still facilitate cooperation. This contrast raises a pivotal question: in structured populations, how does spatially adapted indirect punishment compare to direct punishment in sustaining cooperation, and under what conditions might it prove more effective?

To address this gap, we examined the effectiveness of direct and indirect punishment in structured populations using an extended evolutionary prisoner’s dilemma game on a square lattice. The game consists of two stages: in the first stage, players decide whether to cooperate with their direct neighbors, and in the second stage, they choose whether to impose fines on defectors at a personal cost. Direct punishment is implemented by allowing players to punish their immediate neighbors (depicted as the purple-shaded area in Fig 1). Meanwhile, indirect punishment is adapted to the square lattice, requiring focal players to punish their indirect neighbors, specifically the neighbors of their neighbors (illustrated as the cyan-shaded area in Fig 1). Although this adaptation differs from traditional reputation-based indirect punishment models, it maintains the core idea of indirect punishment, where defectors are sanctioned even if they have not directly harmed the punisher.

Download:

Fig 1. Schematic diagram of direct and indirect punishment in the model.

In the first stage of the prisoner’s dilemma game, a focal player interacts with their eight direct neighbors (purple-shaded area). In the second stage, punishment targets differ: direct punishment imposes penalties on defective direct neighbors, while indirect punishment targets defective second-order neighbors (up to sixteen, cyan-shaded area), excluding direct neighbors.

https://doi.org/10.1371/journal.pcbi.1013068.g001

Through evolutionary simulations, we show that compared to scenarios without punishment, indirect punishment promotes cooperation more effectively than direct punishment under conditions of low punishment costs and fines. Outside this region, when costs are low but fines are high, both forms of punishment are equally effective. When both the cost and fine are relatively high, direct punishment becomes more favorable. However, in other cases, neither punishment strategy significantly enhances cooperation. These results are robust in terms of when network reciprocity alone can and cannot sustain cooperation. In scenarios where indirect punishment outperforms direct punishment, defectors encounter stricter penalties without substantially increasing the costs borne by punishers, making it a more efficient enforcement mechanism. These findings suggest that under specific low-cost and low-fine conditions, indirect punishment serves as a more effective method for promoting cooperation than direct punishment, offering valuable insights into optimizing punishment mechanisms within structured populations.

Model and method

Considering a population of size N placed on a square lattice with moore neighborhood and periodic boundaries, in which players play two-stage prisoner’s dilemma game (PDG) with their neighbors [39]. In the first stage of the PDG, paired players simultaneously choose to either cooperate or defect, with the outcomes determined by the actions of both parties. Specifically, players can obtain payoff R from mutual cooperation and a P from mutual defection. When unilateral defection occurs, cooperator receives S while defector receives T. The social dilemma arises when the payoff elements satisfy T>R>P>S and 2R>T + S, indicating that while mutual cooperation maximizes collective benefits, defection provides the highest individual benefit. To simplify the model without losing generality, we set R = 1 and P = 0, and follow ref.[40] to introduce the dilemma strength parameter , which characterizes the relative benefit of defection over cooperation. Thus, T = 1 + r and S = −r.

In the second stage, players decide whether to punish defectors based on first-stage outcomes. Punishment involves punishers incurring a cost () to impose a fine () to others. Based on first-stage actions, we define four strategies: cooperating without punishing (CN), defecting without punishing (DN), cooperating and punishing defectors (CP), and defecting while punishing defectors (DP). The payoffs for these four strategies are summarized in Table 1.

Download:

Table 1. The payoff matrix for the two-stage Prisoner’s Dilemma game. (a) The first-stage interaction involves players deciding whether to cooperate; (b) The second stage involves players deciding whether to punish, with the interaction outcomes under different strategies representing the row player’s payoff in that stage.

https://doi.org/10.1371/journal.pcbi.1013068.t001

In direct punishment, a focal player interacts with their direct neighbors in both stages (highlighted in purple in Fig 1). In contrast, indirect punishment differs in that second-stage interactions target second-order neighbors (i.e., neighbors of first-order neighbors, highlighted in cyan in Fig 1). Notably, punishing second-order defectors ultimately benefits first-order neighbors, embodying the principle: “I help you by punishing those who defect against you". From this perspective, this form of punishment can be viewed as an indirect reward to direct neighbors.

The total payoff of a player is the sum of the outcomes from the two-stage interactions. Specifically, in the case of direct punishment, the focal player’s payoffs in the first and second stages are calculated according to Table 1(a) and 1(b), respectively, based on interactions with eight direct neighbors. In the case of indirect punishment, the focal player’s payoff in the first stage is calculated through interactions with eight immediate neighbors according to Table 1(a), while the second-stage payoff is obtained through interactions with sixteen second-order neighbors according to Table 1(b).

We use Monte Carlo simulations to obtain the results for both the direct punishment and indirect reward models. Initially, players randomly adopt one of the four strategies. In each time step of the Monte Carlo simulation, N updates occur. During each update, a focal player is randomly selected, and their payoff is calculated according to the above rules, the player then updates strategy through pairwise imitation. Thus, on average, each player updates strategy once per time step. The focal player i mimics the strategy of a randomly selected direct neighbor j with a Fermi probability [41]:

(1)

where represents the player x’s total payoffs. The parameter denotes the imitation strength; as , imitation approaches randomness, while as , imitation is driven by payoff differences. In this study, we set to investigate a strong imitation scenario. The network sizes N range from 100² to 400² in simulation, and the final results are averaged from over 50 independent simulations, with each simulation producing average values from the last 2000 time steps out of a total exceeding time steps.

Results

In structured populations, network reciprocity can enable cooperative clusters to resist defectors [3,42], thereby sustaining cooperation. However, under high dilemma strength, network reciprocity fails to prevent defectors from infiltrating cooperative clusters, making cooperation difficult to establish. To investigate the role of punishment in promoting cooperation, we examine two representative scenarios: low dilemma strength (r = 0.05), where cooperation coexists with defection at 56 % in the absence of punishment, and high dilemma strength (r = 0.2), where cooperation vanishes without punishment. These two scenarios serve as benchmarks for comparing the effectiveness of punishment, and our model reverts to these baselines when punishment costs substantially exceed fines.

Interestingly, indirect punishment can promote cooperation more effectively than direct punishment within a narrow region of punishment parameters. The results in Fig 2 indicate four distinct regions in the parameter space of punishment costs and fines, revealing the differences between indirect and direct punishment in promoting cooperation, regardless of dilemma strength. In region I, characterized by low-costs and low-fines relative to payoff of mutual cooperation, with the fine slightly higher than the punishment cost, indirect punishment can outperform direct punishment, as shown in the red circle of Fig 2. In region II, where both costs and fines are relatively high than that of region I, direct punishment is more effective than indirect punishment. In region III, characterized by low-costs and high-fines, both indirect and direct punishment perform similarly, enabling full cooperation, as seen in Figs A and B in S1 Appendix. In region IV, where costs typically exceed fines, neither punishment method improves cooperation significantly, and outcomes revert to those observed without punishment.

Download:

Fig 2. Indirect punishment can promote cooperation more effectively than direct punishment under conditions of low-costs and low-fines (marked with dark red circles).

Color codes illustrate the difference in cooperation levels (i.e., the fraction of CN + CP within the population) between direct punishment and indirect punishment as functions of punishment cost and fine for low dilemma strength (left panel) and high dilemma strength (right panel). Based on the comparison of indirect punishment to direct punishment in promoting cooperation, the parameter space can be divided into the following regions: Region I, where indirect punishment is superior to direct punishment; Region II, where direct punishment is superior to indirect punishment; Region III and region IV, where the difference between direct and indirect punishment is negligible.

https://doi.org/10.1371/journal.pcbi.1013068.g002

Indirect punishment are more sensitive to changes in punishment costs than direct punishment, which affects the overall efficiency of the punishment mechanism. At a certain fine level (e.g., ), as the cost of punishment increases, it typically reduces the effectiveness of direct punishment. This is evident in the upper left of Fig 3, where the proportion of CP gradually declines as increases, ultimately resulting in the collapse of cooperation. For , the proportion of cooperators (i.e., CN + CP) under indirect punishment can exceed that under direct punishment, as shown in the lower left of Fig 3. Specifically, indirect punishment induces pure cooperation within the population for . However, as continues to rise, the efficiency of indirect punishment declines sharply, leading to the disappearance of cooperation when . In contrast, cooperation in the direct punishment scenario only disappears when .

Download:

Fig 3. The effectiveness of indirect punishment in promoting cooperation is sensitive to the costs of punishment.

Depicted are the fraction of CN, DN, CP and DP as functions of punishment cost (left panels) and fine (right panels) for direct punishment and indirect punishment. Results are obtained under high dilemma strength. Other parameters are set to for left panels, and for right panels.

https://doi.org/10.1371/journal.pcbi.1013068.g003

At a given punishment cost level (e.g., ), as the fine increases, direct punishment induces the spontaneous emergence of CP and gradually dominates the population when , as shown in the upper-right panel of Fig 3. In contrast, under indirect punishment, cooperation can be established when . However, as increases, second-order free riders (i.e., CN, who benefit from others’ punishments without bearing the cost) gradually replace CP. Cooperation reemerges only when . Additionally, in indirect punishment, the fine required to eliminate defection is , significantly higher than the required for direct punishment. This suggests that indirect punishment requires a larger fine to achieve the same effect as direct punishment.

We then investigate why indirect punishment can be more or less effective than direct punishment in promoting cooperation by analyzing the characteristics of population evolution. To do so, we fix and employ a block-like initial distribution to better understand the competitive dynamics between strategies, as shown in Fig 4. Under low-cost conditions (e.g., ), whether through direct punishment or indirect punishment, DN can easily invade CN and DP in the direct punishment scenario, as DN individuals free-ride without bearing the costs of punishment. It is evident that the proportion of CP steadily increases from the early stages of evolution, indicating that the implementation of punitive measures by CP allows them to resist invasion by DN and contributes to their dominance. Eventually, a coexistence of CP and DN emerges under direct punishment, as shown in the first row of Fig 4. In contrast, indirect punishment can support the further expansion of CP clusters, see the second row of Fig 4. When DN clusters are surrounded by CP, DN on the edge of these clusters, encounters more second-order punishment neighbors, resulting in they suffer more punishment, see the upper panels of Fig 5. Although CP impose more punishment under the indirect punishment, the total punishment cost for CP does not increase significantly due to the lower cost associated with punishment action. By increasing the frequency of punishment on defectors without substantially raising punishers’ costs at low , indirect punishment allow CP to outperform DN. As a result, indirect punishment can be more efficient than direct punishment in promoting cooperation.

Download:

Fig 4. Indirect punishment promote the expansion of CP clusters when the punishment cost is low, but lead to their collapse when the cost becomes high.

The leftmost panels illustrate the proportion of the four strategies as a function of time steps, while the right panels present typical evolutionary snapshots starting from a block-like initial distribution for direct punishment and indirect punishment. Light red, dark red, light blue, and dark blue represent CN, CP, DN, and DP, respectively. The results are obtained under fixed and r = 0.2.

https://doi.org/10.1371/journal.pcbi.1013068.g004

Download:

Fig 5. Indirect punishment behavior can lead to defectors being punished more without significantly increasing the total cost for punishers when the punishment cost is low, making them more effective than direct punishment in promoting cooperation. However, high punishment costs result in a greater total punishment cost in the case of indirect punishment, which diminishes the payoff advantage for punishers, making them less effective than direct punishment.

Depicted are the frequency distributions of the total punishment cost for punishers (first column panels from the left), the total fines imposed on the defectors (second column panels), the number of punishments imposed by punishers (third column panels), and the number of punishments received by defectors who are punished (fourth column panels) for direct punishment and indirect punishment. To analyze the evolutionary characteristics of punishers, the distribution data include all evolutionary time steps prior to the stabilization of strategy proportions within the population, and the evolutionary processes correspond to the results presented in Fig 4.

https://doi.org/10.1371/journal.pcbi.1013068.g005

For a high cost (e.g., ), the size of CP clusters shrinks under direct punishment, but CP can still resist invasion of DN, as shown in the third row of Fig 4. In contrast, indirect punishment fail to prevent the invasion of DN, as shown in the fourth row of Fig 4. When CP clusters are surrounded by DN, CP individuals have more second-order defecting neighbors, leading to an increase in punitive actions. However, the rise in the total punishment cost limits the expansion of CP. Furthermore, despite the increase in punishment imposed by CP, these punishments are not effectively concentrated on defectors at the cluster boundaries. As shown in the lower panel of Fig 5, the total fines of DN imposed do not increase significantly. Consequently, indirect punishment fail to sustain cooperation.

Conclusion and discussion

To conclude, in this study, we introduced a novel form of indirect punishment on a square lattice, where players impose penalties solely on defective second-order neighbors (the neighbors of their direct neighbors). This spatially adapted strategy indirectly benefits direct neighbors through a spillover effect, enhancing cooperation within the network. Our results demonstrate that indirect punishment outperforms direct punishment—which targets defective direct neighbors—in fostering cooperation under low-cost and low-fine parameter conditions (see region I in Fig 2). In low-cost, high-fine conditions (see region III in Fig 2), both strategies are equally effective, while direct punishment proves superior under high-cost, high-fine conditions (see region II in Fig 2). These findings remain robust across different dilemma strengths, providing valuable insights into the comparative effectiveness of punishment strategies and highlighting the critical role of the cost-to-fine ratio in determining their success within structured populations.

By exploring the comparative effectiveness of direct punishment and indirect punishment, we offer new insights into how different punishment forms promote cooperation in structured populations. Most existing evolutionary models focus on direct punishment because it is unlikely to be evolutionarily stable and requires explanation. Behavioral experiments further support this, showing that individuals who achieve the highest total payoff tend not to use costly (direct) punishment, suggesting this behavior may have evolved for other reasons [20]. In contrast, a form of indirect punishment—characterized by withholding cooperation—does not face the same issues as direct punishment, such as second-order free riding. Consequently, evolutionary game literature often questions why people continue to use direct punishment when they could opt to withhold cooperation [43,44].

Moving beyond these evolutionary considerations, we identified a narrow parameter region in structured populations where indirect punishment is more effective than direct punishment. This finding is somewhat surprising because direct punishment directly undermines defectors’ payoffs at the boundaries of cooperative clusters—a critical mechanism for network reciprocity. In contrast, indirect punishment targets defective second-order neighbors, encouraging them to cooperate for the benefit of their direct neighbors, yet it does not directly alter the competitive dynamics at the cluster boundaries. These results highlight the nuanced roles that direct and indirect punishment play in fostering cooperation and suggest that their relative effectiveness is closely linked to the underlying network structure.

This unexpected effectiveness of indirect punishment behaviors, particularly under conditions of low costs and fines, prompts the need for a deep explanation of the mechanisms driving this outcome. We propose a novel explanation: under these conditions, indirect punishment lead to significant accumulated fines on defectors without significantly increasing the punishers’ total costs. This is due to the broader punishment scope on indirect neighbors, allowing defectors to be penalized from multiple sources, which amplifies the overall punitive effect, as illustrated in Figs 4 and 5. In contrast, direct punishment targets defectors within a limited range, reducing its impact. This explanation deepens our understanding of how network reciprocity supports altruistic behavior, traditionally focused on the formation and maintenance of cooperative clusters [42,45,46].

In evolutionary game theory, direct punishment refers to a mechanism where an individual actively and intentionally penalizes another for their behavior, typically at a personal cost, such as reducing an immediate neighbor’s payoff in a spatial lattice. Conversely, indirect punishment involves enforcing cooperation through social norms, reputation, or third-party systems, extending influence beyond direct individual action. In our adaptation for a square lattice, we redefine indirect punishment to target second-degree neighbors, capturing its broader spatial scope. While some might note that this retains elements of direct punishment—like individual costs or retaliation risks—these features stem from the lattice’s localized structure and do not undermine the distinction. The critical difference lies in the punishment range: immediate for direct, extended for indirect. This approach preserves the spirit of indirect punishment while enabling a precise analysis of how punishment range influences cooperation in structured populations, enhancing the study of spatial evolutionary dynamics.

Although our adapted form of indirect punishment does not fully align with previous definition—which emphasizes withholding cooperation rather than imposing sanctions[20,36]—our model remains valid as it captures the core concept of sanctioning defectors who harm others. In spatial structures, interactions are limited to local neighbors, prompting us to adapt a novel form of indirect punishment to assess its influence on cooperation in networked populations. Implementing the traditional definition would require higher-order networks to capture group interactions beyond the capabilities of a square lattice [47,48], as indirect punishment derived from indirect reciprocity involves ternary or higher-order relationships. Future work could explore higher-order networks to more accurately reflect the dynamics of withholding cooperation.

Additionally, extending the current model by considering broader punishment scopes beyond indirect neighbors could test the robustness of indirect punishment’ effectiveness compared to direct punishment. Such extensions are meaningful because they can quantify the potential future returns for costly punishers on the effectiveness of cooperation and may reveal additional evolutionary mechanisms behind the evolution of costly punishment. Incorporating reinforcement learning algorithms [49] or large language models [50] could also provide a framework for agents to adapt their punishment strategies based on learned experiences, offering insights into how these strategies evolve over time. Finally, considering the competition between different forms of punishment is also meaningful, as the competitive outcomes may help understand various form of punishment can evolve and mechanisms that drive preferences for punishment forms, warrant further investigation.

Supporting information

S1 Appendix.

This file contains two figures, each showing the fractions of the CN, DN, CP, and DP strategies as functions of the punishment cost and fine , under both direct punishment (top row) and indirect punishment (bottom row) scenarios. Figs A and B present results for low dilemma strength (r = 0.05) and high dilemma strength (r = 0.2), respectively

https://doi.org/10.1371/journal.pcbi.1013068.s001

(PDF)

References

1. Axelrod R, Hamilton WD. The evolution of cooperation. Science. 1981;211(4489):1390–6. pmid:7466396
- View Article
- PubMed/NCBI
- Google Scholar
2. Ohtsuki H, Hauert C, Lieberman E, Nowak MA. A simple rule for the evolution of cooperation on graphs and social networks. Nature. 2006;441(7092):502–5. pmid:16724065
- View Article
- PubMed/NCBI
- Google Scholar
3. Nowak MA. Five rules for the evolution of cooperation. Science. 2006;314(5805):1560–3. pmid:17158317
- View Article
- PubMed/NCBI
- Google Scholar
4. Boyd R, Richerson PJ. Culture and the evolution of human cooperation. Philos Trans R Soc Lond B Biol Sci. 2009;364(1533):3281–8. pmid:19805434
- View Article
- PubMed/NCBI
- Google Scholar
5. Hardin G. The tragedy of the commons. Classic papers in natural resource economics revisited. New York: Routledge. 2018. p. 145–56.
6. Fehr E, Gächter S. Altruistic punishment in humans. Nature. 2002;415(6868):137–40. pmid:11805825
- View Article
- PubMed/NCBI
- Google Scholar
7. Camerer CF. Behavioral game theory: experiments in strategic interaction. Princeton University Press; 2011.
8. Hilbe C, Chatterjee K, Nowak MA. Partners and rivals in direct reciprocity. Nat Hum Behav. 2018;2(7):469–77. pmid:31097794
- View Article
- PubMed/NCBI
- Google Scholar
9. Hu S, Leung C, Leung H. Modelling the dynamics of multiagent q-learning in repeated symmetric games: a mean field theoretic approach. Adv Neural Inf Process Syst. 2019.
- View Article
- Google Scholar
10. Zhu Y, Xia C, Chen Z. Nash equilibrium in iterated multiplayer games under asynchronous best-response dynamics. IEEE Trans Automat Contr. 2023;68(9):5798–805.
- View Article
- Google Scholar
11. Xia C, Wang J, Perc M, Wang Z. Reputation and reciprocity. Phys Life Rev. 2023;46:8–45. pmid:37244154
- View Article
- PubMed/NCBI
- Google Scholar
12. Nowak MA, Sigmund K. Evolution of indirect reciprocity by image scoring. Nature. 1998;393(6685):573–7. pmid:9634232
- View Article
- PubMed/NCBI
- Google Scholar
13. Rand DG, Nowak MA. Human cooperation. Trends Cogn Sci. 2013;17(8):413–25. pmid:23856025
- View Article
- PubMed/NCBI
- Google Scholar
14. Chaudhuri A. Sustaining cooperation in laboratory public goods experiments: a selective survey of the literature. Exp Econ. 2011;14:47–83.
- View Article
- Google Scholar
15. Andreoni J, Varian H. Preplay contracting in the prisoners’ dilemma. Proc Natl Acad Sci U S A. 1999;96(19):10933–8.
- View Article
- Google Scholar
16. Fehr E, Fischbacher U, Gächter S. Strong reciprocity, human cooperation, and the enforcement of social norms. Hum Nat. 2002;13(1):1–25. pmid:26192593
- View Article
- PubMed/NCBI
- Google Scholar
17. Wang G, Li J, Wang W, Niu X, Wang Y. Confusion cannot explain cooperative behavior in public goods games. Proc Natl Acad Sci U S A. 2024;121(10):e2310109121. pmid:38412126
- View Article
- PubMed/NCBI
- Google Scholar
18. Burton-Chellew MN, West SA. Payoff-based learning best explains the rate of decline in cooperation across 237 public-goods games. Nat Hum Behav. 2021;5(10):1330–8. pmid:33941909
- View Article
- PubMed/NCBI
- Google Scholar
19. Shen C, He Z, Guo H, Hu S, Tanimoto J, Shi L, et al. Beyond a binary theorizing of prosociality. Proc Natl Acad Sci U S A. 2024;121(49):e2412195121. pmid:39602256
- View Article
- PubMed/NCBI
- Google Scholar
20. Dreber A, Rand DG, Fudenberg D, Nowak MA. Winners don’t punish. Nature. 2008;452(7185):348–51. pmid:18354481
- View Article
- PubMed/NCBI
- Google Scholar
21. Clutton-Brock TH, Parker GA. Punishment in animal societies. Nature. 1995;373(6511):209–16. pmid:7816134
- View Article
- PubMed/NCBI
- Google Scholar
22. Raihani NJ, Thornton A, Bshary R. Punishment and cooperation in nature. Trends Ecol Evol. 2012;27(5):288–95. pmid:22284810
- View Article
- PubMed/NCBI
- Google Scholar
23. Riehl C, Frederickson ME. Cheating and punishment in cooperative animal societies. Philos Trans R Soc Lond B Biol Sci. 2016;371(1687):20150090. pmid:26729930
- View Article
- PubMed/NCBI
- Google Scholar
24. Wu J-J, Zhang B-Y, Zhou Z-X, He Q-Q, Zheng X-D, Cressman R, et al. Costly punishment does not always increase cooperation. Proc Natl Acad Sci U S A. 2009;106(41):17448–51. pmid:19805085
- View Article
- PubMed/NCBI
- Google Scholar
25. Li X, Jusup M, Wang Z, Li H, Shi L, Podobnik B, et al. Punishment diminishes the benefits of network reciprocity in social dilemma experiments. Proc Natl Acad Sci U S A. 2018;115(1):30–5. pmid:29259113
- View Article
- PubMed/NCBI
- Google Scholar
26. Zhu Y, Zhang Z, Xia C, Chen Z. Equilibrium analysis and incentive-based control of the anticoordinating networked game dynamics. Automatica. 2023;147:110707.
- View Article
- Google Scholar
27. dos Santos M, Rankin DJ, Wedekind C. The evolution of punishment through reputation. Proc Biol Sci. 2011;278(1704):371–7. pmid:20719773
- View Article
- PubMed/NCBI
- Google Scholar
28. Fowler JH. Altruistic punishment and the origin of cooperation. Proc Natl Acad Sci U S A. 2005;102(19):7047–9. pmid:15857950
- View Article
- PubMed/NCBI
- Google Scholar
29. Perc M, Szolnoki A. Self-organization of punishment in structured populations. New J Phys. 2012;14(4):043013.
- View Article
- Google Scholar
30. Szolnoki A, Perc M. Second-order free-riding on antisocial punishment restores the effectiveness of prosocial punishment. Phys Rev X. 2017;7(4) :041027.
- View Article
- Google Scholar
31. Guo T, He Z, Shi L. Self-organization in mobile populations promotes the evolution of altruistic punishment. Phys A: Statist Mech Appl. 2023;630:129282.
- View Article
- Google Scholar
32. Helbing D, Szolnoki A, Perc M, Szabó G. Evolutionary establishment of moral and double moral standards through spatial interactions. PLoS Comput Biol. 2010;6(4):e1000758. pmid:20454464
- View Article
- PubMed/NCBI
- Google Scholar
33. Fu F, Chen X. Leveraging statistical physics to improve understanding of cooperation in multiplex networks. New J Phys. 2017;19(7):071002. pmid:29606900
- View Article
- PubMed/NCBI
- Google Scholar
34. Wang X, Fu F, Wang L. Deterministic theory of evolutionary games on temporal networks. J R Soc Interface. 2024;21(214):20240055. pmid:38807526
- View Article
- PubMed/NCBI
- Google Scholar
35. Chen Z, Geng Y, Chen X, Fu F. Unbending strategies shepherd cooperation and suppress extortion in spatial populations. New J Phys. 2024;26(7):073047.
- View Article
- Google Scholar
36. Rockenbach B, Milinski M. The efficient interaction of indirect reciprocity and costly punishment. Nature. 2006;444(7120):718–23. pmid:17151660
- View Article
- PubMed/NCBI
- Google Scholar
37. Ule A, Schram A, Riedl A, Cason TN. Indirect punishment and generosity toward strangers. Science. 2009;326(5960):1701–4. pmid:20019287
- View Article
- PubMed/NCBI
- Google Scholar
38. Balafoutas L, Nikiforakis N, Rockenbach B. Direct and indirect punishment among strangers in the field. Proc Natl Acad Sci U S A. 2014;111(45):15924–7. pmid:25349390
- View Article
- PubMed/NCBI
- Google Scholar
39. Szabó G, Fath G. Evolutionary games on graphs. Phys Rep. 2007;446(4–6):97–216.
- View Article
- Google Scholar
40. Wang Z, Kokubo S, Jusup M, Tanimoto J. Universal scaling for the dilemma strength in evolutionary games. Phys Life Rev. 2015;14:1–30. pmid:25979121
- View Article
- PubMed/NCBI
- Google Scholar
41. Sigmund K, De Silva H, Traulsen A, Hauert C. Social learning promotes institutions for governing the commons. Nature. 2010;466(7308):861–3. pmid:20631710
- View Article
- PubMed/NCBI
- Google Scholar
42. Wang Z, Kokubo S, Tanimoto J, Fukuda E, Shigaki K. Insight into the so-called spatial reciprocity. Phys Rev E Stat Nonlin Soft Matter Phys. 2013;88(4):042145. pmid:24229153
- View Article
- PubMed/NCBI
- Google Scholar
43. Ohtsuki H, Iwasa Y, Nowak MA. Indirect reciprocity provides only a narrow margin of efficiency for costly punishment. Nature. 2009;457(7225):79–82. pmid:19122640
- View Article
- PubMed/NCBI
- Google Scholar
44. Panchanathan K, Boyd R. Indirect reciprocity can stabilize cooperation without the second-order free rider problem. Nature. 2004;432(7016):499–502. pmid:15565153
- View Article
- PubMed/NCBI
- Google Scholar
45. Nowak MA, May RM. The spatial dilemmas of evolution. Int J Bifurcation Chaos. 1993;03(01):35–78.
- View Article
- Google Scholar
46. Perc M, Jordan JJ, Rand DG, Wang Z, Boccaletti S, Szolnoki A. Statistical physics of human cooperation. Phys. Rep. 2017;687:1–51.
- View Article
- Google Scholar
47. Guo H, Jia D, Sendiña-Nadal I, Zhang M, Wang Z, Li X, et al. Evolutionary games on simplicial complexes. Chaos Solitons Fract. 2021;150:111103.
- View Article
- Google Scholar
48. Civilini A, Sadekar O, Battiston F, Gómez-Gardeñes J, Latora V. Explosive cooperation in social dilemmas on higher-order networks. Phys Rev Lett. 2024;132(16):167401. pmid:38701463
- View Article
- PubMed/NCBI
- Google Scholar
49. Hu S, Soh H, Piliouras G. The best of both worlds in network population games: reaching consensus and convergence to equilibrium. Adv Neural Inf Process Syst. 2024.
- View Article
- Google Scholar
50. Ren S, Cui Z, Song R, Wang Z, Hu S. Emergence of social norms in generative agent societies: principles and architecture. In: Proceedings of the 33rd International Joint Conference on Artificial Intelligence. 2024.

[ref1] 1. Axelrod R, Hamilton WD. The evolution of cooperation. Science. 1981;211(4489):1390–6. pmid:7466396
View Article
PubMed/NCBI
Google Scholar

[2] View Article

[3] PubMed/NCBI

[4] Google Scholar

[ref2] 2. Ohtsuki H, Hauert C, Lieberman E, Nowak MA. A simple rule for the evolution of cooperation on graphs and social networks. Nature. 2006;441(7092):502–5. pmid:16724065
View Article
PubMed/NCBI
Google Scholar

[6] View Article

[7] PubMed/NCBI

[8] Google Scholar

[ref3] 3. Nowak MA. Five rules for the evolution of cooperation. Science. 2006;314(5805):1560–3. pmid:17158317
View Article
PubMed/NCBI
Google Scholar

[10] View Article

[11] PubMed/NCBI

[12] Google Scholar

[ref4] 4. Boyd R, Richerson PJ. Culture and the evolution of human cooperation. Philos Trans R Soc Lond B Biol Sci. 2009;364(1533):3281–8. pmid:19805434
View Article
PubMed/NCBI
Google Scholar

[14] View Article

[15] PubMed/NCBI

[16] Google Scholar

[ref5] 5. Hardin G. The tragedy of the commons. Classic papers in natural resource economics revisited. New York: Routledge. 2018. p. 145–56.

[ref6] 6. Fehr E, Gächter S. Altruistic punishment in humans. Nature. 2002;415(6868):137–40. pmid:11805825
View Article
PubMed/NCBI
Google Scholar

[19] View Article

[20] PubMed/NCBI

[21] Google Scholar

[ref7] 7. Camerer CF. Behavioral game theory: experiments in strategic interaction. Princeton University Press; 2011.

[ref8] 8. Hilbe C, Chatterjee K, Nowak MA. Partners and rivals in direct reciprocity. Nat Hum Behav. 2018;2(7):469–77. pmid:31097794
View Article
PubMed/NCBI
Google Scholar

[24] View Article

[25] PubMed/NCBI

[26] Google Scholar

[ref9] 9. Hu S, Leung C, Leung H. Modelling the dynamics of multiagent q-learning in repeated symmetric games: a mean field theoretic approach. Adv Neural Inf Process Syst. 2019.
View Article
Google Scholar

[28] View Article

[29] Google Scholar

[ref10] 10. Zhu Y, Xia C, Chen Z. Nash equilibrium in iterated multiplayer games under asynchronous best-response dynamics. IEEE Trans Automat Contr. 2023;68(9):5798–805.
View Article
Google Scholar

[31] View Article

[32] Google Scholar

[ref11] 11. Xia C, Wang J, Perc M, Wang Z. Reputation and reciprocity. Phys Life Rev. 2023;46:8–45. pmid:37244154
View Article
PubMed/NCBI
Google Scholar

[34] View Article

[35] PubMed/NCBI

[36] Google Scholar

[ref12] 12. Nowak MA, Sigmund K. Evolution of indirect reciprocity by image scoring. Nature. 1998;393(6685):573–7. pmid:9634232
View Article
PubMed/NCBI
Google Scholar

[38] View Article

[39] PubMed/NCBI

[40] Google Scholar

[ref13] 13. Rand DG, Nowak MA. Human cooperation. Trends Cogn Sci. 2013;17(8):413–25. pmid:23856025
View Article
PubMed/NCBI
Google Scholar

[42] View Article

[43] PubMed/NCBI

[44] Google Scholar

[ref14] 14. Chaudhuri A. Sustaining cooperation in laboratory public goods experiments: a selective survey of the literature. Exp Econ. 2011;14:47–83.
View Article
Google Scholar

[46] View Article

[47] Google Scholar

[ref15] 15. Andreoni J, Varian H. Preplay contracting in the prisoners’ dilemma. Proc Natl Acad Sci U S A. 1999;96(19):10933–8.
View Article
Google Scholar

[49] View Article

[50] Google Scholar

[ref16] 16. Fehr E, Fischbacher U, Gächter S. Strong reciprocity, human cooperation, and the enforcement of social norms. Hum Nat. 2002;13(1):1–25. pmid:26192593
View Article
PubMed/NCBI
Google Scholar

[52] View Article

[53] PubMed/NCBI

[54] Google Scholar

[ref17] 17. Wang G, Li J, Wang W, Niu X, Wang Y. Confusion cannot explain cooperative behavior in public goods games. Proc Natl Acad Sci U S A. 2024;121(10):e2310109121. pmid:38412126
View Article
PubMed/NCBI
Google Scholar

[56] View Article

[57] PubMed/NCBI

[58] Google Scholar

[ref18] 18. Burton-Chellew MN, West SA. Payoff-based learning best explains the rate of decline in cooperation across 237 public-goods games. Nat Hum Behav. 2021;5(10):1330–8. pmid:33941909
View Article
PubMed/NCBI
Google Scholar

[60] View Article

[61] PubMed/NCBI

[62] Google Scholar

[ref19] 19. Shen C, He Z, Guo H, Hu S, Tanimoto J, Shi L, et al. Beyond a binary theorizing of prosociality. Proc Natl Acad Sci U S A. 2024;121(49):e2412195121. pmid:39602256
View Article
PubMed/NCBI
Google Scholar

[64] View Article

[65] PubMed/NCBI

[66] Google Scholar

[ref20] 20. Dreber A, Rand DG, Fudenberg D, Nowak MA. Winners don’t punish. Nature. 2008;452(7185):348–51. pmid:18354481
View Article
PubMed/NCBI
Google Scholar

[68] View Article

[69] PubMed/NCBI

[70] Google Scholar

[ref21] 21. Clutton-Brock TH, Parker GA. Punishment in animal societies. Nature. 1995;373(6511):209–16. pmid:7816134
View Article
PubMed/NCBI
Google Scholar

[72] View Article

[73] PubMed/NCBI

[74] Google Scholar

[ref22] 22. Raihani NJ, Thornton A, Bshary R. Punishment and cooperation in nature. Trends Ecol Evol. 2012;27(5):288–95. pmid:22284810
View Article
PubMed/NCBI
Google Scholar

[76] View Article

[77] PubMed/NCBI

[78] Google Scholar

[ref23] 23. Riehl C, Frederickson ME. Cheating and punishment in cooperative animal societies. Philos Trans R Soc Lond B Biol Sci. 2016;371(1687):20150090. pmid:26729930
View Article
PubMed/NCBI
Google Scholar

[80] View Article

[81] PubMed/NCBI

[82] Google Scholar

[ref24] 24. Wu J-J, Zhang B-Y, Zhou Z-X, He Q-Q, Zheng X-D, Cressman R, et al. Costly punishment does not always increase cooperation. Proc Natl Acad Sci U S A. 2009;106(41):17448–51. pmid:19805085
View Article
PubMed/NCBI
Google Scholar

[84] View Article

[85] PubMed/NCBI

[86] Google Scholar

[ref25] 25. Li X, Jusup M, Wang Z, Li H, Shi L, Podobnik B, et al. Punishment diminishes the benefits of network reciprocity in social dilemma experiments. Proc Natl Acad Sci U S A. 2018;115(1):30–5. pmid:29259113
View Article
PubMed/NCBI
Google Scholar

[88] View Article

[89] PubMed/NCBI

[90] Google Scholar

[ref26] 26. Zhu Y, Zhang Z, Xia C, Chen Z. Equilibrium analysis and incentive-based control of the anticoordinating networked game dynamics. Automatica. 2023;147:110707.
View Article
Google Scholar

[92] View Article

[93] Google Scholar

[ref27] 27. dos Santos M, Rankin DJ, Wedekind C. The evolution of punishment through reputation. Proc Biol Sci. 2011;278(1704):371–7. pmid:20719773
View Article
PubMed/NCBI
Google Scholar

[95] View Article

[96] PubMed/NCBI

[97] Google Scholar

[ref28] 28. Fowler JH. Altruistic punishment and the origin of cooperation. Proc Natl Acad Sci U S A. 2005;102(19):7047–9. pmid:15857950
View Article
PubMed/NCBI
Google Scholar

[99] View Article

[100] PubMed/NCBI

[101] Google Scholar

[ref29] 29. Perc M, Szolnoki A. Self-organization of punishment in structured populations. New J Phys. 2012;14(4):043013.
View Article
Google Scholar

[103] View Article

[104] Google Scholar

[ref30] 30. Szolnoki A, Perc M. Second-order free-riding on antisocial punishment restores the effectiveness of prosocial punishment. Phys Rev X. 2017;7(4) :041027.
View Article
Google Scholar

[106] View Article

[107] Google Scholar

[ref31] 31. Guo T, He Z, Shi L. Self-organization in mobile populations promotes the evolution of altruistic punishment. Phys A: Statist Mech Appl. 2023;630:129282.
View Article
Google Scholar

[109] View Article

[110] Google Scholar

[ref32] 32. Helbing D, Szolnoki A, Perc M, Szabó G. Evolutionary establishment of moral and double moral standards through spatial interactions. PLoS Comput Biol. 2010;6(4):e1000758. pmid:20454464
View Article
PubMed/NCBI
Google Scholar

[112] View Article

[113] PubMed/NCBI

[114] Google Scholar

[ref33] 33. Fu F, Chen X. Leveraging statistical physics to improve understanding of cooperation in multiplex networks. New J Phys. 2017;19(7):071002. pmid:29606900
View Article
PubMed/NCBI
Google Scholar

[116] View Article

[117] PubMed/NCBI

[118] Google Scholar

[ref34] 34. Wang X, Fu F, Wang L. Deterministic theory of evolutionary games on temporal networks. J R Soc Interface. 2024;21(214):20240055. pmid:38807526
View Article
PubMed/NCBI
Google Scholar

[120] View Article

[121] PubMed/NCBI

[122] Google Scholar

[ref35] 35. Chen Z, Geng Y, Chen X, Fu F. Unbending strategies shepherd cooperation and suppress extortion in spatial populations. New J Phys. 2024;26(7):073047.
View Article
Google Scholar

[124] View Article

[125] Google Scholar

[ref36] 36. Rockenbach B, Milinski M. The efficient interaction of indirect reciprocity and costly punishment. Nature. 2006;444(7120):718–23. pmid:17151660
View Article
PubMed/NCBI
Google Scholar

[127] View Article

[128] PubMed/NCBI

[129] Google Scholar

[ref37] 37. Ule A, Schram A, Riedl A, Cason TN. Indirect punishment and generosity toward strangers. Science. 2009;326(5960):1701–4. pmid:20019287
View Article
PubMed/NCBI
Google Scholar

[131] View Article

[132] PubMed/NCBI

[133] Google Scholar

[ref38] 38. Balafoutas L, Nikiforakis N, Rockenbach B. Direct and indirect punishment among strangers in the field. Proc Natl Acad Sci U S A. 2014;111(45):15924–7. pmid:25349390
View Article
PubMed/NCBI
Google Scholar

[135] View Article

[136] PubMed/NCBI

[137] Google Scholar

[ref39] 39. Szabó G, Fath G. Evolutionary games on graphs. Phys Rep. 2007;446(4–6):97–216.
View Article
Google Scholar

[139] View Article

[140] Google Scholar

[ref40] 40. Wang Z, Kokubo S, Jusup M, Tanimoto J. Universal scaling for the dilemma strength in evolutionary games. Phys Life Rev. 2015;14:1–30. pmid:25979121
View Article
PubMed/NCBI
Google Scholar

[142] View Article

[143] PubMed/NCBI

[144] Google Scholar

[ref41] 41. Sigmund K, De Silva H, Traulsen A, Hauert C. Social learning promotes institutions for governing the commons. Nature. 2010;466(7308):861–3. pmid:20631710
View Article
PubMed/NCBI
Google Scholar

[146] View Article

[147] PubMed/NCBI

[148] Google Scholar

[ref42] 42. Wang Z, Kokubo S, Tanimoto J, Fukuda E, Shigaki K. Insight into the so-called spatial reciprocity. Phys Rev E Stat Nonlin Soft Matter Phys. 2013;88(4):042145. pmid:24229153
View Article
PubMed/NCBI
Google Scholar

[150] View Article

[151] PubMed/NCBI

[152] Google Scholar

[ref43] 43. Ohtsuki H, Iwasa Y, Nowak MA. Indirect reciprocity provides only a narrow margin of efficiency for costly punishment. Nature. 2009;457(7225):79–82. pmid:19122640
View Article
PubMed/NCBI
Google Scholar

[154] View Article

[155] PubMed/NCBI

[156] Google Scholar

[ref44] 44. Panchanathan K, Boyd R. Indirect reciprocity can stabilize cooperation without the second-order free rider problem. Nature. 2004;432(7016):499–502. pmid:15565153
View Article
PubMed/NCBI
Google Scholar

[158] View Article

[159] PubMed/NCBI

[160] Google Scholar

[ref45] 45. Nowak MA, May RM. The spatial dilemmas of evolution. Int J Bifurcation Chaos. 1993;03(01):35–78.
View Article
Google Scholar

[162] View Article

[163] Google Scholar

[ref46] 46. Perc M, Jordan JJ, Rand DG, Wang Z, Boccaletti S, Szolnoki A. Statistical physics of human cooperation. Phys. Rep. 2017;687:1–51.
View Article
Google Scholar

[165] View Article

[166] Google Scholar

[ref47] 47. Guo H, Jia D, Sendiña-Nadal I, Zhang M, Wang Z, Li X, et al. Evolutionary games on simplicial complexes. Chaos Solitons Fract. 2021;150:111103.
View Article
Google Scholar

[168] View Article

[169] Google Scholar

[ref48] 48. Civilini A, Sadekar O, Battiston F, Gómez-Gardeñes J, Latora V. Explosive cooperation in social dilemmas on higher-order networks. Phys Rev Lett. 2024;132(16):167401. pmid:38701463
View Article
PubMed/NCBI
Google Scholar

[171] View Article

[172] PubMed/NCBI

[173] Google Scholar

[ref49] 49. Hu S, Soh H, Piliouras G. The best of both worlds in network population games: reaching consensus and convergence to equilibrium. Adv Neural Inf Process Syst. 2024.
View Article
Google Scholar

[175] View Article

[176] Google Scholar

[ref50] 50. Ren S, Cui Z, Song R, Wang Z, Hu S. Emergence of social norms in generative agent societies: principles and architecture. In: Proceedings of the 33rd International Joint Conference on Artificial Intelligence. 2024.

Figures

Abstract

Author summary

Introduction

Model and method

Results

Conclusion and discussion

Supporting information

S1 Appendix.

References