Figures
Abstract
This study incorporates historical performance into traditional imitation rules and proposes a moderated strategy update rule. In this framework, an individual’s temporal historical performance is calculated using the BM model. By adjusting the parameter δ, the influence of historical performance on strategy learning is determined, and the evolution of cooperation is subsequently observed. Results show that the proposed strategy update rule promotes cooperation more effectively than the traditional version, and systemic cooperation is further enhanced as δ increases. The reason why the proposed rule enhances cooperation is that it amplifies the evaluation of cooperative behavior while compressing the evaluation of defective behavior. Although establishing system objectives may hinder the diffusion of cooperative behavior, appropriate performance evaluation mechanisms can mitigate this adverse effect. Our results indicate that multidimensional evaluation can provide a theoretical basis for explaining cooperative behavior in complex environments.
Citation: Wang Y, Yu X, Lu S (2026) Strategies for updating rules driven by reinforcement learning to solve social dilemmas. PLoS One 21(3): e0341925. https://doi.org/10.1371/journal.pone.0341925
Editor: Xiaojie Chen, University of Electronic Science and Technology of China, CHINA
Received: June 28, 2025; Accepted: January 14, 2026; Published: March 10, 2026
Copyright: © 2026 Wang et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: All relevant data are within the paper and its Supporting information files.
Funding: 1.Fundamental Research Funds for the Central Universities (Grant No 3142024036) 2.Self-funded Project of Langfang Science and Technology Research and Development Program(Grant No 2024011066).
Competing interests: The authors have declared that no competing interests exist.
1. Introduction
Cooperation is fundamental to the stability of biological ecosystems and human social systems. Yet, reconciling cooperative behaviors with Darwinian principles of natural selection [1] presents a persistent theoretical challenge. This has prompted most researchers to explore the principles underlying cooperation. Nowak’s seminal work summarizes five key mechanisms facilitating cooperation: kin selection, direct reciprocity, indirect reciprocity, network reciprocity, and group selection [2]. Crucially, network reciprocity exploits underlying population structure, enabling cooperators to cluster and thereby sustain cooperation [3]. Subsequent research has extended this foundational framework, incorporating diverse mechanisms such as punishment and reward [4–7], environmental feedback [8,9], social diversity [10], teaching activities [11,12], reputation [13], among others [14].
While numerous studies posit strategy learning in cooperative evolution primarily through imitation rules [3]—where benefits serve as the key driver—the reality involves complex, multidimensional influences. Strategy learning behavior results from the interplay of factors like self-learning and social learning [15]. Scholars have explored the impact of diverse strategy updating rules on the emergence and maintenance of cooperation. For instance, Yan and Hui demonstrated that integrating reputation mechanisms significantly enhances cooperation [16]. Similar conclusions were reached by Zhang et al. and He et al. [17,18]. Lu and Wang, incorporating past-performance into learning rules, found that increasing the weight of historical outcomes progressively strengthens system-wide cooperation [19]. Other investigated rules include popularity-driven [20] and experience-driven updates [21]. Collectively, these findings indicate that multi-factor dependent learning rules generally foster cooperative evolution.
Despite this extensive exploration, the predominant focus remains on extrinsic social attributes, such as reputation. Consequently, a critical gap persists: the evolutionary patterns and outcomes of system cooperation under composite strategy learning rules driven primarily by intrinsic individual attributes remain unexplored.
In addition, most existing studies focus on the immediate benefits of individual behavior within a single interaction round, overlooking the accumulated experience from prior games. This approach fails to capture the natural phenomenon in which organisms adapt their social strategies based on environmental cues, including feedback from past experiences. Reinforcement learning (RL) rules, however, effectively incorporate the cumulative influence of such memory effects [22–24]. Consequently, researchers have increasingly explored RL in evolutionary cooperation studies. For instance, Jia et al. demonstrated that incorporating RL enhances system-wide cooperation [25]. However, the research conducted by Lu et al [19], focused on the impact of reinforcement learning based relationship strength adjustment on cooperation, neglecting the dual effects of internal and external factors in strategy learning. The studies of Jia et al. [25] and Geng et al. [26] also overlooked the role of individual intrinsic factors in strategy learning. In addition, although Zhang et al. combined reinforcement learning with consensus learning rules to study how different policy update mechanisms affect cooperative evolution, the model assumes that individuals are rational [27,28], which does not reflect the actual cooperative evolution situation. Notably, recent findings suggest RL not only accounts for conditional cooperation but also explains patterns of emotional reciprocity [29,30].
Accordingly, we conceptualize the system’s consistency goal [31–35] as an intrinsic driver of individual behavior. Achieving this goal serves as one criterion for evaluating behavioral performance: success in the preceding round raises the current performance score, while failure lowers it. Drawing on reinforcement learning principles, we accumulate behavioral information across successive rounds to assess historical performance. From a global perspective, we evaluate individual performance and use the resulting assessment as a measure of social evaluation. By taking the interactive payoffs among individuals as the basis for mutual assessment, the strategy learning process is systematically guided through the integration of both social and individual evaluations. On this basis, we examine the evolution of cooperation within the system. This update rule incorporates both real-time game payoffs and historical behavior to govern strategy revisions. Our results show that this modified update mechanism significantly promotes the emergence of prosocial behaviors in the system.
2. Model
In this work, the weak Prisoner’s Dilemma is used [3], and without loss of generality, set the game payoff T = b (b > 1), R = 1, and P = S=0, and follows: T > R > P > S and 2R > T + S. The corresponding payoff matrix M in Eq. (1).
Then, we construct a two-dimensional spatial network with periodic boundary to depict the relationships between individuals in the system. Initially, each individual randomly adopts either cooperation (Si = C) or defection (Si = D) with equal probability p, as specified in Eq. (2), interacting with its four nearest neighbors, accumulating income Pi based on Eq. (3), where Ωi is the set of individual i’s neighbors.
Subsequently, the BM model [22–24] is employed within a reinforcement learning framework to calculate and assess an individual’s historical performance. In performance evaluation, the system’s aspiration level for consistency serves as the evaluation benchmark. If an individual’s cumulative performance during the evaluation period reaches or exceeds this benchmark, their score increases; otherwise, it decreases. And this adjustment mechanism operates persistently. Within BM reinforcement learning, the performance evaluation process occurs in two distinct steps. First, performance is scored according to the degree of deviation between the cumulative revenue and the expected system consistency target, in Eq. (4), where β (β ≥ 0) is the stimulus sensitivity to the reinforcement signal of (Pi-A). Subsequently, evaluating based on individual strategies, in Eq. (5), where gi represents the satisfaction of players with the difference between Pi and A, then, the global evaluation results Ei of individual historical behavior performance can be quantified, with the calculation formula specified as follows.
Where, parameter A represents the consistency goal or expected level of the system, and define A = kiα, where ki = 4 denotes player i’s degree, representing the number of their four nearest neighbors [36,37], and α signifies the system’s consistency aspiration or goal level.
Finally, to refine learning strategies, this study proposes a moderated update rule. This rule integrates individual historical performance assessment with imitation dynamics, combining game payoff and historical performance to guide strategy updates, refers to the previous strategy learning rule setting and linearly adds them together [38,39]. The parameter δ (δ ∈ [0, 1]) modulates the weighting of historical performance in learning. When δ = 0, the strategy learn rule returns to its traditional version [3]. When δ > 0, both game payoffs and historical performance jointly shape strategy adaptation dynamics. Specifically, during the strategy update or learning process, the focal individual i will randomly select a nearest neighbor j and decide whether to learn the strategy of neighbor j based on probability W that based on Eq. (6). Here, parameter K quantifies the stochastic noise level that enables irrational decisions. As K → 0, agent i deterministically adopts the strategy of adjacent agent j, whereas when K→∞, the strategy imitation occurs randomly. Following Ref. [40], we set K = 0.5.
To evaluate the effectiveness of the proposed cooperation-enhancing mechanism, Monte Carlo simulations (MCS, which stands for Monte Carlo step) were performed on a 200 × 200 lattice network. Initially setting p = 0.5 and Ei (0) =0.5, each individual updated their strategy once per full MCS cycle on average. The equilibrium cooperation frequency fc was measured at 1 × 104 MCS, with data averaged over 3 × 103 MCS to minimize fluctuations. Results reflect 20 independent trials.
3. Result
Fig 1 illustrates the evolution of cooperation level fc across defect temptation values b for varying δ. When δ = 0, strategy updates revert to conventional imitation rules, causing cooperators to rapidly disappear even at low b. For δ > 0, however, agents incorporate historical performance into strategy evaluation. This modification substantially elevates cooperation levels, with higher δ values further amplifying cooperative behavior. In particular, when δ is set to 4, the payoff dimensions become consistent with historical performance, promoting a greater degree of system cooperation. Consequently, the moderated update rule extends the critical threshold b for cooperation extinction beyond conventional imitation, thereby promoting both the emergence and sustainability of cooperation.
Vertically, with increasing δ, the system evolves to a stable state with a greater degree of cooperation. Showing that the proposed mechanism promotes cooperation and broadens its disappearance threshold b. Parameters are set to β = 2, α = 0.5 based on Ref. [24] and L = 200.
We further investigated the impact of the parameter δ on strategic evolution from the perspective of population dynamics. Fig 2 shows the temporal evolution of cooperation for b = 1.1 under different δ values. The process exhibits two distinct stages: an initial decline followed by a rise. After reaching a minimum, the cooperation level increases steadily until it stabilizes at an evolutionary equilibrium. This characteristic dip-and-rise pattern reflects the intense competition between cooperators and defectors, which is typical of network reciprocity [41–44]. The proposed moderated strategy-update rule significantly enhances cooperation. Notably, higher δ values drive the system toward a more cooperative equilibrium, accelerate the evolution of cooperation, and shorten the duration of the second phase. These results confirm that the modified rule effectively promotes cooperation.
With increasing δ, the system evolves to a stable state with a greater degree of cooperation. The results indicate that higher parameter δ can promote cooperation. Parameters are set to β = 2, α = 0.5 and L = 200.
Moving forward, we analyze the evolutionary dynamics by examining the spatial distribution of individual strategies to observe the competition between cooperative and competitive behaviors. As shown in Fig 3, when cooperators are initially placed in distinct positions, they quickly disappear under traditional strategy update rules. However, when moderated strategy update rules are applied, cooperators gradually spread within clusters of defectors. They gain a competitive advantage over defectors, which strengthens as parameter δ increases. Ultimately, this advantage allows cooperators to dominate, leading to a higher level of system cooperation at equilibrium. Notably, in the stable state, cooperators survive within numerous small, compact clusters.
Vertically, the diffusion of cooperative behavior accelerates with increasing δ. Simulations indicate that larger δ significantly promotes cooperation. Parameters are set to b = 1.1, α = 0.5, β = 2 and L = 200.
Then, we further analyze the impact of sensitivity parameter β on cooperative evolution given parameter δ = 1.0 and α = 0.5. And the impact of the system consistency aspiration or goal value α on cooperative evolution under the given parameter β = 2 and δ = 1.0.
In Fig 4(a), at δ = 1.0 and α = 0.5, the system cooperation level increases with the sensitivity parameter β, demonstrating its impact on cooperative evolution. This may be because the increase in sensitivity parameters amplifies the gap between game benefits and consistency aspiration or goals. Under this evaluation criterion, the results of cooperative payoff evaluation are amplified while the results of defective payoff evaluation are compressed, which accelerates the propagation and diffusion of cooperative behavior in the system, which also explains the strengthening effect of the proposed mechanism on cooperative behavior. However, as the sensitivity parameter β increases further, it suppresses cooperation in systems characterized by strong temptation. Under high-intensity social dilemma (b is large), defection obtain greater payoff that can achieving system goals. This overestimation of defection’s benefits facilitates its spread, thereby reducing overall cooperation. Fig 4(b) reveals the nonlinear dynamic influence of β on the evolution of cooperation. Thus, we can know that a modestly sensitivity parameter β can promote cooperation.
Panel (b) demonstrates the dynamic impact of the β on cooperation evolution, revealing that while higher β values promote cooperation, the effect is only significant up to a certain threshold. The results indicate that a modestly sensitivity parameter β can promote cooperation. Parameters are set to α = 0.5, δ = 1.0 and L = 200.
Fig 5(a) further examines the influence of the aspiration level α on the evolution of cooperation. The results show that lower system goals improve goal fulfillment among cooperators, thus promoting cooperation. However, an increase in the temptation b triggers an invasion of defectors, leading to a monotonic decline in system-wide cooperation. Despite this decrease, cooperation remains substantially higher than in traditional scenarios. Furthermore, cooperation diminishes as α rises. This is likely because higher α values make it more difficult for cooperative individuals to meet expectations. Under such evaluation criteria, the performance of individuals who fail to achieve α is accentuated, which hinders the propagation of cooperation. Therefore, as illustrated in Fig 5(b), setting lower system aspiration levels (or goals) can facilitate the spread of cooperation within this framework.
The results indicate that lower system consistency desires or goals can promote cooperation. Parameters are set to β = 2, δ = 1.0 and L = 200.
Further analysis in Fig 6 explores the dynamic influence of parameters β and δ on the evolution of cooperation. Horizontally, increasing the sensitivity parameter β alleviates the negative effect of aspiration level setting on cooperation within the system. Although this enhancement promotes collaborative behavior, its extent remains limited. Vertically, establishing appropriate aspiration levels is crucial for both initiating and maintaining cooperation. Thus, these parameters jointly shape the evolution of cooperation in the system.
Horizontally, system cooperation increases as β increases, whereas vertically, it decreases as α increases. The results indicate that parameters β and δ jointly determine the cooperative evolution in the system. Parameters are set to δ = 1.0 and L = 200.
To better illustrate the impact of parameter K on cooperative evolution, we calculated the evolutionary dynamics of the system for several values of K. As shown in Fig 7, the results indicate that the system maintains robustness under small noise fluctuations. However, when the noise amplitude becomes large, cooperative evolution is significantly disrupted. This finding is consistent with the conclusion reported by [40]. Moreover, analysis of the evolution of system cooperation across different network sizes (L = 50–300) shows consistent cooperation evolution, indicating that the system is robust.
Panel (b) plots the fc against the b under different L values. The result is consistent. The system exhibits robustness against minor noise interference and differences in network scale. Which indicating that the system is robust. Parameters are set to β = 2, α = 0.5, δ = 0.2 and L = 200 in Panel (a), and β = 2, α = 0.5, δ = 0.2 and K = 0.5 in Panel (b).
4. Conclusion
Cooperation is recognised as foundation for sustaining socio-economic development. Building upon prior research and empirical observations of individual behavior within real social systems, this study introduces a novel strategy update rule. This rule incorporates a system-wide consensus objective and employs an individual’s historical attainment of this objective as a key performance metric, calculated using the Bush-Mosteller (BM) model within a reinforcement learning framework.
Results demonstrate that, compared to conventional update mechanisms, the proposed rule significantly amplifies cooperative behavior within the system. This enhancement stems from the rule’s systematic devaluation of defection payoffs while concurrently amplifying the perceived benefits of cooperation. This direct mechanism effectively suppresses the proliferation of defection strategies and accelerates the diffusion of cooperative ones. Furthermore, the inherent multidimensionality of the composite performance metric constrains opportunities for defectors by selectively filtering potential imitators. This aligns with empirical evidence indicating that social evaluations rarely hinge on a singular criterion. Critically, establishing an appropriate evaluative benchmark directly fosters cooperative outcomes within groups, and the methodology for behavioral assessment relative to this benchmark is paramount for optimizing group management. Collectively, our findings suggest that multidimensional evaluation creates more favorable conditions for the emergence and persistence of cooperation within complex environmental systems. This research offers theoretical insights into the mechanisms underpinning cooperative behavior in collective settings.
Supporting information
S1 File. This code is used to calculate Fig 1.
https://doi.org/10.1371/journal.pone.0341925.s001
(TXT)
S2 File. This code is used to calculate Fig 2.
https://doi.org/10.1371/journal.pone.0341925.s002
(TXT)
S3 File. This code is used to calculate Fig 3.
https://doi.org/10.1371/journal.pone.0341925.s003
(TXT)
S4 File. This code is used to calculate Fig 4(a).
https://doi.org/10.1371/journal.pone.0341925.s004
(TXT)
S5 File. This code is used to calculate Fig 4(b).
https://doi.org/10.1371/journal.pone.0341925.s005
(TXT)
S6 File. This code is used to calculate Fig 5(a).
https://doi.org/10.1371/journal.pone.0341925.s006
(TXT)
S7 File. This code is used to calculate Fig 5(b).
https://doi.org/10.1371/journal.pone.0341925.s007
(TXT)
S8 File. This code is used to calculate Fig 6.
https://doi.org/10.1371/journal.pone.0341925.s008
(TXT)
S9 File. This code is used to calculate Fig 7(a).
https://doi.org/10.1371/journal.pone.0341925.s009
(TXT)
S10 File. This code is used to calculate Fig 7(b).
https://doi.org/10.1371/journal.pone.0341925.s010
(TXT)
References
- 1. Hauert C, Szabó G. Game theory and physics. Am J Phys. 2005;73(5):405–14.
- 2. Nowak MA. Five rules for the evolution of cooperation. Science. 2006;314(5805):1560–3. pmid:17158317
- 3. Nowak MA, May RM. Evolutionary games and spatial chaos. Nature. 1992;359(6398):826–9.
- 4. Cressman R, Wu J-J, Li C, Tao Y. Game experiments on cooperation through reward and punishment. Biol Theory. 2013;8(2):158–66.
- 5. Sigmund K, Hauert C, Nowak MA. Reward and punishment. Proc Natl Acad Sci. 2001;98(19):10757–62.
- 6. Liu L, Wang L, Niu W, Hua S. Dynamic sanctioning mechanism for cooperative multi-agent systems. Exp Syst Appl. 2026;296:128873.
- 7. Hua S, Liu L. Coevolutionary dynamics of population and institutional rewards in public goods games. Exp Syst Appl. 2024;237:121579.
- 8. Ding R, Wang X, Liu Y, Zhao J, Gu C. Evolutionary games with environmental feedbacks under an external incentive mechanism. Chaos Solitons Fractals. 2023;169:113318.
- 9. Chen Y-D, Guan J-Y, Wu Z-X. Coevolutionary game dynamics with localized environmental resource feedback. Phys Rev E. 2025;111(2–1):024305. pmid:40103166
- 10. Perc M, Szolnoki A. Social diversity and promotion of cooperation in the spatial prisoner’s dilemma game. Phys Rev E Stat Nonlin Soft Matter Phys. 2008;77(1 Pt 1):011904. pmid:18351873
- 11. Szolnoki A, Perc M. Coevolution of teaching activity promotes cooperation. New J Phys. 2008;10(4):043036.
- 12. Szolnoki A, Szabó G. Cooperation enhanced by inhomogeneous activity of teaching for evolutionary Prisoner’s Dilemma games. Europhys Lett. 2007;77(3):30004.
- 13. Wang Z, Wang L, Yin Z-Y, Xia C-Y. Inferring reputation promotes the evolution of cooperation in spatial social dilemma games. PLoS One. 2012;7(7):e40218. pmid:22808120
- 14. Liu L, Chen X, Szolnoki A. Coevolutionary dynamics via adaptive feedback in collective-risk social dilemma game. Elife. 2023;12:e82954. pmid:37204305
- 15. Han X, Zhao X, Xia H. Hybrid learning promotes cooperation in the spatial prisoner’s dilemma game. Chaos Solitons Fractals. 2022;164:112684.
- 16. Bi Y, Yang H. Based on reputation consistent strategy times promotes cooperation in spatial prisoner’s dilemma game. Appl Math Comput. 2023;444:127818.
- 17. Zhang H, An T, Wang J, Wang L, An J, Zhao J, et al. Reputation-based adaptive strategy persistence can promote cooperation considering the actual influence of individual behavior. Phys Lett A. 2024;508:129495.
- 18. He J, Wang J, Yu F, Zheng L. Reputation-based strategy persistence promotes cooperation in spatial social dilemma. Phys Lett A. 2020;384(27):126703.
- 19. Lu S, Wang Y. Past-performance-driven strategy updating promote cooperation in the spatial prisoner’s dilemma game. Appl Math Comput. 2025;491:129220.
- 20. Xu J, Deng Z, Gao B, Song Q, Tian Z, Wang Q, et al. Popularity-driven strategy updating rule promotes cooperation in the spatial prisoner’s dilemma game. Appl Math Comput. 2019;353:82–7.
- 21. Lu S, Wang Y. Experience-driven learning and interactive rules under link weight adjustment promote cooperation in spatial prisoner’s dilemma game. Appl Math Comput. 2025;497:129381.
- 22. Jia D, Guo H, Song Z, Shi L, Deng X, Perc M, et al. Local and global stimuli in reinforcement learning. New J Phys. 2021;23(8):083020.
- 23. Masuda N, Nakamura M. Numerical analysis of a reinforcement learning model with the dynamic aspiration level in the iterated Prisoner’s dilemma. J Theor Biol. 2011;278(1):55–62. pmid:21397610
- 24. Ezaki T, Horita Y, Takezawa M, Masuda N. Reinforcement learning explains conditional cooperation and its moody cousin. PLoS Comput Biol. 2016;12(7):e1005034. pmid:27438888
- 25. Jia D, Li T, Zhao Y, Zhang X, Wang Z. Empty nodes affect conditional cooperation under reinforcement learning. Appl Math Comput. 2022;413:126658.
- 26. Geng Y, Liu Y, Lu Y, Shen C, Shi L. Reinforcement learning explains various conditional cooperation. Appl Math Comput. 2022;427:127182.
- 27. Zhang L, Li Y, Xie Y, Feng Y, Huang C. The combined effects of conformity and reinforcement learning on the evolution of cooperation in public goods games. Chaos Solitons Fractals. 2025;193:116071.
- 28. Horita Y, Takezawa M, Inukai K, Kita T, Masuda N. Reinforcement learning accounts for moody conditional cooperation behavior: experimental results. Sci Rep. 2017;7:39275. pmid:28071646
- 29. Tampuu A, Matiisen T, Kodelja D, Kuzovkin I, Korjus K, Aru J, et al. Multiagent cooperation and competition with deep reinforcement learning. PLoS One. 2017;12(4):e0172395. pmid:28380078
- 30. Ding Z-W, Zheng G-Z, Cai C-R, Cai W-R, Chen L, Zhang J-Q, et al. Emergence of cooperation in two-agent repeated games with reinforcement learning. Chaos Solitons Fractals. 2023;175:114032.
- 31. Bendor J, Mookherjee D, Ray D. Aspiration-based reinforcement learning in repeated interaction games: an overview. Int Game Theory Rev. 2001;03(02n03):159–74.
- 32. Zhang L, Huang C, Li H, Dai Q, Yang J. Cooperation guided by imitation, aspiration and conformity-driven dynamics in evolutionary games. Phys A: Stat Mech Appl. 2021;561:125260.
- 33. Perc M, Wang Z. Heterogeneous aspirations promote cooperation in the prisoner’s dilemma game. PLoS One. 2010;5(12):e15117. pmid:21151898
- 34. Li Z, Yang Z, Wu T, Wang L. Aspiration-based partner switching boosts cooperation in social dilemmas. PLoS One. 2014;9(6):e97866. pmid:24896269
- 35. Liu X, He M, Kang Y, Pan Q. Aspiration promotes cooperation in the prisoner’s dilemma game with the imitation rule. Phys Rev E. 2016;94(1–1):012124. pmid:27575094
- 36. You T, Shi L, Wang X, Mengibaev M, Zhang Y, Zhang P. The effects of aspiration under multiple strategy updating rules on cooperation in prisoner’s dilemma game. Appl Math Comput. 2021;394:125770.
- 37. Chen Y-S, Yang H-X, Guo W-Z, Liu G-G. Promotion of cooperation based on swarm intelligence in spatial public goods games. Appl Math Comput. 2018;320:614–20.
- 38. Lu S, Dai J, Zhu G, Guo L. Investigating the effectiveness of interaction-efficiency-driven strategy updating under progressive-interaction for the evolution of the prisoner’s dilemma game. Chaos Solitons Fractals. 2023;172:113493.
- 39. Wang J, He J, Yu F. Heterogeneity of reputation increment driven by individual influence promotes cooperation in spatial social dilemma. Chaos Solitons Fractals. 2021;146:110887.
- 40. Szabó G, Vukov J, Szolnoki A. Phase diagrams for an evolutionary prisoner’s dilemma game on two-dimensional lattices. Phys Rev E Stat Nonlin Soft Matter Phys. 2005;72(4 Pt 2):047107. pmid:16383580
- 41. Perc M, Szolnoki A, Szabó G. Restricted connections among distinguished players support cooperation. Phys Rev E Stat Nonlin Soft Matter Phys. 2008;78(6 Pt 2):066101. pmid:19256899
- 42. Wang Z, Szolnoki A, Perc M. Interdependent network reciprocity in evolutionary games. Sci Rep. 2013;3:1183. pmid:23378915
- 43. Szolnoki A, Perc M. Promoting cooperation in social dilemmas via simple coevolutionary rules. Eur Phys J B. 2008;67(3):337–44.
- 44.
Perc M, Szolnoki A. Social diversity and promotion of cooperation in the spatial prisoner's dilemma game[J]. Physical Review E-Statistical, Nonlinear, and Soft Matter Physics, 2008;77(1):01190.