Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Gradual reputation dynamics evolve and sustain cooperation in indirect reciprocity

  • Hitoshi Yamamoto ,

    Contributed equally to this work with: Hitoshi Yamamoto, Isamu Okada

    Roles Conceptualization, Data curation, Formal analysis, Funding acquisition, Investigation, Methodology, Project administration, Writing – original draft, Writing – review & editing

    hitoshi@ris.ac.jp

    Affiliation Department of Business Administration, Rissho University, Tokyo, Japan

  • Isamu Okada ,

    Contributed equally to this work with: Hitoshi Yamamoto, Isamu Okada

    Roles Formal analysis, Funding acquisition, Methodology, Writing – original draft

    Affiliation Department of Business Administration, Soka University, Tokyo, Japan

  • Takahisa Suzuki

    Roles Data curation, Formal analysis, Visualization, Writing – review & editing

    Affiliation College of Policy Studies, Tsuda University, Tokyo, Japan

Abstract

Humans have achieved widespread cooperation, largely sustained by mechanisms such as indirect reciprocity, which relies on reputation and social norms. People are highly motivated to maintain a good reputation, and social norms play a critical role in reputation systems by defining acceptable behavior, helping prevent exploitation by free-riders. However, there is a gap between theory and experiment in handling reputation information, with experiments often failing to capture the complexity that theoretical models propose. Here, we address two key issues: what kind of information is needed to define reputation as a social norm and the appropriate level of granularity required for reputation information to function effectively. This paper combines scenario-based experiments and evolutionary game theory to investigate the social norms individuals adopt in real-world settings, aiming to uncover the stability of these norms. Our results show that reputations should be categorized into three levels good, neutral, and bad. Results suggest gradual reputation dynamics that increase and decrease gradually due to cooperation or defection. However, a person’s reputation remains unchanged only when they defect against a bad reputation. Our experimental and theoretical results support critical insights into the dynamics of reputation and social norms within indirect reciprocity, challenging traditional binary reputational evaluations. The gradual nature of reputation updating and the use of nuanced evaluations provide a more realistic model of reputation dynamics.

Introduction

While various species exhibit cooperative behaviors [1, 2], no species has achieved the same level of widespread cooperation as humans [3]. Among the mechanisms identified for the evolution of cooperation [4], indirect reciprocity [5, 6] stands out as a particularly powerful factor in sustaining cooperation within large and dynamic societies. Indirect reciprocity, where individuals help others and receive help in return from different individuals, hinges on the crucial elements of reputation and the social norms that govern it.

Social norms, which define what is considered “good” or “bad” behavior, are integral to the functioning of reputational systems. These norms help individuals discern who deserves cooperation and who does not, thereby preventing exploitation by free-riders. Extensive research across various fields, including biology, physics, economics, and psychology, has sought to identify which norms can sustain a stable cooperative society. For instance, norms such as “cooperation is good and defection is bad” [7, 8], “cooperating with a bad person is bad, and defecting against a bad person is good” [912], and “even cooperating with a bad person is good, and defecting against a bad person is also good” [5, 13, 14] have been extensively studied.

In this paper, we address two key issues in indirect reciprocity. The first concerns the role of higher-order information, which involves the complexity of reputation information used to evaluate social interactions. The second focuses on the granularity of reputation information, specifically how detailed or simplified reputation categories need to be for effective social norms to emerge.

The importance of higher-order information has been emphasized in theoretical studies, which have proposed complex models that consider not only first-order information (the actor’s behavior) but also second-order (the recipient’s reputation) and even third-order information (the actor’s reputation). These models suggest that cooperation becomes more robust as higher-order information is incorporated [12, 1517]. However, there is a significant gap between theory and experiment: while theoretical studies highlight the potential benefits of higher-order information, experimental studies often neglect these complexities. Some experimental findings suggest that humans prefer simpler, first-order information due to cognitive limitations [18], while others indicate that individuals are willing to use higher-order information despite its complexity and associated costs [19, 20]. However, existing studies have primarily focused on whether second-order information is used, without examining in detail how individuals incorporate such information into their reputation evaluations and decision-making processes. The gap underscores the need to further investigate how humans evaluate and utilize higher-order information in practice, particularly in scenarios where cooperation depends on such nuanced assessments.

The granularity of reputation information refers to the degree of detail or resolution in the categories used to represent reputation. While theoretical models often simplify reputation into binary categories (e.g., good or bad) for analytical tractability [15], real-world interactions suggest a more nuanced approach is necessary. For example, recent studies indicate that people may use multi-valued reputations, such as including a “neutral” category alongside “good” and “bad” [21]. Despite this, experimental research on the granularity of reputation information remains limited, with few studies providing empirical evidence to determine the appropriate level of granularity that can inform theoretical models.

Moreover, from a methodological perspective, there is a significant gap between experimental and theoretical approaches. Although extensive literature has examined indirect reciprocity, most studies have employed experimental [18, 2226] or theoretical [1416, 2730] approaches independently. Experimental studies often focus on how people evaluate others’ actions and assign reputations, yet these evaluations are not always grounded in theoretical analysis, making their evolutionary stability difficult to assess. Conversely, theoretical studies comprehensively analyze cooperation’s evolutionary stability but typically neglect the applicability of these findings to real-world behaviors. This disconnection limits our understanding of how social norms function and persist in human societies.

In this paper, we address these gaps by combining experimental and theoretical approaches to investigate the social norms that individuals adopt and their evolutionary stability. To do so, first we conduct a scenario-based experiment under which reputation-determining information includes both the actor’s reputation (third-order information) and the recipient’s reputation (second-order information), with all reputation information being three-valued, i.e., good, neutral, and bad. This experiment is to investigate the social norms individuals adopt in real-world contexts, and the experimental scenarios systematically vary the actor’s behavior (cooperative or defective), the reputation of recipients (good, neutral, or bad), and the reputation of actors (good, neutral, or bad). After clarifying social norms employed in real-world contexts, we rigorously assess their evolutionary stability using mathematical analysis. We will analyze replicator dynamics using the framework of evolutionary game theory to determine whether the norm discovered in the experiment is evolutionarily stable. By combining these approaches, we aim to identify the social norms prevalent in real societies and provide a comprehensive understanding of why these norms emerge and persist.

Results

Experimental exploration of social norm

We designed a scenario-based experiment to elucidate how individuals apply evaluation rules in indirect reciprocity scenarios. The scenario involved two characters: one as the donor, who decided whether to cooperate with the recipient, and the other as the recipient, who requested cooperation from the donor. We characterized the donor and recipient through prior information for three types of reputation: “good reputation,” “bad reputation,” and “neutral (no reputation information).” In this study, we conducted an experiment in which neutral reputation conditions were represented by the absence of reputation information. However, the absence of reputation information does not necessarily equate to a neutral reputation. To determine whether having no reputation information corresponds to having a neutral reputation, we analyzed participants’ evaluations of recipients without reputation information. The results indicated that participants assigned neutral evaluations to recipients without reputation information (S2 Fig). The experiment featured 18 different scenarios, combining variations in donor behavior (cooperate or not), donor reputation (good, bad, or neutral), and recipient reputation (good, bad, or neutral), and it was executed using a between-subjects design. Our experimental setting builds upon the previous research [21]. Previous research analyzed four specific cases in which a donor with a neutral reputation either cooperates with or defects against a recipient with a good or bad reputation. Participants were randomly assigned to one of 18 scenarios and gave their impressions of the donor’s behavior.

The experimental results indicate that all 18 scenes can be categorized into three types of distributions: a right-skewed distribution (evaluated as bad), a left-skewed distribution (evaluated as good), and a distribution with a central peak (evaluated as neutral) (Fig 1). First, we conducted a cluster analysis using the Wasserstein distance to classify participants’ evaluations of the 18 scenes, resulting in three distinct clusters (S1 Fig). Next, to identify the characteristics of each cluster, we fitted three distributions, a normal distribution (interpreted as neutral), an exponential distribution (interpreted as bad), and an exponentially reversed distribution (interpreted as good), using the maximum likelihood estimation method. We determined the best-fitting distribution for each distribution using the Bayesian Information Criterion (BIC). The results (Table S2) indicated that the clusters could be characterized as a right-skewed distribution (bad), a left-skewed distribution (good), and a distribution with a central peak (neutral). Table 1 presents the classification results for each scene, indicating that the reputation values were classified as “good”, “bad”, and “neutral”. These findings reveal that irrespective of the donor’s reputation, cooperative behavior consistently improved the rating by one level in a positive direction. In contrast, defection generally resulted in a downgrade of one level in the evaluation. However, defection against bad recipients was an exception, as their reputation remained unchanged. In other words, when a neutral donor engaged in justified defection, their reputation stayed neutral, whereas justified defection by a good donor was rated as good. These results suggest that justified defection is only acceptable when carried out by donors with a good reputation.

thumbnail
Fig 1. Distribution of evaluations of donor’s behavior:

Violin plots show distribution of donors’ behavior evaluations for each of 18 scenes. Red and blue correspond to defection and cooperation, respectively.

https://doi.org/10.1371/journal.pone.0329742.g001

thumbnail
Table 1. Evaluation rule considering third-order information and multiple reputation values: By cluster analysis and best-fitting distribution for each distribution done on basis of BIC, reputation values are classified as good, bad, and neutral.

https://doi.org/10.1371/journal.pone.0329742.t001

The results in Table 1 allow us to construct a state transition diagram that represents the dynamics of reputation (Fig 2). Participants exhibit three types of reputation towards others, good (G), neutral (N), and bad (B), and these reputations are evaluated by one point on the basis of observed behavior. Generally, cooperation and defection are evaluated as positive and negative, respectively. However, contrary to prior research assumptions, individuals apply a more moderate reputation updating rule. Specifically, defection against a bad donor does not alter the reputation. Justified defection is considered acceptable only for donors with a good reputation. Reputation updating occurs gradually, with no dramatic shifts, such as bad becoming good or good becoming bad. We refer to the social norm revealed by this experiment as “gradating”. This norm is characterized by gradated reputation updates and a neutral attitude towards justified defection, where reputation remains unchanged. These features can be viewed as an extension of the L1 norm from the “leading eight,” [15] applied to a multi-valued reputation system.

thumbnail
Fig 2. State transition dynamics of reputation considering third-order information and multiple reputation values:

Reputation improves by one step for cooperation and deteriorates by one step for defection. However, when defection (justified defection) occurs against bad recipient, donor’s reputation remains unchanged.

https://doi.org/10.1371/journal.pone.0329742.g002

Overall, the results suggest that reputation updates in social interactions are gradual. Justified defection is conditionally accepted on the basis of the donor’s given reputation, highlighting the importance of a given reputation status in evaluating subsequent actions. Although the cooperation of a donor with a good reputation is naturally rated as good, the five-point scale used in this experiment makes it unclear whether the evaluation has improved or simply remained good. The same applies to the defection of donors with a bad reputation. Theoretical research has explored various levels of reputational granularity [7, 31], but the appropriate granularity of reputations remains controversial in experimental research. Future research should address whether reputation should be expressed using three values or extended to a more nuanced scale.

Our experimental setting can be regarded as private in that observers independently evaluate the donor, while public in that the donor and recipient’s reputation are presented as shared information. The analytical model employs a public information structure. On the other hand, recent studies of indirect reciprocity have extensively employed private assessment models [28, 29, 3234]. The results of this study align with findings from private evaluation research, demonstrating that cooperation is consistently evaluated positively even when reputation is public. Future work should examine whether similar patterns emerge in experiments and simulations based on fully private assessment systems.

Theoretical analysis of social norm

We will examine whether the “gradating” social norm discovered in our experimental study can be accepted from a theoretical point of view. To do so, we consider evolutionary game theoretical analysis using replicator dynamics, to look for adaptive action rules keeping cooperative regimes, assuming that all players adopt the social norm discovered.

First, we model the gradating social norm. The gradating uses a ternary reputation label, so we define a set of labels as where a reputation label corresponds to bad, neutral, and good, respectively. Additionally, any action is binary, so we define that an action is 1 when one cooperates and that an action is 0 when one defects. In a social norm using a ternary reputation label, the number of configuration types of action rules is eight: . The numbering of those eight types is denoted as a three-letter string. For example, action rule a3 transforms its suffix to [011] by following a binary transformation. The meanings of each letter are as follows. The first letter is an action to someone with a 0 reputation label. The second and third letter are, respectively, an action to someone with a 1 and 2 reputation label. For example, if a player uses the action rule a3, the player cooperates with someone who has either a 1 or 2 reputation label only. Note that an unconditional cooperator corresponds to a player who adopts a7 while an unconditional defector corresponds to an a0 player.

Here, we model the evolutionary game for infinite well-mixed players. Let population be the state in which the proportion of players adopting action rule ai is pi, where , and . To explore the evolutionary dynamics of the reputation labels, we assume that the time scale for natural selection is much slower than that for social interactions and label updating [35]. Therefore, we can assume that the frequency of labels is always at an equilibrium value, i.e., the expected probability of a player’s reputation label converges to a steady state. Let be the fraction of players with action rule ai having an r reputation label in population p, where , and for any and . The system of equations for computing the equilibrium values of and the results are shown in Methods.

To explore adaptive action rules, all players change their own action rules following an evolutionary process. Replicator dynamics [36] models the natural assumption that players who obtain higher-than-average (expected) payoffs are more likely to increase their proportion, and it is suitable for exploring adaptive strategies through natural selection in biology and other fields. In addition to natural selection through replicator dynamics, here we consider a more realistic analysis by analyzing replicator dynamics with perturbations [37] that models the random entry of free strategies as a small percentage of mutations.

We introduce two parameters to generalize the model and to consider a real situation. One is an implementation error, in which there is a probability e of not cooperating when a player intends to cooperate where . To calculate their expected payoffs, we use two game parameters: cost of cooperation (c) and benefit to its recipients (b) where b>c>0. We are ready to define an expected payoff for a player using action rule ai in population ,

(1)

where and are binary transformations of i and j, respectively, i.e., and .

In this system, for a population ( = p) consisting of eight types of action rules A = {ai}, the proportion of reputation labels of each action rule strategist is included as factors that constitute the payoffs of each action rule, so it is extremely difficult to calculate an analytically exact solution and determine which action rule is dominant for each population. However, what is important in the question of whether the social norm is theoretically possible is to examine the possibility of a stable population existing that can maintain a high cooperation rate. Therefore, we first consider the evolutionary stability (ESS) for a single population, that is, a population in which everyone adopts the same single action rule. As shown in Methods, the analysis when there are no errors (that is, when e = 0) revealed that a2 is ESS. However, because the cooperation rate of the group consisting of a2 only reaches about one-third, this is not a desirable social norm in the sense that perfect cooperation is achieved. A more important problem is that the action rule a2 is generally difficult to understand. In other words, this action rule is denoted as [010], and players choose to cooperate only with others whose reputation rule is 1 (neutral) and not cooperate with others whose reputation is good or bad. Note that, however, there are possible interpretations that can justify such action rules. For example, if a player has a good reputation, she or he may choose not to cooperate because her or his reputation will not be damaged immediately.

To overcome this point, the action rule needs to be extended to a more realistic setting. Thus, in addition to introducing errors, the action rule is formulated from the viewpoint that it is always exposed to the opportunity for new entrants. Therefore, we introduce the mutation rate μ as a second parameter, in which there is a probability μ of random players invading the population for the replicator dynamics with perturbations where . The frequency of players with action rule in the population p changes over time in accordance with the replicator dynamics with perturbations as follows:

(2)

where and each action rule increases the quantity of as a mutant in any period, and therefore, each should decrease the quantity of in order to replace the mutants.

We are ready to analyze the model using numerical simulations. Fig 3 shows the time-series performance and cooperation rate of the action rules using the replicator dynamics with perturbations. From the left panel, we can see that the dynamics has four time phases. First, a0 or unconditional defectors gain power and drive many action strategies (especially a4 to a7) to extinction. Note that our dynamics model is perturbed so that a certain number of mutations can always be introduced, so no strategy is completely extinct. Next, a2, an odd action strategy that cooperates only with neutrals, forms the majority. At this time, the cooperation rate slowly rises to about 1/3. This is because this action strategy, if it forms a single population, will share the reputation information of good, neutral, and bad equally, with 1/3 (See the “fractions of reputation labels with each action rule” in Methods for details). In the second phase, a1, which is a strict rule that only cooperates with good, can invade because it has the same expected payoff (See the “ESS analysis in single population with ” in Methods for details). When the population of a1 eventually exceeds that of a2, the short third phase begins. At this time, a1 suddenly becomes the majority, and a2 is driven to extinction. In this third phase, the a3 strategy, which is dominant over a1, rapidly increases the population, and the cooperation rate rises to nearly . Then, in the final phase, the a3 strategy forms the majority, and the society becomes stable as a cooperative society. The a3 strategy can be said to be a tolerant rule in prosocial behavior because it cooperates with good and neutral.

thumbnail
Fig 3. Replicator dynamics on action rules and cooperation rate

Each panel represents generational consequences of changing the fraction of all action rules and the cooperation rate. The horizontal axis represents generations that update strategies. Parameters are set to and is Left: , and Right: . The initial population is set to . The dotted black line shows the cooperation ratio, and the solid colored lines show the population ratio of each action rules.

https://doi.org/10.1371/journal.pone.0329742.g003

Both panels of Fig 3 show the performances with different parameters, and except for the slower phase transition, it basically has four phases, just like the left panel. In the right panel of Fig 3, only up to the second phase occurs. This is because when a2 is the majority, a1 can invade if the population ratio of a2 is not too high. In the right panel of Fig 3, the population ratio of a2 is too high, so the system converges before the third phase. In other words, depending on the parameters, there are two patterns: up to the second phase, where a2 converges with the majority, and reaching the fourth phase, where a3 converges with the majority. The point here is that when a3 becomes the majority, the cooperation rate can reach almost , whereas when a2 becomes the majority, the cooperation rate remains at about 1/3. In order for the system to maintain cooperative regimes, a3 needs to form the majority. Fig 4 shows which rules will become the majority depending on the parameters of e and μ. While the results shown in Results simulate a dynamics from a specific initial population, the results are quite robust to the composition of the initial population, as shown in Methods.

thumbnail
Fig 4. Action rules that forms the majority in equilibrium:

As shown in Fig 3, the equilibrium state depends on the model parameters, particularly the error rate (e) and the mutation rate (μ). The system converges either to the a3 regime, where the cooperation rate is nearly full and a3 dominates the population (left panel of Fig 3), or to the a2 regime, where the cooperation rate is approximately one-third and a2 is most frequent (right panel of Fig 3). The figure indicates which regime emerges at equilibrium across different combinations of values, with yellow representing the a2 regime and purple representing the a3 regime. The parameters are set as .

https://doi.org/10.1371/journal.pone.0329742.g004

Discussion

In his famous Othello, premiering in 1604, Shakespeare made Cassio say, “Reputation, reputation, reputation! O, I have lost my reputation!” As depicted in famous classics, people are greedy in acquiring a good reputation and feel great pain over losing a good reputation [38]. It is well recognized that a significant portion of human communication centers on the exchange of reputational information [39, 40]. Reputation undeniably serves as a fundamental pillar of human society, yet research on reputation dynamics has often lacked a cohesive integration of theory and experimentation. Here, we clarify the dynamics of reputation that individuals adopt through experimental investigations, aiming to construct a theoretical model strongly supported by experimental evidence.

We designed an experimental and theoretical framework that considers both higher-order information and a multi-value reputation. Our experiment categorized all cases containing third-order information into three reputation levels: good, bad, and neutral. The experimental results show that cooperative behavior always increases the donor’s reputation by one level, while defection generally decreases it by one level. However, defection against a bad recipient does not alter the donor’s reputation. Justified defection maintains the donor’s reputation, and if performed by someone with a good reputation, it continues to be evaluated positively. These findings suggest that reputation updates are gradual and that evaluations of cooperation and defection are influenced by situational factors.

The results also align conceptually with previous theoretical studies that introduced ternary reputation systems [41, 42], both of which demonstrated the evolutionary stability of cooperation under third-order norms within a ternary reputation framework. Our experimental results indicate that individuals adopt a more tolerant norm than that in the conventional ternary image scoring system [41]. Consistent with previous findings [42], the neutral evaluation appears to function as a buffer, allowing cooperation to stabilize more flexibly than in binary reputation systems. Furthermore, we demonstrate that individuals adopt a neutral stance toward justified defection and stabilize cooperation under a tolerant ternary reputation framework.

Our theoretical model based on evolutionary game theory has elucidated the mechanisms through which different action rules evolve and dominate the population in the gradating norm. Our analysis demonstrated that the system stabilizes a cooperative regime in the gradating norm. Additionally, the dominant behavioral rule was that of a tolerant discriminator, who only defects against individuals with bad reputations and cooperates with those with good or neutral reputations.

The results of our study provide valuable insights into the complexities of reputation dynamics and social norms in indirect reciprocity. Our findings reveal several critical patterns in how individuals evaluate and update reputations, challenging some of the prevailing assumptions in the literature. First, our experiment confirmed that reputational evaluations are not strictly binary but can be categorized into good, bad, and neutral. This nuanced classification reflects a more sophisticated understanding of reputation, aligning with recent research that suggests individuals perceive reputations with greater granularity [21]. The recognition of a neutral category alongside good and bad reputations provides a richer framework for understanding how people assess and react to others’ behaviors in real-world scenarios. Secondly, our findings also reveal a discrepancy between theoretical predictions and experimental observations regarding the updating rule of reputation. While theories propose that observers swiftly update a donor’s reputation as good or bad based on a single observation, our results indicate that individuals apply a gradual updating rule. Notably, we observe an intriguing result: individuals do not update their ratings in cases of justified defection. While a theoretical study suggests that people disregard all actions directed at recipients with bad reputations [43], our findings demonstrate that individuals perceive cooperation with bad recipients as a positive action.

The fact that justified defection is not considered good may have important implications for punishment in society. Punishment is recognized as a straightforward yet effective mechanism for sustaining cooperation in social dilemmas [44, 45]. In indirect reciprocity, justified defection serves as a punishment that removes free riders. The second-order norm, which can maintain cooperation robustly, has often been associated with Stern Judging [912] and Simple Standing [5, 13, 14], both of which evaluate justified defection as good. By evaluating justified defection positively, these norms encourage punitive actions against free riders, thereby supporting the maintenance of cooperation. Anthropological research has further suggested that early-developing cognitive mechanisms may naturally lead individuals to view justified defection as morally acceptable [46]. However, recent studies have shown that people tend to adopt a robustly neutral stance toward justified defection [21, 47]. This tendency reflects a psychological reluctance to assign moral approval to punitive behavior, even when such behavior is arguably warranted. Our findings are consistent with earlier work, which indicates that individuals who administer punishment, even when it is normatively justified, do not necessarily enjoy reputational benefits in human societies [4852]. Further research is needed to investigate the cultural specificity, contextual sensitivity, and developmental trajectory of attitudes toward justified defection.

Several limitations of this study warrant careful consideration. First, we recruited participants from a Japanese crowdsourcing platform, resulting in a sample predominantly composed of Japanese adults. This raises the possibility that the social norms observed in this study may partly reflect cultural values specific to Japanese society [53]. To assess the generalizability of the tolerant ternary norm identified here, future research should include cross-cultural samples, particularly from Western contexts. Second, in the present version of the theoretical analysis, we do not introduce assessment errors, and addressing this limitation should be a focus of future analyses. In fact, a system without assessment errors may fail to converge to a unique stationary state. Introducing a small assessment error could help ensure convergence to a unique stationary state, irrespective of the initial conditions. Third, while aggregating reputation into three values is statistically reasonable (S1 Fig), the granularity lost through this aggregation needs to be further investigated. A computational approach, such as modeling reputation with more finely graded scales, could provide valuable insights into how more detailed reputation information influences social evaluations and cooperative behavior.

This study makes significant theoretical contributions by integrating higher-order information and multi-value reputation systems into the analysis of indirect reciprocity. Our results bridge the gap between experimental and theoretical approaches to reputation and social norm dynamics, offering a more comprehensive understanding of how cooperation can be sustained in complex social systems. Future research could explore the implications of these findings in real-world scenarios, where reputations are often multifaceted and influenced by a range of contextual factors.

Methods

Experimental settings

In the experiment, all 18 total conditions of three factors, donor’s reputation (good/bad/neutral), donor’s action (cooperation/defection), and recipient’s reputation (good/bad/neutral), were manipulated by using a between-subjects design. The experiment was conducted on 8th, November 2024, and a total of 1850 responses were collected. Participants were recruited using the website “Yahoo! Crowdsourcing” in Japan.

An example of one of the scenarios is as follows. The participants were assumed to be workers in a restaurant. Assume that a colleague, Bob (recipient), asks another colleague, Alice (Donor), to take over the night shift, and Alice agrees (cooperation) or refuses (defection). We also controlled Alice and Bob’s reputation. Below is an example of a good reputation donor defecting against a bad reputation recipient. For additional cases, refer to the Supplementary Information.

Alice works hard and is always willing to take over when others cannot come to do the night shift. That is why Alice is liked very much by colleagues in the restaurant including you. On the other hand, Bob is not serious about his work. Even when other employees ask him to cover for them on night shifts, he rarely agrees, even when he has the time. For this reason, Bob is not well thought of by colleagues in the restaurant including you. One day, Bob asked Alice to cover for him on the night shift because he wanted to go to a concert of his favorite singer. Although Alice had plenty of time, she declined Bob’s request.

After reading the scenario, participants rated how they assessed the donor’s behavior from three viewpoints using a 5-point scale: “Alice is a reliable person”, “Do you like Alice?”, and “Alice is approachable.” The evaluation scores for the donor’s action were added and normalized after checking the one-factor structure, and we used them as the participants’ evaluation scores for the donor. In the actual experiment, the names of the donor and recipient were converted into common Japanese names. A score of 0.0 means that the participant rated the donor’s behavior most negatively, and 1.0 means that the participant rated the donor’s behavior most positively.

Ethics

The present series of experiments was approved by the Ethics Committee of Rissho University (Ethics approval number: 06-02) and conducted in accordance with the requirements of the Declaration of Helsinki. All participants were informed about the purpose of the study as well as the ways the data would be used. They agreed that the data would be used only for scientific research, that all data would be anonymized, and that they had the right to stop responding at any time. Informed consent was obtained from all participants.

Fractions of reputation labels with each action rule

Let be the new reputation label , which is defined as the case where a donor with the reputation label acts toward a recipient with the reputation label . In the social norm discovered in our experimental study, when a = 1. If a = 0, when r = 0, and when r>0. We call the social norm “gradating” because .

We calculate defined above. The simultaneous equations for are derived in Eq (3).

(3)

where

We explain the equation for as an example. The first term of the equation is for the case that the player’s reputation label is 0 (its probability is ). In that case, the player’s reputation label will be updated to 1 if and only if the player cooperates. If the reputation label of the player’s potential recipient is r, the probability the player cooperates is (because when ir = 1, the player cooperates if and only if the implementation error does not occur, and when ir = 0, the player defects without errors) because one’s action rule is ir. The potential recipient’s action rule is aj with the probability pj, and the reputation label is r with the probability . The second term of the definition is the case that the player’s reputation label is 1. In that case, the label keeps to 1 if and only if the player defects against someone with a 0 reputation label. If ir = 1, which means that the player cooperates, the player defects when an implementation error occurs, and thus, the probability of defecting is e. If ir = 0, which means that the player defects, the player absolutely defects. Therefore, the probability of defecting is integrated into . The explanation of the third term is omitted.

The 24 values expressed by are determined by the 24 simultaneous equations defined above (note that ), but analysis is generally difficult. Therefore, we introduce discrete time. Calculating the right side of Eq (3) as the value at time t and treating the result as a difference equation for time , the solution can be found by numerical simulations.

ESS analysis in single population with no errors

We analyze a special case where all players adopt the same action rule with no error, i.e., let , where (pi = 1), and e = 0. Table 2 shows the values of g(ai) which satisfy Eq (3) for each action rule ai in A.

thumbnail
Table 2. Values of g(ai) satisfying Eq (2) for each action rule in A

https://doi.org/10.1371/journal.pone.0329742.t002

In the cases of a2 and a3, there is another solution , but this solution has an unstable equilibrium, and we will show the proof in the case of a3. If g0(t)<1 then g0(t) decreases over time t because . Note that in the case of a4, the definition of g0 yields , and the definition of g2 yields . Substituting them into , is yielded. There is only one solution, , of this cubic function in .

Next, we consider a reputation label of an x player (a player adopting the action rule) who invades a y population (a population consisting of players who all adopt the action rule), which is defined as . Let , where is the fraction of the r reputation label given to the invader (x) in the case of . The definitions of are presented in Eq (4). Compared with the definitions of gr(y), all are the same except that the player’s reputation label is revised to from gr(y). Thus, the definitions of are

(4)

where

We are ready to calculate their expected payoffs. In the case of , the invader’s expected payoff, denoted as Px(x|y), and the resident’s expected payoff, denoted as Py(x|y), are defined as

(5)(6)

This is why a player with an x action rule invades a population of players with a y action rule if and only if . We then define that an action rule x is evolutionarily stable if , and for, in the case of , any , and .

Table 3 shows the expected payoffs of Px(x|y) in . Note that in , the equation system for does not determine any value, and thus, for any , and thus, we set b/2 in that case. In Table 3, we use the values of . The values in the cases of are approximated.

Therefore, we prove that no action rules are evolutionarily stable except for a2. Note that, when a2 forms the majority in the population, a1 can invade isolatedly because . In the situation , , so a different strategy is required for a1 to become the majority. The key point is a situation in which a2 is the majority in the population, but a1, who has gained citizenship as a neutral mutant, is present at a certain rate. In this situation, a3 can invade because it can increase expected payoffs in the a1 world. This is possible if mutants always invade. If there is a regime of a3, both a1 and a3 can invade rapidly. This effect first allows the hegemony of the population to shift from a2 to a1. Then, this regime change promotes the rapid proliferation of a3, and like a phase transition, the a2 hegemony era will be replaced by the a3 hegemony era. This will realize a stable cooperative regime. When a3 forms the majority of the population, a5 and a7 (unconditional cooperators) can invade as neutral mutants (hence, a3 is not an ESS). However, they are driven out by a0 (unconditional traitors) who always invade in small numbers due to the effect of random drift, so only a small number can survive in the population at any one time.

Robustness check on the performance of the initial population

In Results, we analyzed whether there exists an action rule with evolutionary stability that satisfies a high cooperation rate in order to show the theoretical possibility of the social norm referred as gradating. For this reason, it is not necessary to show that the discovered action rule a3 can invade any initial population. However, since the model we used for our analysis always has new invaders by mutation, no strategy can become 0, so it is clear that it can only have stable points inside the simplex dynamically. Empirically, this makes it easier to eliminate unstable equilibrium points occurring at the boundary and to reach a stable resting point. For these reasons, although we did not provide a rigorous proof in this paper due to analytical difficulties, the action rule a3 discovered this time is highly likely to have global stability. To demonstrate this point, we additionally confirmed that even if the simulation was started from a different initial population, a3 eventually forms a population as the majority. The specific simulation method is as follows. Parameters are set to . An initial population consisting of a single action rule is set for each action rule. In this case, eight simulations are performed. In all of these, a3 was confirmed to be the final winner. Each simulation shows that a3 is reached even when a population generated during the simulation is used as the initial population, suggesting that a3 is highly likely to have global asymptotic stability dynamically.

Supporting information

S1 File. This file includes descriptions of experimental scenarios and statistical analyses, including S1–S4 Tables and S1–S3 Figs.

https://doi.org/10.1371/journal.pone.0329742.s001

(DOCX)

References

  1. 1. Carter GG, Wilkinson GS. Food sharing in vampire bats: reciprocal help predicts donations more than relatedness or harassment. Proc Biol Sci. 2013;280(1753):20122573. pmid:23282995
  2. 2. Dolivo V, Taborsky M. Norway rats reciprocate help according to the quality of help they received. Biology Letters. 2015;11(2):20140959. pmid:25716088
  3. 3. Nowak M, Highfield R. Supercooperators: Altruism, evolution, and why we need each other to succeed. Simon and Schuster; 2012.
  4. 4. Nowak MA. Five rules for the evolution of cooperation. Science. 2006;314(5805):1560–3. pmid:17158317
  5. 5. Sugden R. The economics of rights, cooperation and welfare. Oxford: Basil Blackwell; 1986.
  6. 6. Alexander R. The biology of moral systems. New York: Aldine de Gruyter. 1987.
  7. 7. Nowak MA, Sigmund K. Evolution of indirect reciprocity by image scoring. Nature. 1998;393(6685):573–7. pmid:9634232
  8. 8. Nowak MA, Sigmund K. The dynamics of indirect reciprocity. J Theor Biol. 1998;194(4):561–74. pmid:9790830
  9. 9. Kandori M. Social norms and community enforcement. Rev Econ Stud. 1992;59(1):63–80.
  10. 10. Pacheco JM, Santos FC, Chalub FACC. Stern-judging: A simple, successful norm which promotes cooperation under indirect reciprocity. PLoS Comput Biol. 2006;2(12):e178. pmid:17196034
  11. 11. Santos FP, Santos FC, Pacheco JM. Social norms of cooperation in small-scale societies. PLoS Comput Biol. 2016;12(1):e1004709. pmid:26808261
  12. 12. Santos FP, Santos FC, Pacheco JM. Social norm complexity and past reputations in the evolution of cooperation. Nature. 2018;555(7695):242–5. pmid:29516999
  13. 13. Leimar O, Hammerstein P. Evolution of cooperation through indirect reciprocity. Proc Biol Sci. 2001;268(1468):745–53. pmid:11321064
  14. 14. Panchanathan K, Boyd R. A tale of two defectors: the importance of standing for evolution of indirect reciprocity. J Theor Biol. 2003;224(1):115–26. pmid:12900209
  15. 15. Ohtsuki H, Iwasa Y. The leading eight: social norms that can maintain cooperation by indirect reciprocity. Journal of Theoretical Biology. 2006;239(4):435–44. pmid:16174521
  16. 16. Ohtsuki H, Iwasa Y. Global analyses of evolutionary dynamics and exhaustive search for social norms that maintain cooperation by reputation. J Theor Biol. 2007;244(3):518–31. pmid:17030041
  17. 17. Uchida S, Yamamoto H, Okada I, Sasaki T. A theoretical approach to norm ecosystems: two adaptive architectures of indirect reciprocity show different paths to the evolution of cooperation. Front Phys. 2018;6.
  18. 18. Milinski M, Semmann D, Bakker TC, Krambeck HJ. Cooperation through indirect reciprocity: image scoring or standing strategy?. Proc Biol Sci. 2001;268(1484):2495–501. pmid:11747570
  19. 19. Swakman V, Molleman L, Ule A, Egas M. Reputation-based cooperation: empirical evidence for behavioral strategies. Evol Hum Behav. 2016;37(3):230–235.
  20. 20. Okada I, Yamamoto H, Sato Y, Uchida S, Sasaki T. Experimental evidence of selective inattention in reputation-based cooperation. Sci Rep. 2018;8(1):14813. pmid:30287848
  21. 21. Yamamoto H, Suzuki T, Umetani R. Justified defection is neither justified nor unjustified in indirect reciprocity. PLoS One. 2020;15(6):e0235137. pmid:32603367
  22. 22. Wedekind C, Milinski M. Cooperation through image scoring in humans. Science. 2000;288(5467):850–2. pmid:10797005
  23. 23. Milinski M, Semmann D, Krambeck H-J. Reputation helps solve the “tragedy of the commons”. Nature. 2002;415(6870):424–6. pmid:11807552
  24. 24. Bolton GE, Katok E, Ockenfels A. Cooperation among strangers with limited information about reputation. Journal of Public Economics. 2005;89(8):1457–68.
  25. 25. Ule A, Schram A, Riedl A, Cason TN. Indirect punishment and generosity toward strangers. Science. 2009;326(5960):1701–4. pmid:20019287
  26. 26. Yoeli E, Hoffman M, Rand DG, Nowak MA. Powering up with indirect reciprocity in a large-scale field experiment. Proc Natl Acad Sci U S A. 2013;110 Suppl 2(Suppl 2):10424–9. pmid:23754399
  27. 27. Brandt H, Sigmund K. Indirect reciprocity, image scoring, and moral hazard. Proc Natl Acad Sci U S A. 2005;102(7):2666–70. pmid:15695589
  28. 28. Hilbe C, Schmid L, Tkadlec J, Chatterjee K, Nowak MA. Indirect reciprocity with private, noisy, and incomplete information. Proc Natl Acad Sci U S A. 2018;115(48):12241–6. pmid:30429320
  29. 29. Okada I. Two ways to overcome the three social dilemmas of indirect reciprocity. Sci Rep. 2020;10(1):16799. pmid:33033279
  30. 30. Yamamoto H, Okada I, Sasaki T, Uchida S. Clarifying social norms which have robustness against reputation costs and defector invasion in indirect reciprocity. Sci Rep. 2024;14(1):25073. pmid:39443609
  31. 31. Berger U, Grüne A. On the stability of cooperation under indirect reciprocity with first-order information. Games and Economic Behavior. 2016;98:19–33.
  32. 32. Fujimoto Y, Ohtsuki H. Evolutionary stability of cooperation in indirect reciprocity under noisy and private assessment. Proc Natl Acad Sci U S A. 2023;120(20):e2300544120. pmid:37155910
  33. 33. Fujimoto Y, Ohtsuki H. Who is a Leader in the leading eight? indirect reciprocity under private assessment. PRX Life. 2024;2(2).
  34. 34. Murase Y, Hilbe C. Computational evolution of social norms in well-mixed and group-structured populations. Proc Natl Acad Sci U S A. 2024;121(33):e2406885121. pmid:39116135
  35. 35. Ohtsuki H, Iwasa Y, Nowak MA. Reputation effects in public and private interactions. PLoS Comput Biol. 2015;11(11):e1004527. pmid:26606239
  36. 36. Hofbauer J, Sigmund K. Evolutionary games and population dynamics. Cambridge University Press; 1998.
  37. 37. Okada I, Yamamoto H. Mathematical description and analysis of adaptive risk choice behavior. ACM Trans Intell Syst Technol. 2013;4(1):1–21.
  38. 38. Vonasch AJ, Reynolds T, Winegard BM, Baumeister RF. Death before dishonor. Social Psychological and Personality Science. 2017;9(5):604–13.
  39. 39. Dunbar RI, Marriott A, Duncan ND. Human conversational behavior. Hum Nat. 1997;8(3):231–46. pmid:26196965
  40. 40. Robbins ML, Karan A. Who gossips and how in everyday life?. Social Psychological and Personality Science. 2019;11(2):185–95.
  41. 41. Tanabe S, Suzuki H, Masuda N. Indirect reciprocity with trinary reputations. J Theor Biol. 2013;317:338–47. pmid:23123557
  42. 42. Murase Y, Kim M, Baek SK. Social norms in indirect reciprocity with ternary reputations. Sci Rep. 2022;12(1):455. pmid:35013393
  43. 43. Sasaki T, Okada I, Nakai Y. The evolution of conditional moral assessment in indirect reciprocity. Sci Rep. 2017;7:41870. pmid:28150808
  44. 44. Fehr E, Gächter S. Altruistic punishment in humans. Nature. 2002;415(6868):137–40. pmid:11805825
  45. 45. Sefton M, Shupp R, Walker JM. The effect of rewards and sanctions in provision of public goods. Economic Inquiry. 2007;45(4):671–90.
  46. 46. Hamlin JK, Wynn K, Bloom P, Mahajan N. How infants and toddlers react to antisocial others. Proc Natl Acad Sci U S A. 2011;108(50):19931–6. pmid:22123953
  47. 47. Yamamoto H, Suzuki T. Exploring condition in which people accept AI over human judgements on justified defection. Sci Rep. 2025;15(1):3339. pmid:39870713
  48. 48. Kiyonari T, Barclay P. Cooperation in social dilemmas: free riding may be thwarted by second-order reward rather than by punishment. J Pers Soc Psychol. 2008;95(4):826–42. pmid:18808262
  49. 49. Ozono H, Watabe M. Reputational benefit of punishment: comparison among the punisher, rewarder, and non-sanctioner. Letters on Evolutionary Behavioral Science. 2012;3(2):21–4.
  50. 50. Raihani NJ, Bshary R. The reputation of punishers. Trends Ecol Evol. 2015;30(2):98–103. pmid:25577128
  51. 51. Mifune N, Li Y, Okuda N. The evaluation of second- and third-party punishers. Lett Evol Behav Sci. 2020;11(1):6–9.
  52. 52. Li Y, Mifune N. Punishment in the public goods game is evaluated negatively irrespective of non-cooperators’ motivation. Front Psychol. 2023;14:1198797. pmid:37457072
  53. 53. Yates JF, Ji LJ, Oka T, Lee JW, Shinotsuka H, Sieck WR. Indecisiveness and culture: incidence, values, and thoroughness. Journal of Cross-Cultural Psychology. 2010;41(3):428–44.