A Model of Human Cooperation in Social Dilemmas

Social dilemmas are situations in which collective interests are at odds with private interests: pollution, depletion of natural resources, and intergroup conflicts, are at their core social dilemmas. Because of their multidisciplinarity and their importance, social dilemmas have been studied by economists, biologists, psychologists, sociologists, and political scientists. These studies typically explain tendency to cooperation by dividing people in proself and prosocial types, or appealing to forms of external control or, in iterated social dilemmas, to long-term strategies. But recent experiments have shown that cooperation is possible even in one-shot social dilemmas without forms of external control and the rate of cooperation typically depends on the payoffs. This makes impossible a predictive division between proself and prosocial people and proves that people have attitude to cooperation by nature. The key innovation of this article is in fact to postulate that humans have attitude to cooperation by nature and consequently they do not act a priori as single agents, as assumed by standard economic models, but they forecast how a social dilemma would evolve if they formed coalitions and then they act according to their most optimistic forecast. Formalizing this idea we propose the first predictive model of human cooperation able to organize a number of different experimental findings that are not explained by the standard model. We show also that the model makes satisfactorily accurate quantitative predictions of population average behavior in one-shot social dilemmas.


Introduction
Social dilemmas are situations in which collective interests are at odds with private interests [1]. In other words, they describe situations in which the fully selfish and rational behavior leads to an outcome smaller than the one the individuals would obtain if they acted collectively. Social dilemmas create then a tension between private interests and public interests, between selfishness and cooperation. Classically, several different social dilemmas have been distinguished, including the Prisoner's dilemma, Chicken, Assurance, Public Goods, the Tragedy of the Commons [2], and, more recently, the Traveler's dilemma [3], [4]. Each of these games has been studied by researchers from different disciplines, as economists, biologists, psychologists, sociologists, and political scientists, because of the intrinsic philophical interest in understanding human nature and since many concrete and important situations, as pollution, depletion of natural resources, and intergroup conflict, can be modelled as social dilemmas.
Consequently, the problem of making a predictive division in proself and prosocial types becomes extremely difficult, if not even impossible.
From these experiments, we can argue two conclusions: first, the observation of cooperation in one-shot social dilemmas without external controls suggests that the origin of cooperation relies in the human nature; second, the fact that the rate of cooperation depends on the payoffs suggests that it could be computed, at least approximatively, using only the payoffs. The word approximatively stands for the fact that numerous experimental studies have shown that cooperation is based on a number of factors, as family history, age, culture, gender, even university course [27], religious beliefs [19], and decision time [28]. Therefore, we cannot expect a theory able to say, given only the payoffs, the individual-level rate of cooperation in a social dilemma. We can expect instead a model predicting quite accurately population average behaviour using the mean value of parameters that could be theoretically updated at an individual-level.
In this article we make the first step in this direction: (1) we develop the first predictive model of cooperation; (2) we show that it explains a number of puzzling experimental findings that are not explained by the standard economic model, such as the fact that the rate of cooperation in the Prisoner's dilemma increases when the cost-benefit ratio decreases, the rate of cooperation in the Traveler's dilemma increases when the bonus/penalty decreases, the rate of cooperation in the Public Goods game increases when the pro-capite marginal return increases, the rate of cooperation in the Chicken game is larger than the rate of cooperation in the Prisoner's dilemma with similar payoffs; (3) we show that it makes satisfactorily accurate quantitative predictions of population average behaviour in social dilemmas.
We mention that there are many other models that can be applied to explain deviation towards cooperation in social dilemmas, including the cognitive hierarchy model [29], the quantal level-k theory [30], the level k-theory [31], the quantal response equilibrium [32], the inequity aversion models [33], [34] and the noisy introspection model [35]. Nevertheless, all these models use free parameters and so they are not predictive, but descriptive.
The key idea behind the model is very simple: since experimental data suggest that humans have attitude to cooperation by nature, we formalize the intuition that people do not act a priori as single agents, but they forecast how the game would be played if they formed coalitions and then they act according to their most optimistic forecast.
We anticipate that forecasts will be defined by making a comparison between incentive and risk for an agent i to deviate from the collective interest. This comparison leads to associate a probability to the event ''agent i defects''. As mentioned, we will show that this procedure works satisfactorily well in the prediction of population average behavior. The problem in passing to individual-level predictions is that the event ''player i defects'', given only the payoffs, is not measurable at an individual-level in any universal and objective sense and the dream is to use the factors mentioned above (family history, age, culture, incentives, iterations, etc.) to define parameters to update the measure of the event ''playeri defects'' at an individual-level. In fact, an attempt to extend the present model to iterated social dilemmas has been done in [36], leading to promising results: predictions tend to get close to experimental data as the number of iterations increases. Even though our model is very general and can be applied to every symmetric game, we treat explicitly only four but very relevant and widely studied social dilemmas: the Prisoner's dilemma, the Traveler's dilemma, the Public Goods, and the Tragedy of the Commons. We begin with a short review of these games.

Prisoner's Dilemma
Two players can choose to either ''Cooperate'' or ''Defect''. If both players cooperate, they both receive the monetary reward, R, for cooperating. If one player defects and the other cooperates, then the defector receives the temptation payoff, T, while the other receives the sucker payoff, S. If both players defect, they both receive the punishment payoff, P. Payoffs are subject to the condition TwRwPwS.

Traveler's Dilemma
Fix a bonus/penalty b §2. Two travelers have to claim for a reimbursement between 180 and 300 monetary units for their (identical) luggage that has been lost by the same air company. The air company wants to avoid that the travelers ask for unreasonably high reimbursements and so it decides to adopt the following rule: the traveler who claims the lowest, say m, gets a reimbursement of mzb monetary units, and the other one gets a reimbursement of only m{b monetary units. If both players claim the same, m, then they both get reimbursed of m monetary units.

Public Goods Game
N agents receive an initial endowment of yw0 monetary units and simultaneously choose an amount 0ƒx i ƒy to contribute to a public pool. The total amount in the pot is multiplied by a 0 and then divided equally by all group members. So agent i receives a payoff of where a~a 0 =N. The number a is assumed to belong to the interval (1=N,1) and it is called constant marginal return.

Tragedy of the Commons
Consider a village with N farmers, that has limited grassland. Each of the N farmers has the option to keep a sheep or not. Let the monetary utility of milk and wool from the sheep be hw0. Let the monetary damage to the environment from one sheep grazing over the grassland be denoted by k 0 w0. Assume hvk 0 vhN and let k~k 0 =N. Let x i be a variable that takes values 0 or 1 and denotes whether the farmer i keeps the sheep or not. The payoff of All these games share the same feature: selfish and rational behavior leads to suboptimal outcomes. In the Prisoner's dilemma, the unique Nash equilibrium is to defect, while both players would be better off if they both cooperate; in the Traveler's dilemma, the unique Nash equilibrium is to claim for the lowest possible amount, producing an outcome smaller than the one they would obtain if they both claim for the largest possible amount; in the Public Goods game, the unique Nash equilibrium is not to contribute anything, while all players would be better off if they all contribute everything; in the Tragedy of the Commons, the unique Nash equilibrium is to keep the sheep, while all farmers would be better off if they all agree not to keep the sheep.

An Informal Description of the Model
Before introducing the model in general, we describe it informally in a particular case. Consider the Prisoner's dilemma (recently experimented using MTurk in [20]) with monetary outcomes (expressed in dollars) T~0:20,R~0:15,P~0:05,S~0. The idea is that players forecast how the game would be played if they formed coalitions. In a two-player game there are only two possible coalition structures: in the selfish coalition structure p s players are supposed to follow their private interests and in the cooperative coalition structure p c they are supposed to follow the collective interest. The analysis of these two coalition structures proceed as follows: N In p s players follow their private interest and therefore, by definition, they play the Nash equilibrium (D,D). Since there is no incentive to deviate from a Nash equilibrium, each player gets 0.05 for sure and we say that the value of p s is 0.05 and write v(p s )~0:05.
N To define the value of p c we argue as follows. If the players follow the collective interest, their largest possible payoff is 0.15 in correspondence to the profile of strategies (C,C). Since this profile of strategies is not stable (i.e., each player has a nonzero incentive to deviate from it), we introduce a probability to measure how likely such deviations are. To define this probability, we observe that: We define the prior probability that a player abandons the coalition structure p c by making a sort of proportion between incentive and risk. Specifically, we define the probability that a player abandons p c to be D(p c )½D(p c )zR(p c ) {1 . Now, note that the smallest payoff achievable by a player when she follows p c but the other player does not is the sucker payoff S~0. Therefore, we define v(p c )~0: 15 :10: The numbers v(p s ) and v(p c ) are interpreted as forecasts of the expected payoff for an agent playing according to p s and p c , respectively. Since v(p s )~0:05 and v(p c )~0:10, the most optimistic forecast is in correspondence of the cooperative coalition structure p c . We use this best forecast to generate common beliefs or, in other words, to make a tacit binding between the players: to play only strategies which give a payoff of at least 0.10 to both players. More formally, we restrict the set of profiles of strategies and we allow only profiles s~(s 1 ,s 2 ), such that u i (s) §0:10, for all i. We define the cooperative equilibrium to be the unique Nash equilibrium of this restricted game.
From Fig. 1, it is clear that the cooperative equilibrium is in correspondence of the point in the red set that is closest to (D,D). This point can be computed directly by finding the smallest l such that 0:15l 2 z0:2l(1{l)z0:05(1{l) 2 §0:1, that is l~1 2 . Consequently, the cooperative equilibrium of this variant of the Prisoner's dilemma is Notice that in [20] it has been reported that players cooperated with probability 58 per cent in one treatment and 65 per cent in another treatment and the over-cooperation in the second experiment was explained in terms of framing effect due to the different ways in which the same game was presented.

The Model
We now describe the general model. We recall that, motivated by the observation that attitude to cooperation seems to be intrinsic in the human nature, our main idea is to assume that players do not act a priori as single agents, but they forecast how the game would be played if they formed coalitions and then they play according to their most optimistic forecast. The only technical difficulty to formalize this idea is to define the forecasts. Following the example described in the previous section, they will be defined by assigning to each player i and to each partition p of the player set P, interpreted as a possible coalition structure, a number v i (p) which represents the expected payoff of player i when she plays according to the coalition structure p. This value will be indeed defined as an average where t i,J (p) represents the prior probability that players i assigns to the event ''players in J abandon the coalition structure p'' and e i,J (p) is the infimum of payoffs of player i when she plays according to the coalition structure p and players in J abandon the coalition.
This idea is very general and indeed, in a long-term working paper, we are developping the theory for every normal form game [37]. In case of the classical social dilemmas in consideration the theory is much easier, because of their symmetry.
Coming to the description of the model, let G be a symmetric game and denote P the set of players, each of which has pure strategy set S i , mixed strategies P(S i ) and payoff function u i . We start by assuming, for simplicity, that P~f1,2g and we will explain, at the end of this section, how the model generalizes to N-player games.
A coalition structure is a partition p of the set of players, that is a collection of pairwise disjoint subsets of P whose union covers P. Every set in the partition is called coalition. Given a coalition structure p, we denote by G p the game associated to p, whose players in the same coalition play as a single player whose payoff is the sum of the payoffs of the players belonging to that coalition. Call Nash(G p ) the set of Nash equilibria of the game G p . Now fix i [ P and let {i denote the other player. We denote by D {i (p) the maximal payoff that player {i can obtain leaving the coalition structure p. Formally, D {i (p) will be called incentive of player {i to abandon the coalition structure p.
Given a profile of strategies (s 1 ,s 2 ), a strategy s ' . We denote by R {i (p) the maximal loss that players {i can incur if she decides to leave the coalition structure p to try to achieve her maximal possible gain, but also player i deviates from the coalition structure p either to follow her selfish interests or to anticipate player {i's deviation. Formally,  We define the probability of deviating from the coalition structure p by making a comparison between incentive and risk. There are certainly many ways to do such comparison. In this paper we use a quite intuitive and seemingly natural way to make it and, in future research, it would be important to investigate some others. Specifically, we define and we interpret this number as prior probability that player i assigns to the event ''player {i abandons the coalition structure p''. Therefore t i,1 (p) :~1{t i,{i (p) is interpreted as prior probability that nobody abandons the coalition structure p. Now, let e i,1 (p) be the infimum of payoffs for player i if nobody abandons the coalition structure p, that is the infimum of payoffs for player i when each player plays according to a Nash equilibrium of G p , and let e i,{i (p) be the infimum of payoffs of player i when she plays according to a Nash equilibrium of G p and {i plays a ({i)deviation from a Nash equilibrium of G p . The value for player i of the coalition structure p is by definition Consequently, there is a coalition structure p p (independent of i) which maximizes v(p). We use the number v( p p) to define common beliefs or, in other words, to make a tacit binding among the players.

Definition 0.1
The induced game Ind(G, p p) is the same game as G except for the set of allowed profiles of strategies: in the induced game only profiles of strategies s~(s 1 ,s 2 ) such that u i (s) §v( p p), for all i, are allowed.
Observe that the induced game does not depend on the maximizing coalition structure, that is, in case of multiple coalition structures maximizing the value, one can choose one of them casually to define the induced game and this game does not depend on such choice.
Since the set of allowed strategies in the induced game is convex and compact (and non-empty) one can compute Nash equilibria of the induced game.

Definition 0.2
A cooperative equilibrium for G is a Nash equilibrium of the game Ind (G, p p).
Observe that this model implicitly assumes that it is common knowledge that both players apply the same method of reasoning, that is, each player knows that the other player thinks about coalitions when making her decision. As we elaborate in Section, we believe that this assumption is not unreasonable and may provide a realistic picture of the mental processes that real subjects perform during the game.
In case of N-player games the idea is to define t i,j (p) for every single player j=i and then use the law of total probabilities to extend this measure to a probability measure on the set P\fig. To use the law of total probabilities we need to know the probabilities that two or more given players deviate from p. This is easy in situations of perfect anonimity: one can just assume that the events ''player j deviates'' and ''player k deviates'' are independent and then multiply the respective probabilities. The situation where a player may influence the choice of another player is much more interesting and worthy of being explored.
Finally, we observe that the N-person classical social dilemmas in consideration are computationally very simple, since it is enough to study only the fully selfish coalition structure p s (in which all players play according to a Nash equilibrium of the original game) and the fully cooperative coalition structure p c (in which all players play collectively). More formally, given a coalition structure p=p s ,p c , one has v(p)ƒv(p c ). Therefore, in order to find a coalition structure that maximizes the value, it is enough to know the values v(p s ) and v(p c ).

Prisoner's Dilemma
We compute the cooperative equilibrium of the Prisoner's dilemma in two variants, starting from the one already discussed in Section with monetary outcomes (expressed in dollars) T~0:20,R~0:15,P~0:05,S~0. In this case, the reader can easily check, following the computation sketched in Section, that the cooperative equilibrium is 1 2 Cz 1 2 D for both players. Notice that in [20] it has been reported that players cooperated with probability 58 per cent in one treatment and 65 per cent in another treatment and the over-cooperation in the second experiment was explained in terms of framing effect due to the different ways in which the same game were presented.
Similar results can be obtained making a comparison between the experimental data reported in [19] on the one-shot prisoner's dilemma with T~10,R~7,P~3,S~0 and its cooperative equilibrium: 37 per cent of subjects cooperated in the laboratory, while the cooperative equilibrium is 1 4 Cz 3 4 D. We mention that the same experiment was repeated using MTurk and ten times smaller outcomes, giving a slightly larger percentage of cooperation (47 per cent). Nevertheless, it was shown in [19] that this difference was not statistically significant. Now we consider a parametric Prisoner's dilemma. Fix kw0 and consider the following monetary outcomes: T~kz2, R~kz1,P~1,S~0. The intuition suggests that people should be perfectly selfish for k~0, they should get more cooperative as k increases and they should tend to be perfectly cooperative as k approaches infinity. This qualitative behavior was indeed observed in iterated treatments in [18].
We show that this is in fact the behavior of the cooperative equilibrium. Indeed, one obtains that the cooperative equilibrium coincides with Nash equilibrium for kƒ1, while, for kw1, it is which moves continuously and monotonically from defection to cooperation as k increases and tends to cooperation as k tends to infinity. Note that the fact that the cooperative equilibrium coincides with Nash equilibrium for kƒ1 shows also that Nash equilibrium and cooperative equilibrium are not disjoint solution concepts. Colloquially speaking, players get selfish when they understand that cooperating is not fruitful.
Consequently the cooperative equilibrium strongly depends on b: the predicted claims get smaller as b get larger. In other words, cooperation is more difficult as the bonus/penalty increases. This behaviour has been indeed qualitatively observed both in one-shot and iterated games [38], [16], [17], and [39]. We are aware of only two experimental studies devoted to oneshot Traveler's dilemmas. For these experiments, the prediction of the the cooperative equilibrium are even quantitatively close. Indeed, (1) for b~5 one finds that the unique cooperative equilibrium is a suitable convex combinations of the strategies 296 and 297. This meets the experimental data reported in [16], where they observed that about 80 per cent of the subjects played a strategy between 290 and 300 with an average of 295; (2) For b~180, one has v(p c )vv(p s ), and then the cooperative equilibrium coincides with the Nash equilibrium. This matches the experimental data reported in [16], where they observed that about 80 per cent of the players played the Nash equilibrium; (3) For b~2 and strategy sets f2,3, . . . ,100g, in [17] it has been reported that 38 out of 45 game theorists chose a strategy between 90 and 100 and 28 of them chose a strategy between 97 and 100. In this case v(p c )~99:2 and therefore the cooperative equilibrium is close to the pure strategy 99.

Public Good Game
The unique Nash equilibrium is x i~0 , for all i, in correspondence of which each player gets y. Consequently v(p s )~y. On the other hand, one has v(p c )~2ay : 2a{1 a zay : 1{a a~( 3a{1)y: Therefore, v(p c )ƒv(p s ) if and only if aƒ 2 3 . In other words, when a is small -recall that a is assumed to belong to the interval ( 1 2 ,1) -the cooperative equilibrium reduces to Nash equilibrium and the larger is a the larger is the rate of cooperation predicted by the cooperative equilibrium. The fact that human behavior depends on a in this way has been indeed observed several times (see, e.g., [14], [40]). As a quantitative comparison, we consider the experimental data reported in [41], with a~0:8. We normalize y to be equal to 1 (in the experiment y~0:04 dollars). In this case the cooperative equilibrium is supported between 0.66 and 0.67. In [41] it has been reported that the average of contributions was 0.50, but the mode was 0.60 (6 out of 32 times) followed by 0.80 (5 out of 32 times).

Tragedy of the Commons
One easily sees that the Tragedy of the Commons and the Public Goods game represent the same strategic situation, just by

Comparison between the Prisoner's Dilemma and Chicken
We recall that the Chicken game is basically the same as the Prisoner's dilemma except for the fact that payoffs are subject to the condition TwRwSwP. The Chicken game has two pure Nash equilibria, (C,D) and (D,C), and a symmetric evolutionarily stable mixed Nash equilibrium depending on the payoffs. Observe that e i,1 (p s )~P, since it is the infimum of payoffs of player i when each player plays in according to a Nash equilibrium. Such infimum is attained in correspondence to the profile of strategies (D,D).
It has been observed in [42] that the rate of cooperation in the iterated Prisoner's dilemma is significantly less than the rate of cooperation in the iterated Chicken game with similar payoffs, that is, with payoffs such that the average payoffs across outcomes is the same in both games.
We now show that this behavior is predicted by the cooperative equilibrium in one-shot games, giving a qualitative explanation of why we observe more cooperation in the iterated Chicken game than in the iterated Prisoner's dilemma. The expression qualitative explanation stands for the fact that, of course, a direct comparison between iterated and one-shot games cannot be done, since the former have a much richer set of strategies. Nevertheless, we find quite remarkable the fact that this difference in behavior observed in iterated treatments is predicted for one-shot treatments: we believe that this connection is not casual and deserves to be investigate better.
The payoffs used in [42] are T~400,R~300,D~0,S~{100 for the Prisoner's dilemma and T~300,R~200,S~100,D~0 for the Chicken game. One finds that the cooperative equilibrium of this variant of the Prisoner's dilemma is 2 3 Cz 1 3 D and the cooperative equilibrium of this variant of the Chicken game coincides with the evolutionarily stable strategy 6 7 Cz 1 7 D. So the rate of cooperation predicted by the cooperative equilibrium is significantly higher in the Chicken game.

Conclusions
Many experiments over the years have shown that humans may act cooperatively even in one-shot social dilemmas without forms of external controls and the rate of cooperation depends on the payoffs. This suggests that humans have attitude to cooperation by nature and therefore they do not act a priori as single players, as typically assumed in economics, but they forecast how the game would be played if they formed coalitions and then they play according to their most optimistic forecast.
We have formalized this idea assuming that each player makes an evaluation of the probability that another player abandons the collective interest to follow her private interest. This probability is defined by making a comparison between incentive and risk to deviate from the collective interest and gives rise to common beliefs that, mathematically, correspond to define a suitable restriction of the original game. On the one hand, this procedure seems qualitatively reasonable and we believe it provides a realistic picture of the mental processes that real subjects perform during the game. On the other hand, the formalization of this process, that is, the definitions of the risk, incentive, probabilities, and the induced game, is mathematically simple and seemingly natural but certainly deserves to be investigated better and possibly improved in future research.
However, the actual model makes us optimistic about this direction of research, being the first predictive model able to: (1) make satisfactorily accurate predictions of population average behavior in social dilemmas; (2) explain a number of experimental findings, such as the fact that the rate of cooperation in the Prisoner's dilemma increases when the cost-benefit ratio decreases, the rate of cooperation in the Traveler's dilemma increases when the bonus/penalty decreases, the rate of cooperation in the Public Goods game increases when the pro-capite marginal return increases, the rate of cooperation in the Chicken game is larger than the rate of cooperation in the Prisoner's dilemma with similar payoffs.
The dream is to incorporate other components (as family history, age, culture, incentives, iterations, etc.) into the model in order to make individual-level predictions.