Group size can have positive, negative, or even curvilinear effect on cooperation depending on how the benefit for full cooperation varies as a function of the group size

In a world in which many pressing global issues require large scale cooperation, understanding the group size effect on cooperative behavior is a topic of central importance. Yet, the nature of this effect remains largely unknown, with lab experiments insisting that it is either positive or negative or null, and field experiments suggesting that it is instead curvilinear. Here we shed light on this apparent contradiction by showing that one can recreate all these effects in the lab by varying a single parameter. Specifically, if the benefit for full cooperation remains constant as a function of the group size, then larger groups are less cooperative; if it increases linearly with the size of the group, then larger groups are more cooperative; however, in the more realistic scenario in which the natural output limits of the public good imply that the benefit of cooperation increases fast for early contributions and then decelerates, one may get a curvilinear effect according to which intermediate-size groups cooperate more than smaller groups and more than larger groups. Our findings help fill the gap between lab experiments and field experiments and suggest concrete ways to promote large scale cooperation among people.


Introduction
Cooperation has played a fundamental role in the early evolution of our societies [1,2] and continues playing a major role still nowadays. From the individual level, where we cooperate with our romantic partner, friends, and co-workers in order to handle our individual problems, up to the global level where countries cooperate with other countries in order to handle global problems, our entire life is based on cooperation.
Since the resolution of many pressing global issues, such as global climate change and depletion of natural resources, requires cooperation among many actors, one of the most relevant questions about cooperation regards the effect of the size of the group on cooperative behavior. Indeed, since the influential work by Olson [33], scholars have recognized that the size of a group can have an effect on cooperative decisionmaking. However, the nature of this effect remains one of the most mysterious areas in the literature, with some scholars arguing that it is negative [33][34][35][36][37][38][39][40], others that it is positive [41][42][43][44][45][46][47], and yet others that it is ambiguous [48][49][50][51] or non-significant [52][53][54]. Interestingly, the majority of field experiments seem to agree on yet another possibility, that is, that group size has a curvilinear effect on cooperative behavior, according to which intermediate-size groups cooperate more than smaller groups and more than larger groups [55][56][57][58][59]. The emergence of a curvilinear effect of the group size on cooperation in real life situations is also supported by data concerning academic research, which in fact support the hypothesis that research quality of a research group is optimized for medium-sized groups [60][61][62].
Here we aim at shedding light on this debate, by providing evidence that a single parameter can be responsible for all the different and apparently contradictory effects that have been reported in the literature. Specifically, we show that the effect of the size of the group on cooperative decision-making depends critically on a parameter taking into account different ways in which the notion of cooperation itself can be defined when there are more than two agents.
Indeed, while in case of only two agents a cooperator can be simply defined as a person willing to pay a cost c to give a greater benefit b to the other person [6], the same definition, when transferred to situations where there are more than two agents, is subject to multiple interpretations. If cooperation, from the point of view of the cooperator, means paying a cost c to create a benefit b, what does it mean from the point of view of the other players? Does b get earned by each of the other players or does it get shared among all other players, or none of them? In other words, what is the marginal return for cooperation?
Of course, there is no general answer: the choice of a particular marginal return over the other ones depends on the realistic scenario one is trying to formalize. For instance, in the standard Public Goods game it is assumed that b gets earned by each player (including the cooperator); instead, in the N-person Prisoner's dilemma (as defined in [63]) it is assumed that b gets shared among all players; yet, the Volunteer's dilemma [64] and its variants using critical mass [65] rest somehow in between: one or more cooperators are needed to generate a benefit that gets earned by each player, but, after the critical mass is reached, new cooperators do not generate any more benefit; finally, it has been pointed out [66,67] that a number of realistic situations can be characterized by a marginal return which increases linearly for early contributions and then decelerates, reflecting the natural decrease of marginal returns that occurs when output limits are approached.
In order to take into account this variety of possibilities, we consider a class of social dilemmas parametrized by a function β = β(Γ, N ) describing the marginal return for cooperation when Γ people cooperate in a group of size N . More precisely, our general Public Goods game is the N-person game in which N people have to simultaneously decide whether to cooperate (C) or defect (D). In presence of a total of Γ cooperators, the payoff of a cooperator is defined as β(Γ, N )−c (c > 0 represents the cost of cooperation) and the payoff of a defector is defined as β(Γ, N ). In order to have a social dilemma, we require that: • Full cooperation pays more than full defection, that is, β(N, N ) − c > β(0, N ), for all N ; • Defecting is individually optimal, regardless of the number of cooperators, that is, for all Γ < N , one has β(Γ, N ) − c < β(Γ − 1, N ).
The aim of this paper is to provide evidence that the function β might be responsible for the confusion in the literature about group size effect on cooperation.
We made a first step in that direction with our previous work [63], in which we have shown that the size of the group has a positive effect on cooperation in the standard Public Goods game and has a negative effect on cooperation in the N-person Prisoner's dilemma. A reinterpretation of these results is that, if β(N, N ) increases linearly with N (standard Public Goods game), then the size of the group has a positive effect on cooperation; and, if β(N, N ) is constant with N (N-person Prisoner's dilemma), then the size of the group has a negative effect on cooperation. This reinterpretation suggests that, in the more realistic situations in which the benefit for full cooperation increases linearly for early contributions and then decelerates once the output limits of the public good are approached, we may observe a curvilinear effect of the group size, according to which intermediate-size groups cooperate more than smaller groups and more than larger groups. If effectively found, such an observation would complete our program of showing that different β's may give rise to all sorts of group size effects on cooperation and would in turn explain why field experiments using real public goods games tend to agree on the curvilinear effect rather than either of the two linear effects [55][56][57][58][59].
To test this hypothesis, we have conducted a lab experiment using a general public goods game with a piecewise function β, which increases linearly up to a certain number of cooperators, after which it remains constant. While it is likely that realistic scenarios would be better described by a smoother function, this is a good approximation of all those situations in which the natural output limits of a public good imply that the increase in the marginal return for cooperation tends to zero as the number of contributors grows very large. The upside of choosing a piecewise function β is that, in this way, we could present the instructions of the experiment in a very simple way, thus minimizing random noise due to participants not understanding the decision problem at hand (see Method).
Our results support indeed the hypothesis of a curvilinear effect of the size of the group on cooperative decision-making. In doing so, they (i) shed light on the confusion regarding the group size effect on cooperation, by pointing out that different values of a single parameter might give rise to qualitatively different group size effects; and (ii) they fill the gap between lab experiments and field experiments. Indeed, while lab experiments use either the standard Public Goods game or the N-person Prisoner's dilemma, real public goods game are mostly characterized by a marginal return that increases fast for early contributions and then approaches a constant function as the number of cooperators grows very large -and our results provide evidence that these three situations give rise to three different group size effects.

Method
We have recruited participants through the online labour market Amazon Mechanical Turk (AMT) [68][69][70]. After entering their TurkID, participants were directed to the following instruction screen.
Welcome to this HIT.
This HIT will take about 5 minutes and you will earn 20c for participating.
This HIT consists of a decision problem followed by a few demographic questions.
You can earn an additional bonus depending on the decisions that you and the participants in your cohort will make.
We will tell you the exact number of participants in your cohort later.
Each one of you will have to decide to join either Group A or Group B.
Your bonus depends on the group you decide to join and on the size of the two groups, A and B, as follows: • If the size of Group A is 0 (that is, everybody chooses to join Group B), then everybody gets 10c • If the size of Group A is 1, then the person in Group A gets 5c and each person in Group B gets 15c • If the size of Group A is 2, then each person in Group A gets 10c and each person in Group B gets 20c • If the size of Group A is 3, then each person in Group A gets 15c and each person in Group B gets 25c • If the size of Group A is 4, then each person in Group A gets 20c and each person in Group B gets 30c • And so on, up to 10: If the size of Group A is 10, then each person in Group A gets 50c and each person in Group B gets 60c • However, if the size of Group A is larger than 10, then, independently of the size of the two groups, each person in group A will still get 50c and each person in group B will still get 60c.
After reading the instructions, participants were randomly assigned to one of 12 conditions, differing only on the size of the cohort (N = 3, 5, 10,15,20,25,30,40,50,60,80,100). For instance, the decision screen for the participants in the condition where the size of the cohort is 3 was: You are part of a cohort of 3 participants.
Which group do you want to join?
By using appropriate buttons, participants could select either Group A or Group B.
We opted for not asking any comprehension questions. We made this choice for two reasons. First, with the current design, it is impossible to ask general comprehension questions such as "what is the strategy that benefits the group as a whole", since this strategy depends on the strategy played by the other players. Second, we did not want to ask particular questions about the payoff structure since this may anchor the participants' reasoning on the examples presented. Of course, a downside of our choice is that we could not avoid random noise. However, as it will be discussed in the Results section, random noise cannot be responsible for our findings. Instead, our results would have been even cleaner, if we had not had random noise, since the initial increase of cooperation and its subsequent decline would have been more pronounced (see Results section for more details).
After making their decisions, participants were asked to fill a standard demographic questionnaire, after which they received the "survey code" needed to claim their payment. After collecting all the results, bonuses were computed and paid on top of the participation fee, that was $0.20. In case the number of participants in a particular condition was not divisible by the size of the cohort (it is virtually impossible, in AMT experiments, to decide the exact number of participants playing a particular condition), in order to compute the bonus of the remaining people we formed an additional cohort where these people where grouped with a random choice of people for which the bonus had been already computed.

Results
A total of 1.195 distinct subjects located in the US participated in our experiment. Distinct subjects means that, in case two or more subjects were characterized by either the same TurkID or the same IP address, we kept only the first decision made by the corresponding participant and eliminated the rest. These multiple identities  We observe that not only random noise cannot explain our results, but, without random noise, the effect would have been even stronger. Indeed, subtracting a binary distribution with average 0.5 from a binary distribution with average µ > 0.5, one would obtain a distribution with average µ 0 > µ. Similarly, subtracting a binary distribution with average 0.5 from a binary distribution with average µ < 0.5 one would obtain a distribution with average µ 0 < µ. Thus, if the µ's are the averages that we have found (containing random noise) and the µ 0 's are the true averages (without random noise), the previous inequalities allow us to conclude that the initial increase of cooperation and its following decrease would have been stronger in absence of random noise.

Discussion
Here we have reported on a lab experiment providing evidence that the size of a group can have a curvilinear effect on cooperation, with intermediate-size groups cooperating Figure 1: Proportion of cooperators (people choosing to join Group A) divided by group size. Error bars represent the standard errors of the means. Group size has initially a positive effect on cooperation, which increases and reaches its maximum in groups of size 15, followed by a gradual decrease. Linear regression predicting cooperation using group size as independent variable confirms that both the initial increase of cooperation and its subsequent decline are highly significant (from N = 3 to N = 15: coeff = 0.0187553, p = 0.00042; from N = 15 to N = 100: coeff = −0.00177618, p = 0.00390).
more than smaller groups and more than larger groups. Joining the current results with those of a previously published study of us [63], we can conclude that group size can have qualitatively different effects on cooperation, ranging from positive, to negative and curvilinear, depending on the particular decision problem at hand. Interestingly, our findings suggest that different group size effects are ultimately due to different values of a single parameter, the number β (N, N ), describing the benefit for full cooperation.
If β (N, N ) is constant in N , then group size has a negative effect on cooperation; if β(N, N ) increases linearly with N , then group size has a positive effect on cooperation; in the middle, all sorts of things may a priori happen. In particular, in the realistic situation in which β (N, N ) is a piecewise function that increases linearly with N up to a certain N 0 and then remains constant, then group size has a curvilinear effect, according to which intermediate-size groups cooperate more than smaller groups and more than larger groups.
To the best of our knowledge, ours is the first study reporting a curvilinear effect of the group size on cooperation in an experiment conducted in the ideal setting of a lab, in which confounding factors are minimized. Previous studies reporting a qualitatively similar effect [55][56][57][58] used field experiments, in which it is difficult to isolate the effect of the group size from possibly confounding effects. In our case, the only possibly confounding factor is random noise due to a proportion of people that may have not understood the rules of the decision problem. As we have shown, our results cannot be driven by random noise and, in fact, the curvilinear effect would have been even stronger, without random noise. Moreover, since our experimental design was inspired by a tentative to mimic all those real public goods games in which the natural output limits of the public good imply that the increase of the marginal return for cooperation, when the number of cooperators diverges, tends to zero, our results might explain the apparent contradiction that field experiments tend to converge on the fact that the effect of the group size is curvilinear, while lab experiments tend to converge on either of the two linear effects.
Our contribution is also conceptual, since we have provided evidence that a single parameter might be responsible for different group size effects. We believe that this is a relevant contribution in light of possible applications of our work. Indeed, the difference between β(N, N ) and the total cost of full cooperation cN can be interpreted has the incentive that an institution needs to pay to the contributors in order to make them cooperate. Since institutions are interested in minimizing their costs and, at the same time, maximizing the number of cooperators, it is crucial to understand what is the "lowest" β such that the resulting effect of the group size on cooperation is positive.
This seems to be an non-trivial question. For instance, does β(Γ, N ) = Γ N log 2 (N + 1) give rise to a positive effect or is it still curvilinear or even negative? The technical difficulty here is that it is hard to design an experiment to test people's behavior in these situations, since one cannot expect that an average person would understand the rules of the game when presented using a logarithmic functions.
In terms of economic models, our results are consistent with utilitarian models such as the Charness & Rabin model [71] and the novel cooperative equilibrium model [10,63,72]. Both these models indeed predict that, in our experiment, cooperation initially (i.e., for N ≤ 10) increases with N (see [63] for the details), and then starts decreasing. This behavioral transition follows from the simple observation that free riding when there are more than 10 cooperators costs zero to each of the other players and benefits the free-rider. Thus, cooperation in larger groups is not supported by utilitarian models, which then predict a decrease in cooperative behavior whose speed depends on the particular parameters of the model, such as the extent to which people care about the group payoff versus their individual payoff, and people's beliefs about the behavior of the other players. Thus our results add to the growing body of literature showing that utilitarian models are qualitatively good descriptors of cooperative behavior in social dilemmas.