Dilemma of Dilemmas: How Collective and Individual Perspectives Can Clarify the Size Dilemma in Voluntary Linear Public Goods Dilemmas

Empirical findings on public goods dilemmas indicate an unresolved dilemma: that increasing size—the number of people in the dilemma—sometimes increases, decreases, or does not influence cooperation. We clarify this dilemma by first classifying public goods dilemma properties that specify individual outcomes as individual properties (e.g., Marginal Per Capita Return) and group outcomes as group properties (e.g., public good multiplier), mathematically showing how only one set of properties can remain constant as the dilemma size increases. Underpinning decision-making regarding individual and group properties, we propose that individuals are motivated by both individual and group preferences based on a theory of collective rationality. We use Van Lange's integrated model of social value orientations to operationalize these preferences as an amalgamation of outcomes for self, outcomes for others, and equality of outcomes. Based on this model, we then predict how the public good's benefit and size, combined with controlling individual versus group properties, produce different levels of cooperation in public goods dilemmas. A two (low vs. high benefit) by three (2-person baseline vs. 5-person holding constant individual properties vs. 5-person holding constant group properties) factorial experiment (group n = 99; participant n = 390) confirms our hypotheses. The results indicate that when holding constant group properties, size decreases cooperation. Yet when holding constant individual properties, size increases cooperation when benefit is low and does not affect cooperation when benefit is high. Using agent-based simulations of individual and group preferences vis-à-vis the integrative model, we fit a weighted simulation model to the empirical data. This fitted model is sufficient to reproduce the empirical results, but only when both individual (self-interest) and group (other-interest and equality) preference are included. Our research contributes to understanding how people's motivations and behaviors within public goods dilemmas interact with the properties of the dilemma to lead to collective outcomes.

Introduction their behavior. Whereas matching the empirical results shows that the integrative model is sufficient to account for the different specifications, the parameter estimation also is suggestive of how individuals might integrate individual and group perspectives.

The Multiplayer Public Goods Dilemma
A voluntary linear multiple-player public goods dilemma differs from common resource dilemmas and step-function public goods dilemmas in several important ways. Public goods dilemmas focus on people pooling resources or effort to provide a good for all, whereas common resource dilemmas relate to taking resources from a common pool. For linear public goods dilemmas every contribution creates additional benefit, whereas step-function public goods dilemmas the public good is provided only if a particular step threshold is obtained. Example of step-function dilemmas include fund-raising to build a new library or a certain number of people needed to serve on a committee [21], whereas examples of linear public goods include the environment, a clean house, a potluck dinner, or reviews on a website such that additional contributions continue to increase the benefit of that good for everyone.
We defined a linear public goods dilemma such that the dilemma's participants have a binary choice to contribute (cooperate) or not contribute (defect) each round over multiple rounds of interaction. For each round, an individual's outcome is defined by the following: where n is the number of total participants (size), c is the total number of contributors (c s is 1 if self contributes; c O is number of others who contribute, such that c = c s +c o ), z is each participant's endowment, and α is the dilemma multiplier. Explained discursively, each renewed endowment of z that is contributed to the group is multiplied by α, then divided among the group members n regardless of who contributed, whereas any endowment z not contributed is kept towards one's own outcome. Breaking down the effects based on what a participant can control, on any given round the portion that participants receive from their own decision is za n for contributing (if c S = 1 then c is 1 greater) or z for not contributing (c S = 0). Therefore, a participant's temptation to defect [5], the profit one will earn by not contributing, is z À za n . Furthermore, participants earn za n per additional group member, c O , that contributes, meaning a potential earnings for full cooperation is zα when c = n. The value za n must always remain less than z to be classified as a public goods dilemma; otherwise one's payout would be largest by always contributing to the public good. Consequently, the multiplier is constrained to 1 α < n.

Properties that Vary with Group Size
A central issue for investigating group size in voluntary public goods dilemmas involves two types of properties from the dilemma that vary differently with the group size. The first type relates to the entire group's calculus or outcomes for the social dilemma, such as the dilemma's multiplier or the earning potential if all participants cooperate. These group properties represent the group-focused aspects of the dilemma, addressing the question: what is most beneficial for the entire group? Therefore an actor with only a group perspective would interpret the dilemma in terms of these group properties. The second type of property relates to the individual participants' reward calculus and outcomes, such as their temptation to defect or marginal per capita return (MPCR). The MPCR, za n , is a ratio of the individuals' marginal benefit, za n , to the individuals' marginal cost, z, for contributing. These individual properties remain structured by the overall specification of the public goods dilemma, but are psychologically understood from the vantage of one individual asking the question: what is most beneficial for me? Therefore, an actor with only an individual perspective would interpret the dilemma using these individual properties.
It is critical to note that either the group or individual properties can be held constant as the dilemma size increases, but not both. The multiplier, α, and earning potential for full cooperation, zα, do not contain the group size variable n. In contrast, the marginal per capita return, a n , and the temptation to defect, z À za n , both vary based on the group size. Examining the formulas for these different properties, we see the simplest group and individual properties are the multiplier and the MPCR because they exclude the endowment. Whereas the endowment is important for baseline levels of cooperation behavior, if it is held constant it is not essential for a comparison of how group and individual properties differ by dilemma size (see Table 1). Note that MPCR ¼ a n and therefore the individual property MPCR decreases with size if the multiplier, a group property, is held constant. However, the same equation rewritten as α = MPCR Ã n shows that group properties such as the multiplier α increase with size when the MPCR, an individual property, is held constant (see Fig. 1 for their relationship and S1

Individual and Group Perspectives and Properties
Whereas it is often assumed that people reason from an individual perspective to increase their individual gain or decrease their individual loss, contemporary theorizing suggests that people also reason from a team or collective viewpoint, namely, act to increase the collective gain, to decrease the collective loss, or to increase equality [14,15,17]. Colman et al. [15] suggests that collective rationality, that is a rationality that occurs individually but includes reasoning about a team or group, is fundamental in many decision making contexts. Applying it to game theory and vignettes, they find that collective rationality predicts a large percentage of individual's decisions, in some cases better than an individual rationality, or self-interest.
In a similar vein, Van Lange [22] has proposed an integrative model of social value orientations whereby prosocially oriented individuals attempt to not only jointly maximize their own and other's outcomes, but also the equality of outcomes as shown in Equation 2.
Utility ¼ W1 ðOutcomes for Self Þþ W2 ðOutcomes for OtherÞþ W3 ðEquality in outcomesÞð2Þ This model was supported empirically in prosocials' decision making [22]. Because Coleman et al.'s standard for collective rationality was maximizing joint benefit [16], Van Lange [14] showed that this integrative model can account for the findings of Colman et al.'s collective rationality. Joint benefit is simply an equal focus on the outcomes to self and others, or applying W 1 = W 2 = .5 and W 3 = 0 to the integrative model equation. Therefore, we note that the integrative model represents collective rationality in terms of maximum joint benefit and extends it to include other group-based preferences, namely equality. Equality is an important factor for considering collective rationality because of its presence in social value orientations [22] and as an explanation for group behavior [23]. Using the integrative model to capture both individual and group preferences, we reason about how its different components could alter behavior in public goods dilemmas that vary on size and individual or group properties. In public goods dilemmas the best Outcomes for Self is always produced by defecting, however the strength of this varies based on the dilemma's individual properties. A higher Marginal Per Capita Return, for example, means that cooperative behavior returns a larger proportion of the contribution, thereby mitigating the appeal of defection. Conversely, the Outcomes for Others (which we pluralize to account for multiple other participants) is always maximized by contributing in a public goods dilemma, yet the strength of this also varies based on individual properties of the dilemma. The greater the MPCR, the  more others benefit from a participant's cooperation. Whereas maximizing outcomes for oneself versus others evoke opposite strategies (e.g., cooperation or defection), a higher MPCR enhances the utility of cooperation in both cases. Therefore, because MPCR decreases with size when group properties are held constant, we expect (H1) cooperation should decrease with group size if the group properties are held constant, yet (H2) cooperation should not change with group size if the individual properties are held constant.
Maximizing equality in outcomes suggests a preference for full cooperation or full defection as these produce the same-that is equal-outcomes for everyone. For a 2-person group, an individual's decision always alters the Equality of Outcomes because matching the other participant's choice with cooperation or defection produces full equality and opposing it produces maximum inequality (i.e., the outcomes of one defector and one cooperator). For a larger group, say five people, a participant's decision to cooperate or defect has different effects on equality outcomes depending on the decisions of others, as we later mathematically demonstrate. If two players cooperate and the other two defect, the inequality of the outcomes cannot be reduced by either cooperation or defection (i.e., either there will be two cooperators and three defectors, or vice versa). However, if four players cooperate, then cooperation produces fully equal outcomes whereas defection produces less equal outcomes. This line of reasoning leads to the counterintuitive conclusion that pursuing equality among outcomes can lead to polarized collective decisions. It also indicates that that equality will have a stronger polarization effect for smaller groups and weaker polarization effect for larger groups.
Whether equality concerns will push a group towards full defection or full cooperation depend on the base levels of cooperation. In groups where there is a higher benefit from the public good, the equality of outcomes will be more likely to promote cooperation, whereas in groups with a lower benefit from the public good the equality of outcomes will be more likely to promote defection. Therefore, we expect (H3) cooperation levels should differ between levels of benefit and be most pronounced in the 2-person groups. As we will see next, the empirical findings appear to be consistent with these theoretical hypotheses.

Previous Research on Properties and Group Size
The literature on linear public goods dilemmas has found that size can increase [4,6] or decrease [20,24,25] cooperation, sometimes depending on the condition [5,26]. Field study data has also shown under different conditions group size increase [27,28] or does not influence [29] contributions and cooperation. Because of the variation in results, a meta-analysis of linear public goods studies found that size did not have a statistically significant impact on contributions [30].
Nonetheless, when group properties are held constant, cooperation decreases with group size. For example, Yamagishi [20] compared group sizes ranging from 2 to 501 while holding the multiplier constant in three experiments. He found that as group size increased, cooperation levels decreased, however this pattern was primarily for groups of smaller sizes below 10. This supports our first hypothesis that cooperation should decrease with group size when group properties are held constant, as well as our third hypothesis that cooperation levels should be most pronounced in smaller size groups.
There is also evidence that when the individual properties are held constant, cooperation increases or does not change with group size. For example, Isaac and Walker [4] compared group sizes of 4 and 10 and held constant the MPCR at .30 or .75. They altered initial endowments by condition so that the complete cooperation earning potential remained constant. They found that when the MPCR was high, size had little effect on cooperation; however cooperation increased with group size when the MPCR was lower. Their conclusion was that at high MPCR, group size no longer matters as it does at low levels. A later study [31] compared groups of size 4 to 100 making a similar conclusion, that at high levels of MPCR, size did not alter cooperation, whereas it did at lower levels of MPCR. Other research [6] supports this position by finding that even differences in extremely low levels of MPCR can alter cooperation in large groups (60 and 100).
Isaac and colleagues interpret their results as cooperation increasing with size at low benefit, but they can be reasonably reinterpreted as cooperation increasing with benefit at small sizes. Similarly, they conclude that under conditions of greater benefit, size does not alter cooperation, which can be reasonably reinterpreted as under conditions of greater size, benefit does not alter cooperation. Therefore, the pattern of findings of this literature dovetails with both our second and third hypothesis. The compatibility of this previous research with our predictions gives support to our argument about connecting individual and group preference with individual and group properties through the integrative model; however none of these experiments examined group size while holding constant each type of property.

Experiment
We conduct an experiment according to our earlier dilemma specifications crossing group size while either holding individual or group properties constant with high or low public goods benefit. We hold the endowment z constant at 100 points for all conditions. We consider the difference between group and individual properties by keeping either the multiplier or MPCR constant as the group size increases (Fig. 1). We create two baseline experimental conditions where the group size is 2. Based on previous literature, one has lower public goods benefit with a multiplier of 1.2 and a MPCR of .60 and the second has a higher benefit with a multiplier of 1.6 and a MPCR of .80. From each of these baseline conditions we create comparison conditions by increasing the group size to 5 either holding the multiplier or the MPCR constant (see Table 2). This research was approved by the Department Human Ethics Advisory Group, Melbourne School of Psychological Sciences, University of Melbourne (#1340741). All participants read a plain language statement detailing the research and checked a consent box as written consent before being allowed to participate.

Participants
US participants on Amazon Mechanical Turk (mturk.com) chose to participate in our study for financial compensation (2.50 USD plus a bonus of approximately 1.00 to 4.00 USD based on points earned). They were then directed to a web-based program especially designed for this research. The instructions explained to participants their two choices of "keeping" (defecting) or "contributing" (cooperating) their endowment of 100 points each round. The program presented a table with the number of points they would earn based on their own contribution choice and others' contribution choices. This table of point outcomes differed based on the participants' condition, but was displayed each time the participants selected a choice. Additionally, details about how the table was derived from the public goods dilemma multiplier and the multiplier itself were also presented. To ensure adequate understanding, participants had to correctly answer quiz questions about a table of point outcomes before playing the game. Participants were assigned to a dilemma group and completed 31 rounds with that group. After all group members made a selection each round, a feedback screen would indicate to each participant their choice, their groups' total choices (e.g., number who contributed and number who kept their endowment), and the points they earned. No information was presented from previous rounds except the participant's personal cumulative point total. Participants had no information about the others they were interacting with except the total number who contributed or kept the endowment each round. Additional measurements were collected before and after the dilemma.
To ensure that participants could not be held up indefinitely by others in their group, there was a time limit for each round. If a selection was not submitted within that time the program selected contribute for that participant. Participants who failed to select a choice on any three rounds were timed out of the program and we excluded that entire group from our dataset. We continued collecting data until we had at least 15 groups per condition where no participants timed out of the program (99 groups with 390 participants).

Excluded Data
Programmed responses accounted for 21 (0.2%) of the responses, but no group had more than the individual's maximum of two programmed responses in total. Even though programmed responses were always cooperative, there is no indication they had a substantial or systematic impact on our results (see S1 File for further explanation and analysis). Programmed responses are excluded for all analyses. Because the data can be analyzed at multiple levels, we first examine data with groups as the unit of analysis to present an overall picture and simple statistics. Then we examine data with the individual as the unit of analysis using more complex statistics to control for the group-level interdependence. The group-level analyses based on 99 groups do not have the statistical power of the individual analyses based on 390 participants and therefore the group-level analyses are overly conservative.

Group-Level Analyses
For group-level analyses, cooperation for each group was calculated as the average cooperative decisions across all rounds for all individuals in the group. The two-person public goods groups produced different levels of cooperation based on the benefit level. The lower benefit condition (α = 1.2; MPCR = .60) led to less cooperation (.354), whereas the higher benefit conditions (α = 1.6; MPCR = .80) led to greater cooperation (.629; Means in Table 3), t(33) = -2.04, p = .050. Additionally, neither condition produced near complete cooperation or defection, thereby allowing an increase in size to either increase or decrease cooperation.
Both five-person lower benefit conditions matched the two-person by either holding constant the multiplier or the MPCR (Fig. 2). When the multiplier was held constant the mean cooperation decreased from .354 to .162, which was marginally significant, t(20.11) = −1.94, p = .066 (equal variances not assumed), whereas when the MPCR was held constant the mean cooperation significantly increased from .354 to .602, t(28.83) = 2.19, p = .037, equal variances not assumed. Furthermore, the multiplier-constant condition produced more cooperation than the MPCR-constant condition, t(21.16) = -6.40, p .001, equal variances not assumed.
The same comparisons were made for the higher benefit conditions. Holding the multiplier constant when increasing the size from two to five people led to cooperation decreasing from .629 to .237, t(21.36) = −3.76, p .001 (equal variances not assumed), whereas holding the MPCR constant led to no significant change in cooperation levels from .629 to .611, t(24.63) = −.17, p = .866, equal variances not assumed. The multiplier-constant condition also had greater cooperation than the MPCR-constant condition, t(29) = −5.65, p .001.
To examine the overall effects of condition, we created dummy variables for the benefit level (low or high), and the multiplier-constant conditions and the MPCR-constant conditions (with 2-person groups as the reference). When conducting an ANOVA with the dummy variables and an intercept, all three dummy variables were significant (Table 4: Model 1; all p's .05) such that high benefit increased cooperation, holding the multiplier constant size negatively affects cooperation, and holding the MPCR constant size positively affects cooperation. When including the interaction effects in the model (Model 2), the multiplier-constant dummy still decreases cooperation (H1), but the other two dummy variables no longer influence cooperation (H2). There is a marginally significant effect, F(1,93) = 3.67, p = .059, for Benefit X MPCR-constant interaction, suggesting that at low-benefit constant MPCR, size positively affects cooperation; however at higher levels of benefit this increase is no longer present (H3). To determine if these tendencies hold for a more rigorous higher power analysis, we now turn to analyzing the data by individual.

Individual-Level Analyses
Individual cooperation was calculated as each participant's average cooperation over all rounds. When analyzing data at the individual level it is important to control for the interdependence between individuals as individuals in the same group can affect each other's cooperation rates over time. Specifically, the interclass correlation rate is .6336, suggesting that over half of the individual variance in cooperation is explained by group-level parameters (which in this case includes the conditions as groups always functioned within a set of condition parameters). We conducted mixed models analyses that included group intercepts as a random effect covariant parameter with the design factors as fixed effects. The null model (Table 5  we find similar patterns to our group-level analysis (H1) except that the MPCR-constant dummy variable is only marginally significant (.119) F = 3.170, p = .078, and remains so in Model 3 (H2). Adding the interaction effects in Model 3, this analysis show the interaction of High Benefit X MPCR-Constant conditions has a negative and significant effect on cooperation (-2.67) F = 4.099, p = .046. We interpret this to mean that when MPCR is held constant, benefit has a stronger influence on cooperation between 2-person dilemmas than between 5-person dilemmas (H3). Thus, the interaction effect of our third hypothesis qualifies the simple effect of MPCR-constant size changes on cooperation, our second hypothesis.

Discussion
Overall, we found some support for all three hypotheses, with the marginal support of the second hypothesis in the individual analyses being due primarily to its qualification by the third hypothesis. Most notably our results were in line with previous studies. First, for the 2 person public goods dilemmas the benefit level was the major predictor of cooperation levels. Consistent with prior work, increased size either promoted more, less, or the same amount of cooperation depending on whether group or individual properties were held constant and benefit level [4,5,20]. This interaction between size and benefit dovetails with previous studies that have found at high levels of MPCR group size no longer has discernible influence on cooperation [31]. This was also our surprising and counterintuitive prediction based on reasoning about the dilemma's properties and the group-preference of equality from the integrated model of social orientations. We now turn to agent-based modeling to further explore whether the group and individual preferences of the integrated model are able to be fit to these empirical results.

Simulation using a Model of Individual and Group Preferences
We conduct a simulation of agents that uses Van Lange's [14,22] integrative model (Equation 2) to show that the pattern of our cooperation results, which accounts for the size dilemma, can be produced using this implementation of individual and group preferences without the need of additional complex assumptions about individual actors. In the integrative model's weights (W) allow it to be reduced to individual preferences such as self-interest (W 1 = 1, W 2 = 0, W 3 = 0), group preferences such as altruism (W 1 = 0, W 2 = 1, W 3 = 0), and combined preferences such as cooperation (W 1 = .5, W 2 = .5, W 3 = 0) or egalitarianism (W 1 = .5, W 2 = 0, W 3 = .5). Any set of estimated weight parameters that are able to reproduce the empirical results will indicate a preference combination sufficient to reproduce our size dilemma conditions. We created a public goods dilemma simulation in R according to the same specifications as our empirical study. For each run of the simulation, agents were selected randomly from our population. Based on the previously reported levels of freeriding behavior in social dilemmas [32], we included free-riders-that is complete defectors-as one-third of the population. The other two-thirds were mixed motive agents based on the integrative model. To ensure the simulations were stochastic, every agent on every round had a 10% chance of behaving randomly (50% probability of cooperating).

Mathematical Specification of the Integrated Model Applied to Linear Public Goods Dilemmas
The mixed motive agents behave randomly on the first round (50% probability of cooperating). Thereafter, they use the utility function to calculate the utility of cooperating or defecting premised on all others making the same decisions as the previous round. Assuming others will make the same choice as before seems to us to be the simplest model of others' behavior. Based on others' previous choice, the agent calculates the utility for the current round. If the utility for cooperating exceed the utility for defecting the agent cooperates. Outcomes for Self is operationalized as one's own points earned on one round according to Equation 1. Because all individual outcomes on a given round are either cza n for cooperation or cza n þ z for defection, Equation 1 can be rewritten as one equation: The difference in Outcomes for Self from switching from cooperation to defection, meaning c S = 0 and therefore c decreases by 1, is always an increase by the Temptation to Defect, z À za n . Outcomes for Others is calculated as the average outcome all other players receive per round: In our empirical study and most public goods research with groups larger than 2, participants are not aware of the contributions or cooperation levels of specific individuals. Instead they can see a total level of cooperation and from this it is easy to calculate the points received by others, as everyone in the dilemma receives either cza n for cooperation or cza n þ z for defection (Equation 1). While the average points others received from the dilemma can vary greatly, the utility difference for others between a participant's choice to defect or cooperate is always za n , which is the MPCR x the endowment. Therefore, when switching from cooperation to defection-that is c decreases by 1, but C O does not change-Outcomes for Others decreases by za n points (Equation 4). Finally, Equality in Outcomes is calculated as the difference between one's own outcome and the average outcome of all other players (Equations 3 and 4). The negative absolute value of this difference is added to the utility equation, so full equality in outcomes is 0, with negative numbers representing more inequality: When everyone cooperates, c S = 1; c O = n-1, or defects, c S = 0; c O = 0, Equality of Outcomes is 0, but when the individual defects and the all others cooperate (or vice versa), c S = 0; c O = n-1, Equality of Outcomes becomes the negative of the endowment-z (see S3 Fig. for a full example). We want to note some interesting properties about these functions. The utility difference between cooperation and defection for both self and others outcomes does not change dynamically within the dilemma. The temptation to defect and the MPCR X the endowment will be the same throughout a particular dilemma regardless of the decisions of any player in it. That means that individuals relying only on these strategies should not deviate in their decisions throughout the session. While this may trivialize simulation results based on only these two functions, the model itself still has profound implications for the overall levels of cooperation. For example, if outcomes to self and others are equally weighted, then the agents in the multiplier-constant five person conditions (those with MPCRs < .5) will choose full defection, whereas the agents in the other four conditions will choose full cooperation. The response to the parameters of the dilemma (Table 6) can be altered with a different integrative model weighting scheme, but will not be affected by players' round-to-round decisions.
Equality of Outcomes, however, can dynamically change over rounds within the dilemma and can additionally create threshold points based on others behavior. As more players cooperate in a dilemma, the utility for the Equality of Outcomes is improved by additional cooperation and as more players defect, the utility of Equality of Outcomes is improved by additional defection. In this way, since one cannot alter the outcome of specific others, this equality function simply suggests going along with the crowd in either defecting or cooperating. In fiveperson dilemmas this creates a threshold when two others cooperate and two others defect (S3 Fig.) in which there is no equality advantage for selecting one option over the other. Whereas the threshold points are stable within a dilemma's specific parameters, combining it with the self and other interest in the weighted utility function can create alternate threshold points. For this we conduct simulations. allowing for us to examine a range of utility weight combinations. We conducted 500 public good dilemma simulation trials for each of 402 conditions: 67 weight conditions Ã 6 empirical experimental conditions ( Table 2). The results indicate that only one weight condition, W 1 = .2, W 2 = .5, W 3 = .3, reproduced the cooperation level pattern seen in the empirical results ( Table 7). The other weight conditions, including the more pure models (e.g., purely self-interested) do not reproduce the empirical pattern (Table 7). We conducted several variations of the simulations to ascertain the robustness of the model. First, we simulated all conditions with a population of only mixed motive agents, that is we excluded the 1/3 defectors from the population. Second, we repeated the simulation of all conditions with a random normal distribution term (M = 0; SD = 50) added to the integrated utility function allowing the utility of cooperation versus defection to be selected probabilistically. Third, we simulated all conditions with a random normal distribution term (M = 0; SD = .1) added to all three weights of each individual agent. In all three variations the W 1 = .2, W 2 = .5, W 3 = .3 weight combination continued to produce the closest pattern to the empirical results, albeit with much higher cooperation levels in the no-defector simulation (S1 Table). Fourth, we conducted simulations using the parameters from the best fit model (W 1 = .2, W 2 = .5, W 3 = .3) with agent memory adapted from Kashima, Woolcock, and Kashima [33]. The temporal utility function, U M , at time t is U M,t = (W M )(U t ) + (1-W M )(U M,t-1 ) where W M is the weight of utility for the most recent outcome of the game. We varied W M from .1 to .9 in steps of .1, and found that these varied only slightly and all replicated the pattern in the empirical data with W M = .8 producing the closest match (S1 Table). Finally, we conducted 10 trials of a genetic algorithm to optimize these different robustness parameters (S2 File), and found that . As the first of these is nearly identical to our previous best solution, this gives us further confidence in this parameter selection. The initial simulations and the subsequent tests of robustness indicate to us the integrative model without additional assumptions is sufficient to produce the results we observe empirically. Furthermore, the weighting scheme that produced the closest match to empirical results suggests decision-making that accounts for both individual and group preferences, including outcomes to self and others as well as equality of outcomes. Outcomes for Self received the smallest weight of all three parameters; however, these weights may be only appropriate to with regard to the parameters selected for these experimental conditions.

Further Explanation: Collective Response to Low Levels of Cooperation
One phenomenon present in the empirical data is a difference in how groups collectively respond to low levels of cooperation. Because cooperation is relatively more favorable for both Outcomes to Self and Outcomes to Others as MPCR increases, groups with higher MPCR should return to higher levels of cooperation after dropping to a lower level. Comparing all four 5-person group conditions, we observe patterns in which group with higher MPCRs reach lower levels of cooperation less often and when they do immediately return to higher levels of cooperation, whereas groups with lower MPCRs always reach with low levels of cooperation and rarely increase it afterwards. To examine this response pattern we counted all 5-person groups which had at least one round where the number of contributors was 0 or 1. The first of these rounds that occur we refer to as the drop point round, as it must either be a drop from a higher level of contribution unless it is the first round of interaction. For cases that have the drop point round, the next round is a chance to rebound. On this round we find that in the groups with MPCRs of .24 and .34 have average contributions by less than one person, .71 and .75 people, respectively, indicating a failure to rebound. In contrast, the groups with MPCRs of .60 and .80 obtain an average of 1.67 and 2.33 people contributing, respectively, indicating a rebound from the drop point. We selected the drop point of 0 or only 1 player cooperating because it represents-for at least that round-a clear dominance of self-interest. Yet if multiple people contribute in the following round it indicates both collective rationality-that people are behaving motivated by group preferences-as well as collective action, that people while individually making choices produce a group-level result. We compare this empirical trend to our simulated data from the best fit model ( Table 8). The proportion that contains a drop-point round across all conditions is lower for the simulated data than for the empirical data. Also compared to the empirical data, the simulated data indicated even fewer contributors in a rebound round for the low MPCR conditions (around .25) while a similar number of contributors for the high MPCR conditions. Most importantly, it accurately predicts whether there will be a rebound or not after the drop-points in terms of whether the number of contributors exceeds 1. Whereas this is not proof of a mechanism, it does suggest how a model of individual and group preferences in addition to reproducing levels in overall cooperation levels can also reproduce collective action.

General Discussion
This paper has two major contributions. First, we clarified the size dilemma in public goods research: that the size of the group sometimes increases, sometimes decreases, and sometimes does not affect cooperation levels. We took a three-pronged approach to addressing this issue.
(1) We classified the properties used in public goods dilemmas as individual (MPCR, Temptation to Defect) and group (Multiplier, Earnings for full cooperation) properties, showing the relationship of those property types to group size and cooperation levels. (2) We connected those properties to individual and group preferences and ultimately to Van Lange's integrated model of social value orientations both through hypothesis testing and simulations. (3) We experimentally tested and compared both holding individual and holding group properties constant while size varies. This one experiment with a limited number of conditions and groups resulting in several results close to the traditional significance cut off of .05 should be verified by similar future studies. However, our confidence in the results is bolstered given the grouplevel analyses, individual-level analyses, and simulations converged in line with the theoretical hypotheses and prior literature.
Consistent with our predictions and the previous literature, holding group properties constant decreased cooperation with size, whereas holding individual properties constant did not affect cooperation for public goods dilemmas with a higher level of benefit. For a lower level of benefit and constant individual properties, size increased cooperation which is in line with our counterintuitive hypothesis about equality concerns leading to polarization in smaller groups. Alternatively, this could be interpreted as a ceiling effect, where lower benefit groups increased cooperation with size when individual properties were held constant, but higher benefit groups did hit a cooperation ceiling, potentially caused by a percentage of defectors in the population Collective and Individual Perspectives on Public Goods Dilemmas [32]. Both explanations are reasonable given this data, but the former is theoretically derived from the integrative model of social orientations. Our second major contribution was applying an explanation of individual decision making to public goods dilemmas based on current research in collective rationality or team-based reasoning, which is reasoning about group preferences in addition to individual preferences. We implemented this theoretical perspective in terms of Van Lange's integrative model where individuals attend to and are motivated by a combination of self-interest, interest for others, and equality for all. Van Lange has shown that this model appropriately represents the behaviors and expectations of prosocial individuals [22] and is a sufficient explanation for collective rationality [14]. Applying this model to public goods dilemmas of different sizes, we find that the inclusion of an equality term leads to novel predictions as well as dynamic decision making. Not only are predictions based on this model consistent with our results, but the agent-based simulation further confirms that the outcomes and processes from the model to the data are coherent. Using agent based modeling rooted in a theoretical basis and with an empirical benchmark can enhance our understanding of an issue [34], in this case connecting people's motivations with their individual and collective action, and interactions, within public goods dilemmas.
In regards to continuing work in the area of public goods dilemmas we hope our parameterization will sensitize other researchers to the interrelation between size, properties, and cooperation levels. Our finding of an interaction effect between the change in size and benefit level indicate to some degree that changing the size of a public goods dilemma alters it in fundamental ways instead of always having a simple orthogonal effect. Furthermore, our study shows how the size-based individual and group properties alter cooperation for the particular parameterization we chose. Future research should consider different formulations of the public goods dilemma, such as step-function, one-shot, partial contribution dilemmas, and groups larger than five. Groups of two and five, while substantively different from each other, are relatively small and therefore the explanations we proposed are limited in that they may not scale to larger groups including the largest groups, such as societies. Research has shown that large groups have a pronounced effect on cooperation, but above a certain point there are few effects of further increases in size on cooperation [4,5,20,26,31].
Whereas we presented individual's motivation vis-à-vis individual and group preferences as a sufficient explanation, it is neither exclusive nor complete. We hope this research will spawn comparisons of competing and complementary explanations. Does collective rationality better explain differences in social dilemmas, especially enigmas in social dilemmas such as the size problem, than explanations based on affective, learning, efficacy, or normative processes? Collective rationality, focused on motivation for decision making, could also be combined with other explanations that detail the mechanisms within one's decision-making process. A limitation of this study was that we did not measure individual-level properties such as social value orientations [18,33] or intentions to cooperate. This data would be fundamental to tease apart the individual decision making process in social dilemma settings, and to explore the presence and ordering of potential structural, cognitive, affective and behavioral mechanisms.
The extant literature indicates several possible mechanisms that could function separately or in tandem with collective rationality. In step-level dilemmas, self-efficacy and collective-efficacy decrease with group size as people feel like their contribution matters less and therefore it justifies their uncooperative actions [7,35], however it unclear if this would generalize to linear public goods dilemmas [21]. Against the rationality of self-interested behavior, social norms develop in situations of collective action [36] which then can facilitate collective behavior through cohesion or sanctioning [37,38]. Either self-efficacy or social norms could function as a mechanism by which individual and group preferences are enacted, or serve as alternative explanations to preferences based on social value orientations. For example, do individuals who feel empowered focus more on collective outcomes or do collective preferences lead to behavior which enhances self-efficacy? Or does size trigger a lower level of self-efficacy [7], which then changes individual and collective preferences irrespective of the other dilemma properties? We leave it up to future research to see how these and other internal mechanisms might interplay with individual and collective preferences.
Our simulated results spotlight additional research questions. For instance, how do social value orientations differently operate in situations with multiple others? Is prosociality directed toward multiple others individually, to the group as a whole, or is it conditioned on the separability of other's contributions? We implemented our agent-based model with all other players as a single collective, thereby not increasing the weight of the prosocial terms for additional players. However, one could imagine a pure altruist being more interested in helping the most people, as opposed to helping one individual most. As size within public goods dilemmas is differently structured according to group and individual parameters, size within collective rationality could be applied with different formulations.
A related limitation of our study involves our smallest group containing two people only, whereby the collective outside of oneself is only one other person. To fully understand the implications of collective rationality in multiple actor dilemmas it is necessary to recognize how joint outcomes or equality are interpreted for a "collective" of one other person versus a group collective in which the individual members are less separable [39]. Georg Simmel perhaps was the first to suggest the fundamental difference between a dyad, where groupness is fully dependent on each person, and a triad, where the group can exist without any other single individual. Similarly, as others have noted [19], a social dilemma of two people compared to those with more than two people can lead to different outcomes. Both the Outcome for Others and the Equality of Outcomes depend on others, meaning when the total group is greater than two people, these others' properties are combinations of multiple people's behaviors, not just a single individual's behavior. Future research should consider this in more detail.
In conclusion, this research highlights important issues of theoretical and practical concern. Theoretically, the application of social value orientations into game theoretic dilemmas is not novel, but integration of collectively oriented social values through team-based reasoning introduces a rationality that is not only interesting, but contributes to extant issues. One such issue is the dilemma of dilemmas that is the dilemma of the divergent specifications and findings regarding size and public goods dilemmas. Our specification and findings have illuminated this issue and hopefully will encourage researchers across disciplines to carefully specify public goods dilemmas in light of both individual and group properties. Practical applications from this line of research abound as a fundamental difference between the sizes of many realworld public goods dilemmas. An implication for the largest dilemmas, such as the preservation of the natural environment, includes not only focusing on the personal benefits which are often minimal, but also others' benefits and the equality of them. An integrated focus on all these concepts may be more effective than focusing on one concept to the neglect of the others.  Table. Robustness checks for the simulated results: the models that match the empirical result pattern. (PDF)