Skip to main content
Browse Subject Areas

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Citizen Social Lab: A digital platform for human behavior experimentation within a citizen science framework

  • Julián Vicens,

    Roles Methodology, Software, Validation, Writing – original draft, Writing – review & editing

    Affiliations Departament d’Enginyeria Informàtica i Matemàtiques, Universitat Rovira i Virgili, Tarragona, Spain, Universitat de Barcelona Institute of Complex Systems UBICS, Universitat de Barcelona, Barcelona, Spain, Departament de Física de la Matèria Condensada, Universitat de Barcelona, Barcelona, Spain

  • Josep Perelló,

    Roles Conceptualization, Methodology, Validation, Writing – original draft, Writing – review & editing

    Affiliations Universitat de Barcelona Institute of Complex Systems UBICS, Universitat de Barcelona, Barcelona, Spain, Departament de Física de la Matèria Condensada, Universitat de Barcelona, Barcelona, Spain

  • Jordi Duch

    Roles Conceptualization, Methodology, Software, Validation, Writing – original draft, Writing – review & editing

    Affiliation Departament d’Enginyeria Informàtica i Matemàtiques, Universitat Rovira i Virgili, Tarragona, Spain


Cooperation is one of the behavioral traits that define human beings, however we are still trying to understand why humans cooperate. Behavioral experiments have been largely conducted to shed light into the mechanisms behind cooperation—and other behavioral traits. However, most of these experiments have been conducted in laboratories with highly controlled experimental protocols but with limitations in terms of subject pool or decisions’ context, which limits the reproducibility and the generalization of the results obtained. In an attempt to overcome these limitations, some experimental approaches have moved human behavior experimentation from laboratories to public spaces, where behaviors occur naturally, and have opened the participation to the general public within the citizen science framework. Given the open nature of these environments, it is critical to establish the appropriate data collection protocols to maintain the same data quality that one can obtain in the laboratories. In this article we introduce Citizen Social Lab, a software platform designed to be used in the wild using citizen science practices. The platform allows researchers to collect data in a more realistic context while maintaining the scientific rigor, and it is structured in a modular and scalable way so it can also be easily adapted for online or brick-and-mortar experimental laboratories. Following citizen science guidelines, the platform is designed to motivate a more general population into participation, but also to promote engaging and learning of the scientific research process. We also review the main results of the experiments performed using the platform up to now, and the set of games that each experiment includes. Finally, we evaluate some properties of the platform, such as the heterogeneity of the samples of the experiments, the satisfaction level of participants, or the technical parameters that demonstrate the robustness of the platform and the quality of the data collected.


Social dilemmas modeled as behavioral games are important tools to study the general principles of human behavior and to understand social interactions. Social dilemmas occur when individual interests conflict with other individual or collective interests [13]. Behavioral experimentation thus yield relevant scientific outcomes that have been used to test theories and to refine models, providing experimental data for simulations [4], and making the understanding of human behavior move forward. But the impact of the experimental insights go beyond the scientific theories, because social dilemmas describe interactions and conflicts in real-life situations such as climate change mitigation, refugee repatriation, use of public space, social inclusion, gender discrimination, community care in mental health or resource depletion, and results obtained from behavioral research can be translated to improve all these areas.

Traditionally, most experiments have been conducted in laboratories with highly controlled experimental protocols but with limitations in terms of subject pool or decisions’ context [58]. There is a reasonable doubt and a long-standing concern about the reliability of experimental insights beyond laboratories and students population [9, 10]. Studies comparing students with non-students differ in their conclusions about the adequacy of the use of students as an appropriate subject pool. There were found differences in the behavior between students and other populations, being the students less pro-socials [1113], while other studies report that students and non-students behave similarly in social preferences games [14]. Besides, generalizability of results of laboratory experiments also is affected by the physical context in which they are performed. The situations of social interaction that are studied do not happen in laboratories, but in real life scenarios where participants face dilemmas and make decisions. This leads participants in laboratories to not engage in real-world behaviors, but instead in behaviors that are biased by the experimental conditions.

Furthermore, recently social experimentation has been affected by the general crisis of science in replicability and reproducibility, issues that concern the main actors in science [1517]. Some efforts have been done to solve this situation, promoting the transparency in the statistical and methodological aspects of laboratory work, but also promoting the publication of more detailed methods, the data sources and the codes used in the experiments and in the analysis [18]. Scientists are encouraged to conduct replication studies [19] and, in general, to pursue a more open research culture [16, 20].

In recent years, Computational Social Science has emerged as a multidisciplinary field that studies complex social systems and provides new insights about social behaviors, combining tools and methods from social and computer sciences [2124]. Along these lines, a large number of studies have been conducted generally exploiting big amounts of social data, mostly collected from online social platforms (Twitter, Facebook, etc.) [25]. Within the same field, some researchers have started to use online services such as Amazon Mechanical Turk as platforms to recruit and run their behavioral experiments [26, 27]. Many experiments have been successfully deployed in these services providing new insights to social problems from another perspective [28], however experiments on this platform also suffer from some known limitations [29].

There is a missing gap between the studies conducted with large-scale data from online platforms (that come from less controlled samples and protocols) and the small-scale data collected from the experimentation in behavioral science labs (collected with more robust protocols). New platforms fill this gap providing opportunities for the design of mid and large-scale behavioral experiments in online labs that guarantee the quality of the data collection [3032]. These more flexible platforms have great advantages, as (1) they facilitate the recruitment of more diverse sociodemographic profiles or from very specific communities according to the needs of the experiment, (2) they are able to carry out the experiments in a distributed way in space and time, and (3) they are more efficient at the economic level, since the infrastructure is much lighter. In these platforms other limitations arise, such as the identification of the experimental participants or the economic incentives, to mention only a few.

Our scenario of experimentation is described in the context of lab-in-the-field experiments based on the pop-up guidelines [33], an intermediate situation between traditional behavioral experimentation and big data analysis. The basic idea is to translate the experiments outside the lab to real contexts, and to open participation to new and more diverse audiences. More importantly, the experiments are not only built by taking into account the researcher’s interests and motivations, but also considering the perspective of citizen participation and its social impact in terms of providing the right knowledge to conduct new evidence-based policies by public administrations and empower participants to trigger civic actions. This is framed within the citizen science approach [3437], that promotes the participation and inclusion of non-expert audience in real research processes in different ways [38, 39] (co-creating projects, collecting data, interpreting and analyzing data, and provide actions based on the evidences collectively gathered). Citizen science helps us to involve the general public in behavioral experimentation and impacts the participants themselves [3943], for instance increasing their disposition to science [44].

To carry out these experiments interactively, we designed and implemented Citizen Social Lab (, a platform with a collection of decision-making and behavioral games based on a light infrastructure that can be installed and executed in real-life contexts in a simple but robust way. Depending on the goal of the experiment and the behavioral variables to be studied, the researcher can select and parametrize one or various games, and also define the general dynamics of each experimental session. The platform registers all the behavioral actions taken by the participants, but also provides surveys to collect sociodemographic data, information about the participants’ experience or their decision making process. The platform does not allow the intervention of uncontrolled participants, and it registers data accurately without alterations of any kind.

In contrast to other existing platforms, this platform has been designed to follow citizen science guidelines and to be used in experimental settings where participants are recruited using opportunistic sampling. For these two reasons, both the experimental staging and the platform include features to attract the attention of participants and, once they are enrolled, to improve their focus and engagement within the experiment. Potential participants have no prior knowledge on game theory nor on social dilemmas. They are simply curious about the public intervention under the form of pop-up experiment without any detailed explanation about what was really about. In most of the cases herein analyzed, the general profile are neighbors passing by the street, boulevard or square where the citizen science experiment is been placed or visiting a cultural festival where the experiments were attached to. Participants can then be not qualified as informed citizens in any case and the open and public approach provided by the citizen science practices have allowed us to recruit a quite generic citizen profile. Thus, different approaches are used, one of them being the gamification of the experience [45, 46], which consists in presenting the experiment as a game and a scientific investigation at the same time. Another important feature is the feedback and knowledge obtained by the participants after the experience, for instance through personalized reports for each participant or by organizing public lectures that summarize the results once a paper has been published. These efforts also add new dimensions to the mandatory open data access or ethical and transparency requirements when dealing with citizen science approaches.

The experimentation platform has been active since 2013 and, within that time, it has been used successfully in more than 15 experiments to study different aspects of human behavior. Up to this date more than 2 821 people have contributed, taking around 45 200 valid decisions. We have developed it as an open-source project where software developers and the scientific community can help grow the platform, but also to facilitate the reproducibility of the experiments and to foster the usage of platforms like these as an alternative method to conduct behavioral experiments in all types of environments and settings.

Materials and methods

The platform

Citizen Social Lab is a platform designed to assist in the deployment of human behavioral experiments. It has been created with three important goals in mind that foster versatility. First, the platform is based on light and portable technologies, so it can be used in open and diverse environments following the guidelines of popup experiments [33], but also in pure online or in more “classical” experimental laboratories. Second, it has been designed with a friendly user interface to facilitate participation to a broader population, and to engage and motivate participants to solve the tasks proposed in the experiment while they have an enjoyable experience. And third, it is structured in a way that it is easy to incorporate any type of social dilemma or behavioral game, as well as, any type of interaction: individual/computer, individual/individual and individual/collective.

The platform allows researchers to carry out a suite of dilemmas or behavioral games, which compose the core of the system. The system already contains a few different available dilemmas, which are described in the next section, and this number is expected to increase as new experiments are developed and deployed using the platform. Moreover, beyond the data collected from the participant’s decisions, the system is designed to collect complementary data (about sociodemographics, user experience or experiment-related questions) through surveys before or/and after the social experiment takes place. It also registers all the activity of the participants when using the platform, which can be used to infer other parameters (e.g. response time).

The platform architecture is highly modular and allows the researcher to construct personalized environments combining and parametrizing the modules they require for their particular experimental setting. The basic client modules currently available are the following. (i) Introductory interfaces, with brief but detailed information about the topic and goals of the experiment and legal information with privacy policy. (ii) Questionnaires; that can be used to collect sociodemographic information and also to present specific questions related with the experiment topic or setting. Questionaries can be used before and/or after the main experiment. (iii) Tutorial and instructions; so participants can learn the rules and the mechanics of the experiment by themselves (even though in the physical location there are always researchers to provide support if any question arises) and practice a few testing rounds of the game to familiarize themselves with the game interface. (iv) Games and/or dilemmas; the core of the platform, the module that runs the experiment to collect the decisions of the participants. An experiment can incorporate only one game or a collection of them. (v) Results; a set of interfaces designed to provide feedback to the participants on the outcome of their decisions in the experiment. This is crucial to increase the positive return that they obtain for participating in the experience. Finally, (vi) the administration interface is composed by a set of pages that let the researcher to control the parameters of each session, monitor the evolution of a game, and overview the general performance during the experiment in real-time.

The modules are combined and configured to define what we call the participant’s flow through the experiment (see Fig 1). The system is designed to automatically guide the participants through all the stages without the need of interacting with a researcher (unless otherwise required by the participant), and it allows the existence of simultaneous games at different stages of the experiment.

Fig 1. Block diagram of a participant’s flow through one experimental setup.

The participant goes through three stages: the first stage contains the pre-game module with preliminary instructions about the experiment and surveys, the second stage contains the core game mechanics (which implements the suite of decision-making and behavioral games), and the third stage consists of the post-game module with the final feedback of the experiment and surveys about the experience and the topic of the experiment. Not all these modules and interfaces are present in all the experimental setups.

Games module.

The main goal of the platform is to collect the decisions of the participants when they face different types of dilemmas that are analogies of real-life situations. Most of the dilemmas included up to now are social and interactive, which require synchronized interaction with other individuals, however the platform can also be used to study individual decision-making situations that do not require real-time interaction with other participants.

The first social dilemma implemented is a generalized version of a simple dyadic game, where two people have to decide simultaneously which of the two actions they will select, and the outcome is the result of the combination of them. Depending on the values presented to the participants, they can face different types of games: a Prisoner’s Dilemma [47, 48], a Stag Hunt [49], a Hawk-Dove/Snowdrift [5052] or a Harmony [53]. These dilemmas can be used to measure two important features of social interaction, namely the temptation to free-ride and the risk associated with cooperation.

The second type of social dilemma, the trust game (TG), or otherwise called the investment game, is used in order to measure trust and reciprocity in social interactions [54]. In TG two players are given a quantity of money. The first player sends an amount of money to the second player, the first player is informed that the money that he sends will be multiplied by a factor (e.g. three). The second player takes the action of give some amount of the multiplied money back to the first player, and then both receive their final outcome.

The third type of social dilemma, the Dictator game (DG) can be used to measure generosity, altruism or distributional fairness [55]. In this game, the first player “the dictator” splits an endowment between himself and the second player, “the recipient”. Whatever amount the dictator offers to the second player is accepted, therefore the recipient is passive, cannot punish the dictator’s decision. DG is not formally a game because the outcome only depends on the action of one player, in game theory those games are known as a degenerated game. However, there is a modified version of DG which includes a third player who observes the decision of the dictator and has the option to punish the dictator’s choice [56]. The third person receives an endowment that could choose to spend to punish the dictator, so that punishing has a cost for the punisher.

The fourth type of social dilemma, is a variant of the public goods game, which is a collective experiment game in which the players with their contributions decide invest in public goods or keep their private goods. This particular version is known as collective-risk dilemma [57, 58], and consists of a group of people who must reach a common goal by making contributions from an initial endowment. If the goal is reached, every individual receives the part of the money not contributed. If not, a catastrophe occurs with certain probability, and all participants lose all the money they had kept.

The platform also includes a non-social decision-making game, where participants have to make decisions having uncertain and/or incomplete information [59]. This game is played individually so there are no interactions with other players during the game. With this game we can study decision making strategies by controlling the type and amount of information that can be accessed by the participants.

All the dilemmas described previously can be parametrized to allow for different types of studies (for instance, controlling the values of the payoff matrix) or extended to include different variations when they are available. Also, starting from the implemented interaction structures (Fig 2), new dilemmas can also be constructed and added to the platform following a simple set of guidelines described within the code of the platform.

Fig 2. Interaction types included in the platform.

The platform currently implements four different types of interaction that cover individual-computer (a), individual-individual (b, c) and individual-collective (d) types of coordination. The numbers on the arrow indicate the order of when each interaction takes place, black arrows are interactions from individuals to the computer, and red arrows are interactions from the computer to the participants.

Participation and motivations.

Moving the experiments out of the laboratories implies that usually the participants are not captive in advance, but instead opens the opportunity to attract new audiences from a broader population. The recruiting process in open environments -such as cultural events or public spaces- is substantially different from the recruitment in laboratories, and is usually based on opportunistic sampling [60], where the sampling decisions are made selecting the sample from people who are available at the location during the experiment, and using the selection criteria defined in the experimental setup (for instance, checking the limitation of age of the participants). This type of recruitment taken from citizen science practices and strategies presents new challenges, since you have to attract the interest of the population through other types of incentives very much related to the impact of their lives allowing them to reflect on some topics and from their own actions during the game. In the pop-up experimental framework we usually include a narrative context and performative elements to capture the attention of the participants. However, once the attention of potential participants has been attracted, it is also even more important to present the experiment in a motivating way to guarantee their participation until the end of the session.

We use gamification techniques to the degree that the experimental settings allow us to ensure the scientific rigor of the experiments. Behavioral games and dilemmas per se already have elements and mechanisms of games such as: challenges, objectives, rules, reward, punishment, interaction, competition, collaboration, call-to-action, among others. Based on them, we create an experience where we present some of the experiments as games, with a narrative setting that creates a story surrounding the experiment. In some experiments, mainly the ones that took place within the DAU Barcelona Festival (a festival of games—board games, popular and traditional games, as well as historical simulations, role play or miniatures—organized by Barcelona Institute of Culture.), we created a main character for capturing the participant’s attention (Mr. Banks, Dr. Brain and Climate Game, see S1 Fig), which presents a challenge that can be overcome by participating in the experiment.

The experiments are designed to enhance the motivations of the participants, not only from the perspective of games, but also to impact in the science disposition of participants, the understanding of science or the impact in social issues. This is the particular case of the framed experiments: The Climate Game, Games for Mental Health, the games for social change within the STEM4Youth project and the street art performance called urGENTestimar; all of them are focused on real social concerns: a collective climate action, the mental heath promotion of in-community care services or the concerns from several school groups related to social inclusion, use of public space and gender violence. Furthermore, beyond the economic incentive to participate (according to their performance in the game), participants also receive feedback on how their decisions and contributions could be translated into scientific research.

In our case, there are two types of participation according to the experiment context. Most of the experiments have been carried out in uncontrolled environments in terms of recruitment, without captive participants (e.g. festivals or public spaces). In specific cases, where the experiments were carried out in collaboration with local communities, the need to apply special recruitment techniques is not so important since the communities are usually involved in the design and the deployment of the experiment. In any case, to support the game-based approach, the platform allows the introduction of resources to include the narrative, always preserving the scientific rigor, and also provides features that can be used to create a gamified experience.

Technical details.

Some of the dilemmas previously explained require of individuals interacting in different manners. For instance, in games where two individuals participate there are at least two possible interaction styles: one where the two individuals make a simultaneous decision without knowing the other’s choice and after that they receive the outcome; or another where one player makes a decision while the other player is waiting, once the first decides the second, knowing the other’s choice, makes her decision, finally both get their final feedback (see Fig 2). Also, experiments can have different evolution mechanics: from one-shot games, in which the players just make a unique decision, to iterated games in which the players make various decisions consecutively with the same or different participants. And finally, we also have to consider the possibility that the interaction between the players can be constrained by an underlying structure that defines the relationship between the players, which can range from a all-connected-to-all structure to a specific network structure.

Taking all these points into consideration, we designed a client-server architecture that controls the flow of the experiment according to the needs of the researchers. On one hand, the server manages the pace of the experiment, and implements all the core games and synchronization methods between players. It is based on a python-django backend, combined with a database to store the information generated separately by each experimental setup. The server can be run online, to allow experimentation on the internet or it can be installed in a local server to run experiments in local area networks.

On the other hand, the client contains the user interface that the participants have to use to interact with the experiment. The technology on the client side is composed of html and javascript files that are generated dynamically from the experiment description files. The user interface has been designed to fit the resolution of a tablet device, but also works with any computer with a standard browser. It is also structured in a way that can be easily translated to other languages.

Most of the experiments have used the same infrastructure consisting of a laptop that acted as a server and a collection of tablets that allowed up to 30 participants to be simultaneously participating in the experiment. In Fig 3 we present a diagram of this infrastructure. Data is collected and stored in a database (which may be relational or not), and personal information is stored separately from the experimental data to follow the privacy guidelines required by this type of experiments.

Fig 3. Example of the platform infrastructure.

This is the basic technological infrastructure used in the majority of experiments. It is designed to be rapidly deployed in any environment.

Finally, live control of an experiment is critical to guarantee its correct development. For this reason, they can be controlled using an administration webpage that provides two features: it allows the researcher to configure the parameters that will be used in each iteration of the experiment (e.g. select if a certain group will be intervention or control) and it presents interfaces with the status of the experiment. Live monitoring can be done at two different scales, at a particular game level, where researchers have real-time detailed information about the evolution of a particular game (rounds played, decisions made, earnings, connection status, etc.), or from a more general point of view to obtain a summary of the status of the experiment (demographics, games played, global earnings, etc.).

The experiments

The platform has been in use since December 2013 in 6 different experimental setups focused on the analysis of human behavior. Some of them have been repeated in different situations, which adds to a total of 15 experiments realized. All participants in the experiments signed an informed consent to participate. In agreement with the Spanish Law for Personal Data Protection, no association was ever made between their real names and the results. Experiment procedures were checked and approved by the Viceprovost of Research of Universidad Carlos III de Madrid (Dr.Brain, The Climate Game) or by Ethics Committee of Universitat de Barcelona (Mr.Banks, Games for Mental Health, STEM4Youth and urGentEstimar).

In this section we describe the main goals and results of the six research projects based on this platform, which are also summarized in Table 1.

  1. The first experimental setup based on the platform is “Mr. Banks: The Stock Market Game” to study how people make decisions when they have limited and incomplete information. This setup emulated a stock market environment in which people had to decide whether the market would rise or fall. It allowed us to study the emerging strategies and the relevant use of information when making decisions under uncertainty, and the results are published in Ref. [59]. Three experiments based on this setup have been done in different locations, and is now available online (
  2. Next, we created another experimental setup entitled “Dr. Brain” to study the existence of cooperation phenotypes. The games played by the participants were based on a broad set of dyadic games and allowed us to deepen our understanding of human cooperation and to discover five different types of actors according to their behaviors [61].
  3. The following experimental setup included in the platform was “Dr. Brain: The Climate Game”, which was based on a collective-risk dilemma experiment to study the effect of inequality when participants face a common challenge [62]. Results showed that even though the collective goal was always achieved regardless of the heterogeneity of the initial capital distribution, the effort distribution was highly inequitable. Specifically, participants with fewer resources contributed significantly more (in relative terms) to the public goods than the richer—sometimes twice as much.
  4. The fourth experimental setup implemented in the platform was called “Games for Metal Health” which was repeated in 4 different locations. The goal of this project was to evaluate the importance of communities for effective mental health care by studying different behavioral traits of the different roles of the ecosystem. The results presented in Ref. [63] reinforce the idea of community social capital, with caregivers and professionals playing a leading role.
  5. In the context of the EU project STEM4Youth we performed three experiments, which were co-designed with high-schools of Barcelona, Badalona and Viladecans. They addressed topics raised in workshops with students: gender inequalities, use of public space and integration of immigrants. The experiments combined a set of games that included Trust Game, Dictator’s Game, Prisoner’s Dilemma and Public Goods games.
  6. Finally, we performed two experiments named “urGENTestimar” in the context of artistic performances in Tàrrega and Poblenou (a Barcelona neighborhood), in which the participants took part in a set of behavioral games which included Prisoner’s Dilemma, Dictator’s Game or Snowdrift, and which were framed around different concerns of local communities.

Table 1. Summary of experiments performed thus far.

The suit of games is formed by: Decision-Making Game (DM), Harmony Game (HG), Snowdrift Game (SG), Stag-Hunt Game (SH), Prisoner’s Dilemma (PD), Trust Game (TG), Dictator’s Game (DG) and Collective-Risk Dilemma (CRD). The number of participants and decisions are the valid ones.

Platform evaluation

In this section we analyze the versatility and the robustness of the platform by reviewing some of the results obtained by its use in different experimental setups. Mainly we focus on the sociodemographic diversity, the experience of participation, the time response data collected in the iterative experiments, and finally the robustness in the replicability of experiments.


To start, we review some of the demographical data of the participants in the different experiments. We already stated that one of the main goals of the platform was to open the experiments to a more general population. In this direction, in Fig 4 we present an overview of the 2821 people that took part at some point in the behavioral experiments and perform the experiment with this platform. We observe that we had a combination of participants from a wide range of ages, specially from 10 to 50, but older too, and diverse educational levels, with a predominance of those with higher education. Gender is also balanced (45.73% females) compared with other similar experiments which are usually performed by students with sociodemographic bias.

Fig 4. Diversity of the participants pool.

(Left) The proportion of participants in all the experiments (n = 2821) regarding gender is 54.27% males and 45.73% females. (Center) Distribution of participants according to their ages in all the experiments (n = 2821). (Right) Educational level of participants in all the experiments except “urGENTestimar”, which didn’t ask this question to participants (n = 1993).

Response times

The platform allows for the collection of very precise parameters about the participation in the experiments. One of them is the timestamp in which the participants perform an action. In iterated experiments, where participants make several decisions consecutively, the decision times are collected in each round so that we can calculate how long each participant takes to make a decision, an important measure to understand the strategic risk of a situation [64]. An interesting parameter in behavioral experimentation is the learning time, or in other words, the evolution of time across the game.

In Fig 5 we can see the evolution of the decision-making time across rounds. On the one hand, Mr. Banks presents the evolution of the three experiments that were carried out, the main one (DAU) and the two replicas (CAPS and Sonar+D). The evolution of the time response during the three experiments shows very similar trends. In the first round the time is substantially higher than the rest of the rounds and we see that from the 5th round the slope softens and stays more or less constant until the end. In this experiment, the variables that come into play to make a decision are the same round after round, so the trend is maintained during the game. The three experiments show similar trends but slightly different asymptotic values; the context, size and heterogeneity of the sample may be the cause of this variation, which confirms the accuracy of the data collected.

Fig 5. Time of response in different games.

(Left) Time response evolution across rounds in Mr. Banks experiments for the main performance in DAU (n = 283) and the two replicas CAPS (n = 37) and Sonar+D (n = 20). (Right) Time response evolution across rounds in The Climate Game experiment in both performances, DAU (n = 320) and City (n = 100).

On the other hand, in the case of The Climate Game the evolution of the game is somewhat different. The game starts with long times that go down gradually; however, depending on the point of the game in which the participants are (i.e. the distance to the goal) the times increase or decrease. In this case, unlike the previous one, the decision at each moment is given by the distance to the final goal, so that, as they approach to the end of the game, the times increase again. Therefore, here we observe two sets of behavior: the learning at the beginning of the game and the uncertainty as the participants reach the last rounds. The trends of the two climate change experiments are similar, however, the absolute value of time is slightly higher in the “City” context.

So that we can calculate how long each participant takes to make a decision.

Robustness of replicability

We also measure the consistency and the robustness of the results across different repetitions of the same experiment. Some of the six experimental settings described in the previous section were repeated in different environments and locations, in some cases with similar populations (e.g. the mental health experiment) and in other cases with different populations (e.g. the Mr. Banks experiment). We focus on Mental Health and Mr. Banks to examine the robustness on the platforms in order to collect quality data allowing the replicability in different situations. Mental Health’s experiments took place in Catalunya, in four different locations and social events (popular lunch, snack, etc.), in sum participated around 270 people. We analyze the differences between the four events in cooperation, expected cooperation (Prisoner’s Dilemma) and, trust and reciprocity (Trust game). The differences among the experiments in the four locations are not statistically significant and the data can be aggregated to be analyzed as a whole (Fig 6).

Fig 6. Robustness of generalization in mental health experiments.

Levels of cooperation, cooperation expectation, trust and reciprocity in the four experiments: Lleida (n = 120), Girona (n = 60), Sabadell (n = 48) and Valls (n = 42). It is represented the average level with 0.95 CI in each case. The dashed line represents the total average levels. There are no significant variation in the level of cooperation (Kruskal-Wallis, H = 2.38, p = 0.50), cooperation expectations (Kruskal-Wallis, H = 0.38, p = 0.94), trust (Kruskal-Wallis, H = 2.67, p = 0.45) and reciprocity (H = 3.02, p = 0.39). See Ref. [63] for further details.

Mr. Banks’ experiment was performed in a main location, the DAU Festival, with a large participation, 306 people (283 valid participants), and obtaining robust results. From the analysis of decision emerged two strategies Market-Imitation and Stay-Win Switch-Lose. We compare the main result with two replicas that took place in two different events in Brussels (CAPS conference) and Barcelona (Sonar+D) with data from a narrow demographic populations and with the number of samples much lower than the main experiment. There are no significant differences (>1.96 SD) between the main experiment and the replicas except in Market-Imitation Up/Up between DAU and SONAR+D and Lose-Switch strategy between DAU and CAPS as Fig 7 shows. Therefore, the behavioral patterns observed when we repeat the experiment with different samples are consistent, which helps us consolidate the conclusions reached in the main experiment. Given the existence of this baseline, when we observe one strategy that deviates from the baseline in the repetitions, we can focus on understanding the reason behind this particular result.

Fig 7. Stability of strategies in Mr. Banks replication experiments.

Ratio to follow strategies of Market Imitation and Win-Stay Lose-Shift in the experiments: DAU (n = 283), CAPS (n = 37) and Sonar+D (n = 20). There are no significant differences in Market Imitation strategies except the probability to Up/Up between the experiments of DAU and Sonar+D in (-2.53 SD). There are no significant differences in Win-Stay except in the last case (Lose-Switch) between the experiments of DAU and CAPS (2.35 SD). See S1 and S2 Tables for further details.

This means that the platform captures data accurately since we are able to observe that the behavioral patterns found are consistent with the main results, because part of the results arise significant differences and the rest do not depend on the conditions.


Finally, another important aspect that we measured is the overall satisfaction of the participants after they finish the experiment. In the post-game survey of three games (Mr. Banks, Dr. Brain and The Climate Game) we asked the participants their level of satisfaction of the overall experience. Results of this question are presented in Fig 8. In all the experiments participants were mostly very satisfied or satisfied after the experience, specifically 82.77%. The complete set of results about the experience in each game is represented in the S3 Table.

Fig 8. Participants experience.

Experience of participation in Mr. Banks, Dr. Brain and The Climate Change (n = 1178). The most of participants (82.77%) had a positive experience and a small group (9.51%) had a negative experience, the rest (7.72%) has an indifferent experience.


With Citizen Social Lab we present a platform that combines human behavioral experiments with a citizen science approach with the sake of bringing science to a broader audience and to perform social experiments beyond the laboratories. The platform is designed to be versatile, easy-to-use and robust, and to be used in open and diverse environments. It has already been adopted in several experiments by thousands of participants from a wide range of demographics, which mostly valued to experience to be very positive. The results obtained by some of the 15 experiments realized with the platform have also shown the scientific validity of the data obtained from the platform with several scientific contributions [59, 6163].

In order to maximize participation and make it much more diverse than usual social experiments, we move the laboratory to the field, where behaviors occur naturally. In this non-friendly context we use the the pop-up experimental setup to draw the attention of the potential participants (which are all the people of the surroundings) with different techniques described in Ref. [33]. Then, we benefit from the lure of the game-base mechanics included in the platform in order to introduce them in the experience and guide them through all the tasks required by the experimental setup. This approach has proved to be successful in environments where they are likely to play (as the case of a games festival), leading to successful experiments with a high participation. But the platform can also extend its utility in new scenarios, such as private and public organizations, where the behavioral experimentation can help to understand and manage people within organizations [65].

Hence, it is important to emphasize the need to adapt the experiments to the environment where they take place (e.g. organizations, cultural festivals of any kind, scientific conferences, and so on), especially the experimental design and the interaction with the platform, because it is the way to increase the empathy with potential participants. To achieve this, all the mechanisms of behavioral games and social dilemmas can be used to convert the interaction with the platform into a game (or other mechanism that fits in the context), always with the constraints imposed by the experimental scientific rigor. It is also important to remark that the interface of the platform has to be friendly and adapted to the latest usability standards to overcome the “technological barrier” that might appear for certain groups of ages or social backgrounds. For instance, in the experiments where kids are involved (which have been approved and designed accordingly), a friendly and visual appealing interface based on tablets provides an extra motivation to attract them to participate and reduces the time they need to learn the basics of the experiment.

After all the experiments and their repetitions we consider that the platform has already reached a high maturity level, but there are several points that still need some work to keep improving the technical and experimental parts. First, the platform has been largely tested within the pop-up experimental setting in physical environments. However, even when it has been designed to be easily integrated with online recruiting systems (e.g. Amazon Mechanical Turk), it has not been properly tested and validated in these environments. There is an opportunity to repeat some of the experiments to extend the consistency of the results when the dilemmas are presented to a purely online community, and to evaluate the effect of different payment methods on the participant’s performance.

Moreover, the platform is also constantly improving to provide new features and social dilemmas for the researchers. For example, we are creating the capacity for participants to create a unique profile and join in different environments. The long-term goal is to create a community of volunteers that participate in the experiments, and that can receive alerts when new opportunities to participate are open. We are also extending the number of available dilemmas within the platform as new research projects emerge which, once programmed and tested, are included in the main collection of available dilemmas.

The conceptual design in both types of experiments, the pop-up ones that have been done so far and the large-scale ones that are planned in the future, have in common that the motivations of participants and scientific rigor are at the center of the participatory design. The platform has room for improvement in motivating the participants and in offering rewards at the level of learning and participation. On one hand, it is necessary to improve the mechanisms of learning about the scientific topic of experimentation during the participation in the experiment, but also about the nature of their contributions and about the positive impact in carrying scientific knowledge forward. In this regard, many experiments are framed within a context of social impact, so participation can also be associated to a call to action to solve social concerns. In the most recent experiments, this type of actions have been carried out outside the context of the platform, however, the online version can also contribute to this mission.

On the other hand, participants can improve their experience at the end of the experiment, not only receiving the necessary economic incentive but also obtaining an on-site feedback expanded with real-time information about the research process in which they have participated. They can also obtain an improved experience by remotely following the evolution of the scientific research and participating in more phases of the scientific process. Another possible avenue to improve the platform is to build effective and real-time tools attached to experiments. Participants could in this way provide more feedback and actively contribute in the data interpretation and knowledge building process in both individual and aggregated levels. This effort appears to be meaningful to increase the participants’ sense of ownership of the knowledge being produced by means of citizen science strategies.

Finally yet importantly, all the experiments done within these platform have been following open principles: the articles have been published as open access, and the data generated in all the experiments is also available in public repositories (properly anonymized) [6673]. In the same vein, we are releasing the source code of Citizen Social Lab, including the core of the platform and the code of all the experiments done up-to-date, to the researcher community so they can use it to create their experiments using the templates and guidelines already established in the platform. The project code is going to be released under a CC BY-NC-SA license. It allows share and adapt the platform. It is completely necessary give appropriate credit, provide a link to the license, and indicate if changes were made. This license do not allow the use for commercial purposes. In the case that remix, transform, or build upon the material, the new platform contributions must be distributed under the same license as the original. This license do not have additional restrictions, we may not apply legal terms or technological measures that legally restrict others from doing anything the license permits ( In the very end, if we aim to practice citizen science, it is also necessary to claim for opening the platform by all means: releasing data and code and opening up the results to make them accessible and understandable for anyone.

Supporting information

S1 Fig. Home screen of Mr.Banks, Dr.Brain and The Climate Game.

Screenshots of the initial screen of three experiments (a) Mr. Banks, (b) Dr. Brain and (c) The Climate Game. In this screen we introduce a character and a narrative to attract the attention of the public and to motivate them to participate.


S2 Fig. Decision-making interface of Mr.Banks, Dr.Brain and The Climate Game.

Screenshots of the main user interface of three experiments (a) Mr. Banks, (b) Dr. Brain and (c) The Climate Game where the participants respond to the dilemmas.


S3 Fig. Tutorial interface of The Climate Game.

Screenshots of the tutorial shown before The Climate Game experiment where the participants learn the game mechanics and familiarize with the user interface.


S1 Table. Market imitation.

Biases with respect to the market (Participant/Market).


S2 Table. Win-Stay Lose-Shift strategy.

Decision conditioned to performance (Strategy/Decision).


S3 Table. Satisfaction.

Satisfaction of participants in Mr.Banks (n = 234), Dr.Brain (n = 524) and The Climate Game (n = 420).


S1 File. Supporting information.

Supplementary notes, figures and tables.



We acknowledge the participation of 2 821 anonymous volunteers who made the experiments based on the platform possible. We are grateful to N. Bueno-Guerra, A. Cigarini, J. Gomez-Gardeñes, C. Gracia-Lázaro, M. Gutiérrez-Roig, J. Poncela-Casasnovas, Y. Moreno, and A. Sánchez for his work on the experiments and for useful discussions and comments about the platform and this article. We also thank the support of Mensula Studio for providing the graphical design for some of the experiments.


  1. 1. Schroeder D. A. (Ed.). Social dilemmas: Perspectives on individuals and groups. Greenwood Publishing Group. 1995.
  2. 2. Kollock P. Social dilemmas: The anatomy of cooperation. Annual review of sociology. 1998. 24.1: 183–214.
  3. 3. Nowak MA Evolutionary dynamics. Harvard University Press.
  4. 4. Sánchez A. Physics of human cooperation: experimental evidence and theoretical models. J Stat Mech Theory Exp. 2018(2):24001.
  5. 5. Levitt SD, List JA. Homo economicus Evolves. Science (80). 2008; 319(5865):909–10.
  6. 6. List JA. An introduction to field experiments in economics. Journal of Economic Behavior and Organization. 2009; 70(3):439–42.
  7. 7. Levitt SD, List JA. What Do Laboratory Experiments Measuring Social Preferences Reveal about the Real? J Econ Perspect. 2007; 21(2):153–74.
  8. 8. Levitt SD, List JA. Viewpoint: On the generalizability of lab behaviour to the field. Canadian Journal of Economics. 2007. Vol. 40, p. 347–70.
  9. 9. Rosenthal R, Rosnow RL. Artifact in behavioral research. New York: Academic Press; 1969. 400 pp.
  10. 10. Orne MT. On the social psychology of the psychological experiment: With particular reference to demand characteristics and their implications. Am Psychol. 1962; 17(11):776–83.
  11. 11. Falk A., Meier S., & Zehnder C. Do lab experiments misrepresent social preferences? The case of self-selected student samples. Journal of the European Economic Association. 2013; 11(4), 839–852.
  12. 12. Carpenter J., Connolly C., & Myers C. K. Altruistic behavior in a representative dictator experiment. Experimental Economics. 2008; 11(3), 282–298.
  13. 13. Fehr E, List JA. The Hidden Costs and Returns of Incentives–Trust and Trustworthiness among Ceos. J Eur Econ Assoc. 2004; 2(5):743–71.
  14. 14. Exadaktylos F., Espín A. M., & Brañas-Garza P. Experimental subjects are not different. Scientific Reports. 2013; 3 (1213), 1–6.
  15. 15. Chang AC, Li P. Is Economics Research Replicable? Sixty Published Papers from Thirteen Journals Say “Usually Not”. Financ Econ Discuss Ser. 2015;(83):1–26.
  16. 16. Open Science Collaboration. Estimating the reproducibility of psychological science. Science. 2015;349(6251):aac4716. pmid:26315443
  17. 17. Munafò MR, Nosek BA, Bishop DVM, Button KS, Chambers CD, Percie du Sert N, et al. A manifesto for reproducible science. Nat Hum Behav. 2017;1(1):21.
  18. 18. Nature. Announcement: Transparency upgrade for Nature journals. Nature. 2017;543(7645):288.
  19. 19. Nature Editorial. Why researchers should resolve to engage in 2017. Nature. 2017;541(7635):5–5. pmid:28054620
  20. 20. Nosek BA, Alter G, Banks GC, Borsboom D, Bowman SD, Breckler SJ, et al. Promoting an open research culture. Science (80). 2015;348 (6242).
  21. 21. Lazer D, Pentland A, Adamic L, Aral S, Barabasi A-L, Brewer D, et al. Computational social science. Science. 2009;323: 721–3. pmid:19197046
  22. 22. Cioffi-Revilla C. Computational social science. Wiley Interdisciplinary Reviews: Computational Statistics. 2010. pp. 259–271.
  23. 23. Conte R, Gilbert N, Bonelli G, Cioffi-Revilla C, Deffuant G, Kertesz J, et al. Manifesto of computational social science. Eur Phys J Spec Top. 2012;214: 325–346.
  24. 24. Mann A. Core Concept: Computational social science. Proc Natl Acad Sci. 2016;113: 468–470. pmid:26787844
  25. 25. Bond R. M., Fariss C. J., Jones J. J., Adam D. I., Kramer A. D. I., Marlow C., et al. A 61-million-person experiment in social influence and political mobilization. Nature. 2012. 489, 295–298. pmid:22972300
  26. 26. Mason W. and Siddharth S. Conducting behavioral research on Amazonś Mechanical Turk. Behavior research methods. 2012. 44.1: 1–23. pmid:21717266
  27. 27. Rand D. G. The promise of Mechanical Turk: How online labor markets can help theorists run behavioral experiments. Journal of theoretical biology. 2012. 299, 172–179. pmid:21402081
  28. 28. Shirado H., Christakis N. A. Locally noisy autonomous agents improve global human coordination in network experiments. Nature. 2017. 545 (7654), 370. pmid:28516927
  29. 29. Stewart N., Chandler J., Paolacci G. Crowdsourcing Samples in Cognitive Science. Trends in cognitive sciences. 2017. pmid:28803699
  30. 30. Chen DL, Schonger M, Wickens C. oTree-An open-source platform for laboratory, online, and field experiments. J Behav Exp Financ. 2016;9:88–97.
  31. 31. Radford J, Pilny A, Reichelmann A, Keegan B, Welles BF, Hoye J, et al. Volunteer Science. Soc Psychol Q. 2016;79(4):376–96.
  32. 32. Holt C. Vecon Lab: Last date accessed: Feb 18th, 2018.
  33. 33. Sagarra O, Gutiérrez-Roig M, Bonhoure I, Perelló J. Citizen Science Practices for Computational Social Science Research: The Conceptualization of Pop-Up Experiments. Front Phys. 2016; 3(January):1–19.
  34. 34. Bonney R, Shirk JL, Phillips TB, Wiggins A, Ballard HL, Miller-Rushing AJ, et al. Citizen science: Next steps for citizen science. Science. 2014;343: 1436–1437. pmid:24675940
  35. 35. Gura T. Citizen science: amateur experts. Nature. 2013;496: 259–261. pmid:23586092
  36. 36. Hand E. People Power. Nature. 2010;466: 685–687. pmid:20686547
  37. 37. Silvertown J. A new dawn for citizen science. Trends Ecol Evol. 2009;24: 467–471. pmid:19586682
  38. 38. Kullenberg C, Kasperowski D. What is citizen science?—A scientometric meta-analysis. PLOS One. 2016;11: 1–16.
  39. 39. Bonney R, Phillips TB, Ballard HL, Enck JW. Can citizen science enhance public understanding of science? Public Underst Sci. 2015;25: 2–16. pmid:26445860
  40. 40. Price CA, Lee HS. Changes in participants’ scientific attitudes and epistemological beliefs during an astronomical citizen science project. J Res Sci Teach. 2013;50: 773–801.
  41. 41. Bonney R, Cooper CB, Dickinson J, Kelling S, Phillips T, Rosenberg K V., et al. Citizen Science: A Developing Tool for Expanding Science Knowledge and Scientific Literacy. Bioscience. 2009;59: 977–984.
  42. 42. Bonney R, Ballard H, Jordan R, McCallie E, Phillips T, Shirk J, et al. Public Participation in Scientific Research: Defining the Field and Assessing Its Potential for Informal Science Education Research: Defining the Field and Science Education. Sci Educ. 2009.
  43. 43. Senabre E, Ferran-Ferrer N, Perelló J, Participatory design of citizen science experiments Comunicar 2018;26(54):29–38.
  44. 44. Perelló J, Ferran-Ferrer N, Ferré S, Pou T, Bonhoure I High motivation and relevant scientific competencies through the introduction of citizen science at Secondary schools: An assessment using a rubric model. In Citizen Inquiry Synthesising Science and Inquiry Learning Edited by Herodotou Christothea, Sharples Mike, Scanlon Eileen (pp. 150–175). Oxon, United Kingdom: Routledge 2018.
  45. 45. Ponti M, Hillman T, Kullenberg C, Kasperowski D. Getting it Right or Being Top Rank: Games in Citizen Science. Citiz Sci Theory Pract. 2018;3: 1–12.
  46. 46. Bowser A, Hansen D, Preece J, He Y, Boston C, Gunnell L, et al. Using gamification to inspire new citizen science volunteers. Proc First Int Conf Gameful Des Res Appl. 2013; 18–25.
  47. 47. Rapoport A, Chammah AM. Prisoner’s Dilemma. The University of Michigan Press; 1965.
  48. 48. Axelrod R, Hamilton WD. The evolution of cooperation. Science. 1981;211(4489):1390–6. pmid:7466396
  49. 49. Skyrms B. The stag hunt and the evolution of social structure. Cambridge Univ. Press, Cambridge, UK.; 2003. 1–149 p.
  50. 50. Rapoport A, Chammah AM. The Game of Chicken. Am Behav Sci. 1966; 10(3):10–28.
  51. 51. Maynard Smith J. Evolution and the Theory of Games. Cambridge University Press; 1982. 224 p.
  52. 52. Sugden R. The economics of rights, co-operation and welfare. Palgrave Macmillan, London, UK.; 2004. 1–243 p.
  53. 53. Licht AN. Games Commissions Play: 2x2 Games of International Securities Regulation. Yale J Int Law. 1999;24: 61–125.
  54. 54. Berg J, Dickhaut J, McCabe K. Trust, reciprocity and social history. Games Econ Behav. 1995; 10:122–42.
  55. 55. Forsythe R., Horowitz J. L., Savin N. E., & Sefton M. Fairness in simple bargaining experiments. Games and Economic Behavior. 1994; 6(3):347–369.
  56. 56. Fehr E., & Fischbacher U. Third-party punishment and social norms. Evolution and Human Behavior. 2004; 25(2), 63–87.
  57. 57. Milinski M, Sommerfeld RD, Krambeck HJ, Reed FA, Marotzke J. The collective-risk social dilemma and the prevention of simulated dangerous climate change. PNAS. 2008;105(7):2291–4. pmid:18287081
  58. 58. Tavoni A, Dannenberg A, Kallis G, Loschel A. Inequality, communication, and the avoidance of disastrous climate change in a public goods game. PNAS. 2011;108(29):11825–9. pmid:21730154
  59. 59. Gutiérrez-Roig M, Segura C, Duch J, Perelló J. Market imitation and win-stay lose-shift strategies emerge as unintended patterns in market direction guesses. PLOS One. 2016; 11(8). pmid:27532219
  60. 60. Orcher L. T. Conducting research: Social and behavioral science methods. Routledge. 2016.
  61. 61. Poncela-Casasnovas J, Gutierrez-Roig M, Gracia-Lazaro C, Vicens J, Gomez-Gardenes J, Perello J, et al. Humans display a reduced set of consistent behavioral phenotypes in dyadic games. Sci Adv. 2016; 2(8). pmid:27532047
  62. 62. Vicens J, Bueno-Guerra N, Gutiérrez-Roig M, Gracia-Lázaro C, Gómez-Gardenes J, Perelló J, et al. Resource heterogeneity leads to unjust effort distribution in climate change mitigation. PLOS One 13(10): e0204369. 2018. pmid:30379845
  63. 63. Cigarini A, Vicens J, Duch J, Sánchez A, Perelló J. Quantitative account of social interactions in a mental health care ecosystem: cooperation, trust and collective action. Sci Rep. 2018;8: 3794. pmid:29491363
  64. 64. Brañas-Garza P., Meloso D., & Miller L. Strategic risk and response time across games. International Journal of Game Theory. 2017; 46(2), 511–523.
  65. 65. Espin A. M., Reyes-Pereira F., & Ciria L. F. Organizations should know their people: A behavioral economics approach. Journal of Behavioral Economics for Policy. 2017; 1, 41–48.
  66. 66. Gutiérrez-Roig M, Segura C, Duch J, Perelló J. Dataset—Mr. Banks Experiment 2013. 2016;
  67. 67. Poncela-Casasnovas J, Gutiérrez-Roig M, Gracia-Lázaro C, Vicens J, Gómez-Gardenes J, Perelló J, et al. Dataset—Humans display a reduced set of consistent behavioral phenotypes in dyadic games. 2017.
  68. 68. Vicens J, Bueno-Guerra N, Gutiérrez-Roig M, Gracia-Lázaro C, Gómez-Gardenes J, Perelló J, et al. Dataset—Resource heterogeneity leads to unjust effort distribution in climate change mitigation. 2018.
  69. 69. Cigarini A, Vicens J, Duch J, Sánchez A, Perelló J. Dataset—Quantitative account of social interactions in a mental health care ecosystem: cooperation, trust and collective action. 2018.
  70. 70. Vicens J, Cigarini A, Perelló J. Dataset—STEM4Youth: Games Barcelona. 2018.
  71. 71. Vicens J, Cigarini A, Perelló J. Dataset—STEM4Youth: Games Badalona. 2018.
  72. 72. Vicens J, Cigarini A, Perelló J. Dataset—STEM4Youth: Games Viladecans. 2018.
  73. 73. Vicens J, Cigarini A, Perelló J. Dataset—urGENTestimar. 2018.