Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Care to share? Experimental evidence on code sharing behavior in the social sciences

  • Daniel Krähmer ,

    Roles Conceptualization, Formal analysis, Investigation, Methodology, Software, Visualization, Writing – original draft, Writing – review & editing

    daniel.kraehmer@soziologie.uni-muenchen.de

    Affiliation Department of Sociology, University of Munich (LMU), Munich, Germany

  • Laura Schächtele,

    Roles Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Writing – original draft, Writing – review & editing

    Affiliation Department of Sociology, University of Munich (LMU), Munich, Germany

  • Andreas Schneck

    Roles Conceptualization, Formal analysis, Funding acquisition, Investigation, Methodology, Project administration, Writing – review & editing

    Affiliation Department of Sociology, University of Munich (LMU), Munich, Germany

Abstract

Transparency and peer control are cornerstones of good scientific practice and entail the replication and reproduction of findings. The feasibility of replications, however, hinges on the premise that original researchers make their data and research code publicly available. This applies in particular to large-N observational studies, where analysis code is complex and may involve several ambiguous analytical decisions. To investigate which specific factors influence researchers’ code sharing behavior upon request, we emailed code requests to 1,206 authors who published research articles based on data from the European Social Survey between 2015 and 2020. In this preregistered multifactorial field experiment, we randomly varied three aspects of our code request’s wording in a 2x4x2 factorial design: the overall framing of our request (enhancement of social science research, response to replication crisis), the appeal why researchers should share their code (FAIR principles, academic altruism, prospect of citation, no information), and the perceived effort associated with code sharing (no code cleaning required, no information). Overall, 37.5% of successfully contacted authors supplied their analysis code. Of our experimental treatments, only framing affected researchers’ code sharing behavior, though in the opposite direction we expected: Scientists who received the negative wording alluding to the replication crisis were more likely to share their research code. Taken together, our results highlight that the availability of research code will hardly be enhanced by small-scale individual interventions but instead requires large-scale institutional norms.

Introduction

Transparency and openness are hallmarks of science. They are vital for the scientific enterprise to unfold its self-correcting capabilities and enable researchers to engage in “organized skepticism” [1] by evaluating and replicating scientific findings originally published by others. Such skepticism is warranted, given evidence of a replication crisis. Across disciplines, researchers have struggled to replicate key findings [26], meaning that central claims or discoveries could not be repeated independently. Against this backdrop, both conceptual and direct replications—the latter sometimes being called reproductions [7]—have been deemed crucial for determining whether findings are credible and deserve to enter the stock of scientific knowledge [8].

In practice, reproductions hinge on the availability of an original study’s code and data. Performing a replication solely based on the body of an article is tedious at best and impossible at worst. This goes without assuming ill intent or incompetence on the authors’ side: Space in research articles is limited, and even the most diligent researcher will struggle to document all relevant details of their analysis within an article.

To boost transparency, publishing research data has become increasingly common, not least due to requirements imposed by funders (e.g. by the European Research Council). While publishing research data is clearly a step towards more open science, it is only half the battle. Given the complexity of analyses, a dataset may yield very different conclusions depending on how it is processed and analyzed. This has been demonstrated by a number of many-analysts studies [911] which sparked lively debate across the social sciences [12, 13]. Researchers’ degrees of freedom imply that, in extreme cases, open data may provide little to no extra transparency if replicators cannot reconstruct data preparation and analysis procedures. This applies in particular to large-N observational studies, where analytical flexibility and errors typically unfold downstream of data collection (i.e. sample restrictions, outlier management, coding of missing values). To facilitate effective peer control, open data thus needs to be accompanied by open code (i.e. Stata do-files, SPSS syntax, R scripts).

Open code alleviates two problems that currently compromise the credibility of research: Errors in data preparation and misspecifications of statistical models [8]. While model misspecification should, in principle, be discernible from the body of an article, errors in data preparation are downright impossible to spot without access to authors’ code. In fact, even wrong model specifications may be hard to detect, given that authors rarely justify their statistical models sufficiently for replicators to exert effective scrutiny [14]. Open code advances research integrity by exposing errors and lifting the lid on opaque model descriptions.

Fostering research transparency is not the only benefit of open code. Code quality itself may benefit from being subject to public scrutiny, if only through the short-term motivation of making one’s syntax readable and understandable to others [15]. Code sharing may also increase efficiency, as not all scientific endeavors require reinventing the wheel [16, 17]. Ultimately, re-using and adapting peers’ code saves time and valuable resources, thereby facilitating peer learning and advancing the progress of science.

State of the art and research question

Despite the emergence and rapid dissemination of online repositories such as OSF and Github [18], open code remains the exception rather than the rule for most published research articles. As a consequence, replicators routinely find themselves forced to contact original authors, relying on the latter’s spontaneous willingness to cooperate. Critically, researchers have repeatedly proven reluctant to comply with such requests. In an early study, Wolins [19] found that out of 37 psychologists, only 24% responded positively to a student’s query for raw data and analysis material. Recent studies report similar sharing rates, usually ranging from around 20% [4, 20] to 45% [21, 22]. To tackle the unavailability of research material, some journals (i.e. by the American Economic Association) have recently started requiring authors to provide all data and code files upon submission. These replication packages are then checked by a designated data editor to ensure reproducibility [23].

Fig 1 summarizes the existing literature on data and code sharing (see also Table A in S1 File). Three details are noteworthy. First, even the most optimistic studies [20, 24] do not yield sharing rates above 60%, underpinning the assertion that current research falls far short of full transparency. Second, studies outside the common 20–45% sharing range rely on small samples (see marker size), yielding less precise estimates. Third, researchers’ reluctance to share materials transcends disciplinary boundaries. Providing data and code upon request has proven equally unpopular in the life sciences, economics, psychology, and various other disciplines.

thumbnail
Fig 1. Researchers’ willingness to share data/code in the literature.

Note: Dot-plot of data/code sharing rates in the literature [4, 1921, 2438]. Sample sizes of the respective sharing studies are denoted by dot size. The discipline under study is represented by different dot colors.

https://doi.org/10.1371/journal.pone.0289380.g001

Although there is convincing evidence that scientists’ willingness to share research material is generally low, little is known why researchers choose to withhold or release their code. Correlational studies have established associations between authors’ willingness to share research material and a study’s age [20] as well as its strength of evidence [22]. While both findings offer tempting explanations, i.e. code and data get lost over time and authors actively obstruct verification attempts of shaky results, neither allows for straightforward causal interpretation: The seeming age effect might simply mirror the tightening of journals’ data policies over time. Similarly, the link between an article’s statistical properties and material availability may be confounded by researchers’ self-selection into particular fields and topics.

In a rare attempt to provide causal evidence on knowledge sharing in academia, Krawczyk and Reuben [21] conducted a field experiment asking 200 authors to share supplementary material for published articles. Requests were sent from either Krawczyk’s or Reuben’s institutional email account and included the requestor’s name and affiliation (University of Warsaw or Columbia University, respectively). Additionally, half of all emails identified the sender as an assistant professor. Exploiting this experimental variation, Krawczyk and Reuben found that neither the requestor’s affiliation nor academic position strongly impacts response and compliance among contacted authors. While these results are soothing from an egalitarian perspective, they come with two major limitations. Regardless of their experimental condition, participants in the experiment could easily check the requestor’s academic position online, potentially weakening the treatment effect. Furthermore, email signatures always included both the requestor’s name and institutional affiliation, possibly conflating two analytically distinct treatments.

To this day, the factors influencing scientists’ code sharing behavior remain unknown. What facets of a code request may favor or impede original authors’ willingness to share code? And, by extension, (how) can replicators leverage this knowledge to increase the availability of research code through micro-interventions? Our analysis presents a theory-guided attempt to answer these questions.

Theoretical background

Invoking the fundamentals of rational choice and game theory, we conceptualize code sharing in academia as a public good game [3941]. Collectively, open code is desirable as it fosters the expansion and consolidation of common knowledge by increasing research efficiency and facilitating peer-control. Individually, however, researchers are subjected to academia’s “publish-or-perish” precept [42, 43] which yet gives little to no credit for subsidiary research output, such as analysis code, in the pursuit of tenure and reputation [44, 45]. Rational researchers who aim to maximize personal benefits while minimizing costs (i.e., investment of time and resources, risk of repercussions from peer-control) will refrain from contributing to the public good, running headfirst into a social dilemma [39, 46, 47]. As a result, all researchers lack the collective good.

We build upon this understanding of code sharing as a social dilemma and examine the effect of nudges on increasing cooperation in the academic public good game [4850]. Nudges are deliberate yet subtle modifications to an individual’s “choice architecture”, i.e., the environment in which they make a decision [51]. They are low-cost, easy-to-miss micro-interventions that do not restrict the nudgee’s freedom of choice but may steer them towards specific decisions. While most prominently discussed as a method to nudge individuals towards acting in their own self-interest (e.g., by making healthier food choices), nudges have also been discussed as a means to foster prosocial behavior in social dilemmas, such as vaccine uptake [52] and pro-environmental behavior [53, 54].

We focus on three types of nudges that may reasonably affect researchers’ behavior: the framing of our code request, salience nudges towards its implied appeal or benefit, and the friction associated with compliance. The concept of framing assumes that even seemingly trivial changes in the formulation of choice problems can significantly alter people’s preferences and behavior [55]. For public good games, evidence suggests that participants are generally more likely to cooperate if a choice situation is framed positively [5658]. We expect this also applies to code sharing and formulate a framing hypothesis.

  1. H1: Authors are more likely to share code if the request is framed positively (i.e., to advance science) compared to a negative framing (i.e., in response to the replication crisis).

Salience nudges are commonly employed to steer attention to factually correct aspects, potentially relevant for the individual decision, and yet easily overlooked [59]. Against the backdrop of rational choice theory, we expect that the salience of specific benefits may increase the likelihood of code sharing among researchers. As code sharing has rarely been investigated as a distinct phenomenon, we pertain to survey research on attitudes towards data sharing to identify three potential appeals: compliance with common norms [43, 60], altruistic preferences [43, 61] and actual career benefits via rightful acknowledgment [43, 6264]. We propose that compared to the untreated control group, researchers should be more likely to share code if our request highlights…

  1. H2.1: …that code sharing is in compliance with normative, institutional-level open science norms, such as the FAIR Guiding Principles [65].
  2. H2.2: …the importance of authors’ participation for the scientific community (i.e., academic altruism).
  3. H2.3: …positive effects of code sharing on authors’ own careers, i.e., by future citation.

Considering the opposite end of researchers’ cost-benefit analysis, we argue that short-term costs associated with code sharing (i.e., locating, commenting, and preparing code) are crucial for researchers’ decisions. Indeed, previous research on data sharing has repeatedly pointed towards the investment of time and effort as major disincentives [66, 67]. Assuming rational researchers strive to minimize costs, we posit an effort hypothesis.

  1. H3: Researchers are more likely to share code if we point out that code cleaning is not required (compared to the untreated control group).

The present study

The article at hand presents the results of a large-scale, fully randomized field experiment. We aim to investigate which facets of a code request influence researchers’ code sharing behavior and thereby extend previous research in three fundamental ways. First, whereas prior studies mainly provided correlational evidence, our experimental design enables us to identify causal predictors of code sharing among peers. Second, we are among the first to acknowledge code sharing as a distinct phenomenon, separate from data sharing. This differentiation is crucial as code sharing involves a very different set of practical, ethical, and legal considerations (i.e. file size, privacy issues, copyright). Third, our study descriptively sheds light on the availability of research code in the social sciences. Despite a considerable body of research from psychology and economics, little is known about code sharing in neighboring disciplines such as sociology and political science.

All hypotheses and the research design of this study have been preregistered (https://osf.io/bqjcz). Deviations from the preregistration occurred in four very minor instances (e.g. adapting the language of correspondence based on authors’ reply). All deviations are listed, described and justified in Table B in S1 File.

Materials and methods

We sent code requests to 1,206 authors who published research articles based on data from the European Social Survey (ESS) between 2015 and 2020. The ESS is a high-quality, biannual cross-national survey commonly used in the social sciences. Restricting our sample to ESS analyses offers three advantages. First, ESS data is publicly available, meaning researchers may not invoke privacy concerns as a justification for not sharing their analysis code. Second, ESS users stem from various social science domains, strengthening the external validity of our results. Third, the ESS administrative team curates a list of research articles that rely on ESS data, offering a comprehensive and clear-cut target population for our study.

Sampling procedure

The complete ESS bibliographic database obtained from the ESS administrative team contained 5,429 entries. We restricted our sample to journal articles published between 2015 and 2020 to ensure that original authors could still be expected to have the code for their analyses available. We excluded working papers and theses which are predominantly written by students and early career researchers, as well as book sections. Focusing on journal articles allows us to distinguish research quality and code sharing, as journal articles have passed at least some quality control in the peer-review process. Furthermore, we excluded duplicate entries from the database and articles for which full texts were not available. We assumed the corresponding author to be the most natural point of reference for our code request. If this information was not included, we used available contact details for any author in order of appearance. If an author from our contact database had published multiple eligible articles, we randomly chose one publication to minimize individual burden and avoid reactivity, as researchers may have uncovered that they took part in an experiment. Implementing all sample restrictions left us with 1,206 cases. Fig 2 illustrates the sample selection process in detail.

thumbnail
Fig 2. Sampling procedure starting from the ESS bibliographic database.

Note: Sankey-plot of the sampling procedure. Conceptual as well as ex ante restrictions are described in the preregistration (N = 1,206). Post hoc restrictions were deemed necessary during the field phase to reduce overcoverage and correct for sample-neutral failures (N = 1,028). This graph has been created using a modified version of the Stata ado sankey [68].

https://doi.org/10.1371/journal.pone.0289380.g002

Given the increasing popularity of online repositories, we expected some authors to have made their code publicly available prior to our request. To check this, we automatically screened all articles in our final sample for hyperlinks to scientific repositories. We found references to online material in 39 cases (3.2%). Upon manual inspection, some of these readily available replication packages seemed fragmentary or insufficiently documented, which led us to contact the authors nonetheless.

Field phase

On 6 and 7 July 2022, we reached out to all 1,206 authors in our final sample via email. Requests were sent from an institutional email address affiliated with the department of Sociology at LMU Munich and were signed by D.K. 221 emails (18.3%) bounced due to invalid email addresses. In such cases, we manually searched for alternative email addresses online and contacted the authors the next day. If no such address could be obtained or if the second email, too, resulted in a bounce, we ultimately deemed that request unsuccessful (n = 95). To increase response, we sent up to three reminders at two-week intervals, the first and second of which reiterated our experimental treatments. Whenever any of our emails triggered an out-of-office notification, we stalled all further dispatches until the author’s return. If researchers referred us to one of their coauthors, all subsequent correspondence was redirected accordingly. The field phase ended on 17 January 2023. Approval to conduct this study has been granted by the Institutional Review Board of the Faculty of Social Sciences at LMU Munich (GZ 22–03). Due to the study’s field experimental setting, informed consent could not be obtained prior to individual study participation [69]. Following the recommendations by the ethics board, all authors who had responded to any of our emails were debriefed on 17 January 2023 regarding the experimental nature of our request unless they had explicitly objected to receiving any further emails. Fig A in S1 File provides a breakdown of the daily inflow and outflow of emails between 6 July 2022 and 17 January 2023.

Experimental design

To investigate the determinants of researchers’ code sharing behavior, we experimentally varied three aspects of our request’s wording.

  • Framing of the project (2 levels): The positively worded version of our email stated that researchers’ cooperation would enhance the quality, relevance, and success of social science research. The negative version framed our project in light of the replication crisis and its ramifications for the credibility of social science research.
  • Appeal of code sharing (4 levels): The baseline version of our request did not include any specific appeal on why researchers should share their code. Other versions stated that by sharing code, researchers would either i) honor the FAIR Guiding Principles [65], ii) commit an act of academic altruism by helping other researchers, or iii) increase their chances of being cited as part of our replication attempt.
  • Perceived effort (2 levels): The reduced effort version of our request emphasized that authors were not expected to clean their code before sharing. No such statement was included in the baseline request.

This 2x4x2 factorial design resulted in 16 randomly assigned treatment conditions (n ≈ 75 each). Wherever possible, we kept the wording of treatments at a comparable length to avoid confounding (e.g. by lengthy requests lowering researchers’ willingness to share). We did not vary the order of the three treatment dimensions to ensure readability and coherence. Besides the varied treatment dimensions, the request provided general background information on our research project and stated our interest in replicating published findings. Fig B in S1 File provides an exemplary email; all email templates are available in the preregistration form (https://osf.io/bqjcz).

The central outcome variable was dummy coded based on the correspondence with the authors (0: Code not shared; 1: Code shared via email/link to online repository). All analyses are based on the authors’ sharing behavior upon completion of the field experiment in January 2023. Researchers who promised to share their code but failed until our deadline were coded as non-compliant (n = 28). This seems reasonable, given that potential replicators should not have to wait more than 6 months to receive code from the original authors. As the research design required repeated correspondence, it was inevitable that we were able to identify individual participants during data collection and analysis. To safeguard participants’ privacy, all publicly available data have been fully anonymized.

We employed two-sample z-tests of proportions to test our experimental treatments. Due to the experimental design and the random allocation of treatments, this approach yields an unbiased estimation of our treatment effects. Assuming a moderately large effect, i.e. our treatment leading to an increase in code sharing rates by 15 percentage points (d = 0.3), the statistical power for both two-level and four-level treatments is more than adequate (96.25% and 99.93%, respectively; see Text A in S1 File for a detailed discussion on statistical power). As all our preregistered hypotheses are directional, we used one-tailed tests to evaluate the z-scores. All calculations have been performed using Stata version 17.0.

Results

Descriptive

Overall, 658 (59.2%) out of 1,111 successfully delivered emails (final sample net of bounced emails) triggered a response to our request. Some authors claimed their article would fall outside our sampling frame as their analyses did not use ESS data substantially. If this proved to be true upon manual inspection, cases were coded as ineligible due to overcoverage and excluded from our analyses (n = 66). If an article did report results based on ESS data despite an author’s contrary claim, we followed-up and affirmed our interest in the research code. In certain instances, we detected author duplicates after the initial emails had already been sent (e.g. if an email was forwarded to an author already included in our sample). To minimize the workload for each author and reduce reactivity, we randomly selected one article from these duplicates and requested replication material for this selected article in subsequent emails (n = 16). In one case, we inadvertently contacted a namesake. Applying all post hoc restrictions left us with a refined sample of 1,028 cases. Among these eligible cases, we obtained a response from 56.7% of the authors (n = 583).

To our surprise, a sizable share of researchers indicated to be unfamiliar with the concept of research code. This finding may, on the one hand, be attributable to terminological differences across disciplines and software packages (“code”, “syntax”, “script”, etc.). It might, on the other hand, reflect the persistence of point-and-click solutions, which do not require writing reproducible code in the first place. If authors indicated confusion about what we meant by research code (n = 22), we sent a follow-up email providing examples and clarification.

Upon completion of our field experiment, we obtained research code for 385 articles. This amounts to an aggregate sharing rate of 37.5% (of 1,028 eligible cases), which is largely in line with findings from previous research (see Fig 1). While 43.3% of researchers (n = 445) ignored our repeated attempts, 12.9% of authors (n = 133) stated upfront that they were unable or unwilling to share code. As Fig 3 illustrates, we received most code in response to our initial request (n = 177). Over the course of the three subsequent reminders, the number of shared code packages declined steadily (nR1 = 77, nR2 = 70, nR3 = 61). Among those who shared, the median timespan until we received the code files was 16 days (mean = 27; SD = 29.9).

thumbnail
Fig 3. (Non-)Response to our code request.

Note: Stacked-bar graph of sharing outcomes. Purplish colors denote the proportion of shared code in the refined sample (N = 1,028). Greenish colors denote the proportion of requests where no data was shared.

https://doi.org/10.1371/journal.pone.0289380.g003

Even on the surface, the code packages we received exhibit widely different features. An obvious, though shallow, indicator for this is the number of files per replication package. While most packages consisted of up to five files, 10.1% included 50 or more files (see Fig 4, Panel A). In two extreme cases, packages contained more than 10,000 files, suggesting that some authors interpreted our request more broadly and shared not only research code but also automatically generated output files (e.g. from simulation studies). Despite such nuances in understanding, the observed variance in file counts may reflect genuine differences in coding practices (e.g. bundling code vs. breaking code into task-specific files).

thumbnail
Fig 4. Content of replication packages shared upon request.

Note: Descriptives for shared replication code. Panel A shows the number of files per replication package. Panel B shows the proportion of used software packages by the articles.

https://doi.org/10.1371/journal.pone.0289380.g004

To assess software preferences, we automatically extracted the file extensions (e.g. “.dta”, “.txt”) of all shared files and created binary indicator variables for those unequivocally associated with certain statistical software (e.g. “.do” for Stata, “.r”, “.sps”, etc.). Analyzing these indicators reveals that the majority (58.4%) of sharing authors relies on Stata for statistical analyses, making it by far the most popular software in our sample (see Fig 4, Panel B). Furthermore, 46 studies (11.9%) appear to use more than just one software, which is surprising given that obtaining proficiency in multiple coding languages requires a greater time investment from researchers. Although we can only speculate about the reasons for this pattern, two plausible explanations come to mind. Either researchers deliberately combine different software to exploit the strong points of every single program, or the mix of statistical software stems from collaborations of differently trained researchers.

Authors’ replies to our code request can also be leveraged to gain insight into obstacles commonly associated with sharing one’s analysis code. To obtain information on such hurdles, our third reminder explicitly asked non-compliant researchers to clarify why they could not share their research code with us. While some justifications were generic (“this research was done a long time ago”), others pointed towards concrete and systemic problems: 23 researchers admitted not having created any permanent code in the first place; 20 authors had already lost access to their research code, mostly due to changes in affiliation; 16 researchers reported hardware problems, ranging from corrupt hard-drives to large-scale hacker attacks; 15 authors confessed not being able to locate their code; and 12 individuals reported a lack of time. Although this snapshot is selective, it highlights that many obstacles to code sharing could be overcome by embracing a more institutional, centralized approach to archiving research code. Importantly, only a tiny minority of researchers (n = 4) indicated to hold an explicit conviction of not sharing research code. In most cases, authors’ responses were remarkably favorable, even though, in some cases, helpless.

Main results

Fig 5 depicts code sharing rates across the main experimental treatments. Contrary to our hypothesis, we find a lower proportion of shared code among researchers who received the positively framed request. Invoking the replication crisis instead of emphasizing the quality and success of social science research appears to increase authors’ inclination to share research code (z = 2.287, p = 0.989). Referencing the FAIR data principles (z = -0.269, p = 0.394), calling on researchers’ altruism (z = 1.180, p = 0.881), and mentioning the prospect of future citation (z = -0.077, p = 0.470) does not impact code sharing behavior compared to the neutral baseline condition. Similarly, no differences arise when authors are exempted from code cleaning compared to the neutral control condition without such a statement (z = -0.064, p = 0.474). Full results for all confirmatory tests are reported in Table C in S1 File.

thumbnail
Fig 5. Code sharing rate by main experimental treatments.

Note: Bar chart for code sharing rates across main experimental treatment effects. Brackets indicate differences and corresponding p-values across levels.

https://doi.org/10.1371/journal.pone.0289380.g005

The seemingly positive impact of framing science as in crisis on code sharing behavior becomes visible also in Fig 6. As indicated by the first and second row of the figure’s bottom panel, there is a clear separation between high returns (negatively framed) and low returns (positively framed). No such pattern emerges for the other experimental treatments. Comparing code sharing rates across all 16 unique treatment combinations reveals considerable variation in authors’ behavior. As Fig 6 shows, code sharing rates ranged from about 20% to 50% across unique treatment conditions (for a depiction of response rates across treatment conditions, see Fig C in S1 File). For instance, positively framed requests that appealed to authors’ altruism and mentioned that code cleaning was not required stimulated only 21.9% of authors to share their code with us (compared to 52.4% positive replies among those that received a negatively framed request which highlighted the FAIR norms and did not comment on code cleaning requirements).

thumbnail
Fig 6. Code sharing rates across all 16 unique experimental conditions.

Note: Code sharing rates for all 16 distinct treatment conditions with 95% confidence intervals. The treatment conditions are sorted in ascending order. Black/gray squares represent active/passive treatment conditions, respectively. Results from a linear probability model with all two-way interactions are reported in Table D in S1 File. This graph has been created using the Stata ado mfcurve [70].

https://doi.org/10.1371/journal.pone.0289380.g006

Discussion

This study provides a large-scale assessment of researchers’ code sharing behavior upon request. With an overall sharing rate of 37.5%, our descriptive results align with previous research and demonstrate that code sharing is not common even among researchers who use publicly available observational data. Contrary to our preregistered hypotheses, framing our request positively did not increase code sharing among researchers. Conversely, we find higher code returns among researchers who received the negatively framed request alluding to the replication crisis’ ramifications. Originating from psychology, the replication crisis has sparked discussion on good research practice across disciplines, emphasizing the need for open science [71, 72]. As such, it might have become a buzzword triggering researchers to comply with our request. Neither nudges towards potential appeals nor a reduction of the perceived effort had an effect on authors’ likelihood of sharing their code. While these factors may influence researchers’ attitudes on sharing their material [43, 66, 67], actual behavior seems to hinge on the prevailing incentive structures in academia. Considering the persistent lack of acknowledgment for the publication of subsidiary research output and the resulting unfavorable cost-benefit-ratio, simple nudges are clearly not sufficient to influence researchers’ code sharing behavior. Our estimates might represent a somewhat lower bound of the true nudging effect, though, as some participants could have been impervious to nudging per se. This applies, for instance, to researchers engaging in questionable research practices who would have ignored our request regardless of its wording. We expect the number of duplicitous researchers in our sample to be low but have no way of testing this assumption. Importantly, in some cases the provision of code turned out to be impeded by the simple fact that such code was never written in the first place. This may be due either to the use of graphical user interfaces (GUI) in statistical analysis tools or to a lack of knowledge about data and code management. Unlike ignorance towards best practices of data and code management, the use of GUI solutions is not inherently problematic as long as sufficient additional information is provided to maintain analytic transparency.

Building on prior research findings, the main advantage of our study lies in its experimental design, which enables us to identify the causal effects of specific nudges on researchers’ actual code sharing behavior. Choosing a large-N, multidisciplinary sample of journal articles using ESS data bolstered the external validity of our experiment. As ESS data is publicly available, it also enabled us to investigate code sharing as a distinct phenomenon. Nonetheless, our approach has its limitations. As our data fully relies on our email correspondence with authors, we were only privy to information they chose to share with us. Thus, some of our requests might not have been actively ignored but rather have fallen victim to spam-filters. Reassuringly, this should affect all treatment conditions equally, therefore not inducing bias to our experimental results. It may, however, lead to a slight underestimation of our descriptive sharing rate. Given the high response rate to our request (56.7% among eligible researchers), we remain confident, though, that such distortion is small. As the ESS bibliographic database does not provide information exceeding the individual bibliographic entry, we furthermore cannot assess the sample distribution of properties such as authors’ career level or regional affiliation, both of which have been linked to attitudes and practices regarding open science [64, 7375]. Therefore, we cannot rule out treatment heterogeneity in our experimental results (e.g., more experienced researchers being more strongly affected by the effort treatment). Authors who received multiple requests from our project (e.g. due to forwarding from other researchers) pose another potential threat to the validity of our conclusions, as these individuals might have been affected by reactivity. Excluding such cases from our analysis does, however, leave results unchanged (see Table E in S1 File).

Our findings provide fertile ground for further research on code sharing in academia. Most likely, authors’ readiness to share code depends on several contextual factors such as career status, affiliation, as well as journal properties such as impact factor, and policy control. While such context variables are interesting to consider, none of them is exogenous and, as such, they fall outside the scope of our experimental research design. Overall, the present study points towards the limited effectiveness of individual micro-interventions and highlights the dire need for institutional solutions regarding code availability. To ensure transparency and bolster scientific credibility, requiring open code as part of any research submission should become the institutional standard [32, 72, 76, 77].

Acknowledgments

We thank Katrin Auspurg, who has been intimately involved in planning and conducting the experiment. Thanks to two anonymous reviewers for their helpful comments and to Richard Vielberg for superb research assistance. Lastly, we are indebted to all researchers who participated in our experiment and went to great lengths to share their research code with us.

References

  1. 1. Merton RK. The Sociology of Science: Theoretical and Empirical Investigations. Chicago: University of Chicago Press; 1973.
  2. 2. Camerer CF, Dreber A, Forsell E, Ho TH, Huber J, Johannesson M, et al. Evaluating Replicability of Laboratory Experiments in Economics. Science. 2016;351(6280):1433–1436. pmid:26940865
  3. 3. Camerer CF, Dreber A, Holzmeister F, Ho TH, Huber J, Johannesson M, et al. Evaluating the Replicability of Social Science Experiments in Nature and Science between 2010 and 2015. Nature Human Behaviour. 2018;2(9):637–644. pmid:31346273
  4. 4. Errington TM, Mathur M, Soderberg CK, Denis A, Perfito N, Iorns E, et al. Investigating the Replicability of Preclinical Cancer Biology. eLife. 2021;10:e71601. pmid:34874005
  5. 5. Open Science Collaboration. Estimating the Reproducibility of Psychological Science. Science. 2015;349(6251):aac4716.
  6. 6. Prinz F, Schlange T, Asadullah K. Believe It or Not: How Much Can We Rely on Published Data on Potential Drug Targets? Nature Reviews Drug Discovery. 2011;10(9):712. pmid:21892149
  7. 7. Barba LA. Terminologies for Reproducible Research. ArXiv [Preprint]. 2018 [cited 2023 June 30]. Available from: https://doi.org/10.48550/arXiv.1802.03311.
  8. 8. Auspurg K, Brüderl J. How to Increase Reproducibility and Credibility of Sociological Research. In: Gërxhani K, de Graaf N, Raub W, editors. Handbook of Sociological Science. Cheltenham: Edward Elgar Publishing; 2022. p. 512–527.
  9. 9. Silberzahn R, Uhlmann EL. Crowdsourced Research: Many Hands Make Tight Work. Nature. 2015;526(7572):189–191. pmid:26450041
  10. 10. Silberzahn R, Uhlmann EL, Martin DP, Anselmi P, Aust F, Awtrey E, et al. Many Analysts, One Data Set: Making Transparent How Variations in Analytic Choices Affect Results. Advances in Methods and Practices in Psychological Science. 2018;1(3):337–356.
  11. 11. Breznau N, Rinke EM, Wuttke A, Nguyen HHV, Adem M, Adriaans J, et al. Observing Many Researchers Using the Same Data and Hypothesis Reveals a Hidden Universe of Uncertainty. Proceedings of the National Academy of Sciences of the United States of America. 2022;119(44):e2203150119. pmid:36306328
  12. 12. Auspurg K, Brüderl J. Has the Credibility of the Social Sciences Been Credibly Destroyed? Reanalyzing the “Many Analysts, One Data Set” Project. Socius: Sociological Research for a Dynamic World. 2021;7:1–14.
  13. 13. Auspurg K, Brüderl J. Is Social Research Really Not Better Than Alchemy? How Many-Analysts Studies Produce “A Hidden Universe of Uncertainty” by Not Following Meta-Analytical Standards. MetaArXiv [Preprint]. 2023 [posted 2023 June 5; revised 2023 June 8; cited 2023 June 30]. Available from: https://doi.org/10.31222/osf.io/uc84k.
  14. 14. Kohler U, Class F, Sawert T. Control Variable Selection in Applied Quantitative Sociology: A Critical Review. European Sociological Review. 2023; jcac078.
  15. 15. Easterbrook SM. Open Code for Open Science? Nature Geoscience. 2014;7(11):779–781.
  16. 16. Quigley TJ, Hill AD, Blake A, Petrenko O. Improving Our Field Through Code and Data Sharing. Journal of Management. 2023;49(3):875–880.
  17. 17. Woelfle M, Olliaro P, Todd MH. Open Science Is a Research Accelerator. Nature Chemistry. 2011;3(10):745–748. pmid:21941234
  18. 18. Escamilla E, Klein M, Cooper T, Rampin V, Weigle MC, Nelson ML. The Rise of GitHub in Scholarly Publications. ArXiv [Preprint]. 2022 [cited 2023 June 30]. Available from: https://doi.org/10.48550/arXiv.2208.04895.
  19. 19. Wolins L. Responsibility for Raw Data. American Psychologist. 1962;17(9):657–658.
  20. 20. Vines TH, Albert AYK, Andrew RL, Débarre F, Bock DG, Franklin MT, et al. The Availability of Research Data Declines Rapidly with Article Age. Current Biology. 2014;24(1):94–97. pmid:24361065
  21. 21. Krawczyk M, Reuben E. (Un)Available upon Request: Field Experiment on Researchers’ Willingness to Share Supplementary Materials. Accountability in Research. 2012;19(3):175–186. pmid:22686633
  22. 22. Wicherts JM, Bakker M, Molenaar D. Willingness to Share Research Data Is Related to the Strength of the Evidence and the Quality of Reporting of Statistical Results. PLOS ONE. 2011;6(11):e26828. pmid:22073203
  23. 23. Vilhuber L. Report by the AEA Data Editor. AEA Papers and Proceedings. 2019;109:718–729.
  24. 24. Kyzas PA, Loizou KT, Ioannidis JPA. Selective Reporting Biases in Cancer Prognostic Factor Studies. JNCI: Journal of the National Cancer Institute. 2005;97(14):1043–1055. pmid:16030302
  25. 25. Vines TH, Andrew RL, Bock DG, Franklin MT, Gilbert KJ, Kane NC, et al. Mandated Data Archiving Greatly Improves Access to Research Data. The FASEB Journal. 2013;27(4):1304–1308. pmid:23288929
  26. 26. Stockemer D, Koehler S, Lentz T. Data Access, Transparency, and Replication: New Insights from the Political Behavior Literature. PS: Political Science & Politics. 2018;51(4):799–803.
  27. 27. McCullough BD, Vinod HD. Verifying the Solution from a Nonlinear Solver: A Case Study. American Economic Review. 2003;93(3):873–892.
  28. 28. Reid LN, Rotfeld HJ, Wimmer RD. How Researchers Respond to Replication Requests. Journal of Consumer Research. 1982;9(2):216–218.
  29. 29. Tedersoo L, Küngas R, Oras E, Köster K, Eenmaa H, Leijen Ä, et al. Data Sharing Practices and Data Availability upon Request Differ across Scientific Disciplines. Scientific Data. 2021;8(1):192. pmid:34315906
  30. 30. Craig JR, Reese SC. Retention of Raw Data: A Problem Revisited. American Psychologist. 1973;28(8):723.
  31. 31. Vanpaemel W, Vermorgen M, Deriemaecker L, Storms G. Are We Wasting a Good Crisis? The Availability of Psychological Research Data after the Storm. Collabra. 2015;1(1):3.
  32. 32. Stodden V, Seiler J, Ma Z. An Empirical Analysis of Journal Policy Effectiveness for Computational Reproducibility. Proceedings of the National Academy of Sciences. 2018;115(11):2584–2589. pmid:29531050
  33. 33. Dewald WG, Thursby JG, Anderson RG. Replication in Empirical Economics: The Journal of Money, Credit and Banking Project. The American Economic Review. 1986;76(4):587–603.
  34. 34. Collberg C, Proebsting TA. Repeatability in Computer Systems Research. Communications of the ACM. 2016;59(3):62–69.
  35. 35. Wicherts JM, Borsboom D, Kats J, Molenaar D. The Poor Availability of Psychological Research Data for Reanalysis. American Psychologist. 2006;61(7):726–728. pmid:17032082
  36. 36. Hardwicke TE, Ioannidis JPA. Populating the Data Ark: An Attempt to Retrieve, Preserve, and Liberate Data from the Most Highly-Cited Psychology and Psychiatry Articles. PLOS ONE. 2018;13(8):e0201856. pmid:30071110
  37. 37. Savage CJ, Vickers AJ. Empirical Study of Data Sharing by Authors Publishing in PLoS Journals. PLOS ONE. 2009;4(9):e7078. pmid:19763261
  38. 38. Leberg PL, Neigel JE. Enhancing The Retrievability Of Population Genetic Survey Data? An Assessment Of Animal Mitochondrial DNA Studies. Evolution. 1999;53(6):1961–1965. pmid:28565455
  39. 39. Linek SB, Fecher B, Friesike S, Hebing M. Data Sharing as Social Dilemma: Influence of the Researcher’s Personality. PLOS ONE. 2017;12(8):e0183216. pmid:28817642
  40. 40. Auspurg K, Hinz T. What Fuels Publication Bias?: Theoretical and Empirical Analyses of Risk Factors Using the Caliper Test. Jahrbücher für Nationalökonomie und Statistik. 2011;231(5-6):636–660.
  41. 41. Auspurg K, Hinz T. Social Dilemmas in Science: Detecting Misconduct and Finding Institutional Solutions. In: Jann B, Przepiorka W, editors. Social Dilemmas, Institutions, and the Evolution of Cooperation. Berlin/Boston: De Gruyter; 2017. p. 189–214.
  42. 42. van Dalen HP. How the Publish-or-Perish Principle Divides a Science: The Case of Economists. Scientometrics. 2021;126(2):1675–1694.
  43. 43. Kim Y, Stanton JM. Institutional and Individual Factors Affecting Scientists’ Data-Sharing Behaviors: A Multilevel Analysis: Institutional and Individual Factors Affecting Scientists’ Data Sharing Behaviors: A Multilevel Analysis. Journal of the Association for Information Science and Technology. 2016;67(4):776–799.
  44. 44. Acord SK, Harley D. Credit, Time, and Personality: The Human Challenges to Sharing Scholarly Work Using Web 2.0. New Media & Society. 2013;15(3):379–397.
  45. 45. Borgman CL. The Conundrum of Sharing Research Data. Journal of the American Society for Information Science and Technology. 2012;63(6):1059–1078.
  46. 46. Dawes RM. Social Dilemmas. Annual Review of Psychology. 1980;31(1):169–193.
  47. 47. Kraft-Todd GT, Rand DG. Practice What You Preach: Credibility-enhancing Displays and the Growth of Open Science. Organizational Behavior and Human Decision Processes. 2021;164:1–10.
  48. 48. Barron K, Nurminen T. Nudging Cooperation in Public Goods Provision. Journal of Behavioral and Experimental Economics. 2020;88:101542.
  49. 49. Korn L, Betsch C, Böhm R, Meier NW. Social Nudging: The Effect of Social Feedback Interventions on Vaccine Uptake. Health Psychology. 2018;37(11):1045–1054. pmid:30221969
  50. 50. Nagatsu M. Social Nudges: Their Mechanisms and Justification. Review of Philosophy and Psychology. 2015;6(3):481–494.
  51. 51. Thaler RH, Sunstein CR. Nudge: Improving Decisions about Health, Wealth, and Happiness. New Haven: Yale University Press; 2008.
  52. 52. Reñosa MDC, Landicho J, Wachinger J, Dalglish SL, Bärnighausen K, Bärnighausen T, et al. Nudging toward Vaccination: A Systematic Review. BMJ Global Health. 2021;6(9):e006237. pmid:34593513
  53. 53. Costa DL, Kahn ME. Energy Conservation “Nudges” and Environmentalist Ideology: Evidence from a Randomized Residential Electricity Field Experiment. Journal of the European Economic Association. 2013;11(3):680–702.
  54. 54. Kallbekken S, Sælen H. ‘Nudging’ Hotel Guests to Reduce Food Waste as a Win–Win Environmental Measure. Economics Letters. 2013;119(3):325–327.
  55. 55. Tversky A, Kahneman D. The Framing of Decisions and the Psychology of Choice. Science. 1981;211(4481):453–458. pmid:7455683
  56. 56. Andreoni J. Warm-Glow versus Cold-Prickle: The Effects of Positive and Negative Framing on Cooperation in Experiments. The Quarterly Journal of Economics. 1995;110(1):1–21.
  57. 57. Fujimoto H, Park ES. Framing Effects and Gender Differences in Voluntary Public Goods Provision Experiments. The Journal of Socio-Economics. 2010;39(4):455–457.
  58. 58. Park ES. Warm-Glow versus Cold-Prickle: A Further Experimental Study of Framing Effects on Free-Riding. Journal of Economic Behavior & Organization. 2000;43(4):405–421.
  59. 59. Noggle R. Manipulation, Salience, and Nudges. Bioethics. 2018;32(3):164–170. pmid:29283190
  60. 60. Kim Y, Nah S. Internet Researchers’ Data Sharing Behaviors: An Integration of Data Reuse Experience, Attitudinal Beliefs, Social Norms, and Resource Factors. Online Information Review. 2018;42(1):124–142.
  61. 61. Kim J. Motivating and Impeding Factors Affecting Faculty Contribution to Institutional Repositories. Journal of Digital Information. 2007;8(2).
  62. 62. Cheah PY, Tangseefa D, Somsaman A, Chunsuttiwat T, Nosten F, Day NPJ, et al. Perceived Benefits, Harms, and Views About How to Share Data Responsibly: A Qualitative Study of Experiences With and Attitudes Toward Data Sharing Among Research Staff and Community Representatives in Thailand. Journal of Empirical Research on Human Research Ethics. 2015;10(3):278–289. pmid:26297749
  63. 63. Christensen G, Dafoe A, Miguel E, Moore DA, Rose AK. A Study of the Impact of Data Sharing on Article Citations Using Journal Policies as a Natural Experiment. PLOS ONE. 2019;14(12):e0225883. pmid:31851689
  64. 64. Tenopir C, Christian L, Allard S, Borycz J. Research Data Sharing: Practices and Attitudes of Geophysicists. Earth and Space Science. 2018;5(12):891–902.
  65. 65. Wilkinson MD, Dumontier M, Aalbersberg IJ, Appleton G, Axton M, Baak A, et al. The FAIR Guiding Principles for Scientific Data Management and Stewardship. Scientific Data. 2016;3(1):160018. pmid:26978244
  66. 66. Enke N, Thessen A, Bach K, Bendix J, Seeger B, Gemeinholzer B. The User’s View on Biodiversity Data Sharing—Investigating Facts of Acceptance and Requirements to Realize a Sustainable Use of Research Data —. Ecological Informatics. 2012;11:25–33.
  67. 67. Fecher B, Friesike S, Hebing M. What Drives Academic Data Sharing? PLOS ONE. 2015;10(2):e0118053. pmid:25714752
  68. 68. Naqvi A. SANKEY: Stata Module for Sankey Diagrams (Version 1.2); 2022. Statistical Software Components S459154. Boston College Department of Economics.
  69. 69. Baldassarri D, Abascal M. Field Experiments Across the Social Sciences. Annual Review of Sociology. 2017;43(1):41–73.
  70. 70. Krähmer D. MFCURVE: Stata Module for Plotting Results From Multifactorial Research Designs (Version 1.0); 2023. Statistical Software Components S459224. Boston College Department of Economics.
  71. 71. Baker M. 1,500 Scientists Lift the Lid on Reproducibility. Nature. 2016;533(7604):452–454. pmid:27225100
  72. 72. Nosek BA, Alter G, Banks GC, Borsboom D, Bowman SD, Breckler SJ, et al. Promoting an Open Research Culture. Science. 2015;348(6242):1422–1425. pmid:26113702
  73. 73. Abele-Brehm AE, Gollwitzer M, Steinberg U, Schönbrodt FD. Attitudes Toward Open Science and Public Data Sharing: A Survey Among Members of the German Psychological Society. Social Psychology. 2019;50(4):252–260.
  74. 74. Campbell HA, Micheli-Campbell MA, Udyawer V. Early Career Researchers Embrace Data Sharing. Trends in Ecology & Evolution. 2019;34(2):95–98. pmid:30573193
  75. 75. Tenopir C, Dalton ED, Allard S, Frame M, Pjesivac I, Birch B, et al. Changes in Data Sharing and Data Reuse Practices and Perceptions among Scientists Worldwide. PLOS ONE. 2015;10(8):e0134826. pmid:26308551
  76. 76. Freese J. Replication Standards for Quantitative Social Science: Why Not Sociology? Sociological Methods & Research. 2007;36(2):153–172.
  77. 77. Goldacre B, Morton CE, DeVito NJ. Why Researchers Should Share Their Analytic Code. BMJ. 2019; l6365. pmid:31753846