Citation: Schwab S, Janiaud P, Dayan M, Amrhein V, Panczak R, Palagi PM, et al. (2022) Ten simple rules for good research practice. PLoS Comput Biol 18(6): e1010139. https://doi.org/10.1371/journal.pcbi.1010139
Published: June 23, 2022
Copyright: © 2022 Schwab et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Funding: S.S. received funding from SfwF (Stiftung für wissenschaftliche Forschung an der Universität Zürich; grant no. STWF-19-007). The funder had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interests: The authors have declared that no competing interests exist.
The lack of research reproducibility has caused growing concern across various scientific fields [1–5]. Today, there is widespread agreement, within and outside academia, that scientific research is suffering from a reproducibility crisis [6,7]. Researchers reach different conclusions—even when the same data have been processed—simply due to varied analytical procedures [8,9]. As we continue to recognize this problematic situation, some major causes of irreproducible research have been identified. This, in turn, provides the foundation for improvement by identifying and advocating for good research practices (GRPs). Indeed, powerful solutions are available, for example, preregistration of study protocols and statistical analysis plans, sharing of data and analysis code, and adherence to reporting guidelines. Although these and other best practices may facilitate reproducible research and increase trust in science, it remains the responsibility of researchers themselves to actively integrate them into their everyday research practices.
Contrary to ubiquitous specialized training, cross-disciplinary courses focusing on best practices to enhance the quality of research are lacking at universities and are urgently needed. The intersections between disciplines offer a space for peer evaluation, mutual learning, and sharing of best practices. In medical research, interdisciplinary work is inevitable. For example, conducting clinical trials requires experts with diverse backgrounds, including clinical medicine, pharmacology, biostatistics, evidence synthesis, nursing, and implementation science. Bringing researchers with diverse backgrounds and levels of experience together to exchange knowledge and learn about problems and solutions adds value and improves the quality of research.
The present selection of rules was based on our experiences with teaching GRP courses at the University of Zurich, our course participants’ feedback, and the views of a cross-disciplinary group of experts from within the Swiss Reproducibility Network (www.swissrn.org). The list is neither exhaustive, nor does it aim to address and systematically summarize the wide spectrum of issues including research ethics and legal aspects (e.g., related to misconduct, conflicts of interests, and scientific integrity). Instead, we focused on practical advice at the different stages of everyday research: from planning and execution to reporting of research. For a more comprehensive overview on GRPs, we point to the United Kingdom’s Medical Research Council’s guidelines  and the Swedish Research Council’s report . While the discussion of the rules may predominantly focus on clinical research, much applies, in principle, to basic biomedical research and research in other domains as well.
The 10 proposed rules can serve multiple purposes: an introduction for researchers to relevant concepts to improve research quality, a primer for early-career researchers who participate in our GRP courses, or a starting point for lecturers who plan a GRP course at their own institutions. The 10 rules are grouped according to planning (5 rules), execution (3 rules), and reporting of research (2 rules); see Fig 1. These principles can (and should) be implemented as a habit in everyday research, just like toothbrushing.
Rule 1: Specify your research question
Coming up with a research question is not always simple and may take time. A successful study requires a narrow and clear research question. In evidence-based research, prior studies are assessed in a systematic and transparent way to identify a research gap for a new study that answers a question that matters . Papers that provide a comprehensive overview of the current state of research in the field are particularly helpful—for example, systematic reviews. Perspective papers may also be useful, for example, there is a paper with the title “SARS-CoV-2 and COVID-19: The most important research questions.” However, a systematic assessment of research gaps deserves more attention than opinion-based publications.
In the next step, a vague research question should be further developed and refined. In clinical research and evidence-based medicine, there is an approach called population, intervention, comparator, outcome, and time frame (PICOT) with a set of criteria that can help framing a research question . From a well-developed research question, subsequent steps will follow, which may include the exact definition of the population, the outcome, the data to be collected, and the sample size that is required. It may be useful to find out if other researchers find the idea interesting as well and whether it might promise a valuable contribution to the field. However, actively involving the public or the patients can be a more effective way to determine what research questions matter.
The level of details in a research question also depends on whether the planned research is confirmatory or exploratory. In contrast to confirmatory research, exploratory research does not require a well-defined hypothesis from the start. Some examples of exploratory experiments are those based on omics and multi-omics experiments (genomics, bulk RNA-Seq, single-cell, etc.) in systems biology and connectomics and whole-brain analyses in brain imaging. Both exploration and confirmation are needed in science, and it is helpful to understand their strengths and limitations [14,15].
Rule 2: Write and register a study protocol
In clinical research, registration of clinical trials has become a standard since the late 1990 and is now a legal requirement in many countries. Such studies require a study protocol to be registered, for example, with ClinicalTrials.gov, the European Clinical Trials Register, or the World Health Organization’s International Clinical Trials Registry Platform. Similar effort has been implemented for registration of systematic reviews (PROSPERO). Study registration has also been proposed for observational studies  and more recently in preclinical animal research  and is now being advocated across disciplines under the term “preregistration” [18,19].
Study protocols typically document at minimum the research question and hypothesis, a description of the population, the targeted sample size, the inclusion/exclusion criteria, the study design, the data collection, the data processing and transformation, and the planned statistical analyses. The registration of study protocols reduces publication bias and hindsight bias and can safeguard honest research and minimize waste of research [20–22]. Registration ensures that studies can be scrutinized by comparing the reported research with what was actually planned and written in the protocol, and any discrepancies may indicate serious problems (e.g., outcome switching).
Note that registration does not mean that researchers have no flexibility to adapt the plan as needed. Indeed, new or more appropriate procedures may become available or known only after registration of a study. Therefore, a more detailed statistical analysis plan can be amended to the protocol before the data are observed or unblinded [23,24]. Likewise, registration does not exclude the possibility to conduct exploratory data analyses; however, they must be clearly reported as such.
To go even further, registered reports are a novel article type that incentivize high-quality research—irrespective of the ultimate study outcome [25,26]. With registered reports, peer-reviewers decide before anyone knows the results of the study, and they have a more active role in being able to influence the design and analysis of the study. Journals from various disciplines increasingly support registered reports .
Naturally, preregistration and registered reports also have their limitations and may not be appropriate in a purely hypothesis-generating (explorative) framework. Reports of exploratory studies should indeed not be molded into a confirmatory framework; appropriate rigorous reporting alternatives have been suggested and start to become implemented [28,29].
Rule 3: Justify your sample size
Early-career researchers in our GRP courses often identify sample size as an issue in their research. For example, they say that they work with a low number of samples due to slow growth of cells, or they have a limited number of patient tumor samples due to a rare disease. But if your sample size is too low, your study has a high risk of providing a false negative result (type II error). In other words, you are unlikely to find an effect even if there truly was an effect.
Unfortunately, there is more bad news with small studies. When an effect from a small study was selected for drawing conclusions because it was statistically significant, low power increases the probability that an effect size is overestimated [30,31]. The reason is that with low power, studies that due to sampling variation find larger (overestimated) effects are much more likely to be statistically significant than those that happen to find smaller (more realistic) effects [30,32,33]. Thus, in such situations, effect sizes are often overestimated. For the phenomenon that small studies often report more extreme results (in meta-analyses), the term “small-study effect” was introduced . In any case, an underpowered study is a problematic study, no matter the outcome.
In conclusion, small sample sizes can undermine research, but when is a study too small? For one study, a total of 50 patients may be fine, but for another, 1,000 patients may be required. How large a study needs to be designed requires an appropriate sample size calculation. Appropriate sample size calculation ensures that enough data are collected to ensure sufficient statistical power (the probability to reject the null hypothesis when it is in fact false).
Low-powered studies can be avoided by performing a sample size calculation to find out the required sample size of the study. This requires specifying a primary outcome variable and the magnitude of effect you are interested in (among some other factors); in clinical research, this is often the minimal clinically relevant difference. The statistical power is often set at 80% or larger. A comprehensive list of packages for sample size calculation are available , among them the R package “pwr” . There are also many online calculators available, for example, the University of Zurich’s “SampleSizeR” .
A worthwhile alternative for planning the sample size that puts less emphasis on null hypothesis testing is based on the desired precision of the study; for example, one can calculate the sample size that is necessary to obtain a desired width of a confidence interval for the targeted effect [38–40]. A general framework to sample size justification beyond a calculation-only approach has been proposed . It is also worth mentioning that some study types have other requirements or need specific methods. In diagnostic testing, one would need to determine the anticipated minimal sensitivity or specificity; in prognostic research, the number of parameters that can be used to fit a prediction model given a fixed sample size should be specified. Designs can also be so complex that a simulation (Monte Carlo method) may be required.
Sample size calculations should be done under different assumptions, and the largest estimated sample size is often the safer bet than a best-case scenario. The calculated sample size should further be adjusted to allow for possible missing data. Due to the complexity of accurately calculating sample size, researchers should strongly consider consulting a statistician early in the study design process.
Rule 4: Write a data management plan
In 2020, 2 Coronavirus Disease 2019 (COVID-19) papers in leading medical journals were retracted after major concerns about the data were raised . Today, raw data are more often recognized as a key outcome of research along with the paper. Therefore, it is important to develop a strategy for the life cycle of data, including suitable infrastructure for long-term storage.
The data life cycle is described in a data management plan: a document that describes what data will be collected and how the data will be organized, stored, handled, and protected during and after the end of the research project. Several funders require a data management plan in grant submissions, and publishers like PLOS encourage authors to do so as well. The Wellcome Trust provides guidance in the development of a data management plan, including real examples from neuroimaging, genomics, and social sciences . However, projects do not always allocate funding and resources to the actual implementation of the data management plan.
The Findable, Accessible, Interoperable, and Reusable (FAIR) data principles promote maximal use of data and enable machines to access and reuse data with minimal human intervention . FAIR principles require the data to be retained, preserved, and shared preferably with an immutable unique identifier and a clear usage license. Appropriate metadata will help other researchers (or machines) to discover, process, and understand the data. However, requesting researchers to fully comply with the FAIR data principles in every detail is an ambitious goal.
Multidisciplinary data repositories that support FAIR are, for example, Dryad (datadryad.org https://datadryad.org/), EUDAT (www.eudat.eu), OSF (osf.io https://osf.io/), and Zenodo (zenodo.org https://zenodo.org/). A number of institutional and field-specific repositories may also be suitable. However, sometimes, authors may not be able to make their data publicly available for legal or ethical reasons. In such cases, a data user agreement can indicate the conditions required to access the data. Journals highlight what are acceptable and what are unacceptable data access restrictions and often require a data availability statement.
Organizing the study artifacts in a structured way greatly facilitates the reuse of data and code within and outside the lab, enhancing collaborations and maximizing the research investment. Support and courses for data management plans are sometimes available at universities. Another 10 simple rules paper for creating a good data management plan is dedicated to this topic .
Rule 5: Reduce bias
Bias is a distorted view in favor of or against a particular idea. In statistics, bias is a systematic deviation of a statistical estimate from the (true) quantity it estimates. Bias can invalidate our conclusions, and the more bias there is, the less valid they are. For example, in clinical studies, bias may mislead us into reaching a causal conclusion that the difference in the outcomes was due to the intervention or the exposure. This is a big concern, and, therefore, the risk of bias is assessed in clinical trials  as well as in observational studies [47,48].
There are many different forms of bias that can occur in a study, and they may overlap (e.g., allocation bias and confounding bias) . Bias can occur at different stages, for example, immortal time bias in the design of the study, information bias in the execution of the study, and publication bias in the reporting of research. Understanding bias allows us researchers to remain vigilant of potential sources of bias when peer-reviewing and designing own studies. We summarized some common types of bias and some preventive steps in Table 1, but many other forms of bias exist; for a comprehensive overview, see the Oxford University’s Catalogue of Bias .
Here are some noteworthy examples of study bias from the literature: An example of information bias was observed when in 1998 an alleged association between the measles, mumps, and rubella (MMR) vaccine and autism was reported. Recall bias (a subtype of information bias) emerged when parents of autistic children recalled the onset of autism after an MMR vaccination more often than parents of similar children who were diagnosed prior to the media coverage of that controversial and meanwhile retracted study . A study from 2001 showed better survival for academy award-winning actors, but this was due to immortal time bias that favors the treatment or exposure group [52,53]. A study systematically investigated self-reports about musculoskeletal symptoms and found the presence of information bias. The reason was that participants with little computer-time overestimated, and participants with a lot of computer-time spent underestimated their computer usage .
Information bias can be mitigated by using objective rather than subjective measurements. Standardized operating procedures (SOP) and electronic lab notebooks additionally help to follow well-designed protocols for data collection and handling . Despite the failure to mitigate bias in studies, complete descriptions of data and methods can at least allow the assessment of risk of bias.
Rule 6: Avoid questionable research practices
Questionable research practices (QRPs) can lead to exaggerated findings and false conclusions and thus lead to irreproducible research. Often, QRPs are used with no bad intentions. This becomes evident when methods sections explicitly describe such procedures, for example, to increase the number of samples until statistical significance is reached that supports the hypothesis. Therefore, it is important that researchers know about QRPs in order to recognize and avoid them.
Several questionable QRPs have been named [56,57]. Among them are low statistical power, pseudoreplication, repeated inspection of data, p-hacking , selective reporting, and hypothesizing after the results are known (HARKing).
The first 2 QRPs, low statistical power and pseudoreplication, can be prevented by proper planning and designing of studies, including sample size calculation and appropriate statistical methodology to avoid treating data as independent when in fact they are not. Statistical power is not equal to reproducibility, but statistical power is a precondition of reproducibility as the lack thereof can result in false negative as well as false positive findings (see Rule 3).
In fact, a lot of QRP can be avoided with a study protocol and statistical analysis plan. Preregistration, as described in Rule 2, is considered best practice for this purpose. However, many of these issues can additionally be rooted in institutional incentives and rewards. Both funding and promotion are often tied to the quantity rather than the quality of the research output. At universities, still only few or no rewards are given for writing and registering protocols, sharing data, publishing negative findings, and conducting replication studies. Thus, a wider “culture change” is needed.
Rule 7: Be cautious with interpretations of statistical significance
It would help if more researchers were familiar with correct interpretations and possible misinterpretations of statistical tests, p-values, confidence intervals, and statistical power [59,60]. A statistically significant p-value does not necessarily mean that there is a clinically or biologically relevant effect. Specifically, the traditional dichotomization into statistically significant (p < 0.05) versus statistically nonsignificant (p ≥ 0.05) results is seldom appropriate, can lead to cherry-picking of results and may eventually corrupt science . We instead recommend reporting exact p-values and interpreting them in a graded way in terms of the compatibility of the null hypothesis with the data [62,63]. Moreover, a p-value around 0.05 (e.g., 0.047 or 0.055) provides only little information, as is best illustrated by the associated replication power: The probability that a hypothetical replication study of the same design will lead to a statistically significant result is only 50%  and is even lower in the presence of publication bias and regression to the mean (the phenomenon that effect estimates in replication studies are often smaller than the estimates in the original study) . Claims of novel discoveries should therefore be based on a smaller p-value threshold (e.g., p < 0.005) , but this really depends on the discipline (genome-wide screenings or studies in particle physics often apply much lower thresholds).
Generally, there is often too much emphasis on p-values. A statistical index such as the p-value is just the final product of an analysis, the tip of the iceberg . Statistical analyses often include many complex stages, from data processing, cleaning, transformation, addressing missing data, modeling, to statistical inference. Errors and pitfalls can creep in at any stage, and even a tiny error can have a big impact on the result . Also, when many hypothesis tests are conducted (multiple testing), false positive rates may need to be controlled to protect against wrong conclusions, although adjustments for multiple testing are debated [69–71].
Thus, a p-value alone is not a measure of how credible a scientific finding is . Instead, the quality of the research must be considered, including the study design, the quality of the measurement, and the validity of the assumptions that underlie the data analysis [60,73]. Frameworks exist that help to systematically and transparently assess the certainty in evidence; the most established and widely used one is Grading of Recommendations, Assessment, Development and Evaluations (GRADE; www.gradeworkinggroup.org) .
Training in basic statistics, statistical programming, and reproducible analyses and better involvement of data professionals in academia is necessary. University departments sometimes have statisticians that can support researchers. Importantly, statisticians need to be involved early in the process and on an equal footing and not just at the end of a project to perform the final data analysis.
Rule 8: Make your research open
In reality, science often lacks transparency. Open science makes the process of producing evidence and claims transparent and accessible to others . Several universities and research funders have already implemented open science roadmaps to advocate free and public science as well as open access to scientific knowledge, with the aim of further developing the credibility of research. Open research allows more eyes to see it and critique it, a principle similar to the “Linus’s law” in software development, which says that if there are enough people to test a software, most bugs will be discovered.
As science often progresses incrementally, writing and sharing a study protocol and making data and methods readily available is crucial to facilitate knowledge building. The Open Science Framework (osf.io) is a free and open-source project management tool that supports researchers throughout the entire project life cycle. OSF enables preregistration of study protocols and sharing of documents, data, analysis code, supplementary materials, and preprints.
To facilitate reproducibility, a research paper can link to data and analysis code deposited on OSF. Computational notebooks are now readily available that unite data processing, data transformations, statistical analyses, figures and tables in a single document (e.g., R Markdown, Jupyter); see also the 10 simple rules for reproducible computational research . Making both data and code open thus minimizes waste of funding resources and accelerates science.
Open science can also advance researchers’ careers, especially for early-career researchers. The increased visibility, retrievability, and citations of datasets can all help with career building . Therefore, institutions should provide necessary training, and hiring committees and journals should align their core values with open science, to attract researchers who aim for transparent and credible research .
Rule 9: Report all findings
Publication bias occurs when the outcome of a study influences the decision whether to publish it. Researchers, reviewers, and publishers often find nonsignificant study results not interesting or worth publishing. As a consequence, outcomes and analyses are only selectively reported in the literature , also known as the file drawer effect .
The extent of publication bias in the literature is illustrated by the overwhelming frequency of statistically significant findings . A study extracted p-values from MEDLINE and PubMed Central and showed that 96% of the records reported at least 1 statistically significant p-value , which seems implausible in the real world. Another study plotted the distribution of more than 1 million z-values from Medline, revealing a huge gap from −2 to 2 . Positive studies (i.e., statistically significant, perceived as striking or showing a beneficial effect) were 4 times more likely to get published than negative studies .
Often a statistically nonsignificant result is interpreted as a “null” finding. But a nonsignificant finding does not necessarily mean a null effect; absence of evidence is not evidence of absence . An individual study may be underpowered, resulting in a nonsignificant finding, but the cumulative evidence from multiple studies may indeed provide sufficient evidence in a meta-analysis. Another argument is that a confidence interval that contains the null value often also contains non-null values that may be of high practical importance. Only if all the values inside the interval are deemed unimportant from a practical perspective, then it may be fair to describe a result as a null finding . We should thus never report “no difference” or “no association” just because a p-value is larger than 0.05 or, equivalently, because a confidence interval includes the “null” .
On the other hand, studies sometimes report statistically nonsignificant results with “spin” to claim that the experimental treatment is beneficial, often by focusing their conclusions on statistically significant differences on secondary outcomes despite a statistically nonsignificant difference for the primary outcome [86,87].
Findings that are not being published have a tremendous impact on the research ecosystem, distorting our knowledge of the scientific landscape by perpetuating misconceptions, and jeopardizing judgment of researchers and the public trust in science. In clinical research, publication bias can mislead care decisions and harm patients, for example, when treatments appear useful despite only minimal or even absent benefits reported in studies that were not published and thus are unknown to physicians . Moreover, publication bias also directly affects the formulation and proliferation of scientific theories, which are taught to students and early-career researchers, thereby perpetuating biased research from the core. It has been shown in modeling studies that unless a sufficient proportion of negative studies are published, a false claim can become an accepted fact  and the false positive rates influence trustworthiness in a given field .
In sum, negative findings are undervalued. They need to be more consistently reported at the study level or be systematically investigated at the systematic review level. Researchers have their share of responsibilities, but there is clearly a lack of incentives from promotion and tenure committees, journals, and funders.
Rule 10: Follow reporting guidelines
Study reports need to faithfully describe the aim of the study and what was done, including potential deviations from the original protocol, as well as what was found. Yet, there is ample evidence of discrepancies between protocols and research reports, and of insufficient quality of reporting [79,91–95]. Reporting deficiencies threaten our ability to clearly communicate findings, replicate studies, make informed decisions, and build on existing evidence, wasting time and resources invested in the research .
Reporting guidelines aim to provide the minimum information needed on key design features and analysis decisions, ensuring that findings can be adequately used and studies replicated. In 2008, the Enhancing the QUAlity and Transparency Of Health Research (EQUATOR) network was initiated to provide reporting guidelines for a variety of study designs along with guidelines for education and training on how to enhance quality and transparency of health research. Currently, there are 468 reporting guidelines listed in the network; see the most prominent guidelines in Table 2. Furthermore, following the ICMJE recommendations, medical journals are increasingly endorsing reporting guidelines , in some cases making it mandatory to submit the appropriate reporting checklist along with the manuscript.
The use of reporting guidelines and journal endorsement has led to a positive impact on the quality and transparency of research reporting, but improvement is still needed to maximize the value of research [98,99].
Originally, this paper targeted early-career researchers; however, throughout the development of the rules, it became clear that the present recommendations can serve all researchers irrespective of their seniority. We focused on practical guidelines for planning, conducting, and reporting of research. Others have aligned GRP with similar topics [100,101]. Even though we provide 10 simple rules, the word “simple” should not be taken lightly. Putting the rules into practice usually requires effort and time, especially at the beginning of a research project. However, time can also be redeemed, for example, when certain choices can be justified to reviewers by providing a study protocol or when data can be quickly reanalyzed by using computational notebooks and dynamic reports.
Researchers have field-specific research skills, but sometimes are not aware of best practices in other fields that can be useful. Universities should offer cross-disciplinary GRP courses across faculties to train the next generation of scientists. Such courses are an important building block to improve the reproducibility of science.
This article was written along the Good Research Practice (GRP) courses at the University of Zurich provided by the Center of Reproducible Science (www.crs.uzh.ch). All materials from the course are available at https://osf.io/t9rqm/. We appreciated the discussion, development, and refinement of this article within the working group “training” of the SwissRN (www.swissrn.org). We are grateful to Philip Bourne for a lot of valuable comments on the earlier versions of the manuscript.
- 1. Errington TM, Mathur M, Soderberg CK, Denis A, Perfito N, Iorns E, et al. Investigating the replicability of preclinical cancer biology. Elife. 2021;10. pmid:34874005
- 2. Camerer CF, Dreber A, Holzmeister F, Ho T-H, Huber J, Johannesson M, et al. Evaluating the replicability of social science experiments in Nature and Science between 2010 and 2015. Nat Hum Behav. 2018;2:637–44. pmid:31346273
- 3. Camerer CF, Dreber A, Forsell E, Ho T-H, Huber J, Johannesson M, et al. Evaluating replicability of laboratory experiments in economics. Science. 2016;351:1433–6. pmid:26940865
- 4. Open Science Collaboration. PSYCHOLOGY. Estimating the reproducibility of psychological science. Science. 2015;349:aac4716. pmid:26315443
- 5. Prinz F, Schlange T, Asadullah K. Believe it or not: how much can we rely on published data on potential drug targets? Nat Rev Drug Discov. 2011:712. pmid:21892149
- 6. Bespalov A, Barnett AG, Begley CG. Industry is more alarmed about reproducibility than academia. Nature. 2018;563:626. pmid:30487623
- 7. Baker M. 1,500 scientists lift the lid on reproducibility. Nature. 2016;533:452–4. pmid:27225100
- 8. Botvinik-Nezer R, Holzmeister F, Camerer CF, Dreber A, Huber J, Johannesson M, et al. Variability in the analysis of a single neuroimaging dataset by many teams. Nature. 2020. pmid:32483374
- 9. Silberzahn R, Uhlmann EL, Martin DP, Anselmi P, Aust F, Awtrey E, et al. Many Analysts, One Data Set: Making Transparent How Variations in Analytic Choices Affect Results. Adv Methods Pract Psychol Sci. 2018;1:337–56.
- 10. Medical Research Council, MRC. Good research practice: principles and guidelines. 2012. Available: https://mrc.ukri.org/research/policies-and-guidance-for-researchers/good-research-practice/.
- 11. Swedish Research Council. Good Research Practice–What Is It? 2006. Available: https://lincs.gu.se/digitalAssets/1268/1268493_Good_Research_Practice.pdf.
- 12. Robinson KA, Brunnhuber K, Ciliska D, Juhl CB, Christensen R, Lund H, et al. Evidence-Based Research Series-Paper 1: What Evidence-Based Research is and why is it important? J Clin Epidemiol. 2021;129:151–7. pmid:32979491
- 13. Riva JJ, Malik KMP, Burnie SJ, Endicott AR, Busse JW. What is your research question? An introduction to the PICOT format for clinicians. J Can Chiropr Assoc. 2012;56:167–71. Available: https://www.ncbi.nlm.nih.gov/pubmed/22997465. pmid:22997465
- 14. Schwab S, Held L. Different worlds Confirmatory versus exploratory research. Significance. 2020;17:8–9.
- 15. Tukey JW. We need both exploratory and confirmatory. Am Stat. 1980;34:23–5.
- 16. Loder E, Groves T, MacAuley D. Registration of observational studies. BMJ. 2010:c950. pmid:20167643
- 17. van der Naald M, Wenker S, Doevendans PA, Wever KE, Chamuleau SAJ. Publication rate in preclinical research: a plea for preregistration. BMJ Open Science. 2020;4:e100051. pmid:35047690
- 18. Nosek BA, Beck ED, Campbell L, Flake JK, Hardwicke TE, Mellor DT, et al. Preregistration Is Hard. And Worthwhile Trends Cogn Sci. 2019;23:815–8. pmid:31421987
- 19. Nosek BA, Ebersole CR, DeHaven AC, Mellor DT. The preregistration revolution. Proc Natl Acad Sci U S A. 2018;115:2600–6. pmid:29531091
- 20. Bradley SH, DeVito NJ, Lloyd KE, Richards GC, Rombey T, Wayant C, et al. Reducing bias and improving transparency in medical research: a critical overview of the problems, progress and suggested next steps. J R Soc Med. 2020;113:433–43. pmid:33167771
- 21. Macleod MR, Michie S, Roberts I, Dirnagl U, Chalmers I, Ioannidis JPA, et al. Biomedical research: increasing value, reducing waste. Lancet. 2014;383:101–4. pmid:24411643
- 22. Chalmers I, Glasziou P. Avoidable waste in the production and reporting of research evidence. Lancet. 2009;374:86–9. pmid:19525005
- 23. Yuan I, Topjian AA, Kurth CD, Kirschen MP, Ward CG, Zhang B, et al. Guide to the statistical analysis plan. Paediatr Anaesth. 2019;29:237–42. pmid:30609103
- 24. Thomas L, Peterson ED. The value of statistical analysis plans in observational research: defining high-quality research from the start. JAMA. 2012;308:773–4. pmid:22910753
- 25. Soderberg CK, Errington TM, Schiavone SR, Bottesini J, Thorn FS, Vazire S, et al. Initial evidence of research quality of registered reports compared with the standard publishing model. Nature Human. Behaviour. 2021:1–8.
- 26. Chambers C. What’s next for Registered Reports? Nature. 2019;573:187–9. pmid:31506624
- 27. Chambers CD, Mellor DT. Protocol transparency is vital for registered reports. Nat Hum Behav. 2018:791–2. pmid:31558811
- 28. Dirnagl U. Preregistration of exploratory research: Learning from the golden age of discovery. PLoS Biol. 2020;18:e3000690. pmid:32214315
- 29. McIntosh RD. Exploratory reports: A new article type for Cortex. Cortex. 2017;96:A1–4. pmid:29110814
- 30. Button KS, Ioannidis JPA, Mokrysz C, Nosek BA, Flint J, Robinson ESJ, et al. Power failure: why small sample size undermines the reliability of neuroscience. Nat Rev Neurosci. 2013;14:365–76. pmid:23571845
- 31. Ioannidis JPA. Why most published research findings are false. PLoS Med. 2005;2:e124. pmid:16060722
- 32. van Zwet E, Schwab S, Greenland S. Addressing exaggeration of effects from single RCTs. Significance. 2021;18:16–21.
- 33. van Zwet E, Schwab S, Senn S. The statistical properties of RCTs and a proposal for shrinkage. Stat Med. 2021;40:6107–17. pmid:34425632
- 34. Sterne JA, Gavaghan D, Egger M. Publication and related bias in meta-analysis: power of statistical tests and prevalence in the literature. J Clin Epidemiol. 2000;53:1119–29. pmid:11106885
- 35. H. G. Zhang EZ. CRAN task view: Clinical trial design, monitoring, and analysis. 20 Jun 2021 [cited 3 Mar 2022]. Available: https://CRAN.R-project.org/view=ClinicalTrials.
- 36. Champely S. pwr: Basic Functions for Power Analysis. 2020. Available: https://CRAN.R-project.org/package=pwr.
- 37. Tarigan B, Furrer R, Cherneva K. SampleSizeR: calculate sample sizes within completely randomized design. Open Science. Framework. 2021.
- 38. Rothman KJ, Greenland S. Planning Study Size Based on Precision Rather Than Power. Epidemiology. 2018;29:599–603. pmid:29912015
- 39. Bland JM. The tyranny of power: is there a better way to calculate sample size? BMJ. 2009;339:b3985. pmid:19808754
- 40. Haynes A, Lenz A, Stalder O, Limacher A. presize: An R-package for precision-based sample size calculation in clinical research. J Open Source Softw. 2021;6:3118.
- 41. Lakens D. Sample Size Justification. 2021.
- 42. Ledford H, Van Noorden R. High-profile coronavirus retractions raise concerns about data oversight. Nature. 2020;582:160. pmid:32504025
- 43. Outputs Management Plan—Grant Funding. In: Wellcome [Internet]. [cited 13 Feb 2022]. Available: https://wellcome.org/grant-funding/guidance/how-complete-outputs-management-plan.
- 44. Wilkinson MD, Dumontier M, Aalbersberg IJJ, Appleton G, Axton M, Baak A, et al. The FAIR Guiding Principles for scientific data management and stewardship. Sci Data. 2016;3:160018. pmid:26978244
- 45. Michener WK. Ten Simple Rules for Creating a Good Data Management Plan. PLoS Comput Biol. 2015;11:e1004525. pmid:26492633
- 46. Higgins JPT, Altman DG, Gøtzsche PC, Jüni P, Moher D, Oxman AD, et al. The Cochrane Collaboration’s tool for assessing risk of bias in randomised trials. BMJ. 2011;343:d5928. pmid:22008217
- 47. Sterne JA, Hernán MA, Reeves BC, Savović J, Berkman ND, Viswanathan M, et al. ROBINS-I: a tool for assessing risk of bias in non-randomised studies of interventions. BMJ. 2016;355:i4919. pmid:27733354
- 48. Bero L, Chartres N, Diong J, Fabbri A, Ghersi D, Lam J, et al. The risk of bias in observational studies of exposures (ROBINS-E) tool: concerns arising from application to observational studies of exposures. Syst Rev. 2018;7:242. pmid:30577874
- 49. Sackett DL. Bias in analytic research. J Chronic Dis. 1979;32:51–63. pmid:447779
- 50. Catalogue of bias collaboration. Catalogue of Bias. 2019. Available: https://catalogofbias.org/.
- 51. Andrews N, Miller E, Taylor B, Lingam R, Simmons A, Stowe J, et al. Recall bias, MMR, and autism. Arch Dis Child. 2002;87:493–4. pmid:12456546
- 52. Sylvestre M-P, Huszti E, Hanley JA. Do OSCAR winners live longer than less successful peers? A reanalysis of the evidence. Ann Intern Med. 2006:361–3discussion 392. pmid:16954361
- 53. Yadav K, Lewis RJ. Immortal Time Bias in Observational Studies. JAMA. 2021;325:686–7. pmid:33591334
- 54. Chang C-HJ, Menéndez CC, Robertson MM, Amick BC 3rd, Johnson PW, del Pino RJ, et al. Daily self-reports resulted in information bias when assessing exposure duration to computer use. Am J Ind Med. 2010;53:1142–9. pmid:20632313
- 55. Kwok R. How to pick an electronic laboratory notebook. Nature. 2018;560:269–70. pmid:30082695
- 56. Bishop D. Rein in the four horsemen of irreproducibility. Nature. 2019;568:435. pmid:31019328
- 57. Held L, Schwab S. Improving the reproducibility of science. Significance. 2020;17:10–1.
- 58. Simmons JP, Nelson LD, Simonsohn U. False-positive psychology: undisclosed flexibility in data collection and analysis allows presenting anything as significant. Psychol Sci. 2011;22:1359–66. pmid:22006061
- 59. Greenland S, Senn SJ, Rothman KJ, Carlin JB, Poole C, Goodman SN, et al. Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations. Eur J Epidemiol. 2016;31:337–50. pmid:27209009
- 60. Wasserstein RL, Lazar NA. The ASA’s Statement on p-Values: Context, Process, and Purpose. Am Stat. 2016;70:129–33.
- 61. Amrhein V, Greenland S, McShane B. Retire statistical significance. Nature. 2019;567:305–7. pmid:30894741
- 62. Cox DR, Donnelly CA. Principles of Applied Statistics. Cambridge University Press; 2011. Available: https://play.google.com/store/books/details?id=BMLMGr4kbQYC.
- 63. Amrhein V, Greenland S. Rewriting results in the language of compatibility. Trends Ecol Evol 2022;0. pmid:35227533
- 64. Goodman SN. A comment on replication, p-values and evidence. Stat Med. 1992;11:875–9. pmid:1604067
- 65. Held L, Pawel S, Schwab S. Replication power and regression to the mean. Significance. 2020;17:10–1.
- 66. Benjamin DJ, Berger JO, Johannesson M, Nosek BA, et al. Redefine statistical significance. Nature Human. Behaviour. 2017;2:6–10. pmid:30980045
- 67. Leek JT, Peng RD. Statistics: P values are just the tip of the iceberg. Nature. 2015;520:612. pmid:25925460
- 68. Schwab S, Held L. Statistical programming: Small mistakes, big impacts. Significance. 2021;18:6–7.
- 69. Althouse AD. Adjust for Multiple Comparisons? It’s Not That Simple. Ann Thorac Surg. 2016;101:1644–5. pmid:27106412
- 70. Bender R, Lange S. Adjusting for multiple testing—when and how? J Clin Epidemiol 2001;54: 343–349. pmid:11297884
- 71. Greenland S. Analysis goals, error-cost sensitivity, and analysis hacking: Essential considerations in hypothesis testing and multiple comparisons. Paediatr Perinat Epidemiol. 2021;35:8–23. pmid:33269490
- 72. Nuzzo R. Scientific method: statistical errors. Nature. 2014;506:150–2. pmid:24522584
- 73. Goodman SN. Aligning statistical and scientific reasoning. Science. 2016;352:1180–1.
- 74. GRADE approach. [cited 3 Mar 2022]. Available: https://training.cochrane.org/grade-approach.
- 75. Munafò MR, Nosek BA, Bishop DVM, Button KS, Chambers CD, Percie du Sert N, et al. A manifesto for reproducible science. Nature Human. Behaviour. 2017;1:0021. pmid:33954258
- 76. Sandve GK, Nekrutenko A, Taylor J, Hovig E. Ten simple rules for reproducible computational research. PLoS Comput Biol. 2013;9:e1003285. pmid:24204232
- 77. McKiernan EC, Bourne PE, Brown CT, Buck S, Kenall A, Lin J, et al. How open science helps researchers succeed. Elife. 2016;5. pmid:27387362
- 78. Schönbrodt F. Training students for the Open Science future. Nat Hum Behav. 2019;3:1031. pmid:31602034
- 79. Chan A-W, Hróbjartsson A, Haahr MT, Gøtzsche PC, Altman DG. Empirical evidence for selective reporting of outcomes in randomized trials: comparison of protocols to published articles. JAMA. 2004;291:2457–65. pmid:15161896
- 80. Rosenthal R. The file drawer problem and tolerance for null results. Psychol Bull. 1979;86:638–41.
- 81. Fanelli D. Negative results are disappearing from most disciplines and countries. Scientometrics. 2012;90:891–904.
- 82. Chavalarias D, Wallach JD, Li AHT, Ioannidis JPA. Evolution of Reporting P Values in the Biomedical Literature, 1990–2015. JAMA. 2016;315:1141–8. pmid:26978209
- 83. van Zwet EW, Cator EA. The significance filter, the winner’s curse and the need to shrink. Stat Neerl. 2021;75:437–52.
- 84. Hopewell S, Loudon K, Clarke MJ, Oxman AD, Dickersin K. Publication bias in clinical trials due to statistical significance or direction of trial results. Cochrane Database Syst Rev. 2009:MR000006. pmid:19160345
- 85. Altman DG, Martin BJ. Statistics notes: Absence of evidence is not evidence of absence. BMJ. 1995;311:485.
- 86. Boutron I, Dutton S, Ravaud P, Altman DG. Reporting and interpretation of randomized controlled trials with statistically nonsignificant results for primary outcomes. JAMA. 2010;303:2058–64. pmid:20501928
- 87. Khan MS, Lateef N, Siddiqi TJ, Rehman KA, Alnaimat S, Khan SU, et al. Level and Prevalence of Spin in Published Cardiovascular Randomized Clinical Trial Reports With Statistically Nonsignificant Primary Outcomes: A Systematic Review. JAMA Netw Open. 2019;2:e192622. pmid:31050775
- 88. Egger M, Smith GD. Bias in location and selection of studies. BMJ. 1998;316:61–6. pmid:9451274
- 89. Nissen SB, Magidson T, Gross K, Bergstrom CT. Publication bias and the canonization of false facts. Elife. 2016;5. pmid:27995896
- 90. Grimes DR, Bauch CT, Ioannidis JPA. Modelling science trustworthiness under publish or perish pressure. R Soc Open Sci. 2018;5:171511. pmid:29410855
- 91. Goldacre B, Drysdale H, Dale A, Milosevic I, Slade E, Hartley P, et al. COMPare: a prospective cohort study correcting and monitoring 58 misreported trials in real time. Trials. 2019;20:118. pmid:30760329
- 92. Pildal J, Chan A-W, Hróbjartsson A, Forfang E, Altman DG, Gøtzsche PC. Comparison of descriptions of allocation concealment in trial protocols and the published reports: cohort study. BMJ. 2005;330:1049. pmid:15817527
- 93. Koensgen N, Rombey T, Allers K, Mathes T, Hoffmann F, Pieper D. Comparison of non-Cochrane systematic reviews and their published protocols: differences occurred frequently but were seldom explained. J Clin Epidemiol. 2019;110:34–41. pmid:30822507
- 94. Pocock SJ, Collier TJ, Dandreo KJ, de Stavola BL, Goldman MB, Kalish LA, et al. Issues in the reporting of epidemiological studies: a survey of recent practice. BMJ. 2004;329:883. pmid:15469946
- 95. Li G, Abbade LPF, Nwosu I, Jin Y, Leenus A, Maaz M, et al. A systematic review of comparisons between protocols or registrations and full reports in primary biomedical research. BMC Med Res Methodol. 2018;18:9. pmid:29325533
- 96. Glasziou P, Altman DG, Bossuyt P, Boutron I, Clarke M, Julious S, et al. Reducing waste from incomplete or unusable reports of biomedical research. Lancet. 2014;383:267–76. pmid:24411647
- 97. Shamseer L, Hopewell S, Altman DG, Moher D, Schulz KF. Update on the endorsement of CONSORT by high impact factor journals: a survey of journal “Instructions to Authors” in 2014. Trials. 2016;17:301. pmid:27343072
- 98. Turner L, Shamseer L, Altman DG, Schulz KF, Moher D. Does use of the CONSORT Statement impact the completeness of reporting of randomised controlled trials published in medical journals? A Cochrane review Syst Rev. 2012;1:60. pmid:23194585
- 99. Stevens A, Shamseer L, Weinstein E, Yazdi F, Turner L, Thielman J, et al. Relation of completeness of reporting of health research to journals’ endorsement of reporting guidelines: systematic review. BMJ. 2014;348:g3804. pmid:24965222
- 100. Sarafoglou A, Hoogeveen S, Matzke D, Wagenmakers E-J. Teaching Good Research Practices: Protocol of a Research Master Course. Psychology Learning & Teaching. 2020;19:46–59.
- 101. Kabitzke P, Cheng KM, Altevogt B. Guidelines and Initiatives for Good Research Practice. Handb Exp Pharmacol. 2019. pmid:31696346