General medical publications during COVID-19 show increased dissemination despite lower validation

Nan Gai; Kazuyoshi Aoyama; David Faraoni; Neil M. Goldenberg; David N. Levin; Jason T. Maynes; Mark J. McVey; Farrukh Munshey; Asad Siddiqui; Timothy Switzer; Benjamin E. Steinberg

doi:10.1371/journal.pone.0246427

Abstract

Background

The COVID-19 pandemic has yielded an unprecedented quantity of new publications, contributing to an overwhelming quantity of information and leading to the rapid dissemination of less stringently validated information. Yet, a formal analysis of how the medical literature has changed during the pandemic is lacking. In this analysis, we aimed to quantify how scientific publications changed at the outset of the COVID-19 pandemic.

Methods

We performed a cross-sectional bibliometric study of published studies in four high-impact medical journals to identify differences in the characteristics of COVID-19 related publications compared to non-pandemic studies. Original investigations related to SARS-CoV-2 and COVID-19 published in March and April 2020 were identified and compared to non-COVID-19 research publications over the same two-month period in 2019 and 2020. Extracted data included publication characteristics, study characteristics, author characteristics, and impact metrics. Our primary measure was principal component analysis (PCA) of publication characteristics and impact metrics across groups.

Results

We identified 402 publications that met inclusion criteria: 76 were related to COVID-19; 154 and 172 were non-COVID publications over the same period in 2020 and 2019, respectively. PCA utilizing the collected bibliometric data revealed segregation of the COVID-19 literature subset from both groups of non-COVID literature (2019 and 2020). COVID-19 publications were more likely to describe prospective observational (31.6%) or case series (41.8%) studies without industry funding as compared with non-COVID articles, which were represented primarily by randomized controlled trials (32.5% and 36.6% in the non-COVID literature from 2020 and 2019, respectively).

Conclusions

In this cross-sectional study of publications in four general medical journals, COVID-related articles were significantly different from non-COVID articles based on article characteristics and impact metrics. COVID-related studies were generally shorter articles reporting observational studies with less literature cited and fewer study sites, suggestive of more limited scientific support. They nevertheless had much higher dissemination.

Citation: Gai N, Aoyama K, Faraoni D, Goldenberg NM, Levin DN, Maynes JT, et al. (2021) General medical publications during COVID-19 show increased dissemination despite lower validation. PLoS ONE 16(2): e0246427. https://doi.org/10.1371/journal.pone.0246427

Editor: Itamar Ashkenazi, Technion - Israel Institute of Technology, ISRAEL

Received: September 15, 2020; Accepted: January 20, 2021; Published: February 2, 2021

Copyright: © 2021 Gai et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: All relevant data are within the paper and its Supporting Information files.

Funding: Funding was provided through departmental funds from the Department of Anesthesia and Pain Medicine at the Hospital for Sick Children.

Competing interests: The authors have declared that no competing interests exist.

Introduction

The coronavirus disease 2019 (COVID-19) pandemic has given rise to an unprecedented quantity of publications in a short period of time as researchers worldwide attempt to report their experiences to better understand this new disease and identify promising treatments [1]. This has contributed to a COVID-19 “infodemic”–an overwhelming quantity of information, leading to the rapid dissemination of less stringently validated information [2].

Given the devastating severity of COVID-19, there is an understandable urgency to disseminate new findings. However, the rush to publish has potentially led to the compromise of scientific integrity [3]. This has led to advocacy for quality over quantity, cautioning that a crisis is no excuse for lowering scientific standards [3–5]. Yet, the COVID-19 pandemic has magnified traditional problems of “uninformative” clinical trials–those whose results are not useful to patients, clinicians, researchers, or policy makers [6, 7].

While specific concerns about COVID-19-related publications have been expressed [8], a formal analysis of the extent to which the medical literature has shifted during the pandemic is lacking. In this analysis, we aimed to quantify how scientific publications changed at the outset of the COVID-19 pandemic by performing a cross-sectional bibliometric study of published studies in four high-impact medical journals to identify differences in the characteristics of COVID-19 related publications compared to non-pandemic related studies.

Methods

This is a cross-sectional bibliometric study of original COVID-19 related research publications in the four general medical journals with the highest impact factors [9]–The Journal of the American Medical Association (JAMA), New England Journal of Medicine (NEJM), The Lancet, and Nature Medicine. This study followed the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) reporting guidelines [10].

We searched for original investigations related to SARS-CoV-2 and COVID-19 published in March and April 2020 through MEDLINE. MEDLINE alone was used because it contained entries for all publications within our four journals of interest. Accordingly, other databases were not consulted. As comparison groups, we retrieved all non-COVID-19 research publications over the same two-month period in 2019 and 2020. We included original scientific research, and excluded opinion, news, and educational pieces. Two reviewers verified studies for inclusion and two reviewers audited extracted data. Any discrepancies in eligibility assessment and data collection were resolved by consensus. Extracted data included publication characteristics, study characteristics, author characteristics, and impact metrics. Impact metrics (numbers of reads, citations, and tweets) were not normalized to the time since publication.

Categorical data are presented as counts and percentages and continuous data as medians and interquartile ranges (IQRs). Our primary measure was principal component analysis (PCA) of publication characteristics and impact metrics across groups. In our study, we sought to discover any differences in multiple article metrics between the 2020 COVID period and historical controls. Principal component analysis allows for the determination of the largest contributors to the variance in the data across all article metrics, in an unsupervised fashion without biasing data segregation [11]. Using PCA allows us to identify the most important features that capture the maximum information about the dataset, reducing dimensionality without any significant loss of information. Comparisons between groups were conducted using Chi-square or Fisher’s exact tests for proportions and non-parametric Kruskal-Wallis tests with Dunn’s multiple comparison for continuous data. Data for each journal were aggregated for analysis. P values less than 0.05 were considered statistically significant. Analyses were performed using GraphPad PRISM software version 7.0 and RStudio version 1.3.1056.

Results

The initial MEDLINE literature search identified 1,119 total articles for consideration (262 COVID-related). We identified 402 publications that met inclusion criteria: 76 were related to COVID-19; 154 and 172 were non-COVID publications over the same period in 2020 and 2019, respectively (data available in S1 Dataset). Principal component analysis utilizing the collected bibliometric data revealed segregation of the COVID-19 literature subset from both groups of non-COVID literature (2019 and 2020), verifying that the bibliometric characteristics capture a change in publication metrics (Fig 1). The most significant contributions to the PCA came from metrics representing article dissemination (reads, tweets, and citations with 57%, 54%, and 43% each towards the first principal component, PC1). The two non-COVID subsets of data possess a near overlap in the PCA, indicating a strong consistency between the two years analyzed and emphasizing the uniqueness of the COVID-related literature.

Download:

Fig 1. Principal component analysis of COVID and non-COVID publication characteristics and impact metrics.

Each point in the plot corresponds to a single characteristic provided in Table 1 for COVID (green square) and non-COVID publications from 2019 (purple circle) and 2020 (gray triangle). Principal component 1 (PC1) is shown plotted against (A) PC2 and (B) PC3. PC1, PC2, and PC3 respectively account for 32.4%, 24.8% and 16.4% of the variability. Non-COVID publications from 2019 and 2020 clusters overlap, whereas COVID publications cluster separately. This unbiased analysis suggests COVID-related publications differ from both concurrent and historic non-COVID publications.

https://doi.org/10.1371/journal.pone.0246427.g001

Download:

Table 1. Publication characteristics and impact.

https://doi.org/10.1371/journal.pone.0246427.t001

To further evaluate how the published COVID-19 research literature differed from non-COVID-19 investigations, we first compared their publication characteristics (Table 1). Publication characteristics segregated by individual journal are provided in the Table in S1 Table. COVID-19 publications were more likely to describe prospective observational (31.6%) or case series (41.8%) studies without industry funding as compared with non-COVID articles, which were represented primarily by randomized controlled trials (32.5% and 36.6% in the non-COVID literature from 2020 and 2019, respectively). Moreover, COVID-related publications had lower word counts with fewer citations of other medical literature. While the number of authors was unchanged, the number of author affiliations was decreased, suggesting a lower level of collaborative or multi-institutional studies. There was no observed difference in the proportion of female first or corresponding authors. For Nature Medicine, the only evaluated journal to report submission dates, COVID-related submissions were published in a much shorter amount of time (35.1 days versus 288.3 and 305.3 days for 2020 and 2019 non-COVID publications, respectively).

The observed differences in publication characteristics presumably represents the initial effort to quickly provide clinicians and policymakers with information in the early phase of the pandemic, regardless of quality. To objectively evaluate the extent to which the COVID-19 literature was disseminated, we analyzed the number of accesses, tweets, and citations within our bibliometric dataset. Publications related to COVID had an order of magnitude greater accesses, tweets, and citations compared with non-COVID publications from the same period in both 2019 and 2020 (Table 1). This absolute difference does not consider the greater time since publication of articles from 2019 and therefore may conservatively underestimate the unparalleled rate at which observational data spread across the international medical community.

Discussion

Using an unbiased approach, our PCA suggests that published pandemic-related studies have different article characteristics and impact metrics compared with non-COVID studies. They generally consist of shorter articles reporting observational studies with less literature cited and fewer study sites, suggestive of more limited scientific support. Yet, pandemic-related research is associated with greater reach in terms of readership, citations, and tweets, which speaks to the strong appetite for pandemic-related findings.

The publication characteristics described in our analysis reflect the urgency with which the medical, scientific, and lay communities sought information as the pandemic evolved. This on-going need, however, should be tempered with scientific and ethical oversight that is at least as rigorous as normal times with a focus on well-designed trials and not rapid dissemination of low-quality data. The potential harms of producing multiple iterations of lower-quality studies have been identified, including wasting of resources, lapses in the ethical standard of scientific reporting, delaying the conduct of higher-level evidence trials, diluting the quality of available evidence, and endangering the ethical responsibility to patients who enroll in trials with the expectation of assisting in medical and scientific advancement [6, 12, 13]. Researchers should endeavour to maintain high-quality research methods by increasing collaboration across multiple centres, helping to overcome limitations that may exist from single-centre efforts [3, 14]. International teams working in concert and not in competition on well-designed studies would greatly improve the capacity to detect clinically meaningful effects to inform the international health system’s efforts against COVID-19. For example, research consortia could establish research priorities and promote the implementation of master protocols with adaptive platforms [15–17]. This type of approach is designed for the perpetual investigation of multiple interventions with timely adaptation, an ideal framework for our evolving COVID-19 health crisis that would facilitate wider collaboration and mitigate against the production of low-quality evidence and poor scientific reporting.

Efforts have also focused on the expanding COVID-19 literature itself using both manual and automated methods. Content experts have been vetting the published literature to provide health care workers and policymakers with curated digital compendiums of high-quality research papers, such as the 2019 Novel Coronavirus Research Compendium [18]. Computational approaches are being used to mine the published COVID-19 literature to answer key questions related to the pandemic [19]. As these resources continue to grow, increasing effort will be required to ensure that the medical, scientific, and lay communities can engage with the resulting data and analyses in a meaningful way.

Our analysis, however, has limitations. We focus on the earliest phase of pandemic in order to capture how the medical community first pivoted to acquire and disseminate COVID-19-related knowledge. This potentially biases our results towards observational studies as there would be limited time to advance and report more rigorous study designs, such as randomized controlled trials. Moreover, to efficiently disseminate medical knowledge, the included journals made pandemic-related content freely available, which may have contributed to the observed increase in impact metrics. Lastly, our bibliometric analysis does not consider the root cause of the disparity between COVID and non-COVID publications. This is likely multifactorial but could, in part, reflect the feasibility of a timely study completion, variable adherence to reporting standards, and a strained peer review system. Ongoing evaluations of the publication process over the entirety of the pandemic will inform how the scientific community can most effectively, safely, and ethically disseminate valuable medical knowledge in a time of acute crisis.

Conclusion

COVID-19 led to a significant change in the characteristics of research studies across high-impact general medical journals. During this pandemic, the rapid and broad dissemination of research findings, regardless of underlying quality, were amplified and potentially contributed to the infodemic of misinformation at a time when best evidence needs to be emphasized. Ultimately, relaxing the rigorous standards for scientific research, although tempting for many altruistic reasons during a pandemic, may not actually achieve the objective of producing a solid evidence-based foundation upon which patients, clinicians, and policymakers can make meaningful decisions. The scientific and medical communities must strongly advocate for the thoughtful selection of high-quality research that will ensure the generation of meaningful knowledge and that participants of scientific trials who volunteer their health experience do not do so in vain.

Supporting information

S1 Dataset. The dataset used for the analyses in this study.

https://doi.org/10.1371/journal.pone.0246427.s001

(XLSX)

S1 Table. Publication characteristics and impact by journal.

https://doi.org/10.1371/journal.pone.0246427.s002

(DOCX)

References

1. Balaphas A, Gkoufa K, Daly M-J, de Valence T. Flattening the curve of new publications on COVID-19. J Epidemiol Community Health. 2020 Jul 3;jech–2020-214617. pmid:32631844
2. Tangcharoensathien V, Calleja N, Nguyen T, Purnat T, D’Agostino M, Garcia-Saiso S, et al. Framework for Managing the COVID-19 Infodemic: Methods and Results of an Online, Crowdsourced WHO Technical Consultation. J Med Internet Res. 2020;22(6):e19659. pmid:32558655
3. London AJ, Kimmelman J. Against pandemic research exceptionalism. Science. 2020 May 1;368(6490):476–7. pmid:32327600
4. Bauchner H, Fontanarosa PB. Randomized Clinical Trials and COVID-19: Managing Expectations. JAMA. 2020;323(22):2262–3. pmid:32364561
5. McDermott MM, Newman AB. Preserving Clinical Trial Integrity During the Coronavirus Pandemic. JAMA. 2020 Jun 2;323(21):2135–6. pmid:32211830
6. Zarin DA, Goodman SN, Kimmelman J. Harms From Uninformative Clinical Trials. JAMA. 2019 Sep 3;322(9):813–4. pmid:31343666
7. Pundi K, Perino AC, Harrington RA, Krumholz HM, Turakhia MP. Characteristics and Strength of Evidence of COVID-19 Studies Registered on ClinicalTrials.gov. JAMA Intern Med. 2020 Jul 27; pmid:32730617
8. Salazar JW, McWilliams Jr JM, Wang TY. Setting Expectations for Clinical Research During the COVID-19 Pandemic. JAMA Intern Med. 2020 Jul 27; pmid:32716474
9. Clarivate Web of Science. Web of Science Journal Citation Reports [Internet]. 2020 [cited 2020 Aug 4]. Available from: https://clarivate.com/webofsciencegroup/web-of-science-journal-citation-reports-2020-infographic
10. von Elm E, Altman DG, Egger M, Pocock SJ, Gøtzsche PC, Vandenbroucke JP. The Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) statement: guidelines for reporting observational studies. Lancet. 2007 Oct 20;370(9596):1453–7. pmid:18064739
11. Jolliffe IT, Cadima J. Principal component analysis: a review and recent developments. Philos Trans R Soc A Math Phys Eng Sci. 2016 Apr 13;374(2065):20150202. pmid:26953178
12. Bauchner H, Golub RM, Zylke J. Editorial Concern—Possible Reporting of the Same Patients With COVID-19 in Different Reports. JAMA. 2020 Apr 7;323(13):1256. pmid:32176775
13. Califf RM, Hernandez AF, Landray M. Weighing the Benefits and Risks of Proliferating Observational Treatment Assessments: Observational Cacophony, Randomized Harmony. JAMA. 2020 Jul 31; pmid:32735313
14. Cheng MP, Lee TC, Tan DHS, Murthy S. Generating randomized trial evidence to optimize treatment in the COVID-19 pandemic. Cmaj. 2020;192(15):E405–7. pmid:32336678
15. Angus DC, Alexander BM, Berry S, Buxton M, Lewis R, Paoloni M, et al. Adaptive platform trials: definition, design, conduct and reporting considerations. Nat Rev Drug Discov. 2019;18(10):797–807. pmid:31462747
16. Dean NE, Gsell P-S, Brookmeyer R, Crawford FW, Donnelly CA, Ellenberg SS, et al. Creating a Framework for Conducting Randomized Clinical Trials during Disease Outbreaks. N Engl J Med. 2020 Apr 1;382(14):1366–9. pmid:32242365
17. Woodcock J, LaVange LM. Master Protocols to Study Multiple Therapies, Multiple Diseases, or Both. N Engl J Med. 2017 Jul 5;377(1):62–70. pmid:28679092
18. The Johns Hopkins University. 2019 Novel Coronavirus Research Compendium (NCRC) [Internet]. 2020. [cited 2020 Aug 25]. Available from: https://ncrc.jhsph.edu
19. Wang LL, Lo K, Chandrasekhar Y, Reas R, Yang J, Burdick D, et al. CORD-19: The COVID-19 Open Research Dataset [Preprint]. arXiv:2004.10706. 2020 Apr [cited 2020 Aug 25]. pmid:32510522

[ref1] 1. Balaphas A, Gkoufa K, Daly M-J, de Valence T. Flattening the curve of new publications on COVID-19. J Epidemiol Community Health. 2020 Jul 3;jech–2020-214617. pmid:32631844
View Article
PubMed/NCBI
Google Scholar

[2] View Article

[3] PubMed/NCBI

[4] Google Scholar

[ref2] 2. Tangcharoensathien V, Calleja N, Nguyen T, Purnat T, D’Agostino M, Garcia-Saiso S, et al. Framework for Managing the COVID-19 Infodemic: Methods and Results of an Online, Crowdsourced WHO Technical Consultation. J Med Internet Res. 2020;22(6):e19659. pmid:32558655
View Article
PubMed/NCBI
Google Scholar

[6] View Article

[7] PubMed/NCBI

[8] Google Scholar

[ref3] 3. London AJ, Kimmelman J. Against pandemic research exceptionalism. Science. 2020 May 1;368(6490):476–7. pmid:32327600
View Article
PubMed/NCBI
Google Scholar

[10] View Article

[11] PubMed/NCBI

[12] Google Scholar

[ref4] 4. Bauchner H, Fontanarosa PB. Randomized Clinical Trials and COVID-19: Managing Expectations. JAMA. 2020;323(22):2262–3. pmid:32364561
View Article
PubMed/NCBI
Google Scholar

[14] View Article

[15] PubMed/NCBI

[16] Google Scholar

[ref5] 5. McDermott MM, Newman AB. Preserving Clinical Trial Integrity During the Coronavirus Pandemic. JAMA. 2020 Jun 2;323(21):2135–6. pmid:32211830
View Article
PubMed/NCBI
Google Scholar

[18] View Article

[19] PubMed/NCBI

[20] Google Scholar

[ref6] 6. Zarin DA, Goodman SN, Kimmelman J. Harms From Uninformative Clinical Trials. JAMA. 2019 Sep 3;322(9):813–4. pmid:31343666
View Article
PubMed/NCBI
Google Scholar

[22] View Article

[23] PubMed/NCBI

[24] Google Scholar

[ref7] 7. Pundi K, Perino AC, Harrington RA, Krumholz HM, Turakhia MP. Characteristics and Strength of Evidence of COVID-19 Studies Registered on ClinicalTrials.gov. JAMA Intern Med. 2020 Jul 27; pmid:32730617
View Article
PubMed/NCBI
Google Scholar

[26] View Article

[27] PubMed/NCBI

[28] Google Scholar

[ref8] 8. Salazar JW, McWilliams Jr JM, Wang TY. Setting Expectations for Clinical Research During the COVID-19 Pandemic. JAMA Intern Med. 2020 Jul 27; pmid:32716474
View Article
PubMed/NCBI
Google Scholar

[30] View Article

[31] PubMed/NCBI

[32] Google Scholar

[ref9] 9. Clarivate Web of Science. Web of Science Journal Citation Reports [Internet]. 2020 [cited 2020 Aug 4]. Available from: https://clarivate.com/webofsciencegroup/web-of-science-journal-citation-reports-2020-infographic

[ref10] 10. von Elm E, Altman DG, Egger M, Pocock SJ, Gøtzsche PC, Vandenbroucke JP. The Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) statement: guidelines for reporting observational studies. Lancet. 2007 Oct 20;370(9596):1453–7. pmid:18064739
View Article
PubMed/NCBI
Google Scholar

[35] View Article

[36] PubMed/NCBI

[37] Google Scholar

[ref11] 11. Jolliffe IT, Cadima J. Principal component analysis: a review and recent developments. Philos Trans R Soc A Math Phys Eng Sci. 2016 Apr 13;374(2065):20150202. pmid:26953178
View Article
PubMed/NCBI
Google Scholar

[39] View Article

[40] PubMed/NCBI

[41] Google Scholar

[ref12] 12. Bauchner H, Golub RM, Zylke J. Editorial Concern—Possible Reporting of the Same Patients With COVID-19 in Different Reports. JAMA. 2020 Apr 7;323(13):1256. pmid:32176775
View Article
PubMed/NCBI
Google Scholar

[43] View Article

[44] PubMed/NCBI

[45] Google Scholar

[ref13] 13. Califf RM, Hernandez AF, Landray M. Weighing the Benefits and Risks of Proliferating Observational Treatment Assessments: Observational Cacophony, Randomized Harmony. JAMA. 2020 Jul 31; pmid:32735313
View Article
PubMed/NCBI
Google Scholar

[47] View Article

[48] PubMed/NCBI

[49] Google Scholar

[ref14] 14. Cheng MP, Lee TC, Tan DHS, Murthy S. Generating randomized trial evidence to optimize treatment in the COVID-19 pandemic. Cmaj. 2020;192(15):E405–7. pmid:32336678
View Article
PubMed/NCBI
Google Scholar

[51] View Article

[52] PubMed/NCBI

[53] Google Scholar

[ref15] 15. Angus DC, Alexander BM, Berry S, Buxton M, Lewis R, Paoloni M, et al. Adaptive platform trials: definition, design, conduct and reporting considerations. Nat Rev Drug Discov. 2019;18(10):797–807. pmid:31462747
View Article
PubMed/NCBI
Google Scholar

[55] View Article

[56] PubMed/NCBI

[57] Google Scholar

[ref16] 16. Dean NE, Gsell P-S, Brookmeyer R, Crawford FW, Donnelly CA, Ellenberg SS, et al. Creating a Framework for Conducting Randomized Clinical Trials during Disease Outbreaks. N Engl J Med. 2020 Apr 1;382(14):1366–9. pmid:32242365
View Article
PubMed/NCBI
Google Scholar

[59] View Article

[60] PubMed/NCBI

[61] Google Scholar

[ref17] 17. Woodcock J, LaVange LM. Master Protocols to Study Multiple Therapies, Multiple Diseases, or Both. N Engl J Med. 2017 Jul 5;377(1):62–70. pmid:28679092
View Article
PubMed/NCBI
Google Scholar

[63] View Article

[64] PubMed/NCBI

[65] Google Scholar

[ref18] 18. The Johns Hopkins University. 2019 Novel Coronavirus Research Compendium (NCRC) [Internet]. 2020. [cited 2020 Aug 25]. Available from: https://ncrc.jhsph.edu

[ref19] 19. Wang LL, Lo K, Chandrasekhar Y, Reas R, Yang J, Burdick D, et al. CORD-19: The COVID-19 Open Research Dataset [Preprint]. arXiv:2004.10706. 2020 Apr [cited 2020 Aug 25]. pmid:32510522
View Article
PubMed/NCBI
Google Scholar

[68] View Article

[69] PubMed/NCBI

[70] Google Scholar

Figures

Abstract

Background

Methods

Results

Conclusions

Introduction

Methods

Results

Discussion

Conclusion

Supporting information

S1 Dataset. The dataset used for the analyses in this study.

S1 Table. Publication characteristics and impact by journal.

References