Comparison of citation rates between Covid-19 and non-Covid-19 articles across 24 major scientific journals

Covid-19 has been front and center in the global landscape since the beginning of 2020. In response, the scientific field has dedicated enormous amounts of resources to researching the virus and its effects. The number of times Covid-19 publications are being cited throughout the literature appears remarkably high but has not been directly compared to non-Covid-19 papers in the same journals over an extended period. In our study, we use Clarivate’s Web of Science—Science Citation Index Expanded™ database to identify Covid-19 papers published in 24 major scientific journals over a period of 24 months from January 1, 2020 to December 31, 2021. We conduct our search using keywords “Covid-19”, “coronavirus”, and “sars-cov-2” to locate publications with these words in the title. We then quantify the number of citations these papers have received and compare rates to non-Covid-19 papers in the same journals over the same timeframe. We find that, across 24 open-access and subscription-based scientific journals, Covid-19 papers published in the past 2 years currently have a median citation rate of 120.79 compared to 21.63 for non-Covid-19 papers. When negative binomial regression is used to minimize the influence of other variables such as article number variation and field of research, Covid-19 papers have still experienced more than 80% increase in citations relative to non-Covid-19 papers. These novel findings demonstrate that Covid-19 papers are being cited at remarkably higher rates than non-Covid-19 articles contained within the same journals. This suggests that journal impact factor, which is a product of the number of citations that recently published articles receive, will likely be drastically influenced by the number of Covid-19 papers that a journal has included within its pages in the previous years.


Introduction
In the early days of January 2020, the World Health Organization (WHO) published a bulletin that brought attention to 44 new cases of pneumonia of unknown etiology that had appeared at the end of December 2019 in Wuhan City, Hubei Province of China [1]. Regional cases a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 increased rapidly over the next weeks, and scientists worldwide began to take notice. By the end of January, more than 50 research papers had been published about the outbreak [2]. In the coming months, while researchers scrambled to learn more about the coronavirus that was now spreading internationally, the WHO again released a bulletin on March 11, 2020, officially classifying Covid-19 as a global pandemic [3]. For months after, cases and fatalities rose at staggering clips as governing bodies worldwide grappled with the best measures to contain the virus. Eventually, strict regulations and vaccine rollouts were implemented with enough success to begin to slow case growth [4]. Amidst all of this, there has been an explosion of peerreviewed literature about Covid-19 as researchers work to uncover details such as structure, infectivity, spread, effects, prevention, and treatment of this novel virus. According to the Web of Science Science Citation Index Expanded™ (WOS), which includes more than 8,300 journals across 150 scientific disciplines, nearly 200,000 Covid-19 related papers were published or presented in 24 months between January 1, 2020 and December 31, 2021 [5].
Unsurprisingly, several seminal Covid-19 publications have been cited at incredibly high rates as researchers have turned to these papers to help guide their next steps [6][7][8]. However, this citation trend in Covid-19 literature has not been limited to only a select few articles. As Covid-19 remains in the spotlight, the existing Covid-19 body of literature continues to be heavily leaned upon by scientists hoping to expand the existing knowledge further.
The extent to which Covid-19 papers are being cited in selected top journals relative to non-Covid-19 papers in the same journals has yet to be explored in detail. To address this gap in the literature we use WOS to examine citation rates for Covid-19-related articles published between January 1, 2020 and December 31, 2021 across 24 major scientific journals. We compare these rates with citation rates for non-Covid-19 articles published in the same 24 journals over an identical time frame. We anticipate that these findings could be of value to journals editors and researchers when considering efforts for future publications. In addition, we expect these findings to shed light on the influence that the current influx of Covid-19 literature will have on Journal Impact Factor (JIF) in the years ahead.

Journal selection
Using information provided in Clarivate's 2020 Journal Citation Reports 1 [9] published on June 30, 2021, the top three journals by impact factor were selected from eight scientific categories as defined by WOS. Disciplines were selected based on the likelihood of having high relevancy to Covid-19 and, therefore, a sufficient volume of Covid-19-related papers in that field's premier journals. A baseline of 15 Covid-19 articles was set as a requirement for a journal to be included. If a top three journal did not meet this baseline, the next highest journal by impact factor from that field was included. The disciplines chosen were respiratory, cardiology, immunology, radiology, microbiology, gastroenterology and hepatology, general and internal medicine, and multidisciplinary sciences. With three journals each from eight fields, a total of 24 journals were selected (Table 1).

Data collection and sample randomization
The WOS search criteria were customized to include documents categorized as "Article" published between January 1, 2020 and December 31, 2021. The database tags papers as articles if they meet the following criteria: "Reports of research on original works. Includes research papers, features, brief communications, case reports, technical notes, chronology, and full papers that were published in a journal and/or presented at a symposium or conference" [5]. In addition, we included papers categorized by Web of Science as "Review" to capture meta- analyses and systematic reviews as well. All articles published within the 24-month time frame were selected to produce an average citation rate for the journal itself over that period. Next, articles without the terms "Covid-19", "coronavirus", or "SARS-CoV-2" in the title were selected to provide an average citation rate for only non-Covid-19 articles in the journal. Finally, articles containing the keywords "Covid-19", "coronavirus", or "SARS-CoV-2" in the title were selected to provide an average citation rate for Covid-19-related articles. In this way, three separate citation averages were gathered for each journal: 1) all articles 2) non-Covid-19 articles 3) only Covid-19 articles. In many journals, there was discrepancy between the number of Covid-19 and non-Covid-19 articles. To establish a comparison that was more evenly matched in terms of article volume, randomization software on Microsoft Excel was used to select a sample of non-Covid-19 papers from each journal to create a 1:1 comparison of citation rates between non-Covid and Covid papers. A single average citation rate was obtained from this smaller sample as well.

Data analysis
Data were analyzed with R software version 4.1.2 [10] using the packages (Rcmdr) [11] and (glm2) [12]. Significance was considered when the P-value was < 0.05. Median and range were used to represent continuous variables (not normally distributed) while we used frequencies and percentages to represent categorical variables. The skewness and Kurtosis tests were used for testing the normal distribution of continuous variables. We estimated the effect of Covid-19 subject on citation counts using a negative binomial regression model. The negative binomial regression model was selected over a linear regression model because it resulted in a better fit to the data and was more appropriate for count data. The negative binomial regression model is similar to the Poisson regression model (for count data) except that it performs better with data over-dispersion [13,14]. The model was also used to assess any differences in citation rates attributed to the field category. Finally, the model was adjusted to account for the discrepancy in volume between non-Covid-19 papers and Covid-19 papers.

Results
Comparisons between non-Covid-19 and Covid-19 articles including all fields are shown in Table 2. The median citation rate at two years for Covid-19 articles in the top journals across all eight fields is 120.79 (p = <0.001). For non-Covid-19 articles, the median citation rate is 21.63 (p = < 0.001). This equates to a Covid-19: non-Covid-19 ratio of 5.58 citations per article. When comparing Covid-19 papers and the 1:1 randomized sample of non-Covid-19 papers, the median citation decreases from 21.63 to 20.1 for non-Covid-19 citations and the citation ratio between Covid-19 and non-Covid-19 papers climbs to 6.01 (p = <0.001).
A negative binomial regression model was used to assess for potential confounding variables (

Discussion
We used WOS to determine the difference in rates at which Covid-19 and non-Covid-19 articles from 24 top medical journals are being cited. Looking at all categories and journals that were included, Covid-19 articles published between January 1, 2020 and December 31, 2021 are being cited at considerably higher rates than the non-Covid-19 papers. Across eight selected fields, the median citation for a Covid-19 paper approaches six times that of a non-Covid-19 paper within the same journals. This holds even when article volume is equated using a 1:1 sample of non-Covid to Covid articles as median citation is six times greater for the Covid-19 articles. Using a negative regression model to analyze the entire data, Covid-19 papers have 84% more citations than non-Covid-19 papers when controlling for field and article number discrepancies. This number dips only slightly to 82% when the smaller, randomized sample is compared. 'Amongst all fields, only articles categorized as Medicine, General, or Internal according to WOS see a bump (28%) in citation rates that can be attributed to the field itself. Thus, an article's focus on Covid-19 seems to be a primary driver of increased citations for these articles compared to those that do not deal with a Covid-19 related topic.
Reasons behind the major increase in citations for Covid-19 articles seem straightforward. Covid-19 has dominated global focus since the onset of 2020, affecting over 200 countries across the world [15]. As such, researchers are eager to add their contributions to what is known and what can be done to combat Covid-19 and, to do this, are citing earlier works to support their approaches. This earnest for more knowledge is further bolstered by increased government funding for Covid-19 research. In the United States, the Coronavirus Aid, Relief, and Economic Security (CARES) Act was signed into law on March 27, 2020 and allocates 940 million dollars to the National Institute of Health (NIH) to be used for funding Covid-19 research [16]. Thus, more funding opportunities are available at the researcher level for research that focuses on Covid-19. Another potential reason for the spike in Covid-19 citations is the exhibited capacity of the virus to mutate rapidly. As new variants appear, researchers are offered the chance to publish on new epidemiological or variant characterization studies.
Previous studies have looked at citation rates for preprints of Covid-19 papers [17] and quality of evidence contained within published Covid-19 articles [18,19]. Additionally, citation rate for Covid-19 papers has been assessed previously without direct comparison to non-Covid-19 papers in the same journals [20]. To our knowledge, this study is the first to quantify the rate at which a large volume of peer-reviewed Covid-19 articles are being cited, on average, in major medical journals and compare these results to non-Covid-19 articles in the same journals. In doing this, we demonstrate the sharp contrast between citation rates for Covid-19 and non-Covid-19 articles published in 24 months during the rise and height of the global pandemic. While our research reveals the substantial degree to which Covid-19 articles are being cited in top journals relative to non-Covid-19 articles, the effect on journals themselves remains to be seen.
One way that these effects may be observed is through the influence that Covid-19 papers will have on JIF. JIF is a commonly used surrogate to determine journal excellence and is calculated by dividing number of citations in the current year for articles published in the previous two years by total number of articles published in that journal during the previous two years [21]. Therefore, it is a direct measure of how many citations recently published articles in a given journal receive. As we have shown, Covid-19 papers across all selected fields are being cited at a vastly increased rate compared to non-Covid-19 papers within the same journals. We would estimate that as Covid-19 papers continue to flood the literature across all scientific fields, so long as Covid-19 continues to hold the global spotlight, Covid-19 papers will continue to be cited, on average, at much higher levels than non-Covid-19 papers. We anticipate that JIFs will be affected in coming years which could change the landscape for how journal excellence is determined in the future.

Limitations
For the scope of this paper, we used papers categorized as "Article" or "Review" in our WOS search to ensure that we were focusing on original works and evidence-based literature. This means that other publications such as letters, editorials, perspectives, and opinions were not included in our search criteria which leaves a large number of publications out of the study. Inclusion of these would undoubtedly have increased the volume of papers and affected the results, though the direction or magnitude of change is unclear. Another limitation of this study is that open-access versus subscription-based journals were not filtered separately during the WOS search. Thus, data include a combination of open-access and subscription-based publications in the non-Covid-19 articles. This would seemingly be an impactful factor in how often these articles are being cited. However, a recent meta-analysis showed that the advantage of open-access is debatable with many studies showing no difference in citation rates andquality and heterogeneity concerns posing challenges for generalization [22].
Finally, though WOS is a comprehensive and highly regarded database, it is not without shortcomings. Most notably, its categorization of a publication as "Article" seems imperfect. To examine this more closely, we randomly selected three journals included for this paper and manually searched each journal's website for Covid-19-related articles that the journal itself had categorized as an original piece of research over three months. We compared this to the papers under "Article" that WOS captured over the same 3-month period for that journal. We found that the WOS database included 76% of the Covid-19 papers contained within the journals themselves. Further, 91% of the papers under the "Article" category in the database were classified as original works by the journals themselves. In our sample, the database captured about three-quarters of the papers that should be included but is more precise in labeling only reports of research on original work as "Article". Though our method of locating Covid-19 and non-Covid-19 papers in the selected journals was more expedient, manual scouring of the journals themselves over the 24 months would have optimized both the sensitivity and specificity of locating and comparing the most comprehensive list of publications possible.