The impact of the COVID-19 pandemic on scientific research in the life sciences

The COVID-19 outbreak has posed an unprecedented challenge to humanity and science. On the one side, public and private incentives have been put in place to promptly allocate resources toward research areas strictly related to the COVID-19 emergency. However, research in many fields not directly related to the pandemic has been displaced. In this paper, we assess the impact of COVID-19 on world scientific production in the life sciences and find indications that the usage of medical subject headings (MeSH) has changed following the outbreak. We estimate through a difference-in-differences approach the impact of the start of the COVID-19 pandemic on scientific production using the PubMed database (3.6 Million research papers). We find that COVID-19-related MeSH terms have experienced a 6.5 fold increase in output on average, while publications on unrelated MeSH terms dropped by 10 to 12%. The publication weighted impact has an even more pronounced negative effect (-16% to -19%). Moreover, COVID-19 has displaced clinical trial publications (-24%) and diverted grants from research areas not closely related to COVID-19. Note that since COVID-19 publications may have been fast-tracked, the sudden surge in COVID-19 publications might be driven by editorial policy.


Introduction
The COVID-19 pandemic has mobilized the world scientific community in 2020, especially in the life sciences [1,2]. In the first three months after the pandemic, the number of scientific papers about COVID-19 was fivefold the number of articles on H1N1 swine influenza [3]. Similarly, the number of clinical trials related to COVID-19 prophylaxis and treatments skyrocketed [4]. Thanks to the rapid mobilization of the world scientific community, COVID-19 vaccines have been developed in record time. Despite this undeniable success, there is a rising concern about the negative consequences of COVID-19 on clinical trial research, with many projects being postponed [5][6][7]. According to Evaluate Pharma, clinical trials were one of the pandemic's first casualties, with a record number of 160 studies suspended for reasons related to COVID-19 in April 2020 [8,9] reporting a total of 1,200 trials suspended as of July 2020. As a consequence, clinical researchers have been impaired by reduced access to healthcare research infrastructures. Particularly, the COVID-19 outbreak took a tall on women and early- Medical subject headings. We rely on the Medical Subject Headings (MeSH) terminology to approximate narrowly defined biomedical research fields. This terminology is a curated medical vocabulary, which is manually added to papers in the PubMed corpus. The fact that MeSH terms are manually annotated makes this terminology ideal for classification purposes. However, there is a delay between publication and annotation, on the order of several months. To address this delay and have the most recent classification, we search for all 28 425 MeSH terms using PubMed's ESearch utility and classify paper by the results. The specific API endpoint is https://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi, the relevant scripts are available with the code. For example, we assign the term 'Ageusia' (MeSH ID D000370) to all papers listed in the results of the ESearch API. We apply this method to the whole period (January 2019-December 2020) and obtain a mapping from papers to the MeSH terms. For every MeSH term, we keep track of the year they have been established. For instance, COVID-19 terms were established in 2020 (see Table 1): in January 2020, the WHO recommended 2019-nCoV and 2019-nCoV acute respiratory disease as provisional names for the virus and disease. The WHO issued the official terms COVID-19 and SARS-CoV-2 at the beginning of February 2020. By manually annotating publications, all publications referring to COVID-19 and SARS-CoV-2 since January 2020 have been labelled with the related MeSH terms. Other MeSH terms related to COVID-19, such as coronavirus, for instance, have been established years before the pandemic (see Table 2). We proxy MeSH term usage via search terms using the PubMed EUtilities API; this means that we are not using the hand-labelled MeSH terms but rather the PubMed search results. This means that the accuracy of the MeSH term we assign to a given paper is not perfect. In practice, this means that we have assigned more MeSH terms to a given term than a human annotator would have.
Clinical trials and publication types. We classify publications using PubMed's 'Publica-tionType' field in the XML baseline files (There are 187 publication types, see https://www. nlm.nih.gov/mesh/pubtypes.html). We consider a publication to be related to a clinical trial if it lists any of the following descriptors: In our analysis of the impact of COVID-19 on publications related to clinical trials, we only consider MeSH terms that are associated at least once with a clinical trial publication over the two years. We apply this restriction to filter out MeSH terms that are very unlikely to be relevant for clinical trial types of research.
Open access. We proxy the availability of a journal article to the public, i.e., open access, if it is available from PubMed Central. PubMed Central archives full-text journal articles and provides free access to the public. Note that the copyright license may vary across participating publishers. However, the text of the paper is for all effects and purposes freely available without requiring subscriptions or special affiliation.
Grants. We infer if a publication has been funded by checking if it lists any grants. We classify grants as either 'old', i.e. existed before 2019, or 'new', i.e. first observed afterwards. To do so, we collect all grant IDs for 11,122,017 papers from 2010 on-wards and record their first appearance. This procedure is an indirect inference of the year the grant has been granted. The basic assumption is that if a grant number has not been listed in any publication since 2010, it is very likely a new grant. Specifically, an old grant is a grant listed since 2019 observed at least once from 2010 to 2018.
Note that this procedure is only approximate and has a few shortcomings. Mistyped grant numbers (e.g. '1234-M JPN' and '1234-M-JPN') could appear as new grants, even though they existed before, or new grants might be classified as old grants if they have a common ID (e.g. 'Grant 1'). Unfortunately, there is no central repository of grant numbers and the associated metadata; however, there are plans to assign DOI numbers to grants to alleviate this problem (See https://gitlab.com/crossref/open_funder_registry for the project).
Impact factor weighted publication numbers (IFWN). In our analysis, we consider two measures of scientific output. First, we simply count the number of publications by MeSH term. However, since journals vary considerably in terms of impact factor, we also weigh the number of publications by the impact factor of the venue (e.g., journal) where it was published. Specifically, we use the SCImago journal ranking statistics to weigh a paper by the impact factor of the journal it appears in. We use the 'citation per document in the past two years' for 45,230 ISSNs. Note that a journal may and often has more than one ISSN, i.e., one for the printed edition and one for the online edition. SCImago applies the same score for a venue across linked ISSNs.
For the impact factor weighted number (IFWN) of publication per MeSH terms, this means that all publications are replaced by the impact score of the journal they appear in and summed up.
COVID-19-relatedness. To measure how closely related to COVID-19 is a MeSH term, we introduce an index of relatedness to COVID-19. First, we identify the focal COVID-19 terms, which appeared in the literature in 2020 (see Table 1). Next, for all other pre-existing MeSH terms, we measure how closely related to COVID-19 they end up being.
Our aim is to show that MeSH terms that existed before and are related have experienced a sudden increase in the number of (impact factor weighted) papers.
We define a MeSH term's COVID-19 relatedness as the conditional probability that, given its appearance on a paper, also one of the focal COVID-19 terms listed in Table 1 are present. In other words, the relatedness of a MeSH term is given by the probability that a COVID-19 MeSH term appears alongside. Since the focal COVID-19 terms did not exist before 2020, we estimate this measure only using papers published since January 2020. Formally, we define COVID-19-relatedness (σ) as in Eq (1), where C is the set of papers listing a COVID-19 MeSH term and M i is the set of papers listing MeSH term i.
Intuitively we can read this measure as: what is the probability in 2020 that a COVID-19 MeSH term is present given that we chose a paper with MeSH term i? For example, given that in 2020 we choose a paper dealing with "Ageusia" (i.e., Complete or severe loss of the subjective sense of taste), there is a 96% probability that this paper also lists COVID-19, see Table 1.
Note that a paper listing a related MeSH term does not imply that that paper is doing COVID-19 research, but it implies that one of the MeSH terms listed is often used in COVID-19 research.
Variables. In sum, in our analysis, we use the following variables:

Difference-in-differences
The difference-in-differences (DiD) method is an econometric technique to imitate an experimental research design from observation data, sometimes referred to as a quasi-experimental setup. In a randomized controlled trial, subjects are randomly assigned either to the treated or the control group. Analogously, in this natural experiment, we assume that medical subject headings (MeSH) have been randomly assigned to be either treated (related) or not treated (unrelated) by the pandemic crisis. Before the COVID, for a future health crisis, the set of potentially impacted medical knowledge was not predictable since it depended on the specifics of the emergency. For instance, ageusia (loss of taste), a medical concept existing since 1991, became known to be a specific symptom of COVID-19 only after the pandemic.
Specifically, we exploit the COVID-19 as an unpredictable and exogenous shock that has deeply affected the publication priorities for biomedical scientific production, as compared to the situation before the pandemic. In this setting, COVID-19 is the treatment, and the identification of this new human coronavirus is the event. We claim that treated MeSH terms, i.e., MeSH terms related to COVID-19, have experienced a sudden increase in terms of scientific production and attention. In contrast, research on untreated MeSH terms, i.e., MeSH terms not related to COVID-19, has been displaced by COVID-19. Our analysis compares the scientific output of COVID-19 related and unrelated MeSH terms before and after January 2020.
Consider the simple regression model in Eq (2). We have an outcome Y and dummy variable P identifying the period as before the event P = 0 and P = 1 as after the event. Additionally, we have a dummy variable identifying an observation belonging to the treated group (G = 1) or control (G = 0) group.
In our case, some of the terms turn out to be related to COVID-19 in 2020, whereas most of the MeSH terms are not closely related to COVID-19.
Thus β 1 identifies the overall effect on the control group after the event, β 2 the difference across treated and control groups before the event (i.e. the first difference in DiD) and finally the effect on the treated group after the event, net of the first difference, β 3 . This last parameter identifies the treatment effect on the treated group netting out the pre-treatment difference.
For the DiD to have a causal interpretation, it must be noted that pre-event, the trends of the two groups should be parallel, i.e., the common trend assumption (CTA) must be satisfied. We will show that the CTA holds in the results section.
To specify the DiD model, we need to define a period before and after the event and assign a treatment status or level of exposure to each term.
Before and after. The pre-treatment period is defined as January 2019 to December 2019. The post-treatment period is defined as the months from January 2020 to December 2020. We argue that the state of biomedical research was similar in those two years, apart from the effect of the pandemic.
Treatment status and exposure. The treatment is determined by the COVID-19 relatedness index σ i introduced earlier. Specifically, this number indicates the likelihood that COVID-19 will be a listed MeSH term, given that we observe the focal MeSH term i. To show that the effect becomes even stronger the closer related the subject is, and for ease of interpretation, we also discretize the relatedness value into three levels of treatment. Namely, we group MeSH terms with a σ between, 0% to 20%, 20% to 80% and 80% to 100%. The choice of alternative grouping strategies does not significantly affect our results. Results for alternative thresholds of relatedness can be computed using the available source code. We complement the dichotomized analysis by using the treatment intensity (relatedness measure σ) to show that the result persists.
Panel regression. In this work, we estimate a random effects panel regression where the units of analysis are 28 318 biomedical research fields (i.e. MeSH terms) observed over time before and after the COVID-19 pandemic. The time resolution is at the monthly level, meaning that for each MeSH term, we have 24 observations from January 2019 to December 2020.
The basic panel regression with continuous treatment follows a similar setup as Eq (2) but with MeSH term random effects and monthly fixed effects.
The outcome variable Y it identifies the outcome at time t (i.e., month), for MeSH term i. As before, P t identifies the period with P t = 0 if the month is before January 2020 and P t = 1 if it is on or after this date. In (3), the treatment level is measure by the relatedness to COVID-19 (σ i ), where again the γ 1 identifies pre-trend (constant) differences and δ 1 the overall effect.
As mentioned before, to highlight that the effect is not linear but increases with relatedness, we split σ into three groups: from 0% to 20%, 20% to 80% and 80% to 100%. In the three-level treatment specification, the number of treatment levels (G i ) is 3; hence we have two γ parameters. Note that I(�) is the indicator function, which is 1 if the argument is true, and 0 otherwise.
In total, we estimate six coefficients. As before, the δ l coefficient identifies the DiD effect. Verifying the Common Trend Assumption (CTA). To show that the pre-event trends are parallel and that the effect on publication activity is only visible from January 2020, we estimate a panel regression with each month modelled as a different event. Specifically, we estimate the following model.
We show that the CTA holds for this model by comparing the pre-event trends of the control group to the treated groups (COVID-19 related MeSH terms). Namely, we show that the pre-event trends of the control group are the same as the pre-event trends of the treated group.

Co-occurrence analysis
To investigate if the pandemic has caused a reconfiguration of research priorities, we look at the MeSH term co-occurrence network. Precisely, we extract the co-occurrence network of all 28,318 MeSH terms as they appear in the 3.3 million papers. We considered the co-occurrence networks of 2018, 2019 and 2020. Each node represents a MeSH term in these networks, and a link between them indicates that they have been observed at least once together. The weight of the edge between the MeSH terms is given by the number of times those terms have been jointly observed in the same publications.
Medical language is hugely complicated, and this simple representation does not capture the intricacies, subtle nuances and, in fact, meaning of the terms. Therefore, we do not claim that we can identify how the actual usage of MeSH terms has changed from this object, but rather that it has. Nevertheless, the co-occurrence graph captures rudimentary relations between concepts. We argue that absent a shock to the system, their basic usage patterns, change in importance (within the network) would essentially be the same from year to year. However, if we find that the importance of terms changes more than expected in 2020, it stands to reason that there have been some significant changes.
To show that that MeSH usage has been affected, we compute for each term in the years 2018, 2019 and 2020 their PageRank centrality [17]. The PageRank centrality tells us how likely a random walker traversing a network would be found at a given node if she follows the weights of the empirical edges (i.e., co-usage probability). Specifically, for the case of the MeSH co-occurrence network, this number represents how often an annotator at the National Library of Medicine would assign that MeSH term following the observed general usage patterns. It is a simplistic measure to capture the complexities of biomedical research. Nevertheless, it captures far-reaching interdependence across MeSH terms as the measure uses the whole network to determine the centrality of every MeSH term. A sudden change in the rankings and thus the position of MeSH terms in this network suggests that a given research subject has risen as it is used more often with other important MeSH terms (or vice versa).
To show that COVID-19-related research has profoundly impacted the way MeSH terms are used, we compute for each MeSH term the change in its PageRank centrality (p it ).
We then compare the growth for each MeSH i term in g i (2019), i.e. before the the COVID-19 pandemic, with the growth after the event (g i (2020)).

Publication growth
To estimate growth in scientific output, we compute the year over year growth in the number of the impact weighted number of publications per MeSH. Specifically, we measure the year by year growth as defined below, where m is the impact weighted number of publications at time t.

Changes in output and COVID-19 relatedness
Before we show the regression results, we provide descriptive evidence that publications from 2019 to 2020 have drastically increased. By showing that this growth correlates strongly with a MeSH term's COVID-19 relatedness (σ), we demonstrate that (1) σ captures an essential aspect of the growth dynamics and (2) highlight the meteoric rise of highly related terms. We look at the year over year growth in the number of the impact weighted number of publications per MeSH term from 2018 to 2019 and 2019 to 2020 as defined in the methods section. Fig 1 shows the yearly growth of the impact weighted number of publications per MeSH term. By comparing the growth of the number of publications from the years 2018, 2019 and 2020, we find that the impact factor weighted number of publications has increased by up to a factor of 100 compared to the previous year for Betacoronavirus, one of the most closely related to COVID-19 MeSH term. Fig 1, first row, reveals how strongly correlated the growth in the IFWN of publication is to the term's COVID-19 relatedness. For instance, we see that the term 'Betacoronavirus' skyrocketed from 2019 to 2020, which is expected given that SARS-CoV-2 is a species of the genus. Conversely, the term 'Alphacoronavirus' has not experienced any growth given that it is twin a genus of the Coronaviridae family, but SARS-CoV-2 is not one of its species. Note also the fast growth in the number of publications dealing with 'Quarantine'. Moreover, MeSH terms that grew significantly from 2018 to 2019 and were not closely related to COVID-19, like 'Vaping', slowed down in 2020. From the graph, the picture emerges that publication growth is correlated with COVID-19 relatedness σ and that the growth for less related terms slowed down.
To show that the usage pattern of MeSH terms has changed following the pandemic, we compute the PageRank centrality using graph-tool [18] as discussed in the Methods section. Fig 1, second row, shows the change in the PageRank centrality of the MeSH terms after the pandemic (2019 to 2020, right plot) and before (2018 to 2019, left plot). If there were no change in the general usage pattern, we would expect the variance in PageRank changes to be narrow across the two periods, see (left plot). However, PageRank scores changed significantly more from 2019 to 2020 than from 2018 to 2019, suggesting that there has been a reconfiguration of the network.
To further support this argument, we carry out a DiD regression analysis.

Common trends assumption
As discussed in the Methods section, we need to show that the CTA assumption holds for the DiD to be defined appropriately. We do this by estimating for each month the number of publications and comparing it across treatment groups. This exercise also serves the purpose of a placebo test. By assuming that each month could have potentially been the event's timing (i.e., the outbreak), we show that January 2020 is the most likely timing of the event. The regression table, as noted earlier, contains over 70 estimated coefficients, hence for ease of reading, we will only show the predicted outcome per month by group (see Fig 2). The full regression table with all coefficients is available in the S1 Table.

PLOS ONE
The impact of the COVID-19 pandemic on scientific research in the life sciences All outcome measures depict a similar trend per month. Before the event (i.e., January 2020), there is a common trend across all groups. In contrast, after the event, we observe a sudden rise for the outcomes of the COVID-19 related treated groups (green and red lines) and a decline in the outcomes for the unrelated group (blue line). Therefore, we can conclude that the CTA assumption holds. Table 3 shows the DiD regression results (see Eq (3)) for the selected outcome measures: number of publications (Papers), impact factor weighted number of publications (Impact), open access (OA) publications, clinical trial related publications, and publications with existing grants. Table 3 shows results for the discrete treatment level version of the DiD model (see Eq (4)).

Regression results
Note that the outcome variable is in natural log scale; hence to get the effect of the independent variable, we need to exponentiate the coefficient. For values close to 0, the effect is well approximated by the percentage change of that magnitude.
In both specifications we see that the least related group, drops in the number of publications between 10% and 13%, respectively (first row of Tables 3 and 4, exp(−0.102) � 0.87). In line with our expectations, the increase in the number of papers published by MeSH term is positively affected by the relatedness to COVID-19. In the discrete model (row 2), we note that the number of documents with MeSH terms with a COVID-19 relatedness between 20 and 80% grows by 18% and highly related terms by a factor of approximately 6.6 (exp(1.88)). The same general pattern can be observed for the impact weighted publication number, i.e., Model (2). Note, however, that the drop in the impact factor weighted output is more significant, reaching -19% for COVID-19 unrelated publications, and related publications growing by a factor of 8.7. This difference suggests that there might be a bias to publish papers on COVID-19 related subjects in high impact factor journals. By looking at the number of open access publications (PMC), we note that the least related group has not been affected negatively by the pandemic. However, the number of COVID-19 related publications has drastically increased for the most COVID-19 related group by a factor of 6.2. Note that the substantial increase in the number of papers available through open access is in large part due to journal and editorial policies to make preferentially COVID research immediately available to the public.
Regarding the number of clinical trial publications, we note that the least related group has been affected negatively, with the number of publications on clinical trials dropping by a staggering 24%. At the same time, publications on clinical trials for COVID-19-related MeSH have increased by a factor of 2.1. Note, however, that the effect on clinical trials is not significant in the continuous regression. The discrepancy across Tables 3 and 4 highlights that, especially for trials, the effect is not linear, where only the publications on clinical trials closely related to COVID-19 experiencing a boost.
It has been reported [19] that while the number of clinical trials registered to treat or prevent COVID-19 has surged with 179 new registrations in the second week of April 2020 alone. Only a few of these have led to publishable results in the 12 months since [20]. On the other hand, we find that clinical trial publications, considering related MeSH (but not COVID-19 directly), have had significant growth from the beginning of the pandemic. These results are not contradictory. Indeed counting the number of clinical trial publications listing the exact COVID-19 MeSH term (D000086382), we find 212 publications. While this might seem like a

ln(Papers) ln(Impact) ln(PMC) ln(Trials) ln(Grants) ln(Old Grants)
After Research funding, as proxied by the number of publications with grants, follows a similar pattern, but notably, COVID-19-related MeSH terms list the same proportion of grants established before 2019 as other unrelated MeSH terms, suggesting that grants which were not designated for COVID-19 research have been used to support COVID-19 related research. Overall, the number of publications listing a grant has dropped. Note that this should be because the number of publications overall in the unrelated group has dropped. However, we note that the drop in publications is 10% while the decline in publications with at least one grant is 15%. This difference suggests that publications listing grants, which should have more funding, are disproportionately COVID-19 related papers. To further investigate this aspect, we look at whether the grant was old (pre-2019) or appeared for the first time in or after 2019. It stands to reason that an old grant (pre-2019) would not have been granted for a project dealing with the pandemic. Hence we would expect that COVID-19 related MeSH terms to have a lower proportion of old grants than the unrelated group. In models (6) in Table 4 we show that the number of old grants for the unrelated group drops by 13%. At the same time, the number of papers listing old grants (i.e., pre-2019) among the most related group increased by a factor of 3.1. Overall, these results suggest that COVID-19 related research has been funded largely by pre-existing grants, even though a specific mandate tied to the grants for this use is unlikely.

Discussion
The scientific community has swiftly reallocated research efforts to cope with the COVID-19 pandemic, mobilizing knowledge across disciplines to find innovative solutions in record time. We document this both in terms of changing trends in the biomedical scientific output and the usage of MeSH terms by the scientific community. The flip side of this sudden and energetic prioritization of effort to fight COVID-19 has been a sudden contraction of scientific production in other relevant research areas. All in all, we find strong support to the hypotheses that the COVID-19 crisis has induced a sudden increase of research output in COVID-19 related areas of biomedical research. Conversely, research in areas not related to COVID-19 has experienced a significant drop in overall publishing rates and funding.
Our paper contributes to the literature on the impact of COVID-19 on scientific research: we corroborate previous findings about the surge of COVID-19 related publications [1][2][3], partially displacing research in COVID-19 unrelated fields of research [4,14], particularly research related to clinical trials [5][6][7]. The drop in trial research might have severe consequences for patients affected by life-threatening diseases since it will delay access to new and better treatments. We also confirm the impact of COVID-19 on open access publication output [1]; also, this is milder than traditional outlets. On top of this, we provide more robust evidence on the impact weighted effect of COVID-19 and grant financed research, highlighting the strong displacement effect of COVID-19 on the allocation of financial resources [15]. We document a substantial change in the usage patterns of MeSH terms, suggesting that there has been a reconfiguration in the way research terms are being combined. MeSH terms highly related to COVID-19 were peripheral in the MeSH usage networks before the pandemic but have become central since 2020. We conclude that the usage patterns have changed, with COVID-19 related MeSH terms occupying a much more prominent role in 2020 than they did in the previous years.
We also contribute to the literature by estimating the effect of COVID-19 on biomedical research in a natural experiment framework, isolating the specific effects of the COVID-19 pandemic on the biomedical scientific landscape. This is crucial to identify areas of public intervention to sustain areas of biomedical research which have been neglected during the COVID-19 crisis. Moreover, the exploratory analysis on the changes in usage patterns of MeSH terms, points to an increase in the importance of covid-related topics in the broader biomedical research landscape.
Our results provide compelling evidence that research related to COVID-19 has indeed displaced scientific production in other biomedical fields of research not related to COVID-19, with a significant drop in (impact weighted) scientific output related to non-COVID-19 and a marked reduction of financial support for publications not related to COVID-19 [4,5,16]. The displacement effect is persistent to the end of 2020. As vaccination progresses, we highlight the urgent need for science policy to re-balance support for research activity that was put on pause because of the COVID-19 pandemic.
We find that COVID-19 dramatically impacted clinical research. Reactivation of clinical trials activities that have been postponed or suspended for reasons related to COVID-19 is a priority that should be considered in the national vaccination plans. Moreover, since grants have been diverted and financial incentives have been targeted to sustain COVID-19 research leading to an excessive entry in COVID-19-related clinical trials and the 'covidisation' of research, there is a need to reorient incentives to basic research and otherwise neglected or temporally abandoned areas of biomedical research. Without dedicated support in the recovery plans for neglected research of the COVID-19 era, there is a risk that more medical needs will be unmet in the future, possibly exacerbating the shortage of scientific research for orphan and neglected diseases, which do not belong to COVID-19-related research areas.

Limitations
Our empirical approach has some limits. First, we proxy MeSH term usage via search terms using the PubMed EUtilities API. This means that the accuracy of the MeSH term we assign to a given paper is not fully validated. More time is needed for the completion of manually annotated MeSH terms. Second, the timing of publication is not the moment the research has been carried out. There is a lead time between inception, analysis, write-up, review, revision, and final publication. This delay varies across disciplines. Nevertheless, given that the surge in publications happens around the alleged event date, January 2020, we are confident that the publication date is a reasonable yet imperfect estimate of the timing of the research. Third, several journals have publicly declared to fast-track COVID-19 research. This discrepancy in the speed of publication of COVID-19 related research and other research could affect our results. Specifically, a surge or displacement could be overestimated due to a lag in the publication of COVID-19 unrelated research. We alleviate this bias by estimating the effect considering a considerable time after the event (January 2020 to December 2020). Forth, on the one hand, clinical Trials may lead to multiple publications. Therefore we might overestimate the impact of COVID-19 on the number of clinical trials. On the other hand, COVID-19 publications on clinical trials lag behind, so the number of papers related COVID-19 trials is likely underestimated. Therefore, we note that the focus of this paper is scientific publications on clinical trials rather than on actual clinical trials. Fifth, regarding grants, unfortunately, there is no unique centralized repository mapping grant numbers to years, so we have to proxy old grants with grants that appeared in publications from 2010 to 2018. Besides, grant numbers are free-form entries, meaning that PubMed has no validation step to disambiguate or verify that the grant number has been entered correctly. This has the effect of classifying a grant as new even though it has appeared under a different name. We mitigate this problem by using a long period to collect grant numbers and catch many spellings of the same grant, thereby reducing the likelihood of miss-identifying a grant as new when it existed before. Still, unless unique identifiers are widely used, there is no way to verify this.
So far, there is no conclusive evidence on whether entry into COVID-19 has been excessive. However, there is a growing consensus that COVID-19 has displaced, at least temporally, scientific research in COVID-19 unrelated biomedical research areas. Even though it is certainly expected that more attention will be devoted to the emergency during a pandemic, the displacement of biomedical research in other fields is concerning. Future research is needed to investigate the long-run structural consequences of the COVID-19 crisis on biomedical research.
Supporting information S1