Meta-analysis of nationwide SARS-CoV-2 infection fatality rates in India

There has been much discussion and debate around underreporting of deaths in India in media articles and in the scientific literature. In this brief report, we aim to meta-analyze the available/inferred estimates of infection fatality rates for SARS-CoV-2 in India based on the existent literature. These estimates account for uncaptured deaths and infections. We consider empirical excess death estimates based on all-cause mortality data as well as disease transmission-based estimates that rely on assumptions regarding infection transmission and ascertainment rates in India. Through an initial systematic review (Zimmermann et al., 2021) that followed PRISMA guidelines and comprised a search of databases PubMed, Embase, Global Index Medicus, as well as BioRxiv, MedRxiv, and SSRN for preprints (accessed through iSearch) on July 3, 2021, we further extended the search verification through May 26, 2022. The screening process yielded 15 studies qualitatively analyzed, of which 9 studies with 11 quantitative estimates were included in the meta-analysis. Using a random effects meta-analysis framework, we obtain a pooled estimate of nationwide infection fatality rate (defined as the ratio of estimated deaths over estimated infections) and a corresponding confidence interval. Death underreporting from excess deaths studies varies by a factor of 6.1–13.0 with nationwide cumulative excess deaths ranging from 2.6–6.3 million, whereas the underreporting from disease transmission-based studies varies by a factor of 3.5–7.3 with SARS-CoV-2 related nationwide estimated total deaths ranging from 1.4–3.4 million, through June 2021 with some estimates extending to 31 December 2021. Underreporting of infections was found previously (Zimmermann et al., 2021) to be 24.9 (relying on the latest 4th nationwide serosurvey from 14 June-6 July 2021 prior to launch of the vaccination program). Conservatively, by considering the lower values of these available estimates, we infer that approximately 95% of infections and 71% of deaths were not accounted for in the reported figures in India. Nationwide pooled infection fatality rate estimate for India is 0.51% (95% confidence interval [CI]: 0.45%– 0.58%). We often tend to compare countries across the world in terms of total reported cases and deaths. Although the US has the highest number of reported cumulative deaths globally, after accounting for underreporting, India appears to have the highest number of cumulative total deaths (reported + unreported). However, the large number of estimated infections in India leads to a lower infection fatality rate estimate than the US, which in part is due to the younger population in India. We emphasize that the age-structure of different countries must be taken into consideration while making such comparisons. More granular data are needed to examine heterogeneities across various demographic groups to identify at-risk and underserved populations with high COVID mortality; the hope is that such disaggregated mortality data will soon be made available for India.

a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 emphasize that the age-structure of different countries must be taken into consideration while making such comparisons. More granular data are needed to examine heterogeneities across various demographic groups to identify at-risk and underserved populations with high COVID mortality; the hope is that such disaggregated mortality data will soon be made available for India.

Introduction
The second wave of SARS-CoV-2 in the 2 nd most populous country in the world, India, registered 414 thousand daily cases and 4.5 thousand daily deaths at its peak in May of 2021 [2], and led to a collapse of healthcare infrastructure [3]. Multiple studies indicate that the true number of infections and deaths are orders of magnitude larger [1,4,5]. Considerable effort has been devoted towards investigating the true number of SARS-CoV-2 attributed deaths and inferred infection fatality rates (IFR) in India. This brief report systematically synthesizes the existent literature on the true SARS-CoV-2 IFR in India (as of 26 May 2022), through a metaanalysis of studies based on excess deaths and studies based on epidemiological disease transmission models that present relevant estimates through at least June 2021, capturing most of the second wave in India.

Methods
In brief, we describe the systematic review framework that has previously been detailed in full with the complete search strategy [1]. Adhering to PRISMA guidelines (Table A in S1 Text includes the PRISMA checklist), the databases PubMed, Embase, Global Index Medicus, as well as BioRxiv, MedRxiv, and SSRN for preprints (accessed through iSearch), were searched on July 3, 2021 and results were updated through May 26, 2022. Using this approach, 4,971 citations were screened resulting in 15 studies classified into the following three groups: excess deaths studies (9 articles), disease transmission-based studies estimating unreported deaths (5 articles), disease transmission-based studies using reported deaths only (1 article). Since the three groups are not directly comparable, among the 15 studies, the 9 excess deaths studies with 11 datapoints are included in the nationwide quantitative synthesis. We were unable to stratify and separately meta-analyze disease transmission-based estimates (less than 3 studies rendered through at least June 2021 in the search verification). Several measures of fatality have been used in the literature as indicated in the glossary box. Using a random effects model with DerSimonian-Laird estimates and corresponding confidence intervals (CI), we meta-analyze IFR 2 (defined as the infection fatality rate that accounts for death underreporting, as well as case underreporting). We provide a pooled estimate of nationwide IFR 2 for SARS-CoV-2 in India with corresponding 95% CI. While this meta-analysis focuses on nationwide studies in India, we summarize the 18 other subnational/regional studies (not meta-analyzed) in Table B in S1 Text. A detailed explanation of the meta-analysis framework is provided in Methods B in S1 Text and Methods C in S1 Text, including Fig A in S1 Text displaying the process from data extraction to obtaining meta-analyzable IFRs.
Lastly, ethical approval is not applicable to the present study. The research uses publicly available data, and is IRB exempt.  Table 1). Considering estimates from disease transmission-based studies, URF ranges from 3.5-7.3 for India with total estimated deaths attributed to SARS-CoV-2 ranging from 1.4-3.4 million (see Table 1). As previously reported [1], URF for cases/infections (inferred from the most recent seroprevalence estimate) is 24.9 using the 4 th nationwide serosurvey [6]. As such, the evidence suggests that even by the lowest of these estimates roughly 95% of cases (URF (Case) is reportedly 24.9) and 71% of deaths (URF (Death) is at least 3.5) were missed in India.

Results
Nationwide pooled IFR 2 estimate for India is 0.51% (95% confidence interval [CI]: 0.45%-0.58%), as presented in Fig 2. This estimate attributes 100% of excess deaths to SARS-CoV-2 during 2020-2021. In actuality, the proportion of excess deaths resulting from COVID-19 is not likely to wholly account for the total excess deaths during the pandemic period, and as such this estimate of 0.51% is likely an overestimate. However, disease transmission-based studies give us a nationwide pooled IFR 2 estimate of 0.34% (95% CI: 0.28%-0.41%), although we caution that this second estimate relies on less than 3 data points. Overall, comparing IFR 2 to the nationwide pooled IFR 1 (calculated based on reported deaths) of 0.10% (95% CI: 0.07%-0.14%) [1], we find that IFR 2 is roughly 4 times greater than IFR 1

. Lastly, Fig B in S1
Text presents a visualization of the publication bias assessment among the included studies, and the Egger and Begg tests for asymmetry, as well as the Joanna Briggs Institute (JBI) risk of bias results are presented in the supplementary content (see Methods D in S1 Text and Methods E in S1 Text).

Discussion and conclusions
Over two years since the start of the pandemic, numerous peer-reviewed studies have focused on understanding the actual death toll of SARS-CoV-2 in India, primarily either via excess  Table 1. Summary of nationwide mortality data from included studies in India from 2020-2021. Seroprevalence of 67.6% is used with 765 million infections a from an age-adjusted population as of 14 Jun-6 Jul 2021 from the 4 th nationwide serosurvey [6].

Study
Time Period deaths or disease transmission-focused modeling, enabling the meta-analysis herein of the 11 identified excess deaths estimates. When appropriately accounting for case and death underreporting, the cumulative SARS-CoV-2 infection fatality rate in India varies within a 95% CI of 0.45%-0.58%, which indicates that IFR 2 is 4-6 times more than what is being reported based on tabulated deaths due to COVID-19. The disease transmission-based estimates qualitatively appear to be more conservative than the ones that originated from excess deaths studies. One possible explanation could stem from the fact that most of the excess death studies are based on all-cause-mortality data and do not quantify the proportion of the excess deaths attributable to COVID-19. The pooled IFR 2 estimate from COVID-specific transmission model-based studies is largely congruent to the estimated IFR of 0.3% (as of 14 November 2021) for India reported in the global IFR study by Barber et al. (2022) [7]. [a] Estimated total cumulative infections is calculated as the seroprevalence of 67.6% among ages � 6 years from the latest 4 th nationwide serosurvey study in India [6] multiplied by the age-adjusted population (additional details are included in Methods B in S1 Text and Fig A in S1 Text).
[2] Underreporting Factor is computed as Excess Deaths divided by COVID-19 Reported Deaths, unless otherwise noted.
[3] Underreporting Factor (URF), as well as COVID-19 Reported Deaths are directly reported in this study. Hence, the URF in this table is the precalculated estimate provided.
[4] Excess Deaths, as well as COVID-19 Reported Deaths, are directly reported in this study.
[5] The COVID-19 Reported Deaths provided in this study are across select states in the Civil Registration System (CRS).
[6] The precalculated Underreporting Factor and COVID-19 Reported Deaths reported in this study are through September 2021.

PLOS GLOBAL PUBLIC HEALTH
Limitations of the meta-analysis are as follows. First, insufficient data on age and sex-disaggregated mortality for India precluded investigation into heterogeneity by such demographics. Second, multiple studies rely on excess deaths derived from common sources, such as the civil registration system (CRS) data, as well as infections derived from nationwide serosurveys, which rules out independence between included studies and may bias the resulting pooled estimate. India recently released the CRS data for 2020, but most studies estimate largest excess deaths during April-June of 2021 and no CRS data are available for this period. Moreover, the incompleteness of CRS data may hinder representativeness and, thereby, complicates the interpretability of excess deaths estimates relying on CRS data. The more nationally representative sample registration system (SRS) is often used to adjust for missing death information in CRS, but SRS data are not yet available for 2020 and 2021. Lastly, while we use the latest available nationwide serosurvey to obtain an age-adjusted infections estimate in computing the IFR for SARS-CoV-2, we acknowledge that this approach does not incorporate factors of waning immunity and re-infections. If such components were able to be accounted for, the denominator of the IFR (estimated infections) may have been larger and thereby the true IFR will be attenuated to a degree. Such limitations inherent to sero-surveillance studies also include sero-reversion which concerns reduced detection of SARS-CoV-2 antibodies and leads to an upward bias in IFR estimates [8].
It is critical to contextualize the uncaptured SARS-CoV-2 infections and deaths in India, and how such underreporting could distort comparisons of disease spread and mortality within countries across the world. Considering the three countries with the highest cumulative reported deaths (as of December 31, 2021), namely, India, Brazil, and the United States (in ascending order), the IFR 2 (as of 14 November 2021) reported by Barber et al. (2022) appears to be the lowest in India (IFR 2 of 0.3%) compared to the US (IFR 2 of 0.9%) and Brazil (IFR 2 of 0.5%) [7]. This is due to the very large number of estimated cumulative infections in India (approximately 1 billion, through mid-November 2021 [7]). With respect to the total number of deaths, Wang et al. (2022) estimate deaths to be underreported by a factor of 8.3, 1.3, and 1.2 for India, the US, and Brazil, respectively [5]. This is qualitatively similar to the death underreporting factors reliant on WHO estimates (similarly through 31 st December 2021) of 9.8, 1.1, and 1.1 for India (4.7 million excess deaths and 481,080 reported deaths), the US (933,547 excess deaths and 818,464 reported deaths), and Brazil (681,514 excess deaths and 618,817 reported deaths), respectively [9]. These rankings indicate that underreporting of deaths (through 31 st December 2021) is particularly acute for India.
While metrics are useful for evaluating public health policies, we caution against such crude comparisons based on a single metric. Although we use cumulative excess deaths as a measure of comparison in mortality ranking, population counts are not factored in and deaths per million may be preferrable in another context. In addition, such overall mortality comparisons must be placed in the context of the age-structure of the different countries. India has a younger population (Median age 28 years) than the US (Median age 38 years) or Brazil (Median age 34 years) [10]. Age-specific IFR 2 should be used, if possible, when examining COVID-19 mortality burden within and across countries and in subsequent decision making. Recent studies underscore the importance of adjusting for age structures, when performing related deaths estimations. For example, The Economist recently made available an age-adjusted IFR source [11], which is further incorporated into their published estimates [12]. Disaggregated mortality data are necessary to validate these age-specific estimates for India.
Many of the included studies in this meta-analysis also sought to account for changes in mortality and subsequently changes in IFR over time often by incorporating as granular, longitudinal data as possible. This is important as the lethality of the virus is subject to multiple time-varying components, especially the roll-out of vaccines (starting in January 2021 within India), as well as the changing variant landscape wherein the milder SARS-CoV-2 variant Omicron and sub-lineages became dominant.
We look forward to the release of timely, disaggregated data on SARS-CoV-2 deaths within India to assess the burden of COVID-19 among various demographic groups [13], as well as to enable targeted policy interventions. Once nationwide 2021 CRS reports are released, the findings with respect to the excess death estimates will be further validated. In the absence of data, we must rely on curated estimates computed by multiple teams of dispassionate scientists and a systematic review and synthesis of such evidence.
Supporting information S1 Table. Results of risk of bias assessment for included articles. review that enabled this meta-analysis. The authors also wish to thank Maxwell Salvatore for providing technical advice and feedback regarding the graphics in this report.