Figures
Abstract
Objectives
The aim of this study was to quantify transmission trends in South Africa during the first four waves of the COVID-19 pandemic using estimates of the time-varying reproduction number (R) and to compare the robustness of R estimates based on three different data sources, and using data from public and private sector service providers.
Methods
R was estimated from March 2020 through April 2022, nationally and by province, based on time series of rt-PCR-confirmed cases, hospitalisations, and hospital-associated deaths, using a method that models daily incidence as a weighted sum of past incidence, as implemented in the R package EpiEstim. R was also estimated separately using public and private sector data.
Results
Nationally, the maximum case-based R following the introduction of lockdown measures was 1.55 (CI: 1.43–1.66), 1.56 (CI: 1.47–1.64), 1.46 (CI: 1.38–1.53) and 3.33 (CI: 2.84–3.97) during the first (Wuhan-Hu), second (Beta), third (Delta), and fourth (Omicron) waves, respectively. Estimates based on the three data sources (cases, hospitalisations, deaths) were generally similar during the first three waves, but higher during the fourth wave for case-based estimates. Public and private sector R estimates were generally similar except during the initial lockdowns and in case-based estimates during the fourth wave.
Conclusion
Agreement between R estimates using different data sources during the first three waves suggests that data from any of these sources could be used in the early stages of a future pandemic. The high R estimates for Omicron relative to earlier waves are interesting given a high level of exposure pre-Omicron. The agreement between public and private sector R estimates highlights that clients of the public and private sectors did not experience two separate epidemics, except perhaps to a limited extent during the strictest lockdowns in the first wave.
Citation: Bingham J, Tempia S, Moultrie H, Viboud C, Jassat W, Cohen C, et al. (2023) Estimating the time-varying reproduction number for COVID-19 in South Africa during the first four waves using multiple measures of incidence for public and private sectors across four waves. PLoS ONE 18(9): e0287026. https://doi.org/10.1371/journal.pone.0287026
Editor: AbdulAzeez Adeyemi Anjorin, Lagos State University, NIGERIA
Received: September 2, 2022; Accepted: May 30, 2023; Published: September 22, 2023
Copyright: © 2023 Bingham et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: We have updated our data availability statement to point the reader to our study’s minimal underlying data set, which can be accessed at: https://zenodo.org/record/6948468#.ZEZ3tnZBxEZ (DOI: 10.5281/zenodo.6948468).
Funding: This work was supported by the Wellcome Trust [grant number 221003/Z/20/Z] in collaboration with the Foreign, Commonwealth and Development Office, United Kingdom. JB and JRCP are also supported by the Department of Science and Innovation and the National Research Foundation (NRF). Any opinion, finding, and conclusion or recommendation expressed in this material is that of the authors and the NRF does not accept any liability in this regard. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interests: CC has received grant support from Sanofi Pasteur, US CDC, Wellcome Trust, Programme for Applied Technologies in Health (PATH), Bill & Melinda Gates Foundation and South African Medical Research Council (SA-MRC). JRCP has received funding for COVID-related work from Bill & Melinda Gates Foundation, WHO AFRO, and Wellcome Trust and serves on the Ministerial Advisory Committee for COVID-19 for the South African National Department of Health. The other authors report no known conflicts of interest. This does not alter our adherence to PLOS ONE policies on sharing data and materials.
Introduction
As of April 2022, South Africa had experienced four waves of COVID-19, following the first reported case in early March 2020. In March 2020, while case numbers were still low, the South African government declared a national state of emergency and introduced strict lockdown legislation, with five lockdown levels [1–3]. Lockdown started at level five, which forced the temporary closure of all non-essential businesses, the closure of international and inter-provincial borders, an alcohol and tobacco prohibition, and a ban on leaving one’s home except to access essential services. Lockdown restrictions relaxed, due to economic pressure, early during the onset of the first wave, and were reintroduced in adjusted forms during the second and third waves. The first wave was associated with primarily wild-type SARS-CoV-2, the second wave with the Beta variant (B.1.351), the third wave with the Delta variant (B.1.617.2), and the fourth wave with the Omicron variant (B.1.1.529.1) [4]. During July 2021, civil unrest caused disruptions to laboratory and healthcare services in KwaZulu-Natal and Gauteng provinces.
In South Africa, roughly 17% of residents have access to private healthcare through health insurance, whereas the remaining 83% rely primarily on the public healthcare system [5]. Clients of the private sector tend to have a higher mean income and live in areas with lower population density [6, 7]; while people with lower socioeconomic status are likely to be at a higher risk of SARS-CoV-2 infection [7–10]. As such, transmission patterns may be expected to differ between clients of the private versus public sector healthcare providers, due to the difficulty of adhering to lockdown measures in higher density areas [11].
The time-varying reproduction number R is the expected number of secondary cases caused by a single infected individual at a given point in time, assuming that conditions remain constant for the duration of the infectious period. R reflects inherent pathogen properties combined with environmental and social conditions such as population immunity, non-pharmaceutical interventions (NPIs), perceptions of disease, and access to healthcare. Reproduction number estimates are used to track transmission trends, assess the impacts of interventions, and parameterise epidemic models [12, 13].
R estimates are typically based on time series data of cases or deaths, although any measure representing an approximately constant proportion of total incidence may be used, in conjunction with data from which to estimate the generation interval [14]. For example, a time series of deaths are sometimes used for reproduction number estimation, particularly in situations where the consistency of testing systems is called into question, as deaths are thought to be more systematically assessed and recorded than other measures of incidence.
We generated R estimates for COVID-19 in South Africa during the first four waves of the epidemic, nationally and by province. We also sought to examine whether R estimates differed according to data source (comparing the daily incidence of laboratory-confirmed COVID-19 cases, hospitalisations, and in hospital deaths), and to compare transmission patterns between the public and private sectors.
Methods
Data
Our study utilised two primary data sets maintained by the South African NICD. Data on laboratory-confirmed cases were obtained from the national Notifiable Medical Conditions Surveillance System (NMC-SS) line list, to which all pathology laboratories are legally required to report positive COVID-19 test results [24] and included both primary infections and suspected reinfections. Case report dates ranged from March 5th, 2020, when the first case was reported, through April 28th, 2022. Positive tests were classified as associated with a reinfection if more than 90 days had elapsed since the most recent positive test for the same patient [15]. Data on laboratory-confirmed cases were filtered to include only cases confirmed via reverse-transcription polymerase chain reaction (rt-PCR) testing, due to substantial numbers of incorrectly entered reference dates for antigen tests. Data on hospital admissions and hospital-associated deaths were obtained from the national DATCOV dataset, to which all private (262) and public (407) hospitals report confirmed COVID-19-positive admissions and deaths in hospital [16]. The generation interval distribution was approximated using a gamma distribution fit to data from PHIRST-C, a community cohort study of COVID-19 transmission (mean = 6.63 days, simean = 0.51 days; si = 3.28 days, sisi = 0.27 days; see R estimation subsection below) [17]. Dates of symptom onset were available for 55% of hospitalised cases and 57% of in-hospital deaths. 31 cases (0.007%) in the DATCOV database were missing both admission date and date of symptom onset and were excluded from the analyses based on admissions and deaths. The reinfections line list was linked with DATCOV to obtain dates of symptom onset for 5.6% of rt-PCR-confirmed cases (obtained through the DATCOV database).
Imputation
Missing onset dates in the source datasets were imputed using the multiple imputation chained equations technique, as implemented in the R packages mice [18] and countimp [19]. The imputation procedure recursively estimates individual-level delays between symptom onset and hospital admission (for hospitalisations and deaths), or between symptom onset and date of reported case confirmation (for rt-PCR-confirmed cases). In each estimation step, a multivariate negative binomial model was fitted to predict delays based on 6 (lab-confirmed cases) or 7 (hospitalisations and deaths) other variables and used to predict delay values that were incurred but not reported. Poisson distributed generalised linear models were fitted to explore the relationships between the known delay values and individual predictor variables, in order to select the most relevant predictors. The variables used for imputations in the two line lists were: health sector where laboratory testing/hospital admission occurred (public or private), age group (in ten-year intervals), month of case report/hospital admission, case outcome (for admissions), day of hospital admission (for admissions), province (for rt-PCR-confirmed cases), and district (for admissions).
Adjustment for right-censoring
Two mechanisms exist by which the data are right-censored. The first occurs because some case confirmations, hospital admissions, and in hospital deaths correspond to dates of symptom onset that fall within the date range of the data, despite the events (case confirmation, hospital admission, or death) occurring outside the date range of the data. The second cause of right-censoring in the data comes from case confirmations, hospital admissions, and in-hospital deaths that occur within the date range of the data, but which are not yet included in the dataset.
The first source of right-censoring was accounted for by inflating the end of the time series according to the distributions of delays from symptom onset to reporting of test results, hospital admission, and death [14]. Specifically, counts for each day were divided by the proportion of the appropriate delay distribution with delay larger than the difference between the day in question and the last date for which test results, hospital admissions, or deaths were reported. The second source of right-censoring was more difficult to adjust for rigorously, and was mitigated by truncating the last 3, 7, and 7 days (respectively) of the time series.
R estimation
Daily time series of rt-PCR-confirmed COVID-19 cases, hospitalizations, and deaths, by dates of symptom onset, were computed using the imputed line lists and adjusted for right-censoring. R was estimated using the method described by Thompson et al. and implemented in the R package EpiEstim [14], identified in a recent review of R estimation techniques as a suitable method given the data available during this study [20]. The method assumes that current incidence results from the combined transmissibility of recently infected individuals, and can thus be calculated based on recent incidence values [12, 14]. In general terms, the number of new infections coming from individuals infected n days ago is assumed to be proportional to the product of the number of individuals infected n days ago, and the proportion of transmission events that occur between n and n + 1 days following infection. Specifically, the incidence It (in time step t) is assumed to be Poisson distributed, with mean where R is the time-varying (instantaneous) reproduction number at time step t, and ws is the relative infectiousness of an individual s time steps following infection, assumed to be constant with respect to t. Furthermore, R is assumed to be constant over some time window τ, the choice of which involves a tradeoff between the width of the resulting credible intervals and the sensitivity of the method to rapid changes in R. We estimated R using 7, 14, and 21 day sliding windows; estimates using 7-day sliding windows are presented in the main text. The likelihoods of hypothetical values for R are calculated, resulting in a gamma-distributed Bayesian posterior for R. The infectiousness profile ws was estimated using a gamma-distributed generation interval fit to data from the PHIRST-C community cohort study [17] (μSI = 6.63 days, σμ-SI = 0.51 days; σSI = 3.28 days, σσ-SI = 0.27 days). In order to accommodate uncertainty in ws, n1 pairs of (μSI, σSI) are sampled from truncated normal distributions (with σSI < μSI). From each of these n1 pairs, n2 posterior R estimates are generated, resulting in n1 x n2 R estimates, per imputation, for each time window. We generated R estimates using n1 = n2 = 25, resulting in 625 R estimates per time window per imputation. Several parameter combinations were explored to determine suitable values for n1 and n2 (see section 5 in S1 Appendix). No explicit smoothing of the time series or R estimates was performed. The R estimation procedure was performed on each of 25 imputed datasets. We present median values, between imputations, of the median R estimates and 95% credible intervals (CI) arising from the 625 individual R estimates per estimation window.
R was estimated based on rt-PCR-confirmed cases (Rcases), hospital admissions (Radmissions) and hospital-associated deaths (Rdeaths). R was also estimated separately using public and private sector data; rt-PCR-confirmed cases were classified according to the sector of the laboratory where samples were processed, whereas hospitalised cases and in-hospital deaths were classified according to the sector of the hospital where admission took place. This distinction is relevant because some patients who were admitted to public hospitals paid out of pocket to access testing services from private laboratories. Laboratory and healthcare service providers were disrupted during a period of civil unrest in KwaZulu-Natal and Gauteng provinces. We indicate the period of unrest, between 10 and 19 July 2021, and provide a simplistic indicator of its effects on R estimates by adding a shaded area to the graph, with opacity equal to the proportion of the backwards-looking generation time which occurs during or before the unrest, as this is the weight applied to past incidence values when estimating R.
Ethics statement
This study has received ethical clearance from University of the Witwatersrand Human Research Ethics Committee (clearance certificate no. M210752, formerly M160667) and approval under reciprocal review from Stellenbosch University Health Research Ethics Committee (project ID 19330, ethics reference no. N20/11/074_RECIP_WITS_M160667_COVID-19). All data was received in fully anonymized form. The need to obtain participant consent was waived by the ethics committees.
Results
National and provincial R estimates using time series of cases
Nationally, R based on cases (Rcases) dropped sharply following the closure of borders and schools in mid-March 2020 (Fig 1), followed by an increase in mid-April. During levels five and four lockdown, Rcases fluctuated around 1.25, with large credible intervals (Table 1 and Fig 1). Rcases remained steady through the level three lockdown in June 2020, with values between 1 and 1.5, then began to decrease in late June; the maximum Rcases in the first wave (after April 1st 2020) was 1.55 (CI: 1.43–1.66). Rcases crossed below 1 in mid-July and continued to decrease through early August. Rcases increased gradually during level two lockdown and through most of level one lockdown, then began to decrease in the first half of December while incidence in the second wave was still increasing; maximum Rcases in the second wave was 1.56 (CI: 1.47–1.64). The gradual decrease continued until the first week of January 2021, approximately one week after the introduction of adjusted level three lockdown, when Rcases declined sharply, reaching a value of 0.56 (CI: 0.53–0.60) by the end of January. Rcases then began to increase, in a pattern similar to that following the first wave, through May 2021, crossing above 1 in early April and peaking in mid-June; maximum Rcases in the third wave was 1.46 (CI: 1.38–1.53). Rcases dipped sharply from mid-June through late July, then climbed to a smaller peak in mid-to-late August. Rcases decreased through late August and September 2021, then remained approximately constant from late September through October 2021. Starting in early November 2021, Rcases increased rapidly throughout November 2021, reaching a maximum value of 3.33 (CI 2.83–3.97), and then decreased rapidly through the end of December. Rcases increased slightly during January 2022, then remained constant during February and March. Rcases increased quickly in early to mid April 2022 (Fig 1).
R estimates for each data endpoint (upper panel), South Africa, based on (lower panel) national daily time series of rt-PCR-confirmed cases, hospitalisations, and deaths. R estimated using 7-day sliding windows, from early March 2020 through 25 April. Results reflect median values (between imputations) of median R estimates and associated 2.5% and 97.5% credible intervals. L = Level. Red-shaded areas indicate the period during which civil unrest caused severe disruptions to surveillance in KwaZulu-Natal and Gauteng provinces; grey-shaded areas indicate gradually diminishing effects on R estimates.
Dates indicate the start of each period.
Trends in Rcases between provinces were generally similar, although several provinces (Northern Cape, North West, and Free State) exhibited extended first waves. Estimates for Western Cape, Gauteng, and to a lesser extent KwaZulu-Natal indicate transitory increases in transmission during October 2020, prior to the onset of the second wave (Fig 2 and section 1 in S1 Appendix).
R estimated on 7-day sliding windows. Results reflect median values (between imputations) of median R estimates and associated 2.5% and 97.5% credible intervals. L = Level. Red-shaded areas indicate the period during which civil unrest caused severe disruptions to surveillance in KwaZulu-Natal and Gauteng provinces; grey-shaded areas indicate gradually diminishing effects on R estimates.
While the timing of transmission trends varied substantially between provinces during the first wave, peak Rcases values during the second (Beta-dominated) wave occurred at similar times in most provinces, except in the Eastern Cape, where the second wave started approximately four weeks earlier than in other provinces; in addition, Limpopo, Mpumalanga, Northern Cape, and Free State experienced peak transmission slightly later than the more-densely populated provinces of Western Cape, Gauteng, and KwaZulu-Natal.
The timing and shape of the third (Delta-dominated) wave was more varied between provinces than the first two waves (Fig 2 and section 1 in S1 Appendix). Limpopo, Mpumalanga, North West, and Gauteng provinces experienced peak Rcases in late June or early July, while the Eastern Cape, KwaZulu-Natal, Western Cape, and Free State experienced peak Rcases in late July or August.
Comparing R estimates based on different data endpoints
Trends in R estimates based on cases (Rcases), hospitalisations (Radmissions), and in hospital deaths (Rdeaths) were generally similar during the first three waves, but Rcases diverged from Radmissions and Rdeaths during the fourth wave, with the peak Rcases being higher than that from deaths and admissions. Throughout the epidemic, estimates based on deaths (and to a lesser extent admissions) were generally less stable and had wider credible intervals than estimates based on cases (see sections 2, 3 in S1 Appendix). Ratios between the three endpoints varied considerably over the course of the epidemic (see section 6 in S1 Appendix).
Shortly after the peak of the first wave, in July 2020, Rcases dropped below Radmissions and Rdeaths; this occurred in several provinces as well as nationally. During the first wave, in mid-May 2020, Rdeaths was higher than Rcases, which was in turn higher than Radmissions. In November 2020, when Rcases and Radmissions dipped, as well as in late December 2020 and early January 2021, Rdeaths was higher than Rcases and Radmissions. In six out of nine provinces, Rcases exceeded Radmissions and Rdeaths during parts of June and July 2021.
In Gauteng, Radmissions had lower maxima in both waves than Rcases or Rdeaths; Rcases had similar maxima to Rdeaths but maintained these values for shorter periods. Rcases reached lower post-wave minima than Radmissions or Rdeaths. In KwaZulu-Natal, Radmissions changed less drastically during the first two waves, with lower maxima and higher minima, than Rcases or Rdeaths. Civil unrest in Gauteng and KwaZulu-Natal during July 2021 caused substantial disruptions to laboratory and healthcare services, with corresponding dips in R estimates followed by compensatory increases. All three endpoints were affected, although Rcases was the most affected.
During the fourth wave in late 2021 and early 2022, Rcases rose faster and higher than Radmissions or Rdeaths. R estimates based on the three data endpoints peaked within two days of one another in early December 2021, after which Rcases dropped more rapidly than Radmissions, which in turn dropped faster than Rdeaths. Estimates based on the three endpoints converged again in mid-February 2022.
Public versus private sector
Transmission patterns between clients of the public and private sectors were generally similar. Public and private sector R estimates differed most during the initial level five and four lockdowns, and in Rcases leading up to the peak of the fourth wave. At the end of the third wave in June/July 2021, and to a lesser extent at the end of the second wave, private-sector R dropped more rapidly than public-sector R, so that private-sector R was lower than public sector R (Fig 3). This difference appears in estimates using all three data endpoints for the national-level analysis, as well as in Gauteng, Free State, and KwaZulu-Natal, and appeared in estimates based on cases in all provinces except the Western Cape (see section 3 in S1 Appendix). During the fourth wave, Rcases in the private sector rose above Rcases in the public sector.
R estimates by sector, based on rt-PCR-confirmed COVID-19 cases (upper panel), hospitalisations (middle panel), and deaths (lower panel), South Africa. R estimates were generated using 7-day sliding windows. Results reflect median values (between imputations) of median R estimates and associated 2.5% and 97.5% credible intervals. L = Level. Red-shaded areas indicate the period during which civil unrest caused severe disruptions to surveillance in KwaZulu-Natal and Gauteng provinces; grey-shaded areas indicate gradually diminishing effects on R estimates.
Discussion
Overview
The initial level five lockdown in early 2020 had a substantial impact on transmission but was insufficient to bring R below one (Table 1). It is difficult to disentangle the effects of subsequent lockdowns from those of increasing population immunity and other drivers of behavioural change. Even given high levels of preceding population immunity [21, 22], the estimated peak R for the Omicron wave was substantially higher than previous waves. Average R values were lower in lower lockdown levels, probably because lockdown levels were increased in response to increasing transmission and lowered when transmission was deemed to be under control.
R estimates based on the three data endpoints were similar overall, although in practice hospitalisations and deaths provided slightly less timeous estimates. During our regular public reporting of R estimates [23], data updates on hospitalised cases and deaths were typically delayed by one to two weeks. Combined with longer truncation of hospitalised cases and deaths to account for late arriving data, this meant that estimates of Rcases were two to three weeks ahead of those for Radmissions, and three weeks ahead of those for Rdeaths. R followed generally similar patterns by province, with notable exceptions including extended waves in some provinces, an early second wave in the Eastern Cape, and unrest-related changes to R estimates in Gauteng and KwaZulu-Natal during July and August 2021. R estimates in the private and public sectors were similar prior to the fourth wave, except during the level five and four lockdowns of early 2020, but diverged during the fourth wave in late 2021 and early 2022.
Several studies featuring R estimates for South Africa have been published since the start of the COVID-19 pandemic. However, aside from the regular reports we released via the National Institute for Communicable Diseases (NICD) [23], the studies we identified relied on publicly reported time series data, which do not include symptom onset dates and which may be affected by backfilling and additional reporting delays, and used international estimates for the generation interval [1, 2, 24–30]; most studies also use a single measure of incidence and do not cover all of the first four waves. While currently available evidence suggests that clients of the public sector experienced higher levels of transmission during the first two waves [7, 31], we did not identify any studies comparing reproduction number estimates in different income groups, or between healthcare sectors.
Our national-level R estimates are consistent with estimates from two other South African studies using comparable methods [27, 32]. McCarthy et al. estimated R for the period before March 18th 2020 (when the first movement restrictions were enacted) at 4.15 (CI: 3.60–4.74), using a generation interval estimate of 5.7 +- 2.7 days, and ignoring importation status [32]; we estimated an R of 3.80 (CI: 3.03–4.92) for the same period. Roussouw obtained R trajectories consistent with ours, with R crossing 1 at approximately the same time as using our estimates, and peaking at similar values [27].
Provinces with drawn-out first waves (Northern Cape, North West, and Free State) are the three provinces with the lowest population density and smallest populations, possibly reflecting discontinuous epidemics in more isolated populations [33].
Events in some provinces, such as the civil unrest in KwaZulu-Natal and Gauteng provinces during July 2021, were associated with sudden drops in R estimates, followed by compensatory increases (see Figs 1–3). In general, short-term decreases in incidence observation processes (such as disruptions to testing facilities or decreases in healthcare-seeking behaviour) led to transitory dips in R estimates, followed by compensatory increases. This pattern of decline and increase is typical during public holidays, such as the Easter weekend in 2021, which coincided with a slight drop in both Radmissions and Rcases across most provinces (Fig 2). In KwaZulu-Natal, R estimates based on admissions (Radmissions) spiked following a nosocomial outbreak at a private hospital in early April [34] (Fig 2). Furthermorea public hospital in Durban, KwaZulu-Natal, reported a sudden surge of approximately100 admissions on May 26th 2020 –this led to a spike, and subsequent dip, in R estimates for KZN. All of these factors highlight the fact that case-based estimates can be affected by changes in testing practice, which could bias R estimates, particularly over the short term.
While this work represents one lens through which to assess the effectiveness of lockdowns, thorough assessment of this important question (how effective were lockdowns?) would require additional sources of data, such as information on adherence to lockdown measures, and alternative modelling techniques beyond the scope of this work. Overall, the modelling approaches used here do not allow us to develop appropriate counter-factuals which would be necessary to examine these types of questions. Similarly, we did not have access to data which would allow us to clearly identify and differentiate between the causes of differences in R estimates based on different data endpoints.
Different data endpoints
R estimation using hospital admissions and deaths may prove valuable in scenarios where laboratory testing is limited or unavailable to the public or if there is changing access to testing throughout the pandemic, for example, policies restricting testing to severe cases during epidemic peaks such as those implemented in the Western Cape [35]. Numbers of hospitalisations should be more robust to changes in test availability, although changing hospitalisation admission criteria may coincide with changes in the laboratory testing data; similarly, in-hospital deaths may change relative to incidence of infections due to improvements in treatment practices leading to reduced mortality over time. Overall, the general agreement of R estimates between endpoints prior to the fourth wave is encouraging, and points to the consistency of the imputation procedure. While agreement between endpoints would also suggest similarities in the levels of representativeness of each data endpoint, the ratios of incidence from the three time-series varied substantially over the time period considered (see section 6 in S1 Appendix).
Stricter hospital admission practices near times of peak transmission, and corresponding relaxations in hospital admission practices following wave peaks, could have resulted in decreasing representativeness of hospitalisations (as a measure of incidence) during wave peaks and increasing representativeness following waves; this could explain the less extreme values and slower changes in Radmissions relative to Rcases. On the other hand, testing-seeking behaviour may shift during waves, with heightened levels of concern during the beginning of waves leading to increases in test seeking and case confirmations. As wave peaks pass, people may be less concerned, leading to a reduction in test-seeking behaviour and corresponding decrease in R estimates based on confirmed cases. Divergence between Rcases and Radmissions / Rdeaths during the fourth wave was likely caused by a combination of the above factors with reduced severity of outcomes for people infected with the Omicron variant relative to the previously dominant Delta variant, causing the proportion of underlying infections which resulted in hospital admissions to decrease [36, 37].
Were surveillance resources to be scaled back in the future and the reliability of one or two of the data endpoints called into question, our results suggest that the remaining data endpoints could still be used to monitor changes in transmission. However, robust surveillance of all three data endpoints should be maintained in some areas for validation purposes, particularly as increasing decoupling of COVID-19 cases, hospitalisations, and deaths is likely going forward.
Public versus private sector
The similarities between public- and private-sector R estimates are of particular interest in light of several seroprevalence studies which suggest that clients of the public sector experienced higher levels of SARS-CoV-2 transmission [7, 31, 38]. It is also important to note the under-representation of clients of the public sector in all three data endpoints relative to clients of the private sector, as clients of the private sector represent approximately 17% of the South African population, yet only 52% of recorded cases came from public sector testing facilities.
The most straightforward reason for the similarity of public and private sector R estimates is the fact that clients of the public and private sectors do not form two separate and isolated populations with regard to respiratory virus transmission–although there may have been less mixing during the initial highly-restrictive lockdown levels five and four, leading to differing R estimates during this period. In addition, several biases could have made it more difficult to detect differences in transmission: for example, healthcare-seeking behaviour driven by heightened concern during periods of high incidence may have increased the proportion of infections that appeared in private sector data, inflating private sector R estimates near peaks in R. This explanation is consistent with the higher Rcases based on private sector data during the fourth wave. Capacity limitations of public sector healthcare providers and laboratory services (particularly during the first wave ‐ see section 7 in S1 Appendix ‐ when public sector testing experienced substantial processing delays and backlogs) may also have led to decreasing representativeness of public sector data near peaks in case and hospital admission numbers.
Unbiased R estimation requires that measures of disease incidence represent constant proportions of the true incidence of infections. In addition, the generation interval may have shifted over time with the appearance of new variants and changes in disease-related behaviours. Changes in laboratory capacity, including availability of test kits and reagents, levels of population immunity (whether pre-existing, infection-induced, or vaccine-induced), healthcare-seeking behaviour, treatment of COVID-19 disease, circulating SARS-CoV-2 strains, and data collection practices may all bias R estimates.
Furthermore, a number of factors may have altered mortality outcomes over time, including the introduction of dexamethasone treatment in mid-June 2020, the use of oxygen administration via high-flow nasal cannula, changes in the quality of healthcare provided if health systems are overwhelmed, changing levels of vaccine- and infection-induced immunity, and potential differences in severity between initially circulating viruses and the different variants that dominated the second, third, and fourth waves. Combined, these factors may lead to perturbations in the time series data that are unrelated to transmission. Furthermore, the distributions of delays between symptom onset and case report/admission/death may change over time, which would affect the accuracy of adjustments for right-censoring at the end of the time series. Due to the impact of civil unrest, which led to reductions in testing rates in KwaZulu-Natal and Gauteng provinces in July 2021, trends in R estimates during that time for these provinces, as well as nationally, should be interpreted with caution.
The agreement of R estimates based on the three data endpoints prior to the fourth wave, along with the fact that many of the likely biases would affect only one or two of the three endpoints, suggests that either the above biases were relatively small and / or together moved the three data endpoints in similar ways.
Early (pre-lockdown) incidence included a substantial portion of imported cases, and our early R estimates are likely not an accurate reflection of transmission trends; the sharp drop in R estimates following the initial border closures may be attributed in part to the sudden decrease in imported cases (see McCarthy et al. [32] for early R estimates incorporating data on importation status).
Future work could include comparison with other R estimation methods [20, 39], methods to correct for holidays and disruptions to surveillance processes (e.g. due to civil unrest), and disentangling the role of immunity from trends in R.
Conclusion
We conducted a robust process of R estimation for COVID-19 in South Africa. The use of high-resolution data allowed us to directly compare estimates based on different data endpoints, since all estimates were based on symptom onset dates, to compare estimates between the public and private sectors, and to exclude antigen testing due to issues with data quality and reporting completeness.
We found that different data endpoints yielded similar R estimates during the first three waves but diverged during the fourth wave, suggesting that, while useful R estimates could potentially be obtained from only one or two data endpoints, particularly if virus severity remained unchanged, decision-makers should where possible consider R estimates using multiple data endpoints. We found similar R estimates using public and private sector data, although clients of the public sector were heavily underrepresented in the data, and private sector Rcases was higher than public sector Rcases during the fourth wave.
Although R estimates provide limited resolution for understanding the drivers of transmission, the estimates presented here served a valuable role in the South African COVID-19 response efforts by providing routine monitoring of transmission trends.
Acknowledgments
We thank the numerous individuals and organizations involved in collecting and curating the DATCOV and NMC line list datasets (NMC epidemiology team: A. Moipone Shonhiwa, G. Ntshoe, J. Ebonwu, L. Motsuku, L. Shuping, M. Muchengeti, J. Kleynhans, G. Hunt, V. Odhiambo Olago, H. Ismail, N. Govender, A. Mathews, V. Essel, V. Msimang, T. Kufa-Chakezha, N. Villyen Motaze, N. Mayet, T. Mmaborwa Matjokotja, M. Neti, T. Arendse, T. Lamola, I. Matiea, D. Muganhiri, B. Ndlovu, K. Ravhuhali, E. Ramutshila, S. Mhlanga, A. Mzoneli, N. Naran, T. Whitbread, M. Moeti, C. Iwu, E. Mathatha, F. Gavhi, M. Makamu, M. Makhubele, S. Mdleleni, B. Chiger, and J. Kleynhans; information technology team: T. Mukange, T. Bell, L. Darwin, F. McKenna, N. Munava, M. Raza Bano, T. Ngobeni; DATCOV team: L. Blumberg, R. Kai, S. Dyasi, T. Arendse, M. Masha, B. Cowper, K. Skhosana, F. Malomane, M. Blom, A. Mzoneli, S. Mhlanga, B. Ali, C. Mudara, L. Ozougwu, R. Welch, N. Mfongeh, P. Manana, Y. Mangwane, M. Mokgosana, T. Buthelezi, P. Makwene, M. Dryden, C. Vika).
We thank Dr Yuri Munsamy who provided editing services on behalf of SACEMA.
References
- 1. Childs SJ. Quantification of the South African Lockdown Regimes, for the SARS-CoV-2 Pandemic, and the Levels of Immunity They Require to Work. medRxiv. 2020; 2020.07.11.20151555.
- 2.
Direct and Indirect Health Effects of Lockdown in South Africa. In: Center For Global Development [Internet]. [cited 27 May 2021]. Available: https://www.cgdev.org/publication/direct-and-indirect-health-effects-lockdown-south-africa
- 3.
Regulations and Guidelines ‐ Coronavirus COVID-19 | South African Government. [cited 20 Jul 2021]. Available: https://www.gov.za/covid-19/resources/regulations-and-guidelines-coronavirus-covid-19
- 4.
Network for Genomic Surveillance in South Africa (NGS-SA). SARS-CoV-2 Sequencing Update. National Institute for Communicable Diseases, South Africa; 2021. Available: https://www.nicd.ac.za/wp-content/uploads/2022/01/Update-of-SA-sequencing-data-from-GISAID-30-Dec-2021_dash.pdf
- 5.
Statistics South Africa. General Household Survey 2018. Statistics South Africa; 2019 May p. 203. Report No.: 318. Available: https://www.statssa.gov.za/publications/P0302/P03022019.pdf
- 6. Söderlund N, Hansl B. Health insurance in South Africa: an empirical analysis of trends in risk-pooling and efficiency following deregulation.: 8.
- 7. Shaw JA, Meiring M, Cummins T, Chegou NN, Claassen C, Plessis ND, et al. Higher SARS-CoV-2 seroprevalence in workers with lower socioeconomic status in Cape Town, South Africa. PLOS ONE. 2021;16: e0247852. pmid:33630977
- 8.
Chatterjee A, UNU-WIDER. Measuring wealth inequality in South Africa: An agenda. 45th ed. UNU-WIDER; 2019. https://doi.org/10.35188/UNU-WIDER/2019/679-1
- 9. Maphumulo WT, Bhengu BR. Challenges of quality improvement in the healthcare of South Africa post-apartheid: A critical review. Curationis. 2019;42: 1901. pmid:31170800
- 10. Hawkins RB, Charles EJ, Mehaffey JH. Socio-economic status and COVID-19–related cases and fatalities. Public Health. 2020;189: 129–134. pmid:33227595
- 11. JMIR Public Health and Surveillance ‐ Novel Coronavirus in Cape Town Informal Settlements: Feasibility of Using Informal Dwelling Outlines to Identify High Risk Areas for COVID-19 Transmission From A Social Distancing Perspective. [cited 24 Jun 2021]. Available: https://publichealth.jmir.org/2020/2/e18844/
- 12. Cori A, Ferguson NM, Fraser C, Cauchemez S. A New Framework and Software to Estimate Time-Varying Reproduction Numbers During Epidemics. American Journal of Epidemiology. 2013;178: 1505–1512. pmid:24043437
- 13. Fraser C. Estimating Individual and Household Reproduction Numbers in an Emerging Epidemic. PLOS ONE. 2007;2: e758. pmid:17712406
- 14. Thompson RN, Stockwin JE, van Gaalen RD, Polonsky JA, Kamvar ZN, Demarsh PA, et al. Improved inference of time-varying reproduction numbers during infectious disease outbreaks. Epidemics. 2019;29: 100356. pmid:31624039
- 15. Pulliam JRC, van Schalkwyk C, Govender N, von Gottberg A, Cohen C, Groome MJ, et al. Increased risk of SARS-CoV-2 reinfection associated with emergence of Omicron in South Africa. Science. 2022;376: eabn4947. pmid:35289632
- 16. NICD National COVID-19 Hospital Surveillance. National Institute for Communicable Diseases; 2021 Jun p. 5. Available: https://www.nicd.ac.za/wp-content/uploads/2021/07/DATCOV-National-report-20210630.pdf
- 17. Cohen C, Kleynhans J, Gottberg A von, McMorrow ML, Wolter N, Bhiman JN, et al. SARS-CoV-2 incidence, transmission and reinfection in a rural and an urban setting: results of the PHIRST-C cohort study, South Africa, 2020–2021. 2021 Jul p. 2021.07.20.21260855. pmid:34909794
- 18. Buuren S van, Groothuis-Oudshoorn K, Vink G, Schouten R, Robitzsch A, Rockenschaub P, et al. mice: Multivariate Imputation by Chained Equations. 2021. Available: https://CRAN.R-project.org/package=mice
- 19. Kleinke K. kkleinke/countimp. 2020. Available: https://github.com/kkleinke/countimp
- 20. Gostic KM, McGough L, Baskerville EB, Abbott S, Joshi K, Tedijanto C, et al. Practical considerations for measuring the effective reproductive number, Rt. PLOS Computational Biology. 2020;16: e1008409. pmid:33301457
- 21. Madhi SA, Kwatra G, Myers JE, Jassat W, Dhar N, Mukendi CK, et al. Population Immunity and Covid-19 Severity with Omicron Variant in South Africa. N Engl J Med. 2022;386: 1314–1326. pmid:35196424
- 22. Kleynhans J, Tempia S, Wolter N, von Gottberg A, Bhiman J, Buys A, et al. SARS-CoV-2 Seroprevalence in a Rural and Urban Household Cohort during First and Second Waves of Infections, South Africa, July 2020–March 2021. Emerging Infectious Disease journal. 2021;27. pmid:34477548
- 23.
NICD. THE DAILY COVID-19 EFFECTIVE REPRODUCTIVE NUMBER (R) IN SOUTH AFRICA. National Institute for Communicable Diseases; 2021 May p. 13. Available: https://www.nicd.ac.za/wp-content/uploads/2021/08/COVID-19-Effective-Reproductive-Number-in-South-Africa-week-32.pdf
- 24. Garba SM, Lubuma JM-S, Tsanou B. Modeling the transmission dynamics of the COVID-19 Pandemic in South Africa. Mathematical Biosciences. 2020;328: 108441. pmid:32763338
- 25. Giandhari J, Pillay S, Wilkinson E, Tegally H, Sinayskiy I, Schuld M, et al. Early transmission of SARS-CoV-2 in South Africa: An epidemiological and phylogenetic report. medRxiv. 2020 [cited 27 May 2021]. pmid:32511505
- 26. Mbuvha R, Marwala T. Bayesian inference of COVID-19 spreading rates in South Africa. PLOS ONE. 2020;15: e0237126. pmid:32756608
- 27. Roussouw L. Estimating the Effective Reproduction Number of COVID-19 in South Africa. 2021. Available: https://unsupervised.online/static/covid-19/estimating_r_za.html
- 28. Olivier LE, Craig IK. An epidemiological model for the spread of COVID-19: A South African case study. arXiv e-prints. 2020;2005: arXiv:2005.08012.
- 29. Musa SS, Zhao S, Wang MH, Habib AG, Mustapha UT, He D. Estimation of exponential growth rate and basic reproduction number of the coronavirus disease 2019 (COVID-19) in Africa. Infectious Diseases of Poverty. 2020;9: 96. pmid:32678037
- 30.
May 26 AAP, Doi 2021. Covid-19: Estimates for South Africa. In: Covid-19 [Internet]. [cited 27 May 2021]. Available: https://epiforecasts.io/covid/posts/national/south-africa/
- 31. George JA, Khoza S, Mayne E, Dlamini S, Kone N, Jassat W, et al. Sentinel seroprevalence of SARS-CoV-2 in the Gauteng province, South Africa August to October 2020. Infectious Diseases (except HIV/AIDS); 2021 Apr.
- 32. Mccarthy K, Tempia S, Kufa T, Kleynhans J, Wolter N, Jassat W, et al. The Importation and Establishment of Community Transmission of SARS-CoV-2 During the First Eight Weeks of the South African COVID-19 Epidemic. Rochester, NY: Social Science Research Network; 2021 Feb. Report No.: ID 3792114. pmid:34405139
- 33.
Statistics South Africa. Mid-year population estimates 2020. Statistics South Africa; 2020 Jul p. 35. Report No.: 302. Available: http://www.statssa.gov.za/publications/P0302/P03022020.pdf
- 34.
Ministerial Advisory Committee (MAC) on COVID-19. St Augustine Hospital Outbreak of COVID-19 ‐ Interim Report. National Department of Health, South Africa; 2020 May p. 2. Available: https://sacoronavirus.co.za/wp-content/uploads/2020/08/Memo_Advisory-St-Augustine-interim-report-final.pdf
- 35. Mahomed H, Gilson L, Boulle A, Davies M-A, Khan S, Carthy KM, et al. The evolution of the COVID-19 pandemic and health system responses in South Africa and the Western Cape Province–how decision-making was supported by data. District Health Barometer 2019/2020. Health Systems Trust; 2020. p. 18. Available: https://www.hst.org.za/publications/District%20Health%20Barometers/DHB%202019–20%20Section%20A,%20chapter%208%20-%20COVID-19%20pandemic.pdf
- 36. Wolter N, Jassat W, Walaza S, Welch R, Moultrie H, Groome M, et al. Early assessment of the clinical severity of the SARS-CoV-2 omicron variant in South Africa: a data linkage study. The Lancet. 2022;399: 437–446. pmid:35065011
- 37. Davies M-A, Kassanjee R, Rousseau P, Morden E, Johnson L, Solomon W, et al. Outcomes of laboratory-confirmed SARS-CoV-2 infection in the Omicron-driven fourth wave compared with previous waves in the Western Cape Province, South Africa. Tropical Medicine & International Health. 2022;27: 564–573. pmid:35411997
- 38. Vermeulen M, Mhlanga L, Sykes W, Coleman C, Pietersen N, Cable R, et al. Prevalence of anti-SARS-CoV-2 antibodies among blood donors in South Africa during the period January-May 2021. 2021.
- 39. Abbott S, Hellewell J, Sherratt K, Gostic K, Hickson J, Badr HS, et al. EpiNow2: Estimate Real-Time Case Counts and Time-Varying Epidemiological Parameters. 2020. Available: https://CRAN.R-project.org/package=EpiNow2