Seroprevalence of SARS-CoV-2 antibodies in Bali Province: Indonesia shows underdetection of COVID-19 cases by routine surveillance

The international tourist destination of Bali reported its first case of Coronavirus Disease 2019 or COVID-19 in March 2020. To better understand the extent of exposure of Bali’s 4.3 million inhabitants to the COVID-19 virus, we performed two repeated cross-sectional serosurveys stratified by urban and rural areas. We used a highly specific multiplex assay that detects antibodies to three different viral antigens. We also assessed demographic and social risk factors and history of symptoms. Our results show that the virus was widespread in Bali by late 2020, with 16.73% (95% CI 12.22–21.12) of the population having been infected by that time. We saw no differences in seroprevalence between urban and rural areas, possibly due to extensive population mixing, and similar levels of seroprevalence by gender and among age groups, except for lower seroprevalence in the very young. We observed no difference in seroprevalence between our two closely spaced surveys. Individuals reporting symptoms in the past six months were about twice as likely to be seropositive as those not reporting symptoms. Based upon official statistics for laboratory diagnosed cases for the six months prior to the survey, we estimate that for every reported case an additional 52 cases, at least, were undetected. Our results support the hypothesis that by late 2020 the virus was widespread in Bali, but largely undetected by surveillance.


Introduction
Coronavirus Disease 2019 or COVID-19 is an infectious disease caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) [1]. Indonesia reported its first COVID- 19  Cross-sectional sero-surveys were conducted in two rounds between 14 October-24 December 2020. Specimen collection for round 1 occurred from 14 October-14 November and for round 2 from 15 November-24 December. Urban (Denpasar) compared to rural (outside Denpasar) stratification was done. Multi-stage cluster random sampling of census enumeration areas was performed.

Population and sample
Sample size and power. The survey was designed to provide an estimate of seroprevalence of SARS-CoV-2 in Bali Province, in urban Denpasar, and in rural areas outside of the capital. The minimum required sample size was calculated using this formula.
We assumed that the expected prevalence of antibodies to SARS-CoV-2 was higher in Denpasar (7%) than in the rest of the island (5%). We also assumed a conservative design effect (DE) of 2 to account for an expected increase of variance due to clustering effect, a response rate of 75%, and a desired confidence level of estimate 1-α/2 = 95% (Z 1-α/2 = 1.96) and a 2% margin of error (d), resulting in a sample size of 1,555 in urban Denpasar and 1,138 in rural areas (total of 2,693 individuals) per survey. With an average expected number of persons per household of 3.46 in Denpasar and of 3.89 in the non-Denpasar areas [21], the number of households expected to be visited was 450 in urban and 293 in rural areas (total of 743 households) per survey.
Sample, household (HH) and participant selection. A household was defined as a group of persons who reside in the same place and prepare meals together. All members of selected households who were �1 year of age were eligible if informed consent was obtained by either the participant of the parent or guardian.
The National Statistics Office provided enumeration areas (census blocks based upon a March 2020 national social and economic survey) for Bali as the sampling frame. Twenty HHs were selected per census block, with 23 from Denpasar and 15 from rural districts, for a total of 38 blocks.
A two-stage systematic random sampling strategy was applied. In stage 1, census blocks were selected in urban and rural areas, while in stage two, 20 HHs in each census block were selected after performing stratification by educational level of the head of HH. The samples were mutually independent between surveys, whereby the census blocks with odd serial numbers were used as the sampling frame for the round 1 survey, while census blocks with even serial numbers were used for round 2. A sample reserve of 20% or 4 HHs per census block was prepared. Selection of backup samples was carried out by systematic sampling from the list of HHs not selected in the original procedure.

Data collection
Training. A technical guide for fieldwork including all survey procedures was prepared and reviewed during two days of training for the eight survey teams. The survey teams were monitored by project staff throughout the survey.
Questionnaire. A structured questionnaire in the Indonesian language was developed based on previous behavioral and risk factor assessments for SAR-CoV2 exposure [22][23][24]. Basic demographic and economic information (number of HH members, HH income, age, gender, education, and marital status) was obtained along with a history of symptoms during the last 6 months. We also developed a non-response form for refusers. The interview questions were pilot tested and evaluated by local experts to ensure that they were culturally appropriate and could be understood by a layperson with a primary education. The questionnaire was revised after piloting in enumeration areas not selected for the survey.
Interview. Survey teams visited HHs in coordination with local leaders and guides from the district statistical office. Interview responses were recorded on a password protected handheld device that uploaded to a secure database upon completion of the questionnaire. Paper format was used in areas where internet access was limited.
Blood sampling and processing. Dried blood spot (DBS) samples were collected by finger prick and spotted onto Whatman 903 filter paper. An identification number was placed on each filter paper. Each participant's data included a unique identifier (barcoded label). Data collection in each census block was on average completed in four days.
DBS specimens were registered in an electonic log book for tracking. Blood spots were dried overnight at room temperature until uniformly dark brown with no red color visible. The DBS were then stored and shipped at 4-8˚C via air to the Eijkman Institute for Molecular Biology in Jakarta, where they were stored at 20˚C. SARS-CoV-2 antibody testing was performed on a Luminex MAGPIX instrument using the Tetracore FlexImmArray SARS-CoV-2 Human IgG Antibody assay [25]. This test detects antibodies to three different antigenic sites on the virus: spike (S), nucleoprotein (NP) and hybrid using a Multiplex Bead technology. The test uses 7 microspheres, with three detecting antibodies to different SARS-CoV-2 antigens, while 4 are internal controls. Every 96-well test plate was used for testing up to 90 samples in one test run. Each test run included SARS-CoV-2 negative control serum, positive control serum, and calibrator in duplicates. Antibody response was determined qualitatively using the ratio of mean fluorescence intensity (MFI) target antigen/MFI calibrator. Results were considered SARS-CoV2 human IgG positive if all three target antigens had a ratio of �1.2, negative if the ratio was �0.9, and indeterminate if between this range. Specimens with indeterminate results were re-tested; if the second test was indeterminate the specimen was classified as negative. A positive result implied past exposure to the virus.

Data analysis
Weighting was performed by the National Statistical Office to account for sampling design, response rate, and stratification. Descriptive analysis was conducted for characteristics of the HHs and respondents for round 1, round 2 and combined rounds. The overall prevalence of SARS-CoV-2 antibodies was calculated using weighted and unweighted data, per round and for both rounds combined. To estimate a population value, we calculated a range of estimates as a percentage, whereby a population mean lies between an upper and lower interval. Prevalence was calculated for rural and urban areas, by socio-demographic parameters, and by presence or absence of reported symptoms. Sensitivity analysis was not carried out, as an unpublished report showed that the Tetrocore test kit had a sensitivity of 89% and specificity of 100% [25]. Raw data used for analysis is provided in (S1 Data).
In Bali province, monitoring of COVID-19 cases is performed through active and passive surveillance systems. The passive system records the number of confirmed infections, deaths, and recovered hospitalized COVID-19 people from public health centers, local and central public hospitals, and some private hospitals in Bali Province. Meanwhile, active surveillance records cases found from contact tracing and screening at the entry points such as the airport and harbour. Information from active and passive surveillance is entered to the single sign on (SSO) system. The system was developed in mid June 2020 and was used until September 2021. Since then, COVID-19 data was transfered to a central information system called 'New All Record' (NAR). The New All Record system was established in April 2021 [26]. The possibility of a gap between detected and reported COVID-19 persons in this surveillance system is beyond the scope of our study.
Using the SSO data from 1 June-30 November 2020 (https://infocorona.baliprov.go.id/) for which period we assumed that infections would produce detectable antibodies during the time of our survey, we compared the estimated seroprevalence of SARS-CoV-2 with reported cumulative COVID-19 cases in Bali Province and calculated number of missed infections during this period. Table 1 presents the characteristics of census blocks being surveyed. Among 38 targeted census blocks, one census block was not surveyed during round 1 due to refusal by local authorities; in round 2, all 38 census blocks were successfully surveyed. The acceptance rate of HHs and respondents was slightly higher in rural than in urban areas, and was higher in round 2 than in round 1. The acceptance rate for DBS collection was low in both rounds, but was higher in rural areas and in round 2. Very few samples failed quality control for the lab assays, with only 1.8% failing in round 1 and 1.1% failing in round 2. Samples that did not pass QC were excluded from the analysis. Table 2 presents the sociodemographic and household characteristics of respondents consenting to interview and blood sampling. Repondents in the 15-54 year old age group were more likely to consent to interview than those younger or older than this group. Gender and marital status were not associated with consent to interview. The median level of education was through high school, with mean household monthly income in both surveys about 3 million IDR (215 USD). While most households have ventilation, only 20% were airconditioned. Blood samples were obtained from a low proportion of children (1-4 y.o., and 5-14 y.o.) in both rounds, with an increase in round 2. Table 3 shows details of seroprevalence for each survey round. The combined overall prevalence for both surveys (N = 2,545), without and with weighting were 17.5 (95% CI 16.01-18.96) and 16.73 (95% CI 12.22-21.12), respectively.

Results
Contrary to expectation, neither survey round showed a statistically significant difference in prevalence between urban versus rural areas. Prevalence based on socio-demographics also

PLOS GLOBAL PUBLIC HEALTH
showed no significant difference between sex, age group, education level, marital and employment status in either survey round. Respondents with symptoms were more likely to be seropositive than asymptomatic individuals. As shown in "Fig 1", seroprevalence results were extrapolated to estimate the total number of infections in Bali compared to to the actual number of infections reported by the surveillance system of the Government of Bali. The results show that only about 2% of infections were reported, with 52 cases likely occuring for each laboratory confirmed reported case.

PLOS GLOBAL PUBLIC HEALTH
Nearly three fourths of respondents reported no symptoms of SARS-CoV-2 infection in the last six months, as shown in Table 4. The most common symptoms reported were fever, headache, and rhinitis.

Discussion
This article reports results of a SARS-CoV-2 repeated cross-sectional serosurvey conducted in Bali Province in late 2020 [27]. The study was conducted relatively early in the pandemic, yet a high prevalence of SARS-CoV-2 antibodies was detected with a round 1 estimate of 18.04% The results of this survey are similar to those obtained in July 2020 in India [16,28], lower than results from a survey in Iran in April 2020 [29], but much higher than results from a systematic review for Asia and Southeast Asia for January-December 2020 showing seroprevalence of only 0.6% (0.3-1.4%) [27]. Within Indonesia, results from this survey were significantly lower than those obtained in Jakarta in March 2021, showing seroprevalence of 44.6% using the same laboratory methods [30]. However, the prevalence reported in the Bali survey was higher than that reported in during a comparable time in East Java (11%) [31].

PLOS GLOBAL PUBLIC HEALTH
The estimated level of SARS-CoV-2 prevalence from this survey implies that the surveillance system in Bali detected and reported an extremely low proportion of positive cases as shown in "Fig 1". Meanwhile in Jakarta, a more massive testing effort was able to detect and report a higher proportion (8.1%) of infections [30]; however, this is still far below the required number by WHO [10,11]. This condition occurred not only in Indonesia but elsewhere, as one systematic review has reported that numbers of infections estimated by serosurveys are always higher than actual number of cases detected [27]. The low proportion of positive cases detected by the surveillance system reflects the fact that neither Bali's testing rate nor contact tracing efforts meet WHO standards [10,11]. The weakness of surveillance systems to detect COVID-19 has also been reported in Africa [32]. Suboptimal reporting and monitoring of COVID-19 cases may create a false impression of decreasing of COVID-19 incidence [32]. Critical assessment of COVID-19 data in Indonesia [33] showed that the surveillance system only reported the confirmed, recovered and fatal cases and did not report suspected cases who died. It also missed the geographic and demographic details at the national, provincial and district level, as has been reported in Africa [32]. Suboptimal surveillance may mis-direct policymaking and control strategies [32] and may diminish the effectiveness of policy initiatives at the local level [33] and in the worst case adversely affect morbidity and mortality rates.
This survey was conducted using a standard methodology. Overestimation of prevalence due to clustering of cases within families was low, given that only 7% of families had more than one positive individual. As the laboratory test used requires positivity for antibodies to three different antigens, its specificity is high (100%) while sensitivity is 89% [15], leading to possible underestimation of seroprevalence.
Symptoms which were possibly related to COVID-19 were reported in 34.76% (95%CI 7.00-62.50) of respondents in round 1 and 26.45% (95% CI 8.35-44.45) of respondents in round 2. In contrast, seroprevalence in those not reporting symptoms was about half this level, though the differences were not statistically significant. In Iran, the proportion of seropositive asymptomatic individuals was much higher (57.2%) [17]. Even if the potential spread from asymptomatic cases is low [13], their relatively high numbers leads to significant transmission risk [34,35], particularly in the context of weak active case detection and contract tracing.
Results from this study showed that 72% of seropositive respondents reported no symptoms whatsoever. Those numbers were probably too high [13] as a recent meta-analysis estimated the proportion of asymptomatic cases at 17% (95%CI 14-20%) [14], though several other studies showed wider variation [17,36]. This study is subject to recall bias, as we asked about symptoms occurring in the past six months, whereby mild illness may have been forgotten or judged as not being ill at all. Further, at the time of the interview and DBS collection, some participants may have been subclinical or pre-symtomatic [14], which have shown high risk of transmission [13,36]. Therefore, results from this survey support the conclusion that by focusing on symptomatic individuals, programs lose the opportunity to prevent transmission from asymptomatic individuals [13,14].
No seroprevalence differences was found between the two survey rounds. During the first round of data collection, the response rate for blood draw was very low for various reasons (e.g. fear of diagnosis with COVID-19, fear of needles, too young for blood draw etc.). We attempted to increase the response rate by re-socialization to the mayor, head of districts and sub-districts, head of villages and sub-villages; and provided additional incentives, such as blood type test (for children), blood glucose test and blood pressure test (for adult respondents) for their participation. These measures helped to to increase the acceptance for blood draw in the second survey, resulting in a narrowing of the 95% confidence interval of SARS--CoV-2 seroprevalence in the second compare to first round survey. Nevertheless, analysis of sociodemographic characteristics of respondents for the two rounds showed that acceptance of blood draw for females was significantly higher than male (p 0.00), and for rural areas was significantly higher than for urban areas (p 0.00). Related to symptoms and acceptance for blood draw, the percentage of those reporting headache and dyspnea were higher in those providing blood samples compared those not reporting these symptoms. However, for other symptoms, we found no differences among those with and without symptoms, making us confident that differences in blood collection success between survey rounds did not result in significant bias.
We found no difference in prevalence between urban Denpasar and the rest of rural Bali. This may be because road transport and access throughout Bali is relatively good [13]. In addition, movement between rural and urban areas is common, as most respondents (round 1 = 71.9% and round 2 = 69.7%) reported participating in traditional ceremonies in their home villages while working or residing elsewhere.
This study has some weaknesses. First, response rate was low for DBS collection, with responses particularly low among very young participants. Second, as symptoms were reported by recall, the results are subject to bias. Nonetheless, results do show that the virus had spread substantially in Bali by late 2020, in contrast to findings from official statistics.