Zika, dengue and chikungunya population prevalence in Rio de Janeiro city, Brazil, and the importance of seroprevalence studies to estimate the real number of infected individuals

In the last 40 years, Latin America countries, including Brazil, have suffered from the emergence and reemergence of arboviruses, first Dengue (DENV) and recently Zika (ZIKV) and Chikungunya (CHIKV). All three arboviruses are currently endemic in Brazil and have caused major outbreaks in recent years. Rio de Janeiro city, host of the last Summer Olympic Games and the Football World Cup, has been specially affected by them. A surveillance system based on symptomatic reports is in place in Rio, but the true number of affected individuals is unknown due to the great number of Zika, Dengue and Chikungunya asymptomatic cases. Seroprevalence studies are more suitable to evaluate the real number of cases in a given population. We performed a populational seroprevalence survey in Rio, with recruitment of a sample of volunteers of all ages and gender from July to October 2018, within randomly selected census tracts and household. A total of 2,120 volunteers were interviewed and tested with rapid immunochromatographic test for ZIKV, DENV and CHIKV. Individuals with positive results for IgG and/or IgM from only one virus were classified accordingly, while those with test results positive for both ZIKV and DENV were classified as flaviviruses. We corrected for sample design and non-response in data analysis, and calculated point estimate prevalence and 95% confidence intervals for each virus. Arbovirus prevalence in the Rio's population (n = 6,688,927) was estimated at 48.6% [95% CI 44.8–52.4] (n = 3,254,121) for flaviviruses and at 18.0% [95% CI 14.8–21.2] (n = 1,204,765) for CHIKV. Approximately 17.0% [95% CI 14.1–20.1] (n = 1,145,674) of Rio´s population had no contact with any of the three arboviruses. The reported cases of Zika, Dengue and Chikungunya by the current surveillance system in place is insufficient to estimate their real numbers, and our data indicate that Zika seroprevalence could be at least five times and Chikungunya 45 times bigger. The high number of individuals having never been infected by any of the three arboviruses, may indicate a proper scenario for future outbreaks.


Introduction
Brazil has been significantly affected by arbovirus epidemics since dengue virus (DENV) reintroduction in 1986 [1]. Since then, the virus spread throughout the country and the first major epidemic was described in Rio de Janeiro in 1990 [1]. In 2014, the Brazilian National Surveillance System confirmed the autochthonous transmission of another arbovirus also transmitted by Aedes aegypti, chikungunya virus (CHIKV) [2]. In 2015, autochthonous transmission of a third arbovirus, zika virus (ZIKV), transmitted by the same vector, was confirmed [2].
ZIKV infection has become a national epidemic in a short time, reaching 205,578 cases countrywide in 2016 [3]. During the epidemic, Brazilian health professionals reported an increased number of newborns with microcephaly, and the association between ZIKV and this disease was later proved causal [3]. By early 2020, 3,523 cases of Congenital Zika Syndrome have been reported in Brazil resulting in socioeconomic burden for the affected families [4,5]. Acute and chronic CHIKV virus disease have been related to mental health and its burden in Latin America was estimated in 25.45 disability-adjusted life-years (DALY) per 100,000 of population [6,7]. It was estimated that dengue was responsible for 1.14 million DALY in 2013 [8].
ZIKV, DENV and CHIKV (ZDC) epidemiologic studies are usually based on either surveillance of reported cases of symptomatic patients or on most affected subpopulations, such as cohort of pregnant women for ZIKV, or even on mathematical modeling approaches to estimate burden of diseases [9]. However, studies of symptomatic patients tend to underestimate the total number of cases due to the high level of asymptomatic infections [10]. Both DENV and ZIKV infections, and to a lesser extent CHIKV infection, tend to course with a high number of asymptomatic cases, but the true rates seems to vary from area to area and even inside a specific country [9,11].
After two years of high number of cases, ZIKV reports in Brazil are currently scarce and it is believed that a certain level of immunity has been acquired by its population. However, in a population of about 7 million inhabitants, only 40,431 cases have been officially reported in Rio (0.6%) [3]. Population cross-sectional designs based on serological diagnosis are able to generate more reliable data to estimate the real disease burden and for modeling approaches. However, seroprevalence studies are not common due to its high cost and complex logistic implementation, being even more uncommon in middle and low-income countries. In such countries, cross-reactivity due to co-circulation of multiple arboviruses makes the serological diagnosis more challenging, increasing the cost of such study designs [9].
Considering the co-circulation of ZDC in Rio de Janeiro city and the rarity of population seroprevalence studies to evaluate the ZDC dynamic and burden in such setting, our study aimed at estimate the real seroprevalence of all three arboviruses and possible co-infections by selecting a random sample of inhabitants geographically distributed over Rio´s territory. We used the STROBE checklist to format the article [12].

Study design
We conducted a general populational cross-sectional study in Rio de Janeiro city, Brazil, with recruitment of a sample of volunteers, symptomatic or asymptomatic, within a randomly selected household based on the census tracts of the Brazilian Institute of Geography and Statistics (IBGE) ordered by Administrative Region (RA). Included volunteers were interviewed and tested for the presence of antibodies against ZDC between July and October 2018.

Setting
The city of Rio de Janeiro is located in the southeastern macro region of Brazil and it is limited by the Atlantic Ocean to the south. The estimated population for 2018 was 6,688,927 (https:// www.ibge.gov.br/). The climate in the city is tropical, hot and humid, with local variations due to differences in altitude, vegetation and proximity to the ocean (http://www.ipe.br/). The average annual temperature between 1981 and 2010 was 29˚C, with the highest daily temperature averages (from 30˚to 32˚) occurring in the summer. Summertime (December to March) is also the period in which the greatest precipitations are reported (average of 205mm of precipitation in January between 1981 and 2010). The city is geographically divided into 33 Administrative Regions (AR) and 161 districts.
The study recruitment occurred during the low season risk period to acquire ZDC due to decrease of breeding sites and low vector proliferation from dry conditions, and was all based on home visits. Interviews were made during weekdays and tests and blood collection were scheduled for a second visit during weekends within 10 days from first visit so that we could include volunteers that were not at home in the first visit.

Participants
Individuals of all ages and gender residing in permanent private households in Rio were defined as potential participants. Foreign adults living in Rio who did not speak Portuguese were not included. We also did not include adults living in collective households such as hotels and pensions, and in improvised homes.
Households were randomly selected and all residents invited to participate. However, only those who agreed to sign the consent form were included. For volunteers under 18, consent was given by legal guardians. When approaching condominium, the building manager was first contacted and the study objectives explained so that interviewers could gain access to the housing areas. Informative material specially elaborated for the study was distributed during the field activities and made available in social networks developed exclusively for the project. In addition, project researchers participated in radio and television programs to broaden the knowledge of the study in the city and facilitate the acceptance of interviewers' entry into the residences.

Variables and data measurement
The rapid diagnostic test (RDT) kits used are based on the dual-path immunochromatographic (DPP1) platform for ZIKV, CHIKV and DENV, IgM/IgG. The RDTs were developed by Bio-Manguinhos (Fundação Oswaldo Cruz-Fiocruz), Brazil, in partnership with Chembio-Diagnostic System INC, USA and use whole blood, serum or plasma, digital or venous puncture samples for the simultaneous detection of IgM and IgG against the three arboviruses. Results become available in 15 to 20 minutes. The test has an innovation that is a digital instrument for reading, interpreting and storing test results. Results are available in absolute numbers and in the form of categories based on pre-established cut-offs (negative, positive, indeterminate and invalid). Tests performed with human blood samples indicated sensitivity (IgM and IgG) close to 100% and specificity of 95% for IgM and 98% for IgG, but high level of cross reaction between ZIKV and DENV were reported.
Every volunteer included in the study should agree to perform the ZDC RDT using digital puncture samples. Venous blood was collected from volunteers RDT positive for ZIKV and DENV for future laboratory confirmation with plaque reduction neutralization tests (PRNT) because of the possibility of cross-reaction between the two viruses, except for children under five years of age.
All consented volunteers responded to a standardized questionnaire that was developed based on the World Health Organization (WHO) Zika seroprevalence survey document aiming at equalizing studies on the prevalence of arboviruses worldwide [13]. The questionnaire was filled in a portable electronic device (PED) to guarantee safely data storage and volunteer confidentiality. All study data were automatically and safely synchronized to a central storage and data quality control. Each volunteer received their own identification when entering the study and that was the only identification available in the electronic databases. Socio-demographic (such as age, gender and race), epidemiologic (such as housing conditions, travel to endemic areas and ways used to avoid contact with vector), and clinical data (such as current symptoms, vaccination against yellow fever and report of previous episodes of Zika, Dengue or Chikungunya fever) were collected in the questionnaire.

Bias
We addressed the potential for selection bias by randomly appointing IBGE Rio´s census tracts and by choosing the target households equally randomly inside that area. We visited the selected areas to update the households' information from the 2010 IBGE national census. Once selected, a household was visited at least three times before called empty. For those, a list of possible substituted target residences was available for the field interviewers. For the included household, we attempted to interview all consented dwellers. In order to accomplish it, we scheduled a second visit to the residence during the weekend so that all RDT and weekdays missed interviews could be performed.
Information bias was addressed not only by training both teams, interviewers and health professional, in the protocol, but also by performing the interviews before the RDT. Field coordinators were responsible for returning to specific households and double check the interviews. They were also rapidly available all week long to respond to any problem in the field activities such as PED and RDT malfunction.
The 17 different pairs of field investigators had at their disposal equal number of vehicles with drivers to get around the city on weekends due to the great distances to be traveled and Rio´s intense traffic. All cars were identified with the Fiocruz logo so that we could reach eventual areas dominated the drug dealers. The collected blood was stored in thermal boxes with a digital thermometer to maintain the transport temperature between 2-8˚C. At the end of the activities on Saturdays and Sundays, every collected specimen was centrifuged and stored in a laboratory at Fiocruz at -70˚C.

Sample size
Rio´s census tracts from the 2010 IBGE national Census were used as primary sampling units for sample selection. The secondary units of sampling were the households, where all the residents were part of the target sample.
The sample size was guided by a prevalence of reported cases of the three arboviruses estimated at 1.5%. A minimum proportion (Pmin) of 1.5% was specified for which the relative margin of error of the estimation (dR) should be a maximum of 35%, with a confidence coefficient of 100x(1-α) of 95%. However, for a two-stage conglomerate sampling plan, we chose to consider the effects of this sampling plan on the design. We multiplied the sample size by an estimate of the effect of the sampling plan (EPA), referring to the dimensioning variable [14][15][16]. An arbitrary EPA of 2.15 was defined for use in the sample size calculation as there was no EPA data from previous household surveys on the subject. Considering that the average number of dwellers per household was 2.93 in the city according to the 2010 census, the household sample size was equal to 1,535 (4,500 � 2.93).
Based on the field logistic, we defined that, in each selected census tract, we should interview residents of 10 households, leading to a sample of sectors equal to 152 (minimum number of census tract = 1511.096/10 = 151.11).
The list of census tracts was first ordered by AR and then by the average income of the households in the sector. Sectors were finally selected by systematic sampling with probabilities proportional to size (PPT), and the number of permanent private households in the sector according to the 2010 census was used as a measure of size. The prior ordering of sectors by average income within the AR, combined with the systematic selection, constitutes an implicit stratification of the sectors by income in the AR, ensuring the inclusion of households of all income levels. In each sector, the households were selected by reverse sampling [14], after the elaboration of an exhaustive list of the domiciles or update of the address book of the households of the sector.

Quantitative variables
Our main outcomes were defined according to the RDT results. We modified a WHO recommendation for laboratory testing for Zika (https://apps.who.int/iris/bitstream/handle/10665/ 204671/WHO_ZIKV_LAB_16.1_eng.pdf?sequence=1) in order to define the outcomes and individuals with positive results for IgG and/or IgM from only one of the viruses were classified accordingly, while those with test results positive for both ZIKV and DENV (IgM and/or IgG) were classified as flaviviruses (FLAV) due to the possibility of cross-reactivity (FLAV = abDENV+ abZIKV+ positive sera). Coinfection was defined as having serological response to more than one virus. We analyzed as final outcomes: Dengue-DENV + DENV/CHIKV; Zika -ZIKV + ZIKV/CHIKV; Chikungunya-CHIKV + DENV/CHIKV + ZIKV/CHIK + FLAV/ CHIKV; Flaviviruses-FLAV + FLAV/CHIKV. Any volunteer negative to all three studied arboviruses were classified as "No arbovirus". All specimens classified as flavivirus are currently under laboratorial analysis with plaque reduction neutralization tests (PRNT) to differentiate serotypes of DENV and ZIKV.
Our current goal was to inform the final ZDC prevalence estimates and describe it to only few available variables already related to infection. Gender was categorized as male, female or other. Race was self-reported, and we decided to categorize in non-black (white/yellow) or black (include black and brown/mulatto race), since we did not have any native Brazilian. Age was categorized in 0 to 14, 15 to 29, 30 to 59, 60 or more years old. We used number of years in school to create the categories for literacy, and the ones used as roughly coincident with the course periods current in place in Brazil (0 to 04, 05 to 09, 10 to 12, 13 or more). We address the variable yellow fever immunization (YFI) due to the recent increase of its coverage in Rio (self-report of yellow fever immunization-yes or no). Finally, previous dengue, zika and chikungunya fever were self-reported based on sign and symptoms or diagnosis by a heath professional.

Statistical methods
The sample was stratified and conglomerated in several stages and used procedures of disproportionate allocation of the sample. Therefore, it was necessary to calculate and use sample weights for each of the eligible interviewed residents in order to allow the estimation of the population prevalence without bias. First, basic sample weights were obtained, corresponding to the inverse of the probabilities of inclusion of the interviewed eligible residents. Then, these weights were calibrated for known population totals by gender and age group, seeking to correct any distortions in the sample distribution due to the differential non-response observed in the survey [17]. The solution adopted to correct the non-response was to model the probabilities of response by using the information available on the variables collected in the baseline survey, such as age and sex. The individuals' new adjusted weights were calculated by the ratio between their calibrated weights and the predicted values of the estimated response probabilities [18,19].
Our data analysis was both descriptive and bivariate. It is essential to incorporate the sample weights and the structure of the sample plan in the analyzes, not only in the descriptive analysis to obtain proportions and total prevalence estimates, but also in the bivariate analysis. It is also important to consider the effect of calibration on sample weights. For this purpose, we used the survey package of the R software [20].
For the final results, we used point estimates for proportions, prevalence and ratios, always associated with their confidence intervals (CI) with a 95% confidence level.

Ethical aspects
This study is based on Resolution n. 466/2012, issued by the Brazilian National Ethics Research Committee, and was approved in April 6 th , 2018 (CAAE 83186318.1.0000.5240).

Results
A total of 4,386 potential volunteers were approached for study participation during the fourmonth period of data collection (Fig 1). Of the 2,749 volunteers (63%) who signed the informed consent, the majority were women (1,624/59%) and reported average age of 43.7 years old (SD 21,4). About 50% of the 2,120 submitted to the RDT had positive serology for both ZIKV and DENV and were eligible for whole blood withdrawn. About 70% of those eligible volunteers collected venous blood.
Population estimates indicate that 17.1% of Rio´s population was not infected with any of the three arboviruses, while 18.0% have had contact with CHIKV. Accurate detection of DENV and ZIKV indicated prevalence of 28.9% and 3.2%, respectively, but these figures may be higher depending on the pending PRNT test results (Table 1).
Population estimates based on sample data, indicated a predominance of women (53.1%), black race (black and brown/mulatto; 65.3%), and adults and middle age individuals (56.4% more than 30 years-old). Self-reported YFI was estimated at 55.9% (Table 2).
Except for a higher prevalence (52.1 vs. 45.5%) and chance (OR 1.30; 95% CI 1.08-1.57) of Flavivirus among men (Tables 3 and 4, respectively), arbovirus infection seems to be equally distributed in both genders. Estimates for self-reported race, indicated that non-black race presented lower prevalence (14.8 vs. 19.8%) and chance for CHIKV (OR 0.70; 95% CI 0.51-0.97). There was no age difference for ZIKV and CHIKV, but Flavivirus prevalence increased with age from 27.2% (0-14) to 57.1% (60 or more) and RDT negative for all arboviruses was more prevalent (29.9%) for young ages (0-14). It does not seem to have a literacy gradient for seroprevalence, but DENV was less prevalent for 0 to 4 years of study and the same group presented higher figures for negative arboviruses serology (30.9%) ( Table 3). Among the individuals that self-reported previous diagnosis, 166/537 (30.9%) were seropositive for DENV, 11/165 PLOS ONE (6.7%) were seropositive for ZIKV and 120/148 (81.1%) were seropositive for CHIKV (Tables 2  and 3). Previous reported CHIKV episode was associated with a higher chance (OR 26.3; 95% CI 14.4-47.9) of a positive CHIKV serology (Tabel 4). Previous declaration of ZIKV, DENV and CHIKV was inversely associated of being negative for all tests in the no arbovirus group.

Discussion
Rio de Janeiro, the second largest city in Brazil and host of the latest Summer Olympic Games in 2016 and the Football Word Cup in 2014, has been hitting hard by reemerging and

PLOS ONE
Zika, dengue and chikungunya population seroprevalence in Rio de Janeiro, Brazil emerging arbovirus epidemics in recent years. We hypothesized that most of Rio´s inhabitants have had previous contact with at least one of the three reported viruses. Our estimates indicate that more than 80% of inhabitants in Rio´s were exposed to arbovirus infection, and the most prevalent, as expected, was DENV (Prevalence = 28.9%; 95%CI 25.7-32.0) which reemerged as a public health problem in Brazil in 1990 [1]. The least prevalent among the three-target viruses was ZIKV (Prevalence = 3.2%; 95%CI 2.2-4.2), which is somewhat surprising due to the large epidemics of ZIKV in Rio in 2016 [3]. However, these numbers could    reach more than 50% if all infections categorized as Flaviviruses were assumed to be ZIKV, leading to more robust estimates compared to other international seroprevalence studies [9]. Prevalence of CHIKV was estimated to be 18% (95%CI 14.8-21.2) in Rio after four years of uninterrupted report of large number of cases. Due to the circulation of different viruses from the Flavivirus genus, we were not able to rule out the DENV and ZIKV RDT cross-reactivity. Therefore, we opted to categorize this uncertainty as Flaviviruses (abDENV+abZIKV+ positive sera). Prevalence estimate for Flaviviruses reached 48.6% (95%CI 44.8-52.4). Coinfection of Flaviviruses and CHIKV was common.
Overall, DENV seroprevalence is higher in the Americas than in Asia and Africa, but rates are not homogeneous distributed in its countries [9,[21][22][23]. ZIKV seroprevalence also varied largely and rates from 36% in the pediatric group in Nicaragua to 66% among schoolchildren in French Polynesia have been reported [24,25]. Introduction of CHIKV in the Americas is recent and seroprevalence studies are still rare [26,27]. The reported numbers (13.1-20.0%) are lower than the reports from Africa and Asia [28,29]. Prevalence estimates were slightly higher for women but gender was not found to be associated with confirmed virus serologic tests, excepted for Flaviviruses. Studies on gender-related differences are not conclusive, and while some authors indicate a higher risk for men [30,31], others indicate a higher risk for women [25,32]. Gender differences are usually explained by level of exposure to the vectors and this fact may be influenced by cultural particularities [33]. Age-group was not associated with DENV, ZIKV and CHIKV positive RDT. However, the same data for Flaviviruses indicate a gradient increase of seropositivity in the direction of older groups. All age-groups were associated with a lower chance of no serologic response to the analyzed arbovirus. These results reinforce data from other studies showing no relationship between age and seroprevalence in a new disease emergency scenario, and an age gradient in the endemic context [21,22,28]. Data on DENV, ZIKV and Flaviviruses are less reliable and further analysis will be needed once the PRNT results become available. Our non-black population tend to have lower chance for arbovirus seropositivity, and race disparity is more evident for CHIKV infection (OR 0.70; 95%CI 0.51-0.97). Although not statistically significant, the non-black population had a higher chance of serological non-response to all studied arbovirus. Our results for the literacy variable indicate not only higher prevalence for groups attending schools for less than 9 years, but also for those who attended up to 12 years. The same groups

PLOS ONE
had lower chance to be negative for all studied arbovirus. Black race and low literacy may not be directly related to seroprevalence, but rather being markers of direct exposure to vectors due to lower socioeconomic status [9,21]. In a complex context of an epidemic scenario in which three arboviruses are circulating almost concurrently, sign and symptoms recall seems to be related to the intensity of complains, since Chikungunya is markedly associated to acute and chronic painful arthralgias. Those who reported previous CHIKV diagnosis based on symptoms and serologic confirmation presented a higher chance of being indeed seropositive for CHIKV (OR 26.28; 95% CI 14.40-47.95). Similar finding has been reported elsewhere [28].
Seroprevalence studies are more important to estimate diseases burden but, as in any other epidemiologic design, these results have to be interpreted with caution. We presented the final results of a ZDC cross-sectional serologic study in Rio, a major city with a large geographical territory and heavy traffic in which more than 6.5 million people live, almost 20% of them living in slams with difficult access, low cover of urban infrastructures and high rates of violence (IPP: http://www.data.rio/pages/rio-em-sntese-2). Study sample size was calculated based on all Rio´s AR and stratified by SES aiming at covering the entire territory and make our sample representative of Rio´s population. Large distances, difficult access and the constant fear of violence among Rio´s inhabitants may have had an impact on our ability to have the approached individuals sign the informed consent and test. Several strategies were used to increase recruitment and the study visibility to the Rio´s communities by using of social media, television and radio. Nevertheless, the refusal group was different in gender and age distribution and we corrected it for the final estimates by using techniques already used in national surveys. An accurate RDT to differentiate ZIKV and DENV infection was not available during the study period. Therefore, blood samples were collected for PRNT laboratory differentiation, and more than 70% of those eligible individuals accepted the procedure.
We presented a well-funded populational survey performed in four months during 2018. The observed non-response can be considered compatible with the type of research and in line with the practice of successful home surveys in Brazil [34]. The final gender and age estimates were in line with the 2010 national census and the 2015 estimates performed by the Federal Government indicating that our sample is representative of Rio´s population (https://datasus. saude.gov.br /populacao-residente/). In spite of all adversities, we were able to approach the entire target sample and recruit 63% of them, resulting in estimates likely generalizable not only to Rio´s population but also for populations with similar socioeconomic and epidemiological contexts.
Based on our estimates and the official reported numbers until the end of 2018, only a few cases of ZIKV and CHIKV infection reached health units for care, and the great majority may have had mild diseases or were asymptomatic. Our data accounted for 214,806 and 1,204,765 individuals in Rio in 2018 with serologic marker of previous ZIKV and CHIKV infection, respectively. From 2015 to December 2018, the regular surveillance system in place in Rio reported 40,431 cases of Zika and 26,810 of Chikungunya and our data indicate that Zika seroprevalence could be at least five times bigger than the reported cases to the regular surveillance system, while for Chikungunya these figures could be 45 times bigger. The reason for that could be not only the high rates of asymptomatic cases, but also difficulties for a correct diagnosis due to the little knowledge of both diseases which were only recently introduced in Rio. Moreover, the presented results also show that, even after 30 years of the first DENV epidemic and four years after ZIKV and CHIKV emergency, 1,145,674 individuals had not been infected by any of the three arboviruses, which may indicate a proper scenario for future outbreaks.

Conclusions
Herein, we presented a more direct measurement of the immunity scenario in the complex setting of a major city in which more than one arbovirus is circulating at the same time. Our findings are important to reinforce the need for well-designed seroprevalence research in order to obtain the real burden of diseases which present moderate to high levels of asymptomatic cases. In the public health perspective, our data suggest that the current surveillance system in place is insufficient to estimate their real health and socioeconomic impacts.