Epidemiological and genomic investigation of chikungunya virus in Rio de Janeiro state, Brazil, between 2015 and 2018

Since 2014, Brazil has experienced an unprecedented epidemic caused by chikungunya virus (CHIKV), with several waves of East-Central-South-African (ECSA) lineage transmission reported across the country. In 2018, Rio de Janeiro state, the third most populous state in Brazil, reported 41% of all chikungunya cases in the country. Here we use evolutionary and epidemiological analysis to estimate the timescale of CHIKV-ECSA-American lineage and its epidemiological patterns in Rio de Janeiro. We show that the CHIKV-ECSA outbreak in Rio de Janeiro derived from two distinct clades introduced from the Northeast region in mid-2015 (clade RJ1, n = 63/67 genomes from Rio de Janeiro) and mid-2017 (clade RJ2, n = 4/67). We detected evidence for positive selection in non-structural proteins linked with viral replication in the RJ1 clade (clade-defining: nsP4-A481D) and the RJ2 clade (nsP1-D531G). Finally, we estimate the CHIKV-ECSA’s basic reproduction number (R0) to be between 1.2 to 1.6 and show that its instantaneous reproduction number (Rt) displays a strong seasonal pattern with peaks in transmission coinciding with periods of high Aedes aegypti transmission potential. Our results highlight the need for continued genomic and epidemiological surveillance of CHIKV in Brazil, particularly during periods of high ecological suitability, and show that selective pressures underline the emergence and evolution of the large urban CHIKV-ECSA outbreak in Rio de Janeiro.


Introduction
Chikungunya virus (CHIKV) is an enveloped virus with a single-stranded positive-sense RNA 11.8 kb genome that belongs to the Alphavirus genus in the Togaviridae family [1].The CHIKV genome contains two open reading frames encoding four non-structural proteins (nsP1-nsP4), important for viral replication, and four structural proteins (capsid, E1-E3), necessary for virus assembly [2].CHIKV is transmitted by the bites of infected Aedes spp.mosquitoes, mainly from the Ae.aegypti species and, to a lesser extent, the Ae.albopictus species.In humans, clinical manifestations of chikungunya include arthralgia, high fever, myalgia, headache and often exanthema.However, chikungunya may also cause long-lasting debilitating polyarthralgia [1], and can also lead to neurological complications and fatal outcomes [3].Despite being an important public health threat with >1 billion people at risk for transmission [4], reporting of chikungunya infections often relies on syndromic surveillance, which is challenged by the co-circulation of mosquito-borne viruses that cause similar clinical symptoms, including dengue (DENV) and Zika (ZIKV) [5].
Since its first description in Tanzania in 1952, chikungunya has caused over 70 outbreaks in Africa, Asia, Americas, Europe and in Ocean Pacific Islands [4].CHIKV can be classified into four distinct lineages: the West-African lineage, the Asian lineage, the East-Central-South-African (ECSA) genotype [6], and the Indian Ocean lineage (IOL).Chikungunya's geographic distribution-particularly of Asian, IOL and ECSA lineages-has been expanding rapidly over the last 20 years probably due rapid urbanization, globalization of trade, and virus evolution and adaptation to local variation in the distribution of vector species [7][8][9][10].For example, a single amino acid change in the E1 protein (i.e., E1-A226V) of the CHIKV IOL lineage has been linked to increase transmissibility and infectivity in Ae. albopictus and was linked with a series of severe outbreaks in Indo-Pacific Asia [11][12][13].
In 2013, the CHIKV Asian lineage was first detected in St Martin islands in the Caribbean [14].Since then, this lineage has spread to >50 countries and territories in the Americas [14,15].In 2014, co-circulation of the Asian and ECSA lineages was detected in Brazil [16].Local transmission of the Asian genotype was first detected in Amapa ´state, North Brazil, on 31 July 2014 [16].On 25 August 2014, local transmission of the CHIKV ECSA lineage was identified in Bahia state, Northeast Brazil [16,17].Retrospective outbreak investigations revealed that the ECSA index case in Brazil arrived in Bahia in late May 2014 from Angola [17].This new ECSA-American lineage was then linked to large outbreaks in the Amazonas and Roraima states in North Brazil, despite earlier circulation of the Asian lineage in that region [18].Currently, circulation of ECSA-American lineage has been confirmed in all five geographic regions of Brazil [18][19][20][21][22][23], but also in Paraguay [24] and Haiti [25].
Rio de Janeiro (RJ) is the third most populous Brazilian state and a key business and touristic hub in the Americas.Due to air and fluvial connectivity and high mosquito climatic suitability (adequate levels of temperature, humidity and precipitation for mosquito occurrence) [26], the state has historically suffered large mosquito-borne viral outbreaks [27], including dengue, Zika, yellow fever and, more recently, chikungunya.Between 2013 and 2018, CHIKV has caused 403,828 confirmed infections in Brazil [28].In 2018, RJ became the epidemic center of CHIKV in Brazil, accounting for 41% of the cases reported in the country [28].Until 2017, only 20,251 cases had been confirmed in Rio de Janeiro, corresponding to 6% of the total case numbers reported in Brazil.The earliest autochthonous chikungunya cases were reported in November 2015, and since then two ECSA sub-lineages have been detected in Rio de Janeiro [20,21,29].Despite the burden imposed in Rio de Janeiro, chikungunya's evolution and transmission patterns remain poorly understood.The reliance of chikungunya diagnosis on clinical-epidemiological criteria and the scarcity of chikungunya sequence data from Rio de Janeiro, hamper our understanding of chikungunya transmission and evolution.Moreover, relevant epidemiological parameters, such as basic (R 0 ) and time varying (R t ) reproduction numbers, which measure virus transmissibility, have not been yet previously estimated for CHIKV in Rio de Janeiro, hindering comparisons with the virus' epidemiological dynamics in other settings.
Using genetic analysis of CHIKV newly generated and publicly available genome sequences and CHIKV traditional surveillance data, we investigated its evolution and transmission dynamics in Rio de Janeiro.We show that the large 2018 CHIKV outbreak in Rio de Janeiro was mainly caused by one dominant ECSA-American viral clade that likely persisted locally since mid-2015.In addition, we provide evidence of adaptive molecular evolution on nonstructural amino acid positions, including in a lineage-defining mutation related with the dominant ECSA viral clade associated with the 2018 outbreak in Rio de Janeiro.Finally, using epidemiological modeling, we estimate basic and instantaneous reproduction numbers for CHIKV ECSA-American in Brazil, further expanding our understanding of the epidemiology of this rapidly expanding lineage.

Ethics statement
Clinical samples were collected with formal written consent following approval of the ethical review board of Universidade do Grande Rio (approval number: 54544316.3.0000.5283).

Quantifying CHIKV transmissibility in Rio de Janeiro
The weekly aggregated epidemiological data of chikungunya laboratory-confirmed cases in RJ was obtained from the Brazilian Ministry of Health from 2014 to 2018, and then the results were presented in time series.Data was used to estimate weekly CHIKV incidence and transmissibility (reproduction number) for Rio de Janeiro between 2016 and 2018.We estimated the basic reproductive number (R 0 ), defined as the number of secondary infections caused by an infected individual in a completely susceptible population, using data from 3 January to 13 March of 2016.We employed the exponential growth method to estimate R 0 [27], which derives an estimate of R 0 from the generation time distribution along with an estimate of the exponential growth rate, calculated during the earliest weeks of the outbreak where epidemic growth was approximately exponential and factors such as susceptible depletion, behavioral change or vector control are negligible.To assess changes in transmission over time in subsequent epidemics when immunity due to prior natural infection is present, we also calculated the instantaneous reproduction number (R t ), defined as the average number of secondary cases caused by an infected individual during the epidemic, provided the conditions remained as they were at time t [30].We used an 8-week sliding-window to estimate R t , and conducted several sensitivity analyses considering different sliding-window intervals (3 to 7 weeks).We assumed a parametric gamma distribution for the generation time (GT) with a mean of 14 days and a standard deviation of 6.4 days, based on an earlier estimate obtained during an outbreak caused by the Indian Ocean lineage [31].We also performed sensitivity analyses with different mean GT values (10 and 20 days), keeping the standard deviation constant.All analyses were performed in R software v4.2.2 [32] with the R 0 [33] and EpiEstim [30] packages.

Aedes aegypti transmission potential in Rio de Janeiro
To estimate Aedes aegypti transmission potential for RJ between 2014 and 2018, we first extracted daily values (midday) of 2m air temperature, 2m dew point temperature and total precipitation for all locations across RJ from the ERA-5 ECMWF reanalysis product [34].We calculated relative humidity based on daily temperature and dew point estimates using the August-Roche-Magnus approximation [35][36][37].Each variable was then averaged over RJ and combined to calculate index P using the MVSE R package [38].We used priors related to Ae. aegypti ecology and entomology, as described elsewhere [38].We then aggregated resulting daily estimates to generate mean weekly estimates for index P between 2014 and 2018.Code and data used in these analyses are available in S1 File and on the project GitHub repository (https://github.com/filiperomero2/CHIKV_RJ_2015-2018,last accessed on 30 May 2023).

Sample collection, RNA extraction, and CHIKV molecular diagnosis
To assess the genetic diversity of CHIKV in Rio de Janeiro state, we generated new CHIKV genomes obtained from samples collected in the Duque de Caxias municipality, the third most populous city in Rio de Janeiro state (Fig 1A and 1B; https://cidades.ibge.gov.br/brasil/rj/duque-de-caxias/, last accessed 21 June 2022).A total of 179 samples, blood or urine, from arboviral suspected cases were collected in public health care units between July 2017 and June 2018.Viral nucleic acid extractions were performed with the QIAmp viral RNA mini kit (Qiagen, Germany), following manufacturer's instructions.We then tested extracted RNAs by RT-qPCR for detection of ZIKV, DENV, and CHIKV using the ZDC molecular kit (Bio-Manguinhos FIOCRUZ, Brazil).Samples presenting a cycle threshold (Ct) < 38 for a given target were considered positive for CHIKV.

Chikungunya virus whole-genome sequencing
We selected 34 CHIKV RNA-positive samples with Ct < 30 for viral whole-genome sequencing using the Illumina platform.Samples were distributed across the most sampled months of 2018 (March, April, and May), and one sample from 2017 was also sequenced.Amplicon target sequencing was conducted employing an amplicon-based sequencing protocol [39].Briefly, we converted viral RNA to cDNA with SuperScript IV (Thermo Fisher Scientific), which was amplified with the Q5 DNA polymerase (New England Biolabs, UK) in two PCR reactions.Each PCR reaction contained 22 pairs of primers and generated overlapping amplicons of approximately 400 bp [39].The amplicons covered the entire CHIKV genome and were used as input to the QIAseq 1-Step Library prep kit (Qiagen, Germany), following the manufacturer's protocol.Libraries were normalized and equimolarly pooled at 14 pMol with 10% phiX control.Sequencing was performed on an Illumina MiSeq instrument with a V3 (600 cycles) cartridge.

Maximum likelihood phylogenetic analyses
To contextualize the novel genome sequences within the global CHIKV diversity, we downloaded all CHIKV sequences with more than 8,000 nucleotides available on NCBI GenBank [45], as of 31 May 2022.Sequences without information on the country of sampling and date of collection were removed from subsequent analyses.The remaining 1,262 sequences were combined with the 34 new sequences and aligned to the CHIKV RefSeq genome (NCBI accession: NC_004162.2) with MAFFT v7.480 [46].After manually trimming gap-rich regions, we estimated maximum-likelihood (ML) phylogenetic trees using IQ-Tree v2.1.2[47].We used Model-Finder [48] to determine the best-fit substitution model and the Shimoidara-Hasegawa-like approximate likelihood ratio test (SH-aLRT) [49] to assess phylogenetic node support.A second dataset was assembled, comprehending only sequences from the ECSA-American sub-lineage.Only Brazilian sequences collected up 2018 were kept in this dataset, as a preliminary analysis indicated international sequences or those collected later were not related to the genome sequences described in this study.Sequences were aligned against the oldest available ECSA strain from Brazil (Accession number: KP164568, from Feira de Santana, Bahia state), and non-coding regions were trimmed out of the alignment.A novel ML tree was inferred, using the methods described above.Screening of recombination signals was performed using RDP4 [50].

Temporal structure of CHIKV ECSA-American nucleotide alignment
Root-to-tip regressions were used to evaluate the temporal signal available on genomes from the ECSA-American sub-lineage dataset, including only Brazilian strains.The estimated maximum likelihood phylogenetic tree was manually rooted in the oldest available ECSA strain from Brazil and regression between divergence and sampling dates was analyzed in TempEst v1.5.3 [51].To optimize the temporal signal, outliers were conservatively defined as sequences whose regression residuals exceeded more or less than two times the interquartile range of the residual distribution, and 51 sequences were removed.A novel ML tree inference and root-totip regression were performed on the filtered ECSA-American sub-lineage dataset (n = 148).

Bayesian phylogenetic analyses
To investigate the evolutionary origins of CHIKV in Rio de Janeiro, we used Bayesian coalescent and phylogeographic models available in BEAST v1.10.4 [52,53].Analyses were run using a HKY+G4 nucleotide substitution model [54,55], a strict molecular clock model, and a flexible skygrid tree prior [56].The cut-off for the skygrid model was set at 4.22 years, following a preliminary estimate of the time of the most recent common ancestor (tMRCA) obtained from TempEst (x-axis intercept in the regression).The number of grid points was set to match the number of months within the tree temporal span between the tMRCA and the earliest tip (n = 51).We also performed an alternative analysis with the uncorrelated relaxed clock model with lognormal rate distribution (UCLN) [57].We used default priors and operators, and ran at least two independent Markov Chain Monte Carlo (MCMC) chains with 40 to 100 million steps, sampling every 10,000th step, using a maximum-likelihood phylogeny as a starting tree.The BEAGLE library was used for accelerated computations [58].We assessed mixing of MCMC chains and convergence of all parameters using Tracer v1.7 [59].We used Logcombiner [52] to sample 1,000 trees from the combined posterior trees distribution, after removal of 10% burn-in.A discrete phylogeographic analysis was then performed with a set of 1,000 empirical dated trees obtained from the posterior distribution, as previously described [60].We used an asymmetric substitution model [53] and considered four geographic locations/ regions: Rio de Janeiro state (n = 67), North region (n = 22), Northeast region (n = 52), and Central-West region (n = 7).
Finally, we also investigated changes in CHIKV genetic diversity over time in RJ.We assembled a third dataset, comprehending all Rio de Janeiro sequences belonging to a dominant clade, identified in the previous analysis (n = 63).We used a skygrid prior, as described above.In this case, we also used an informative prior on the root age based on the estimate obtained from the analysis of the full dataset.BEAST analyses were run in duplicate for 20 million generations, sampling every 2,000 th state.BEAST XML files and outputs are available in S2 File and on the project GitHub repository (https://github.com/filiperomero2/CHIKV_RJ_2015-2018, last accessed on 30 May 2023).

Selection analyses and ancestral states reconstruction
Selection analyses were conducted on the ECSA-American alignment with Brazilian sequences described above using MEME (Mixed Effects Model of Evolution) [61] and FEL (fixed effects likelihood) [62] models to detect individual genome sites subjected to episodic and pervasive positive selection on the Datamonkey web server [63,64].A significance threshold of p < 0.01 was used for analyses.To map positively selected sites along the evolutionary history of Brazilian CHIKV ECSA-American sub-lineage, we performed ancestral sequence reconstruction using TreeTime [65].

State-level chikungunya incidence and transmissibility
The first chikungunya epidemic wave in Rio de Janeiro state occurred in 2016 (Fig 1C), with 16,245 laboratory-confirmed cases.In 2017, fewer cases were reported (n = 3,989), with a nearly constant number of infections in the first semester, decreasing by the end of the year.In early 2018, a sharp increase in chikungunya infections were observed, leading to a larger epidemic with a magnitude greater than the first (n = 24,990 cases).In both 2016 and 2018 waves, cases grew exponentially between January and March, reaching peaks in April.Most confirmed cases occurred in the state's capital, Rio de Janeiro city (54%; n = 24,636/45,241; S1 Fig) .This pattern was most evident in 2016 (87%; n = 14,204/16,245) than in 2017 (44%; n = 1,768/3,989) and 2018 (35%; n = 8,649/24,990).In 2018, while the state capital exhibited the highest incidence, a large number of infections were reported in other municipalities, such as Campos dos Goytacazes (24%; n = 5,968/24,990) and São Gonc ¸alo (21%, n = 5,323/24,990).
We then estimated the Rt to track the progress of the epidemic in Rio de Janeiro state (Fig 1D).Noticeably, we observed that between 2016 and mid-2018, high transmissibility periods broadly coincide with periods of relatively high Ae.aegypti transmission potential.This shows that during the study period, the timing of chikungunya recurrence in Rio de Janeiro was at least partly driven by climatic factors such as temperature and humidity that correlate with fluctuations in Ae. aegypti abundance (Figs 1D and S3).As expected, R t follows a seasonal pattern, reaching the peak values in early 2016 (mean: 2.14, 95% CI: 2.06-2.22),2017 (mean: 1.45, 95% CI: 1.35-1.56)and 2018 (mean: 1.76, 95% CI: 1.63-1.91).R t estimates obtained using different window lengths, and GT distributions revealed similar patterns (S4 Fig).

Molecular detection and whole-genome sequencing of CHIKV from Duque de Caxias municipality
Of the 179 clinical samples from patients presenting symptoms compatible with arbovirus infections in health-care units of the Duque de Caxias municipality, 48.6% (n = 87) tested positive for CHIKV RNA (mean cycle threshold, Ct, was 23.2, range: 14.7 to 33.1), 1.1% (n = 2) tested positive for DENV and 2.8% (n = 5) for ZIKV.We detected CHIKV coinfections with DENV (n = 1) and ZIKV (n = 3).For the CHIKV RNA-positive cases, clinical manifestations   CHIKV-ECSA-American clade spread to the North and Centre-West regions of Brazil, as well as to Rio de Janeiro state.RJ1 and RJ2 clades result from independent introductions from the Northeast region with location posterior probability support > 99%.In contrast, the tMRCA for RJ1 was estimated in mid-2015 (mean: 7 July, 95% BCI: 4 April to 24 September 2015; clade posterior probability: 0.99), RJ2 tMRCA was dated around mid-2017 (mean: 7 July 2017, 95% BCI: 12 March 2017 to 6 November 2017; clade posterior probability: 1.00).Although we detected a more recent introduction from the Northeast region, our data suggests that the 2018 chikungunya epidemic in Rio de Janeiro was driven mostly by the RJ1 clade that was circulating in Rio de Janeiro since mid-2015.

Phylogenetic analyses of the CHIKV ECSA-American lineage
The demographic pattern of the RJ1 clade inferred through coalescent non-parametric analysis, revealed that virus genetic diversity rose during the 2016 epidemic, declined through 2017, and rose again (

Discussion
We provide a detailed picture of the epidemic history of CHIKV ECSA-American in Rio de Janeiro state from 2016 to 2018, which could be divided into three epidemic phases.First, our analyses suggest that CHIKV ECSA was introduced in Rio de Janeiro around mid-2015, causing its first recorded epidemic in 2016, once ZIKV cases started decreasing in the state [67].We estimated a R 0 for CHIKV ECSA-American in Rio de Janeiro to be around 1.56, similar to those reported to CHIKV across other settings [18,[68][69][70].Additionally, our R 0 for CHIKV was consistent with estimates for DENV in Rio de Janeiro [66], and lower compared to estimates for Zika [66], which caused large epidemics in Rio de Janeiro between 2015-2016 [71].However, we then found unusually low CHIKV incidence and genetic diversity in 2017.The implementation of large-scale Wolbachia interventions in Ae. aegypti populations in Rio de Janeiro only started in late 2017 in a single neighborhood (Ilha do Governador); therefore, it is unlikely to explain the observed patterns [72].Low chikungunya incidences in 2017 were also observed in a large serosurveillance study of blood donors [73].Thus, we hypothesize that the low number of cases in 2017 can partially be explained by climatic factors since Ae.aegypti suitability, mean temperature, or humidity in 2017 was lower compared to previous and following years.Moreover, the large chikungunya epidemic in Rio de Janeiro in 2018 indicates that the trend observed in 2017 was unlikely to be caused primarily by population immunity.Indeed, surveys found that 18% of seroprevalence against CHIKV in >2,000 participants screened between July and October 2018 [74].Additional investigations into the dynamics of arbovirus population immunity, including accurate estimates of force of infection, and analysis of data on vector control policies and population behavior, could help better understand the dynamics and evolution of CHIKV transmission in the Americas.
Overall, our results show that CHIKV incidence, R t , and Ae.aegypti transmission potential display a seasonal pattern and are generally synchronized.Similar trends have been noticed for ZIKV [18,26], and further highlight the interplay between vector ecology, climate and arbovirus disease transmission.Given its major connectivity at the national and international level, Rio de Janeiro is a key hotspot for arbovirus transmission and a location of interest for epidemiological surveillance in Brazil.For instance, several events of dengue lineage replacement were caused by novel lineages initially detected in Rio de Janeiro [27].In 2018, three different arboviruses circulated in Rio de Janeiro, CHIKV, DENV, and ZIKV, with CHIKV causing the most infections.We noticed a similar scenario in the Duque de Caxias municipality, from which 48.6% were positive for CHIKV and much smaller proportions for ZIKV (2.79%) and DENV (1.11%) in 2018.Overall, clinical symptoms in our cohort were similar to those described elsewhere [75].However, among the CHIKV RNA-positive cases, we recorded one presenting neurological manifestations.A recent study has shown that neurological signs and symptoms were reported in 39% of CHIKV cases with a fatal outcome [3].Additional studies are needed to investigate host and viral factors associated with clinical manifestations.
Our genetic analyses confirm that the clade ECSA-American lineage emerged in Northeast Brazil in mid-2014, consistent with the arrival of the index case in Bahia state in late May [16,17].This lineage then spread to all other regions over the following couple of years, being introduced in Rio de Janeiro around mid-2015, and later in mid-2017.We show that both introductions occurred from the Northeast region, consistent with previous findings [20,21,29].As the first autochthonous cases of CHIKV in Rio de Janeiro were detected in November 2015, the period of cryptic transmission in the state was between two and seven months, consistent with previous estimates [20,29].
Our analyses show the acquisition of a range of non-synonymous mutations in both nonstructural and structural genes of the ECSA-American lineage.Although Ae. albopictus mosquitoes are abundant in Rio de Janeiro [76], none of the new sequences presented the mutations E1:A226V or E2:L210Q, previously associated with increased transmission competence by Ae. albopictus mosquitoes in the Indian Ocean Lineage [11,12].However, it remains unclear whether other mutations in CHIKV ECSA-American strains are associated with increased transmissibility in Ae. aegypti and/or Ae. albopictus.Notwithstanding, our analyses prove that at least two non-synonymous mutations associated with sequences from Brazil evolved through molecular adaptation: nsP1: D351G and nsP4: A481D.While evidence for positive selection in non-structural genes has already been identified for the CHIKV Asian genotype [77], this is the first report of this phenomenon for CHIKV ECSA.Signals of positive selection in non-structural proteins have also been detected for another alphavirus (Western equine encephalitis virus) in natural settings [78,79], and experiments with chimeric alphaviruses have shown that adaptive mutations in nonstructural proteins are essential for viral replication and infectivity [80].Nonstructural proteins have also been shown to be preferential targets for adaptive molecular evolution among different flaviviruses [81].As these new mutations may impact the virus' biological properties and epidemiology, further studies with cellular and animal models, including Ae. aegypti and Ae.albopictus vector competence surveys, should be performed to clarify their phenotypic impact.
In conclusion, our study sheds light on the epidemiological dynamics of chikungunya in the state of Rio de Janeiro up to 2018.We demonstrate that chikungunya incidence and transmissibility varied seasonally, following trends in Ae. aegypti transmission potential.We also provided first reports of CHIKV ECSA-American epidemiological parameters and identified mutations of interest that evolved under positive selection.We encourage scaling up diagnostic efforts for arboviruses and deployment of effective mosquito control strategies.Larger forthcoming genomic and epidemiological CHIKV-ECSA studies will help to disentangle the impact of virus molecular evolution and adaptation, host population behavior and immunity, and vector ecology to improve public health interventions against arboviruses.
Newly generated sequences clustered within the ECSA-American lineage with maximum statistical support (SH-aLRT = 100; Fig 3A).After removal of outliers, we found a strong correlation between sampling dates and genetic divergence (Fig 3B and 3C; R 2 = 0.72).No recombination was detected.Most of the new sequences (n = 28/31) clustered within a dominant clade of sequences from Rio de Janeiro state (named RJ1; n = 74, SH-aLRT = 88.4), while a few (n = 3/31) grouped in a smaller clade (RJ2; n = 4, SH-aLRT = 95.2) together with a publicly available sequence from Rio de Janeiro.Most sequences from Duque de Caxias clustered with sequences from other municipalities, such as Nitero ´i and São João de Meriti, suggesting frequent viral spread within state borders.Time scale inferences with both strict and UCLN models results were highly congruent.As the posterior distribution of the coefficient variation of the UCLN model displayed wide variation and included extremely low values (95% BCI: 6.68 x 10 −4 -0.43), we opted for focusing on the strict model (S2 File).Our time-scaled phylogenetic analysis estimated an evolutionary rate of around 5.72 x 10 −4 (95% Bayesian credible interval, BCI, 4.89 x 10 −4 -6.50 x 10 −4 ) substitutions per site per year and corroborated that the ECSA-American clade emerged in the Northeast region in mid-2014 (mean: 21 July, 95% BCI: 29 May to 25 August; Fig 4A).

Fig 2 .
Fig 2. Symptoms distribution, RT-qPCR Ct values dynamics and their correlation with genome sequencing coverage.(A) Distribution of symptoms exhibited by all CHIKV positive patients in the Duque de Caxias cohort.(B) The time lag between symptoms onset and sample collection dates exhibits correlation with RT-qPCR Ct values.As infection proceeds, viral loads decrease (Cts increase) likely due to immunological response.(C) Negative correlation between RT-qPCR Cts and genome sequencing coverage.Sequences characterized from samples with higher viral load (lower Cts) tend to exhibit higher coverage, although no strong statistical correlation was inferred on a linear model (p = 0.08).https://doi.org/10.1371/journal.pntd.0011536.g002 Fig 4B).Notably, epidemiological and genetic data independently suggest that 2017 had a much lower number of infections than 2016 and 2018, despite the circulation of locally established lineage (RJ1) and a window of climatic suitability for transmission (Fig 1D).

Fig 3 .
Fig 3. Maximum likelihood phylogenetic analysis of global and ECSA-American datasets.(A) Phylogenetic tree inferred from the global dataset.All new characterized genomes clustered within the ECSA-Br clade.Names of lineages and relevant clades are indicated.SH-aLRT statistical support values for these clades are shown close to their defining nodes (colored in red).(B) Phylogeny inferred from the filtered ECSA-American dataset.Tip shapes are colored according to sampling location (Centre-West region: purple, North region: yellow, Northeast region: red, Rio de Janeiro state: green).Sequences generated in this study are highlighted with red circles around the tip shapes.Clades composed mostly by RJ sequences are indicated along their SH-aLRT support values.(C) The root-to-tip regression plot, which indicates a strong temporal signal (R 2 = 0.72, slope = 5.19 x 10 −4 ).Scale bars represent substitutions per site (s/s).https://doi.org/10.1371/journal.pntd.0011536.g003

Fig 4 .
Fig 4. Bayesian time scaled phylogeographic reconstruction for the clade ECSA-Br.(A) The molecular clock phylogeny annotated with discrete trait reconstructions.Colors indicate estimated ancestral locations (Centre-West region: purple, North region: yellow, Northeast region: red, Rio de Janeiro state: green).Tip shapes mark sequences generated in this study.The x-axis depicts the timescale, while the density plots indicate the posterior distributions estimated for the age of clades ECSA-American, RJ1 and RJ2.Posterior probabilities for both RJ clades are shown.Positively selected mutations detected with MEME and FEL models (NS1: D351G and NS4: A481D) are exhibited on the branches where they occurred according to the ancestral states reconstruction performed (TreeTime).The inset marks the date of arrival of the index case in the Northeast region.(B) Skygrid reconstruction plot for the clade RJ1.A separate analysis was performed with only RJ sequences from the clade RJ1, allowing the reconstruction of the dynamics of variation of viral effective population size in the state (left y-axis).The analysis reveals CHIKV genetic diversity varied over time, with periods of high diversity matching peaks observed in incidence data (right y-axis).https://doi.org/10.1371/journal.pntd.0011536.g004

S3Fig.
Time series of climatic variables-(A) 2m temperature, (B) total precipitation and (C) relative humidity-for Rio de Janeiro state between 2014 and 2018.All climatic data was obtained from the ECMWF reanalysis product [34].(TIFF) S4 Fig. R t estimates and sensitivity analysis.(A) R t estimate performed with different generation time (GT) distributions (gamma distributions with means 10, 14 and 20, and constant standard deviation: 6.4 days).Colors indicate estimates for different GTs (10: gray, 14: yellow, 20: blue).(B) R t estimates performed with different sliding window lengths (3 to 8 weeks).Colors indicate estimates for different window lengths (3: salmon, 4: yellow, 5: green, 6: blue, 7: purple, 8: pink).Solid lines indicate mean values and ribbons indicate the 95% confidence intervals.The dashed lines denote the critical epidemic threshold (R t = 1).(TIFF) S5 Fig. TreeTime and HyPhy analysis.(A) The molecular clock tree had its branches colored according to the ancestral location's reconstructions.Tip shapes indicate sequences generated in this study.Light gray ticks along branches indicate synonymous mutations, while dark gray ticks mark non-synonymous mutations.Mutations for which evidence of positive selection was found (MEME and FEL models) are marked in red (NS4: A481D) and blue (NS1: D531G).(B) CHIKV genome scheme exhibiting all mutations reconstructed and associated with RJ clades.Non-synonymous mutations are numbered from 1 to 18 and their positions and corresponding amino acid replacements are indicated by black vertical lines.Similarly, vertical gray lines indicate the position of the synonymous mutations inferred.Non-structural and structural open reading frames are colored in dark green and light green, respectively.(TIFF) S1 File.Data and R code used in epidemiological analyses.(ZIP) S2 File.BEAST xml files and outputs generated in this study.(ZIP) S3 File.Clinical data from CHIKV positive patients.(XLSX) S4 File.Input and output files of selection analysis with Datamonkey.(ZIP)