Genomic, epidemiological and digital surveillance of Chikungunya virus in the Brazilian Amazon

Background Since its first detection in the Caribbean in late 2013, chikungunya virus (CHIKV) has affected 51 countries in the Americas. The CHIKV epidemic in the Americas was caused by the CHIKV-Asian genotype. In August 2014, local transmission of the CHIKV-Asian genotype was detected in the Brazilian Amazon region. However, a distinct lineage, the CHIKV-East-Central-South-America (ECSA)-genotype, was detected nearly simultaneously in Feira de Santana, Bahia state, northeast Brazil. The genomic diversity and the dynamics of CHIKV in the Brazilian Amazon region remains poorly understood despite its importance to better understand the epidemiological spread and public health impact of CHIKV in the country. Methodology/Principal findings We report a large CHIKV outbreak (5,928 notified cases between August 2014 and August 2018) in Boa vista municipality, capital city of Roraima’s state, located in the Brazilian Amazon region. We generated 20 novel CHIKV-ECSA genomes from the Brazilian Amazon region using MinION portable genome sequencing. Phylogenetic analyses revealed that despite an early introduction of the Asian genotype in 2015 in Roraima, the large CHIKV outbreak in 2017 in Boa Vista was caused by an ECSA-lineage most likely introduced from northeastern Brazil. Epidemiological analyses suggest a basic reproductive number of R0 of 1.66, which translates in an estimated 39 (95% CI: 36 to 45) % of Roraima’s population infected with CHIKV-ECSA. Finally, we find a strong association between Google search activity and the local laboratory-confirmed CHIKV cases in Roraima. Conclusions/Significance This study highlights the potential of combining traditional surveillance with portable genome sequencing technologies and digital epidemiology to inform public health surveillance in the Amazon region. Our data reveal a large CHIKV-ECSA outbreak in Boa Vista, limited potential for future CHIKV outbreaks, and indicate a replacement of the Asian genotype by the ECSA genotype in the Amazon region.


Introduction
In August 2014, local transmission of chikungunya virus (CHIKV) was detected in Brazil for the first time, with cases being reported nearly simultaneously in Oiapoque (Amapá state, north Brazil) and Feira de Santana (Bahia state, northeast Brazil), two municipalities separated and yellow fever virus in the past [27]. Moreover, the Amazon region has recently been highlighted as a region with high transmission potential of vector-borne diseases [4] and, more generally, a region with high potential for virus zoonoses and emergence [28]. Due to its connectivity and potential impact on global epidemiology of vector-borne and zoonotic virus from the Amazon basin, it is important to improve genomic pathogen surveillance in Roraima. By August 2018, the public health laboratory of Boa Vista (capital city of Roraima state) had reported 5,928 CHIKV cases, 3,795 of which were laboratory-confirmed.
Here we a use combination of on-site portable virus genome sequencing, and epidemiological analysis of case count and web search data to describe the circulation, genetic diversity, epidemic potential and attack rates of a large CHIKV outbreak in Boa Vista.

Connectivity in study area
Roraima is the northernmost of Brazil's 27 federal units (Fig 1A) and has an estimated population of 450,479, of whom 284,313 live in the capital city of Boa Vista (ibge.gov.br/). Despite being Brazil's least populated federal unit, Roraima is one of the best-connected Brazilian states in the Amazon basin [29]. Within Brazil, Roraima is connected to Amazonas state in the south via the road BR-174. This road also connects Roraima's capital city, Boa Vista, to the states of Bolivar and Amazonas in Venezuela in the north. Further, the road BR-401 links Boa Vista to Guyana in the east. There are four daily flights connecting Boa Vista with Brasília, capital of Brazil, as well as six weekly flights to Manaus, the capital city of Amazonas state and the biggest city in the north of the country, with connecting daily nonstop flights to all other Brazilian states/regions and international destinations, including important international airport hubs in Panamá City and Miami, USA. There are also less-commonly used seasonal fluvial networks that connect Boa Vista and Manaus via the Amazonas river. and September 2018, LACEN-RR notified 5,928 CHIKV cases in Boa Vista alone, 3,795 of these laboratory-confirmed, to the National Reportable Disease Information System (SINAN). Case count time series are available from Github (https://github.com/arbospread/chikamazon). We follow the Brazilian Ministry of Health's guidelines and define a notified CHIKV case as a suspected case characterized by (i) acute onset of fever >38.5˚C, (ii) severe arthralgia and/or arthritis not explained by other medical conditions, and (iii) residing or having visited epidemic areas within 15 days before onset of symptoms. A laboratory-confirmed case is a suspected case confirmed by laboratory methods such as (i) virus isolation in cell culture, (ii) detection of viral RNA, (iii) detection of virus-specific IgM antibodies in a single serum sample collected in the acute or convalescent stage of infection; or (iv) a four-fold rise of IgG titres in samples collected during the acute phase, in comparison with a sample collected in the convalescent period.

Ethics statement
Residual anonymized clinical samples were processed in accordance with the terms of Resolution 510/2016 of CONEP (National Ethical Committee for Research, Brazilian Ministry of Health), under the auspices of the ZiBRA project (http://www.zibraproject.org/). The project was approved by the Pan American Health Organization Ethics Review Committee (PAHOERC) n o PAHO-2016-08-0029.

Nucleic acid isolation and RT-qPCR
Residual anonymized clinical diagnostic samples were sent to Instituto Leônidas e Maria Deane, FIOCRUZ Manaus, Amazonas, Brazil, for molecular diagnostics as part of the ZiBRA-2 project. Total RNA extraction was performed with QIAmp Viral RNA Mini kit (Qiagen), following manufacturer's recommendations. Samples were first tested using a multiplexed qRT-PCR protocol against CHIKV, dengue virus (DENV1-4), yellow fever virus, Zika virus, Oropouche virus and Mayaro virus [30]. All qRT-PCR results were corroborated using a second protocol [31]; comparable Ct values were obtained with the two protocols. CHIKV positive samples tested negative for all other arboviruses tested. Samples were selected for sequencing based on Ct-value <30 (to maximize genome coverage of clinical samples by nanopore sequencing [32]), and based on the availability of epidemiological metadata, such as date of onset of symptoms, date of sample collection, gender, municipality of residence, and symptoms ( Table 1). We included a total of 13 samples from Roraima state plus 5 additional samples from patients visiting the LACEN-Amazonas in Manaus.

Complete genome MinION nanopore sequencing
Sequencing was attempted on samples with Ct-value �30 at Instituto Leônidas e Maria Deane, FIOCRUZ Manaus. We used an Oxford Nanopore MinION device with protocol chemistry R9.4, as previously described [33]. Sequencing statistics can be found in S1 Table. In brief, we employed a protocol with cDNA synthesis using random primers followed by strain-specific multiplex PCR [33]. Extracted RNA was converted to cDNA using the Protoscript II First Strand cDNA synthesis Kit (New England Biolabs, Hitchin, UK) and random hexamer priming. CHIKV genome amplification by multiplex PCR was attempted using the CHIKAsia-nECSA primer scheme and 35 cycles of PCR using Q5 High-Fidelity DNA polymerase (NEB) as described in [33]. PCR products were cleaned up using AmpureXP purification beads (Beckman Coulter, High Wycombe, UK) and quantified using fluorimetry with the Qubit dsDNA High Sensitivity assay on the Qubit 3.0 instrument (Life Technologies). PCR products for samples yielding sufficient material were barcoded and pooled in an equimolar fashion using the Native Barcoding Kit (Oxford Nanopore Technologies, Oxford, UK). Sequencing libraries were generated from the barcoded products using the Genomic DNA Sequencing Kit SQK-MAP007/SQK-LSK108 (Oxford Nanopore Technologies). Libraries were loaded onto a R9/R9.4 flow cell and sequencing data were collected for up to 48hr. Consensus genome sequences were produced by alignment of two-direction reads to a CHIKV virus reference genome (GenBank Accession number: N11602) as previously described in [33]. Positions with�20× genome coverage were used to produce consensus alleles, while regions with lower coverage, and those in primer-binding regions were masked with N characters. Validation of the sequencing protocol was previously performed in [33].

Collation of CHIKV-ECSA complete genome datasets
Genotyping was first conducted using the phylogenetic arbovirus subtyping tool available at http://www.krisp.org.za/tools.php. Complete and near complete sequences were retrieved from GenBank on June 2017 [34]. Two complete or near-complete CHIKV genome datasets were generated. Dataset [35]. Dataset 2 (ECSA-Br) included only the 29 Brazilian genome sequences. Using a robust nonparametric test [36], no evidence of recombination was found in both datasets.

Maximum likelihood analysis and temporal signal estimation
Maximum likelihood (ML) phylogenetic analyses were performed for each dataset using RAxML v8 [37]. We used a GTR nucleotide substitution model with 4 gamma categories (GTR+4Γ). In order to investigate the evolutionary temporal signal in each dataset, we regressed root-to-tip genetic distances against sample collection dates using TempEst [38]. For both datasets we obtained a strong linear correlation (dataset 1: r 2 = 0.93; dataset 2: r 2 = 0.84) suggesting these alignments contain sufficient temporal information to justify a molecular clock approach. However, for dataset 1, the Angola/M2022/1962 strain was positioned substantially above the regression line. Previous investigations have suggested this strain may have been the result of contamination or high passage in cell culture [9], so this sequence was removed from subsequent analyses.

Molecular clock phylogenetic analysis
To estimate time-calibrated phylogenies we used the BEAST v.1.10.1 software package [39].
To infer historical trends in effective population size from the genealogy we used several different coalescent models. Because preliminary analysis indicated oscillations in epidemic size through time (as also expected from national case report data), we used three flexible, nonparametric models: a) the standard Bayesian skyline plot (BSP; 10 groups) [40], b) the Bayesian skyride plot [41], and c) the Bayesian skygrid model [42], with 45 grid points equally spaced between the estimated TMRCA of the CHIKV-ECSA genotype in Brazil and the date of the earliest available isolate, collected in 18 March 2017 [42]. For comparison, we also used a constant population size coalescent model. We tested two molecular clock models: a) the strict molecular clock model, which assumes a single rate across all phylogeny branches, and b) the more flexible uncorrelated relaxed molecular clock model with a lognormal rate distribution (UCLN) [43]. Because the marginal posterior distribution of the coefficient of variation of the UCLN model did not exclude zero (most likely due to the small alignment size), we used a strict molecular model in all analyses. For each coalescent model, Markov Chain Monte Carlo analyses were run in duplicate for 10 million steps using a ML starting tree, and the GTR+4Γ codon partition (CP)1+2,3 model [43].

Epidemiological analysis
The epidemic basic reproductive number (R 0 ) was estimated from monthly confirmed cases, as previously described [32,44]. Because (i) the Asian genotype was circulating in the north region of Brazil since 2014 [1], and (ii) we observed a relatively small number of cases both in the notified and confirmed time series, we assume cases from June 2014 and December 2016 did not represent autochthonous transmission of CHIVK-ECSA. We assume a mean generation time of 14 days, as previously reported elsewhere for an outbreak caused by an Indian Ocean lineage (IOL), a subclade of the ECSA genotype [45]. We report R 0 estimates for different values of the generation time (g) parameter, along with corresponding estimates of the epidemic exponential growth rate, per month (r).

Web search query data
Available in near-real time, disease-related Internet search activity has been shown to track disease activity (a) in seasonal mosquito-borne disease outbreaks, such as those caused by dengue [46], and (b) in unexpected and emerging mosquito-borne disease outbreaks such as the 2015-2016 Latin American Zika outbreak [47]. Here, we investigated whether we could find a meaningful relationship between Internet search activity and the local chikungunya outbreak in Roraima. Indeed, novel Internet-based data sources have the potential to complement traditional surveillance by capturing early increases in disease-related search activity that may signal an increase in the public's perception of a given public health threat and may additionally capture underlying increases in disease activity. Internet searches may be particularly important and indicative of changes in disease transmission early during an outbreak, when ongoing information on the virus transmission is obfuscated by a lack of medical surveillance. In addition, Internet search trends may also help track disease activity in populations that may not seek formal medical care. We used the Google Trends (GT) tool [46,47] to compile the monthly fraction of online searches for the term "Chikungunya", that originated from Boa Vista municipality (Roraima state), between January 2014 and July 2018. For comparison, GT search activity for the term "Chikungunya" was collected for the same time period for Manaus municipality (Amazonas state). The synchronicity of GT time series and notified and confirmed case counts from Boa Vista and Manaus was assessed using the Spearman's rank correlation test in the R software [48].

Results
Although most CHIKV notified cases in Brazil were reported in 2016 (Fig 1), in Roraima, the majority of notified and confirmed cases in Roraima state were reported in 2017 (5,027 notified cases and 3,720 laboratory-confirmed infections). The number of cases in Roraima started increasing exponentially in January 2017, and the outbreak peaked in July 2017. We selected 15 RT-qPCR+ virus isolates from autochthonous cases in Roraima state (11 from Boa Vista, 1 from Bonfim, and 1 from Iracema municipalities) ( Table 1) with a cycle threshold (Ct) �30 (mean 20.3, range 13.7-27.41). We included two isolates from two infected travellers returning to Roraima in December 2014, and an additional five isolates from Amazonas state (all from Manaus municipality), sampled between July 2015 and March 2017. In less than 48 hours genome sequence data was obtained for all selected isolates and in less than 72 hours preliminary results were shared with local public health officials and the Brazilian Ministry of Health. A mean genome coverage of 86% (20x) per base pair was obtained for the sequenced data; mean coverage increased to 90% when focusing on samples with Ct<26 (Fig  2A). Coverage of individual sequences and epidemiological information for each sequenced isolate can be found in Table 1.
Identification of virus genotypes was conducted using phylogenetic analysis of full-length genome datasets (manual classification) and using an online phylogenetic analysis tool (automated classification). Both approaches identified the ECSA genotype as the dominant genotype circulating in both Roraima and Manaus between 2015 and 2017. However, two cases from late 2014 returning from Venezuela to Roraima (AMA294 and AMA295) were classified as Asian genotype, the dominant lineage circulating in Latin America.
ML and Bayesian phylogenetic analyses reveal that the ECSA sequences from Brazil form a single well-supported clade (bootstrap support = 100), hereafter named as ECSA-Br clade; which contains strong temporal signal (r 2 = 0.84) as measured by a regression of genetic divergence against sampling dates (Figs 2B and 3). Thus we estimated the evolutionary time-scale of the ECSA-Br lineage using several well-established molecular clock coalescent methods. Our substitution rate estimates indicate that the ECSA-Br lineage is evolving at 7.15 x 10 −4 substitutions per site per year (s/s/y; 95% Bayesian credible interval: 5.04-9.55 x 10 −4 ). This estimated rate is higher than that estimated for endemic lineages, and is similar to the evolutionary rates estimated for the epidemic lineage circulating in the Indian Ocean region (Fig 2C). A closer inspection of amino acid mutations indicate that the ECSA-Br strains lack both the A226V (E1 protein) and the L210Q (E2 protein) mutations that has been reported to increase virus transmissibility and persistence in Ae. albopictus populations in the Indian Ocean [49]. This is consistent with the establishment of the ECSA genotype in Brazil following the introduction of a single strain to the Americas [1]. The two isolates collected in late 2014 in Roraima cluster together and fall as expected within the diversity of other Asian genotype sequences from the Americas. Our phylogenetic reconstruction suggests at least five separate introductions of the Asian genotype strain Brazil (S1 Fig), in contrast to a single introduction of the ECSA genotype followed by onward transmission. Moreover, all 13 ECSA isolates sampled in Roraima (node C) cluster together with maximum phylogenetic support (bootstrap support = 100; posterior probability = 1.00) (Fig 3). We consistently estimate the date of the most recent common ancestor of ECSA-Br Roraima clade to be mid-July 2016 (95% BCI: late March to late October 2016) (Fig 3); similar dating estimates under different coalescent models (S2 Fig). In contrast to the Roraima strains, sequences from Manaus were found to be interspersed with isolates from Bahia and Pernambuco (Fig 3), indicating separate introductions of the CHIKV-ECSA lineage, some in early 2015 (node B), possibly from the northeast region of Brazil. Interestingly, according to travel history reports, the first autochthonous transmission of CHIKV in Manaus was linked to an index patient who reported spending holidays in Feira de Santana (Bahia state) in early 2015, during a period when this city was experiencing a large CHIKV outbreak [5]. The date of node A was estimated to be around mid-July 2014 (95% BCI: early Jul-late Aug 2014), shortly after the arrival of the presumed index case in Feira de Santana, Bahia [5]. This is in line with a single introduction to Bahia (node A), followed by subsequent waves of transmission across the northeast and southeast regions of Brazil [5,50,51]. Our demographic reconstructions indicate that the outbreak in Roraima 2017 probably represents the third epidemic wave spreading across Brazil (S3 Fig). Next, we used notified case counts to estimate the basic reproductive number, R 0 , of the epidemic. R 0 is the average number of secondary cases caused by an infected individual and can be estimated from epidemic growth rates during its early exponential phase [44]. We find that R 0 � 1.66 (95% CI: 1.51-1.83), in line with previous reports from other settings [52][53][54]. A   Fig 2. Sequencing statistics, temporal signal and evolutionary rates of the CHIKV-ECSA lineage. A. Genome coverage plotted against RT-qPCR CT-values for the newly generated sequence data. B. Genetic divergence regressed against dates of sample collection for dataset 2 (CHIKV-ECSA-Br lineage). C. Evolutionary rate estimates for the CHIKV-ECSA-Br lineage obtained by this study (circle number 1) compared to published evolutionary rates obtained for other lineages. Circles numbered 2 to 8 represent point estimates reported in [1,9,80]. Horizontal bars represent 95% highest posterior density credible intervals for evolutionary rates.
https://doi.org/10.1371/journal.pntd.0007065.g002 sensitivity analysis considering different exponential growth phase periods resulted in a lower bound for R 0 of around 1.23 (S4 Fig). To gain insights into the possible magnitude of the outbreak and local surveillance capacity we used the equilibrium end state of a simple susceptibleinfected-recovered (SIR) model: N = S + I + R, S~1/R 0 , I~0, with N being the total population size of Roraima. Using this simple mathematical approach, we obtain an attack rate (R) of 0.39 (95% CI: 0.36-0.45), slightly lower than elsewhere in Brazil [13,16]. This corresponds to an estimated 110,882 (95% CI: 102,352-127,940) infected individuals, and a case detection rate of 5.34% (95% CI: 4. 63-5.79). This implies that approximately 1 case was notified for every 19 infections. If we assume 32.7-41.2% of the estimated infections are symptomatic, as previously reported in Bahia and Sergipe [55], then we estimate that the local observation success of symptomatic cases was between 12.8-16.1%. However, if we assume that 75-97% of people infected with CHIKV will develop symptomatic infections, as reported for the Indian Ocean lineage [11,56,57], then the chances of a reported a symptomatic CHIKV case decrease to 5-7% [10]. Case reports suggest that the beginning of the exponential phase of the outbreak was in December 2016 (S4 Fig), while genetic data suggests that the outbreak clade emerged around July 2016. However, between August 2014 and June 2016, 612 CHIKV notified cases and 40 confirmed cases were reported by the LACEN-RR. It is therefore likely that prior to Jan 2017, low but non-neglectable transmission of the Asian genotype occurred in Roraima.
We investigated the public's awareness of the chikungunya outbreak by retrospectively monitoring Google searches of the search term "chikungunya" in Roraima state from January 2014 to July 2018 (Fig 4). As a comparison, we performed a similar search focusing on the neighbouring state of Amazonas. We found that web search activity and CHIKV cases counts in Roraima are highly correlated (notified cases: r = 0.89; confirmed cases: r = 0.92, Fig 4D-4E). Additionally, the timing of the peak of Google searches corresponds to that of notified and confirmed cases with a peak in July 2017 (Fig 4A and 4C, Fig 4B and 4F). It is important to note that web search activity was available weeks or months before the final number of confirmed (and suspected) cases were made publicly available. This fact highlights the potential utility of monitoring disease-related searches during the outbreak. Interestingly, we find some web-search activity in Roraima before June 2016, particularly in September 2014, March 2015 and March 2016 (Fig 4F). These patterns are distinct to those in the Amazonas neighbouring state (notified cases: r = 0.65; confirmed cases: r = 0.15), which shows an early peak in November 2014, soon after the estimated age of node B (Fig 3B), followed by a peak in February 2016 and another in March 2017 (Fig 4C). These multiple peaks in internet search queries are consistent with the timing of at least 3 introductions detected in our phylogenetic analyses (Fig  3B), each possibly resulting in small epidemic waves of CHIKV in Manaus and Amazonas states.

Discussion
In this study we characterized an outbreak caused by CHIKV in Boa Vista city, Roraima state, northern Brazil, using a combination of genetic, laboratory-confirmed and -suspected, and digital search data. Our findings show that an ECSA lineage was introduced in Roraima around July 2016, six months before the beginning of the exponential increase in case numbers. Using simple epidemiological models, we show that on average 1 in 17 (95% CI: [14][15][16][17][18][19][20] symptomatic CHIKV cases, a fraction of the 110,882 (95% CI: 102,352-127,940) estimated number of infections, sought medical care during the outbreak of CHIK ECSA in Roraima. Incidence of CHIKV notified cases was strongly associated with fluctuation in Google search activity in Roraima. Moreover, this study represent the first effort to generate on-site complete CHIKV genome sequences. Our results deliver a genomic and epidemiological description of the largest outbreak ever reported in north Brazil, revealing the circulation of the ECSA lineage in the Amazon region.
We estimate that 39% (95% CI: 36-45%) of Roraima's population was infected with CHIK-V-ECSA-Br during the outbreak in 2017. Our estimates are higher than the 20% seropositive observed in a rural community in Bahia [11], and slightly lower than the 45.7-57.1% observed in two serosurveys conducted in the same state [13], where the ECSA lineage also seems to predominate. The observed differences in terms of the proportion of the population exposed to CHIKV in Roraima compared to previous estimates from the northeast region could result from partial protection resulting from low-level transmission of the CHIKV-Asian genotype during 2014-2016 in the north region. Alternatively, some level of cross-protection could have been conferred by previous exposure to Mayaro virus (MAYV); Mayaro is an antigenically-related alphavirus that may provide some level of cross-reactivity [58,59] and is associated with Haemagogus spp. vectors [60], but has also been identified in Culex quinquefasciatus and Aedes aegypti mosquitoes [66]. MAYV has been detected in the north [61][62][63][64][65] and centre-west [22,[66][67][68][69][70] regions of Brazil. Moderate to high prevalence of MAYV IgM have been found in urban northern areas [61], which could explain the limited spread of CHIKV in Manaus compared to Roraima. Finally, because CHIKV notified cases will be influenced by the apparent rate of infection associated to the genotype causing an outbreak [56], future comparisons of epidemiological parameters across different regions from where no genotype data is available should be taken with caution. Given the rapid spread of different CHIKV lineages, novel diagnostic tools may be needed to evaluate the proportion of individuals infected by each genotype.
Different CHIKV circulating lineages may have remarkably different public health consequences. Lineage-specific clinical presentations have been recently highlighted by a recent index cluster study which showed that 82% of CHIKV infections caused by the ECSA lineage are symptomatic, in comparison to only 52% of symptomatic infections caused by the Asian genotype [56]. While the Asian lineage seems to have circulated cryptically for 9 months before its first detection in the Caribbean [3], the faster detection of the ECSA lineage in Brazil could at least in part be a consequence of a higher rate of symptomatic to asymptomatic infections of the ECSA lineage circulating in Brazil. The time lag between the phylogenetic estimate of the date of introduction of a virus lineage and the date of the first confirmed case in a given region, enables us to identify surveillance gaps between the arrival and discovery of a virus in that region [71].
We used genomic data collected over a 3-year period to estimate the genetic history of the CHIKV-ECSA-Br lineage. We estimate that the CHIKV-ECSA-Br lineage arrived in Roraima around July 2016, whilst the first confirmed CHIKV cases in Roraima occurred earlier, in August 2014. That the discovery date anticipates the estimated date of introduction can be explained by initial introduction(s) of the Asian linage (from the north of Brazil or from other south American regions) resulting in only limited onwards transmission, followed by the replacement of the Asian lineages by an epidemiological successful ECSA lineage. Transmission of the Asian genotype during this period is in line with an increase in notified and confirmed cases, as well internet search query data between August 2014 and June 2016. It is also possible that ecological conditions may have dampened the transmission of the Asian genotype between August 2014 (detection of autochthonous transmission of the Asian genotype in the north region of Brazil) and July 2016 (estimated arrival of the ECSA in Roraima). In the future, fine-scaled, high-resolution measures of transmission potential that take into account daily changes in humidity and temperature will help addressing the impact of climatic changes in the arbovirus epidemiology in the Brazilian Amazon. Nationwide molecular and seroprevalence studies combined with epidemiological modelling [72] will help to determine the proportion of cases caused by the ECSA compared to the Asian lineage in different geographic settings, and to identify which populations are still at risk of infection in Brazil.
We estimated high rates of nucleotide substitution for this lineage, which equates to around 8 (95% BCI: 6-11) nucleotide substitutions per year across the virus genome. Such rates are similar to the evolutionary rates estimated for the IOL lineage; these are typical of urban and epidemic transmission cycles in locations with an abundance of suitable hosts and lack of herd immunity [9]. None of the mutations associated previously with increased transmissibility of the IOL lineage in Ae. albopictus mosquitos in the Indian Ocean region were identified in this study. However, it is currently unclear whether we should expect the same mutations to be linked with increased transmission in Aedes spp. populations both from Brazil and from Southeast Asia. Further, it is possible that CHIKV in Brazil is transmitted mainly by the Ae. aegypti vector that is abundant throughout Brazil [73]. In line with this, CHIKV-ECSA was recently detected in Aedes aegypti from Maranhão [74] and Rio de Janeiro states [75].
The past dengue serotype 4 genotype II outbreak in Brazil ignited in the north of the country, and is inferred to have been introduced from Venezuela to Roraima, before spreading to the northeast and southeast region of Brazil [76]. Our genetic analysis reveals at least four instances of ECSA-Br virus lineage migration in the opposite direction, i.e., from northeastern to northern Brazil. Such a pattern may not be surprising due to the year-round persistence of Aedes aegypti mosquitos in the northeast and the north areas [32]. Within-country transmission will be dictated by human mobility, climatic synchrony, and levels of population immunity. Moreover, international spread of the ECSA-Br linage is expected to regions linked to Brazil. Previous analyses of dengue virus serotypes has identified a strong connectivity between north Brazil and Venezuela [26,77], and northeast Brazil and Haiti [32,78]. In addition, Angola and Brazil are linked by human mobility and synchronous climates that have facilitated the migration of CHIKV-ECSA [1] and Zika virus (http://virological.org/t/circulation-of-theasian-lineage-zika-virus-in-angola/248).
Improving surveillance in the Amazon region may help anticipate transmission of vectorborne diseases and also spillover from wild mammals of zoonotic viruses of particular concern [28]. Genomic portable sequencing of vector-borne viral infections in the Amazon may is particularly important in the context of early identification of circulation of strains newly (re)introduced from wildlife. For example, yellow fever strains collected in Roraima seem to be at the source of the 2016-2018 yellow fever virus outbreak in southeast Brazil, which has affected large urban centres in Minas Gerais, São Paulo and Rio de Janeiro [27]. In the near future, the increasing rapidity and decreasing cost of genome sequencing in poorly sampled areas, combined with emerging theoretical approaches [79], will facilitate the investigation of possible associations between arbovirus lineage diversity, mosquito vectors, reservoir species, and transmission potential.
Finally, the reported synchronicities between notified chikungunya case counts in Roraima and the chikungunya-related Internet searches originated in the region highlight the potential complementarity that Internet search activity may offer in future disease outbreaks. Specifically, given that disease-related search activity can be monitored in near-real time, early signals of increases in disease activity may be spotted weeks or months before lab-confirmed case counts may be available in an unfolding outbreak.