Genomic Surveillance of Yellow Fever Virus Epizootic in São Paulo, Brazil, 2016 – 2018

São Paulo, a densely inhabited state in southeast Brazil that contains the fourth most populated city in the world, recently experienced its largest yellow fever virus (YFV) outbreak in decades. YFV does not normally circulate extensively in São Paulo, so most people were unvaccinated when the outbreak began. Surveillance in non-human primates (NHPs) is important for determining the magnitude and geographic extent of an epizootic, thereby helping to evaluate the risk of YFV spillover to humans. Data from infected NHPs can give more accurate insights into YFV spread than when using data from human cases alone. To contextualise human cases, identify epizootic foci and uncover the rate and direction of YFV spread in São Paulo, we generated and analysed virus genomic data and epizootic case data from NHPs in São Paulo. We report the occurrence of three spatiotemporally distinct phases of the outbreak in São Paulo prior to February 2018. We generated 51 new virus genomes from YFV positive cases identified in 23 different municipalities in São Paulo, mostly sampled from NHPs between October 2016 and January 2018. Although we observe substantial heterogeneity in lineage dispersal velocities between phylogenetic branches, continuous phylogeographic analyses of generated YFV genomes suggest that YFV lineages spread in São Paulo at a mean rate of approximately 1km per day during all phases of the outbreak. Viral lineages from the first epizootic phase in northern São Paulo subsequently dispersed towards the south of the state to cause the second and third epizootic phases there. This alters our understanding of how YFV was introduced into the densely populated south of São Paulo state. Our results shed light on the sylvatic transmission of YFV in highly fragmented forested regions in São Paulo state and highlight the importance of continued surveillance of zoonotic pathogens in sentinel species.

Introduction Yellow fever (YF) is an acute haemorrhagic disease caused by yellow fever virus (YFV), a single-stranded positive-sense RNA virus from the Flavivirus genus. Yellow fever is endemic to the American and African tropics, where nearly 400 million people are estimated to be at risk of infection [1]. Clinical manifestations of YF in humans range from inapparent or mild disease in up to 80% of infected cases, to severe hepatitis and haemorrhagic disease. The fatality rate among patients who develop visceral disease can range from 20% to 60% [2]. YF is preventable in humans by administration of a single dose of an extremely effective vaccine that provides life-long protection against the disease.
YFV is primarily transmitted in the Americas in a sylvatic cycle between non-human primates (NHPs) and tree-dwelling mosquitoes (primarily Haemagogus janthinomys, Haemagogus leucocelaenus and more rarely mosquitoes of the Sabethes genus) [3][4][5]. The urban cycle of YFV, in which YFV is transmitted between humans and Aedes aegypti mosquitoes, has not been observed in the Americas since the 1940s [2,6]. In Brazil, outbreaks typically initiate amongst NHPs in neotropical forests and cause subsequent human epidemics that mostly affect unvaccinated male adults in rural areas. YFV infection tends to be detected earlier in NHPs before occasional human spill over infections are observed, and infection location is typically more precisely ascertained for NHPs than for humans. Surveillance of primate epizootics is therefore critical to alert us to the possibility of future human infections [7], and to provide the clearest possible indicator of spatiotemporal disease spread. From July 2016, the southeast region of Brazil suffered the largest YFV outbreak observed in the Americas in decades. The epizootic/epidemic was caused by a rapidly-spreading lineage of the South American 1 (SA1) genotype that is thought to have originated in the Amazon basin [8]. Between December 2016 and June 2019, the Brazilian Ministry of Health confirmed at least 2251 human YFV cases and 1567 epizootic events in NHPs across Brazil [9]. São Paulo state is a densely populated state of southeast Brazil that contains one of the world's largest urban conurbations. Available reports from São Paulo suggest that the state health authorities confirmed around 875 cases of YFV in NHPs between July 2016 and 18 th November 2019 and 624 cases of YFV in humans between January 2017 and 18 th November 2019 [10][11][12]. YFV genomes from 36 human cases sampled during the outbreak have recently been reported from São Paulo [13]. However, despite the magnitude and expansion of the epizootic in São Paulo, remarkably little is known about the origins, spread or genetic diversity of YFV in NHP populations in the region.

PLOS PATHOGENS
Analytical insights into the spread of YFV among NHP populations can guide public health efforts more effectively than those based on human cases alone. This is because routes and rate of YFV spread estimated solely from human cases can be biased by substantial human travel away from the location of infection, or obscured by the lack of human cases in well-vaccinated regions where epizootics are nevertheless ongoing. Epizootic foci can be used to determine which human populations may be most urgently in need of vaccination. Understanding how YFV is introduced into new areas, particularly highly urbanised areas such as São Paulo city, is important for designing strategies that can interrupt these introductions. Estimates of the rate of YFV spread can help public health organisations ensure healthcare preparedness is achieved before the anticipated onset of local human cases. To address these questions, we report case data from NHP populations in São Paulo from January 2016-February 2018. We sequence YFV from NHPs in São Paulo using a previously-developed portable sequencing approach. We analyse case data and conducted virus phylogeographic analyses to characterize the spread of YFV in São Paulo during the first waves of the outbreak.

Identification of YFV positive cases
Reported NHP carcasses were tested for YFV. Not all reported carcasses were tested for YFV and therefore the number of positive animals reported here likely underestimates YFV incidence in NHPs in São Paulo. The proportion of all cases in our data changed over time, such that proportionally fewer reported NHPs were tested in late 2017 and early 2018 than at an early stage of outbreak emergence (S1 Fig). Despite this, the proportion of all carcasses that were tested following reporting was relatively consistent across each municipality (S2 Fig). YFV was confirmed by a positive result in at least one of the four methods described below; immunohistochemistry, immunofluorescence, reverse transcription quantitative PCR (RT-qPCR), or epidemiological linkage.

Histopathology and immunohistochemistry for NHP cases
All samples were subjected to histopathology and immunohistochemistry examination. Samples of brain, heart, lung, liver, spleen and kidney from NHPs were fixed in formaldehyde and embedded in paraffin. Histological sections of these tissues were stained with haematoxylin and eosin and examined on a microscope. Indirect immunohistochemistry was used to detect the presence of the YFV antigen. Sections of liver (0.3uM) were placed on slides coated with silane and were treated with an in-house polyclonal anti-YFV antibody that was produced in mice and used at 1/30,000 dilution. The slides were treated with anti-mouse secondary antibodies, linked to either horseradish peroxidase (Reveal HRP Spring polymer System, Spring) or to alkaline phosphate (Link MACH4 universal AP Polymer and Polymer MACH4 universal AP, Biocare). Chromogenic detection of the presence of YFV was subsequently conducted using the substrates 3,3'-diaminobenzidine or Fast Red (Warp Red, Biocare), respectively.

Indirect immunofluorescence
Samples of NHP blood or serum, and tissue material suspensions obtained from autopsies, were tested using a standardized indirect immunofluorescence technique [15]. An in-house polyclonal anti-flavivirus antibody (anti-DENV 1-4) and an anti-mouse IgG-FITC antibody (Sigma) were used. Positive samples were typed by indirect immunofluorescence with monoclonal antibodies for YFV (Biomanguinhos, Rio de Janeiro).

RT-qPCR
Total RNA was extracted from tissue and serum samples using two commercial kits according to the manufacturer's instructions: QIAamp RNA Blood Mini Kit for tissues and QIAamp Viral RNA Mini Kit for serum (Qiagen Inc., Germany). Viral RNA was detected using two previously published RT-qPCR techniques [16,17].

Epidemiological linkage
Where multiple NHPs from the same species were found dead at the same time and location, the number of carcasses was recorded but typically only one or a few animals were tested. In these instances, untested animals were on rare occasions considered as confirmed cases on the basis of their spatiotemporal association with tested, confirmed cases.

Sample collection of human cases
We randomly selected 5 samples from confirmed human cases collected in municipalities of São Paulo state (S1 Table) that had been tested using the RT-qPCR methods, above. These samples were collected in several municipalities of São Paulo state and sent for molecular diagnostics at Instituto Adolfo Lutz.

MinION genome sequencing
A selection of positive samples (S3 Fig and S1 Table) was sequenced using a rapid wholegenome sequencing protocol that has been previously validated and successfully applied in Brazil [8,18]. NHP samples were selected to have RT-qPCR cycle threshold (CT) values <25 to facilitate genomic amplification and sequencing. Because few samples were available, human samples were randomly selected from positive cases with CT <37. In brief, cDNA was produced from viral RNA using random hexamers and the Protoscript II First Strand cDNA synthesis kit (NEB). The genome was amplified using a multiplex PCR scheme designed to produce overlapping 500bp amplicons across the whole coding region of the recent South American genotype I outbreak clade [8,18]. PCR products were quantified, barcoded using the Oxford Nanopore Technologies (ONT) Native Barcoding Kit (NBD103, or NBD103 and NBD114, depending on sequencing date), and pooled in an equimolar fashion. Sequencing libraries consisting of 10-24 samples per library were constructed using a Ligation Sequencing kit (ONT, SQK-LSK108 or SQK-LSK109, depending on sequencing date). Sequencing was performed on a ONT FLO-MIN106 flow cell for up to 48h as described previously [8].

Generation of consensus sequences
Previously published approaches were used to produce YFV consensus sequences [18,19]. In brief, raw files were basecalled using Albacore or Guppy version 3.0.3 (Oxford Nanopore Technologies). Reads were demultiplexed and trimmed of adaptor and barcode sequences using Porechop (version 0.2.3, https://github.com.rrwick/Porechop). Trimmed reads for each barcoded sample were mapped to a reference genome (accession number JF912190) using bwa [20], and primer binding sites were trimmed from reads. Single nucleotide variants to the reference genome were detected using nanopolish variant calling (version 0.11.1) [21]. Consensus sequences were generated using identified variants. Sites with <20X coverage were replaced with ambiguity code N. Sequencing statistics can be found in S2 Table. Accession numbers of newly generated sequences can be found in S1 Table. Phylogenetic analyses The genome sequences generated here (n = 51) were combined with all previously published SA1 genotype YFV genomes available to 31 December 2019 (n = 173). This dataset was curated to remove all sequences shorter than 70% of the whole coding sequence length, those with missing metadata, and those with abnormal genomic sequences typical of sequencing or assembly errors. The full dataset analysed here therefore contains 224 YFV sequences, as shown in S4 Fig. Sequences were aligned using MAFFT v.7 [22]. The best-fit substitution model was selected using ModelTest-NG [23]. Maximum likelihood (ML) phylogenetic trees were estimated using RAxML v.8 [24] under a GTR + I + Γ 4 nucleotide substitution model. Statistical support for phylogenetic nodes was estimated using a ML bootstrap approach with 100 replicates.

Phylogeographic analyses
To investigate the spread of YFV in São Paulo, we analysed the São Paulo outbreak clade in more detail. We focus on two datasets; the 'full' São Paulo outbreak clade, represented by 99 sequences sampled from 2016-2018, and a 'geographically restricted' dataset in which 4 sequences are removed that appear to be clear spatial exports from the main São Paulo epizootic to distant locations in Goiás, Espírito Santo and Minas Gerais, with little phylogenetic evidence of extensive local transmission. These spatial outliers were removed so that rare, longdistance movements with little or no observed onwards transmission did not impact estimates of branch dispersal rate or maximal wavefront distance.
Geo-referenced and time-stamped sequences were analysed in BEAST v.1.10.4 [25] using the BEAGLE library to enhance computational speed [26]. The uncorrelated lognormally distributed relaxed clock model [27], a skygrid coalescent tree prior [28] and a continuous phylogeographic model that uses a relaxed random walk (RRW) to model the spatial diffusion of lineages were used. Contrary to a standard Brownian diffusion model that is based on a strong assumption of homogeneous dispersal velocity in all phylogenetic branches, the RRW model is a flexible model that specifically allows inference of heterogeneous, branch-specific dispersal velocities. Dispersal velocity variation among lineages was here represented using a Cauchy distribution [29,30]. The default settings were used for the diffusion rate. Uniform priors were used to allow date uncertainty for those sequences where exact sampling day was unknown, and a jitter added to tips with non-unique co-ordinates (window size of 0.2 decimal degrees).
Virus diffusion through time and space was summarised using 1000 phylogenies sampled at regular intervals from the posterior distribution (after exclusion of 10% burn-in). The R package "seraphim" [31] was used to extract the spatiotemporal information contained in these phylogenies, whose branches can be considered as movement vectors (each having start and end spatial coordinates, and start and end dates, in decimal units). The package was also used to estimate statistics of spatial dissemination, such as mean branch dispersal velocity, change in the maximal wavefront distance from epizootic origin, and an estimate of subsequent wavefront dispersal velocity [31]. The maximum wavefront distance is the distance between the epidemic origin (here approximated as the inferred tree root location) and the furthest location at which virus transmission has ever occurred. Wavefront velocity is the rate at which the maximum wavefront distance increases over time. In contrast, branch dispersal velocity captures the rate of movement of any lineage, regardless of whether or not it expands the epidemic wavefront. The spatiotemporal diffusion of YFV was visualised using the "spreadGraphic2" R function [31].
Our results reveal three main epizootic phases, including a phase of low reported NHP case counts in northern São Paulo state, followed by two phases where higher numbers of NHP cases were reported almost exclusively in the south the of state. Phase 1 runs from July 2016 to Jan 2017 (6%, 33/591 cases). Phase 2 runs from February to July 2017 (20%, 119/591 cases) and phase 3 from July 2017 to at least February 2018 (74%, 439/591 cases) (Fig 1A). These transition dates were chosen to capture the spatial shift between phase 1 and phase 2 (Fig 1B), and to occur centrally within the low case count period that occurs between phase 2 and 3 ( Fig  1A), but the exact dates adopted here are somewhat arbitrary. The peaks of epizootic phases 2 and 3 were mid-April 2017 and late November 2017, respectively. Proportionally fewer NHP carcasses were tested in phase 3 than in phase 1 or 2, and in January and February 2018 than in many earlier months (S1 Fig). It is therefore possible that phase 3 may have been of longer duration and proportionally greater magnitude than we report here.
NHP cases were geographically widespread across 57 municipalities of São Paulo to February 2018, with most reported cases concentrated around the southeast region of the state (Fig  2). The greatest number of confirmed cases were observed in the municipalities of Mairiporã (n = 88), Jundiaí (n = 68), São Paulo (n = 62), Bragança Paulista (n = 48), and Atibaia (n = 36). There is a clear distinction between the geographic location of cases in epizootic phase 1, and those in phases 2 and 3 (Fig 1B). Specifically, almost all cases that form part of phase 1 occur in a geographically distinct cluster of municipalities in the north of São Paulo state (Fig 1B;  Fig 2). On the other hand, nearly all cases of YFV detected after the first phase occurred in  (Fig 1B and  Fig 2).
To investigate the source and transmission of YFV during all three epizootic waves, and to characterise the genetic diversity of the virus circulating in NHP populations across São Paulo state, we generated whole YFV genomes for 46 RT-qPCR positive NHP samples (median Ctvalues of 14, range 9-25) from 20 municipalities using a previously described MinION sequencing protocol [8,18]. We also sequenced five whole genomes from human samples collected in four municipalities of São Paulo state (median Ct-values of 34, range 32-37). In total, this represents genomes from 51 samples collected across 23 municipalities between October 2016 and January 2018 (S3 Fig).
We first used maximum likelihood methods to estimate phylogenetic relationships amongst all 224 sequences available from SA1 in South America. Our genetic analyses indicate a strong phylogenetic spatial structure of the ongoing epizootic in São Paulo state, with 43 of 46 sequences from NHPs in São Paulo clustering in a single clade (bootstrap score = 98%), previously named as lineage YFV MG/SP [32]. This clade contains sequences sampled in São Paulo state, and an additional 8 sequences sampled in Minas Gerais that were all from within 100km of São Paulo state (Fig 3,  Sequences from 6 NHPs sampled during the first epizootic phase (October to December 2016) form two separate phylogenetic clades in the maximum likelihood tree, one of which appears basal to the 2016-2019 outbreak clade in southeast Brazil. Whilst bootstrap support for this structure is relatively poor (bootstrap score = 67%), its presence raises the possibility that two distinct YFV lineages may have co-circulated in northern São Paulo state during the first low-intensity epizootic phase in late 2016. Municipalities are shaded based on reporting and testing of NHPs. Circles depict locations of those cases from humans that were sequenced as part of this study, and red and orange triangles depict locations in São Paulo state from which NHP genome sequences were sequenced as part of this study. Eight samples from 7 municipalities in MG that form a clade with our sequences from São Paulo and that were sequenced by previous studies [8,43] are shown in purple triangles. Samples SA131 and Y37-244 are indicated. Shapefiles used to produce this maps are freely available from the Brazilian Institute of Geography and Statistics (IBGE) [42].  Nodes with >50% bootstrap support are coloured dark blue, and bootstrap scores are provided for highly supported nodes (> = 75%). Sequences generated in this study have black tip labels (n = 51). One clade (previously denoted lineage YFV MG/ES/RJ [32]) is collapsed for clarity. This clade contains three sequences that were sampled from humans in São Paulo, but for all three cases patient travel histories indicate that YFV was acquired elsewhere. Samples SA131 and Y37-244 are indicated with an asterisk and discussed in the Discussion. We used a continuous diffusion model to investigate the spatiotemporal spread of YFV. Our analyses were unable to accurately determine the route via which YFV was introduced to São Paulo state. Although the root of the MCC tree is placed in north-western MG (red dot ,  Fig 4), the spatial uncertainty in this root location is very high. In addition, posterior support for phylogenetic grouping of sequences from Goiás state with those from northern São Paulo is very weak (0.12), and as such this reconstructed lineage movement should be interpreted with caution. Whilst we were unable to identify the route of introduction to northern São Paulo, our analyses are better able to resolve subsequent viral dispersal trajectories from the north to the south of the state. Specifically, YFV appears to have disseminated from northern São Paulo into geographically neighbouring areas of western MG and into the south of the state.
Both those analyses including and discarding spatially-outlying sequences (see Fig 4 for locations of the discarded sequences) suggest that the mean branch velocity was~1 km/day (geographically restricted dataset: 0.83 km/day, 95% highest posterior density (HPD), 0.50-1.53, full dataset: 0.95 km/day, 95% HPD 0.53-2.42). The mean velocity did not vary significantly between the three epizootic phases observed in São Paulo (S3 Table). There is, however, heterogeneity in the dispersal velocity of different branches, such that most phylogenetic branches have lineage dispersal velocities that are somewhat lower than the mean, and a few are associated with rapid dispersal (Fig 5).
Considering all sequences, the wavefront velocity varies across time (Fig 6). However, rare distant outlier sequences can heavily inform the maximum wavefront distance, even if associated expansive virus lineages do no contribute substantially to the overall epizootic and only represent isolated cases in sampled outlier locations. Considering sequences in the spatially

PLOS PATHOGENS
Genomic surveillance of yellow fever in São Paulo during the 2016-2018 epizootic restricted area only (see Fig 4 for locations), the posterior intervals on the wavefront velocity are relatively wide and we therefore cannot exclude that the wavefront velocity of the majority of the sampled epizootic is homogeneous across time (Fig 6). In this dataset, the epidemic wavefront moved on average only 0.4 km/day (95% HPD, 0.30-0.53) during January 2015 to January 2017 (Fig 6).

Discussion
In this study we analyse data on the epizootic intensity and spatiotemporal distribution of YFV cases in NHPs in São Paulo state, Brazil, during 2016-2018. Our results demonstrate the existence of three distinct phases of the epizootic to early 2018. Specifically, we observe an

PLOS PATHOGENS
Genomic surveillance of yellow fever in São Paulo during the 2016-2018 epizootic initial phase in which a small number of cases were identified in the northern region of São Paulo state during late 2016. This period was followed by two larger epizootic phases; the first from February 2017 to July 2017, and the second from July 2017 to at least February 2018 (the date of the most recent data available for this study). During the second and third phases, almost all cases were observed in the south of São Paulo state. We generated 51 novel YFV genomes from samples collected in 23 different municipalities in São Paulo state. The majority of sequenced viruses were sampled from NHPs of the genus Alouatta from late 2016 to early 2018, and represent the first and third epizootic phases identified here (S3 Fig, Fig 1). Phylogeographic analyses of these data indicate that after being introduced to the north of São Paulo state, YFV lineages spread southwards and initiated epizootics in the south of the state. Lineages spread at a mean branch velocity of approximately 1km per day (Fig 4 and Fig 5), though there is substantial heterogeneity in branch dispersal velocity (Fig 5).
Previous analyses have hypothesised that the epizootic in the densely populated south of São Paulo originated in Minas Gerais [13], but those analyses were performed without any data originating from the north of São Paulo state. Here, we revise this hypothesis and suggest that the epizootic in the densely populated south of São Paulo (epizootic phase 2 and 3) may instead have originated in the north of the state. However, we note that few YFV-positive carcasses were identified in the centre-east of São Paulo (Fig 2) and genomes sampled from NHP in the north and south of São Paulo are connected by a single branch in the phylogenetic tree, with no genomes sampled in between. As such, it is impossible to identify the exact route or mode of YFV spread through the state. Several possibilities might explain the movement of YFV between northern and southern São Paulo: (i), YFV-positive NHPs were present but were not identified in a corridor in the eastern São Paulo region due to limited sampling (Fig 2),

PLOS PATHOGENS
Genomic surveillance of yellow fever in São Paulo during the 2016-2018 epizootic and it is therefore plausible that YFV will emerge again there in future. Understanding how and why the outbreak spread southwards more extensively from northern São Paulo during the 2016-2019 outbreak than in previous outbreaks could help inform future surveillance and control efforts.
Determining how YFV is introduced into highly urbanised areas, such as São Paulo city, is important for designing strategies that can effectively interrupt such introductions. One of our sequenced samples, SA131 (accession number MH193175), was collected from an Alouatta sp. individual at the Parque Estadual das Fontes do Ipiranga (PEFI) (Fig 2 and Fig 3). This park forms part of São Paulo city's zoo, and represents a 53 km 2 fragmented island of Atlantic forest within the urbanised city metropolis. Surprisingly, phylogenetic analyses show that the YFV genome recovered from sample SA131 clusters together with isolate Y37-244 (accession number MH030085) (bootstrap score for ML tree = 68%, 95% posterior support for MCC tree = 1), collected in Piracaia, 75 km from the park, and not with any of the isolates that were detected in the outskirts of São Paulo (Fig 2 and Fig 3). This result could be explained by: (i) incomplete sampling of a wave of continuous transmission among NHPs that live between the two locations; (ii) human-mediated transport of YFV infected NHPs [35], (iii) human-mediated transport of mosquitos, or (iv) introduction of the virus by an asymptomatic human visitor, who carried the virus from Piracaia to PEFI. Scenario (i) seems unlikely since PEFI is an isolated forested fragment with limited connectivity to other forested areas; at this stage, scenarios (ii), (iii) and (iv) all remain possible.
Using YFV genetic data, we estimate that virus lineages dispersed at a mean rate of~1 km per day during the outbreak in São Paulo (95% HPD, 0.50-1.53). We did not observe substantial differences in the mean lineage dispersal between epizootic phases (S3 Table), which might have indicated different impact of drivers in different epizootic phases or corresponding affected regions. However, our analyses demonstrated substantial heterogeneity between different phylogenetic branches (Fig 5). Most phylogenetic branches show low lineage velocity (<3 km/day) that are consistent with YFV spread by NHPs and mosquitos between contiguous forested patches. Far more rarely, we observe higher velocity lineage movements. Such movements could indicate human transport of infected NHPs or mosquitos, or long-distance movement of infected humans or mosquitos. We also cannot rule out that terminal phylogenetic branches with unusually high lineage dispersal velocity may have been caused by accidental attribution of incorrect metadata to samples, and efforts should be made to perform confirmatory sequencing of outlier samples and sequence YFV from other samples collected in their vicinity. Greater genomic sampling density is required to better investigate the drivers of such rapid YFV lineage movements.
Our mean estimate of the rate of YFV lineage movement is slightly slower than that those previously estimated for the states of Minas Gerais, Espírito Santo and Rio de Janeiro, and for the southeast region as a whole (~4km/day) [8,13]. Phylogenetic estimations of rate of spread often depend on sampling range and tend to be biased when rates are estimated for large geographic areas [36]. However, the relatively confined area of YFV circulation suggests that relaxed random walk models used here are a good approximation for viral diffusion in the southeast region of Brazil. Additional analyses of larger datasets are necessary to test whether spatial heterogeneity in NHP density, habitat fragmentation, or impact of human activities, may be responsible for the differences observed in the rate of YFV spread among different areas of southeast Brazil.
The NHP case data presented here (Fig 1 and Fig 2) suggest that the epizootic was larger in the southern parts of São Paulo than in the north. We hypothesise that this could be related to different NHP or vector species density or distribution in each area. Specifically, the southern part of São Paulo has large, connected forested areas that support a high diversity and density

PLOS PATHOGENS
Genomic surveillance of yellow fever in São Paulo during the 2016-2018 epizootic of NHPs [37,38]. In contrast, the north of São Paulo is covered by savannah and semi-deciduous forest [38], which may offer more restricted habitat availability for YFV vectors and amplifying NHP species.
Despite this, the magnitude of the YFV epizootic presented in Fig 1 and Fig 2 is based on epidemiological surveillance alone, and we caution that any bias in reporting or testing of NHP carcasses would also bias these results. Multiple municipalities reported small numbers of NHP carcasses that could not be tested, and/or tested relatively few carcasses in total. It is therefore possible that municipalities that did not report positive cases (Fig 1) nevertheless had local, undetected epizootics. YFV surveillance based on reporting of dead animals may have led to the appearance of a larger YFV epizootic in the south of São Paulo São Paulo, because animals from the Alouatta genus that are more likely to die of YFV infection than other genera [39] are more common in the south of São Paulo [37] than the north. The south of São Paulo is more densely human-populated than the north, so we also cannot exclude that dead NHPs would have been noticed and reported more frequently there. This could lead to the appearance that the outbreak was larger in the south. Reporting of NHP carcasses varies over time (S1 Fig). It is impossible to determine to what extent time-vary reporting of NHP carcasses was caused by consistent reporting of time-varying NHP deaths, or by changing frequency at which NHP deaths were reported by human observers. However, our observation that the epizootic in northern São Paulo was largely extinguished after phase 1 is unlikely to have been severely affected by such surveillance biases, as northern locations continue to test NHP carcasses after this time without detecting positive samples (Fig 1, S1 Fig). This finding is also consistent with the spatial distribution of YFV cases in humans [40]. Greater efforts to sample mosquito vectors or live NHP during epizootic periods would be important to determine whether reliance on reporting of NHP carcasses biases our understanding of YFV spatial distribution.
To facilitate sequencing, we preferentially studied samples showing low CTs during RT-qPCR. Our selected samples were therefore biased towards those obtained from animals with higher viremia, and hence likely towards animals with greater clinical disease and from the Alouatta genus. Further, all phylogeographic inference is conditioned upon sampling, such that we can only reconstruct lineage movement amongst those lineages that we sample. The potential sampling biases that we describe above may have resulted in major or minor differences between the actual viral spread and the inferred dispersal history of lineages [41]. Analysis of larger YFV genomic datasets incorporating more sequences from non-Alouatta genera (including humans with insubstantial travel history), animals with lower viremia, and animals from genomically unsampled locations is critical to determine whether this could have affected our phylogeographic reconstructions of YFV spread in southeast Brazil.
Our study sheds light on the spatial and temporal dynamics of YF in NHP hosts across São Paulo state. A better understanding of the vectors and host species involved in the persistence of the virus, denser genomic sampling of available positive samples, and targeted investigation into phylogenetic branches associated with long-distance or rapid movements, will be important to understand the drivers of YFV transmission and anticipate future outbreaks.

PLOS PATHOGENS
Genomic surveillance of yellow fever in São Paulo during the 2016-2018 epizootic phase 2 and 3, and is therefore not central in São Paulo. Colours indicate results of testing. Darker shades of each colour in each bar represent results in testing in any municipality that detected positive NHPs during any month, and paler shades represent cases from those municipalities that never detected positive cases during the displayed period. Shapefiles used to produce this map are freely available from the Brazilian Institute of Geography and Statistics (IBGE) [42].