Genetic Variability of West Nile Virus in U.S. Blood Donors from the 2012 Epidemic Season

West Nile virus (WNV) is an arbovirus maintained in nature in a bird-mosquito enzootic cycle which can also infect other vertebrates including humans. WNV is now endemic in the United States (U.S.), causing yearly outbreaks that have resulted in an estimated total of 4–5 million human infections. Over 41,700 cases of West Nile disease, including 18,810 neuroinvasive cases and 1,765 deaths, were reported to the CDC between 1999 and 2014. In 2012, the second largest West Nile outbreak in the U.S. was reported, which caused 5,674 cases and 286 deaths. WNV continues to evolve, and three major WNV lineage I genotypes (NY99, WN02, and SW/WN03) have been described in the U.S. since introduction of the virus in 1999. We report here the WNV sequences obtained from 19 human samples acquired during the 2012 U.S. outbreak and our examination of the evolutionary dynamics in WNV isolates sequenced from 1999–2012. Maximum-likelihood and Bayesian methods were used to perform the phylogenetic analyses. Selection pressure analyses were performed with the HyPhy package using the Datamonkey web-server. Using different codon-based and branch-site selection models, we detected a number of codons subjected to positive pressure in WNV genes. Thirteen of the 19 completely sequenced isolates from 10 U.S. states were genetically similar, sharing up to 55 nucleotide mutations and 4 amino acid substitutions when compared with the prototype isolate WN-NY99. Overall, these analyses showed that following a brief contraction in 2008–2009, WNV genetic divergence in the U.S. continued to increase in 2012, and that closely related variants were found across a broad geographic range of the U.S., coincident with the second-largest WNV outbreak in U.S. history.


Introduction
West Nile virus (WNV) emerged in the United States in 1999 and has become endemic, having caused annual outbreaks each subsequent year. WNV is a Flavivirus maintained in nature in an enzootic cycle between birds and mosquitoes. Other vertebrate hosts may be infected and develop disease, as occurs with humans and horses, which are considered dead-end hosts since they do not develop sufficient viremia to re-infect mosquitoes [1,2]. Transmission may also occur between humans via blood transfusion and transplantation of organs from infected individuals [3,4]. Since 2003, donated blood has been routinely screened for WNV by nucleic acid testing (NAT), and thousands of transmissions have been prevented [5]. Approximately 80% of humans infected with WNV develop no symptoms. Symptoms of WNV infections may vary from fever, rash and flu-like symptoms to severe neurological disease, which develops in less than 1% of cases and can result in death [6][7][8]. According to the U.S. Centers for Disease Control and Prevention (CDC), WNV poses an ongoing public health threat, having infected millions of people and caused 1,765 deaths in the U.S. through the end of 2014 [9].
WNV is the most widely geographically distributed Flavivirus in the world, present on every continent except Antarctica. WNV infection had been observed in Africa, Asia, Australia/Oceania, and southern Europe prior to 1999. In 1999, the first cases of WNV in the Americas were observed in the U.S. in New York City, and the virus has since spread westward across the 48 contiguous states and Canada, and southward into Mexico, the Caribbean islands, Central America and South America, where it has caused human disease as far south as Argentina [10][11][12].
In the U.S., WNV causes annual outbreaks of varying size and severity. Peaks of WNV activity have been observed in [2002][2003]2006 and 2012. Reduced WNV activity was observed from 2008-2011 compared to 2002-2007 [9]. Following this period of relatively low activity, a large outbreak of WNV disease occurred in the 48 contiguous states in 2012 with 5,674 reported cases including 2,873 neuroinvasive cases and 286 deaths, the largest numbers reported to the ArboNET for any year since 2003. [9]. WNV disease cases peaked in late August 2012, with 5,199 (92%) cases having onset of illness during July-September. The incidence of WNV neuroinvasive disease increased in 2012 to 0.92 per 100,000. More than half of the neuroinvasive disease cases in 2012 were reported from four states: Texas (n = 844), California (n = 297), Illinois (n = 187), and Louisiana (= 155) [9,14]. There are an estimated 30-70 non-neuroinvasive disease cases for every reported case of WNV neuroinvasive disease [6,8,13]. Therefore, an estimated 86,000-200,000 non-neuroinvasive disease cases might have occurred in 2012 but only 2,801 were diagnosed and reported. [14]. The reason for the increased incidence of WNV disease in 2012 is unknown and may involve multiple environmental and ecological factors as well as selection and dissemination of genetically best-fitted viruses.
The spread of WNV in the Americas has offered a unique opportunity to observe evolution and genetic adaptation occurring in an arbovirus introduced to a new environment. The prototype strain from the 1999 New York outbreaks became known as the NY99 genotype, and is believed to share a common genetic origin with a 1998 Israeli isolate IS-98 [11,15]. In 2002, a new WNV genotype, WN02, appeared and was characterized by one amino acid substitution, E-V 159 A, and 13 conserved nucleotide mutations [16,17]. WN02 was found to be more efficiently transmitted by New World mosquitoes than NY99, and eventually completely replaced NY99 [18]. This genetic shift coincided in time with large U.S. outbreaks in 2002-2003 and may have contributed to WNV's spread across North America. Even with the genetic changes observed as WNV spread through North America, genetic variability of human isolates remained relatively low, increasing from 0.18% in 2002 to 0.37% in 2005 [19]. A second new genotype termed SW/WN03, defined by two additional fixed amino acid substitutions, NS4A-A 85 T and NS5-K 314 R, was first observed in isolates collected in 2003. WN02 and SW/ WN03 genotypes displaced the ancestor NY99 genotype in the U.S. [20].
High WNV activity in the U.S. continued through 2006 and 2007, and during this period, further genetic diversification of WNV strains was observed. A new well-defined viral cluster occurring within genotype SW/WN03, termed MW/WN06, was observed in strains collected from blood donors in the Midwestern and Northwestern U. S. in 2006 and 2007 [21]. The number of genetic mutations appearing in U.S. WNV strains continued to increase over this period, but the number of conserved mutations decreased slightly. Some nucleotide mutations which were previously believed to have been fixed in WNV isolates occurring after 2003 appeared to revert to the NY99 sequence, but other mutations associated with the WN02 genotype remained fixed [21]. The increased virulence of the WN02 genotype in mosquitoes is believed to have facilitated westward spread in 2002-2003 with a dramatic increase in infections, causing the largest WNV outbreak ever recognized worldwide and the largest viral encephalitis outbreak ever recognized in North America. This spread highlighted the need to monitor mutations occurring in the WNV genome and the genetic relationships of viral isolates causing disease in the U.S. over time [10-12, 17, 21].
Here we report results obtained from sequencing and phylogenetic analysis of 19 human WNV isolates from 13 U.S. states: Arizona (AZ), California (CA), Georgia (GA), Illinois (IL), Louisiana (LA), Nebraska (NE), New Mexico (NM), North Dakota (ND), Mississippi (MS), Ohio (OH), South Dakota (SD), Texas (TX), and Wyoming (WY), from blood donations collected during the 2012 epidemic season. Thirteen of the 19 completely sequenced isolates from 10 U.S. states (ND, SD, WY, TX, MS, GA, NM, OH, NE, IL) were genetically similar, sharing up to 55 nucleotide mutations and 4 amino acid substitutions when compared with WN-NY99 (GenBank accession number AF196835). Phylogenetically, these 13 isolates clustered together with previously published 2012 isolates from TX [22,23] and some 2012 isolates from CO published in GenBank suggesting that this genetic variant was widely geographically distributed in 2012. Isolates from AZ and CA were different from these genetic variants and phylogenetically clustered within local clades.

Ethics statement
All human specimens used in this study were obtained from blood donors who signed the blood center's Institutional Review Board (IRB) approved informed consent. These specimens were anonymized (unlinked) before shipment. Use of these unlinked specimens has been approved as exempt by the U.S. Food and Drug Administration (FDA) IRB (Human Subjects Research-Exempt RIHSC Protocol #127B).

Samples
The study included 19 isolates obtained after cultivation of residual blood specimens from blood donors who tested reactive for WNV RNA by FDA-approved commercial nucleic acid test assays used to screen blood donations. These 19 samples were representative of 13 states of the U.S.: AZ, CA, GA, IL, LA, NE, NM, ND, MS, OH, SD, TX, and WY (Table 1).

Virus isolation, RNA extraction and Reverse Transcription-Polymerase Chain Reaction (RT-PCR)
A single passage in Vero cells (ATCC # CCL-81) was performed for virus isolation from each specimen as previously described by Grinev et al. [19]; cell culture supernatants were harvested within 7 days and used for viral RNA extraction by the QIAamp Viral Mini RNA extraction kit (Qiagen, Valencia, CA) according to the manufacturer's protocol. Reverse transcription reactions, PCR amplification and purification of amplicons were performed as described earlier [19].

DNA sequencing, assembly and alignment
Amplicons covering an entire WNV genome of each studied isolate were subjected to Sanger sequencing using the amplification primers and additional internal sequencing primers. Sequencing reactions were performed as described before [19]. Sequencing data were assembled and analyzed using the Vector NTI Advance 11.5 software package (Invitrogen). Nucleotide (nt) and deduced amino acid (aa) sequences from studied isolates were aligned using the Align X program and compared to the genomic sequence of the parental WNV isolate WN-NY99 (AF196835). Nucleotide sequences reported in this paper were deposited into the GenBank database and accession numbers are shown in Table 1 (KM012170-KM012188).

Phylogenetic analysis
For Maximum-likelihood phylogeny we used MEGA 6 [24]. The Maximum-likelihood method employing the General Time Reversible (GTR) + Γ + I model was used to produce phylogenetic trees. This model was determined using the selection tool available in MEGA 6. The parental strain WN-NY99 (AF196835) was used to root the trees. The 19 newly sequenced WNV strains from this study ( Table 1)

Selection pressure analysis
A selection analysis of ORFs of WNV strains isolated in 2012 (n = 77, S1 Table) was performed using the Datamonkey web-server (www.datamonkey.org). In addition to the Single-likelihood ancestor counting (SLAC), Internal Fixed effects likelihood (IFEL), Fixed effects likelihood (FEL), Random Effects likelihood (REL), Mixed Effects Model of Evolution (MEME), Fast, Unconstrained Bayesian Approximation for inferring selection (FUBAR) methods, and Evolutionary fingerprint, we also employed the Conant-Stadler Property Informed Models of Evolution (PRIME) method. We have used the PRIME method to study site-specific aa properties (e.g. chemical composition, charge, polarity) which are being conserved or altered by the evolutionary process. Because of Datamonkey server restrictions, the REL method was only used to evaluate 74 sequences, which was the largest dataset that could be successfully analyzed (KJ501432, KJ501434 and KJ501437 were excluded randomly).

Time-scale analysis
A Bayesian skyline plot (BSP) was used to estimate the viral effective population size through time. Evolutionary rates for the WNV ORF sequences (n = 870) were calculated using the Bayesian Markov-chain Monte Carlo (MCMC) approach employed by BEAST ver. 1.8.1 [25] and the BEAGLE library [26]. The dataset was analyzed using the TN93+Γ 4 substitution model and the non-parametric Bayesian Skyline plot model, under relaxed uncorrelated lognormal (UCLN) molecular clocks as described elsewhere [21]. Four independent MCMC chains were run on a Tesla K20 computing processor until convergence to the stationary distribution was achieved (~500-600 million states with sampling frequency of 50,000). Posterior distributions were examined in Tracer v1.6 [27] to ensure adequate mixing and convergence. All chains were combined in LogCombiner with a burn-in value set to 30% of generations. The maximum clade credibility tree (MCC) and BSP (after resampling to 100,000) were generated. The MCC tree was visualized using FigTree v1.4.2 [27].

Nucleotide changes and amino acid substitutions
Complete genomic sequences from 19 studied isolates from the 2012 epidemic were compared to the prototype strain WN-NY99 (AF196835). Most mutations (~89%) were silent transitions (U$C, A$G). The total number of nt mutations ranged from 54 to 83. Shared nucleotide mutations identified in the studied WNV isolates are shown in Table 2. All 19 WNV 2012 isolates from this study shared 7 nt mutations (T 1442 C, C 2466 T, A 4146 G, C 4803 T, C 6426 T, C 6996 T and A 10851 G). Four mutations were shared by 18 of the 19 isolates: T 7938 C and T 8811 C (excepting BSL140); T 7015 C (excepting BSL178); and C 9352 T (excepting BSL85). In addition, seventeen isolates except BSL53 and BSL178 shared transition C 6138 T. Thirteen of the 19 completely sequenced isolates from 10 U.S. states (ND, SD, WY, TX, MS, GA, NM, OH, NE, IL) shared more than 50 nucleotide mutations when compared with prototype strain WN-NY99 (Table 2 and S2 Table). Among 2012 WNV isolates from this study, the number of deduced aa substitutions ranged from 4 to 13 when compared to WN-NY99, most of which are conservative changes. The transition T 1442 C is the non-silent mutation leading to the aa substitution E-V 449 A (V 159 A, in the Envelope protein numeration). This substitution is common for all WNV isolates collected in the U.S. since 2003, and therefore fixed in all strains of the WN02 and SW/WN03 genotypes. In addition to the aa substitution E-V 449 A, six WNV isolates shared NS2A-V 1201 I and 12 isolates shared the substitution NS2A-R 1331 K. Thirteen isolates reported here shared the substitution NS4B-I 2513 M (Table 3 and S3 Table). Analysis of the nt variation in the ORFs of the North American WNV dataset (n = 870, S1 Table) reveals increased evolutionary divergence from year to year (Fig 1). The estimated transition/transversion bias is 10.44 and the majority of the nt changes are transitions with relative rate 25.3 for U$C and 7.2 for A$G.

Phylogenetic analysis
Phylogenetic analysis was performed using the Maximum-likelihood method. In addition to the 19 WNV ORFs sequenced in this study, the North American WNV ORF sequences available from the GenBank database, as of September 2015, were included in the dataset (n = 870, S1 Table). We have analyzed the phylogeny of these sequences and identified, as expected, the presence of the common clades representing the North American WNV genotypes NY99, WN02 and SW/WN03, previously described in the course of study of WNV evolution in North America [11,[16][17][18][19][20][21][22][23][28][29][30][31][32][33][34][35][36][37][38][39] (Fig 2 and S1 Fig). The 2012 WNV human isolates from this study are located within six nodes termed here "Node 1" to "Node 6". Node-specific aa substitutions and geographical origin of isolates are shown in Fig 2 and S4 Table. We have observed that all WNV isolates reported here except BSL53 (KM012172), which is clustered in Node 6 within the SW/WN03 genotype, belong to the WN02 genotype. All studied isolates carried the common North American WNV aa substitution E-V 159 A which is fixed in the WN02 and SW/WN03 genotypes and present in all WNV strains collected in the U.S. since 2003.
Analysis of the entire ORF of WNV isolates circulating in the U.S. has shown that two isolates from AZ, BSL05 (KM012170) and BSL80 (KM012174), clustered together with previously published isolates from AZ in Node 2, and an isolate from CA, BSL85 (KM012175), clustered    T  T  T  T  C  C  T  C  C  C  T  T  C  C  T  T  T  G  T   BSL101  C  T  T  T  T  C  C  T  C  C  C  T  T  C  C  T  T  T  G  T   BSL107  C  T  T  T  T  C  C  T  C  C  C  T  T  C  C  T  T  T  G  T   BSL116  C  T  T  T  T  C  C  T  C  C  . .
All nt mutations are shown in S2 with other isolates from CA in Node 5. We found that the WNV isolate BSL178 from LA (KM012181) was associated with Node 3, which mainly consisted of previously published WNV strains collected from TX in 2012 [22,23] and two 2012 isolates from CO. WNV strains presented in Node 3 shared up to nine aa substitutions. Surprisingly, many of the published 2012 isolates (n = 27) were clustered within Node 4 together with 13 genetically related WNV isolates from this study collected from 10 states: ND, SD, WY, TX, MS, GA, NM, OH, NE and IL. All WNV strains from Node 4, except KM012188 and KJ501532, shared the NS2A-R 188 K aa substitution in addition to the common E-V 159 A. Other node-specific aa substitutions are shown in S4 Table. 3

. Selection pressure analysis
Using different codon-based and branch-site approaches, we detected a number of codons subjected to positive pressure in WNV strains collected in 2012 (n = 74 for REL, n = 77 for all other methods). Analysis was done using the DataMonkey web-server (www.datamonkey.org). We found that eight codons: 379; 1083; 1195; 1238; 1494; 2288; 2389; and 2842; were detected  Node-specific amino acid substitutions are shown for each node (see also S4 Table). For each node, states shown in red in the adjacent U.S. map are those from which strains have been isolated.
as positively selected by at least two methods. Site 2842, corresponding to the NS5-K 314 R aa substitution, was the only site identified as positively selected by all methods (Table 4). Eleven node-specific aa substitutions identified in the phylogenetic analysis and detected as positively selected by at least one method are shown in Table 5. We performed evolutionary fingerprint analysis, which models site to site variation in selection pressure across the ORF, for WNV isolates from 2012 (n = 77) (Fig 3). The colored pixels on this plot show the density of the posterior sample of the distribution for a given rate and ellipses reflect a Gaussian-approximated variance in each individual rate estimate. Points above the diagonal line corresponded to positive selection (ω>1), and points below the diagonal line corresponded to negative selection (ω<1). Most of the points are concentrated below the diagonal line which represents the idealized neutral evolution scenario (ω = 1). The results suggest that WNV strains collected in 2012 were subjected to strong negative purifying selection. In addition, we conducted a supplementary selection pressure analysis using PRIME to detect whether selection for amino acids with differing chemical properties is occurring within the 2012 dataset (n = 77). Conant-Stadler PRIME analysis allows the non-synonymous substitution rate β to depend not only on the site in question (like FEL and MEME), but also on which residues are being exchanged. Substitution rate analysis identified a single rate class, which suggests that across the 2012 WNV isolates, the rate of substitution between each residue was similar and no particular substitution was favored. PRIME analysis detected an overall substitution rate of 0.05 substitutions/codon site. One site, codon 2842, was negatively selected for volume and positively selected for changes in chemical composition. This codon, corresponding to the NS5-K 314 R aa substitution, was identified as positively selected by all methods used for study of selection pressure.

Time-scale analysis
The time-scale analysis was performed using the North American WNV dataset (n = 870, S1 Table) and the non-parametric Bayesian Skyline plot (BSP) model available in BEAST 1.8.1.
Previously we found that the BSP with the relaxed molecular clock (UCLN) was the best-fitted model [21]. The maximum clade credibility tree (MCC) was selected and the age for each node containing studied WNV isolates is shown on Fig 4A and

Discussion
WNV now is the most widespread and common cause of viral encephalitis in the U.S. and worldwide [11,12]. After six years of relatively low WNV incidence in the U.S., a large outbreak was observed in 2012 causing 5,674 total disease cases and 286 deaths, the largest number of deaths ever reported [9]. In this study we investigated the genetic variability of 19 WNV strains isolated from human samples collected in 2012 from 13 states of the U.S. (Table 1). Although humans are considered dead-end hosts for WNV, and therefore, not important for the WNV lifecycle, human isolates represent circulating viruses. Studying human WNV isolates is also important for public health and for the safety of the blood supply.
Multiple factors were potentially involved in the magnitude of the 2012 outbreak. In addition to ecological and environmental factors that have been shown to increase viral transmission [40,41], viral genetics and selection of new best-fitted variants may play a significant role in WNV outbreaks. Viral adaptation to domestic mosquitoes and birds has played a major role in the spread of WNV in the U.S. since its introduction in 1999. WNV has continued to evolve, as illustrated through the displacement of the ancestor genotype WN99 by the new genotype WN02 in 2002, followed by the appearance and co-circulation of genotype SW/WN03 in 2003 and subtype MW/WN06 in 2006 [11,[16][17][18][19][20][21][22][23][28][29][30][31][32][33][34][35][36][37][38][39]. Analysis of nucleotide divergence of newly sequenced isolates from this study together with published North American WNV strains (n = 870) demonstrates increasing evolutionary divergence from year to year (Fig 1).
Previous phylogenetic analysis of WNV isolates shows that with limited exceptions, WNV isolates from circulating genotypes in the U.S. were poorly differentiated spatially and temporally [21]. It has been postulated that WNV genetic variations in the U.S. have occurred in some geographic areas which function as distinct niches of evolution. In these areas, the genetic . Nodes 1 to 6 containing WNV isolates from this study are highlighted in green and shown in detail. The mean time to the most recent common ancestor (tMRCA) is shown in each principal node. The 95% highest probability densities (95% HPD) for each node age are shown variant accumulates genetic changes while adapting to the local ecological conditions, hosts and vectors, and may either stay in that area or be disseminated to other regions by migrating birds [42]. We observed that isolate BSL178 from LA was grouped in Node 3 together with WNV strains collected from TX in 2012 [22,23] and two 2012 isolates from CO. Thirteen other genetically similar human isolates from samples collected in 10 U.S. states for this study clustered with 2012 mosquito and bird isolates from TX [22,23] and CO in Node 4 (Fig 2). Nodes 3 and 4 are good examples of strong temporal phylogenetic structures constituted by well temporally differentiated isolates, and they were composed predominantly of isolates collected in the 2012 epidemic season. In contrast, isolates from AZ and CA clustered within local Nodes 2 and 5. These nodes are good examples of strong spatial phylogenetic structures, which are supported by high bootstrapping values. The finding of similar isolates across a broad geographic area in 2012 suggests that closely related genetic variants of WNV represented in Node 4 spread over the Atlantic, Mississippi and Central bird flyways, but not the Pacific, and were identified coincident with the largest U.S. WNV outbreak since 2003. In CA and AZ, both of which are located on the Pacific bird flyway, specimens clustered with local circulating clades suggesting predominantly local scale evolution in this area [21,39,43].
Previous studies of 2012 U.S. isolates have suggested that viral genetic composition was not a determinant of outbreak intensity at the local level. Duggal et al. noted that the genetic composition of viruses circulating in Texas in 2012 was similar between isolates from a county that experienced a large outbreak (Dallas County) and a county that didn't (Montgomery County) [22]. Our data supports this conclusion on a broader geographic basis, because WNV isolates from the Nodes 3 and 4 circulated alongside isolates that were similar to those that circulated in 2008-2011, and high numbers of disease cases occurred in areas where isolates from these Nodes were not detected at all, such as CA. Rather, increased replication in a favorable environment may have provided opportunity for genetically related co-existing strains to circulate and spread over migratory bird flyways, as has been reported on a local scale in TX and AZ [22,23,39].
The degree of genetic diversity and fitness of viral population is a balance between positive or negative selection and genetic drift as accumulation of random neutral mutations [44]. Previous studies have shown a low level of positive selection in WNV isolates from the U.S. [21,34,36] suggesting that most aa changes were the result of genetic drift. In our study of WNV isolates from 2012, selection pressure analysis revealed only one site that was positively selected by all employed methods, codon 2842 (NS5 314 ). This site has been previously identified as subject to positive selection in other studies of North American WNV sequences [21,22,37]. We found that this site is associated with nodes 2, 4 and 6 (Tables 4, 5 and S4) and aa substitution NS5-K 314 R is involved in the emergence of the SW/WN03 genotype [20,21]. Site 1195 in NS2A was detected as positively selected by four methods. This site is associated with Node 3 aa substitution NS2A-T 52 I. Overall for the 2012 isolates, three aa substitutions in Node 3 and five in Node 4 were identified as positively selected by at least one method. Potentially aa substitutions could impact viral fitness and virulence, and the biological significance of those changes in viral proteins warrants further investigation. In general, our results are consistent with previous studies which have demonstrated that only limited positive selection is acting on the population of WNV circulating in the U.S., and purifying selection is predominant [21,22]. as blue bars. B) Bayesian coalescent inference of genetic diversity and population dynamics using the Bayesian Skyline plot. The X axis represents years of study and the Y axis, the relative genetic diversity product of the effective population size. In previous studies, results of time-scale analysis were only reported for select genes of WNV or reduced datasets [17,21,34,35,45]. In this study we performed comprehensive time-scale analysis using 870 full-length ORFs of WNV strains isolated in the U.S. in 1999-2012 ( Fig 4A). We found that the time to most recent common ancestor (tMRCA) for the whole dataset (n = 870) was 14.78 years (95% HPD = 13.87-15.49 years), which is consistent with the value of 15.57 years (95% HPD = 14.23-16.98 years) previously reported in the study of human isolates (n = 62) when strain IS-98 (AF481864) from 1998 was used to root the tree [21]. We calculated the mean nucleotide substitution rate (MNSR), using the BSP model with the relaxed molecular clock, to be 6.81 x 10 -4 substitutions/site/year (s/s/y), which also correlates with published data [21,36]. Analysis of the BSP (Fig 4B) shows that genetic divergence had continued to slowly increase through 2012 following a brief period of contraction in 2008-2009, which also agrees with data published by us and others [21,36,45].
Overall, our findings in this study suggest that the patterns of WNV genetic evolution in the U.S. following the 2012 outbreak remained consistent with previous trends. Additionally, our observation of the broad geographic distribution of genetically similar isolates suggests that these WNV variants may have spread via migratory birds, and were detected coincident with the largest WNV outbreak since 2003. The emergence of this genetic variant may potentially mark the beginning of a new genetic shift and spread of a new WNV genotype after 10 years of steady drift.
Supporting Information S1  . WNV genotypes are color-coded in the branches of the tree as NY99 (black), WN02 (blue), SW/WN03 (purple) and cluster MW/WN06 (red). Nodes 1 to 6 containing WNV isolates from this study are highlighted in green. The mean time to the most recent common ancestor (tMRCA) is shown in each principal node. The 95% highest probability densities (95% HPD) for each node age are shown as blue bars. (TIFF)