Molecular Epidemiology of Influenza A/H3N2 Viruses Circulating in Mexico from 2003 to 2012

In this work, nineteen influenza A/H3N2 viruses isolated in Mexico between 2003 and 2012 were studied. Our findings show that different human A/H3N2 viral lineages co-circulate within a same season and can also persist locally in between different influenza seasons, increasing the chance for genetic reassortment events. A novel minor cluster was also identified, named here as Korea, that circulated worldwide during 2003. Frequently, phylogenetic characterization did not correlate with the determined antigenic identity, supporting the need for the use of molecular evolutionary tools additionally to antigenic data for the surveillance and characterization of viral diversity during each flu season. This work represents the first long-term molecular epidemiology study of influenza A/H3N2 viruses in Mexico based on the complete genomic sequences and contributes to the monitoring of evolutionary trends of A/H3N2 influenza viruses within North and Central America.


Introduction
Influenza A virus is one of the most important pathogens of humans, responsible for 250,000 to 500,000 deaths annually and potentially millions during major pandemics [1,2]. These singlestranded, negative-sense RNA viruses of the family Orthomyxoviridae cause regular seasonal epidemics and occasional global pandemics [3]. Influenza viruses are antigenically variable pathogens with a high evolutionary rate that gives them the capacity to evade the immune system and keep circulating in human populations. The two main evolutionary mechanisms that allow influenza viruses to constantly evolve and re-infect their hosts are antigenic drift and antigenic shift [3]. Antigenic drift occurs as a result of progressive accumulation of mutations that become fixed in the viral genome. Such mutations can confer minor changes in the viral proteins that may be advantageous for viral fitness, including the capacity to escape immune system recognition. During antigenic shift, an influenza A virus strain may acquire the HA segment, and possibly the NA segment as well, from an influenza virus of a different subtype, resulting in a viral strain with novel antigenic proteins [4].
During a given flu season multiple viral lineages may be introduced into a discrete population, and genetic reassortment may occur among co-circulating viruses giving place to novel viral strains with gene segments from different origins [5] [6]. Moreover, whole influenza viruses from a different animal origin (usually avian or swine) can be introduced into an immunologically naive population [3]. While it has long been known that influenza viruses circulate globally, the existence, location and determinants of a common 'source' population from which genetic and antigenic variants might emerge remains a topic of great debate, as does the extent of viral persistence between epidemic seasons within individual localities [7,8,9,10,11].
Phylogenetic and antigenic characterization of circulating strains has identified distinct A/H3N2 viral lineages that have circulated in humans since 1968 [12,13]. Influenza strains for the yearly seasonal vaccine are chosen based on the antigenic characterization and prevalence studies of circulating viruses previous to the start of each influenza seasons [14]. A schematic representation of the main A/H3N2 antigenic groups circulating globally after the year 2000 is shown in Fig. S1 [15, 16,17,18,19,20,21,22,23,24,25,26,27,28].  [18,19,20,21]. Previous observations suggest that the Fujian-like viruses arose from a reassortment event between viruses closely related to the earlier A/Panama/2007/99-like strains and the later circulating A/California/07/2004 viruses [29]. In 2006, a group of viruses with an antigenically A/ Wisconsin/67/2005-like HA and an M2 protein bearing the S31N mutation conferring amantadine-resistance, started circulating worldwide displacing the California-like viruses and dominated until early 2008 (2007; 2008). This group of viruses, also named the N-lineage, originated once more by a reassortment event, which most likely occurred in early 2005, generating a new lineage of adamantane resistant A/H3N2 viruses [30,31]. The emergence of the N-lineage marked a hallmark in the rise of amantadineresistant virus prevalence, as the global widespread and dominance of the N-lineage during 2006 season contributed to the fixation of this mutation in the subsequent circulating strains in the following years [31]. In mid-2007, the divergent A/Brisbane/10/ 2007-like viruses started circulating worldwide until mid-2009, followed by the A/Perth/16/2009-like viruses that dominated through early 2012 [23,24,25,26,27]. Later in 2012, the A/Victoria/ 361/2011-like viruses arose and circulated through 2013 [28].
In a global context, it has been proposed that A/H3N2 viruses that circulate in South-East Asia may function as a key source population for other locations, including the Americas, Europe, and Africa, where the virus typically dies out at the end of each flu season (sink -source model) [10]. However, it has been shown that viruses from outside Asia also may contribute to A/H3N2 evolution [8]. The H3N2 influenza seasonality in Mexico typically follows the northern hemisphere temperate regions winter influenza season, comprising the time period from October through the end of April, with an observable phase shift of one or two months [32]. During some seasons, active influenza transmission can be noted in tropical and subtropical areas of Mexico in mid to late July, and may continue throughout the rest of the season and overlap with the start of the winter season of the temperate areas during late November, with the season being essentially over by January [32].
Despite of its role during the H1N1 2009 influenza pandemic, there is little information on the evolution of seasonal influenza A/ H3N2 viruses in Mexico. To address this issue, this work explored the molecular epidemiology of the first set of whole-genome sequences of A/H3N2 human influenza isolates that circulated in Mexico between 2003 and 2012.

Ethics statement and viral sample collection
The use of human clinical samples for this study was approved by the Bioethics Committee at the Instituto de Biotecnología, UNAM. Samples were collected between 2003-2009 by personnel of the Ministry of Health, under the Mexican Official Norm NOM-017-SSA2-2012 for epidemiological surveillance. Informed consent was not obtained, as the analysis of samples for pathogen is part of the mandate of clinical testing for the public health agency. The majority of samples were from Mexico City, with some being collected in other states of Mexico. The nineteen samples used in this study were chosen randomly, covering different regions and time periods (Table 1). Viruses from InDRE (Institute of Epidemiological Diagnosis and Reference of Mexico) were adapted to grow in MDCK cells, and 11 were antigenically characterized by the Centre of Disease Control (CDC, Atlanta, USA) by hemagglutination inhibition assay using the WHO influenza reagent kits. Samples were antigenically identified prior to our analysis. Four additional clinical samples from 2011 and 2012 were collected by medical doctors during private practice in Veracruz State, Mexico, after signed consent of a parent or guardian (Table 1). Viral samples from 2011-2012 were not antigenically characterized and were used directly without cell culture adaptation.

Viral RNA extraction and whole genome amplification
Before extracting viral nucleic acids, external or host DNA was removed by treatment with Turbo DNAse (Ambion). Total sample RNA was extracted with the PureLink Viral DNA/RNA kit (Invitrogen), using linear acrylamide as a carrier. Whole influenza genome amplification was done by multi-segment RT-PCR using a set of universal primers, MBTuni-12 (59 ACGCGTGATCAG-CAAAAGCAGG 39) and MBTuni-13 (59 ACGCGTGATCAG-TAGAAACAAGG 39) with the SuperScript III one-step RT-PCR platinum Taq HiFi kit (Invitrogen), as described previously [33]. All RT-PCR products were purified and concentrated by DNA clean & concentrator kit (ZYMO) and visualized in 1.5% agarose gels.

Library preparation and sequencing
PCR products from whole genome RT-PCR were used as input material for preparation of sequencing libraries. DNA was sheared to approximately 300 base pairs by treatment with NEBNext Fragmentase (New England Biolabs) for 15 min at 37uC. Sheared DNA was purified and concentrated by DNA clean & concentrator kit (ZYMO) and used for preparation of the libraries using Illumina's Genomic DNA Sample Preparation kit with Multiplex Sample Preparation Oligo kit, as described by the manufacturer.
High throughput sequencing was performed at the Nextgeneration sequencing core facility located at the Instituto de Biotecnología, UNAM in Cuernavaca, Morelos. Libraries of approximately 350 nucleotides were used to generate sequencing clusters, followed by 45 or 66 cycles of single base pair extensions in the Genome Analyzer IIx sequencer (Illumina, San Diego, CA), which was followed by multiplex barcode acquisition. Computational analysis was performed using the computational cluster at the Instituto de Biotecnología, UNAM. Image analysis was carried out using Genome Analyzer Pipeline Version 1.4. All reads were quality filtered, and aligned to the reference genome A/H3N2/ New York/392/2004 (GenBank accession numbers: CY002071, CY002070, CY002069, CY002064, CY002067, CY002066, CY002065, CY002068) by SMALT version 0.7.0.1 [34] and a consensus sequence was called by SAMtools version 0.1.18 [35]. To allow sequence variability, two separate alignment rounds were done. In the first round, 15 and 20 mismatches were permitted within the 46 or 65 nucleotide long reads respectively, while during the second alignment round only 5 mismatches were allowed using as a reference the consensus sequence previously generated during the first mapping round. Any reads that did not match the reference genome were not considered for further analysis.

Phylogenetic analysis
Complete influenza A/H3N2 viral genomes from North America collected from 1998 to 2013 available on the Influenza Virus Resource-NCBI Database were downloaded from the Influenza Virus Resource as of March 2013. Dataset was limited to north hemisphere viruses following the sink source model, in which hypothetically, viruses circulating in Mexico are seeded from the northern hemisphere at the beginning of the season. Viruses from the database were chosen according to their year of collection (from 1999 to 2013) and whole genome availability.
Data sets for each viral gene segment were then obtained and curated by removing redundant sequences. When available, sequences for the antigenic defining reference strains selected for vaccination for the flu seasons from 1999-2013 described in literature and by the CDC information sheets were added to each dataset ( Fig. S1 and Table 1) [15, 16,17,18,19,20,21,22,23,24,25,26,27,28]. Congruent to the information available on influenza seasonality in Mexico, the following criteria were chosen for assigning seasonality of our isolates: viruses collected from January to June correspond to the previous season from year of collection (e.g. 'A/Mexico/1/6/99' would belong to the 1998-1999 season), while viruses from July to December correspond to the following season (e.g. 'A/Mexico/1/7/99' would belong to the 1999-2000 season).
The phylogenetic clusters found in this work were labeled according to the presence of antigenic reference strains (Table 1 and Fig. S1), when available. Given this, the Panama cluster corresponds to the Panama-clustering viruses, the Fujian cluster to the Fujian-clustering viruses, the Korea cluster to the Koreaclustering viruses, the California cluster to the Californiaclustering viruses, the N-lineage to the Wisconsin-clustering viruses, the Brisbane cluster to the Brisbane-clustering viruses, the Perth cluster to the Perth-clustering viruses, and finally the Victoria cluster to the Victoria-clustering viruses.
To define the viral clusters, all eight individual gene trees were built. Data sets were individually aligned using MUSCLE [36] implemented in SeaView [37]. For the phylogenetic analysis, a total of 756 sequences were used for PB2, 759 for PB1, 752 for PA, 758 for HA, 627 for NP, 669 for NA, 461 for the M, and 414 for NS gene. Phylogenetic analysis was performed under the maximum likelihood criteria with PhyML 3.0 [38], using GTR+ G model with aLRT (SH-like) for branch support. Clusters were To further determine mutations in the previously described receptor binding and antigenic sites of the HA gene and in the amantadine resistance sites of the M gene [13,30,39,40], we manually searched for specific mutations on the sequence alignments. For the HA protein, HA1 numbering was used to detect the receptor binding sites, while HA2 numbering sites (+16 from amino acid one in coding sequence) was used to assign antigenic sites [39,40]. The HA1 and HA2 numbering refers to the two polypeptides (HA1 and HA2) products resulting of the proteolytic cleavage of the single polypeptide precursor (HA0). The five major antigenic sites in the HA protein have been mapped on the HA1 peptide [40,41]. For clarity purposes, all amino acid positions in the text are given in HA1 numbering, which correspond to the ORF of HA gene.

Results
To study the evolution of A/H3N2 influenza viruses circulating in Mexico, the complete genomic sequences of 19 isolates collected between 2003 and 2012 were used for phylogenetic analysis ( Table 1). All sequences were deposited into GenBank, accession numbers KJ855328-KJ855479. Unfortunately, the number of samples obtained for this work was limited. Due to the small sample size used, it is unlikely that the complete H3N2 viral diversity in Mexico from 2000-2013 is represented. However, the isolates used represent random samples taken in places were influenza outbreaks were detected during the peaks of influenza seasons from 2000-2013. Therefore, the low number of viral isolates in this work could still represent the distribution and incidence of the major viral groups circulating in the collection areas at the time of the sampling.
The HA and NA trees obtained showed seven phylogenetic clusters matching previously defined antigenic groups (Panama, Fujian, California, N-lineage, Brisbane, Perth and Victoria) ( Figs. 1 and 2). The clade 'B' phylogenetic cluster with no matching antigenic strain was also detected. The clade 'B' has been characterized as a minor ancestral cluster formed by viruses from 2003-2004 closely related to the A/Panama/2007/99-like strains [29]. Further, a novel cluster, termed here Korea, was identified ( Figs. 1 and 2).
The majority of the observed clusters were also evident in the PB2, PB1, PA, and NP gene trees (Figs. S2, S3, S4, S5), and to a lesser extent in the M and NS gene trees, which are not fully resolved due to the lower phylogenetic signal associated with the reduced variability in these gene segments (Figs. S6 and S7). The Mexican isolates used in this study grouped within five of the observed phylogenetic clusters: the new Korea cluster, the clade 'B', the N-lineage, Brisbane, and Victoria (Figs. 1 and 2, Table 1).

The Korea cluster
The newly observed cluster was provisionally named in this study Korea (as the antigenic strain A/Korea/770/2002 groups within this cluster in the HA gene tree), and is also observed in other gene trees (PB2, PB1, PA and NP) (Figs. S2, S3, S4, S5). Our phylogenetic analysis shows that the Korea viruses have a HA and PB1 closely related to 'old' viruses belonging to the Panama and Fujian clusters and to the Clade 'B' (Fig. 1 and Fig. S3); in an HA tree, the Korea-cluster is positioned close to the clade 'B', while for the PB1 tree it is positioned outside the Clade 'B' and the Panama and Fujian clusters. On the other hand, for the NA, PB2, PA and NP gene trees, the Korea cluster is basal to the main tree trunk from which strains collected after 2004 diverge (Fig. 2, and Figs. S2, S4 and S5); in the NA and PB2 trees, the Korea cluster is closely related to the California viruses, while in the PA and NP trees it is close to the N-lineage. In the M tree, the Korea cluster is located close to the Brisbane cluster (Fig. S6), while in the NS tree the clusters cannot be defined (Fig. S7). These findings suggest a complex reassortment that gave origin to the Korea cluster viruses. Because the M gene is highly conserved, a discrete distribution of viruses according to their date of collection is not clearly visible on the M tree. Therefore the position of the Korea cluster close could be biased by the sequence similarity and does not necessarily reflect a close phylogenetic relationship between the Brisbane and Korea clusters.
Regarding the seasonality pattern observed in Mexico, the dates of collections of the viruses used in this work span between September and May (with exception of only two viruses, A/ Mexico/DIF835/2003 and A/Mexico/QUE2270/2005 collected in July and August, respectively) ( Table 1). Overall, the pattern observed is the expected due to the geographical position of Mexico, with a delay of two months in the duration of the season. It is of interest that some Mexican isolates seemed to be ahead or behind the rest of the North American viral diversity. In this sense,  clear segregation according to seasonality for the Victoria cluster in these gene trees (Fig. 2, Figs. S4 and S5).   (Table 1).

Phylogenetic/Antigenic Characterization and Variations in the HA gene
We further analyzed specific mutations in the receptor binding and antigenic sites of the HA protein sequences of our isolates and reference strains. We found that there is a clear distinction in the receptor binding sites of the Panama and Korea clusters  Table 2) although these two viruses had been characterized as antigenically equivalent [40,41]. Determined by HA2 numbering (+16 from aminoacid one in coding sequence, [39,40]. 2 Amantadine sensitivity given by mutation in position 31 of M2 protein. S = sensitivity, N = resistance. 3 Antigenic sites B and E are conformational epitopes. 4 Changes of interest are shown in bold and italics. doi:10.1371/journal.pone.0102453.t002 As expected, more variations were observed in the antigenic sites (Table 2). No changes were found between the antigenic sites of isolate A/Mexico/DIF2662/2003 from clade 'B' and the reference strain A/Korea/770/2002 ( Table 2). The A/Panama/ 2007/99 and A/Korea/770/2002 reference strains share the amino acid S in position 189 of antigenic site B, evidence of the similarities between the HA of the Korea viruses and 'old' viruses from 2002 or before. Interestingly, the largest number of changes between the between Korea, Fujian and California cluster, was found amongst the Fujian and Korea clusters, with two changes in receptor binding sites 226 and 227 and four changes in antigenic sites (A 145, B 188, B 189 and B 195) (Table 2).
Jin et al. have described that H155T and Q156H are the two amino acid changes responsible for the antigenic drift between Panama-like and Fujian-like viruses, and a total of 13 amino acid differences between these group of viruses can be observed [39]. We searched manually in the HA alignment to find differences among the viruses used in this study and found out that contrasting to the antigenic characterization results, isolate  Table 2). Changes associated with A/Victoria/361/2011 reference strain were found in the Mexican isolates from 2011, which congruently grouped within the Victoria cluster (Table 1 and 2). Three of these isolates presented an additional change in antigenic site C (196A) ( Table 2).
Finally, sequence analysis of the M2 gene revealed that all influenza isolates detected after 2004 in Mexico are amantadineresistant, having an asparagine in position 31 of the M2 protein, while isolates from 2003 or earlier (belonging to the clade 'B' or the Korea cluster) are amantadine-sensitive, having a serine in this position (Table 2).

Discussion
In this work we analyzed all eight segments of 19 influenza A/ H3N2 viruses isolated between 2003 and 2012 in Mexico and compared them to a set of A/H3N2 influenza virus strains isolated in North America from 1998 to 2013. Previous studies report frequent co-circulation of different lineages during single influenza seasons [8,29]. We observed co-circulation It has been suggested that influenza A/H3N2 viruses are seeded each year from South-East Asia and disappear at the end of season [10]. However, local persistence of viral lineages and other complex evolutionary patterns for A/H3N2 influenza viruses have also been detected [5], suggesting that strains from North America do contribute to A/H3N2 global evolution [8]. In a global context, numerous reassortment events have been detected among H3N2 circulating in humans, and some of these reassortant variants have been found to persist over time [5,8,13,29,30,31]. The differences in clustering of HA and NA gene segments within phylogenetic trees suggest that reassortment can occur relatively frequently over time and it has been demonstrated for a limited number of influenza seasons that multiple lineages of influenza A (H3N2) viruses co-circulate, persist and reassort within the human population [13,29]. In a local scale, there are also studies that evidence the co-circulation of different lineages of viruses and the occurrence of multiple reassortment events in several countries [42,43,44,45,46]. In general, we observed that in Mexico, as in North America, during each influenza season virus cluster according to their date of collection rather than by location. There is no evidence for in situ evolution of Mexican A/H3N2 viruses, supporting the hypothesis of global circulation of influenza [47].
We found evidence for possible local persistence of some viral lineages in Mexico between different influenza seasons. While persistence of the N-lineage cannot be fully supported, as it is well known that during each flu season there is an overlap in the circulation of lineages from the previous and the starting season, the presence of the Brisbane cluster during three influenza seasons indicates persistence, and not a simple overlap. Nonetheless, our observations on local persistence of lineages within different seasons in Mexico should be further evaluated, as it is difficult to prove persistence within a geographic region without serial sampling of the same location.
We also detected a novel cluster named here as  [48,49], but here we have shown that the Fujian cluster is clearly phylogenetically distinct from the smaller Korea cluster. Moreover, we have shown that the Korea cluster had global distribution.
As inferred from our phylogenetic analysis, the Korea cluster shows a complex genetic pattern suggesting that it may have originated by extensive reassortment. The HA and PB1 of the Korea viruses are related to 'old viruses' from 1999-2000 (Panama cluster) and from 2002-2003 (Fujian cluster and to clade 'B'), while the NA, PB2, PA and NP genes are basal to the main tree trunk from which strains collected after 2004 diverge, related either to the California cluster or to the N-lineage. With respect to the other lineages known to have circulated in North America, we did not observe any Mexican isolates belonging to the Fujian, California, or Perth cluster, but such observations could be a result of the low number of samples used or sampling bias.
Changes were observed in antigenic sites in the HA protein among our isolates and reference strains, but no clear variation pattern associated with antigenic/phylogenetic groups could be detected in viruses collected within a short temporal range (with 6 months to one year). Clear differences could be made between viruses collected within a larger spatial range (from before 2003 and after 2004), given mainly by changes in receptor binding sites. Since the emergence of the N-lineage in 2005, no frequent changes in the receptor binding sites have been detected, suggesting that receptor binding sites are under functional constrain, and therefore only few changes are fixed in the main viral population.
In conclusion, our findings show that different A/H3N2 viral lineages in Mexico can co-circulate and can also persist locally in between different influenza seasons and thus contributing to the local evolution of A/H3N2 viruses. A novel minor cluster was also identified, named here as Korea, that circulated worldwide during 2003. Our work contributes with useful information on the epidemiological and evolutionary behavior of A/H3N2 viruses in Mexico that can be integrated with global data. Figure S1 Prevalent A/H3N2 antigenic groups circulating in human populations from 2000 to 2012. Schematic representation of predominant antigenic groups that circulated since 2000 as reported [16,17,18,19,20,21,22,23,24,25,26,27,28]. Time line is shown at the top. S31N points to introduction of amantadine resistance mutation in circulating strains, with broken line representing amantadine sensitive strains and continuous line represents presence of resistant viruses [31]. Reference strains of antigenic groups detected are shown.