Genetic Variability in L1 and L2 Genes of HPV-16 and HPV-58 in Southwest China

HPV account for most of the incidence of cervical cancer. Approximately 90% of anal cancers and a smaller subset (<50%) of other cancers (oropharyngeal, penile, vaginal, vulvar) are also attributed to HPV. The L1 protein comprising HPV vaccine formulations elicits high-titre neutralizing antibodies and confers type restricted protection. The L2 protein is a promising candidate for a broadly protective HPV vaccine. In our previous study, we found the most prevalent high-risk HPV infectious serotypes were HPV-16 and HPV-58 among women of Southwest China. To explore gene polymorphisms and intratypic variations of HPV-16 and HPV-58 L1/L2 genes originating in Southwest China, HPV-16 (L1: n = 31, L2: n = 28) and HPV-58 (L1: n = 21, L2: n = 21) L1/L2 genes were sequenced and compared to others described and submitted to GenBank. Phylogenetic trees were then constructed by Neighbor-Joining and the Kimura 2-parameters methods (MEGA software), followed by an analysis of the diversity of secondary structure. Then selection pressures acting on the L1/L2 genes were estimated by PAML software. Twenty-nine single nucleotide changes were observed in HPV-16 L1 sequences with 16/29 non-synonymous mutations and 13/29 synonymous mutations (six in alpha helix and two in beta turns). Seventeen single nucleotide changes were observed in HPV-16 L2 sequences with 8/17 non-synonymous mutations (one in beta turn) and 9/17 synonymous mutations. Twenty-four single nucleotide changes were observed in HPV-58 L1 sequences with 10/24 non-synonymous mutations and 14/24 synonymous mutations (eight in alpha helix and four in beta turn). Seven single nucleotide changes were observed in HPV-58 L2 sequences with 4/7 non-synonymous mutations and 3/7 synonymous mutations. The result of selective pressure analysis showed that most of these mutations were of positive selection. This study may help understand the intrinsic geographical relatedness and biological differences of HPV-16/HPV-58 and contributes further to research on their infectivity, pathogenicity, and vaccine strategy.


Introduction
Human Papillomavirus (HPV) virions are one of the most important pathogenic agents for cervical cancer, which accounts for a worldwide cancer burden in women second only to breast cancer [1,2]. Approximately 90% of anal cancers and a smaller subset (,50%) of other cancers (oropharyngeal, penile, vaginal, and vulvar) are also attributed to HPV. In total, HPV accounts for 5.2% of the worldwide cancer burden. HPVs 16 and 18 are responsible for 70% of cervical cancer cases and, especially HPV-16, for a large proportion of other cancers [3]. On the basis of their oncogenic potential, HPV types that infect the genital tract are classified as low risk (LR) and high risk (HR) [4]. Low-risk HPVs (including HPV- 6, 11, 42, 43, and 44) are mainly associated with benign genital warts, while high-risk HPVs (including HPV- 16,18,31,33,39,45,51,52,56,58,59, and 68) are the etiological agents of cervical cancer, a disease that affects approximately 500,000 women worldwide [5]. In our previous study, we found the most prevalent high-risk HPV infectious serotypes were HPV-16 and HPV-58 among women of Southwest China.
The HPV genome is packaged within a non-enveloped, icosahedral capsid composed of 72 pentamers of the major capsid late protein (L1) and an unknown number of the minor capsid proteins L2 [6,7]. The pentamers of L1 expressed in heterologous systems that assemble into virus-like particles (VLPs) [8] are the components used in the design of prophylactic vaccines. The L1 protein comprising HPV vaccine formulations elicits high-titre neutralizing antibodies and confers type-specific and long-lasting protection against persistent infection and associated cervical neoplasia attributable to HPV vaccine types [8,9]. However, there has been no vaccine designed that can prevent all HPV infections owing to the lack of cross-reactivity between L1 proteins of different HPV types.
The inner conical hollow of L1 pentamers can be occluded with a monomer of L2 [6]. An N-terminal ''external loop'' of L2 contains cross-neutralizing epitopes, which can be the target of neutralizing and cross-neutralizing antibodies as well [10][11][12]. Analysis of FUTURE I/II and PATRICIA data suggested crossprotective vaccine efficacy against infections and lesions associated with HPV 31, 33, and 45 [13]. Therefore, targeting L2 may be an acceptable approach for a candidate vaccine. However, its low abundance in natural capsids (only 12-72 molecules per 360 copies of L1) limits its immunogenicity [14]. Currently, we have no efficient vaccines against L2 to prevent infection with these highrisk HPV types.
Due to the high prevalence of HPV not only among asymptomatic women but also in samples of different neoplasias worldwide, the association between intratypical variants of HPV-16 L1 has been described in several papers. Nevertheless, data concerning molecular variants of HPV-16 and HPV-58 L2 are still limited, necessitating further studies that would be essential to expand knowledge of the different variants. Furthermore, there is little data regarding the intratypical variants of HPV-16 L1 in Southwest China.
The aim of this study is to detect the nucleotide variability, gene polymorphism and phylogeny in the L1 and L2 genes of the High-Risk HPV-16 (L1: n = 31, L2: n = 28) and HPV-58 (L1: n = 21, L2: n = 21) samples obtained from Southwest China. The most variable sequences were chosen for an analysis of the diversity of their secondary structure. Nucleotide and amino acid sequence alignments were used to evaluate variant clusters. Amino acid changes of L1 and L2 genes might affect immune responses to HPV-16 and HPV-58 capsid proteins and advance HPV vaccine strategies. The genomic characterization of HPV variants is pivotal for a deeper understanding of the intrinsic geographical relatedness and biological differences of these viruses and contributes further to research on their infectivity and pathogenicity.

Ethics Statement
Written consent was obtained from each participant. The study protocol was approved by the institutional ethics committee (Institute of Medical Biology, Chinese Academy of Medical

Clinical Specimens
Samples examined in this study were obtained from cervical scrapings of 3000 volunteer outpatients from women in Southwest China from 2009 to 2011. After routine cytology and HC2 testing, a cell suspension from each sample was placed in a 1.5 mL Table 2. Cont.   Table 3. Nucleotide sequence mutations of HPV-16 L2. Eppendorf tube and transferred to a laboratory at the Institute of Medical Biology for HPV DNA amplification. HPV typing of these specimens was performed using a nested multiplex PCR assay as described previously [15,16]. HPV-16 (L1: n = 31; L2: n = 28), HPV-58 (L1: n = 21; L2: n = 21) sequences thereby obtained were used for molecular characterization by sequence analysis of the L1 and L2 gene.

Nucleic acid extraction and sequencing amplification
Molecular characterization was performed by sequence analysis of L1 and L2 gene amplicons. The entire region of L1 and L2 genes of HPV-16 and HPV-58 were amplified using degenerate primer pairs. Two partially overlapping fragments for each virus were amplified. All Primers were designed and synthesized by Sangon Biotech (Shanghai). The GeneBank reference sequences used for primer design were HPV-16 (NC001526 [17]) and HPV-58 (HQ537759). The primer sequences and their relative positions are listed in Table 1. The amplification of the fragments was performed in 50 ul reaction volumes containing 5 ul extracted DNA (template), 26power taq PCR MasterMix (TaKaRa), Table 3. Cont.    25 pmol of each primer (Sangon Biotech) and deionized water. The cycling conditions were as follows: 94u for 5 min followed by 30 cycles at 94u for 45 sec, 50u for 45 sec, 72u for 1 min and a final 72u extension for 7 min. Amplicons were visualized on 2% agarose gels stained with GoldViewTM Nucleic Acid Stain. Then PCR products were purified and sequenced by Sangon Biotech.

Results
Phylogenetic and amino acid mutations analysis of HPV-16 L1 sequences L1 HPV-16 sequences were determined and analyzed by aligning L1 1596 nucleotide sequences from all viral strains (n = 31; including the reference sequences). The neighbor joining phylogenetic tree can be seen in Fig. 1.
Twenty-nine single nucleotide changes were identified among the sequences studied. Specifcally, 13/29 (44.8%) were synonymous mutations and 16/29 (55.2%) were non-synonymous mutations. Of the 6 amino acid mutations observed in the sequences encoding the alpha helix, only one was a nonsynonymous mutation. 2 non-synonymous mutations were observed in the sequences encoding the beta turn (glycine to arginine) (http://npsa-pbil.ibcp.fr/cgi-bin/npsa_automat.). 1 of 31 sequences did not belong to any standard type branch (Fig. 1). 7 samples were found to have the same mutation from A to C at the position of 979. The detected mutations are summarized in Table 2.
Compared to prototype HPV sequences, insertion and deletion events were not identified and there was no evidence of premature stop codons or nucleotide deletions in the L1 HPV-16 sequences analyzed.
Phylogenetic and amino acid mutations analysis of HPV-16 L2 sequences L2 HPV-16 sequences were determined and analyzed by aligning L2 1422 nucleotide sequences from all viral strains (n = 28; including the reference sequences). The neighbor joining phylogenetic tree can be seen in Fig. 2.
Seventeen single nucleotide changes were identified among the sequences studied. Specially, 9/17 (52.9%) were synonymous mutations and 8/17 (47.1%) were non-synonymous mutations. No amino acid changes were discovered at residues 65-71 and 112-120, which play a important role in inducing neutralizing antibodies [24]. No amino acid mutations occurred in the sequences encoding the alpha helix. Only one non-synonymous mutation was observed in the sequences encoding the beta turn (aspartic acid to glutamic acid). Both JX313748 and JX313749 L2 sequences fell into the same branch of AY686579 (Fig. 2). The detected mutations are summarized in Table 3.
Insertion and deletion events were not present and there was no evidence of premature stop codons or nucleotide deletions within the L2 HPV-16 analyzed sequences. Phylogenetic and amino acid mutations analysis of HPV-58 L1 sequences L1 HPV-58 sequences were determined and analyzed by aligning L1 1575 nucleotide sequences from all viral strains (n = 21; including the reference sequences). The neighbor joining phylogenetic tree can be seen in Fig. 3.
Twenty-four single nucleotide changes were identified among the sequences studied. Specifically, 14/24 (58.3%) were synonymous mutations and 10/24 (41.7%) were non synonymous mutations. 8 amino acid mutations (four non-synonymous) occurred in the sequences encoding the alpha helix, and 4 mutations were observed in the sequences encoding the beta turn with 2 being non-synonymous. 8 samples were found to have the same mutation from C to A at position 1124. The detected mutations are summarized in Table 4.
Compared to prototype HPV sequences, neither frame shifts, premature stop codons, insertions nor deletions were observed in the L1 HPV-58 analyzed sequences.
Phylogenetic and amino acid mutations analysis of HPV-58 L2 sequences L2 HPV-58 sequences were determined and analyzed by aligning L2 1419 nucleotide sequences from all viral strains (n = 21; including the reference sequences). The neighbor joining phylogenetic tree can be seen in Fig. 4. Seven single nucleotide changes were identified among the sequences studied. Specifically, 3/7 (42.9%) were synonymous mutations and 4/7 (57.1%) were non-synonymous mutations. No amino acid changes were observed at residues 65-71 and 112-120, which play an important role in inducing neutralizing antibodies. No amino acid mutations occurred in the sequences encoding the alpha helix and beta turn, but 6 mutations were found in the random coil, and one in the extended strand. The detected mutations are summarized in Table 5.
Insertion and deletion events were not identified and there was no evidence of premature stop codons or nucleotide deletions in the L2 HPV-58 sequences analyzed.

Selective pressure analysis of all sequences
We tested for variable dN/dS rate ratios among various lineages using the PAML4.0 [25]. There was no evidence of negative selection in the sequence alignment of HPV-16 and HPV-58 L1 and L2 genes (P-value ,0.1). The selective pressure analysis results are summarized in Table 6,7,8,9.

Discussion
Human papillomavirus (HPV) vaccines against L1 are now licensed in more than 100 countries. National and regional immunization programs aimed at young adolescent girls have been widely implemented, and include catch-up programs in some countries up to the age of 18 years or older. However these vaccines target only two of the 15 high-risk HPV types, responsible for 70-80% of cervical cancer cases. The prevention of 96% of cervical cancer would require immunity against at least 7 high risk HPV types (HPV- 16,18,31,33,45, 52 and 58) [5]. L2 and other subtypes' L1 are now being used in vaccine research on a broadening scale.
Among 3000 volunteer outpatients investigated in our early study between 2009 and 2011, 646 cases were HPV positive, for a rate of 21.5%. Among the 646 positive samples, 476 cases were of the high risk type (73.7% of the total positive samples), while 170 cases were of the low risk type (26.3% of the total positive samples). The most common HPV high risk subtypes in Southwest China among female reproductive tract infections were HPV-16, 58,18, 31, 33, and 35. The most common low risk subtypes were HPV-6, 11, and 81. Among 476 high risk type positive samples, HPV-16 and 58 were the main high risk subtypes. HPV-16 comprised 217 cases (45.6% of the total high risk subtypes' samples); HPV-58 145 cases (30.5% of the total high risk subtypes' samples). HPV-16 and HPV-58 comprised 362 cases (76.1% of the total high risk subtypes' samples), while all the other high risk subtypes including HPV33, 35,18,31,59,66 comprised only 114 cases (24% of the total high risk subtypes' samples) [26]. This may be attributed to the special geographic location where the samples were taken and interactions of different populations in southwest border district of China.
HPV-16 is the most prevalent high-risk types of HPV worldwide, and also the type that is most frequently associated with cancer [27][28][29]. However, the prevalence of HPV-58 and its relative contribution in the development of cervical neoplasia vary greatly in different area worldwide. HPV-58 has previously been reported to be particularly prevalent in some areas of northeastern Asian (China, Korean, Japan), some regions of central and south America, with a significant trend to increased prevalence in line with the increasing severity of lesions [30][31][32][33][34][35][36][37][38]. These data differed from those of the international study reported by Bosch et al., which did not include Chinese women [27]. HPV-58 is rarely detected in the Americas, Europe and Africa, Worldwide, HPV-58 has been found in only 2% of cervical cancers [27]. In contrast, Chan et al. [39] reported that one-third of the women with cervical cancers in Hong Kong were positive for HPV-58, and similarly high rates have also been reported for Chinese populations living in Shanghai (East of China) (30.4% in cervical cancers subgroup) [40], Jiangxi (middle of China) (18.4% in cervical cancers subgroup) [40], and Taiwan (21.0% in cervical cancers subgroup) [41]. Studies from different group suggest that HPV-58 may play a more prominent role in the development of CC in Asia than HPV types 31, 33, and 45, which are more common on other continents [31,33,40,42,43]. HPV-58 variants carrying E7 C632T (T20I) and G760A (G63S) substitutions may be associated with an increased risk for cervical cancer [39,43].
Based on previous research of other research group and us, we chose HPV-16 (L1: n = 31, L2: n = 28) and HPV-58 (L1: n = 21, L2: n = 21) samples to explore the intratype variations, construct their phylogenetic trees and estimate selection pressures acting on the L1 and L2 genes in our current study.
It is reported that specific intratype HPV genome variations may be related to virus infectivity, pathogenicity, progression to cervical cancer, viral particle assembly and host immune response [44,45]. However, there is still no data demonstrating if immunity to one HPV variant can protect against infection from another variant. Thus, identification of HPV genetic diversity in specific clinical settings may prove important for the rational design of diagnostic, therapeutic, and vaccine strategies [46,47]. The main The neighbor joining phylogenetic tree results showed that the L1 and L2 of HPV-16 and HPV-58 are distributed in two or three standard branches, not in one specific branch. Most HPV-58 L1 and L2 sequences fell into D90400 (Japan), EU918765, HQ537760 (isolate AS347), which belong to the Asian and the European branches. Some mutations occurred in the sequences encoding the alpha helix and beta turn, which influence protein secondary structure. The function of these mutations still needs further investigations. In this study, the most common mutation of HPV-16 L1 was A979C (T327P). This position located in a major common B cell epitope peptide in both mice and humans, which might affect the immunogenicity of HPV-16 L1 [48].
From the result of selective pressure analysis we conclude that most mutations of HPV-16 and HPV-58 L1 and L2 were of positive selection, which indicated that these amino acid changes were beneficial to accommodate the human papillomavirus to its environment.
The L1 and L2 of HPV-16 and HPV-58 had a low rate of nucleotide changes, something that could be attributed to the fact that HPV uses the host cell's DNA replication machinery, which is characterized by proofreading capacity and post replication repair mechanisms [17]. Moreover, many core functions of viral proteins are very important in the viral life cycle, and this may result in selection that restricts the actual number of possible evolutionary events. Some samples did not belong to any standard type branch because of distinctive mutations or the narrow scope on choosing standard types, requiring further efforts of analysis in the future.
Nucleotide substitutions in viral genomes may affect virus assembly, carcinogenic potential, and host immunologic responses [49]. HPV-16 and HPV-58 gene diversity may help us understand the oncogenic potential of these viral strains and how polymorphisms can affect the host response following infection or vaccination.  Table 6,7,8,9: lnL, the log-likelihood difference between the two models; 2Dl, twice the log-likelihood difference between the two models; The positively selected sites were identified with posterior probability $0.9 using Bayes empirical Bayes (BEB) approach. One asterisk indicates posterior probability $0.95, and two asterisks indicate posterior probability $0.99. NA means not allowed. NS means the sites under positive selection but not reaching the significance level of 0.9. doi:10.1371/journal.pone.0055204.t009