Evolving Molecular Epidemiological Profile of Human Immunodeficiency Virus 1 in the Southwest Border of China

Background We have previously reported in Xishuangbanna (Banna) Dai Autonomous Prefecture, a well-developed tourist destination in the southwest border of China, that HIV-1 transmitted dominantly through heterosexual contact with less divergent genotypes and few drug resistant mutations [1]. Due to the rapid increase of newly diagnosed HIV-1 cases per year in Banna in recent years, it’s important to evaluate the evolution of HIV-1 molecular epidemiology for the better understanding of ongoing HIV-1 outbreak in this region. Methodology/Principal Findings By sequencing of HIV-1 pol genes and phylogenetic analysis, we conducted a molecular epidemiologic study in 352 HIV-1-seropositive highly active antiretroviral treatment (HAART)-naïve individuals newly diagnosed at the Banna Center for Disease Control and Prevention between 2009 and 2011. Of 283 samples (84.1% taken from heterosexually acquired adults, 10.6% from needle-sharing drug users, 2.8% from men who have sex with men, 0.4% from children born from HIV-1-infected mothers, and 2.1% remained unknown) with successful sequencing for pol gene, we identified 108 (38.2%) HIV-1 subtype CRF08_BC, 101 (35.7%) CRF01_AE, 49 (17.3%) CRF07_BC, 5 (1.8%) C/CRF57_BC, 3 (1.1%) B’, 1 (0.4%) B/CRF51_01B, and 16 (5.7%) unique recombinants forms. Among these infected individuals, 104 (36.7%) cases showed drug resistant or resistance-relevant mutations, and 4 of them conferring high-level resistance to 3TC/FTC, EFV/NVP or NFV. Phylogenetic analysis revealed 21 clusters (2–7 sequences) with only 21.2% (60/283) sequences involved. Conclusion/Significance In contrast to our previous findings, CRF08_BC, replaced CRF01_AE, became the dominant genotype of HIV-1 in Banna prefecture. The viral strains with drug resistance mutations were detected frequently in newly diagnosed HIV-1-infected individuals in this region.


Introduction
The Xishuangbanna (Banna) Dai Autonomous Prefecture of China's Yunnan province, geographically located at the southwestern border of China, has two counties (Menghai and Mengla) and one city (Jinghong) with an estimated population of 1.13 million populations (including 13 ethnics) in 2012. Due to its distinctive biogeographic characteristics and diversified ethnic cultures, Banna prefecture has become a well-developed tourist destination in China since 1990s.
Before 2005, HIV-1 transmission in Banna prefecture had been characterized as mainly through heterosexual contact (77.3%) with less genotypic diversity and uncommon drug resistance mutations (DRMs) [1]. Although multiple HIV-1 genotypes (including B, B', C, CRF07_BC, CRF08_BC, and CRF01_AE) had been identified in Yunnan province [2], only CRF01_AE and CRF08_BC had been found to be dominant genotypes (62.2% and 33.3% respectively). Moreover, only one case with subtype CRF01_AE had been identified to carry mutations conferring high-level resistance to antiretroviral drugs NRTI and NNRTI in our previous study [1].
Due to the rapidly increasing numbers of newly diagnosed HIV-1 cases per year in the past years (75 cases in 2005 and nearly 200 cases in 2011), it is important to evaluate the evolution of HIV-1 molecular epidemiology for the better understanding of ongoing HIV-1 outbreak in this region. In addition, the free HAART drugs have been delivered to registered HIV-1-infected individuals with a CD4 count less than 350 cells/ml according to the national ''free antiretroviral treatment program'' (NFATP) since 2005. Therefore, the transmission of HIV-1 with drug-related mutations might be emerged, as we observed recently in Guangdong province [3]. In the present study, we conducted a molecular epidemiological investigation in 357 cases of newly diagnosed HIV-1 infection from July 2009 to June 2011 at the Banna Center for Disease Control and Prevention (CDC).

Participants and specimens
From July 2009 to June 2011, a total of 352 individuals who had been newly diagnosed with HIV-1 infection at the Banna CDC had been enrolled into this study. All individuals were required to complete standardized questionnaires (describing sex, age, risk factors, mode of transmission, occupation, geographic location). An enzyme immunoassay for screening HIV-1/HIV-2/O antibodies (Vironostika Uni-form II plus O kit; Organon Teknika BV, Turnhout, Belgium) in sera was performed in the Banna CDC. The positive sera were confirmed by western blot (HIVBlot2.2 kit; Genelabs Diagnostics, Singapore).
After giving their written informed consents, 322 HIV-1 infected individuals were recruited into the molecular epidemiological and drug resistance survey. Briefly, a total of 5 ml of EDTA-treated whole blood was taken from each patient, and plasma samples were sent to our laboratory of Tropical Medicine Institute, Guangzhou University of Chinese Medicine (TMI, GUCM) by flight within 4 hours. The blood samples were used immediately for routine blood count and CD4 T-cell count measurements as well as to separate plasma. Plasma samples were stored at 280uC until use.

Ethics statement
The institutional ethics committees of TMI, GUCM had approved the study protocol (No. 2009C015).

Viral load measurement
All plasma samples were thawed at the same time and were used for viral RNA measurement using the Food and Drug Administration (FDA) -approved Amplicor HIV-1 Monitor Test kit (version 1.5) (Roche Molecular Systems, Inc., Branchburg, New Jersey, USA) according to the manufacturer's instructions.

Sequencing of HIV-1 pol gene
The pol gene of HIV-1 encodes the viral enzymes including protease, reverse transcriptase, and integrase, and holds sufficient variability to permit the phylogenetic reconstruction of transmissions [4]. Although it is hard to identify the difference between subtypes CRF01_AE and CRF15_01B by sequencing pol region, CRF15_01B so far has not yet been identified in Yunnan province [5] and few cases with CRF15_01B have been reported only in Hebei province and Beijing city [6]. In addition, the newly identified circulating recombinant forms, CRF57_BC [7]/ CRF65_cpx [8] and CRF51_01B [9], could not be distinguished from subtypes B and C by sequencing pol region. However, considering that subtypes B and C had not-yet been identified in this area, we conducted the HIV-1 genotyping by sequencing the pol region as described previously [1]. Briefly, viral RNA was extracted from the HIV-1-infected individual's plasma (150 ml) using the QIAamp Viral RNA Mini kit (Qiagen, Valencia, California, United States) according to the manufacturer's instructions initially. The viral RNA was then subjected to a one-step reverse transcription polymerase chain reaction (RT-PCR) to generate a fragment of pol gene (1864 base pairs) spanning protease and reverse transcriptase regions as previously described [1,3,10]. The PCR products were purified (Qiagen, Valencia, Spain) and directly sequenced. The sequences generated were edited using the SeqMan II software program from the DNAStar package v.5.08 (Lasergene, Madison, WI). To eliminate potential contamination, all of the sequences obtained were first subjected to an HIV-1 Blast search to compare with related reference sequences in the HIV Databases, funded by the Division of AIDS of the National Institute of Allergy and Infectious Diseases (NIAID), National Institutes of Health (NIH) (http://hivweb.lanl.gov/content/index). Finally, a total of 283 sequences were successfully obtained from the 322 blood samples.

Drug resistant mutations
Drug resistance mutations in protease and reverse transcriptase genes were identified by using the last updated online Stanford Resistance Database tool: HIVdb program-Genotypic Resistance Interpretation Algorithm (version 7.0, http://sierra2.stanford. edu/sierra/servlet/JSierra?action=sequenceInput). In additional, sequence data were also submitted and evaluated by using the last

HIV-1 genotyping and phylogenetic analysis
HIV-1 subtype and CRF designations were determined by uploading sequences into the REGA HIV-1 automated Subtyping Tool version 2.0 (http://www.bioafrica.net/rega-genotype/html/ subtypinghiv.html), and confirmed by in-house phylogenetic analysis on nucleotide acid sequences of pol as previously described [1,3,10]. For phylogenetic analysis, reference sequences representing overall HIV-1 group M genetic variability were obtained from the National Institutes of Health/National Institute of Allergy and Infectious Diseases (NIH/NIAID)-funded HIV database, including all subtypes, sub-subtypes, and circulating recombinant forms (CRFs) references. CRFs identified most recently, such as CRF62_BC [11], CRF64_BC [12], and CRF65_CPX [8], etc., were also included. All reference sequences and our newly obtained nucleotide sequences were aligned using Muscle [13] and followed by manual editing. Neighbor-joining (N-J) tree was drawn under the HKY model of evolution with 1000 bootstrap replicates in SeaView v4.3 [14]. Sequences assigned to specific HIV-1 subtypes were finally confirmed by constructing maximum likelihood (ML) phylogenetic sub-trees as described below. To determine a recombinant virus, similarity analysis and bootscanning were performed with the Simplot version 3.5.1 software [15].
Phylogenetic inter-relationships among viral sequences were estimated using maximum likelihood (ML) phylogenies with PHYML 3.0 [16] with the minimal number of reference sequences. The whole sequence alignment was split into 6 subdatasets based on the neighbor-joining tree, followed by gapstripping and ML phylogenetic reconstruction. An approximate likelihood ratio test (SH-like) was used to assess confidence in topology. Phylogenies were inferred using a general-time reversible model of nucleotide substitution, an estimated proportion of invariant sites, and gamma distributed rates among sites. The best of SPR and NNI heuristic options was selected to search the tree space, and bootstrap values with 1000 replicates were used to assess confidence in topology. The existence of transmission clusters was determined using the statistical robustness of the maximum likelihood topologies assessed by high bootstrap values (.98%) with 1000 re-samplings and short branch lengths (genetic distances,0.015) of HIV-1 pol gene sequence [3,4]. The phylogenetic tree was drawn with FigTree v.1.42 (tree.bio.ed.ac.uk/ software/figtree/).

Statistical analysis
Differences in the CD4 T-cell count and viral load among different subgroups of HIV-1 infected individuals were determined using a Mann-Whitney nonparametric test.

Accession Numbers
All pol sequences analyzed in the present study are deposited in EMBL under the accession numbers HG421451 to HG421735.  Figure 1B). The means (6SD) of CD4 counts and of viral loads were 4306266 cells/ml and 3.7661.35 log10 copies/ml respectively. All patients had not been previously treated with any antiretroviral drug.

HIV-1 Genotyping and Unique Recombinant Forms (URF) identification
Among 322 infected individuals who had given their written consent to participate in this study, we obtained the complete sequencing from 283 samples, while 39 samples failed in amplification. However, the characteristics of the 283 individuals with a successful pol sequencing were comparable with that of the whole study group (352 infected individuals) ( Figure 1B). Their means (6SD) of CD4 counts and of viral loads were 4266257 cells/ml and 3.8461.32 log10 copies/ml respectively. Both values were similar to that of the whole study group (see above).

Drug Resistance Mutation Analysis (DRMs)
As showed in Table 2, a total of 33 DRMs sites in 104/283 (36.7%) infected individuals were identified with HIVdb Algorithm in our study, including 6 DRMs identified by CPR ver.6.0 simultaneously. The 6 DRMs identified by both HIVdb Algorithm and CPR included M184V/I, T215S, K219N (NRTI), K103N (NNRTI), and D30N, M46I/L (PI). M184V is selected by 3TC/ FTC and reduces susceptibility to these drugs more than 100-fold. M184I usually emerges before M184V and has similar resistance profiles. T215S is a known revertant of the resistance mutation T215Y/F. K103N is selected by NVP and EFV and reduces susceptibility to them by about 50 and 20-fold, respectively. D30N is a NFV-selected substrate-cleft mutation that causes high-level resistance to NFV. M46I/L are selected primarily by NFV, ATV and LPV etc., and reduce susceptibility to these drugs. Other mutations such as A98G, V179D/T and Q58E are also need to be considered. A98G and V179D reduces NVP and EFV susceptibility. Q58E is a major mutation for Tipranavir.

Identification of Transmission Clusters
To evaluate the global profile of the local HIV-1 transmission, we split the sequence alignment into 6 sub-datasets, and implemented maximum-likelihood phylogenetic reconstruction on each sub-dataset to obtain the genetic distances. These subdatasets were designate with the data of a representative references, such as CRF08_BC   identified, 17 chains had been supported by contact tracing (including the sex, age, habitation, occupation, risk infectors in the same cluster), while four clusters (clusters 3, 6, 8 and 9) remained ambiguous. Clusters 6 and 8 were both consisted of 2 women lived in the same city, who had reported to get HIV-1 from heterosexual contact. Both clusters 3 and 9 derived from 2 men lived in the same city, who claimed to contract HIV-1 either by heterosexual contact or by uncertain route.
Among the 21 clusters, most of them were composed of 2 sequences (12 clusters), while 5 clusters were composed of 3 sequences and 4 clusters were composed of multiple sequences (cluster 15 consisted of 4 sequences; both clusters 5 and cluster 13 consisted of 5 sequences; and the biggest cluster 1, consisted of 7 sequences). The new B'/C recombinant virus spreading in cluster 1 represented a potential regional outbreak of novel B'/C recombinant, as reported elsewhere in Yunnan province [17,18]. The characteristics of patients in the 21 transmission clusters were summarized in Table 3.
We further investigated whether the transmission of drug resistant strains occurs among these 21 clusters. There were 6/21 (28.6%) clusters consisted of individuals who were infected with strains carrying the same DRMs. These cases included V179T in cluster 4, V90I in cluster 5, E138A in cluster 9, Q58E/T74S in cluster 14, V90I/V179D in cluster 15, and A98G in cluster 18.

Discussion
In Yunnan province, there were a total of 104981 cases of HIV-1 infection and 7671 newly diagnosed HIV-1 infected individuals in 2012 (http://yn.yunnan.cn/html/2012-11/27/content_2510358. htm). Although nearly 200 newly diagnosed HIV-1 infected individuals per year in Banna prefecture represents only 2% of new infections per year in Yunnan, our findings in the present study did raise serious concerns on the severity of regional HIV-1 transmission. Firstly, newly diagnosed infected individuals increased rapidly (averaged 18% per year deduced from cases in 2005 and 2011). Heterosexual transmission was consistently dominant in Banna. In addition, the unsafe sexual behavior of injecting drug users could further increased the risk of HIV-1 transmission in the general population [19].
Secondly, our results suggested that HIV-1 population evolved rapidly in Banna and the evidence of new sources of infection further increased the genetic complexity. The profiles of HIV-1 evolution in Yunnan province (including the emergence of numerous novel circulating recombinant forms) had been described by previous reports [5,20,21,22]. Compared with our previous study in 2008 [1], CRF08_BC had replaced CRF01_AE to become the most prevalent genotype in this region. A shift of dominant HIV-1 subtype had also been described in other regions of Yunnan province [5,20,21]. Moreover, we identified in the present study 6 common genotypes and 3 URF genotypes (undefined B'/C, CRF01_AE/C, and CRF01_B'), as compared to 3 common genotypes and only 1 URF reported in our previous study in the same region [1]. These results indicated the local HIV-1 genotypes divergence did increase rapidly in this region.
A series of novel circulating recombinant forms have recently been identified in China, especially in Yunnan province, including CRF55_01B (identified from MSM in China) [23], CRF57_BC [7], CRF59_01B (from MSM in northeast China) [24], CRF61_BC (CRF found among the heterosexual population in different regions in China) [25], CRF62_BC [11], CRF64_BC [12], CRF65_CPX (first novel HIV-1 second-generation inter-CRF in China) [8]. In the present study, we included available sequences of CRFs in HIV-1 sequence database into our reference Table 2. Cont.   pool. The breakpoints of our URFs differed apparently from those of any known CRF, implying the unidentified novel CRFs emerged in this region. The evolving profile of HIV-1 molecular epidemiology raises also a challenge for the antiretroviral therapy. The diagnosis of DRMs associated with free HAART drugs (supplied under the NFATP program) had been increased rapidly in the past 5 years. The rapid genotyping evolution and transmission of HIV-1 with DRMs in Banna might increase the risk of treatment failure in this region. Of 283 sequences analysed, 102 (36.0%) showed DRMs, even if we excluded 6 DRMs from the IAS-USA 2013 mutation list (due to the coexistence of multiple DRMs). This proportion of untreated HIV-1 patients with DRMs was apparently higher than those reported previously (13.3%) [1]. Moreover, not only the rapidly increasing prevalence of resistant strains, but also the emerging of mutations conferring high level resistance to the primary antiretroviral drugs represent a serious challenge for the control of epidemic in this region. As in this study, there were 4 (1.4%) cases carrying mutations M184V (2 cases), K103N (1 case) and D30N (1case), which should confer high level resistance to 3TC/FTC, EFV/NVP and NFV, the primary drugs composed the first-line antiretroviral therapy regimen recommended in China. Moreover, a higher proportion (28.6%) of transmission clusters were also found to be associated with DRMs transmission. Poor adherence and compliance may be the most important contributors related to the increasing prevalence of DRMs in this region. The ''HIV-1 Education and Training Programs'' need to be enhanced in Banna to guarantee the infected individuals taking medicine as advised. Since as high as 54.8% of infected individuals were farmer and 25.9% were unemployed ( Figure. 1b), such a poor education status would frustrate our efforts on prevention and control of HIV-1 dissemination. Thus, the efforts to include people in low social-status into the NFATP program will be the key to control HIV-1 outbreak in this region.