Molecular Epidemiology of Human Immunodeficiency Virus Type 1 in Guangdong Province of Southern China

Background Although the outbreak of human immunodeficiency virus type 1 (HIV-1) in Guangdong has been documented for more than a decade, the molecular characteristics of such a regional HIV-1 epidemic remained unknown. Methodology/Principal Findings By sequencing of HIV-1 pol/env genes and phylogenetic analysis, we performed a molecular epidemiologic study in a representative subset (n  = 200) of the 508 HIV-1-seropositive individuals followed up at the center for HIV/AIDS care and treatment of Guangzhou Hospital of Infectious Diseases. Of 157 samples (54.1% heterosexual acquired adults, 20.4% needle-sharing drug users, 5.7% receivers of blood transfusion, 1.3% men who have sex with men, and 18.5% remained unknown) with successful sequencing for both pol and env genes, 105 (66.9%) HIV-1 subtype CRF01_AE and 24 (15.3%) CRF07_BC, 9 (5.7%) B’, 5 (3.2%) CRF08_BC, 5 (3.2%) B, 1 (0.6%) C, 3 (1.9%) CRF02_AG, and 5 (3.2%) inter-region recombinants were identified within pol/env sequences. Thirteen (8.3%) samples (3 naïves, 6 and 5 received with antiretroviral treatment [ART] 1–21 weeks and ≥24 weeks respectively) showed mutations conferring resistance to nucleoside/nonnucleoside reverse transcriptase inhibitors or protease inhibitors. Among 63 ART-naïve patients, 3 (4.8%) showed single or multiple drug resistant mutations. Phylogenetic analysis showed 8 small clusters (2–3 sequences/cluster) with only 17 (10.8%) sequences involved. Conclusion/Significance This study confirms that sexual transmission with dominant CRF01_AE strain is a major risk for current HIV-1 outbreak in the Guangdong’s general population. The transmission with drug-resistant variants is starting to emerge in this region.


Introduction
Guangdong province, located at the southern coast of China with a registered number of permanent residents reaching 104.3 millions in 2011 (http://www.stats.gov.cn/zgrkpc/dlc/yw/ t20110428_402722384.htm), is the first region opened to the world as from 1978. The first HIV-1 case was diagnosed from a traveler infected overseas in 1990. HIV-1 infection was initially confirmed in native intravenous drug users (IDU) in 1996 [1,2]. Then, the HIV-1 epidemic emerged rapidly in the next 10 years (average 160.3% each year from 1997 to 2007), followed by a significant decrease due to the prevention and control measures taken by the Chinese government [3]. According to the statistics of department of health of Guangdong province, AIDS had been the first factor causing death for consecutive 11 [4], the HIV-1-related mortality was as high as 27.9 per 100 person-year patients in Guangdong province in 2009 (http:// www.gdwst.gov.cn/a/yiqingxx/201002147510.html). However, it remains unknown whether such a high regional mortality rate of AIDS is caused by a particular HIV-1 outbreak (such as an emergence of new or drug-resistant variants) in Guangdong.
In present study, we conducted a molecular epidemiological investigation in a subset of 508 HIV-1-seropositive individuals followed up from January to September 2009 at the center for AIDS prevention and treatment of Guangzhou Hospital of Infectious Diseases (GHID), the only one authorized for implementing the NFATP program in Guangzhou city (capital of Guangdong province) who had treated around 90% of HIV-1 patients in Guangdong province during the past years.

Participants and Specimens
From January to September 2009, a total of 508 patients (462 residents from 19/21 cities of Guangdong province and 46 residents from other cities outside of Guangdong) (Fig. 1a) who visited at the center for AIDS prevention and treatment of GHID participated in this study. All patients were required to complete standardized questionnaires (describing sex, age, risk factors, mode of transmission, occupation, geographic location, and treatment, etc) by the national HIV/AIDS surveillance system and sentinel surveillance program [5]. Of 508 patients recruited, 357 (70.3%) cases were currently receiving highly active antiretroviral therapy (HAART). The combinations of antiretroviral drugs included any 2 combinations of 4 NRTIs (Zidovudine [AZT], Didanosine [ddI], Stavudine [d4T] or Lamivudine [3TC], and 1 NNRTIs (Nevirapine or Efavirenz). Twenty patients (3.9%) with tuberculosis (TB) or opportunistic infections (OI) were receiving anti-TB or antiinfection therapy (who were not on HAART and would be followed by free HAART treatment once the TB or OI will be controlled). Finally, 4 patients (0.8%) with a higher CD4 count (.350 cells/ml) received the free Chinese-medicine treatment covered by the national health insurance program.
As the pre-plan design, 200 patients were selected randomly and delivered into the molecular epidemiology and drug resistance survey after consent (written consents were obtained from all patients). The Institutional Ethics Committees of Guangzhou University of Chinese Medicine (GUCM) had approved the study protocol. These investigations have been conducted according to the principles expressed in the Declaration of Helsinki. A total of 5 ml of EDTA-treated whole blood was taken from each patient, and all samples were sent to our laboratory in GUCM. The blood samples were then used immediately for routine blood count and CD4 + T-cell count measurements as well as to separate plasma and peripheral blood monocular cells (PBMC). Plasma samples were stored at 280uC while PBMC samples were re-suspended in the storage buffer containing 10% dimethyl sulfoxide (DMSO) (Sigma-Aldrich Corporation, St Louis, Missouri, USA) and 50% fetus bovine serum (Invitrogen Corporation, Grand Island, New York, USA), kept at 280uC for 48 hours, and then stored in liquid nitrogen.

Viral Load Measurement
All plasma samples were thawed at the same time and were used for viral RNA measurement using the Food and Drug Administration (FDA)-approved Amplicor HIV-1 Monitor Test kit (version 1.5) (Roche Molecular Systems, Inc., Branchburg, New Jersey, USA) according to the manufacturer's instructions.

Sequencings of HIV-1 pol and env Genes
Generally, HIV-1 subtypes, drug resistant mutations, as well as transmission clusters can be detected by sequencing pol gene. However, subtypes CRF01_AE and CRF15_01B can only be differentiated by the sequence of env region. Moreover, simultaneous sequencing pol and env genes will be helpful for identify the incidence of potential new inter-subtype recombinants [6,7,8].
Initially, viral RNA was extracted from the patients' plasma (150 ml) using the QIAamp Viral RNA Mini kit (Qiagen, Valencia, California, USA) according to the manufacturer's instructions. The viral RNA was then subjected to reverse transcription polymerase chain reaction (RT-PCR). The sensitivity and specificity of our amplification system for the different HIV-1 subtypes had been validated previously [9,10]. By increasing the input sample size (1 ml) concentrated with the use of ultracentrifugation, the detection of limits of our PCR-based sequencing assay could reach 10 copies/ml of HIV-1 RNA. Due to a high occurrence of amplification failure (.50%) even with the use of 1ml plasma by ultracentrifuge, patient's PBMC deleted of CD8 + cells by antibody-conjugated magnetic beads (Miltenyi Biotec Inc, Auburn, CA, USA) were used for isolating the HIV-1 from all patients. About 100 ml of culture supernatants collected at the peak of viral production were used for HIV-1 RNA extraction. On the basis of the reference sequences obtained from the National Institutes of Health/National Institute of Allergy and Infectious Diseases (NIH/NIAID)-funded HIV database, the genetic subtypes were identified in pol gene (1864 base pairs) spanning protease and reverse transcriptase regions and in the V3-V5 region of the env gene (660 base pairs) as previously described [7,11]. The newly obtained sequences were aligned using Muscle [12] as implemented in Seaview v4.3 [13] with reference sequences representing overall HIV-1 group M genetic variability. We included all pure subtypes, sub-subtypes, Asian circulating recombinant forms (CRFs) references (CRF01_AE, CRF07_BC, CRF08_BC, CRF15_01B, CRF33_01B, CRF34_01B, CRF48_01B and CRF51_01B), and also some other CRFs prevalent elsewhere but reported to circulate sometimes at low level in Asia, such as CRF02_AG and CRF13_cpx. Phylogenetic analysis was first conducted for each new sequence individually. The pol and env fragments, were investigated for recombination using Simplot version 3.5.1 software [14]. Genotyping data were used to identify minor and major resistance mutations in protease and reverse transcriptase genes, based on the last updated online Stanford Resistance Database tool: HIVdb program-Genotypic Resistance Interpretation Algorithm (version 6.1.0, http://sierra2. stanford.edu/sierra/servlet/JSierra? action = sequenceInput). Additionally, in accordance with the latest WHO recommendations, resistance mutations were also evaluated with the last updated list for surveillance of transmitted drug-resistant strains in untreated patients (ver.6.0 http://cpr.stanford.edu/cpr.cgi).

Phylogenetic Reconstruction
Phylogenetic inter-relationships among viral sequences were estimated using PHYML with the minimal number of reference sequences, i.e excluding the CRFs not represented among the samples of the present study [15,16]. Phylogenies were inferred using a general-time reversible model of nucleotide substitution, an estimated proportion of invariant sites, and gamma distributed rates among sites. The best of SPR and NNI heuristic options was selected to search the tree space, and bootstrap values with 1000 replicates were used to assess confidence in topology. The existence of transmission clusters was determined using the statistical robustness of the maximum likelihood topologies assessed by high bootstrap values (.98%) with 1000 resamplings and short branch lengths (genetic distances ,0.015) of HIV-1 pol gene sequence [6,7].

Statistical Analysis
Differences in the CD4 + T-cell count and viral load among different subgroups of patients were determined using a Mann-Whitney nonparametric test.

Accession Numbers
All sequences analyzed in the present study are deposited in EMBL under the accession numbers HE590887 to HE591043 and HE591086 to HE591242 for the pol and env sequences, respectively.

Viral loads and CD4 + T-cell Counts
The mean (6SD) of CD4 counts and the geometric mean (6 SE) of viral loads in the 508 patients were 2846213 cells/ml and 5.3662.66 log 10 copies/ml respectively. Significant higher viral loads were observed in the 127 (25%) untreated (7.2761.89 log 10 copies/ml) than in the 357 (70.3%) HAART-treated (4.496174 log 10 copies/ml) patients (P,0.01) while similar CD4 counts were observed between untreated (2936122 cells/ml) and HAARTtreated (2836213 cells/ml) patients (P.0.05). The highest viral load (8.8963.23 log 10 copies/ml) and the lowest CD4 count were observed in the 20 patients with anti-TB and anti-infection therapies. The 357 cases had been receiving HAART treatment for average 19.3 months (m) (ranging from 1 week to 68.6 m), while 35.9% (128/357) of them had been treated for less than 6 months. Among 229 patients who had been treated for longer than 6 months, 12 (5.2%) HAART-treated patients had plasma viral load higher than 200 copies/ml (indicating the clinical viral failure of HAART). The CD4 count (mean 6 SD) was significantly (P,0.01) higher in patients with longer HAART treatment duration (2626169 cells/ml for 6-12 m, 3356206 cells/ml for 12-24 m, 3726201 cells/ml for 24-36 m, and 4276231 cells/ml for .36 m) than in patients with HAART less than 6 months (1426133 cells/ml).

HIV-1 Genotyping and Drug Resistance Mutation Analysis
Of 200 samples delivered into pol and env partial gene sequence analysis, 157 samples were achieved with both pol and env sequences. In addition, 21 samples achieving with pol sequences but experienced an amplification failure by env-specific RT-PCR, another 9 achieving env sequences but failed to be amplified by polspecific RT-PCR, and the remaining 13 failed to be amplified by both envand pol-specific RT-PCRs. The characteristics of these 157 participants, including their ages, risk factors, occupations, and treatment statuses were similar to that of all 508 patients (Fig. 1b). Of 157 sequence samples analyzed, 152 (96.8%) had consistent subtypes (66.9% CRF01_AE, 14.8% CRF07_BC, 5.7% B', 3.2% CRF08_BC, 3.2% B, 0.6% C, and 1.9% CRF02_AG), 5 (3.2%) showed distinct genotypes in env and pol. Among the 5 distinct genotypes, 3 (1.9%) were inter-subtype recombinants (CRF02_AG/CRF01_AE, CRF07_BC/CRF01_AE, and CRF08/B) and 2 (1.3%) were unique recombinant forms in pol (URFs) ( Table 1). The characteristics of patients showing a potential inter-subtype recombinant were summarized in Table 2. No obvious geographic correlation was observed (data not shown).

Identification of Transmission Clusters
Of 157 pol sequence samples analyzed, only 17 (10.8%) sequences segregated into 8 tiny transmission clusters defined by PhyML (Fig. 2). All clusters included the sequences from 2 patients, except one with three (cluster 4). The only transmission cluster (cluster 1) with drug resistance mutations (patients 440# and 437#) was identified as showed in Table 3. Of 8 clusters identified, 6 transmission chains had been confirmed by contact tracing (including the sex, age, habitation, occupation of patients in the same cluster), while two clusters (clusters 5 & 8) remained ambiguous: one (cluster 5) derived from one child (#456) who got HIV-1 from blood transfusion and one man (#408) who was not the source of contaminated blood supply; another one (cluster 8) derived from 2 men (patients 197# and 198#) who were reported to contract HIV-1 infection by heterosexual route in the same city (Guangzhou). The characteristics of 17 patients showing pol sequences within the 8 transmission clusters were summarized in Table 4

Discussion
Until October 2011, 32,195 people were reported as HIV-1infected individuals in Guangdong, a total of 9808 were diagnosed as AIDS and 7083 of them died (http://www.gdwst.gov.cn/a/ zwxw/201111309428.html). Although NFATP has been implemented since 2002 (initially to former plasma donors and then to the whole population) [18], the overall treatment coverage for treatment-eligible population reached only 63.4% by 2009 while it remained particularly low (42.7%) in IDUs as compared to sexually infected patients (61.7%) and those infected through plasma donation and blood transfusion (80.2%) [4]. Since 18.7% infected individuals were IDUs and as high as 66.1% were unemployed or 17.9% were farmer (Fig. 1b), it was clear that people in low social-status faced a higher risk and it would be more difficult for them to access the NFATP program, thereby causing a delay in receiving the HAART treatment. The fact that new HIV-1-infected cases diagnosed mainly at an advanced stage of infection (http://www.gdwst.gov.cn/a/yiqingxx/201002147510. html) could be the principal cause of an overall high mortality in this region.  Table 3. Drug resistance mutations in HAARV-treated and drug-naïve patients in Guangdong, China. Although an emergence of drug resistant variants might account for the regional high mortality, this seams unlikely in our findings since 10 (10.6%) HAART-treated patients had drug resistance mutations to the first-line NRTIs and NNRTIs, which correlated to the clinical failure in controlling their viral loads in only 2 (2.1%) cases. Although repeated measures should be conducted to exclude the possibility that we had missed the minority resistant stains, other reasons like drug adherence might also cause the viral failure as reported previously [19]. It is worthy to note that infection with drug resistant HIV-1 variants might not necessarily cause the clinical viral failure, even if virus resistance to all administered drugs existed, some patients may maintain a low viral load (,200 copies/ml) [20]. It was intriguing that four patients (437, 182, 211, and 326) with high-level multiple drug resistance mutations had an undetectable viral load (,50 copies/ ml). It seems also plausible that the replication of these viral strains showing multiple drug resistance mutations might be controlled in vivo by their CD8 + T cells since these viruses did replicate well in PBMCs after depleting of CD8 + T cells ex vivo. It could be interesting to follow up these patients to verify whether their viral loads will remain undetectable after withdrawing from HAART.

No. Sex Risk factors
We found 12 genotypes of HIV-1 strains including 7 subtypes (B, B9, C, CRF01_AE, CRF02_AG, CRF07_BC, CRF08_BC), and 5 inter-subtype recombinants from which 3 mosaic in pol. CRF01_AE, subtype B and subtype C are dominant in Asia [21]. Subtype C is the most prevalent globally, which is found predominantly in India. Subtype B9 prevalence in China was thought to be founded by a single lineage of pandemic subtype B around 1985 [22]. CRF07_BC and CRF08_BC have been spreading through IDUs and are currently circulating in majority of mainland China [23,24,25,26]. CRF02_AG is the predominant molecular form of HIV-1 in some western African countries [27], and also reported in China [28]. Several recombinants were also identified in our study and their full-length genomes need to be sequenced as they could potentially represent a new CRF. Although we cannot exclude the possibility that different fragments (pol or env) may be amplified from distinct HIV-1 strains in an individual patient, this probability should be very rare since any infection with dual or multiple HIV-1 quasi-species will result in an outgrowth of dominant/major HIV-1 strain during primary infection, which may be a new recombinant strain formed quickly after dual or multiple infections [29]. In fact, the genotyping results on pol and env fragments were generated from dominant sequences amplified from the major HIV-1 strain of each individual patient. The dual or multiple HIV-1 infections can be confirmed generally by clonal sequence analysis of viral quasispecies in seropositive individuals. The higher diversity of HIV-1 in Guangdong implies the multiple introductions of HIV-1 strains in such a most active world-trade region in China.
Phylogenetic analysis of viral gene sequences has successfully been used to construct direct or indirect epidemiological links in geographically defined populations with acute/primary or chronic HIV-1 infection [6,30,31,32,33]. In our study, heterosexual contact was the dominant route of HIV-1 transmission in Guangdong as reported by others [3], and 8 tiny transmission chains were identified by phylogenetic analysis of pol, which was also coincident with the characteristics of heterosexual transmission as reported previously by our group [7,11]. Of 8 clusters identified, 6 heterosexual transmission chains were confirmed by contact tracing with two exceptions that both men in cluster 8 declared definitely a heterosexual route of transmission and one child (#456) in cluster 5 got HIV-1 from blood transfusion while the man (#408) in cluster 5 was not the source of contaminated blood supply. In addition, we confirmed 1 transmission chain (cluster 1) involving 2 patients infected by HIV-1 with identical drug resistance mutations. Moreover, phylogenetic transmission reconstruction may provide the evidence of dual (or super) infection. For example, patient 399 (in cluster 7), in contrast to his heterosexual partner (patient 92) who remained a pure subtype (CRF02_AG), was most likely super-infected by CRF01_AE (the dominant strain in the region), leading to a new CRF02_AG/ CRF01_AE recombinant subtype. Furthermore, patient 197 (in cluster 8) were also likely super-infected by diverse strains of HIV-1 (CRF08_BC/B) (confirmed by repeated PCR and sequencing), resulting in new inter-subtype recombinants (Table 4). Thus, dual or super infections might contribute to the diversity of HIV-1 subtypes in the region.
Taken together, our findings demonstrated that HIV-1 CRF01_AE was a major subtype accounting for HIV-1-infected patients in Guangdong. Although not common, transmission of drug resistant strains did exist. The major risk factors for HIV-1related mortality were most likely not receiving HAART and having a low CD4 count (,50 cells/ml) when first declared eligible for treatment as reported earlier [4]. Given that about 323,252 (43.7%) of the estimated 740,000 HIV-infected individuals living in China at the end of 2009 were identified [34], it is possible that the high proportion (.50%) of undiagnosed people with advanced chronic HIV-1 infection (associated with high viral loads) might be a major source accounting for the current outbreak of sexually acquired HIV-1 transmission in China. Thus, there is an urgent need for earlier HIV-1 diagnosis allowing better access to treatment so as to decrease the HIV-1-related mortality and limit the source of the sexually transmitted virus.