Molecular Epidemiology of HIV-1 in Jilin Province, Northeastern China: Emergence of a New CRF07_BC Transmission Cluster and Intersubtype Recombinants

Objective To investigate the HIV-1 molecular epidemiology among newly diagnosed HIV-1 infected persons living in the Jilin province of northeastern China. Methods Plasma samples from 189 newly diagnosed HIV-1 infected patients were collected between June 2010 and August 2011 from all nine cities of Jilin province. HIV-1 nucleotide sequences of gag P17–P24 and env C2–C4 gene regions were amplified using a multiplex RT-PCR method and sequenced. Phylogenetic and recombination analyses were used to determine the HIV-1 genotypes. Results Based on all sequences generated, the subtype/CFR distribution was as follows: CRF01_AE (58.1%), CRF07_BC (13.2%), subtype B’ (13.2%), recombinant viruses (8.1%), subtype B (3.7%), CRF02_AG (2.9%), subtype C (0.7%). In addition to finding CRF01_AE strains from previously reported transmission clusters 1, 4 and 5, a new transmission cluster was described within the CRF07_BC radiation. Among 11 different recombinants identified, 10 contained portions of gene regions from the CRF01_AE lineage. CRF02_AG was found to form a transmission cluster of 4 in local Jilin residents. Conclusions Our study presents a molecular epidemiologic investigation describing the complex structure of HIV-1 strains co-circulating in Jilin province. The results highlight the critical importance of continuous monitoring of HIV-infections, along with detailed socio-demographic data, in order to design appropriate prevention measures to limit the spread of new HIV infections.


Introduction
In China, it is estimated that 780,000 people were living with HIV by the end of 2011, according to the ''China AIDS Response Progress Report'' [1]. China is experiencing a dynamic and complex HIV/AIDS epidemic. The reported predominant cocirculating HIV-1 genotypes are: subtype B', circulating recombinant form (CRF) CRF01_AE, CRF07_BC, and CRF08_BC [2,3]. These three CRFs and the B' subtype constituted 92.8% of reported HIV-1 infections in China in 2006 based on our nationwide molecular epidemiology survey, and were detected in all high-risk groups including former plasma donors (FPDs), injecting drug users (IDUs), men having sex with men (MSM) and heterosexual transmissions [2]. Co-circulation with strains from different HIV-1 subtypes, CRFs, and unique recombinant forms (URFs) in these risk groups can create opportunities for the emergence of new, hybrid recombinants [4][5][6][7].
Jilin province is located in the center of northeast China, bordering with Russia to the east, the Democratic People's Republic of Korea (DPRK, also referred to as North Korea) across the rivers of Yalu and Tumen to the southeast, Liaoning province to the southwest, Inner Mongolia Autonomous Region to the west, and Heilongjiang province to the north. Jilin province encompasses an area of 187,400 square kilometers, and is divided into nine regions: Changchun (the capital of Jilin province), Jilin (located in the center of Jilin province), Siping, Tonghua, Baishan, Liaoyuan, Baicheng, Songyuan and Yanbian Korean Autonomous Prefecture. According to the sixth nationwide population census of 2010 [8], Jilin province had a population of 27,462,297, with a total of 44 ethnicities, including Han, Manchu, Mongol and Hui. Jilin's central location contributes to a persistent influx and outflow of people, mainly due to trade, labor, tourism, and education; this increasingly mobile population provides an increased opportunity for importing new HIV-1 strains and increasing transmissions.
The first known AIDS case in Jilin province was in a laborer infected through heterosexual contact with a female commercial sex worker (CSW) in Mombasa, Kenya in 1993, however, the HIV genotype of the infected patient was unknown. Jilin province experienced a low level of new HIV infections between 1993 and 1994, and then the number of reported HIV infections increased yearly [8]. A total of 1,477 HIV infections had been reported by the end of 2010 among all risk groups [9].
The current distribution of HIV-1 subtypes, CRFs, and recombinants in Jilin province is largely unknown. Therefore, a detailed HIV-1 molecular epidemiologic investigation to determine the genotypic distribution and the emergence and spread of new subtypes and CRFs is of great importance for understanding the dynamics of the HIV-1 epidemic in this region. In the present study, we performed an HIV-1 molecular epidemiological investigation of 189 newly diagnosed HIV-infections in Jilin province.

Study subjects and dataset information
A total of 189 newly diagnosed HIV-infected persons identified between January 2008 and December 2010 at local voluntary counseling and testing sites (VCT), sentinel surveillance sites, and medical institutions in Jilin province were agreed to be enrolled in this study. All newly diagnosed HIV-infected people were identified from various cities and risk groups. Whole blood samples were collected in 2010 (n = 93) and 2011 (n = 96); plasma was separated and stored at -80uC. The study was approved by the institutional review board of the National Center for AIDS/ STD Control and Prevention, China CDC. A written informed consent, as well as a socio-demographic questionnaire, was obtained from each participant in this study.
The socio-demographic data that was collected included sex, age, ethnicity, marital status, education background, year of diagnosis, year of sampling, site of sampling, CD4+ T Cell Count and risk group.

HIV-1 RNA extraction, amplification and sequencing
Plasma samples from 189 newly diagnosed HIV infected participants were collected and submitted for RT-PCR and sequencing. Viral RNA was extracted from 280 ml of plasma using the QIAamp Viral RNA Mini kit (Qiagen, Valencia, California, USA) following the manufacturer's instructions [11,12]. The extracted viral RNA was subjected to a multiplex reverse transcription, polymerase chain reaction (RT-PCR) to obtain the nucleotide sequences of HIV-1 gag P17-P24 (HXB2: position 781-1836 for 1056 base pairs (bp)) and env C2-C4 (HXB2: positions 7002-754 for 540 bp) gene regions as previously described [13].
The positive PCR products were purified using QIAquick Gel Extraction Kit (Qiagen, Valencia, California, USA) and sequenced directly on an ABI 3730XL automated sequencer using BigDye terminators (Applied Biosystems, Foster City, California, USA) by Beijing Biomed Technology Development CO., Ltd (Beijing, China). In addition to these 48 reference strains, we used 62 reference sequences from viruses that represent subtypes/CRFs commonly identified in China as follows: subtypes A1 (1), B (6), B' (7), C (2), CRF07_BC (5) and CRF01_AE (39), CRF08_BC (2). Both gag and env sequences from these 110 reference strains were available. Following alignment, manual adjustments were made taking into consideration protein coding sequences using BioEdit software [14,15]. Neighbor-joining phylogenetic trees were constructed using the Kimura 2-parameter model of evolution, including both transitions and transversions [16], implemented in the MEGA 5.0 software package [17]. The reliability of the tree structure or branching order was evaluated by bootstrap analysis with 1000 replicates [18]. In order to better display the phylogenetic trees of the HIV-1 gag P17-P24 and env C2-C4 regions, the sequences were separated into two different Neighbor-joining trees, one containing only the subtype A, CRF01_AE, CRF02_AG-related sequences and the other representing the B/B', C, CRF07_BC related sequences. If there was evidence of recombination in any of the sequences (i.e., discordant gene regions or outlier position in a tree), they were further analyzed using the jumping profile Hidden Markov Model program (jpHMM; http://jphmm.gobics.de/) [19]. In order to confirm the possible recombinant structures and identify recombinant breakpoints of the potential HIV-1 recombinants, Bootscanning analysis was performed using Simplot 3.5.1 software package with window size of 300 bp, step size of 20 bp) [20].

Nucleotide sequence accession numbers
All the nucleotide sequences obtained in this study were submitted to GenBank under accession numbers of KF818784-KF818917 for the HIV-1 gag P17-P24 gene region and Table 1.  Table 2. HIV-1 genotype information of 12 subjects of Jilin province with new recombinants.
Sequence ID

Site of sampling
Year of sampling

Results
Demographic and epidemiologic information on the study participants A total of 189 newly diagnosed HIV-1 infected samples were collected from local voluntary counseling and testing sites (VCT), sentinel surveillance sites and medical institutions in Jilin province and were used for the HIV-1 genetic analysis. For each sample, gag P17-P24 and env C2-C4 genes were amplified and sequenced. From the 189 plasma samples, 134 gag P17-P24 and 121 env C2-C4 gene sequences were obtained; 119 samples had both gag P17-P24 and env C2-C4 sequences; a total of 136 samples were genotyped with a success rate of 72.0% (136/189). The failure of PCR amplification or sequencing was likely related to factors such as poor transportation and storage conditions, low plasma volumes, low viral load, repeated freezing and thawing, and poor amplification and/or sequencing primer specificity.
The demographic and epidemiologic data are summarized in Table S1. Among the 189 participants, 81.5% were males. The mean age of the participants was 37.

Discussion
Here we describe the most comprehensive HIV-1 molecular epidemiologic investigation, to date, on the characteristics and trends of the HIV/AIDS epidemic in the Jilin province of northeastern China between 2010 and 2011. Jilin province, as well as most of China, is experiencing an increasingly complex HIV epidemic. We identified subtypes B, B', and C, CRF01_AE (including lineages CRF01-1, CRF01-4 and CRF01-5), CRF02_AG, CRF07_BC (CRF07-1 and the newly identified lineage CRF07-2), and recombinant viruses; CRF01_AE was the predominant genotype in both heterosexual and MSM sequences.
In a previous nation-wide study [21], we identified 7 unique, strongly supported, phylogenetic sub-clusters or lineages among the CRF01_AE radiation nation-wide. These lineages were partially segregated by geographic regions of China and different risk groups. The first three lineages (CRF01-1through CRF01-3) were prevalent among IDUs and heterosexuals from the south and southwest, lineages CRF01-4 and CRF01-5 were primarily found in MSM from northern cities, and sub-clusters CRF01-6 and CRF01-7 were found in heterosexuals from southern China. In the current study, we identified three of these CRF01_AE lineages. CRF01-4 and CRF01-5 were found among sexual transmissions, both heterosexual and MSM, while the one CRF01-1 virus was identified in a heterosexual male.
Previous studies have found that there was a large and strongly supported transmission sub-cluster of CRF07_BC strains among MSM in Beijing, Liaoning, and Shijiazhuang within the CRF07 radiation [22][23][24], we designated this CRF07_BC sub-cluster as CRF07-1. In this current study, we found the route of transmissions for CRF07-1 had expanded to include heterosexuals as well as MSM in Jilin province. Furthermore, a new statistically supported, monophyletic transmission cluster of CRF07_BC (designated CRF07-2) was identified among seven MSM from Changchun and one MSM from Songyuan, which is geographically adjacent to the northwestern portion of Changchun. Although the sample sizes are still fairly small relative to the epidemic, CRF07-1 has spread throughout much of Jilin province and is transmitted by both heterosexuals and MSM, while CRF07-2 has only been found among MSM in Changchun and neighboring Songyuan.
Originally subtype B' infections in China were identified primarily among IDU in Yunnan province and subsequently among FPD and heterosexual in inland China, due to unhygienic commercial plasma collections during the early to mid-1990s, after which the practice was banned [25]. The majority of subtype B' infections are now found in the heterosexual population [2]. This shift in risk groups is most likely the result of national policies in China that strictly regulated blood donations, which began in the late 1990s [26] and the fact that the HIV-infected FPD were spreading subtype B' viruses through heterosexual transmissions to their sex partners. Indeed, we found no subtype B' infections among our 83 MSM from Jilin province. While these comparisons of genotypes and risk groups are interesting, and appear to indicate certain trends, it must be highlighted that the samples were collected by convenience and do not represent a random sampling. An unknown degree of sampling bias is likely.
In the present study, we also detected a new statistically supported monophyletic cluster of 4 CRF02_AG uniquely associated with Korean ethnicity and heterosexuals near the Jilin-North Korean border (Yanbian Korean Autonomous Prefecture). The exception was an IDU in Changchun, who was also a citizen of Yanbian Korean Autonomous Prefecture but was imprisoned in Changchun. In a previous study, it was also showed that CRF02_AG predominated among heterosexuals in Cina, accounting for at least 68.8% of all detected CRF02_AG in a nationwide cross-sectional study of HIV-1 epidemic in China. In light of the fact that the four study participants were all of Korean ethnicity, it is tempting to speculate that the CRF02_AG cluster detected in the present study originated in Yanbian Korean Autonomous Prefecture. It is the first report of CRF02_AG forming a transmission cluster in a local resident population in China. In the past, CRF02_AG had only been found in migrating workers from West Africa [27,28].
Among the complex mix of subtypes and CRFs that co-circulate in China, it was not surprising to find the emergence of gag and gag/env recombinants. All but 1 (90.9%) of the 11 recombinants we found contained at least a portion of CRF01_AE, which is consistent with CRF01_AE being the predominant genotype among both heterosexual and MSM transmissions. Overall, the highest HIV-1 genotype diversity was observed among the heterosexual population, however, we only had three IDU samples. If we had a larger sampling of IDUs, we may have found a larger genotype diversity in this risk group too, due to the presence of both needle sharing and sexual routes of infection. In fact, from the three Jilin IDU sequences, we found three different genotypes, subtype C, CRF07_BC, and CRF02_AG, supporting the hypothesis that if more samples were available from IDUs, we might have found a genotype diversity at lease as high as that seen in heterosexual transmissions.
In conclusion, our study indicates that the HIV epidemic in Jilin province is very complex; effective control will require an understanding of the dynamics that drives the spread of HIV through social/sexual transmission networks [57,58]. Future molecular epidemiologic studies need to focus on collecting detailed behavioral and socio-demographic information that will allow the characterization of common behavioral risk factors for each of the identified social/sexual transmission clusters. Since transmission clusters significantly drive the HIV-1 epidemic in China, characterizing the specific, common risk behaviors for these networks will help target intervention strategies.