Identification and Characterization of a Novel HIV-1 Circulating Recombinant Form (CRF59_01B) Identified among Men-Who-Have-Sex-with-Men in China

The HIV-1 epidemic among men-who-have-sex-with-men (MSM) continues to expand in China. A large-scale national survey we conducted on HIV-1 strains among MSM in 11 provinces in China from 2008 to 2013 (n = 920) identified a novel transmission cluster consisting of six strains (0.7%) that belonged to a new circulating recombinant form (designated CRF59_01B). CRF59_01B contains two subtype B segments of U.S.-European origin (in the pol and vpu-env regions) in a CRF01_AE backbone. CRF59_01B is the second CRF (after CRF55_01B) circulating primarily among MSM in China. CRF59_01B occurs at a low frequency (less than 1%), but it was detected in four different provinces/regions in China: Liaoning (northeast China) (n = 3); Hunan (central China) (n = 1); Guangdong (south China) (n = 1); Yunnan (southwest China) (n = 1). One additional recombinant strain was detected in a heterosexual individual in Liaoning province but is not the focus of this paper. Bayesian molecular clock analyses indicate that CRF59_01B emerged as a result of recombination between CRF01_AE and subtype B around the year 2001. The emergence of multiple forms of recombinants and CRFs reflects the ever-increasing contribution of homosexual transmission in China's HIV epidemic and indicates an active HIV transmission network among MSM in China.


Introduction
A high level of genetic diversity is the hallmark of human immunodeficiency virus type 1 (HIV-1). HIV-1 is classified into four groups: M, O, N and P [1,2]. HIV-1 group M strains are responsible for the vast majority of HIV infections worldwide and consist of 11 subtypes and sub-subtypes, 58 circulating recombinant forms (CRFs) (www.hiv.lanl.gov), and various types of unique recombinant forms (URFs). A variety of CRFs and URFs continue to be detected worldwide. In particular, new CRF strains with serial numbers above 51 were all reported from Asia: CRF51_01B from Singapore [3]; CRF52_01B from Thailand and Malaysia [4]; CRF53_01B [5] and CRF54_01B [6] from Malaysia; and CRF55_01B [7] from China. All CRFs reported from Asia are recombinants of CRF01_AE and subtype B, except CRF07_BC [8] and CRF08_BC [9]. Widespread co-circulation and dual infection of CRF01_AE and subtype B in various regions in Asia have led to the emergence of a large number of CRFs comprising subtype B and CRF01_AE. Of note among them, CRF51_01B and CRF55_01B are CRFs that were identified mainly among men-who-have-sex-with-men (MSM).
The HIV-1 epidemic continues to expand rapidly among MSM in China [10][11][12]. In 2011, MSM accounted for 29.4% of all newly diagnosed HIV cases [10]. MSM became one of the nation's most targeted populations for HIV prevention and care. Initially, the MSM population in China was predominantly infected with subtype B [13,14], the typical U.S.-European strains that are prevalent in western countries. However, in recent years, a dramatic shift in genotype distribution from subtype B to CRF01_AE and other virus lineages has been observed in China [15]. Furthermore, studies [16,17] have identified a number of distinct phylogenetic clusters uniquely associated with the epidemic among MSM in China: CRF01_AE clusters 1 and 2, and CRF07_BC cluster 3. These three lineages of HIV-1 strains account for approximately 80% of HIV-1 infections among MSM in China [17]. In addition, various new recombinant strains mostly comprising CRF01_AE and subtype B have been detected [18,19].
In the present study, we discuss a new circulating recombinant form that we identified (CRF59_01B) and that is uniquely associated with transmission among MSM in China. Additionally, we investigate its nationwide occurrence and the evolutionary history of it emergence.

Study Subjects, HIV-1 RNA Isolation and Screening of HIV-1 Genotypes
This study was performed as part of a nationwide molecular epidemiological survey of Chinese MSM. A total of 920 plasma samples were collected from HIV-1-seropositive MSM in 11 provinces/municipalities across China from 2008 to 2013: Jilin province (n = 8); Liaoning province (n = 263); Beijing (n = 163); Shandong province (n = 42); Jiangsu province (n = 49); Shanghai (n = 26); Anhui province (n = 136); Henan province (n = 58); Hunan province (n = 68); Guangdong province (n = 40); and Yunnan province (n = 67). This study was approved by the Institutional Review Board of the First Affiliated Hospital of China Medical University. Written informed consent was obtained from all participants before sample collection. HIV-1 RNA was extracted from participants' plasma using QIAamp Viral Mini Kits (Qiagen, Germany) and was used to amplify and determine the nucleotide sequences of the 1.1-kb protease-reverse transcriptase (pro-RT) region in the pol gene (HXB2: 2253-3318). HIV-1 genotypes were determined based on the neighbor-joining analysis of the Kimura 2-parameter distance matrix and a transition-totransversion ratio of 2.0, using MEGA software Version 5.0

Near Full-Length HIV-1 Nucleotide Sequencing
Near-full-length genome (NFLG) sequences of the strains of interest were determined using the single-gene amplification (SGA) method [20] to prevent any artificial recombination that might have occurred with a nested polymerase chain reaction (PCR). Briefly, plasma HIV-1 RNA was reverse-transcribed into singlestrand cDNA using Superscript III First-Strand Synthesis System (Invitrogen, USA) with 59-half -reverse primer 07Rev8 (59-CCTARTGGGATGTGTACTTCTGAACTT-39; HXB2: 5193-5219 nt) and 39-half-reverse primer 1.R3.B3R (59-ACTACTT-GAAGCACTCAAGGCAAGCTTTATTG-39; HXB2: 9611-9642 nt) as described previously [21]. The 59-and 39-halves of the HIV-1 viral genome were independently amplified from cDNA by using two rounds of nested PCR with specific primers [21]. Both PCR reactions were performed in a final volume of 20 ul containing 15.3 ul RNase-free Water, 2 ul 106 High-Fidelity Platinum PCR buffer, 0.8 ul MgSO 4 (50 mM), 0.4 ul dNTP (10 mM), 0.2 ul of each primer (20 pmol/ul), 0.1 ul Platinum Taq High-Fidelity polymerase (Invitrogen, USA), and 1 ul template. The first and second rounds of PCR were both performed under the following conditions: 94uC for 2 minutes, 35 cycles at 94uC for 10 seconds, 60uC for 30 seconds, 68uC for 4.5 minutes, final extension of 10 minutes at 68uC. The second-round PCR products were electrophoresed on 0.7% TAE agarose gel to check for positive amplification. Then, the positive amplification products were purified and directly sequenced using internal walking primers with an ABI 3730XL Sanger-based genetic analyzer. All sequences were analyzed, edited and assembled by overlapping the sequences of the two half-genome fragments with Sequencer 4.10.1 and Bioedit version 5.0.

Recombination breakpoint analyses
The NFLG sequences were first analyzed using the Recombination Identification Program (RIP) and the jumping profile Hidden Markov Model (jpHMM) on the Los Alamos HIV Sequence Database (www.hiv.lanl.gov) to define the recombinant structures. Subsequently, all NFLG sequences were aligned with HIV-1 subtypes/CRFs reference sequences using HIVAlign (http:// www.hiv.lanl.gov/content/sequence/VIRALIGN/viralign.html) and then manually edited with Bioedit 5.0. A phylogenetic tree of the NFLG was constructed by applying the neighbour-joining method based on Kimura's two-parameter distance matrix with 1000 bootstrap replicates using MEGA 5.0. Subtype B (83FR.HXB2), CRF01_AE (90TH.CM240) and subtype C (95IN21068) were used in the bootscanning analysis with SimPlot version 3.5.1. Insertion segments were used to build sub-regions phylogenetic tree via the neighbour-joining method with bootstrapping to confirm the origin of the different segments. The Recombinant HIV-1 Drawing Tool (http://www.hiv.lanl.gov/ content/sequence/DRAWCRF/recom_mapper.html) was used to elucidate the structure of the new HIV-1 recombinant forms (CRF01_AE/B).

Estimate of Appearance of Most Recent Common Ancestor of CRF59_01B
The rate of the evolution of different segments of CRF59_01B were estimated from a set of subtype B and CRF01_AE references with known sampling dates using BEAST v.1.6.0. Dates were estimated using Bayesian Markov Chain Monte Carlo (MCMC) inference under both the general time-reversal (GTR) and Hasegawa-Kishino-Yano (HKY) nucleotide substitution models. The MCMC analysis was computed for 20 million states sampled at every 1000 states, and the MCMC results were evaluated using the Tracer 1.5 program. All parameters were estimated from an effective sampling size (ESS) .200. The maximum clade credibility (MCC) trees were viewed and edited using FigTree v1.3.1.
Within a group of other recombinant strains (n = 28), we found that six strains (0.7%) formed a distinct phylogenetic cluster that is different from any other known HIV-1 genotype: 3 from Liaoning province (northeast), and 1 each from Guangdong, Yunnan and Hunan provinces (Table 1, Figure 1B). Furthermore, we also identified one additional strain (11CN.LNSY300876) that belonged to this cluster from a heterosexual male in Liaoning province (Table 1). These seven study subjects share no obvious epidemiologic link. Recombination breakpoint analysis revealed that these seven strains contained a small subtype B segment in a CRF01_AE backbone in the 1.1-kb pol segment (data not shown).
To define the detailed subtype structure of these strains, we determined the NFLG sequences from the plasma RNA samples. We successfully amplified and determined the NFLG sequences of six of the seven study subjects (all except 12CN.HNCS501137 from Hunan) ( Table 1). As shown in Figure 2, a total of six NFLG sequences (five from MSM; one from a heterosexual male) ( Table 1) formed a distinct monophyletic cluster that is distinct from any other known subtype or CRF. Recombination breakpoint analysis revealed that these six NFLG sequences shared identical recombinant structures in which two subtype B regions (nucleotide position 2570-2718 in the pol region and 6149-8243 nt in the region relative to the HXB2 genome) were located in a CRF01_AE backbone (Figure 3). The recombinant structure is distinct from any other known CRFs comprising CRF01_AE and subtype B, including CRF15_01B [22], CRF33_01B [23], CRF34_01B [24], CRF48_01B [25], CRF51_01B [3], CRF52_01B [4], CRF53_01B [5], CRF54_01B [6], and CRF55 01B [7]. Subregion tree analyses further confirmed the parental origins of each region of the recombinant genome as follows ( Figure 3C [26,27] lineage associated with blood-borne epidemics in Asia [28]. Additionally, the CRF01_AE regions were found in the Thailand CRF01_AE radiation and were not related to the CRF01_AE variants (clusters 1 and 2) that we recently identified among MSM in China [29]. We designated these novel CRF01_AE/B recombinants as CRF59_01B [30].

Evolutionary Characteristics of CRF59_01B
To estimate the timeline of emergence of CRF59_01B, we performed Bayesian molecular clock analyses on the CRF01_AE regions [Regions I (HXB2: 790-2569 nt), II (HXB2: 2719-  (Table 1). These strains are compared with the reference sequences of all known subtypes/sub-subtypes as well as CRFs relevant to this study, including CRF15_01B, CRF34_01B and CRF52_01B from Thailand; CRF51_01B from Singapore; CRF33_01B, CRF48_01B, CRF53_01B and CRF54_01B from Malaysia; and CRF55_01B from China (http://www.hiv.lanl.gov/content/index).  Figure 4C). This suggests that the recombination generating CRF59_01B from parental lineages of subtype B and CRF01_AE occurred around the year 2001. In contrast, the estimated tMRCAs for Chinese MSM CRF01_AE cluster 1 [1989.4 (1986.8-1992.0)] and cluster 2 [1996.7(1993.6-1999.2)] are significantly earlier than those of CRF59_01B (Figure 4).

Discussion
The large-scale national survey we conducted on HIV-1 strains circulating among MSM in China (Figure 1) identified a new CRF that we designated CRF59_01B (Figure 2, 3). CRF59_01B is the second CRF (after CRF55_01B [7]) to circulate primarily among MSM in China. This is the third CRF (after CRF51_01B from Singapore [3] and CRF55_01B [7]) to be identified among MSM in Asia. The appearance of this CRF reflects the recent upsurge of disease activity among MSM throughout the Chinese regions we studied [31]. As shown in the subregion trees ( Figure 3C), the subtype B regions of CRF59_01B are of U.S.-Eurpoean origin, not of the subtype B9 (Thai variant of subtype B), which is associated with blood-borne epidemics in Asia. Additionally, CRF01_AE regions were found in the Thai CRF01_AE radiation and are not related to any CRF01_AE variants (clusters 1 and 2) that were recently identified among MSM in China [17]. As demonstrated in previous studies [13], subtype B was the predominant strain among MSM in China initially, but CRF01_AE has showed increasing prevalence among Chinese MSM in recent years. The co-circulation of these two HIV-1 lineages has led to the generation of various recombinants between the Thai CRF01_AE and subtype B strains, including CRF55_01B and CRF59_01B. These recombinants are distinct from other known CRFs consisting of CRF01_AE and subtype B9, including CRF15_01B and CRF34_01B from Thailand; CRF33_01B, CRF48_01B, CRF53_01B, CRF54_01B from Malaysia; and CRF52_01B from Thailand and Malaysia.  Figure 1B). These detections may suggest that an unknown focus of CRF59_01B is present outside the study sites. Alternatively, CRF59_01B has a very limited circulation, but the high mobility exhibited by the MSM population may explain the strain's sporadic and diffuse distribution across China.
According to the behavior studies [32,33], most MSM lack a basic knowledge about HIV/AIDS and usually have multiple sexual partners with whom they exhibit unprotected sexual behavior. Indeed, the rate of infection of sexually transmitted diseases, such as syphilis, is remarkably high in many cities: 27.7% in Nanjing and Yangzhou [34], 31.1% in Shenyang [35], and 14.3% in Harbin [36]. These factors make it possible for MSM to experience multiple HIV infections and superinfections, facilitating the emergence of new recombinant strains and their rapid dissemination across China. The emergence of CRF55_01B and CRF59_01B suggests that new recombinant forms comprising the CRF01_AE and subtype B lineages are actively being generated among MSM in various regions of China.