HIV-1 genetic transmission networks among men who have sex with men in Kunming, China

Background Yunnan has the greatest share of reported human immunodeficiency virus (HIV)/acquired immunodeficiency syndrome (AIDS) cases in China. In recent years, HIV prevalence and incidence remained stubbornly high in men who have sex with men (MSM). To follow the dynamics of the HIV-1 epidemic among MSM, HIV-1 genetic characteristics and genetic transmission networks were investigated. Methods Blood samples from 190 newly diagnosed HIV-1 cases among MSM were continuously collected at fixed sites from January 2013 to December 2015 in Kunming City, Yunnan Province. Partial gag, pol and env genes were sequenced and used for phylogenetic and genotypic drug resistance analyses. The genetic characteristics of the predominant HIV-1 strains were analyzed by the Bayesian Markov Chain Monte Carlo (MCMC) method. The genetic transmission networks were identified with a genetic distance of 0.03 substitutions/site and 90% bootstrap support. Results Among the 190 HIV-1 positive MSM reported during 2013–2105, various genotypes were identified, including CRF01_AE (45.3%), CRF07_BC (35.8%), unique recombinant forms (URFs) (11.6%), CRF08_BC (3.2%), CRF55_01B (2.1%), subtype B (1.6%) and CRF59_01B (0.5%). The effective population sizes (EPS) for CRF01_AE and CRF07_BC increased exponentially from approximately 2001–2010 and 2005–2009, respectively. Genetic transmission networks were constructed with 308 pol sequences from MSM diagnosed during 2010–2015. Of the 308 MSM, 109 (35.4%) were identified in 38 distinct clusters. Having multiple male partners was associated with a high probability of identification in the genetic transmission networks. Of the 38 clusters, 27 (71.1%) contained individuals diagnosed in different years. Of the 109 individuals in the networks, 26 (23.9%) had ≥2 potential transmission partners (≥2 links). The proportion of MSM with ≥2 links was higher among those diagnosed from 2010–2012. The constituent ratios of their potential transmission partners by areas showed no significant difference among MSM from Kunming, other cities in Yunnan and other provinces. Additionally, surveillance drug resistance mutations (SDRMs) were identified in 5% of individuals. Conclusion This study revealed the various HIV-a genotypes circulating among MSM in Kunming. MSM with more partners were more easily detected in transmission networks, and early-diagnosed MSM remained active in transmission networks. These findings suggested that the routine interventions should be combined with HIV testing and linkage to care and early antiretroviral therapy among HIV-positive MSM.


Introduction
Despite substantial efforts to control human immunodeficiency virus-1 (HIV-1) in men who have sex with men (MSM), the HIV epidemic in MSM remains disproportionately severe in countries of low, middle and high income [1,2]. A comprehensive search showed that HIV prevalence in MSM ranged from 3.0% in the Middle East and North Africa region to 25.4% in the Caribbean, and HIV infection levels in MSM were substantially higher than those in non-MSM individuals [1]. The incidence of HIV in MSM remains high in many countries, even as HIV incidence seems to be decreasing in the general population in those countries [2].
In China, a fast-spreading HIV epidemic among MSM constitutes a challenge to efforts to control the HIV pandemic. The annual rate of newly reported HIV cases attributed to MSM in China has increased from 2.5% in 2006 to 28.3% in 2015 [3,4]. According to national sentinel surveillance data, the prevalence of HIV infection among MSM has increased from 0.9% in 2003 to 8.0% in 2015 [3,5]. A meta-analysis found that the national HIV incidence among Chinese MSM was 5.0 per 100 person-years (95% CI: 4.1%-5.8%) [6], which was much higher than that of any other population in China.
Yunnan is located in southwest China and is situated along the drug trafficking routes channeling heroin into China. Since the first HIV epidemic in China was identified among people who inject drugs (PWID) in Yunnan in 1989, Yunnan has been one of the areas hardest hit by HIV in China [7]. By the end of 2015, the number of people living with HIV/AIDS (PLWHA) in Yunnan was 86,483, ranking first in the nation. After 2006, the main transmission route changed from intravenous injection to sexual contact [8]. In addition to heterosexually transmission of the infection, MSM transmission also increased significantly. HIV prevalence (9.42-11.78%) and incidence (6.01-8.38%) remained stubbornly high in MSM during 2008-2010.
Because of stigma and social discrimination, Chinese MSM do not stay in their hometowns, where they are easily recognized by acquaintances [9,10]. Most of these individuals concentrate in urban areas, where close social and sexual networks can easily be constructed [11]. Larger networks provided more chances for exposure to varied sexual practices and potentially HIV-positive partners. HIV strains from epidemiologically linked individuals are more similar than HIV strains from epidemiologically unassociated individuals [12,13]. In a phylogenetic tree, a cluster represents a group of sequences from potential transmission partners. Thus, the genetic relatedness of HIV-1 can reflect the relationships between infected individuals, based on which the potential genetic transmission networks can be inferred [14,15]. Recently, with advances in molecular epidemiology, analyses of genetic transmission networks can help HIV researchers and public health professionals understand how HIV is spread within and between populations and further deliver efficient and effective interventions [11,16]. Thus, genetic transmission network analyses are useful for revealing HIV acquisition risk factors for MSM at the network level.
Kunming is the capital city of Yunnan Province, where both the estimated number of MSM and the reported number of HIV-positive MSM accounted for more than half of those in the whole province. During 2010-2012, we conducted an HIV molecular epidemiological survey among MSM in Kunming City [17]. In the present study, we updated the characteristics of the HIV molecular epidemics among MSM in Kunming during 2013-2015. Based on the accumulated data, we performed an analysis of the genetic transmission networks in this population. These findings should be valuable for better understanding the HIV epidemic and optimizing HIV health care services for MSM individuals in Yunnan.

Study participants and sample collection
A total of 190 newly confirmed HIV-1-positive MSM blood samples were continuously collected between January 2013 and December 2015 through fixed voluntary counseling and testing sites (VCTs) and non-government organizations (NGOs) in Kunming City, Yunnan Province. The corresponding demographic characteristics, including the total lifetime number of partners, were also collected by the staff working at these VCTs and NGOs. HIV-1 infection status was determined by an Enzyme-Linked Immunosorbent Assay (ELISA, Biomerieux, France) and confirmed by Western blot assay (HIV BLOT 2.2, MP Diagnostics, Singapore). All HIV tests were informed and voluntary. Written consent was obtained from all participants. The study was approved by the Biomedical Ethics Review Committee of Yunnan Province.

Amplification of HIV-1 gene fragments
Viral RNA was extracted from 140 μl of plasma using the QIAamp Viral RNA Mini Kit (Qiagen, Valencia, CA, United States) according to the manufacturer's instructions. RNA samples were directly subjected to nested polymerase chain reactions (PCR) to generate fragments of gag (HXB2: 865-1785; encoding portions of p17 and p24), pol (HXB2: 2141-3446; encoding the protease and the first 299 residues of reverse transcriptase) and env (HXB2: 7005-7526, encoding the V3-V4 region). The details of amplification and sequencing were as previously described [17].

Sequence analysis
The assembly of the different sequences generated from the same gene region of each sample was performed using the DNA sequence analysis software Sequencher 5.0 (Gene Codes, Ann Arbor, MI). ClustalW multiple alignments were performed using Bio-Edit 7.0 software. Reference sequences were obtained from the Los Alamos National Laboratory (LANL) database (http://hiv-web.lanl.gov/content/index), which covers the major HIV-1 subtypes/circulating recombinant forms (CRFs). Phylogenetic tree analyses were performed using the neighbor-joining method based on the Kimura 2-parameter model with 1000 bootstrap replicates, using MEGA version 6.0 [18]. To demonstrate possible intersubtype mosaicism, candidate sequences were analyzed using the Recombination Identification Program (RIP, version 3.0; http://hiv-web.lanl.gov).

Bayesian MCMC evolutionary analyses
The evolutionary rates and effective population sizes (EPS) of CRF01_AE and CRF07_BC were inferred for the pol gene using the Bayesian Markov Chain Monte Carlo (MCMC) method. The dataset included 167 CRF01_AE and 105 CRF07_BC pol sequences identified in both the present and previous studies [17]. Based on the previous study, the general time reversible (GTR) model plus a gamma distribution (Γ4) among site rate heterogeneity (I) model (GTR+I +4) was chosen as the nucleotide substitution model [17]. Bayesian MCMC analyses were performed using a Bayesian uncorrelated exponential relaxed molecular clock method in combination with the 'Bayesian skyline' coalescent tree priors under the selected nucleotide substitution model in the BEAST v1.7.4 package [19]. Each MCMC analysis was run for 50 million generations and sampled every 5,000 generations. For each CRF, the log and tree files from two independent runs were combined with LogCombiner. The combined log files were viewed with Tracer v1.5, which showed that the effective sampling sizes (ESS) of the parameters were no less than 200, and the runs converged. Combining the parameter log files with the appropriated trees log files, Bayesian skyline plots were constructed with Tracer v1.5.

Identification and analysis of genetic transmission networks
HIV-1 pol sequences hold sufficient variability for the reconstruction of transmission histories [20] and have been routinely used for transmission network analyses in the previous studies [11,[21][22][23]. In this study, the detection of meaningful transmission clusters depends on high support value (bootstrap or posterior probability) and low intra-cluster genetic distance. The phylogenetic tree was constructed with 308 MSM pol sequences identified in Kunming during 2010-2015, by using the neighbor-joining method based on the Tamura-Nei 93 model (S1 Fig). The transmission clusters were extracted from the phylogenetic tree using Cluster Picker software [14]. Transmission clusters were defined as those with node support thresholds greater than 90% and intra-cluster maximum pairwise genetic distances of less than 3.0% nucleotide substitutions per site [11,24]. The pairwise genetic distances of all sequences within the available clusters were calculated based on which minimum genetic distances algorithm was used to define the linkages within a cluster [11]. For visualizing and analyzing the networks, the data were processed using a custom R script utilizing the network package in R software [25].

Genotypic analysis of HIV-1 drug resistance
The nucleotide sequences of the pol gene, containing the full-length protease gene and the first 299 codons of the reverse transcriptase gene, were submitted to Stanford HIV Drug Resistance Database (http://hivdb.stanford.edu). The sequences with surveillance drug resistance mutations (SDRMs) was analyzed with the Calibrated Population Resistance (CPR) Tool (Version 6.0) [26].

Sequence data
All of the sequences obtained in this study were submitted to GenBank under the accession numbers MF036693-MF037203. The MSM pol sequences identified in Kunming during 2010-2012 were submitted to GenBank under the accession numbers MH028254-MH028382.

Statistical analysis
Statistical analyses were conducted using the SPSS 21.0 statistical analysis software package (SPSS Inc. Chicago, IL). Categorical variables were compared using χ 2 . All tests were twotailed, and p values <0.05 were considered significant.

Demographic characteristics of the study subjects
A total of 190 newly confirmed HIV-positive MSM samples were collected in Kunming City from 2013 to 2015 (Table 1)

Demographic history of the major HIV-1 genotypes
To further investigate the demographic history of the predominant HIV-1 strains among MSM in Yunnan, we examined the changes in effective population size (the number of infected individuals contributing to HIV transmission) by Bayesian skyline plots with 167 CRF01_AE and 105 CRF07_BC pol sequences from the current study and a previous study. For the CRF01_AE lineages (Fig 1A), the effective population size initially grew in the early 2000s, underwent exponential growth from approximately 2001 to 2010, and reached a stationary phase thereafter. For the CRF07_BC lineages (Fig 1B), the effective population size initially grew in the mid-2000s, underwent an exponential increase from approximately 2005 to 2009, maintained a slight increase from approximately 2010 to 2012, and became stationary

Identification and characterization of genetic transmission networks
A total of 308 MSM pol sequences identified in Kunming were used for genetic transmission network analysis. All sequences were amplified from the plasma samples obtained for HIV-1 diagnosis during 2010-2015. A total of 109 (35.4%) sequences were linked to at least one other sequence, and these segregated into 38 distinct clusters, with the number of sequences per cluster ranging from 2 to 9 (Fig 2A). Among all clusters, 24 (63.2%) were made up of 2 individuals. The genetic distances between the pairwise sequences in each cluster ranged from 0.000 to 0.0239 substitutions/site (Fig 2B). Pairs with a genetic distances of 0.015 substitutions/site accounted for 82.2% of all sequence pairs (120/146). The distributions of the individuals in networks by demographic characteristics (permanent residence, age, nationality, marital status and education) showed no differences (Table 2). However, with increasing number of partners, the probability that the subjects would be identified in the genetic transmission networks increased significantly ( Table 2, Trend χ 2 = 4.547, p = 0.038).
Among the clusters, 27 clusters (71.1%) contained individuals diagnosed in different years ( Fig 3A). Of the 38 clusters, 22 belonged to CRF01_AE, nine to CRF07_BC, three to subtype B, two to CRF08_BC, one to CRF59_01B and one to CRF01_AE/CRF07_BC (Fig 4B). For the different genotypes, the proportions of the individuals involved in the networks showed significant differences ( Table 2, χ 2 = 28.140, p<0.001).
Of the 109 individuals in the transmission networks, 83 (76.1%) were found to have only one potential transmission partner (one link), and 26 (23.9%) were found to have !2 potential transmission partners (!2 links) (Fig 2C). In transmission networks, an individual with more links would be more associated with HIV transmission than one with only one link. The 26 individuals with !2 links were associated with another 61 individuals in the transmission networks. Furthermore, the proportion of MSM with !2 links was higher among those diagnosed before 2013 (Table 3), which suggested that the individuals diagnosed during 2010-2012 had been active in the genetic transmission networks.
The genetic transmission networks contained MSM from different areas, including 39 (35.8%) permanent residents of Kunming City, 47 (43.1%) temporary residents from other cities in Yunnan Province and 23 (21.1%) temporary residents from other provinces (Fig 3C). For the three different sources of MSM, the constituent ratios of their potential transmission partners by areas showed no significant difference (Fig 4). This result suggested that these three different sources of MSM were randomly mixed in the genetic transmission networks.

Discussion
To track the changing genetic characteristics of HIV-1 among MSM, we conducted a crosssectional HIV-1 molecular epidemiological and transmitted drug resistance (TDR) study among newly diagnosed MSM in Kunming City of Yunnan Province in China. Based on the cumulative HIV-1 sequences data since our previous study [17], we further investigated the demographic history of the main HIV-1 strains and genetic transmission networks in MSM.
CRF55_01B and CRF59_01B are two 01B CRFs that were originally identified in the MSM population in China: CRF55_01B was identified among MSM in southern China [27], while CRF59_01B was identified among MSM in northeastern China [28]. Of the four individuals with CRF55_01B, two were from Guangdong and two were Kunming natives. These four individuals were not found in the genetic transmission networks, suggesting that these individuals were independently infected with CRF55_01B. By retrospective analysis, we found that CRF59_01B had already existed among MSM during 2010-2012. Furthermore, the five CRF59_01B pol sequences collected during 2010-2015 were all involved in one genetic transmission cluster, suggesting that they might have derived from one individual.
CRF08_BC appears to be a distinctive strain in Yunnan MSM; this strain was rarely found among MSM outside Yunnan. However, CRF08_BC was also the predominant genotype among heterosexually transmitted individuals in Yunnan [29]. Our previous work suggests that bisexual behavior might have mediated CRF08_BC transmission from heterosexually infected individuals into the MSM population [17]. Of the seven MSM infected with CRF08_BC, six were found in two genetic transmission clusters, suggesting that this strain was gradually disseminated into this population. However, a number of individuals infected with CRF01_AE and CRF07_BC among MSM have contributed to HIV transmission in this population. Bayesian skyline plots were used to examine EPS of CRF01_AE and CRF07_BC, which grew exponentially from 2001 to 2010 and 2005 to 2009, respectively. Notably, routine HIV/ AIDS intervention for MSM was initiated in 2008 in Yunnan Province, which may have had certain effects in preventing further growth and diversity.
HIV-1 transmission networks based on genetic relations can be utilized to explore the characteristics of HIV-1 transmission among a given population [11,[21][22][23]. In our study, a genetic distance of 0.03 substitutions/site and 90% bootstrap support were used to define transmission clusters. Many factors can influence the choice of genetic distance and branch support thresholds, such as spatial and temporal scales of analysis, HIV subtype, the underlying mode of transmission, and the viral genomic region(s) being analyzed [30]. Clusters that are defined by a short distance will reflect recent transmissions and frequent samplings [23,24,31]. When the goal is to detect distinct transmission events (i.e., two individuals) that may be separated by a long period of time during which the viral populations diverged, a higher genetic distance threshold should be used [11,30]. In our study, 273 subjects had the CD4 + T lymphocyte counts, of whom 44.3% (121/273) had CD4 <350 cells/μl, which suggested that our dataset included some individuals with long-term HIV infection. In a similar study, a genetic distance threshold of 0.03 substitutions/site was used to define transmission clusters for long-standing infection [11]. In a recent study, Wertheim et al. suggested that a genetic distance threshold can be estimated as total viral genetic diversity between two transmission partners within the first 10 years of infection [31], which means the accumulated evolution time within two transmission partners since the transmission event is no more than 20 years. For  HIV-1 genetic transmission networks among MSM in Kunming our sequence sets, the mean estimated evolutionary rates of CRF01_AE and CRF07_BC were 2.54×10 −3 and 1.71×10 −3 substitutions site -1 year -1 . Based on the abovementioned assumption, the genetic distance threshold for CRF01_AE tends to be less than 0.0508 substitutions/site, and that for CRF07_BC tends to be less than 0.0342 substitutions/site. Thus, the genetic distance of 0.03 substitutions/site used in our study is in a rational range. With this distance, we could describe the longitudinal trend in the HIV transmission network among local MSM. Among the MSM reported in Kunming during 2010-2015, 35.4% were involved in the genetic transmission networks. In theory, infectious diseases are transmitted in the form of a network, and thus, each "infector" should have at least one potential transmission partner and be involved in one cluster. However, two-thirds of the subjects not found to be linked to any other subject. The primary reason for this statistic may be the efficiency of sampling and the complexity of the epidemic, especially for some widely circulating genotypes, such as CRF01_AE and CRF07_BC. For these two CRFs, the proportion of subjects identified in the networks was lower than those for some newly introduced CRFs, such as CRF08_BC and CRF59_01B. However, as we found in this study, subjects with more partners were more likely to be identified in the genetic transmission networks.
We also found that more than 70% of genetic transmission clusters contained subjects reported in the different years. Among the earlier reported subjects (during 2010-2012), the proportion of subjects with !2 links was higher. Because of the issue of sampling density, the possibility cannot be completely excluded that the descendant viral strains indirectly links to their ancestor. However, an individual with multiple links could be more associated with HIV transmission than one with only one link. In other words, individuals with multiple links could have potentially higher transmission risk. Especially, we should pay attention to the earlier diagnosed HIV-infected subjects with multiple links, who might be still active in transmission networks and could be the potential intervention targets. Therefore, in addition to the routine interventions aimed at associated risk factors, interventions such as active care and early antiretroviral therapy targeting identified HIV-positive MSM should be promoted, as these methods could improve quality of life and decrease further HIV transmission by lowering community viral load [32,33]. The other option is pre-exposure prophylaxis (PrEP), which could potentially prevent HIV infection among HIV-seronegative MSM and their partners [34,35]. Recently, we have promoted antiretroviral treatment among newly diagnosed HIV-1 MSM, but the effect awaits evaluation. The analysis of genetic transmission networks may be a useful method to find growing clusters and screen out the potentially high-risk individuals, allowing us to deliver the targeted interventions. Thus, we should continue to perform regular analyses of genetic transmission among MSM.
In the genetic transmission networks, the proportions of MSM from Kunming City, other cities in Yunnan and other provinces were 35.8%, 43.1% and 21.1%, respectively. Among these three groups of MSM, the constituent ratios of their potential transmission partners by source areas showed no significant differences, which suggested that there was no regional preference when the three different groups of MSM chose their partners. In fact, with the development of mobile apps, the seeking of partners has become more convenient and will not be limited by location of residence. Our recent study showed that the internet and dating apps had become the most preferred way (66.3%) of making contacts among MSM [36]. In contrast, mobile technology presents an opportunity for innovative interventions for HIV prevention, by which we can directly provide sexual health information and HIV testing services to this population, which remains underdiagnosed.
The prevalence of SDRMs among antiretroviral therapy (ART)-naïve MSM was 5.0% during 2013-2015, which was not significantly higher than that in 2010-2012 (4.6%, χ 2 = 0.033, p = 1.000) [17]. In the present study, all the individuals carrying SDRMs came from Yunnan Province, and eight were single. This result suggested that the drug resistance transmission was localized. Strikingly, a strain harboring drug-resistant mutations to NRTIs, NNRTIs and PIs (T215S, Y188L and L90M) was confirmed to be transmitted in the genetic transmission networks. If ART is to be expanded among MSM, TDR surveillance will be necessary to prevent the spread of such strains.
In summary, we consecutively tracked the dynamic changes in HIV-1 genetic characteristics among MSM in Kunming. In addition to the genotypes previously detected in this population, two new CRFs, CRF55_01B and CRF59_01B, were found for the first time in this study. The demographic history of the main HIV-1 strains suggested that the 2000s were a rapidgrowth period for CRF01_AE and CRF07_BC. The genetic transmission network analysis suggested that MSM with more sexual partners were more easily detected in the networks and that early-diagnosed MSM with HIV were still active in HIV transmission. With the accumulation of genetic data, further genetic transmission network analyses should be performed, which could then help us target interventions to high-risk individuals.