A Comprehensive Mapping of HIV-1 Genotypes in Various Risk Groups and Regions across China Based on a Nationwide Molecular Epidemiologic Survey

Background China is experiencing a dynamic HIV/AIDS epidemic. While serology based surveillance systems have reported the spread of HIV/AIDS, detailed tracking of its transmission in populations and regions is not possible without mapping it at the molecular level. We therefore conducted a nationwide molecular epidemiology survey across the country. Methods HIV-1 genotypes were determined from 1,408 HIV-positive persons newly diagnosed in 2006. The prevalence of each genotype was estimated by weighting the genotype’s prevalence from each province- and risk-specific subpopulation with the number of reported cases in the corresponding subgroups in that year. Results CRF07_BC (35.5%), CRF01_AE (27.6%), CRF08_BC (20.1%), and subtype B' (9.6%) were the four main HIV-1 strains in China. CRF07_BC and CRF08_BC were the primary drivers of infection among injecting drug users in northeastern and southeastern China, respectively, and subtype B' remained dominant among former plasma donors in central China. In contrast, all four strains occurred in significant proportions among heterosexuals nationwide, pointing to an expansion of the HIV-1 epidemic from high-risk populations into the general population. CRF01_AE also replaced subtype B as the principal driver of infection among men-who-have-sex-with-men. Conclusions Our study provides the first comprehensive baseline data on the diversity and characteristics of HIV/AIDS epidemic in China, reflecting unique region- and risk group-specific transmission dynamics. The results provide information critical for designing effective prevention measures against HIV transmission.


Introduction
Since the first HIV/AIDS case in China was reported in 1985 [1], an estimated 740,000 individuals have contracted HIV in China by the end of 2009 [2]. In the last quarter century, the main drivers of China's HIV epidemic have shifted considerably, from blood transmission to sexual transmission. The proportions of injecting drug users (IDUs), heterosexuals, and men who have sex with men (MSM) among HIV positive persons have changed from 44.2%, 11.3% and 0.3% in 1985 -2005 to 25.8%, 55.7% and 8.6% in 2009, respectively [2].
China has experienced several waves of HIV-1 epidemics since its onset in the late 1980s. In 1989, the first epidemic was among IDUs in Yunnan province, on China's southwest border near Myanmar [3,4]. Subtype B' (Thailand's variant of subtype B) [3] and subtype C were almost concurrently transmitted to Yunnan from Thailand and India, respectively [5], later generating two new B/C circulating recombinant forms, CRF07_BC and CRF08_BC, most likely in Yunnan [6][7][8]. CRF07_BC spread to northwestern China along a drug traffic route in around 1993 [8][9][10], and CRF08_BC spread eastward to the southern coastal provinces, Guangxi and Guangdong, in around 1990 [6,8]. IDUs constituted about 70% of HIV cases reported in early 1990s [11]. However, in the early-mid 1990s, unregulated and unsanitary commercial plasma collection in central China resulted in an explosion of HIV-1 subtype B' among plasma donors in Henan [12] and then dispersed to other provinces in central China [13][14][15]. On the other hand, HIV-1 CRF01_AE was first detected among heterosexuals and IDUs in Guangdong, Guangxi andYunnan around 1996-1997 [16,17], and soon spread through sexual routes to other provinces along the southeast coast [18,19]. More recently, HIV infections among MSM increased rapidly [2].
As HIV-1 transmission patterns in China diversify, there is increasing need for detailed, comprehensive analysis on the geographic and demographic distribution of viral genotypes. In this study, we estimated the likely prevalence and distribution of HIV-1 genotypes in China, using more than 1,500 specimens collected throughout China in conjunction with HIV case report data for various risk groups in the respective provinces. Our results provide the first comprehensive dataset on the characteristics and diversity of China's HIV/AIDS epidemic as related to regional and risk group-specific transmission patterns.

Study Population and Sample Collection
A total of 1,513 plasma specimens were collected from HIVpositive individuals newly diagnosed in 2006 from various risk groups in 30 of the 31 provinces of mainland China except Hainan (Table 1). Plasma samples were collected during routine follow-up visits at local Centers for Disease Control in 2006 -2008. The study was conducted in a cross-sectional method using stratified random sampling by province. The sampling ratio [numbers of samples collected (b) over numbers of HIV case reports (a)] for each province is shown in Table 1 (column b/a). For provinces with fewer reported cases, higher sampling ratios were used to assure statistical confidence; the median sampling ratios for provinces with ,200, 200-499, 500-999, 1,000-3,999, and .4000 cases were 22.1%, 16.4%, 9.5%, 2.9% and 2.1%, respectively (Table 1). No HIV-2 infections were identified in this study. The study was approved by the institutional review boards of the National Center for AIDS/STD Control and Prevention. Written informed consent was obtained from all study participants.

Sequence Analysis and Subtype Determination
HIV-1 nucleotide sequences of 1.1-kb gag (HXB2:781-1836 nt) and 540-bp env (HXB2:7002-7541 nt) regions were PCR-amplified and sequenced as described by Cheng et al [26]. HIV-1 genotypes were determined based on neighbor-joining tree analysis in comparison with Los Alamos 2010 HIV-1 subtyping references (http://www.hiv.lanl.gov). Phylogenetic analysis was performed using MEGA4 software with bootstrapping of 1,000 replications [27]. The recombinants were analyzed and confirmed with Simplot version 3.5.1 [28]. The HIV-1 genotype of each patient was assigned based on the genotypes of both the gag and env genes; if only one gene region was available, the genotype of that region was assigned. Samples with different genotypic identification assigned to the gag and env regions were deemed discordant, and labeled by the gag/env genotype designations (e.g. CRF01_AE/B).

Data Stratification and Calculation
Since the transmission of HIV, and therefore the distribution of genotypes, is strongly associated with the geographic location and risk group status of infective individuals [29], we defined the basic unit of our analysis (a subgroup) as each particular risk population in a given province. We defined the proportion of an HIV-1 genotype within each subgroup as its proportion among the samples obtained for each subgroup. We then determined the estimated number of individuals for each genotype in a subgroup by multiplying the calculated proportion with the total number of HIV cases reported in the subgroup. For a subgroup in which no specimen was obtained, the number of reported cases was removed from the 2006 reported total. The adjusted totals and corresponding sampling ratios for each province is listed in the last two columns [(d) and (c/d)] of Table 1. The actual calculation process is shown in Table S1.
The estimated number of individuals infected with each genotype in each risk group nationwide was obtained by summing the subgroups across all provinces (Table 2). Similarly, the estimated proportions and numbers of individuals infected with each genotype in each geographic region and nationwide were obtained by summing up the subgroups accordingly (Table 2 and 3).

Distribution of HIV-1 Genotypes in China
A total of 1,404 HIV-1 genotypes of the 1.1-kb gag and/or the 540-bp env regions were determined from 1,513 plasma specimens collected from newly diagnosed HIV-positive persons in 2006 (Table 1). 1,290 gag genotypes (85.3% of samples) and 1,186 env genotypes (78.4%) were obtained, and both gag and env genotypes were available for 1,072 samples (70.9%). There appears to be no significant bias in the success rates of genotyping due to preferred amplification of particular genotypes over others. In calculating the prevalence of HIV-1 genotypes, we weighted the sampled prevalences with the size of the corresponding risk group in any given province; if no specimen was obtained for the risk-and province-specific subgroup, the case count for that subgroup was removed from the analysis (Methods, also see Table S1). This removed 5.6% (2,030 of 36,167) of all reported cases from the national total [(d) in Table 1].

HIV-1 Genotype Distribution by Geographic Region
As summarized in Table 3, HIV-1 infections were the most concentrated in the western region [ (6,200+17,147) of 34,148,68.4%], followed by eastern (6,970 of 34,148, 20.4%), central (3,434 of 34,148, 10.1%), and northeastern (397 of 34,148, 1.2%). Figure 1 illustrates the estimated region-and province-specific distribution of HIV-1 genotypes (also see Tables 3 and S2 Table 3). The high proportion of MSM (26.4%) was a unique feature in this region. Consistent with the high concentration of MSM, subtype B was detected at a higher proportion (21.4%) in the northeast than any other regions, although CRF01_AE infections were now more prevalent among MSM than subtype B (54.3% and 37.1%, respectively; see Table 3).
In the southwestern region, IDUs (8,285 of 17,147, 48.3%) and heterosexuals (7,361 of 17,147, 42.9%) were the two main risk groups, contributing almost equally to the epidemic. CRF08_BC  (Table  S2). This is the region where we observed the highest viral diversity after the eastern region (Table S2).  (Table S2, Fig. 2A). IDUs and heterosexuals were the two main risk groups for the spread of CRF07_BC, accounting for 68.8% (8,344 of 12,122) and 21.4% (2,595 of 12,122), respectively ( Table 2).
HIV-1 subtype B' was identified in 28 provinces (except Ningxia and Xizang [Tibet]) and was the most wide-spread HIV-1 strain in China in terms of geographical reach (Fig. 2D). The highest prevalence of subtype B' was observed in central and southwestern China, especially in Henan (1,391 of 3,289, 42.3%), Anhui (319 of   3,289, 9.7%) and Yunnan (295 of 3,289, 9.0%) ( Table S2). Due to the history of subtype B' transmission in the early-mid 1990s, FPDs were considered to be a primary risk group infected with subtype B' [30]. However, as shown in Table 2, the proportion of subtype B' infections was now highest among heterosexuals (1,373 of 3,289, 41.7%), followed by FPD+BT (1,214 of 3,289, 36.9%) ( Table 2). This strongly suggests subtype B' had spread from FPDs into the general population [30,31].

Discussion
We conducted a nationwide cross-sectional study on the HIV-1 genotype distribution in China, combining molecular data with province-specific estimates of the number of HIV infections in each risk population. Our study revealed a high level of complexity of the HIV epidemic in China. We detected a total of 10 HIV-1 subtypes and CRFs (Table 2) found in circulation in all continents of the world. This likely reflects the increasing mobility of people   [32]. Viral diversity varied considerably in different risk populations and regions in China. Subtype B' predominated among FPD+BT (92.5%) ( Table 2), consistent with the region's history of a single explosive HIV epidemic due to contamination in commercial plasma donations in the early-mid 1990s in central China [30]. Similarly, CRF07_BC predominated among IDUs in northwestern China (99.5%) ( Table 3), In contrast, we observed the highest viral diversity among heterosexuals: all 12 genotype categories were detected in this population (Table 2). Furthermore, as typically seen in subtype B' in central China and CRF07_BC in northwestern China, these two strains were also prevalent among heterosexuals in the respective regions: B' (59.4%) in central China; CRF07_BC (92.2%) in northwestern China (Table 3). This suggests that the spread of HIV-1 from the particular high-risk populations (IDUs and FPDs) into general populations was significantly facilitated through heterosexual transmission in those regions [30,31,33,34].
As well, differences in viral diversity between geographical regions point to differences in the drivers of HIV-1 transmission. CRF07_BC predominated in the northwestern region in the vast majority of risk groups (over 90%) ( Table 3). This was especially true for Xinjiang: CRF07_BC was the single predominant HIV-1 strain even among heterosexuals, a genotypically diverse risk group in other regions of the country. This uniformity indicates a strong regional founder effect in the local HIV epidemic and suggests a lack of incoming transmissions from other regions. In contrast, although subtype B' was the main HIV genotype (60.1%) in central China, its prevalence was highest in the FPD+BT population, less so in heterosexuals, and was rarely found amongst local IDU and MSM (Table 3). Henan province, the center of the FPD epidemic, was the exception, in which subtype B' predominated in all risk groups. Infections in all other geographical regions were driven by multiple HIV-1 genotypes, which suggest that with a few exceptions, the impact of geographical communities on HIV transmission was weaker than that of behavioral communities (i.e. risk groups). Interventions for curbing new infections may therefore benefit more from implementing risk behavior-specific measures across geographical boundaries than from a more regionally focused approach.
Further, changes in the relative prevalence of genotypes nationwide and among populations of interest illustrate shifts in China's HIV transmission patterns. The proportion of subtype B' amongst all infections has decreased from 47.5% in 1998 and 29.1% in 2002 [35] to 9.6% in 2006, in accordance with national policies that strictly regulated blood donations since the late 1990s [36]. However, in 2006, heterosexual contact has overtaken blood transmission routes as the greatest risk group for subtype B' infections nationwide (41.7% and 36.9%), and B' also predominated amongst heterosexual infections in the central region (59.4%; Table 3). At the same time, the prevalence of CRF01_AE infections has increased from 9.6% in 1998 and 15.5% in 2002 [35] to 27.5% in 2006 (Table 2). This is likely related to the increasing predominance of sexual transmission across the country. CRF01_AE was the most prevalent HIV-1 strains among heterosexuals (39.8%) and MSM (55.8%) ( Table 2), and was highly represented in all geographic regions except the northwestern and central regions where local genotypes prevailed (Fig. 1). Notably, transmission among MSM was previously predominated by subtype B (of US and European origin) in major cities of China [2], but has now been eclipsed by the increasing prevalence of CRF01_AE. This is consistent with previous observations in MSM population in Beijing [20], Liaoning [21] and Shijiazhuang [22]. These trends underscore the increasing significance of sexual transmission in China's HIV/AIDS epidemic.
Our study has several limitations. Sampling bias may have occurred due to limited sample size in provinces with low HIV-1 prevalence. Bias may also have been introduced during the adjustment of genotype prevalences for risk-and province-specific subgroups, which excluded some subgroups of very low prevalence from calculation (e.g. IDUs in Anhui; see Table S1). Additionally, risk assessment and attribution are also subject to biases; in particular, some MSM, commercial sex workers, and commercial sex clients may be unwilling to disclose their risk status. Future iterations of the national molecular epidemiologic survey should aim to more adequately sample each province-and risk-specific subgroup and to better assess risk statuses.
In summary, our results illustrate the recent state of China's HIV/AIDS epidemic, reflecting the country's unique region-and risk group-specific transmission patterns. Our data provide information critical for designing effective interventions to limit future HIV transmission in China. As current approaches in HIV vaccine development often remain limited in eliciting immune responses against wide-ranging viral subtypes [37,38], improved knowledge of genotype prevalences can help inform vaccine research and development efforts tailored towards the particular HIV epidemic profile in China and nearby regions. In addition, we propose that our method for estimating genotype prevalence (also see [29]), which links molecular genotyping data with the numbers of HIV case reports in region-and risk-group specific subpopulations, may be widely applied to future molecular epidemiologic analyses. We aim to continue monitoring the trends in HIV-1 subtype distribution in China to identify and characterize newly emerging properties of public health importance.

Supporting Information
Table S1 Estimated numbers and proportions of HIV-1 infections for region-and risk-specific subgroups in China. (XLS)