Detailed Molecular Epidemiologic Characterization of HIV-1 Infection in Bulgaria Reveals Broad Diversity and Evolving Phylodynamics

Limited information is available to describe the molecular epidemiology of HIV-1 in Bulgaria. To better understand the genetic diversity and the epidemiologic dynamics of HIV-1 we analyzed 125 new polymerase (pol) sequences from Bulgarians diagnosed through 2009 and 77 pol sequences available from our previous study from persons infected prior to 2007. Epidemiologic and demographic information was obtained from each participant and phylogenetic analysis was used to infer HIV-1 evolutionary histories. 120 (59.5%) persons were infected with one of five different HIV-1 subtypes (A1, B, C, F1 and H) and 63 (31.2%) persons were infected with one of six different circulating recombinant forms (CRFs; 01_AE, 02_AG, 04_cpx, 05_DF, 14_BG, and 36_cpx). We also for the first time identified infection with two different clusters of unique A-like and F-like sub-subtype variants in 12 persons (5.9%) and seven unique recombinant forms (3.5%), including a novel J/C recombinant. While subtype B was the major genotype identified and was more prevalent in MSM and increased between 2000–2005, most non-B subtypes were present in persons ≥45 years old. CRF01_AE was the most common non-B subtype and was higher in women and IDUs relative to other risk groups combined. Our results show that HIV-1 infection in Bulgaria reflects the shifting distribution of genotypes coincident with the changing epidemiology of the HIV-1 epidemic among different risk groups. Our data support increased public health interventions targeting IDUs and MSM. Furthermore, the substantial and increasing HIV-1 genetic heterogeneity, combined with fluctuating infection dynamics, highlights the importance of sustained and expanded surveillance to prevent and control HIV-1 infection in Bulgaria.


Introduction
The rapidly evolving human immunodeficiency virus type 1 (HIV-1) is characterized by enormous genetic heterogeneity and is divided phylogenetically into four major groups: M (major), N (new), O (outlier), and the recently identified group P from a crossspecies transmission from an SIV-infected gorilla [1]. Nine subtypes of HIV-1 group M (A-D, F-H, J and K) and few subsubtypes, e.g. F1, F2, and A1 -A4 are currently recognized. In addition, a great variety of circulating recombinant forms (CRFs) and unique recombinant forms (URFs) have been identified adding to the growing genetic complexity of HIV-1 [2]. The unequal worldwide distribution of the different HIV-1 genotypes results from the global transmission and spread of certain variants or the limited spread of local endemic strains [2]. Subtype B is predominant in the Americas, Western Europe, and Australia [3,4], subtype A prevails in Russia and the former Soviet Union (FSU) countries, and is also prevalent in Africa. Subtype C is the most abundant genetic form in South and Eastern Africa, South East Asia, and worldwide followed by subtypes A and B. CRFs and URFs are widely distributed in countries where different subtypes co-circulate [5,6]. In Eastern Europe, Russia, Ukraine, Belarus and Moldova the HIV-1 epidemic is dominated by subtype A, followed by subtype B and CRF03_AB [7][8][9][10]. In the Baltic countries, subtype B predominates in Lithuania, subtype A1 is more common in Latvia, and the rare genetic form CRF06_cpx prevails in Estonia [11][12][13].
The HIV-1 epidemic in the Balkan region has been affected by various historic and socio-economic factors. Recently, several studies of HIV-1 molecular epidemiology in this region have reported a wide variety of introduced and prevalent HIV-1 genotypes with specific subtypes predominating in each country.

Study population and specimen preparation
In addition to 77 patients from our previous study in 2007 [27], we collected blood samples from 125 new persons for a total of 202 (18.36%) patients from the 1,100 persons diagnosed with HIV-1 infection in Bulgaria from the beginning of the epidemic until 2009. Convenience specimens were collected from different vulnerable populations, including MSM, IDUs, offspring of infected mothers, and blood product recipients. The blood samples were collected at the National HIV Confirmatory Laboratory and in clinics in the capital city of Sofia and the other two main cities of the country, Plovdiv and Varna. All patients consented to the testing and data collection procedures. Following specimen collection and patient interviews to obtain demographic and clinical information, the blood samples were linked to the demographic and clinical data through an anonymous numerical code in accordance with the ethical standards of Bulgaria. Plasma was obtained from the blood specimens by centrifugation at the National HIV Confirmatory Laboratory in Sofia and stored at 80uC as previously described [27].
Polymerase (pol) gene RT-PCR and sequence analysis Viral RNA was extracted from one ml plasma using the QIAampH Ultra Sens TM Virus Kit 50 (QIAGEN, Cat. No.53704). Generation of the protease (PR) and reverse transcriptase (RT) sequences of the HIV-1 pol gene was performed using either the Viroseq HIV-1 Genotyping Test (Abbott) and/or TruGene DNA Sequencing System (Siemens Healthcare) RT-PCR kits at the National HIV Confirmatory Laboratory in Sofia, Bulgaria. Automated DNA sequencing was done using an Applied Biosystems model 310 sequencer or an OpenGene DNA sequencing system (Visible Genetics, Siemens) following the manufacturer's protocol.
All 125 new HIV-1 pol sequences were first analyzed using the internet-based REGA HIV-1&2 Automated Subtyping Tool (Version 2.0) (http://jose.med.kuleuven.be/genotypetool/html/ subtypinghiv.html) [30], the COntext-based Modeling for Expeditious Typing, version 0.2 (COMET HIV-1/2) -http://comet. retrovirology.lu/, and the Geno2pheno v3.3 (www.geno2pheno. org/) programs to obtain preliminary HIV-1 subtype classification. Possible genetic composition of URF pol sequences was inferred using the BLAST sliding window algorithm implemented at NCBI (www.ncbi.nlm.nih.gov/projects/genotyping) and used to guide selection of reference sequences in bootscanning analysis. Manual bootscan analysis was done with the program SimPlot to confirm selected subtypes using the F84 nucleotide substitution model and a sliding window of 200-bp, a 40-bp step, with the transition/ transversion ratio determined empirically [31]. Percent nucleotide identities and distances were calculated using alignments of selected sequences using the software Geneious Pro v5.5.5 and MEGA5 using TN93+G+I as a substitution model, respectively. Detection of recombination was also evaluated using the programs RDP, 3Seq, GENECON, MaxChi, and Chimaera, implemented in RDP v3 [32].
Individual BLAST searches of all 125 new sequences was performed and the most similar GenBank sequences for each subtype and CRF were downloaded from the HIV Los Alamos sequence database for further sequence analysis (http://www.hiv. lanl.gov/). To obtain a comprehensive description of the molecular epidemiology of HIV-1 in Bulgaria we also included in the analyses 77 HIV-1 sequences from Bulgaria previously reported by our group. Nucleotide alignments were prepared using Clustal W v1.6 in the MEGA5 and BioEdit software packages followed by manual editing [33,34]. Several alignments were made for phylogenetic analysis, including the complete data set consisting of 125 new and 77 previously reported Bulgarian sequences, phylogenetically similar sequences from the BLAST search and HIV-1 subtype reference sequences from 2011 available at the Los Alamos HIV database. To investigate further the molecular epidemiology of HIV-1 in Bulgaria, we also analyzed subsets of Bulgarian and reference sequences by using selected genotype clusters inferred by maximum likelihood (ML) Bayesian analysis.
The best fitting distance model of nucleotide substitution for each alignment was inferred using the ML method with goodness of fit measured by the Bayesian information criterion in MEGA5. The best fitting nucleotide substitution model for the phylogenetic alignments was inferred to be the general time reversible model (GTR) with discrete gamma and invariant among-site rate variation. Phylogenetic relationships were inferred using the ML method in MEGA5 and using Bayesian analysis with BEAST v1.6.2 [35] program. Stability of the ML tree topologies was tested using 1,000 nonparametric bootstrap replicates, whereas statistical support for the inferred Bayesian trees was assessed by posterior probabilities. For the Bayesian analyses, a relaxed molecular clock model was used and each run consisted of two independent 800610 6 Markov chain Monte Carlo (MCMC) generations with sampling every 800,000 th generation and a constant coalescent tree prior. Convergence of the MCMC was assessed by calculating the effective sampling size (ESS) of the runs using the program Tracer v1.5 (http://beast.bio.ed.ac.uk/Tracer). All parameter estimates showed significant ESSs .1,200. The tree with the maximum product of the posterior clade probabilities (maximum clade credibility tree) was chosen from the posterior distribution of 8,001 sampled trees after burning in the first 1,000 sampled trees with the program TreeAnnotator version 1.6.2 [36].
Potential epidemiologic clusters were defined using a stringent set of criteria and included those sequences grouping together in phylogenetic analysis with posterior probabilities $0.97, $96% ML bootstrap support, and sharing .90% nucleotide identity per total sampling period between related sequences. The latter estimate is based on the 10 23 substitution rate for HIV-1 pol sequences generating a divergence rate of 1-2%/year between founder and transmitted viruses [36]. For example, HIV sequences sampled 5 years after a transmission event would expect to be 5-10% divergent. Antiretroviral resistance-associated mutations were detected using the Stanford genotypic resistance algorithm (http://sierra2.stanford.edu) and stripped from partial alignments containing potential transmission clusters to minimize clustering artifacts.

Accession numbers
GenBank accession numbers for the 125 new pol sequences are JQ259060-JQ259184.

Statistical analysis
To assess the epidemiological associations with each of the major HIV-1 subtype groups, multivariable logistic regression models were constructed, controlling for the following demographic and risk mode variables: age at diagnosis, gender, transmission mode, infection abroad, and diagnosis period. Each subtype group (B, other major subtypes combined (C, A1, F1, H), 01_AE, 02_AG, rare CRFs, and URFs) was modeled relative to all other subtype groups collectively. Odds ratios and 95% confidence limits were calculated. Single variable logistic regression models were also performed and results are reported for findings that differed from those of the multivariable models. In addition, linear trends (degree of freedom = 1) in subtype prevalence and in proportion of infections by risk transmission category were assessed using the Cochran-Armitage trend test. Differences in subtype diversity by population subgroups were assessed using a Fisher-Freeman-Halton test for nominal independence.
The location of the various genotypes was plotted on a map of Bulgaria to examine trends in their geographic distribution ( Figure 7). The majority of infections and greatest heterogeneity was observed in the capital of Sofia, followed by the three other major cities of Plovdiv, Burgas, and Varna. The most common subtypes B and CRF01_AE showed the widest distribution across Bulgaria. The majority of the URFs were located in and around Sofia.

Identification of HIV-1 clusters within Bulgaria
Using stringent phylogenetic and nucleotide identity criteria as evidence of possible epidemiologically linked HIV-1 sequences, we identified 10 clusters of highly associated sequences ( Figure 1). The majority of the clusters occurred in the CRF02_AG and subtype B clades ( Figure 1). Within CRF02_AG there was one group of six, and three pairs of related sequences (Figures 1 and  5A). A monophyletic lineage with high posterior probability containing six sequences (patients 927, 859, 822, 836, 826,950 and 836) was obtained from specimens collected over about a year from male IDUs who all lived in Plovdiv and the neighboring city of Haskovo. All six sequences shared .97.4% nucleotide identity. A heterosexual pair (907 and 908) from Sofia reported contact with each other and was infected with subtype CRF02_AG and shared 99.4% nucleotide identity (Figures 1 and  5A). Heterosexual spouses (319 and 322) were both infected with subtype CRF02_AG pol sequences that clustered together with a 1.0 posterior probability, 99% bootstrap support, and which shared 98% nucleotide identity (Figures 1 and 5A). The fourth cluster contained sequences from three heterosexual females (138, 496, 1041) of which one (138) reported possible infection while in Ghana [37]. HIV-1 CRF02_AG in person 1041 was also highly related to that from an individual from Cyprus (Figures 1 and 5A). All four sequences in this cluster shared .95% nucleotide identity.
Seven sequences (133, 207, 210, 228, 376, 597, and 906) were unique and clustered together with strong support as sister clades of subtypes CRF05_DF, F1 and BF (Figures 1 and 4A). Sequence analysis showed that this F-like cluster shared 90.7-97.1% nucleotide identity which is greater than that within F1 (94-96.5%), F2 (95.8-97.1%), and CRF05_DF (94.9-96.6%) in this region of pol. Likewise, the F-like Bulgarian sequences were equidistant from the F1, F2, and CRF05_DF subtypes sharing 75-79% nucleotide identity which is greater than the range of sequence identity between F1, F2, and CRF-05_DF in this region (89-96%). Bootscan analysis showed the F-like sequences to be mostly composed of F1 sequences but with weak support suggesting they contain unidentified sequences (Figure S1A-G). Analysis of F2 sequences has also determined that they may contain fragments of an uncertain origin [38]. Additional phylogenetic analysis including sequences from BLAST analysis showed these seven sequences clustered with strong support with an F-variant from the former Soviet Union (FSU) ( Figure 4A) [39]. Thus, we are tentatively classifying these new Bulgarian sequences as F-like sub-subtype variants.
Similarly, five persons (434, 540, 544, 584, and 833) were infected with another novel sub-subtype that formed a distinct lineage between A1 and CRF01_AE in the phylogenetic analysis (Figures 1 and 4B). Bootscan analysis showed the genetic structure of these five sequences was similar and is composed of mostly A1 and unidentified sequences ( Figure S2A-E). The nucleotide identities within this group of five sequences (94-96.9%) was similar to that seen within A1 (94.2-95.3%), A2 (93.8-95.3%), A3 (93.8-98.6%), A4 (94.1-94.9%), and CRF01_AE (95.3-97.5%). In addition, this group of five Bulgarian sequences and the A1, A2, A3, A4, CRF01_AE reference sequences were nearly equidistant from each other sharing about 90-95%, which is similar to that between subtypes B and D in this region. The distinct relationship of these five A-like sequences to other subtype A sequences was also supported by phylogenetic analysis as were the A1, A2, A3, and A4 sub-subtypes ( Figure 4B). The five A-like sequences clustered with other pol sequences from HIV-1 infected persons from Africa, FSU, and persons from Cyprus who reported infection in FSU that have also been described as A variants or A sub-subtypes and were shown to originate from Africa [40]. These findings are consistent with our previous results showing an A clade from FSU but for which we classified these variants as A1 genotypes [27]. Interestingly, three of these five persons reported acquiring infection abroad; one in Russia (833), one in Greece (544), and one (540) who did not define a country and is consistent with the prevalence of these A-variants in FSU and Greece. Recombination was not detected in any of the seven F-like or five A-like pol sequences using the program RDP and the phylogenetic results were not influenced by the presence of antiretroviral drug resistant mutations (data not shown). Combined, and in accordance with the 1999 HIV-1 nomenclature proposal [41], these data suggest these Bulgarian A-and F-like sequences may represent sub-subtypes first reported in FSU but originating from Africa [40].
The remaining different but highly related sequence sets were composed of mostly subtype B infections (three sets) and one group of five subtype A1 sequences (326, 673, 687, 551 and 862) (Figures 1 and 4B). Male participants 326, 673, and 687 were all infected with HIV subtype A1 that shared .96.5% nucleotide  (Figures 1  and 2). The intrapair HIV-1 pol nucleotide identity for both MSM pairs was .99.8%. Contact information was not available for the remaining two sequences (553 and 564) to investigate further their potential epidemiologic links. These three sets of B sequences still clustered together after phylogenetic analysis of over 4,000 global subtype B sequences in the GenBank database ( Figure S3) further supporting their molecular and possible epidemiologic linkage. Association of major HIV-1 subtypes with demographic and risk groups in Bulgaria Subtype B is the most frequent HIV-1 present in our study population and was found in approximately half (104/202, 51.5%) of the patient samples ( Table 2, Figures 1 and 2). Nine patients (8.7%) reported acquiring HIV-1 outside of Bulgaria; one each in England, Greece, Italy, South Africa, Romania, Spain, two in Germany and one patient did not report a specific country where he was infected. In a multivariable model, controlling for age, gender, risk mode, country of infection, and calendar period of diagnosis, relative to other subtypes, subtype B was independently higher in MSM (odds ratio (OR) = 16.0, p-value,0.001) and significantly lower in IDU (OR = 0.2, p-value = 0.02) compared to heterosexual risk, and significantly lower in females (OR = 0.3, pvalue = 0.002) compared to males and persons $45 years of age at diagnosis (OR = 0.3, p-value 0.05) compared to persons 20-44 years of age (Table 3). There was a significant spike in the relative prevalence of subtype B during the 2000-05 period in our study population (OR = 3.5, p-value = 0.03). Phylogenetic analysis of over 4,000 global subtype B sequences showed that this subtype has been introduced into Bulgaria on multiple occasions followed by local expansion demonstrated by clustering of some Bulgarian sequences in this tree ( Figure S3).

Association of circulating and complex recombinant HIV-1 forms with demographic and risk groups in Bulgaria
Phylogenetic and subtype analysis classified 63/202 (31.2%) HIV-1 sequences from Bulgaria as CRFs (Table 2, Figures 1, 5,  6A and B). CRF01_AE is the most common HIV-1 recombinant form in Bulgaria with 40 (19.8%) of the 202 sequences in our study. Based upon multivariable regression, compared to all other subtypes, CRF01_AE prevalence is significantly higher among females (OR = 2.5, p = 0.02) and IDUs (OR = 4.0, p = 0.02) relative to other risk groups combined ( Table 3). Two of four newborns in the study were also infected with HIV-1 CRF01_AE reflective of more women with this subtype.
We found HIV-1 CRF02_AG to be the second most frequent CRF group in Bulgaria with 15/202 persons (7.4%) infected with this subtype ( Table 2, Figures 1 and 5A). In a single-variable regression analysis (data not shown), compared to other subtypes, CRF02_AG infection was significantly higher during later diagnosis periods. However, this finding was explained by the coincidental increase in IDU infection over time. In the multivariate analysis, CRF02_AG prevalence was higher among IDU (OR = 8.4, p = 0.003) and females (OR = 4.6, p = 0.02) ( Table 3) independent of other potential risk factors.
Only one sequence (062) genotyped as HIV-1 CRF04_cpx and which clustered with strong support with similar reference strains from Cyprus, Greece and France (Table 2, Figures 1, 6A and B), countries where this complex form has only been identified to date [42][43][44]. Phylogenetic analysis of all CRF04_cpx sequences available at Los Alamos and GenBank comparable in size to our sequence confirmed the genotype of sequence 062 ( Figure 6B).
One Bulgarian HIV-1 pol sequence (150) clustered with significant support with the reference CRF36_cpx sequence [2] ( Table 2, Figures 1 and 6A). This patient reported probable acquisition of infection while in Libya but this rare subtype has only been found in Cameroon [2].
Two HIV-1 pol sequences (534 and 669) clustered with the referent CRF14_BG sequence ( Table 2, Figures 1 and 6B). One of these patients (534) reported probable acquisition of infection while in Spain. CRF14_BG is widespread in Portugal and Spain, and in some African countries, and was also recently found among IDUs in Greece [45,46]. Collectively, these rare HIV-1 genotypes were significantly less likely to be acquired in Bulgaria (OR = 0.1, p = 0.006) and were more likely to be found in women (OR = 5.6, p = 0.04) based upon multivariable analysis.
In addition to the 12 A and F sub-subtype variants described above, seven other persons were infected with novel URFs (Table 2, Figure 1, 6A and 6C). HIV-1 pol sequences from two patients (004 and 297) clustered together weakly but formed a unique lineage between subtypes C and BC with a strong posterior probability (Figure 1). Bootscan analysis revealed that the pol sequence from person 004, was composed of unidentified and subtype C sequences, whereas the HIV-1 pol from person 297, who reported possible infection abroad, was composed of subtype J and C sequences with strong support ( Figure 6C and Figure S4A and B). Two sequences (021 and 452) clustered with genotype CRF34_01B phylogenetically but with weak bootstrap and posterior probability (Figure 1). Bootscan analysis showed sequence 021 was composed of A1, A2, and unidentified . Inferred phylogenetic relationships of Bulgarian HIV-1 subtypes C and H. Tree structure was inferred using maximum likelihood analysis of polymerase sequences implemented in MEGA5. Support for each node was determined using 1,000 bootstrap replications with only values $70 shown. Scale bar indicates the number of nucleotide substitutions per site. Antiretroviral resistanceassociated mutations were stripped from the alignments. Nearly identical tree topologies were also obtained with Bayesian analysis. The 690-bp alignment consisted of 9 HIV-1 C and H strains from Bulgaria and 29 Group M reference sequences from the Los Alamos HIV database. The tree was rooted by using an HIV-1 subtype J strain as the outgroup. Bulgarian sequences are shown using green branches and taxon names. doi:10.1371/journal.pone.0059666.g003 sequences, while sequence 452 was mostly a mixture of unidentified sequences ( Figure 6C and Figure S4C and D). Sequence 063 formed a lineage ancestral to the HIV-1 A1 and CRF02_AG genotypes (Figures 1 and 6C) and was composed of pol sequences from A1 and undefined fragments ( Figure S4E). Sequence 399 clustered with CRF45_cpx with weak support (Figure 1) but bootscan analysis revealed it was a complex of CRF45-cpx and CRF09-cpx sequences ( Figure S4F). Sequence 316 clustered with CRF43_02AG reference sequences with strong posterior probability (Figures 1 and 6A), but was composed of both CRF43_02G and G fragments ( Figure S4G). Based upon multivariable analysis, these URFs were significantly less likely to occur among persons infected while in Bulgaria (OR = 0.2, p = 0.0074) ( Table 3).

Trends in the HIV-1 subtype prevalence and total infections in high risk groups
To better understand the temporal trends of the epidemiology of HIV-1 infection in Bulgaria and for purposes of targeting prevention resources, we analyzed trends in the prevalence of HIV-1 subtypes, and trends in the major risk groups across four time periods (1986-1995, 1996-1999, 2000-2005 Figures 8A and B). CRF01_AE prevalence decreased after 1996-1999 even though this subtype was also strongly associated with IDU and IDU prevalence increased during the last time period (Table 3, Figures 8A and B). The overall CRF02_AG and CRF01_AE prevalences decreased during the last period because the larger proportion of heterosexual infections in our study (140/202) with these subtypes decreased during the last period which was greater than the number of new IDU infections with these CRFs. Subtype B did not significantly increase over time using multivariable or single variable analyses (data not shown) despite a significant spike in prevalence from 2000-2005 (OR = 3.5, p-value = 0.03) likely explained by overrepresentation of MSM in the study population and a simultaneous decrease in infected heterosexuals during that period ( Figures 8A and B). The rare subtypes 04_cpx, 05_DF, 14_BG, 36_cpx, were most prevalent during 1986-1999 (18.2%), but have declined to 3.9% in 2006-2009 (p = 0.03).
Proportions of younger infected persons, ages ,20, were significantly declining (trend test p,0.001). Persons infected through heterosexual contact and vertical transmission were declining, while those infected through MSM or IDU were increasing (p,0.001) ( Figure 8B  The 758-bp alignment was composed of 9 subtype F and F-like sub-subtype polymerase sequences from Bulgaria and 43 subtype F reference sequences from the Los Alamos HIV database and highly related sequences identified using BLAST. The tree was rooted by using three HIV-1 subtype J strains as the outgroup. (B) A-like sub-subtypes. The 754-bp alignment consisted of 10 subtype A and A-like sub-subtype polymerase sequences from Bulgaria and 121 subtype A reference sequences from the Los Alamos HIV database and highly related sequences identified using BLAST. Antiretroviral resistance-associated mutations were stripped from the alignment. The tree was rooted by using three HIV-1 J strains as the outgroup. doi:10.1371/journal.pone.0059666.g004

Discussion
In the present study we describe the expanded molecular surveillance and epidemiological assessment of the HIV-1 epidemic in Bulgaria. Our results extend those reported previously by our group that initially identified multiple subtypes in Bulgaria, which we proposed were sustained by viral inflow from European countries, USA and Africa [27]. In addition to our earlier results, current national data from ongoing HIV and AIDS surveillance suggest that the HIV/AIDS incidence in Bulgaria remains low overall. To obtain a broader picture of the HIV-1 epidemic in Bulgaria we analyzed new pol sequences from 125 HIV-1-infected persons and combined these with 77 pol sequences from patients reported in our previous study, altogether representing about 20% of infections identified to date. Our study shows that the HIV-1 epidemic in Bulgaria has several key characteristics relative to the genetic composition of circulating viruses in the country, including a high genetic diversity of more than 15 different HIV-1 subtypes, CRFs and URFs, two new sub-subtype variants, and the introduction of seven extremely rare HIV-1 forms, including CRF05_DF, CRF04_cpx CRF36_cpx, CRF14_BG, and a plethora of different URFs that are distinct from all reference sequences. The data suggest the presence of local networks in certain risk groups with specific HIV-1 genotypes [29], an unequal subtype distribution among different risk groups, with most MSM infected with subtype B, and the lack or low prevalence of certain genotypes in specific groups, and the fluctuation of different HIV-1 genotypes overtime and in certain risk groups.
In our current study we found a broader genetic heterogeneity of HIV-1 circulating in Bulgaria than that previously reported by our group [27]. Most infections were with the major HIV-1 subtypes followed by a variety of CRFs, URFs and sub-subtypes. Although subtype B was the most prevalent genotype in both studies and represents about half the HIV-1 infections in Bulgaria, our results differ from those in Western Europe where subtype B comprises the vast majority of infections [2,47,48]. Our results show that subtype B has been introduced into Bulgaria multiple times, but also with local expansion in certain populations. In contrast to our previous study [27,48], the second largest genotype in Bulgaria is now CRF01_AE, followed by CRF02_AG. We also identified seven new complex circulating forms not seen before in Bulgaria and which are typically found in Cameroon, Libya, Singapore, Thailand, Saudi Arabia, Greece, Cyprus, Portugal, Spain, or France. Notably, two of these CRFs have been associated with outbreaks in IDUs in Greece (CRF04_cpx and CRF14_BG) [43][44][45][46], but only one of these genotypes (CRF14_BG) was present in an IDU in Bulgaria and may represent a recent infection. The others were present in non-IDU populations (three heterosexual and one blood product recipient) suggesting that these rare variants may be spreading outside of IDUs. We also identified nine possible new URFs circulating in Bulgaria that may be the result of the mixing of so many genotypes in the population similar to the natural history of HIV-1 in Africa. These URFs were found to be associated with persons reporting travel abroad who may be facilitating the increase of viral diversity in Bulgaria. All of the URFs identified in our study, including the clusters of A-like and F-like sequences, will require further characterization using complete genomes to determine their genetic composition and final classification as was done for subtype K and the A and F sub-subtypes [41,[49][50][51].
The distribution of HIV-1 subtypes in our study population varied by age, sex, geography, and risk exposure. The greatest number of infections and broadest HIV-1 diversity occurred in major cities where most immigrants tend to live. We also observed the largest number of HIV subtypes and hence the greatest amount of HIV diversity in persons over 45 years old despite their being fewer total infections in this age group. Persons infected by heterosexual or IDU transmission had the greatest variety of subtypes, compared to MSM (p,0.0001). Furthermore, we also observed that closely related viral clades are increasing in circulation in geographically restricted IDU subgroups. The HIV-1 CRF02_AG seen in one IDU subgroup appears to have been transmitted within separate transmission clusters. Both women and IDUs were more likely to be infected with subtypes CRF01_AE and CRF02_AG, though the trend in IDUs may be due to an overall increase in IDU infections over time in Bulgaria. In contrast, with the exception of two persons, all MSM were infected with subtype B. Our results are consistent with several studies from different European countries that found a predominance of subtype B in MSM [22,47,48,52]. There is also recent Figure 6. Inferred phylogenetic relationships of Bulgarian HIV-1 subtypes. Tree structure was inferred using maximum likelihood analysis of polymerase sequences implemented in MEGA5. Support for each node was determined using 1,000 bootstrap replications with only values $70 shown. Scale bar indicates the number of nucleotide substitutions per site. Antiretroviral resistance-associated mutations were stripped from the alignments. Nearly identical tree topologies were also obtained with Bayesian analysis. (A) Multiple Circulating and unique recombinant forms (CRF and URF, respectively). The 696-bp alignment was composed of one CRF04_cpx, four CRF05_DF, two CRF14_BG, and one CRF36_cpx strains and one CRF43_02G-like strain from Bulgaria and 71 related reference sequences from the Los Alamos HIV database. The tree was rooted by using an HIV-1 subtype J strain as the outgroup. Bulgarian sequences are shown using green branches and taxon names. (B) CRF04_cpx. The 1043-bp alignment was composed of one HIV-1 CRF04_cpx1 strain from Bulgaria and 69 reference sequences from the Los Alamos HIV database and related sequences identified by BLAST. The tree was rooted by using an HIV-1 subtype J sequence as the outgroup. Bulgarian sequence is depicted using a green branch and taxon name. (C) Identification of novel HIV-1 URFs in Bulgaria. The 696-bp alignment was composed of six HIV-1 URF strains from Bulgaria and 68 reference sequences from the Los Alamos HIV database. The tree was rooted by using an HIV-1 subtype J strain as the outgroup. Bulgarian sequences are shown using green branches and taxon names. Bulgarian sequence #297 was shown by SimPlot analysis to be a novel recombinant of subtype J and C sequences. doi:10.1371/journal.pone.0059666.g006    independent evidence that subtype B in Bulgaria may be phylogenetically related to subtype B strains identified in Russia, Greece and some other neighboring Balkan countries [47]. In addition, our phylogenetic and epidemiologic analyses showed an absence or reduced prevalence of certain viral strains in specific populations. For example, neither MSM nor individuals infected by blood transfusion were infected with HIV-1 CRF01_AE, which was the second most common genotype in Bulgaria. These findings suggest a pattern of independent and limited transmission of some viral clades within certain population groups without viral exchange between them. The high level of HIV-1 diversity and wide distribution among various risk groups and demographic classifications are likely to influence various aspects of the epidemic in Bulgaria, including patient treatment, disease progression, laboratory monitoring, and vaccine development. Analysis of longitudinal epidemiologic and demographic data with HIV-1 genotype can provide important information regarding infection dynamics during different stages of the epidemic in Bulgaria. Using this strategy by year of HIV/AIDS diagnosis we found a dissimilar rate of introduction and spread of different HIV-1 clades over time during the history of the epidemic in Bulgaria. Some clades persist, such as subtype B, while others like CRF01_AE and subtype C have declined, and other clades appear to be emerging, like CRF02_AG. We also found that there was a significant increase in the number of infections in MSM and IDUs with a concomitant decline in heterosexual and vertical infections. Some of these observed changes in the dynamics of HIV-1 infection in Bulgaria can be explained by the initial introduction of specific HIV-1 clades in certain populations spreading afterwards within populations at increased risk for infection. We also found that during the early years of the epidemic HIV-1 was most likely found in Bulgarians returning from abroad and who then transmitted HIV-1 via heterosexual contact. However, in recent years the significant increase in transmission of certain HIV-1 subtypes to and among IDUs and MSM is most likely due to increased risky behavior and spread of subtypes within these groups but also coincided with a decrease in heterosexual infections.
Our results also suggest that the infection dynamics and HIV-1 genetic diversity of the HIV-1 epidemic in Bulgaria are fluctuating.
There are a few factors that seem relevant to help explain this phenomenon. Firstly, the central geographic location of Bulgaria at the crossing point between Western Europe, Eastern Europe, and the Middle East facilitates the introduction of HIV-1 into and beyond Bulgaria. In addition, an increase of West African immigrants to Europe to escape war, oppression and poverty has likely helped increase the HIV-1 diversity in the country [53]. The spread of HIV-1 within high risk groups like IDUs and MSM from 2006-2009 supports the changing epidemiological dynamics of HIV-1 infection which could indicate an increase of these populations in Bulgaria or increases in risky behavior or both. There has been a dramatic reduction and inversion of the socioeconomic status of the Bulgarian population from a planned to a market economy after 1989 associated with high unemployment rates of about 20% in 2001 prior to the period of our study and a subsequent fall in the gross domestic product by 50%, which may lead individuals to substance abuse with an increased risk for infection (www.google.com/publicdata). An underestimation of MSM transmission in the early years of the epidemic due to stigma associated with reporting of homosexual contacts could also explain the observed increase in MSM infections. Thus, control of the HIV-1 epidemic in these populations will require targeted interventions in conjunction with continuous and increased surveillance efforts of specific at-risk groups, especially IDUs and MSM and their contacts. These public health initiatives will be crucial to understanding the details of the epidemiology of HIV-1 in Bulgaria and to assess the efficacy of prevention strategies and control of the epidemic.
The cross-sectional design of our study, which included convenience specimens, collected as patients were diagnosed or came into clinics for care, may not truly represent the other 80% of reported cases in Bulgaria. The effect of this possible sampling error in our analyses is unknown but may influence genotype distribution over time and in different populations. For example, we determined that MSM were overrepresented and IDUs were underrepresented in our two studies combined which may affect the trend interpretations in these risk groups. Increased HIV-1 testing and surveillance, especially in IDUs, should help to overcome this potential bias and improve the accuracy of understanding the epidemic in Bulgaria and the proposed targeted interventions. Also, our phylogenetic results are based on a single genomic region, in some cases genotypes were only identified from limited numbers of patients. Thus, confirmation of the rare genotypes identified herein may require sequence analysis of additional regions or complete genomes and their detection in larger numbers of persons. Any potential association of infection with a geographic location is based on self-reporting and may be affected by recall or other biases and thus may require analysis of larger HIV-1 datasets from the purported country where the transmission was reported to confirm the origin. Minor differences between our results and those of our previous study most likely represent the use of more stringent criteria for genotype classification and the use of a larger set of reference sequences that included more CRFs in the current study.
In conclusion, we found a broad level of HIV-1 genetic diversity in Bulgaria, including most major subtypes and CRFs and the identification of novel URFs and sub-subtypes, whose composition fluctuates in different populations and during different phases of the epidemic. Our data also suggest the extremely dynamic nature of the Bulgarian epidemic is characterized by an unequal distribution of different HIV-1 genotypes among high risk populations. These findings emphasize the need for sustained and focused molecular epidemiological surveillance to identify transmission links that can be targeted by prevention strategies to control the HIV-1 epidemic in Bulgaria.