Genetic Analysis of HIV-1 Subtypes in Nairobi, Kenya

Background Genetic analysis of a viral infection helps in following its spread in a given population, in tracking the routes of infection and, where applicable, in vaccine design. Additionally, sequence analysis of the viral genome provides information about patterns of genetic divergence that may have occurred during viral evolution. Objective In this study we have analyzed the subtypes of Human Immunodeficiency Virus -1 (HIV-1) circulating in a diverse sample population of Nairobi, Kenya. Methodology 69 blood samples were collected from a diverse subject population attending the Aga Khan University Hospital in Nairobi, Kenya. Total DNA was extracted from peripheral blood mononuclear cells (PBMCs), and used in a Polymerase Chain Reaction (PCR) to amplify the HIV gag gene. The PCR amplimers were partially sequenced, and alignment and phylogenetic analysis of these sequences was performed using the Los Alamos HIV Database. Results Blood samples from 69 HIV-1 infected subjects from varying ethnic backgrounds were analyzed. Sequence alignment and phylogenetic analysis showed 39 isolates to be subtype A, 13 subtype D, 7 subtype C, 3 subtype AD and CRF01_AE, 2 subtype G and 1 subtype AC and 1 AG. Deeper phylogenetic analysis revealed HIV subtype A sequences to be highly divergent as compared to subtypes D and C. Conclusion Our analysis indicates that HIV-1 subtypes in the Nairobi province of Kenya are dominated by a genetically diverse clade A. Additionally, the prevalence of highly divergent, complex subtypes, intersubtypes, and the recombinant forms indicates viral mixing in Kenyan population, possibly as a result of dual infections.


Introduction
Human Immunodeficiency Virus (HIV) has several perplexing attributes that distinguish it from other viruses. Important determinants of the pathogenicity of this virus are its genetic heterogeneity, which results mainly from the error-prone reversetranscriptase activity during viral replication (introducing an average of one error per genome per replication cycle, the rapid turn over of HIV-1 in vivo, recombination (which occurs at a rate of about 2% per kilobase per replication cycle), and selective immune pressure by the host [1][2][3][4][5].
HIV has been primarily classified on the basis of geographical distribution and animal source of human infection into two types, HIV-1 and HIV-2. Both of these types can be transmitted through sexual contact and blood, as well as from mother to child, and are capable of causing AIDS. The propensity of the virus towards genetic diversity leads to rapid emergence of new types and subtypes that evolve further into subtypes, recombinant forms and quasispecies that are specific to infected groups and communities [6][7][8][9][10][11]. Exhibition of tremendous genetic variation in HIV-1 due to high mutation rates and recombination [12] has led to classification of the virus into three distantly related groups; Main group (M), Outlier group (O) and non-M-non-O group (N). The M group that dominates the AIDS pandemic has been subdivided into at least 12 distinct lineages, designated as subtypes and subsubtypes (A1, A2, B, C, D, F1, F2, G, H, J, K and L) and almost 33 circulating recombinant forms (CRF) [13][14][15]. Viral epidemiological studies therefore provide opportunities to monitor the global spread of HIV, tracking routes of infection and analyzing patterns of virus' genetic divergence [9].
Earlier studies have demonstrated that HIV-1 subtypes are not randomly distributed among the globe and show distinct geographical distribution [16,17]: Subtypes A and D are predominant in Africa; subtype B in USA, Europe, Australia, Thailand and Brazil; subtype C in South Africa, Ethiopia and India; F in some regions of Central Africa and Eastern Europe and CRF01_AE in southeast Asia [4,[18][19][20][21]. With increase in the prevalence of HIV, geographic distribution of subtypes has diversified to a large extent. The greatest genetic variation in HIV-1 has been found in regions where the HIV epidemic is oldest such as the regions of sub-Saharan Africa where most of the HIV-1 subtypes and many of the CRFs have been identified [22]. Kenya is one of the many countries of sub Saharan Africa region where HIV pandemic has had an overwhelming effect. Estimates suggest that by the end of the year 2005, 1.5 million people of Kenya were found to be HIV infected. In the same year, about 140,000 adults and children were estimated to have died from AIDS [23]. Epidemiological Studies have indicated that majority cases of HIV in Kenya belong to subtype A [24]. Although HIV-1 subtype A is dominant in Kenya there is an increasing prevalence of other subtypes and recombinant viruses as well. For instance CRF10 strain was first identified in western Kenya [25]. Epidemiological studies in Kenya have also reported increased prevalence in subtype C and D [26].
There is no recent published data regarding HIV subtypes in the general population of Nairobi, one of the main cities of Kenya. Few studies were carried out to track the HIV-1 subtypes in Nairobi but these studies are relatively outdated, have dealt with limited sample size [26,27], or have targeted specific high risk groups [28]. In the present study, we investigate the prevalence of circulating HIV-1 subtypes in a sample size that is larger and more diverse than the ones studied previously.

Study subjects
Ethical approval for this study was obtained from the Ethical Research Council, Aga Khan University. For the study, HIVinfected residents of Nairobi were recruited at the Aga Khan Hospital, Nairobi, Kenya. Informed consent was obtained from all participants along with the data on gender, age, ethnic background, occupation, and marital status. HIV-1 status of these subjects was previously known to be positive based on their HIV-1 antibody test. All subjects who gave consent were recruited into the study, and their blood samples obtained during the year 2007.

Extraction of genomic DNA
3-4 ml of whole blood was collected from each subject. Extraction of DNA was carried out as described previously [29]. Briefly, to 0.5 ml of peripheral blood mononuclear cells (PBMCs), 0.9 ml of 16 RBC lysing solution (0.32 M Sucrose, 1% Triton X-100, 5 mM MgCl2.6H2O, 12 mM Tris-HCl, pH 7.6) was added and centrifuged at 13,000 rpm for 1 min. After discarding the supernatant the pellet was re-extracted with 0.9 ml RBC lysing solution. After centrifugation at 13,000 rpm, the pellet was washed with 1 ml water. To the pellet, 20 ml of 20% SDS, 80 ml Proteinase K buffer (0.375 M NaCl, 0.12 M EDTA, pH 8.0) and 40 ml of 10 mg/ml Proteinase K was added to the solution which was then incubated at 56uC for 1 hour. Subsequently, 200 ml of 6 M NaCl was added to the suspension and then centrifuged at 13,000 rpm for 5 min. Supernatant was then transferred to a fresh tube and added with 400 ml of isopropanol. DNA was then pelleted by centrifugation at 13,000 rpm for 5 min. The DNA pellet was washed with 70% ethanol, air-dried, re-suspended in 100 ml of water, and stored at 220uC.

b-globin PCR
To ascertain the quantity and quality of the extracted DNA, bglobin PCR was carried out using the previously described [30] primers PC03 (59-ACACAACTGTGTTCACTAGC-39) and PC04 (59-CAACTTCATCCACGTTCACC-39). The final 25 ml PCR mixture contained 5 ml samples, 16 PCR buffer (56 Green GoTaqH Flexi Buffer, pH 8.5), 1 mM MgCl 2, 200 mM dNTPs, 0.2 pmol of each primer and 0.2 U of Taq polymerase. Thermocycle was: denaturation at 94uC for 5 min, followed by 40 cycles of denaturation at 94uC for 30 sec, annealing at 51uC for 30 sec and extension at 72uC for 30 sec, with a final extension of at 72uC for 5 min.
The reaction mixture of 25 ml for both first and second round PCR contained 16 PCR buffer (56 Green GoTaqH Flexi Buffer, pH 8.5), 2 mM MgCl 2 , 400 mM dNTPs and 0.3 U of Taq Polymerase. The first round of PCR was performed with 0.48 pmol of primers GOPF and GOPR. Thermocycle was: denaturation at 95uC for 5 min, followed by 35 cycles of denaturation at 95uC for 1 min, annealing at 58uC for 1 min and extension at 72uC for 1 min, with a final extension of at 72uC for 15 min.
1 ml of the first-round PCR product along with 0.48 pmol of the primers GIPF and GIPR was used for the second-round PCR. Thermocycle was: denaturation at 95uC for 5 min, followed by 35 cycles of denaturation at 95uC for 1 min, annealing at 60uC for 1 min and extension at 72uC for 1 min, with a final extension of at 72uC for 15 min. The amplified products were electrophoresed on 1.2% agarose gel, stained by ethidium bromide and visualized under ultraviolet light.

Sequencing and Phylogenetic Analysis
Nested PCR products of gag gene were partially sequenced from Macrogen Inc, Korea, using the primer GSP1 (59-CCATCAAT-GAGGAAGCTGC-39, nt 1400-1418, HXB2). For subtyping and further analysis, the nucleotide sequence spanning the p24 and p7 region of gag gene, nt 1577-2040, HXB2 [31] (comprising 460-470 bp), was aligned with sequences from the Los Alamos HIV sequence database. This was accomplished by using the HIV BLAST Search (http://www.hiv.lanl.gov/). The samples were assigned subtypes based on the closest homology found with the subtype references in the Los Alamos database.
Using the same sequence, alignments were obtained by the Clustal X program (1.83) [32]. After alignment the positions where gaps occurred were stripped and minor manual adjustments were made using MacClade [33]. From these alignments, phylogenetic relationships were determined by using neighbor-joining method with the help of PAUP* [34]. Pairwise genetic distances were calculated with Kimura's two parameter method [35] SimPlot Version 3.5.1 was used for the analysis of recombinant subtypes [36] . In order to establish geographic relationship between our and previously reported strains, reference gag sequences from different countries were selected from the Los Alamos HIV sequence database and sequence alignment and phylogenetic analysis performed. All of the sequences were analyzed for GRA hypermutation using the consensus sequence for the appropriate subtype by the Los Alamos Hypermut Program (http://www.hiv.lanl. gov/content/sequence/HYPERMUT/hypermut.html).

Subject Profile
Samples collected in the year 2007, from 69 HIV-infected subjects who consented to the study, were analyzed. Of these subjects, 32 (46.37%) were males and 37 (53.63%) were females (Table 1). Their ages ranged between 16 and 58 years (Table 1). Recruited subjects had varying ethnic backgrounds with majority being from Luo and Kikuyu communities followed by Kamba (Table 1). Distribution via occupation revealed a mixed picture: persons involved in business numbered 10 (14.49%), technicians 10 (14.49%), clerks 6 (8.69%) and the rest constituting others ( Table 1). Distribution of risk groups was as follows: 76.81% had history of unprotected sex, 15.94% had history of blood transfusion whereas 4.34% had other risk factors (Table 1).

Subtyping and Phylogenetic Analysis
69 samples were successfully amplified for the gag gene in a nested PCR, followed by sequencing. Generated sequences (approximately covering 460-470 bp of p24 and p7 region of gag gene, nt 1577-2040, HXB2) were then used for sequence alignment using Clustal X and to construct the phylogenetic trees using neighbor-joining method with PAUP*. Alignment of sequences with reference sequences from Los Alamos database and Phylogenetic analysis revealed that 39 (56.52%) of the 69 HIV-1 subtypes were A, 13 (18.84%) were subtype D, 7 (10.14%) were subtype C and 2 (2.89%) were subtype G. Simplot analysis revealed a few recombinant types in our study samples: 3 (4.34%) being AD, 1 (1.44%) AC, 1 (1.44%) AG and 3 (4.34%) CRF01_AE ( Table 2, Fig. 1).

Phylogenetic Analysis for Geographic Relationship
For the three most represented subtypes in our study, namely, A, D and C, deeper phylogenetic analysis was performed to explore their geographic origin ( Fig. 2A, B, C). Our sample subtype A sequences clustered with sequences from varied geographical regions represented by Uganda, Kenya, Sweden, Rwanda, South Africa, Australia, India, China and Democratic Republic of the Congo (Fig. 2A). These results indicate a significant diversity among our subtype A sequences, possibly indicating an older HIV-1 infection in the Nairobi population. Conversely, Phylogenetic analysis of subtype D sequences revealed a relatively lesser degree of diversity (Fig. 2B). Sequences of 10 out of 12 strains for subtype D clustered together, and with a reference sequence from Uganda indicating a strong phylogenetic relationship and lesser genetic divergence compared to subtype A sequences. Two out of twelve subtype D sequences clustered independently with reference sequences from Kenya. Lastly, subtype C sequences, in a manner similar to our subtype D sequences, clutered closely together, and with African countries (Fig. 2C). Four out of seven subtype C sequences clustered closely with reference sequence from Ethiopia while three sequences clustered with references from South Africa and Botswana. One of the sequence, 26KE, representing subtype A, was found to be hypermutated by the Hypermut Program at Los Alamos Database (http://www.hiv.lanl.gov/content/sequence/ HYPERMUT/hypermut.html).

Discussion
In the present study we document the genetic diversity of HIV-1 strains in the general population of Nairobi, Kenya, by sequence subtyping and phylogenetic analysis of gag region.

Subtype Distribution Profile
HIV-1 subtype A was found to be the predominant type circulating in the Nairobi population followed by subtypes D, C and G. These findings suggest a consistent and continual spread of subtype A in Nairobi ( Table 2). The data presented here illustrate a heterogeneous epidemic which is consistent with the previous epidemiological studies in Kenya [24,27,[37][38][39][40][41]. Homology of Kenyan subtype A samples with multiple countries indicates that the origin of subtype A infection in these countries may be from Kenya.
We have also identified recombinants AD, AC, AG and CRF01_AE, which carry recombined subtype A genomic sequences. Occurrence of these mixed subtypes and CRFs indicates that subtype A is not only predominantly in circulation in the Kenyan population but with time, it is slowly evolving and  diverging into new HIV strains. These recombinant forms might have developed from recombination between parental subtype strains within dually infected patients or through transmission of already recombined strains to the subjects studied. With any of the above possibilities it is evident that dual infections are common in this population. HIV-1 subtype distribution according to gender and age Previous reports have suggested that HIV in Kenya is more prevalent in women than in men [23], in our study as well, of the recruited HIV seropositive subjects, majority was of females (37 women and 32 men). In female subjects the prevalent HIV-1 subtype was A (48.64%). The prevalence of subtype D (21.62%) and C (13.51%) in women was found to be higher as compared to men, which were, respectively, 15.62% and 6.25%.
Analysis of the patient profiles revealed that most of the study subjects belonged to age group 31-50 years. Moreover, phylogenetic analysis of the samples from these age groups indicated the prevalence of subtype G, intersubtype AD, AC and AG within the age group of 31-40, whereas the prevalence of CRF01_AE was found to be higher in the age group of 41-50 (66.67%) ( Table 1).

HIV-1 subtype distribution according to Ethnicity
In our HIV seropositive study group, Luo and Kikuyu, comprising 16 subjects each, were the most represented groups, followed by Kamba (13 subjects) ( Table 1). In the Luo group, subtype A, D and its intersubtype AD were discovered while subtype C was found to be surprisingly absent. Interestingly, all subtypes discovered in this study were represented in the Kikuyu group. These ethnic groups exhibit distinct cultural behavior and practices which may play a crucial role in the transmission of HIV within and from these groups. Distribution of various ethnicities in Kenya is in the order: Kikuyu (22%), Luhya (14%), Luo (13%), Kalenjin (12%), Kamba (11%). In our HIV positive subjects, however, the order of predominance was: Luo/Kikuyu, Kamba, Luhya, and Kalenjin. A high occurrence of HIV in Luo and Kikuyu groups may have a connection with cultural and social practices. Traditional male circumcision, which in certain trials has been shown to be protective against HIV [42], is, for instance, not practiced in the Luo ethnicity [43]. Polygamy and wife inheritance, on the other hand, is common in the Luo community [44], which may also play a role in increased transmission of HIV and confinement of specific subtypes within this group. Finally, CRF01_AE was found to be more prevalent in the ethnic group Kamba (66.67%) compared to any other ethnic group. This was an interesting observation since prevalence of CR01_AE, previously reported in Central African Republic countries, Chad, Congo, and Egypt, has not been noted in the East African region (www.hiv.lanl.gov). Kamba ethnicity is distributed in the eastcentral and coastal Kenya. Prevalence of CR01_AE in this tribe may be attributed to their interaction with countries of Central African Republic through trade and travel.

Phylogenetic relationship of Kenya subtypes with other countries
Phylogenetically, most of the HIV-1 subtypes in our study have shown their closest homology and association with African countries such as Kenya, Uganda, South Africa, Ethiopia, Rwanda, Botswana and Democratic Republic of Congo. These geographic associations suggest that the bulk of HIV-1 infection in Nairobi is likely to be a result of direct transmission of the virus within these countries possibly via travelers across borders. Certain viral strains bearing subtype A demonstrated closer homology to  (Fig. 2C), depicting geographical associations of the studied sample group with other global regions. Reference Sequences from Los Alamaos Database have been indicated in bold. Out groups selected for Fig. 2A strains from India, Australia and Sweden which may indicate spread of these strains from or to Kenya. Closer relationship of our subtype D and C, but not A, sequences with African subtypes may indicate a propensity of transmission of these subtypes for specific genetic backdrops.

Conclusion
Among the sub-Saharan African countries, Kenya has one of the highest prevalence rates of HIV-1 infection [45]. HIV infection among adults in urban areas, such as Nairobi, is almost twice (10%) as high as in rural areas (5-6%). Studies previously conducted in Nairobi have targeted particular high-risk groups [28] and/or have relied on a limited number of samples [46]. Moreover, most of these studies date a few years back, necessitating an update on general as well as molecular epidemiology data [28,46]. Increasing variation among HIV subtypes may have implications on HIV prevention and treatment programmes. At this stage few vaccine trials in Kenya are underway or are presently in the planning stage [47,48] and there is a need to monitor the current extent of HIV-1 subtype divergence in the infected population in different parts of the country.

Author Contributions
Conceived and designed the experiments: SA. Performed the experiments: SK SK. Analyzed the data: SK SK SA. Contributed reagents/materials/ analysis tools: PO NO RH SA. Wrote the paper: SK SA.