Conceived and designed the experiments: EMS BTB FAP. Performed the experiments: EMS WA NK. Analyzed the data: EMS GB BL. Contributed reagents/materials/analysis tools: ML PM RS JK CW KF. Wrote the paper: EMS.
The authors have declared that no competing interests exist.
To identify novel biomarkers for HIV-1 resistance, including pathways that may be critical in anti-HIV-1 vaccine design, we carried out a gene expression analysis on blood samples obtained from HIV-1 highly exposed seronegatives (HESN) from a commercial sex worker cohort in Nairobi and compared their profiles to HIV-1 negative controls. Whole blood samples were collected from 43 HIV-1 resistant sex workers and a similar number of controls. Total RNA was extracted and hybridized to the Affymetrix HUG 133 Plus 2.0 micro arrays (Affymetrix, Santa Clara CA). Output data was analysed through ArrayAssist software (Agilent, San Jose CA). More than 2,274 probe sets were differentially expressed in the HESN as compared to the control group (fold change ≥1.3; p value ≤0.0001, FDR <0.05). Unsupervised hierarchical clustering of the differentially expressed genes readily distinguished HESNs from controls. Pathway analysis through the KEGG signaling database revealed a majority of the impacted pathways (13 of 15, 87%) had genes that were significantly down regulated. The most down expressed pathways were glycolysis/gluconeogenesis, pentose phosphate, phosphatidyl inositol, natural killer cell cytotoxicity and T-cell receptor signaling. Ribosomal protein synthesis and tight junction genes were up regulated. We infer that the hallmark of HIV-1 resistance is down regulation of genes in key signaling pathways that HIV-1 depends on for infection.
The disease AIDS perhaps ranks as one of the most devastating scourges of mankind. Since it was identified in 1983 more than 30 million people have died and another 33 million currently live with the virus
The presence of individuals who are highly exposed to HIV-1 but do not get infected, provide hope for a better understanding of correlates for protection that may lead to a more effective vaccine strategy. Highly exposed seronegative (HESN) populations have been identified among intravenous drug users
Our primary objective was to investigate if there is a predictive gene expression pattern that defines the HESN phenotype. Gene expression data from the 86 micro arrays were normalized and all were included in the analysis. An unsupervised hierarchical clustering algorithm was run on the data to ascertain if the two populations could be separated into two distinct groups based on their gene expression profiles.
Each row on the Y axis represents a single gene probe and the phylograms represent distinct signaling pathways. The red color denotes up regulated genes while the green are down regulated.
Of the 54675 gene probe sets represented on the Affymetrix U133 plus 2.0 gene chip, 2,274 probe sets were differentially expressed in the HESN group as compared with the control group (fold change ≥1.3; p value ≤0.05) after correction with multiple testing using the Benjamin-Hochberg false discovery rate (FDR<0.05). Of the total differentially expressed transcripts, 462 (20%) could be mapped onto the KEGG signaling database.
More than 40% of genes that had earlier been reported to be associated with HIV-1 susceptibility in HESN populations were found to be differentially expressed in this dataset (
Gene | Regulation | Fold Change | Reference | |
APOBEC3G | up | 1.1 | 0.7 |
|
CCL3L1 | up | 1.2 | 0.4 |
|
CCL4 | up | 1.1 | 0.6 |
|
CCL5 | down | −1.1 | 0.6 |
|
CCR5 | down | −1.1 | 0.6 |
|
CD207(Langerin) | up | 1.2 | 0.4 |
|
CD209(DC-SIGN) | down | −1.2 | 0.02 |
|
CUL5 | up | 1.3 | 0.2 |
|
CXCL12(SDF-1) | down | −1.2 | 0.6 |
|
DEFA1 | up | 1.3 | 0.4 |
|
DEFA3 | up | 1.3 | 0.4 |
|
DEFB1 | up | 1.4 | 0.001 |
|
DEFB4 | up | 1.2 | 0.5 |
|
HLA-A | down | −1.1 | 0.01 |
|
HLA-B | down | −1.1 | 0.04 |
|
HLA-C | down | −1.1 | 0.03 |
|
IL4 | down | −1.1 | 0.9 |
|
KIR2DL1 | down | −1.9 | 0.02 |
|
KIR2DL2 | down | −1.7 | 0.01 |
|
KIR2DL3 | down | −1.8 | 0.07 |
|
KIR2DL4 | down | −1.6 | 0.02 |
|
KIR2DL5 | down | −1.6 | 0.02 |
|
KIR2DS1 | down | −1.5 | 0.2 |
|
KIR2DS2 | down | −1.8 | 0.1 |
|
KIR2DS3 | down | −2.2 | 0.04 |
|
KIR2DS4 | down | −1.6 | 0.1 |
|
KIR2DS5 | down | −2.03 | 0.02 |
|
KIR3DL1 | down | −1.9 | 0.01 |
|
KIR3DL2 | down | −1.8 | 0.02 |
|
KIR3DL3 | down | −1.8 | 0.08 |
|
PPIA(CyclophilinA) | up | 1.5 | 0.002 |
|
TRIM5alpha | down | −1.6 | 0.1 | |
TSG101 | down | −1.1 | 0.3 |
|
TLR2 | down | −1.04 | 0.7 |
|
TLR4 | down | −1.2 | 0.07 |
|
TLR8 | down | −1.4 | 0.001 |
|
Affymetrix ID | Entrez | Gene Symbol | Chromosome | Regulation | |
237953_at | 1803 | Dpp4 | Chr2 | up | 1.99E-08 |
1569312_at | 7705 | ZNF146 | Chr19 | up | 1.17E-07 |
237051_at | 10463 | SLC30A9 | Chr4 | up | 1.27E-07 |
1559848_at | 387338 | NSUN4 | Chr1 | up | 2.23E-05 |
241798_at | 23244 | SCC-112 | Chr4 | up | 2.23E-05 |
244515_at | 5713 | PSMD7 | Chr16 | up | 2.37E-05 |
243683_at | 9643 | MORF4L2 | ChrX | up | 2.91E-05 |
1557575_at | 8987 | GENX-3414 | Chr4 | up | 3.09E-05 |
232279_at | 23338 | PHF15 | Chr5 | up | 4.94E-05 |
233127_at | 55422 | ZNF331 | Chr19 | up | 4.93E-05 |
225354_s_at | 83699 | SH3BGRL2 | Chr6 | down | 5.32E-05 |
223683_at | 84225 | ZMYND15 | Chr17 | down | 6.42E-05 |
212667_at | 6678 | SPARC | Chr5 | down | 6.57E-05 |
242191_at | 200030 | NBPF11 | Chr1 | up | 6.60E-05 |
233302_at | 64919 | BCLIIB | Chr14 | up | 7.06E-05 |
206494_s_at | 3674 | ITGA2B | Chr17 | down | 1.01E-04 |
213258_at | 7035 | TFP1 | Chr2 | down | 1.01E-04 |
221942_s_at | 2982 | GUCY1A3 | Chr4 | down | 1.01E-04 |
201108_s_at | 7057 | THBS1 | Chr15 | down | 1.41E-04 |
207114_at | 80740 | LY6G6C | Chr6 | down | 1.72E-04 |
227088_at | 8654 | PDE5A | Chr4 | down | 1.77E-04 |
1559126_at | 23223 | RRP12 | Chr10 | up | 1.83E-04 |
216580_at | 120872 | RPL7 | Chr12 | up | 2.08E-04 |
215859_at | 56926 | NCLN | Chr19 | up | 2.20E-04 |
240263_at | 51106 | TFB1M | Chr6 | up | 2.62E-04 |
227461_at | 85439 | STON2 | Chr14 | down | 2.63E-04 |
201059_at | 2017 | CTTN | Chr11 | down | 2.69E-04 |
1560043_at | 51706 | CYB5R1 | Chr1 | up | 2.79E-04 |
202729_s_at | 4052 | LTBP1 | Chr2 | down | 3.04E-04 |
230014_at | 51646 | YPEL5 | Chr2 | up | 3.18E-04 |
236841_at | 374666 | FAM39DP | Chr15 | up | 4.28E-04 |
239805_at | 9058 | SLC13A2 | Chr17 | down | 4.75E-04 |
232570_s_at | 80332 | ADAM33 | Chr20 | up | 6.13E-04 |
233087_at | 64839 | FBXL17 | Chr5 | up | 6.13E-04 |
202275_at | 2539 | G6PD | ChrX | down | 6.13E-04 |
229991_s_at | 94121 | SYTL4 | ChrX | down | 6.24E-04 |
230945_at | 10120 | ACTRIB | Chr2 | up | 6.80E-04 |
1558975_at | 124402 | FAM100A | Chr16 | up | 6.86E-04 |
1552750_at | 117286 | CIB3 | Chr19 | up | 6.94E-04 |
209676_at | 7035 | TFP1 | Chr2 | down | 7.28E-04 |
206254_at | 1950 | EGF1 | Chr4 | down | 1.10E-04 |
226303_at | 5239 | PGM5 | Chr9 | down | 1.33E-03 |
205524_s_at | 1404 | HAPLN1 | Chr5 | down | 1.36E-03 |
222319_at | 4676 | NAP1L4 | Chr11 | up | 1.41E-03 |
205409_at | 2355 | FOSL2 | Chr2 | down | 1.46E-03 |
200785_s_at | 4035 | LRP1 | Chr12 | down | 1.50E-03 |
212077_at | 800 | CALD1 | Chr7 | down | 1.62E-03 |
219232_s_at | 112399 | EGLN3 | Chr14 | down | 1.79E-03 |
1569542_at | 166647 | GP125 | Chr4 | up | 1.90E-03 |
200661_at | 5476 | CTSA | Chr20 | down | 2.00E-03 |
To validate the outcome of the microarray, representative down-regulated and up-regulated genes were picked for quantitative real time RT-PCR. Genes from one of the most down regulated signaling pathways- glucose\gluconeogenesis were randomly picked for qRTPCR analysis while a random sample of zinc finger proteins were selected to represent upregulated pathways. And to control for probable effect of sex work, samples from HIV seronegative sex workers (New Negatives) who had been involved in sex work for one year or less, were included in the qRTPCR analysis.
The expression differences of the selected genes in HESN women as compared to the controls were similar to the microarray data (
Genes expression profiles from HESNs were compared with HIV uninfected non sex worker maternal child health clinic attendees (HIV negative) and HIV Negative new entrants into the sex trade (New Negatives). Each assay was a ratio gene quantity to its 18srRNA. G6PDH- glycose 6 phospate dehydrogenase; G3PDH-Glyceraldehyde 3 phosphate dehydrogenase; PGM-1-Phosphoglucomutase-1; IRS-1-Insulin receptor substrate-1; ZNF346-Zinc finger nuclease 346; ZHX2- Zinc finger and homeodomain protein 2.
Our primary objective in this study was to investigate if there is a predictive gene expression signature pattern that defines the HESN phenotype in the Pumwani sex worker cohort. Apart from identifying candidate gene biomarkers, we used a systems biology approach to take into account the inherent complexity that may not be elucidated through single gene observations. We also used whole blood in an attempt to unravel the probable interplay between systemic intercellular networks that may be lost through individual cell type approaches. As controlling for HESN studies is often a challenge, we have attempted to include data from HIV negative new entrants into the sex worker trade as an additional control group to lessen confounding factors such as frequent exposures to other sexually transmitted infections. The overall outcome show a distinct difference in expression patterns between the HESN population and controls. Previous studies in this cohort have identified specific human leukocyte antigens (HLA) and interferon regulatory factors (IRF) as associated with resistance
The most up-regulated gene was the Dipeptidyl peptidase 4 (DPP4). The HESN group had a more than a 2 fold change over expression of this gene as compared to controls. DPP4, also known as CD26, is a 110 kDa protein that has been found to be one of the most ubiquitous soluble proteins. Its expression levels have been associated with cancer, diabetes and infectious disease
In conformity with the high expression of DPP4, we observed a down-regulation of the insulin signaling pathway among the HESN women. Though we could not, in our studies, directly attribute the low expression of insulin genes to DPP4, the association of very high DPP4 with diabetes is a confirmed phenomenon
The phospatidylinositol pathway was a key impacted signaling system among the HIV-1 resistant women (
In concordance with earlier findings from our group using specific cell subsets
An up-regulation of DNA binding genes that have been associated with gene editing and silencing was a key phenomenon among the HESN. This included the Zinc finger (ZFN) and SMAD proteins encoding genes. Among the highly expressed ZFN genes were the Zinc Finger and Homeoboxes 2 (ZHX2) and Zinc Finger protein 20 (ZBTB20). ZHX2 has been identified as a transcription repressor gene
In tandem with the overexpression of gene silencing factors at the transcriptional level, we observed dramatic down regulation of host factors reported to be important for successful HIV-1 infection. These factors included intergrins, a class of cell surface receptors that mediate linkages with the extracellular matrix and have been identified as HIV-1 receptors
A previous in
Our findings imply that a repertoire of genes acting in concert rather than a single determinant may be responsible for the observed HESN phenotypes in the Pumwani cohort. Our data also suggest that genes contributing to a lowered basal immune activation state, in tandem with a general repression of host HIV-1 dependency factors, may be a key contributor to the HIV-1 resistance phenomenon. Hence, the understanding of signaling and metabolic events that regulate immune activation may provide crucial information for the design of effective anti-HIV-1 preventive strategies.
This study was guided by the Helsinki Declaration on ethical principles for medical research involving human subjects. All studies were approved by University of Nairobi/Kenyatta National Hospital Ethical Review Board and the University of Manitoba Health Research Ethics Board. All patients provided written informed consent for collection of samples and subsequent analysis.
The Nairobi (Pumwani) commercial sex worker cohort was established in 1985 and has provided vital data that there might be biological mediated resistance to HIV-1 infection
Whole blood samples were collected in PaxGene blood tubes (Preanalytix, GMBH) and total RNA was extracted in Nairobi following the manufacturer's protocol (Qiagen). Briefly, pelleted nucleic acids were digested with proteinase K then shredded through a spin column to remove protein debris. The resulting supernatant was mixed with ethanol and RNA was selectively bound to a silica-based fibre matrix. Elution of RNA was done with RNASE-free water after washing with saline buffers and digestion with DNAse to remove DNA contamination. Assessment of RNA quality, integrity and purity were done through a Bionalyser 2100 (Agilent Technologies, Palo Alto CA). RNA Samples were considered for further analysis only if they had distinct 28 s and 18 s ribosomal peaks and had been processed or frozen at −70°C within 5 hours after collection.
RNA samples meeting the above criteria were shipped to The Centre For Applied Genomics, Hospitals for Sick Children, Toronto for analysis. 100 ng of RNA was amplified using the Affymetrix two cycle amplification kit (Affymetrix, Santa Clara, CA) incorporating the Ambion MEGAscript T7 amplification procedure. To minimize the challenges posed by abundant globin RNA transcripts, Nugen Whole Blood Solution (Nugen, San Carlos, CA) procedures were incorporated into the Affymetrix protocol. Amplification, labeling, hybridization onto the Affymetrix U133 Plus 2.0 Gene Chip, washing, and scanning were performed as per the manufacturer's protocol.
Background subtraction and probe signal summarization was analyzed using the R Bioconductor software. The resulting log2 signal values were retransformed to linear scale and analyzed using the Affymetrix MAS5 package. The program's algorithm output files represented the differences in intensities between the perfect match and mismatch probe sets or a detection of present, marginal or absent calls. Raw image DAT data was processed to CEL files and analyzed using the ArrayAssist software (Stratagene, La Jolla CA). The ratio of the geometric means of the expression analysis of the relevant gene fragments was computed to yield a fold change analysis. Confidence intervals and corrected
Selected array results from HESN and controls were validated by quantitative PCR using LightCycler real time PCR (Roche Diagnostics). And to asses that the gene expression profiles may not have been confounded by sex work, blood samples from sex worker women who have been involved in sex work for one year and were HIV negative (New Negatives) were randomly selected and included as additional controls in qRT-PCR confirmation of microarray data.
Samples were analysed with Quantitect SYBR Green RT PCR assay kit as per the manufacturer's protocol (Qiagen). Briefly, 100 ng of each sample were ran in duplicate and normalized to 18 sRNA. Relative expression of each gene was determined from a standard curve of pooled cDNA from human peripheral blood mononuclear cells with total RNA extracted in the same way in a 10 series dilution to form the qRT-PCR standards
The microarray data have been deposited at the Genomic Spatial Event (GSE) database under numbers GSE 30155–30240 and GSE 33580.
We thank the volunteers of the Pumwani sex worker cohort for their invaluable time and contribution to this project. We sincerely acknowledge the support from the clinic staff of the University of Nairobi/University of Manitoba Collaborative Program for sample and data collection. Our gratitude to Mark Jordan and Brenda Oosterveen of Agriculture Canada for validating our microarray assays, Leslie Slaney and Ian Maclean of University of Manitoba for logistical support and Rose Odera and Salome Ngamau of Kenya Medical Research Institute for secretarial services.