Association of Single Nucleotide Polymorphisms in the Lens Epithelium-Derived Growth Factor (LEDGF/p75) with HIV-1 Infection Outcomes in Brazilian HIV-1+ Individuals

The lens epithelium-derived growth factor p75 (LEDGF/p75), coded by the PSIP1 gene, is an important host co-factor that interacts with HIV-1 integrase to target integration of viral cDNA into active genes. The aim of this study was to investigate the association of SNPs in the PSIP1 gene with disease outcome in HIV-1 infected patients. We performed a genetic association study in a cohort of 171 HIV-1 seropositive Brazilian individuals classified as rapid progressors (RP, n = 69), typical progressors (TP, n = 79) and long-term nonprogressors (LTNP, n = 23). The exonic SNP rs61744944 and 9 tag SNPs were genotyped. A group of 192 healthy subjects was analyzed to determine the frequency of SNPs and haplotypes in the general population. Linkage disequilibrium (LD) analyses indicated that the SNPs analyzed were not in high LD (r2<0.8). Logistic regression models suggested that patients carrying the T allele rs61744944 (472L) were more likely to develop a LTNP phenotype (OR = 4.98; p = 0.05) as compared to TP group. The same trend was observed when LTNPs were compared to the RP group (OR = 3.26). Results of haplotype analyses reinforced this association, since the OR values obtained for the haplotype carrying allele T at rs61744944 also reflected an association with LTNP status (OR = 6.05; p = 0.08 and OR = 3.44; p = 0.12 for comparisons to TP and RP, respectively). The rare missense variations Ile436Ser and Thr473Ile were not identified in the patients enrolled in this study. Gene expression analyses showed lower LEDGF/p75 mRNA levels in peripheral blood mononuclear cells obtained from HIV-1 infected individuals. However, these levels were not influenced by any of the SNPs investigated. In spite of the limited number of LTNPs, these data suggest that the PSIP1 gene could be associated with the outcome of HIV-1 infection. Further analyses of this gene may guide the identification of causative variants to help predict disease course.

As integration of the viral genome is a prerequisite for HIV-1 replication, genetic variations in LEDGF/p75 could lead to different disease outcomes. Some studies have suggested that genetic variation in PSIP1 may influence the susceptibility to HIV-1 infection and disease progression. However, these results need to be validated in different populations [39][40][41]. In the current study, we performed a comprehensive analysis of the genetic variations in PSIP1 in a cohort of Brazilian HIV-1 infected individuals with different disease outcomes.

Ethics Statement
The present study was approved by the IPEC/FIOCRUZ Institutional Review Board (IRB), as well as by the Brazilian National Commission of Ethics in Research (151/01, 81/2008 and CONEP 14430). The LTNPs patients have signed the written informed consent. For the others patients, the use of stored biological samples was approved by the IPEC IRB for an anonymous unlinked study. In all cases, personal identifications were excluded to ensure patient's anonymity. The present study has not been submitted or accepted for publication elsewhere.

Subjects and study design
The present study was conducted to characterize the frequencies of PSIP1 SNPs as well as to establish possible associations between these SNPs and AIDS progression profiles in a cohort of 171 HIV-1 infected Brazilian subjects. Patients were classified as rapid progressors (RP, n = 69), typical progressors (TP, n = 79) and long-term nonprogressors (LTNP, n = 23) according to clinical and laboratory information obtained from the database provided by the Statistics and Document Service of Instituto de Pesquisa Clínica Evandro Chagas (IPEC, Rio de Janeiro). The time of progression to AIDS was based on the time between HIV-1 infection and the occurrence of the first AIDS-defining event (CD4+ T cell counts ,350 cells/mm3, occurrence of AIDSdefining disease, use of antiretroviral therapy or death related to AIDS). The criteria to classify these patients were as follows: a) RP are those patients who progressed to AIDS up to 3 years after HIV-1 infection (the date of HIV-1 infection was estimated as the mean between the last negative and the first positive HIV-1 serology, with a maximum interval of 1.5 year between the two tests); b) TP are those patients who progressed to AIDS within 4 to 10 years after HIV-1 infection (the date of HIV-1 infection was estimated as the mean between the last negative and the first positive HIV-1 serology; in the absence of a negative serology the date of infection was inferred to be 6 months before the first positive HIV-1 serology); and c) LTNP are those patients who maintained CD4+ T cell counts .500 cells/mm 3 without the use of antiretroviral therapy and AIDS-defining events for more than 10 years after HIV-1 infection. In addition, a group of 192 HIV-1 negative individuals from the same geographic region was analyzed. This group was used as population control to characterize the frequency of PSIP1 SNPs and haplotypes. Therefore, these samples were not included in the association analysis.
In order to assess LEDGF/p75 mRNA expression in LTNPs, a group of 15 HIV-1+ untreated patients with viral loads higher than 10,000 copies/ml and a group of 20 uninfected healthy donors were included as controls. The level of LEDGF/p75 mRNA was quantified for 16 out of 23 LTNPs.

DNA extraction and SNP genotyping
DNA samples were extracted from 200 ml of whole blood or from isolated PBMCs using QIAamp DNA kit (Qiagen Inc., CA, USA), according to the manufacturer's protocol. PSIP1 exons 8-14 were amplified and sequenced in order to identify the occurrence of missense mutations and other genetic variations within and outside the exons coding for the IBD. The positions K364, I365, D366, F406 and V408, associated with the interaction of HIV-1 IN and I436 and T473, identified in LTNPs, were investigated. The SNPs rs61744944 (Q472L), rs35678110 and one SNP located in intron 11 identified by Ballana and colleagues [39] (chromosome position 15469145 according to contig NT_008413.18) were also analyzed. A representation of PSIP1 genomic organization and LEDGF protein structure, as well as all the genetic markers evaluated in the present study, is depicted in Figure 1. PCR reactions were carried out with 100 ng of genomic DNA, 0.25 mM of each specific primer (designed using Primer3 tool; Table S1), 1.25 U of Taq DNA polymerase (Promega, WI, USA), 0.3 mM of each dNTP (Invitrogen, CA, USA) and 1 mM of MgCl 2 in a final reaction volume of 50 mL. Cycling conditions were as follows: 95uC for 2 minutes followed by 30 cycles of 95uC (30 seconds), 58-60uC (30 seconds) and 72uC (1:30 minute) and 10 minutes at 72uC for final extension. Samples were purified and sequenced using BigDye Terminator Cycle Sequencing Reaction Kit (Life Technologies, CA, USA). Exons 8, 9, 10 and 14 were characterized for all LTNPs and 30 out of 69 RP, while exons 11-13 were sequenced for all HIV-1+ patients and healthy subjects.
In order to increase gene coverage, a total of 8 tag SNPs (rs7470146, rs10962048, rs10283923, rs12339417, rs10119931, rs2737829, rs1033056, rs17337140) was also selected from HapMap data bank using data from CEU population (minor allele frequency of at least 0.05 and r 2 $0.8 as linkage disequilibrium cutoff). The SNP rs2277191 [40] was also included in this analysis, as depicted in Figure 1. Genotyping of tag SNPs was performed using Real time SNP Genotyping Assays (Table S2) LEDGF/p75 mRNA expression RNA was isolated from 10 million frozen PBMCs using Trizol reagent, according to the manufacturer's protocol (Life Technologies, CA, USA). All RNA samples were assessed for quality and quantity by Synergy 2 multi-mode plate reader (BioTek Instruments, VT, USA), and RNA concentrations were further used to set up reactions. RNA (1 mg) was treated with DNase I (Life Technologies, CA, USA) in a 10 ml volume and then reverse transcribed using 20 ml reactions of Superscript II (Life Technologies, CA, USA) and Oligo(dT) (0.5 mg/ml). LEDGF/p75 mRNA levels were quantified using a Real Time Gene Expression Assay (Hs00253515_m1), according to manufacturer's protocol. YWHAZ (Hs03044281_g1) [42] and PGK1 (Hs00943178_g1) [43] were used as reference genes for these analyses. Reactions were carried out in an ABI Prism 7500 (Life Technologies, CA, USA) thermal cycler. After PCR, the variation of threshold cycles (DCt) between LEDGF/p75 and the mean between the two reference genes (Ct reference = (Ct YWHAZ + Ct PGK1)/2) was calculated for all samples (Ct LEDGF/p75 -Ct reference ). Relative LEDGF/p75 levels are shown as 2 2DCt values, which represent the fold increase in LEDGF/p75 mRNA expression.

Statistical analyses
SNP genotyping data. All statistical analyses were carried out as previously described [44]. Frequencies of each genotype, allele and minor allele carriers were determined by direct counting. Deviations from Hardy-Weinberg equilibrium (HWE) were accessed by Fisher exact test and pairwise linkage disequilibrium (LD) patterns were determined using the r 2 statistics. Both analyses were performed using data from healthy subjects. Frequencies of each SNP were compared between patients and healthy subjects using chi-squared tests to assess whether there was any differential profile among HIV+ subjects. Comparisons of patients with different AIDS progression phenotypes were performed by unconditional logistic regression models using frequencies of genotypes, alleles and minor allele carriers. Odds ratios (ORs) were used as effect estimates. Haplotype frequencies were estimated by maximum likelihood and compared between the different patients groups as well. No adjustments for multiple comparisons were applied. All analyses were performed using R for windows version 2.12.1 (R Development Core Team 2010), and the packages ''genetics'' and ''haplo.stats''.
Gene expression data. Comparisons of LEDGF/p75 mRNA mean levels in HIV negative individuals, non-treated HIV-1+ and LTNP groups were performed by using Kruskal-Wallis followed by Dunn's post hoc test. Mann-Whitney tests were applied to compare the means of LEDGF/p75 production between minor allele carriers and non-carriers for each SNP. All analyses were performed using the software GraphPad Prism for Windows (version 6.2).

Frequencies of PSIP1 SNPs in healthy subjects and HIV-1+ patients
The frequencies of PSIP1 SNPs were first described on healthy subjects and then compared to HIV-1+ patients, independent of progression profile, in order to identify possible associations with HIV-1 infection (Table S3). The general characteristics such as age, gender and ethnicity of HIV-1 positive patients and healthy subjects are summarized in Table 1. The mean age of the HIV-1 positive and negative individuals included in the study was approximately 33-34 years old. The distribution of both genders as well as the three classes of self-reported ethnicity (white, mestizoes and blacks) was also similar in both groups.
We first investigated the occurrence of rare and new mutations in the PSIP1 exons 8-14 and boundaries (chromosome positions 15474384 to 15466523), as well as substitutions in functional residues in healthy subjects and HIV-1 infected individuals. After direct sequencing, no novel mutations were detected and no differences were observed in exons 8-10 between healthy and HIV-1+ subjects. The rare missense variations Ile436Ser and Thr473Ile were not identified, and the residues K364, I365, D366 and F406 were not mutated in the individuals enrolled in this study. The V408I mutation was identified in one HIV-1+ subject(classified as LTNP). Similarly, the SNP rs35678110 was identified in only one HIV-1+ patient (classified as rapid progressor). Among the 37 exon variations documented in dbSNP in the region spanning exons 8-14, the SNP rs61744944 was the only major variation observed in our samples. The new intron SNP (chromosome position 15469145, according to contig NT_008413.18) was also absent among our cases and healthy controls.
After genotyping all tags (rs7470146, rs10962048, rs10283923, rs12339417, rs10119931, rs2737829, rs1033056, rs17337140) and candidate (rs2277191, rs61744944) SNPs, data from the healthy controls were analyzed separately for quality control and also to determine the distribution of the 10 PSIP1 polymorphisms in the general population. All the genetic variations investigated in this study were present in healthy Brazilian subjects, with minor allele frequencies ranging from 0.01 (rs61744944) to 0.34 (rs7470146). The rs2277191 variant was also underrepresented (0.02). The frequencies of each genotype, allele and minor allele carriers are shown in Table S3. Two SNPs with previously reported association with AIDS phenotypes (rs12339417 and rs1033056) were excluded from further analysis due to deviations from HWE. Results of pairwise LD analyses showed no definite association between SNPs using a r 2 .0.8 cutoff. Therefore, the remaining 8 SNPs were genotyped in the patients group and used for haplotype analyses (Table S4).
A total of 24 PSIP1 haplotypes were found in our samples, including both HIV-1 negative and positive subjects (Table S4). From the 15 haplotype combinations detected among HIV-1 positive individuals, 7 were absent in the healthy subjects group. Comparisons including the four haplotypes with minor frequencies of at least 0.03 did not show differences between HIV+ and HIV2 groups (Table S4).

Association between PSIP1 SNPs and AIDS progression phenotypes
The HIV-1+ individuals were further analyzed according to their progressor status (RP, TP or LTNP), as outlined in Material and Methods. The frequencies of PSIP1 SNPs and haplotypes were compared among the groups to determine a possible association of PSIP1 gene variations with AIDS progression profiles. Results of univariate logistic regression models showed an association between allele T at rs61744944 (Q472L) and an LTNP phenotype, with an OR value of 4.98 and a borderline p-value of 0.05 when compared to TP under a codominant model ( Table 2). The same trend was observed when LTNP patients were compared to RP (OR = 3.26). Analyses of the other 7 SNPs did not show any clear pattern of association with AIDS progression phenotypes ( Table 2). Despite the lack of statistical significance, results of haplotype analyses reinforce the association between SNP rs61744944 and LTNP phenotype, since the single haplotype carrying the allele T showed OR values of 6.05 (p = 0.08) and 3.44 (p = 0.12) in comparison to TP and RP phenotypes, respectively (Table 3).

LEDGF/p75 mRNA expression analyses
LEDGF/p75 expression was significantly reduced in both untreated HIV-1+ (0.7760.47; p,0.05) and LTNP (0.6960.50; p,0.01) groups relative to uninfected individuals (1.5961), as depicted in Figure 2. No differences were observed between the untreated HIV-1 infected individuals and the LTNPs groups. In order to investigate a putative impact of the SNPs in the modulation of LEDGF/p75 gene expression in LTNPs, we further compared the relative expression between the minor allele carriers and non-carriers of each SNP. No differences in LEDGF/ p75 expression levels were observed in LTNPs with distinct genetic background for all SNPs analyzed (data not shown).

Discussion
The present study aimed to characterize the genetic diversity of PSIP1 among Brazilian HIV-1 infected individuals and to investigate the association between these markers with AIDS progression. The exonic variations may have an important impact on protein function. The missense variation Q472L (rs61744944; A.T) is located in PSIP1 exon 13, in a region adjacent to the IBD. Our data show a trend towards an association between the T allele (472L) and LTNP status, suggesting that this variat might confer a protective effect. This SNP has not been reported to significantly affect LEDGF/p75 structure [41], IN binding affinity or HIV replication levels [40], [45]. However, data reported by Madlala and colleagues [40] suggested an association between rs61744944 T and lower viral loads in acute phase of HIV-1 infection. This, together with the results described in the present study, suggest that this variation may have a subtle effect and, as such, would require a larger sample size to reach statistical significance. Alternatively, this SNP might be tagging the effect of a rare variant located in a functional region. Sequencing analysis did not show any other variation in the IBD or its boundaries. However, we cannot rule out the possibility that the SNP rs61744944 might be linked to a rare mutation in a regulatory region.
Resistance to HIV-1 infection and AIDS progression is a complex phenotype in which host genetics plays an important role. The first protective variation to be reported was the D32 deletion in CCR5 gene coding region, followed by several candidate genes studies which described associations with other restriction factors such as CCR5 ligands and HLA alleles, especially class I [46]. The first genome-wide association study (GWAS) was developed to characterize factors associated with control of HIV-1 [23]. Despite the potential of GWAS, new candidates have not been identified. Aside from HLA class I genes HLA-B and HLA-C that were associated with AIDS related outcomes such as control of viremia and LTNP phenotype [22], [47], genes encoding known HIV-1 restriction factors were not associated with disease progression in those studies. Limitations such as the low power of GWAS to detect subtle effects may explain these results. Moreover, rare   variations that might play a key role in outcomes such as the LTNP phenotype are also excluded from this approach. Therefore, the complete characterization of the genetic influence on complex phenotypes such as AIDS progression requires the concomitant development of genome-wide as well as candidate genes/regions designs as described in the present study. The interaction between LEDGF/p75 and HIV-1 IN has emerged as a promising new therapeutic target. Inhibitors of LEDGF/p75-IN interaction (LEDGINs) might be considered for patients failing the currently available drug regimens [48]. In spite of the apparent stability of cellular targets, host genetic variations may impact not only antiviral therapy, but also may be involved in the disease pathogenesis and outcome.
To date, a total of 3 studies have focused on the identification of PSIP1 polymorphisms in HIV-1 infected individuals [39][40][41]. Two studies were conducted with LTNPs to identify rare mutations associated with this phenotype [39], [41]. According to the data obtained from the control population, all the genetic variations investigated in the present study were present in Brazilian subjects, with allele frequencies similar to those observed in Caucasians (CEU) and Africans (AFR) according to HapMap and 1000 Genomes Projects [49], [50]. Notably, the frequencies of SNPs rs10119931, rs2277191 and rs7470146 were close to the mean between those reported for CEU and AFR populations, reflecting the miscegenation in the Brazilian population [51]. SNP rs2277191 has been previously associated with HIV-1 acquisition and disease progression. Lower CD4 + T cell counts and a rapid decline of these cells were observed during the early phase of HIV-1 infection in South African women carrying this particular SNP [40]. Here, we observed similar frequencies of this SNP for both HIV-1 positive and negative individuals. However, we were unable to test the association between this SNP and disease progression due to the small number of subjects carrying allele A.
We have also investigated whether the positions K364, I365, D366, F406, V408, associated with the interaction of HIV-1 IN, were mutated in our cohort. Mutations I365, D366, or F406 disrupt LEDGF/p75-IN interaction and K364 or V408 result in an intermediate phenotype [31], [32]. The V408 was the only position mutated in one LTNP patient. This change affects, but not abolish the interaction with IN [52]. In this particular case, the substitution was V408I, thus the hydrophobic side chain characteristic of this region of the IBD was maintained. A possible impact of this mutation and an association with the nonprogressor status should be further investigated in other LTNP and elite controlers cohorts. The two rare mutations I436S and T473I identified by Ballana and colleagues [39] were not identified in the patients enrolled in this study.
The literature data regarding PSIP1 variations and gene expression are still controversial. Madlala and colleagues [40] identified an association of the SNP rs12339417 with reduced levels of LEDGF/p75, while Messiaen [41] did not find any differences between mRNA LEDGF/p75 levels and disease progression or CD4 decline and viral load. According to our data, the levels of LEDGF/p75 mRNA produced by the LTNPs of our cohort were not significantly influenced by any of the SNPs investigated. However, we did observe a reduction in the LEDGF/p75 expression in both untreated HIV-1+ and LTNP groups relative to HIV-individuals. These results are in agreement with previous reports [40], [53]. As demonstrated by Mous and colleagues [53], untreated HIV-1 infected individuals have significantly lower levels of LEDGF/p75 and higher levels of APOBEC3G, TRIM5a and tetherin than healthy controls, which might suggest a balance of host innate mechanisms to limit HIV replication by increasing the expression of antiviral restriction factors and decreasing the expression of factors that favor viral replication. In our study, no differences were observed between untreated HIV+ individuals and LTNPs suggesting that the LTNP status is not associated with a lower level of LEDGF/p75. Mous and colleagues [53] showed that while mRNA LEDGF/p75 expression was reduced in PBMCs, elevated protein levels were detected in monocytes relative to uninfected individuals. The mechanisms associated with the regulation of LEDGF/p75 expression in different cell types need to be investigated in HIV-1 infected individuals with different disease outcomes in order to elucidate the association of LEDGF expression with control of HIV-1 infection.
In conclusion, results of the present work reinforce the association of PSIP1 gene and rs61744944 SNP with LTNP status. Since this variation did not alter the protein function [40], [45], [54], the mechanisms involved in this protection need to be determined. Further studies with larger cohorts of patients with distinct disease outcomes are needed to confirm these trends and to clarify the role of these genetic variations in the function of LEDGF/p75, a pontential target for anti-HIV therapy.