Distinct Genetic Loci Control Plasma HIV-RNA and Cellular HIV-DNA Levels in HIV-1 Infection: The ANRS Genome Wide Association 01 Study

Previous studies of the HIV-1 disease have shown that HLA and Chemokine receptor genetic variants influence disease progression and early viral load. We performed a Genome Wide Association study in a cohort of 605 HIV-1-infected seroconverters for detection of novel genetic factors that influence plasma HIV-RNA and cellular HIV-DNA levels. Most of the SNPs strongly associated with HIV-RNA levels were localised in the 6p21 major histocompatibility complex (MHC) region and were in the vicinity of class I and III genes. Moreover, protective alleles for four disease-associated SNPs in the MHC locus (rs2395029, rs13199524, rs12198173 and rs3093662) were strikingly over-represented among forty-five Long Term HIV controllers. Furthermore, we show that the HIV-DNA levels (reflecting the HIV reservoir) are associated with the same four SNPs, but also with two additional SNPs on chromosome 17 (rs6503919; intergenic region flanked by the DDX40 and YPEL2 genes) and chromosome 8 (rs2575735; within the Syndecan 2 gene). Our data provide evidence that the MHC controls both HIV replication and HIV reservoir. They also indicate that two additional genomic loci may influence the HIV reservoir.


Introduction
In the past ten years, candidate gene approaches were used to investigate host genetic variability in HIV-1 disease. Chemokine receptors such as CCR5 and CCR2 [1] and Major Histocompatibility Complex (MHC) class I genetic variation have been definitively associated with either clinical disease progression or HIV-RNA levels [2,3]. Recent advances in large-scale genotyping technologies enabled genome-wide association (GWA) strategies for detection of novel genetic variants that influence HIV-1 disease. In this respect, the first GWA study in HIV-1 disease [4] identified three Single Nucleotide Polymorphisms (SNPs): two of them, linked with the ability to control plasma HIV-RNA levels, were located within the MHC region (rs2395029 within HCP5 gene and rs9264942 near HLA-C), while a third SNP (rs9261174), located near ZNRD1, was only associated with disease progression. These data suggest that disease progression and HIV-RNA might be controlled by several loci of the human genome.
Plasma HIV-RNA levels differ largely between patients. This is the best known virological marker of HIV disease progression; its predictive value was proven at different points of the time course of HIV infection [5][6][7]. HIV-DNA levels in peripheral blood monononuclear cells (PBMC) is another major predictive marker for HIV disease progression representing a phenotype not yet studied in either candidate gene or GWA approaches [8,9]. HIV-DNA and HIV-RNA levels measure two different forms of HIV-1. It is not clear whether HIV-DNA levels measure only the viral reservoir. Indeed both phenotypes can be determined accurately and, even though they are correlated, they are distinct predictive factors for HIV disease progression. Indeed plasma HIV-RNA is a composite marker reflecting viral fitness, replicative capacity and host control, while intracellular HIV-DNA levels reflect the establishment of HIV reservoirs and, thereof, the effect of numerous host proteins that may interact with the intracellular viral lifecycle [10].
We hypothesized that genetic variants that regulate HIV-DNA metabolism, in particular HIV latency, are different than those involved in HIV-RNA metabolism and HIV replication. For this purpose, we conducted a GWA study in the large ANRS PRIMO cohort of unselected HIV-1 seroconverters (observed median time since infection: 46 days) for identifying SNPs associated with early plasma HIV-RNA levels [11]. All 605 Caucasian subjects of this cohort were genotyped using Illumina Sentrix Human Hap300 Beadchip containing 317,139 tagging SNPs. We selected genetic variants having a high likelihood of being associated with plasma HIV-RNA levels for a false discovery rate (FDR) [12] of 25%.
In order to provide additional evidence to support these selected HIV-1 disease-related SNPs, we compared their allelic frequencies between this reference population (PRIMO) and an independent population of long-term HIV controllers [13]. The latter group of patients represents less than 0.5% of the general population of HIV-1 infected patients, and reflects an extreme phenotype compared to the PRIMO population. HIV controllers exhibit sustained spontaneous control of viral replication after at least 10 years of HIV infection (observed median time since HIV diagnosis 18 years). We hypothesized that true positive genetic variants obtained from the GWA will exhibit enrichment for protective alleles in long-term HIV controllers compared to patients from the PRIMO population. We also studied for the first time association signals for another viral phenotype, i.e., cellular HIV-DNA levels.
Taken together, our data redefine the regions controlling HIV-RNA within the MHC locus and indicate an independent effect of SNPs located nearby not only class I but also class III genes. Some of these SNPs are associated with HIV-RNA levels from primary infection and early seroconversion and are over-represented in patients showing long-term spontaneous control. Most importantly, our data provide evidence that the MHC locus controls both HIV replication and HIV-DNA levels while early control of HIV-DNA levels may be under the control of two additional genomic loci.

GWA and plasma HIV1-RNA levels
A total of 308,222 SNPs on the autosomal chromosomes passed the quality-control filters. We tested for association of genotype and early plasma HIV-RNA levels using a linear regression model assuming an additive genetic effect with adjustment for gender. The calculated log 10 P-values for the SNPs genotyped in the 605 patients from the PRIMO cohort are plotted in Figure 1.
The SNPs associated with plasma HIV-RNA levels and selected for a FDR threshold of 25% are shown in Table 1; 15 SNPs were located on Chromosome 6, 3 each in chromosomes 12 and 16, and one on Chromosome 4. Among the 15 SNPs located in Chromosome 6, twelve were clustered within the 6p21 region harbouring the MHC (from 312 kb to 321 kb). The strongest association signal was observed for the rs10484554 SNP, which is located upstream from HLA-C. The rs10484554(T) protective allele had frequency of 0.128 in our population. The second highest association signal was observed for the rs2523619 SNP which lies within the HLA-B gene which is in strong linkage disequilibrium (r 2 = 0.57) with the rs10484554 SNP. The third highest association signal obtained in the presented study was for the rs2395029 which lies within the class I gene HCP5 with weak linkage disequilibrium with rs10484554 (r 2 = 0.13) and rs2523619 (r 2 = 0.07). This association signal for rs2395029 replicates the result previously reported from CHAVI [4]. Another three HLA-C neighbouring SNPs (rs130065, rs3130467 and rs3130473) showing strong linkage disequilibrium (r 2 .0.5) and significant effects on viral load were among the top signals.
When comparing allelic frequencies between HIV-controllers and the PRIMO seroconverters, four SNPs located within the Figure 1. Genome wide association (GWA) study for plasma HIV-RNA levels in the ANRS PRIMO cohort. Negative log 10 (P)-values for the score test for genome-wide association across the genome ordered from 1pter to 22qter. Adjacent chromosomes are shown in light blue and dark blue. The spacing between SNPs does not reflect actual physical distances. The false discovery rate was calculated based on a stratified approach (by chromosome) leading to chromosome-specific thresholds. doi:10.1371/journal.pone.0003907.g001 The following information is given for each SNP: Beta is the slope parameter (allele indicated in bracket is used as a reference group); CHAVI P-values are the P-values obtained from the CHAVI study; CHAVI rank is the rank of the p-values (obtained from the CHAVI study) according to the Illumina 317K whole-genome single-nucleotide polymorphism arrays. The rank assigned to tied values is calculated by taking the mean of the ranks. 6p21 (rs2395029, rs13199524, rs12198173 and rs3093662, are in bold in Table 1), showed highly significant differences (p,10 24 ) that correspond to a striking enrichment in HIV-controllers for protective allelic variants associated with HIV-RNA levels. We further designated them as major SNPs. Moreover, these four major SNPs showed highly significant p values (p,10 27 ) in the CHAVI study (Table 1). It should be noted that our study on the PRIMO cohort used plasma HIV-RNA levels measured at the time of primary infection, while the CHAVI study used viral loads measured during the plateau phase (6 to 18 months following seroconversion). Overall these results support the idea that these major SNPs have an effect on viral load throughout the whole time course of HIV disease. The extent of linkage disequilibrium between selected SNPs (Table 1) lying in the 6p21 area is shown in Figures 2, and Figure  S1 and Figure S2. The genomic region flanked by these SNPs contains both MHC class I and class III genes and extends from the PSORS1C1 gene to TNXB gene. Supplementary Figure 1 and 2 display the linkage disequilibrium data matrix (based on HapMap CEU data) and the genes located witin this 6p21 region (from 312 kb to 321 kb).
Among the class I genes, only HLA-C and HLA-B genes are adjacent to SNPs associated with the control of plasma viral load ( Figure 2). It should be noted that neither the PRIMO (Table 1) nor the CHAVI studies [4] provide evidence for association signals with HLA class II. Among the major SNPs, the rs2395029 is in moderate linkage disequilibrium (r 2 = 0.21) with the rs3093662 SNP which lies within the Tumor Necrosis Factor (TNF) gene. The rs12198173 and rs13199524 SNPs showed moderate linkage disequilibrium with SNP rs2395029 as well as with all the other SNPs in the class I MHC region. Finally the rs12198173 and rs13199524 SNPs are in nearly complete linkage disequilibrium (r 2 = 0.97), are both located in the class III region of MHC and both lie within introns of the TNXB gene. Although mutations in the TNXB gene are responsible for the Ehlers-Danlos Syndrome [14,14], it is unlikely that this gene is responsible for the observed effect in HIV-1 plasma viral load. Other neighbouring genes, particularly C4A and C4B (which show different types of polymorphisms such as duplications, gene recombinations, insertions, and SNPs inducing the appearance of premature stop codons) are located in the immediate telomeric side of TNXB and linked both to autoimmune and infectious diseases and are excellent candidate genes for this effect [15]. These results provide evidence that the region encompassing the HCP5 and MIC genes as well as the class III region encompassing TNXB and C4 genes region both contribute to the control of HIV-RNA levels.
Box plots for HIV-RNA levels measured during primary infection for the four major SNPs are shown in Figure 3. In patients homozygotes for the rs2395029(T) allele, the median value HIV-RNA was 5.09 (range: 4.48-5.70) whereas in heterozygous patients, the median was 4.20 (range: 3.25-5.09), illustrating that the minor allele G is associated with lower HIV-RNA levels. A similar trend for the minor allele G was observed for the rs3093662. Despite a minor allelic frequency of 0.083 (rs12198173) and 0.080 (rs13199524), with few observed homozygote patients for the protective alleles (A for rs12198173, T for rs13199524), a strong reduction (2.35 log 10 ) for the HIV-RNA median level was observed. Both SNPs (rs12198173 and rs13199524) were located within the TNXB gene but the two homozygote patients also carried one favourable HLA allele such as HLA-B27 or HLA-B57. It should be noted that the homozygote patients for these protective alleles (rs12198173(A), rs13199524(T)) account for 4% of the HIV controllers versus 0.3% of the PRIMO seroconverters.
Another important result of this study was the strong association of the rs11725412 SNP located on chromosome 4 with plasma HIV-RNA levels both in the PRIMO and HIV controller cohorts (Table 1). Although we cannot exclude that this finding represents a false positive signal, the p-value on the PRIMO analysis was low (p,6610 26 ) and a significant (p,6.58610 24 ) enrichment was observed in HIV-controllers. The rs11725412 SNP lies in an intergenic region in 4p14, its immediate functional neighbours are the TBC1D1 gene (telomeric) and the KLF3 gene (centromeric). TBC1D1, the founding member of a family of proteins sharing a 180 to 200 amino acid TBC domain, is presumed to have a role in regulating cell proliferation and differentiation [16]. The KLF3 protein has a characteristic zinc finger domain and is classified in the family of SP transcription factors but its precise function remains unknown [17,18]. More distant genes in this region include several Toll like receptors (TLR1,6,10) and PTTG2 (a gene belonging to the securin family) that plays an important role in cell division and survival [19,20].

GWA and cellular HIV-DNA levels
HIV-1 DNA levels in PBMCs are established at a very early stage of HIV infection and are representative of the HIV reservoir [21]. This marker showed a wide range (median = 3.30 log 10 copies/millions PBMC; IQR = 0.76) in patients from the PRIMO cohort; in contrast, HIV-1 DNA levels were very low (median = 1.77 log copies/millions PBMC; IQR = 0.64) in longterm HIV-controllers. We hypothesized that this phenotype may be controlled by genomic regions different from those controlling HIV replication, particularly in phases of pre-integration or in provirus transcription that may be variable among patients.
The SNPs associated with HIV-DNA levels in the PRIMO cohort and selected for a FDR threshold of 25% are shown in Table 2. Among the 28 SNPs, 21 were located in the chromosome 6 (mainly in 6p21), 4 in chromosome 8, 1 in chromosome 17, 1 in chromosome 22 and 1 in chromosome 12. Some of these hits might represent false positives. However the four major SNPs (i.e.,  rs2395029, rs13199524, rs12198173 and rs3093662 are in bold in Table 2) previously found to be associated with HIV-RNA levels were also associated with HIV-DNA levels. This highlights the importance of the MHC region for both HIV replication and HIV reservoir. The first three SNPs (with q-value ,2.3%, see Table 2) are located within three different chromosomes, specifically chromosomes 6, 17 and 8. The strongest association signal was observed for the rs2395029, one of the four major SNPs associated with plasma HIV-RNA (Table 1) [4]. The second highest association signal corresponds to rs6503919 SNP (chromosome 17). This SNP is found in an intergenic region flanked by DDX40 [22] (which is a member of the DExH/D box family of ATP-dependent RNA helicases) (for a recent review see [23]), and YPEL2 (which is a putative zinc-binding protein gene) [24]. The rs2575735 SNP (chromosome 8) lies within the SDC2 gene which encodes for syndecan 2. Syndecan 2 is a trans HIV receptor which binds to HIV-1 gp120 through heparan sulfate chains [25,26]. Keeping in mind that HIV controllers were selected on the basis of a specific HIV-RNA phenotype (,400 copies/ml after 10 years of known HIV infection), we observed no significant enrichment (p.0.1) for the protective alleles of rs6503919 and rs2575735 SNPs in HIV controllers compared to the PRIMO patients. Nevertheless, we cannot rule out the possibility that some of nonchromosome 6 hits may represent false positives. Box plots for the HIV-DNA levels in the PRIMO seroconverters for the first three SNPs (rs2395029, rs6503919, and rs2575735 as listed in Table 2) are shown in figure 4. The difference in median HIV-DNA levels between TT homozygotes and TG heterozygous patients for rs2395029 was 20.53, illustrating that the minor allele G is associated with lower HIV-DNA levels. A similar trend, i.e., a 0.36 difference in median values in HIV-DNA according to the genotype was also observed among HIV controllers. For the SNP rs6503919 located on chromosome 17, we observed both a 20.62 log 10 difference in median HIV-DNA values between AA and GG patients in the PRIMO cohort, and a 20.45 log 10 difference in HIV controllers according to the same genotype. A 20.62 log 10 HIV-DNA difference is clinically highly significant since it can be compared to the effect obtained within the first year following antiretroviral treatment initiation with a classical triple therapy [27]. For the rs2575735 SNP in chromosome 8, we observed a reduction of 20.37 log 10 between homozygotes AA and GG in the PRIMO patients. In contrast such a difference was not observed in HIV controllers. This result may imply a temporal effect which disappears over time.

Discussion
The present GWA study identified SNPs highly associated with early viral replication in the ANRS PRIMO cohort. Although in this study, we performed the genotyping using an Illumina 317K chip, and gains in terms of genetic coverage and transferability [28] could be obtained using more recent technologies it should be noted that the top SNPs associated with the control of viral load are located in the MHC locus in our and Fellay's study even that the Fellay et al. study considered a SNP-chip having higher genetic coverage (Illumina 550K SNP). Furthermore the definition of the viral phenotype is different between Fellay's study and this study; further explaining differences in the power of associations. Four of these SNPs exhibited a striking enrichment of their protective allele in long-term HIV controllers as compared to the unselected seroconverters from the PRIMO cohort. These four major SNPs are located in the 6p21 MHC in the vicinity of class I and III genes. The association signal observed for the rs2395029 SNP located in the HCP5 gene, which is in complete linkage disequilibrium with HLA-B*5701 [29], replicates the previous report from the Chavi study [4]. Interestingly, the rs2395029*T allele which has a protective role in HIV infection was also recently shown to be associated with increased susceptibility to psoriasis and psoriatic arthritis [30]. One possible explanation for this MHC association signal is that variants of the classical HLA-B and HLA-C alleles optimally present HIV-derived peptides to CD8+ T-cells which, in turn, kill HIV-infected cells. The results of the present study make an additional contribution to reports in the literature regarding the importance of HLA-B27 and HLA-B57 in long-term nonprogression and in viral load control [31][32][33]. Another possible explanation relates to other MHC-related gene variants in linkage disequilibrium with rs2395029 such as SNPs in the TNF gene. The fact that the MICB gene is flanked by rs2395029 and rs3093662 SNPs also suggests its contribution to the control of HIV-1 RNA. MICs are polymorphic [34,35] and affect cell-mediated cytotoxicity through interaction with NKG2D [36,37] while TNF variants affect numerous steps of the immune response [38]. It is, thus, possible that HIV-infected patients who have a particular set of the classical either HLA-B or HLA-C alleles associated with TNF and MIC protective alleles have an increased capability to control HIV replication during the time course of HIV disease.
In addition, the present study also highlights that some of the major SNPs located in the MHC region have an effect on HIV-RNA levels independently of SNP rs2395029. The two major SNPs rs12198173 and rs13199524, which are in nearly complete linkage disequilibrium, suggest that the class III region of MHC plays an important role in HIV replication. In homozygote patients for the minor allele of these SNPs, plasma HIV-RNA was 2.35 log 10 lower than in patients with other genotypes. To the best of our knowledge, the present is the first study demonstrating that a rare genetic variant within the class III region of MHC is associated with important viral load changes, which are comparable to that obtained during the first weeks following multiple antiretroviral therapy initiation. Undoubtedly, such an efficient control of viral replication, both in terms of strength and duration, provides new insights into underlying mechanisms and may lead to novel methodologies for HIV vaccine design.
Moreover, the present study highlights the importance of SNPs adjacent to HLA-C such as the rs10484554 which was the most highly associated SNP in our study. HLA-C has a dual function: it presents HIV-derived peptides to T lymphocytes but also interacts with either activatory or inhibitory NK receptors. Unfortunately, very few studies have focused attention on the role of HLA-C alleles in HIV disease [39]. Recent reports [40] provided evidence of epistatic interactions between classical HLA-B alleles and KIRreceptor polymorphisms but again only the role of HLA-B allelic variants was studied. It is worth noting that the protective rs10484554*T allele was also recently shown to be associated with increased susceptibility to psoriasis and psoriatic arthritis [30]. These findings suggest that HLA-C deserves more attention as an HIV disease modifying gene.
The present one is the first study which explored the role of genetic variants on the early establishment of the HIV reservoir. Association signals for HIV-DNA levels revealed new genetic loci that may control early steps of HIV spreading. The four major SNPs associated with viral replication were also found to be strongly associated with HIV-DNA levels. The strongest association signal for HIV-DNA was again observed for the rs2395029 SNP located in the HCP5 gene suggesting that structural variants of MHC genes also play a very important role on HIV reservoir. Moreover, we found an association signal for the SNP rs6503919 in chromosome 17 which lies in an intergenic region flanked by the DDX40 and YPEL2 genes. One plausible explanation for this finding is that the Helicase DDX40 structural variants play an important role in the metabolism of HIV-RNA and/or in the efficiency of provirus transcription. DEAD helicases are generally thought to participate in many aspects of RNA metabolism including RNA transcription, mRNA export, translation and RNA stability. It is now well established that HIV-1 has the ability to use more than one cellular RNA helicases for its replicative life cycle (for a recent review see [23]). The other immediate neighbour gene of SNP rs6503919 is YPEL2, which is a member of the zinc-binding protein family. Besides the fact that it is associated with the mitotic apparatus, details of the exact functions of YPEL2 are not well known to date. It is possible that structural variants of YPEL2 affect the rate of division of immune competent cells, thus, limiting expansion of latently infected cells and their spreading to lymphoid tissues.
We also report an association signal on chromosome 8 for the rs2575735 SNP lying within the Syndecan 2 gene. The role of Syndecans (which are trans HIV receptors) on the HIV reservoir has not been investigated to date. Syndecans play a role for HIV dissemination to T lymphocytes and macrophages [25,26]. Our findings support the idea that structural variants of Syndecans may be important in controlling HIV spreading and early establishment of the HIV reservoir. It should be noted that the protective variant was not over-represented in long-term HIV controllers, suggesting either a temporal effect or a lack of power due to the small number of HIV controllers tested. It is also important to keep in mind that the HIV controllers in the present study were selected on an HIV-RNA based phenotype (,400 copies/ml) but not on an HIV-DNA based phenotype. Further studies are necessary to confirm the role of Syndecan 2 genetic variants on HIV reservoirs.
Taken together our data provide evidence that four major SNPs in the MHC locus are associated with both control of plasma HIV-RNA and cellular HIV-DNA levels. Most importantly they are associated with viral load replication at the time of primary infection, at the viral set point but also at late stages of HIV-1 disease. It is worth noting that here we performed the genotyping using an Illumina 317K chip and that gains in terms of genetic coverage and transferability could be obtained using more recent technologies [28]. In our study, we provide evidence that the host genetic control on HIV-RNA over time is under the control of both MHC class I and class III genes. Finally, this is the first GWA study which investigated the host determinants controlling the HIV-1 reservoir and identified two new loci in different chromosomes; this result implies that additional and/or different genes than those controlling HIV replication control the establishment of the HIV reservoir.

Patients
Since November 1996, the ongoing French ANRS PRIMO CO 06 Cohort has enrolled patients during primary HIV-1 infection in 81 French hospitals [11]. The present study was approved by the Paris Cochin Ethics Committee. All subjects gave their informed written consent. Recent infection was confirmed by one of the following: an incomplete Western Blot pattern (absence of anti-p68 and anti p34); a detectable plasma HIV-RNA or positive p24 antigenemia a negative or weakly reactive ELISA test; an interval less than 6 months between an authenticated negative and a positive ELISA test. The date of infection was estimated on the basis of one of the following: the date of symptom onset minus 15 days, the date of the incomplete Western Blot minus one month, or the mid-point between a negative and a positive ELISA test. The interval between the estimated date of infection and enrolment did not exceed 6 months (median observed interval in the cohort = 46 days). Patients had to be antiretroviral-naïve at enrolment. Clinical and biological examinations were conducted at enrolment and at each follow-up visit, and included CD4 cell and plasma HIV-RNA assays. Blood cells and plasma were drawn at enrolment and at follow-up visits and frozen until analyses.
Since May 2006, the ongoing French ANRS EP36 study has enrolled HIV controllers, defined as patients who had more than 90% of their viral loads ,400 copies/ml after 10 years of known HIV infection, and were disease symptom-free and antiretroviraltherapy free. This study was approved by the Paris Bicêtre Ethics Committee. Patients were identified in hospitals throughout France, and were enrolled after written informed consent [13,41]. At enrolment, 10 ml of whole blood were collected and frozen for further virological and genetic analyses. A total of 45 Caucasian HIV controller patients with available DNA samples were selected for this study.

HIV-RNA and HIV-DNA measurements
Plasma HIV-RNA levels and HIV-DNA in PBMCs used in this paper were quantified in the first patient samples drawn at the time of enrolment in the PRIMO cohort and the HIV controllers study, prior to any antiretroviral therapy initiation. Median time from infection to enrolment was 47 days IQR 37-69 HIV-RNA and HIV-DNA levels measured during the 0-6 months following infection predict HIV disease progression [8]. HIV-RNA levels measured at the time of primary infection are strongly associated with HIV-RNA levels at the set point [42].
Quantification of HIV-RNA in plasma was performed on site at the participating hospitals, while HIV-DNA quantification was performed at the Necker Virology laboratory using real time PCR technology based on amplification of the HIV-1 LTR gene [43]. Briefly, after nucleic acid extraction, the total amount of DNA was quantified. The 95% detection threshold of the assay was 5 HIV-DNA copies/PCR, that is, 70 copies of HIV-DNA per million PBMC when using 0.5 mg of total DNA per PCR. For HIV controller samples, quadruplicate tests were conducted by PCR in order to get numerical data values for very low levels. Results were expressed as the log 10 copies number per million PBMC.

Illumina SNP genotyping and quality-control filtering
The DNA samples from all patients were genotyped with Illumina HapMap300 Beadarrays containing 317,000 SNPs. The scanned images were processed using BeadStudio (version 3.2.23) and all genotypes were called using the Illumina proprietary algorithm with genotype clusters generated from the series of samples tested.
Samples with call rate below 97% were discarded and SNPs with Gene Call,90% were removed. SNP with either a minor allele frequency below 0.01 or significant (p,10 27 ) deviation from Hardy-Weinberg equilibrium were excluded. After applying these filters, the average call rate across these samples was 98.3% with 308,222 SNPs on the autosomal chromosomes.

Statistical analysis
Population stratification. To detect, and control, possible population stratification, we employed the genomic control approach [44] using all SNPs to estimate the genomic-control inflation factor l.
The value for l was small (l = 1.005) indicating no population substructure in our cohort and inducing no inflation of the test statistics under the null hypothesis. We, therefore, reported all our analyses results without regard to specific treatment for population structure.
Multiple testing. To address the multiple testing problem, we selected SNPs considering a 25% level for the FDR which is the expected proportion of false discoveries among all discoveries [12]. The q-values, which are analogous to the adjusted P-values for the FDR criterion, measure the minimum FDR that is incurred when calling that test significant. In the present study, we estimated the q-values using the procedure introduced by Dalmasso et al. [45] and considering a stratified approach (by chromosome) as proposed by Sun et al [46].
Comparison between PRIMO and HIV controller patients. We compared allele frequencies between the PRIMO and HIV controller patients using an algorithm dedicated to association studies using bi-allelic markers and allowing fast computation of unbiased exact P-values [47].

Assessment
of linkage disequilibrium between SNPs. To assess the level of linkage disequilibrium between SNPs, we calculated the pairwise r 2 measure between consecutive pairs of markers throughout the genome using the expectationmaximization algorithm to estimate 2-locus haplotype frequencies which is implemented in the R package snpMatrix [48]. Genome wide association tests. We tested for an association between SNPs and HIV-RNA and HIV-DNA levels according to a linear model with an additive genetic effect with adjustment for gender. Significance was determined using generalized score tests [49] with SNP genotypes as the dependent variables and either HIV-DNA or HIV-RNA levels as the independent variables.