Evidence for an Epistatic Effect between TP53 R72P and MDM2 T309G SNPs in HIV Infection: A Cross-Sectional Study in Women from South Brazil

Objective To investigate the associations of TP53 R72P and MDM2 T309G SNPs with HPV infection status, HPV oncogenic risk and HIV infection status. Design Cross-sectional study combining two groups (150 HIV-negative and 100 HIV-positive) of women. Methods Data was collected using a closed questionnaire. DNA was extracted from cervical samples. HPV infection status was determined by nested-PCR, and HPV oncogenic risk group by Sanger sequencing. Both SNPS were genotyped by PCR-RFLP. Crude and adjusted associations involving each exposure (R72P and T309G SNPs, as well as 13 models of epistasis) and each outcome (HPV status, HPV oncogenic risk group and HIV infection) were assessed using logistic regression. Results R72P SNP was protectively associated with HPV status (overdominant model), as well as T309G SNP with HPV oncogenic risk (strongest in the overdominant model). No epistatic model was associated with HPV status, but a dominant (R72P over T309G) protective epistatic effect was observed for HPV oncogenic risk. HIV status was strongly associated (risk factor) with different epistatic models, especially in models based on a visual inspection of the results. Moreover, HIV status was evidenced to be an effect mediator of the associations involving HPV oncogenic risk. Conclusions We found evidence for a role of R72P and T309G SNPs in HPV status and HPV oncogenic risk (respectively), and strong associations were found for an epistatic effect in HIV status. Prospective studies in larger samples are warranted to validate our findings, which point to a novel role of these SNPs in HIV infection.


Introduction
Infection susceptibility variabilityamong individuals is observable in many infectious diseases. Understanding such variation and identifying its causal factors may have important implications for clinical practice and population health. Although intra and inter-population variation of viral infection susceptibility is attributable to several factors, host genetics likely plays an important role. Regarding HIV infection, a common example of genetic resistance is the CCR5 32-bp deletion. This allele, when transcribed and translated, results in a non-functional receptor, thus providing resistance to HIV infection [1]. Importantly, these findings are currently being incorporated into therapeutic approaches [2,3], thus evidencing that understanding the genetic basis ofviral infection susceptibility have practical implications for human health. In this regard, several genome-wide association studies (GWAS) have been conducted to investigate the roles of host genetics inHIV load and/or disease progression [4][5][6][7][8][9][10][11][12][13][14], and, more recently, some GWAS focused on genetic factors associated with HIV acquisition in different populations [15][16][17][18][19][20]. Such attempts indicate the importance given to host genetics regarding HIV pathogenesis.
Considering that HIV infection is considered a major risk factor for HPV infection, the identification of host genetic factors involved with HIV infection may also have implications for HPVrelated outcomes. Given the well-established roles of HPV in cancer, the majority of the studies involving HPV and host genetics are related with cancer development and/or progression associated with infection by oncogenic HPV strains. In this context, p53 pathway genes have beenextensively studied given that oncogenic HPV E6 orchestrates, in association with E6AP, p53 degradation by the ubiquitin-proteasome system [21]. Theoncogenic effects of HPVhavealso been investigatedepidemiologically (in different populations) using genetic variants in p53 pathway genes, including the TP53 R72P SNP (rs1042522) [22][23][24], which was evidenced to interfere with p53 apoptotic and transcriptional functions [25][26][27]. More recently, the MDM2 T309G SNP (rs2279744) [28][29][30], which influences p53 activity by affecting MDM2 transcription, have also been investigated in similar contexts [31,32].
In a recent review, different biological mechanisms by which interfering with p53 pathway may impact viral infection (or virus persistence after exposure to it) wereproposed [33]. In addition, there are substantial amounts of in vitro evidence supporting a functional relationship between p53 and different HIV proteins (with evidence for implicationsfor p53-mediated apoptosis), as well assome evidence regarding MDM2. These functional relationships with the p53 pathway have been evidenced for gp120 [34,35], Rev, Tat [36] and Vpu [37] proteins. Considering the roles of R72P and T309G SNPs in p53 apoptosis and regulation (respectively) and the evidence for a role of p53 pathway in HPV and HIV biology, we aimed to investigate the association of these SNPs with HPV infection status, HPV oncogenic risk and HIV infection status.

Study Design and Participants
We performed a cross-sectional study, selecting two independent groups of women based on HIV status (determined by medical diagnosis). From May2010to May2011, 250 (150 HIVnegative and 100 HIV-positive)women seekinggynecologic careat the gynecological ambulatory clinic of Faculty of Medicine of Federal University of Pelotas (South Brazil) that fulfilledeligibility criteria (not pregnant, sexually active, and not menstruating) and agreed to participate were sequentially included in the study. Data were collected using an adapted version of a closed questionnaire [38], which was applied by a trained female interviewer. Routine gynecological exams (cervicitis indicators, visual inspection with acetic acid and Lugol's iodine), were performed and included in the questionnaire, as well the patient's recorded information (last Pap test result).

Ethics Statement
The study was approved by the Ethics Committee of the Faculty of Medicine of Federal University of Pelotas (June 2009). Written informed consent was obtained from all participants. All procedures were performed in accordance with the Helsinki Declaration guidelines.

DNA Collection
Cervical samples were collected with a cytobrushand placed into1.5 mlmicrotubescontaining300ml ofCell Lysis Solution (Puregene TM DNA Extraction Kit,Gentra Systems Minneapolis, MN). The material was enzymatically digested using 1.5 ml of proteinase K (10 mg/ml, New England Biolabs, MA) and incubatedovernightat room temperature. DNA was extracted according to manufacturer's specifications.

HPV Detection and Genotyping
HPV detectionwas performed using nested-PCRintworounds: amplification of a 450 bp fragment using the MY09/11 primer pair [39] and amplification of a 140 bp fragment using the GP5/6 primer pair [40]. MY90/11 and GP5/6 PCRs (final reaction volume of 25 ml) were performed as follows: initial denaturation for 9 min at 95uC; 40 cycles of denaturation (for 1 min at 95uC and for 30 s at 94uC, respectively), primer annealing (for 1 min at 55uC and 30 s at 45uC, respectively), and extension (for 1 min and for 30 s, respectively, at 72uC); and final extension for 5 min at 72uC [41,42]. PCR amplicons were visualized on 2.0% agarose gels stained with GelRed TM (Biotium Inc., CA). HPV-positive amplicons (from the second nested-PCR round) were purified using Gel Band purification kit (GE Healthcare, USA) according to manufacturer'sinstructions. To determine HPVoncogenic risk group (i.e., HPV genotype), Sanger sequencing was performed in a MegaBACE 1000 DNA sequencer (GE Healthcare, USA) using Dynamic ET-terminator technology. Chromatograms were assembled and analyzed using the ContigExpress module of the Vector NTI 10.0 suite (Invitrogen, USA). The assembled sequences were submitted to BLAST alignment (www.nci.nlm. gov/BLAST) against sequences available in GenBank.
Both SNPs were genotyped by PCR-RFLP using GoTaq qPCR Master Mix (Promega, USA) (in 12 ml reactions) with primers, restriction enzymes and PCR conditionsdescribed previously [43][44][45]. Briefly, for the R72P SNP, the 199 bpamplicon was cleaved using BstUI (New England Biolabs, MA) and loaded on 2.5% agarose gel stained with GelRed TM (Biotium Inc., CA). Genotyping was performed as follows: one fragment of 199 bp corresponds to P72P genotype; three fragments of 199 bp, 113 bp and 86 bp correspond to R72P genotype; and two fragments of 113 bp and 86 bp correspond to R72R genotype. For the T309G SNP, the 157 bpamplicon was cleaved by MspA1I (New England Biolabs, MA) and loaded on 2.5% agarose gel stained with GelRed TM . Genotyping was performed as follows: one fragment of 157 bp corresponds to T309T genotype; three fragments of 157 bp, 109 bp and 48 bp correspond to T309G genotype; and two fragments of 109 bp and 48 bp correspond to G309G genotype.

Statistical Analyses
Analyses were performed in R (version 3.0.1, http://www. r-project.org/). Descriptive analyses were stratified according to HIV infection status for both SNPs [including assessing Hardy-Weinberg equilibrium (HWE) by Fisher's exact test using the ''genetics'' R package: http://cran.r-project.org/web/packages/ genetics/], status of HPV infection (positive or negative), HPV oncogenic risk (high or low) and potential confounding variables (i.e., skin color, achieved schooling in years, family income in minimum salaries and age). Skin color was a categorical variable defined by interviewer's observation. Achieved schooling in years was categorized in illiterate, 1-4, 5-8, 9-11 and 11 or more according to the formal educational system that the recruited women frequented [composed of: 8 years or Primary Education, which changes substantially after the first 4 years; Secondary Education, composed of 3 years (in a total of 11 years); and Higher Education]. Age was categorized in groups of (approximately) 5 years (adjusting the age limits of some groups to avoid categories with very few individuals) for a more detailed comparison between HIV groups. Crude comparisons between HIV strata were performed by either chi-squared or Fisher's exact test (using the ''gmodels'' R package: http://cran.r-project.org/web/packages/ gmodels/index.html).
Importantly, any confounding effect of age, education and family income (after adjusting for skin color) on the association between a genetic factor and a given trait is expected to occur by chance (since an individual's genotype is not influenced by environmental factors), thus imposing a difficulty to establish an adequate conceptual framework to select confounding variables. Therefore, we used a statistical-oriented approach, although skin color was invariable adjusted for. The reason for doing so is the well-known possibility of confounding in SNP-outcome analyses due to population stratification. Since both socioeconomic status (in several populations including Brazil) and genotypic frequencies of the vast majority of SNPs vary substantially according to skin color, the last can create spurious associations between genetic factors and outcomes influenced by socioeconomic factors. This is relevant for this manuscript since the last are known to have profound implications for sexually transmitted diseases.
The confounding variable selection was performed by stepwise backwards selection (critical P = 0.2 according to likelihood-ratio chi-squared test using the ''car'' R package: http://cran.r-project. org/web/packages/car/index.html). Since skin color was invariably adjusted for due to conceptual considerations, it was not subjected to removal. This process was performed having, as the independent variable, each outcome of the study (i.e., HPV status, HPV oncogenic risk and HIV status) and each main exposure (R72P and T309G genotypes, as well as the genotypes combined). These analyses were performed by logistic and multinomial logistic regression (using the ''nnet'' R package: http://cran.r-project.org/ web/packages/nnet/index.html), respectively. The variables that remained in the final models of theoutcome and the genetic exposure of interest were considered confounding factors for the associations involving such exposure-outcome pair.
Associations involving R72P and T309G SNPs and study outcomes were assessed in crude and adjusted logistic regression [estimating odds ratio (OR)] models. Five genetic models (obtained using the ''SNPassoc'' R package: http://cran. r-project.org/web/packages/SNPassoc/index.html) were tested: codominant or genotypic (i.e., each genotype is coded as a distinct category), additive (i.e., the SNPs are coded numerically according to the number of variant alleles), overdominant (i.e., homozygous genotypes = 0 and heterozygous genotype = 1), dominant (i.e., homozygous wild = 0; heterozygous and homozygous variant genotype = 1) and recessive (i.e., homozygous wild and heterozygous genotypes = 0 and homozygous variant genotype = 1), withthe category corresponding to (or containing) the wild-type homozygous genotype as the reference group (as described elsewhere [46]). For the analyses of epistasis, a total of 13 epistaticmodels were tested using crude and adjusted logistic regression models. Eleven of them were described elsewhere [46]. Briefly, the following epistatic models were tested: dominant epistasis [both R72P over T309G (reference: R72R T309T; category 1: R72R G309_; category 2: P72_ _309_) and vice-versa (reference: R72R T309T;   . This visual method was described elsewhere [46] and is aimed at identifying patterns that may indicate anepistatic relationship not reflected in the other models. For the association analyses, P,0.05 was considered statistically significant and P,0.10 (but $0.05) was considered of marginal significance. Given our limited sample size and the practical/logistic impossibility of increasing it, power analyses were performed to estimate the statistical power of this study for different OR values. These analyses were performed by simulations (see Methods S1 for details).

Sample Description
The characteristics of the sample (stratified according to HIV infection status) are shown in table 1. There were statistically significant differences between HIV-positive and HIV-negative regarding the two HPV-related outcomes, age, schooling (P,0.001) and skin color (P = 0.005). Family income did not significantly differ between HIV strata (P = 0.389). In this initial analysis, no significant differenceswere observed between HIV strata regarding the genotypic frequencies of R72P (P = 0.200) and T309G (P = 0.543) SNPs. Moreover, there was no evidence for departures from HWE regarding R72P and T309G SNPs in the total sample (P.0.999 and P = 0.102, respectively) and within skin color strata (white: P = 0.642 and P = 0.474; black: P.0.999 and P.0.999; brown: P = 0.797 and P = 0.331).

Associations of R72P and T309G SNPs with HPV Outcomes and HIV Status
For R72P SNP, only skin color was considered a covariate for all outcomes. For T309G SNP, in addition to skin color, age was also considered a covariate for HPV status and HIV status (tables S1 and S2). Crude and adjusted analyses for the associations of   Although marginal, these associations lead to the speculation that HIV status could be an (at least partial) effect mediator of the associations between T309GSNP and HPV oncogenic risk. In this regard, the analysis of association between this SNP and HPV oncogenic risk were repeated with the inclusion of HIV status as a covariate (table S3). By doing so, only the overdominant model remained, although attenuated, statistically significant [OR (95% CI), 0.38 (0.14-0.96); P = 0.040]. The same was performed for the analysis of association between R72P SNP and HPV status, but no substantial differences were observed [overdominant model: OR (95% CI), 0.60 (0.38-0.96); P = 0.032], further evidencing an effect mediation role in HIV status in the associations between T309G SNP and HPV oncogenic risk. The epistatic models were numbered as described previously [46]. In addition to skin color, age was considered a covariate for HPV status and HIV status (table S4). Crude and adjusted analyses for the associations ofepistatic models and studyoutcomes are shown in tables 5-7. Noassociations were observed for HPV status (P$0.192 and P$0.129, respectively). RegardingHPV oncogenic risk, a significant association was observed for dominant epistasis with R72P SNP overcoming the effects of T309G SNP  The epistatic models were numbered as described previously [46]. Since there were significant associations of different epistaticmodels with HIV status, a visual inspection of OR (with 95% CI) resulting from the adjusted analyses between genotypic combinations of the two SNPs and HIV was performed to identify patterns and, possibly, elaborate further epistatic models. Larger (and similar to one another) OR values were observed for the genotypic combinations R72R G309G and P72PT309T than for the rest ( figure 1). This pattern indicates thata different epistatic effect may be underlying the observed associations. Therefore, an additional model was elaborated to reflect the following epistatic relationship: a double dominant effect that both affects disease risk and blocks a ''hidden'' additional recessive effect, which is manifested in the presence of a homozygous-wild homozygous-variant genotypic combination (which could be called double dominant epistasis with blocked recessive effects). Such model was tested in categorical [assuming distinct effects for the following genotypic combinations: R72R T309T (reference); R72P _309G_, _72_ The epistatic models were numbered as described previously [46].

Discussion
We investigated,for the first time in the epidemiological setting, the evidence linking p53 and MDM2 with HIV,as well aspotential implications of SNPs in the p53 pathway for HPV-related outcomes others than cancer development and/or progression. Analyses of epistatic models provided evidence for a novel mechanism linking the p53 pathway with HIV infection status. Interestingly, we observed large effect sizes for the epistatic models (achieving notably low P-values in a relatively small sample). This contrasts with a recent GWAS involving 6300 cases and 7200 controls that tested approximately 8 million common variants, which suggested that host genetic influences on HIV acquisition are either rare or have very small effects (notdetectable given the study power) [20]. In our study, all genotypic combinations of the epistatic model 9.  [15,20], our effect sizes are larger than all of the OR of statistically-significant SNPs reported in these studies, with one exception (OR, 9.51) thatwas observed in an exploratory phase of the analyses but was markedly decreased in the expanded and external validation analyses (2.67 and 1.69, respectively) [18]. This observation reinforces the importance of analyzing epistasis even in genome-wide scalestudies [47,48].
An important consideration is that we found relatively weak associations of epistatic models with HPV-related outcomes (especially for HPV status), which would not be expected given their strong associations with HIV status. While very strong associations were found between some epistaticmodels and HIV status, there were no associations of models 9.1 and 9.2 with HPV status and HPV oncogenic risk (data not shown). It is conceptually difficult to conceive a factor causally involved in HIV status but not with HPV status when this is not adjusted for in the analysis (unless HIV status does not have a causal effect on HPV status). However, this issue does not invalidate our findings, since they are unlikely to be caused by confounding (discussed below) or chance alone (the P-values for models 9.1 and 9.2 were very low).
An important limitation of our study was the sample size (which increase was not possible due to practical/logistic reasons). Power analysesshowed that, as expected, there was generally low power (power ,0.80) when HPV oncogenic risk was the dependent variable (power ,0.80) for single-SNP and epistasis analyses (due to reduced sample size) alike. For the former (table S6), associations involving OR,2 and performed under the recessive model (of which the exposure prevalence is smaller than of other genetic effects) were also generally underpowered. For the latter (table S7), power was also reduced for associations involving OR,3 and for several models when HIV status was the dependent variable. These results, in addition to the partial redundancy of some genetic/epistatic models, are reflected in the several marginal associations observed. Although these observationsillustrate that our analyses were underpowered under some circumstances, they support the causality of our associations, since the effect sizes were large enough to achieve statistical significance even in a small sample. This is particularly illustrated in the epistatic model 9.1: while power analyses indicate low power when HIV status was the dependent variable, very strong associations were observed due to the large OR values.
Another important consideration is thatthe study design was not optimal for causal inference. This is particularly relevant considering the hypothesis that the SNPs (in combination) might influence virus establishment/persistence after exposure to it, which would be a rather dynamic mechanism. However, it is wellknown that associations involving germ-line genetic markers as independent variables are not subjected to reverse causation and are generally robust against confounding, as reviewed in the context of Mendelian randomization [49]. To further reduce the possibility of residual confounding caused by population stratification, the analyses were adjusted for skin color and additional covariates. Nonetheless, the optimal design would be a prospective study to validate our findings and understand their underlying mechanisms. Besides, the results for models 9.1 and 9.2 were very similar, thus requiring additional studies for validating not only the associations, but also the actual model epistaticmodel. Another issue is the true relationship among genetic markers and the study outcomes: it cannot be determined from our study whether the SNPs are truly causal or are in linkage disequilibrium with the causal variants. Nevertheless, the value of our findings regarding genetic predisposition to HIV persistence after viral exposure is independent of this issue.In summary, ourresults provided evidence for a role of the p53 pathway -involving R72P and T309G SNPs -in HPV status and HPV oncogenic risk, and strong associations were found for an epistatic effect on HIV status. Our results require validation in prospective cohort studies using larger samples for the associations and to identify the best epistatic model. Applications of our findings are related to genetic testing for HIV susceptibility and contributing to the understanding of HIV susceptibility differences among populations. Furthermore, findings of this nature can be explored in laboratory studies aiming at either elucidating HIV pathogenesis or developing new therapeutic/preventive strategies.

Supporting Information
Table S1 Likelihood-ratio chi-squared tests P-values of the selection of confounders based on association with each outcome. * Skin color was included as a covariate regardless of meeting the selection criteria. (DOCX)  ) and P-values], including HIV status as a covariate, between R72P SNP and HPV status and between T309G SNP and HPV oncogenic risk. * ''A'' and ''a'' correspond to wild-type (i.e., either R72 or T309) and variant alleles (i.e., either P72 or G309), respectively. (DOCX) Table S4 Likelihood-ratio chi-squared tests P-values of the selection of confounders based on association with combined genotypes. * Skin color was included as a covariate regardless of meeting the selection criteria. { Also associated with HPV and HIV status. (DOCX) Table S5 Adjusted associations (including HIV status as a covariate) between 11 epistatic models and HPV oncogenic risk. * The epistatic models were numbered as described previously [46]. { The ''_'' indicates that the effect is irrespective of the allele. E.g., R72P G309_ represents the genotypic combinations R72P T309G -R72P G309G. ¥ Other: