Browse Subject Areas

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Risk of Ovarian Cancer and Inherited Variants in Relapse-Associated Genes

Risk of Ovarian Cancer and Inherited Variants in Relapse-Associated Genes

  • Abraham Peedicayil, 
  • Robert A. Vierkant, 
  • Lynn C. Hartmann, 
  • Brooke L. Fridley, 
  • Zachary S. Fredericksen, 
  • Kristin L. White, 
  • Elaine A. Elliott, 
  • Catherine M. Phelan, 
  • Ya-Yu Tsai, 
  • Andrew Berchuck



We previously identified a panel of genes associated with outcome of ovarian cancer. The purpose of the current study was to assess whether variants in these genes correlated with ovarian cancer risk.

Methods and Findings

Women with and without invasive ovarian cancer (749 cases, 1,041 controls) were genotyped at 136 single nucleotide polymorphisms (SNPs) within 13 candidate genes. Risk was estimated for each SNP and for overall variation within each gene. At the gene-level, variation within MSL1 (male-specific lethal-1 homolog) was associated with risk of serous cancer (p = 0.03); haplotypes within PRPF31 (PRP31 pre-mRNA processing factor 31 homolog) were associated with risk of invasive disease (p = 0.03). MSL1 rs7211770 was associated with decreased risk of serous disease (OR 0.81, 95% CI 0.66–0.98; p = 0.03). SNPs in MFSD7, BTN3A3, ZNF200, PTPRS, and CCND1A were inversely associated with risk (p<0.05), and there was increased risk at HEXIM1 rs1053578 (p = 0.04, OR 1.40, 95% CI 1.02–1.91).


Tumor studies can reveal novel genes worthy of follow-up for cancer susceptibility. Here, we found that inherited markers in the gene encoding MSL1, part of a complex that modifies the histone H4, may decrease risk of invasive serous ovarian cancer.


Worldwide, there are approximately 125,000 deaths each year due to ovarian cancer [1]; increased understanding of factors related to its outcome and etiology should reduce the burden of this disease. We previously reported results of tumor mRNA expression studies which suggested that altered expression of a particular set of genes predicted response to chemotherapy among women with advanced-stage high-grade epithelial ovarian cancer [2]. These genes included SF3A3, MFSD7 (formerly known as FLJ22269), ID4, BTN3A3, OSGIN2 (formerly known as C8orf1), FARP1, PRKCH, C15orf15, ZNF200, MSL1 (formerly known as LOC339287), HEXIM1 (formerly known as HIS1), PTPRS, CC2D1A (formerly known as FLJ20241), and PRPF31. Expression levels differed among tumors from women with differing outcomes; namely, in combination, expression of these genes predicted early relapse (<21 months) after optimal surgery and platinum-paclitaxel chemotherapy with an accuracy of 86% and positive predictive value of 95% [3], [2].

The etiology of ovarian cancer is known to be complex and, at least in part, includes inherited susceptibility factors. Mutations in BRCA1, BRCA2, MLH1, and MSH2 account for approximately 50% of familial ovarian cancer [4], [5], and remaining cases with a family history are likely due to combinations of multiple alleles conferring low to moderate penetrant susceptibility [6], [7] such as variants in BNC2 [8] and, possibly, TP53 [9], CDKN2A [10], CDKN1B [10], and AURKA [11]. As a complement to genome-wide searches, a useful approach for the identification of additional low-risk alleles is the study of highly-informative inherited variants in candidate genes identified from tumor expression studies. To assess whether variation in genes with differing expression levels by outcome influenced risk of ovarian cancer, we conducted a case-control analysis of inherited variants in genes in the predictive model mentioned above [2] as well as in HTRA1 (encoding the serine protease HtrA1) which we have shown is down-regulated in a majority of ovarian tumors [12], [13] and has a key role in apoptosis [13]. As the first examination of germline variation in this novel set of genes (Table 1), we aimed to more broadly elucidate their role in epithelial ovarian carcinogenic processes.


Demographic, reproductive, lifestyle, and tumor characteristics of 749 epithelial invasive ovarian cancer patients and 1,041 controls are described in Table 2. Generally, the expected distributions of risk factors were observed; a greater proportion of patients than controls had never used oral contraceptives (p<0.001), had used hormone therapy (p<0.001), were nulliparous (p = 0.01), and had a first or second degree family history of ovarian cancer (p<0.001). Such factors associated with risk of ovarian cancer were included as covariates in all genetic analyses.

Gene-level and selected SNP-level association-testing results are shown in Table 3 and Table 4, respectively. Associations for the full set of SNPs examined are displayed in Table S1. Of genes examined in relation to risk of invasive ovarian cancer and risk of invasive serous ovarian cancer, only global variation in the MSL1 and PRPF31 genes was associated at p<0.05. MSL1 gene-level principal components (summarized combinations of genotypes based on two SNPs) were associated with risk of invasive serous disease (p = 0.03). At one of the two SNPs in this gene, rs7211440 (r2 = 0.53 with rs17678694; Figure S1), carriage of the minor allele was associated with reduced risk of both invasive and invasive serous disease (OR 0.85, 95% CI 0.72–1.00, p = 0.05; OR 0.81, 95% CI 0.66–0.98, p = 0.03, respectively, Table 4), suggesting this SNP as the primary driver of MSL1's gene-level serous association.

PRPF31 haplotypes (consecutive series of alleles based on eight SNPs) were associated with risk of invasive disease (p = 0.03). As haplotype analysis can reveal hidden associations, we more closely examined the twenty-six common haplotypes within PRPF31 (Table S2). Compared to the most common haplotype 11111111 (major alleles at all PRPF31 SNPs), the haplotype 10101111 (minor alleles at intronic tagging single nucleotide polymorphisms (tagSNPs) rs12985735 and rs254272) conferred reduced risk of invasive ovarian cancer (OR 0.22, 95%CI 0.06–0.82, p = 0.02). These two SNPs were only modestly correlated (r2 = 0.23; Figure S1), and neither was independently associated with risk (rs12985735 per allele OR 1.13, 95% CI 0.98–1.30, p = 0.10; rs254272 per allele OR 1.05, 95% CI 0.89–1.25, p = 0.56), suggesting that an ungenotyped variant in close proximity to the PRPF31 haplotype may contribute to the association. As two other PRPF31 haplotypes were associated with ovarian cancer risk at p<0.10 (00111101 OR 2.81, 95% CI 0.85–9.22, p = 0.09; 10101101 OR 2.86, 95% CI 0.94–8.71, p = 0.06), other variants in the gene may also contribute to the observed global haplotype association.

Outside of the two gene-level associated genes (MSL1 and PRPF31), a small number of SNPs in six other genes were associated at the single-SNP level (p<0.05, Table 4) including three SNPs associated with reduced risk of both invasive and invasive serous disease (in CC2DIA, MFSD7, and ZNF200). The strongest SNP-level association was observed at the non-synonymous rs2305777 (T801M) in the NF-κB activating gene CC2D1A (invasive OR 0.84, 95% 0.72–0.99, p = 0.03; serous invasive OR 0.76, 95% CI 0.62–0.99 p = 0.005). Based on sequence conservation across species [14], this SNP is predicted to be relatively undamaging to protein function. Because this SNP was not correlated with other genotyped SNPs (r2<0.12, Figure S1) and did not tag other common HapMap SNPs at r2>0.9, sequencing may be required to clarify the meaning of this SNP association. Similarly, MFSD7 rs6840253 was associated with a reduction of ovarian cancer risk (invasive OR 0.81, 95% CI 0.66–1.0, p = 0.05; serous invasive OR 0.76, 95% CI 0.58–0.98, p = 0.03), as was ZNF200 rs186493 (invasive OR 0.83, 95% 0.71–0.82, p = 0.02; invasive serous OR 0.82, 95% CI 0.68–0.98, p = 0.03). Both are promoter region SNPs (within 5 kb 5′ upstream) and independent of other HapMap SNPs at r2>0.9. Because each is correlated at r2>0.6 with other genotyped SNPs, additional genotyping of modestly correlated SNPs could help elucidate these associations.

Three SNPs were associated only with reduced risk of invasive serous disease (in PTPRS and BTN3A3), and one SNP associated with increased risk of invasive serous disease (in HEXIM1). Thus, while results were generally similar in both case groups, the often greater statistical significance of results among women with serous disease, despite reduced power due to a 40% smaller sample size, suggests that subtype analysis revealed heterogeneity by histology.

SNPs that were suggestive only in serous disease included two highly-correlated PTPRS intron 9 SNPs (r2 = 0.99; rs886936 OR 0.85, 95% CI 0.72–1.00, p = 0.05; rs11878779 OR 0.79, 95% CI 0.66–0.95, p = 0.01). Interestingly, these and three other highly-correlated intron 9 SNPs (Figure S1) were independent of each other in HapMap at r2<0.9, serving as an example of non-transferability of linkage disequilibrium (LD) across populations. The only SNP with minor alleles associated with increased risk was the potential HEXIM1 promoter region SNP rs1053578 (serous OR 1.40, 95% CI 1.02–1.91, p = 0.04) which was independent of other genotyped SNPs at r2<0.04 and of other HapMap SNPs at r2<0.9. BTN3A3's putative promoter region SNP rs12206812 was associated with reduced risk of serous invasive disease (OR 0.63, 95% CI 0.44–0.90, p = 0.01); however, we suspect genotype or genetic map error because of failure in WGA DNA and independence with all other genotyped BTN3A3 SNPs (r2 = 0; Figure S1). No gene-level or SNP-level associations at p<0.05 were observed for SF3A3, ID4, OSGIN2, HTRA1, or C15orf15.


Knowledge about the genetics of ovarian cancer is in a rapid state of expansion. As the most lethal gynecologic cancer, discovery of inherited factors related to etiology and outcome may assist in the development of important targeted prediction and therapeutic strategies. Recent work has clarified the roles of long-standing candidate SNPs in the progesterone receptor, retinoblastoma, p53, and cell cycle genes [10], and enabled genome-wide association studies [8]. Analysis of highly-informative variants in selective sets of novel genes complements these candidate SNP and genome-wide association studies, providing improved coverage in high-priority regions based on known tumor biology [16]. Here, we selected novel candidate genes based on prior evidence of their association with time to relapse of ovarian cancer, and we chose comprehensive sets of variants [2], [13]. Our primary result is that variation in MSL1 was related to risk of serous invasive ovarian cancer; notably, minor alleles at rs7211440 correlated with decreased risk (OR 0.81, 95% CI 0.66–0.98, p = 0.03). Previously, increased expression of MSL1 was correlated with earlier time to relapse [2]; additional validation of the prognostic model and replication of the etiologic association are warranted.

MSL1 encodes one of five proteins that form the highly-conserved MSL complex with enzymatic capabilities as a histone acetyltransferase (HAT) [17]. HATs modify a variety of histone domains through acetylation, which, along with other coactivators, regulates histone and chromatin activation and influences gene expression [18]. The MSL complex specifically acetylates lysine residue 16 on histone H4 (H4-Lys16), which plays a crucial role in regulating chromatin folding and silencing of gene expression [19], [20], [17]. Knock-out models indicate that absence of the MSL complex leads to malfunctions during the S phase of the cell cycle, leading to errors in DNA replication [17]. Additionally, loss of monoacetylation of H4-Lys16 and aberrant functioning on H4 are hallmarks of cancer cells. Our data suggest that inherited variation in MSL1 may impact risk of invasive serous ovarian cancer and are consistent with findings that irregular H4 modifications may cause errors in chromatin folding and gene expression and are widespread in cancer phenotypes [21]. With suggestive SNPs in genes for p53 and CDKN2A [10], [9], which also regulate histone modification, evidence for a role of inherited risk factors related to histones is accumulating.

Strengths of this work include the use of two case-control study populations (from Mayo Clinic and Duke University), advanced SNP selection techniques (e.g., high level of required correlation among alleles, inclusion of putative-functional SNPs, and selection of multiple tagSNPs in large LD bins), and excellent genotyping quality. Our assessment of risk for serous invasive disease suggested a degree of genetic heterogeneity by histologic subtype; however, we suggest caution in interpretation of results (particularly single-SNP results in the absence of gene-level significance) due to the relatively large number of tests performed. We also note that no results are statistically significant after adjustment for multiple testing using a conservative Bonferroni correction. Thus, replication of our results is warranted to confirm these associations. Avenues for future research extending from this expression-based candidate gene work include analysis of other histone regulatory genes and detailed assessment of functional mechanisms. More imminently, examination of promising SNPs within MSL1 among a larger set of serous invasive patients and controls and additional fine-scale mapping [16] will assist in clarification of the importance of these genes to ovarian cancer susceptibility. This should, in turn, inform translation of such findings to the clinical management of women at increased risk for ovarian cancer.

Materials and Methods

Study Participants

Subjects participated in two ongoing case-control studies of epithelial ovarian cancer initiated in January 2000 at Mayo Clinic (Rochester, MN) and in May 1999 at Duke University (Durham, NC). Details of the study design have been described in more detail elsewhere [22][24]. Briefly, a total of 749 women with histologically-confirmed invasive epithelial ovarian cancer and 1,041 controls without ovarian cancer and without bilateral oophorectomy were recruited from the two study sites (Table 2; site-specific characteristics provided in Table S3). At Mayo Clinic, ovarian cancer cases (patients) were over 20 years of age with histologically confirmed incident epithelial ovarian cancer and enrolled in the study within one year after diagnosis. All cases seen in the gynecologic or medical oncology units which lived in the six-state region that defines the primary service population of Mayo Clinic (Minnesota, Iowa, Wisconsin, Illinois, North Dakota, and South Dakota) were invited to participate. Controls were recruited from among women seen for general medical examinations and frequency-matched to patients on age and region of residence (i.e., state, county). At Duke University, patients were women with histologically confirmed primary epithelial ovarian cancer, between 20 and 74 years of age, and identified within a 48-county region using the North Carolina Central Cancer Registry. Controls were identified using list-assisted random digit dialing and frequency matched to patients on race, age, and county of residence. No exclusions based on ethnicity were made. Applicants provided written informed consent, and protocols were approved by the Mayo Clinic and Duke University Institutional Review Board.

Data and Biospecimen Collection

Information on potential risk factors was collected through in-person interviews at both sites using similar questionnaires. DNA was extracted from 10 to 15 mL fresh venous blood using the Gentra AutoPure LS Purgene salting out methodology (Gentra, Minneapolis, MN). DNA from Duke University participants were transferred to Mayo Clinic for whole-genome amplification (WGA) with the REPLI-G protocol (Qiagen Inc, Valencia CA) which we have shown yielded highly-reproducible results [25]. DNA concentrations were adjusted to 50 ng/µl prior to genotyping and verified using the PicoGreen dsDNA Quantitation kit (Molecular Probes, Inc., Eugene OR). Samples were bar-coded to ensure accurate and reliable processing.

SNP Selection

We identified tagSNPs within five kb of each candidate gene using the algorithm of ldSelect [26] to bin pair-wise correlated SNPs at r2≥0.90 with minor allele frequency (MAF) ≥0.05 in the HapMap CEU population (Utah residents with ancestry from northern and western Europe) [27]. HapMap data were used because, in November 2007, they were more informative for these genes than data from Perlegen Sciences [28], Seattle SNPs (, and NIEHS SNPs ( One tagSNP per bin was selected if less than ten SNPs were in a LD bin, and two tagSNPs per bin were selected in LD bins with ten or more SNPs. Among tagSNPs, SNPs were chosen to maximize Illumina SNP score (a measure of predicted genotyping success) and then MAF. FARP1 (307 kb) and PRKCH (229 kb) required over 100 tagSNPs each and were excluded from the study for cost-efficiency. For the remaining genes (Table 1), 117 tagSNPs and 19 putative-functional SNPs (within 10 kb upstream, 5′ UTR, 3′ UTR, or non-synonymous from Ensembl version 34 with European-American MAF≥0.05 and Illumina SNP score≥0.6) were selected. Thus, a total of 136 SNPs in these 13 candidate genes were genotyped.


As part of a larger study, genotyping of 2,176 DNA samples (897 genomic, 1,279 WGA, and 129 duplicates) from 2,047 unique study participants was performed at Mayo Clinic along with 65 laboratory controls. We used the Illumina GoldenGate BeadArray assay and BeadStudio software for automated clustering and calling according to a standard protocol [29]. Of 2,047 participants genotyped, 44 samples (2.1%) failed (call rate <90%), and 213 participants (10.4%) were found to be ineligible or have borderline disease and were excluded; thus 1,790 participants (including 749 patients with invasive disease and 1,041 controls) were analyzed here. A total of 1,152 SNPs for a variety of projects were attempted; 25 failed SNPs included 15 (1.3%) with call rate <90%, nine (0.8%) with poor clustering, and one (<0.1%) with unresolved replicate or Mendelian errors in genomic DNA. We assessed departures from Hardy Weinberg equilibrium (HWE) in self-reported white, non-Hispanic controls with a Pearson goodness-of-fit test or, in the case of SNPs with a MAF <5%, a Fisher exact test, and we excluded SNPs with MAF <0.01 (N = 64, 5.6%) or HWE p-value<0.0001 (N = 11, 1.0%), leaving 1,052 SNPs for analysis. For WGA DNA, an additional 20 SNPs (1.7%) were excluded due to one or more of the above criteria. Among the 13 candidate genes studied, 134 of 136 SNPs were successfully genotyped in genomic samples, and all but six of these were also genotyped successfully in WGA samples (Table S4).

Statistical Methods

Distributions of demographic and clinical variables were compared between patients and controls using chi-square tests or t-tests, and estimates of pair-wise LD between SNPs were obtained using Haploview software, version 4.1 [30]. Genetic association analyses (described below) were adjusted for study site, age, body mass index, hormone therapy, oral contraceptive use, number of live births, age at first live birth, and population structure principal components which accounted for the possibility of population stratification using an approach similar to that described previously [31]. Population structure principal components were created using 2,517 SNPs from this and prior genotyping panels [23]; scatter plot matrices by self-reported race indicated that the first four population structure principal components reasonably approximated racial differences across individuals and were thus included as covariates in all models (Figure S2).

Associations with ovarian cancer risk were assessed using logistic regression of SNPs, gene-level principal components, and gene-level haplotypes. For SNPs, odds ratios (OR) and 95% confidence intervals (CI) were estimated separately for heterozygous and homozygous minor allele genotypes, using the homozygous major allele genotype as the referent group. We included eight SNPs with HWE<0.05 (Table S4) because of acceptable genotype cluster plots, the large number of tests (Bonferroni corrected p-value≤3.7×10−4), and no assumption of HWE for single-SNP analysis. Formal genotypic tests of association were carried out assuming an ordinal (log-additive) effect using simple tests for trend. Within each gene, we used a principal component analysis to create orthogonal linear combinations of the SNP minor allele count variables (including genotypes imputed using the MACH software package [32]) to provide an alternate and equivalent representation of the collection of SNPs as a whole. The resulting smallest subset of gene-level principal components that accounted for at least 90% of the SNP variability was included in regression models, and gene-specific associations were evaluated using a multiple degree of freedom likelihood ratio test. Gene-centric haplotype-based association analyses were conducted using posterior probabilities of all possible haplotypes for an individual (excluding SNPs with HWE p-value<0.05), conditional on the observed genotypes. The expectation-maximization algorithm was used to estimate haplotypes [33] and create haplotype design variables ranging from 0 to 2. Because of the imprecision involved in low-frequency haplotypes, we excluded haplotypes with an estimated frequency of less than ten. Assessments of risk among common haplotypes tested the simultaneous effects of all haplotypes combined in logistic regression; individual haplotype associations used the most common haplotype as reference. All statistical tests were two-sided, and unless otherwise indicated, were carried out using SAS software (SAS Institute, Inc., Cary, NC).

Supporting Information

Figure S1.

Linkage disequilibrium plots. Genes with gene-level or SNP-level p<0.05 are shown; Haploview 4.1 (Barrett et al., 2005) based on self-reported white-non-Hispanic controls; r2 = 0 = white and r2 = 1 = black; numbers represent r2 * 100.

(1.04 MB DOC)

Figure S2.

Matrix of scatterplots for four population structure principal components by self-reported race. Population structure principal components analysis based on 1,981 participants and 2,517 SNPs including imputed genotypes; for each scatterplot, vertical axis corresponds to the component listed in diagonal element to the left of the plot, and horizontal axis corresponds to the component listed in diagonal underneath the plot; results suggest that the first component differentiated white non-Hispanic and black non-Hispanic from other samples, while the fourth component helped to further differentiate Asian from other samples; these four population structure principal components were used as covariates in association testing.

(0.30 MB DOC)

Table S1.

SNPs and risk of ovarian cancer, OR (95% CI)

(0.43 MB DOC)

Table S2.

Haplotype results for PRPF31 (global p-value = 0.03)

(0.12 MB DOC)

Table S3.

Characteristics of study participants by study site

(0.14 MB DOC)

Table S4.

SNP and genotype information

(0.56 MB DOC)


We thank Ms. Karin Goodman and Ms. Ashley Pitzer for subject recruitment, Ms. Michele Schmidt and Mr. Sebastian Armasu for assistance with statistical analyses, and Ms. Katelyn Goodman for assistance with manuscript preparation.

Author Contributions

Analyzed the data: RAV BLF ZSF Y-YT MCL MLK DNR. Wrote the paper: AP ELG. Provided funding: ELG FJC TS. Assisted with recruitment of Mayo Clinic participants: LCH EAE KRK ML. Assisted with results interpretation: KLW CMP TAS. Oversaw Duke University study: AB JS. Recruited Duke University participants: ESI, Jr. Provided analytical advice: FJC. Assisted with gene selection: PP VS. Performed genotyping: JMC.


  1. 1. Parkin DM, Bray F, Ferlay J, Pisani P (2005) Global cancer statistics, 2002. CA Cancer J Clin 55(2): 74–108.
  2. 2. Hartmann LC, Lu KH, Linette GP, Cliby WA, Kalli KR, et al. (2005) Gene expression profiles predict early relapse in ovarian cancer after platinum-paclitaxel chemotherapy. Clin Cancer Res 11(6): 2149–2155.
  3. 3. De Smet F, Pochet NL, De Moor BL, Van Gorp T, Timmerman D, et al. (2005) Independent test set performance in the prediction of early relapse in ovarian cancer with gene expression profiles. Clin Cancer Res 11(21): 7958–7959; author reply 7959.
  4. 4. Ramus SJ, Harrington PA, Pye C, DiCioccio RA, Cox MJ, et al. (2007) Contribution of BRCA1 and BRCA2 mutations to inherited ovarian cancer. Hum Mutat 28(12): 1207–1215.
  5. 5. Lynch HT, Casey MJ, Snyder CL, Bewtra C, Lynch JF, et al. (2009) Hereditary ovarian carcinoma: heterogeneity, molecular genetics, pathology, and management. Molecular oncology 3(2): 97–137.
  6. 6. Lichtenstein P, Holm NV, Verkasalo PK, Iliadou A, Kaprio J, et al. (2000) Environmental and heritable factors in the causation of cancer– analyses of cohorts of twins from Sweden, Denmark, and Finland. N Engl J Med 343(2): 78–85.
  7. 7. Pharoah PD, Ponder BA (2002) The genetics of ovarian cancer. Best Pract Res Clin Obstet Gynaecol 16(4): 449–468.
  8. 8. Song H, Ramus SJ, Tyrer J, Bolton KL, Gentry-Maharaj A, et al. (2009) A genome-wide association study identifies a new ovarian cancer susceptibility locus on 9p22.2. Nat Genet 41(9): 996–1000.
  9. 9. Schildkraut JM, Goode EL, Clyde MA, Iversen ES, Moorman PG, et al. (2009) Single nucleotide polymorphisms in the TP53 region and susceptibility to invasive epithelial ovarian cancer. Cancer Res 69(6): 2349–2357.
  10. 10. Gayther SA, Song H, Ramus SJ, Kjaer SK, Whittemore AS, et al. (2007) Tagging single nucleotide polymorphisms in cell cycle control genes and susceptibility to invasive epithelial ovarian cancer. Cancer Res 67(7): 3027–3035.
  11. 11. Ramus SJ, Vierkant RA, Johnatty SE, Pike MC, Van Den Berg DJ, et al. (2008) Consortium analysis of 7 candidate SNPs for ovarian cancer. Intl J Cancer 123(2): 380–388.
  12. 12. Zumbrunn J, Trueb B (1996) Primary structure of a putative serine protease specific for IGF-binding proteins. FEBS Lett 398(2-3): 187–192.
  13. 13. Chien J, Aletti G, Baldi A, Catalano V, Muretto P, et al. (2006) Serine protease HtrA1 modulates chemotherapy-induced cytotoxicity. J Clin Invest 116(7): 1994–2004.
  14. 14. Ng PC, Henikoff S (2003) SIFT: Predicting amino acid changes that affect protein function. Nucleic Acids Res 31(13): 3812–3814.
  15. 15. Pearce CL, Wu AH, Gayther SA, Bale AE, Beck PA, et al. (2008) Progesterone receptor variation and risk of ovarian cancer is limited to the invasive endometrioid subtype: results from the ovarian cancer association consortium pooled analysis. Br J Cancer 98(2): 282–288.
  16. 16. Pittman AM, Naranjo S, Webb E, Broderick P, Lips EH, et al. (2009) The colorectal cancer risk at 18q21 is caused by a novel variant altering SMAD7 expression. Genome Res 19(6): 987–993.
  17. 17. Smith ER, Cayrou C, Huang R, Lane WS, Cote J, et al. (2005) A human protein complex homologous to the Drosophila MSL complex is responsible for the majority of histone H4 acetylation at lysine 16. Mol Cell Biol 25(21): 9175–9188.
  18. 18. Cooper GM, Hausman RE (2007) The Cell: A Molecular Approach. Sunderlander: Sinauer Associates, Inc.
  19. 19. Kelley RL, Solovyeva I, Lyman LM, Richman R, Solovyev V, et al. (1995) Expression of msl-2 causes assembly of dosage compensation regulators on the X chromosomes and female lethality in Drosophila. Cell 81(6): 867–877.
  20. 20. Morales V, Straub T, Neumann MF, Mengus G, Akhtar A, et al. (2004) Functional integration of the histone acetyltransferase MOF into the dosage compensation complex. Embo J 23(11): 2258–2268.
  21. 21. Fraga MF, Ballestar E, Villar-Garea A, Boix-Chornet M, Espada J, et al. (2005) Loss of acetylation at Lys16 and trimethylation at Lys20 of histone H4 is a common hallmark of human cancer. Nat Genet 37(4): 391–400.
  22. 22. Sellers TA, Schildkraut JM, Pankratz VS, Vierkant RA, Fredericksen ZS, et al. (2005) Estrogen bioactivation, genetic polymorphisms, and ovarian cancer. Cancer Epidemiol Biomarkers Prev 14(11 Pt 1): 2536–2543.
  23. 23. Kelemen LE, Sellers TA, Schildkraut JM, Cunningham JM, Vierkant RA, et al. (2008) Genetic variation in the one-carbon transfer pathway and ovarian cancer risk. Cancer Res 68(7): 2498–2506.
  24. 24. Sellers TA, Huang Y, Cunningham J, Goode EL, Sutphen R, et al. (2008) Association of single nucleotide polymorphisms in glycosylation genes with risk of epithelial ovarian cancer. Cancer Epidemiol Biomarkers Prev 17(2): 397–404.
  25. 25. Cunningham JM, Sellers TA, Schildkraut JM, Fredericksen ZS, Vierkant RA, et al. (2008) Performance of amplified DNA in an Illumina GoldenGate BeadArray assay. Cancer Epidemiol Biomarkers Prev 17(7): 1781–1789.
  26. 26. Carlson CS, Eberle MA, Rieder MJ, Yi Q, Kruglyak L, et al. (2004) Selecting a maximally informative set of single-nucleotide polymorphisms for association analyses using linkage disequilibrium. Am J Hum Genet 74(1): 106–120.
  27. 27. Frazer KA, Ballinger DG, Cox DR, Hinds DA, Stuve LL, et al. (2007) A second generation human haplotype map of over 3.1 million SNPs. Nature 449(7164): 851–861.
  28. 28. Hinds DA, Stuve LL, Nilsen GB, Halperin E, Eskin E, et al. (2005) Whole-genome patterns of common DNA variation in three human populations. Science 307(5712): 1072–1079.
  29. 29. Oliphant A, Barker DL, Stuelpnagel JR, Chee MS (2002) BeadArray technology: enabling an accurate, cost-effective approach to high-throughput genotyping. Biotechniques Suppl: 56–58, 60-51.
  30. 30. Barrett JC, Fry B, Maller J, Daly MJ (2005) Haploview: analysis and visualization of LD and haplotype maps. Bioinformatics 21(2): 263–265.
  31. 31. Price AL, Patterson NJ, Plenge RM, Weinblatt ME, Shadick NA, et al. (2006) Principal components analysis corrects for stratification in genome-wide association studies. Nat Genet 38(8): 904–909.
  32. 32. Li Y, Ding J, Abecasis GR (2006) Mach 1.0: Rapid Haplotype Reconstruction and Missing Genotype Inference. Am J Hum Genet s79: 416.
  33. 33. Schaid DJ, Rowland CM, Tines DE, Jacobson RM, Poland GA (2002) Score tests for association between traits and haplotypes when linkage phase is ambiguous. Am J Hum Genet 70(2): 425–434.