X-Chromosomal Maternal and Fetal SNPs and the Risk of Spontaneous Preterm Delivery in a Danish/Norwegian Genome-Wide Association Study

Background Recent epidemiological studies suggest that the maternal genome is an important contributor to spontaneous preterm delivery (PTD). There is also a significant excess of males among preterm born infants, which may imply an X-linked mode of inheritance for a subset of cases. To explore this, we examined the effect of maternal and fetal X-chromosomal single nucleotide polymorphisms (SNPs) on the risk of PTD in two independent genome-wide association studies and one replication study. Methods Participants were recruited from the Danish National Birth Cohort and the Norwegian Mother and Child cohort studies. Data from these two populations were first analyzed independently, and then combined in a meta-analysis. Overall, we evaluated 12,211 SNPs in 1,535 case-mother dyads and 1,487 control-mother dyads. Analyses were done using a hybrid design that combines case-mother dyads and control-mother dyads, as implemented in the Haplin statistical software package. A sex-stratified analysis was performed for the fetal SNPs. In the replication study, 10 maternal and 16 fetal SNPs were analyzed using case-parent triads from independent studies of PTD in the United States, Argentina and Denmark. Results In the meta-analysis, the G allele at the maternal SNP rs2747022 in the FERM domain containing 7 gene (FRMD7) increased the risk of spontaneous PTD by 1.2 (95% confidence interval (CI): 1.1, 1.4). Although an association with this SNP was confirmed in the replication study, it was no longer statistically significant after a Bonferroni correction for multiple testing. Conclusion We did not find strong evidence in our data to implicate X-chromosomal SNPs in the etiology of spontaneous PTD. Although non-significant after correction for multiple testing, the mother’s G allele at rs2747022 in FRMD7 increased the risk of spontaneous PTD across all populations in this study, thus warranting further investigation in other populations.


Introduction
Preterm delivery (PTD), defined as delivery before 37 weeks of gestation, affected 15 million births in 2010 [1]. It is associated with a substantially increased risk of mortality, as well as short-and long-term morbidity [1,2]. The PTD rate ranges from 6% in Scandinavian countries [3] to 18% in some African populations [1]. PTD is routinely divided into two main groups according to clinical presentation: i) spontaneous PTD, in which delivery starts with either uterine contractions (preterm labor) or membrane rupture (preterm prelabor rupture of membranes (PPROM)), and ii) iatrogenic PTD, which is induced by medical or surgical intervention.
Spontaneous PTD is etiologically heterogeneous, involving both genetic and environmental risk factors. Twin studies have estimated the heritability of PTD at 17-36% [4,5], and that of parturition timing at 34% [6]. Although there is compelling evidence for a genetic component to PTD, no common genetic variants or mode of inheritance have yet been established [7]. Recent generational epidemiological studies suggest that the maternal genome is a key genetic contributor to PTD inheritance [7][8][9], and a personal history of PTD is considered the most important risk factor for PTD in multiparous women [7]. Women who were born preterm themselves, or who have sisters or maternal half-sisters with a history of PTD, are also at increased risk [7,8].
A small but significant excess of males has been observed among preterm-born infants in most populations, especially for spontaneous PTD [10]. Different hypotheses have been proposed to explain this excess, including a shorter gestational length due to higher average fetal weight [11], increased vulnerability to certain pregnancy complications [10], and the fact that biochemical processes, such as increased estrogen production from androgen precursors or higher levels of interleukin-1 in the amniotic fluid of males, could lead to uterine contractions and PTD [12,13]. It is also possible that this male excess is due to fetal X-linked risk alleles contributing to a subset of the PTD cases, in which hemizygosity would increase the risk compared to heterozygosity [12].
Several complex disorders have been associated with genetic variants on the X chromosome, including prostate cancer [14], type 2 diabetes [15], X-linked dystonia Parkinsonism [16] and psychiatric disorders such as schizophrenia and autism spectrum disorders [17]. Hypospadias, a congenital malformation of male external genitalia, is strongly associated with genetic variants on the X chromosome [18]. As males inherit only one X chromosome, they are hemizygous for X-linked genes, and although females inherit two X chromosomes, one of them is inactivated in each cell. A recent meta-analysis found a significant association between skewed X-chromosome inactivation in females and idiopathic recurrent spontaneous abortion [19]. Recurrent second-trimester spontaneous abortion is a well-known risk factor for spontaneous PTD [20,21], and it is plausible that some of the same mechanisms may be involved in both outcomes.
Most candidate-gene studies of PTD have used a case-control study design and focused primarily on autosomal markers. This may be partly due to limited knowledge about plausible X-linked candidate genes for PTD and a lack of appropriate statistical methodology for family-based association studies of X chromosome markers [22]. Nearly all methods that have subsequently been developed for handling X-linked markers are based on the transmission/disequilibrium test (TDT) [23][24][25][26][27]. A family-based likelihood ratio test for the X chromosome (X-LRT) [28] and Haplin [22] are among the few existing methods that can estimate genetic relative risks, in contrast to other methods [23][24][25]27,29] that only generate a p-value for hypothesis-testing.
Taking into consideration the strong evidence of a maternally mediated genetic effect in PTD and the higher proportion of males among infants born preterm, we examined the effects of maternal and fetal X-linked gene variants on the risk of PTD, using data from two genome-wide association studies (GWAS) in Scandinavia (Norway and Denmark). To verify our findings, we conducted a replication study using case-parent triads from independent studies in the United States, Argentina and Denmark.

Ethics Statement
All the studies outlined in this paper were approved by the regional ethics committees or institutional review boards (IRB) at each site, and a written informed consent was obtained from each participant.

Study Participants
This study was conducted using data from DNBC [30] and MoBa [31]. A replication study of 10 of the most significant maternal SNPs and 16 of the most significant fetal SNPs from the combined analysis of DNBC and MoBa data was performed in independent family studies of PTD in the United States (US), Argentina and Denmark.
DNBC. The DNBC includes approximately 100,000 pregnancies from 1996 to 2002. Women were invited to participate at their first prenatal visit with their general practitioner at gestational weeks 6-12 [30]. Information on exposures not registered in medical records was obtained from national registers, telephone interviews, and a food frequency questionnaire. Blood samples from mothers were collected by the general practitioner during the routine visit at gestational weeks 6-12 and 24, and a cord blood sample was taken at delivery. Blood samples were stored in the Danish National Biobank at the Statens Serum Institut in Copenhagen, Denmark.
A case was defined as a live, singleton spontaneous PTD occurring before 259 days (37 weeks) of gestation; a control was defined as a live, singleton full-term delivery (i.e. occurring at 280-286 days (40 weeks) of gestation). The exclusion criteria were: fetal malformations, preeclampsia/eclampsia, placenta previa, placental abruption, polyhydramnios, isoimmunization and placental insufficiency. In addition, the infant's parents and grandparents had to be of Nordic ancestry (i.e. born in Denmark or one of the other Nordic countries).
MoBa. MoBa is a nationwide Norwegian pregnancy cohort study administered by the Norwegian Institute of Public Health (NIPH). The study includes more than 107,000 pregnancies recruited from 1999 through 2008. Women were invited by postal invitation in connection with a routine ultrasound screening offered to all pregnant women in Norway at gestational weeks 17-19. Most of the pregnant women in Norway were invited and the participation rate was 42.7%. Participation rates for the first three questionnaires were 92-95% [31]. For the current study, cases and controls were selected from Version 4 of the MoBa cohort, which included a total of 71,669 pregnancies. This version was released in 2008 for research use.
Blood samples were drawn from the pregnant woman and the fetus' father during the ultrasound appointment. A new blood sample from the woman and a cord-blood sample from the infant were collected at delivery. All biological specimens were sent to the MoBa Biobank where DNA was extracted, processed and stored until retrieval [32].
A case was defined as a live, singleton spontaneous PTD occurring between 154 and 258 days of gestation (22 0/7 -36 6/7 weeks); a control was defined as a live, singleton full-term delivery, i.e. occurring at 273-286 days of gestation (39 0/7 and 40 6/7 weeks). Gestational age was estimated by ultrasound at gestational weeks [17][18][19]. In the few cases without ultrasound dating, gestational age was estimated using the date of the last menstrual period. Strict selection criteria were applied to both cases and controls in order to yield the clearest possible phenotype. Only women in the age group 20-34 years were selected. As women aged ,20 years and .35 years have an increased risk of spontaneous PTD [21], only women in the age group 20-34 years were selected in order to prevent the increased risk from affecting the results. Pregnancies involving pre-existing medical conditions, such as diabetes, hypertension, specific autoimmune diseases (inflammatory bowel disease, systemic lupus erythematosus, rheumatoid arthritis and scleroderma) and immune-compromised conditions, were excluded from the study. Lastly, pregnancies with complications such as preeclampsia, hypertension, gestational diabetes, placental abruption, placenta previa, cervical cerclage, small for gestational age and fetal malformation were also excluded, as were pregnancies conceived by in vitro fertilization.
The US prematurity study. This study focuses on PTD cases and their families. Study participants (cases and both parents and grandparents when available) were enrolled at various locations in the US, including Iowa City (Iowa), Wake Forest (North Carolina), Pittsburgh (Pennsylvania) and Rochester (New York). DNA from each participant was extracted from saliva or blood samples. Cases were defined as PTD if gestational age was ,37 weeks. In order to harmonize the phenotypes of the replication cohort with those of the original study populations, we excluded indicated deliveries without PPROM, multiple gestations, fetal malformations, preeclampsia or hypertension in pregnancy, placental abruption, placenta previa, conception via assisted reproductive technology, and maternal age ,20 or .39 years. Only white, non-Hispanic individuals were included in the study. For the maternal replication, all ethnicities were included in one of the analyses, but after other inclusion criteria were applied, only white, non-Hispanic individuals remained for analysis.
The argentina prematurity study. The Argentina Prematurity Study had similar enrollment criteria as the US Prematurity study. Cases were defined as PTD if gestational age was ,37 weeks. We excluded deliveries with multiple gestations, preeclampsia, maternal age ,20 or .39 years, and indicated deliveries without PPROM. For the current study, maternal SNPs were replicated using triads consisting of the mother of a pretermborn infant and her parents.
The denmark family study of PTD. This study includes mothers of preterm-born infants and their parents (the infant's maternal grandparents). Cases were defined as PTD if gestational age was ,37 weeks. Indicated deliveries and congenital malformations were excluded.

Genotyping
The Illumina Human660W-Quad BeadChip platform (Illumina, San Diego, CA, USA) was used for genotyping in both study populations. The DNBC samples were genotyped by the Center for Inherited Disease Research (CIDR) at the Johns Hopkins University (Baltimore, MD, USA). The MoBa samples were genotyped at the genotyping core facility at Oslo University Hospital (Oslo, Norway). For replication, the 28 selected SNPs were genotyped at the University of Iowa using the TaqManH chemistry genotyping system (Applied Biosystems, Foster City, CA, USA). All reactions were performed under standard conditions supplied by Applied Biosystems. Following thermocycling, fluorescence levels of the FAM and VIC dyes were measured and genotypes were scored using the proprietary Sequence Detection Systems 2.2 software (Applied Biosystems) and reviewed manually by at least two independent observers.

Quality Control
DNBC. In the Danish data, 2,035 mother-infant pairs were available for analysis (1,061 case pairs and 974 control pairs). Of these, 20 dyads were excluded because of a call rate ,97% in either the mother or the infant. One dyad was excluded because of unknown gender in the infant, and a further nine dyads were excluded because either the mother or the infant had a sibling or half-sibling in the cohort. After quality control, 2,005 motherinfant dyads (1,046 case dyads and 959 control dyads) were available for the current analysis.
Before data processing, there were 14,441 SNPs on the X chromosome. After filtering out SNPs that had a call rate ,95%, more than five Mendelian inconsistencies, or a minor allele frequency (MAF) of ,1%, we were left with 12,345 SNPs for further analysis.
MoBa. In the Norwegian data, there were 1,086 complete mother-infant dyads eligible for analysis (529 case dyads and 557 control dyads). Sixty-two dyads were excluded because they had a genotype call rate ,97% in either the mother or the infant. In addition, 7 mother-infant pairs were excluded because of inconsistencies in parenthood. This left 1,017 mother-infant dyads (489 case-dyads and 528 control-dyads) for the current analysis.
Overall, 14,441 SNPs on the X chromosome were available before data processing. SNPs that had a call rate ,95%, more than five Mendelian inconsistencies, or a MAF ,1% were excluded, leaving 12,361 SNPs for the current analysis.
The US prematurity study and the denmark family study of PTD. The fetal triads consisted of a preterm infant and his/ her parents. For the fetal triads, 286 case families (787 individuals) were eligible for analysis. Of these, three families (15 individuals) were removed because of Mendelian inconsistencies, 25 individuals were removed because of a call rate ,90%, 8 individuals were subsequently removed because there was only one person left in the family, and six mother-father pairs (12 individuals) were removed because no data/genotypes were available on the infant. This left 267 families (727 individuals) for further analysis. A set of 18 SNPs were chosen for replication. One SNP (rs5918890) failed assay design and another SNP (rs6524611) had a call rate ,95%, leaving 16 SNPs for analysis.
The maternal triads consisted of the mother of a preterm infant and the infant's maternal grandparents. For analysis of the maternal triads, 98 case families (293 individuals) were eligible for analysis. Of these, 116 individuals were excluded because they had a call rate ,90% and 45 individuals were subsequently excluded because there was only one person left in the family. This left 53 families (132 individuals) for further analysis. None of the SNPs had a call rate ,95%, but one SNP was excluded because of low MAF. Thus, 9 SNPs were included in the analysis.

Data Analysis
The Norwegian and Danish case-mother dyads and controlmother dyads were analyzed separately using a hybrid approach in the R statistical package Haplin [22,33,34]. The software is freely downloadable at http://www.uib.no/smis/gjessing/genetics/ software/haplin. We analyzed both maternal and fetal SNP effects. In addition, separate analyses were performed for male and female cases to identify possible sex-specific effects. A Bonferroni correction puts the significance threshold at 4.1610 26 for this study.
The results from the two populations were combined in a fixedeffects meta-analysis of 12,211 SNPs in 1,535 case-mother dyads and 1,487 control-mother dyads. Thus, for each SNP, the weighted average of the two log(RR) values was computed using inverse variance weights, and similarly for the standard error of the combined estimate. In addition, the two overall p-values were combined using Fisher's method [35], and quantile-quantile (QQ) plots were generated for the combined p-values for each of the different analyses ( Figure S1). In addition, regional association plots for the SNPs with the lowest uncorrected p-values were made using a modified version of the R script available at http://www. broadinstitute.org/files/shared/diabetes/scandinavs/assocplot.R (Figure 1). With the meta-analysis approach, relative risks with opposing directions in the two populations may cancel each other out, whereas the Fisher method combines p-values irrespective of the direction of the effect.
The replication data from the US prematurity study were analyzed using the case-parent triad module in Haplin. In this setting, the unit of analysis is a preterm infant and his/her parents/siblings. The analyzed maternal triads consisted of the mother of a preterm infant and her parents (the infant's maternal grandparents).
Haplin was originally designed to analyze genetic and environmental risk factors using a case-parent triad approach, a case-control approach or a combination of the two [34]. In addition to fetal effects, Haplin can also estimate maternal effects unambiguously. Haplin is based on log-linear modeling and uses a full maximum likelihood (ML) model for estimation of relative risks. In addition, missing genotypes are imputed using the expectation maximization (EM) algorithm.
The relationship between male and female allele effects may be influenced by X inactivation in females. The analyses were therefore carried out using a parameterization model in which boys and girls were assigned different baseline risks (B B and B G ) and a shared relative risk (RR). By assuming separate baseline risks, confounding due to other effects that can influence sex differences can be avoided. The baseline risk applies when no risk alleles are present (i.e. A 1 in boys and A 1 A 1 in girls, with A 2 representing the risk allele). The risk increases similarly in boys and girls (B B *RR and B G *RR) when they are hemizygous for the risk allele (A 2 in boys) or homozygous for the risk allele (A 2 A 2 in girls). When girls are heterozygous (A 1 A 2 ), the risk is the average of B G and B G *RR. X inactivation is thus taken into account. In Figure 1. Regional association plots. A) rs2747022 in mothers, B) rs4239992 in infants, C) rs6652393 in male infants. Large red diamond represents the association in the meta-analysis. Large blue diamond represents the association in the replication study. Small red, orange and yellow diamonds represents SNPs in different degrees of LD with the associated SNP. doi:10.1371/journal.pone.0061781.g001 addition to joint analyses of males and females, Haplin has an option for running sex-specific analyses [22].

Results
Demographic and pregnancy characteristics for cases and controls in DNBC and MoBa are outlined in Table 1. Table 2 shows the number of families in the replication study.

Maternal Effects
Of 12,211 analyzed maternal SNPs, 29 had a combined p-value ,10 23 before Bonferroni correction for multiple testing and 14 of these had an uncorrected p-value ,0.05 in both the Norwegian and Danish population (Table S1). The best result was for rs7892483, with a combined p-value of 9.8610 26 and a relative risk (RR) of 1.7 (95% CI: 1.3, 2.1; Table 3). However, none of the SNPs remained significant after correcting for multiple testing.

Fetal Effects
Of the analyzed fetal SNPs, 19 had a combined p-value ,10 23 before Bonferroni correction and 9 of these had an uncorrected pvalue ,0.05 in both populations in the discovery phase (Table S2). None of the SNPs reached the chromosome-wide significance level threshold of 4.1610 26 , but rs2961403 was closest to significance, with a combined p-value of 1.2610 25 and a RR of 1.3 (95% CI: 1.1, 1.4) ( Table 4). However, this SNP was only borderline significant in the Norwegian sample (p-value = 1.8610 26 ). In the Danish study, rs6528251 in the ''DEAD/H (Asp-Glu-Ala-Asp/ His) box polypeptide 26B'' (DDX26B) gene was the SNP closest to significance (p-value = 1.1610 24 ).

Sex-stratified Analyses
When girls and boys were analyzed separately, 10 SNPs had a combined, uncorrected p-value ,10 23 for boys and five of these SNPs had a p-value ,0.05 in both studies (Table S3). The SNP closest to significance was rs6652393 in the interleukin-1 receptorassociated protein-like 2 gene (IL1RAPL2) ( Table 5). In boys, the A-allele of this SNP had an overall p-value of 3.1610 25 and a RR of 1.2 (95% CI: 1.1, 1.3). Moreover, three of the other top SNPs were located in IL1RAPL2 (Table S3). For girls, there were 17 SNPs with a combined p-value ,10 23 , only one of which had a pvalue ,0.05 in both studies ( Table 6, Table S4).

Replication Analyses
In the maternal replication study, SNP rs2747022 located in FRMD7 had an uncorrected p-value of 0.01 (Table 7). After including the Argentinean families, this SNP had an uncorrected p-value of 0.03 (Table 8, Figure 1A). In the fetal replication, four  SNPs had p-values ,0.05: rs4239992 in MIR-505 (microRNA 505), rs17328647 in ATP11C (ATPase, class 6 type, 11C), and rs5953790 and rs2485729 located close to the ATP11C gene region (Table 9, Figure 1B). All of the SNPs were in strong linkage disequilibrium (LD) with each other and the G allele at rs4239992 increased the relative risk of spontaneous PTD by 2.3 (95% CI: 1.3, 4.1). In the sex-stratified analysis, there was suggestive evidence of association with these SNPs in boys, but not in girls.

Discussion
In this meta-analysis of two independent GWAS from two ancestrally and geographically closely related populations, many SNPs had p-values ,0.05 but none remained significant after a Bonferroni adjustment for multiple testing. Replication was attempted for the SNPs closest to significance, but the results were inconclusive.

Maternal Effects
For the maternal SNPs, the most promising finding was rs2747022 in FRMD7. Before Bonferroni correction, the G allele at this SNP was associated with an increased risk of spontaneous PTD in the Norwegian and Danish data, the combined analysis and the replication study, also when the Argentinean families were included. Another SNP in this gene, rs7880476, was also associated in the combined analyses of the Norwegian and Danish data. But like rs2747022 above, the association was no longer significant after Bonferroni correction.
FRMD7 maps to Xq26.2 and is associated with X-linked idiopathic congenital nystagmus [36]. This condition is fully penetrant in males, but has incomplete penetrance in females [36]. Expression analyses show that mRNA is present at low levels in most human adult tissues. In embryos, there is expression in various parts of the brain [36], and it has been postulated that the FRMD7 protein is important for neurite development and neuronal differentiation [37]. It is unclear how this gene might  be involved in the pathogenesis of spontaneous PTD. An associated SNP may be a surrogate for etiologic variants adjacent to the gene in which the most associated SNP is located. However, the genes immediately flanking FRMD7 (MST4, RAP2C, MBNL3) do not appear to be obvious candidates for PTD, although MBNL3 is involved in alternative splicing, a mechanism that would be consistent with altered developmental expression patterns.
The SNP closest to significance in the combined analysis was rs7892483. This finding was consistent across the Norwegian and Danish samples, but not in the replication study involving the US and Argentinean samples. Of the other most associated SNPs, four (rs5972070, rs5972071, rs5973734, and rs5973741; Table S1) were in strong LD with rs7892483. These SNPs lie in a gene desert and have not previously been associated with any disease. It is possible that they are in LD with a causal variant of unknown location. Several loci associated with disease have been localized to gene deserts, and it has been shown that these regions may harbor regulatory elements that can modulate gene expression over large distances on the chromosome [38].

Fetal Effects
Before Bonferroni correction, the closely linked SNPs rs4239992, rs17328647, r5953790 and rs2485729s in the MIR505/ATPC11 gene region had uncorrected p-values ,0.05, both in the combined analysis and in the replication study. However, the relative risks in the Norwegian and Danish samples point in opposite directions, complicating the interpretation of these risk estimates. This could be explained by a genetic ''flip-flop'', a phenomenon characterized by opposite alleles being associated in two populations owing to heterogeneous effects of the same variant or to differences in LD [39]. However, this is unlikely given the similar allele frequencies of these SNPs in the Norwegian and Danish samples and the close ancestral origins of these two populations. Nevertheless, this is an interesting finding, particularly because rs4239992 is located in a microRNA gene. These genes code for small RNA molecules that can have important regulatory functions in gene expression [40], and might thus be involved in regulating genes important for PTD.
The six top hits in the combined analysis were also among the top hits in the Norwegian but not the Danish study. Since analyses based on the hybrid study design are not fully protected against population stratification because of the case-control component, some of the hits might be the result of population substructure or random false positives. However, none of these SNPs achieved chromosome-wide significance in the replication study.

Sex-stratified Analyses
In the combined study, the most promising SNP among males was rs6652393 in IL1RAPL2. Before Bonferroni correction this SNP was significant at p,0.05 in both populations, but not in the sex-stratified analysis of females, indicating a possible sex-specific effect. IL1RAPL2 maps to Xq22 and is a member of the IL-1 receptor family. In mice, IL1RAPL2 is specifically expressed in the central nervous system (CNS) from embryonic day 12.5 onwards [41]. A study by Born and co-workers [42] detected expression of IL1RAPL2 in skin, liver, placenta, and fetal brain tissues. Only   The Maternal X-Chromosome and Preterm Delivery PLOS ONE | www.plosone.org weak expression has been detected in the adult brain [41]. IL1RAPL2 is closely related to IL1RAPL1, which has been associated with mental retardation, autism and psychiatric disorders [43], all of which are conditions associated with PTD [2]. Mechanisms leading to PTD may also be responsible for impaired neonatal outcome [44], with boys being more vulnerable to these outcomes than girls. Therefore, it is plausible that some sex-specific mechanism involved in both CNS development and spontaneous PTD is at play. As mentioned earlier, there is a higher level of the IL-1 receptor antagonist in the amniotic fluid of females compared with males. It has also been shown that IL-1 can induce PTD in mice [41], but it remains unclear how IL1RAPL2 is involved in the response to IL-1 [42].
In the sex-stratified analysis, there was only suggestive evidence of association with one SNP across both study samples in females. However, the respective relative risks were in opposite directions in the Norwegian and Danish samples. In the Norwegian sample, several SNPs in the dachshund homolog 2 gene (DACH2) were close to significance in females, whereas in the Danish study the SNPs closest to significance were in the glutamate receptor 3 gene (GRIA3). These findings were not replicated and are likely to be false positives. Further, none of the associations remained after correction for multiple testing. Taken together, our data do not provide strong evidence for a sex-specific effect in girls in the combined analysis.
In the sex-stratified replication, the same SNPs that were promising in the fetal replication study were also promising in males but not females, again indicating a possible sex-specific effect. However, this effect was not present in the original study population. Again, none of the replicated SNPs were significant in females.
Despite the negative results, this is the first study to look for associations between spontaneous PTD and SNPs along the X chromosome. The hybrid design used in this study is widely applicable to other perinatal disorders and has several attractive and novel features compared with other methods. It is less prone to population stratification than the case-control design, and since it involves more controls than the case-parent triad design alone, it not only provides more statistical power for detecting an effect, it also allows the main effects of an exposure (genetic or environmental) to be estimated. Furthermore, our study subjects come from two relatively homogeneous populations that share a common ancestry and geography, further protecting against population stratification. Because spontaneous PTD is a heterogeneous condition, we applied strict inclusion criteria in order to obtain a clearly defined phenotype.
Although this study is based on one of the largest collections of PTD samples to date, the power to detect small genetic effects in particular may still be limited. Several associations that were significant in the two independent discovery populations from Denmark and Norway failed to replicate in a third study population from the US. There are several plausible explanations for this. First of all, the associations did not remain significant after a Bonferroni correction for multiple testing, increasing the likelihood that they were false positives. Whereas the primary studies were based on a relatively homogeneous Scandinavian population, the replication study was based on a more heterogeneous/admixed US and Argentinean population. Also, some of the variables that were used as exclusion criteria in the primary studies (for example diabetes and cerclage) were not available for the replication population. Furthermore, the replication study had a smaller sample size than the combined study, in particular for the evaluation of maternal SNPs. Finally, the case-parent triad study design in the replication study offers better protection against population stratification than the hybrid design in the primary study, but the statistical power in the former is lower because fewer control alleles are available for comparison.
In conclusion, our data did not provide any strong evidence for the involvement of X-chromosomal SNPs in the risk of spontaneous PTD. However, there were several interesting findings, such as the maternal SNP rs2747022 in FRMD7, which represents a particularly attractive candidate for further investigations in other large studies. The hybrid study design described here and the analytic opportunities provided by Haplin should prove valuable not only for the replication but also for exploring other perinatal disorders.