Differential Gene Expression at the Maternal-Fetal Interface in Preeclampsia Is Influenced by Gestational Age

Genome-wide transcription data of utero-placental tissue has been used to identify altered gene expression associated with preeclampsia (PE). As many women with PE deliver preterm, there is often a difference in gestational age between PE women and healthy pregnant controls. This may pose a potential bias since gestational age has been shown to dramatically influence gene expression in utero-placental tissue. By pooling data from three genome-wide transcription studies of the maternal-fetal interface, we have evaluated the relative effect of gestational age and PE on gene expression. A total of 18,180 transcripts were evaluated in 49 PE cases and 105 controls, with gestational age ranging from week 14 to 42. A total of 22 transcripts were associated with PE, whereas 92 transcripts with gestational age (nominal P value <1.51*10−6, Bonferroni adjusted P value <0.05). Our results indicate that gestational age has a great influence on gene expression both in normal and PE-complicated pregnancies. This effect might introduce serious bias in data analyses and needs to be carefully assessed in future genome-wide transcription studies.


Introduction
Preeclampsia (PE) is one of the leading causes of perinatal mortality and deaths of pregnant women worldwide [1,2]. PE is a pregnancy-specific disorder, diagnosed by de novo onset of hypertension and proteinuria in the latter half of pregnancy [2][3][4]. To date, there are few reliable predictive tests or any effective treatment available, except delivery of the baby and the placenta. Consequently, PE accounts for approximately 20% of all preterm births [5].
The aetiology of PE is not completely understood, but it is generally considered that disturbed interactions between the invading (fetal) trophoblasts and maternal cells causing defective trophoblast invasion are important pathophysiological events. The subsequent impaired spiral artery remodelling and reduced placental perfusion is proposed to create oxidative stress and a release of inflammatory factors into the maternal circulation, causing overt PE [6]. Gene expression analyses may provide further insight in mechanisms of disease and function as preventive, predictive or therapeutic measures. As the molecular mechanisms behind impaired trophoblast invasion are preferentially reflected at the maternal-fetal interface, attempts to identify aberrant gene expression associated with preeclamptic pregnancies at this site have been made. So far, a number of genome-wide transcription analyses of decidual and placental bed tissue have been performed using a limited number of samples [7][8][9]. Findings in these studies have been inconsistent, probably reflecting lack of power in each individual study in combination with the complexity of the disease.
Women with PE often deliver preterm, due to medical indications or the condition itself. Transcriptional comparisons of gene expression in utero-placental tissue from women with PE and women with normal pregnancies are therefore often hampered by relatively large differences in gestational age. It has been shown that gene expression in utero-placental tissue differs dramatically over gestation [10,11]. However, most studies aiming to identify altered utero-placental expression in PE have failed to properly assess changes in transcription levels caused by such differences in gestational age.
In this study, we have pooled data from three different genome-wide transcription studies [8,9,11] of tissue from the maternal-fetal interface to assess differential gene expression associated with both PE and gestational age. This study is the first to include both variables (PE and gestational age), but also the first to combine data from different genome-wide platforms to analyse transcription profiles. In addition, with 154 samples analysed, this is so far the largest study performed to identify differences in expression patterns at the maternal-fetal interface.

Microarray Datasets
In this study, we have utilised three publicly available gene expression datasets from tissue from the maternal-fetal interface ( Table 1). The first dataset consist of analyses of basal plate biopsies from 36 second-and third trimester singleton pregnancies, including elective surgical terminations and normal, uncomplicated term pregnancies [11]. None of these 36 samples were from women with pregnancies complicated by PE, fetal anomalies, hypertension, diabetes, infections or other significant maternal health issues. The dataset is available at the Gene Expression Omnibus (www.ncbi.nlm.nih.gov/geo/), accession no. GSE5999. The second dataset consists of 23 basal plate biopsies from women who developed severe PE (n = 12) or had preterm labour (n = 11) [9]. Approximately one third of the cases with preterm labour resulted from cervical insufficiency, and 5 of the PE pregnancies were complicated by fetal growth restriction. Pregnancies complicated by fetal anomalies, premature rupture of the membranes, infections, diabetes, autoimmune diseases or pregnancies with multiple gestations were excluded. The dataset is available at Gene Expression Omnibus (www.ncbi.nlm.nih.gov/ geo/), accession no. GSE14722. The third dataset consists of 95 decidua basalis samples from women who developed PE (n = 37) and women with normal, uncomplicated term pregnancies (n = 58) [8]. Among the 37 PE cases, 30 had severe PE or PE complicated by small for gestational age (SGA) and seven cases had mild PE not complicated by SGA. Pregnancies with multiple gestations, or fetal and placental anomalies (such as placenta accreta, placenta membranacea, placentas from fetuses with chromosomal anomalies or developmental anomalies, or macroscopic and microscopic signs of infection) were excluded. The dataset is available via ArrayExpress (www.ebi.ac.uk/microarray-as/ae/), accession no. E-TABM-682. The criteria to diagnose PE as well as severe PE in dataset #2 and #3 were similar [12,13]. All data used for this study have been retrieved from www.ncbi.nlm.nih.gov/geo/or www.ebi.ac.uk/microarray-as/ae/, and no additional ethical approvals were obtained. A more comprehensive description of the tissue sampling and study population characteristics can be found in the original papers for the respective studies [8,9,11].

Probe-mapping and Construction of Probe Pairs between Microarray Platforms
The transcription data that the three studies used in this work originate from has been produced on two different microarray platforms. Dataset #1 and #2 were analysed on Affymetrix HG-U133A&B GeneChips (Affymetrix, Santa Clara, CA, USA) including 44,928 probesets, and dataset #3 was analysed on Illumina HumanWG-6 v2 Expression BeadChips (Illumina Inc., San Diego, CA, USA) including 48,095 probes. Affymetrix interrogates mRNA expression using a panel of different 25mer probes per transcript (probesets), whereas Illumina uses multiple, identical 50mer probes per transcript. Both platforms provide probe annotations, but since the sequence of the human genome is constantly updated, these annotations do not always match the latest updated sequence [14]. Having accurate knowledge of which transcripts the probes and probesets are measuring is essential to ensure accurate biological interpretation of the results in downstream analyses. Therefore, probes and probesets from the two platforms were remapped to Ensembl transcript predictions using the available Ensembl annotation for expression microarrays (Ensembl release 57). This Ensembl annotation was produced by first aligning the probe sequences to the corresponding genome sequence (Ensembl release 57), using the exonerate alignment tool [15]. A default of one base pair mismatch was permitted between the probe and the genome sequence assembly. Probes that matched at 100 or more locations (e.g. suspected Alu repeats) were discarded. The remaining probes or probesets were associated with Ensembl transcript predictions. For Affymetrix probesets, it was required that .50% of the probes matched a given transcript sequence. This mapping procedure is described in more detail at the Ensembl webpage (www.ensembl.org/info/docs/ microarray_probe_set_mapping.html). For the current study, Illumina-Affymetrix probe pairs were constructed if an Affymetrix probeset and an Illumina probe mapped to the same Ensembl transcript. This resulted in multiple probe pairs from some transcripts. E.g., if two different Affymetrix probesets (A1 and A2) and two Illumina probes (I1 and I2) mapped to the same transcript, this resulted in four possible probe pairs (A1-I1, A1-I2, A2-I1 and A2-I2).

Pre-processing of Microarray Data
All pre-processing procedures were performed using the open source software R, available via www.bioconductor.org. Affymetrix gene expression values from dataset #1 and #2 were imported into R and extracted using the Robust Multichip Average (RMA) method [16] implemented in the affy R library. Illumina gene expression values from dataset #3 were imported and extracted using the lumi R library. Both methods for extracting expression values included quantile normalisation. Next, transcription values were inverse normal transformed, for each dataset separately, to obtain perfect normally distributed values where the mean is set to zero and standard deviation to one, as described previously [17]. Briefly, transcription values for all probes were first inverse normal transformed for each individual separately to adjust for variation between samples (e.g. RNA quantity). Second, the transcription values for all individuals within each substudy were inverse normal transformed for each probe, to adjust for variation between probes (e.g. probe specificity). This normalisation procedure produces comparable transcription values across individuals and transcripts (independent of RNA quality, tissue sampling method and platform etc.).

Statistical Analyses of Microarray Data
The normalised transcription values from the three datasets were pooled and analysed using the moderated t-test implemented in limma R library [18,19], available via the bioconductor project (www.bioconductor.org). First, the transcription values were fitted to a linear model with PE-status and gestational age as covariables, as well as a factor with values (1, 2 or 3) to separate the three studies. This factor allows the mean expression level for each transcript to differ between studies, which is expected if the samples included in three studies differed in terms of gestational age, different rate of PE incidence and sampling methods. Second, an empirical Bayesian method [18] was applied to the fitted model object. To correct for multiple testing, a Bonferroni adjusted P value of 0.05/33,088 (number of Illumina-Affymetrix probe pairs) = 1.51*10 26 was used as significance threshold in all analyses. To evaluate if the majority of the transcripts were regulated by gestational age or PE-status, the proportion of nondifferentially expressed genes were estimated, as described previously [20]. In addition to the variables described above, we assessed the interaction between PE and gestational age by including an interaction term (PE*gestational age) in the linear model.

Linear Model Fitting for Gestational Age
Gene expression levels are known to vary over gestation, but whether this relationship is linear throughout pregnancy is not known. In a previous study of gestational age-related transcriptional changes at the maternal-fetal interface [11] (dataset #1, Table 1), gestational age was categorised as either mid-gestation or term, whereas we have assumed a linear relationship between gestational age and transcription levels. To evaluate if these two methods for estimating gestational age effects provided similar results, we first reanalysed dataset #1 using the moderated t-test two times. The first time we used gestational age as a continuous variable, and the second time we dichotomised gestational age into mid-gestation and term. We then compared the log2 fold change (log2FC) values between analyses for each transcript.

Ingenuity Pathway Analyses
We used Ingenuity Pathway Analysis (IPA) v7.5 (Ingenuity Systems, Redwood City, CA, USA) to study the biological function of the genes that were differentially expressed in association with PE or gestational age. Fischer's exact test was used to investigate if any biological function were over-represented among these genes. No adjustment for multiple testing was done in the IPA analyses.

Concordance with Previous Studies
To investigate whether our assessment of PE-and gestational age-related transcripts was similar to the individual three substudies included, we compared our results to those previously published [8,9,11]. However, as the three sub-studies originally were analysed using different methods and settings, results were Figure 1. A comparison of two methods for identifying gestational age-related transcripts; using gestational age as a linear variable age or dichotomising it to midgestation and term. The correlation coefficient between the log2 fold change (log2FC) values in the two methods (Pearson's product-moment correlation coefficient) is 0.94 (P value ,2.2*10 216 ). doi:10.1371/journal.pone.0069848.g001 not directly comparable. We therefore reanalysed each dataset (#1-#3) separately using the moderated t-test, and compared these results to the results when the three datasets were pooled. For each transcript, the log2FCs corresponding to PE and gestational age were compared across datasets. In addition, we compared the top 100 most significant results (ranked by P value) for PE-and gestational age-related transcripts between the datasets.

Study Group Characteristics and Microarray Datasets
Genome-wide transcription data from 154 tissue samples from the maternal-fetal interface, originating from three different datasets (Table 1), were used in this work. A total of 105 nonpreeclamptic (NP) controls with gestational age ranging between week 14 to 42 in addition to 49 PE cases with gestational age ranging from week 24 to 39 were included in the analyses (Table 1).

Linear Model Fitting for Gestational Age
The comparison of the two different methods for identifying gestational age-related transcripts (using gestational age as a quantitative variable versus dichotomising it to mid-gestation and term) showed that these two methods gave similar results ( Figure 1, Pearson's product-moment correlation coefficient 0.94, P,2.2*10 216 ). Consequently, we conclude that gestational age can be used as a linear variable as well as being dichotomised into mid-gestation and term. Using a linear relationship allows for interpolation of the effect of gestational age throughout the total range of our data and compensates for the discrepancy in gestational age between datasets ( Figure 2).

PE-and Gestational Age-related Transcripts
Of the 33,088 Illumina-Affymetrix probe pairs analysed, 29 were differentially expressed between PE and NP after correction for multiple testing (Table 2). Together, these probe pairs represents 22 different transcripts. In contrast, as many as 174 probe pairs, representing 92 different transcripts, were significantly associated with gestational age (Table 3). For the most significant observations, there were clear changes in transcription levels associated with either PE or gestational age ( Figure 3, Figure S1 and S2). We did not detect any significant interactions between PE and gestational age. Two transcripts were associated with both PE-status and gestational age; fibroblast activation protein (FAP) and corticotropin releasing hormone (CRH). FAP was down-regulated in PE and decreased expression over gestation, whereas CRH was upregulated in PE and increased expression over gestation. The Differential Gene Expression in the Placenta PLOS ONE | www.plosone.org estimated fraction of differentially expressed transcripts [20] was 49% for gestational age and 30% for PE. Since transcription values for the three datasets were standardised for each transcript prior to merging, all between-study-effects such as tissue source would be excluded. We did not allow for interaction between the study/tissue source and the effect of PE or gestational age on transcription values. To evaluate if the inclusion of controls with very low gestational age affected the results, we performed the same analyses excluding controls (n = 15) with gestational age ,20 weeks. The results were very similar to the original results with a correlation between the -log10 (P values) for PE-associated transcripts of R = 0.96, and for the gestational age-associated transcripts r = 0.88. However, removing these 15 individuals decreased the number of significantly associated transcripts from 174 to 72 for gestational age, and from 29 to 18 for PE. Still, the ranking of the P values for the two subanalyses were the same (as indicated by the high correlation coefficients), and the decrease in number of significant findings most likely reflects the decrease in power by decreasing the sample size. Similarly, to test if the inclusion of cases with mild PE influenced the results, we performed the same analyses by excluding the seven cases with mild PE from dataset #3. These results are also very similar to the primary results with a correlation coefficient between the log2FC for PEassociated transcripts of R = 0.99.

Ingenuity Pathway Analyses
IPA analysis of the 22 PE-associated transcripts ( Table 2) identified in our analyses demonstrated an over-representation of biological functions such as cellular growth and proliferation, cell death, endocrine system disorders and metabolic disease (Table 4). IPA analysis of the 92 gestational age-associated transcripts ( Table 3) demonstrated over-representation of the biological functions cellular assembly and organisation, tissue development, cellular movement, cardiovascular system development and function, cellular growth and proliferation, connective tissue development, cell-to-cell signalling and interaction, and cell cycle ( Table 5).

Concordance with Previous Studies
Since the number of significant transcripts in the previous publications of the included datasets was very limited when applying a stringent threshold for multiple testing (Bonferroni), no rigorously comparison between datasets could be performed. However, lists of transcripts/probes that were claimed to be differentially expressed in the three original studies are included (Tables S1-S3). A comparison of our reanalysis of dataset #1 with 36 NP samples and our analyses of all three datasets combined with 154 samples showed that the log2FC values for gestational age-associated transcripts were highly correlated (Pearson's product-moment correlation coefficient 0.94, P,2.2*10 216 ). In addition, the top 100 transcripts from the reanalysis of dataset #1 were all nominally significant (P,0.05) in our analysis of the pooled datasets and 20 transcripts were shared between the top 100 transcripts from dataset #1 and the pooled analyses.
Comparing our separate reanalyses of dataset #2 and #3 (with 12 or 37 PE samples and 11 or 58 NP samples, respectively) with our analysis of the pooled dataset (49 PE and 105 NP), we found that the correlation between the log2FC values for PE in dataset #2 and the pooled dataset (Pearson's product-moment correlation coefficient 0.57, P,2.2*10 216 ) were similar to the correlation between the PE log2FC values for dataset #3 and the pooled dataset (Pearson's product-moment correlation coefficient 0.71, P,2.2*10 216 ). For the sub-analysis of PE-associated transcripts in dataset #2, 78 of the top 100 transcripts were nominally significant (P,0.05) in our pooled analyses of PE-associated transcripts, but only two of these were significant in the pooled analyses after correction for multiple testing and four transcripts were shared between the top 100 transcripts from dataset #2 and the pooled analyses. For the reanalysis of dataset #3, 93 of the top 100 transcripts were nominally significant (P,0.05) in our total analysis, of which nine were significant in the pooled dataset after correction for multiple testing. Seven transcripts were shared between the top 100 transcripts from dataset #3 and the pooled analyses. The correlation between dataset #2 and #3 log2FC values from the PE-associated transcripts was low (Pearson's product-moment correlation coefficient 0.04, P,1.04*10 213 ). This is not surprising, since none of the PE-associated transcripts in dataset #2 and only two in dataset #2 were significant after correcting for multiple testing (Bonferroni).

Discussion
In this work, we have pooled data from three different genomewide transcription analyses of samples from the maternal-fetal interface to generate a dataset consisting of 154 samples. To the best of our knowledge, this is the largest dataset used to assess changes in transcription levels associated with either PE or gestational age. A total of 92 gestational age-related and 22 PEassociated transcripts were identified. These numbers by themselves indicates that a large fraction of variance in transcription levels at the maternal-fetal interface can be attributed to gestational age rather than PE-status. The large sample size achieved by pooling datasets from three different studies enabled us to apply a stringent significance threshold (Bonferroni adjusted P values) in order to minimise the probability of false positives. Using Bonferroni cut-off of 0.05 corresponds to false discovery rate (FDR) cut-offs of 2.8*10 24 for gestational age and 1.7*10 23 for PE [21]. Previous genome-wide transcription studies on PE have not used such stringent threshold for significance, and those findings should be interpreted with care. In accordance with this, only few PE-or gestational age-associated transcripts from previous publications have been replicated in others or our study.
The most significant finding among the PE-associated transcripts was the down-regulation (P = 2.43*10 29 , .2 fold downregulated) of angiopoietin-like 2 (ANGPTL2). The ANGPTL2 protein is a secreted glycoprotein with homology to the angiopoietins, which are important angiogenic factors. Although the angiopoietin-like proteins do not bind to the angiopoietin receptor, they are believed to play a role in angiogenesis via induction of endothelial cell sprouting in blood vessels [22]. We also observed a down-regulation of RARRES2 (retinoic acid receptor responder 2, also called chemerin) among the PEassociated transcripts. It was recently demonstrated that chemerin could induce angiogenesis in vitro [23,24]. Down-regulation of these angiogenic factors may be linked to the pathogenesis of PE through abnormal vascular morphology, as placentas from women with severe PE, especially in combination with fetal growth restriction, are characterised by decreased capillary volume and surface area [25].
The expression of CRH and FAP were significantly associated with both PE-status and gestational age. While the expression of CRH increased with gestational age and was up-regulated in PE,  Figure A, the transcription level is significantly associated with both PE-status and gestational age. In B, the transcription level is associated with gestational age only, and in C with PE-status only. doi:10.1371/journal.pone.0069848.g003    the expression of FAP decreased with gestational age and was down-regulated in PE. The transcriptional changes of FAP and CRH are probably linked to both gestational age and PE. During pregnancy, CRH is produced by decidual and placental tissue [26], and released into the fetal and maternal circulation. Maternal plasma CRH levels increase over gestation [27], concurrent with our observation of CRH among the gestational age-related transcripts. A further elevation of maternal plasma CRH levels has been shown in PE compared to normal pregnancies at the same gestational age [28]. Increased levels of CRH may contribute to the pathogenesis of PE trough its role in regulation of vascular resistance and blood flow in utero-placental tissue [29]. Among the transcripts that were up-or down-regulated due to gestational age, we noted decreased expression of galectin 3 (LGALS3) and increased expression of galectin (LGALS8). Galectins are highly expressed at the maternal-fetal interface, and regarded as multifunctional regulators of fundamental cellular processes due to their capacity to modulate functions such as cell-extracellular matrix interactions, proliferation, adhesion, and invasion [30]. Galectin 3 is expressed in placental cell columns, but not in invasive extravillous trophoblasts [31]. The negative correlation between expression and trophoblast invasiveness is in accordance with our finding of decreased galectin 3 expression over gestation. Galectin 8 is expressed by decidual cells, villous and extravillous trophoblasts [32], but its role is less clearly understood.
The IPA analyses of the 22 PE-associated transcripts demonstrated an over-representation of genes associated with metabolic disease (Table 4). PE share several metabolic abnormalities with cardiovascular diseases and diabetes, and having a PE complicated pregnancy is associated with increased risk of type 2 diabetes later in life [33]. This agrees with pregnancy acting as a stress factor which could reveal a pre-existing disposition to later life metabolic disease [34]. IPA analysis of the 92 transcripts that were associated with gestational age revealed an over-representation of genes involved in cell assembly and organisation, tissue development, cellular movement, tissue morphology, and connective tissue development and function (Table 5). These findings are in agreement with known biological processes taking place at the maternal-fetal interface during pregnancy, such as trophoblast proliferation, differentiation, invasion and extracellular matrix remodelling.
It is important to consider that the data used in our study was produced on two different microarray platforms, and that tissue sampling procedures differed between the three sub-studies included in our dataset. In dataset #1 and #2, basal plate biopsies were used for transcriptional analyses, whereas in dataset #3, decidual tissue was collected by vacuum suction of the entire placental bed. These differences may pose a potential bias, as gene expression has been shown to differ depending on tissue sampling site [35]. Inter-and intra-platform reproducibility has been shown to be good in terms of detecting differentially expressed genes [36][37][38]. However, the reproducibility of absolute transcription levels is poor, and pre-processing is required before comparisons of transcription values across platforms can be made. To deal with these limitations, and be able to make both inter-and intraplatform comparisons of transcription values, we performed inverse normal transformation. After this transformation, the distribution of transcription values is assumed equal for all probes, independent of study, sampling method and platform. However, PE-status, gestational age and tissue sampling method differed between studies, and consequently we had to include this as a factor giving a separate level for each study in the linear regression model. In our analyses, we exclusively searched for observations that agreed between datasets, and the limitations mentioned above will rather reduce the power of identifying differentially expressed genes than introduce false positive results. Combined with the fact that our pooled dataset only targets 14,678 genes, and that we are using a very stringent threshold for significance, our results likely represent only a small fraction of the total number of transcripts that are influenced by either PE or gestational age. Had all samples been collected by the same method and analysed on the same arrays, it is likely that a much larger number of differentially expressed transcripts would be identified. In our analyses, we did not allow for interaction between the study/tissue source and the effect of PE or gestational age on transcription values. The main reason for this is that the cause of such interaction would be impossible to determine (e.g. sampling method, microarray type, PE heterogeneity etc.). Instead, we focused on identifying shared effects across studies.
Another limitation in our dataset is the heterogeneity of the PE group. In dataset #2, all cases had severe PE of which 5 were complicated by FGR. In dataset #3 30 of 37 cases had severe PE or PE complicated by SGA, seven had mild PE. We recognise that this may indicate that some of our PE-associated transcripts may be linked to SGA pathogenesis or restricted to severe PE, and that it would have been preferable to use a more homogenous case group. The sample size of PE cases was too small to allow for any stratification in the analyses. It is therefore important to consider that our results might not be generalisable to all kinds of PE. However, PE is a complex disorder. Classifying individuals into e.g. severe and mild PE does not necessarily mean that these are different disorders. Rather, the diagnosis of both mild and severe PE are combinations of different quantitative characteristics, and at some pre-defined cut-off, the disorder is regarded as severe. This means that the underlying causes of PE might be as complex within the patients with severe PE as between cases with severe and mild PE. The NP group is also heterogeneous, including samples from preterm labour without infection (1/3 of these due to cervical insufficiency), elective terminations (of which the future pregnancy outcome is unknown), as well as normal pregnant women. Including cases with preterm labour may have influenced our results, as this condition with or without infection is likely to result from some sort of pathology that could result in gene expression changes. A possible shared pathology between preterm birth or PE might also result in similar changes in gene expression, which would reduce our power to identify PE-specific differentially expressed genes. However, it is almost impossible to match severe PE cases with regards to gestational age by using completely healthy individuals, as normal pregnancies in this gestational age range (week 24-36) are rare. The inclusion of samples with gestational age ,20 weeks may also have confounded our results, as the future pregnancy outcome is unknown, and they may have developed PE later in pregnancy. However, excluding these samples from the analyses did not change the results, but rather decreased the power due to the smaller sample size. Unfortunately, we were not able validate the microarray results, e.g. through quantitative real-time (qRT-) PCR, for all our significant findings, mainly due to lack of RNA. However, validations have previously been performed for a subset of our genes: ANGPTL2, CRH, TCF7L2, PLA2G7 and TCF7L2 [8,11] with good results. Even though more comprehensive corroborative studies might have be useful for verifying the results, there are many examples where this is not feasible, e.g. due to lack of RNA [39]. It is also worth considering that microarray data are generally much more accurate compared to qRT-PCR. For qRT-PCR, housekeeping genes are assumed to be equally expressed in all samples, and commonly used for normalisation procedures. However, it has been shown that the expression of housekeeping genes vary dramatically between individuals and that the heritability is as high as 0.56 for some housekeeping genes [17]. Consequently, housekeeping genes are not ideal for internal standardisation. The normalisation procedure performed for microarray experiments, based on the average expression level per individual, is likely to generate more precise estimates.
To our knowledge, this is the first study to simultaneously assess the effects of gestational age and PE-status on gene expression at the maternal-fetal interface. We found that a much large number of transcripts were influenced by gestational age compared to PE status. Based on this, we strongly recommend that adjustments for gestational age should be performed in similar studies. The large sample size achieved by pooling different datasets allowed us to apply a stringent threshold for significance, which has not been feasible in most previous studies. The transcripts identified in this study are likely to be influenced by gestational age or disease status, or even play a direct role in the development of PE. Figure S1 Variation in transcription values depending on gestational age for all transcripts that were significantly associated with preeclampsia (PE). Normalised transcription values are plotted for different gestational ages (weeks). The red points represent PE pregnancies and blue points normal pregnancies (NP). The lines are the estimated regression lines for gestational age (red line for PE and blue line for NP), separated by the regression coefficient for PE-status. (PDF) Figure S2 Variation in transcription values depending on gestational age for all transcripts that were significantly associated with gestational age. Normalised transcription values are plotted for different gestational ages (weeks). The red points represent preeclamptic (PE) pregnancies, and blue points normal pregnancies (NP). The lines are the estimated regression lines for gestational age (red line for PE and blue line for NP), separated by the regression coefficient for PE-status. (PDF)