Genome-Wide Transcriptome Directed Pathway Analysis of Maternal Pre-Eclampsia Susceptibility Genes

Background Preeclampsia (PE) is a serious hypertensive pregnancy disorder with a significant genetic component. Numerous genetic studies, including our own, have yielded many susceptibility genes from distinct functional groups. Additionally, transcriptome profiling of tissues at the maternal-fetal interface has likewise yielded many differentially expressed genes. Often there is little overlap between these two approaches, although genes identified in both approaches are significantly associated with PE. We have thus taken a novel integrative bioinformatics approach of analysing pathways common to the susceptibility genes and the PE transcriptome. Methods Using Illumina Human Ht12v4 and Wg6v3 BeadChips, transcriptome profiling was conducted on n = 65 normotensive and n = 60 PE decidua basalis tissues collected at delivery. The R software package libraries lumi and limma were used to preprocess transcript data for pathway analysis. Pathways were analysed and constructed using Pathway Studio. We examined ten candidate genes, which are from these functional groups: activin/inhibin signalling—ACVR1, ACVR1C, ACVR2A, INHA, INHBB; structural components—COL4A1, COL4A2 and M1 family aminopeptidases—ERAP1, ERAP2 and LNPEP. Results/Conclusion Major common regulators/targets of these susceptibility genes identified were AGT, IFNG, IL6, INHBA, SERPINE1, TGFB1 and VEGFA. The top two categories of pathways associated with the susceptibility genes, which were significantly altered in the PE decidual transcriptome, were apoptosis and cell signaling (p < 0.001). Thus, susceptibility genes from distinct functional groups share similar downstream pathways through common regulators/targets, some of which are altered in PE. This study contributes to a better understanding of how susceptibility genes may interact in the development of PE. With this knowledge, more targeted functional analyses of PE susceptibility genes in these key pathways can be performed to examine their contributions to the pathogenesis and severity of PE.


Introduction
Preeclampsia (PE) is a common and serious pregnancy complication characterised by newonset hypertension and proteinuria after 20 weeks' gestation and affects between 2-8% of all pregnancies worldwide [1]. Although the disorder has been known since antiquity, the cause of PE remains elusive with the only known cure being the removal of the placenta [2]. Due to the severity of the mother's condition, there is often an urgent need to deliver the fetus prematurely. PE accounts for over 40% of medically indicated pre-term births [3]. Preterm births are associated with greater neonatal morbidity and mortality in the short term, as well as high economic, health and social costs later in life [4,5]. A PE mother is also at increased long term risk of developing ischemic heart disease, stroke and cardiovascular disease [6][7][8]. Therefore, the consequences of PE are not merely short term but may have persistent, long term effects on the mother and child.
PE susceptibility is influenced in part by a strong genetic component. Heritability estimates range from 0.31 to 0.54 [9][10][11], and numerous candidate genes have been implicated either by linkage or association methods. The ACVR2A [12] and STOX1 [13] genes were identified by genetic linkage methods. Several genes (AGT, ACE, AGTR1, AGTR2, FV, MTHFR, NOS3 and TNFα) have been the focus of candidate gene association studies [14]. Genome-wide association scans have implicated multiple genetic loci [15], including the INHBB [16], and PSG11 loci [17]. It is widely accepted that PE does not follow a Mendelian inheritance except in a few families [18]. Instead, PE is the result of complex interactions between the maternal and fetal genotypes and environment factors.
Another approach used to investigate the pathogenesis of PE is a microarray study design to identify differentially expressed genes in tissues at the maternal-fetal interface and thereby gain an insight into possible disease mechanisms. Placental [19,20] and decidual [21][22][23][24][25] tissue microarray studies have reported numerous differentially expressed genes. However, there is often little concordance between microarray studies, likely due to factors such as insufficient power and heterogeneity of tissue samples [20,26]. To date, most transcriptome studies have been conducted in the fetal-derived placenta with only a few using maternal-derived decidua [19][20][21][22][23][24][25]27]. Conversely, most genetic linkage/association studies have focused on the maternal genotype [11-13, 16, 28-31]. Taken together, this discordance in study design strategies partly explains the very small overlap between genetic association/linkage and expression studies.
Hence, further expression studies should focus on the maternal tissue in addition to the fetal tissue.
Whilst the aforementioned study designs have reported numerous PE candidate genes, there is frustratingly little overlap in the genes identified. A recent study by Founds [32], linking placental global gene expression with PE susceptibility loci, showed that 40% of genes altered in first trimester placental chorionic villus sampling resided in known PE susceptibility loci. However, these account for only 13 genes, a small fraction of known PE candidate genes. A review by Jebbink et al. [33] showed at least 178 genes associated with PE have been identified by both the candidate gene and genome-wide study approaches, with many more genes identified since then. It is unlikely that there will be a single, universal causative gene for PE. Given the large diversity of genes identified from genetic studies and microarray studies, we have taken a novel integrative bioinformatics approach to bridge this gap by analysing pathways common between susceptibility genes identified by the genetic association approach and the PE transcriptome. Identifying common underlying biological pathways will allow us to perform more specific and targeted functional analyses to address the complexities of PE.
Our earlier studies showed that maternal susceptibility genes, which are from various functional groups: activin/inhibin signalling-ACVR1, ACVR1C, ACVR2A, INHA, INHBB; structural components-COL4A1, COL4A2 and M1 family aminopeptidases-ERAP1, ERAP2 and LNPEP, were differentially expressed in tissues at the maternal-fetal interface [12,34]. Functional studies of these genes are presently limited and based on the assumptions of how PE is thought to develop. Hence, to identify unbiased, novel functional roles of these susceptibility genes, we performed a genome-wide transcriptome directed pathway analysis of maternal PE susceptibility genes. By taking into account the underlying pathways of the whole genome instead of focussing specifically on differentially expressed genes, we aimed to use an integrative bioinformatics approach identify novel biological processes involved in the development of PE.

Patient Samples
Decidual basalis samples were collected from a total of n = 65 normotensive and n = 60 PE patients at Caesarean section as described previously [12]. Normotensive patients underwent Caesarean section due to breech presentation, maternal request or previous history. PE was defined according to the Australasian Society for the Study of Hypertension in Pregnancy criteria [35,36]. Exclusion criteria for PE patients included diabetes and systemic lupus erythematosus. Blood pressures of normotensive patients were recorded as <140/90 mmHg. A non-treating obstetrician independently verified patient clinical records. Tissue samples were verified as decidual by hospital pathologists. Each patient gave written informed consent for samples to be used for the study. Research and ethics approval was granted by The Royal Women's Hospital Research and Ethics Committees, Melbourne, Australia and the Institutional Review Board of the University of Texas Health Science Center at San Antonio, Texas, USA.

Sample Processing
Harvested decidual tissue was placed into an appropriate volume of RNA-later (Qiagen, Hilden, Germany) and kept at 4°C for at least 24 hrs. Up to 250 mg of decidual tissue was then removed from the RNA-later and stored at -80°C. Total RNA was extracted from decidual samples using RNeasy Midi kits (Qiagen) with yield and quality determined as described previously [12]. Complementary RNA synthesis, amplification and purification were performed as described previously [24].

Transcriptional Profiling
Microarray interrogation of decidual complementary RNA was performed in two batches. The first batch of n = 23 normotensive and n = 25 PE samples were hybridised onto Illumina HumanWG-6 v3 Expression BeadChips (Illumina Inc., San Diego, CA, USA), while the second batch of n = 42 normotensive and n = 35 PE samples were hybridised onto Illumina HumanHT-12 v4 Expression BeadChips (Illumina Inc.) in accordance with Illumina's Whole-Genome Gene Expression Direct Hybridisation assay protocol. All samples were scanned on the Illumina iScan System with iScan Control Software (v3.2.45). Illumina's GenomeStudio software (v2010.2), Gene Expression Module (v1.7.0), was used to generate a control summary report to assess assay performance and quality control metrics. Updates in array content, from one BeadChip version to another, often results in changes in transcript probe identifiers. We therefore utilised PROBE_SEQUENCE information as the unique identifier to highlight transcript probes common to both the HumanWG-6 and HumanHT-12 BeadChips. A total of 39,426 common probes were identified for data pre-processing. To account for batch effects, the data from each batch was analysed independently. The raw microarray data are accessible at the Gene Expression Omnibus repository with the Accession Number GSE60438 (National Center for Biotechnology Information, National Institutes of Health, Bethesda, MD, USA).

Data Pre-processing
Background noise was subtracted from transcript data for analysis using Illumina's GenomeStudio software (v2010.2), Gene Expression Module (v1.7.0). The data from each batch were then pre-processed independently with the open source software R version 3.0.2 available via www.bioconductor.org. The lumi R package [37] was used to log2-transform and quantile normalise the data. The limma R package [38] was then used to rank differential gene expression with moderated t tests.

Pathway Analyses
The list of ranked genes for each microarray batch and the list of susceptibility genes were imported into Pathway Studio 9.0 (Elsevier, Amsterdam, Netherlands) for pathway analysis. Gene set enrichment analysis (GSEA) was performed on gene expression data to identify altered pathways throughout the genome. To determine the pathways associated with the susceptibility genes, a sub-enrichment analysis was performed on the list of susceptibility genes. Pathways are represented by the Gene Ontology (GO) set class of biological processes. Data files were then exported as databases in Microsoft Access 2010 (Microsoft Corp., Redmond, WA, USA) to first determine the consistently altered pathways between the two transcriptome profiling batches and then the concordant pathways between susceptibility genes and the PE transcriptome.

Gene Network Construction
To determine the interactions between susceptibility genes from the different functional groups, gene networks were constructed. The literature-based ResNet Mammalian 9.0 Database in Pathway Studio 9.0 (Elsevier) was used to determine common pathway targets and regulators of the susceptibility genes. Pathways were selected using their expression relationship. Each pathway link is supported by at least one published reference. The references for each link were manually cross-checked to remove any erroneous links.

Statistical Analyses
Student's t test with Welch's Correction and 2 × 2 contingency table with Fisher's Exact Test were used for analysing patient characteristics where appropriate on GraphPad Prism 5 (GraphPad Software Inc., La Jolla, CA, USA). A value of p < 0.05 was considered statistically significant for patient characteristics. Mann-Whitney U test was used for enriching significant pathways in the GSEA on Pathway Studio 9.0. For the pathway analyses, a semi-conservative value of p < 0.001 was selected as the statistical cut-off to maximise the identification of novel pathways, while minimising the number of potential false positives in multiple testing.

Patient Characteristics
Gestational age, infant birth weight, birth weight percentiles, gravidity and parity between the n = 65 normotensive and n = 60 PE patients were significantly different (Table 1). No significant difference was observed for maternal age or infant sex. The significant differences for gravidity and parity were expected given that PE is more common in first pregnancies. The lower birth weights and gestational age at delivery for the PE patients are consistent with earlier delivery due to the severity of the disease.

Common pathways of susceptibility genes
Pathways and interactions between susceptibility genes from the various functional groups were determined by an inbuilt literature-based database search in Pathway Studio 9.0. The major common pathway regulators and targets of susceptibility genes, with four or more connections, are AGT, IFNG, IL6, INHBA, SERPINE1, TGFB1 and VEGFA (Fig 1). A similar analysis of the pathways and interactions between these major regulator and targets was then performed to identify their downstream genes that could serve as novel PE biomarkers. In total, 13 genes (CDH1, EDN1, ENG, FLT1, IL10, INS, KDR, MMP2, MMP9, NOS2, NOS3, PTGS2 and TNF) downstream of these major regulators and targets were identified (Fig 2). Enrichment of the pathways associated with the susceptibility genes identified a total of 114 GO sets in 15 pathway categories ( Table 2). The top three pathway categories were in the areas of reproduction, cell signalling and liver function. There were 10 pathway categories that were associated with at least two functional groups of susceptibility genes. All three functional groups of susceptibility genes were present in the pathway categories of neural function, differentiation and angiogenesis. Further details of these GO sets are presented as supplementary information in S1 Table. Major altered PE pathways GSEA of the PE decidual transcriptome yielded 42 GO sets that were consistently altered between the two transcriptome profiling batches. The 13 pathway categories of these 42 differentially expressed GO sets (p<0.001) are presented in Table 3. The top three altered pathway categories were in the areas of immunity/inflammation, cell signalling and apoptosis, which represent 28 GO sets. Detailed information of the GO sets is available in S2 Table. Overlap of PE transcriptome and PE susceptibility genes The pathway categories of the GO sets that are concordant between the PE decidual transcriptome and the susceptibility genes are presented in Table 4. There were a total of 8 common GO sets that were grouped into five pathway categories. The top two pathway categories were apoptosis and cell signalling, which represent five GO sets. Details of the individual GO sets are available in S3 Table.  Differentially expressed genes From the ranking of the genes using the limma R package, none of the genes reached statistical significance when the stringent Bonferroni's Correction (p < 1.26 X 10 6 ) for multiple testing was applied. However, by using the semi-conservative statistical cut-off of p<0.001, a total of 8 differentially expressed genes were identified as concordant between the two datasets ( Table 5). Downregulated genes were CD72, DBP, DPP7, HS3ST2, PER3, SLC2A6 and TNFRSF14. PDK4 was the only upregulated gene. The entire list of genes with concordant differential expression between the two datasets is available in S4 Table.

Discussion
In this study, we performed a novel genome-wide transcriptome directed pathway analysis of maternal PE susceptibility genes in a large set of 125 decidual samples. The transcriptome profiling yielded a total of only 8 differentially expressed genes. Two genes have reported associations with PE. DBP, an albumin promoter binding protein, was previously reported by Løset et al. [24] to be decreased in PE decidua and our results support this. In an earlier study of the fetal placenta, the other gene SLC2A6, which codes for a glucose transporter, was significantly increased in PE [39] and contrasts with our results where we found a decrease in PE maternal decidua. This may be reflective of the tissue being sampled from different parts of the maternal-fetal interface. The remaining genes have no known association with PE. However, this process of identifying differentially expressed genes that satisfy a statistically significant threshold may overlook other genes of equal or greater biological relevance; for example, smaller fold changes in the gene expression levels of several genes (an additive effect) in a common pathway may have a greater downstream impact compared with a large fold change in expression levels from a single gene [26]. By taking into account the decidual transcriptome, instead of focussing specifically on individual differentially expressed genes, there is increased power of detecting altered pathways in PE. Hence, by using our novel approach, we identified common altered pathways shared between microarray data and susceptibility genes.
To determine the interactions between the susceptibility genes from the various functional groups, we first constructed gene network pathways to identify common regulators and targets. There are many complex interactions between the genes. Some of these identified genes are both a regulator and target of the susceptibility genes. For example, one target of ERAP1 is AGT and AGT in turn is a regulator of ACVR2A. ERAP1 and ACVR2A are from different functional groups; however, we demonstrate an interactive link between these two genes via AGT, a finding that is not apparent from the traditional study approaches. These major regulators and targets are also implicated in previous PE studies. AGT is a major component of the renin-angiotensin-system, which regulates blood pressure and body-fluid volume, and has been widely investigated as a PE candidate gene [18]. Many studies have shown the remaining genes, which encode various cytokines, growth factors and protease inhibitors, are measurable in the maternal circulation and are significantly altered in PE. Activin A and inhibin A dimers derived from INHBA, IFNγ, IL6, plasminogen activator inhibitor-1, SERPINE1 and TGFβ1are all significantly increased in the maternal circulation in PE-affected women [40][41][42][43][44][45]. In contrast, VEGFA, an angiogenic factor, is significantly reduced in the circulation of women with PE [46]. Further analysis of downstream genes of the common regulators and targets was undertaken to identify potential PE biomarkers. With no selection bias, this analysis identified the anti-angiogenic factors, soluble FLT1 (sFLT) and endoglin (ENG), which are currently widely explored as predictive PE biomarkers and were first identified through microarray studies [47][48][49]. The other downstream genes also have recognised roles in PE. MMP2, MMP9 and CDH have altered expression in PE, with functional roles in the invasion of trophoblast cells into the maternal decidua [50][51][52][53][54]. KDR codes for a VEGF receptor that is significantly decreased in PE and may contribute to the endothelial dysfunction observed in PE [55]. Endothelin, which is coded by EDN1, is a vasoconstrictor that is significantly increased in the circulation of PE women [56]. Endothelin (EDN1), endothelial nitric oxide synthase (NOS3), inducible nitric oxide synthase (NOS2) and prostaglandin-endoperoxide synthase 2 (PTGS2) are well known for their involvement in maintaining blood pressure [57]. An IL10 null rodent model of PE was developed, highlighting a possible role for IL10 in PE [58]. Thus, these may be possible targets through which these PE susceptibility genes act to influence the development of PE and more targeted functional analyses can be performed with this knowledge. Therefore, a better understanding of how these different proteins interact may enable the development of a rigorous panel of PE biomarkers.
Additionally, the pathways associated with the susceptibility genes were determined and categorised. The majority of these pathway categories were associated with at least two functional groups of genes from the activin/inhibin signalling, structural components and M1 family aminopeptidases functional groups. The pathway categories of neural function, differentiation and angiogenesis had all three functional groups involved. This provides evidence that genes from distinct functional groups interact with each other and are involved in multiple pathways. Most, if not all of these pathway categories are thought to be important for a healthy, uncomplicated pregnancy. Therefore defects in multiple genes affecting several important pathways may promote PE susceptibility, providing additional weight behind the complex, multi-factorial nature of this serious disorder of obstetric medicine.
The top three altered PE pathway categories in the decidual transcriptome were immunity/ inflammation, cell signalling and apoptosis, representing 28 altered gene sets, which were consistently altered between both transcriptome profiling batches. These pathway categories are consistent with the published literature [24,59]. The top pathway category of immunity/inflammation supports the growing evidence of a highly dysregulated immune and inflammatory response at the PE maternal-fetal interface [60]. PE is hypothesised to be partly due to immune maladaptation to paternal antigens carried by the fetus, which leads to an exacerbated inflammatory response [61]. The disruption of cell signalling cascades that modulate many processes during pregnancy such as trophoblast invasion and spiral arteriole remodelling is hypothesised to lead to the shallow trophoblast invasion and poor spiral arteriole remodelling observed in PE pregnancies [62,63]. Abnormal apoptosis regulation is also commonly observed in PE with alterations in multiple pathways such as the p53 pathway [64]. The concordant pathway categories between the susceptibility genes and the PE decidual transcriptome represent the altered pathways associated with the susceptibility genes. The top two concordant pathway categories of apoptosis and cell signalling, were also among the top three categories altered in the PE transcriptome. Therefore, the susceptibility genes may contribute to the development of PE through these particular pathways and focussing our functional analyses of the susceptibility genes in these areas will be of importance.
This integrative bioinformatics approach allows us to identify novel interactions and unbiased functional roles of the susceptibility genes. For example, the effect of altered collagen expression on blood pressure regulation through vasoactive factor production could be examined. Based on the gene networks, COL4A1 regulates VEGFA, which in turn regulates many vasoactive factor genes such as NOS3, NOS2, PTGS2 and FLT1. This novel function is not apparent from the known structural role of collagen. Interestingly, recent studies show that cleavage products derived from the non-collagenous domain of both COL4A1 and COL4A2, have significant anti-angiogenic effects on endothelial cells including increased apoptosis and decreased proliferation, and are being explored as novel cancer therapeutics [65][66][67]. Hence, this may be a plausible pathway through which collagen affects blood pressure regulation. The pathway category of blood pressure regulation was nominally altered (p < 0.05) in the PE decidual transcriptome. Therefore, undertaking this pathway-directed approach allows us to rationalise various studies that appear disparate, as the results from this study show that the genes identified through the different approaches interact with each other.
Given the complex genetics of PE, it is likely that the genes from other previously identified susceptibility loci, not present in our gene networks, may be part of a further extension of the current networks of gene interactions. Of the genes represented in the gene networks of this study, two genes SHH and NOS3 reside at the 7q36 locus [28,30]. The other loci identified thus far are at chromosomes 2p13, 2p25, 2q22, 9p13 and 10q22 [13,29,31,68]. Further pathway analysis of these previously identified loci is warranted to extend the current knowledge.
In summary, we found that maternal PE susceptibility genes from distinct functional groups share similar downstream pathways through common regulators and targets. Downstream pathways associated with the susceptibility genes are altered in PE. Common pathways are the link between genes identified through the multiple approaches. An integrative bioinformatics approach allows us to identify novel interactions and unbiased functional roles of the susceptibility genes. Therefore, with this knowledge more targeted functional analyses of PE susceptibility genes in these key altered pathways can be performed to examine their contributions to the pathogenesis and severity of PE.
Supporting Information S1