Comparison of Normal and Pre-Eclamptic Placental Gene Expression: A Systematic Review with Meta-Analysis

Pre-eclampsia (PE) is a serious multi-factorial disorder of human pregnancy. It is associated with changes in the expression of placental genes. Recent transcription profiling of placental genes with microarray analyses have offered better opportunities to define the molecular pathology of this disorder. However, the extent to which placental gene expression changes in PE is not fully understood. We conducted a systematic review of published PE and normal pregnancy (NP) control placental RNA microarrays to describe the similarities and differences between NP and PE placental gene expression, and examined how these differences could contribute to the molecular pathology of the disease. A total of 167 microarray samples were available for meta-analysis. We found the expression pattern of one group of genes was the same in PE and NP. The review also identified a set of genes (PE unique genes) including a subset, that were significantly (p < 0.05) down-regulated in pre-eclamptic placentae only. Using class prediction analysis, we further identified the expression of 88 genes that were highly associated with PE (p < 0.05), 10 of which (LEP, HTRA4, SPAG4, LHB, TREM1, FSTL3, CGB, INHA, PROCR, and LTF) were significant at p < 0.001. Our review also suggested that about 30% of genes currently being investigated as possibly of importance in PE placenta were not consistently and significantly affected in the PE placentae. We recommend further work to confirm the roles of the PE unique and associated genes, currently not being investigated in the molecular pathology of the disease.


Introduction
Pre-eclampsia (PE), a major cause of perinatal mortality complicates up to 8% of all pregnancies in Western countries [1][2][3]. It is one of the top 4 causes of maternal mortality and morbidity worldwide, causing 10 to 15% of maternal deaths [2][3][4]. PE is characterised by new hypertension (blood pressure of !140/90 mmHg) on two separate readings at least 6 hours apart presenting after 20 weeks' gestation in conjunction with clinically relevant proteinuria (!300mg) per 24 hours [5].
PE is a multifactorial disease, and while there is a cautious acceptance of links between familial concordance and maternal polymorphism in the pathogenesis of the disease [6][7][8][9][10][11][12][13], the placenta is suggested as the primary cause of PE [14,15], Nonetheless, there is a degree of uncertainty, especially about the roles of gene regulation and expression in the molecular pathogenesis of the disease. Expectedly, knowledge on placental gene expression is advancing [16][17][18]. And while recent meta-analysis of Relative Gene Expression (RGE) in NP and PE placentae have linked the changes in specific genes in the placenta to PE [13,19], these studies have often focused on identifying genes that are either highly up-regulated or down-regulated between the case and control matched samples. Traditionally, this approach is suggested as highly suitable for candidate gene discovery or class prediction studies [20][21][22]. However, this methodology is lately suggested as less sensitive for microarray studies that seek to account for variability in gene expression across sample within same class or to map the molecular pathology of a disease from 'noisy' data sets [23][24][25][26]. We therefore examined whether RGE analysis would identify same PE genes as Absolute Gene Expression (AGE) analysis, and also to determine the functional roles of gene sets or families that are equally expressed at high or low levels in both NP and PE placentae.
Therefore, in this study we provide evidence that AGE analyses identify gene sets whose combined expression patterns could uniquely characterise biological and functional phenotype for PE placentae. We further provide evidence for putative inter-relationships and contributory roles of equally low or high level expressed genes in the molecular pathology of PE.

Study selection
Public data repositories Gene Expression Omnibus (GEO) and ArrayExpress Archive were systematically searched in accordance with PRISMA and MIAME in December 2014, and repeated in June 2015. No time limit for data publication was set. Search terms used were NP placenta, PE and Term placenta explant. Study series with no report on placental tissue but other tissues such as Chorionic villous tissue, Decidua, Trophoblast cell lines and Basement membrane were excluded. Similarly, study series with no matched control group; control group composed of pregnancies complicated by small for gestational age fetuses; gestational diabetes, Non-homo sapiens control; and Non-term placentae were excluded. Also, duplicate samples; Methylation profiling array; Protein profiling array; Long non-coding RNAs (long ncRNAs, lncRNA); and all complications of human pregnancy other than PE were excluded.

Array Processing and Quality Control
Data for each sample included were downloaded from GEO (or from ArrayExpress if not available in GEO). The series data were prepared according to INMEX [27] requirement for metaanalysis, and exported into INMEX. Probe IDs from the different platforms were re-annotated in INMEX using the November, 2012 annotation information obtained from the NCBI Gen-Bank and Bioconductor into Entrez gene IDs. Multiple probes mapping to the same gene were presented as an average for combined probes and thence referred to as genes.
To prepare the data for differential expression analysis using Linear Model for Microarrays (Limma), the data was log transformed into additive scale, and then quantile normalised. Microarray quality appraisal was further performed using INMEX built-in protocol. Firstly, study series with low quality samples was defined as samples with >60% missing data, and were rejected. Using the INMEX inbuilt re-annotation protocol, study series with less than 10 common genes were also excluded from further analysis.

Finding Significantly Expressed Genes in Pre-eclampsia
Significantly expressed gene was defined as a gene that shows consistent stronger aggregated differential expression (DE) profile across the multiple datasets [24,27]. Therefore the DE genes were identified by combining P-values from the multiple studies using Fisher's method (-2 Ã ∑Log (p)) (p<0.05) for Relative expressions or RankProduct analysis for Absolute expressions.
RankProduct analysis, a non-parametric statistic [24], was used to identify genes that were consistently up-regulated or down-regulated in PE or NP placentae. The RankProduct analysis combined the gene rank from the different arrays together instead of using actual expression data to select genes that were consistently ranked high or low [24,26]. The product ranks from all samples were then calculated as the test statistic in100X permutations with False Discovery Rate (FDR) Confidence at 1 -alpha = 95.0%. Genes with consistently high ranks (smaller rank product) across the different microarrays were classified as up-regulated. Genes with consistently low ranks (larger rank products) across the different arrays at the stated FDR were classified as down-regulated, whereas genes with inconsistent rank product across the different studies were classified as non-significant [28].

Gene-Disease Association Analysis
Gene-Disease Association analysis was performed to identify genes whose expression could be associated with PE placentae [29]. Using BRBArray Tools five prediction methods were performed: Compound covariate predictor, Diagonal linear discriminant analysis, K-nearest neighbours (for K = 1 and 3), Nearest Centroid, and Support vector machines. With a fixed internal random seed, genes were selected using a combination of univariate F-test (p < 0.05), and Leave-one-out cross-validation at 100 permutations, and further evaluated with ROC curve analysis.

Functional Role Assignment
Biological relevance of the differentially expressed genes was determined using gene enrichment analysis programmes in WebGestalt [30]. Briefly, Kyoto Encyclopedia of Genes and Genomes (KEGG) Homo sapiens genome pathways database [31] was probed with the differentially expressed genes to identify statistically enriched pathways. The Benjamini & Hochberg (BH) hypergeometric test [32] was used for all enrichment evaluation analyses, with adjusted p values based on R function p.adjust.

Results
Following the systematic search, a total of 41 microarray study series were identified (Fig 1). Twelve of the study series met the inclusion criteria, and 29 series (S1 Table) were excluded based on the eligibility criteria. Of the 12 series that met the eligibility criteria, 6 series (GSE 30186, GSE25906, GSE35574, GSE43942, GSE4707, GSE47187) ( Table 1) passed the quality and integrity checks. The remaining six failed the INMEX microarray quality and integrity assessments and were further excluded (S2 Table). Altogether 167 samples, consisting of 68 PE and 99 NP met the sample inclusion criteria for the meta-analysis. A total of 16701 genes passed filtering criteria.

Patterns of Gene Expression in NP and PE Placentae
The samples were analysed to determine the patterns of gene expression in NP and PE placentae. We used AGE and RGE analyses to characterise the respective patterns in PE and NP placentae.
AGE analysis for NP and PE placental genes. RankProd meta-analysis was used to identify AGE in NP or PE (Table 2). Significant AGE was defined as genes whose product of expression were persistently ranked as positively (up) or negatively (down)-regulated across all PE  Comparison of Gene Expression in Normal and Pre-Eclamptic Placentae only or NP only placentae, at a given false discovery rate (FDR<0.05). Data output was expressed as rank product of mean expression levels. For NP, a total of 1922 genes were identified as consistently significant (FDR < 0.05). Of these, 846 genes were negatively regulated and 1076 were positively regulated ( Table 2). The expression levels of 14779 genes in NP placentae were inconsistent and were classified as non-significant. In contrast, the expression of 9540 genes in PE placentae was consistent and significant (FDR < 0.05) ( Table 2). Of these, 5146 (54%) genes were significantly down-regulated and 4394 (46%) genes were up-regulated in the PE placentae. The expression levels of 7161 genes in PE placentae were inconsistent and thus were classified as non-significant (Table 2). RGE analysis for PE placental genes. RGE was defined as the relative quantitation of the differences in the expression level of a gene between the PE and NP placental samples [27]. The data output was expressed as a fold-change of expression levels in PE relative to NP. Using fisher's method to combine p values, the expressions of 4349 genes were identified as significant in PE (p <0.05), relative to NP ( Table 2). Of these, 2197 (13%) genes were negatively regulated, and 2152 (13%) were positively regulated ( Table 2). Fig 2 shows that 2071 of these genes were differentially expressed across the study series before meta-analysis, and a further 2278 genes were significant (p <0.05) only after meta-analysis. The expression of 172 other genes lost significance after meta-analysis.
Trends in placental gene expression and PE unique genes. Trends in the changes to PE placental gene expression were determined by examining the relationships between PE and NP Absolute and Relative gene sets. First, we compared the gene counts in the respective gene sets from the Absolute PE and NP analyses. Fig 3 shows a 6 fold increase in the number of negative significant genes in PE than in NP. Similarly, there was a 4 fold increase in positive significant genes in PE than in NP (Fig 3). The proportion of non-significant genes in NP following AGE was twice the concentration found in PE non-significant gene set. Interestingly, while the proportions of genes identified as positive or negative significant from RGE were twice less than those identified from AGE (Table 2), the number of the Relative PE non-significant genes was similar to the Absolute NP non-significant genes. We therefore examined further, whether there was any relationship between the Absolute NP non-significant genes, Absolute PE significant and Relative PE significant genes.
Using BioVenn [33] we identified four sets of genes. First, the comparison of the PE and NP Absolute genes ( Table); (2) a second set of genes that were significantly (p< 0.05) down-regulated only in PE placentae (n = 4300; p< 0.05; S5 Table). The third set of genes consisted of a group of 1076 genes significantly (p<0.05) up-regulated in both NP and PE (S6 Table). The fourth set was a group of 3318 genes, that were significantly (p< 0.05) up-regulated only in PE (S7 Table).
Further comparison of the PE negative and positive significant gene sets with NP non-significant gene sub-group showed that all the PE unique genes were not significantly regulated in NP placenta (Fig 4C & 4D). Altogether, there were 7618 more significantly regulated genes in PE than were in NP at the FDR Confidence of 1 -alpha = 95.0%.

Links between Relative and Absolute PE Significant Genes
We further examined the relationship between the PE significant Relative and Absolute genes. All 4394 PE Absolute positive (up-regulated) significant genes were compared with the 2152 PE Relative positive significant genes. We expected all Relative PE genes to be identified amidst the Absolute PE gene sets. However, only 79% (n = 1688) of the Relative positive significant genes were identified in the PE Absolute positive genes. A much smaller number (24%, n = 524) of the total PE Relative negative (down-regulated) significant genes (n = 2197) were identified in the Absolute PE negative significant genes. Overall, only 51% of the Relative significant genes were identified in the Absolute gene sets, with majority localised within the positive significant gene set. Further examination showed that the expression signals of the significant Relative genes unmatched to Absolute genes were previously classified as inconsistent and non-significant by the AGE analysis. In contrast, the Absolute genes not matched to Relative genes typically showed low level expression profile or were similarly expressed in both NP and PE placentae.

PE Placental Associated (PPA) Genes and Current Research
We tested the hypothesis that NP and PE placental gene expression profiles do not differ, and that a prediction analysis would not discriminate between the NP and PE genes but only pick up the random noise in the data set. To examine this, all 16701 genes from 99 NP placentae and 68 PE placental microarrays samples were tested and the expression of 88 genes (Table 3) was significantly (p <0.05) associated with PE placentae (Pre-eclamptic Placenta Associated, PPA).
A ROC evaluation of the prediction accuracy was performed by plotting the sensitivity against 1-specificity for each result value of the test with tools available in BRBArray Tools. Three prediction algorithms were used to generate the ROC, including compound covariate predictor (CCP), diagonal linear discriminant analysis (DLDA), and Bayesian compound covariate predictor (BCCP). The analysis yielded a very modest but comparable ROC (Fig not  shown) for all three algorithms with AUC of (0.226 (CCP), 0.246 (DLDA), 0.227 (BCCP)).
We further evaluated the currency of gene-to-publication ranks of these PPA genes by probing the scientific literature with GLAD4U (Gene List Automatically Derived For You, [34]). The search retrieved 6,288 publications, of which 642 contained information on 493 genes related to PE placenta. After ranking, 76 genes were significant (p<0.01) and prioritised as highly relevant to PE placenta ( Table 4). The overlap between GLAD4U genes and PPA genes showed that only 6 of the latter genes (FLT1, ENG, INHA, LEP, PAPPA2, and HTRA4) were scored as significant and highly relevant from GLAD4U. Interestingly, 3 of these genes (LEP, HTRA4, INHA) also appeared in the top 10 of the PPA genes (Table 3). Similarly, fewer than expected GLAD4U genes (Table 5) were respectively identified in the PE Relative genes (22  genes), and PE Absolute genes (49 genes). Collectively, about 36% of genes identified from literature as highly relevant for PE placenta could not be confirmed as significant or consistently expressed in PE placentae following a large scale microarray meta-analysis. Comparison of Gene Expression in Normal and Pre-Eclamptic Placentae  (Table 6).
We repeated the analysis with the Relative significant genes, and 176 pathways were significantly affected in PE placentae. Of these, 164 were correctly mapped to Absolute genes affected pathways, but with variations in enrichment ratios. Examination of the 12 pathways affected

Discussion
PE is a serious complication of human pregnancy. While previous studies have led to clear descriptions of symptoms and diagnosis, our understanding of the genes altered in PE is still limited. In an attempt to identify a common set of dysregulated genes in PE placentae, we subjected a thoroughly screened subset of existing datasets to a robust set of analyses. Interestingly, the data revealed that over a third of the genes identified in the literature as being implicated in PE, were not identified as associated with or consistently expressed in PE placentae. This raises the question of whether current trends in PE genomic investigations are accurately reflecting the true nature of the molecular pathology of the condition. In cognisance of this, we identified specific gene sets that have not been previously reported for PE. Of these, there was an expectation that all the significant RGE genes would be mapped to the AGE PE genes. Rather, only 51% of the RGE genes were identified from the AGE PE genes. The remaining RGE significant genes showed varied levels of expression between PE and NP placentae but were classified as inconsistent with RankProd analysis. In contrast, 77% Comparison of Gene Expression in Normal and Pre-Eclamptic Placentae of the AGE genes did not match with the RGE genes. Of these, about 80% were genes that showed low or similar levels of expression in both PE and NP but were consistently expressed in PE placentae only. Thus, the current findings show that the use of AGE analysis enables the description of a comprehensive, globally and consistently expressed PE placental genes. On the other hand, the findings show that overt use of RGE analysis to the disadvantage of AGE could limit gene sets and our understanding of the real time and complexities of changes that could occur in the PE state. These findings appear to confirm earlier reports [23,24] that RGE not only identifies limited candidate genes but could also exclude large proportion of genes that may be of relevance in characterising the molecular pathology of a disease including those with low level expression and genes with similar levels of expression in both the case and control samples. The findings also seem to suggest that RGE could inherently identify genes whose expression patterns may be inconsistent but might have large differential expression between control and case samples.
Generally, the roles of genes with low level expressions in a disease state are unclear. However, reports from stem cell research suggest that low level gene expression may be involved in lineage priming and cell differentiation [35][36][37][38]. While such conclusions cannot be inferred as yet in the placenta from the current study, our findings showed that the PE placenta retains its ability to express the genes significantly regulated in NP placenta. The findings also showed the presence of additional subsets of unique genes including low level expressed genes that were consistently expressed only in PE placentae.
It could thus, be inferred from the current findings, albeit limited to RNA messages that: (1) there may be apparent expression of a set of genes, that could be critical for the survival or development of the placenta, and the pattern of expression of these genes might be similar in both NP and PE placentae; (2) in PE placentae, there may be consistent regulation of excess pool of genes (PE unique genes), that may exacerbate the activation of pregnancy-favourable biological pathways or precipitate pregnancy-unfavourable biological pathways; (3) PE may be a polygenic condition decompensated by the cumulative effect of multiple genes, each with small effects, and there may be no single gene with a large effect. These were most evident in the extent to which the molecular interaction and reaction pathways were affected in the PE placentae.
We identified two sets of pathways: common pathways in both NP and PE placentae, and unique pathways affected only in PE. The observation that the common pathways were enriched either more negatively or positively in PE than in NP appeared to suggest a plausible decompensation or exaggeration of normal placental functions as key factors in PE. Perhaps, of greatest significance for future research is the identification of previously unidentified dysregulated pathways in PE placentae such as: Histidine metabolism, Fc epsilon RI signaling pathway, allograft rejection, graft vs. host disease, primary immunodeficiency and renin-angiotensin, Wnt signaling, RNA degradation, and RNA Polymerase.
Wnt signaling, RNA degradation, and RNA Polymerase pathways were significantly affected only in PE. The canonical Wnt pathway leads to regulation of gene transcription [39], suggesting that PE could be linked to excessive gene expression in response to an autacoids or a paracrine hormones such as histamine with regulatory roles on Wnt pathway [40].
Crucially, dysregulated metabolism of histamine as a consequence of impaired histidine metabolism in pregnancy is well known to affect PE [41,42]. Therefore, the concurrent identification of Histidine metabolism pathway in PE is of significance. Possibly, the cumulative effects of the release of the histamine and other substances involved in inflammation and immune responses, cell proliferation, tissue differentiation, tumour formation, apoptosis and production of purines and pyrimidines is of importance [43,44]. Significantly, dysregulation of these functions are widely accepted to be rooted in the defects in early trophoblast to uterine invasion, adaptive transformation of the uterine spiral arteries to high capacity and low impedance vessels, and development of chorionic villi [14,45,46]. These are important issues known to affect PE, commonly at the early stages of the disease development [47].
Fc epsilon RI-mediated signaling pathway was also affected only in PE. This pathway in mast cells are initiated by interaction of multivalent allogens with the extracellular domain of the alpha chain of Fc epsilon RI to release preformed histamines, proteoglycans (especially heparin), phospholipase A2 and subsequently, leukotrienes (LTC4, LTD4 and LTE4), prostaglandins (especially PDG2), and cytokines including TNF-alpha, IL-4 and IL-5 [48]. These mediators and cytokines contribute to inflammatory responses.
In the case of inflammatory pathways in PE, it is suggested that the nuclear factor kappalight-chain-enhancer of activated B cells (NF-κB) pathway mediates excessive maternal intravascular inflammation that leads to endothelial dysfunction [49,50]. In this context, it has been hypothesised that PE arises as a result of an excessive maternal intravascular inflammatory response to pregnancy, and that it involves the activation of both innate and the adaptive immune system, neutrophil, and the complement system pathways [50][51][52][53][54].
This opinion is however not universally supported. A recent review by Ahmed and Ramma [63] appears to down-play the roles of inflammatory, hypoxia and immunologic pathways in favour of angiogenic response as the cause of PE. They argue that recent work supports the hypothesis that PE arises because of the loss of vascular endothelial growth factor (VEGF) activity, which in turn is caused by increase in the levels of endogenous soluble fms-like tyrosine kinase-1 (sFlt-1), an anti-angiogenic factor [63]. SFlt-1 binds and reduces free circulating levels of the pro-angiogenic factor VEGF, and thus inhibits the beneficial effects mediated by flt-1 (also known as vascular endothelial growth factor receptor 1 (VEGFR-1)) on maternal endothelium, with consequent maternal hypertension and proteinuria [64,65]. It is further argued that altered balance of circulating pro-angiogenic/anti-angiogenic factors such sFlt-1, soluble endoglin, and placenta growth factor (PlGF) are unique to PE [63][64][65][66][67]. This view is not lost as we also identified VEGF signaling pathway as affected only in PE.
However, due to the complexity of pathways affected in PE, our findings contrast the conclusions drawn by Ahmed and Ramma [63]. Instead, our findings support a more global view that multiple and concurrent dysregulated pathways underpin the aetiology of PE [47], and no single pathway could be associated with the origins of PE.
These findings therefore provide the opportunity to re-examine current studies in PE to reflect the consistently expressed genes that are unique to PE placentae or biological pathways, especially those that may be exclusively affected in PE placentae, to improve our understanding of the molecular pathology or the genomic basis of PE.
Supporting Information S1

Author Contributions
Conceived and designed the experiments: OB AW MHFS.
Performed the experiments: OB.