Meta-analysis of miRNA expression profiles for prostate cancer recurrence following radical prostatectomy

Background Prostate cancer (PCa) is a leading reason of death in men and the most diagnosed malignancies in the western countries at the present time. After radical prostatectomy (RP), nearly 30% of men develop clinical recurrence with high serum prostate-specific antigen levels. An important challenge in PCa research is to identify effective predictors of tumor recurrence. The molecular alterations in microRNAs are associated with PCa initiation and progression. Several miRNA microarray studies have been conducted in recurrence PCa, but the results vary among different studies. Methods We conducted a meta-analysis of 6 available miRNA expression datasets to identify a panel of co-deregulated miRNA genes and overlapping biological processes. The meta-analysis was performed using the ‘MetaDE’ package, based on combined P-value approaches (adaptive weight and Fisher's methods), in R version 3.3.1. Results Meta-analysis of six miRNA datasets revealed miR-125A, miR-199A-3P, miR-28-5P, miR-301B, miR-324-5P, miR-361-5P, miR-363*, miR-449A, miR-484, miR-498, miR-579, miR-637, miR-720, miR-874 and miR-98 are commonly upregulated miRNA genes, while miR-1, miR-133A, miR-133B, miR-137, miR-221, miR-340, miR-370, miR-449B, miR-489, miR-492, miR-496, miR-541, miR-572, miR-583, miR-606, miR-624, miR-636, miR-639, miR-661, miR-760, miR-890, and miR-939 are commonly downregulated miRNA genes in recurrent PCa samples in comparison to non-recurrent PCa samples. The network-based analysis showed that some of these miRNAs have an established prognostic significance in other cancers and can be actively involved in tumor growth. Gene ontology enrichment revealed many target genes of co-deregulated miRNAs are involved in “regulation of epithelial cell proliferation” and “tissue morphogenesis”. Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis indicated that these miRNAs regulate cancer pathways. The PPI hub proteins analysis identified CTNNB1 as the most highly ranked hub protein. Besides, common pathway analysis showed that TCF3, MAX, MYC, CYP26A1, and SREBF1 significantly interact with those DE miRNA genes. The identified genes have been known as tumor suppressors and biomarkers which are closely related to several cancer types, such as colorectal cancer, breast cancer, PCa, gastric, and hepatocellular carcinomas. Additionally, it was shown that the combination of DE miRNAs can assist in the more specific detection of the PCa and prediction of biochemical recurrence (BCR). Conclusion We found that the identified miRNAs through meta-analysis are candidate predictive markers for recurrent PCa after radical prostatectomy.


Introduction
Prostate cancer (PCa) is the most diagnosed malignancy and the second most reason of cancer-related death for the men over the age of 50 in the western countries [1]. The prostate-specific antigen (PSA) is the most reliable biomarker for PCa, which is helpful for diagnosis, screening, and follow-up after surgery. For treatment of PCa, two treatment methods, radiation therapy or radical prostatectomy (RP) and hormone ablation therapy are used. Yet, these methods do not provide enhanced survival rates and nearly 30% of patients experience a biochemical recurrence with enhanced PSA levels after curative treatment of RP [2]. Moreover, metastatic and advanced tumors of PCa respond very poorly to chemotherapy [3]. All these facts emphasize the significance of developing early diagnostic biomarkers for PCa progression. Identifying effective predictors of tumor recurrence after the surgical operation to determine whether treatment is required or not is a main challenge in the PCa research. To predict biochemical recurrence (BCR) of PCa after RP and develop effective predictors of tumor recurrence, multiple studies have been conducted for gene expression profiling [4][5][6].
Recently, numerous studies have been published which show that the alterations in micro-RNAs are associated with PCa initiation and progression [7][8][9].
Meta-analysis utilizes statistical methods to contrast and combines results from multiple studies in the hope of increasing the statistical power and reproducibility over individual studies and identifying patterns across studies [15]. A limited number of studies [1, 10-14, 16, 17] has been conducted on microRNA expression profiles to distinguish recurrent from nonrecurrent prostate tumor tissues and to identify novel biomarkers for prediction of PCa progression. The average differential expression level (fold change) and some level of significance as measured by the t-test are common procedures for identifying the biomarkers. These miRNA microarray data sets provide a rich resource for genome-wide information on PCa progression and make an ideal chance to perform a meta-analysis study. We assumed that a meta-analysis of some miRNA expression datasets of PCa progression can give a potentially significant list of co-deregulated miRNAs in PCa progression, which is important to specify pathways in which the miRNAs of interest and their target genes are involved. To increase the probability of revealing truly significant deregulated miRNA genes, which should have higher potentials to be utilized as consistent biomarkers for the disease, we analyzed miRNA expression profile in PCa progression considering 5 studies (6 datasets). This meta-analysis increases the significance of the results.

Literature analysis
There are a limited number of reports in the literature studied miRNAs in PCa progression. We systematically queried for these studies from PubMed database.
The following Medical Subjective Heading (MeSH) and Embase tree were used: "recurrence" or "recurrence" and "prostatic neoplasms" or "prostate cancer" and "micrornas" or "microRNA" and "gene expression" or "expression". In addition, publicly available microRNA data sets were searched by "RISmed" package in R to ensure no relevant studies were missed. Through database searching, a total of 24 studies was identified. Of these, 19 studies were retained after rejecting repetition. According to the title and abstract, a total of 14 studies was excluded. Review, case report, animal experiment, no association with PCa, and experiment on DNA microarray were the reasons for excluding these articles. The full-text articles were evaluated for the remaining 5 studies, and all of them (6 datasets) were retained in the final meta-analysis. These miRNA data sets were obtained from the National Centers for Biotechnology Information (NCBI) Gene Expression Omnibus (GEO) database (http://www.ncbi. nlm.nih.gov/geo/).

The microRNA datasets and individual data analysis
In this study, a total of six microRNA datasets related to the recurrent PCa after RP (GSE55323 [10], GSE26245 and GSE26247 [11], GSE65061 [17], GSE62610 [12], and GSE46738 [14]) met the inclusion criteria and were selected for meta-analysis.
In the GSE55323, a total of 41 recurrent and 41 non-recurrent tumors after RP, which have been obtained from Baylor College of Medicine Prostate Cancer program, have been considered for performing miRNA profiling. Recurrence has been defined as a two consecutive serum PSAs greater than 0.2 ng/ml. To carry out microarray analysis, 20 samples from each group have been profiled using miRNA microarray chips.
In GSE26245 and GSE26247, total RNA from 71 formalin-fixed-paraffin-embedded (FFPE) specimens with known long-term outcome have been used for performing DASL expression profiling with a custom-designed panel of 522 PCa relevant genes. Recurrence has been defined as a two consecutive serum PSAs greater than 0.2 ng/ml. In the GSE26245, samples from 71 patients (29 with BCR and 42 without BCR) and in the GSE26247, samples from 82 patients (29 with BCR and 53 without BCR) have been used. In this study, the samples with unknown BCR have been removed.
For the GSE65061, total RNA has been extracted from tumor-enriched 1mm cores from 43 RP paraffin tissue blocks. Tissue isolated at the time of RP has been utilized for miRNA profiling. Thirty-six months has been considered as the cutoff, as it was near the median time to recurrence. From 43 patients, 19 were labeled as the samples with BCR ( 36 months) and 24 as the samples without BCR (> 36 months).
In the GSE62610, total RNA has been taken from tumor-enriched 1.5 mm cores in diameter from 36 formalin fixed paraffin embedded (FFPE) specimens. Then biochemical failure has been defined as two consecutive measurements of PSA > 0.2 ng/ml. From 36 patients 22 has been classified as the samples with BCR and 14 as the samples without BCR. In the GSE62610 most of microRNAs have null expression. After excluding miRNAs with no expression in any of the samples, 536 miRNAs have been kept for further analysis.
For GSE46738, total RNA has been taken from tumors from the 51 patients that underwent an RP by the same surgeon to treat localized PCa. In the GSE46738, the BCR status of samples is not mentioned explicitly. In the present study, according to the expression level of the miR-NAs with greater statistical power, which has been reported in the third table of the study [14], tumors were divided into the positive BCR and the negative BCR by using clustering techniques. In GSE46738, from 51 samples, 34 were classified as the samples with BCR and 17 as the samples without BCR. Table 1 has provided detailed information of each dataset.
The microRNA microarray datasets were obtained from GEO NCBI. All GEO series matrix files (GSE), platform sets, and annotation files were downloaded and parsed using 'GEOquery' package of Bioconductor 3.2 in R version 3.2.2. To identify Differentially Expressed (DE) miR-NAs in each individual dataset, moderated t-test was used.

Microarray meta-analysis
This meta-analysis was performed in accordance with the guidelines provided in [18]. First, each individual dataset was preprocessed using the log2 transformation and normalization. Then, any training and validation dataset (GSE26245 and GSE26247) was combined together. Next, gene matching was done for all microRNA probes. When multiple probe sets matched to an identical gene symbol, the probe that presented the greatest inter-quartile range (IQR) was selected to represent the target gene symbol. After matching all probes to a common microRNA gene symbol, differential expression analysis was performed with "MetaDE" package for each dataset independently using adjusted p-value < 0.05, based on the false discovery rate by the Benjamini-Hochberg procedure [19] and moderated t-test.
Data integrity was checked for all datasets, and the differential expression meta-analysis across recurrence and non-recurrence samples was carried out by p-values combination using Adaptive Weight (AW) and Fisher methods.

Statistical analysis
The meta-analysis was performed using the 'MetaDE' package in R [20]. The moderated t-statistic was utilized to identify DE miRNAs in each individual dataset. The Fisher and AW were Network analysis of common differentially expressed microRNA genes Network-based analysis was performed using a MIROB web tool (http://mirob.interactome. ru/) which has been designed to support analysis of microRNA expression data. MIROB (microRNA OncoBase) scans the set of input miRNAs to build any cancer pathways, detects key targets and Transcription Factors (TF) of candidate miRNAs, identifies any possible correlation between key targets and TF of candidate miRNAs and other diseases, and makes pathogenesis network.
Functional gene set enrichment analysis of common differentially expressed microRNA genes The TF and target genes of DE miRNAs were searched for the pathways in which they participate, using the EnrichR web-tool [21]. Specifically, Gene Ontology (GO) analysis, Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis, and Reactome pathway analysis were performed. Moreover, in order to find metabolic pathways and biochemical reactions, pathway commons analysis was performed for candidate DE miRNAs using a pathway commons network visualizer (PCViz) web tool. To further investigate the function of the DE miR-NAs, they were also mapped to the most significant KEGG pathway.

Diagnostic performance of common differentially expressed microRNA genes
Based on the hypothesis that miRNA classifiers may improve sensitivity and specificity over single markers, the diagnostic potential of 37 DE miRNAs classifier was trained and tested in each dataset by performing receiver operating characteristic (ROC) analysis using the normalized expression values of miRNA genes. Logistic regression classifier with leaveone-out cross validation (LOOCV) scheme was used for this analysis. To show the discriminatory power of 37 DE miRNAs for distinguishing recurrent PCa samples from non-recurrent samples, area under the ROC curve (AUC) was calculated using WEKA open source machine learning software. Furthermore, the best subset of DE microRNAs which improves the BCR prediction over original studies and 37 DE miRNAs set, was identified in each dataset by using geometric particle swarm optimization (PSO). The LOOCV AUC of logistic regression was utilized as the fitness function. Partial C4.5 decision tree (PART) was used to find the relation and rules between the DE microRNAs in each dataset for biological point of view.

Identification of common differentially expressed microRNAs for prostate cancer recurrence by meta-analysis
To identify a common DE microRNAs for PCa recurrence, five miRNA studies (Table 1) were analyzed using "MetaDE" package in R. First, individual analysis was performed and the moderated t-test was used to calculate the p-values which frequently used in meta-analysis. Then, AW and Fisher's method were utilized to combine the p-values and find miRNAs that were differentially expressed between samples with recurrence and non-recurrence (+/− BCR) across all studies. From miRNA microarray meta-analysis, we identified a total of 37 DE miR-NAs including 15 overexpressed and 22 under expressed microRNAs across at least two datasets under the significance threshold of adjusted p-value < 0.05. Fig 1 shows the number of DE microRNAs against FDR obtained from individual analysis as well as meta-analysis. It is clearly seen that the meta-analysis has detected more candidate markers. Fig 2 shows the heat map of those 37 microRNAs. A complete list of DE microRNAs has been provided in Table 2.
Identification of the TF and regulatory network for the differentially expressed microRNAs obtained from meta-analysis MIROB tool was used to perform regulatory microRNA network analysis to identify regulators responsible for the observed patterns in miRNA meta-analysis studies. The interaction network was constructed between DE microRNAs, TF and target genes associated with the complete set of DE (Fig 3). Twenty four of DE miRNAs were found in the network. The details of those miRNA gene networks have been given in Table 3. Key targets, ontology information on target genes, TF and a descriptive analysis of expression of the DE miRNAs have been summarized in this table. In addition, it shows that DE miRNAs are highly associated with colorectal, PCa, breast, and gastric cancer. Further enrichment analysis for identification of overrepresented biological pathways and gene ontology terms We performed gene set enrichment analysis by EnrichR tool, using the complete list of key targets and TF of DE miRNAs. GO terms and biological pathways were significantly overrepresented in the gene list if they showed an adjusted p-value < 0.05. Results for gene ontology and enriched biological pathways (KEGG, Reactome) have been shown in Tables 4-6, respectively. DE microRNAs in meta-analysis results were associated with the enriched pathways with adjusted p-value < 0.05, including "MicroRNAs in cancer (hsa05206)", "Pathways in cancer (hsa05200)", "Proteoglycans in cancer (hsa05205)", "PI3K-Akt signaling pathway (hsa04151)", "Prostate cancer (hsa05215)" and "Signal Transduction (R-HSA-162582)". The most important GO terms associated with key targets and TF of DE miRNA genes included "regulation of epithelial cell proliferation (GO: 0050678)", "tissue morphogenesis (GO: 0048729)", "regulation of cellular response to stress (GO: 0080135)", and "positive regulation of cellular component movement (GO: 0051272)". To further investigate the function of DE miRNAs, we mapped them to the KEGG database. Eleven of them (miR-1, miR-125A, miR-133A, miR-133B, miR-137, miR-199A, miR-221, miR-28, miR-324, miR-363 and miR-449A) were found in the "miRNAs in cancer" pathway (KEGG-ID: hsa05206; Fig 4) with adjusted P-value of 7.554e-15 (Table 5). Moreover, common pathway analysis revealed that TCF3, MYC, MAX, CYP26A1, and SREBF1 significantly interact with DE miRNAs (Fig 5).  The "moderated t-test" is used to perform individual analysis and calculate p-values. The corresponding p-values are adjusted, based on the false discovery rate using the Benjamini-Hochberg procedure used to select DE miRNAs across at least two datasets.

Diagnostic performance
We assessed the diagnostic potential of the 37-miRNA signature identified by meta-analysis. ROC curve analysis gave AUCs from 0.55-0.84 for miRNAs set in each GEO dataset (See Fig  6). To investigate whether a miRNA signature may increase diagnostic accuracy over 37-miRNA signature, we employed a soft computing technique (PSO/ logistic regression) and trained and tested on miRNA expression profiles. The best subset of DE miRNAs was identified in each GEO dataset and shown in Table 7.
Notably, the discriminating power of the identified signatures in each GEO dataset is higher than the case where 37-miRNA classifier was considered. For the best subset of DE miRNAs in each GEO dataset, the ROC curve analysis gave AUCs from 0.75-0.97 (See Fig 7). The highest diagnostic accuracy (97%) was given for GSE55323 with 11-miRNAs. Moreover, in order to correctly classify BCR+ vs. BCR-samples, simple rules were extracted using a decision tree classifier (Table 7). Among six GEO datasets, rules with high diagnostic potentials were extracted for GSE46738 and GSE26247.
Finally, a comparison between the expressions of co-deregulated microRNAs in BCR+ vs. BCR-was done by plotting boxplots (Fig 8). The boxplots were drawn for co-deregulated microRNAs that are involved in the PCa pathway.

Discussion
Various miRNAs are DE in individuals with recurrent PCa, and identifying the most important miRNAs and pathways associated with the disease is very important. A meta-analysis of multiple miRNA datasets combines the generated p-values of individual studies, making the identification of DE microRNA genes more reliable.
In this study, we attempted to identify common miRNAs underlying recurrent PCa using meta-analysis of six publicly available microRNA datasets to focus deeply on identifying DE microRNA genes and risk factors shared between them.
Tumor-suppressive miR-449A targets HDAC1 and induces growth arrest in PCa [37]. It also causes Rb-dependent cell cycle arrest and senescence in PCa cells (Table 3) [47]. For a previously poorly characterized miRNA, namely miR-579, no PCa related functions have been reported. MiR-579-3p is only known as a master regulator of melanoma progression and drug resistance [48].
Among the under expressed DE microRNAs mir-496, miR-137, miR-1, and miR-370 had the highest combined P-values across all studies.
MiR-496 is also a previously poorly characterized miRNA, which has no functions in PCa. Methylated DNA binding domain protein 2 (MBD2) is known as the only TF of miR-496, which coordinately silences gene expression through activation of the miR-496 promoter in breast cancer cell line [49].
Methylated mir-137 host gene is promising diagnostic and/or prognostic biomarker of PCa. The epigenetic silencing of miR137 is an important event in promoting androgen signaling during prostate carcinogenesis and progression. MiR-137 suppresses cell growth in several cancers such as ovarian, colorectal, and gastric [50].
MiR-1 is known as a biomarker of recurrence PCa, which is in agreement with the findings in present meta-analysis study. MiR-1 functions as a tumor suppressor which suppresses cancer cell proliferation, metastasis, angiogenesis, invasion, cell cycle arrest, WNT signaling and promotes apoptosis by ectopic expression. This miRNA is a potential prognostic biomarker of hepatocellular carcinoma (HCC) and colorectal cancer. The expression of miR-1 alters in several cancers such as lung, gastrointestinal, prostate, bladder, head and neck, and renal cancer [51].
MiR-370 plays an important role in the proliferation of human PCa cells by directly suppressing the tumor suppressor FOXO1 [39].
PPI Hub Proteins analysis of the TF and target genes of DE MicroRNAs was conducted for prioritization of the most important hub genes using the EnrichR web tool. CTNNB1 was the most important hub genes among TF and target genes of DE microRNAs across six microarray studies. CTNNB1 (Catenin Beta 1) functions as a Key downstream component of the canonical WNT signaling pathway. WNTs and their downstream effectors have crucial roles in the regulation of various processes that are important for cancer progression, including tumor growth, tumor initiation, differentiation, cell senescence, cell death, differentiation and metastasis [52]. Nuclear accumulation and abnormal stabilization of CTNNB1 as a consequence of missense mutations occurs at a high frequency in a variety of epithelial cancers such as colorectal cancer, medulloblastoma, ovarian cancer, and pilomatrixoma. Upregulation of CTNNB1 is also associated with PCa [53].
To elucidate the role of DE microRNAs obtained from the meta-analysis, we performed pathway analysis and gene set enrichment analysis for TF and target genes of DE miRNAs  using the EnrichR web tool. The most enriched pathway and Gene Ontology (GO) term among the TF and target genes of DE miRNAs were "MicroRNAs in cancer (hsa05206)", "Pathways in cancer (hsa05200)", Signal Transduction (R-HSA-162582)", "regulation of epithelial cell proliferation (GO: 0050678)" and "tissue morphogenesis (GO: GO:0048729)".
Common pathway analysis revealed that TCF3, MYC, MAX, CYP26A1 and SREBF1 were the most significant proteins associated with DE miRNA genes. Of note, these proteins were not identified as TF and target genes of DE microRNAs.
Previous studies have reported that the diminished activity of TCF3 plays a role in lymphoid malignancies, and up-regulation of it is involved in the development and progression of colorectal cancer. TCF3 is regulated by androgens and acts as a tumor promoter in PCa [54].
Overexpression, Mutations, translocation and rearrangement of MYC is related to several cancers such as breast, PCa, gastrointestinal, melanoma, and small cell lung cancer [55].
MAX is known as a tumor suppressor in renal oncocytomas and small cell lung cancer. The mutation of it has been identified in gastrointestinal stromal tumors [56]. High expression of CYP26A1 is associated with several cancers such as breast, head and neck, colorectal and ovarian. CYP26A1 is a methylation marker of PCs associated with ERG-positive cancers [57].
Sterol regulatory element-binding protein1 (SREBP1) is a key regulatory factor that controls lipid homeostasis. SREBP1 is a critical link between oncogenic signaling and tumor metabolism. The overexpression of SREBF1 is related to a variety of cancers such as PCa, breast, head and neck, colorectal, endometrial, glioblastoma, pancreatic, and ovarian [58]. To understand the association of the DE microRNAs list with the most significant target genes and transcription factors, we conducted a regulatory gene network analysis using the MIROB web tool. CDKN1A and LASP1 were amongst the most significant target genes associated with the DE microRNAs.
Cyclin-dependent kinase inhibitor 1 (CDKN1A) also is known as P21 is involved in p53/TP53 mediated inhibition of cellular proliferation in response to DNA damage and its overexpression results in cell cycle arrest and autophagy cell death. The expression of this gene is tightly controlled by the tumor suppressor protein p53 in a human brain tumor cell line [59]. The CDKN1A genotypes CT and TT are associated with an increased risk of advanced prostate carcinoma compared with the CC genotype [60]. Elevated p21 levels are associated with higher Gleason score, and increased PCa recurrence [61].
LIM and SH3 protein 1 (LASP1), a promoter of cell proliferation and migration, play a significant role in cancer development and progression. LASP-1 is involved in numerous biological and pathological processes. It plays an important role in the regulation of dynamic actinbased and cytoskeletal activities. LASP-1 is highly expressed in the central nervous system and contributes to the formation and progression of prostate cancer through a NF-KB pathway [62].
RELA, SNAI2, and TP53 were among the most significant transcription factors associated with the DE microRNAs. RELA also known as NF-kappa-B is a ubiquitous transcription factor involved in many biological processes such as immunity, inflammation, cell growth, differentiation, tumorigenesis and apoptosis. Zinc finger protein (SNAI2) is known as a transcriptional repressor that modulates both activator-dependent and basal transcription. SNAI2 regulates cell proliferation and invasiveness of metastatic PCa cell lines. Cellular tumor antigen p53 (TP53) acts as a tumor suppressor in many tumor types; induces growth arrest or apoptosis depending on the physiological circumstances and cell type.
In this study, we also built new miRNA diagnostic classifiers in each GEO datasets based on best subset of DE miRNAs in the meta-analysis. These classifiers predicted BCR after RP with very high accuracy. The highest diagnostic accuracy (97%) was given for GSE55323 with  (Table 3). Lines within the boxes indicate median values; whiskers-min and max for miRNA values. BCR+/ -, biochemical disease recurrence status (positive, negative). https://doi.org/10.1371/journal.pone.0179543.g008 Meta-analysis of miRNAs for PCa recurrence after RP 11-miRNAs. The performance of our 11-miRNA diagnostic classifier (97%) exceeded that of a 2-miRNA classifier (miR-1+miR-133B; AUC: 71%) developed earlier by Karatas et al. [10]. One miRNA (miR-1) is shared between these classifiers, further supporting the validity of our findings.
Briefly, we used "MetaDE" package to perform a meta-analysis, which provides options for gene matching across studies, gene filtering before meta-analysis and functions for conducting several major meta-analysis methods such as Fisher and AW for differential expression analysis. Then performed the GO enrichment analysis, pathway analysis, network analysis, and ROC analysis.
In conclusion, this is the first report that provides biological insights on common micro-RNA expression signatures for recurrent PCa after RP. The candidate miRNAs are worthy to be validated in the wet lab.