Identification of UBE2C as hub gene in driving prostate cancer by integrated bioinformatics analysis

Background The aim of this study was to identify novel genes in promoting primary prostate cancer (PCa) progression and to explore its role in the prognosis of prostate cancer. Methods Four microarray datasets containing primary prostate cancer samples and benign prostate samples were downloaded from Gene Expression Omnibus (GEO), then differentially expressed genes (DEGs) were identified by R software (version 3.6.2). Gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) were performed to identify the function of DEGs. Using STRING and Cytoscape (version 3.7.1), we constructed a protein-protein interaction (PPI) network and identified the hub gene of prostate cancer. Clinical data on GSE70770 and TCGA was collected to show the role of hub gene in prostate cancer progression. The correlations between hub gene and clinical parameters were also indicated by cox regression analysis. Gene Set Enrichment Analysis (GSEA) was performed to highlight the function of Ubiquitin-conjugating enzyme complex (UBE2C) in prostate cancer. Results 243 upregulated genes and 298 downregulated genes that changed in at least two microarrays have been identified. GO and KEGG analysis indicated significant changes in the oxidation-reduction process, angiogenesis, TGF-beta signaling pathway. UBE2C, PDZ-binding kinase (PBK), cyclin B1 (CCNB1), Cyclin-dependent kinase inhibitor 3 (CDKN3), topoisomerase II alpha (TOP2A), Aurora kinase A (AURKA) and MKI67 were identified as the candidate hub genes, which were all correlated with prostate cancer patient’ disease-free survival in TCGA. In fact, only UBE2C was highly expressed in prostate cancer when compared with benign prostate tissue in TCGA and the expression of UBE2C was also in parallel with the Gleason score of prostate cancer. Cox regression analysis has indicated UBE2C could function as the independent prognostic factor of prostate cancer. GSEA showed UBE2C had played an important role in the pathway of prostate cancer, such as NOTCH signaling pathway, WNT-β-catenin signaling pathway. Conclusions UBE2C was pivotal for the progression of prostate cancer and the level of UBE2C was important to predict the prognosis of patients.

Introduction Prostate cancer has become the most universal diagnosed cancer among American men, which accounted for 20% of newly diagnosed cancers [1,2]. To our surprise, although the survival probability of prostate cancer was the highest, the proleptic mortality of prostate carcinoma accounted for 10% among the whole cancer death cases [1]. Among American men, prostate cancer remained the third-leading reason for cancer death. Based on the cancer statistics in China, we have concluded that prostate cancer has become the most popular cancer of the genitourinary system of men [2]. In detail, from 2000 to 2011, the occurrence of prostate cancer has increased year by year. Prostate cancer has become a threat to long-term health of patients and has worsened the living conditions. Unfortunately, in China, the mortality of prostate cancer has been rising year by year too. It was essential for clinicians to diagnose and to treat diverse kinds of prostate cancer correctly and appropriately. For localized prostate cancer, surgery and radiation have become the first choice, although it could bring some adverse effects to patients, such as frequent urination, hematuria and sexual dysfunction that maybe pull down life quality [3]. Androgen deprivation therapy(ADT) combined with radiotherapy was superior to other therapies for high risk prostate cancer patients [4]. Although ADT was recognized as standard therapy for diverse prostate cancer, chemotherapy has also been recommended for metastatic prostate cancer. In terms of the screening and diagnosis, prostatespecific antigen (PSA) has been widely performed to monitor and diagnose prostate cancer. While due to PSA's poor specificity, early PSA test, could lead to overdiagnosis and overtreatment of prostate cancer [5]. Previous studies have proved high-sequencing could function as a novel way to identify biomarkers in various cancers. Fang et al. have compared stroma surrounding invasive prostate tumors and matched normal stroma, in order to identify a potential target for prostate cancer [6]. Comparing luminal cells, basal cells and epithelial cells from hormone-naive and castration-resistant mice, L6YD was identified as a marker castration-resistant prostate cancer [7]. Thus, this article aimed to identify the hub genes of prostate carcinoma by bioinformatics analysis.
In this study, four databases were downloaded from GEO, which included benign prostate gland and primary prostate cancer tissue. Using limma and affy package of R software, DEGs were identified. Furthermore, GO and KEGG pathway analysis were performed by The Database for Annotation, Visualization, and Integrated Discovery (DAVID) [8] (http://www.david. niaid.nih.gov). Then, PPI and modules analysis was performed by STRING [9] (https://stringdb.org/), metascape [10] (http://metascape.org/gp/index.html) and Cytoscape [11,12]. Survival analysis was performed to screen and validate the hub gene of prostate cancer. In brief, UBE2C, highly expressed in prostate cancer tissue, was identified as the real hub gene. Through GSEA [13] and univariate and multivariate analysis, the role of UBE2C in prostate cancer was explored completely. GSE104749 [14], GSE3325 [15], GSE69223 [16], GSE46602 [17], four prostate cancer gene expression profiles were downloaded from GEO, which were from the same GPL platform (GPL570). The criteria of inclusion GEO gene expression profiles were shown in S1 Fig.  GSE3325

Identification of DEGs
GSE3325, GSE46602, GSE69223 and GSE104749 gene expression profiles have been normalized and transformed into suitable matrix using Robust Multichip Average (RMA) included in affy package of R software (version 3.6.1) consisting of background adjustment, quantile normalization and summarization pre-processing [6,18]. Limma package was used to identify DEGs between primary prostate cancer and noncancerous prostate samples. The criteria of DEGs were adjusted p value < 0.05 and |log2 (fold change) | > 1. Venn diagram was used to find DEGs appearing in at least two databases (Figs 1 and S2).

GO and KEGG pathway analysis of the DEGs
To further explore the function of DEGs, GO [19] and KEGG [20] analysis of DEGs was performed by DAVID. We downloaded the results and performed further analysis by the R software. The criterion of function analysis was p-value cutoff <0.05. Using metascape, DEGs were studied for further (Fig 2).

Construction and analysis of PPI network
Online website STRING and cytoscape were used to build the PPI network and analyzed the functional pathways (Fig 3). Cytoscape functioned as a network graph, with differential expressed molecules represented as nodes and intermolecular interactions represented as links, that is, edges, between nodes [11]. In this study, the combined score represented intermolecular interactions and the protein pairs with combined score > 0.4 were selected to construct the PPI network [21]. Nodes were drawn in different sizes and colors, which represented the node degree and the regulation (up or down), respectively [18]. The hub genes were calculated by different methods with cytohubba [22]. The cytohubba is a plugin of cytoscape. It was widely used and essential for us to explore the most important node in diverse biological networks. CytoHubba included 11 topological analysis methods, while in this study we used Degree, Density of Maximum Neighborhood Component (DNMC), Maximal Clique Centrality (MCC) to identify candidate hub gene of prostate cancer.

PLOS ONE
Identification of hub gene in driving prostate cancer

Survival analysis and expression validation of candidate hub genes
To further figure out the real hub gene from candidate hub genes of prostate cancer, Gene Expression Profiling Interactive Analysis (GEPIA, http://gepia.cancer-pku.cn/) [23], an online website based on The Cancer Genome Atlas [24] (TCGA) and the Genotype-Tissue Expression (GTEx) databases [25], was not only performed to valid the expression of candidate hub genes between prostate cancer and benign prostate tissue, but also used to validate the correlations between disease-free survival and expression level of candidate hub genes (Fig 4).

Clinical parameters analysis
Considering the clinical application value of candidate hub genes, firstly, we analyzed the correlation between candidate hub genes and Gleason score of prostate cancer patients in GSE70770 ( Fig 5). Secondly, we have validated the role of UBE2C in the T-stage and N-stage of prostate cancer (Fig 6). Then we performed the disease-free survival and expression analysis of UBE2C in GSE70770 (three samples was excluded because of the absence of the follow up months), GSE116918 and TCGA by KM curve and boxplot (Figs 6 and 7). The correlations between clinical parameters and the expression of UBE2C included in Table 2 was done by

GSEA
The GEO dataset (GSE70770) contained 206 primary prostate cancers, which were divided into two groups based on the median expression of UBE2C, high versus low. To figure out the function and effect of UBE2C, GSEA was used to do the enrichment analysis. The criteria cutoff set up as P-value < 0.05 and false discovery rate (FDR) < 0.25 (Fig 8).

GO and KEGG pathway analysis
To figure out the function of these DEGs, as described in S1 Table, DAVID was used to perform GO and KEGG pathway and the results were presented in S2  biological process analysis indicated DEGs were significantly enriched in the oxidation-reduction process, negative regulation of transcription, cell-cell signaling, lipid metabolic process, cell adhesion. For the cellular component, the most significantly altered pathway was proteinaceous extracellular matrix, extracellular region. As for molecular function, cadherin binding involved in cell-cell adhesion, oxidoreductase activity was significantly enriched. KEGG pathway analysis showed that transcriptional dysregulation in cancer, TGF-beta signaling pathway, Hippo signaling pathway were essential for prostate cancer. With metascape, the results indicated that DEGs were obviously correlated with the development of prostate neoplasms ( Fig  2E and 2F).

PPI network construction and hub genes validation
In order to screen the hub genes, the different expression genes were uploaded to STRING, then PPI network was further constructed by Cytoscape (Fig 3). In order to identify the hub genes of prostate cancer, DEGs were calculated by Maximal Clique Centrality (MCC) [26], Degree method, Density of Maximum Neighborhood Component (DNMC) [22] to screen the top ten genes. The 7 genes, UBE2C, PBK, CCNB1, CDKN3, TOP2A, AURKA and MKI67, presenting in at least two methods, were recognized as candidate hub genes. But only UBE2C, screening in all methods, was recognized as of higher worth. In order to verify the role and function of 7 genes in the progression and prognosis of prostate cancer, expression level and survival analysis was performed by GEPIA (Fig 4). The results showed that 7 genes were all associated with prostate cancer prognosis, but only UBE2C, TOP2A and CCNB1 were high expression in prostate cancer samples than normal prostate gland samples, the expression of PBK, CDKN3, AURKA, MKI67 were not.

Clinical information of the hub gene
Considering the clinical value of 7 genes, we analyzed the correlation between candidate hub genes and Gleason score of prostate cancer. As the results showed that, candidate hub genes, in particular, UBE2C was positive related with Gleason score in GSE70770 (Figs 5 and 6B). The higher expression level of UBE2C indicated the worse Gleason score. Moreover, the expression level of UBE2C increased as T-stage and N-stage of prostate cancer improved ( Fig  6C and 6D). According to the validation results, UBE2C, which was highly expressed in prostate cancer (Fig 6A), in particular castration-resistant prostate cancer. And the results showed that UBE2C was essential for predicting the disease-free survival of prostate cancer and may play an important role in prostate cancer (Figs 7 and S4). UBE2C was confirmed as the hub gene of prostate cancer. As shown in Table 2, based on the median expression of UBE2C, primary prostate cancer samples in GSE70770 were divided into low versus high group. As we expected, extra-capsular extension (ECE), Gleason score and biochemical relapse (BCR) had remarkable differences between the two groups, while no significant differences were observed in age, PSA and positive surgical margins (PSM). To figure out whether UBE2C could function as the independent prognostic factor of prostate cancer, the effect of clinical phenotype on relapse-free survival was presented by Cox regression (Table 3). According to the results, UBE2C functioned as a risk factor in prostate cancer progression (HR 2.796; 95%CI 1.762-4.436; P<0.0001). ROC curve was applied to present the predictive accuracy of UBE2C (AUC = 0.64 in GSE70770, AUC = 0.675 in TCGA) in DFS analysis (Fig 7). In GSE70770, the AUC of prostate cancer BCR at 1-year is 0. For localized prostate cancer, radical prostatectomy combined with radiation has become the first choice, although it can bring some adverse effects to patients. Previous study has showed UBE2C was highly correlated with chemoresistance and radiotherapy resistance of prostate cancer [27][28][29][30]. According to our results, UBE2C highly expressed in prostate cancer, especially in castration-resistant prostate cancer and UBE2C was correlated with the neuroendocrine prostate cancer biomarkers, such as RB1 and LDHA (S4 Fig). It indicated that targeting UBE2C may provide new therapy idea for prostate cancer, especially for castration-resistant prostate cancer.

GSEA
As shown in Fig 8, high expression of UBE2C were positive with G2M check point, cell cycle, DNA replication. Furthermore, E2F targets, MYC targets, Notch signaling pathway were also significantly enriched companying with UBE2C expression. As we expected, UBE2C is positively correlated with bladder cancer progression as well as renal cell carcinoma, pancreatic cancer.

Discussion
Ubiquitin, which was a conserved protein containing 76 amino acids, could be included in the process of post-translation for protein degradation [31]. We confirmed that ubiquitin could be conjugated to lysine residue of target protein through multi-step reactions of enzymes [31][32][33]. Ubiquitin-activating enzyme (E1), ubiquitin-conjugating enzyme (E2) and ubiquitin ligases (E3) were all involved in the ubiquitination. As we all know, ubiquitination, which was one of the post-translational modifications, began with activation of ubiquitin molecule and was terminated by the linkage of ubiquitin to the target protein [32,33]. Ubiquitination was essential for the regulation of a number of cellular processes among almost all mammalian cells [34,35]. The modification involved in the progression of many severe diseases, like infection, inflammation response, cancer, neurodegeneration [34][35][36]. It provided us with a novel strategy for the therapeutic intervention of tumor by regulating the activity of ubiquitin enzyme [35]. Based on current study, UBE2C, which could not only drive prostate neoplasms progression but also could predict the prognosis of prostate cancer, was identified as the hub gene of prostate cancer. We have confirmed that the level of UBE2C was paralleled with the Gleason score of prostate cancer, early biochemical recurrence and poor clinical outcomes. As the results showed, the higher UBE2C level was, the worse Gleason score would be. UBE2C, which belongs to the family of E2 ubiquitin-conjugating enzyme, has played a crucial role in inducing protein degradation through ubiquitin-proteasome proteolytic (UPP) pathway in cooperation with anaphase promoting complex (APC) [37,38]. UBE2C was highly expressed in diverse tumors when compared with respective normal tissue [38,39], such as breast cancer, lung cancer, colon cancer, liver cancer, thyroid cancer, prostate cancer [40]. The mechanisms of UBE2C in promoting and regulating the development of prostate cancer deserve to be explored. And there have been several researches focused on the mechanism of UBE2C in prostate cancer. Other research has confirmed the level of UBE2C, which could be regulated by miR-381-3p, was positive correlated with the proliferation of prostate cancer cells [39]. Post-translational modification of Mediator 1(MED1) T1032 phosphorylation, which was regulated by PI3K/AKT signaling pathway [41], promoted the expression of UBE2C in prostate cancer and enhanced the role of UBE2C in promoting the proliferation of prostate cancer and in driving prostate cancer progression. Conspicuously, in PC-3 cells, UBE2C could recruit FOXA1 to its enhancers, that was mostly likely correlated with the expression of UBE2C [41] and promoted the progression of castration-resistant prostate cancer. Other research has reported that FOXM1, E2F1 and RAD51 could bind to the enhancer regions and promoter regions of UBE2C in order to regulate the expression of UBE2C [40]. Apparently, the level of UBE2C showed strong relationship with the differentiation and progression of prostate cancer and conspicuously correlated with the prognosis. Compared with low risk androgen-dependent prostate carcinoma (ADPC), UBE2C prominently upregulated in the fatal castrationresistant prostate cancer (CRPC) [37]. ADT has been considered as the first-line treatment for advanced prostate cancer. Although in a few months later, prostate cancer would progress to CRPC which had no response to the ADT [42]. The pathogenesis of CRPC was most closely to AR mutation, AR amplification, the presence of AR-V7 and abnormal activation of androgen receptors downstream signals [43]. Compared with full-length AR(AR-FL), the level of UBE2C was accurately correlated with the expression of AR variant 7(AR-V7), which could regulate the expression of UBE2C [42]. It revealed that UBE2C was the downstream target of AR-V7, which would give us a brand-new way to explore the therapy for CRPC by targeting the activity of UBE2C.
Through comprehensive bioinformatics analysis, we concluded that UBE2C could drive the progression of prostate cancer, but the analysis in this paper still had some certain limitations. First, the role of UBE2C in prostate cancer has not been proven in vivo or in vitro. It is essential for us to verify the importance of UBE2C in prostate cancer and explore the mechanism of UBE2C in regulating the development of prostate cancer by experimental approach. Second, the study of chemotherapy and radiotherapy resistance of prostate cancer patients is very important. It is of great worth to study the relationship between the expression of UBE2C and the treatment resistance of prostate cancer. So we should pay attention to performing clinical trials and we also should focus on the prevention, the treatment and the care of cancer.