TCGA based integrated genomic analyses of ceRNA network and novel subtypes revealing potential biomarkers for the prognosis and target therapy of tongue squamous cell carcinoma

Objectives The study aimed to investigate the ceRNA network in biological development of Tongue Squamous Cell Carcinoma (TSCC) and to identify novel molecular subtypes of TSCC to screen potential biomarkers for target therapy and prognosis by using integrated genomic analysis based on The Cancer Genome Atlas (TCGA) database. Material and methods Data on gene expressions were downloaded from TCGA and GEO database. Differentially expressed RNAs(DERNAs) were shown by DESeq2 package in R. Functional enrichment analysis of DEmRNAs was performed using clusterprofilers in R. PPI network was established by referring to String website. Survival analysis of DERNAs was carried out by survival package in R. Interactions among mRNAs, miRNAs and lncRNAs were obtained from Starbase v3.0 and used to construct ceRNA network. Consensus Cluster Plus package was applied to identify molecular subtypes. All key genes were validated by comparing them with GEO microarray data. Statistical analyses of clinical features among different subtypes were performed using SPSS 22.0. Results A total of 2907 mRNAs (1366 up-regulated and 1541 down-regulated), 191miRNAs (98 up-regulated and 93 down-regulated) and 1831 lncRNAs (1151 up-regulated and 680 down-regulated) were identified from tumor and normal tissues. A ceRNA network was successfully constructed and 15 DEmRNAs, 1 DEmiRNA, 2 DElncRNAs associated with prognosis were employed. Furthermore, we firstly identified 2 molecular subtypes, basal and differentiated, and found that differentiated subtype consumed less alcohol and was related to a better overall survival. Conclusion The study constructed a ceRNA network and identified molecular subtypes of TSCC, and our findings provided a novel insight into this intractable cancer and potential therapeutic targets and prognostic indicators.


Introduction
Oral squamous cell carcinoma (OSCC), among others, is one of the most frequent cancers in the world. The major type of OSCC is tongue squamous cell carcinoma, the incidence which accounts for 25%~40% [1]. OSCC is characterized by a rapid local invasion, an early lymph node metastasis and a poor prognosis [2]. According to a retrospective analysis by Nair [3], TSCC was clinically different from other oral cancers due to its pathological factors, suggesting that TSCC could be regarded as a particular type of OSCC. The occurrence of TSCC could be attributed to tobacco, alcohol and areca nut [4]. Although great efforts have been made in the advances in surgical techniques, chemo-radiotherapy and some other target therapies, the overall survival of TSCC still hovers around 50% [5]. Thus, it is crucial to reveal the underlying biological mechanism and different molecular subtypes of TSCC associated with prognosis in order to discover novel biomarkers for target therapy and prognosis prediction.
MRNAs and some non-coding RNAs, including microRNAs(miRNAs) and long non-coding RNAs (lncRNAs), play an important role in the initiation and progression of TSCC. MiRNA, which is a family of functional noncoding RNA molecules with a length of 20-25 nucleotides, widely participate in post-transcriptional regulation by targeting mRNAs. A series of miRNAs have been demonstrated as oncogenes or tumor suppressors of TSCC [6,7]. LncRNAs, which are defined as endogenous non-coding RNAs that are longer than 200 nucleotides, regulate biological processes including transcriptional regulation, cell metabolism and RNA modification in tumorigenesis and metastasis of TSCC [8,9]. On the basis of the function of miRNAs and lncRNAs, Salmena et al put forward a hypothesis of competing endogenous RNA (ceRNA) across transcriptome, in which all types of RNA transcripts were believed to be able to act as a ceRNA through microRNA response elements (MREs) that are accessible to microRNA binding [10]. According to such a hypothesis, a great number of studies further demonstrated the importance of ceRNA in the occurrence and development of different cancers. For example, Zhou et al used GEO database to construct a miRNA-mRNA network of TSCC [11]. Nohata et al identified a prognostic lncRNAs malignant progression of TSCC [12]. The ceRNA network of OSCC was first established by Simin Li in 2017 [13]. However, to the best of our knowledge, no study has been conducted on ceRNA network of TSCC.
In addition to the regulation of ceRNA network in carcinogenesis, different clinical or molecular subtypes have been increasingly shown to be associated with different clinical and pathological factors. Walter et al identified 4 gene expression subtypes of head and neck squamous cell carcinoma (HNSCC) [14], and to further expand Walter's study, Zevallos [15] et al carried out gene expression analysis and demonstrated 4 subtypes to predict nodal metastasis and survival in human papilloma virus (HPV)-negative OSCC. However, we still lack a classification of TSCC subtypes.
Therefore, the aim of present study is to investigate the hidden crosstalk among various RNAs in TSCC via ceRNA network construction and to identify molecular subtypes of TSCC that can be used to predict prognosis. Furthermore, we identified that several genes with differential expressions from tumor and normal tissues were significantly associated with overall survival by integrated analysis. Our findings provided a novel insight into the biological process of TSCC and potential biomarkers for target therapy and survival prediction.

Data collection and preprocessing
Gene expression data (RNA sequencing profiles and miRNA profiling) and corresponding clinical data of TSCC were obtained from TCGA database (https://portal.gdc.cancer.gov/), and 126 TSCC samples and 13 normal controlled samples were collected. Among these data, data on mRNA and lncRNA expressions were obtained from Illumina HiSeqRNASeq platforms, while miRNA data were collected from Illumina HiSeqmiRNASeq platforms.

Functional enrichment analysis of GO annotation and KEGG Pathways
ClusterProfiler v3.8 package of R was used to analyze and visualize functional profiles (Gene Ontology (GO) annotation and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway) of the genes in order to determine shared functions among DEmRNAs. P<0.05 was considered as a threshold of GO and KEGG enrichment analysis.

Establishment of protein-protein interaction (PPI) network
To understand the underlying interaction of DEmRNAs, the STRING website was employed to construct the PPI network, which was visualized by the Cytoscape software v3.6.1.

DEmRNAs, DElncRNAs, and DEmiRNAs associated with prognosis
Survival analysis was carried out by using the survival package of R to help assess the prognostic value of differentially expressed RNAs in TSCC patients. All samples were divided into either the high-expression group(>median) or the low-expression group(<median) in terms of the expression of each DEmRNA, DElncRNA, and DEmiRNA. The Survival curves were plotted using the Kaplan-Meier method. The log-rank test was adopted to assess statistical significance. P<0.05 was considered as statistically significant.

Prediction of lncRNA-miRNA and miRNA-mRNA interactions
We predict the interaction between DElncRNA and DEmiRNA or DEmRNA and DEmiRNA by using starBase v3.0, which determined more than 1.1 million miRNA-ncRNA, 2.5 million miRNA-mRNA and 1.5 million RNA-RNA interactions from multi-dimensional sequencing data. In addition, the prediction results of miRanda, Targerscan and miRmap were integrated by starBase. Only the regulatory pairs of DEmiRNAs and DEGs, DElncRNAs and DEmiRNAs had opposite expressions and were therefore included in the present study.

Construction of ceRNAs regulatory network
According to ceRNA theory, the selected interaction of DEmiRNAs and DEmRNAs and of DElncRNAs and DEmiRNAs were integrated to construct the DElncRNAs-DEmiRNAs-DEmRNAs ceRNA network using Cytoscape software v3.6.1.

Validation of expressions of crucial DEmRNAs, DElncRNAs, and
DEmiRNAs basing on ceRNA network and prognosis analysis GEO2R (http://www.ncbi.nlm.nih.gov/geo/geo2r/) is an interactive web analysis tool for GEO data to identify genes with differential expressions among different groups provided by GEO database. We applied GEO2R to confirm key RNAs. Crucial DEmRNAs and DElncRNAs were verified in GSE30784 and GSE13601 and GSE28100 was used to validate crucial DEmiRNAs.

Copy number variations (CNV), somatic mutation (SNV) and methylation analysis of hub genes
To determine the possible regulatory mechanism of crucial RNAs with differential expressions from TSCC and normal tissue, CNV and methylation data analysis on selected RNAs were performed by cor package of R software. Pearson correlation analysis was carried out to identify whether they had correlated relationship with Pearson correlation coefficient >0. 40, which was set as a cutoff. SNV analysis was performed to excavate whether there existed some mutations of genes that possibly regulated genes, which were most significantly related to prognosis, by using Fisher test. P value < 0.05 was considered as statistically significant.

Molecular subtypes of TSCC classification
To find out if some molecular subtypes of TSCC were associated with prognosis, we classify TSCC subtypes in terms of gene expression from TCGA using ConsensusClusterPlus package of R. The top 2000 variable genes were reduced for integrative clustering analysis by median absolute deviation. All sample were divided into k (2 to 6) groups for determining a stable classification by k value analysis.

Clinical features description and survival analysis of different subtypes
Clinical information (i.e. clinical stage, T stage, N stage, smoking history, alcohol history, nodal extracapsular spread) was collected. Chi-square test was used to assess statistical significance between each subtype with p<0.05, which was set as the cut-off criterion. Survival analyses of patients with different subtypes were performed subsequently using a log-rank test by survival package of R. P value <0.05 was set as a threshold.

Identification of differential expression of mRNAs, miRNAs, lncRNAs between different subtypes of TSCC associated with prognostic DERNAs and functional enrichment analysis of GO and KEGG
To understand the intrinsic biological features between different molecular subtypes, we identified mRNAs, lncRNAs and miRNAs with differential expressions. Among these DERNAs, genes overlapped with prognostic genes mentioned above were considered as potential prognostic predictors for TSCC, according to different subtypes. Functional enrichment analyses were carried out to excavate biological mechanism and to define gene expression subtypes by clusterprofiler mentioned above. P<0.05 was considered as a cutoff of GO and KEGG enrichment analysis.

Functional enrichment analysis of gene ontology and KEGG pathways
2907 DEmRNAs were involved in the functional enrichment analysis using clusterProfiler package of R. GO annotation consists of biological process (BP), cellular component (CC), molecular function (MF). The top 50 GO analyses results were list in S1 Table. These results which took part in extracellular organization, ion transmembrane transport, muscle biological process and epidermal cell differentiation. MF analysis indicated that these DEmRNAs were significantly associated with 257 MFs, including channel activity, DNA-binding transcription activator activity, RNA polymerase II-specific, actin binding, enzyme inhibitor activity and cytokine activity. We found that the most enriched CC was collagen-containing extracellular matrix. KEGG pathway analysis (Fig 2 and S2 Table) showed that a total of 50 pathways were highly enriched, which was consistent to DEmRNAs. These pathways were mainly involved in Cytokine-cytokine receptor interaction, PI3K-Akt signaling pathway, Human papillomavirus infection, Focal adhesion and Calcium signaling pathway.

Protein-protein network analysis
Based on DEmRNAs, 12872 pairs of mRNA interaction relationship were identified using the STRING database with scores of >0. 4. Genes, whose number of edges were larger than 50, were employed in PPI network (S3 Table). 77 mRNAs were therefore selected to construct PPI network (Fig 3) using Cytoscape. The gene with the highest degree was ALB (Degree = 217), and ITGAX was the gene with the highest combined score (Combined score = 0.932).

Prognostic overall survival assessment of DEmRNAs, DEmiRNAs, and DElncRNAs
The survival analysis was performed to evaluate the prognostic signatures of DEmRNAs, DEmiRNAs, and DElncRNAs. Setting P<0.05 as a cutoff, a total of 154 DEmRNAs, 6 DEmiRNAs, 76 DElncRNAs were significantly related to overall survival. Among these potential prognostic biomarkers, a low expression of NAGS, which was the gene the most significantly associated with survival (P = 0.00036), was positively related to OS. For DEmiRNAs, a low expression of hsa-miR-1229-3p linked the most significantly to a high OS(P = 0.0053). However, a low expression of AL359851.1 the most significantly suggested a longer survival time, with a P = 0.00096 in DElncRNAs. The 3 biomarkers that were the most significantly related of each type of DERNAs were presented in Fig 4. All survival analysis results were given in S4 Table.

CeRNAs regulatory network construction
StarBase v3.0 was used to match DEmiRNAs and DEmRNAs or DElncRNAs. A total of 358 pairs of DEmiRNA-DEmRNA interactions (S5 Table) combined by 26 DEmiRNAs and 185 DEmRNAs were successfully predicted. In miRNA-ncRNA interaction analysis, 23 DEmiR-NAs and 22 DElncRNAs were involved in 44 pairs of DEmiRNA-DElncRNA interactions (S6 Table). All the interactive pairs were integrated to construct the lncRNA-miRNA-mRNA ceRNA network (Fig 5), which contained 249 nodes and 409 edges, using cytoscape software.

Validation of expressions of crucial DEmRNAs, DE-miRNAs, and DE-lncRNAs
To further confirm the biological function and prognostic value of the DEmRNAs, DE-miR-NAs and DE-lncRNAs, we used GEO2R to analyze 3 GEO data (GSE30784, GSE13601 and GSE28100). SYBU, AC017048.3 and LOC100506691 were demonstrated to be non-differentially expressed in none of the mRNA microarrays. However, others were validated in at least one data (Table 2).

CNV methylation and SNV analysis of hub genes
As 15 DEmRNAs mentioned above had been demonstrated as playing an important role in the development and prognosis of TSCC, we aimed to find the underlying regulatory mechanism. Methylation analysis did not show any gene whose methylation was negatively correlated with its gene expression level, by using cor package of R. However, the expression of GFPT2(Pearson's correlation coefficient = 0.436363) was possibly to be positively correlated with its CNV. SNV analysis of the most significant prognostic gene, NAGS, identified FAT1 (p = 0.039) as a potential SNV locus that regulate the gene expression of NAGS.

Molecular subtypes of TSCC classification
As DERNAs actively participated in TSCC onset and development, we speculated if there were some intrinsic gene expression subtypes related to different prognoses in TSCC. The 2000 most variable genes were employed by variance filtering (MAD method). The output of Con-sensusClusterPlus showed k (2 to 6) clusters (Fig 6), and we calculated cluster-consensus and item-consensus results, which were visualized in Fig 6. All samples were successfully categorized into 2 subtypes in terms of the most stable k value, and such a process resulted in the removal of 66 out of 126 samples from subtype A and 60 out of 126 samples from subtype B.

Clinical characteristics description and prognostic assessment
The clinical characteristics was described, and the results (Table 3) revealed no correlation of TSCC molecular subtype with clinical features such as gender, onset age, pathologic N or T stage, clinical stage and smoking. Noticeably, the patients in group A had a longer drink history(p = 0.024) and more alcohol consumption per day(p = 0.006). Kaplan-Meier survival curve in two subtypes (Fig 7) indicated that subtype B was significantly associated with a better overall survival(p = 0.0039).

Identification of differential expression of mRNAs, miRNAs, lncRNAs between different subtypes of TSCC overlapping with prognostic DERNAs and functional enrichment analysis of GO and KEGG
After performing DESeq2 package, 715 DEmRNAs, 7 DEmiRNAs and 521 DElncRNAs were screened. Top 10 DEmRNAs between two subtypes were list in Table 4. Morpheus website was Integrated genomic analyses for potential biomarkers for the prognosis and target therapy of TSCC applied for plotting the heatmap of DEmRNAs (Fig 8). The 3 most upregulated genes were BPIFB2, CTCFL and NTS in subtype A, while DEFB4B, CRNN and MUC21 were the 3 most up-regulated genes in subtype B. DERNAs, 25 mRNAs, 5lncRNAs and 0 miRNA were found to overlapped with prognostic RNAs mentioned above (Table 5). Further enrichment analysis (S7 Table) showed that upregulated genes in subtype A were highly enriched not only in metabolic process, digestion and multicellular organismal homeostasis in GO, but also in chemical carcinogenesis, metabolism by cytochrome P450 and in nicotine addiction in KEGG (Fig 9). However, upregulated genes in group B (S8 Table) was involved in GO including keratinocyte or epidermal cell differentiation, skin or development, peptide cross-linking and cell killing and KEGG, including lipid or acid metabolism and IL-17 signal pathway. Therefore, we termed these two subtypes 'basal' and 'differentiated' in terms of their particular differential expression gene functions and literature reviews.

Discussion
TSCC, which is the most common kind of OSCC, is generally characterized by a notably aggressive biological behavior and a poor survival. It is urgent to detect the underlying genetic pathogenesis and to find reliable therapeutic targets and prognostic biomarkers for TSCC in order to improve patients' clinical outcome. In this study, we used TCGA database to identify mRNAs, miRNAs, lncRNA with differential expressions between TSCC and normal control tissues. To further understand the DERNAs, GO and KEGG analysis were performed subsequently. The results of GO analysis showed that extracellular matrix organization, extracellular structure organization, muscle contraction, regulation of ion transmembrane transport and muscle system process were involved in the biological process of TSCC initiation, which was different from other GO types of OSCC [16,17]. This suggested TSCC might be a special type of OSCC, which might be resulted from particular anatomical and histological structure of the tongue. Moreover, the enriched KEGG pathway was found to be involved in cytokine-cytokine receptor interaction, PI3K-Akt signaling pathway, HPV infection, alcoholism, focal adhesion, MAPK signaling pathway and cAMP signaling pathway, which was consistent with findings from other previous studies on the tumorigenesis of TSCC [18,19]. We next constructed the PPI network with DEmRNAs and obtained the hub genes with the highest degree including ALB, FN1, EGF, MM9, KNG1, COL1A1, SPP1, ACTN2, most of which were reported to play a critical role in carcinogenesis and tumor progression [20][21][22][23].
The ceRNA network was established by integrating all DERNAs. A total of 15 prognostic DEmRNAs (SLC16A1, E2F7, SCN8A, ZIC2, NR4A1, CLU, GFPT2, CEP55, PLK1, SYBU, STC2, DEPDC1B, MICB, LIFR, BIRC5), 1 DEmiRNA(hsa-miR-337-3p) and 2 DElncRNAs (AC017048.3, AC156455.1) were involved in ceRNA network. We investigated the underlying mechanism of mRNA with differential expression, and apart from GFPT2, no correlation was identified between CNV or methylation and gene expression, which provided foundation for intensive study in future. SNV analysis of NAGS (the most significant prognostic gene) indicated that somatic mutation in FAT1(atypical cadherin 1) may be associated with the dysregulation of NAGS. According to Pickering, FAT1 was mutated in 30% of OSCC samples, which was also the highest frequency of mutation in HNSCC [24,25]. Researchers also suggested that FAT1 inhibited migration and invasion in OSCC and some other types cancers [26,27]. In this study, FAT1 mutation was shown to be related to differential expression of NAGS, which resulted in a remarkable variation of overall survival. All these results suggested that FAT1 could be used as a novel candidate tumor suppressor or prognostic predictor. SLC16A1(solute carrier family 16 member 1), alternatively known as MCT, is the most significant prognostic DEmRNA employed in ceRNA network and might have the potential to be   [29]. E2F7(transcription factor 7), targeted by MIR424, miR-3666 and miR-26a, is highly expressed in multiple tumors [30,31] such as endometrial carcinoma, breast cancer, colorectal cancer and glioma, indicating that E2F7 might constitute a potential therapeutic target. The overexpression of E2F7 was also detected in HNSCC [32]. It was also found that Sphk1-dependent activation of AKT mediated E2F7-induced doxorubicin resistance, which might be similar to the mechanism of progression in TSCC. SCN8A (sodium voltage-gated channel alpha subunit 8), which is related to different types of cancers [33,34] including colorectal cancer, prostate cancer, breast and some other types of cancer, emerges as an aggressive oncogene and could cause poor prognosis [35]. ZIC2 (Zic family member 2) has been reported to be crucial to the progression of cancer [36][37][38] such as hepatocellular carcinoma, epithelial ovarian tumor, osteosarcoma and OSCC. ZIC2 could activated the PI3K/AKT signal pathway, which played a vital role in TSCC, as it promotes viability, migration, and invasion of cancer cells. Sakuma [38] et al. detected that a significant up-regulation of ZIC2 in OSCC samples was related to a poor survival rate. NR4A1 (nuclear receptor subfamily 4 group A member 1) was validated to be targeted by miR-377 [39] and PCH4 [40] and act as a tumor promoter in OSCC. Besides, Liu and Chen [41,42] discovered that NR4A1 could be a critical general regulator in the induction of T cell dysfunction and inhibit NR4A, thus, NR4A1 was seen as promising in cancer immunotherapy. The prognostic DEmiRNA, hsa-miR-337-3p targeting one DEmRNA (AMOT, a witnessed oncogene in OSCC [43]), has been implicated to target AMOT, HOXB7 [44], MMP-14 [45], PTEN [46]  Integrated genomic analyses for potential biomarkers for the prognosis and target therapy of TSCC and acted as a tumor suppressor through sponging oncogene. Loss of miR-337-3p expression was associated with the development of cancer [47]. Most of these DERNAs were demonstrated to participate in carcinogenesis and progression in existing studies. However, the underlying functions of the 2 prognostic DElncRNAs in cancer remain poorly understood.
To further explore the possible genetic classification of TSCC, we identified 2 molecular subtypes in TSCC and termed these 2 subtypes basal and differentiated, according to gene function and Pickering's work on integrative genomic analysis of OSCC [48]. We integrated the clinical data of the 2 subtypes, interestingly, we found that basal tumors were highly associated with alcohol consumption but a poor clinical outcome. Moreover, the basal tumors were likely to come from patients with a smoking habit. All these clinical features suggested that alcohol and tobacco participate in critical progression of TSCC, therefore leading to a worse prognosis. In addition, we obtained 715 DEmRNAs, 7 DEmiRNAs, 521DElncRNAs. The functional analysis results suggested that basal tumor was expected to be related to metabolic process, chemical carcinogenesis and nicotine addiction, which could explain its different clinical manifestation. However, differentiated tumor was shown to participate in keratinocyte or epidermal cell differentiation, skin or epithelial development, peptide cross-linking, cell killing and IL-17 signal pathway, which were similar to the features characterized by a previous study on differentiated subtype [49]. These findings contributed to a better understanding on the difference of clinical characteristics and the underlying biological mechanism between different subtypes of TSCC.
Among these DERNAs, 25 mRNAs and 5lncRNAs were significantly correlated to prognostic RNAs and had a potential to act as predicted biomarkers to evaluate prognosis. These findings are mostly consistent with the results of previous studies. ODF4 (outer dense fiber of sperm tails 4) has been reported to have a high expression [50] in breast cancer, prostate cancer, basal cell carcinoma and chronic myeloid lymphoma. Afsharpad [51] determined ODF4 expression in urinary exfoliated cells, cancerous tissue and tumor-free tissue to confirm its diagnostic and surveillance potential. SPINK7(serine peptidase inhibitor, Kazal type 7 (putative)) targeted by miR-1322 [52] by inhibiting invasion of cancer cells via the urokinasetype plasmin activator receptor 1 integrin pathway or by upregulating p53 expression was considered as a tumor suppressor in multiple types cancers [53][54][55]. Li et al [56] validated the predictive value of SPINK7 in noninvasive early detection of gastric cancer. Therefore, it is possible to predict prognosis of TSCC patients by detecting SPINK7 expression in saliva. The dysregulation of LINC02487, alternatively known as LOC441178, was detected by transcriptome analysis and verified by RT-PCR in OSCC [57]. Xu [58] et al found that LINC02487 could sponge coiled-coil-containing protein kinase 1 (ROCK1) in OSCC, and that OSCC patients with a high expression of LINC02487 would have a longer survival time, which was in accordance with our result and also suggested that LINC02487 could be a new therapeutic target and prognostic indicator for TSCC. To the best of our knowledge, no relevant studies found that LINC02028, LOC101927503, LOC349160 and LINC02560 existed in cancer.
With the advance of RNA sequencing and bioinformatic analysis, a large amount of genomic data become available but requires to be decoded. In our study, we found that 15 DEmR-NAs, 1 DEmiRNA, 2 DElncRNAs were related to prognosis by constructing ceRNA network of TSCC. Most of them were found to act as oncogenes or tumor suppressors in many types of cancer, and the analysis of GO and KEGG pathway and PPI network further confirmed their function. Combining the ceRNA network theory, we considered that these DERNAs played a critical role in the occurrence of TSCC and could be utilized as biomarkers for target therapy. Additionally, we firstly confirmed 2 molecular subtypes in TSCC using data from TCGA by clustering analysis, and such an effort helped comprehend the intrinsic classification of TSCC and specific biological process of each subtype. We correlated DEmRNAs between basal and differentiated tumor with prognosis, and 25 mRNAs and 5 lncRNAs were identified to be associated with prognosis. Evidence suggested that these critical molecules not only participated in the initiation, but also in the development of TSCC and helped deciding prognosis and therapy decision. Advances and limitations of this study should also be acknowledged. It should be noticed that we integrated several categories of genetic data to bring out the importance of these selected molecules at different regulatory levels, which has hardly been done in previous studies. However, the data from TCGA only included American samples, which could not represent the whole existing conditions and as the regulatory pathway of molecular is very intricate, what we found in the study may be the tip of an iceberg. Thus, the mechanism of TSCC or even in all types of cancer still has an immense potential to be discovered, and further clinical trials and molecular experiments are required to verify our results.

Conclusion
The TCGA-based comprehensive genomic analyses successfully established a ceRNA network of TSCC and defined 2 intrinsic molecular subtypes of TSCC. Furthermore, we demonstrated key molecules involved in carcinogenesis and progression by integrating results of analyses of ceRNA network, overall survival and subtype classification. These discoveries provided a novel genetic landscape and the foundation for prognostic prediction or for more effective treatment, in which key genes are checked and targeted by specific inhibitors or promotors Supporting information S1