Characterization of long noncoding RNA and messenger RNA signatures in melanoma tumorigenesis and metastasis

The incidence of melanoma, the most aggressive and life-threatening form of skin cancer, has significantly risen over recent decades. Therefore, it is essential to identify the mechanisms that underlie melanoma tumorigenesis and metastasis and to explore novel and effective melanoma treatment strategies. Accumulating evidence s uggests that aberrantly expressed long noncoding RNAs (lncRNAs) have vital functions in multiple cancers. However, lncRNA functions in melanoma tumorigenesis and metastasis remain unclear. In this study, we investigated lncRNA and messenger RNA (mRNA) expression profiles in primary melanomas, metastatic melanomas and normal skin samples from the Gene Expression Omnibus database. We used GSE15605 as the training set (n = 74) and GSE7553 as the validation set (n = 58). In three comparisons (primary melanoma versus normal skin, metastatic melanoma versus normal skin, and metastatic melanoma versus primary melanoma), 178, 295 and 48 lncRNAs and 847, 1758, and 295 mRNAs were aberrantly expressed, respectively. We performed Gene Ontology and Kyoto Encyclopedia of Genes and Genomes pathway analyses to examine the differentially expressed mRNAs, and potential core lncRNAs were predicted by lncRNA-mRNA co-expression networks. Based on our results, 15 lncRNAs and 144 mRNAs were significantly associated with melanoma tumorigenesis and metastasis. A subsequent analysis suggested a critical role for a five-lncRNA signature during melanoma tumorigenesis and metastasis. Low expression of U47924.27 was significantly associated with decreased survival of patients with melanoma. To the best of our knowledge, this study is the first to explore the expression patterns of lncRNAs and mRNAs during melanoma tumorigenesis and metastasis by re-annotating microarray data from the Gene Expression Omnibus (GEO) microarray dataset. These findings reveal potential roles for lncRNAs during melanoma tumorigenesis and metastasis and provide a rich candidate reservoir for future studies.


Introduction
The worldwide incidence of melanoma, the most aggressive form of skin cancer, has rapidly increased in recent decades. The number of new melanoma cases in the United States is expected to reach 76,380, with 10,130 deaths, by the end of 2016 [1]. Primary melanoma (PM) is curable by surgery. However, if it is not detected early and surgically removed, melanoma is highly likely to metastasize. Thus, identification of the mechanisms driving both tumorigenesis and metastasis and the development of novel and effective melanoma treatment strategies are urgently needed.
Long noncoding RNAs (lncRNAs), which exceed 200 nucleotides in length, are messenger RNA (mRNA)-like transcripts that do not encode proteins [2,3]. Unlike smaller microRNAs, which play crucial roles in human carcinogenesis, our understanding of lncRNA biological functions is in its infancy. The first functional lncRNA, XIST, was discovered in the early 1990s; XIST inactivates gene expression from the X-chromosome by dosage equalization [4,5]. Multiple reports have shown that lncRNAs regulate complex and diverse functions, including embryonic stem cell pluripotency [6], epigenetic gene regulation [7], the DNA damage response [8], and chromatin remodeling [9]. Furthermore, lncRNAs participate in wide-ranging cellular processes, including cell cycle, proliferation, apoptosis, and invasion [10].
With the emergence of next generation sequencing, large projects have identified multiple lncRNAs that are involved in carcinogenesis and development of cancer [11,12], including glioblastoma [13], ovarian cancer [14], hepatocellular carcinoma [15], gastric cancer [16] and colorectal cancer (CRC) [17]; this knowledge suggests intriguing possibilities for diagnostic and therapeutic lncRNA applications. However, little is known about lncRNA functions in melanoma tumorigenesis and metastasis. Multiple studies have identified several functions for lncRNAs in melanoma. Upregulation of SPRY4-IT1 might play an important role in melanoma and be a useful early biomarker in humans [18,19]. Tang et al. [20] has shown HOTAIR overexpression in lymph node metastases compared to PMs and demonstrated an active role in cell motility and invasion, which highlights HOTAIR as a potential target for malignant melanoma therapy. A long intergenic non-coding RNA, CASC15, correlates with melanoma progression and is involved in the regulation of phenotype-switching [21]. Another lncRNA, SLNCR1, promotes melanoma invasion by binding to the androgen receptor and brain-specific homeobox protein 3a [22]. Nevertheless, the low specificities and sensitivities of lncRNAs suggest that a single target is not likely to fully illustrate lncRNA mechanisms in melanoma. The potential roles of lncRNAs during melanoma tumorigenesis and metastasis have not yet been fully explored.
We began our study by analyzing previously published melanoma gene expression profiles from the Gene Expression Omnibus (GEO) database and conducted lncRNA profiling to identify significant lncRNAs. The identified lncRNA profiles were then verified using another independent validation set. A Gene Ontology (GO) analysis, Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis, and lncRNA-mRNA co-expression network analysis were then conducted. We performed survival analysis based on TCGA database. Our findings might uncover possible lncRNA and mRNA expression profiles associated with melanoma progression and metastasis and provide novel insights into the molecular pathogenesis of melanoma. dataset (GSE15605) as the training set and another independent dataset (GSE7553) as the validation set [23]. All of the samples in these datasets were hybridized with the HG-U133 Plus 2.0 Array (Affymetrix, Santa Clara, CA, USA); this array includes 54,675 probe sets and is widely used in biological research.

lncRNA annotation pipeline
To evaluate lncRNA expression in the melanoma gene expression data, we applied an lncRNA annotation pipeline based on the method constructed by Zhang et al. [24]. First, the Affymetrix HG-U133 Plus 2.0 probe set ID was mapped to the latest version of the NetAffx Annotation File (S1 Table, HG-U133_Plus_2 Annotations, CSV format, Release 34, 30 MB, 1/23/14). The annotations contained the probe set ID, gene title, gene symbol, Ensembl, Refseq transcript ID and other information. Second, for the probe sets from the Refseq database, the IDs that were labeled "NR" were retained (NR indicated non-coding RNA). For the probe sets from the Ensembl database, the IDs with "antisense", "processed transcripts", "sense-overlapping", "non_sense_mediated_decay", "sense_intronic", "lincRNA", "non-coding", "misc-RNA" or "3prime-overlapping-ncrna" in the Ensembl annotations were retained. Of the probe sets from the Refseq and Ensembl databases, those that were labeled "NR" in the Refseq database and also annotated with the above Ensembl gene titles were retained. Third, the probe sets were filtered by removing pseudogenes, microRNAs, rRNAs and other small RNAs, including snRNAs, snoRNAs, and tRNAs. enriched functions of the differentially expressed mRNAs. GO analyses covered three domains: Biological Process, Cellular Component and Molecular Function.

Kaplan-Meier analysis
The lncRNA reads per kilobase per million mapped reads (RPKM) of 221 skin samples from patients with cutaneous melanoma were downloaded from The Atlas of Noncoding RNAs in Cancer [30] (TANRIC, http://ibl.mdanderson.org/tanric/_design/basic/index.html), and corresponding clinical parameters and follow-up information for these patients were downloaded from The Cancer Genome Atlas (TCGA) [31]. Kaplan-Meier analyses were performed in R software to explore the association between the lncRNA and overall survival of patients with melanoma. According to the median level of each lncRNA expression, we divided patients with melanoma into low and high lncRNA expression groups; p< 0.05 was considered significant.

GEO data set characteristics
The GSE15605 and GSE7553 series were obtained from GEO and used in this study.  Table). Of these, 1,470 probe sets (952 lncRNAs) were annotated as lncRNAs by both the Refseq and the Ensembl databases; 260 probe sets (215 lncRNAs) were annotated Differentially expressed lncRNAs between PM, MM and N Using the absolute log 2 FC>1 and p<0.05 threshold, differentially expressed lncRNAs were identified for the following three comparisons: PM vs. N, MM vs. N, and MM vs. PM. In PM vs. N, we identified 207 probe sets (178 lncRNAs) containing 108 upregulated probe sets (91 lncRNAs) and 99 downregulated probe sets (87 lncRNAs) (S3 Table). In MM vs. N, we identified 348 probe sets (295 lncRNAs) containing 185 upregulated probe sets (150 lncRNAs) and 163 downregulated probe sets (145 lncRNAs) (S4 Table). In MM vs. PM, we identified 51 probe sets (48 lncRNAs) containing 20 upregulated probe sets (17 lncRNAs) and 31 downregulated probe sets (31 lncRNAs) (S5 Table), as shown in Table 1. A hierarchical clustering analysis of all samples from the GSE15605 training set and the GSE7553 validation set were processed from these differentially expressed lncRNAs. The hierarchical clustering maps for the three comparisons revealed non-random partitioning of the samples into two major groups in GSE15605. Using a training-validation approach, we validated our results in the GSE7553 dataset. Similar distinctions between two sample types were observed (Fig 2).

GO and KEGG pathway analyses
To further explore the potential functions of these differentially expressed mRNAs, we performed GO and KEGG pathway analyses. Upregulated and downregulated mRNAs were separately analyzed in the GO analysis and included the following three domains: Biological Process (GOBP), Cellular Component (GOCC) and Molecular Function (GOMF). In PM vs. N, the top enriched GO terms among upregulated mRNAs included immune response (GOBP), extracellular region (GOCC), and calcium ion binding (GOMF), whereas oxidation-reduction process (GOBP), extracellular exosome (GOCC), and structural molecule activity (GOMF) were the top enriched GO terms among downregulated mRNAs.

lncRNA-mRNA co-expression network analysis
We constructed an lncRNA-mRNA co-expression network to predict the potential functions of differentially expressed lncRNAs in PM and MM. Using a PCC analysis (absolute PCC>0.80, p<0.05), we identified 54 lncRNAs and 472 correlated mRNAs in PM vs. N to construct a clear network with 526 network nodes and 2402 connection edges (Fig 5A). The network indicated that multiple lncRNAs regulated numerous co-expressed mRNAs, including U47924.27, a downregulated lncRNA that was co-expressed with 235 mRNAs, and LINC00888, an upregulated lncRNA that was co-expressed with 15 mRNAs. In MM vs. N, we identified 73 lncRNAs and 666 correlated mRNAs by applying a PCC threshold of 0.90 and a significance threshold of 0.05. The network in MM vs. N contained 739 network nodes and 3145 connection edges ( Fig  5B). Within this network, 2,866 pairs were positively correlated, and 278 pairs were negatively

Venn diagram analysis
To identify significantly expressed lncRNAs and mRNAs associated with melanoma tumorigenesis and metastasis, we constructed a Venn diagram analysis of the common and unique lncRNAs or mRNAs in the three comparisons (PM vs. N, MM vs. N, and MM vs. PM (Fig 6)). We identified 15 lncRNAs (16 probe sets) that overlapped in all three comparisons, including 12 downregulated lncRNAs and 3 upregulated lncRNAs ( Table 2). As shown in Fig 6B, 144 mRNAs (163 probe sets) overlapped in all three comparison groups. Of these, 143 mRNAs were downregulated, including KRTDAP, KRT5, TACSTD2, and SERPINB5, but only SPP1 was upregulated (S9 Table). Based on the log 2 FC, p-value, and degree of associated mRNAs, we selected five lncRNAs, including RP11-164P12.4, TINCR, U47924.27, RP11-532F12.5 and LINC00888. We found mRNAs associated with these five lncRNAs accounted for a large proportion of the lncRNAassociated mRNAs (294/472, 467/666, and 152/169 in PM vs. N, MM vs. N, and MM vs. PM, respectively). We constructed a lncRNA-mRNA co-expression network with these five lncRNAs and associated 152 mRNAs in MM vs. PM, as shown in Fig 7. We further used a database developed by Terai et al. to analyze the target RNAs for these five lncRNAs. LINC00888 was not  included in the database. The target RNAs for four lncRNAs are sorted by rank (using either MINENERGY or SUMENERGY). We showed the target mRNAs that overlapped in three comprisons (PM/N, MM/N, and MM/PM) of our study and the database developed by Terai et al. in S1 Fig and S11-S14 Tables. The results showed that most lncRNA coexpressed mRNAs in our study overlapped with target mRNAs in the database developed by Terai et al., which indicated the lncRNA-mRNA network in our study was worthwhile and accurate.

Kaplan-Meier analysis
We validated five lncRNAs, RP11-164P12.4, TINCR, U47924.27, RP11-532F12.5 and LINC00 888, in the TCGA cohort containing 221 patients with melanoma. Four lncRNAs were annotated in the TCGA, and the findings showed that low expression of U47924.27 was associated with a shorter overall survival in patients with melanoma. However, a significant association was not detected for the other three lncRNAs (Fig 8).

Discussion
Multiple recent studies have re-annotated microarray data to discover new lncRNA biomarkers and identify therapeutic lncRNA targets [24,41,42]. Zhang et al. investigated lncRNA expression profiles in gliomas by re-annotating the Affymetrix HG-U133 Plus 2.0 array [24]. This method permitted an lncRNA and mRNA expression analysis that was feasible, accurate, and inexpensive. Based on Zhang's method, we explored lncRNA profiles in two existing melanoma patient cohorts from the GEO database as described [43,44]. In our training set (GSE15605), we identified 178, 295, and 48 lncRNAs that were aberrantly expressed in PM vs. N, MM vs. N, and MM vs. PM, respectively. The validation of these results by another independent cohort (GSE7553) highlighted the usefulness of these lncRNA signatures.
Based on the log 2 FC, p-value, and number of associated mRNAs, we established five candidate lncRNAs (TINCR, RP11-164P12.4, RP11-532F12.5, U47924.27, and LINC00888) that might play critical roles in melanoma tumorigenesis and metastasis. Tissue differentiation-inducing non-protein coding RNA (TINCR) was previously reported to be upregulated in esophageal squamous cell carcinoma and might facilitate its development through an association with CLND7 and ANAX1 [48]. Zhang et al. [49] showed that downregulated TINCR promoted proliferation and metastasis in CRC by acting as a potential cancer suppressor gene. Sarkar et al. [46] suggested that a gain of ANCR and loss of TINCR might maintain keratinocyte progenitors in their undifferentiated states, resulting in melanoma tumorigenesis. This association should be investigated further. In our study, TINCR was significantly downregulated in PM vs. N (log 2 FC = -3.031) and downregulated further in MM vs. N (log 2 FC = -5.260). This suggests that TINCR might play a critical regulatory role in melanoma formation and metastasis.
No reports that address the roles of RP11-164P12.4, RP11-532F12.5, U47924.27, and LINC00888, but we can predict their functions by analyzing their associated mRNAs. RP11-532F12.5 was downregulated with a log 2 FC of -4.971 in MM when compared with normal skin tissues. Interestingly, an analysis of the RP11-532F12.5 genomic locus showed that RHOV was its near coding gene (38 kb away). RP11-532F12.5 might regulate the expression of neighboring protein-coding genes and influence the development and progression of melanoma. RHOV, which is co-expressed with RP11-532F12. 5  LINC00888, which is located on chromosome 3, had the most co-expressed mRNAs among the upregulated lncRNAs in the three comparisons. In MM vs. N, LINC00888 (upregulated) co-expressed with 13 mRNAs, 11 downregulated and 2 upregulated. One example was BNC1 (basonuclin 1), a downregulated zinc-finger transcription factor with numerous known targets, and loss of BNC1 increases the metastatic potential of breast cancer [55]. LINC00888 is negatively correlated with BNC1, which implies a potentially active role in melanoma.
In our study, 144 differentially expressed mRNAs overlapped in all three comparison groups. Of these, the following mRNAs should be highlighted: MUCL1, DSC3, SERPINB5, CST6, and SPP1. MUCL1 (mucin-like 1), also known as SBEM, was first identified by Miksicek et al. [56]. Conley et al. [57] observed that HER2 regulated MUCL1 to promote breast cancer cell growth through the FAK/JNK signaling pathway. Valladares-Ayerbes et al. [58] showed that SBEM was detectable in bone marrow micrometastases of breast cancer patients by RT-PCR, which implied its potential utility as a bone marrow micrometastasis marker for breast cancer. These findings revealed a correlation between MUCL1 and cancer progression. DSC3 (Desmocollin 3), a member of the cadherin superfamily, has been associated with lymph node metastasis in oral squamous cell carcinoma [59]. Recently, Pan et al. [60] observed that the loss of DSC3 in prostate cancer predicted a poor prognosis; our study was consistent with these results. DSC3 was downregulated in PM compared to N (log 2 FC = -3.068) and more downregulated in MM compared to N (log 2 FC = -7.705). Maspin (SERPINB5) was identified as a cancer suppressor gene in 1994 by Zou et al. [61]. Numerous studies have demonstrated that maspin loss predicts possible metastasis and poor patient prognosis for prostate [62], cervical [63], and gastric cancers [64]. Our results showed that SERPINB5 expression was significantly decreased in PM vs. N, MM vs. N and MM vs. PM. Moreover, multiple studies have shown that CST6 encodes a secreted protein (Cystatin E/M) that suppresses metastasis. Jin et al. revealed that CST6 inhibited migration, invasion, and bone metastasis in breast cancer [65]. Another study demonstrated that CST6 overexpression decreased metastasis in prostate cancer [66]. Accordingly, our results showed CST6 downregulation in all three comparison groups. Secreted phosphoprotein 1 (SPP1, also called osteopontin) has functions in tumorigenesis, tumor progression, and metastasis in numerous cancers [67][68][69]. Liu et al. [68] showed that shRNA-mediated SPP1 suppressed the proliferation, migration, and invasion of human renal cancer ACHN cells by regulating MMP2 and MMP9. A recent study by Agrawal et al. showed that SPP1 was consistently upregulated in high-grade, invasive, and recurrent urothelial cancer cases [70]. In our study of 146 overlapping mRNAs over three comparisons, SPP1 was the only upregulated mRNA.
To explore the functions of the differentially expressed mRNAs in melanoma, we constructed GO and pathway analyses. Several GO terms from the upregulated mRNAs were related to immune and inflammatory responses, including immune response, inflammatory response, chemotaxis and regulation of immune response. However, several GO terms from the downregulated mRNAs were related to skin development, including epidermis development, keratinocyte, and keratinocyte differentiation.
According to the KEGG pathway analysis, mRNAs were targeted to pathway in cancer, metabolic pathways, melanogenesis, the p53 signaling pathway, and the PPAR signaling pathway, and others. Pathway in cancer was the top enriched term in MM vs. N and MM vs. PM, suggesting that our differentially expressed mRNAs are correlated with cancer. According to the study by Dowling et al. [71], metabolic pathways differed between melanoma in situ and invasive melanoma. Melanogenesis can be a pathogenic factor during melanoma progression. Thus, melanogenesis inhibition is a rational MM therapy approach [72]. Numerous reports have shown that the p53 signaling pathway controls cancer cell apoptosis and growth and is considered a key tumor suppressor in over half of all sporadic human cancers [73][74][75]. Peroxisome proliferator-activated receptors (PPARs) are nuclear hormone receptors with three isoforms: PPARa, PPARc, and PPARb/d. PPAR agonists, such as the thiazolidinediones, might be useful treatment modalities for malignant melanoma and melisma [76].
Furthermore, lncRNA data from the TCGA database were utilized to assess the correlation between lncRNA expression and the overall survival of patients with melanoma. Low U47924.27 expression was associated with a shorter overall survival in this study, indicating that U47924.27 down-regulation might be a potential marker of a poor prognosis.
Our study had several limitations. First, Affymetrix HG-U133 Plus 2.0 arrays included some, but not all, of the possible lncRNAs present. Second, distinct lncRNA expression patterns implied potential relationships to melanoma, but we do not have direct experimental evidence to support this hypothesis. We primarily focused our study on the value of bioinformatics-based analyses for discovering novel or important lncRNAs and mRNAs expressed during melanoma tumorigenesis and metastasis. Finally, the N sample size in our validation dataset (GSE7553) was not large and might have increased the bias in our analysis.

Conclusions
We are the first to report the identification of lncRNA and mRNA expression patterns in melanoma tumorigenesis and metastasis by re-annotating microarray data from the GEO microarray dataset. We identified 15 lncRNAs and 143 mRNAs that are associated with melanoma tumorigenesis and metastasis. Based on the follow-up investigation revealed that a five-lncRNA signature might have a critical role in melanoma tumorigenesis and metastasis. Furthermore, based on the TCGA database, low U47924.27 expression was associated with a shorter overall survival. Our study might provide a candidate reservoir for future investigations of lncRNAs and mRNAs associated with melanoma tumorigenesis and metastasis; more extensive investigations will be performed in the future.