Network-based co-expression analysis for exploring the potential diagnostic biomarkers of metastatic melanoma

Metastatic melanoma is an aggressive skin cancer and is one of the global malignancies with high mortality and morbidity. It is essential to identify and verify diagnostic biomarkers of early metastatic melanoma. Previous studies have systematically assessed protein biomarkers and mRNA-based expression characteristics. However, molecular markers for the early diagnosis of metastatic melanoma have not been identified. To explore potential regulatory targets, we have analyzed the gene microarray expression profiles of malignant melanoma samples by co-expression analysis based on the network approach. The differentially expressed genes (DEGs) were screened by the EdgeR package of R software. A weighted gene co-expression network analysis (WGCNA) was used for the identification of DEGs in the special gene modules and hub genes. Subsequently, a protein-protein interaction network was constructed to extract hub genes associated with gene modules. Finally, twenty-four important hub genes (RASGRP2, IKZF1, CXCR5, LTB, BLK, LINGO3, CCR6, P2RY10, RHOH, JUP, KRT14, PLA2G3, SPRR1A, KRT78, SFN, CLDN4, IL1RN, PKP3, CBLC, KRT16, TMEM79, KLK8, LYPD3 and LYPD5) were treated as valuable factors involved in the immune response and tumor cell development in tumorigenesis. In addition, a transcriptional regulatory network was constructed for these specific modules or hub genes, and a few core transcriptional regulators were found to be mostly associated with our hub genes, including GATA1, STAT1, SP1, and PSG1. In summary, our findings enhance our understanding of the biological process of malignant melanoma metastasis, enabling us to identify specific genes to use for diagnostic and prognostic markers and possibly for targeted therapy.


Introduction
Melanoma, also known as malignant melanoma, is a type of cancer that develops from cells with different degrees of pigmentation, including melanotic and amelanotic melanocytes [1]. The main reason for the onset of melanoma is a low pigment content in skin exposed to a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 were extracted by the edgeR package, which implements an accurate statistical method of multigroup experiments developed by Robinson and Smyth [15] and provides statistical procedures for assessing the differential expression of RNA-Seq data in ChIP-Seq experiments. In this study, a log2 fold change (logFC) > 1 and P < 0.01 were selected as the screening standard for DEGs.

Weighted gene co-expression network analysis
Co-expression networks were constructed with the identified DEGs. We first calculated the Pearson correlations between all genes present in the dataset and obtained the correlation matrix of the whole gene correlation. The power beta was then used to remove the weakly correlated genes while retaining the stronger ones. The process produced a weighted network called 'adjacent matrices' that was converted to a topological overlapping matrices (TOM) network. After constructing the TOM network, we used hierarchical, clustering genes to make the cluster dendrogram with branches corresponding to the gene co-expression modules. The module definition in the dendrogram was executed using the dynamicTreeCut algorithm with the following parameters: deep split = 2, maximum cut height = 0.95, minimal module size = 90 genes [16].

Functional analysis of network module genes
To determine the biological mechanisms of melanoma alteration, the up-regulation and down-regulation of DEGs were used for Gene Ontology (GO) biology process terminology and Kyoto Encyclopedia Gene and Genomes (KEGG) pathway enrichment analysis. All the DEGs annotated in the GO database were used to classify GO functions through the cluster-Profiler package and to understand the distribution of gene functionality at the global level. DEGs for KEGG enrichment analysis were mapped to the KEGG database. After the analysis of GO and KEGG data, significant selection criteria were set as false discovery rate (FDR) and P-value under a level of 0.05.

PPI network construction
The online database resource search tool (STRING) for retrieving interactive genes provides protein-protein interaction information, including the prediction and comparison of intergenomic interactions [17]. Protein-protein interaction (PPI) research can reveal the protein functions of DEGs at the molecular level and can explain the cellular mechanisms by elucidating the interactions of genome-wide proteins [18]. In this study, a PPI network of all identified DEGs was established. The up-regulation and down-regulation of the gene-encoded proteins are represented by nodes of different shapes and colors in the network, and lines represent the interactions between proteins. The widely linked genes (hub genes) in the PPI network are thought to be more closely related to most proteins.

Transcriptional regulation network
In this study, to understand the transcriptional regulation of the hub genes, we used the University of California Santa Cruz (UCSC) database (http://genome.ucsc.edu/) TF-to-target to search for the interactions between a transcription factor (TF) and the hub genes and to assess the effect of a TF on the expression of the central gene. We downloaded the tfbsConsSites and tfbsConsFactors files from UCSC. The tfbsConsSites gives all the TF coordinate information, and the tfbsConsFactors site gives the TF identification information. We can predict the potential target relationships of all TFs based on the refGene file from UCSC and obtain the gene location information for hg18. There are 199 TF and 199,950 TF to the target interactions.

Pre-processing of datasets
To improve the reliability of the analysis, microarray data containing 471 3-level expression data were pretreated and combined into two global datasets of both primary tumor tissue (TP) and melanoma malignant metastatic tumor (MT) tissue obtained from the GDC database. There were no statistically significant differences in the general backgrounds of the 103 cases of TP and 368 MT, including age, gender, race, ethnicity and vital status ( Table 1).

Detection of differentially expressed genes
Principal component analysis (PCA) was considered to be able to explain the variance of the data and reveal the internal structure of the data. In the present study, the distribution of 471 samples was divided into two clusters (Fig 1A). A total of 1,568 DEGs was detected in this study. Compared with primary solid tumor samples, 639 down-regulated genes and 929 upregulated genes were identified in the metastatic melanoma samples (P-value < 0.01, log2 fold change > 1) (Fig 1B).

A weighted gene co-expression network
The dimensionless topological network follows the power law and the soft threshold power (β), which is used to scale the adjacency matrix [19]. In this study, as shown in Fig 2A, six gene modules were partitioned by dynamic tree cutting. Each module is independently verified to the other modules based on the correlation and significance between each two modules ( Fig  2B). After examining their eigengene co-expression, we merged these dynamic modules into three with a threshold of 0.5, which confirmed the reliability of the module divisions (Fig 2C). The module significance of the correlation between the modules is P = 4.9e-232.
As a key component of the detection module, the eigengene is the main indicator of the change in expression profile in the module, and its relevance to the phenotypes can help reveal key biological functions and identify potential biomarkers. By merging eigengenes, we found two main modules, shown in blue and yellow, worthy of further exploration.

GO and KEGG enrichment analysis of blue and yellow module genes
By exploring the biological functions of the blue and yellow modules, we have carried out the analysis of gene ontology (GO) terms and the pathway analysis of the KEGG database. All the significant terms of the annotation systems are represented as color bars to compare the relative significance of the enriched terms, where the length and color saturation of each term are proportional to the gene count/ratio and the adjusted P-value obtained from the enrichment analysis.
In the blue module, the most GO terms in biological processes were associated with immune activities, including lymphocyte and leukocyte differentiation, proliferation and activation as well as T-cell and B-cell activation (Fig 3A, Table 2). The yellow module was enriched in keratinocyte and epidermis cell differentiation, keratinization and cornification, and the establishment of the skin barrier (Fig 3B, Table 2).
In the cellular component categories, the enriched GO terms in the blue module were primarily associated with receptor complexes, cell or granule membranes and immunological synapses (Fig 3A, Table 2). The yellow module genes were significantly related to connexin and keratin as well as gap and cell-cell junctions (Fig 3B, Table 2). In the molecular function categories, the GO terms enriched in the blue module genes included cytokines, chemokines, nucleotides, purine and G-protein coupled nucleotides or peptides or purinergic receptor activities, while the yellow module included endopeptidase, peptidase and serine-type inhibitors or regulator activities (Fig 3A and 3B, Table 2).

Potential diagnostic biomarkers of metastatic melanoma
To explore the intrinsic cascade relationship between gene functions, we constructed a plot of gene functional regulation cascades for important GO terms in biological process (Fig 4). Then, we found that the genetic function between the hierarchical tree relationships was significant. The results showed that the gene functional cascade eventually induced keratinization and cornification (GO:0070268, P adj = 1e-20), peptide cross-linking (GO:0018149, P adj = 1e-20) and regulation of cytosolic calcium ion concentration (GO:0051480, P adj = 1.02e-5).
To screen the significant pathway terms for the blue and yellow modules, the Fisher exact test was used as a statistical method for calculating the significance level (P-value < 0.05). According to the KEGG database, we obtained the pathways involved in all the genes of the blue and yellow modules by the annotating method. The blue module was mainly enriched with genes for cytokine-cytokine receptor interactions, intestinal immune network for IgA production, B-cell and T-cell receptor signaling pathways, NF-kappa B and chemokine signaling pathways (Fig 3C, Table 3). The yellow module was primarily enriched with genes involved in the metabolism of arachidonic acid, alpha-linolenic acid, linoleic acid, retinol and ether lipid as well as the GnRH, VEGF, IL-17 and Ras signaling pathways (Fig 3D, Table 3).

Identification of hub genes and PPI network construction
These two important modules, blue and yellow, contained 200 and 400 genes, respectively. The highly connected nodes in each module group were extracted as the hub genes. In each network, the node size and color depth are proportional to the connect strength (the sum of the degrees in the module).
To compare and integrate our gene co-expression network-protein interaction data, we extracted a high-quality PPI network containing 705 nodes and 12,738 edges from the STRING database (Fig 5A). A node represents a gene in the network, and the edge represents the interaction between genes. Subsequently, we found the subnetworks in the STRING network genome of all the modules and extracted the central genes from the corresponding subnetworks. As shown in Fig 5B,   Since the subnetworks were extracted from a high-quality STRING protein interaction database, the data indicate that these module genes have a tight regulatory relationship. Finally, 50 and 206 hub genes were found in the blue and the yellow module subnetworks, respectively.

Transcriptional regulatory network
To identify genes that are most relevant to immunity and keratinization or cornification, we constructed a transcriptional regulatory network of hub genes in the two modules, blue and yellow, independently. Finally, nine hub-hub genes with degrees greater than 10 were extracted from the blue module: RASGRP2, IKZF1, CXCR5, LTB, BLK, LINGO3, CCR6, P2RY10 and RHOH (Fig 5D, Table 4 and S1 Table). In the yellow transcription network, fifteen hub-hub genes, including JUP, KRT14, PLA2G3, SPRR1A, KRT78, SFN, CLDN4, IL1RN, PKP3, CBLC, KRT16, TMEM79, KLK8, LYPD3 and LYPD5 were highly enriched in transcriptional regulators (Fig 5E, Table 5 and S2 Table). Additionally, several core transcriptional regulators were found to be mostly associated with our hub-hub genes, including

Discussion
Melanoma is one of the most common skin malignancies, and its poor prognosis is associated with a high metastatic capacity [20]. Currently, although several molecular biomarkers have been identified in the progress of melanoma, the mechanism of cell metastasis development has not been completely characterized [21]. The main purpose of this study was to utilize a global approach to construct a co-expression network of DEGs that predicted candidate gene clusters and hub genes involved in the pathogenesis of metastatic melanoma. The WGCNA algorithms are designed to clarify the relationships among genes and maintain consistency among all the samples; therefore, they can be used to determine the phenotype responsible for complex biological mechanisms. The WGCNA unsupervised hierarchical clustering approach avoids potential bias and subjectivity decisions caused by a previous selection of candidate genes associated with malignant melanoma metastasis. In our study, from the 1,568 DEGs, six significant gene modules were merged into blue and yellow modules with a cut threshold of 0.5.
In the blue module, fifty hub genes with a mean logFC 2.32 were extracted from the 200 upregulated genes (S1 Table). These genes are mainly involved in immune response activities, including B-cell, T-cell, lymphocyte, leukocyte and mononuclear cell responses. Active immunotherapy is a promising anti-cancer strategy designed to trigger specific T-cell responses against tumor cells to avoid relapse or progression of the disease. C-X-C chemokine receptor type 5 (CXCR5) is also one of the hub genes enriched in the cytokine-cytokine receptor pathway. CXCR5, known as CD185 or Burkitt lymphoma receptor 1 (BLR1), belongs to the CXC chemokine receptor family of the G protein coupled seven transmembrane receptors. It allows T cells to migrate to the lymph node B cell region [21]. Recently, the overexpression of CXCR5 in breast and prostate cancer patients was found to be highly correlated with lymph node metastasis and infiltration [22,23]. Elevated CXCR5 expression may lead to the survival and migration of abnormal cells in breast cancers deficient in functional p53 [24]. In addition, CXCR5 overexpression initiates a congenital and adaptive immune response through T-cells and B-cells [25]. For the 400 down-regulated genes in the yellow module, 206 hub genes with a mean logFC of -4.02 were identified as significant (S2 Table). These genes have a great impact on skin epidermal development and keratinization. The keratin families, including KRT 2, KRT 5, KRT 6, KRT 14, KRT 16, KRT 37, KRT 75, KRT 78 and KRT 80, have been shown to be significantly down-regulated during keratinocyte and epidermis development and in cytoskeleton intermediate filament termination. The main function of human melanocytes is to produce melanin for the epidermis to prevent damage from UVR [26,27]. The major structural proteins of epithelial cells, the keratins, are components of the intermediate filaments, and are the most heterogeneous of all the intermediate filaments [28,29]. The expression of KRT5 / KRT14 pairs is closely related to the intermediate phenotype of cells that undergo the epithelial-mesenchymal transition (EMT), which has a significant effect on tumor progression and metastasis because Potential diagnostic biomarkers of metastatic melanoma it helps to spread tumor cells throughout the body. Thus, paired KRT5 / KRT14 can be used to identify basal cell metastases [30].
Tyrosine is the precursor of the synthetic pigment melanin, which is a natural barrier to UVR damage but will weaken the effects of radiotherapy or chemotherapy. Therefore, melanin and melanogenesis are considered a double-edged sword [31]. Tyrosine and L-DOPA are recognized as substrates and intermediates of melanin biosynthesis. Studies have shown that increasing the supply of L-tyrosine in melanoma/melanocyte cell lines leads to an increase in melanoma cells [32]. In addition, L-tyrosine and L-DOPA are essential for the eventual formation of melanosomes in the pigment synthesis system of normal and malignant melanocytes as important mediators of cellular processes that affect not only the state of the melanocytes but also the recipient cells they affect [33]. Tyrosinase activity was tyrosine dose-dependent in melanotic and hypomelanotic cell lines, whereas tyrosinase activity continued to increase but slowed when melanization reached a plateau in amelanotic cells [34]. Recently, a study by Sarna et al. [35] showed that the presence of melanin affects the elasticity of cells as well as the movement of malignant melanocytes, which may help to understand the metastatic process of malignant melanomas. Our GO enrichment results revealed a few inhibited cellular component pathways, including gap junction (GO:0005921), fascia adherens (GO:0005916), connexin complex (GO:0005922), cell-cell junction (GO:0005911) and anchored component of membrane (GO:0031225), which may induce the metastatic ability of malignant melanocytes.
Interestingly, recent studies have shown that melanin levels in primary tumors increase with the development of metastasis and are high in deep melanoma cells [36]. Those studies also showed that the pigmentation increased with the local development of advanced melanoma and distant metastasis [37]. Additional evidence suggested that high melanin levels also contribute to hypoxia, dramatically changing the radiosensitivity of tumor cells and thus  -06  39  AP-2alphaA, AP-2gamma, ARNT, CREB1, DAND5, E2F, E2F1, EGR1, EGR2, EGR3,  IRF1, IRF2, MAX, MEF2A, MIA3, MYC, MYCN, MZF1, NF-kappaB1, NF1, NFKB1,  NKX2-5  Potential diagnostic biomarkers of metastatic melanoma affecting the response of melanoma cells to antineoplastic agents [38]. Moreover, other evidence indicated that inhibitors of melanogenesis, including tyrosinase inhibitors, could reverse the immunosuppressive properties and chemoresistance of melanin [39]. However, they also observed the highest level of melanin in pT1-pT2 primary tumor melanomas [36]. Our results show that melanin biosynthesis was inhibited in the tyrosine metabolic pathway (ID: hsa00350) of metastatic melanoma. This inhibition could reflect the mechanism by which melanoma acquires a more malignant phenotype and resistance to therapy. The inhibition of melanogenesis may be an effective therapeutic strategy for melanotic melanomas. In addition, during the initial formation of a tumor, immune suppression promotes tumor growth and progression. Our study found that genes involved in the biological processes of desmosome organization (GO: 0002934) and hemidesmosome assembly (GO: 00031581) in metastatic melanomas, including JUP, PKP3, KRT14, and KRT5, were inhibited. The reduction in the expression of these genes may play a critical role in altering tumor cell motility and, therefore, in increasing tumor cell metastasis. In addition, studies have shown that melanin granules formed in melanosomes, essential organelles that affect melanocyte elasticity, can  Cart-1, DAND5, EGR3, GATA1, GATA2,  GATA3, IL10, MRPL36, MYOD1, MZF1, Meis-1, NF-kappaB1, NF1, NFKB1, PATZ1, PSG1,  PTK7, Pax-2, SP1, SRF, TCF3, TGIF1, TP53, ZNF423,  Potential diagnostic biomarkers of metastatic melanoma inhibit melanoma cell migration [35]. Considering this evidence, we can infer that abnormalities in melanin metabolic activity, melanosomes and desmosomes are fundamental factors affecting the migration of melanoma cells, a finding that may contribute to a better understanding of the mechanism of malignant melanoma metastasis. Tumor cell metastasis involves a series of complex factors that can migrate through the bloodstream or the lymphatic system, or directly to other parts of the body. Transcription factors play important roles in the cell movement that is the basis of metastasis formation. CREB (cAMP response element-binding protein) and its associated proteins (e.g., MMP-2, MCAM and MUC18) play a key role in tumor growth and metastasis of human melanomas [40]. Our results suggest that CREB is a transcriptional regulator of TGM1 (transglutaminase 1),LYPD3 (LY6/PLAUR domain containing 3), TMEM79 (transmembrane protein 79), and claudin 4. The activation transcription factors, which regulate SPRR1A (small proline rich protein 1A), KRT78, claudin 4 and RAB25 (a member of the RAS oncogene family), also play crucial roles in malignant melanoma metastasis [41]. However, their precise functions in metastatic melanoma have not been characterized.

Conclusions
In summary, from 471 gene chip samples, through WGCNA analysis, we identified 1,486 DEGs consisting of two major gene co-expression modules. The results provide important insights into human malignant melanoma immunity and keratinocyte dysfunction. In addition, by constructing a transcriptional regulatory network, we have identified several major transcription factors, including GATA1, STAT1, SP1, and PSG1, that regulate the expression of 24 hub-hub genes. These genes may be the focus of subsequent work. These transcription factors may act as crucial regulators affecting gene expression by post-transcriptional or epigenetic modification. In addition, some of the hub-hub genes, which have not been reported before, showed important interactions in several pathways affecting biological processes, molecular mechanisms and cellular components. Our results potentially contribute to the better understanding of the progression of malignant melanoma as well as the identification of biomarkers for the early diagnosis of melanoma.
Supporting information S1