Establishment of a CALU, AURKA, and MCM2 gene panel for discrimination of metastasis from primary colon and lung cancers

Metastasis is known as a key step in cancer recurrence and could be stimulated by multiple factors. Calumenin (CALU) is one of these factors which has a direct impact on cancer metastasis and yet, its underlined mechanisms have not been completely elucidated. The current study was aimed to identify CALU co-expressed genes, their signaling pathways, and expression status within the human cancers. To this point, CALU associated genes were visualized using the Cytoscape plugin BisoGenet and annotated with the Enrichr web-based application. The list of CALU related diseases was retrieved using the DisGenNet, and cancer datasets were downloaded from The Cancer Genome Atlas (TCGA) and analyzed with the Cufflink software. ROC curve analysis was used to estimate the diagnostic accuracy of DEGs in each cancer, and the Kaplan–Meier survival analysis was performed to plot the overall survival of patients. The protein level of the signature biomarkers was measured in 40 biopsy specimens and matched adjacent normal tissues collected from CRC and lung cancer patients. Analysis of CALU co-expressed genes network in TCGA datasets indicated that the network is markedly altered in human colon (COAD) and lung (LUAD) cancers. Diagnostic accuracy estimation of differentially expressed genes showed that a gene panel consisted of CALU, AURKA, and MCM2 was able to successfully distinguish cancer tumors from healthy samples. Cancer cases with abnormal expression of the signature genes had a significantly lower survival rate than other patients. Additionally, comparison of CALU, AURKA, and MCM2 proteins between healthy samples, early and advanced tumors showed that the level of these proteins was increased through normal–carcinoma transition in both types of cancers. These data indicate that the interactions between CALU, AURKA, and MCM2 has a pivotal role in cancer development, and thereby needs to be explored in the future.

CALU isoforms show distinctive roles in cellular functions. As chaperone proteins, CALU isoforms participate in protein correct structure modeling and maturation [3][4][5]. The implication of CALU function results in the activation of ER-located enzymes such as the γ-carboxylase [6,7]. Also, CALU can modulate the cellular stress through regulation of GRP78 and phosphorylated PERK as ER-stress factors, C/EBP homologous protein (CHOP), and p-JNK proapoptotic proteins, and antiapoptotic Bcl-2 [3]. Elevation of CALU extracellular isoforms ameliorates SEPT1 expression and increases cell cycle modulation [8].
The upregulation of CALU, however, has been reported to associate with the elevation of cell migration and metastasis in lung and colon cancers. Analysis of human lung cancer patients indicated a higher level of CALU and Oxysterol binding protein-like 5 (OSBPL5) in metastatic cancer cells than non-metastatic samples [9]. Interestingly, cancer cells were shown to increase the CALU secretion from the cancer-associated fibroblasts (CAFs) via a TGF-β induced miR-21 expression axis and promoted tumoral proliferation and metastasis [10]. A high-throughput proteomic study on tumoral biopsies obtained from colon cancer patients detected the co-progressive expression of CALU and Biglycan, as the new biomarkers of colon tumors [11]. This investigation was in line with the previous study of identifying the potential biomarkers of colorectal carcinogenesis, in which reported the CALU as one the metastasisrelated proteins upregulated from adenoma to carcinoma [12]. These findings brought us to identify the CALU associated genes network with Bioinformatics tools. Moreover, we developed a signature gene panel consisted of CALU and other differentially expressed genes (DEGs) between healthy and cancer datasets retrieved from the Cancer Genome Atlas (TCGA), and estimated the diagnostic and prognostic performances of this panel in colon and lung cancer patients.

Network visualization and analysis
Network illustration was carried out using the Cytoscape version 3.7.1 [13]. Cytoscape plugin BisoGenet was applied to visualize CALU interactions with neighbored genes [14]. The topology of the network was determined with the Cytoscape plugin CytoHubba version 0.1 [15]. The Gene Ontology (GO) biological process and KEGG pathways annotation were analyzed with the Enrichr web-based application (http://amp.pharm.mssm.edu/Enrichr/). The results were assumed as statistically significant if P < 0.05.

Identification of calumenin related diseases
The list of CALU related diseases was retrieved using the Cytoscape plugin DisGenNet, a bioinformatics platform that integrates the genes data of various human disorders [16]. Following query terms were chosen as data sources: 1-Online Mendelian Inheritance in Man (OMIM) (Mendelian Inheritance in Man and its online version, OMIM), 2-Genetic Association Database (GAD) [17], 3-Mouse Genome Database (MGD) [18], 4-Comparative Toxicogenomics Database (CTD) [19], 5-PubMed, and 6-Uniprot. Ranking of the associations between CALU and diseases was performed according to the number of sources, organism type, and the number of supported publications.

Sample size estimation
Sample size was determined by using of the chi-square test based on the α error = 0.05 and β error = 0.2 [20]. Analysis of the gene panel expression in the colon cancer training cohort resulted in a proportion of 0.41 in healthy samples, and 0.5 in CRC tumors. The values for the control group and lung tumors in the training set of lung cancer were 0.57 and 0.71, respectively. The estimated ratio between the healthy controls and cancer cases was set at 1:1.2 for colon cancer, and 1:1.25 for lung cancer. On the other hand, due to the existence of one confounder (gene panel), the sample size of the validation set was increased by about 10%, in both types of cancers.

Data mining and processing
Illumina Hiseq 2000 RNA-seq datasets (level 3 per-gene RNA-seq v2 expression data) were retrieved from TCGA (https://cancergenome.nih.gov). The dbGaP accession number to the specific version of the TCGA datasets is phs000178.v10.p8. The read counts of these datasets were estimated by RSEM package (RNA-seq by expectation maximization). To identify the miRNAs, small reads (< 75 bp) were analyzed with the RNA family algorithms (Rfam, version 13.0, http://rfam.xfam.org), to remove the known small noncoding RNAs including rRNAs, tRNAs, snRNAs, and snoRNAs. The filtered sequences were subsequently analyzed with MIR-EAP (https://sourceforge.net/projects/MIREAP, 2018), and miRNAs were detected based on their canonical hairpin structure. To identify the protein-coding genes reads > 100 bp were analyzed the Coding-Noncoding-Index (CNI, version 2) based on 64 triplets of nucleotides algorithms [21], and additionally examined with the coding potential calculator-2 (CPC-2) webserver to distinguish noncoding transcripts from mRNAs based on the length and quality rate of the open reading frame of protein-coding transcripts [22]. Next, we analyzed the retrieved protein sequences with the Pfam-scan (version 1.3) database archive to evaluate their annotation [23], and also used the phylogenetic codon substitution frequency (phyloCSF, release 20121028) to distinguish the coding transcripts from noncoding alignments according to the evolutionary preservation of amino acids with known families of proteins [24]. To identify the human transcription factors and kinases, the coding protein list was uploaded into the Ensembl BioMart web process version 79 (https://www.ensembl.org/Biomart) and analyzed with the human genome assembly GRCh38.p12. consisted of 40 CRC and 40 lung cancer patients, who underwent cancer surgery at Shohadaye Tajrish Hospital, Tehran-Iran, during 2012-2018. Each participant received and filled an informed consent before sampling. Subjects median age was 65 years (range: 60-70 years). As for healthy control, a matched adjacent normal sample was also received from each patient. Samples were liquid nitrogen-frozen and kept at -80˚C. Tumors' assessment was performed based on the American Joint Committee on Cancer (AJCC) guidelines [25].

Western blot analysis
Tissue samples were lysed in a lysis buffer consisted of sodium dodecyl sulfate (SDS, 0.1%), Triton X-100 (1%,), Tris-HCl (100 mM, pH 7.4), sodium deoxycholate (0.5%), NaCl (150 mM), protease inhibitor phenylmethylsulfonyl fluoride (PMSF, 1mM), ethylenediaminetetraacetic acid (5mM), and glycerol (10%). After heating for 95˚C for 5 min, the mixture was cooled on ice and centrifuged for 5 min at 14,000 rpm to collect the supernatant. Analysis of protein concentration was done by using Bradford's method [26]. Electrophoresis was carried on a 12% SDS-polyacrylamide gel, and the separated proteins were then electrotransferred onto a nitrocellulose membrane by using at a constant voltage (20 V) for one h at 4˚C. After transferring, the membranes were washed with the Tris Buffered Saline with Tween-20 (TBST-10X) at room temperature for 5 min, incubated with the blocking buffer consisted of PBS (1X), Tween-20 (0.1%,), and skim milk (5%) for one h, and rewashed with the TBS buffer again. Primary antibody incubation was done for an overnight at 4˚C with a 1:1000 dilution of the following antibodies: anti-CALU (4C6) Mouse mAb (#11991S), anti-Aurora A (D3E4Q) Rabbit mAb (#14475S), anti-Phospho-Aurora A (Thr288) (C39D8) Rabbit mAb (#3079S), anti-MCM2 Antibody (#4007S), anti-Phospho-MCM2 (Ser139) (D1Z8X) XP Rabbit mAb (#12958S), and anti-β-Actin Antibody (#4967S) as a loading control. The membranes were then washed with the TBST buffer for three times and incubated for one h at room temperature with a 1:2000 dilution of the Anti-Mouse IgG, HRP-linked Antibody (#7076S, for CALU), or 1:2000 dilution of the Anti-Rabbit IgG, HRP-linked Antibody (#7074S, for AURKA and MCM2). All antibodies were purchased from Cell Signaling Inc., USA. The blots were then rewashed with TBST buffer, and the immunoreactive proteins were monitored using the Sig-nalFire ECL Reagent (#6883S, Cell Signaling Inc., USA). Proteins bands were analyzed using a Densitometry scanner and the ImageJ software (NIH, Bethesda, USA). The relative density of protein bands was determined by measuring the integrated density of each band after subtraction of X-ray film background. Before the analysis of the ratio of phosphorylated/total for AURKA and MCM2 proteins, the density value of each band was normalized to β-Actin and then calculated.

Statistical analysis
The Chi-square test and unpaired unequal variance t-test were performed to calculate the sex and age variables among the studied groups. The Mann-Whitney U test assessed the levels of the genes between groups. The receiver operating characteristic (ROC) curve analysis was performed to evaluate the diagnostic power of the genes panel. The area under the ROC curve (AUC) was constructed as an accuracy criterion for the examination of the candidate genes panel. Overall survival (OS) of patients was plotted with Kaplan-Meier survival analysis. All data are represented as the mean ± S.D. (Standard deviation). Statistical significance was achieved when P < 0.05 ( � ). All statistical analyses were performed with IBM SPSS Statistics software version 22 (IBM, USA).

Identification of calumenin related diseases
To identify the calumenin related diseases, DisGeNET database was used with the keyword CALU. According to the results, the expression pattern of CALU isoform 1 containing the canonical sequence (UniPort identifier: O43852-1), was increased in 3 phenotypes (Mental process, neoplastic process, and pathologic function), and ten human diseases (Table 1). Among the identified diseases, seven diseases were categorized as Neoplastic Process, and two other diseases, including Arteriosclerosis and Cystic Fibrosis, were identified as Disease or Syndrome group. The neoplastic category diseases consisted of colon and lung cancers.

Visualization of the calumenin associated genes network
Calumenin associated gene network illustration was performed with the Cytoscape software. The molecular interaction between CALU and other genes was analyzed by the Cytoscape plugin BisoGenet, and the final network consisted of 127 nodes and 739 interactions. To create a subnetwork consisting of the key nodes, we used the CytoHubba algorithms and selected the hub objects (degree value >5) within the primary network. In cellular biology, the term of the network represents a complex signaling mechanism consists of the essential information for a specific physiological action. Therefore, the analysis of a signaling system requires the use of graph-theoretic parameters, including centrality, motifs, etc., to depict the network organization. Like other kinds of directed networks, the nodes of the current system (Fig 1) were categorized based on their incoming and outgoing edges. Degree is the simplest type of the centrality measures and shows the importance of a node in the network based on its interaction (edge) with other nodes in directed networks. Using this classification, those nodes (top 1-5%) with the highest degree are termed as hubs and have an important role in the network. On the other hand, enrichment analysis of these nodes explains the physiological aspects of the target network. For example, the node representing the CALU protein in Fig 1, having an out-degree of 12 and an in-degree of 14, indicating the important role of this protein in the complex network existed between known cancer related signaling factors such as AURKA, EEF2K, EGFR, MCM2, RAF1, etc., which mediate the cancer cells proliferation and survival.

Analysis of calumenin co-expressed genes perturbation in TCGA
The studied population consisted of 239 TCGA colon adenocarcinoma profiles (TCGA--COAD, 107 normal and 132 CRCs), and 245 lung adenocarcinoma profiles (TCGA-LUAD, 108 normal and 137 tumors). Each dataset was divided into a training set and a validation set. As for TCGA-COAD, the training set consisted of 47 healthy controls and 60 CRCs, and the validation set consisted of 60 normal datasets and 72 CRCs. On the other hand, the training set of TCGA-LUAD consisted of 37 healthy controls and 50 lung cancer datasets, and the validation set consisted of 71 normal and 87 lung cancer datasets. Descriptive details of colon and lung cancer patients are shown in Table 2 and Table 3, respectively. Following the data processing, DEGs of each cancer were analyzed using the cufflink and visualized with the Volcano plot. The complete list of upregulated and downregulated genes of TCGA-COAD and TCGA-LUAD datasets was presented in S1 Table, respectively. As already shown, the expression level of CALU was significantly upregulated in both colon adenocarcinoma and lung adenocarcinoma samples in comparison to the normal groups (P < 0.001). Volcano analysis of the TCGA-COAD dataset indicated PRPH, PARK2, and CLGN as significantly downregulated genes, and ASPM, AURKA, GTSE1, NPM1, and MCM2 as most significant upregulated genes (P < 0.001, Fig 2A). Meanwhile, analysis of TCGA-LUAD dataset introduced NOS2, SH3GLB1, CFTR, IQCB1, TCTN1, TPRKB, PLIN3, CTNNAL1, and TMEM165 as most downregulated genes along with MCM2, RDX, P4HB, AURKA, LIMK2, PDLIM5, and CEP76 as significantly upregulated genes (P < 0.001, Fig 3A).

Gene set enrichment analysis
Gene set enrichment analysis (GSEA) is a statistical evaluation performed to identify the most significant biological terms in a given gene set, including signaling pathways, molecular functions, diseases, etc. Here, we used the Enrichr web-based application to display CALU coexpressed genes network enrichment results. GO enrichment analysis of TCGA-COAD dataset showed that the upregulated genes were over-presented in GO biological process: ribosomal small subunit export from the nucleus (GO:0000056), and regulation of centrosome cycle (GO:0046605) (Fig 2B) while pathway enrichment analysis of these datasets indicated KEGG's DNA replication as significant term in TCGA-COAD (Fig 2C).
Enrichment analysis of TCGA-COAD downregulated genes showed that the genes were over-presented in GO biological process: positive regulation of protein localization to endosome (GO:1905668), and positive regulation of protein localization to early endosome (GO:1902966) ( Fig 2B). While, pathway enrichment analysis of these datasets indicated KEGG's Endometrial cancer, and Amyotrophic lateral sclerosis (ALS) as significant terms (Fig 2C).
GO enrichment analysis of TCGA-LUAD dataset showed that the LUAD upregulated genes were over-presented in GO biological process: mitotic cell cycle phase transition (GO:0044772), and regulation of organelle assembly (GO:1902115) (Fig 3B) while pathway enrichment analysis of this dataset indicated KEGG's Regulation of actin cytoskeleton as a significant term (Fig 3C).
Enrichment analysis of TCGA-LUAD downregulated genes showed that the genes were over-presented in GO biological process: positive regulation of insulin secretion involved in cellular response to glucose stimulus (GO:0035774), positive regulation of autophagosome assembly (GO:2000786), protein localization to vacuole (GO:0072665), and positive regulation of vacuole organization (GO:0044090) (Fig 3B) while pathway enrichment analysis of these datasets indicated KEGG's Arginine and proline metabolism, and Arginine biosynthesis as significant terms (Fig 3C).

Estimation of DEGs diagnostic accuracy in cancer groups
Next, we performed a ROC curve analysis to estimate the diagnostic accuracy of DEGs in each cancer ( Table 4)

Establishment of the predictive panel
To determine the risk of being diagnosed with cancer, we built a logistic model between normal controls and cancer datasets in the training set (n = 107 for COAD and = 87 for LUAD).  Next, the diagnostic performance of established panel was estimated using the ROC data. Analysis of CRC samples versus normal group achieved an AUC of 0.8369 (95% CI = 0.7551-0.9187, 78.33% sensitivity and 80.85% specificity, Fig 4A). As for early CRC (I-II TNM stages), and demonstrated as log2-fold changes in the volcano plot. Red points mean significantly upregulated genes, and blue points mean downregulated genes.
As for LUAD, the corresponding AUC for all stages (I-IV TNM stages) compared to normal group was 0.8500 (95% CI = 0.7725-0.9275; sensitivity = 80% and specificity = 82.98%, Fig  Fig 5E). The AUC of advanced lung cancer (III-IV TNM stages) versus normal group was 0.8876 (95% CI = 0.8220-0.9533; sensitivity = 78.33% and specificity = 97.87%, Fig 5F). These results indicated that the type of cancer or its status have no effect on diagnostic performance of the panel.

Correlation between the abnormal levels of candidate genes and overall survival
The Kaplan-Meier survival analysis was performed to plot the OS of colon and lung cancer patients with abnormal levels of CALU, AURKA, and MCM2, in separate form and as a panel. There was no significant correlation between DEGs level and OS of patients with colon and lung cancers in training sets (Fig 6A-6C and Fig 7A-7C). On the other hand, analysis of DEGs in validation sets indicated that the abnormal level of CALU associated with the OS of cancer patients with lung cancer (P < 0.01, Fig 7E).
OS analysis of CALU, AURKA, and MCM2 in form of a panel indicated a significant correlation between abnormal expression of DEGs and overall survival of CRC patients in both training set (P < 0.05, Fig 6D) and validation set (P < 0.01, Fig 6H), along with overall survival of lung cancer patients in training set (P < 0.01, Fig 7D) and validation set (P < 0.001, Fig 7H).

PLOS ONE
Developing a Gene panel for Detecting of early tumors from advanced colon and lung cancers

Analysis of colon and lung cancer biopsy specimens indicated a gradual increment of CALU, AURKA and MCM2 proteins level through normalcarcinoma transition
To investigate the protein alteration of the candidate signature biomarkers through cancer progression, the protein level of CALU, AURKA, and MCM2 was analyzed between 40 biopsy specimens and matched adjacent healthy tissues collected from CRC and lung cancer patients. To eliminate the possible impact of the TNM stage parameter on the analysis results, each cohort was designed to have an equal number of early (I-II TNM stage, n = 20), and advanced (III-IV TNM stage, n = 20) cancer patients. Additional details are demonstrated in Table 2 and Table 3.
Western blot analysis of CALU, AURKA, and MCM2 proteins showed that the level of all of these candidate biomarkers was statistically increased through normal-carcinoma transition in both types of cancers (Fig 8). Comparative analysis of CALU between normal samples and all cancer tumors indicated a 3.2-fold and 3.63-fold enhancement of protein level in the CRC group and lung cancer group, respectively (P < 0.001). On the other hand, analysis of the CALU level between early and advanced tumor stages of CRC and lung cancer groups resulted in a 1.66-fold (P < 0.001), and 1.31-fold (P < 0.01) elevation of CALU level, respectively.
To investigate the alteration level of AURKA and MCM2 proteins, the ratio of their active form to their total level (phosphorylated AURKA at Thr-288 residue/total AURKA, and phosphorylated MCM2 at Ser-139/total MCM2) was analyzed between target groups. Comparative analysis of p-AURKA/AURKA between normal samples and all cancer tumors indicated a 3.43-fold and 3.12-fold enhancement of protein level in the CRC group and lung cancer group, respectively (P < 0.001). On the other hand, analysis of the p-AURKA/AURKA ratio between early and advanced tumor stages of CRC and lung cancer groups resulted in a 1.88-fold increment (P < 0.001), and 1.14-fold elevation (P = 0.021) of p-AURKA/AURKA ratio, respectively.
Similar results were obtained from the analysis of p-MCM2/MCM2 level between healthy samples and all tumor in CRC group (3.48-fold increment, P < 0.001), and lung group (3.56-fold increment, P < 0.00). Analysis of the p-MCM2/MCM2 ratio between early and advanced tumor stages of CRC and lung cancer groups resulted in a 1.27-fold increase (P < 0.01), and 1.44-fold elevation (P < 0.001) of p-MCM2/MCM2 ratio, respectively.

Discussion
Tumor metastasis is defined as cancer cell migration from the primary site to distant locations and plays a critical role in cancer recurrence and mortality [27,28]. As most of the current anticancer therapeutic efforts are focused on metastasis prevention, thus identification of metastasis signaling mediators is highly appreciated.
One of the main regulators of cancer metastasis is the extracellular matrix (ECM) proteins. ECM family consists of fiber molecules and proteins such as calumenin, which interact with extracellular focal adhesion complex (EFAC) and intracellular cytoskeleton, and through that improve the structural formation of cells [29]. Meanwhile, the positive impact of calumenin on cancer development was also reported [2,8,[30][31][32]. According to the vital role of calumenin in tumor metastasis, it could be considered as a potential target for anticancer therapy. However, considering the lack of knowledge about the calumenin association with other signaling mediators, we tried to visualize and analyze calumenin co-expressed genes network in human cancers.
Analysis of CALU co-expressed genes network in TCGA datasets indicated that the network is markedly altered in human cancers. Diagnostic accuracy analysis of these DEGs in TCGA datasets COAD and LUAD showed that a gene panel that consists of CALU, AURKA,

PLOS ONE
Developing a Gene panel for Detecting of early tumors from advanced colon and lung cancers and MCM2 could successfully distinguish cancer tumors from healthy samples. Additionally, cancer cases with abnormal expression of these genes had a statistically meaningful lower survival rate than other patients.
AURKA is a kinase involves in destabilizing the microtubules of cancer cells by phosphorylation of the tumor suppressor RAS-association domain family 1, isoform A (RASSF1A), and subsequently increases abnormal cancer cells proliferation by disrupting the mitotic phase [33]. AURKA overexpression leads to consequent activation of the NF-κB signaling pathway via negative regulation of IκBα and induces tumor invasion [34]. One the other hand, AURKA plays a pivotal role in tumor survival by provoking Bcl-2 and MCL-1 [35,36], and anti-apoptotic factors levels, blocking Bax, Bim, PUMA apoptotic mediators activity [35,37,38], along with disruption of the mammalian Target of Rapamycin (mTOR) autophagy pathway [38]. AURKA upregulation also suppresses cytochrome C releasing into the cytoplasm and prevents Apaf-1/cytochrome C apoptosome formation [39].
MCM2, the other member of our panel, belongs to the minichromosome maintenance protein complex family and has a critical impact on initiating DNA replication and proliferation of cells [40]. MCM2 is activated in response to growth signaling pathways and stimulates DNA replication, and therefore highly expressed in active proliferative cells, including cancer cells [41,42]. Accordingly, an abnormal level of MCM2 has been reported widely to associate with human cancers, including breast [43], colon [44], gastric [45], rectal [46], and skin cancer development [47].
To our knowledge, no investigation has been done to find the signaling interaction between CALU, AURKA, and MCM2 proteins. However, a survey in the published literature helped us to visualize the possible correlation between these candidate genes (Fig 9). As previously mentioned, the expression level of CALU is reported to be increased in advanced colon and lung cancer tumors [9,11]. Likes other ER-resident chaperones, the expression of CALU can be triggered in response to ER stress and the unfolded protein response (UPR) caused by the physiological and pathologic stimuli such as metabolic disturbance, DNA damage, accumulation of misfolded proteins, calcium depletion, oxidative stress, and etc., which are directly associated with cancer initiation and development [48]. A recent study on neonatal rat cardiomyocytes showed that that the modulatory impact of CALU on ER stress is mediated thorugh the suppression of ER-initiated apoptosis markers CHOP, and p-JNK, alleviation of the UPR triggers pro-apoptotic PERK-ATF4 axis, and eventually increment of antiapoptotic protein Bcl-2 [49]. As a known UPR transcription factor gene, ATF4 is overexpressed under glucose deprivation and positively upregulates the ATF3 expression [50,51]. It has been shown that ATF3 has a suppression impact on the mitotic kinase gene AURKA expression. The other negative effect of ER stress on AURKA expression can be mediated by AFT4, through the suppression of the FOXO1/AURKA expression axis [52]. AURKA is reported to remove the inhibitory impact of the p53/p21 axis on the activation of the Cyclin-dependent kinase 2 (CDK2), cell cycle transition, and tumorogenesis by promoting the HDM2-induced ubiquitination and inhibition of p53 [53]. This kinase also has a positive role in Akt activation to suppress the p27 protein, like the other negative regulator of CDK2 [53]. As a tumor suppressor, p27 is also reported to be able to block the G1/S transition by suppression of the MCM2 phosphorylation [54]. The suggested mechanism is illustrated in Fig 9. Although our analysis of the clinical samples of the CRC and lung cancer patients confirms the possible correlation, however, this theoretical mechanism needs to be investigated with the experimental validations to elucidate the exact underlying interactions.
Currently, there is a lack of a low-cost, noninvasive method with high sensitivity and specificity performances for cancer screening. For example, According to the American Cancer Society (ACS) screening guidelines for early detection of colon cancer [55], both men and women beginning at age 50 should follow one of the following testing schedules: (1) colonoscopy (every ten years); (2) flexible sigmoidoscopy (every five years); (3) double-contrast BE (every five years); or (4) CT colonography (every five years). Despite the high accuracy of these techniques, all the current screening methods have their own disadvantages. For example, using the colonoscopy as a golden standard for CRC screening requires unpleasant bowel preparation, and mostly is along with the pain experiment, discomfort, and embarrassment [56][57][58]. Therefore, a wide range of subjects delay their colonoscopy test or entirly avoiding it. Although other less-invasive screening methods such as fecal occult blood test (FOBT) and carcinoembryonic antigen (CEA) blood estimation are now using as alternatives; but, low sensitivity and specificity of these methods limit their use for cancer diagnosis [59][60][61]. Hence, detection of new biomarkers with more accurate diagnostic capability is needed for cancer screening in routine health checkups.
The current study suggests a multigene panel for cancer screening. Although a wide range of biomarker studies has been dedicated to a single gene exaination, this strategy is not always efficient. Some clinical features, such as the personal and familial history of patients, should be considered, especially in the case of hereditary diseases like cancer [62]. In addition, there is no guarantee that repeating a test for an individual gene will yield the initial results. Therefore, the application of a multigene panel for cancer diagnosis will save time and cost and can provide more accurate results. Besides that, it will elucidate the underlying interactions between candidate genes involved in malignancy.
In summary, analysis of CALU co-expressed genes network as one the key regulators in tumor metastasis introduced a gene panel that consists of CALU, AURKA, and MCM2 with high discriminative accuracy between healthy and cancer (colon and lung) populations. Also, abnormal levels of these genes had a reverse correlation with the patients' survival rate. This data indicates that the interactions between CALU, AURKA, and MCM2 has a pivotal role in cancer development, and thereby needs to be explored in the future. Supporting information S1 Table. The expression level analysis of upregulated and downregulated genes between tumors and healthy samples in TCGA-COAD (n = 107), and in TCGA-LUAD (n = 87) datasets in the training set. (XLS)