Identification of potential target genes and crucial pathways in small cell lung cancer based on bioinformatic strategy and human samples

Small cell lung cancer (SCLC) is a carcinoma of the lungs with strong invasion, poor prognosis and resistant to multiple chemotherapeutic drugs. It has posed severe challenges for the effective treatment of lung cancer. Therefore, searching for genes related to the development and prognosis of SCLC and uncovering their underlying molecular mechanisms are urgent problems to be resolved. This study is aimed at exploring the potential pathogenic and prognostic crucial genes and key pathways of SCLC via bioinformatic analysis of public datasets. Firstly, 117 SCLC samples and 51 normal lung samples were collected and analyzed from three gene expression datasets. Then, 102 up-regulated and 106 down-regulated differentially expressed genes (DEGs) were observed. And then, functional annotation and pathway enrichment analyzes of DEGs was performed utilizing the FunRich. The protein-protein interaction (PPI) network of the DEGs was constructed through the STRING website, visualized by Cytoscape. Finally, the expression levels of eight hub genes were confirmed in Oncomine database and human samples from SCLC patients. It showed that CDC20, BUB1, TOP2A, RRM2, CCNA2, UBE2C, MAD2L1, and BUB1B were upregulated in SCLC tissues compared to paired adjacent non-cancerous tissues. These suggested that eight hub genes might be viewed as new biomarkers for prognosis of SCLC or to guide individualized medication for the therapy of SCLC.


Introduction
Lung cancer is a subtype of malignant tumors with a peak risk of morbidity and mortality, which makes it a notable healthcare issue for human beings. In 2018, the total number of cases newly diagnosed as lung cancer was about 2.09 million (11.6% of all newly diagnosed cancers), and the number of new deaths was about 1.76 million (18.4% of all sites) worldwide [1]. Small cell lung cancer (SCLC) accounts for approximately 20% of lung cancer patients and belongs to neuroendocrine tumor [2]. Different from other histopathological subtypes of lung cancer, SCLC is accompanied by rapid clinical progression. Almost all patients with SCLC have extensive metastasis when diagnosed, and it has a poor prognosis. The 5-year relative overall survival (OS) rate is not more than 6%. In clinical practice, conventional treatments of SCLC include chemotherapy, radiotherapy, surgery and immunotherapy. Chemotherapy for SCLC is the main treatment currently, but there are still many problems such as drug resistance and easy relapse [3,4]. In the past few decades, the survival rate of SCLC patients has not been evidently improved, and no molecular-targeted drugs have been shown to significantly prolong the survival time of patients [5]. In recent years, studies have been reported that many molecular mechanisms altered in SCLC including induced expression of oncogene, such as MYC [6] and FGFR1 [7], and deletion of tumor-suppressor genes, such as TP53, PTEN, RB, and FHIT [8]. Changes in these related genes and signaling pathways promote cell proliferation and inhibition of apoptosis, leading to early-stage metastasis of tumor cells, such as mutation, methylation or expression of PIK3CA, PTEN, AKT and other genes in the PI3K/AKT/mTOR pathway [9]. Because of the complexity of biological characteristics and poor prognosis of SCLC, the key biomarkers and specific targets for occurrence and development of SCLC are not well known. Therefore, it is necessary to explore more genetic information to screen out potential or promising biomarkers for early-stage diagnosis and precision medical treatment of SCLC. In recent years, gene chip technology and bioinformatics analysis has been widely used to identify molecular changes in tumorigenesis and development and has been proved to be an efficient method for identifying key genes in the research of genomics [10][11][12]. However, due to the strong invasion and short life span of SCLC, the data of related gene chips are infrequent.
In this study, gene expression data obtained from GEO databases were integrated to conduct data mining and analysis of SCLC. Then, a series of co-differentially expressed genes have been screened in SCLC. A series of analysis were carried out based on these genes, including analysis of functional enrichment, protein-protein interaction network and human samples validation. We identified numbers of hub genes analyzed the interaction between genes and drugs. Our research may offer more insight into the molecular mechanisms or study of available drugs for this epidemic and destructive disease. The workflow for bioinformatics strategy of SCLC was illustrated in Fig 1.

SCLC gene expression data from GEO data repository
As a publicly genomics database, Gene Expression Omnibus (GEO) of NCBI (https://www. ncbi.nlm.nih.gov/) collects submitted high-throughput gene expression data and can be used for retrieving all datasets involving studies of SCLC. For our following analysis, it was considered reasonable for studies that met the following criteria: (1) Studies of human SCLC and corresponding adjacent or normal lung tissues. (2) There are detailed information of research technology and platform. (3) All studies have been published in English. Based on these criteria, three gene expression microarray datasets for SCLC, including GSE40275, GSE99316, and GSE60052, were taken from GEO. Details of these microarray studies were shown in Table 1.

Data preprocessing and DEGs screening
The DEGs between SCLC and normal lung samples from GEO datasets, GSE40275 and GSE99316, were screened by using GEO2R (http://www.ncbi.nlm.nih.gov/geo/geo2r). GEO2R is an online analysis tool for comparing two or more groups of data in the GEO series to identify DEGs under the same detection method. The matrix data (.TSV, from supplementary files) of GSE60052 were normalized and log 2 transformed by limma package of R [13]. The application of the adjusted P-value (adj. P) can balance the detection of statistically significant genes and the restriction of false positive. Probe sets without gene symbols were removed, and genes with multiple probe sets were averaged. Next, |log2FC (fold change) |>1 and adj. P-value<0.05 was considered to be statistically significant. To visualize the identified DEGs, SangerBox (soft.sangerbox.com) and FunRich [14] were used to make volcano plots and Venn diagrams, respectively.

Circular visualization of the DEGs
Circos (http://circos.ca/) was applied to display our data for a better understanding of DEGs, including their gene symbols and locations on chromosomes [15].

Functional enrichment analysis of DEGs
Funrich is a publicly available software for Functional annotation and pathway enrichment analysis of genes or proteins. In this study, Funrich was used to analyze the functional enrichment of up-regulated and down-regulated DEGs, including molecular function, biological process, cell composition and biological pathway. The results of the biological pathway analysis were visualized by OmicShare tools (http://www.omicshare.com/tools), a free online platform for data analysis.

Construction and analysis of protein-protein interaction (PPI) network
For the better view of the relationship among these DEGs, the PPI network was constructed via using The Search Tool for the Retrieval of Interacting Genes (STRING) database (version 11.0; http://string-db.org/) [16]. In the present study, Cytoscape software (version 3.6.1) was used to establish and visualize PPI networks [17]. As one of the characteristics of PPI network, the network module contained specific biological significance. The salient modules in this PPI network were explored by The Cytoscape plug-in Molecular Complex Detection (MCODE) [18]. The thresholds were set as follows: Degree cutoff = 2, Node Score Cutoff = 0.2, and K-Core = 2. Next, the pathway enrichment analysis of the DEGs in different modules was performed by FunRich and visualized by the OmicShare tools.

Screening of hub genes
The hub genes in the PPI network were screened by cytohubba, a plug-in of Cytoscape software (version 3.6.1). The genes with degree score �75 were considered as hub genes.

Verification of hub genes expression levels
At first, the expression of these hub genes was validated in Oncomine database. As one of the world's largest tumor gene microarray database and integrated data analysis platform, Oncoline (oncomine.org) is designed to excavate cancer genetic information. The database hitherto has collected data from 729 gene expression data sets and over 90,000 cancer and normal tissue samples. It can be used to uncover the differential expression of a single gene in SCLC tissue and its related normal tissues [19]. To figure out the expression levels of hub genes in SCLC, SCLC gene expression data from the study of Garber et al [20] and Bhattacharjee et al [21] in the Oncomine database were investigated and visualized by GraphPad Prism. Thresholds for the data type was restricted to mRNA. Then, the expression levels of hub genes were further verified by quantitative real-time PCR (RT-qPCR) through tissue samples of SCLC patients and paired adjacent non-cancerous ones.

Construction and analysis of hub gene-drug interaction network
The hub gene-drug interaction networks were constructed by Comparative Toxicogenomics Database (CTD) [22], a platform for analysis chemotherapeutic drugs which could inhibit or induce the mRNA or protein expression of hub genes. The hub gene-drug interaction was investigated in CTD database and visualized by the OmicShare tools.

Human SCLC samples
All 7 SCLC tissues and 7 paired adjacent non-cancerous tissue samples of PPFE (Formalinfixed and paraffin embedded) were collected from patients who had been diagnosed as SCLC from May 2019 to May 2020 at Taihe Hospital of Hubei University of Medicine, China. The PPFE samples were stored at room temperature until total RNA was extracted. All SCLC patients were diagnosed and graded according to the pathological characteristics in the Department of Pathology, Taihe Hospital. All human samples were obtained by informed consent (IFC) from patients, and this study was supported and approved by by the Ethics Committee of Taihe Hospital. after an initial 120s denaturation at 95˚C. HPRT1 was endogenous reference gene. All reactions were run in triplicate. The relative RNA levels of SCLC samples were calculated by using the 2 −ΔΔCt method. All primers of the hub genes and HPRT1 were synthesized by Sangon Biotech (Shanghai, China), and the information of their sequences were listed in Table 2.

Statistical analysis
Statistical analysis was performed through GraphPad Prism (version 8.2.1, San Diego, CA) software. Student's t-tests were used for the comparison of two sample groups. Differences were considered as statistically significant when P < 0.05 ( � P < 0.05, �� P < 0.01, ��� P < 0.001).

Identification of DEGs in SCLC
117 SCLC samples and 51 normal lung samples were involved in this study (Table 1 and S1 Table). There were 3337 DEGs (1752 upregulated and 1585 down-regulated) in GSE40275, 510 DEGs (326 up-regulated and 184 down-regulated) in GSE99316-GPL570, and 2304 DEGs (953 up-regulated and 1351 down-regulated) in GSE60052 which were identified between SCLC tissues and normal lung tissues as shown in volcano plots (Fig 2A-2C). The Venn diagram analysis of these DEGs mapped that 208 DEGs, including 102 up-regulated genes and 106 down-regulated genes, were consistently found in the three data sets (Fig 2D-2E and S2 Table). All 208 DEGs are listed in Table 3. As shown in Fig 2F, we screened top 20 differentially expressed up-regulated and down-regulated genes respectively by the cut-off criteria. Chromosome mapping of DEGs presented gene distribution on chromosomes, with chromosomes 1 containing the most dysregulated genes in SCLC ( Fig 2G). Interestingly, four genes showed dysregulation on the X chromosome in SCLC (FHL1, SRPX, HMGB3 and WNK3), while no genes on the Y chromosome was affected.

Functional annotation and pathway enrichment analysis
Funrich, as a tool for the analysis of genes and proteins, was used for GO functional annotation and biological pathway enrichment analysis of DEGs.
There are three categories, including biological process (BP), cellular component (CC) and molecular function (MF), involved in GO functional annotation. In Fig 3, the top 10 enriched GO projects were displayed. Analysis of GO BP suggested that up-regulated DEGs were significantly enriched in cell cycle, regulation of nucleobase, nucleoside, nucleotide and nucleic acid  https://doi.org/10.1371/journal.pone.0242194.g002 metabolism ( Fig 3A). For analysis of CC, the genes were significantly enriched in nucleus, nucleoplasm, kinetochore, and chromosome ( Fig 3C). The analysis of MF for these genes mainly included DNA binding ( Fig 3E). Analysis of GO BP indicated that down-regulated DEGs were most but not significantly enriched in cell growth and/or maintenance, and transmembrane receptor protein tyrosine kinase signaling pathway ( Fig 3B). For analysis of CC, the genes were manifestly enriched in extracellular, extracellular space, and extracellular matrix ( Fig 3D). Finally, analysis of MF of these genes showed that they were apparently enriched in extracellular matrix structural constituents ( Fig 3F).
Furthermore, pathway enrichment analysis was carried out for up-regulation and downregulation DEGs. Candidate genes of up-regulated DEGs were mainly enriched in mitotic cell cycle/ M-M/G1 phases, DNA replication, and ATM pathway (Fig 4A and Table 4). Moreover, a critical gene CDC20 was particularly enriched in cell cycle and DNA replication in pathway enrichment analysis for up-regulated genes (Table 4 and S3 Table). The notably enriched pathways for down-regulated DEGs were epithelial-to-mesenchymal transition, amb2 integrin signaling, and interleukin-6 (IL-6) signaling pathway ( Fig 4B). However, Interleukin-6 (IL-6), a crucial gene associated with inflammation was significantly enriched in amb2 Integrin signaling, Interleukin-mediated signaling, and cytokine signaling in the immune system in pathway enrichment analysis for down-regulated genes (Table 4 and S3 Table).

Analysis of PPI network and modules
As results, the PPI network showed that a total of 178 nodes and 2466 protein pairs were acquired with a score > 0.4. The main nodes in this network were the up-regulated DEGs ( Fig  5A and S4 Table). Furthermore, two modules (module 1 and module 2) with score >4 were detected by MCODE (Fig 5B-5C). All nodes in module 1 with an MCODE score of 56.094 (65 nodes, 1795edges) were up-regulated DEGs, while all nodes in modules 2 with an MCODE score of 4.60 (11nodes, 23edges) were down-regulated DEGs in SCLC samples. Furthermore, the biological pathways enrichment analysis of two modules were shown in Fig 6 and S5 Table. The mitotic cell cycle pathway was identified as the most significant pathway in module 1 (Fig 6A), and amb2 Integrin signaling was the most significant pathway in module 2 ( Fig 6B).

PLOS ONE
Potential target genes and crucial pathway in SCLC and BUB1B, were hub genes with higher node degrees, and they were all up-regulated genes in module 1.

Verification of hub gene mRNA expression levels
Firstly, the mRNA expression levels of CDC20, BUB1, TOP2A, RRM2, CCNA2, UBE2C, MAD2L1, and BUB1B were significantly increased in SCLC samples compared with normal lung samples based on the Oncomine database, which was consistent with the above bioinformatics investigation (Fig 7). Secondly, further study was conducted to verify the expression levels of these hub genes by RT-qPCR through tissue samples of SCLC patients and paired adjacent non-cancerous ones. The mRNA levels of eight hub genes in SCLC tissues were significantly overexpressed compared to those in paired adjacent ones. (Fig 8).

Analysis of hub gene-drug interaction network
CTD was used to study the interaction between hub genes and available therapeutic drugs of cancer. As results, multiple drugs could alter the expression of these eight hub genes, including

Discussion
Due to the insufficiency in effective targeted therapy options, SCLC is considered a "neglected sibling" compared to NSCLC. The mutations of epidermal growth factor receptor (EGFR)  . Therefore, novel biomarkers with high efficiency, high sensitivity and high specificity are urgently needed for diagnosis and prognosis of SCLC. Compared with single array analysis, multiple arrays integration is considered as a better method to enhance detection capabilities and improve the reliability of results [30]. In this study, we analyzed three SCLC data sets from GEO to gain insight into gene expression patterns on a genome-wide scale. Then, 208 DEGs (102 up-regulated and 108 down-regulated) and eight hub genes have been identified and used for further analysis. Chromosome mapping of 208 DEGs displayed that chromosome 1 contained the most dysregulated genes in SCLC. Previously studies affirmed that early-stage development of lung cancer was associated with X chromosome inactivation in females. The inactivation test of X-chromosome could be used to screen women who are prone to malignant tumors, including lung cancer [31]. Our findings suggested that the abnormal expression of FHL1, SRPX, HMGB3 and WNK3 on the X chromosome may be related to SCLC in females [32]. However, there is no differential expression gene in Y chromosome in our present study.
A PPI network was constructed for the purpose of predicting the protein functional association of 208 identified DEGs. As a result, up-regulated genes were predominantly enriched in mitotic cell cycle, DNA replication, mitotic M-M/G1 phases, and ATM pathway in SCLC. Meanwhile, down-regulated genes were mainly enriched in epithelial-to-mesenchymal transition, amb2 Integrin signaling, and IL-6 signaling pathway in SCLC. Among the top 10 biological pathways enrichment analysis for down-regulated genes, 9 pathways were closely related to IL-6 and immune system. (Table 4 and S3 Table) It indicated that the immune level of SCLC was relatively low, and application of IL-6 inhibitors, immunotherapy or activation of immunity might be the potential strategies for the treatment of SCLC.
In the current study, the eight hub genes selected by cytohubba were overexpressed in SCLC tissues compared to paired adjacent non-cancerous tissues. Lastly, seven of those hub genes, namely CDC20, TOP2A, BUBI, BUBIB, UBE2C, CCNA2 and MAD2L1 (all P-values <0.05), were closely related to the cell cycle pathway. Additionally, the mRNA expression of hub genes were searched for by mining the Oncomine database and human samples, which further validated the bioinformatics analysis. Although previous studies have authenticated that most of these disordered hub genes were closely related to the diagnosis, treatment and prognosis of various diseases, the precise functions and molecular mechanism of them in the occurrence and development of SCLC have not yet been clearly illuminated.
As an evolutionarily conservative process, cell cycle plays an imperative role in cell growth and differentiation. Dysregulations of cell cycle are considered as a hallmark of human cancer [33]. In the treatment of cancer, many strategies for the cell cycle have been implemented. Increasingly studies have revealed that several genes related to cell cycle such as CDC20, TOP2A, BUBI, BUBIB, and UBE2C, are related to the occurrence and development of cancer, which were also identified in the present study.
Cell division cycle protein 20 (CDC20), locating in chromosome 1, is a promoter for the anaphase-promoting complex (APC). Early research has demonstrated that CDC20 is highly  . In our current study, TOP2A was overexpressed in SCLC and mostly enriched in mitotic cell cycle pathway. Consequently, the results of our study are in concert with those of previous studies, suggesting that TOP2A may be a direct or indirect factor in the occurrence and deterioration of SCLC The multidomain protein kinases BUB1 (aliases: BUB1A) and BUB1B (aliases: BUBR1) are key elements of the mitotic checkpoint for spindle assembly [48]. BUB1 plays an important role in chromosome assembly and kinetochore localization in cells [49][50][51]. While BUB1B is associated with stabilizing centromere-microtubule junction and chromosome alignment [52]. Upregulation of BUB1B can prevent aneuploidy and cancer and prolong healthy lifetime [53]. Abnormal expression of BUB1 and BUB1B were resulted in the prognosis of patients with brain tumor [54], glioblastoma [55], colorectal cancer [56], and NSCLC [57], and resulted in the impairment of mitotic checkpoint function. Therefore, future studies on BUB1 and BUB1B combining genetic approaches may provide an effective strategy for clinical anti-tumor treatment and could be the focus of SCLC research in the future.
Ribonucleotide reductase regulatory subunit M2 (RRM2) is a significant enzyme in DNA replication. High expression of RRM2 was uncovered in glioblastoma [64] with promoting tumorigenicity [65], prostate cancer [66], NSCLC [67], and breast cancer. Overexpression of RRM2 was strongly associated with worse survival in breast cancer and increased expression was shown in tamoxifen resistant patients [68]. Recently, a study [69] showed that chemotherapy resistance was associated with RRM2/EGFR/AKT signaling pathway in NSCLC.
Cyclin A2 (CCNA2), belonging to cyclin family, is a regulator of cell cycle. It activates cyclin dependent kinase 2(CK2) and promotes transformation through G1/S and G2/M [70]. CCNA2 was overexpressed in bladder cancer [71], ER+ breast cancer and related to tamoxifen resistance [72], hepatoma with promoting cell proliferation [73] and lung adenocarcinoma [74]. CCNA2 might have prognostic value for progression free survival(PFS) and OS in patients with lung cancer [75].
As an integral part of the mitotic spindle assembly checkpoint, mitotic arrest defect 2 like 1 (MAD2L1) is manifestly enriched in the cell cycle pathway and ensures that all chromosomes are arranged correctly on the metaphase plate [76,77]. The deletion of tumor suppressor MAD2L1 could lead to premature degradation of cyclin B, mitosis failure in human cells and tumorigenesis [78]. MAD2L1 is overexpressed in multiple cancerous, such as breast cancer [79,80] and gastric cancer [81], and lung cancer [82]. Recent studies [83][84][85] demonstrated that the prognosis of SCLC patients with high MAD2L1 expression is worse than that with low MAD2L1 expression. It implies that MAD2L1 may be a promising therapeutic target for SCLC.
The interaction between eight hub genes and anti-tumor drugs were analyzed for the better understanding of the possibility of these genes as promising therapeutic targets for SCLC. As a result, we discovered that multiple drugs could change the expression of these hub genes.
As shown in Fig 9A, Sulforafan and Irinotecan promoted the expression of CDC20, while Sunitinib, Methotrexate, Fluorouracil, Dasatinib and Oxaliplatin inhibited the expression of CDC20. This suggested that Sunitinib, Methotrexate, Fluorouracil, Dasatinib, and Oxaliplatin might be viewed as targeted drug for the treatment of SCLC patients with high expression of CDC20. Interestingly, the interaction between Cisplatin or Paclitaxel and CDC20 were controversial, and the conclusions drawn in different studies were inconsistent. In the same way, Sunitinib, Nocodazole, Topotecan, Cyclosporine, Dasatinib, Etoposide, Fluorouracil, Irinotecan, Methotrexate, and Azathioprine might be viewed as targeted drugs for the treatment of SCLC patients with high expression of BUB1. (Fig 9B) Methotrexate, Oxaliplatin, Carboplatin, Teniposide, Sunitinib, Cyclosporine, Aurofusarin, Dasatinib, Daunorubicin, Dronabinol, Etoposide, Fluorouracil, and Irinotecan might be viewed as targeted drugs for the treatment of SCLC patients with high expression of TOP2A. (Fig 9C) Vincristine, Cyclosporine, Methotrexate, Oxaliplatin, Topotecan and Cytarabine might be viewed as targeted drugs for the treatment of SCLC patients with high expression of UBE2C. (Fig 9D) Methotrexate, Oxaliplatin, Topotecan, Dasatinib, Etoposide and Fluorouracil might be viewed as targeted drugs for the treatment of SCLC patients with high expression of BUB1B. (Fig 9E) Methotrexate, Oxaliplatin, Sunitinib, Bicalutamide, Afureserti, Azathioprine, Dasatinib, Hydroxyurea and Topotecan might be viewed as targeted drugs for the treatment of SCLC patients with high expression of RRM2. (Fig 9F) Camptothecin and Fenretinide might be viewed as targeted drugs for the treatment of SCLC patients with high expression of MAD2L1. (Fig 9H) Surprisingly, no confirmed one could be viewed as targeted drug for the treatment of SCLC patients with high expression of CCNA2. (Fig 9G) However, further experimental study, including in vivo and in vitro experiments and clinical studies, are needed to explore the relationship between these hub genes and the prognosis of SCLC patients, and whether SCLC patients can benefit from these drugs.
At present, some related studies on the SCLC core genes have been published. Liao et al screened five hub genes from four GEO datasets (GSE60052, GSE43346, GSE15240 and GSE6044) by the raw data analysis by R software, functional annotation and pathway enrichment analysis, PPI network analysis, enrichment analyzes of two significant modules, and analysis of the expression levels of hub genes in the Oncomine database [84]. Wen et al discerned 10 hub genes from two GEO databases (GSE6044 and GSE11969) by using GEO2R tool, GO functional annotation and KEGG pathway enrichment analysis, PPI network, module analysis, hub genes selection, and validation of the mRNA expression levels of hub genes in the Oncomine database [86]. Mao et al identified 19 hub genes and 32 miRNAs from two GEO datasets (GSE6044 and GSE19945). Further analysis performed by functional annotation and pathway enrichment analysis of DEGs, PPI network, module analysis, hub genes screening, and miRNA-gene regulatory network [87]. Compared with the published literatures, we all found DEGs at RNA level based on GEO database and analyzed by bioinformatics methods. We constructed functional annotation and pathway enrichment analysis, PPI network, enrichment analyzes of significant modules, and verification of the expression levels of hub genes in the Oncomine database of DEGs by using similar methods. But we excluded two GEO datasets, including GSE6044 and GSE11969 which were in above published reports. The reasons are as follows: As reported, the human genome contains about 20,000 to 4,5000 genes encoding proteins [88]. But the number of probes in GSE6044 was only 8,793 genes, a lot less than 20,000~50,000 genes in that three GEO datasets. Therefore, the inadequacy of these data may lead to incomplete analysis. Secondly, the database related technology platform was applied 10 more years ago and had not been updated recently, therefore GSE6044 was ruled out. For GSE11969, all expression data with log 2 (fold change) is less than 2, which is completely different from other GEO data sets. It may be unreasonable. Therefore, GSE11969 was excluded. More advantages of this study were as follows: Firstly, RT-qPCR were performed to validate the mRNA expression levels of hub genes by human samples. Secondly, this study screened potential prognostic biomarkers or therapeutic targets in SCLC and performed hub gene-drug interaction network analysis.
In the present paper, we have discussed that the overexpression of eight hub genes was closely related to the occurrence and development of SCLC, indicating that these hub genes might be acted as promising prognostic markers or therapeutic targets for SCLC. But our research also has limitations. Firstly, the data utilized in this study were all collected from public databases, but the quality of the data cannot be evaluated. Secondly, the sample capacity of relevant data is comparatively small. Thirdly, our study focused only on genes that changed significantly in multiple data sets, the characteristics of race, region, gender, age, tumor classification, stage, and smoking status were not considered integrally. Therefore, a lot of valuable biological information may be ignored in our research. Finally, as results, all eight hub genes were overexpressed in SCLC, but the corresponding mechanism has not been fully elucidated. Therefore, more molecular evidence is needed. Moreover, the current research in SCLC lacked prognostic data related to these hub genes, such as survival curves, which brought limitations to the clinical application value of hub genes. In this paper, the expression levels of eight hub genes were mainly analyzed. Whether these hub genes could be used as biomarkers or therapeutic targets of SCLC required further study.

Conclusions
In conclusion, our bioinformatics analysis identified 208 DEGs, eight hub genes (CDC20, BUB1, TOP2A, RRM2, CCNA2, UBE2C, MAD2L1), and the mitotic cell cycle pathway that might play an momentous role in the development and prognosis of SCLC. As shown in database analysis and confirmed by human samples, overexpression of these hub genes indicated a poor prognosis for patients with SCLC. These results indicated that a comprehensive study of these DGEs will help us to understand the pathogenesis and progression of SCLC. However, since this study is mainly based on data analysis, further basic mechanism studies and clinical studies are needed to confirm these hypotheses in SCLC. We hope that this study can furnish certain new genomic basis for the individualized treatment of SCLC.
Supporting information S1