Target and drug predictions for SARS-CoV-2 infection in hepatocellular carcinoma patients

Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is the cause of the coronavirus disease (COVID-19), which poses a major threat to humans worldwide. With the continuous progress of the pandemic, a growing number of people are infected with SARS-CoV-2, including hepatocellular carcinoma (HCC) patients. However, the relationship between COVID-19 and HCC has not been fully elucidated. In order to provide better treatment for HCC patients infected with SARS-CoV-2, it’s urgently needed to identify common targets and find effective drugs for both. In our study, transcriptomic analysis was performed on both selected lung epithelial cell datasets of COVID-19 patients and the datasets of HCC patients to identify the synergistic effect of COVID-19 in HCC patients. What’s more, common differentially expressed genes were identified, and a protein-protein interactions network was designed. Then, hub genes and basic modules were detected based on the protein-protein interactions network. Next, functional analysis was performed using gene ontology terminology and the Kyoto Encyclopedia of Genes and Genomes pathway. Finally, protein-protein interactions revealed COVID-19 interaction with key proteins associated with HCC and further identified transcription factor (TF) genes and microRNAs (miRNA) with differentially expressed gene interactions and transcription factor activity. This study reveals that COVID-19 and HCC are closely linked at the molecular level and proposes drugs that may play an important role in HCC patients with COVID-19. More importantly, according to the results of our research, two critical drugs, Ilomastat and Palmatine, may be effective for HCC patients with COVID-19, which provides clinicians with a novel therapeutic idea when facing possible complications in HCC patients with COVID-19.


Introduction
Currently, the world is experiencing a serious pandemic due to the outbreak of coronavirus disease . The pandemic was caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) [1]. As far as we know, Angiotensin-converting enzyme 2, the functional receptor of SARS-CoV-2, is the crux of the viral infection [2]. According to previous literature, the infection rate of cancer patients was higher than that of the general population [3]. Furthermore, the case-fatality rate of cancer patients infected with SARS-CoV-2 was higher than the overall case-fatality rate of the general population infected with SARS-CoV-2. This indicates that cancer patients are more easily infected with SARS-CoV-2 and have a higher risk of death [4]. Liver injury caused by SARS-CoV-2 could aggravate the condition of patients with hepatocellular carcinoma (HCC) [5]. Liver cancer remains a challenge for global health [6]. It is predicted that by 2025, more than 1 million people will be affected by liver cancer every year [7]. HCC is the most common type of liver tumor, accounting for more than 90% of all liver tumors. Additionally, over 50% of HCC patients are infected by the Hepatitis B virus [8]. Benedicto et al. suggested that in HCC patients with high Neuropilin-1 expression in their liver sinusoidal endothelial cells, their hepatic stellate cells and tumor cells may have a higher chance of SARS-CoV-2 infection [9]. The mortality of cancer patients with COVID-19 is much higher than that in non-cancer patients; however, it is difficult to determine the impact of COVID-19 on HCC because of the limited data on HCC patients with COVID-19 [10]. Microarray data analysis for COVID-19 and the risk factors of HCC is still unknown.
This study attempted to find the common biological pathways and the relationship between HCC and COVID-19. We selected two datasets for our research. The SARS-CoV-2 infection in humans was analyzed using the GSE147507 dataset, while the TCGA-LIHC dataset was used for the gene expression analysis of HCC. First, we identified respectively the differentially expressed genes (DEGs) of GSE147507 and TCGA-LIHC. We then searched common DEGs for COVID- 19 and HCC so that we could obtain the basic data for the study. Based on the common DEGs, pathway analysis and gene set enrichment analysis were performed to further understand the biological processes of genome-based expression. Most importantly, we identified hub genes from the common DEGs, a critical step in order to identify relevant effective drug molecules. We designed protein-protein interactions (PPIs) networks to gather hub genes. Fig 1 shows the specific workflow of the present study.

Collection of the dataset
The GSE147507 (https://www.ncbi.nlm.nih.gov/geo/download/?acc=GSE147507) dataset, developed by Blanco-Melo et al. [11], describes the SARS-CoV-2 infection's transcriptional responses. While the TCGA-LIHC dataset were downloaded from TCGA (https://portal.gdc. cancer.gov/). The GEO database was employed for gene expression analysis under the National Center for Biotechnology Information Platform [12]. The Illumina NextSeq 500 platform was utilized for the GSE147507 dataset for extracted RNA sequence analysis [11]. The COVID-19 dataset (GSE147507) supplies samples consisting of SARS-CoV-2 infection of human lung epithelial and alveolar cells, and the dataset contains 4 samples. We employed the TCGA-LIHC dataset containing 374 tumor samples and 50 paraneoplastic tissue samples. The paraneoplastic tissue samples were used as the control group.

Identification of DEGs and common genes between COVID-19 and HCC
Pinpointing the common DEGs between the GSE147507 and TCGA-LIHC datasets was the primary task of this study. To discern the DEGs of GSE147507, the researchers used the limma package of the R (V 3.6.3) programming language. The data generated by microarray analysis was retrieved by DESeq2 [13] and the limma package [14]. We set the criteria for considering differences in expression as P values <0.05 and | logFC |>1. Benjamini-Hochberg was interaction-based information based on experiments and predictions, and the interactions generated through the network tool are defined as 3D structures, attachment information, and confidence scores [22]. The confidence score was also utilized for the current PPIs network with a moderate confidence score of 0.400. The confidence score, considered a moderate confidence score, was set using the STRING platform. For the purpose of obtaining a better visual representation of the network and identifying hub genes, the obtained PPIs were analyzed by Cytoscape (https://cytoscape.org/).

Identification of hub genes and module analysis
The analysis of the PPIs network for the current study was implemented through Cytoscape. The hub genes of the corresponding PPIs networks were indicated using the cytoHubba plugin (http://apps.cytoscape.org/apps/cytohubba). The MCODE (http://apps.cytoscape.org/apps/ mcode) plugin was utilized to identify the most profound modules and intensely connection regions in the PPIs network. The identification of highly interconnected parts by MCODE clustering helps to study effective drug design.

TF-gene interactions
Interactions of TF-genes with identified common DEGs reveal the role TFs play in gene functional pathways and expression levels [23]. The NetworkAnalyst (https://www.networkanalyst. ca/) platform was utilized to identify the interactions of TF-genes with identified common genes. NetworkAnalyst is a synthetic web-based platform for gene expression across multiple species and enables them to be meta-analyzed [24]. The network generated for the TF interaction genes is available from the ENCODE (https://www.encodeproject.org/) database included in the NetworkAnalyst platform.

TF-miRNA coregulatory network
TF-miRNA co-regulatory interactions were collected from the RegNetwork repository [25], which facilitates the detection of miRNAs and regulatory TFs that regulate DEGs at the posttranscriptional and transcriptional levels. We used NetworkAnalyst to visualize the TF-miRNA co-regulatory network. NetworkAnalyst helped to navigate complicated datasets in the simplest way possible, identifying biological features and functions to derive valid biological hypotheses [26].

Identification of candidate drugs
Drug molecules were designed based on the COVID-19 and HCC common DEGs using DSigDB, which consists of 22,527 genomes. The Enrichr platform was utilized to identify drug molecules with common DEGs. Data were obtained from the DrugSignatures database (DSigDB). The results of the drug candidates were generated based on P-values. P < 0.05 was set as a statistical criterion. Access to the DSigDB was obtained through the Enrichr platform (https://amp.pharm.mssm.edu/Enrichr/). Enrichr is mainly utilized as an enrichment analysis platform, providing numerous visual details of the collective function of the genes provided for input [27].

Identification of common DEGs between COVID-19 and HCC
The GSE147507 dataset was utilized to identify DEGs for COVID-19. 814 DEGs were identified, including 419 up-regulated genes and 395 down-regulated genes. For the HCC dataset, TCGA-LIHC was utilized to identify a total of 4,462 DEGs, of which 3,210 genes were up-regulated and 1,252 genes were down-regulated. As shown in Fig 2, combining the results of the above analysis, we identified 33 DEGs with common up-regulated expression and 68 DEGs with common down-regulated expression. These two sets of the DEGs were used to complete further analysis.

Gene ontology and pathway findings in terms of gene set enrichment analysis
The current study analyzes GO terms and KEGG pathway for the 33 DEGs with common upregulated expression and 68 DEGs with common down-regulated expression. The three best known GO terms include biological processes, molecular functions, and cellular components. Our studies illustrate the top GO terms for each subsection (biological processes, cellular components, and molecular functions), as shown in Table 1. The data in Table 1 demonstrate that for biological processes, common 33 DEGs with common up-regulated expression are highly enhanced in extracellular matrix disassembly and collagen catabolic process. 68 DEGs with common down-regulated expression are highly enhanced in neutrophil degranulation and neutrophil activation involved in immune response. Data from the molecular function subsection suggest that metalloendopeptidase activity is well correlated with 33 common up-regulated genes. Correspondingly, carbohydrate binding is well correlated with 68 DEGs with common down-regulated expression. Cellular component studies revealed significant involvement of apical part of cell and ficolin-1-rich granule respectively in common up-regulated and down-regulated DEGs. Table 2 shows the interaction of the complement and coagulation cascades with most genes according to the KEGG pathway database. The information obtained from Table 2 indicates the interaction of the platelet activation pathway and the malaria pathway with most genes according to the KEGG pathway database. Fig 3 shows the visualization results of the GO and KEGG analysis. WikiPathways, Reactome and BioCarta pathway analyses are summarized in Table 3. Fig 4 illustrates the results of the pathway analysis from the diverse pathway databases, respectively.

PPIs network to identify hub genes and module analysis
The PPIs network was created for further analysis of this study, including the hub gene assay used to identify drug molecules for COVID-19 and HCC. Ultimately, the results of the PPIs network were connected, and the PPIs analysis was proposed to build the drug compounds at

Identification of hub genes and module analysis for suggesting therapeutic solutions
To track hub genes from the network of PPIs highlighted in Fig 5, cytohubba, a plug-in for the Cytoscape software, was utilized. Hub genes were sorted by their degree values, which indicated the number of gene interactions in the PPIs network. In the UP-PPIs network, the top 10 identified hub genes were PDGFRB, MMP14, VWF, CD34, NES, MCAM, CSPG4, MMP1, SPARCL1 and MMP10. In the DOWN-PPIs network, the top 10 identified hub genes were IL1B, S100A12, FCGR3B, CCR1, S100A8, CCL3, CCL2, CCL4, CLEC4D and LILRA1.

PLOS ONE
Target and drug predictions for COVID-19 with HCC patients  Table 4.

Transcription factor-gene interactions
Transcription factor-(TF) gene interactions were gathered using NetworkAnalyst. For the common up-regulated and down-regulated DEGs, the TF-genes were characterized. The interaction of the TF regulators with the common up-regulated and down-regulated DEGs is shown in Fig 7. For the common up-regulated DEGs, the network consists of 67 nodes and 278 edges (UP-TF-gene network). The network includes a total of 42 TF-genes. MCM7 is mainly regulated by 33 TF-genes, ADAM17 is mainly regulated by 25 TF-genes, and MAFG is mainly regulated by 25 TF-genes. For the common down-regulated DEGs, the network consists of 135 nodes and 358 edges (DOWN-TF-gene network). The network includes a total of 86 TF-genes. SERPINA1 is mainly regulated by 23 TF-genes, C5AR1 is mainly regulated by 21 TF-genes, and ARL5B is mainly regulated by 20 TF-genes. These TF-genes moderate the network of more than one common DEG, which demonstrates the high interaction of TF-genes with the common up-regulated and down-regulated DEGs. Fig 7 illustrates the TF-gene interaction network.

TF-microRNA coregulatory network
TF-microRNA (miRNA) coregulatory network was produced using NetworkAnalyst. Analysis of the TF-miRNA co-regulatory network and provides the interaction of miRNA and TFs with the common up-regulated and down-regulated DEGs. This interaction may be responsible for regulating the expression of DEGs. For the common up-regulated DEGs, the network created for the TF-miRNA co-regulatory network includes 116 nodes and 299 edges (UP-TF-miRNA network). Among all miRNAs involved in regulating genes, the largest number of regulated genes was has-miR-16, which was involved in regulating 6 genes. For the common down-regulated DEGs, the network created for the TF-miRNA co-regulatory network includes 191 nodes and 452 edges (DOWN-TF-miRNA network). Similarly, the largest number of regulated genes in this network is has-miR-9, which is involved in the regulation of 7 genes. Fig 8 illustrates the TF-miRNA co-regulatory network.

Identification of candidate drugs
This study aimed to integrate the treatment of COVID-19 and HCC. According to the DSigDB, drug molecules were derived from the 33 common up-regulated DEGs and 68 common down-regulated DEGs. Among all drug candidates based on adjusted P-values (P < 0.05), the current study highlights the top ten important drugs. Ilomastat TTD 00008545, CGS-27023A TTD 00002801, CHEMBL475540 TTD 00006054, LAMININ BOSS, Plasmasteril BOSS, Ethylene dimethacrylate BOSS, 5194442 MCF7 UP, 9001-31-4 BOSS, fluphenazine PC3 UP and norcyclobenzaprine PC3 UP are the peak candidates for COVID-19 and HCC according to common up-regulated DEGs. Palmatine CTD 00000225, betamethasone CTD 00005504, fludrocortisone CTD 00005975, suloctidil HL60 UP, Modrasone CTD 00001031, Roflumilast CTD 00003916, acetohexamide PC3 UP, alexidine CTD 00000048, Antimycin A CTD 00005427 and dequalinium CTD 00005770 are the best candidates for COVID-19 and HCC based on common down-regulated DEGs. For the common up-regulated DEGs, the analysis showed that Ilomastat TTD 00008545, CGS-27023A TTD 00002801 and CHEMBL475540 TTD 00006054 were the top 3 drug molecules that interacted with the most genes. For the common down-regulated DEGs, the analysis showed that Palmatine CTD 00000225, betamethasone CTD 00005504 and fludrocortisone CTD 00005975 were the top 3 drug molecules that interacted with the most genes. Since these characteristic drugs were detected against common DEGs, these drugs represent common drugs for COVID-19 and HCC. Tables 5 and 6 indicate the drug candidates for the common DEGs in the DSigDB.

Discussion
There are no previous studies showing that having liver cancer is a risk factor for COVID-19. When a patient has liver cancer or even advanced liver cancer, the normal physiological functions of the liver are disrupted, and many critical proteins cannot be synthesized. This condition may promote the development of COVID-19. This study contributes to the narrative Bioinformatics curriculum for the meaningful analysis of SARS-CoV-2 affected lung epithelial and alveolar samples and HCC affected human liver tissue. Bioinformatics-related methods were used for this study to screen 814 and 4,462 DEGs from GSE147507 and TCGA-LIHC, respectively. To establish relationships and detect drug candidates based on COVID-19 and HCC, common DEGs were identified between the GSE147507 and TCGA-LIHC datasets. A total of 33 common up-regulated DEGs and 68 common down-regulated DEGs were identified. Next, we continued to analyze GO, the pathway, PPIs networks, TF-gene interactions, the TF-miRNA coregulatory network, and drug candidate assays. 33 common up-regulated DEGs and 68 common down-regulated DEGs were identified for detecting GO terms. GO terms were selected based on P-values. Extracellular matrix disassembly, neutrophil degranulation, metalloendopeptidase activity, carbohydrate binding, apical part of cell and ficolin-1-rich granule were the most important GO terms. Extracellular matrix breakdown is strongly associated with the pathogenic process of COVID-19. SARS-CoV-2 induces an inflammatory response at the blood-gas barrier, causing the breakdown of adherens junctions and tight junctions between endothelial cells, leading to the collapse of the blood-gas barrier and ultimately to hypoxia [28]. Neutrophils play an important role in the course of COVID19 infection. Increased hyperactivated degranulated neutrophils in alveolar lavage fluid of patients with neoconiosis compared to influenza [29]. Carbohydrate binding leads to insulin resistance, the thereby being involved in the pathogenesis of diabetes, COVID-19 may also predispose infected individuals to hyperglycemia [30,31]. The top ranked GO terms according to cellular component are the apical portion of the cell and the paclitaxel-1 enriched granules.
The KEGG pathway of the 33 common up-regulated DEGs and 68 common down-regulated DEGs was identified, as similar pathways were found for COVID-19 and HCC. The top 2 KEGG pathways included platelet activation and Malaria. Increased platelet activation and platelet-monocyte aggregate formation can be observed in patients with severe COVID-19, while platelet activation is associated with the severity of COVID-19 and mortality [32]. Platelet activation is increased in steatosis to nonalcoholic steatohepatitis (NASH), and antiplatelet therapy may prevent the development of NASH and subsequent HCC [33]. A study by Wang et al. found that malaria effectively inhibited HCC progression and prolonged survival time in tumor-bearing mice [34]. Concurrently, the results of WikiPathways indicated that the most interactive gene pathway was matrix metalloproteinases and COVID-19 adverse outcome pathway. A study by Petito et al. also suggested that plasma matrix metalloproteinase 9 was significantly increased in COVID-19 patients [35]. The results from the Reactome and BioCarta pathway also implicated that the most critical pathway is the activation of matrix metalloproteinases and the binding of formylated peptides by formylated peptide receptors.
Because hub gene detection, module analysis, and drug identification depend entirely on the PPIs network, the PPIs network analysis is the most remarkable part of the study. The PPIs analysis was also generated for 33 common up-regulated DEGs and 68 common down-regulated DEGs. According to the PPIs network the PDGFRB, MMP14, VWF, IL1B, S100A12 and FCGR3B genes were declared central genes due to their high interaction rate or degree values. High expression of PDGFRB and MMP14 is associated with poor prognosis of HCC [36][37][38]. VWF has also been mentioned several times as a biomarker to predict the prognosis of liver cancer [33,39,40]. VWF is involved in the formation of microvascular thrombi during COVID19 pathogenesis [41]. The inflammatory cytokine IL-1β was proven to be responsible for the induction of PD-L1 expression, further mediating immune escape from HCC [42]. Del Valle et al. revealed low levels of IL-1β expression in the serum of COVID-19 patients [43]. S100A12 was low-expressed in HCC tissues, and lower expression of S100A12 was associated with poorer OS [44]. To concentrate on critical regions of the PPIs network, modular analysis of hub genes was implemented. We targeted highly concentrated regions as this is more productive according to drug compounding recommendations.
TF-gene interactions were identified with common DEGs. TF-genes act as regulators according to the expression of genes that may lead to the production of cancer cells. From the network, MCM7 and SERPINA1 showed high interaction rates with other TF-genes. The degree values of MCM7 and SERPINA1 in the TF-gene interaction network were 33 and 23, respectively. MCM7 facilitates cancer procession through cyclin D1-dependent signaling and serves as a prognostic marker for patients with HCC [45]. Alpha-1 antitrypsin (AAT), a predominant plasma serine protease inhibitor encoded by serpina1, is known to promote the immune response to viral infections [46]. Among the regulators, KLF9 and MTA2 had significant interactions. In the TF-gene interaction network, the degree values of KLF9 and MTA2 were 12 and 9, respectively. Fu et al. demonstrated that mRNA and protein levels of KLF9 were lower in hepatocellular carcinoma (HCC) tissues than in normal tissues and that upregulation of KLF9 inhibited cell proliferation and movement [47]. Guan et al. revealed that high expression levels of MTA2 were closely associated with advanced pathological stages and low overall survival of patients, and that MTA2 promoted HCC proliferation and metastasis in vitro and in vivo by inhibiting the Hippo signaling pathway [48].
Regulatory biomolecules serve as underlying biomarkers in multiple complicating diseases. With this in mind, miRNAs and TF-genes were utilized to regulate the common DEGs. They were visualized and analyzed in a TF-miRNA co-regulatory network. 106 miRNAs and 104 TF-genes were identified in this study. Among the TFs with the strongest interactions, JUN had the high degree value of 12. JUN is a key regulator of inflammatory cytokine genes, such as CCL2, and is expressed at high levels in COVID-19 patients [49][50][51]. In a study by Liu et al., JUN was discovered to be potentially involved in the molecular mechanisms of HCC pathogenesis [52]. SP1 had the high degree value of 10. SP1-regulated RasGRP1 transcription stimulates proliferation of HCC [53]. TF-genes are responders that perform expression regulation by binding to target genes and miRNAs and can regulate gene expression through mRNA degradation [54].
Antimycin A is considered to be effective in decreasing the activity of coronaviruses and also has the potential to treat COVID-19 [55]. It is predicted that cell surface-bound immunoglobulin (csBiP) readily binds to SARS-CoV-2. CsBiP has a peptide-binding substrate-binding domain (SBD), and acetohexamide can bind to the SBD thereby interfering with the binding of SARS-CoV-2 to csBiP [56]. Betamethasone improves symptoms of hyposmia in patients with COVID-19 [57]. Since SARS-CoV-2 is a novel virus, drug treatment has been less extensively studied thus far. For this reason, we were only able to gather a small number of samples for analysis. In the future, once more samples become available, the ongoing study will gain further validity in the context of the SARS-CoV-2 pandemic.

Conclusions
In regards to transcriptomic analysis, no other studies on SARS-CoV-2 and HCC have been done so far. We completed an analysis of the DEGs between the two datasets, filtered the material by common gene identification and tried to identify the infection response between SARS-CoV-2 and HCC affected hepatocytes. In this study, we elucidated the intrinsic association between HCC and COVID-19 by identifying key genes that are commonly up-or downregulated by HCC and COVID-19. In addition, this study also indicates that COVID-19 may be a risk factor for the progression of HCC. Drug targets are logically suggested because they are derived from the identification of hub genes and may serve as a positive preamble to already approved drugs. HCC and COVID-19 share common targets. We propose these targets to help scientists and clinicians better investigate the mechanisms underlying the possible complications in HCC patients with COVID-19. Given that SARS-CoV-2 is a recently evolved virus, studies of its risk factors and infection are scarce. Unique studies of SARS-CoV-2 will become increasingly significant as more data sets become available.