IL6-mediated HCoV-host interactome regulatory network and GO/Pathway enrichment analysis

During these days of global emergency for the COVID-19 disease outbreak, there is an urgency to share reliable information able to help worldwide life scientists to get better insights and make sense of the large amount of data currently available. In this study we used the results presented in [1] to perform two different Systems Biology analyses on the HCoV-host interactome. In the first one, we reconstructed the interactome of the HCoV-host proteins, integrating it with highly reliable miRNA and drug interactions information. We then added the IL-6 gene, identified in recent publications [2] as heavily involved in the COVID-19 progression and, interestingly, we identified several interactions with the reconstructed interactome. In the second analysis, we performed a Gene Ontology and a Pathways enrichment analysis on the full set of the HCoV-host interactome proteins and on the ones belonging to a significantly dense cluster of interacting proteins identified in the first analysis. Results of the two analyses provide a compact but comprehensive glance on some of the current state-of-the-art regulations, GO, and pathways involved in the HCoV-host interactome, and that could support all scientists currently focusing on SARS-CoV-2 research.


Author summary
In this paper we provide data about the HCoV-host interactome that can be extracted from the integration of several public available databases. We used the initial interactome published by Zhou et al. and analyzed if there are already known and validated interactions. We also looked into possible known miRNAs and drugs interactions to suggest possible biomarker candidates and treatment options. We also performed a Gene Ontology and a Pathways enrichment analysis to understand which are the pathways most likely involved in the proteins targeted by SARS-CoV-2. This paper not only provides a set of validated and reliable data that could help researchers in their fight against the COVID-19 disease outbreak, but also demonstrates how Systems Biology can be effectively used to quickly gather preliminary but still significant data without resorting only to expensive lab experiments.

Introduction
The sudden global emergency caused by the newly discovered SARS-CoV-2 requires a fast but reliable and comprehensive analysis of the virus interactions with the human genome. Coronaviruses (CoVs) typically affect the respiratory tract of mammals, including humans, and lead to mild to severe respiratory tract infections [3]. To provide fast results that could contribute to the fight against the epidemic, in this study we started from the results presented in [1] and performed two different Systems Biology analyses on the HCoV-host interactome. In the first one, relying on validated interaction only, we reconstructed the regulatory network linking the proteins identified in [1], integrating it with the IL-6 protein, miRNA, and drug interactions information. We would like to stress the fact that, while the original paper [1] focuses on Protein Interactions, this paper looks at gene regulatory mechanisms. We wanted to look at the same problem, starting from Zhou results, from a different point of view. We consider this contribution as a continuation or expansion of Zhou work, not something that in any way contradicts it. Our first analysis has been done using the RING database [4], a data repository integrating 30 public databases designed for advanced biological networks reconstruction. This phase allowed to immediately identify a strongly connected cluster of proteins and drugs, as well as 3 miRNAs that the cluster produces. In the second analysis, executed using an R pipeline, we compared the full set of the HCoV-host interactome proteins with the ones belonging to the identified cluster. In particular, we performed a Gene Ontology [5] enrichment analysis, a Pathways enrichment analysis (with both KEGG [6] and WikiPathways [7]), as well as a final cluster analysis. The results of the latter analysis are publicly available at https://precious. polito.it/covid-19/ as an interactive website (the same data is also available in the S1_File of the supplemental materials).
As pure bioinformaticians we currently lack a valid experimental setup to prove the validity of results and the goal of this paper is not to propose a new methodology but to apply consolidated techniques to share and propagate knowledge that may help other scientists involved in SARS-CoV-2 research to better design their studies or uncover new hypothesis.

Methods and results
During these days of global emergency for the COVID-19 disease outbreak, there is an urgency to share reliable information able to help worldwide life scientists to get better insights and make sense of the large amount of data currently available. Based on these premise, we aim at providing at-a-glance insights, easy to read by life scientists, about: i) the current state-of-the-art knowledge available in terms of direct regulatory interactions taking place among gene/proteins included in the HCoV-host interactome [1] and IL-6, which resulted in the HCoV-Cluster network, and ii) the whole set of GOs and Pathways enriched in both the HCoV sets.

Network analysis
For this first analysis we used the Graph Tools of the RING database (https://precious.polito. it/theringdb/login) [4]. This tool integrates more than 30 publicly available data repositories and allows to reconstruct interaction networks between genes, Transcription Factors, miR-NAs, Drugs, Diseases, and SNPs.
As a first step we performed a network reconstruction looking only for interactions among the proteins of the HCoV-host interactome as reported in S3 Table in the Supplemental material of [1]. In order to build the minimum set of most reliable interactions, we used the most conservative settings, where all interactions are validated, manually curated, and limited to signaling or regulatory actions like inhibition/activation (data sources: TRRUST [8] and SIGNOR [9]). This phase allowed to divide the original HCoV-host interactome proteins in two sets: one that shows no obvious regulatory interactions, and one made of 20 genes (see Table 1) out of the original 135 proteins, that forms a well-defined regulatory interactome. For all interactions we report (in S3 Table of the supplemental material), the Pubmed references of the supporting papers.
The HCoV-host Network has been then enhanced looking for co-expressed microRNAs (in this case we selected the MIRIAD data source [10], while relaxing the Validated and Manually curated filters). This operation highlighted that 3 miRNAs are co-expressed by the identified cluster: hsa-miR 3912-3p and hsa-miR 3912-5p hosted by the NPM1 gene, and hsa-miR 4751, hosted by ATF5 the gene. We decided to highlight miRna possibly involved in the Covid interactome because miRNAs are known to mediate several regulatory mechanisms, but also to be powerful biomarkers for several diseases. Finally, we added the IL-6 gene to see if it presented any interaction with the identified cluster.
The resulting network is presented in Fig 1. Yellow edges represent multiple interactions. For each node of the network, the S1 Table reports the node name, type, UniprotId, and possible aliases. For each interaction, the S2 Table reports the interaction type (to match the symbol with its meaning refer to [4]), the database of origin, and, where available, the PubMed Ids of the related papers.
As a third step, we further enhanced the network looking for drug interactions that had at least two targets in the identified cluster (data sources: DGIdB [11] and DRUGBANK [12]). In Table 1 For the sake of clarity and readability we kept the proposed HCoV interacting network as small as possible and we avoided including further possible regulations with weaker reliability. Custom enhancements are obviously possible for interested scientists, and instructions are reported in the S1 Instruction file.
The list of drugs that have at least two interactions with the cluster is reported in Table 2. Although, the list of drugs reported only shares Paroxetine with of [1], the others could still be of interest as possible repurposing candidates.  Table).

PLOS COMPUTATIONAL BIOLOGY
The proposed HCoV interacting network may be helpful to all scientists currently involved in SARS-CoV-2 research, since it provides a compact but comprehensive glance on some of the current state-of-the-art regulations that take place among the HCoV-host interactome.

Enrichment analysis
For this second analysis we resorted to an R [13] pipeline, taking advantage of clusterProfiler package [14], and we fed it with both the full-set of 135 HCoV-host proteins, and HCoV cluster of 20 proteins identified in the previous analysis.
With both the sets we performed statistically valid enrichment analysis on: • the three Gene Ontologies (GO [5]) (i.e., Molecular Function, Cellular Component and Biological Process); • two of the most widely used pathway repositories KEGG [6] and WikiPathways [7]); Finally, we also added a one non-statistical GO classification. The result is provided as an interactive website (available at https://precious.polito.it/covid-19) that collects and presents all results both in graphical and tabular form; all the tables are immediately available to download to, once again, provide scientists with immediately usable result for further analysis. All results are also available in the S1 Data file.
Each enrichment analysis result is provided with its entity (i.e., a GO or Pathway) defined with its ID and Description, the gene ratio coverage, a list of proteins involved in that pathway or GO, and its enrichment p-value adjusted by Benjamini & Hochberg (BH) method [14]. This value has been computed to better control the expected proportion of false discoveries amongst the rejected hypotheses (i.e., false discovery rate, FDR), thus allowing for a less stringent condition on false discoveries, and allowing more candidate options to be included as results. Furthermore, we also computed, only for GOs, a non-statistical classification, in this

PLOS COMPUTATIONAL BIOLOGY
case, possibly helpful for preliminary evaluation or hypothesis definition. In this case we simply showcase the frequency of recurring GOs among the entities, with no statistical consideration. Because of that, data tables related to GO classification do not report any p-value. Finally, a cluster analysis has been computed in order to selectively elucidate possible inner differences between the HCoV-host protein-set and the HCoV-host proteins cluster-set in all the computations performed in the previous steps (i.e., GO Classification GO Enrichment and Pathway Enrichment). The same statistical assumption and limits, previously discussed for enrichment analysis, do apply to cluster analysis.
As a proof of concept, we hereby show in Table 3, the top five Wiki Pathway enriched pathways.
WP3872 describes integrin mediated cell survival regulation induced by parathyroid hormone-related protein. Although the pathway may seem not of immediate applicability for SARS-CoV-2, it is interesting to highlight the involvement of the PI3-K/Akt pathway by increasing levels of integrin A6B4, which further modulate the pro/anti-apoptosis members in the Bcl-2 family. Bcl-2 family of genes have been shown to play an important role in the IL-6mediated protective response to oxidative stress. Authors in [15] showed that IL-6 induced Bcl-2 expression, both in vivo and in vitro, disrupted interactions between proapoptotic and antiapoptotic factors, and suppressed H2O2-induced loss of mitochondrial membrane potential in vitro. Concluding that IL-6 induces Bcl-2 expression to perform cytoprotective functions in response to oxygen toxicity, and conclude that IL-6 induces Bcl-2 expression to perform cytoprotective functions in response to oxygen toxicity, and that this effect is mediated by alterations in the interactions between BAK and MFNS.
WP4298 refers to Viral Myocarditis (VM) pathway. VM is a rare cardiac disease associated with the inflammation and injury of the myocardium result of cooperation between viral processes and the adaptive as innate host's immune response (see [16][17][18]). Recent papers address SARS-CoV-2 as responsible for acute myocarditis or fulminant myocarditis nevertheless author state that the mechanism of cardiac pathology caused by SARS-CoV-2 needs further study [19,20].
Acute Respiratory Distress Syndrome (ARDS) induced by SARS-CoV-2, has been recently highlighted as mediated by high level of cytokine IL-6 [2,21] that leads to excessive inflammatory response, which is further related to bad prognosis. While IL-6 may be considered as a therapeutic target on his own, common ILs regulatory traits (shared with IL-3 and IL-5, may be helpful to highlight more detailed mechanisms of action in the inflammatory response. WP3646 pathway reports main hub genes and their related miRNAs. The pathways has been built on a set of differentially expressed genes in both chronic HCV (hepatitis C virus) and HCC (hepatocellular carcinoma) to highlight how Hepatitis C Virus leads to hepatocellular carcinoma [22]. This pathway suggests a possible similar behavior for corona family viruses and the liver involvement during infection has been recently highlighted, but still largely uncovered [23].
Supporting information S1