Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Comprehensive information-based differential gene regulatory networks analysis (CIdrgn): Application to gastric cancer and chemotherapy-responsive gene network identification

Abstract

Biological condition-responsive gene network analysis has attracted considerable research attention because of its ability to identify pathways or gene modules involved in the underlying mechanisms of diseases. Although many condition-specific gene network identification methods have been developed, they are based on partial or incomplete gene regulatory network information, with most studies only considering the differential expression levels or correlations among genes. However, a single gene-based analysis cannot effectively identify the molecular interactions involved in the mechanisms underlying diseases, which reflect perturbations in specific molecular network functions rather than disorders of a single gene. To comprehensively identify differentially regulated gene networks, we propose a novel computational strategy called comprehensive analysis of differential gene regulatory networks (CIdrgn). Our strategy incorporates comprehensive information on the networks between genes, including the expression levels, edge structures and regulatory effects, to measure the dissimilarity among networks. We extended the proposed CIdrgn to cell line characteristic-specific gene network analysis. Monte Carlo simulations showed the effectiveness of CIdrgn for identifying differentially regulated gene networks with different network structures and scales. Moreover, condition-responsive network identification in cell line characteristic-specific gene network analyses was verified. We applied CIdrgn to identify gastric cancer and itsf chemotherapy (capecitabine and oxaliplatin) -responsive network based on the Cancer Dependency Map. The CXC family of chemokines and cadherin gene family networks were identified as gastric cancer-specific gene regulatory networks, which was verified through a literature survey. The networks of the olfactory receptor family with the ASCL1/FOS family were identified as capecitabine- and oxaliplatin sensitive -specific gene networks. We expect that the proposed CIdrgn method will be a useful tool for identifying crucial molecular interactions involved in the specific biological conditions of cancer cell lines, such as the cancer stage or acquired anticancer drug resistance.

Introduction

The complex mechanisms underlying some diseases states cannot be understood by gene expression level analyses alone because some diseases are caused by disturbances in the specific functions of molecular networks rather than disorders in a single gene. One of the most powerful emerging techniques for biomedical research is heterogeneous gene regulatory network analysis, especially for responsive gene sets and module identification, and such networks have attracted considerable research attention because of their ability to reveal the complex mechanisms of diseases. Many studies have focused on the expression levels of genes and developed statistical methods for identifying differentially expressed genes (DEGs) or gene sets [13]. Computational methods based on correlations between genes have also been proposed to identify condition-responsive subnetworks and gene modules. Tesson et al. [4] focused only on the coexpression of genes and developed the method DiffCoEx to identify correlation pattern changes between multiple conditions. Guo et al. [5] developed the responsive score based on covariance of genes and then proposed the method for searching responsive networks. Although various computational approaches have been developed to identify significant gene sets describing phenotype-specific characteristics, these methods are based only on partial or incomplete information (e.g., correlation, covariance of expression levels of genes) on gene regulatory networks. Thus, biological condition-responsive gene networks could not be effectively identified.

Our study focuses on the drawbacks of the previous studies. In order to consider various aspects of network structure and effectively identify responsive networks, we propose a novel computational strategy that incorporates comprehensive information of network structure unlike the existing studies, called comprehensive information-based differential gene regulatory network analysis (CIdrgn). We consider the gene regulatory network structure based on not only expression levels of genes but also edge structures and regulatory effects of regulator genes to their target genes and incorporate the comprehensive information into a dissimilarity measure to compare the gene regulatory networks. We then proposed a novel statistic strategy to identify differentially regulated gene networks across phenotypes or biological conditions and extend the proposed CIdrgn to cell line characteristic-specific gene regulatory network analysis. To the best of our knowledge, this is the first study on responsive network identification based on comprehensive information of network structure.

We demonstrate the effectiveness of the proposed CIdrgn scheme through Monte Carlo simulations. Our strategy showed outstanding performance for identifying differentially regulated gene networks with various structures and scales. We applied our strategy to the Cancer Dependency Map (DepMap) dataset and performed gastric cancer (GC) and chemotherapy (capecitabine and oxaliplatin)-responsive gene network analyses. The identified responsive networks and their members (genes) represent candidate biomarkers for mechanisms related to gastric/colorectal cancer and various chemotherapies. Although biological knowledge about the identified subnetworks and markers was not incorporated into the differential gene regulatory network analysis, our data-driven strategy provides biologically reliable results for GC and its chemotherapy-responsive gene network analysis. Our results suggested that The CXC family of chemokines and cadherin (CDH) genes may play crucial roles in tumor invasion, metastasis, and progression of GC. In addition, the loss of activity of the olfactory receptor (OR) family with the ASCL1/FOS family may lead to drug resistance in cell lines and inducing these gene families may provide vital clues to enhance the chemotherapy efficiency of capecitabine and oxaliplatin.

The remainder of this paper is organized as follows. In the Methods section, we introduce a novel strategy for the differential gene regulatory network analysis. In the Monte Carlo simulation section, we present the results of the simulation studies. In the GC and XELOX-responsive gene regulatory network analysis section, we describe the results of GC and its chemotherapy-responsive network analysis. In the Discussion section, we provide the conclusions.

Methods

Suppose that is an n × p data matrix that describes the expression of p possible regulators that control target gene transcription for n cell lines. To estimate the gene regulatory network, we considered the following linear regression model: (1) where βℓj is the regression coefficient that represents the effect of each regulator xj on its target y and ϵ = (ϵ1, …, ϵℓn)T is a random error vector. The network can be represented by a weighted graph G = (V, E, W), where V is the set of vertices corresponding to p genes and EV × V is the set of edges, where (i, j) ∈ E indicates a link between vertices i and j (i.e., genes i and j). W = (wij) is the edge weight between vertices i and j which can be estimated using and .

Previous studies

Significance analysis of microarray for gene sets (SAM-GS).

To identify statistically significant gene sets across phenotypes A and B, Dinu et al. [1] proposed the SAM-GS method by extending the SAM t-statistic. The SAM-GS statistic is given as follows: (2) where and are the averages of the expression levels of gene j in phenotypes A and B, respectively; s0 is a tuning parameter; and sj is the gene-specific scatter. This scatter is given as follows: (3) where nA and nB are the numbers of cell lines in phenotypes A and B, respectively, and a = (1/nA + 1/nB)/(nA + nB − 2). The SAM-GS statistic measures the difference in expression levels of a gene set across phenotypes A and B.

Gene set co-expression analysis (GSCA).

Choi and Kendziorski [6] proposed GSCA a statistical method to identify differentially co-expressed genes. For genes in a subnetwork G, GSCA computes pairwise correlations for all gene pairs, where |V| is the number of genes in V. The statistic of the GSCA used to measure the dispersion of correlations between phenotypes A and B is given as follows: (4) where and are the correlations between the kth and jth genes for the phenotypes A and B, respectively.

Comprehensive information-based differentially regulated gene network analysis

In this study, we proposed a novel differential gene regulatory network analysis strategy that can effectively identify biological condition-responsive subnetworks based on gene expression levels and subnetwork structures. Network structures are evaluated based on regulatory effects between genes and the edge structure of genes, and the following statistics are proposed to measure the dissimilarity of the subnetworks.

  • Dissimilarity based on regulatory effects between genes
    We define the following regulatory effect of the jth regulator gene on the th target gene in phenotypes A and B, consistent with a study conducted earlier [7]: (5) where and are the regression coefficients estimated by cell lines of phenotypes of A and B, respectively. Let R(A) = {r1j(A), …, rqj(A), rj1(A), …, rjp(A)}T and R(B) = {r1j(B), …, rqj(B), rj1(B), …, rjp(B)}T be regulatory effect vectors of the jth gene in phenotypes A and B, respectively, where the first q elements indicate the regulator effects of jth gene on its q targets and the remaining p elements indicate the regulator effect of the p regulators on the jth gene. We propose the following statistic to describe dissimilarity of the regulatory effect of the jth gene: (6)
    The large value γj(A, B) indicates the jth gene has different expression levels and size of edges connected its neighborhoods between phenotypes A and B. We then propose the following statistic to measure dissimilarity of regulatory effects of a subnetwork G: (7)
    The large value of Γ indicates that the genes consisting of G have different regulatory effects on their target genes across the phenotypes, thus the network corresponding the large Γ can be considered as the responsive network having phenotype specific regulatory effects.
  • Dissimilarity of the edge structure of nodes
    The genes linked in the networks may have similar biological functions; thus, node (gene) similarity can be defined using the Jaccard index. We propose the following statistic for the node similarity of the jth gene between phenotypes A and B in line with Li et al. [8]: (8) where NAj and NAj are the sets of nodes that are directly connected to the jth gene within phenotypes A and B, respectively. If the gene have similar target and regulator genes between phenotypes A and B, than we consider that the gene is not phenotype specific feature. Thus, we propose the following Jaccard distance metric to measure the dissimilarity of the edge structure of the jth gene: (9)
    We then propose the following statistic to describe the dissimilarity of edge structures in a subnetwork: (10)
  • Dissimilarity based on adjusted regulatory effects
    We also consider the following statistic based on the adjusted regulatory effect using the Jaccard distance: (11)

To comprehensively measure the dissimilarity of a subnetwork across biological conditions or phenotypes, we propose the following statistic that incorporates not only expression levels of genes but also edge structures and regulatory effects into the differential gene regulatory network analysis, (12) (13)

The large values of the statistics CIdrgn.1 and CIdrgn.2 indicate that the subnetworks show different regulatory structures for phenotypes A and B.

Responsive subnetwork extract.

To extract biological condition-responsive subnetworks, we computed the significance of the proposed statistics based on the permutation framework.

We first compute the statistics Γ, Λ, and ΓΛ for the subnetworks. Then, we generate permutation samples and and estimate gene regulatory networks G(pm)(A) and G(pm)(B) based on and for pm = 1, …, T. The permutation statistics Γ(pm), Λ(pm), and were computed for G(pm)(A) and G(pm)(B). Because the statistics have different scales, we normalize each statistic Γ, Λ, and ΓΛ to prevent that any one dominates CIdrgn.1 and CIdrgn.2, where (14)

For notational compactness, in the remainder of this section, we shall write the notation in Eq (14) as normalized statistics.

We then computed the similarity measures CIdrgn.1 and CIdrgn.2 based on the normalized statistics and computed the following permutation p-values: (15) (16) where CIdrgn(pm).1 and CIdrgn(pm).2 are the statistics computed by the pmth permutation sample and I(⋅) is an indicator function. The small permutation p-value indicates that networks G(A) and G(B) show significantly different regulatory structures.

The algorithm for CIdrgn is given in Algorithm1. For the dissimilarity measure CIdrgn.2, similar procedures were applied to compute the permutation p.valueCIdrgn.2 and identify the responsive subnetworks. The proposed statistic is based on comprehensive information of the gene regulatory network, that is, the dissimilarity of node edge structures, regulatory effects, and expression levels; thus, we can effectively extract biological condition-specific gene networks. Fig 1 shows overall framework of the proposed CIdrgn.

Algorithm 1 Algorithm for CIdrgn

1: Construct gene regulatory networks for phenotypes A and B: G(A), G(B).

2: Compute the statistics Γ, Λ and ΓΛ.

3: Construct permutation samples and , estimate gene networks based on the permutation samples Gpm(A) and Gpm(B) and compute the statistics Γ(pm), Λ(pm), and for pm = 1, …T.

4: Normalize Γ, Λ, ΓΛ and then compute CIdrgn.1 and CIdrgn(pm).1 for pm = 1, …, pm.

5: Compute permutation p-values of the statistic CIdrgn.1: p.valueCIdrgn.1.

6: If p.valueCIdrgn.1 < κ, then select the subnetwork as a differentially regulated gene networks, where κ is a threshold.

7: Output the list of subnetworks.

CIdrgn for cell line characteristic-specific gene networks

We consider cell line characteristic-specific gene regulatory networks that can be estimated using the following varying coefficient model [9]: (17) where βjℓ(mα) is the varying coefficient of the jth regulator gene xj on th target gene y for the αth target sample with modulator M = mα, such as the cancer characteristics of the cell lines. The varying coefficient model enables us to construct the gene regulatory networks under varying conditions of cell line characteristics M = mα. The network of the αth cell line can be represented using a weighted graph Gα = (V, Eα, Wα), where Wα = (wαij) is the edge weight, which can be computed based on and . EαV × V is the set of edges, where (i, j) ∈ Eα indicates the link between vertices i and j in the network of the αth cell line.

We also propose the method for identifying responsive subnetwork from cell line characteristic-specific gene networks.

  • Dissimilarity based on the regulatory effect of the sample-specific gene network
    For the sample-specific gene network, the regulatory effect of the jth regulator gene on th target gene in the αth cell line is given as follows: (18) where xαj is an expression level value of the jth gene in the αth cell line. We define the regulatory effect of phenotypes A and B as follows: (19) where and are the set of cell lines of phenotypes A and B, respectively. Let and be regulatory effect vectors of the jth gene in the network of αth cell line for phenotypes A and B, respectively. We consider the statistic to measure the dissimilarity of the regulatory effect of the jth gene in sample-specific gene networks as follows (20)
    We then propose the following statistic to measure dissimilarity of the regulatory effects of genes in a subnetwork Gα, (21)
  • Dissimilarity of the node edge structures for sample-specific gene network
    We consider and as the sets of edges of the jth gene in phenotypes A and B, respectively. We then define the following node similarity: (22) where Nαj is the sets of nodes that are connected to the jth gene in the network of the αth cell line. The Jaccard distance of the jth node in the cell line characteristic-specific gene network is given as follows: (23)
    Similar to (10), the dissimilarity measure of the edge structure is given as follows: (24)

We then proposed the following statistics to identify differentially regulated gene networks in the cell line characteristic-specific gene network analysis. (25) (26) where (27)

For the cell line characteristic-specific gene network, the condition-responsive subnetworks are extracted based on the permutation p-value of CIdrgnss.1 and CIdrgnss.2, which are consistent with p.valueCIdrgn.1 and p.valueCIdrgn.2.

Monte Carlo simulations

Monte Carlo simulations were conducted to investigate the performance of the proposed CIdrgn. Suppose that we have two phenotypes A and B and 10 subnetworks consisting of 10 genes, where the sample sizes of phenotypes A and B are nA = 50 and nB = 50, respectively. We consider four common subnetworks of the two phenotypes A and B, six A-specific and six B-specific subnetworks. The expression levels of the p genes in 100 (= nA + nB) cell lines were generated using the following procedure. In scenario 1, we generated four precision matrices for nw = 1,.., 4 common subnetworks from “random’’ graph structure using huge.generator in R package Huge, where the off-diagonal elements of the precision matrix were set to 0.5 and the positive number 0.2 was added to the diagonal elements of the precision matrix, which is consistent with [10]. For each 10 genes consisting of the 4 common subnetworks, the expression levels of 100 cell lines were generated from . For the six A-specific and B-specific subnetworks, the precision matrices ( and for nw = 1, …, 6) were generated from the “random” and “band” graph structures, respectively. For each 10 genes consisting of the each six A- and six B-specific subnetworks, we generated expression levels of nA = nB = 50 cell lines from and , respectively. The expression levels of the remaining p − 100 genes were generated from N(0p−100, Ip). Scenarios 2, 3, and 4 were similar to scenario 1 except that the precision matrices for nw = 1,.., 4 and for nw = 1, …, 6 have “cluster”, “scale-free”, and “hub” graph structures for scenarios 2, 3, and 4, respectively. Fig 2 shows the graph structures of the common subnetworks and A- and B-specific subnetworks.

We considered p = 1000 and μ10 = (0.3, 0.3, …, 0.3)T, and the gene regulatory networks were estimated using the lasso method [11], where the response and predictor variables are the expression levels of the target gene and regulator genes, respectively. We evaluated the proposed CIdrgn by comparing it with the statistics SAM-GS and GSCA. The permutation p-values of the statistics SAM-GS and GSCA were computed, and responsive subnetworks were extracted by the permutation p-values with a significance level of 0.05, such as in the case of CIdrgn.

The recall, precision, true negative rate (TNR), F-measure, and accuracy results are given in the column labeled “subnetwork size:10” of Table 1. CIdrgn provides effective results for responsive subnetwork identification overall, although limited differences are observed. In particular, CIdrgn.2 shows effective results in terms of precision, true negative rate, F-measure, and accuracy. The methods show similar results for the true positive rate (i.e., Recall) for six responsive network identifications, while our method provides outstanding performance for the true negative rate (TNR) for four common subnetworks.

We also consider the scenarios for a large-scale subnetwork, that is, each 10 subnetworks consist of 50 genes. The results for the large-scale subnetwork scenarios are presented in the column labeled “subnetwork size:50” in Table 1. The results show that our strategy also provides outstanding performance for large-scale subnetwork situations, where CIdrgn.2 also shows the most effective results for differentially regulated gene network identification.

Monte Carlo simulations: Sample-specific gene network analysis

For the cell line characteristic-specific gene regulatory networks, we considered the same network structures given in Fig 2. For sample-specific gene regulatory strength, we generated precision matrices as follows: (28)

The gene expression levels were determined using the same method applied in the gene network analysis with bulk cell lines except that the expression of genes in six A-specific and six B-specific subnetworks was generated as xα(A) ∼ N(010, Ωα(A)) and xα(B) ∼ N(μ10, Ωα(B)) for α = 1, …, 50. The modulator M = (m1, …, m100)T that describes the characteristics of cell lines was generated from the uniform distribution U(−1, 1).m The cell line characteristic-specific gene network was estimated using SiGN-L1 (https://sign.hgc.jp/signl1/) [7].

Table 2 shows the results for cell line characteristic-specific gene networks.

thumbnail
Table 2. Simulation results: Cell line characteristic analysis.

https://doi.org/10.1371/journal.pone.0286044.t002

CIdrgn provides effective results for condition-responsive subnetwork identification in cell line characteristic-specific gene network analyses. CIdrgn.1 outperforms in scenarios 2 and 4, while CIdrgn.2 shows an effective performance for scenarios 1 and 3.

In short, our proposed strategy provides effective results for differentially regulated gene network identification in bulk cell line gene networks and cell line characteristic-specific gene networks. This finding implies that the incorporation of comprehensive information on gene regulatory systems is crucial for differential gene regulatory network analyses.

Gastric cancer and XELOX-responsive gene regulatory network analysis

Gastric cancer (GC) is the one of the leading causes of cancer-related death worldwide. Although chemotherapy represents the standard treatment for GC, it is not always effective because of drug resistance in cancer cell lines. Uncovering crucial markers and their regulatory networks that play key roles in the mechanisms of acquired drug resistance are a vital issue to enhance chemotherapy efficacy of gastric cancer.

We applied the proposed strategy to identify GC and chemotherapy -responsive gene regulatory networks. We used a publicly available large-scale dataset from the Cancer Dependency Map (DepMap) (https://depmap.org/portal/), where the RNA expression of 19,221 genes and 1,406 cell lines from the CCLE dataset (22Q2) and the drug sensitivity of 4,686 compounds and 578 cell lines from the PRISM repurposing primary screen can be downloaded. Because CIdrgn.2 and CIdrgnss.2 show outstanding performance in simulation studies, we use CIdrgn.2 and CIdrgnss.2 for GC and XELOX-responsive gene regulatory network analysis.

GC-responsive gene regulatory network identification

To identify the GC-responsive gene regulatory networks, we considered 41 GC and 1365 non-gastric cancer (NGC) cell lines as the two phenotypes and then estimated GC gene regulatory networks based on the GC cell lines. The NGC gene network was estimated using 41 randomly selected cell lines from 1365 NGC cell lines, and the construction process was repeated 50 times based on 41 randomly selected NGC cell lines. The permutation GC and NGC networks were also estimated using the 41 cell lines randomly selected from among the 1406 cell lines, and the permutation NGC network construction process was also repeated 50 times. We then extracted subnetworks from the estimated GC and NGC networks and permutation GC and NGC networks. For NGC networks, only subnetworks existing in all 50 iterations were considered as NGC networks (permutation NGC networks). The permutation of network estimation was repeated 500 times. We extracted edges with a size greater than 0.7 and identified the GC/NGS-responsive network based on a permutation p-value with a significance level of 0.01. A total of 106 differentially regulated subnetworks were identified, where the identified 106 sub-networks are gastric cancer cell line specific gene networks, i.e., the sub-networks are not existed in NGC gene networks.

To verify the efficiency of the method, we visualized the identified GC-specific gene regulatory networks based on subnetworks with a size greater than 3 (i.e., a subnetwork with more than three nodes), as shown in Fig 3. The chemokine (C-X-C motif) ligand (CXCL) family (i.e., CXCL1, CXCL2, CXCL3 and CXCL8) is identified as a GC-responsive gene regulatory system. Moreover, the CDH gene family (CDH2 and CDH17) has GC-responsive regulatory structures. The identified GC-responsive markers (i.e., CXCL and CDH gene families) have been studied as GC markers, and their involvement in mechanisms related to GC have been demonstrated in many studies.

CXCL family.

The CXCL family plays an important role in the control of inflammation and angiogenesis because their specific receptors are expressed by both leukocytes and endothelial cells [12]. Chen et al. [13] revealed that the CXCL family represents markers for the development of GC and demonstrated that it is involved in the pathogenesis of GC. In particular, the mRNA expression levels of the CXCL family have been identified as promising prognostic indicators of Epstein-Barr virus-associated GC [14]. Zhou et al. [15] revealed that the CXCL10/CXCR3 axis promotes the invasion of GC via the PI3K/AKT pathway-dependent matrix metalloproteinase production.

CDH gene family.

The CDH gene family has been identified as a marker of tissue morphogenesis, sorting, and cell adhesion, which are important processes for tissue formation and maintenance.

CDH2, encoded by N-cadherin, is involved in the transformation of human GC cells and associated with intracellular events of gastric carcinogenesis and high expression of corresponding transcription factors [16]. GC cell invasion is upregulated by Twist and accompanied by CDH2 [17]. GC invasiveness is inhibited by microRNA-145-5p, which targets CDH2 to suppress epithelial-mesenchymal transition (EMT) [18]. Jiang et al. [18] revealed that CDH2 is involved in the molecular mechanisms that regulate GC cell invasion and migration, especially EMT progression [19]. Aberrant CDH2 expression is involved in the invasion and metastasis of GC [20].

The expression of CDH17 is increased in GC, and its silencing suppresses the metastasis and proliferation of GC cells [21]. CDH17 has been identified as a useful target molecule, and its crucial roles in the comprehensive detection and localization of GC metastases have been revealed earlier [22]. Mutation of E-cadherin (CDH1) represents the most common germline mutation in GC [23]. In particular, mutations in CDH1 lead to hereditary diffuse GC [23, 24].

To identify the biological processes involved in the identified GC-responsive subnetworks, we performed a Kyoto Encyclopedia of Genes and Genome (KEGG) pathway enrichment analysis of the GC-responsive markers. Fig 4 shows the enriched pathways corresponding to the p-value (i.e., -log(p.value)). As shown in Fig 4, the most enriched pathway is the “IL-17 signaling pathway” grouping of the CXCL family (i.e., CXCL8, LCN2, CXCL1, CXCL3, and CXCL2). Furthermore, the “viral protein interaction with cytokine and cytokine receptor” and “cytokine-cytokine receptor interaction” pathways were also enriched for GC-responsive markers. The pathway “epithelial cell signaling in Helicobacter pylori infection” was also identified as an enriched pathway. These findings imply that GC-responsive networks may be involved in EMT-related mechanisms of GC.

thumbnail
Fig 4. Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis.

https://doi.org/10.1371/journal.pone.0286044.g004

XELOX-responsive gene regulatory network identification

We also performed a GC chemotherapy-responsive gene regulatory network identification. We focused on the anticancer drugs capecitabine and oxaliplatin, which have been used as standard treatments for gastric and colorectal cancer, and their combination is known as the XELOX regimen. For XELOX, we extracted cell lines that had non-missing values in sensitivities to both capecitabine and oxaliplatin and matched the cell lines with the expression level dataset. In total, 520 cell lines were extracted. We then estimated 520 capecitabine- and capecitabine-sensitive gene networks using SiGN-L1 [7]. We considered the cell lines with the top 100 highest and lowest drug sensitivity values as drug-sensitive and drug-resistant cell lines, respectively. The sensitive and resistant cell lines for both capecitabine and oxaliplatin were defined as XELOX-sensitive and XELOX-resistant cell lines, and 18 XELOX-sensitive and 24 XELOX-resistant cell lines were considered. From the 520 capecitabine-sensitive gene networks, we extracted 18 and 24 networks corresponding to the 18 and 24 XELOX-sensitive and XELOX-resistant cell lines, respectively. The 18 sensitive and 24 resistance permutated networks were also randomly selected from 520 capecitabine-sensitive gene networks. By using the permutation p-value of CIdrgnss.2 with a significance level of 0.01, we identified 62 capecitabine-responsive subnetworks. For oxaliplatin, similar processes were performed, and nine oxaliplatin-responsive subnetworks were identified. Among the genes in the 62 capecitabine- and 9 oxaliplatin-responsive subnetworks, the common genes in the sensitive (resistant) networks were defined as XELOX-sensitive (resistant) markers. We then extracted edges of the XELOX-sensitive (resistance) markers from the capecitabine and oxaliplatin-sensitive (resistance) gene networks and constructed XELOX-sensitive (resistant) networks, where the edge size was given as an average value from the 18 XELOX-sensitive and 24 XELOX-resistant networks.

Fig 5 shows the XELOX-responsive gene regulatory networks for the XELOX-sensitive and XELOX-resistant cell lines. The networks of XELOX-sensitive and XELOX-resistant cell lines showed different gene regulatory structures despite presenting common genes, and their networks were observed for both sensitive and resistant cell lines. The subnetwork of NCKAP1L was the largest XELOX-responsive gene regulatory network. The regulatory structure between NCKAP1L/ASCL1 and the OR family can be considered a XELOX-sensitive gene regulatory system. ASCL1 is an important gene in XELOX-sensitive cell lines, whereas the sub-network disappears from sensitive to resistance cell lines. Furthermore, the subnetwork of the FOS gene family can also be considered a XELOX-sensitive network.

Table 3 shows the identified XELOX-responsive markers and their association with GC/colorectal cancer and chemotherapy.

NCKAP1L.

The Nck-associated protein 1-like (NCKAP1L) gene is known as a DNA methylation-driven gene that promotes the malignancy of cancer cells [25, 26]. The expression of NCKAP1 is considered a marker of colon cancer [27]. NCKAP1 has been identified as a chemotherapy marker of immune checkpoint inhibitors [29]. NCKAP1 is associated with long-term DOX-induced cognitive dysfunction [30] Stark et al. [31] reported an association between NCKAP1 levels and cisplatin-induced apoptosis and cytotoxicity.

ASCL1.

Expression levels of ASCL1 were identified as crucial markers of various GC subtypes, that is, lower expression of ASCL1 was observed in gastric mixed adenocarcinoma, gastrointestinal stromal tumors, and gastric intestinal-type adenocarcinoma [32]. In addition, overexpression of ASCL1 mRNA was observed in gastrointestinal carcinoid tumor cells compared with that in normal tissue [34]. ASCL1 was identified as a DEG between GC and normal tissue and an upregulated transcription factor in the gene network of the DEGs [33]. Furthermore, the downregulated expression of ASCL1 has been identified as a marker of GC and colon cancer [32]. Augustyn et al. [36] revealed that the ASCL1 pathway is a promising therapeutic target of cabozantinib for lung cancer. ASCL1 is involved in the anticancer mechanisms of doxorubicin and octreotide in neuroendocrine cancer [38].

OR family.

OR family activation is a crucial marker for cancer cell growth and progression [39]. The OR family promotes the invasiveness and metastasis of colorectal cancer cells [43]. OR2B6 was detectable in colon and gastric carcinoma cell lines [39]. The long noncoding RNA of OR3A4 promotes metastasis and tumorigenicity in GC [40]. OR51B5 has been identified as a candidate oncogene in scirrhous-type GC [41].

Morita et al. [42] identified OR7C1 as a novel marker of colon cancer-initiating cells and a potent target of immunotherapy. Activation of the OR family members induces apoptosis and reduces cell migration and proliferation in colorectal cancer cells [39]. The OR family (i.e., OR1D2, OR4F15, and OR1A1) was also disrupted in colorectal cancer, and the importance of OR activity-associated genes in colorectal cancer was identified by Li et al. [43]. Choi et al. [44] found that the OR family is an important target for drug discovery and represents a crucial anticancer target.

FOS family (AP-1 transcription factor family).

The FOS gene family is involved in malignant transformation in several types of cancer, and its members (FOS, FOSB, FOSL1, and FOSL2) play crucial roles in cell proliferation and cancer development [45]. Abnormal expression of FOSB is a signature of tumor progression and poor survival. Overexpression of FOSB suppresses cell proliferation, clone formation, and migration, whereas silencing of FOSB leads to proliferation, clone formation, and migration in GC cell lines [46]. Tang et al. [46] suggested that FOSB could be a useful biomarker for the diagnosis and prognosis of GC. FOS family members have been identified as novel prognostic markers and therapeutic targets in colorectal cancer [45, 47]. The FOS family is involved in the antitumor efficacy of oxaliplatin, and the importance of the effect of oxaliplatin ‘on FOS signaling has been revealed [48, 49].

Dong et al. [50] demonstrated that upregulated FOS expression is a marker of low sensitivity to eribulin. Javellana et al. [51] revealed potentially clinically relevant roles of increased FOS family expression in driving the initial responses of tumors to carboplatin and paclitaxel and suggested that the FOS family will be a useful druggable target to counteract chemotherapy resistance.

For the XELOX-sensitive and XELOX-resistant subnetworks in Fig 5, we performed a Gene Ontology (GO) enrichment analysis. Fig 6 shows the significant GO terms corresponding to a p-value less than 0.01, where the enriched pathways of XELOX-sensitive and XELOX-resistant subnetworks are shown in red and blue, respectively. As shown in Fig 6, the XELOX-sensitive (red) and XELOX-resistant (blue) markers are enriched in different pathways, although the networks have common genes.

thumbnail
Fig 6. Gene Ontology (GO) pathway analysis of XELOX-sensitive (red) and XELOX-resistant (blue) specific markers.

https://doi.org/10.1371/journal.pone.0286044.g006

The most enriched pathway of the XELOX-sensitive markers is the “olfactory receptor activity” grouping of the OR family (i.e., OR4E2, OR4E1, OR10G2, OR52R1, OR51E2, OR51T1). Furthermore, the pathways related to “DNA binding”, “steroid hormone”, and “G protein-coupled receptor” were also identified as XELOX-sensitive gene markers. These findings imply that the XELOX-sensitive gene regulatory network is dominated by G protein-coupled receptors because the OR family belongs to the superfamily of G protein-coupled receptors [52]. For XELOX-resistance markers, the most enriched pathway is “neurotransmitter transport”, and it consists SV2B and CHAT, which have crucial molecular functions as drug targets [53].

Our results and those of previous studies show that the CXC family of chemokines may play a crucial role in the progression of GC. Furthermore, identifying the EMT-related mechanism involving the CDH gene family is essential for understanding the metastasis, proliferation, and development of GC and revealing the mechanism of the gene family in EMT-associated GC is vital to understanding the progression of GC. The GS- and anti-cancer drugs response- related mechanisms of ASCL1 that was identified as XELOX sensitive-specific hub gene were revealed in previous studies. Although its mechanisms of acquiring gastric cancer drug resistance have not yet been clearly demonstrated, our results and literatures suggest that the loss of activities and molecular interactions of the OR family with the ASCL1 and FOS families may represent a molecular mechanism underlying acquired drug resistance in GC cell lines.

Conclusions

In this study, we introduce a novel strategy for differential gene regulatory network analysis. To identify condition-responsive gene regulatory networks describing biological condition-specific regulatory characteristics, we propose a computational method that measures the dissimilarity of subnetworks based on not only gene expression, but also regulatory effects between genes and edge structures. We also extended the proposed method to cell line characteristic-specific gene network analysis. Our method performs differential gene regulatory network analysis based on comprehensive information of the network structure; thus, we can effectively identify biological condition-responsive gene networks that cannot be detected using methods based on differences in gene expression levels. Monte Carlo simulations showed that our strategy demonstrates outstanding performance for responsive gene network identification in various network structure scenarios. It implies that incorporation of various network structures not only expression levels of genes is crucial to identify responsive networks.

We applied the proposed CIdrgn to the DepMap dataset and performed an analysis of GC and chemotherapy (XELOX) -responsive gene regulatory networks. Our strategy identified the molecular interplays of CXCL family (i.e., CXCL1, CXCL2, CXCL3 and CXCL8) and CDH gene family as GS responsive networks. For the XELOX-responsive gene networks, the regulatory structure between NCKAP1L/ASCL1 and the OR family are identified. The identified markers from the GC and XELOX-responsive network analysis provide strong evidence for the mechanisms related to GC, colorectal cancer, and various chemotherapies.

Our results suggest that the CXC family of chemokines and CDH genes may play a crucial role in GC tumor invasion, metastasis, and progression. Moreover, our findings indicate that revealing the mechanisms associated with genes in EMT-associated GC will be key to understanding GC progression. We also suggest that the loss of activity of the OR family with the ASCL1/FOS family may lead to drug resistance in cell lines, and inducers of this gene family can provide crucial information to improve the chemotherapy efficiency of XELOX.

Acknowledgments

This research used the computational resources of supercomputer Fugaku provided by the RIKEN Center for Computational Science and the Super Computer System, Human Genome Center, Institute of Medical Science, University of Tokyo.

References

  1. 1. Dinu I, Potter JD, Mueller T, Liu Q, Adewale AJ, et al. (2007) Improving gene set analysis of microarray data by SAM-GS. BMC Bioinformatics 8: 242. pmid:17612399
  2. 2. Barry WT, Nobel AB, Wright FA. (2008) A statistical framework for testing functional categories in microarray data. Ann Appl Stat 2: 286–315.
  3. 3. Ho YY, Cope L, Dettling M, Parmigiani G. (2007) Statistical methods for identifying differentially expressed gene combinations. Gene Func Anal 171–191. pmid:18314583
  4. 4. Tesson BM, Breitling R, Jansen RC (2010) DiffCoEx: a simple and sensitive method to find differentially coexpressed gene modules. BMC Bioinformatics 11: 497. pmid:20925918
  5. 5. Guo Z, Wang L, Li Y, Gong X, Yao C, et al. (2007) Edge-based scoring and searching method for identifying condition-responsive protein-protein interaction sub-network. Bioinformatics. 23(16): 2121–2128. pmid:17545181
  6. 6. Choi Y, Kendziorski C (2009) Statistical methods for gene set co-expression analysis. Bioinformatics. 25(21): 2780–2786. pmid:19689953
  7. 7. Shimamura T, Imoto S, Shimada Y, Hosono Y, Niida A, et al. (2011) A novel network profiling analysis reveals system changes in epithelial-mesenchymal transition. PLoS ONE. 6(6): e20804. pmid:21687740
  8. 8. Li Y, Luo P, Wu C (2014) A new network node similarity measure method and its applications. arXiv: 1403.4303. https://doi.org/10.48550/arXiv.1403.4303.
  9. 9. Hastie T, Tibshirani R (1993) Varying-coefficient models. J Royal Stat Soc Ser B. 55(4): 757–796.
  10. 10. He H, Cao S, Zhang JG, Shen H, Wang YP, Deng HW (2019) A Statistical Test for Differential Network Analysis Based on Inference of Gaussian Graphical Model. Sci Rep. 9(1): 10863. pmid:31350445
  11. 11. Tibshirani R (1996) Regression shrinkage and selection via the lasso. J Roy Stat Soc SerB 58: 267–288.
  12. 12. Romagnani P, Lasagni L, Annunziato F, Serio M, Romagnani S (2004) CXC chemokines: the regulatory link between inflammation and angiogenesis. Trends Immunol 25(4): 201–209. pmid:15039047
  13. 13. Chen X, Chen R, Jin R, Huang Z (2020) The role of CXCL chemokine family in the development and progression of gastric cancer. Int J Clin Exp Pathol 13(3): 484–492. pmid:32269686
  14. 14. Mu L, Hu S, Li G, Wu P, Ren C, et al. (2022) Characterization of the Prognostic Values of CXCL Family in Epstein-Barr Virus Associated Gastric Cancer. Oxid Med Cell Longev 2022: 2218140. pmid:35693706
  15. 15. Zhou H, Wu J, Wang T, Zhang X, Liu D (2016) CXCL10/CXCR3 axis promotes the invasion of gastric cancer via PI3K/AKT pathway-dependent MMPs production. Biomed Pharmacother 82: 479–488. pmid:27470388
  16. 16. Wang BJ, Zhang ZQ, Ke Y (2006) Conversion of cadherin isoforms in cultured human gastric carcinoma cells. World J Gastroenterol 12(6): 966–970. pmid:16521229
  17. 17. Yang Z, Zhang X, Gang H, Li X, Li Z, et al. (2007) Up-regulation of gastric cancer cell invasion by Twist is accompanied by N-cadherin and fibronectin expression. Biochem Biophys Res Commun. 358(3): 925–930. pmid:17512904
  18. 18. Jiang SB, He XJ, Xia YJ, Hu WJ, Luo JG, et al. (2016) MicroRNA-145-5p inhibits gastric cancer invasiveness through targeting N-cadherin and ZEB2 to suppress epithelial-mesenchymal transition. Onco Targets Ther 9: 2305–2315. pmid:27143926
  19. 19. Jiang Y, Liu X, Shen R, Gu X, Qian W (2021) Fbxo21 regulates the epithelial-to-mesenchymal transition through ubiquitination of Nr2f2 in gastric cancer. J Cancer 12(5): 1421–1430. pmid:33531987
  20. 20. Zhu Y, Wu J, Ma W, Zhang H, Wang D (2007) Expression of TGF-β1, Snail, E-cadherin and N-cadherin in gastric cancer and its significance. Chinese J Clin Onc 4: 384–389.
  21. 21. Li F, Wan B, Li XQ (2022) Expression Profile and Prognostic Values of CDH Family Members in Lung Adenocarcinoma. Dis Markers 2022: 9644466. pmid:35242247
  22. 22. Matsusaka K, Ushiku T, Urabe M, Fukuyo M, Abe H, et al. (2016) Coupling CDH17 and CLDN18 markers for comprehensive membrane-targeted detection of human gastric cancer. Oncotarget 7(39): 64168–64181. pmid:27580354
  23. 23. Luo W, Fedda F, Lynch P, Tan D (2018) CDH1 Gene and Hereditary Diffuse Gastric Cancer Syndrome: Molecular and Histological Alterations and Implications for Diagnosis And Treatment. Front Pharmacol 9: 1421. pmid:30568591
  24. 24. Kaurah P, MacMillan A, Boyd N, Senz J, De Luca A, et al. (2007) Founder and recurrent CDH1 mutations in families with hereditary diffuse gastric cancer. JAMA. 297(21): 2360–2372. pmid:17545690
  25. 25. Hu M, Xie J, Hou H, Liu M, Wang J (2022) Prognostic Value of DNA Methylation-Driven Genes in Clear Cell Renal Cell Carcinoma: A Study Based on Methylation and Transcriptome Analyses. Dis Markers. 2020: 8817652.
  26. 26. Zhong XP, Kan A, Ling YH, Lu LH, Mei J, et al. (2019) NCKAP1 improves patient outcome and inhibits cell growth by enhancing Rb1/p53 activation in hepatocellular carcinoma. Cell Death Dis 10(5): 369. pmid:31068575
  27. 27. Chen J, Ge J, Zhang W, Xie X, Zhong X, Tang S (2022) NCKAP1 is a Prognostic Biomarker for Inhibition of Cell Growth in Clear Cell Renal Cell Carcinoma. Front Genet 13: 764957. pmid:35957696
  28. 28. Chu S, Wang H, Yu M (2017) A putative molecular network associated with colon cancer metastasis constructed from microarray data. World J Surg Oncol 15(1): 115. pmid:28629469
  29. 29. Pan S, Zhan Y, Chen X, Wu B, Liu B (2019) Bladder Cancer Exhibiting High Immune Infiltration Shows the Lowest Response Rate to Immune Checkpoint Inhibitors. Front Oncol 9: 1101. pmid:31737562
  30. 30. McAlpin BR, Mahalingam R, Singh AK, Dharmaraj S, Chrisikos TT, et al. (2022) HDAC6 inhibition reverses long-term doxorubicin-induced cognitive dysfunction by restoring microglia homeostasis and synaptic integrity. Theranostics 12(2): 603–619. pmid:34976203
  31. 31. Stark AL, Hause RJ, Gorsic LK, Antao NN, Wong SS, et al. (2014) Protein quantitative trait loci identify novel candidates modulating cellular response to chemotherapy. PLoS Genet 10(4): e1004192. pmid:24699359
  32. 32. Wang CY, Shahi P, Huang JT, Phan NN, Sun Z, et al. (2017) Systematic analysis of the achaete-scute complex-like gene signature in clinical cancer patients. Mol Clin Oncol 6(1): 7–18. pmid:28123722
  33. 33. Liu Y, Hu X, Pan Z, Jiang Y, Guo D, et al. (2021) Functional Modules Analysis and Hub Gene Prognostic Values Evaluation Based on CoExpression Network in Gastric Cancer. Preprint from Research Square.
  34. 34. Nakakura EK, Sriuranpong VR, Kunnimalaiyaan M, Hsiao EC, Schuebel KE, et al. (2005) Regulation of neuroendocrine differentiation in gastrointestinal carcinoid tumor cells by notch signaling. J Clin Endocrinol Metab 90(7): 4350–4356. pmid:15870121
  35. 35. Tendler S, Kanter L, Lewensohn R, Ortiz-Villalon C, Viktorsson K, et al. (2020) The prognostic implications of Notch1, Hes1, Ascl1, and DLL3 protein expression in SCLC patients receiving platinum-based chemotherapy. PLoS One 15(10): e0240973. pmid:33104707
  36. 36. Augustyn A, Borromeo M, Wang T, Fujimoto J, Shao C, et al. (2014) ASCL1 is a lineage oncogene providing therapeutic targets for high-grade neuroendocrine lung cancers. Proc Natl Acad Sci U S A. 111(41): 14788–14793. pmid:25267614
  37. 37. Chen SJ, Chen LH, Yeh YM, Lin CK, Lin PC, et al. (2021) Targeting lysosomal cysteine protease cathepsin S reveals immunomodulatory therapeutic strategy for oxaliplatin-induced peripheral neuropathy. Theranostics 11(10): 4672–4687. pmid:33754020
  38. 38. Xiao Y, Jaskula-Sztul R, Javadi A, Xu W, Eide J, et al. (2012) Co-delivery of doxorubicin and siRNA using octreotide-conjugated gold nanorods for targeted neuroendocrine cancer therapy. Nanoscale 4(22): 7185–7193. pmid:23070403
  39. 39. Weber L, Al-Refae K, Ebbert J, Jagers P, Altmuller J, et al. (2017) Activation of odorant receptor in colorectal cancer cells leads to inhibition of cell proliferation and apoptosis. PLoS ONE 12: e0172491. pmid:28273117
  40. 40. Guo X, Yang Z, Zhi Q, Wang D, Guo L, et al. (2016) Long noncoding RNA OR3A4 promotes metastasis and tumorigenicity in gastric cancer. Oncotarget 7(21): 30276–30294. pmid:26863570
  41. 41. Sera T, Sakuma T, Maruo K, Tsujio G, Yamamoto Y, et al. (2022) Candidate Oncogenes, ARHGAP4, NOS3, and OR51B5, for the Development of Scirrhous-type Gastric Cancer. Anticancer Res 42(11): 5195–5203. pmid:36288877
  42. 42. Morita R, Hirohashi Y, Torigoe T, Ito-Inoda S, Takahashi A, et al. (2016) Olfactory Receptor Family 7 Subfamily C Member 1 Is a Novel Marker of Colon Cancer-Initiating Cells and Is a Potent Target of Immunotherapy. Clin Cancer Res 22(13): 3298–3309. pmid:26861454
  43. 43. Li Z, Yu D, Gan M, Shan Q, Yin X, et al. (2015) A genome-wide assessment of rare copy number variants in colorectal cancer. Oncotarget 6(28): 26411–26423. pmid:26315111
  44. 44. Choi YR, Shim J, Park JH, Kim YS, Kim MJ (2021) Discovery of Orphan Olfactory Receptor 6M1 as a New Anticancer Target in MCF-7 Cells by a Combination of Surface Plasmon Resonance-Based and Cell-Based Systems. Sensors (Basel). 21(10): 3468. pmid:34065710
  45. 45. Cervantes-Madrid DL, Nagi S, Gustafsson AA (2017) FosB transcription factor regulates COX-2 expression in colorectal cancer cells without affecting PGE2 expression. Oncol Lett 13(3): 1411–1416. pmid:28454270
  46. 46. Tang C, Jiang Y, Shao W, Shi W, Gao X, et al. (2016) Abnormal expression of FOSB correlates with tumor progression and poor survival in patients with gastric cancer. Int J Oncol 49(4): 1489–1496. pmid:27633497
  47. 47. Wang L, Sun Y, Jiang M, Zhang S, Wolfl S (2009) FOS proliferating network construction in early colorectal cancer (CRC) based on integrative significant function cluster and inferring analysis. Cancer Invest 27(8): 816–824. pmid:19557575
  48. 48. Song L, Du A, Xiong Y, Jiang J, Zhang Y, et al. (2016) γ-Aminobutyric acid inhibits the proliferation and increases oxaliplatin sensitivity in human colon cancer cells. Tumour Biol 37(11): 14885–14894. pmid:27644246
  49. 49. Alian OM, Azmi AS, Mohammad RM (2012) Network insights on oxaliplatin anti-cancer mechanisms. Clin Transl Med 1(1): 26. pmid:23369220
  50. 50. Dong X, Yang Y, Xu G, Tian Z, Yang Q, et al. (2022) The initial expression alterations occurring to transcription factors during the formation of breast cancer: Evidence from bioinformatics. Cancer Med 11(5): 1371–1395. pmid:35037412
  51. 51. Javellana M, Eckert MA, Heide J, Zawieracz K, Weigert M, et al. (2022) Neoadjuvant Chemotherapy Indvuces Genomic and Transcriptomic Changes in Ovarian Cancer. Cancer Res 82(1): 169–176. pmid:34737212
  52. 52. Cong X, Ren W, Pacalon J, Xu R, Xu L, et al. (2022) Large-Scale G Protein-Coupled Olfactory Receptor-Ligand Pairing. ACS Cent Sci 8(3): 379–387. pmid:35350604
  53. 53. Gether U, Andersen PH, Larsson OM, Schousboe A (2006) Neurotransmitter transporters: molecular function of important drug targets. Trends Pharmacol Sci 27(7): 375–383. pmid:16762425