Comprehensive information-based differential gene regulatory networks analysis (CIdrgn): Application to gastric cancer and chemotherapy-responsive gene network identification

Biological condition-responsive gene network analysis has attracted considerable research attention because of its ability to identify pathways or gene modules involved in the underlying mechanisms of diseases. Although many condition-specific gene network identification methods have been developed, they are based on partial or incomplete gene regulatory network information, with most studies only considering the differential expression levels or correlations among genes. However, a single gene-based analysis cannot effectively identify the molecular interactions involved in the mechanisms underlying diseases, which reflect perturbations in specific molecular network functions rather than disorders of a single gene. To comprehensively identify differentially regulated gene networks, we propose a novel computational strategy called comprehensive analysis of differential gene regulatory networks (CIdrgn). Our strategy incorporates comprehensive information on the networks between genes, including the expression levels, edge structures and regulatory effects, to measure the dissimilarity among networks. We extended the proposed CIdrgn to cell line characteristic-specific gene network analysis. Monte Carlo simulations showed the effectiveness of CIdrgn for identifying differentially regulated gene networks with different network structures and scales. Moreover, condition-responsive network identification in cell line characteristic-specific gene network analyses was verified. We applied CIdrgn to identify gastric cancer and itsf chemotherapy (capecitabine and oxaliplatin) -responsive network based on the Cancer Dependency Map. The CXC family of chemokines and cadherin gene family networks were identified as gastric cancer-specific gene regulatory networks, which was verified through a literature survey. The networks of the olfactory receptor family with the ASCL1/FOS family were identified as capecitabine- and oxaliplatin sensitive -specific gene networks. We expect that the proposed CIdrgn method will be a useful tool for identifying crucial molecular interactions involved in the specific biological conditions of cancer cell lines, such as the cancer stage or acquired anticancer drug resistance.


Introduction
The complex mechanisms underlying some diseases states cannot be understood by gene expression level analyses alone because some diseases are caused by disturbances in the specific functions of molecular networks rather than disorders in a single gene. One of the most powerful emerging techniques for biomedical research is heterogeneous gene regulatory network analysis, especially for responsive gene sets and module identification, and such networks have attracted considerable research attention because of their ability to reveal the complex mechanisms of diseases. Many studies have focused on the expression levels of genes and developed statistical methods for identifying differentially expressed genes (DEGs) or gene sets [1][2][3]. Computational methods based on correlations between genes have also been proposed to identify condition-responsive subnetworks and gene modules. Tesson et al. [4] focused only on the coexpression of genes and developed the method DiffCoEx to identify correlation pattern changes between multiple conditions. Guo et al. [5] developed the responsive score based on covariance of genes and then proposed the method for searching responsive networks. Although various computational approaches have been developed to identify significant gene sets describing phenotype-specific characteristics, these methods are based only on partial or incomplete information (e.g., correlation, covariance of expression levels of genes) on gene regulatory networks. Thus, biological condition-responsive gene networks could not be effectively identified.
Our study focuses on the drawbacks of the previous studies. In order to consider various aspects of network structure and effectively identify responsive networks, we propose a novel computational strategy that incorporates comprehensive information of network structure unlike the existing studies, called comprehensive information-based differential gene regulatory network analysis (CIdrgn). We consider the gene regulatory network structure based on not only expression levels of genes but also edge structures and regulatory effects of regulator genes to their target genes and incorporate the comprehensive information into a dissimilarity measure to compare the gene regulatory networks. We then proposed a novel statistic strategy to identify differentially regulated gene networks across phenotypes or biological conditions and extend the proposed CIdrgn to cell line characteristic-specific gene regulatory network analysis. To the best of our knowledge, this is the first study on responsive network identification based on comprehensive information of network structure.
We demonstrate the effectiveness of the proposed CIdrgn scheme through Monte Carlo simulations. Our strategy showed outstanding performance for identifying differentially regulated gene networks with various structures and scales. We applied our strategy to the Cancer Dependency Map (DepMap) dataset and performed gastric cancer (GC) and chemotherapy (capecitabine and oxaliplatin)-responsive gene network analyses. The identified responsive networks and their members (genes) represent candidate biomarkers for mechanisms related to gastric/colorectal cancer and various chemotherapies. Although biological knowledge about the identified subnetworks and markers was not incorporated into the differential gene regulatory network analysis, our data-driven strategy provides biologically reliable results for GC and its chemotherapy-responsive gene network analysis. Our results suggested that The CXC family of chemokines and cadherin (CDH) genes may play crucial roles in tumor invasion, metastasis, and progression of GC. In addition, the loss of activity of the olfactory receptor (OR) family with the ASCL1/FOS family may lead to drug resistance in cell lines and inducing these gene families may provide vital clues to enhance the chemotherapy efficiency of capecitabine and oxaliplatin.
The remainder of this paper is organized as follows. In the Methods section, we introduce a novel strategy for the differential gene regulatory network analysis. In the Monte Carlo simulation section, we present the results of the simulation studies. In the GC and XELOXresponsive gene regulatory network analysis section, we describe the results of GC and its chemotherapy-responsive network analysis. In the Discussion section, we provide the conclusions.

Methods
Suppose that X ¼ ðx 1 ; . . . ; x n Þ T 2 R n�p is an n × p data matrix that describes the expression of p possible regulators that control target gene transcription y ' 2 R n ; ' ¼ 1; . . . ; q for n cell lines. To estimate the gene regulatory network, we considered the following linear regression model: where β ℓj is the regression coefficient that represents the effect of each regulator x j on its target y ℓ and ϵ ℓ = (� ℓ1 , . . ., � ℓn ) T is a random error vector. The network can be represented by a weighted graph G = (V, E, W), where V is the set of vertices corresponding to p genes and E 2 V × V is the set of edges, where (i, j) 2 E indicates a link between vertices i and j (i.e., genes i and j). W = (w ij ) is the edge weight between vertices i and j which can be estimated usingb ij andb ji .

Previous studies
Significance analysis of microarray for gene sets (SAM-GS). To identify statistically significant gene sets across phenotypes A and B, Dinu et al. [1] proposed the SAM-GS method by extending the SAM t-statistic. The SAM-GS statistic is given as follows: where � x Aj and � x Bj are the averages of the expression levels of gene j in phenotypes A and B, respectively; s 0 is a tuning parameter; and s j is the gene-specific scatter. This scatter is given as follows: sðjÞ ¼ ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffiffi ffi ffi ffi ffi af where n A and n B are the numbers of cell lines in phenotypes A and B, respectively, and a = (1/n A + 1/n B )/(n A + n B − 2). The SAM-GS statistic measures the difference in expression levels of a gene set across phenotypes A and B. Gene set co-expression analysis (GSCA). Choi and Kendziorski [6] proposed GSCA a statistical method to identify differentially co-expressed genes. For genes in a subnetwork G, GSCA computes pairwise correlations for all jVj 2 À � gene pairs, where |V| is the number of genes in V. The statistic of the GSCA used to measure the dispersion of correlations between phenotypes A and B is given as follows: ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffiffi ffi ffiffi 1 jVjðjVj À 1Þ=2 where c A kj and c B kj are the correlations between the k th and j th genes for the phenotypes A and B, respectively.

Comprehensive information-based differentially regulated gene network analysis
In this study, we proposed a novel differential gene regulatory network analysis strategy that can effectively identify biological condition-responsive subnetworks based on gene expression levels and subnetwork structures. Network structures are evaluated based on regulatory effects between genes and the edge structure of genes, and the following statistics are proposed to measure the dissimilarity of the subnetworks.
• Dissimilarity based on regulatory effects between genes We define the following regulatory effect of the j th regulator gene on the ℓ th target gene in phenotypes A and B, consistent with a study conducted earlier [7]: We propose the following statistic to describe dissimilarity of the regulatory effect of the j th gene: The large value γ j (A, B) indicates the j th gene has different expression levels and size of edges connected its neighborhoods between phenotypes A and B. We then propose the following statistic to measure dissimilarity of regulatory effects of a subnetwork G: The large value of Γ indicates that the genes consisting of G have different regulatory effects on their target genes across the phenotypes, thus the network corresponding the large Γ can be considered as the responsive network having phenotype specific regulatory effects. • Dissimilarity of the edge structure of nodes The genes linked in the networks may have similar biological functions; thus, node (gene) similarity can be defined using the Jaccard index. We propose the following statistic for the node similarity of the j th gene between phenotypes A and B in line with Li et al. [8]: where N Aj and N Aj are the sets of nodes that are directly connected to the j th gene within phenotypes A and B, respectively. If the gene have similar target and regulator genes between phenotypes A and B, than we consider that the gene is not phenotype specific feature. Thus, we propose the following Jaccard distance metric to measure the dissimilarity of the edge structure of the j th gene: We then propose the following statistic to describe the dissimilarity of edge structures in a subnetwork: • Dissimilarity based on adjusted regulatory effects We also consider the following statistic based on the adjusted regulatory effect using the Jaccard distance: To comprehensively measure the dissimilarity of a subnetwork across biological conditions or phenotypes, we propose the following statistic that incorporates not only expression levels of genes but also edge structures and regulatory effects into the differential gene regulatory network analysis, The large values of the statistics CIdrgn.1 and CIdrgn.2 indicate that the subnetworks show different regulatory structures for phenotypes A and B.
Responsive subnetwork extract. To extract biological condition-responsive subnetworks, we computed the significance of the proposed statistics based on the permutation framework.
We first compute the statistics Γ, Λ, and Γ Λ for the subnetworks. Then, we generate permutation samples S ðpmÞ For notational compactness, in the remainder of this section, we shall write the notation in Eq (14)  where CIdrgn (pm) .1 and CIdrgn (pm) .2 are the statistics computed by the pm th permutation sample and I(�) is an indicator function. The small permutation p-value indicates that networks G (A) and G(B) show significantly different regulatory structures. The algorithm for CIdrgn is given in Algorithm1. For the dissimilarity measure CIdrgn.2, similar procedures were applied to compute the permutation p.value CIdrgn.2 and identify the responsive subnetworks. The proposed statistic is based on comprehensive information of the gene regulatory network, that is, the dissimilarity of node edge structures, regulatory effects, and expression levels; thus, we can effectively extract biological condition-specific gene networks.

CIdrgn for cell line characteristic-specific gene networks
We consider cell line characteristic-specific gene regulatory networks that can be estimated using the following varying coefficient model [9]: where β jℓ (m α ) is the varying coefficient of the j th regulator gene x j on ℓ th target gene y ℓ for the indicates the link between vertices i and j in the network of the α th cell line. We also propose the method for identifying responsive subnetwork from cell line characteristic-specific gene networks.
• Dissimilarity based on the regulatory effect of the sample-specific gene network For the sample-specific gene network, the regulatory effect of the j th regulator gene on ℓ th target gene in the α th cell line is given as follows: where x αj is an expression level value of the j th gene in the α th cell line. We define the regulatory effect of phenotypes A and B as follows: We then propose the following statistic to measure dissimilarity of the regulatory effects of genes in a subnetwork G α , • Dissimilarity of the node edge structures for sample-specific gene network We consider [ a2A N aj and [ a2B N aj as the sets of edges of the j th gene in phenotypes A and B, respectively. We then define the following node similarity: where N αj is the sets of nodes that are connected to the j th gene in the network of the α th cell line. The Jaccard distance of the j th node in the cell line characteristic-specific gene network is given as follows: Similar to (10), the dissimilarity measure of the edge structure is given as follows: We then proposed the following statistics to identify differentially regulated gene networks in the cell line characteristic-specific gene network analysis. where For the cell line characteristic-specific gene network, the condition-responsive subnetworks are extracted based on the permutation p-value of CIdrgn ss .1 and CIdrgn ss .2, which are consistent with p.value CIdrgn.1 and p.value CIdrgn.2 .

Monte Carlo simulations
Monte Carlo simulations were conducted to investigate the performance of the proposed CIdrgn. Suppose that we have two phenotypes A and B and 10 subnetworks consisting of 10 genes, where the sample sizes of phenotypes A and B are n A = 50 and n B = 50, respectively. We consider four common subnetworks of the two phenotypes A and B, six A-specific and six Bspecific subnetworks. The expression levels of the p genes in 100 (= n A + n B ) cell lines were generated using the following procedure. In scenario 1, we generated four precision matrices O nw c for nw = 1,.., 4 common subnetworks from "random'' graph structure using huge.generator in R package Huge, where the off-diagonal elements of the precision matrix were set to 0.5 and the positive number 0.2 was added to the diagonal elements of the precision matrix, which is consistent with [10]. For each 10 genes consisting of the 4 common subnetworks, the expression levels of 100 cell lines were generated from Nð0 10 ; O À 1 c Þ. For the six A-specific and B-specific subnetworks, the precision matrices (O nw A and O nw B for nw = 1, . . ., 6) were generated from the "random" and "band" graph structures, respectively. For each 10 genes consisting of the each six A-and six B-specific subnetworks, we generated expression levels of n A = n B = 50 cell lines from Nð0 10 ; O À 1 A Þ and Nðμ 10 ; O À 1 B Þ, respectively. The expression levels of the remaining p − 100 genes were generated from N(0 p−100 , I p ). Scenarios 2, 3, and 4 were similar to scenario 1 except that the precision matrices O nw c for nw = 1,.., 4 and O nw A for nw = 1, . . ., 6 have "cluster", "scale-free", and "hub" graph structures for scenarios 2, 3, and 4, respectively. Fig 2  shows the graph structures of the common subnetworks and A-and B-specific subnetworks.
We considered p = 1000 and μ 10 = (0.3, 0.3, . . ., 0.3) T , and the gene regulatory networks were estimated using the lasso method [11], where the response and predictor variables are the expression levels of the target gene and regulator genes, respectively. We evaluated the proposed CIdrgn by comparing it with the statistics SAM-GS and GSCA. The permutation p-values of the statistics SAM-GS and GSCA were computed, and responsive subnetworks were extracted by the permutation p-values with a significance level of 0.05, such as in the case of CIdrgn.
The recall, precision, true negative rate (TNR), F-measure, and accuracy results are given in the column labeled "subnetwork size:10" of Table 1. CIdrgn provides effective results for responsive subnetwork identification overall, although limited differences are observed. In particular, CIdrgn.2 shows effective results in terms of precision, true negative rate, F-measure, and accuracy. The methods show similar results for the true positive rate (i.e., Recall) for six responsive network identifications, while our method provides outstanding performance for the true negative rate (TNR) for four common subnetworks.
We also consider the scenarios for a large-scale subnetwork, that is, each 10 subnetworks consist of 50 genes. The results for the large-scale subnetwork scenarios are presented in the column labeled "subnetwork size:50" in Table 1. The results show that our strategy also provides outstanding performance for large-scale subnetwork situations, where CIdrgn.2 also shows the most effective results for differentially regulated gene network identification.

Monte Carlo simulations: Sample-specific gene network analysis
For the cell line characteristic-specific gene regulatory networks, we considered the same network structures given in Fig 2. For sample-specific gene regulatory strength, we generated precision matrices as follows: The gene expression levels were determined using the same method applied in the gene network analysis with bulk cell lines except that the expression of genes in six A-specific and six B-specific subnetworks was generated as x α (A) * N(0 10 , O α (A)) and x α (B) * N(μ 10 , O α (B)) for α = 1, . . ., 50. The modulator M = (m 1 , . . ., m 100 ) T that describes the characteristics of cell lines was generated from the uniform distribution U (−1, 1).m The cell line characteristic-specific gene network was estimated using SiGN-L1 (https://sign.hgc.jp/signl1/) [7]. Table 2 shows the results for cell line characteristic-specific gene networks. CIdrgn provides effective results for condition-responsive subnetwork identification in cell line characteristic-specific gene network analyses. CIdrgn.1 outperforms in scenarios 2 and 4, while CIdrgn.2 shows an effective performance for scenarios 1 and 3.
In short, our proposed strategy provides effective results for differentially regulated gene network identification in bulk cell line gene networks and cell line characteristic-specific gene networks. This finding implies that the incorporation of comprehensive information on gene regulatory systems is crucial for differential gene regulatory network analyses.

Gastric cancer and XELOX-responsive gene regulatory network analysis
Gastric cancer (GC) is the one of the leading causes of cancer-related death worldwide. Although chemotherapy represents the standard treatment for GC, it is not always effective because of drug resistance in cancer cell lines. Uncovering crucial markers and their regulatory networks that play key roles in the mechanisms of acquired drug resistance are a vital issue to enhance chemotherapy efficacy of gastric cancer.
We applied the proposed strategy to identify GC and chemotherapy -responsive gene regulatory networks. We used a publicly available large-scale dataset from the Cancer Dependency Map (DepMap) (https://depmap.org/portal/), where the RNA expression of 19,221 genes and 1,406 cell lines from the CCLE dataset (22Q2) and the drug sensitivity of 4,686 compounds and 578 cell lines from the PRISM repurposing primary screen can be downloaded. Because CIdrgn.2 and CIdrgn ss .2 show outstanding performance in simulation studies, we use CIdrgn.2 and CIdrgn ss .2 for GC and XELOX-responsive gene regulatory network analysis.

GC-responsive gene regulatory network identification
To identify the GC-responsive gene regulatory networks, we considered 41 GC and 1365 nongastric cancer (NGC) cell lines as the two phenotypes and then estimated GC gene regulatory networks based on the GC cell lines. The NGC gene network was estimated using 41 randomly selected cell lines from 1365 NGC cell lines, and the construction process was repeated 50 times based on 41 randomly selected NGC cell lines. The permutation GC and NGC networks were also estimated using the 41 cell lines randomly selected from among the 1406 cell lines, and the permutation NGC network construction process was also repeated 50 times. We then extracted subnetworks from the estimated GC and NGC networks and permutation GC and NGC networks. For NGC networks, only subnetworks existing in all 50 iterations were considered as NGC networks (permutation NGC networks). The permutation of network estimation was repeated 500 times. We extracted edges with a size greater than 0.7 and identified the GC/ NGS-responsive network based on a permutation p-value with a significance level of 0.01. A total of 106 differentially regulated subnetworks were identified, where the identified 106 subnetworks are gastric cancer cell line specific gene networks, i.e., the sub-networks are not existed in NGC gene networks.
To verify the efficiency of the method, we visualized the identified GC-specific gene regulatory networks based on subnetworks with a size greater than 3 (i.e., a subnetwork with more than three nodes), as shown in Fig 3. The chemokine (C-X-C motif) ligand (CXCL) family (i.e., CXCL1, CXCL2, CXCL3 and CXCL8) is identified as a GC-responsive gene regulatory system. Moreover, the CDH gene family (CDH2 and CDH17) has GC-responsive regulatory structures. The identified GC-responsive markers (i.e., CXCL and CDH gene families) have been studied as GC markers, and their involvement in mechanisms related to GC have been demonstrated in many studies.
CXCL family. The CXCL family plays an important role in the control of inflammation and angiogenesis because their specific receptors are expressed by both leukocytes and endothelial cells [12]. Chen et al. [13] revealed that the CXCL family represents markers for the development of GC and demonstrated that it is involved in the pathogenesis of GC. In particular, the mRNA expression levels of the CXCL family have been identified as promising prognostic indicators of Epstein-Barr virus-associated GC [14]. Zhou et al. [15] revealed that the CXCL10/CXCR3 axis promotes the invasion of GC via the PI3K/AKT pathway-dependent matrix metalloproteinase production.
CDH gene family. The CDH gene family has been identified as a marker of tissue morphogenesis, sorting, and cell adhesion, which are important processes for tissue formation and maintenance.
CDH2, encoded by N-cadherin, is involved in the transformation of human GC cells and associated with intracellular events of gastric carcinogenesis and high expression of corresponding transcription factors [16]. GC cell invasion is upregulated by Twist and accompanied by CDH2 [17]. GC invasiveness is inhibited by microRNA-145-5p, which targets CDH2 to suppress epithelial-mesenchymal transition (EMT) [18]. Jiang et al. [18] revealed that CDH2 is involved in the molecular mechanisms that regulate GC cell invasion and migration, especially EMT progression [19]. Aberrant CDH2 expression is involved in the invasion and metastasis of GC [20].
The expression of CDH17 is increased in GC, and its silencing suppresses the metastasis and proliferation of GC cells [21]. CDH17 has been identified as a useful target molecule, and its crucial roles in the comprehensive detection and localization of GC metastases have been revealed earlier [22]. Mutation of E-cadherin (CDH1) represents the most common germline mutation in GC [23]. In particular, mutations in CDH1 lead to hereditary diffuse GC [23,24].
To identify the biological processes involved in the identified GC-responsive subnetworks, we performed a Kyoto Encyclopedia of Genes and Genome (KEGG) pathway enrichment analysis of the GC-responsive markers. Fig 4 shows the enriched pathways corresponding to the p-value (i.e., -log(p.value)). As shown in Fig 4, the most enriched pathway is the "IL-17 signaling pathway" grouping of the CXCL family (i.e., CXCL8, LCN2, CXCL1, CXCL3, and CXCL2). Furthermore, the "viral protein interaction with cytokine and cytokine receptor" and "cytokine-cytokine receptor interaction" pathways were also enriched for GC-responsive markers. The pathway "epithelial cell signaling in Helicobacter pylori infection" was also identified as an enriched pathway. These findings imply that GC-responsive networks may be involved in EMT-related mechanisms of GC.

XELOX-responsive gene regulatory network identification
We also performed a GC chemotherapy-responsive gene regulatory network identification. We focused on the anticancer drugs capecitabine and oxaliplatin, which have been used as standard treatments for gastric and colorectal cancer, and their combination is known as the XELOX regimen. For XELOX, we extracted cell lines that had non-missing values in sensitivities to both capecitabine and oxaliplatin and matched the cell lines with the expression level dataset. In total, 520 cell lines were extracted. We then estimated 520 capecitabine-and capecitabine-sensitive gene networks using SiGN-L1 [7]. We considered the cell lines with the top 100 highest and lowest drug sensitivity values as drug-sensitive and drug-resistant cell lines, respectively. The sensitive and resistant cell lines for both capecitabine and oxaliplatin were defined as XELOX-sensitive and XELOX-resistant cell lines, and 18 XELOX-sensitive and 24 XELOX-resistant cell lines were considered. From the 520 capecitabine-sensitive gene networks, we extracted 18 and 24 networks corresponding to the 18 and 24 XELOX-sensitive and XELOX-resistant cell lines, respectively. The 18 sensitive and 24 resistance permutated networks were also randomly selected from 520 capecitabine-sensitive gene networks. By using the permutation p-value of CIdrgn ss .2 with a significance level of 0.01, we identified 62 capecitabine-responsive subnetworks. For oxaliplatin, similar processes were performed, and nine oxaliplatin-responsive subnetworks were identified. Among the genes in the 62 capecitabineand 9 oxaliplatin-responsive subnetworks, the common genes in the sensitive (resistant) networks were defined as XELOX-sensitive (resistant) markers. We then extracted edges of the XELOX-sensitive (resistance) markers from the capecitabine and oxaliplatin-sensitive (resistance) gene networks and constructed XELOX-sensitive (resistant) networks, where the edge size was given as an average value from the 18 XELOX-sensitive and 24 XELOX-resistant networks. Fig 5 shows the XELOX-responsive gene regulatory networks for the XELOX-sensitive and XELOX-resistant cell lines. The networks of XELOX-sensitive and XELOX-resistant cell lines showed different gene regulatory structures despite presenting common genes, and their networks were observed for both sensitive and resistant cell lines. The subnetwork of NCKAP1L was the largest XELOX-responsive gene regulatory network. The regulatory structure between NCKAP1L/ASCL1 and the OR family can be considered a XELOX-sensitive gene regulatory system. ASCL1 is an important gene in XELOX-sensitive cell lines, whereas the sub-network disappears from sensitive to resistance cell lines. Furthermore, the subnetwork of the FOS gene family can also be considered a XELOX-sensitive network. Table 3 shows the identified XELOX-responsive markers and their association with GC/ colorectal cancer and chemotherapy.
NCKAP1L. The Nck-associated protein 1-like (NCKAP1L) gene is known as a DNA methylation-driven gene that promotes the malignancy of cancer cells [25,26]. The expression of NCKAP1 is considered a marker of colon cancer [27]. NCKAP1 has been identified as a chemotherapy marker of immune checkpoint inhibitors [29]. NCKAP1 is associated with long-term DOX-induced cognitive dysfunction [30] Stark et al. [31] reported an association between NCKAP1 levels and cisplatin-induced apoptosis and cytotoxicity.
ASCL1. Expression levels of ASCL1 were identified as crucial markers of various GC subtypes, that is, lower expression of ASCL1 was observed in gastric mixed adenocarcinoma, gastrointestinal stromal tumors, and gastric intestinal-type adenocarcinoma [32]. In addition,  Table 3. Identified markers and evidence.
Morita et al. [42] identified OR7C1 as a novel marker of colon cancer-initiating cells and a potent target of immunotherapy. Activation of the OR family members induces apoptosis and reduces cell migration and proliferation in colorectal cancer cells [39]. The OR family (i.e., OR1D2, OR4F15, and OR1A1) was also disrupted in colorectal cancer, and the importance of OR activity-associated genes in colorectal cancer was identified by Li et al. [43]. Choi et al. [44] found that the OR family is an important target for drug discovery and represents a crucial anticancer target.
FOS family (AP-1 transcription factor family). The FOS gene family is involved in malignant transformation in several types of cancer, and its members (FOS, FOSB, FOSL1, and FOSL2) play crucial roles in cell proliferation and cancer development [45]. Abnormal expression of FOSB is a signature of tumor progression and poor survival. Overexpression of FOSB suppresses cell proliferation, clone formation, and migration, whereas silencing of FOSB leads to proliferation, clone formation, and migration in GC cell lines [46]. Tang et al. [46] suggested that FOSB could be a useful biomarker for the diagnosis and prognosis of GC. FOS family members have been identified as novel prognostic markers and therapeutic targets in colorectal cancer [45,47]. The FOS family is involved in the antitumor efficacy of oxaliplatin, and the importance of the effect of oxaliplatin 'on FOS signaling has been revealed [48,49].
Dong et al. [50] demonstrated that upregulated FOS expression is a marker of low sensitivity to eribulin. Javellana et al. [51] revealed potentially clinically relevant roles of increased FOS family expression in driving the initial responses of tumors to carboplatin and paclitaxel and suggested that the FOS family will be a useful druggable target to counteract chemotherapy resistance.
For the XELOX-sensitive and XELOX-resistant subnetworks in Fig 5, we performed a Gene Ontology (GO) enrichment analysis. Fig 6 shows the significant GO terms corresponding to a p-value less than 0.01, where the enriched pathways of XELOX-sensitive and XELOX-resistant subnetworks are shown in red and blue, respectively. As shown in Fig 6, the XELOX-sensitive (red) and XELOX-resistant (blue) markers are enriched in different pathways, although the networks have common genes.
The most enriched pathway of the XELOX-sensitive markers is the "olfactory receptor activity" grouping of the OR family (i.e., OR4E2, OR4E1, OR10G2, OR52R1, OR51E2, OR51T1). Furthermore, the pathways related to "DNA binding", "steroid hormone", and "G protein-coupled receptor" were also identified as XELOX-sensitive gene markers. These findings imply that the XELOX-sensitive gene regulatory network is dominated by G protein-coupled receptors because the OR family belongs to the superfamily of G protein-coupled receptors [52]. For XELOX-resistance markers, the most enriched pathway is "neurotransmitter transport", and it consists SV2B and CHAT, which have crucial molecular functions as drug targets [53].
Our results and those of previous studies show that the CXC family of chemokines may play a crucial role in the progression of GC. Furthermore, identifying the EMT-related mechanism involving the CDH gene family is essential for understanding the metastasis, proliferation, and development of GC and revealing the mechanism of the gene family in EMT-associated GC is vital to understanding the progression of GC. The GS-and anti-cancer drugs response-related mechanisms of ASCL1 that was identified as XELOX sensitive-specific hub gene were revealed in previous studies. Although its mechanisms of acquiring gastric cancer drug resistance have not yet been clearly demonstrated, our results and literatures suggest that the loss of activities and molecular interactions of the OR family with the ASCL1 and FOS families may represent a molecular mechanism underlying acquired drug resistance in GC cell lines.

Conclusions
In this study, we introduce a novel strategy for differential gene regulatory network analysis. To identify condition-responsive gene regulatory networks describing biological conditionspecific regulatory characteristics, we propose a computational method that measures the dissimilarity of subnetworks based on not only gene expression, but also regulatory effects between genes and edge structures. We also extended the proposed method to cell line characteristic-specific gene network analysis. Our method performs differential gene regulatory network analysis based on comprehensive information of the network structure; thus, we can effectively identify biological condition-responsive gene networks that cannot be detected using methods based on differences in gene expression levels. Monte Carlo simulations showed that our strategy demonstrates outstanding performance for responsive gene network identification in various network structure scenarios. It implies that incorporation of various network structures not only expression levels of genes is crucial to identify responsive networks.
We applied the proposed CIdrgn to the DepMap dataset and performed an analysis of GC and chemotherapy (XELOX) -responsive gene regulatory networks. Our strategy identified the molecular interplays of CXCL family (i.e., CXCL1, CXCL2, CXCL3 and CXCL8) and CDH gene family as GS responsive networks. For the XELOX-responsive gene networks, the regulatory structure between NCKAP1L/ASCL1 and the OR family are identified. The identified markers from the GC and XELOX-responsive network analysis provide strong evidence for the mechanisms related to GC, colorectal cancer, and various chemotherapies.
Our results suggest that the CXC family of chemokines and CDH genes may play a crucial role in GC tumor invasion, metastasis, and progression. Moreover, our findings indicate that revealing the mechanisms associated with genes in EMT-associated GC will be key to understanding GC progression. We also suggest that the loss of activity of the OR family with the ASCL1/FOS family may lead to drug resistance in cell lines, and inducers of this gene family can provide crucial information to improve the chemotherapy efficiency of XELOX.