The authors have declared that no competing interests exist.
Conceived and designed the experiments: NDJ. Performed the experiments: NDJ. Analyzed the data: NDJ NB. Contributed reagents/materials/analysis tools: NDJ NB. Wrote the paper: NDJ NB.
Understanding gene transcription regulatory networks is critical to deciphering the molecular mechanisms of different cellular states. Most studies focus on static transcriptional networks. In the current study, we used the gastrin-regulated system as a model to understand the dynamics of transcriptional networks composed of transcription factors (TFs) and target genes (TGs). The hormone gastrin activates and stimulates signaling pathways leading to various cellular states through transcriptional programs. Dysregulation of gastrin can result in cancerous tumors, for example. However, the regulatory networks involving gastrin are highly complex, and the roles of most of the components of these networks are unknown. We used time series microarray data of AR42J adenocarcinoma cells treated with gastrin combined with static TF-TG relationships integrated from different sources, and we reconstructed the dynamic activities of TFs using network component analysis (NCA). Based on the peak expression of TGs and activity of TFs, we created active sub-networks at four time ranges after gastrin treatment, namely immediate-early (IE), mid-early (ME), mid-late (ML) and very late (VL). Network analysis revealed that the active sub-networks were topologically different at the early and late time ranges. Gene ontology analysis unveiled that each active sub-network was highly enriched in a particular biological process. Interestingly, network motif patterns were also distinct between the sub-networks. This analysis can be applied to other time series microarray datasets, focusing on smaller sub-networks that are activated in a cascade, allowing better overview of the mechanisms involved at each time range.
Understanding gene transcription regulatory networks is critical to deciphering the molecular mechanisms resulting in different cellular states in response to growth factors
Gene regulatory networks are highly complex and dynamic, especially the coordinated regulation between transcription factors (TFs) and their target genes (TGs)
Understanding cellular functionality by studying its key components, interactions and network topological measures is a common approach in systems biology
It is well known that transcription networks often contain a small set of recurring regulatory patterns called network motifs. These small networks are frequently found in quantities that are significantly larger than would be expected for random networks
In the current study, we combined time series gene expression data and static TF-TG interaction data to study the dynamic features of gastrin-regulated transcriptional networks. We reconstructed the dynamic activities of key TFs using network component analysis (NCA)
The gene expression data used in this study were obtained by measuring the response of AR42J adenocarcinoma cells treated with gastrin hormone at 11 time points over a period of 14 hours. We downloaded the data from GEO database (Array Express accession number: GSE32869)
We collected the experimentally verified TF-TG regulations from TFacts
Active sub-networks for four time ranges were excerpted from static network by incorporating gene expression data and predicted TF activities from NCA. For each time range, active TFs and TGs (which displays higher expression/activity than a threshold) were identified and combined it with static network to define an active sub-network.
Network component analysis (NCA) is a computational method for reconstructing hidden regulatory signals (TFs activity) from gene expression data with known connectivity information in terms of matrix decomposition
Based on above formulation, the decomposition of [
To guarantee the uniqueness of the solution for
The connectivity matrix [C] must have full-column rank
When a node in the regulatory layer is removed along with all of the output nodes connected to it, the resulting network must be characterized by a connectivity matrix that still has full-column rank
The [T] matrix must have full row rank
Using NCA as the reconstruction method, we predicted significant TFs and their temporal activity profiles.
The degree of a node
In the case of directed networks,
The betweenness centrality
In the case of directed networks,
Network motifs are small networks that are present in large complex networks at higher frequencies than in random networks. To understand the regulation patterns, we used the FANMOD tool
Gene ontology (GO) analysis aims to capture increasing knowledge of gene functions in a collective manner. We used the ClueGO tool
The workflow of our data pre-processing and subsequent analysis is presented in
The schematic of our approach for constructing the active sub-networks at different time ranges from gene expression data.
The changes in the expression of differentially expressed genes (DEGs) over the 14-hour period after gastrin stimulation are shown in
(A) TG expression profiles. (B) Reconstructed TF activity profiles. Clustering was performed in such a way that both sequential and co-regulation was observed. Here, each row represents either a TG or TF, and each column represents a progression in time (in hours). Activations and repressions are represented by red and blue colors, respectively.
We used network component analysis (NCA)
To understand gastrin regulation in adenocarcinoma cells in terms of quantitative measurements, we computed the number of differentially expressed genes (DEGs) and TFs (computed from NCA) activated at each time point (
The number of active TGs and TFs at each time point is presented. Active TGs and TFs were defined at each time point based on peak expression or activity at that time point.
To study the dynamic features of regulatory networks, a static integrated regulatory network was constructed by combining various data resources including Chip-X studies, experiments and predicted data bases. The regulatory TF-TG relationships were extracted from the Transcriptome Browser
The active sub-networks were built based on differential expression of TGs and peak TF activity during a specific time range. We defined four active sub-networks, namely immediate-early (IE), mid-early (ML), mid-late (ML) and very late (VL), as shown in
We then examined the structural and modular architecture of the four constructed active sub-networks. This analysis clearly distinguished each active sub-network from the others. The size and topological metrics of the static regulatory network and the four active sub-networks are presented in
Network | Static network | IE network | ME network | ML Network | VL network | |
Active TGs | 12980 | 734 | 877 | 1201 | 265 | |
Size | Active TFs | 449 | 43 | 78 | 87 | 31 |
Regulatory interactions | 164077 | 2008 | 3957 | 5984 | 762 | |
Average degree | 24.03 | 5.02 | 8.02 | 8.94 | 4.82 | |
Topological metrics | Clustering coefficient | 0.274 | 0.054 | 0.062 | 0.113 | 0.272 |
Diameter | 6 | 7 | 9 | 7 | 6 | |
Average path length | 2.634 | 2.957 | 3.343 | 2.942 | 2.697 | |
Betweenness centrality | 3.87E-06 | 7.48E-05 | 15.92E-05 | 9.07E-05 | 42.52E-05 | |
Closeness centrality | 1.38E-02 | 2.94E-02 | 3.39E-02 | 2.71E-02 | 5.64E-02 | |
Centralization | 0.285 | 0.338 | 0.245 | 0.293 | 0.456 | |
Average degree | NA | 1.26±0.41 | 1.76±0.61 | 2.45±0.66 | 0.61±0.32 | |
Clustering coefficient | NA | 0.029±0.023 | 0.05±0.03 | 0.077±0.04 | 0.014±0.02 | |
Diameter | NA | 6.01±1.31 | 7.1±1.63 | 7.6±1.24 | 2.9±1.27 | |
Average path length | NA | 2.66±0.57 | 2.99±0.5 | 3.23±0.37 | 1.52±0.46 | |
Centralization | NA | 0.06 | 0.151±0.05 | 0.181±0.06 | 0.109±0.06 |
The degree of a node is the number of interactions incident to it. The clustering coefficient measures the interconnectivity around a node. The average path length is the average length of all shortest paths among all node pairs. Betweenness centrality is the average number of shortest paths between all node pairs passing through a node. Closeness centrality is the reciprocal of the average shortest path lengths. The mean and standard deviation (mean ± SD) of 100 random networks for each active sub-network are presented in the last row. All computations were performed in Cytoscape.
NA-Denotes not applicable.
To corroborate the significance of these topological divergences, we performed the same computations on a set of random networks created from a static network with the same number of nodes as the actual active sub-networks. We generated 100 random networks, and the structural properties were averaged over these 100 networks. The mean and standard deviations are provided in the last row of
Transcriptional regulatory networks are made up of small recurring patterns called network motifs. To understand the dynamic functional characteristics of the gastrin network, we performed a network motif analysis for each active sub-network. We identified network motifs with 3-, 4- and 5-nodes in four active sub-networks using the FANMOD tool
The significantly enriched network motifs and their Z-scores in the four active sub-networks are presented in
The key network motifs of 3-, 4-, and 5-component nodes detected in four active sub-networks are presented. Red nodes represent TGs and green nodes represent TFs. The network motif search was performed using the FANMOD tool. The motifs with p-value<0.05 were considered statistically significant.
The computed Z-scores for selected network motifs across the four active sub-networks were shown. Each active sub-network is enriched in specific network motifs. The labels ‘a’, ‘b’, ‘c’, ‘d’, and ‘e’ represent the respective network motifs in
Five types of 5-node motifs were enriched in the constructed networks. MIMs (multi input motifs, types ‘a’, ‘b’ and ‘c’) were the most common regulatory pattern among 5-node motifs. Of these, type ‘b’ was significantly enriched in the VL sub-network and type ‘c’ in the ML network. MIMs are also known to regulate large-scale gene activation
To identify which and how various biological processes are affected by differentially expressed TGs in each active sub-network, we conducted functional annotations in ClueGO tool in Cytoscape
Network representations of enriched terms among active genes in the respective sub-networks. Enriched terms are represented as nodes based on their kappa score (≥0.3). The node size indicates the significance of the enrichment. (A) IE active sub-network. (B) ME active sub-network.
Network representations of enriched terms among active genes in the respective sub-networks. Enriched terms are represented as nodes based on their kappa score (≥0.3). The node size indicates the significance of the terms enrichment. (A) ML active sub-network. (B) VL active sub-network.
The IE active sub-network is highly enriched in functions involved in cellular differentiation. Many previous studies have confirmed the role of gastrin in differentiation processes
The ME active sub-network is enriched in a large number of significant categories involving several metabolic and several types of cancer associated pathways. This sub-network is enriched in energy-related metabolic processes such as carbohydrate, polysaccharide, lipid, organic acid, cellular aromatic compound metabolic processes and generation precursor metabolites and energy processes. This network is overrepresented by cancer pathways such as small cell lung cancer, thyroid cancer, prostate cancer, endometrial cancer and chronic myeloid leukemia. In addition, this network is enriched with response to endoplasmic reticulum (ER) stress, cellular response to stimulus, B cell activation and hemostasis. One of the recent studies confirms the involvement of gastrin in regulating the genes resulting in ER stress
The ML network is highly enriched in biological processes related to morphogenesis such as the cell morphogenesis, morphogenesis of an epithelium, embryonic epithelial tube formation, heart morphogenesis. In addition, this network is enriched with regulation of cell migration and cell growth, angiogenesis and activation of Wnt and calcium signaling pathways. Previous studies of gastrin-CCK2R (cholecystokinin-2 receptor) signaling have confirmed some of the processes predicted in this study. Gastrin-CCK2R involvement in the morphogenesis of epithelium cells was found in previous studies. Pagliocca et al. found that stimulation with gastrin promotes branching morphogenesis (process of tubule formation) in gastric AGS cells through the activation of protein kinase C (PKC)
The VL sub-network is mainly involved in metabolic processes such as DNA metabolic process, phosphorous metabolic process, cell communication, apoptosis and tube morphogenesis. Gastrin has induced apoptosis in many cells in previous studies
Thus, our functional annotation of genes in four active sub-networks revealed several known and new functions of gastrin. In addition, this analysis contributed to identifying the gastrin response from a dynamic perspective.
The primary objective of this study was to analyze the time series gene expression data generated by external stimuli to understand the transcriptional regulatory network from a dynamic perspective. To achieve this goal, we integrated information from a static TF-TG network with gene expression data to identify key TFs temporal dynamics. The gene expression and TF activities showed early-, mid-, and late-phase action in response to gastrin. This indicates that gastrin regulates genes over a period of 14 hours, although the majority of the genes were active at 1 and 2 hours and TFs were active 1.5 and 6 hours after gastrin treatment.
To more comprehensively understand the mechanisms of transcriptional regulation, we built four active sub-networks at four different time ranges. The active sub-networks defined in this study showed structural differences in their network organization. The ME and VL sub-networks were more strongly interconnected than the others. In addition, we identified key regulatory patterns, called network motifs, in all sub-networks. This analysis showed that distinct network motifs were significantly enriched in each active sub-network. The GO ontology and pathway analysis of active TGs and TFs in each active sub-network revealed interesting facts. Each active sub-network was enriched in unique GO terms/pathways. This shows that gastrin triggers different cellular states through diverse and complex transcription regulation patterns depending on the time of activation. We demonstrated that analyzing time series microarray data through partitioning to smaller temporal sub-networks reveals network properties that are unique for each time range, yet may otherwise be hidden when the whole time range is combined.
The development of high-throughput technologies such as microarrays results in large amounts of biological data and demands the rapid development of computational methods and strategies to analyze the data and thus extract biological knowledge. Our current study provides one such strategy for using these data and integrating known biological information to decipher the mechanisms of signaling and transcriptional programs of the biological system.
(ZIP)
(ZIP)
(OUT)
(OUT)
(OUT)
(OUT)
(OUT)
(OUT)
(OUT)
(OUT)