Figure 1.
Schematic overview of the computational framework used for the network motif regulatory module inference.
Gene expression patterns were first clustered into biologically meaningful groups by FCM; GO category information of genes was used to determine the optimal cluster number. To evaluate the gene clusters, GSEA was performed on the optimal clusters. Additionally, significant network motifs detected in the combined network of PPI and PDI were then assigned to each transcription factor. After the gene clusters are formed and transcription factors were assigned to network motif categories, the connections between transcription factors and gene clusters were inferred by training RNNs that mimic the topology of the network motifs that transcription factors are assigned to. Finally, the inferred network motifs were validated by BSEA and literature results.
Figure 2.
The scheme illustrates the process of grouping genes into biologically meaningful clusters. The gene expression data were first utilized to find the optimal m value for FCM clustering. With the optimal m value, FCM clustering was performed on gene expression data for cluster numbers ranging from 2 to 50. The similarity scores of all pairs of genes in each cluster of one partition are averaged and denoted as overall similarity score for one cluster partition. The partition with the highest similarity score was selected as the optimal one. GSEA was performed using FuncAssociate to evaluate the gene clusters formed using the optimal cluster number.
Figure 3.
Clustering results obtained using K-mean and FCM algorithms.
Three clustering results were plotted: k-means clustering and FCM clustering with two m values (m is the fuzziness parameter): default value (m = 2) and optimal value (m = 1.1548).
Figure 4.
Predicted network motif from known cell cycle dependent genes.
The left panel presents the four network motif regulatory modules considered in this study. The right panel depicts inferred transcription factor-target gene relationships for eight cell cycle dependent transcription factors.
Table 1.
The experimental results of GA-PSO with RNN.
Figure 5.
Binding site enrichment analysis for gene clusters.
Sequence logos represent the motif significantly overrepresented in individual gene cluster associated with their predicted upstream transcription factors, according to the WebMOTIFS discovery algorithm [48]. Individual base letter height indicates level of conservation within each binding site position. Conserved binding motifs are the conserved binding sequences used in the WebMOTIFS discovery algorithm.
Figure 6.
Ingenuity analysis for BRCA1-related network motif: A predicted network motif, where BRCA1 regulates two clusters which interact with each other (top right corner), and a network reconstructed by the IPA software.
Shaded genes are genes identified in the network motif and others are those associated with the identified genes based on pathway analysis.
Figure 7.
Ingenuity analysis for BRCA1 and STAT1-related network motif: A predicted network motif, in which BRCA1 and STAT1 regulate all three genes in Cluster 36 (top right corner), and a network reconstructed by the IPA software.
Shaded genes are genes identified in the network motif and others are those associated with the identified genes based on pathway analysis.
Figure 8.
Ingenuity analysis for E2F1 and E2F2-related network motif: A predicted network motif with E2F1 and E2F2 interacting with each other and regulating the genes in Cluster 34 (top left corner), and a network reconstructed by the IPA software.
Shaded genes are genes identified in the network motif and others are those associated with the identified genes based on pathway analysis.
Figure 9.
Ingenuity analysis for E2F and PCNA-related network motif: A predicted network motif where E2F2 and PCNA bind together and regulate downstream genes in Cluster 34 (top left corner), and a network reconstructed by the IPA software.
Shaded genes are genes identified in the network motif and others are those associated with the identified genes based on pathway analysis.
Table 2.
Networks included in this study.