Skip to main content
Advertisement

< Back to Article

Fig 1.

The overall framework of iHerd.

It is an end-to-end learning framework, which contains a Graph Coarsen module, Graph Representation Learning module, Embedding Refinement module and Hierarchical Embedding Alignment module. The detail of each component is introduced in the Method Section.

More »

Fig 1 Expand

Fig 2.

iHerd recovers the hierarchy change within the transcript factor (TF) GRNs.

(a) The UMAP of embeddings for TFs in GM12878. (b) The boxplot of three clusters in GM12878 for the ratio of in degree and out degree. (c) The UMAP of embeddings for TFs in K562. (d) The boxplot of three clusters in K562 for the ratio of in degree and out degree. (e) The UMAP of embeddings for TF2 in GM12878 and K562 without the embedding alignment. (f) The UMAP of embeddings for TF2 in GM12878 and K562 after the embedding alignment. (g) The line plot of the sorted normalized L2 distance. The purple dot indicates there is a switching event for this TF. The boxplot of normalized l2 distance between non-switching TFs and switching TFs also demonstrates that the TF with a higher l2 distance has more chance to switch its cluster from GM12878 to K562. (h-j) The UMAPs of three TF examples switch their cluster from GM12878 to K562.

More »

Fig 2 Expand

Fig 3.

Simulated GRN experiments.

(a) Simulation scheme on GRNs. (b) The violin plot of the false positive test. (c) The distributions of the node change distance for the false positive test.

More »

Fig 3 Expand

Table 1.

Summary of baseline methods.

More »

Table 1 Expand

Table 2.

TF prioritization benchmarking.

More »

Table 2 Expand

Table 3.

False Positive Rate Analysis.

More »

Table 3 Expand

Fig 4.

iHerd identifies different divergent genes between cell types.

(a) Different neuronal and non-neuronal groups in UMAP using RNA. Here are seven samples and 84,852 cells. (b) The dot plot colored by gene expression for different cell types. These genes are differentially expressed across seven cell types. (c) The illustration of the discovery of different divergent genes. The L2 distance between g2 and g’2 is lower than the threshold at the early stage but exceeds the threshold at the late stage. While the L2 distance between g1 and g’1 exceeds the threshold for all stages. (d) The boxplot of normalized abstract correlation changes from excitatory neurons to macroglia between the top rewiring genes and the top conserved genes. Here we select 5% of genes with the largest L2 distance as the top rewiring genes and 5% of genes with the smallest l2 distance as the most conserved genes. The normalized L2 distance at different stages for EDG. We select one example EDG: PPARG, and its violin plot of gene expression indicates that PPARG is highly expressed in neurons. (f) The normalized L2 distance at different stages for LDG. We select one example LDG: RUNX2, and its violin plot of gene expression indicates that RUNX2 is highly expressed in microglia. "Middle" in iHerd refers to the first coarsening stage of the original network. With "Early", "Middle", and "Late" representing the coarsest, semi-coarse, and original networks, we can have deeper insights into gene behavior in different biological contexts.

More »

Fig 4 Expand

Fig 5.

iHerd highlights extensive cell-type-specific divergent genes in brain disorders.

(a-c) The analysis of excitatory neurons from control to MDD. (a) The UMAP of one example of an early divergent gene: ENPEP and the normalized L2 distance among different stages for EDG. (b) The UMAP of one example of a late divergent gene: INPP5D and the normalized L2 distance among different stages for LDG. (c) The normalized correlation changes for the top rewiring gene and the top conserved gene. The top rewiring gene “ENPEP” shows larger correlation changes with other genes while the top conserved gene “ZNF804A” almost has no correlation changes with other genes. (d-f) The analysis of excitatory neurons from control to PTSD. (d) The UMAP of one example of an early divergent gene: TGFBR3 and the normalized L2 distance among different stages for ED(e) The UMAP of one example of a late divergent gene: ADARB2 and the normalized L2 distance among different stages for LDG. (f) The normalized correlation changes for the top rewiring gene and the top conserved gene. The top wiring gene “TGFBR3” shows larger correlation changes with other genes while the top conserved gene “SLC26A4-AS1” almost has no correlation changes with other genes.

More »

Fig 5 Expand

Table 4.

EDG vs LDG benchmarking.

More »

Table 4 Expand

Fig 6.

Simulated GCN experiments.

(a) Simulation scheme on GCNs. (b) The violin plot of the false positive test. (c) The distributions of the node change distance for the false positive test.

More »

Fig 6 Expand

Fig 7.

The robustness and biological relevance analysis of the graph coarsen module in iHerd.

(a) The boxplot depicting the distribution of community sizes (node counts) in each run of iHerd. (b) A heatmap visualization of the overlap ratios calculated between communities identified across multiple runs of iHerd. (c) Enrichment heatmap from GO enrichment analysis of communities identified by iHerd.

More »

Fig 7 Expand

Fig 8.

The parameter tuning for iHerd.

(a) The bar plot of the number of nodes per level for controls and disease samples under excitatory neurons and microglia. (b) The line plot of running time with different embedding dimensions and different learning frameworks for controls under excitatory neurons and microglia. (c) The line plot of network modality with different coarsen times (zero coarsen times indicates the initial state).

More »

Fig 8 Expand

Table 5.

Statistics of GRNs.

More »

Table 5 Expand