Fig 1.
(A) Starting from single-cell data (I), we obtain the distance matrix of cells in PCA space (III) and the GRN of individual cells (II). By solving the Euclidean distance of cells in the space characterized by the out-degree and in-degree of all genes, we obtain the cell distance matrix (IV). The Hadamard product of the distance matrix in PCA space and the distance matrix in GRN space, with a constraint on the number of neighbors, results in the cell differentiation graph G (V). Simultaneously, we solve the rough pseudotime of cells based on their relationships in GRN space (VI). We use the rough pseudotime to correct the cell differentiation graph G and then find the minimum spanning tree of G to obtain the cell differentiation trajectory graph (VII). (B) The curve of the number k of supergenes obtained during gene pooling with respect to the number l of original genes in each group. We select the optimal supergenes and the number of original genes per group by calculating the mean of supergenes in the preceding and following sliding windows. (C) The function graph of the tick function, which attains its minimum value at x = 1. We consider that when the pseudotime difference between clusters is 1 at the cluster scale, the distance between the two clusters is minimized, meaning the probability of a connecting edge between the two clusters is maximized.
Fig 2.
Differentiation trajectory plots for several datasets: (A) cellbench-SC1_luyitian, (B) distal-lung-epithelium_treutlein, (C) mouse-cell-atlas-combination-5, (D) germline-human-female_li, (E) hematopoiesis-gates_olsson, (F) mESC-differentiation_hayashi, (G) neonatal-inner-ear-SC-HC_burns, and (H) placenta-trophoblast-differentiation_mca.
Table 1.
Comparison of scGRN-Entropy with existing state-of-the-art methods in terms of accuracy across 14 real datasets encompassing 8 types of differentiation trajectories.
Fig 3.
(A) The expression patterns of the 20 genes most positively correlated with pseudotime and the 20 least correlated genes clearly reflect the characteristics of cells at different stages: Female FGC1, FGC2, and FGC3. Genes highly expressed in the early stage (Female FGC1) gradually decrease in expression over time, whereas genes highly expressed in the later stage (Female FGC3) show a gradual increase. During the transitional stage (Female FGC2), all 40 genes display varying levels of expression. (B) The in-degree and out-degree plots for supergenes, shown on the horizontal and vertical axes, respectively, reveal that supergenes highly expressed during the transition period exhibit an increase in in-degree followed by a gradual decline. In contrast, supergenes highly expressed in the early and late stages demonstrate opposite trends in both in-degree and out-degree.
Fig 4.
(A) KEGG enrichment analysis results of genes that were positively correlated with pseudotime, using the enrichment dataset “KEGG 2016”. (B) GO enrichment analysis results of genes that were positively correlated with pseudotime, using the enrichment dataset “GO Biological Process 2021”. (C) KEGG enrichment analysis results of genes that were negatively correlated with pseudotime, using the enrichment dataset “KEGG 2016”. (D) GO enrichment analysis results of genes that were negatively correlated with pseudotime, using the enrichment dataset “GO Biological Process 2021”.
Fig 5.
Evolution of gene-gene regulatory strength over time, showing that the gene regulatory network underwent significant changes initially.
After a certain period, only Gene2 continued to exhibit slight regulatory influence on other genes until it eventually stabilized.
Table 2.
Chemical reactions and their propensity functions.