Hierarchical marker genes selection in scRNA-seq analysis

doi:10.1371/journal.pcbi.1012643

Fig 1.

Overview of hierarchical marker gene selection in PBMC3k data.

(a) Marker gene heatmap generated by the one-vs-all FindMarker approach in Seurat. (b) Our constructed hierarchical structure of cell clusters in the PBMC3k dataset. (c) Assembled heatmap that concatenates marker gene heatmaps for individual splits in the constructed cell cluster hierarchy.

More »

Expand

Fig 2.

Hierarchical marker gene selection in PBMC control dataset.

(a) Marker gene heatmap generated by the one-vs-all FindMarker approach in Seurat. (b) Constructed hierarchy of cell clusters in PBMC control dataset. (c) Assembled heatmap that summarizes all marker genes for various splits in the cell cluster hierarchy.

More »

Expand

Fig 3.

Hierarchical marker gene selection in PBMC stim dataset.

(a) Marker gene heatmap generated by the one-vs-all FindMarker approach in Seurat. (b) Constructed hierarchy of cell clusters in PBMC stim dataset.(c) Assembled heatmap that summarizes all marker genes for various splits in the cell cluster hierarchy.

More »

Expand

Fig 4.

Comparison of hierarchical marker genes with two baselines and three existing marker genes selection methods.

Baselines are either all genes or highly variable genes. The three existing approaches are the flat one-vs-all FindMarker in Seurat, the flat version of scGeneFit, and the hierarchical version of scGeneFit. For each evaluation datasets, we trained a K-Nearest Neighbor classifier on 70% of the cells, and tested classification accuracy on the remaining 30% cells. (a) Classification accuracies for the PBMC3k dataset; (b) Classification accuracies for the PBMC control dataset; (c) Classification accuracies for the PBMC stim dataset.

More »

Expand

Fig 5.

UMAP visualization of hierarchical marker genes, two baselines and three existing marker genes selection methods, applied to three datasets.

(a) UMAP visualizations of PBMC3k dataset colored by cell types; (b) UMAP visualizations of PBMC control dataset; (c) UMAP visualizations of PBMC stim dataset.

More »

Expand

Fig 6.

Compare hierarchical marker genes with two baselines and three existing marker genes selection methods, in the context of cell type mapping.

Given two scRNA-seq datasets with significant batch effect between them, we trained a K-Nearest Neighbor classifier on one dataset (reference), and tested classification accuracy on the other dataset (query). (a) Classification accuracies with PBMC control as reference and PBMC stim as query; (b) Classification accuracies with PBMC stim as reference and PBMC control as query.

More »

Expand

Fig 7.

Marker gene heatmap generated by the one-vs-all FindMarker approach in Seurat.

More »

Expand