Skip to main content
Advertisement

< Back to Article

Fig 1.

Model architecture.

In the train stage, a graph autoencoder (GAE) and a denoising autoencoder (DAE) are constructed simultaneously. The multi-head attention blocks are used to combine the denoising embedding and the topological embedding. In the clustering stage, the prediction labels of each cell are obtained by a self-optimizing clustering.

More »

Fig 1 Expand

Table 1.

General information of 16 scRNA-seq datasets used for methods evaluation.

More »

Table 1 Expand

Fig 2.

ARI score of AttentionAE-sc (our methods) and all baseline methods in 16 scRNA-seq datasets.

Arithmetic mean was taken as results of each dataset after running each method five times under different random seeds. Methods that need to specify the number of clusters were marked with an asterisk (*).

More »

Fig 2 Expand

Fig 3.

Evaluations of cell embedding.

a. Clustering performance of AttentionAE-sc and four methods based on community detection algorithm using Silhouette scores as metrics. Each box contains the results of 16 datasets (run 5 times by different random seeds). In the box diagram, the dotted line represents the median score, and the upper or lower solid lines represent the maximum or minimum score. b. Comparison of UMAP visualization from different methods on the dataset Romanov. AttentionAE-sc learned more clustering-friendly embedding for single-cell clustering.

More »

Fig 3 Expand

Fig 4.

AttentionAE-sc construction better relationships among cells (Muraro).

a. The trend of the number of cell groups predicted and ARI scores changing with training epochs during the clustering stage. The better clustering performance was obtained, when the cluster centers were adaptively chosen with each iteration and redundant cluster centers were discarded. Other datasets were shown in S4 Fig. b. The visualization of cell embeddings. c. the heatmap of relative cosine distance between cells calculated by the embeddings of multi-head attention layer. d. the heatmap of relative cosine distance between cells calculated from the input expression matrix. On the right subfigure, we sorted the dataset follow per cell groups to get an intuitive visualization of among cells distance (c, d, e). e. the heatmap of relative cosine distance between cells calculated by ordinary denoising autoencoder (DAE) and the corresponding visualization.

More »

Fig 4 Expand

Fig 5.

Hyperparameters search and the ablation experiment.

a. Comparison of different methods for cell connectivity calculations. When compared to the UMAP method, the ’gauss’ method demonstrated superior performance. b. Comparison of different settings for the number of attention heads. c. Comparison of different settings for the resolution parameter in the Leiden algorithm. d. Comparison of different settings for the number of highly variable genes. e. Model ablation experiment of AttentionAE-sc in 8 scRNA-seq datasets. Four conditions were tested, including the absence of the information fusion block (wo attn), the absence of the ZINB loss function (wo zinb), the absence of residual connections (wo res), and the absence of GAE (wo gnn).

More »

Fig 5 Expand

Fig 6.

Experiments in BRCA dataset.

a. UMAP visualization of the clustering results on the BRCA datasets. b. Comparison of cell distribution between the predicted clusters and the ground truth cell lines. In the heat map, each row represents a class of cells from the same cell line, and each column represents a class of cells from the same predicted cluster. There was generally a one-to-one correspondence between cell lines and predicted clusters for ideal clustering results. c. Expression dot plot of identified DEGs in different cell lines. Based on the predicted cell labels, DEGs of all clusters were calculated by the Wilcoxon test (cluster 15 only contains 30 cells, so it was excluded). Top 1 DEGs of predicted clusters were selected as the potential marker genes. d. The pot plot was drawn to visualize the expression in each cell line and the most of cell lines can be distinguished obtained DEGs. BCAS3 was specifically expressed in cluster 18, i.e. MCF7 and KPL1. e. The UMAP visualization of the sub-clustering results on the cluster 18 by AttentionAE-sc, which was consist of these two cell lines.

More »

Fig 6 Expand