Skip to main content
Advertisement

< Back to Article

Fig 1.

Imputation performance of Network (blue), DrImpute (green), kNN-smoothing (pink), SAVER (yellow), scImpute (turquoise), SCRABBLE (purple) and Ensemble (orange).

A) Distribution of average expression levels (per gene) in each dataset. Quartiles are represented by vertical lines. B) Pearson correlation coefficient, for each gene, between the imputation by the specified method and the original values before masking. Only values that could be imputed by all methods (non-zero imputation) were considered. Correlations per gene across cells were computed for all genes for which at least 10 imputation values were available for analysis. Expression quartiles were determined for each dataset separately, on the masked data.

More »

Fig 1 Expand

Table 1.

Percentage of genes best imputed by each method (highest Pearson correlation coefficient) in the seven test datasets restricted to values that could be imputed by all individual methods.

More »

Table 1 Expand

Fig 2.

Characterization of the genes best predicted by Network (blue), DrImpute (green), kNN-smoothing (pink), SCRABBLE (bulk and single cell data as reference; purple) and SAVER (yellow) in the Lung Atlas 10X dataset.

A) Determination of the top best performing methods for each gene. Methods performing best and close to the best method (correlation not smaller than 0.1 –best method) were selected as top performers. Genes for which all methods were top performers were included in the background but not in the foreground. B) Distribution of missing values per gene, average expression levels and variance of the genes for which a given method is one of the top performers, compared against all tested genes (background). Average gene expression is shown as log2-transformed normalized expression. Too few genes were best predicted by scImpute, so no distributions were drawn for this method.

More »

Fig 2 Expand

Fig 3.

Comparison of impact of different imputation methods on cell trajectory inference for time course hESC differentiation data.

Cells were projected onto the trajectories determined with slingshot to compute a pseudotime of differentiation. The pseudotime assigned to each individual cell (y-axis) is then compared to the time point label of the sample (x-axis). Colored points represent the mean pseudotime per known time point. In the title, the Pearson’s r between pseudotime and time point labels (and the respective p-value) are shown. Note that slingshot always fails to correctly position the sample at time point 0, suggesting an artifact in the original data.

More »

Fig 3 Expand

Fig 4.

Detection of cell type-specific markers before and after imputation.

A) Number of significant (FDR < 0.05, |log2FC| > 0.25) cell type markers detected with no dropout imputation and using the tested imputation methods. Horizontal dashed lines correspond to the number of markers detected irrespectively of imputation. The fraction of the bar in a darker shade corresponds to the number of markers detected exclusively when using a given imputation approach. B) and C) fraction of captured high confidence terms, defined as significantly enriched (p.value < 0.001 and log2Enrichment > 0.5, Methods) GO biological process terms among the cluster markers detected without imputation. B) Sensitivity: fraction of high confidence terms detected as significantly enriched (p.value < 0.001 and log2Enrichment > 0.5) among the cluster markers detected with each imputation method. C) log2-enrichment of all high confidence terms among the cluster markers detected with each imputation method. DEC: definitive endoderm cells; EC: endothelial cells; H9: undifferentiated human embryonic stem cells; HFF: human foreskin fibroblasts; NPC: neural progenitor cells; TB: trophoblast-like cells.

More »

Fig 4 Expand

Fig 5.

Detection of cell type-specific transcription factors is improved upon network-based imputation.

A) Enrichment score in GO term “DNA-binding transcription factor activity” among the genes uniquely detected after each imputation approach. B) Projection of cells onto a low dimension representation of the data before imputation, using ZINB-WaVE[30]. Color represents normalized expression levels of EHF (top) and OSR1 (bottom) before and after Network-based imputation. DEC: definitive endoderm cells; EC: endothelial cells; H9: undifferentiated human embryonic stem cells; HFF: human foreskin fibroblasts; NPC: neural progenitor cells; TB: trophoblast-like cells.

More »

Fig 5 Expand