CCAST: A Model-Based Gating Strategy to Isolate Homogeneous Subpopulations in a Heterogeneous Population of Single Cells
Figure 7
CCAST analysis on SUM159 breast cancer results.
A Results for the estimation process for all the split point statistics in all the inner nodes in Figure 6. The root node corresponding to EPCAM shows one local maxima and one global maximum. Gating the data from this global maximum results in 9 distinct subpopulations. Nodes 3, 4, 8, 9, 13 and 14 have clear natural maxima indicating optimal splits for the data into these 9 homogenous subpopulations (see Figure 6) corresponding to the 9 bar plots in B. B Bar plots of the 9 homogenous subpopulations from Figure 6 across all 3 markers with standard deviation intervals for each marker. The values on the bars on the left side of each plot correspond to the minimum value for all 3 bar heights. Each side bar gives a sense of the relative difference between bar heights. The main title for each plot shows the corresponding leaf node bin on the tree in Figure 6. Predicted Celltypes 3 and 1 correspond to P3, P4, P7 and P5, P6, P8 respectively indicating more homogeneous sub populations than expected. The bar plots show evidence of at least 5 distinct sub populations i.e. P1, P2, P5, P7 and P9. C Gupta et al. [3] gating strategy isolated 3 cell states (Basal, stem, and luminal) using EPCAM as the major marker. They further use CD24 to sort out these 3 states. We also automatically identify EPCAM as the major marker but use a combination of multiple splits from CD24 and EPCAM to produce 9 homogeneous bins. D Comparison of predicted breast cancer subpopulations comparing the CCAST versus Gupta et al. [3] gating strategy shows potential evidence of contamination after sorting. This analysis indicated the CCAST subpopulation P9 is clearly a mixture of basal, stem, and luminal subpopulations from Gutpa et al. [3]. Unique CCAST subpopulations P1 and P2 were not even identified by Gupta et al. [3].