A guide to automated apoptosis detection: How to make sense of imaging flow cytometry data

doi:10.1371/journal.pone.0197208

Fig 1.

Cell state classification for imaging flow cytometry.

(A) Machine learning is used to predict the unknown composition of cellular populations (e.g. viable (green) and apoptotic (red)). (B) The combination of experimental expertise (1, 2, 3) and data mining (3, 4, 5) facilitates the derivation of useful information from huge amounts of biological data.

More »

Expand

Fig 2.

From feature to model selection.

(A-C) Mutual information maximization (MIM) and Fisher criterion applied on linear separable and nonseparable two-class problems (red and green). (D-G) In contrast to the correlation coefficient mutual information (MI) recognizes linear and nonlinear dependencies between random variables. (H-M) The classification of a two class problem (red and green) using classifiers with different complexities display variations concerning the decision boundary. (N) The reshuffling of training and validation set (cross validation) helps to judge the model stability. For a performance check, the best model is used to classify the test set, which consist of so far unseen data not used for model selection. (O) To find the model best suited for discrimination (black arrow) the sum of two sources of errors (black, dashed) has to be minimized. Bias (light gray, dashed) decreases with increasing model complexity, whereas variance (dark gray, solid) increases.

More »

Expand

Fig 3.

Traditional gating on labeled data vs. machine learning on morphological properties.

(A) Two-dimensional gating approach for Annexin V and propidium iodide (PI) staining. (B) Classification of kinetics data using the gates determined by A. (C-F) CV accuracy as function of hyperparameters for different feature selection and classification schemes. The crosses indicate the winning models. (G) Unbiased model performance of the winning models. (H-I) Comparison of the traditional gating technique (blue) and the machine learning approach (yellow) for kinetics data and titration data. The crosses in I mark the ratios obtained from A, which are used for calibration.

More »

Expand

Fig 4.

Impact of class imbalance and feature quality on classification.

(A) Two-dimensional representation of viable (green) and apoptotic (red) cells with corresponding separation boundaries obtained by LDA (black, dashed) and KNN classification (black, solid). (B) Comparison of CV accuracy (yellow) and its class specific geometric mean (blue) of LDA (dashed) and KNN classification (solid) as measure of performance. (C) Class specific CV accuracy of viable (v, green) and apoptotic (a, red) cells for LDA (dashed) and KNN classification (solid). (D) Class specific confusion matrix obtained by LDA. (E-F) Temporal evolution of the size of the nucleus measured by 7AAD and loss of membrane integrity measured by Zombie Aqua. (G-H) Comparison of viable (green) and apoptotic (red) distribution of the size of the nucleus and loss of membrane integrity.

More »

Expand

Fig 5.

Information content of caspace-3.

(A) Feature selection via MIM. (B) Normalized caspase-3 intensity of viable cells without stimulation (green) and apoptotic (red) cells 180 min after stimulation. (C) Model selection for all features (blue) vs. all features except caspase-3 (yellow). (D) Performance comparison of different classification algorithms trained on all features except caspase-3. (E) Classification of kinetics data using the optimal models from C.

More »

Expand