Figure 1.
Proposed integrated phenotyping approach (FlowMax).
CFSE flow-cytometry time series are preprocessed to create one-dimensional fluorescence histograms that are used to determine the cell proliferation parameters for each time point, using the parameters of the previous time points as added constraints (step 1). Fluorescence parameters are then used to extend a cell population model and allow for direct training of the cell population parameters on the fluorescence histograms (step 2). To estimate solution sensitivity and redundancy, step 2 is repeated many times, solutions are filtered by score, parameter sensitivities are determined for each solution, non-redundant maximum-likelihood parameter ranges are found after clustering, and a final filtering step eliminates clusters representing poor solutions (step 3).
Figure 2.
(A) Noisy log-transformed cell fluorescence is modeled by a weighted mixture of Gaussian distributions for each cell division: , parameterized according to equations describing variability in staining (CV), background fluorescence (b), dye dilution (r), and a small correction for the fluorescence of the initial population of cells (s). Weights for each Gaussian correspond to cell counts in each generation. (B) Analysis of the cell fluorescence model fitting accuracy for 1,000 generated CFSE fluorescence time courses (see also Tables S3 and S4). Average percent error in generational cell counts normalized to the maximum generational cell count for each time course. Numbers indicate an error ≥ 0.5%. (C) Representative cell fluorescence model fitting to experimental data from wildtype B cells at indicated time points after start of lipopolysaccharides (LPS) stimulation (red lines indicate undivided population).
Figure 3.
The fcyton cell proliferation model.
(A) A graphical representation summarizing the model parameters required to calculate the total number of cells in each generation as a function of time. Division and death times are assumed to be log-normally distributed and different between undivided and dividing cells. Progressor fractions (Fs) determine the fraction of responding cells in each generation committed to division and protected from death. (B,C) Analysis of the accuracy associated with fitting fcyton parameters for a set of 1,000 generated realistic datasets of generational cell counts assuming perfect cell counts and an optimized ad hoc objective function (see Text S1 and Tables S3 and S4). (B) Average percent error in generational cell counts normalized to the maximum generational cell count for each time course. Numbers indicate an error ≥ 0.5%. (C) Analysis of the error associated with determining key fcyton parameters. Box plots represent 5, 25, 50, 75, and 95 percentile values. Outliers are not shown. For analysis of all fcyton parameter errors see also Figure S2 (green).
Figure 4.
Accuracy of phenotyping generated datasets in a sequential or integrated manner.
The accuracy associated with sequential fitting Gaussians to fluorescence data to obtain cell counts for each generation (blue) and integrated fitting of the fcyton model to fluorescence data directly using fitted fluorescence parameters as adaptors (purple) was determined for 1,000 sets of randomly generated realistic CFSE time courses (see also Tables S3 and S4). (A) Average percent error in generational cell counts normalized to the maximum generational cell count for each time course. Numbers indicate an error ≥ 0.5%. (B) Analysis of the error associated with determining key fcyton cellular parameters. Box plots represent 5,25,50,75, and 95 percentile values. Outliers are not shown. For a comparison of all 12 parameters see Figure S1 (blue) and Figure S2 (purple).
Figure 5.
Comparison of FlowMax to the Cyton Calculator.
The Cyton Calculator [9] and a computational tool implementing our methodology, “FlowMax,” were used to train the cyton model with log-normally distributed division and death times on a CFSE time course of wildtype B cells stimulated with lipopolysaccharides (LPS). The best-fit generational cell counts were input to the Cyton Calculator. (A) Visual summary of solution quality estimation pipeline implemented as part of FlowMax. Candidate parameter sets are filtered by the normalized % area difference score, parameter sensitivity ranges are calculated, parameter sensitivity ranges are clustered to reveal non-redundant maximum-likelihood parameter ranges (red ranges). Jagged lines represent the sum of uniform parameter distributions in each cluster. (B) Best fit cyton model parameters determined using the Cyton Calculator (blue dots) and our phenotyping tool, FlowMax (square red individual fits with sensitivity ranges represented by error bars and square green weighted cluster averages with error bars representing the intersection of parameter sensitivity ranges for 41 solutions in the only identified cluster). (C) Plots of Fs (the fraction of cells dividing to the next generation), and log-normal distributions for the time to divide and die of undivided and dividing cells sampled uniformly from best-fit cluster ranges in (B). (D) Generational (colors) and total cell counts (black) are plotted as a function of time for 250 cyton parameter sets sampled uniformly from the intersection of best-fit cluster parameter ranges. Red dots show average experimental cell counts for each time point. Error bars show standard deviation for duplicate runs.
Figure 6.
Testing the accuracy of the proposed approach as a function of data quality.
Six typical CFSE time courses of varying quality were generated and fitted using our methodology (Figure 1). (A-F) The best-fit cluster solutions are shown as overlays on top of black histograms for indicated time points. Conditions tested were (A) low CV, (B) high CV (e.g. poor staining), (C) 10% Gaussian count noise (e.g. mixed populations), (D) 10% Gaussian scale noise (poor mixing of cells), (E) four distributed time points (e.g. infrequent time points), (F) four early time points from the first 48 hours (see Methods for full description). (G) Parameter sensitivity ranges for each solution in each non-redundant cluster next to the maximum likelihood parameter ranges are shown for fcyton fitting. The actual parameter value is shown first (black dot).
Figure 7.
Phenotyping WT, nfkb1−/−, and rel−/− B cells stimulated with anti-IgM and LPS.
(A) Visual summaries of best-fit phenotype clusters for WT (top), nfkb1−/− (middle), and rel−/− (bottom) genotypes stimulated with anti-IgM (left), and LPS (right). To visualize cellular parameter sensitivity, 250 sets of parameters were selected randomly from within parameter sensitivity ranges and used to depict individual curves for the fraction of responding cells in each generation (Fs) and lognormal distributions for time-dependent probabilities to divide (Tdiv) and die (Tdie) for undivided and divided cells. (B) Tables summarizing the best fit cellular parameters determined using the integrated computational tool, FlowMax, as well as the relative amount of cell cycling and survival reported in previous studies [12]. Values in parentheses represent the lognormal standard deviation parameters. (C) Total cell counts simulated with the fcyton model when indicated combinations of nfkb1−/−specific parameters were substituted by WT-specific parameters during anti-IgM stimulation (“chimeric” solutions). Dots show WT (red) and nfkb1−/− (blue) experimental counts. Error bars show cell count standard deviation for duplicate runs.