OpenCyto: An Open Source Infrastructure for Scalable, Robust, Reproducible, and Automated, End-to-End Flow Cytometry Data Analysis
When reproducing manual gating, raw FCS files and FlowJo workspace XML files are read into the R environment using parseWorkspace, creating a GatingSet object that represents the compensated, transformed and gated data stored in an ncdfFlowSet on disk. Cell populations annotated with gates can be visualized using plotGate, from the flowViz package Gating schemes can be visualized using plot. To perform automated gating, the user defines a csv representation of a gating tree, which is read by the OpenCyto package to generate a gatingTemplate object. This template can be applied to a GatingSet containing data, but no gates, provided the data uses the markers defined in the template. OpenCyto utilizes built-in automated gating methods, or external methods registered via a plug-in framework, to gate different cell subsets and populate the GatingSet with data-driven gate definitions for each sample. Manual and automated gating may be readily compared within a single framework. Cell populations and features can be extracted for further statistical analysis with other R and BioConductor software packages. Data (red boxes), software packages (blue boxes), framework functionality (gray boxes), and data flow/data structures (arrows/labeled arrows) are represented. flowCore, flowStats, and flowViz, are the core Bioconductor flow packages that benefit from the substantial infrastructure changes we have made to improve scalability and data visualization.