Browse Subject Areas

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Informatics-Based Discovery of Disease-Associated Immune Profiles

  • Amber Delmas,

    Affiliations Department of Cancer Biology, The Scripps Research Institute, Jupiter, Florida, United States of America, Department of Immunology and Microbial Sciences, The Scripps Research Institute, Jupiter, Florida, United States of America

  • Angelos Oikonomopoulos,

    Affiliation Division of Digestive Disease, University of California Los Angeles, Los Angeles, California, United States of America

  • Precious N. Lacey,

    Affiliation Division of Digestive Disease, University of California Los Angeles, Los Angeles, California, United States of America

  • Mohammad Fallahi,

    Affiliation Informatics Core, The Scripps Institute, Jupiter, Florida, United States of America

  • Daniel W. Hommes,

    Affiliation Division of Digestive Disease, University of California Los Angeles, Los Angeles, California, United States of America

  • Mark S. Sundrud

    Affiliations Department of Cancer Biology, The Scripps Research Institute, Jupiter, Florida, United States of America, Department of Immunology and Microbial Sciences, The Scripps Research Institute, Jupiter, Florida, United States of America

Informatics-Based Discovery of Disease-Associated Immune Profiles

  • Amber Delmas, 
  • Angelos Oikonomopoulos, 
  • Precious N. Lacey, 
  • Mohammad Fallahi, 
  • Daniel W. Hommes, 
  • Mark S. Sundrud


Advances in flow and mass cytometry are enabling ultra-high resolution immune profiling in mice and humans on an unprecedented scale. However, the resulting high-content datasets challenge traditional views of cytometry data, which are both limited in scope and biased by pre-existing hypotheses. Computational solutions are now emerging (e.g., Citrus, AutoGate, SPADE) that automate cell gating or enable visualization of relative subset abundance within healthy versus diseased mice or humans. Yet these tools require significant computational fluency and fail to show quantitative relationships between discrete immune phenotypes and continuous disease variables. Here we describe a simple informatics platform that uses hierarchical clustering and nearest neighbor algorithms to associate manually gated immune phenotypes with clinical or pre-clinical disease endpoints of interest in a rapid and unbiased manner. Using this approach, we identify discrete immune profiles that correspond with either weight loss or histologic colitis in a T cell transfer model of inflammatory bowel disease (IBD), and show distinct nodes of immune dysregulation in the IBDs, Crohn’s disease and ulcerative colitis. This streamlined informatics approach for cytometry data analysis leverages publicly available software, can be applied to manually or computationally gated cytometry data, is suitable for any clinical or pre-clinical setting, and embraces ultra-high content flow and mass cytometry as a discovery engine.


In the current era of high content flow cytometry (FACS), and even higher content mass cytometry (CyTOF), large-scale immune profiling in mice and human patients is becoming commonplace [13]. However, the value of using ultra-high content cytometry as a discovery tool is undermined by traditional views of FACS data, and current methods of FACS analysis. Even a single 10-parameter FACS experiment performed on human PBMC or mouse splenocytes can distinguish–assuming simple bi-modal distribution–up to 1,024 (210) distinct immune phenotypes. Yet only a fraction of this information is used because it is not practical to analyze hundreds of individual immune phenotypes “one-by-one”. Rather, cytometry data is routinely distilled down to focus on small numbers of well-characterized immune cell subsets that fit a hypothesis; this in turn generates bias, and disregards a large amount of potentially valuable information.

Computational solutions (e.g., Citrus, AutoGate) have been developed to remove bias introduced by manual cell gating and identify phenotypically unique cell clusters [4,5]. Yet whether raw cytometry data is parsed manually or computationally, the challenge remains to link discrete immune phenotypes to disease endpoints of interest, both within and between groups of patients or experimental animals. Programs such as spanning tree progression of density-normalized events (SPADE) begin to address this issue [6]. Currently, however, SPADE is used largely as a visualization tool to display immunophenotypic differences between groups of healthy controls and patients [3,7,8], which ignores both within-group variability and immunophenotypic associations with clinical endpoints other than diagnosis. Thus, additional approaches are called for that enable a more personalized and dynamic view of immune profiling data.

Hierarchical clustering is a now-routine method of analyzing other forms of big data (e.g., transcriptomic profiling, genome sequencing), most commonly to identify groups of differentially expressed transcripts between distinct cell types, or between similar cell types isolated from different hosts [9,10]. Importantly, hierarchical clustering also identifies relationships (i.e., direct, inverse) between individual transcripts across all cells (i.e., within and between groups), which can then be used to infer gene regulatory networks [11]. However, the most common visual output of hierarchical clustering, the heatmap, lacks quantitative information about the similarity between variables. For this reason, many hierarchical clustering software packages, such as GenePattern (, include nearest neighbor search algorithms (e.g., Euclidian distance, Manhattan distance, Pearson coefficient) that yield quantitative measurements of the similarity between variables [12]. Based on these concepts, we reasoned that combining hierarchical clustering with nearest neighbor searches in the context of immune profiling could identify relationships between large numbers of immune phenotypes and continuous clinical or pre-clinical disease endpoints. This “immuno-informatics” platform values increased readouts generated in ultra-high content cytometry experiments, can be applied downstream of manual or computational raw data analyses, and provides a comprehensive view of all immune phenotypes relative to discrete disease variables.

To explore the utility of this approach, we used manually gated immune profiling datasets from both a prevalent T cell transfer mouse model of chronic colitis [13], and human inflammatory bowel disease (IBD) patients. The T cell transfer model of colitis is interesting because the primary endpoints of disease are weight loss and histologic colitis, yet weight loss in this model is highly variable and does not necessarily correlate with severity of colitis [13]. Further, immune phenotypes that correlate with either weight loss or histologic colitis in this model are poorly characterized. By combining hierarchical clustering with nearest neighbor searches, we rapidly identify quantitative associations between specific immune phenotypes and disease endpoints, both in the T cell transfer model of colitis and human IBD patients.

Materials and Methods


Wild type FVB/N (FVB; model no. FVB) mice were purchased from Taconic. FVB.Rag1-/- mice were provided by Dr. Allan Bieber (Mayo Clinic, Rochester, MN). All mice used in this study were housed, bred, used in experiments, and sacrificed in accordance with a protocol (13–019) approved by the Institutional Animal Care and Use Committee of Scripps Florida.

Human samples

All experiments using human blood were conducted in accordance with IRB protocols approved by institutional review boards at The Scripps Research Institute or UCLA. Blood was obtained at UCLA following informed written consent from healthy adults, Crohn’s disease patients, or ulcerative colitis patients; consenting patients provided clinical history and demographic data at time of phlebotomy. The UCLA institutional review board approved all procedures and forms used to obtain informed patient consent, and all documentation for consenting patients is stored on paper at UCLA. PBMC was isolated and cryopreserved in de-identified and barcoded vials following ficoll density centrifugation, and frozen vials were shipped to Scripps Florida for analyses. PBMC vials were thawed and stained immediately for FACS analysis (see below).

T cell transfer-induced colitis

CD4+CD25- T cells from spleens and peripheral lymph nodes of wild type female FVB mice were magnetically isolated using an EasySep T cell negative isolation kit (Stem Cell Technologies, Inc.). Enriched splenocytes were further FACS-sorted to obtain pure naïve T cells (CD3+CD4+CD25-CD62LhiCD44lo); 0.5 x 106 cells were injected intraperitoneally (i.p.) into 6- to 8-week old syngeneic female FVB.Rag1-/- mice. Rag1-/- mice were weighed directly prior to T cell transfer to obtain baseline weights; mice were weighed twice weekly for the duration of the experiments, and euthanized if ≥ 20% baseline weight loss was reached.


Colons (~ 1 cm proximal, distal sections) were cut from euthanized Rag1-/- mice 5–8 weeks post-T cell transfer (depending on animal morbidity) and fixed in 10% neutral buffered formalin, embedded into paraffin blocks, cut for slides, and stained with hematoxylin and eosin (H&E). H&E-stained sections were analyzed and scored blindly by a veterinary pathologist.

Cell isolation

Mouse single mononuclear cell suspensions were prepared from spleen, or mesenteric lymph nodes (MLN) following tissue disruption. For isolation of mononuclear cells from colon, whole colons (cecum to anus) were removed, flushed with PBS to remove the fecal contents, and opened longitudinally to expose the epithelium. Tissues were incubated for 30 min at room temperature in DMEM media (without phenol red; Life Technologies) plus 0.15% DTT (Sigma-Aldrich) to remove mucus. After washing with media, colons were incubated for 30 min at room temperature in media containing 1 mM EDTA (Amresco) to remove the epithelium. After washing again with media, lamina propria was digested in media containing 0.25 mg/mL liberase TL and 10 U/mL RNase-free DNaseI (both from Roche), with shaking in a bacterial incubator (Environ Shaker; Labline) for 15–25 min at 37°C. Single cell suspensions were passed through 70 μm nylon filters (BD) and mononuclear cells were isolated by 70/30% percoll gradient centrifugation (Sigma-Aldrich). Mononuclear cells were washed twice in DMEM (LifeTech), counted, and resuspended for FACS analysis.

Flow Cytometry

FACS staining for surface antigens was performed as previously described [14]. Intracellular stains were performed following 4 hr. ex vivo stimulation with phorbol 12-myristate 13-acetate (PMA) and ionomycin in the presence of brefeldin A (all from Sigma-Aldrich). Stimulated cells were washed in PBS, stained for cell surface antigens in PBS for 20 min. at room temperature, fixed and permeabilized using a Foxp3 intracellular staining kit (eBioscience), and then stained with antibodies against transcription factors and cytokines. Anti-mouse antibodies used for FACS analysis were: Alexa700-CD45, brilliant violet (BV)650-CD3, BV711-CD4, BV605-CD25, Percp/Cy5.5-CD44, BV605-CD62L, APC-IFNγ, Percp/Cy5.5-IL-17A, PE-IL-22, PE/Cy7-IL-10, FITC-Ki-67 (all from Biolegend); PE/CF594-CD25 and PE/CF594-RORγt (from BD); and e450-Foxp3 (clone FJK-16s) (from eBioscience). Anti-human antibodies used for FACS analysis were: APC-CD3, PE-CD4, PE/Cy7-CD45RO, Percp/Cy5.5-CCR7 (from Biolegend); and PE/CF594-CD25 (from BD). For both mouse and human FACS analysis, viable cells were discriminated using an eFluor® 506 fixable viability dye (eBioscience). All FACS data was acquired on LSRII and analyzed using FlowJo software (TreeStar, Inc.; version 9.8.5). Raw FACS data was analyzed by manual gating (using strategies shown throughout) following compensation set in FlowJo using single color-stained control samples. Subset frequencies (i.e., percentage of parent gates) and in some cases, absolute cell numbers (calculated by multiplying total mononuclear cell numbers by subset frequencies) were exported to Microsoft Excel. Absolute cell numbers were obtained for only spleen and mesenteric lymph node-derived subsets; cell numbers for subsets from colon lamina propria were not recorded or used for analyses given variable cell recovery from enzymatically-digested intestinal tissues.

Bioinformatics analysis

Immunophenotypic data from FlowJo as above were collated in Microsoft Excel together with clinical, pre-clinical, and human demographic data (converted to single numeric values; as in Fig 1A and 1B and Table 1). To enable analyses in GenePattern, Microsoft Excel spreadsheets containing the data were converted to gct files as per instructions found in the GenePattern File Formats Guide ( gct files were then analyzed using the HierarchicalClustering module in GenePattern ( using both row and column clustering (Pearson correlation) and log-transformation. Two-dimensional hierarchical clustering data output files (atr, cdt, gtr) where analyzed in the HierarchicalClusteringViewer module to generate heatmaps. Within the HierarchicalClusteringViewer software (run through a Java applet), “nearest neighbor searches” were performed to quantify similarity between select clinical or pre-clinical disease endpoints of interest and all other data. Pearson correlation was used for nearest neighbor searches unless noted otherwise. Euclidian or Manhattan distances were also tested in independent nearest neighbor searches to compare results with those generated using Pearson coefficients (S2 File). Follow-up analysis and graphing was performed using GraphPad Prism software.

Fig 1. An informatics approach to correlating immune phenotpyes with weight loss or colitis in a T cell transfer mouse model of IBD.

(A) Weight loss in FVB.Rag1-/- mice (n = 9) injected with wild type naïve CD4+ T cells. Weights are shown relative to day 0 (pre-transfer baseline). Bold red trace shows mean weight loss for the group; green and blue traces show individual mice displaying mild or aggressive weight loss, respectively. Examples of disease severity index (DSI) calculations are shown in color-coded text. (B) Quantitative colitis scores (n = 9) from the same group of T cell-transferred FVB.Rag1-/- mice shown in (A). H&E-stained colon tissues were scored blindly as in [17]; representative micrographs (at right) show mild (score of 1) and severe (score of 3) inflammation (20x magnification). Red horizontal bar indicates mean colitis scores for the group. (C) Left, 10-parameter FACS panel used for analyzing ex vivo expression of surface antigens on leukocytes isolated from spleen, mesenteric lymph nodes (MLN), and colon lamina propria (colon) of FVB.Rag1-/- mice injected as in (A). Right, Gating strategy for surface FACS analysis; immune subsets used in downstream analysis are indicated by gates, text, and where appropriate, percentages. (D) Left, 11-parameter FACS panel used for analyzing ex vivo expression of intracellular transcription factors and cytokines in leukocytes isolated from T cell-transferred FVB.Rag1-/- mice as above. Right, Gating strategy for intracellular FACS analysis; immune subsets used in downstream analysis are indicated by gates, text, and where appropriate, percentages. (E) Heat map showing hierarchical clustering of 7 disease endpoints and 57 immune phenotypes in T cell-transferred FVB.Rag1-/- mice as above. Dendrograms (far left) show the clustering relationship between the mice based on all disease endpoints and immunophenotypes.

Table 1. Conversion of IBD patient clinical and demographic data to single numeric values for hierarchical clustering.

Statistical analyses

Statistical analyses were performed in Prism (GraphPad). One-way ANOVA with no pairing and Tukey correction for multiple comparisons, as well as Pearson correlation tests were used as appropriate and are indicated in the Figure legends. For One-way ANOVA analyses, statistical comparisons were made between all groups; only significant differences (P ≤ .05) are shown in Figures.


Distinct immune profiles correspond to T cell transfer-induced weight loss and colitis

To explore the utility of using informatics to link immune phenotypes with disease endpoints, we performed systematic immune profiling in a group of 9 Rag1-/- mice transplanted with wild type naïve CD4+ T cells over 3 independent experiments. Transferred recipients were co-housed to normalize microflora, and both weight loss (Fig 1A), and histologic colitis was assessed (Fig 1B). At sacrifice, leukocytes from spleen, mesenteric lymph nodes (MLN), and colon lamina propria were analyzed by ex vivo surface (Fig 1C) or intracellular (Fig 1D) FACS to assess the phenotypes and absolute numbers of immune cell subsets, including: CD25hiFoxp3+ T regulatory (Treg) cells, CD25loFoxp3- T conventional (Tconv) cells, CD62LloCD44hi effector/memory T cells (Teff cells), and CD62LhiCD44lo naïve T cells (Tnaive). Within Teff cells, we analyzed surface expression of pro-inflammatory chemokine receptors (e.g., CCR6, CXCR3), and the gut homing integrin α4β7 (Fig 1C). In addition, we analyzed intracellular expression of key transcription factors (Foxp3, RORγt), and–within Teff cells–expression of the proliferation-associated nuclear antigen, Ki-67 [15], as well as several pro- and anti-inflammatory cytokines (IL-17A, IFNγ, IL-22, IL-10) involved in mucosal immune regulation (Fig 1D) [16].

To enable hierarchical clustering of these immune phenotypes (57 in total) with weight loss or colitis, we converted information reflecting disease endpoints into single numeric values for each mouse. For weight loss, we created a disease severity index (DSI), which considers both the total percentage of bodyweight lost (relative to pre-transfer baseline) and the time in which weight loss occurs; higher values in this index reflect more aggressive weight loss (Fig 1A). For colitis, histologic inflammation was quantified using a standard scoring system of 0–4, where 0 reflects no evidence of inflammation and 4 indicates maximal severity of inflammation with transmural leukocyte infiltration and loss of goblet cells (Fig 1B) [17]. Other disease endpoints documented included colon weight (in g), colon length (in cm), and colon weight:length ratio, which correlate with histologic inflammation in some chemically induced models of colitis (data not shown) [17]. All data points (63 in total) were then collated into a single file for hierarchical clustering in GenePattern ( (Fig 1E).

After clustering, we highlighted either the DSI (Fig 2A) or colitis scores (Fig 2B) and used the nearest neighbor search feature in GenePattern (HierarchicalClusteringViewer module) to generate Pearson (r) coefficients for all immune phenotypes relative to each disease endpoint, ranked from high (positive Pearson coefficient; directly correlated) to low (negative Pearson coefficient; inversely correlated). As expected, the absolute percentage of weight loss was the strongest direct correlate of the DSI (r = 0.865; P = 0.0026), the time post-T cell transfer was among the strongest inverse correlates of the DSI (r = -0.646; P = 0.0503), and histologic colitis did not correlate with the DSI (r = -0.045) (Fig 2A).

Fig 2. Discrete immune phenotypes correspond with T cell transfer-induced weight loss or colitis in Rag1-/- mice.

(A) Rank-ordered (Pearson r) correlation values of all disease endpoints and immune phenotypes relative to weight loss (disease severity index (DSI)), in FVB.Rag1-/- mice injected with wild type naïve CD4+ T cells as in Fig 1A. Relevant disease endpoints and immune phenotypes are indicated by black and red text, respectively. Correlation between weight loss and colitis scores is further shown in insert, where blue text indicates the Pearson r correlation value. (B) Rank-ordered (Pearson r) correlation values of all disease endpoints and immune phenotypes relative to colitis scores, determined by histology, in the same T cell-transferred FVB.Rag1-/- mice. Relevant immune phenotypes are indicated by red text; correlation with weight loss (DSI) is indicated by black text. For (A, B), the correlation of the reference variable with itself (r = 1.0) is shown at top left in grey. (C) Exemplar immune phenotypes that correlate with T cell transfer-induced weight loss (disease severity index (DSI)), (left), but not histologic colitis (right) in T cell-transferred FVB.Rag1-/- mice. (D) Exemplar immune phenotypes that correlate with T cell transfer-induced colitis (right), but not weight loss (disease severity index (DSI)) (left). Pearson r correlation values are show in red (for correlations achieving statistical significance) and blue (for correlations not statistically significant). * P < .05, ** P < .01, *** P < .001, Pearson correlation test.

Interestingly, the percentage of induced Tregs (iTregs) in spleen was the strongest inversely correlated immune phenotype with weight loss (r = -0.758; P = 0.0179) (Fig 2B), suggesting splenic iTregs protect against weight loss in this model. As expected, iTreg frequency positively correlated with absolute iTreg numbers in spleen (Figure A in S1 File), and accordingly, absolute numbers of iTregs in spleen also inversely correlated with weight loss (r = -0.600; P = 0.087), albeit to a lesser degree than iTreg frequencies (Figure B in S1 File). By contrast, neither absolute numbers nor percentages of iTregs in other tissues (MLN, colon) correlated significantly with weight loss. Other immune phenotypes increased proportionately with weight loss, for example the percentage of RORγt+IL-22+ Teff cells in MLN (r = 0.859; P = 0.003), and the percentage of CXCR3+ Teff cells in colon (r = 0.7930; P = 0.011) (Fig 2B). Importantly, none of these correlates of weight loss showed similar relationships with the severity of histologic colitis (Fig 2B).

Whereas Pearson correlation is the default option for performing nearest neighbor searches in GenePattern, more common algorithms for assessing nearest neighbors calculate distance between variables (i.e., Euclidian distance, Manhattan distance) [12]. Nonetheless, the same immune phenotypes that correlated directly (e.g., RORγt+IL-22+ Teff cells in MLN, CXCR3+ Teff cells in colon) or inversely (iTregs in spleen) with T cell transfer-induced weight loss by Pearson coefficient were also identified as close or distant variables, respectively, by either Euclidian or Manhattan algorithms (Figure A-C in S2 File). Accordingly, these immune phenotypes also showed the strongest association with T cell transfer-induced weight loss using a cumulative rank order scoring system, which incorporates all 3 nearest neighbor algorithms (Figure D in S2 File).

A distinct set of immune phenotypes correlated with colitis severity, but not weight loss; the percentage of total Teff cells in colon increased proportionately with colitis severity (r = 0.714; P = 0.031), whereas frequency of both CCR6+ (r = -0.946; P = 0.0001) and α4β7+ (r = -0.642; P = 0.042) Teff cells in colon inversely correlated with colitis severity. Together, these results highlight the utility of using informatics in analyzing large immune profiling datasets, and support the notion that T cell transfer-induced weight loss and colitis are independent events driven by distinct immunologic mechanisms.

Coupling immune phenotypes with clinical disease endpoints

To validate this approach in a clinical setting, we performed FACS analysis on frozen PBMCs from healthy adult donors (n = 26), and adult IBD patients (ulcerative colitis (UC), n = 50; Crohn’s disease (CD), n = 53). For proof-of-principle, we assessed a relatively small number (n = 24) of manually gated immune parameters reflecting frequencies of major CD3+ and CD3- lymphocyte subsets, including CD4+ and CD4- (CD8+) CD3+ T cells; CD4+CD25lo Tconv and CD4+CD25hi cells (within CD3+CD4+ T cells); and CCR7hiCD45RO- Tnaive, CCR7loCD45RO+ Teff, and CCR7loCD45RO- Teff cells (within both CD4+ and CD8+ Tconv cells) (Fig 3A). To ensure assay reliability, we performed repeated analyses on a control stock of healthy donor PBMC, run in parallel during each independent experiment on sets of healthy donor and IBD patient samples. Coefficients of variation (CVs) for each major T cell subset ranged between 6–15% (Fig 3B), indicating reliable detection.

Fig 3. Informatics-based identification of immune dysregulation in clinical inflammatory bowel diseases.

(A) Bottom left, 6-parameter FACS panel used for analyzing expression of surface antigens on peripheral blood mononuclear cells (PBMC) from healthy adult donors and adult IBD patients. Gating strategy for FACS analysis of human PBMC; immune subsets used in downstream analysis are indicated by gates and text. (B) Percentages of major T cell subsets in a healthy control PBMC stock, determined by repeated FACS analysis as in (A), over 10 independent staining experiments. Each subset is quantified based on the percentages within relevant parent gates (as in (A)); coefficients of variation (CVs) are indicated for each subset by color-matched text. (C) Heat map showing hierarchical clustering of 7 disease endpoints and 24 immune phenotypes in healthy adults (n = 26) and IBD patients ((ulcerative colitis (UC), n = 50; Crohn’s disease (CD), n = 53). (D) Rank-ordered (Pearson r) correlation values of all disease endpoints and immune phenotypes relative to diagnosis group (i.e., healthy donors, group 1; UC patients, group 2; CD patients, group 3). Relevant disease endpoints and immune phenotypes are indicated by black and red text, respectively; the correlation of the reference variable with itself (r = 1.0) is shown at top left in grey. (E) Immune cell subsets (CD4+CD25hileft; CD8+RO- Teff–middle; CD8+ naïve–right) identified by hierarchical clustering and ranked Pearson coefficients (as in (C, D)) perturbed in CD patient PBMC. (F) Immune cell subsets (CD4+ naive–left; CD4+ Teff–right) identified by hierarchical clustering and ranked Pearson coefficients (as in C and S3 File) perturbed in UC PBMC. Red lines indicate median values for each group. * P < .05, ** P < .01, *** P < .001, One-way ANOVA. Teff, effector/memory T cells. Only significant differences between groups are shown.

To enable hierarchical clustering and nearest neighbor searches with clinical and demographic data, we again converted disease and demographic parameters to single numeric values (Table 1). Clinical data included: diagnosis, age at diagnosis, disease activity (based on pathology), history of ileitis, number of related surgeries (e.g., bowel resection), and medication history. Demographic data included: age at collection, gender, and ethnicity (Table 2). All data were again collated for hierarchical clustering (Fig 3C).

Table 2. Demographic data of healthy volunteers and IBD patients analyzed in the study.

We performed two nearest neighbor searches to extract immune correlates of CD (Fig 3D) or UC (S3 File). As expected, clinical parameters such as ileitis (r = 0.565; P < 0.0001), and medications (r = 0.514; P < 0.0001) directly correlated with CD, whereas age at diagnosis was the strongest inverse correlate of both CD and UC (Fig 3D and S3 File). More importantly, our analyses identified distinct nodes of T cell dysregulation in CD and UC. In CD, percentages of both CD4+CD25hi cells and CD8+RO- Teff cells were increased, and percentages of CD8+ naive cells were decreased relative to both healthy controls and UC patients (Fig 3E). By contrast, UC patients displayed increased percentages of CD4+ naive cells, and decreased percentages of CD4+RO+ Teff cells vs. either healthy donors or CD patients (Fig 3F).


Here we describe a simple informatics pipeline that can be leveraged to rapidly discriminate immune correlates of clinical and pre-clinical disease parameters within high-content cytometry datasets. In mice, we show that distinct immune profiles correlate with T cell transfer-induced weight loss or histologic colitis. These results are interesting because they suggest that weight loss and colitis in this model are unrelated; they are important because these endpoints are widely used in the laboratory and presented in the literature as interchangeable measures of disease [13,17]. In the colon lamina propria, for example, we show that CXCR3+ Teff cells increase proportionately with more severe weight loss, but show no correlation with severity of mucosal inflammation. By contrast, CCR6+ Teff cells decrease commensurate with more severe histologic inflammation but show no association with weight loss. Expression of CXCR3 in Teff cells enriches for IFNγ-producing Th1 cells, whereas CCR6 expression broadly distinguishes RORγt+ Th17 cells that transiently express IL-17A and IL-17F pursuant to signals present in the local microenvironment [1820]. The strong inverse correlation between CCR6+ Th17 cell abundance and histologic colitis (r = -0.946) points to a protective function of CCR6+ Th17 cells in the colonic mucosa, which is both consistent with the fact that Th17 cytokines enforce barrier function in the gut [16] and supported by previous data from both animal models of IBD and clinical trials; Th17 cells are capable of suppressing experimental colitis in mice [21], and the neutralizing IL-17A antibody, Secukinumab, not only failed to show efficacy in a recent IBD clinical trial, it exacerbated disease activity [22].

The association between mucosal CXCR3+ Th1 cells and T cell transfer-induced weight loss is perhaps more difficult to understand given that the mechanisms underlying weight loss in this model are ill defined. Th1 cell-derived IFNγ is known classically for activating phagocytes and cytolytic lymphocytes (e.g., natural killer [NK] cells), and has been experimentally shown to promote parenchymal cell death and tissue damage in a variety of autoimmune mouse models [21,23,24]. Thus, it is possible that Th1 cells induce histologically subtle damage to the colonic epithelium, leading in turn to dissemination of bacterial products from the gut (e.g., to the liver) and failure to thrive [25]. Indeed, additional methods to assess intestinal pathology, such as 3-dimensional stereomicroscopy, could be useful to distinguish qualitatively distinct mucosal lesions [26]. It is also possible that increased frequency of CXCR3+ Th1 cells in colons of morbid animals is a consequence–rather than cause–of excessive tissue damage. Expression of IFNγ per se showed less strong correlations than CXCR3 with T cell transfer-induced weight loss (data not shown), and CXCR3 ligands, such as CXCL9, CXCL10, CXCL11, are broadly expressed by endothelial and epithelial cells, where they are upregulated upon microbial infection, stress, or tissue damage [27].

Frequencies of colonic α4β7 (integrin)+ Teff cells also inversely correlated with severity of histologic colitis. This result, which implies a protective role of α4β7+ Teff cells in this model, is surprising given that α4β7 is functionally required for experimental colitis in mice [28] and the neutralizing α4β7 antibody, Vedolizumab, is now approved for use as a therapeutic in IBD [29,30]. Like most integrins, however, cell surface α4β7 is labile and can be internalized upon T cell activation [31], and we consistently observed lower α4β7 staining in cells from colon lamina propria vs. either spleen or MLN (data not shown). Thus, the apparent increase in Teff cell α4β7 expression in less inflamed colons may simply reflect reduced T cell activation in response to luminal antigens, which is also required for colitis in this model [32]. As a whole, these results highlight how informatics-based analyses of flow cytometry data can be used, both to gain insight into disease biology and identify biomarkers linked to discrete disease endpoints.

In humans, we identified non-overlapping nodes of immune dysregulation in the two common forms of IBD, CD and UC. Despite their common classification, CD and UC are distinct diseases that affect discrete regions of the intestinal tract, present as different histopathologic lesions, and display unique genetic susceptibilities [33,34]. It is interesting in this regard that CD was associated with perturbations mostly in CD8+ T cell subsets, whereas alterations in the CD4+ T cell compartment were most obvious in UC. The specific dysregulation of CD8+ T cells in CD patient blood observed here is consistent with at least two previous reports, including one from van Unen et al., which used a panel of 32 metal isotope-tagged antibodies and CyTOF to profile nearly 150 immune cell phenotypes in blood and mucosal biopsies of CD patients [8,35]. This shift from naïve (CCR7hiRO-) to chronically activated (CCR7-RO-) CD8+ T cells in the blood of CD patients predicts the previously reported infiltration of activated CD8+ T cells into small bowel lesions of both ileitis-prone mice and human CD patients [36,37]. We are not aware of other reports documenting similar changes in naïve and effector/memory CD4+ T cells in UC patient peripheral blood. The extent to which these immune profiles can be generalized within CD and UC patients requires further investigation, but these preliminary results highlight the unique biology at play in CD and UC, and predict the non-overlapping clinical responses of CD and UC patients to targeted therapies [29,30,38].

Technically, there are 3 major advantages offered by an informatics view of cytometry data. First is expedience; analyzing dozens to hundreds of distinct parameters in groups of mice or humans “one-by-one” takes days to weeks. By contrast, this same analysis performed via informatics takes minutes. Second is breadth; cluster-based analysis affords a comprehensive view of all relationships between immune phenotypes and clinical or pre-clinical disease endpoints. Indeed, discriminating immune features that are not associated with disease can be equally as informative as understanding those that are. Third is sensitivity; we show several examples where subtle changes in immune cell frequencies correlate significantly with disease endpoints. For example, the percentage of CD4+ iTreg cells in spleen strongly correlated with protection from T cell transfer-induced weight loss (r = -0.758; P = 0.0179) (Fig 2B, top left), despite the fact that iTregs represented only 0.45–2.66% of total splenic CD4+ T cells in this cohort of animals. This information could have been lost if not for unbiased informatics analyses.

The notion of using informatics to handle ultra-high content cytometry data is in itself, not new, and several tools have been developed in recent years, both to enable unsupervised gating of raw cytometry data (e.g., Citrus, AutoGate) [4,5], and to visualize subset abundance and lineage relationships within patient populations (e.g., SPADE) [6]. Yet whether using manual or data-driven methods of raw cytometry data analysis, the question of how one relates these phenotypes to myriad clinical or pre-clinical disease parameters has remained a challenge. Hierarchical clustering of manually analyzed cytometry data has also been used previously to display qualitative differences between cell types or healthy vs. diseased patients [39,40]. Our study now shows added value of performing downstream nearest neighbor searches to identify quantitative relationships between immune phenotypes and select clinical or pre-clinical variables. Most importantly, this analysis platform uses publically available software, can be retrospectively applied to existing datasets, and is suitable for any clinical or pre-clinical disease setting.

Supporting Information

S1 File. Relationship between iTreg frequency, absolute number and T cell-induced weight loss.

(A) Correlation between frequencies (i.e., percentage of total CD4+ T cells) and absolute numbers of CD25+Foxp3+ induced T regulatory cells (iTregs) in spleens of colitic FVB.Rag1-/- mice transplanted with wild type naïve CD4+ T cells (as in Fig 1A). iTreg frequencies were determined by gating in FlowJo following ex vivo intracellular FACS analysis (as in Fig 1D). iTreg numbers were calculated by multiplying the total number of mononuclear cells recovered from spleen by subset frequencies (e.g., percentage of parent gates; example shown in Fig 1D). (B) Correlation between iTreg numbers in spleen and T cell transfer-induced weight loss (disease severity). Pearson (r) coefficients are indicated in red text; ** P < .01, Pearson correlation test.


S2 File. Relationship between GenePattern nearest neighbor search algorithms.

Correlation between Pearson (r) coefficients (correlation with T cell transfer-induced weight loss) and Euclidian (A) or Manhattan (B) distances (distance from T cell transfer-induced weight loss) of immune phenotypes in colitic FVB.Rag1-/- mice. (C) Correlation between Euclidian and Manhattan distances (from T cell transfer-induced weight loss) of immune phenotypes in colitic FVB.Rag1-/- mice. Pearson (r) coefficients are indicated in red text; immune phenotypes identified by Pearson coefficients (in Fig 2A and 2C) are highlighted by blue text and arrowheads. **** P < .0001, Pearson correlation test. (D) Combined rank order score of all pre-clinical and immunophenotypic variables relative to T cell transfer-induced weight loss following nearest neighbor searches using Pearson coefficient, Euclidian distance, and Manhattan distance. For Pearson correlation, variables were sorted from low (inverse) to high (direct) Pearson (r) coefficients and given low-to-high rank order scores. For Euclidian and Manhattan distance searches, variables were sorted from high-to-low dissimilarity values and given low-to-high rank order scores. The combined rank order score reflects the sum of all 3 rank order values; variables with highest combined rank order scores are increased in T cell-transferred FVB.Rag1-/- mice showing the greatest weight loss. Immune phenotypes identified by Pearson coefficients (in Fig 2A and 2C) are highlighted in blue.


S3 File. Immunophenotypic correlates of ulcerative colitis.

Rank-ordered (Pearson r) correlation values of all disease endpoints and immune phenotypes relative to diagnosis group (i.e., healthy donors, group 1; CD patients, group 2; UC patients, group 3). Relevant disease endpoints and immune phenotypes are indicated by black and red text, respectively; the correlation of the reference variable with itself (r = 1.0) is shown at top left in grey.



We thank employees of the Scripps Florida Animal Resource Center (ARC) and Flow Cytometry Core Facility for assistance with veterinary care of animals and FACS analysis, respectively. We also thank Dr. Amanda Beck for quantitative histopathology analysis, and members of the Sundrud laboratory and Dr. Derya Unutmaz for critical discussions and review of the manuscript.

Author Contributions

  1. Conceptualization: DWH MSS.
  2. Data curation: AD AO PNL MF DWH MSS.
  3. Formal analysis: AD MF MSS.
  4. Funding acquisition: MSS.
  5. Investigation: AD MSS.
  6. Methodology: AD MSS.
  7. Project administration: AD AO PNL DWH MSS.
  8. Resources: AO PNL DWH.
  9. Software: AD MF MSS.
  10. Supervision: DWH MSS.
  11. Validation: AD MF MSS.
  12. Visualization: AD MF MSS.
  13. Writing – original draft: MSS.
  14. Writing – review & editing: AD AO PNL MF DWH.


  1. 1. Nair N, Mei HE, Chen SY, Hale M, Nolan GP, Maecker HT, et al. Mass cytometry as a platform for the discovery of cellular biomarkers to guide effective rheumatic disease therapy. Arthritis research & therapy. 2015;17:127. pmid:25981462; PubMed Central PMCID: PMC4436107.
  2. 2. Tsai JJ, Jen YH, Chang JS, Hsiao HM, Noisakran S, Perng GC. Frequency alterations in key innate immune cell components in the peripheral blood of dengue patients detected by FACS analysis. Journal of innate immunity. 2011;3(5):530–40. pmid:21335935.
  3. 3. Wanke-Jellinek L, Keegan JW, Dolan JW, Lederer JA. Characterization of lung infection-induced TCRgammadelta T cell phenotypes by CyTOF mass cytometry. Journal of leukocyte biology. 2016;99(3):483–93. pmid:26428679.
  4. 4. Bruggner RV, Bodenmiller B, Dill DL, Tibshirani RJ, Nolan GP. Automated identification of stratifying signatures in cellular subpopulations. Proceedings of the National Academy of Sciences of the United States of America. 2014;111(26):E2770–7. pmid:24979804; PubMed Central PMCID: PMC4084463.
  5. 5. Meehan S, Walther G, Moore W, Orlova D, Meehan C, Parks D, et al. AutoGate: automating analysis of flow cytometry data. Immunologic research. 2014;58(2–3):218–23. pmid:24825775; PubMed Central PMCID: PMC4464812.
  6. 6. Linderman MD, Bjornson Z, Simonds EF, Qiu P, Bruggner RV, Sheode K, et al. CytoSPADE: high-performance analysis and visualization of high-dimensional cytometry data. Bioinformatics. 2012;28(18):2400–1. pmid:22782546; PubMed Central PMCID: PMC3436846.
  7. 7. Spitzer MH, Gherardini PF, Fragiadakis GK, Bhattacharya N, Yuan RT, Hotson AN, et al. IMMUNOLOGY. An interactive reference framework for modeling a dynamic immune system. Science. 2015;349(6244):1259425. pmid:26160952; PubMed Central PMCID: PMC4537647.
  8. 8. van Unen V, Li N, Molendijk I, Temurhan M, Hollt T, van der Meulen-de Jong AE, et al. Mass Cytometry of the Human Mucosal Immune System Identifies Tissue- and Disease-Associated Immune Subsets. Immunity. 2016;44(5):1227–39. pmid:27178470.
  9. 9. Feuerer M, Hill JA, Kretschmer K, von Boehmer H, Mathis D, Benoist C. Genomic definition of multiple ex vivo regulatory T cell subphenotypes. Proceedings of the National Academy of Sciences of the United States of America. 2010;107(13):5919–24. pmid:20231436; PubMed Central PMCID: PMC2851866.
  10. 10. Gazit R, Garrison BS, Rao TN, Shay T, Costello J, Ericson J, et al. Transcriptome analysis identifies regulators of hematopoietic stem and progenitor cells. Stem cell reports. 2013;1(3):266–80. pmid:24319662; PubMed Central PMCID: PMC3849420.
  11. 11. Yosef N, Shalek AK, Gaublomme JT, Jin H, Lee Y, Awasthi A, et al. Dynamic regulatory network controlling TH17 cell differentiation. Nature. 2013;496(7446):461–8. pmid:23467089; PubMed Central PMCID: PMC3637864.
  12. 12. Fechner U, Franke L, Renner S, Schneider P, Schneider G. Comparison of correlation vector methods for ligand-based similarity searching. Journal of computer-aided molecular design. 2003;17(10):687–98. pmid:15068367.
  13. 13. Ostanin DV, Bao J, Koboziev I, Gray L, Robinson-Jackson SA, Kosloski-Davidson M, et al. T cell transfer model of chronic colitis: concepts, considerations, and tricks of the trade. American journal of physiology Gastrointestinal and liver physiology. 2009;296(2):G135–46. pmid:19033538; PubMed Central PMCID: PMC2643911.
  14. 14. Carlson TJ, Pellerin A, Djuretic IM, Trivigno C, Koralov SB, Rao A, et al. Halofuginone-induced amino acid starvation regulates Stat3-dependent Th17 effector function and reduces established autoimmune inflammation. Journal of immunology. 2014;192(5):2167–76. pmid:24489094; PubMed Central PMCID: PMC3936195.
  15. 15. Gerdes J, Lemke H, Baisch H, Wacker HH, Schwab U, Stein H. Cell cycle analysis of a cell proliferation-associated human nuclear antigen defined by the monoclonal antibody Ki-67. Journal of immunology. 1984;133(4):1710–5. pmid:6206131.
  16. 16. Bamias G, Arseneau KO, Cominelli F. Cytokines and mucosal immunity. Current opinion in gastroenterology. 2014;30(6):547–52. pmid:25203451; PubMed Central PMCID: PMC4234041.
  17. 17. Wirtz S, Neufert C, Weigmann B, Neurath MF. Chemically induced mouse models of intestinal inflammation. Nature protocols. 2007;2(3):541–6. pmid:17406617.
  18. 18. Hirahara K, Vahedi G, Ghoreschi K, Yang XP, Nakayamada S, Kanno Y, et al. Helper T-cell differentiation and plasticity: insights from epigenetics. Immunology. 2011;134(3):235–45. pmid:21977994; PubMed Central PMCID: PMC3209564.
  19. 19. Sano T, Huang W, Hall JA, Yang Y, Chen A, Gavzy SJ, et al. An IL-23R/IL-22 Circuit Regulates Epithelial Serum Amyloid A to Promote Local Effector Th17 Responses. Cell. 2015;163(2):381–93. pmid:26411290; PubMed Central PMCID: PMC4621768.
  20. 20. Wan Q, Kozhaya L, ElHed A, Ramesh R, Carlson TJ, Djuretic IM, et al. Cytokine signals through PI-3 kinase pathway modulate Th17 cytokine production by CCR6+ human memory T cells. The Journal of experimental medicine. 2011;208(9):1875–87. pmid:21825017; PubMed Central PMCID: PMC3171088.
  21. 21. Awasthi A, Kuchroo VK. IL-17A directly inhibits TH1 cells and thereby suppresses development of intestinal inflammation. Nature immunology. 2009;10(6):568–70. pmid:19448657.
  22. 22. Hueber W, Sands BE, Lewitzky S, Vandemeulebroecke M, Reinisch W, Higgins PD, et al. Secukinumab, a human anti-IL-17A monoclonal antibody, for moderate to severe Crohn's disease: unexpected results of a randomised, double-blind placebo-controlled trial. Gut. 2012;61(12):1693–700. pmid:22595313; PubMed Central PMCID: PMC4902107.
  23. 23. Esensten JH, Lee MR, Glimcher LH, Bluestone JA. T-bet-deficient NOD mice are protected from diabetes due to defects in both T cell and innate immune system function. Journal of immunology. 2009;183(1):75–82. pmid:19535634; PubMed Central PMCID: PMC2732575.
  24. 24. Hirota K, Duarte JH, Veldhoen M, Hornsby E, Li Y, Cua DJ, et al. Fate mapping of IL-17-producing T cells in inflammatory responses. Nature immunology. 2011;12(3):255–63. pmid:21278737; PubMed Central PMCID: PMC3040235.
  25. 25. Henao-Mejia J, Elinav E, Jin C, Hao L, Mehal WZ, Strowig T, et al. Inflammasome-mediated dysbiosis regulates progression of NAFLD and obesity. Nature. 2012;482(7384):179–85. pmid:22297845; PubMed Central PMCID: PMC3276682.
  26. 26. Rodriguez-Palacios A, Kodani T, Kaydo L, Pietropaoli D, Corridoni D, Howell S, et al. Stereomicroscopic 3D-pattern profiling of murine and human intestinal inflammation reveals unique structural phenotypes. Nature communications. 2015;6:7577. pmid:26154811; PubMed Central PMCID: PMC4510646.
  27. 27. Van Raemdonck K, Van den Steen PE, Liekens S, Van Damme J, Struyf S. CXCR3 ligands in disease and therapy. Cytokine & growth factor reviews. 2015;26(3):311–27. pmid:25498524.
  28. 28. Kurmaeva E, Lord JD, Zhang S, Bao JR, Kevil CG, Grisham MB, et al. T cell-associated alpha4beta7 but not alpha4beta1 integrin is required for the induction and perpetuation of chronic colitis. Mucosal immunology. 2014;7(6):1354–65. pmid:24717354; PubMed Central PMCID: PMC4417258.
  29. 29. Feagan BG, Rutgeerts P, Sands BE, Hanauer S, Colombel JF, Sandborn WJ, et al. Vedolizumab as induction and maintenance therapy for ulcerative colitis. The New England journal of medicine. 2013;369(8):699–710. pmid:23964932.
  30. 30. Sandborn WJ, Feagan BG, Rutgeerts P, Hanauer S, Colombel JF, Sands BE, et al. Vedolizumab as induction and maintenance therapy for Crohn's disease. The New England journal of medicine. 2013;369(8):711–21. pmid:23964933.
  31. 31. Kinashi T. Intracellular signalling controlling integrin activation in lymphocytes. Nature reviews Immunology. 2005;5(7):546–59. pmid:15965491.
  32. 32. Do JS, Visperas A, Freeman ML, Iwakura Y, Oukka M, Min B. Colitogenic effector T cells: roles of gut-homing integrin, gut antigen specificity and gammadelta T cells. Immunology and cell biology. 2014;92(1):90–8. pmid:24189163; PubMed Central PMCID: PMC3947309.
  33. 33. Jostins L, Ripke S, Weersma RK, Duerr RH, McGovern DP, Hui KY, et al. Host-microbe interactions have shaped the genetic architecture of inflammatory bowel disease. Nature. 2012;491(7422):119–24. pmid:23128233; PubMed Central PMCID: PMC3491803.
  34. 34. Kaser A, Zeissig S, Blumberg RS. Inflammatory bowel disease. Annual review of immunology. 2010;28:573–621. pmid:20192811.
  35. 35. Hedin CR, McCarthy NE, Louis P, Farquharson FM, McCartney S, Taylor K, et al. Altered intestinal microbiota and blood T cell phenotype are shared by patients with Crohn's disease and their unaffected siblings. Gut. 2014;63(10):1578–86. pmid:24398881.
  36. 36. Matsuzaki K, Tsuzuki Y, Matsunaga H, Inoue T, Miyazaki J, Hokari R, et al. In vivo demonstration of T lymphocyte migration and amelioration of ileitis in intestinal mucosa of SAMP1/Yit mice by the inhibition of MAdCAM-1. Clinical and experimental immunology. 2005;140(1):22–31. pmid:15762871; PubMed Central PMCID: PMC1809333.
  37. 37. Boschetti G, Nancey S, Moussata D, Cotte E, Francois Y, Flourie B, et al. Enrichment of Circulating and Mucosal Cytotoxic CD8+ T Cells Is Associated with Postoperative Endoscopic Recurrence in Patients with Crohn's Disease. Journal of Crohn's & colitis. 2016;10(3):338–45. pmid:26589954; PubMed Central PMCID: PMC4957475.
  38. 38. Chen ML, Sundrud MS. Cytokine Networks and T-Cell Subsets in Inflammatory Bowel Diseases. Inflammatory bowel diseases. 2016;22(5):1157–67. pmid:26863267; PubMed Central PMCID: PMC4838490.
  39. 39. Gedye CA, Hussain A, Paterson J, Smrke A, Saini H, Sirskyj D, et al. Cell surface profiling using high-throughput flow cytometry: a platform for biomarker discovery and analysis of cellular heterogeneity. PloS one. 2014;9(8):e105602. pmid:25170899; PubMed Central PMCID: PMC4149490.
  40. 40. Theorell J, Gustavsson AL, Tesi B, Sigmundsson K, Ljunggren HG, Lundback T, et al. Immunomodulatory activity of commonly used drugs on Fc-receptor-mediated human natural killer cell activation. Cancer immunology, immunotherapy: CII. 2014;63(6):627–41. pmid:24682538.