Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

< Back to Article

Table 1.

Data sets used in this manuscript.

More »

Table 1 Expand

Fig 1.

Consensus clustering results of cohort GSE47460 (Kaminski-LGRC bulk expression cohort) [1417], GSE134692 (BMS bulk RNA-seq cohort) [18] and replication of GSE32537 (Schwartz-Univ of Colorado bulk expression cohort) [10] results.

A. Consensus clustering of IPF patients in GSE47460 (Kaminski-LGRC bulk expression cohort) [1417] based on the 5,000 most variable genes in IPF patients showing distribution of samples based on k = 2 consensus clusters. B. Hierarchical clustering of IPF samples from GSE47460 (Kaminski-LGRC bulk expression cohort) [1417] using top 5,000 most variable genes. x axis represents individual patients, y axis represents genes. Subsets are indicated in x axis color bar and legend of heatmap and correspond to classes shown in Fig 1A. C. PCA of IPF and Control samples from GSE47460 (Kaminski-LGRC bulk expression cohort) [1417] with IPF subsets and Control indicated. Subsets are indicated in legend of PCA plot and correspond to classes shown in Fig 1A. D. Expression of cilium-related genes previously identified by Yang et al. in [10] from the 75 samples in GSE47460 (Kaminski-LGRC bulk expression cohort) [1417] not overlapping with GSE32537. Subsets are indicated on x axis of box plots and correspond to classes shown in Fig 1A. Adjusted p values determined by ANOVA and post-hoc Dunn’s test are reported on plots. E. Correlation plot of log fold changes calculated in GSE47460 (Kaminski-LGRC bulk expression cohort) [1417] (when comparing Subset 1 and Subset 2 (x axis) using the 75 samples not appearing in GSE32537 and compared to GSE32537 (Schwartz-Univ of Colorado bulk expression cohort) [10] (y axis). Genes with reported absolute log fold change of larger than 0.58 and adjusted p value < 0.05 were used in this analysis from both datasets. F. Consensus clustering of IPF patients in GSE134692 (BMS bulk RNA-seq cohort) [18] based on the 5,000 most variable genes in IPF patients showing distribution of samples based on k = 2 consensus clusters. G. Hierarchical clustering of IPF samples from GSE134692 (BMS bulk RNA-seq cohort) [18] using top 5,000 most variable genes. x axis represents individual patients, y axis represents genes. Subsets are indicated in x axis color bar and legend of heatmap and correspond to classes shown in Fig 1F. H. PCA of IPF and Normal samples from GSE134692 (BMS bulk RNA-seq cohort) [18] with IPF subsets and Normal indicated. Subsets are indicated in legend of PCA plot and correspond to classes shown in Fig 1F. I. Expression of cilium-related genes from GSE134692 (BMS bulk RNA-seq cohort) [18] previously identified by Yang et al. [10]. Subsets are indicated on x axis of box plots and correspond to classes shown in Fig 1F. Adjusted p values determined by ANOVA and post-hoc Dunn’s test are reported on plots. J. Correlation plot of log fold changes calculated in GSE32537 (Schwartz-Univ of Colorado bulk expression cohort) [10] (when comparing Subset 1 and Subset 2 (x axis, logFC_GSE32537) compared to GSE134692 (BMS bulk RNA-seq cohort) [18] subsets (y axis, logFC_GSE134692). Genes with reported absolute log fold change of larger than 0.58 and adjusted p value < 0.05 were used in this analysis from both datasets.

More »

Fig 1 Expand

Table 2.

Pathway enrichment results in patient subsets in GSE47460 (Kaminski-LGRC bulk expression cohort) [1417].

More »

Table 2 Expand

Fig 2.

Evaluation of fibrotic markers and clinical parameters in IPF subsets.

A. Expression of fibrosis markers in IPF subsets based on the analysis in Fig 1 in GSE47460 (Kaminski-LGRC bulk expression cohort) [1417] as compared to healthy controls (‘Control’). Adjusted p values determined by ANOVA and post-hoc Dunn’s test are reported on plots. B. Distribution of clinical parameters in IPF subsets based on the analysis in Fig 1 as reported in GSE47460 (Kaminski-LGRC bulk expression cohort) [1417] as compared to healthy controls (‘Control’). %DLCO, FVC and FEV1 values represent pre-lung transplant values reported in GSE47460 (Kaminski-LGRC bulk expression cohort) [1417]. Adjusted p values determined by ANOVA and post-hoc Dunn’s test are reported on plots.

More »

Fig 2 Expand

Fig 3.

Gene signature scores for non-hematopoietic cell populations in GSE47460 (Kaminski-LGRC bulk expression cohort) [1417] subclasses.

Signatures were determined using total lung mononuclear cell data from GSE132771 (Sheppard-UCSF single cell cohort) [19]. Cell cluster names follow labeling in S3A Fig. Labels used in S3 Fig are indicated in parentheses. A. Details of our workflow with datasets used at each step indicated. B. Gene signature scores for endothelial cell subpopulations. C. Gene signature scores for mesothelial cell populations. D. Gene signature scores for epithelial cell subpopulations. Adjusted p values are reported on plots.

More »

Fig 3 Expand

Fig 4.

Gene signature scores for hematopoietic cell populations in GSE47460 (Kaminski-LGRC bulk expression cohort) [1417] subclasses.

Signatures were determined using total lung mononuclear cell data from GSE132771 (Sheppard-UCSF single cell cohort) [19]. Cell cluster names follow labeling in S3A Fig. Labels used in S3A Fig are indicated in parentheses. A. Gene signature scores for B cell subpopulations. B. Gene signature scores for T cell populations. C. Gene signature scores for myeloid cell subpopulations. Adjusted p values are reported on plots.

More »

Fig 4 Expand

Fig 5.

Gene signature scores for smooth muscle/pericyte/fibroblast obtained by using CD45-/EPCAM-/CD235a- (‘Lineage-sorted cells’) data from GSE132771 (Sheppard-UCSF single cell cohort) [19] in GSE47460 (Kaminski-LGRC bulk expression cohort) [1417] subclasses.

Cell cluster names follow labeling in S3B Fig. Labels used in S3B Fig are indicated in parentheses. A. Gene signature scores for adventitial and peribronchial fibroblast subpopulations. B. Gene signature scores for alveolar fibroblast subpopulations. C. Gene signature scores for CTHRC1+ fibroblast subpopulation. D. Gene expression scores for pericytes and smooth muscle cell populations. Adjusted p values are reported on plots.

More »

Fig 5 Expand

Table 3.

Summary of cellular and gene expression changes in patient subsets in GSE47460.

More »

Table 3 Expand

Fig 6.

Evaluation of differential chemokine networks in IPF subsets.

A. Hierarchical clustering of GSE47460 (Kaminski-LGRC bulk expression cohort) [1417] IPF patients based on chemokines differentially expressed between IPF subsets. Absolute log FC >0.58 and adjusted p value<0.05 was used to define differentially expressed chemokines. x axis represents individual patients, y axis represents genes. Subsets are indicated in x axis color bar and legend of heatmap and correspond to classes shown in Fig 1A. B. Expression of chemokines detectable in GSE135893 (Kropski-Vanderbilt Univ single cell cohort) [24] scRNAseq data and in GSE47460 (Kaminski-LGRC bulk expression cohort) [1417] IPF subsets. scRNAseq UMAP plots (top row) were generated using the ‘FeaturePlot’ function in R package Seurat. Each UMAP plot depicts the expression of the chemokine indicated. Color bars indicate scaled expression in each cell on the plot. Cell clusters correspond to clusters reported in S5 Fig. Bar plots (bottom row) depict the expression of the same chemokines in GSE47460 (Kaminski-LGRC bulk expression cohort) [1417] IPF subsets. Adjusted p values are reported on plots.

More »

Fig 6 Expand

Fig 7.

Ligand-receptor networks in ‘Ciliated_low’ and ‘Ciliated_high’ donors in GSE135893 (Kropski-Vanderbilt Univ single cell cohort) [24].

A. Histogram of distribution of percentage of ciliated epithelial cells in GSE135893 (Kropski-Vanderbilt Univ single cell cohort) [24] with the cutoff we selected indicated. B-E. Circular plots indicating ligand receptor interactions in subsets of patients in GSE135893 (Kropski-Vanderbilt Univ single cell cohort) [24]. On each circular plot, the top half of the circle represents cell types expressing the receptor for the ligand of the indicated cell type on the bottom half of the plot. Connections represent inferred active ligand-receptor pairs between types of cells. The thickness of the lines represents the relative level of expression of a given ligand/receptor. Transparency of the lines represents relative strength of the given ligand-receptor interaction as reported by the z score value calculated by PyMiner. B. Top ligands produced by macrophages (bottom half of circle) in ‘Ciliated_low’ patients and the top receptors they interact with (top half of circle with cell types expressing the receptor indicated). C. Top ligands produced by macrophages (bottom half of circle) in ‘Ciliated_high’ patients and the top receptors they interact with (top half of circle with cell types expressing the receptor indicated). D. Top ligands produced by ciliated epithelial cells (bottom half of circle) in ‘Ciliated_low’ patients and the top receptors they interact with (top half of circle with cell types expressing the receptor indicated). E. Top ligands produced by ciliated epithelial cells (bottom half of circle) in ‘Ciliated_high’ patients and the top receptors they interact with (top half of circle with cell types expressing the receptor indicated).

More »

Fig 7 Expand

Fig 8.

Building a machine learning-based classifier for distinguishing subclasses in GSE47460 (Kaminski-LGRC bulk expression cohort) [1417].

A. ROC curve of classifier from three different methods used based on cell signature data. Legend indicates names of machine learning (svm, gbm, glmnet) used. B. Relative importance of cell types identified by the elastic net model sorted by importance. C. ROC curve of classifier from three different methods used based on gene expression data. Legend indicates names of machine learning (svm, gbm, glmnet) used. D. Expression values of top 5 genes identified by recursive feature elimination across subsets of patients in GSE47460 (Kaminski-LGRC bulk expression cohort) [1417]. Adjusted p values are reported on plots.

More »

Fig 8 Expand

Fig 9.

Expression of a pirfenidone response gene signature differs between IPF subsets.

A. Pirfenidone response signature from reference [42]. B. Gene signature scores in GSE47460 (Kaminski-LGRC bulk expression cohort) [1417]. Adjusted p values are reported on plots.

More »

Fig 9 Expand