Fig 1.
Identification of Differentially Expressed Genes (DEGs).
A. Volcano plot of DEGs in the GSE150910 dataset (103 IPF samples vs 103 normal samples). B. Heatmap of DEGs in the GSE150910 dataset (the upper part shows a density heatmap of DEG expression across samples, displaying five quantiles and mean lines; the lower part shows a heatmap of DEG expression). C. Volcano plot of DEGs in the GSE93606 dataset (57 IPF vs 20 normal samples). D. Heatmap of DEGs in the GSE93606 dataset (similar to panel B). Red dots represent upregulated genes, green dots represent downregulated genes, and gray dots represent non-differentially expressed genes.
Fig 2.
Identification of Key Module Genes.
A. Kaplan-Meier survival curves for IPF samples (Log-rank test). B. Hierarchical clustering of samples (each branch represents a sample; the vertical axis indicates the Euclidean distance of gene expression levels). C. Soft-threshold selection (the left plot suggests 9 as the optimal soft-threshold for further analysis; the right plot shows network connectivity under different soft-thresholds). D. Identification of co-expression modules (the upper part shows a hierarchical clustering dendrogram of genes, and the lower part represents gene modules). E. Heatmap of module-trait correlations (left color blocks represent modules, right color bar represents the correlation range; in the middle heatmap, darker colors indicate higher correlations, red represents positive correlation, blue represents negative correlation, and cell numbers denote correlation and significance).
Fig 3.
Ascertainment and Functional Exploration of Candidate Genes.
A. Intersection of IPF DEGs (left: upregulated genes, right: downregulated genes). B. Differentially expressed CMRGs. C. GO enrichment analysis of candidate genes. D. Protein-protein interaction (PPI) network of candidate genes.
Fig 4.
Acquisition of Prognostic Genes.
A. Forest plot of univariate Cox regression analysis. B. LASSO regression analysis. C. Expression levels of biomarkers in IPF samples and controls (Wilcoxon test). ** represents P < 0.01, and *** represents P < 0.001.
Fig 5.
Construction of a Well-Performing Risk Model.
A. Risk score distribution (x-axis: samples sorted by increasing risk score; y-axis: risk score; dashed lines indicate median risk score and corresponding patient count). B. Survival status distribution (x-axis: samples sorted by increasing risk score; y-axis: survival time; dashed lines indicate median risk score and corresponding patient count). C. Kaplan-Meier survival curve (Log-rank test). D. ROC curve (nodes at 1, 2, and 3 years). E. Heatmap showing HP, PDLIM7, and CFAP45 expression levels in high-risk group (n = 28) and low-risk group (n = 29). F. Correlation among the three model genes.
Fig 6.
Validation of the Well-Performing Risk Model.
A. Risk score distribution (x-axis: samples sorted by increasing risk score; y-axis: risk score; dashed lines indicate median risk score and corresponding patient count). B. Survival status distribution (x-axis: samples sorted by increasing risk score; y-axis: survival time; dashed lines indicate median risk score and corresponding patient count). C. Kaplan-Meier survival curve (Log-rank test). D. ROC curve (nodes at 1, 2, and 3 years). E. Heatmap showing HP, PDLIM7, and CFAP45 expression levels in high-risk group (n = 22) and low-risk (n = 23) group. F. Correlation among the three model genes.
Fig 7.
Exploration of Pathway and Immune Infiltration Differences Between HRG and LRG Patients.
A-C. Functional enrichment analysis of PDLIM7, CFAP45, and HP, respectively. D. Proportion of 22 immune cell types. E. Boxplots of 14 immune cell score differences (Wilcoxon test). The horizontal axis represents different types of immune cells, and the vertical axis represents the proportion of cell infiltration. ns indicates P > 0.05, * indicates P < 0.05, ** indicates P < 0.01, and **** indicates P < 0.0001. F. Correlation heatmap of 14 immune cell types. Red indicates a positive correlation, and blue indicates a negative correlation. The darker the color, the stronger the correlation. G-I. Lollipop charts showing PDLIM7, CFAP45, and HP correlations with differential immune cells, respectively.
Fig 8.
Revealing Potential Regulatory Mechanisms and Therapeutic compounds for IPF.
A. Construction of the ceRNA network (pink nodes represent mRNA biomarkers, green nodes represent lncRNAs, and orange nodes represent miRNAs). B. Biomarker-small molecule compound network (red nodes represent mRNA biomarkers, green nodes represent small molecule compounds).
Fig 9.
Identification of Epithelial Cells and Macrophages as Key Cells for IPF.
A. Dimensionality-reduced cell clusters. B. Annotated cell clusters. C. Differences in cell types between IPF and normal samples (Wilcoxon test). D. Proportions of differential cell types. E. Expression levels of biomarkers in different cell types (Wilcoxon test). * indicates P < 0.05, ** indicates P < 0.01, and *** represents P < 0.001.
Fig 10.
Exploration of Interactions and Differentiation Trajectories of Key Cells.
A. Cell-cell communication networks among eight cell types (left: by interaction count, right: by weight; different colors represent different cell types, line thickness indicates interaction strength). B. Bubble chart of receptor-ligand interactions between cells (bubble size represents p-value; red indicates high communication probability, blue indicates low communication probability). C-D. Pseudotime trajectory analysis and relative expression profiles of biomarkers in epithelial cells (x-axis: pseudotime, darker colors represent earlier pseudotime, lighter colors represent later pseudotime). E-F. Pseudotime trajectory analysis and relative expression profiles of biomarkers in macrophages.
Fig 11.
Validation of Biomarker Expression in Mice.
A. H&E staining of lung tissues (scale bar: 100 μm). B. Quantification of pulmonary fibrosis severity using the Ashcroft score based on H&E-stained mouse lung sections. C. Chest CT images of mice. D. Lung hydroxyproline content measured using a commercial hydroxyproline assay kit (A030-2-1, Njjccbio, China). E-G. mRNA levels of PDLIM7, CFAP45, and HP in lung tissues determined by RT-qPCR (6 Ctrl vs 6 BLM). GAPDH was used as the housekeeping gene for normalization. Results are expressed as relative gene transcript levels, with the Ctrl group set to 1. Data are presented as mean ± SD. Error bars represents SD. * P < 0.05; **P < 0.01; ***P < 0.001.