Fig 1.
The flow chart of patient enrollment.
Table 1.
Clinical-radiological characteristics of patients in cohorts.
Table 2.
Univariable and multivariate logistic regression analysis for the association between clinical-radiological characteristics and IASLC grade.
Table 3.
Performance of all prediction models.
Fig 2.
Performance of the five developed models.
a Training cohort, b validation cohort, and c independent test cohort.
Fig 3.
Tumor subregion segmentation was performed using a two-stage approach.
(a) patient-level supervoxel generation using simple linear iterative clustering, followed by (b) cohort-level phenotypic clustering with a Gaussian mixture model. The optimal number of clusters (k = 3) was determined using the elbow method based on the rate of change in the sum of squared errors. The cohort-level clustering results are visualized in (c). Color coding: red indicates Subregion 1, green represents Subregion 2, and blue corresponds to Subregion 3.
Table 4.
Comparison of performance among models.
Fig 4.
Comparison of tumor subregion proportions between high-grade and low-grade invasive pulmonary adenocarcinoma patients.
Patient A (high-grade group): Subregion 1: 73.74%, Subregion 2: 0%, Subregion 3: 26.26% (a). Patient B (low-grade group): Subregion 1: 16.16%, Subregion 2: 6.06%, Subregion 3: 77.78% (b).
Fig 5.
a-c and d-f depicts the calibration curve and decision curve analysis of the models in the training cohort, validation cohort, and independent test cohort, respectively.
Fig 6.
In the feature importance plot (a), the Y-axis shows features ranked by mean |SHAP| value (overall impact on prediction), with the most important feature at the top. The X-axis indicates mean |SHAP|; longer bars denote stronger feature influence. In the beeswarm plot (b), the X-axis displays individual SHAP values per sample. Red and blue dots represent high and low feature values, respectively. Force plots start from the base value. Each feature contributes a force proportional to its SHAP value, shown as an arrow: red increases the probability of high-grade prediction, blue decreases it. The sum of all forces yields the final prediction f(x). A 62-year-old female patient with pathology showing 30% micropapillary and complex glandular components (IASLC Grade 3). Model prediction (f(x)=0.980 > 0.525) classified it as high-grade, consistent with the pathological diagnosis (c). An 84-year-old male patient with pathology showing predominantly lepidic growth (IASLC Grade 1). Model prediction (f(x)=0.330 < 0.525) classified it as low-grade, consistent with the pathological diagnosis (d).