Machine learning for differentiating lung squamous cell cancer from adenocarcinoma using Clinical-Metabolic characteristics and 18F-FDG PET/CT radiomics

Yalin Zhang; Huiling Liu; Cheng Chang; Yong Yin; Ruozheng Wang

doi:10.1371/journal.pone.0300170

Abstract

Noninvasive differentiation between the squamous cell carcinoma (SCC) and adenocarcinoma (ADC) subtypes of non-small cell lung cancer (NSCLC) could benefit patients who are unsuitable for invasive diagnostic procedures. Therefore, this study evaluates the predictive performance of a PET/CT-based radiomics model. It aims to distinguish between the histological subtypes of lung adenocarcinoma and squamous cell carcinoma, employing four different machine learning techniques. A total of 255 Non-Small Cell Lung Cancer (NSCLC) patients were retrospectively analyzed and randomly divided into the training (n = 177) and validation (n = 78) sets, respectively. Radiomics features were extracted, and the Least Absolute Shrinkage and Selection Operator (LASSO) method was employed for feature selection. Subsequently, models were constructed using four distinct machine learning techniques, with the top-performing algorithm determined by evaluating metrics such as accuracy, sensitivity, specificity, and the area under the curve (AUC). The efficacy of the various models was appraised and compared using the DeLong test. A nomogram was developed based on the model with the best predictive efficiency and clinical utility, and it was validated using calibration curves. Results indicated that the logistic regression classifier had better predictive power in the validation cohort of the radiomic model. The combined model (AUC 0.870) exhibited superior predictive power compared to the clinical model (AUC 0.848) and the radiomics model (AUC 0.774). In this study, we discovered that the combined model, refined by the logistic regression classifier, exhibited the most effective performance in classifying the histological subtypes of NSCLC.

Citation: Zhang Y, Liu H, Chang C, Yin Y, Wang R (2024) Machine learning for differentiating lung squamous cell cancer from adenocarcinoma using Clinical-Metabolic characteristics and 18F-FDG PET/CT radiomics. PLoS ONE 19(4): e0300170. https://doi.org/10.1371/journal.pone.0300170

Editor: Francesco Dondi, Università degli Studi di Brescia: Universita degli Studi di Brescia, ITALY

Received: October 31, 2023; Accepted: February 22, 2024; Published: April 3, 2024

Copyright: © 2024 Zhang et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: All relevant data are within the manuscript and its Supporting information files.

Funding: This work was supported by the Science and Technology Foundation of Xinjiang Uygur Autonomous Region(2022E02050). It was also supported by the Special Funds Project of Central Guidance on Local Science and Technology Development (ZYYD2022B18).

Competing interests: The authors have declared that no competing interests exist.

Introduction

According to GLOBOCAN 2020, lung cancer ranks as the second most common type of cancer and stands as the leading cause of cancer-related deaths. Approximately 2.2 million new cases were diagnosed in 2020 alone, with the disease accounting for an estimated 1.8 million fatalities [1]. Various types of lung cancer exist, with non-small cell lung cancer (NSCLC) being the most prevalent, constituting about 85% of all lung cancer cases globally [2]. Squamous cell carcinoma (SCC) and adenocarcinoma (ADC) represent the two most common histologic subtypes. Research indicates significant variances in the genetic and epigenetic traits of ADC and SCC during tumorigenesis and progression [3]. Given the differing treatment approaches for adenocarcinoma and squamous cell carcinoma, swift and precise identification of these pathological subtypes is critical.

Approximately one-third of patients diagnosed with NSCLC are at Stage III, a stage at which most are no longer viable candidates for surgical intervention [4]. Consequently, the adoption of computed tomography (CT)-guided biopsies has become the gold standard for determining the pathologic subtype of lung cancer. However, this invasive method may not fully capture the entire tumor’s heterogeneity. Given that biopsies typically yield only a few small tissue samples, they may not provide a comprehensive understanding of the overall tumor, posing challenges for accurate diagnosis. Additionally, potential risks associated with biopsy procedures, such as pneumothorax, intrathoracic hemorrhage, pleural reaction, air embolism, and intrapleural implantation metastasis, exist. The prospect of additional biopsies due to heterogeneous or necrotic tumor tissue may deter some patients from undergoing a biopsy, particularly those with an uncontrolled cough [5, 6]. Therefore, the development of a reliable, non-invasive, and practical method for predicting NSCLC histology prior to treatment is paramount.

Relevant studies suggest that certain clinical characteristics can aid in differentiating the diagnosis of lung adenocarcinoma from lung squamous cell carcinoma. These include factors such as age, smoking history, tumor diameter, imaging signs, and microvascular density [7–9]. However, the sole reliance on clinical features for classifying pathological tissues may be influenced by the subjective judgment of the physicians or by the heterogeneity and quantity of the samples, potentially leading to variability in diagnostic outcomes.

As medical and information technologies advance, there is an exponential growth in the volume of medical data, especially in the production of medical imaging. These images harbor extensive latent details pertinent to human health. Yet, the manual examination and interpretation of such data are not only time-consuming but also prone to human bias. Leveraging the capabilities of machine learning can significantly alleviate these issues by extracting sophisticated features and minimizing subjectivity. Radiomics entails the quantitative retrieval of characteristics from conventional medical imaging. The development of predictive or diagnostic models through machine learning techniques allows for the collection of data that can be assimilated into clinical decision-making tools, consequently improving the accuracy of diagnoses or prognosis. [10, 11]. Zhu et al. [12] extracted 485 features from manually delineated tumor regions in 129 NSCLC patients. The results demonstrated that the area under the curve (AUC) for the training and validation sets reached 0.905 and 0.893, respectively, indicating that the imaging features have substantial efficacy in differentiating between lung adenocarcinoma and squamous cell carcinoma. Bashir et al. [13] analyzed the effectiveness of a random forest model utilizing CT image radiomics features, CT semantic features, and combined features in distinguishing lung adenocarcinoma from squamous cell carcinoma. The findings revealed that the random forest model based on radiomics features could non-invasively analyze the histological subtypes of NSCLC with an AUC of 1.

18F-fluorodeoxyglucose (FDG) PET/CT, which combines anatomical and metabolic information, is crucial for identifying primary tumors, staging diseases, and assessing treatment success. Radiomics based on PET/CT shows potential in differentiating ADC from SCC. Yan et al’s study [14] developed separate models based on PET and CT, as well as a combined PET-CT model. Among these, the combined model exhibited superior performance in predicting ADC, SCC, and metastasis. Additional studies have also discovered that the inclusion of clinical characteristics, such as gender and smoking history, further enhanced the classification performance, achieving an area under the curve (AUC) of 0.859, which surpassed the performance of radiomics alone [15, 16]. When developing a radiomics prediction model, the selection of an appropriate machine learning algorithm can significantly enhance the model’s predictive accuracy and stability. Various classifiers, including Support Vector Machine (SVM), Logistic Regression (LR), and Random Forest (RF), have been utilized to construct models in the studies mentioned earlier. The Light Gradient Boosting Machine (LGBM), a model rooted in gradient boosting decision trees (GBDT), shares principles with the XGBoost algorithm but offers several advantages, such as faster training efficiency, lower memory consumption, higher accuracy, and support for parallel learning. To the best of our knowledge, there has been scant research evaluating the efficacy of the LGBM classifier for radiomics models based on 18F-FDG PET/CT that incorporate clinico-metabolic features for differentiating between ADC and SCC in lung cancer. Therefore, the objective of this research was to create and corroborate a superior machine learning (ML) model utilizing PET/CT data to distinguish between SCC and ADC in stage III NSCLC.

Materials and methods

Study design

The Ethics Committee of the Affiliated Cancer Hospital of Shandong First Medical University approved the current retrospective study (No. SDTHEC 2023010008). The requirement for written informed consent was waived due to the study’s retrospective nature. The workflow of our study is depicted in Fig 1.

Download:

Fig 1. The workflow of this study.

https://doi.org/10.1371/journal.pone.0300170.g001

Patients

Data were accessed for research purposes beginning on March 10, 2023. In this study, we selected a cohort of 255 patients diagnosed with non-small cell lung cancer (NSCLC) between September 2018 and May 2022. The inclusion criteria for this study were as follows: (1) pathologically confirmed non-small cell lung cancer (NSCLC); (2) available PET/CT images obtained before treatment; (3) a diagnosis of stage III disease; (4) a single tumor lesion exceeding 1 cm in diameter. The exclusion criteria included: (1) patients who received anti-tumor treatment prior to the PET/CT scan; (2) individuals with a history of other thoracic malignant tumors or systemic malignancies; (3) patients with pathological confirmation of histological subtypes other than ADC or SCC; (4) patients who underwent surgical intervention after their diagnosis.

In the end, the study enrolled 255 patients, who were then randomly divided into two groups: the training cohort, consisting of 177 individuals, and the internal validation cohort, comprising 78 individuals, following a 7:3 distribution. The clinical characteristics of the patients were systematically documented. Furthermore, the researchers measured various PET metabolic parameters, including metabolic tumor volume (MTV), mean standardized uptake value (SUVmean), maximum standardized uptake value (SUVmax), and minimum standardized uptake value (SUVmin). Additionally, the total lesion glycolysis (TLG) was calculated using the formula TLG = SUVmean × MTV [17].

18F-FDG PET/CT image acquisition

18F-FDG scans were conducted using a Philips Gemini TF PET/CT system (Philips Medical Systems, Netherlands) in accordance with standard clinical scanning protocols. Patients were required to fast for at least six hours prior to the scan, ensuring their blood glucose levels remained below 140 mg/dL. Approximately one hour after the administration of an intravenous dose of 4.4 MBq/kg of 18F-FDG, PET and CT images were acquired. The PET images were reconstructed in multiple planes and reconstruction slice-thickness range of 1 to 3 mm.

Tumor segmentation

Tumor segmentation was executed using AccuContour software (version 3.2, Manteia Medical Technologies Co., Ltd., Xiamen, China). Two experienced nuclear medicine physicians employed a threshold of 40% of the maximum standardized uptake value (SUVmax) to delineate the gross tumor volume (GTV) on PET images, reaching a consensus without prior knowledge of the pathology [18, 19]. Concurrently, the contours of the GTV on CT slices were outlined based on the integration of PET and anatomical data from the CT images. Subsequently, two senior radiologists conducted a collaborative review of the target images.

Feature extraction

In this study, the features are divided into three categories: (I) geometric, (II) intensity, and (III) textural. Geometric features capture the three-dimensional shape properties of the tumor. Intensity features reflect the statistical distribution of voxel intensities within the tumor. In contrast, textural features leverage methods such as the gray-level co-occurrence matrix (GLCM), gray-level run length matrix (GLRLM), gray-level size zone matrix (GLSZM), and neighborhood gray-tone difference matrix (NGTDM) to characterize patterns and spatial distributions of voxel intensities. A total of 1,834 handcrafted CT features and 2,016 handcrafted PET features were extracted. All handcrafted features were extracted using a custom feature analysis program implemented in Pyradiomics (http://pyradiomics.readthedocs.io). To integrate PET and CT features, an early fusion approach was employed.

Feature selection and prediction model establishment

To ensure maximal representation of features while maintaining their distinctiveness, we assessed the correlation among highly repeatable attributes using Spearman’s rank correlation coefficient. Features exhibiting a correlation coefficient greater than 0.9 with any other feature were retained. For feature selection, we employed a greedy recursive elimination technique, which systematically removes the most redundant features from the current set at each step. Then, we employed the Least Absolute Shrinkage and Selection Operator (LASSO) regression analysis to select effective features within the training dataset. The LASSO model is particularly effective in reducing regression coefficients toward zero, thereby excluding irrelevant features by setting their coefficients to zero. The initial step involved identifying the optimal regularization parameter, λ. To accomplish this, we utilized 10-fold cross-validation, adhering to the least absolute criterion. By selecting the λ value that minimized the cross-validation error, we identified features with non-zero coefficients that were instrumental in fitting the regression model. These features were then aggregated to construct the radiomic model. Additionally, a radiomics score was computed for each patient by linearly combining the selected features with weights derived from their respective coefficients in the model. The LASSO regression modeling was conducted using the scikit-learn package in Python.

To differentiate between SCC and ADC, three independent predictive models were separately developed: the Clinical-Metabolic Model (clinic model), the PET/CT Radiomic Model (RS model), and the Combined PET/CT Radiomic and Clinical-Metabolic Model (combined model). Four machine learning classifiers, including LR, LGBM, SVM, RF, were used to construct these models. During this process, 5-fold cross-validation was employed to derive these final models.

Development and validation of individualized nomogram

Furthermore, we constructed a radiomics nomogram using the validation dataset to facilitate a rapid and visual assessment of the enhanced predictive value provided by the combination of the radiomics scores with clinical risk factors. Logistic regression analysis was used in this study to combine radiomic features with clinical risk factors in the nomogram. Finally, we developed calibration curves to appraise the calibration quality of the nomogram.

Statistical analysis

Patient characteristics were compared using independent sample t-tests, Mann-Whitney U tests, Fisher’s exact test, or chi-square (χ²) tests where relevant. The process of identifying clinical features incorporated both univariate and multivariate logistic regression analyses. The statistical software SPSS (Version 25.0) and R (Version 3.4.0) were used for data analysis, with P-values below 0.05 signaling statistical significance. The selection of the most effective machine learning (ML) model hinged on its performance metrics: the area under the receiver operating characteristic curve (AUC), accuracy (ACC), sensitivity (SEN), and specificity (SPE). AUC comparisons of the various models on the validation set were made using the DeLong test. Decision Curve Analysis (DCA) was also implemented to evaluate the clinical usefulness of the predictive model.

Results

Clinical characteristics of patients

This study included a total of 255 participants diagnosed with non-small cell lung cancer (NSCLC). Among these cases, there were 145 patients with squamous cell carcinoma (SCC) and 110 with adenocarcinoma (ADC). The patient population ranged in age from 26 to 85 years, with a mean age of 62 years. The distribution of baseline clinical characteristics was well-balanced between the training and validation cohorts. Table 1 displays the distribution characteristics of the two groups, providing a detailed breakdown of the baseline clinical attributes for each set of patients.

Download:

Table 1. Baseline characteristics of patients in cohorts.

https://doi.org/10.1371/journal.pone.0300170.t001

Features selection and prediction model establishment

To minimize subjective variability in the segmentation of regions of interest (ROI), only radiomic features with both inter-reader and intra-reader Intraclass Correlation Coefficients (ICCs) greater than 0.75 were included. The radiomic features extracted from PET/CT images were categorized into seven distinct groups: first-order features, shape-based features, Gray Level Dependence Matrix (GLDM) features, Gray Level Run Length Matrix (GLRLM) features, Gray Level Size Zone Matrix (GLSZM) features, Neighbouring Gray Tone Difference Matrix (NGTDM) features, and Gray Level Co-occurrence Matrix (GLCM) features. Detailed information about the handcrafted features is provided in Supplementary data (S1 Table). Fig 2 depicts the quantity and distribution of the handcrafted features extracted from the CT and PET images.

Download:

Fig 2. Number and ratio of handcrafted features.

A show CT features, B show PET features.

https://doi.org/10.1371/journal.pone.0300170.g002

Three models were constructed independently using selected clinical factors-metabolic parameters, PET/CT radiomic features, and a combination of the aforementioned variables, utilizing LASSO regression in the training cohort. The clinic model comprised two clinical factors (gender and age), one tumor marker (CEA), and two metabolic parameters (SUVmax and SUVmin) (Fig 3A). The analysis suggested that SCC was more prevalent among older males with a long history of smoking, whereas ADC tended to occur in younger females, typically non-smokers (p < 0.05). Univariate logistic regression analysis identified gender, age, smoking history, T stage, white blood cell count (WBC), CEA, total lesion glycolysis (TLG), SUVmin, SUVmax, SUVmean, and metabolic tumor volume (MTV) as independent risk factors for pathological classification (Table 2, p < 0.05). Multivariate logistic regression analysis confirmed gender, age, CEA, SUVmax, and SUVmin as independent clinical predictors of histology (Table 3, p < 0.05). Within the training cohort for the RS model, eight features were identified (Fig 3B, Table 4). The combined model incorporated two clinical parameters (gender and CEA) along with seven radiomic parameters (Fig 3C–3E, Table 4). To calculate each patient’s pre-scores for each model, the following formulas were applied:

Download:

Fig 3. Radiomic features selected using a LASSO regression model for subgroups.

A-C The coefficients of each feature in the most predictive feature subset. The abscissa is the coefficient, and the ordinate shows the reserved features. The larger the coefficient is, the more predictive effect of the feature is. A shows feature selected in the clinic model, B shows feature selected in the RS model, C shows feature selected in the combined model, D MSE of 10 fold cross validation. E Coefficients of 10 fold cross validation.

https://doi.org/10.1371/journal.pone.0300170.g003

Download:

Table 2. Univariate logistic regression analysis of clinical predictors of histology.

https://doi.org/10.1371/journal.pone.0300170.t002

Download:

Table 3. Multivariate logistic regression analysis of clinical predictors of histology.

https://doi.org/10.1371/journal.pone.0300170.t003

Download:

Table 4. The final selected PET-CT radiomics features for RS model and combined model.

https://doi.org/10.1371/journal.pone.0300170.t004

Prediction performance and clinical utility of prediction models

Table 5 displays a consolidated overview of the predictive capabilities for differentiating ADC from SCC across multiple machine learning classifiers within the training and internal validation groups. In the validation cohorts, the LR models demonstrated superior outcomes with respect to the AUC, ACC, SEN, and SPE compared to other ML classifiers. As a result, LR was selected as the preferred ML algorithms for the classification of the specified pathological types.

Download:

Table 5. Performance of four machine learning algorithms for differentiating pathological subtypes in the training and internal validation cohort.

https://doi.org/10.1371/journal.pone.0300170.t005

The performance evaluation of the three predictive models using the logistic regression (LR) classifier (Fig 4A and 4B), complemented by the DeLong test results (Table 6), indicated that the combined model surpassed the others, exhibiting superior discrimination and achieving the highest level of accuracy. This was substantiated by the metrics obtained using the logistic regression classifier in both the training cohort (AUC (95% CI) = 0.869 (0.817–0.921)) and the validation cohort (AUC (95% CI) = 0.870 (0.791–0.948)), with both p-values being less than 0.05.

Download:

Fig 4. Comparison of receiver operating characteristic (ROC) curves for predicting subtype of pathology.

A shows the ROC curve of LR in the training cohort; B shows the ROC curve of LR in the validation cohort.

https://doi.org/10.1371/journal.pone.0300170.g004

Download:

Table 6. DeLong test within different models based on LR classifier for the validation cohort.

https://doi.org/10.1371/journal.pone.0300170.t006

A nomogram was developed to integrate clinical and radiomic signatures, and it exhibited the most robust performance (Fig 5A). Decision curve analysis (DCA) further confirmed that the combined model serves as the most dependable clinical instrument for predicting histological subtypes, especially when the threshold probability surpasses 20% (Fig 5B and 5C). Calibration curves for both the training and validation cohorts of the nomogram demonstrated a high degree of agreement between the predicted histology and the actual observations, with the combined model using the logistic regression algorithm showing particular effectiveness (Fig 5D and 5E).

Download:

Fig 5. Clinical utility of prediction models.

A shows Nomogram of a clinical radiomics model developed based on a logistic regression model for the training cohort. gender 1:male 2:female. B,C show that Decision curve analysis (DCA) was conducted for the prediction model based on the logistic regression model in the training (B) and validation cohorts (C). D,E show Calibration curves of the nomogram based on the logistic regression model in the training (D) and validation cohorts (E).

https://doi.org/10.1371/journal.pone.0300170.g005

Discussion

Personalized treatment plays a crucial role in improving patient survival outcomes. At the heart of personalized medicine lies the early and accurate diagnosis and staging of lung cancer, as well as the precise identification of its pathological subtypes. Although biopsy is considered the gold standard for diagnosing lung cancer, its invasiveness, limited reproducibility, possibility of yielding false-negative results, and the associated risk of complications highlight the urgent need for enhanced diagnostic methods. Therefore, the differentiation of pathological subtypes of non-small cell lung cancer (NSCLC) through standard imaging modalities remains a substantial challenge. In this study, we compared four classifier models to identify the pathological subtypes of non-small cell lung cancer. The optimal classifier was evaluated for its predictive efficacy across three models: the RS model, the clinic model, and the combined model. In this study, we discovered that the combined model, refined by the logistic regression classifier, exhibited the most effective performance in classifying the histological subtypes of NSCLC.

We explored the clinical features that contribute to the differentiation between ADC and SCC in NSCLC. We found that gender, age, CEA levels, maximum standardized uptake value and minimum standardized uptake value were statistically significant discriminators between ADC and SCC, which were accordence with other studies. Previous research has validated gender and age as clinical characteristics that can distinguish between ADC and SCC. Koh et al. [20] compared intratumoral stromal proportions and positron emission tomography (PET) textural features in females and males diagnosed with either adenocarcinoma or squamous cell carcinoma. Their findings indicated a higher prevalence of ADC in females compared to males. Additionally, the variation in tumor heterogeneity between women w ith ADC and men with ADC or SCC suggests that gender may serve as a distinguishing feature. Younger patients are more commonly diagnosed with adenocarcinoma, aligning with the findings of several studies [21–24]. This trend may be attributable to the different mutation rates of genes such as EGFR, ALK, and KRAS in younger versus older lung cancer patients. In our cohort, the ADC group consisted of younger individuals than the SCC group (P<0.05), although the average age in both groups exceeded 60 years. This contrasts with other studies that categorize patients as younger if under 40 and older if over 60. The discrepancy can primarily be ascribed to the specific sample population in our research. Serum CEA levels are commonly measured to identify lung cancer, serving as a tumor marker. Elevated CEA levels are seen in 35% to 70% of NSCLC patients, particularly in those with lung adenocarcinoma and advanced disease [25], a finding that our study corroborates. Furthermore, research by Karam et al. [26–28] on 98 NSCLC cases established a significant correlation between SUVmax and the size of primary lesions, with SCC showing notably higher SUVmax values than ADC. Our study confirms that SUVmax is indeed higher in SCC compared to ADC (P < 0.05). We also observed that SUVmin is higher in ADC than in SCC (P < 0.05), adding another layer to the diagnostic criteria for these subtypes.

In addition to analyzing clinical features, this study also integrated radiomic features from PET/CT images. In our study, the most significant radiomic features for both the RS model and the combined model were lbp_3D_m1_firstorder_Skewness, log_sigma_5_0_mm_3D_firstorder_90Percentile, squareroot_ngtdm_Busyness and log_sigma_2_0_mm_3D_firstorder_Maximum. A previous study [29] demonstrated that first-order features were particularly stable and robust in rectal cancer. In our research, three first-order features were also confirmed to be the most significant indicators for classifying histological types. Another noteworthy feature in our study was ’busyness,’ which is associated with the spatial frequency of intensity changes. Erol M et al. [30] reported that radiomic features, including busyness, were independently correlated with the staging of lung squamous cancer. Bashir et al. [13] and Hyun et al. [16] have previously explored the application of radiomics in the classification of NSCLC. However, in their studies, the radiomic features exhibiting the highest performance include separately GLSZMSZLIE, coefficient of variation, NGTDM coarseness and gray-level zone length nonuniformity, gray-level nonuniformity for zone. The best-performing subset of radiological features in our study differs from those identified in other studies [31, 32]. This discrepancy may be attributed to the fact that there are hundreds of radiomic features, many of which are inter-correlated, leading to the possibility that different high-ranking features might essentially represent variations of the same underlying feature.

Regarding the predictive capabilities of different models, several studies [33, 34] have indicated that a combined model incorporating both PET and CT features yields a higher Area Under the Curve than models using only PET or CT features individually. Ren [35] analyzed preoperative clinical features, tumor markers, and PET and CT imaging characteristics, subsequently constructing four independent predictive models. The DeLong test revealed that the combined model exhibited superior performance in predicting the pathological subtypes of NSCLC, with an AUC of 0.932 for the training cohort and an AUC of 0.901 for the validation cohort. These findings align with those of our study, which determined that the combined model achieved higher AUC values compared to the pure PET-CT model and the clinical model alone.

In the construction of radiomics-based prediction models, the selection of suitable machine learning algorithms can enhance the predictive accuracy and stability of the model. Recent advancements in machine learning algorithms, including Gaussian processes, decision trees, RF, SVM, LGBM, and LR, have propelled the application and development of radiomics. Shen [36] evaluated seven different classifiers to optimize a model for classification: SVM with a linear kernel, SVM with a radial basis function kernel (SVM-RBF), RF, LR, Gaussian process classifier (GP), linear discriminant analysis (LDA), and the AdaBoost classifier. The study found that a PET/CT radiomics model using the SVM-RBF classifier demonstrated the best performance, with an AUC of 0.9155, when integrating subregion imaging from PET-CT scans and clinical features for classifying histological subtypes of NSCLC. Parmar et al. [37] demonstrated that the random forest method was the most effective in managing radiomic feature instability, outperforming 12 other machine learning classifiers, including bagging, Bayesian, boosting, decision trees, discriminant analysis, generalized linear models, multiple adaptive regression splines, nearest neighbors, neural networks, partial least squares and principal component regression, and SVMs—in terms of prognostic performance. Huang, et al. [38] employed the LGBM algorithm to develop both a radiomic model and a fusion model (clinical + radiomic) to predict EGFR mutation status in patients with NSCLC. Models based on radiomic signatures can provide relatively accurate non-invasive predictions of EGFR expression status.

In our study, we found that the combined model constructed using the Logistic Regression algorithm performs excellently in identifying pathological subtypes, even when compared with multiple algorithms, including Light Gradient Boosting Machine. Moreover, LGBM also demonstrated good predictive performance. Logistic regression, a linear model, is widely favored for binary classification problems due to its computational simplicity and interpretability. LR is adept at handling large datasets and excels with linearly separable problems. Our finding aligns with a similar study conducted by Ren et al. [35], which reported that a combined model incorporating clinico-biological features and 18F-FDG PET/CT data, utilizing the LR algorithm, showed strong capability in distinguishing SCC from ADC. Additionally, another study suggested that a model trained on 18F-FDG PET radiomics with the LR algorithm could be effective for predicting the histological subtypes of lung cancer [16]. Given that different machine learning algorithms have their respective optimal application contexts, it is imperative to explore a variety of algorithms to identify the most suitable model for predicting NSCLC histology subtypes. For instance, Gao et al. proposed an improved adaptive neuro-fuzzy inference system-based machine learning method to predict the multi-axis fatigue life of various metal materials. The study found that this model exhibited superior predictive performance and extrapolation capabilities when compared with six classical machine learning models [39]. Moreover, a recent study indicated that a 3D convolutional neural network (CNN) model effectively differentiated between benign and malignant pulmonary nodules in 2-[18F]FDG PET images [40]. The optimization of machine learning algorithms should be prioritized in future research to improve the performance of predictions.

Our study has several limitations. Firstly, this study is based on data from a single center and only includes a training set and a inner validation set. Multi-center data can be included to enhance the stability of predictions. Secondly, the classification method adopted in this study is machine learning. In the future, we can attempt to incorporate deep learning techniques, even with multiomics to optimize the classification model.

Conclusion

In this study, we developed a comprehensive model that integrates clinical characteristics with PET/CT imaging features using the logistic regression algorithm. This model serves as an effective tool for the “virtual biopsy” of stage III non-small cell lung cancer, distinguishing between different pathological subgroups. It can aid physicians in making informed clinical decisions concerning treatment options and prognostic assessments.

Supporting information

S1 Table. Radiomics feature extraction.

https://doi.org/10.1371/journal.pone.0300170.s001

(DOC)

S1 Dataset. PET-CT radiomics features.

https://doi.org/10.1371/journal.pone.0300170.s002

(ZIP)

S2 Dataset. Baseline characteristics of patients.

https://doi.org/10.1371/journal.pone.0300170.s003

(ZIP)

S1 File.

https://doi.org/10.1371/journal.pone.0300170.s004

(DOCX)

S2 File.

https://doi.org/10.1371/journal.pone.0300170.s005

(PDF)

Acknowledgments

Thanks to our colleagues at the Department of Radiation Oncology, Shandong Cancer Hospital. Thanks for the technical support provided by Onekey AI platform.

References

1. Sung H, Ferlay J, Siegel RL, Laversanne M, Soerjomataram I, Jemal A, et al. Global Cancer Statistics 2020: GLOBOCAN Estimates of Incidence and Mortality Worldwide for 36 Cancers in 185 Countries. CA: A Cancer Journal for Clinicians. 2021; 209–249. pmid:33538338
- View Article
- PubMed/NCBI
- Google Scholar
2. Bradley JD, Paulus R, Komaki R, Masters G, Blumenschein G, Schild S, et al. Standard-dose versus high-dose conformal radiotherapy with concurrent and consolidation carboplatin plus paclitaxel with or without cetuximab for patients with stage IIIA or IIIB non-small-cell lung cancer (RTOG 0617): a randomised, two-by-two factorial phase 3 study. The Lancet Oncology. 2015;16: 187–199. pmid:25601342
- View Article
- PubMed/NCBI
- Google Scholar
3. Takamochi K, Ohmiya H, Itoh M, Mogushi K, Saito T, Hara K, et al. Novel biomarkers that assist in accurate discrimination of squamous cell carcinoma from adenocarcinoma of the lung. BMC Cancer. 2016;16. pmid:27681076
- View Article
- PubMed/NCBI
- Google Scholar
4. Bryan S, Masoud H, Weir Hannah K, Woods R, Lockwood G, Smith Lisa F, et al. Cancer in Canada: Stage at diagnosis. Health Reports, Health Reports. 2018. pmid:30566206
- View Article
- PubMed/NCBI
- Google Scholar
5. Cooper WA, O’Toole S, Boyer M, Horvath L, Mahar A. What’s new in non-small cell lung cancer for pathologists the importance of accurate subtyping, EGFR mutations and ALK rearrangements. Pathology. 2011;43: 103–115. pmid:21233671
- View Article
- PubMed/NCBI
- Google Scholar
6. Yuan C, Tao X, Zheng D, Pan Y, Ye T, Hu H, et al. The lymph node status and histologic subtypes influenced the effect of postoperative radiotherapy on patients with N2 positive IIIA non-small cell lung cancer. Journal of Surgical Oncology. 2019; 379–387. pmid:30536966
- View Article
- PubMed/NCBI
- Google Scholar
7. He B, Song Y, Wang L, Wang T, She Y, Hou L, et al. A machine learning-based prediction of the micropapillary/solid growth pattern in invasive lung adenocarcinoma with radiomics. Translational Lung Cancer Research. 2021;10: 955–964. pmid:33718035
- View Article
- PubMed/NCBI
- Google Scholar
8. Jiang C, Zhao M, Hou S, Hu X, Huang J, Wang H, et al. The Indicative Value of Serum Tumor Markers for Metastasis and Stage of Non-Small Cell Lung Cancer. Cancers. 2022;14: 5064. pmid:36291848
- View Article
- PubMed/NCBI
- Google Scholar
9. Deniffel D, Sauter A, Fingerle A, Rummeny EJ, Makowski MR, Pfeiffer D. Improved differentiation between primary lung cancer and pulmonary metastasis by combining dual-energy CT–derived biomarkers with conventional CT attenuation. European Radiology. 2021;31: 1002–1010. pmid:32856165
- View Article
- PubMed/NCBI
- Google Scholar
10. Avanzo M, Stancanello J, Pirrone G, Sartor G. Radiomics and deep learning in lung cancer. Strahlentherapie und Onkologie. 2020; 879–887. pmid:32367456
- View Article
- PubMed/NCBI
- Google Scholar
11. Thawani R, McLane M, Beig N, Ghose S, Prasanna P, Velcheti V, et al. Radiomics and radiogenomics in lung cancer: A review for the clinician. Lung Cancer. 2018;115: 34–41. pmid:29290259
- View Article
- PubMed/NCBI
- Google Scholar
12. Zhu X, Dong D, Chen Z, Fang M, Zhang L, Song J, et al. Radiomic signature as a diagnostic factor for histologic subtype classification of non-small cell lung cancer. European Radiology. 2018; 2772–2778. pmid:29450713
- View Article
- PubMed/NCBI
- Google Scholar
13. Bashir U, Kawa B, Siddique M, Mak SM, Nair A, Mclean E, et al. Non-invasive classification of non-small cell lung cancer: a comparison between random forest models utilising radiomic and semantic features. The British Journal of Radiology. 2019; 20190159. pmid:31166787
- View Article
- PubMed/NCBI
- Google Scholar
14. Yan M, Wang W. Development of a Radiomics Prediction Model for Histological Type Diagnosis in Solitary Pulmonary Nodules: The Combination of CT and FDG PET. Frontiers in Oncology. 2020;10. pmid:33042839
- View Article
- PubMed/NCBI
- Google Scholar
15. Bianconi F, Palumbo I, Fravolini ML, Chiari R, Minestrini M, Brunese L, et al. Texture Analysis on [18F]FDG PET/CT in Non-Small-Cell Lung Cancer: Correlations Between PET Features, CT Features, and Histological Types. Molecular Imaging and Biology. 2019;21: 1200–1209. pmid:30847822
- View Article
- PubMed/NCBI
- Google Scholar
16. Hyun SH, Ahn MS, Koh YW, Lee SJ. A Machine-Learning Approach Using PET-Based Radiomics to Predict the Histological Subtypes of Lung Cancer. Clinical Nuclear Medicine. 2019; 956–960. pmid:31689276
- View Article
- PubMed/NCBI
- Google Scholar
17. Yang B, Wang Q gen, Lu M, Ge Y, Zheng Y jun, Zhu H, et al. Correlations Study Between 18F-FDG PET/CT Metabolic Parameters Predicting Epidermal Growth Factor Receptor Mutation Status and Prognosis in Lung Adenocarcinoma. Frontiers in Oncology. 2019;9. pmid:31380265
- View Article
- PubMed/NCBI
- Google Scholar
18. Agarwal J, Tibdewal A, Patil M, Misra S, Purandare N, Rangarajan V, et al. Optimal standardized uptake value threshold for auto contouring of gross tumor volume using positron emission tomography/computed tomography in patients with operable nonsmall-cell lung cancer: Comparison with pathological tumor size. Indian Journal of Nuclear Medicine. 2021;36: 7. pmid:34040289
- View Article
- PubMed/NCBI
- Google Scholar
19. Zhang Y, Hu Y, Zhao S, Cui C. The Utility of PET/CT Metabolic Parameters Measured Based on Fixed Percentage Threshold of SUVmax and Adaptive Iterative Algorithm in the New Revised FIGO Staging System for Stage III Cervical Cancer. Frontiers in Medicine. 2021;8. pmid:34395472
- View Article
- PubMed/NCBI
- Google Scholar
20. Koh YW, Lee D, Lee SJ. Intratumoral heterogeneity as measured using the tumor-stroma ratio and PET texture analyses in females with lung adenocarcinomas differs from that of males with lung adenocarcinomas or squamous cell carcinomas. Medicine. 2019;98: e14876. pmid:30882693
- View Article
- PubMed/NCBI
- Google Scholar
21. Sacher AG, Dahlberg SE, Heng J, Mach S, Jänne PA, Oxnard GR. Association Between Younger Age and Targetable Genomic Alterations and Prognosis in Non–Small-Cell Lung Cancer. JAMA Oncology. 2016; 313. pmid:26720421
- View Article
- PubMed/NCBI
- Google Scholar
22. Bigay-Gamé L, Bota S, Greillier L, Monnet I, Madroszyk A, Corre R, et al. Characteristics of Lung Cancer in Patients Younger than 40 Years: A Prospective Multicenter Analysis in France. Oncology. 2018; 337–343. pmid:30278447
- View Article
- PubMed/NCBI
- Google Scholar
23. Garrana SH, Dagogo-Jack I, Cobb R, Kuo AH, Mendoza DP, Zhang EW, et al. Clinical and Imaging Features of Non–Small-Cell Lung Cancer in Young Patients. Clinical Lung Cancer. 2021;22: 23–31. pmid:33189594
- View Article
- PubMed/NCBI
- Google Scholar
24. Catania C, Botteri E, Barberis M, Conforti F, Toffalorio F, Marinis FD, et al. Molecular features and clinical outcome of lung malignancies in very young people. Future Oncology. 2015;11: 1211–1221. pmid:25832878
- View Article
- PubMed/NCBI
- Google Scholar
25. Yoshino I, Ichinose Y, Nagashima A, Takeo S, Motohiro A, Yano T, et al. Clinical Characterization of Node-Negative Lung Adenocarcinoma: Results of a Prospective Investigation. Journal of Thoracic Oncology. 2006;1: 825–831. pmid:17409966
- View Article
- PubMed/NCBI
- Google Scholar
26. Ito R, Iwano S, Kishimoto M, Ito S, Kato K, Naganawa S. Correlation between FDG-PET/CT findings and solid type non-small cell cancer prognostic factors: are there differences between adenocarcinoma and squamous cell carcinoma? Annals of Nuclear Medicine. 2015;29: 897–905. pmid:26342592
- View Article
- PubMed/NCBI
- Google Scholar
27. Karam MB, Doroudinia A, Behzadi B, Mehrian P, Koma AY. Correlation of quantified metabolic activity in nonsmall cell lung cancer with tumor size and tumor pathological characteristics. Medicine. 2018;97: e11628. pmid:30095621
- View Article
- PubMed/NCBI
- Google Scholar
28. Li M, Sun Y, Liu Y, Han A, Zhao S, Ma L, et al. Relationship between primary lesion FDG uptake and clinical stage at PET–CT for non-small cell lung cancer patients: An observation. Lung Cancer. 2010;68: 394–397. pmid:19683358
- View Article
- PubMed/NCBI
- Google Scholar
29. Traverso A, Kazmierski M, Shi Z, Kalendralis P, Welch M, Nissen HD, et al. Stability of radiomic features of apparent diffusion coefficient (ADC) maps for locally advanced rectal cancer in response to image pre-processing. Physica Medica. 2019;61: 44–51. pmid:31151578
- View Article
- PubMed/NCBI
- Google Scholar
30. Erol M, Önner H, Küçükosmanoğlu İ. Association of Fluorodeoxyglucose Positron Emission Tomography Radiomics Features with Clinicopathological Factors and Prognosis in Lung Squamous Cell Cancer. Nuclear Medicine and Molecular Imaging. 2022;56: 306–312. pmid:36425277
- View Article
- PubMed/NCBI
- Google Scholar
31. Wang X, Dai Y, Lin H, Cheng J, Zhang Y, Cao M, et al. Shape and texture analyses based on conventional MRI for the preoperative prediction of the aggressiveness of pituitary adenomas. European Radiology. 2023;33: 3312–3321. pmid:36738323
- View Article
- PubMed/NCBI
- Google Scholar
32. Cui Y, Lin Y, Zhao Z, Long H, Zheng L, Lin X. Comprehensive 18F-FDG PET-based radiomics in elevating the pathological response to neoadjuvant immunochemotherapy for resectable stage III non-small-cell lung cancer: A pilot study. Frontiers in Immunology. 2022. pmid:36466929
- View Article
- PubMed/NCBI
- Google Scholar
33. Zhou Y, Ma X, Zhang T, Wang J, Zhang T, Tian R. Use of radiomics based on 18F-FDG PET/CT and machine learning methods to aid clinical decision-making in the classification of solitary pulmonary lesions: an innovative approach. European Journal of Nuclear Medicine and Molecular Imaging. 2021; 2904–2913. pmid:33547553
- View Article
- PubMed/NCBI
- Google Scholar
34. Koyasu S, Nishio M, Isoda H, Nakamoto Y, Togashi K. Usefulness of gradient tree boosting for predicting histological subtype and EGFR mutation status of non-small cell lung cancer on 18F FDG-PET/CT. Annals of Nuclear Medicine. 2020; 49–57. pmid:31659591
- View Article
- PubMed/NCBI
- Google Scholar
35. Ren C, Zhang J, Qi M, Zhang J, Zhang Y, Song S, et al. Correction to: Machine learning based on clinico-biological features integrated 18F-FDG PET/CT radiomics for distinguishing squamous cell carcinoma from adenocarcinoma of lung. European Journal of Nuclear Medicine and Molecular Imaging. 2021; 1696–1696. pmid:33532911
- View Article
- PubMed/NCBI
- Google Scholar
36. Shen H, Chen L, Liu K, Zhao K, Li J, Yu L, et al. A subregion-based positron emission tomography/computed tomography (PET/CT) radiomics model for the classification of non-small cell lung cancer histopathological subtypes. Quantitative Imaging in Medicine and Surgery. 2021; 2918–2932. pmid:34249623
- View Article
- PubMed/NCBI
- Google Scholar
37. Parmar C, Grossmann P, Bussink J, Lambin P, Aerts HJWL. Machine Learning methods for Quantitative Radiomic Biomarkers. Scientific Reports. 2015;5. pmid:26278466
- View Article
- PubMed/NCBI
- Google Scholar
38. Huang X, Sun Y, Tan M, Ma W, Gao P, Qi L, et al. Three-Dimensional Convolutional Neural Network-Based Prediction of Epidermal Growth Factor Receptor Expression Status in Patients With Non-Small Cell Lung Cancer. Frontiers in Oncology. 2022;12. pmid:35186727
- View Article
- PubMed/NCBI
- Google Scholar
39. Gao J., Heng F, Yuan Y., Liu Y. A novel machine learning method for multiaxial fatigue life prediction: Improved adaptive neuro-fuzzy inference system. International Journal of Fatigue. 2024; 108007.
- View Article
- Google Scholar
40. Alves VM, Dos Santos Cardoso J, Gama J. Classification of Pulmonary Nodules in 2-[¹⁸F]FDG PET/CT Images with a 3D Convolutional Neural Network. Nucl Med Mol Imaging. 2024; 9–24. pmid:38261899
- View Article
- PubMed/NCBI
- Google Scholar

[ref1] 1. Sung H, Ferlay J, Siegel RL, Laversanne M, Soerjomataram I, Jemal A, et al. Global Cancer Statistics 2020: GLOBOCAN Estimates of Incidence and Mortality Worldwide for 36 Cancers in 185 Countries. CA: A Cancer Journal for Clinicians. 2021; 209–249. pmid:33538338
View Article
PubMed/NCBI
Google Scholar

[2] View Article

[3] PubMed/NCBI

[4] Google Scholar

[ref2] 2. Bradley JD, Paulus R, Komaki R, Masters G, Blumenschein G, Schild S, et al. Standard-dose versus high-dose conformal radiotherapy with concurrent and consolidation carboplatin plus paclitaxel with or without cetuximab for patients with stage IIIA or IIIB non-small-cell lung cancer (RTOG 0617): a randomised, two-by-two factorial phase 3 study. The Lancet Oncology. 2015;16: 187–199. pmid:25601342
View Article
PubMed/NCBI
Google Scholar

[6] View Article

[7] PubMed/NCBI

[8] Google Scholar

[ref3] 3. Takamochi K, Ohmiya H, Itoh M, Mogushi K, Saito T, Hara K, et al. Novel biomarkers that assist in accurate discrimination of squamous cell carcinoma from adenocarcinoma of the lung. BMC Cancer. 2016;16. pmid:27681076
View Article
PubMed/NCBI
Google Scholar

[10] View Article

[11] PubMed/NCBI

[12] Google Scholar

[ref4] 4. Bryan S, Masoud H, Weir Hannah K, Woods R, Lockwood G, Smith Lisa F, et al. Cancer in Canada: Stage at diagnosis. Health Reports, Health Reports. 2018. pmid:30566206
View Article
PubMed/NCBI
Google Scholar

[14] View Article

[15] PubMed/NCBI

[16] Google Scholar

[ref5] 5. Cooper WA, O’Toole S, Boyer M, Horvath L, Mahar A. What’s new in non-small cell lung cancer for pathologists the importance of accurate subtyping, EGFR mutations and ALK rearrangements. Pathology. 2011;43: 103–115. pmid:21233671
View Article
PubMed/NCBI
Google Scholar

[18] View Article

[19] PubMed/NCBI

[20] Google Scholar

[ref6] 6. Yuan C, Tao X, Zheng D, Pan Y, Ye T, Hu H, et al. The lymph node status and histologic subtypes influenced the effect of postoperative radiotherapy on patients with N2 positive IIIA non-small cell lung cancer. Journal of Surgical Oncology. 2019; 379–387. pmid:30536966
View Article
PubMed/NCBI
Google Scholar

[22] View Article

[23] PubMed/NCBI

[24] Google Scholar

[ref7] 7. He B, Song Y, Wang L, Wang T, She Y, Hou L, et al. A machine learning-based prediction of the micropapillary/solid growth pattern in invasive lung adenocarcinoma with radiomics. Translational Lung Cancer Research. 2021;10: 955–964. pmid:33718035
View Article
PubMed/NCBI
Google Scholar

[26] View Article

[27] PubMed/NCBI

[28] Google Scholar

[ref8] 8. Jiang C, Zhao M, Hou S, Hu X, Huang J, Wang H, et al. The Indicative Value of Serum Tumor Markers for Metastasis and Stage of Non-Small Cell Lung Cancer. Cancers. 2022;14: 5064. pmid:36291848
View Article
PubMed/NCBI
Google Scholar

[30] View Article

[31] PubMed/NCBI

[32] Google Scholar

[ref9] 9. Deniffel D, Sauter A, Fingerle A, Rummeny EJ, Makowski MR, Pfeiffer D. Improved differentiation between primary lung cancer and pulmonary metastasis by combining dual-energy CT–derived biomarkers with conventional CT attenuation. European Radiology. 2021;31: 1002–1010. pmid:32856165
View Article
PubMed/NCBI
Google Scholar

[34] View Article

[35] PubMed/NCBI

[36] Google Scholar

[ref10] 10. Avanzo M, Stancanello J, Pirrone G, Sartor G. Radiomics and deep learning in lung cancer. Strahlentherapie und Onkologie. 2020; 879–887. pmid:32367456
View Article
PubMed/NCBI
Google Scholar

[38] View Article

[39] PubMed/NCBI

[40] Google Scholar

[ref11] 11. Thawani R, McLane M, Beig N, Ghose S, Prasanna P, Velcheti V, et al. Radiomics and radiogenomics in lung cancer: A review for the clinician. Lung Cancer. 2018;115: 34–41. pmid:29290259
View Article
PubMed/NCBI
Google Scholar

[42] View Article

[43] PubMed/NCBI

[44] Google Scholar

[ref12] 12. Zhu X, Dong D, Chen Z, Fang M, Zhang L, Song J, et al. Radiomic signature as a diagnostic factor for histologic subtype classification of non-small cell lung cancer. European Radiology. 2018; 2772–2778. pmid:29450713
View Article
PubMed/NCBI
Google Scholar

[46] View Article

[47] PubMed/NCBI

[48] Google Scholar

[ref13] 13. Bashir U, Kawa B, Siddique M, Mak SM, Nair A, Mclean E, et al. Non-invasive classification of non-small cell lung cancer: a comparison between random forest models utilising radiomic and semantic features. The British Journal of Radiology. 2019; 20190159. pmid:31166787
View Article
PubMed/NCBI
Google Scholar

[50] View Article

[51] PubMed/NCBI

[52] Google Scholar

[ref14] 14. Yan M, Wang W. Development of a Radiomics Prediction Model for Histological Type Diagnosis in Solitary Pulmonary Nodules: The Combination of CT and FDG PET. Frontiers in Oncology. 2020;10. pmid:33042839
View Article
PubMed/NCBI
Google Scholar

[54] View Article

[55] PubMed/NCBI

[56] Google Scholar

[ref15] 15. Bianconi F, Palumbo I, Fravolini ML, Chiari R, Minestrini M, Brunese L, et al. Texture Analysis on [18F]FDG PET/CT in Non-Small-Cell Lung Cancer: Correlations Between PET Features, CT Features, and Histological Types. Molecular Imaging and Biology. 2019;21: 1200–1209. pmid:30847822
View Article
PubMed/NCBI
Google Scholar

[58] View Article

[59] PubMed/NCBI

[60] Google Scholar

[ref16] 16. Hyun SH, Ahn MS, Koh YW, Lee SJ. A Machine-Learning Approach Using PET-Based Radiomics to Predict the Histological Subtypes of Lung Cancer. Clinical Nuclear Medicine. 2019; 956–960. pmid:31689276
View Article
PubMed/NCBI
Google Scholar

[62] View Article

[63] PubMed/NCBI

[64] Google Scholar

[ref17] 17. Yang B, Wang Q gen, Lu M, Ge Y, Zheng Y jun, Zhu H, et al. Correlations Study Between 18F-FDG PET/CT Metabolic Parameters Predicting Epidermal Growth Factor Receptor Mutation Status and Prognosis in Lung Adenocarcinoma. Frontiers in Oncology. 2019;9. pmid:31380265
View Article
PubMed/NCBI
Google Scholar

[66] View Article

[67] PubMed/NCBI

[68] Google Scholar

[ref18] 18. Agarwal J, Tibdewal A, Patil M, Misra S, Purandare N, Rangarajan V, et al. Optimal standardized uptake value threshold for auto contouring of gross tumor volume using positron emission tomography/computed tomography in patients with operable nonsmall-cell lung cancer: Comparison with pathological tumor size. Indian Journal of Nuclear Medicine. 2021;36: 7. pmid:34040289
View Article
PubMed/NCBI
Google Scholar

[70] View Article

[71] PubMed/NCBI

[72] Google Scholar

[ref19] 19. Zhang Y, Hu Y, Zhao S, Cui C. The Utility of PET/CT Metabolic Parameters Measured Based on Fixed Percentage Threshold of SUVmax and Adaptive Iterative Algorithm in the New Revised FIGO Staging System for Stage III Cervical Cancer. Frontiers in Medicine. 2021;8. pmid:34395472
View Article
PubMed/NCBI
Google Scholar

[74] View Article

[75] PubMed/NCBI

[76] Google Scholar

[ref20] 20. Koh YW, Lee D, Lee SJ. Intratumoral heterogeneity as measured using the tumor-stroma ratio and PET texture analyses in females with lung adenocarcinomas differs from that of males with lung adenocarcinomas or squamous cell carcinomas. Medicine. 2019;98: e14876. pmid:30882693
View Article
PubMed/NCBI
Google Scholar

[78] View Article

[79] PubMed/NCBI

[80] Google Scholar

[ref21] 21. Sacher AG, Dahlberg SE, Heng J, Mach S, Jänne PA, Oxnard GR. Association Between Younger Age and Targetable Genomic Alterations and Prognosis in Non–Small-Cell Lung Cancer. JAMA Oncology. 2016; 313. pmid:26720421
View Article
PubMed/NCBI
Google Scholar

[82] View Article

[83] PubMed/NCBI

[84] Google Scholar

[ref22] 22. Bigay-Gamé L, Bota S, Greillier L, Monnet I, Madroszyk A, Corre R, et al. Characteristics of Lung Cancer in Patients Younger than 40 Years: A Prospective Multicenter Analysis in France. Oncology. 2018; 337–343. pmid:30278447
View Article
PubMed/NCBI
Google Scholar

[86] View Article

[87] PubMed/NCBI

[88] Google Scholar

[ref23] 23. Garrana SH, Dagogo-Jack I, Cobb R, Kuo AH, Mendoza DP, Zhang EW, et al. Clinical and Imaging Features of Non–Small-Cell Lung Cancer in Young Patients. Clinical Lung Cancer. 2021;22: 23–31. pmid:33189594
View Article
PubMed/NCBI
Google Scholar

[90] View Article

[91] PubMed/NCBI

[92] Google Scholar

[ref24] 24. Catania C, Botteri E, Barberis M, Conforti F, Toffalorio F, Marinis FD, et al. Molecular features and clinical outcome of lung malignancies in very young people. Future Oncology. 2015;11: 1211–1221. pmid:25832878
View Article
PubMed/NCBI
Google Scholar

[94] View Article

[95] PubMed/NCBI

[96] Google Scholar

[ref25] 25. Yoshino I, Ichinose Y, Nagashima A, Takeo S, Motohiro A, Yano T, et al. Clinical Characterization of Node-Negative Lung Adenocarcinoma: Results of a Prospective Investigation. Journal of Thoracic Oncology. 2006;1: 825–831. pmid:17409966
View Article
PubMed/NCBI
Google Scholar

[98] View Article

[99] PubMed/NCBI

[100] Google Scholar

[ref26] 26. Ito R, Iwano S, Kishimoto M, Ito S, Kato K, Naganawa S. Correlation between FDG-PET/CT findings and solid type non-small cell cancer prognostic factors: are there differences between adenocarcinoma and squamous cell carcinoma? Annals of Nuclear Medicine. 2015;29: 897–905. pmid:26342592
View Article
PubMed/NCBI
Google Scholar

[102] View Article

[103] PubMed/NCBI

[104] Google Scholar

[ref27] 27. Karam MB, Doroudinia A, Behzadi B, Mehrian P, Koma AY. Correlation of quantified metabolic activity in nonsmall cell lung cancer with tumor size and tumor pathological characteristics. Medicine. 2018;97: e11628. pmid:30095621
View Article
PubMed/NCBI
Google Scholar

[106] View Article

[107] PubMed/NCBI

[108] Google Scholar

[ref28] 28. Li M, Sun Y, Liu Y, Han A, Zhao S, Ma L, et al. Relationship between primary lesion FDG uptake and clinical stage at PET–CT for non-small cell lung cancer patients: An observation. Lung Cancer. 2010;68: 394–397. pmid:19683358
View Article
PubMed/NCBI
Google Scholar

[110] View Article

[111] PubMed/NCBI

[112] Google Scholar

[ref29] 29. Traverso A, Kazmierski M, Shi Z, Kalendralis P, Welch M, Nissen HD, et al. Stability of radiomic features of apparent diffusion coefficient (ADC) maps for locally advanced rectal cancer in response to image pre-processing. Physica Medica. 2019;61: 44–51. pmid:31151578
View Article
PubMed/NCBI
Google Scholar

[114] View Article

[115] PubMed/NCBI

[116] Google Scholar

[ref30] 30. Erol M, Önner H, Küçükosmanoğlu İ. Association of Fluorodeoxyglucose Positron Emission Tomography Radiomics Features with Clinicopathological Factors and Prognosis in Lung Squamous Cell Cancer. Nuclear Medicine and Molecular Imaging. 2022;56: 306–312. pmid:36425277
View Article
PubMed/NCBI
Google Scholar

[118] View Article

[119] PubMed/NCBI

[120] Google Scholar

[ref31] 31. Wang X, Dai Y, Lin H, Cheng J, Zhang Y, Cao M, et al. Shape and texture analyses based on conventional MRI for the preoperative prediction of the aggressiveness of pituitary adenomas. European Radiology. 2023;33: 3312–3321. pmid:36738323
View Article
PubMed/NCBI
Google Scholar

[122] View Article

[123] PubMed/NCBI

[124] Google Scholar

[ref32] 32. Cui Y, Lin Y, Zhao Z, Long H, Zheng L, Lin X. Comprehensive 18F-FDG PET-based radiomics in elevating the pathological response to neoadjuvant immunochemotherapy for resectable stage III non-small-cell lung cancer: A pilot study. Frontiers in Immunology. 2022. pmid:36466929
View Article
PubMed/NCBI
Google Scholar

[126] View Article

[127] PubMed/NCBI

[128] Google Scholar

[ref33] 33. Zhou Y, Ma X, Zhang T, Wang J, Zhang T, Tian R. Use of radiomics based on 18F-FDG PET/CT and machine learning methods to aid clinical decision-making in the classification of solitary pulmonary lesions: an innovative approach. European Journal of Nuclear Medicine and Molecular Imaging. 2021; 2904–2913. pmid:33547553
View Article
PubMed/NCBI
Google Scholar

[130] View Article

[131] PubMed/NCBI

[132] Google Scholar

[ref34] 34. Koyasu S, Nishio M, Isoda H, Nakamoto Y, Togashi K. Usefulness of gradient tree boosting for predicting histological subtype and EGFR mutation status of non-small cell lung cancer on 18F FDG-PET/CT. Annals of Nuclear Medicine. 2020; 49–57. pmid:31659591
View Article
PubMed/NCBI
Google Scholar

[134] View Article

[135] PubMed/NCBI

[136] Google Scholar

[ref35] 35. Ren C, Zhang J, Qi M, Zhang J, Zhang Y, Song S, et al. Correction to: Machine learning based on clinico-biological features integrated 18F-FDG PET/CT radiomics for distinguishing squamous cell carcinoma from adenocarcinoma of lung. European Journal of Nuclear Medicine and Molecular Imaging. 2021; 1696–1696. pmid:33532911
View Article
PubMed/NCBI
Google Scholar

[138] View Article

[139] PubMed/NCBI

[140] Google Scholar

[ref36] 36. Shen H, Chen L, Liu K, Zhao K, Li J, Yu L, et al. A subregion-based positron emission tomography/computed tomography (PET/CT) radiomics model for the classification of non-small cell lung cancer histopathological subtypes. Quantitative Imaging in Medicine and Surgery. 2021; 2918–2932. pmid:34249623
View Article
PubMed/NCBI
Google Scholar

[142] View Article

[143] PubMed/NCBI

[144] Google Scholar

[ref37] 37. Parmar C, Grossmann P, Bussink J, Lambin P, Aerts HJWL. Machine Learning methods for Quantitative Radiomic Biomarkers. Scientific Reports. 2015;5. pmid:26278466
View Article
PubMed/NCBI
Google Scholar

[146] View Article

[147] PubMed/NCBI

[148] Google Scholar

[ref38] 38. Huang X, Sun Y, Tan M, Ma W, Gao P, Qi L, et al. Three-Dimensional Convolutional Neural Network-Based Prediction of Epidermal Growth Factor Receptor Expression Status in Patients With Non-Small Cell Lung Cancer. Frontiers in Oncology. 2022;12. pmid:35186727
View Article
PubMed/NCBI
Google Scholar

[150] View Article

[151] PubMed/NCBI

[152] Google Scholar

[ref39] 39. Gao J., Heng F, Yuan Y., Liu Y. A novel machine learning method for multiaxial fatigue life prediction: Improved adaptive neuro-fuzzy inference system. International Journal of Fatigue. 2024; 108007.
View Article
Google Scholar

[154] View Article

[155] Google Scholar

[ref40] 40. Alves VM, Dos Santos Cardoso J, Gama J. Classification of Pulmonary Nodules in 2-[¹⁸F]FDG PET/CT Images with a 3D Convolutional Neural Network. Nucl Med Mol Imaging. 2024; 9–24. pmid:38261899
View Article
PubMed/NCBI
Google Scholar

[157] View Article

[158] PubMed/NCBI

[159] Google Scholar

Figures

Abstract

Introduction

Materials and methods

Study design

Patients

18F-FDG PET/CT image acquisition

Tumor segmentation

Feature extraction

Feature selection and prediction model establishment

Development and validation of individualized nomogram

Statistical analysis

Results

Clinical characteristics of patients

Features selection and prediction model establishment

Prediction performance and clinical utility of prediction models

Discussion

Conclusion

Supporting information

S1 Table. Radiomics feature extraction.

S1 Dataset. PET-CT radiomics features.

S2 Dataset. Baseline characteristics of patients.

S1 File.

S2 File.

Acknowledgments

References