Fig 1.
Illustration of different types of lung cancer, highlighting the affected lung areas and unique cellular patterns for each type.
Fig 2.
A Seven-Stage Framework for Gene Expression-based Lung Cancer Classification: From Data Collection to Model Deployment with Integrated Feature Engineering and PCA-MI Feature Selection.
Fig 3.
A block diagram of the CNN architecture shows the flow from the input layer through convolutional pooling and fully connected layers to the output layer to classify the samples as either lung cancer or not.
Fig 4.
Performance Evaluation of CNN Model, A) The confusion matrix highlights high classification accuracy, with minimal misclassifications among the Adenocarcinoma (A), Normal (N), and Squamous (S) classes.
B) The ROC curve shows an overall AUC of 0.99, indicating excellent discriminatory power across all classes. C, D) Training and validation accuracy and loss curves demonstrate consistent learning with minimal overfitting, achieving strong generalization and low error.
Table 1.
The Accuracy, Precision, Recall, and F1-Score for various feature extraction methods. The proposed method achieves the highest accuracy across metrics.
Fig 5.
This chart shows the accuracy of various feature extraction methods.
Fig 6.
Showcases the significance of the top genes and the importance score identified by the hybrid approach, validating its effectiveness for biomarker discovery.
Fig 7.
The PPI network visualises interactions among significant genes identified through the PCA-MI framework, with nodes representing proteins and edges denoting interactions, constructed using STRING with a confidence score ≥0.7.
Fig 8.
Hub Gene Analysis: A) Top 20 hub genes ranked by degree centrality are highlighted, indicating their pivotal roles in the network.
B) chart showing degree centrality scores of the hub genes, reflecting their connectivity and influence.