Figure 1.
Proposed computational methodology for lung tumor classification from protein sequence properties.
Figure 2.
The IFS curves depicting classification accuracy and MCC in lung tumor categorization.
(A) The IFS curve generated using Classification Accuracy in Lung Tumor categorization. The x-axis represented the number of features while the y-axis represented the jack-knife cross-validation accuracy. The peak of classification accuracy attained was 87.6% with 36 features. The top 36 features derived by Hybrid Feature Selection (Gain Ratio +CFS Subset) approach form the optimal feature set. (B) The IFS curve generated using MCC values obtained from classification algorithms. The peak of MCC is 0.812 with 36 features. The top 36 features derived by the Hybrid Feature Selection approach (Gain Ratio + CFS Subset) formed the optimal feature set.
Table 1.
Optimal classification accuracy with filtered subsets and IFS.
Table 2.
Comparison of predictor models in lung cancer tumor categorization.
Table 3.
Classes to cluster evaluation.
Figure 3.
Decision tree model obtained by the Random Forest classifier.
Figure 4.
The hybrid feature selection techniques are represented as solid diamonds. The optimal features filtered by each technique are represented by directed edges from the technique to the feature. Results of each hybrid feature selection technique are represented in different colors.