A pruned and parameter-efficient Xception framework for skin cancer classification

doi:10.1371/journal.pone.0341227

Table 1.

Summary of related works on skin lesion classification, including classical methods, hybrid approaches, and deep learning models.

More »

Expand

Fig 1.

Distribution of skin lesion classes in the HAM10000 dataset.

The dataset exhibits a significant class imbalance, with most images belonging to the melanocytic nevi class, while minority classes such as dermatofibroma and vascular lesions are underrepresented.

More »

Expand

Fig 2.

Example images for each class from the HAM10000 dataset.

The images illustrate the visual diversity of skin lesion categories and highlight inter-class variability across different lesion types.

More »

Expand

Table 2.

Number of training samples for each class after applying different sampling strategies.

More »

Expand

Table 3.

Performance scores of different models for the original dataset (mean ± standard deviation over 5 runs).

More »

Expand

Fig 3.

Training loss curves of multiple deep learning models.

The X-axis denotes training epochs, while the Y-axis represents loss values. All models were trained with early stopping to mitigate overfitting. Most architectures exhibit a stable reduction in loss, whereas VGG16 shows slower convergence, highlighting differences in learning dynamics across models.

More »

Expand

Fig 4.

Training accuracy curves of different deep learning models.

The X-axis represents training epochs, while the Y-axis indicates accuracy. Most architectures demonstrate a rapid increase in accuracy during the early epochs, with DenseNet201, Xception, and EfficientNetB3 approaching near-perfect performance. In contrast, VGG16 converges more slowly and stabilizes at a lower accuracy level, reflecting differences in training dynamics among models.

More »

Expand

Table 4.

Performance scores for different layers with sparsity for the original dataset (mean ± standard deviation over 5 runs).

More »

Expand

Table 5.

Performance scores of different sampling strategies (mean ± standard deviation over 5 runs).

More »

Expand

Table 6.

Performance scores after applying data augmentation and AvgTopK strategies (mean ± standard deviation over 5 runs).

More »

Expand

Table 7.

Comparison of prior studies on the HAM10000 dataset for skin lesion classification. All results shown are based on the original 10,015-image dataset; in our case, oversampling and augmentation are applied only to the training set after the train–test split.

More »

Expand

Fig 5.

Confusion matrix of the proposed model evaluated on the HAM10000 dataset.

All data augmentation and SMOTE procedures were applied exclusively to the training set after the train-test split, ensuring that the test set remained free of synthetic samples.

More »

Expand

Fig 6.

Grad-CAM visualizations for representative skin lesion classes.

For each example, the original dermoscopic image is shown alongside its corresponding Grad-CAM heatmap. The highlighted regions indicate areas that contribute most strongly to the model‘s classification decision, with warmer colors representing higher relevance.

More »

Expand

Fig 7.

Representative test images with ground truth and predicted labels.

Each example displays a dermoscopic image from the test set along with a pair of labels, where the left label denotes the ground truth and the right label indicates the model prediction (e.g., mel|akiec).

More »

Expand