Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Breast cancer inter-image dissimilarity by feature optimization: An application of novel flea optimization algorithm

  • P.P. Fathimathul Rajeena ,

    Roles Conceptualization, Formal analysis, Funding acquisition, Methodology, Project administration, Supervision, Validation, Writing – original draft, Writing – review & editing

    fatimah.rajeena@kfu.edu.sa

    Affiliation Computer Science Department, College of Computer Science and Information Technology, King Faisal University, Alhasa, Saudi Arabia

  • Muhammad Yasir,

    Roles Data curation, Formal analysis, Investigation, Methodology, Resources, Software, Validation, Visualization, Writing – original draft, Writing – review & editing

    Affiliation Department of Computer Science, HITEC University Taxila, Taxila, Pakistan

  • Mona. A. S. Ali,

    Roles Conceptualization, Formal analysis, Funding acquisition, Methodology, Project administration, Supervision, Writing – original draft

    Affiliation Computer Science Department, College of Computer Science and Information Technology, King Faisal University, Alhasa, Saudi Arabia

  • Junaid Ali Khan

    Roles Conceptualization, Data curation, Formal analysis, Methodology, Project administration, Resources, Supervision, Validation, Writing – original draft, Writing – review & editing

    Affiliations Department of Computer Science, HITEC University Taxila, Taxila, Pakistan, Department of Computer Engineering, Sakarya University, Serdivan/Sakarya, Türkiye

Abstract

Background/Objective: Breast cancer is a serious disease that has caused thousands of deaths around the world. According to the American Cancer Society, more than 40,000 women and about 600 men lost their lives due to breast cancer in 2021, and it increased to 43,700 women and 530 men until 2023.

Method: In this paper, a modified version of ResNet-50 has been exploited to extract features from breast tissue biopsy slides, contained in the BreakHis public dataset. The standard 177 layer model is amended upto 146 layers by reducing redundant activation, normalization operations and number of convolutional filters without compromising representational capacity. As a result the computational efficiency is achieved along with reduction in learnable parameters from 23.7M to 16.8M. The features vector is extracted using novel Flea optimization Algorithm that performs exploration from a d-dimensional search space to get global features. An inter-image dissimilarity evaluation has been performed to find out class compactness and separation, demonstrating its crucial role in achieving better classification performance. The results of the proposed framework are obtained on various performance indicators including average accuracy, precision, recall, F1 score etc while the statistical analysis is made to see the reliability of the framework based on MCC, Cohen’s Kappa and t-test.

Results: The results of the proposed method are compared with DenseNet, VGG, CNN with LSTM, Primal Dual Multi-instance SVM, Single Task CNN and Multi Task CNN and shown dominance on various performance measures. An accuracy of 99.20% was achieved at 40× magnification, 99.62% at 100× magnification, 99.50% at 200× magnification, 99.34% at 400× magnification, respectively.

Conclusions: The proposed approach, implemented on the real hardware, can provide an alternate to health experts in diagnosing breast cancer in the early stages.

Section I: Introduction and related literature

Breast cancer is a serious disease that has caused thousands of deaths around the world. According to the American Cancer Society, over 40,000 women and about 600 men lost their lives due to breast cancer in 2012 in USA and the count increased to 43,700 women and 530 men till 2023. Whereas, breast cancer has resulted 6,85,000 deaths all over the world during 2023 [13]. There are four main types of breast cancer: benign, normal, in situ carcinoma, and invasive carcinoma. A benign tumour changes the breast slightly, but it is not harmful or dangerous. In situ carcinoma only affects certain parts of the breast and doesn’t spread to other body parts [4]. It is not very harmful and can be treated if found early. The most dangerous type is invasive carcinoma, which can spread to other organs [57]. Healthcare professionals can detect breast cancer using methods like mammograms, X-rays, Portion Emission Tomography (PET), Ultrasound, Magnetic Resonance Imaging (MRI) and Computed Tomography (CT). if detected at initial stages, the cancer can be cured [8]. The best way to diagnose breast cancer is by looking at tissue from the breast under a microscope. To make the tissue easier to see, it is stained with special dyes in a lab. Histopathological image analysis (which looks at microscopic images of breast tissue) is important for early cancer treatment. Genomics is the study of genes, also helps in understanding breast cancer. By combining data from genes and medical images, doctors can make more accurate diagnoses [9,10]. Computer-Aided Design (CAD) has been used to help identify breast cancer. Whereas, traditional CAD systems rely on features created by people, which can reduce their effectiveness. With new advances in technology, deep learning methods have been used to detect breast cancer. Deep learning is better at solving complex problems because it doesn’t need as much human involvement [11,12]. This makes it useful for things like natural language processing, image analysis, pattern recognition [13,14]. Moreover, ML has played an important role in the field of medical imaging especially, in feature extraction and transfer learning [15].

In [16], the authors have performed classification on BreakHis dataset. In this approach, experiments are performed by using different magnification levels of histopathological images and for classification, Support Vector Machines (SVM), 1-Nearest Neighbour (1-NN), Random Forest (RF), and Quadratic Linear Analysis (QDA) are utilized. After experiments, maximum 85% accuracy has been achieved. The authors have highlighted the issues related to the dataset such as multiclass classification which is a hard task to classify. Although their work has laid the basis for new dataset in cancer research but, their proposed method has many shortcomings including high false positive rate and curse of dimensionality.

S. H. Kassani et al. [17] have proposed an ensemble approach which used three pretrained deep convolutional neural network (CNN) models for feature extraction. At preprocessing stage, augmentation, stain-normalization, tuning of hyper-parameters has been performed. Moreover, fine-tuned convolutional neural networks model has been used in the study to achieve improved results. These models include VGG19, DenseNet and MobileNet. The extracted features are fed to multi-layer perceptron for classification. For validation of proposed model, BreakHis, Patch Camelyon, ICIAR, and Bioimaging datasets were used. The study achieved 98.13%, 95%,94.6% and 83.10% accuracies respectively. The combination of different deep neural network approaches not only increased computational complexity but it also increased the time complexity of the proposed approach.

By utilizing the BreakHis dataset, I. Sayin et al. have used Xception, VGG, InceptionResNet and ResNet deep learning pre-trained models for classification of histopathological images of breasts. They came to conclusion that Xception outperformed all models in comparison with the accuracy of 89% whereas, InceptionResNet and ResNet gave 87% accuracy but their F-1 Score was different; former gave 86 and later provided 87. The authors only utilized the 200× magnified data of the images, they have not tested their proposed approach on other magnification levels of the dataset. Likewise, they have obtained low classification accuracy by using their proposed approach [18].

A. Nahid et al. have exploited the statistical and structural information of BreakHis dataset images for classification of histopathological images using CNN, Long-short-term Memory (LSTM) and combination of CNN & LSTM. First extraction of the features is performed by using aforementioned models then Support Vector Machine (SVM) with Softmax function is used for final classification. They obtained maximum 91% accuracy at 200× magnification level and maximum precision was 96% at 40× magnification level whereas maximum F-1 measure was observed at 40× & 100× magnification levels. The authors have not evaluated all the magnification level as a whole but they have processed turn by turn all magnification levels which does not generalizes the whole dataset [19]. To detect the malignant images in BreakHis dataset, a Primal-Dual Multi-Instance SVM was introduced by H. Seo et al. they also derived an optimization algorithm for SVM without using quadratic programming and least-squares. The authors claimed that this reduces the complexity of the optimization and increases the scalability of the algorithm. Although, the authors have computed the accuracy on all magnification levels of BreakHis dataset but they achieved maximum 89.8% accuracy at 200× magnification level with 92.4 and 63.6 F-1 score and recall respectively [20].

In their monograph, K. Jabeen et al. have presented a pre-processing image enhancement technique and named it haze-reduced local-global which improves the contrast of the images. Later on, they performed augmentation on the dataset and fed the pre-processed dataset to the pre-trained EfficientNet-bO model. They extracted the features from pre-trained model and fed them to the optimization algorithm for feature selection using Equilibrium-Jaya controlled Regula Falsi. The authors have used two datasets in their study; i) INbreast & ii) CBIS-DDSM. They achieved 95.4% and 99.7% accuracy on both datasets respectively. The authors have used smaller datasets and by using comparatively a larger dataset like BreakHis one may be able to check the effectiveness of their proposed approach [21]. In [22] D. Kaplun et al. have used Zernike image moments for extraction of features from histopathological images and used neural networks to classify the extracted features. The authors also utilized the Explainable Artificial Intelligence and Local Interpretable Model-Agnostic Explanation techniques to explain their model. They have claimed that they have achieved 100% accuracy at 40× magnification while performing binary classification. Their proposed model needs meticulous review as their classification accuracy is claimed too high. Moreover, use of interpretation methods have not only increased the computational complexity rather, proposed approach has increased the time complexity as well.

In [23], the authors have performed classification on BreakHis dataset independent from magnification levels of images by using CNNs. They have used two models: single task CNN for classification of malignant images and multi-task CNN for prediction of benign and malignant images. They have stated that their approach has performed well for classification. Moreover, they have also claimed that their proposed model can capture additional information from different magnification levels. They achieved maximum 83.33% accuracy from their proposed model. They have not utilized the whole BreakHis dataset. Instead, they have used a subset of the dataset for their research. Use of complete dataset may affect the performance metric. Table 1 represents the comparison of state of the art approaches.

The contributions of this work are as follows:

  • Proposed modified ResNet-50 based deep neural network architecture having 116 convolution layers consisting of 35 convolution 35 batch normalization 01 max pooling 01 average pooling 31 activation 10 addition 01 softmax 01 input and 01 output layers.
  • A novel Flea Optimization Algorithm is presented for selection of optimized set of features to perform effective classification.
  • The results of the proposed approach have been compared with VGG, DenseNet, ResNet-50, CNNs, InceptionResNet, and LSTM.
  • An ablation study has been conducted to find the effectiveness of proposed approach.
  • Simulations have been performed for 1000 number of iterations to check the effectiveness of proposed approach by meticulously observing the time, space and computational complexity in addition to convergence and accuracy.

The rest of the paper consists of four sections. The general understanding of the model, Flea Optimization Algorithm, the performance measures, revised ResNet-50 model, and feature optimization are described in Section II, under the proposed work. The values of the parameters of simulation environment are stated in section III along with the explanation and the discussion of the results. In section IV, after completing the subject studies, we have provided conclusions, recommendation for the subsequent research and direction for the said study.

Section II: Proposed work

The section breaks down into four sub-units. The modified ResNet-50 Model is explained in the first section, in the second section, Flea Optimization Algorithm is described, and the feature optimization technique is explained in the last section.

Modified ResNet-50 model

The modification in ResNet50 model by reducing the number of learnable parameters and layers, results optimal feature selection which consequently helps in better classification of images. Moreover, it also consequent reduction in training time and improvement in prediction accuracy. In this modification, the total number of layers are 146 and the number of learnable parameters decreased from 23.7M to 16.8M, thereby reducing the model cost. These 146 include: 01 input 44 convolution 44 batch normalization 01 max pooling 01 average pooling 40 activation 13 addition 01 SoftMax and 01 output layer. The layered architecture along with block diagram of modified ResNet50 is shown in Fig 1.

thumbnail
Fig 1. Modified ResNet-50 model: (a) Architecture, (b) block diagram.

https://doi.org/10.1371/journal.pone.0341848.g001

The Table 2 presents the number of learnable parameters and layers for standard ResNet-50 and modified ResNet-50, is given below for comparison.

The output of trained network is used to train deep neural network using transfer learning which provides a feature vector at output. To obtain discriminant features and reduce the dimensionality of feature vector, feature optimization is performed by using Flea Optimization Algorithm. Later on, it is fed to different classifiers to find inter-image dissimilarity.

Flea optimization algorithm

A model to select a subset of discriminating features from a given feature vector has been presented in this monograph. The objective of the optimization is to achieve an optimized feature set that may accurately classify the instances to improve the performance metric. This study has introduced a novel Flea Optimization Algorithm for selection of optimal features from another feature set, obtained using modified ResNet50 model, as explained earlier in section II-A. The proposed approach follows following key steps: (i) image pre-processing, (ii) feature extraction, (iii) feature optimization, and (iv) classification of images as per their discriminant features. Flea Optimization Algorithm (FOA) is used as a meta-heuristic search algorithm that optimizes the subset of ResNet-extracted features. Unlike standard LASSO solvers, which only perform convex coefficient shrinkage on all input features, FOA explicitly searches for an optimal combination of relevant features by encoding feature-selection masks within the flea population. Standard LASSO solver optimize coefficients under L1 penalty but do not explore combinations of features beyond convex shrinkage. Therefore, LASSO may retain correlated features but it may exhibit sensitivity to the regularization parameter and may fail to identify the global optimum in high-dimensional feature spaces. In contrast, FOA performs a population-based global search that explicitly evaluates feature-selection vectors. This allows FOA to discover more discriminative subsets of the ResNet features than LASSO alone, especially when the data distribution or extracted features are non-linear and multi-modal. This bi-level optimization enables FOA to remove redundant or noisy ResNet features, avoid the limitations of purely convex sparsity penalties, and escape local minima when the feature space is high-dimensional. Graphical representation of proposed models is shown in Fig 2.

At start, pre-processing steps is performed on input images by augmentation technique to enhance the model’s generalization and robustness. To achieve this, we have applied:

  • A random rotation ranging from –5 to 5 degrees
  • Random X-axis and Y-axis reflection (to introduce variability to the training images).
  • Random X-axis and Y-axis shear within the range from –0.05 to 0.05 (To address geometric distortions in the images)
  • At last, images are resized to before feeding to the modified ResNet50.

At modified ResNet-50 stage, the preprocessed input images are divided into 70/30 training and validation instances then feature extraction process is performed. We have tried different ratios of the dataset for training and testing like 80/20, 60/40 but 80/20 provided biasness and 60/40 was not able to converge the model in correct direction. 70/30 avoids both aforementioned problems. The output of trained network is used to train deep neural network using transfer learning. The output of modified ResNet-50 is a Directed Acyclic Graph (DAG) which is used for feature extraction from BreakHis dataset.

The modified ResNet50 model’s output serves as the input for the Flea Optimization Algorithm, which conducts optimizations based on fitness criteria. Features with higher fitness values are prioritized in this process. The features are index-sorted in descending order, placing those with greater fitness values at the beginning indices. Then Least Absolute Shrinkage and Selection Operator (LASSO) objective function is used with FOA to reduce the dimensions of feature vector. Subsequently, the optimized set of features is fed into the classifier to find inter-image-dissimilarity. Our study includes Decision Trees, Narrow Neural Network, Medium Neural Network, Wide Neural Network, Bi-layer Neural Network, and Tri-Layers Neural Network to find inter-image dissimilarity.

The population of ‘s’ fleas with ‘t’ parameters of each flea which is equal to input feature vector as:

(1)

The feature vector for the ith instance is denoted as . To achieve our optimization objective, we employed the least absolute shrinkage and selection operator (LASSO). LASSO modifies the ordinary least squares (OLS) objective function by incorporating a penalty term. This penalty term encourages the selection of a smaller, more relevant subset of features by driving some of the coefficients to zero. This process simplifies the model and enhances its predictive accuracy. The mathematical expression for LASSO is written as:

(2)

In above equation, f denotes the actual outcome, L is the feature matrix, α is the regularization parameter that defines the strength of the shrinkage term, and ψ is the vector of coefficients, we intend to estimate. Increasing the value of α results in greater shrinkage of the coefficients, which means more coefficients are reduced to zero. Above equation comprises two components: represents the ordinary least squares term, specifically the squared Euclidean norm (L2 norm) of the residual sum of squares. It indicates how well the model is performing on the training data; and denotes the L1 norm of the coefficient vector, which sums the absolute values of the coefficients. The L1 norm serves as a penalty term, causing some coefficients to become zero depending on the value of α. Coefficients that are zero are excluded, leaving the remaining features selected for classification. The objective is to find the values of ψ that minimize this objective function. The logical steps of FOA are given in Algorithm 1.

Algorithm 1. Algorithm-1: Flea optimization algorithm for feature selection.

Feature optimization

In feature optimization, LASSO is used as objective function which uses convex penalty function to reduce the number of features, as described in the previous section (B). When the value of α is larger, more features are set to zero, leading to a smaller set of features being selected for classification. Conversely, a smaller value of α results in fewer features being set to zero, which means more features are retained for classification. This may negatively impact classification accuracy and time complexity. On the other hand, fitness of the discriminatory features is selected by FOA. It performs exploration and exploitation is search space and selects the discriminatory features by utilizing the intelligence of its population. FOA selects optimal features by avoiding local optima in d-dimensional space whereas LASSO reduces the number of selected features so that most optimal feature set may be obtained. Experiments with various values of α have been performed and it was found that α at value of 0.0030 strikes the right balance. It significantly reduces the number of features while it also helps to maximize classification accuracy. Jointly, FOA and LASSO reduced number of features from 2048 to 780, 811, 664, 733 for 40×, 100×, 200× and 400× magnification levels, respectively. This reduction decreased computational time and storage costs while improving accuracy and convergence.

Section III: Results and analysis

This section enlists the key findings which have been observed during the study. To carry out the study, we have used MatLab R2022a for simulations on Intel Core i5 8350U CPU in Windows11 environment. Tree, Narrow Neural Network, Medium Neural Network, Wide Neural Network, Bi-Layered Neural Network and Tri-Layered Neural Network have been used for classification of features. Table 3 shows the general and specific parameter values and settings, used in the experiments:

Performance metric for evaluation of proposed approach along with mathematical equations is given in Table 4.

BreakHis dataset have been used in our study, which consists of histopathological images of breasts cancer patients. It has 7909 microscopic images of breast cancer tissues with two classifications: benign and malignant. To capture the dataset, 40×, 100×, 200× and 400× magnification levels has been used. The original dataset has two main classes: benign and malignant then each class has its associated cancer sub-class, the cancer subclasses have four types of magnification levels for breast tissues images i.e. 40×, 100×, 200× and 400×. It has eight classes: four for malignant type: Ductal Carcinoma (DC), Lobular Carcinoma (LC), Mucinous Carcinoma (MC), Papillary Carcinoma (PC) and four for benign type: Adenosis (A), Fibroadenoma (F), Phyllodes Tumor (PT), Tubular Adenoma (TA) [24]. Fig 3 shows the sample images of the dataset from 40× and 100× magnification level.

thumbnail
Fig 3. Breast cancer images of dataset (a) adenosis (b) ductal carcinoma.

https://doi.org/10.1371/journal.pone.0341848.g003

The dataset images have been classified as per their inter-image dissimilarity. Five standard MATLAB neural-network classifiers were used for classification, including Narrow, Medium, Wide, Bi-Layered, and Tri-Layered neural networks. These models differ in depth and number of neurons. The Narrow, Medium, and Wide models contain a single hidden layer with approximately 10, 20, and 50 neurons, respectively, whereas the Bi-Layered and Tri-Layered models include two and three hidden layers with 10 neurons per layer. All networks were trained using MATLAB’s default scaled Conjugate Gradient Training (SCG) algorithm, TanSig activation function, softmax output layer, default data division and stopping criteria. Training was performed using Stochastic Gradient Descent with Momentum (SGDM) to train the deep learning model. The dataset was divided into 70/30 training and validation subsets using stratified splitting. To improve generalization, data augmentation was applied, including random rotation (±5°), horizontal/vertical reflection, and random shear (±0.05). All images were resized to 224×224×3 using grayscale-to-RGB conversion where needed.

The modified ResNet architecture was fine-tuned by replacing the softmax, classification and final fully connected, layers with new softmax, classification and fully connected layers. Final layer has eight neurons to accommodate the classification of the dataset into eight classes. Hyperparameters used for training include: Learning Rate 0.0002, Batch Size 20 with 10 number of epochs. Moreover, shuffling was performed in every epoch and validation was carried out by using augmented validation set. Experiment results have shown significant increase in accuracy, precision and recall in addition to improvement in MCC, F-1 measure and Kappa. Moreover, error and false positive rate has decreased, as shown in graphs in this section, later. Table 5 depicts the performance metric before feature optimization whereas Table 6 display the classification results after feature optimization. It can be observed from the Tables 5 and 6 that proposed approach has outperformed in all performance measures after feature optimization.

thumbnail
Table 5. Classification results at 40 magnification, before feature optimization on BreakHis dataset.

Boldface values depict significant results.

https://doi.org/10.1371/journal.pone.0341848.t005

thumbnail
Table 6. Classification results at 40 magnification, after feature optimization on BreakHis dataset.

Boldface values depict significant results.

https://doi.org/10.1371/journal.pone.0341848.t006

It is evident from Table 5 that maximum accuracy achieved, before feature optimization is approximately 90% which is attained by using Wide NNs. Similarly, maximum precision is 89.72% whereas recall is 88.45% which is also exhibited by Wide NN. After feature optimization, maximum accuracy is 99.2% which is obtained by using Wide NNs and Tri-Layer NNs. Likewise, both recall and precision increased to 98.84% and 98.78% respectively in case of Tri-layered NN and Wide NNs. All other classification algorithms performed with more than 97% accuracy. It is worth mentioning that false positive rate has been reduced from 0.0157 to 0.0012 which testifies the effectiveness of the proposed approach. Moreover, the training time also reduced after reduction in number of features. For example, it took Trilayered NN to train 113 seconds before feature optimization and after feature optimization, it only took 22.535 seconds. In the same way, training time for all classifiers has reduced after feature optimization. Due to comparatively simple internal structure of the decision trees, they took less time as compared to neural networks, which have comparatively complex internal architecture.

For 100× magnification level, Tables 7 and 8 represent the classification results before and after optimization, respectively. In this case, before optimization, the maximum accuracy attained is 89.70% with 89.83% and 83.68% precision and recall respectively. The error was 10.30% with 1.75% false positive rate. After feature optimization, Trees outperformed all classifiers, maximum accuracy achieved is 99.62% with 99.58% precision and 99.29% recall whereas, the error and false positive rate have been reduced to 0.38% and 0.061% respectively. The reduction in features not only resulted decreased training time but it also resulted a decrease in the complexity of inputs as a result, Trees performed better. Hence these graphs represents the effectiveness of proposed approach. In case of neural networks, the narrow, medium, wide and bilayered NN provided more than 99% accuracy.

thumbnail
Table 7. Classification results at 100 magnification, before feature optimization on BreakHis dataset.

https://doi.org/10.1371/journal.pone.0341848.t007

thumbnail
Table 8. Classification results at 100 magnification, after feature optimization on BreakHis dataset.

https://doi.org/10.1371/journal.pone.0341848.t008

Tables 9 and 10 show the results of 200× magnification levels before and after feature optimization, respectively. WE can observe in Table 8 that Wide NN performed well than other classifiers due to their ability to detect features like texture, edges and shapes in images so they make precise selection of features, resulting improvement in precision and recall. In contrast, decision trees and narrow NNs often face limitations in data handling capacity, generalization, and scalability, leading to comparatively lower precision and recall. Whereas, after feature optimization, Trees have beaten in terms of whole performance metric to all classifiers, in comparison, due to decreased input complexity.

thumbnail
Table 9. Classification results at 200 magnification, before feature optimization on BreakHis dataset.

https://doi.org/10.1371/journal.pone.0341848.t009

thumbnail
Table 10. Classification results at 200 magnification, after feature optimization on BreakHis dataset.

https://doi.org/10.1371/journal.pone.0341848.t010

The meticulous examination of Tables 11 and 12 reveals interesting information. Before feature optimization, WideNN outshined all other classifiers due to more generalization ability than Tree, Narrow & Medium NN and it has less internal complexity than Bilayered and Trilayered NN. After feature optimization, the performance of Medium, Wide and Bilayered NN was equal in terms of accuracy but we can observe in Table 11 that Bilayered NN have more MCC, Kappa and F1_Score than Medium and Wide NN. So performance of Bilayered NN was best among all, in comparison. Although, Wide NN have more Precision (99.16%) than Wide NN (99.13%) but Medium NN have more F1_Score (99.08%) than Wide NN (98.96%) but equal Kappa and MCC, we can conclude that performance of Medium NN was better than Wide NN.

thumbnail
Table 11. Classification results at 400 magnification, before feature optimization on BreakHis dataset.

https://doi.org/10.1371/journal.pone.0341848.t011

thumbnail
Table 12. Classification results at 400 magnification, after feature optimization on BreakHis dataset.

https://doi.org/10.1371/journal.pone.0341848.t012

The training graphs illustrated in Fig 4 shows that proposed approach showed more convergence with increased accuracy at lower magnification level as and when magnification of dataset images increased, the convergence decreased to certain level and model gained accuracy after more iterations. Similarly, the error curve exhibits the converse. At lower magnification level of images, the error decreased after lesser iterations whereas, increase in magnification level, resulted reduction in error after greater number of iterations.

thumbnail
Fig 4. Training graphs of BreakHis dataset with respective magnification levels.

https://doi.org/10.1371/journal.pone.0341848.g004

Comparison of Figs 5 and 6 depicts that feature optimization has improved the classification accuracy, precision and recall. This, in turn, decreased the classification error, false negative rate and false positive rate. One can say that feature optimization improves the classification accuracy after selection of discriminant features from a given feature vector. The Wide NN performed well in terms of classification before feature optimization whereas after classification Wide NN and Trilayered NN equally performed better. Performance of Tree classifier was at lowest level of correct predictions, before optimization, due to increased input complexity. The reason behind improvement in classification results is selection of discriminant features by Flea Optimization Algorithm for classification.

thumbnail
Fig 5. Confusion matrix of classifiers at 40 magnification (a) before optimization (b) after optimization.

https://doi.org/10.1371/journal.pone.0341848.g005

thumbnail
Fig 6. Confusion matrix of classifiers at 100 magnification (a) before optimization (b) after optimization.

https://doi.org/10.1371/journal.pone.0341848.g006

We can observe a bit of performance degradation in Figs 7 and 8, before and after feature optimization, when magnification of the images has increased from 200× to 400×. The reason may be blurring of images which lacks in detection of edges, textures and shapes in images. This degradation affects the performance metric in a bad fashion. This, in turn, increases the classification error, false negative rate and false positive rate.

thumbnail
Fig 7. Confusion Matrix of classifiers at 200 magnification (a) before optimization (b) after optimization.

https://doi.org/10.1371/journal.pone.0341848.g007

thumbnail
Fig 8. Confusion matrix of classifiers at 400 magnification (a) before optimization (b) after optimization.

https://doi.org/10.1371/journal.pone.0341848.g008

As shown in Fig 9, classification accuracy increased up to 99.2% whereas, it was 90.65% before optimization in case of 40× magnification likewise, it increased from 89.7% to 99.23% in 100× scenario. The increase in accuracy is due to discriminatory abilities of selected features which help in detection of inter image dissimilarity. More the dissimilar images, more accurate classification will be and vice versa.

Accuracy graphs of 200× and 400× are listed in Fig 10 which show improvement in accuracy after FOA has been applied on the extracted features. Trees showed major improvement in both cases as their input complexity has been reduced and discriminant features helped them to find inter-image dissimilarity more accurately. In case of neural networks, Bilayered and Trilayered NN have performed well due to their property of detecting textures, shapes and edges in the image which make them superior from other neural networks. The extracted discriminant features help to find inter-image dissimilarity more accurately. Fig 11 shows the error percentage before and after feature optimization. It can be noticed from graphs that error was up to 42.90% and 45.43%, before feature optimization and after the feature optimization, the error has reduced to 0.8% and 0.38% respectively. Whereas, the minimum error before optimization was 9.35%. Maximum error reduction has been observed in Trees which is 45% and minimum in Wide NN which is 9% and the reason behind this phenomenon is classification accuracy before optimization. The accuracy of Trees was less so error rate was more, conversely, the Wide NN have More accuracy than Trees so their error rate was less. After application of FOA, classification accuracy of Trees was observed towards increase which, decreased the error rate. On the other hand, after feature optimization, there was comparatively small improvement in Wide NN (as it was performing a bit well before feature optimization) which caused less reduction in error. The graphs on the right side of Fig 12 shows the precision and recall after feature optimization. It is clear from the graph that minimum 95.38% (Bilayered) and maximum 99% (Trilayered NN) precision has been obtained by neural networks, after feature optimization due to their inherent property of detecting features like texture, edges and shapes in images so they make precise selection of features, resulting improvement in precision and recall. On the other hand, for neural networks, it was minimum 78% (Trilayered NN) and maximum 90% (Wide NN) before feature optimization. Moreover, the high dimensional data is handled efficiently by Trilayered and Bilayered NN by using regularization techniques such as batch normalization, dropout and weight regularization which causes improved precision and recall. In contrast, decision trees and narrow NNs often face limitations in capacity, generalization, and scalability, leading to comparatively lower precision and recall before feature optimization but when the input of Trees and Narrow NN was optimized set of features, the precision and recall increased due to decrease in input complexity. It shows that feature optimization not only increases the accuracy but it also increases the precision and recall. Consequently, error, false positive rate and training time reduces as well.

thumbnail
Fig 12. Precision and recall graphs of classifiers at 40 and 100.

https://doi.org/10.1371/journal.pone.0341848.g012

Fig 13 illustrates the results for 200× and 400× magnification levels. The left side graphs show the performance before optimization therein one can infer that maximum recall was about 80%. Whereas, after feature optimization, recall approaches approximately 99% and precision up to 99%. The increase in recall may be due to more elaborated and zoomed in features of the images which helped in efficient classification of images By considering all true positives, true negatives, false positives and false negatives, MCC provides good insights of the model. We can see in Fig 14 that before feature optimization the values of MCC were smaller but after feature optimization, the model’s MCC score improved. One can see the minimum score achieved after feature optimization is 95.5%(40× magnification) and maximum is 98.99% (100× magnification). Likewise, Kappa and F1_Score have improved due to increased precision and recall. As F1_Score is harmonic mean of precision and recall, an increase in precision and recall also give a rise to F1_Score. Kappa indicates that model has accurately predicted instances to their correct labels. Increase in all three parameters means that model has performed well on most of the instances. We can see in Fig 15 that at 400× magnification level, the MCC, Kappa and F-1 Score all decreased than 200× magnification level in both left and right graphs this is due to two reason; firstly, before optimization, at input there were too much features and due this dimensionality curse, the classifiers could not classify the images with more accuracy which result decrease in precision and recall. The decrease in precision and recall cause a decrease in MCC, Kappa and F1_Score. Secondly, after optimization, the classifiers may not be able to detect texture, shape or edges due to magnification of the images which resulted extraction of less discriminant features; as a result model’s precision and recall decreased, which in turn, decreased the MCC, Kappa and F-1 Score of the model.

thumbnail
Fig 13. Precision and recall graphs of classifiers at 200 and 400.

https://doi.org/10.1371/journal.pone.0341848.g013

thumbnail
Fig 14. Kappa, MCC and F1_Score graphs of classifiers at 40 and 100.

https://doi.org/10.1371/journal.pone.0341848.g014

thumbnail
Fig 15. Kappa, MCC and F1_Score graphs of classifiers at 200 and 400.

https://doi.org/10.1371/journal.pone.0341848.g015

Although training time has not much contribution in improvement of accuracy rather it helps to complete the training in shorter time interval and reduces the time complexity of the algorithm. The graphs on the left in Fig 16 show the training time before feature optimization and the graphs on right side show training time after feature optimization. We can see that the reduction of features has decreased timed complexity approximately 50% for the Tree, Narrow NN and Medium NN whereas, it has decreased approximately 85% in case of Bilayered and Trilayered NN. This reduction in training time enables the proposed approach to complete in comparatively less finite time with more accuracy, precision, recall, MCC, Kappa and F-1 Score and reduced error, and false positive rate.

The graphs in Fig 17 show detailed analysis of improvements, made by the proposed approach in the performance metric. We can see that in case of tree, on average 55% recall, precision and 40% Kappa has increased which resulted increased accuracy whereas MCC, accuracy, F-1 Score and specificity got rise on average 35%, 32%, 90% and 80%, respectively which shows that models has predicted most of the instances correctly. We can see that Kappa has increased for all classifiers; i.e. minimum 40% (Medium & Wide NN) and maximum about 90% (Trilayered NN). The increase in Kappa denotes that model has predicted well for most of the instances against their labels during classification, after feature optimization. The F-1 score and accuracy curves depicts that after feature optimization, model has improved the accuracy, precision and recall for all the classifiers.

thumbnail
Fig 17. Percent improvement in error, recall, specificity, precision and accuracy.

https://doi.org/10.1371/journal.pone.0341848.g017

It can be observed from the Table 13 that proposed approach has performed well than state-of-the-art techniques. We have obtained greater accuracy in less computational and time complexity in all magnification levels of BreakHis dataset. A statistical t-test was conducted on the learning results and found a value of 0.712 as t-statistics with 5 degrees of freedom. The corresponding two-tailed p-value is observed to be 0.51, that exceeded the conventional significance threshold of 0.05. It is also worth to mention that 95% confidence interval for the mean accuracy ranged from approximately 0.986 to 0.996, further supporting the conclusion that the model’s accuracy reliably aligns with the expected benchmark.

thumbnail
Table 13. Comparison of results with state-of-the-art approaches.

https://doi.org/10.1371/journal.pone.0341848.t013

Ablation study

The performance of LASSO and FOA has been evaluated by using three ResNet-based feature selection scenarios (LASSO, FOA, and hybrid ResNet-FOA-LASSO) in terms of convergence, computational efficiency and predictive accuracy. We took Mean Squared Error (MSE) as performance indicator, lower value of MSE will indicate better accuracy. ResNet-LASSO required 308 epochs to reach the stopping criterion, it took 94 seconds as training time, and achieved 0.0444 MSE score with a gradient of 0.0855. The performance of ResNet-FOA was a bit better as it took 312 epochs to attain MSE score of 0.0768 and used 47 seconds as training time. The hybrid ResNet-FOA-LASSO approach outperformed both methods, converging in only 205 epochs, consumed 17 seconds as training time and provided lowest MSE 0.0421 with a gradient value of 0.0594. These results demonstrate that ResNet-FOA-LASSO is an efficient and accurate method for optimizing ResNet-extracted features, achieving lower network loss in fewer epochs and with less computational complexity. Experimental results are listed in Table 14. Fig 18 shows the performance comparison of the scenarios.

thumbnail
Fig 18. Performance comparison of ResNet-LASSO, ResNet-FOA and hybrid ResNet-FOA-LASSO.

https://doi.org/10.1371/journal.pone.0341848.g018

thumbnail
Table 14. Comparison of ResNet-based feature selection methods in terms of training epochs, time, final performance (MSE), and gradient.

Lower MSE and gradient indicate better accuracy and convergence.

https://doi.org/10.1371/journal.pone.0341848.t014

Inter-image dissimilarity

Inter-image dissimilarity is a numerical measure which is used to measure degree of differences in images of a dataset. This measure has been used to examine the structural separability of breast cancer tissue subtypes in the feature space (which has been generated by the proposed approach). After extraction of deep features using a pretrained ResNet, an optimal subset of highly discriminative features was selected by using FOA and LASSO. Pairwise distances between all images was computed using the Euclidean distance measure. For each class, intra-class dissimilarity was used as the mean distance among samples belonging to the same subclass, whereas inter-class dissimilarity was measured as the mean distance between samples of a given subclass and all samples of the remaining classes. The difference between these two metrics, helped to identify the class margin and characterized the compactness and separability of each class in the learned feature space. This measure gives a quantitative basis that how well the feature-extraction stage differentiates visually similar tissue patterns before classification. Table 15 shows intra and inter class dissimilarity along with Class Margin and Accuracy.

thumbnail
Table 15. Intra-class, inter-class dissimilarity, class margin, and classification accuracy for breast cancer classification.

https://doi.org/10.1371/journal.pone.0341848.t015

The observed dissimilarity structure in extracted features shows the inherent variability (in shape, texture, structure and appearance of breast cancer tissues) has been closely aligned with the classification performance. Classes such as papillary carcinoma, lobular carcinoma, phyllodes tumor, and tubular adenoma exhibited very low intra-class dissimilarity () and relatively high inter-class dissimilarity (–11) which resulted large positive class margins and achieved good accuracy (99–100%). Whereas, adenosis and ductal carcinoma demonstrated higher intra-class dissimilarity (18.13 and 2.43, respectively), indicating significant intra-class heterogeneity, as a result, classification accuracy decreased to 80% and 59%, respectively. A strong and consistent relationship has been observed between class margin and classification accuracy which indicates that greater separability in the dissimilarity space results more reliable predictions. These findings validate that the deep features learned by ResNet inherently encode inter-image dissimilarity patterns. Moreover, the classification performance is determined by how the images are distributed in terms of their dissimilarity in a given dataset.

Section IV: Conclusions

Based upon the comprehensive simulations and discussion on the graphs, tables and findings of the proposed framework, the following conclusions can be drawn:

  • The proposed deep neural network architecture consists of 146 layers. As a result of this optimization in deep neural network, the number of learnable parameters decreased from 23.7M to 16.8M. Consequently, time complexity decreased.
  • A novel Flea Optimization Algorithm has been introduced in the study for selection of discriminant features, from a given feature set. Optimized feature vector improves accuracy and reduces time complexity.
  • At input stage, pre-processing was performed on dataset through augmentation techniques such as shear, reflection and rotation to enhance model’s robustness. As a result, model became robust and accuracy of model improved which resulted improvement in performance metric.
  • On average Wide NN performed well, before and after feature optimization. The time complexity is also acceptable in both scenarios.
  • It has been observed when the magnification level of images is increased then convergence becomes slower. The convergence at 40× magnification was faster as compared to 100× and convergence at 100× was faster as compared to 200× and so on. All the magnification levels of BreakHis dataset have been considered and an accuracy of 99.20% at 40× magnification, 99.62% at 100× magnification, 99.50% at 200× magnification and 99.34% at 400× magnification has been achieved. To evaluate the effectiveness of the proposed approach, we compared proposed approach with DenseNet, VGG, LSTM, Primal-Dual Multi-Instance SVM, Single task CNN and multi-task CNN considering accuracy, precision, recall, F-1 measure, Kappa, computational, time, and space complexity. Moreover, the proposed approach outperformed all the deep learning models in comparison.
  • The performance of the proposed approach is compared with DenseNet, VGG, LSTM, Primal-Dual Multi-Instance SVM, Single task CNN and multi-task CNN using following performance metric: i)accuracy; ii) F-1 measure; iii) Recall; iv) Precision; v. Error; vi) KAPPA; and vii) MCC. After experiments, results showed that proposed approach outperformed all approaches, in comparison. The improved performance metric of the proposed model makes it ideal for physicians to use it in real life breast cancer diagnosis.
  • An ablation study has been conducted to evaluate the effectiveness of FOA and LASSO with ResNet architecture and concluded that alone LASSO or FOA underperformed but the combination of both (hybrid approach) has resulted an increase in classification accuracy.
  • The MCC values observed from 0.9534 to 0.9873 and Kappa values were from 0.8668 to 0.9633. Moreover, t-test gave a value of 0.712 as t-statistics with 5 degrees of freedom. The corresponding two-tailed p-value is observed to be 0.51, that exceeded the conventional significance threshold of 0.05 which gave us 95% confidence interval.

In future, one can exploit Flea Optimization Algorithm for hyperparameter tuning of deep learning architecture. The implementation of these architectures can be performed on microchips to have practical implementation in the health sciences and other allied domain.

Acknowledgments

The authors acknowledge King Faisal University, Alhasa and HITEC University Taxila for their support.

References

  1. 1. Nasser M, Yusof UK. Deep learning based methods for breast cancer diagnosis: A systematic review and future direction. Diagnostics (Basel). 2023;13(1):161. pmid:36611453
  2. 2. El-Nabawy A, El-Bendary N, Belal NA. A feature-fusion framework of clinical, genomics, and histopathological data for METABRIC breast cancer subtype classification. Appl Soft Comput. 2020;91:106238.
  3. 3. Pravalika L, Bachu S, Kumar NU. Breast cancer detection and classification using artificial intelligence. In: 2024 1st international conference on innovative sustainable technologies for energy, mechatronics, and smart systems (ISTEMS); 2024. p. 1–5. https://doi.org/10.1109/istems60181.2024.10560257
  4. 4. Aggarwal R, Sounderajah V, Martin G, Ting DSW, Karthikesalingam A, King D, et al. Diagnostic accuracy of deep learning in medical imaging: A systematic review and meta-analysis. NPJ Digit Med. 2021;4(1):65. pmid:33828217
  5. 5. Nassif AB, Talib MA, Nasir Q, Afadar Y, Elgendy O. Breast cancer detection using artificial intelligence techniques: A systematic literature review. Artif Intell Med. 2022;127:102276. pmid:35430037
  6. 6. Yao H, Zhang X, Zhou X, Liu S. Parallel structure deep neural network using CNN and RNN with an attention mechanism for breast cancer histology image classification. Cancers (Basel). 2019;11(12):1901. pmid:31795390
  7. 7. Ha R, Chang P, Mutasa S, Karcich J, Goodman S, Blum E, et al. Convolutional neural network using a breast MRI tumor dataset can predict oncotype Dx recurrence score. J Magn Reson Imaging. 2019;49(2):518–24. pmid:30129697
  8. 8. Saadatmand S, Bretveld R, Siesling S, Tilanus-Linthorst MM. Influence of tumour stage at breast cancer detection on survival in modern times: Population based study in 173 797 patients. Bmj. 2015;351.
  9. 9. Fujioka T, Mori M, Kubota K, Oyama J, Yamaga E, Yashima Y, et al. The utility of deep learning in breast ultrasonic imaging: A review. Diagnostics (Basel). 2020;10(12):1055. pmid:33291266
  10. 10. Yassin NIR, Omran S, El Houby EMF, Allam H. Machine learning techniques for breast cancer computer aided diagnosis using different image modalities: A systematic review. Comput Meth Progr Biomed. 2018;156:25–45. pmid:29428074
  11. 11. Saba T. Recent advancement in cancer detection using machine learning: Systematic survey of decades, comparisons and challenges. J Infect Public Health. 2020;13(9):1274–89. pmid:32758393
  12. 12. Shah SM, Khan RA, Arif S, Sajid U. Artificial intelligence for breast cancer analysis: Trends & directions. Comput Biol Med. 2022;142:105221. pmid:35016100
  13. 13. Bai J, Posner R, Wang T, Yang C, Nabavi S. Applying deep learning in digital breast tomosynthesis for automatic breast cancer detection: A review. Med Image Anal. 2021;71:102049. pmid:33901993
  14. 14. Rautela K, Kumar D, Kumar V. A systematic review on breast cancer detection using deep learning techniques. Arch Computat Methods Eng. 2022;29(7):4599–629.
  15. 15. Hosna A, Merry E, Gyalmo J, Alom Z, Aung Z, Azim MA. Transfer learning: A friendly introduction. J Big Data. 2022;9(1):102. pmid:36313477
  16. 16. Bharati S, Podder P, Mondal M. Artificial neural network based breast cancer screening: A comprehensive review. arXiv preprint. 2020. https://doi.org/10.48550/arXiv.2006.01767
  17. 17. Kassani SH, Kassani PH, Wesolowski MJ, Schneider KA, Deters R. Classification of histopathological biopsy images using ensemble of deep learning networks. arXiv preprint. 2019. https://arxiv.org/abs/1909.11870
  18. 18. Sayın İ, Soydaş MA, Mert YE, Yarkataş A, Ergun B, Yeh SS. Comparative analysis of deep learning architectures for breast cancer diagnosis using the BreaKHis dataset. 2023. https://doi.org/arXiv:230901007
  19. 19. Nahid A-A, Mehrabi MA, Kong Y. Histopathological breast cancer image classification by deep neural network techniques guided by local clustering. Biomed Res Int. 2018;2018:2362108. pmid:29707566
  20. 20. Seo H, Brand L, Barco LS, Wang H. Scaling multi-instance support vector machine to breast cancer detection on the BreaKHis dataset. Bioinformatics. 2022;38(Suppl 1):i92–100. pmid:35758811
  21. 21. Jabeen K, Khan MA, Balili J, Alhaisoni M, Almujally NA, Alrashidi H, et al. BC2NetRF: Breast cancer classification from mammogram images using enhanced deep learning features and equilibrium-Jaya Controlled Regula Falsi-based features selection. Diagnostics (Basel). 2023;13(7):1238. pmid:37046456
  22. 22. Kaplun D, Krasichkov A, Chetyrbok P, Oleinikov N, Garg A, Pannu HS. Cancer cell profiling using image moments and neural networks with model agnostic explainability: A case study of breast cancer histopathological (BreakHis) database. Mathematics. 2021;9(20):2616.
  23. 23. Bayramoglu N, Kannala J, Heikkila J. Deep learning for magnification independent breast cancer histopathology image classification. In: 2016 23rd international conference on pattern recognition (ICPR); 2016. p. 2440–5. https://doi.org/10.1109/icpr.2016.7900002
  24. 24. Araújo T, Aresta G, Castro E, Rouco J, Aguiar P, Eloy C, et al. Classification of breast cancer histology images using convolutional neural networks. PLoS One. 2017;12(6):e0177544. pmid:28570557
  25. 25. Wakili MA, Shehu HA, Sharif MH, Sharif MHU, Umar A, Kusetogullari H, et al. Classification of breast cancer histopathological images using DenseNet and transfer learning. Comput Intell Neurosci. 2022;2022:8904768. pmid:36262621
  26. 26. Sayın İ, Soydaş MA, Mert YE, Yarkataş A, Ergun B, Yeh SS. Comparative analysis of deep learning architectures for breast cancer diagnosis using the BreaKHis dataset. 2023. https://arxiv.org/abs/2309.01007
  27. 27. Mehta S, Khurana S. Combining CNN and LSTM for enhanced breast tumor image analysis. In: 2024 3rd international conference for advancement in technology (ICONAT); 2024. p. 1–5.
  28. 28. Seo H, Brand L, Barco LS, Wang H. Scaling multi-instance support vector machine to breast cancer detection on the BreaKHis dataset. Bioinformatics. 2022;38(Suppl 1):i92–100. pmid:35758811
  29. 29. Jabeen K, Khan MA, Alhaisoni M, Tariq U, Zhang Y-D, Hamza A, et al. Breast cancer classification from ultrasound images using probability-based optimal deep learning feature fusion. Sensors (Basel). 2022;22(3):807. pmid:35161552