Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

AdaptiveInvolutionNet: Spatially-adaptive involution with channel-wise attention for breast MRI tumor classification

  • Saeed Alqahtani ,

    Roles Conceptualization, Formal analysis, Project administration, Validation, Visualization, Writing – original draft, Writing – review & editing

    salqahtani@nu.edu.sa

    Affiliation Radiological Sciences Department, College of Applied Medical Sciences, Najran University, Najran, Kingdom of Saudi Arabia

  • Khaled Alqahtani,

    Roles Conceptualization, Data curation, Methodology, Resources, Validation, Writing – original draft, Writing – review & editing

    Affiliation Biomedial Physics Department, King Fasial Hospital and Research Centre, Riyadh, Kingdom of Saudi Arabia

  • Faisal Alshomrani,

    Roles Data curation, Formal analysis, Investigation, Methodology, Validation, Writing – original draft, Writing – review & editing

    Affiliation Department of Diagnostic Radiology Technology, College of Applied Medical Science, Taibah University, Medinah, Kingdom of Saudi Arabia

  • Khaled AlQahtani

    Roles Conceptualization, Formal analysis, Investigation, Methodology, Validation, Visualization, Writing – original draft, Writing – review & editing

    Affiliation Radiology Department, King Abdulaziz Specialist Hospital, Taif, Kingdom of Saudi Arabia

Abstract

Background: Early and accurate classification of breast tumors from MRI scans is essential for improving patient outcomes. However, a key limitation of conventional deep learning models, such as Convolutional Neural Networks (CNNs), is their difficulty in capturing the subtle, spatially variant features that are crucial for precise medical image interpretation. Objective: To address this limitation, we propose a novel deep learning framework called AdaptiveInvolutionNet (AIN). This hybrid architecture is specifically designed to improve discriminative feature learning for breast tumor classification by integrating two key mechanisms: spatially-adaptive involution layers and channel-wise attention. Methods: Our AIN model employs a unique strategy for feature extraction. In its early layers, it utilizes spatially-adaptive involution kernels, which are highly effective at capturing fine-grained, localized features. As the network deepens, it transitions to conventional convolutions to maintain computational efficiency. To further enhance its diagnostic capabilities, we have embedded channel-wise attention mechanisms (specifically, squeeze-and-excitation modules) within the residual connections of the network. This allows the model to dynamically and selectively amplify diagnostically relevant features while suppressing less important ones. The model was rigorously trained and evaluated on a large, balanced dataset of 6,000 breast MRI images (3,000 benign, 3,000 malignant) using a robust five-fold cross-validation protocol. Results: AIN demonstrated superior performance, achieving a high test accuracy of 97%. This performance was consistent and reliable across all folds, with an average accuracy of 96% (± 1%). The model also showed strong agreement with true labels, indicated by a high Cohen’s Kappa score of 0.93 (± 0.01), and produced well-calibrated, trustworthy predictions with a low Brier score of just 0.0241. Conclusion: By successfully uniting an adaptive spatial feature extraction method with powerful attention mechanisms, AIN represents a significant advancement in medical image analysis. Its high accuracy, robust generalization, and consistent reliability demonstrate a strong potential for it to serve as a valuable and dependable computer-aided diagnostic tool for breast cancer detection in clinical settings.

1 Introduction

Breast cancer represents a major global health challenge. It has become the most commonly occurring cancer among women, affecting around 23% of all the women worldwide. More than 80% of the BC cases have been recorded in women with age of 45 years and more [13]. The causative factors of BC are multifactorial and very diverse in nature. These include hormonal imbalances, radiation exposure, genetic mutations (e.g., mutation in BRCA gene), obesity, and could be diet and life stress in general [4].

Mammography technique is currently the gold standard for the early detection of BC [5]. Mammography till date has proven to be the most capable method for identifying breast tumors before the development of the physical symptoms and thus significantly reducing mortality rates [6,7]. Mammography still has many limitations such as a low diagnostic yield and false negatives [8]. To overcome these challenges, a biopsy is always recommended for the lesions that have a greater chance (more than 2%) of being suspected with malignant tumors. Roughly less than 30% of these lesions have a proven malignancy [9]. To tackle the unnecessary problems with biopsy, another advanced approach, MRI (the Magnetic Resonance Imaging) has been adopted for BC detection [10,11].

MRI is an excellent detection approach because it has non-ionizing radiations and superior soft tissue resolution [12]. But, there are some limitations associated with MRI as well. For interpreting the MRI images, one needs highly skilled experts [13,14]. There are many cancer detecting methods formulated and evaluated recently. These methods utilized different types of risks aspects associated with imaging, genetics, and public health data. But these aspects are not sufficient for an individual to deal with one or more critical diagnostic screenings and correctly calculate the contingency of BC [15].

There are still some methods that rely solely on the variables such as breast density. Hence, these tools are also not appropriate for both the practitioners and the patients. Hence, there’s a growing need for automated, standardized, personalized screening approaches that have the ability to integrate genetic, imaging, and clinical data to better predict risk and guide early interventions. Recent advancements in the AI (Artificial Intelligence) technology have offered a significant improvement to the diagnostic performance of MRI in BC diagnosis and detection [16,17].

SVM (Support Vector Machine) is one of the most widely used ML (Machine Learning) based algorithm that uses supervised learning for BC diagnosis. The working principle of this technique includes working in high-dimensional space by finding the optimal hyper-plane for the separation of malignant and benign tumors. This approach is highly efficient where the dataset is small [18,19].

RF (Random Forest) model is another ensemble-based learning approach that is developed by combining multiple DT’s (Decision Trees) to get a stable, efficient and more accurate prediction [20,21]. RF method is known for its ability to reduce the over-fitting and handling missing data. It is highly effective for gene expression profiling and biopsy classification. K-Nearest Neighbors or denoted as KNN works by classifying the data points on the basis of categories of their nearest neighbors. It is a powerful approach for BC diagnosis. KNN can be optimized by combining with other feature selection methods. Based on the similarity measures, this method is highly efficient in identifying the tumor patterns [22]. Artificial Neural Networks (ANN) and NB (Naïve Bayes) are also some of the most commonly used ML approaches for BC detection [2327].

ANNs have the ability to learn complex input and output relationships. In BC detection, this model takes the features such as texture, shape, size of the tumor from mammography, MRI, and histopathological data and defines nonlinear relationships between these features. NB is a probabilistic classifier that is based on the Bayes’ theorem. It detects the disease by assuming independence among the input features. It is highly efficient for binary classification problems mainly in the structured data [28]. Although these methods efficiently diagnosed BC, they still have some limitations. These approaches need a large, well-structured, hand-crafted dataset for training. As in these methods, the learning process is typically carried out via supervised learning; they require human supervision.

That’s where deep learning (DL) technology, a subfield of AI, having an ability to extract the complex patterns from large image data emerged [29]. Medical imaging, autonomous systems, and NLP (natural language processing) fields have been massively transformed due to the power DL methods to learn the hierarchical features automatically. This technology has not only enhanced the computational power but also works well with big data. In a recent study [30], the authors developed an automated DL model called CNN (Convolutional Neural Networks) for the efficient diagnosis of BC through the image texture attribute extraction.

Another research study [31] introduced MesoNet approach. This technique utilizes DL-based CNNs for the efficient prediction of the mesothelioma sufferers’ survival rates. MesoNet model didn’t require any toxicologist to manually locate the tag areas instead it used whole-slide digitalized images. Another powerful approach, SVM model well-trained on the CNN features was utilized on T2-Weighted (T2W) and DCEMRI images [32]. This study highlighted the fact that feature fusion approaches perform better than the image and classification fusion methods.

In another study [33], the authors proposed an advanced CNN architecture, called the boosted EfficientNet CNN. This model, as a solution to low image resolution, helped automatically detecting cancerous cells in the BC pathology tissues. Another DL-based classification system [34], the stacked sparse auto-encoders model was implemented for BC detection using the multi-parametric MRI. Many DL methods including CADNet157 [35], DCNN [3638], deep fuzzy model [39], Gamma function based ensemble system [40], and hybrid rule-based systems [41] have been proposed for the efficient diagnosis of BC.

Some researchers found that the data augmentation technique can smooth down the training loss convergence[4244]. Transfer learning and data augmentation can efficiently overcome the challenges of unavailability of medical imaging large datasets. Another study [45] led to the developed of DL-based system, known as the DenseNet121 CNN model. This method helped accurately identify the BC utilizing histopathology images. Another popular DL approach, the Vision Transformer (ViT) model was proposed [46]. This model, unlike the CNNs which solely focus on the local tissue lesions, utilizes self-attention learning to model global context and long-range dependencies. This method has proven to be an efficient approach for image classification tasks [47] , particularly in medical imaging, including breast cancer histopathological analysis [4851], as it helps identify the spatial relationships and fine-grained features within the image.

He et al. [52] proposed a novel approach by integrating discrete wavelet transform (DWT) with a ViT, enabling the network to capture significant frequency-domain features from ultrasound images and thereby improving its receptive fields. Similarly, Jahan et al. [53] developed a comprehensive deep learning framework that utilizes a ViT-based model for both the detection of cancerous patches and the identification of cancer subtypes from whole slide images (WSIs), demonstrating superior performance compared to other deep learning models. In another study, Hayat et al. [54] introduced a hybrid model that combines the feature extraction capabilities of EfficientNetV2 with the classification power of a ViT, achieving high accuracy in classifying histopathological images. Addressing the limitations of standard ViTs in medical imaging, Babita and Nayak [55] designed the RDTNet, which incorporates a residual deformable attention-based transformer layer (RDTL) to capture both local and global contextual details from histopathological images.

DL has shown great potential for automated breast cancer detection from MRI. However, existing methods struggle to achieve the reliability needed for real clinical use. Current DL models use standard convolutions that process images in a fixed, uniform way. This approach has a major limitation—breast tumors come in many different shapes and sizes. Some are small and round, while others are large and irregular with unclear edges. Because these models cannot adapt to different tumor characteristics, they miss important details such as: Faint or blurry tumor boundaries, Small irregular patterns, and Subtle differences in tissue texture. This paper addresses this gap by proposing AdaptiveInvolutionNet (AIN), a new deep learning model designed specifically to adapt to different tumor characteristics in breast MRI.

  • Novel Hybrid Architecture: Introduces AIN, the first breast MRI classification framework to combine spatially-adaptive involution layers with channel-wise attention, enabling precise spatial feature extraction and enhanced channel-wise feature recalibration.
  • Improved Feature Learning: Incorporates learnable fusion of involution and convolutional pathways and progressive channel expansion, achieving superior representation of tumor boundaries and tissue heterogeneity.
  • Robust Evaluation Protocol: Implements five-fold cross-validation with calibration curve analysis, ensuring well-calibrated probability estimates and reliable generalization across folds.
  • High Performance: Demonstrates 97% accuracy and consistent reliability, highlighting AIN’s potential as a computer-aided diagnosis tool to support radiologists in early breast cancer detection.

These contributions advance medical image analysis by improving diagnostic accuracy for breast cancer detection.

The rest of the paper is organized as follows: Sect 2 describes the detailed material and methods used for this work. Sect 3 presents the model performance, which includes five-fold validation, an ablation study, and a comparative analysis. Finally, Sect 4 concludes the paper.

2 Materials and methods

2.1 Dataset detail

The dataset utilized in this study comprises breast Magnetic Resonance Imaging (MRI) scans, meticulously curated from The Cancer Imaging Archive (TCIA) Breast Diagnosis collection (https://www.cancerimagingarchive.net/collection/breast-diagnosis/). The original TCIA collection is extensive, encompassing over 101,000 medical images across diverse modalities, including MRI, Computed Tomography, Positron Emission Tomography, and Mammography. This comprehensive repository also includes associated clinical metadata, such as Diagnosis, Molecular Test results, Demographic information, Pathology Details, Radiomic Features, and Measurement data.

For the specific objectives of this research, only MRI images were retained, allowing for a focused investigation into breast tumor classification using this particular imaging modality. The core characteristics of the final dataset subset used for model training and evaluation are summarized in Table 1. Dataset sample images are shown in Fig 1.

thumbnail
Table 1. Summary of the TCIA BREAST-DIAGNOSIS dataset characteristics.

https://doi.org/10.1371/journal.pone.0340808.t001

thumbnail
Fig 1. Collection of breast MRI scans, categorized into Benign and Malignant cases.

https://doi.org/10.1371/journal.pone.0340808.g001

2.2 Dataset preprocessing

The breast MRI images underwent a comprehensive series of preprocessing and augmentation steps to optimize them for deep learning model training and evaluation. These procedures were systematically managed through a custom class, designed to integrate seamlessly with PyTorch’s Data Loader for efficient batch processing.

Initially, images were loaded from a structured directory, where benign and malignant cases were organized into distinct folders. Upon loading, each image file was consistently converted to the RGB format. A critical initial check was implemented for image dimensions: if an image’s width or height was found to be less than 32 pixels, it was automatically resized to a standard pixels. A stratified split, maintaining an 80/20 ratio (80% for training and 20% for testing), was performed.

For the training data and validation (test) data, a sequence of augmentations was applied to introduce variability and improve model robustness. Images were first resized to pixels, followed by a random pixel crop, which encouraged the model to learn features from various parts of the image. Further augmentation included random horizontal flipping with a 0.5 probability, random rotations up to 10 degrees, and minor color jittering (brightness, contrast, saturation at 0.1, hue at 0.05) to simulate diverse lighting conditions and scanner variations. Finally, these augmented images were converted into PyTorch tensors and normalized using standard mean and standard deviation values derived from the ImageNet dataset (, ), a common practice for stabilizing training, especially with pre-trained models.

Finally, the preprocessed train_dataset and test_dataset instances were integrated with PyTorch’s Data Loader to facilitate efficient batch processing during model training and evaluation. A batch_size of 16 was configured for both loaders. The training loader was set to shuffle=True to randomize sample order in each epoch and drop_last=True to discard any incomplete final batches. Both data loaders leveraged num_workers=2 for parallel data loading and pin_memory=True to expedite data transfer to the GPU, optimizing overall training efficiency.

2.3 Comparison with other breast cancer datasets

We briefly compare the Breast MRI Tumor Classification Dataset (used in this study) with two prominent public benchmarks: the TCIA breast cancer MRI collections and the BUSI ultrasound dataset. These comparisons underscore the unique challenges and opportunities in multimodal breast imaging for AI-driven diagnostics.

The TCIA (The Cancer Imaging Archive) hosts multiple breast cancer collections, such as TCGA-BRCA (139 subjects with MRI linked to genomic data) and Duke-Breast-Cancer-MRI (922 pre-operative DCE-MRI cases), emphasizing large-scale, multi-institutional MRI data for radiogenomics and tumor phenotyping. Our dataset aligns closely with TCIA’s MRI focus, comprising 6,000 de-identified MRI slices (balanced benign/malignant) resized to 224×224, but it is more accessible via Kaggle for rapid prototyping without DICOM handling. In contrast, the BUSI (Breast Ultrasound Images) dataset contains 780 2D ultrasound scans (437 benign, 210 malignant, 133 normal) from 600 patients aged 25–75, collected in 2018 at Baheya Hospital (Egypt), with manual segmentation masks for tumor regions. BUSI supports classification and segmentation tasks in ultrasound, a cost-effective modality for dense breast screening, but lacks the volumetric depth and contrast enhancement of MRI.

2.4 Proposed model: AdaptiveInvolutionNet (AIN)

The proposed model, AIN (Fig 2), introduces a novel deep learning architecture designed for breast MRI tumor classification. AIN synergistically integrates Spatially-Adaptive Involution Layers (SAIL) and Channel-Wise Attention Mechanisms (CWAM) within a residual learning framework to enhance feature representation and discrimination. By combining dynamic, content-aware involution operations with traditional convolutional feature extraction, AIN addresses the limitations of conventional convolutional neural networks (CNNs) in capturing spatially varying patterns and adaptive feature representations critical for medical image analysis. This section provides a detailed description of the architecture, its core components, and their mathematical formulations, ensuring clarity and rigor for research reproducibility.

thumbnail
Fig 2. AIN Development Process - Sequential four-step development of Adaptive Involution Network for breast MRI tumor classification: introducing SAIL layers, integrating CWAM mechanisms, combining with CNN architecture, and achieving final AIN model.

https://doi.org/10.1371/journal.pone.0340808.g002

The end-to-end transformation of the input image in AIN to the final output is mathematically expressed as Eq 1:

(1)

where:

  • : Stem block transformation for initial feature extraction.
  • : Transformation at layer , incorporating SAIL or convolutional operations.
  • : Global feature aggregation via adaptive average pooling.
  • : Classification head transformation mapping features to class probabilities.

2.4.1 Spatially-Adaptive Involution Layer (SAIL).

The Spatially-Adaptive Involution Layer (SAIL) is the core innovation of AIN, designed to overcome the limitations of traditional CNN in capturing spatially varying patterns critical for breast MRI tumor classification as shown in Fig 3. Unlike conventional convolutions, which rely on fixed, translation-invariant kernels, SAIL dynamically generates location-specific kernels conditioned on the input feature content. This adaptivity enables the layer to respond to local image characteristics which are often heterogeneous in medical images. SAIL integrates a learnable fusion coefficient α to combine the outputs of an involution pathway and a parallel conventional convolution pathway, balancing flexibility with the stability of traditional feature extraction. By generating spatially adaptive kernels, SAIL enhances the model’s ability to capture fine-grained details while maintaining robustness across diverse image regions. This makes it particularly effective for tasks requiring precise localization and contextual understanding, such as distinguishing malignant from benign tumors in MRI scans.

thumbnail
Fig 3. SAIL Architecture Overview - Diagram showing the key components of Spatially-Adaptive Involution Layers, including adaptive kernel generation, spatial feature unfolding, involution operations, channel projection, and parallel convolution across multi-scale feature maps.

https://doi.org/10.1371/journal.pone.0340808.g003

The SAIL module consists of several sub-components: the Adaptive Kernel Generation Network (AKGN) synthesizes position-specific kernels through a bottleneck design, reducing computational overhead while ensuring rich kernel representations; spatial feature unfolding extracts local neighborhoods for pixel-wise operations; the involution operation applies these dynamic kernels to produce location-specific features; a channel projection aligns the output dimensions; and a parallel convolution pathway complements the involution process. The fusion mechanism, governed by α, is optimized end-to-end via gradient descent, allowing the model to learn the optimal balance between adaptive and fixed feature extraction. This design not only improves feature representation but also enhances the model’s adaptability to complex, spatially varying patterns in medical images.

For an input feature tensor , where B is the batch size, is the number of input channels, and H,W are spatial dimensions, SAIL computes the output as Eq 2:

(2)

where: α is initialized as a learnable scalar parameter with α = 0.0.

  • : Sigmoid activation to constrain the fusion coefficient to [0,1].
  • : Element-wise Hadamard product.
  • : Output of the involution pathway.
  • : Output of the conventional convolution pathway.

The AKGN generates position-specific kernels as depicted in Eq 3:

(3)

where , , , r is the channel reduction ratio (typically 4 or 8), and K is the kernel size (typically 3 or 5). The involution operation and other sub-components follow the original formulations.

2.4.2 Channel-Wise Attention Mechanism (CWAM).

The Channel-Wise Attention Mechanism (CWAM) (Fig 4) is designed to enhance the discriminability of features by modeling inter-channel dependencies and recalibrating channel responses based on global context. In medical image analysis, CWAM ensures that the network prioritizes task-relevant features while suppressing irrelevant or noisy ones. By integrating a squeeze-and-excitation strategy, CWAM captures global spatial information through global average pooling (squeeze) and transforms it into channel-wise attention weights via a two-layer neural network (excitation). These weights are then applied to scale the input feature maps, effectively amplifying important channels and attenuating less relevant ones. This mechanism enhances the model’s ability to focus on diagnostically significant features, improving classification accuracy for breast MRI tumor differentiation.

thumbnail
Fig 4. CWAM Three-Step Process - Channel-Wise Attention Mechanism showing global context aggregation, channel dependency modeling through bottleneck transformation, and feature recalibration with attention weights.

https://doi.org/10.1371/journal.pone.0340808.g004

The CWAM operates in three stages: global context aggregation condenses spatial information into channel descriptors; channel dependency modeling learns attention weights through a bottleneck transformation; and feature recalibration applies these weights to refine the feature maps. By embedding CWAM within the residual blocks, AIN ensures that attention-enhanced features are seamlessly integrated into the hierarchical feature learning process, contributing to robust and discriminative representations. The lightweight design of CWAM, achieved through channel reduction, maintains computational efficiency while significantly boosting feature quality.

For an input tensor , the CWAM output is generated as shown in Eq 4:

(4)

where are the attention weights, computed as Eq 5:

(5)

where , , , and r is the reduction ratio (typically 16).

2.4.3 Adaptive Residual Block (ARB).

The Adaptive Residual Block (ARB) (Fig 5) serves as the fundamental computational unit of AIN, integrating SAIL and CWAM within a residual learning framework to enable deep network training while mitigating the vanishing gradient problem. The ARB combines the adaptive feature extraction capabilities of SAIL (in early layers) or conventional convolutions (in deeper layers) with the feature recalibration of CWAM, ensuring that both local and global feature representations are optimized for the classification task. The residual structure, with its skip connections, facilitates gradient flow and allows the network to learn identity mappings when necessary, enhancing training stability and convergence. This is particularly crucial for deep architectures processing complex medical images, where maintaining gradient information across multiple layers is essential for capturing both low-level textures and high-level semantic patterns.

thumbnail
Fig 5. Adaptive Residual Block Flow - Circular process showing ARB components: feature extraction through SAIL/convolutions, batch normalization, ReLU activation, dropout regularization, additional convolution, CWAM recalibration, and shortcut connection.

https://doi.org/10.1371/journal.pone.0340808.g005

The ARB’s main transformation path includes a sequence of operations: either a SAIL or convolutional layer for feature extraction, followed by batch normalization, ReLU activation, dropout for regularization, a second convolution, and finally CWAM for channel-wise recalibration. The shortcut connection ensures that the input is either directly added (identity mapping) or projected to match the output dimensions, preserving information flow. By strategically deploying SAIL in early layers and transitioning to convolutions in deeper layers, the ARB balances computational efficiency with adaptive feature learning, making it well-suited for medical image analysis tasks requiring both precision and scalability.

The ARB output is defined as Eq 6:

(6)

where is the main transformation path is as Eq 7:

(7)

and is the shortcut connection as depicted in Eq 8:

(8)

where .

2.4.4 AdaptiveInvolutionNet architecture.

The AdaptiveInvolutionNet (AIN) architecture (Fig 6) is a hierarchical deep learning framework tailored for breast MRI tumor classification, leveraging the complementary strengths of SAIL, CWAM, and residual learning. AIN is structured to progressively extract features at increasing levels of abstraction, starting with fine-grained, spatially adaptive features in early layers and transitioning to high-level, computationally efficient feature representations in deeper layers. The architecture is composed of four main stages, each comprising multiple ARBs, with channel dimensions increasing (64, 128, 256, 512) and spatial resolutions decreasing through downsampling. This hierarchical design ensures that the network captures both local details (e.g., tumor textures) and global context (e.g., spatial relationships), critical for accurate medical image classification.

thumbnail
Fig 6. AIN Sequential Architecture - Gear-based diagram showing AIN processing stages from feature extraction stem through four ARB stages (SAIL-based stages 1-2, convolution-based stages 3-4), global average pooling, to classification head.

https://doi.org/10.1371/journal.pone.0340808.g006

The AIN pipeline begins with a feature extraction stem that performs aggressive downsampling and initial feature extraction, followed by four stages of ARBs. In stages 1 and 2, ARBs incorporate SAIL to capture spatially adaptive features, enabling the model to focus on local variations in MRI images. In stages 3 and 4, ARBs use conventional convolutions to prioritize computational efficiency while maintaining robust feature generalization. The extracted features are aggregated via global average pooling, producing a compact representation that is fed into a classification head, a two-layer multi-layer perceptron (MLP) with dropout regularization to prevent overfitting. The strategic integration of SAIL in early layers and the transition to convolutions in deeper layers optimize the trade-off between adaptivity and scalability, making AIN a powerful architecture for complex medical imaging tasks.

2.4.5 Architectural composition.

This section details the mathematical formulation and implementation specifics of AIN’s four primary architectural components, providing the technical foundation for system implementation and reproducibility.

Feature extraction: The Stem block performs initial feature extraction with aggressive spatial downsampling through a sequential combination of convolution, normalization, activation, and pooling operations. The stem block transformation is described as Eq 9:

(9)

where:

  • : Convolution weights expanding input from 3 to 64 channels
  • : Channel-wise bias parameters
  • : Convolution with stride 2 and padding 3 for dimension preservation
  • : Max pooling with stride 2 and padding 1
  • : Output feature tensor
  • , : Spatial dimensions after 4× reduction

Hierarchical Feature Learning: The network implements a four-stage hierarchical architecture with progressive feature abstraction. Each stage contains two ARBs with specific configurations as detailed in Eqs 1013:

(10)(11)(12)(13)

Stage-specific configurations:

  • Stages 1-2: Implement ARBs with SAIL modules for spatially-adaptive feature extraction
  • Stages 3-4: Utilize conventional convolution-based ARBs for computational efficiency
  • Downsampling: Applied at stage transitions (2, 3, 4) via stride-2 operations in the first ARB
  • Channel progression: Doubles at each stage transition (64→128→256→512)

Global Feature Aggregation: It converts spatially distributed features into a fixed-size representation through adaptive average pooling as shown in Eq 14:

(14)

where , , and .

Classification Head: It implements a two-layer MLP with dropout regularization to map global features to class probabilities as shown in Eq 15:

(15)

where:

  • , : First layer parameters
  • , : Output layer parameters
  • : Class probability distribution

The intermediate dimension of 256 provides sufficient representational capacity for task-specific transformations while maintaining computational efficiency.

2.4.6 System model workflow.

The complete forward pass of AdaptiveInvolutionNet is summarized in Algorithm 1, which integrates the stem block, hierarchical ARB stages, global pooling, and classification head.

Algorithm 1 AdaptiveInvolutionNet Forward Pass

Require: Input image

Ensure: Class probabilities

1:

2:   for i = 1 to 4 do

3:    for j = 1 to 2 do

4:     if i ≤ 2 then

5:     

6:     else

7:     

8:     end if

9:    end for

10:   end for

11:

12: return

The AdaptiveInvolutionNet architecture leverages the synergistic integration of SAIL, CWAM, and residual learning to achieve robust feature representation for breast MRI tumor classification. SAIL’s spatially adaptive kernels enable fine-grained feature extraction in early layers, capturing critical local patterns such as tumor boundaries. CWAM enhances feature discriminability by recalibrating channel responses, while the residual framework ensures stable training of deep networks. The strategic transition to conventional convolutions in deeper layers balances computational efficiency with generalization, making AIN suitable for complex medical image analysis tasks.

3 Result

3.1 Training environment

All training hyperparameters are detailed in Table 2 for full reproducibility. The model was trained and evaluated on a GPU-accelerated environment using torch.cuda, with automatic fallback to CPU if CUDA was unavailable.

thumbnail
Table 2. Complete list of training hyperparameters used for the proposed AIN model.

https://doi.org/10.1371/journal.pone.0340808.t002

3.2 Evaluation metrics definitions

Precision measures the proportion of true positive predictions among all positive predictions for a given class. It is defined as Eq 16:

(16)

where is the number of true positives, and is the number of false positives.

Recall (or sensitivity) measures the proportion of true positive predictions among all actual positive instances. It is defined as as Eq 17:

(17)

where is the number of false negatives.

F1-Score is the harmonic mean of precision and recall, providing a balanced measure of both metrics. It is defined as as Eq 18:

(18)

Cohen’s Kappa measures the agreement between predicted and actual classifications, accounting for chance agreement. It is defined as as Eq 19:

(19)

where po is the observed agreement, and pe is the expected agreement by chance.

Matthews Correlation Coefficient (MCC) is a balanced measure of classification performance, considering true and false positives and negatives. It is defined as as Eq 20:

(20)

ROC AUC Score represents the area under the Receiver Operating Characteristic curve, measuring the model’s ability to distinguish between classes. A value of 1 indicates perfect discrimination, while 0.5 indicates no discrimination ability.

3.3 Training and validation loss-accuracy analysis

Figs 7 and 8 present comprehensive training history plots for the breast cancer classification model across five-fold cross-validation, illustrating the temporal evolution of model performance metrics including training/validation accuracy and loss functions over successive epochs. These visualizations provide critical insights into model convergence behavior, generalization capability, and potential overfitting or underfitting characteristics.

thumbnail
Fig 7. Training and Validation History - Folds 1-3 - Performance plots showing model accuracy and loss progression over training epochs for each fold during cross-validation.

https://doi.org/10.1371/journal.pone.0340808.g007

thumbnail
Fig 8. Training and Validation History - Folds 4-5 and Overall Performance - Performance plots showing model accuracy and loss progression over training epochs for the remaining cross-validation folds plus overall performance summary.

https://doi.org/10.1371/journal.pone.0340808.g008

The training history for Fold 1 (Fig 7(a)) demonstrates robust convergence characteristics with synchronized improvement in both training and validation metrics. The accuracy curves exhibit steady upward progression, indicating effective gradient descent optimization and appropriate learning rate selection. The loss curves show corresponding downward trajectories with minimal oscillations, suggesting stable backpropagation dynamics and well-conditioned parameter updates. The close alignment between training and validation curves indicates balanced model complexity without significant overfitting, demonstrating good bias-variance trade-off optimization.

Fold 2 (Fig 7(b)) exhibits similar convergence patterns with consistent improvement across epochs. The training and validation accuracy curves maintain parallel progression, indicating robust generalization capability. The loss function demonstrates smooth monotonic decrease with minimal variance, suggesting effective optimization landscape navigation. The gap between training and validation metrics remains minimal, confirming the absence of significant overfitting and validating the regularization strategies employed.

The third fold (Fig 7(c)) shows excellent training stability with convergent behavior in both accuracy and loss metrics. The validation curves closely track the training curves, indicating strong model generalization across different data distributions. The smooth convergence pattern without significant plateauing suggests optimal learning rate scheduling and effective feature extraction capability throughout the training process.

Fold 4 (Fig 8(a)) demonstrates consistent training progression with stable convergence characteristics. The accuracy metrics show steady improvement with minimal validation-training gap, indicating robust generalization performance. The loss curves exhibit expected decreasing trends with controlled variance, suggesting effective gradient flow and parameter optimization. Any minor fluctuations in the validation curves reflect natural stochastic variations inherent in the training process rather than systematic overfitting issues.

The fifth fold (Fig 8(b)) exhibits optimal training characteristics with superior convergence behavior compared to other folds. The training and validation curves show excellent alignment with minimal divergence, indicating exceptional generalization capability. The loss function demonstrates rapid initial decrease followed by stable convergence, suggesting efficient optimization dynamics and well-tuned hyperparameters. This fold’s superior performance aligns with the confusion matrix analysis, confirming its status as the best-performing configuration.

The overall training history plot (Fig 8(c)) presents macro-averaged performance across all five folds, providing comprehensive insights into the model’s collective training dynamics. The aggregated loss curve demonstrates consistent downward progression with controlled variance, indicating robust optimization across different data partitions. The ensemble behavior validates the stability of the training protocol and confirms the effectiveness of the cross-validation framework.

3.4 Classification report analysis

Table 3 summarizes the classification performance of the proposed model for breast cancer classification using five-fold cross-validation. The table reports precision, recall, F1-score, and support for two classes (0: benign, 1: malignant) for each fold, alongside overall accuracy, macro average, and weighted average metrics. In Fold 1, the model achieves an accuracy of 0.96, with balanced precision and recall for both classes (0.95–0.97). Fold 2 maintains similar performance with an accuracy of 0.96 and slightly higher precision for class 1 (0.97). Fold 3 also reports an accuracy of 0.96, with class 1 achieving a high recall of 0.99. Fold 4 shows a slightly lower accuracy of 0.95, with marginally reduced F1-scores (0.95 for both classes). Fold 5 demonstrates the best performance, achieving the highest accuracy of 0.97, with consistent precision, recall, and F1-scores of 0.97 for both classes. The overall metrics across all folds yield an accuracy of 0.97, with macro and weighted averages of 0.97, reflecting robust performance. The support values indicate a balanced class distribution, with approximately 600 samples per class in the overall dataset. Based on these results, Fold 5 is identified as the best-performing fold due to its superior accuracy and balanced metrics, highlighting the model’s effectiveness in distinguishing benign and malignant cases.

Table 4 presents the classification metrics for the proposed breast cancer classification model evaluated using five-fold cross-validation. The table reports Cohen’s Kappa, Matthews Correlation Coefficient (MCC), and ROC AUC Score for each fold, alongside their cross-validation averages and standard deviations. In Fold 1, the model achieves a Cohen’s Kappa of 0.9233, MCC of 0.9235, and ROC AUC of 0.9932, indicating strong agreement and discriminative power. Fold 2 shows improved performance with a Cohen’s Kappa of 0.9366, MCC of 0.9367, and ROC AUC of 0.9930. Fold 3 reports a Cohen’s Kappa of 0.9300, MCC of 0.9313, and a high ROC AUC of 0.9954. Fold 4 has slightly lower metrics, with a Cohen’s Kappa of 0.9167, MCC of 0.9173, and ROC AUC of 0.9927. Fold 5 demonstrates the best performance, achieving the highest Cohen’s Kappa and MCC of 0.9449 and a ROC AUC of 0.9964, reflecting superior model reliability and discrimination. The cross-validation summary indicates an average validation accuracy of 96.52% (± 0.50), with average Cohen’s Kappa of 0.9303 (± 0.0099), MCC of 0.9308 (± 0.0097), and ROC AUC of 0.9941 (± 0.0015). These results highlight the model’s robust and consistent performance across folds, with Fold 5 identified as the best-performing fold due to its highest metrics, underscoring the model’s effectiveness in classifying benign and malignant cases.

thumbnail
Table 4. Summary of classification metrics across five folds.

https://doi.org/10.1371/journal.pone.0340808.t004

3.5 Confusion matrix analysis

Fig 9 presents the confusion matrices for the breast cancer classification model evaluated using five-fold cross-validation. Subfigures (a) through (e) display the confusion matrices for Folds 1 to 5, respectively, detailing the true positives, true negatives, false positives, and false negatives for the two classes (0: benign, 1: malignant) across different data splits. Subfigure (f) shows the aggregated confusion matrix, summarizing the overall classification performance across all folds. Fold 1 demonstrates high true positive and true negative rates with minimal misclassifications. Fold 2 similarly shows strong classification performance with few errors. Fold 3 maintains robust performance, with balanced correct predictions for both classes. Fold 4 exhibits slightly more misclassifications compared to other folds. Fold 5 achieves the best performance, with the fewest misclassifications, indicating optimal classification of benign and malignant cases. The aggregated confusion matrix in Subfigure (f) confirms the model’s overall effectiveness, with balanced performance across both classes. Fold 5 is identified as the best-performing fold due to its minimal classification errors.

thumbnail
Fig 9. Detailed confusion matrices for breast cancer classification showing individual fold performance (a-e) and aggregated results across all folds (f), demonstrating model consistency and overall classification accuracy.

https://doi.org/10.1371/journal.pone.0340808.g009

3.6 Calibration curve and brier score

Fig 10 presents the calibration curves for the breast cancer classification model evaluated using five-fold cross-validation, illustrating the reliability of predicted probabilities for the two classes (0: benign, 1: malignant). Subfigures (a) through (e) display the calibration curves for Folds 1 to 5, respectively, showing the alignment between predicted probabilities and actual outcomes across different data splits. Subfigure (f) presents the aggregated calibration curve, summarizing the overall reliability across all folds. Fold 1 shows strong calibration with a Brier score of 0.0324, indicating good probability alignment. Fold 2 exhibits slightly better calibration with a Brier score of 0.0295, reflecting improved reliability. Fold 3 maintains robust calibration with a Brier score of 0.0300, showing consistent probability estimates. Fold 4 has a slightly higher Brier score of 0.0338, suggesting marginally less precise calibration. Fold 5 achieves the best calibration performance with the lowest Brier score of 0.0241, indicating the closest alignment between predicted probabilities and actual outcomes. The aggregated calibration curve in Subfigure (f) confirms the model’s overall robust probability estimates across both classes. Fold 5 is identified as the best-performing fold due to its lowest Brier score and superior calibration, highlighting the model’s effectiveness in producing reliable probability predictions for benign and malignant cases.

thumbnail
Fig 10. Calibration curves illustrating the reliability of predicted probabilities for different models within a five-fold cross-validation framework.

Subfigures (a)–(e) show individual fold results, while (f) presents the overall calibration.

https://doi.org/10.1371/journal.pone.0340808.g010

3.7 Ablation study

To validate the effectiveness of our architectural design choices, we conducted a comprehensive ablation study examining the contribution of each component. As shown in Table 5, the full model combining SAIL blocks in early layers, convolutional blocks in deeper layers, and Channel-Wise Attention Modules (CWAM) achieves the highest performance at 97.17% validation and 96.87% test accuracy. Removing SAIL entirely and using only convolutional layers results in a 0.70% performance drop (96.17%), while replacing all convolutional blocks with SAIL layers causes a more significant degradation to 94.08%, demonstrating that the hybrid architecture effectively leverages the complementary strengths of both approaches.

thumbnail
Table 5. Ablation study on the contribution of SAIL, CWAM (SE), and the hybrid layering strategy.

All models are trained for a maximum of 50 epochs with early stopping (patience = 15).

https://doi.org/10.1371/journal.pone.0340808.t005

The attention mechanism proves crucial for performance, as removing CWAM (SE) reduces accuracy to 95.67%. Furthermore, isolating components reveals that SAIL blocks alone in early layers without attention achieve 95.83%, while using only convolutional blocks in deep layers maintains 94.50%, confirming that both the strategic placement of SAIL in early feature extraction stages and the integration of attention mechanisms are essential. These results validate our hybrid layering strategy, where SAIL captures complex spectral-spatial patterns in shallow layers while conventional convolutions provide computational efficiency in deeper layers, with CWAM enhancing discriminative feature learning throughout the network.

Fig 11 illustrates both correct and incorrect predictions, including a clear example of a detected malignant tumor, thereby providing insight into the model’s decision-making and localization performance. The top row displays four examples of correct predictions (two benign, two malignant), with true labels, pred labels, and confidence scores in green. The bottom row shows four examples of incorrect predictions, with true labels, misclassified labels, and confidence scores in red.

3.8 Comparison with SOTA techniques

Table 6 presents a comprehensive performance comparison between the proposed AIN and existing state-of-the-art deep learning methods for breast cancer classification. The comparison demonstrates AIN’s superior performance across different architectural paradigms. The CNN-SVM hybrid approach by Edrees Almalki et al. achieved 93.6% accuracy by combining convolutional feature extraction with support vector machine classification, representing traditional machine learning integration with deep features. Chen et al.’s implementation of GoogLeNet, leveraging inception modules for multi-scale feature extraction, achieved 96.37% accuracy, demonstrating the effectiveness of multi-path architectures in medical imaging. Chaudhury et al.’s SqueezeNet implementation, designed for computational efficiency through fire modules, achieved 90.3% accuracy, highlighting the trade-off between model complexity and performance. The Inception-V3 CNN approach by Nadkarni et al. achieved 89.75% accuracy, utilizing deep inception architectures with auxiliary classifiers for improved gradient flow. In contrast, the proposed AIN achieves 97% accuracy, representing a notable improvement of 0.63% over the best-performing baseline (GoogLeNet) and significant margins over other approaches (3.4% over CNN-SVM, 6.7% over SqueezeNet, and 7.25% over Inception-V3). This superior performance can be attributed to AIN’s novel integration of spatially-adaptive involution layers that generate content-aware kernels for capturing fine-grained spatial variations in breast MRI images, combined with channel-wise attention mechanisms that enhance feature discriminability. The hybrid architecture’s strategic deployment of adaptive operations in early layers for spatial feature extraction, transitioning to conventional convolutions in deeper layers for semantic pattern recognition, enables optimal balance between adaptability and computational efficiency.

thumbnail
Table 6. Performance comparison of AdaptiveInvolutionNet with state-of-the-art methods for breast cancer classification.

https://doi.org/10.1371/journal.pone.0340808.t006

4 Conclusion

This study presents AdaptiveInvolutionNet (AIN), a spatially-adaptive hybrid deep learning framework for breast MRI tumor classification. By leveraging involution layers for fine-grained spatial feature extraction, channel-wise attention for selective feature amplification, and residual connections for stable deep learning, AIN consistently outperforms conventional CNNs, achieving 97% test accuracy with robust cross-validation performance. The well-calibrated predictions and low Brier scores underscore its reliability for clinical applications.

Beyond accuracy, AIN has the potential to reduce diagnostic delays, assist radiologists in interpreting large MRI datasets, and minimize false positives and negatives in breast cancer screening. However, the model’s evaluation was limited to a single dataset; future work will involve testing on multi-center, heterogeneous datasets, integrating multi-modal imaging (e.g., mammography, ultrasound), and exploring real-time deployment in clinical workflows. These future directions will further establish AIN as a practical and scalable tool for automated breast cancer diagnosis.

References

  1. 1. Alam T, Shia W-C, Hsu F-R, Hassan T. Improving breast cancer detection and diagnosis through semantic segmentation using the Unet3+ deep learning framework. Biomedicines. 2023;11(6):1536. pmid:37371631
  2. 2. Yurttakal AH, Erbay H, İkizceli T, Karaçavuş S. Detection of breast cancer via deep convolution neural networks using MRI images. Multimed Tools Appl. 2019;79(21–22):15555–73.
  3. 3. Duffy SW, Vulkan D, Cuckle H, Parmar D, Sheikh S, Smith RA, et al. Effect of mammographic screening from age 40 years on breast cancer mortality (UK Age trial): final results of a randomised, controlled trial. Lancet Oncol. 2020;21(9):1165–72. pmid:32800099
  4. 4. Akhtar N, Pant H, Dwivedi A, Jain V, Perwej Y. A breast cancer diagnosis framework based on machine learning. IJSRSET. 2023:2395–1990.
  5. 5. Zuluaga-Gomez J, Al Masry Z, Benaggoune K, Meraghni S, Zerhouni N. A CNN-based methodology for breast cancer diagnosis using thermal images. Computer Methods in Biomechanics and Biomedical Engineering: Imaging & Visualization. 2020;9(2):131–45.
  6. 6. Shah SH, Iqbal MJ, Ahmad I, Khan S, Rodrigues JJPC. Optimized gene selection and classification of cancer from microarray gene expression data using deep learning. Neural Comput & Applic. 2020;37(35–36):29049–60.
  7. 7. Gouda W, Almurafeh M, Humayun M, Jhanjhi NZ. Detection of COVID-19 based on chest X-rays using deep learning. Healthcare (Basel). 2022;10(2):343. pmid:35206957
  8. 8. Hofmann B. Informing about mammographic screening: ethical challenges and suggested solutions. Bioethics. 2020;34(5):483–92. pmid:31633832
  9. 9. Ma L, Guo H, Zhao Y, Liu Z, Wang C, Bu J, et al. Liquid biopsy in cancer current: status, challenges and future prospects. Signal Transduct Target Ther. 2024;9(1):336. pmid:39617822
  10. 10. Adam R, Dell’Aquila K, Hodges L, Maldjian T, Duong TQ. Deep learning applications to breast cancer detection by magnetic resonance imaging: a literature review. Breast Cancer Res. 2023;25(1):87. pmid:37488621
  11. 11. Zhang M, Young GS, Chen H, Li J, Qin L, McFaline-Figueroa JR, et al. Deep-learning detection of cancer metastases to the brain on MRI. J Magn Reson Imaging. 2020;52(4):1227–36. pmid:32167652
  12. 12. Wekking D, Porcu M, De Silva P, Saba L, Scartozzi M, Solinas C. Breast MRI: clinical indications, recommendations, and future applications in breast cancer diagnosis. Curr Oncol Rep. 2023;25(4):257–67. pmid:36749493
  13. 13. Arnold TC, Freeman CW, Litt B, Stein JM. Low-field MRI: clinical promise and challenges. J Magn Reson Imaging. 2023;57(1):25–44. pmid:36120962
  14. 14. Teixeira PAG, Kessler H, Morbée L, Douis N, Boubaker F, Gillet R, et al. Mineralized tissue visualization with MRI: practical insights and recommendations for optimized clinical applications. Diagn Interv Imaging. 2025;106(5):147–56. pmid:39667997
  15. 15. Abdelaziz Ismael SA, Mohammed A, Hefny H. An enhanced deep learning approach for brain cancer MRI images classification using residual networks. Artif Intell Med. 2020;102:101779. pmid:31980109
  16. 16. Hirsch L, Huang Y, Makse HA, Martinez DF, Hughes M, Eskreis-Winkler S, et al. Early detection of breast cancer in MRI using AI. Acad Radiol. 2025;32(3):1218–25. pmid:39482209
  17. 17. Díaz O, Rodríguez-Ruíz A, Sechopoulos I. Artificial Intelligence for breast cancer detection: technology, challenges, and prospects. European Journal of Radiology. 2024;175:111457.
  18. 18. Elkorany AS, Marey M, Almustafa KM, Elsharkawy ZF. Breast cancer diagnosis using support vector machines optimized by whale optimization and dragonfly algorithms. IEEE Access. 2022;10:69688–99.
  19. 19. Bilal A, Imran A, Baig TI, Liu X, Abouel Nasr E, Long H. Breast cancer diagnosis using support vector machine optimized by improved quantum inspired grey wolf optimization. Scientific Reports. 2024;14(1):10714.
  20. 20. Jackins V, Vimal S, Kaliappan M, Lee MY. AI-based smart prediction of clinical disease using random forest classifier and Naive Bayes. J Supercomput. 2020;77(5):5198–219.
  21. 21. Nassif AB, Talib MA, Nasir Q, Afadar Y, Elgendy O. Breast cancer detection using artificial intelligence techniques: a systematic literature review. Artificial Intelligence in Medicine. 2022;127:102276.
  22. 22. Amethiya Y, Pipariya P, Patel S, Shah M. Comparative analysis of breast cancer detection using machine learning and biosensors. Intelligent Medicine. 2022;2(2):69–81.
  23. 23. Eldin Rashed AE, Elmorsy AM, Mansour Atwa AE. Comparative evaluation of automated machine learning techniques for breast cancer diagnosis. Biomedical Signal Processing and Control. 2023;86:105016.
  24. 24. Alshayeji MH, Ellethy H, Abed SE, Gupta R. Computer-aided detection of breast cancer on the wisconsin dataset: an artificial neural networks approach. Biomedical Signal Processing and Control. 2022;71:103141.
  25. 25. Shah A, Shah M, Pandya A, Sushra R, Sushra R, Mehta M, et al. A comprehensive study on skin cancer detection using artificial neural network (ANN) and convolutional neural network (CNN). Clinical eHealth. 2023;6:76–84.
  26. 26. S. P, Al-Turjman F, Stephan T. An automated breast cancer diagnosis using feature selection and parameter optimization in ANN. Computers & Electrical Engineering. 2021;90:106958. https://doi.org/10.1016/j.compeleceng.2020.106958
  27. 27. Alzboon MS, Qawasmeh S, Alqaraleh M, Abuashour A, Bader AF, Al-Batah M. Machine learning classification algorithms for accurate breast cancer diagnosis. In: 2023 3rd International Conference on Emerging Smart Technologies and Applications (eSmarTA). 2023. p. 1–8. https://doi.org/10.1109/esmarta59349.2023.10293415
  28. 28. Bhise S, Gadekar S, Gaur AS, Bepari S, Deepmala Kale DSA. Breast cancer detection using machine learning techniques. International Journal of Engineering Research and Technology. 2021;10(7):2278–0181.
  29. 29. Aggarwal R, Sounderajah V, Martin G, Ting DSW, Karthikesalingam A, King D, et al. Diagnostic accuracy of deep learning in medical imaging: a systematic review and meta-analysis. NPJ Digit Med. 2021;4(1):65. pmid:33828217
  30. 30. Melekoodappattu JG, Dhas AS, Kandathil BK, Adarsh KS. Breast cancer detection in mammogram: combining modified CNN and texture feature based approach. J Ambient Intell Human Comput. 2022;14(9):11397–406.
  31. 31. Alanazi SA, Kamruzzaman MM, Islam Sarker MN, Alruwaili M, Alhwaiti Y, Alshammari N, et al. Boosting breast cancer detection using convolutional neural network. J Healthc Eng. 2021;2021:5528622. pmid:33884157
  32. 32. Hu Q, Whitney HM, Giger ML. A deep learning methodology for improved breast cancer diagnosis using multiparametric MRI. Sci Rep. 2020;10(1):10536. pmid:32601367
  33. 33. Wang J, Liu Q, Xie H, Yang Z, Zhou H. Boosted EfficientNet: detection of lymph node metastases in breast cancer using convolutional neural networks. Cancers (Basel). 2021;13(4):661. pmid:33562232
  34. 34. Parekh VS, Macura KJ, Harvey SC, Kamel IR, Ei-Khouli R, Bluemke DA, et al. Multiparametric deep learning tissue signatures for a radiological biomarker of breast cancer: preliminary results. Med Phys. 2020;47(1):75–88. pmid:31598978
  35. 35. Mokni R, Haoues M. CADNet157 model: fine-tuned ResNet152 model for breast cancer diagnosis from mammography images. Neural Comput & Applic. 2022;34(24):22023–46.
  36. 36. Prakash SS, Visakha K. Breast cancer malignancy prediction using deep learning neural networks. In: 2020 Second International Conference on Inventive Research in Computing Applications (ICIRCA). 2020. p. 88–92.
  37. 37. Houssein EH, Emam MM, Ali AA. An optimized deep learning architecture for breast cancer diagnosis based on improved marine predators algorithm. Neural Comput Appl. 2022;34(20):18015–33. pmid:35698722
  38. 38. Desai M, Shah M. An anatomization on breast cancer detection and diagnosis employing multi-layer perceptron neural network (MLP) and Convolutional neural network (CNN). Clinical eHealth. 2021;4:1–11.
  39. 39. Shen T, Wang J, Gou C, Wang F-Y. Hierarchical fused model with deep learning and type-2 fuzzy learning for breast cancer diagnosis. IEEE Trans Fuzzy Syst. 2020;28(12):3204–18.
  40. 40. Majumdar S, Pramanik P, Sarkar R. Gamma function based ensemble of CNN models for breast cancer detection in histopathology images. Expert Systems with Applications. 2023;213:119022.
  41. 41. Awotunde JB, Panigrahi R, Khandelwal B, Garg A, Bhoi AK. Breast cancer diagnosis based on hybrid rule-based feature selection with deep learning algorithm. Research on Biomedical Engineering. 2023;:1–13.
  42. 42. da Silva DS, Nascimento CS, Jagatheesaperumal SK, Albuquerque VHC de. Mammogram image enhancement techniques for online breast cancer detection and diagnosis. Sensors (Basel). 2022;22(22):8818. pmid:36433415
  43. 43. Zebari DA, Ibrahim DA, Zeebaree DQ, Haron H, Salih MS, Damaševičius R. Systematic review of computing approaches for breast cancer detection based computer aided diagnosis using mammogram images. Applied Artificial Intelligence. 2021;35(15):2157–203.
  44. 44. Oyelade ON, Ezugwu AE. A novel wavelet decomposition and transformation convolutional neural network with data augmentation for breast cancer detection using digital mammogram. Sci Rep. 2022;12(1):5913. pmid:35396565
  45. 45. Pattanaik RK, Mishra S, Siddique M, Gopikrishna T, Satapathy S. Breast cancer classification from mammogram images using extreme learning machine-based DenseNet121 model. Journal of Sensors. 2022;2022:1–12.
  46. 46. Liu Y, Zhang Y, Wang Y, Hou F, Yuan J, Tian J, et al. A survey of visual transformers. IEEE Trans Neural Netw Learn Syst. 2024;35(6):7478–98. pmid:37015131
  47. 47. Han K, Wang Y, Chen H, Chen X, Guo J, Liu Z, et al. A survey on vision transformer. IEEE Trans Pattern Anal Mach Intell. 2023;45(1):87–110. pmid:35180075
  48. 48. Tummala S, Kim J, Kadry S. BreaST-Net: multi-class classification of breast cancer from histopathological images using ensemble of swin transformers. Mathematics. 2022;10(21):4109.
  49. 49. Abimouloud ML, Bensid K, Elleuch M, Ammar MB, Kherallah M. Vision transformer based convolutional neural network for breast cancer histopathological images classification. Multimed Tools Appl. 2024.
  50. 50. Jahan I, Chowdhury MEH, Vranic S, Al Saady RM, Kabir S, Pranto ZH, et al. Deep learning and vision transformers-based framework for breast cancer and subtype identification. Neural Comput & Applic. 2025;37(16):9311–30.
  51. 51. Fiaz A, Raza B, Faheem M, Raza A. A deep fusion-based vision transformer for breast cancer classification. Healthc Technol Lett. 2024;11(6):471–84. pmid:39720758
  52. 52. He C, Diao Y, Ma X, Yu S, He X, Mao G, et al. A vision transformer network with wavelet-based features for breast ultrasound classification. Image Anal Stereol. 2025;43(2):185–94.
  53. 53. Jahan I, Chowdhury MEH, Vranic S, Al Saady RM, Kabir S, Pranto ZH, et al. Deep learning and vision transformers-based framework for breast cancer and subtype identification. Neural Comput & Applic. 2025;37(16):9311–30.
  54. 54. Hayat M, Ahmad N, Nasir A, Ahmad Tariq Z. Hybrid deep learning EfficientNetV2 and Vision Transformer (EffNetV2-ViT) model for breast cancer histopathological image classification. IEEE Access. 2024;12:184119–31.
  55. 55. Nayak DR, et al. RDTNet: a residual deformable attention based transformer network for breast cancer classification. Expert Systems with Applications. 2024;249:123569.
  56. 56. Edrees Almalki Y, Shaf A, Ali T, Aamir M, Khalid Alduraibi S, Mohessen Almutiri S. Breast cancer detection in Saudi Arabian women using hybrid machine learning on mammographic images. Comput Mater Contin. 2022;72(3):4833–51.
  57. 57. Chen S-H, Wu Y-L, Pan C-Y, Lian L-Y, Su Q-C. Breast ultrasound image classification and physiological assessment based on GoogLeNet. Journal of Radiation Research and Applied Sciences. 2023;16(3):100628.
  58. 58. Chaudhury S, Sau K, Khan MA, Shabaz M. Deep transfer learning for IDC breast cancer detection using fast AI technique and Sqeezenet architecture. Math Biosci Eng. 2023;20(6):10404–27. pmid:37322939
  59. 59. Nadkarni S, Noronha K. Classification of mammographic images using convolutional neural networks. In: 2023 IEEE Engineering Informatics. IEEE; 2023.