GL-Net: A knowledge-guided Gaussian-gated and layered refinement network for 3D MRI segmentation of brain gliomas

Huimin Lu; Yilong Wang; Han Xue; Guizeng Wang; Jamshid Moradi Kurdestany; Songzhe Ma

doi:10.1371/journal.pone.0351953

Abstract

Glioblastoma is a highly malignant brain tumor, and accurate lesion segmentation in MRI is essential for diagnosis, treatment planning, and prognosis assessment. This paper proposes a knowledge-guided 3D hybrid Transformer-CNN framework, GL-Net, which integrates prior knowledge through a Gaussian Gating Module (GGM) and a Layered Refinement Module (LRM), together with a novel Edge-Region Voxel Dynamic Weighted Loss Function. These modules collaboratively enhance feature activation, refine label-specific structures, and improve edge delineation, enabling robust segmentation even under limited-sample conditions. The proposed GL-Net was evaluated on the BraTS2019 and BraTS2021 datasets, achieving average Dice Similarity Coefficients (DSC) of 0.877 and 0.913, and Hausdorff Distances (HD) of 1.83 and 1.55, respectively—demonstrating highly competitive performance and a substantial reduction in boundary errors relative to the reported benchmarks of current data-driven approaches. Furthermore, to assess its clinical applicability, VASARI (Visually Accessible Rembrandt Images) feature extraction was performed using both the GL-Net-generated segmentation masks and the ground truth labels on the BraTS2019 dataset for glioblastoma (GBM) diagnosis. The diagnostic performances were nearly identical (GT AUC: 0.954 / GL-Net AUC: 0.949), and the DeLong test (p = 0.99) indicated no statistically significant difference between the two. These results suggest that GL-Net not only achieves highly competitive segmentation accuracy but also produces radiomic features comparable to expert manual annotations, providing complementary evidence of its potential clinical relevance. The proposed framework shows strong clinical potential for precise and consistent glioma delineation, providing valuable support for surgical planning, radiotherapy targeting, and diagnostic decision-making in clinical workflows.

Citation: Lu H, Wang Y, Xue H, Wang G, Kurdestany JM, Ma S (2026) GL-Net: A knowledge-guided Gaussian-gated and layered refinement network for 3D MRI segmentation of brain gliomas. PLoS One 21(6): e0351953. https://doi.org/10.1371/journal.pone.0351953

Editor: Taikyeong Ted Jeong, Hallym University, KOREA, REPUBLIC OF

Received: August 11, 2025; Accepted: June 3, 2026; Published: June 22, 2026

Copyright: © 2026 Lu et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: The datasets used in this article are the brain tumor MRI datasets (BraTS2019 and BraTS2021) published by the Perelman School of Medicine at the University of Pennsylvania and can be found at: https://www.kaggle.com/datasets/debobratachakraborty/brats2019-dataset and https://www.kaggle.com/datasets/victorfernandezalbor/brats2021dataset.

Funding: This research is supported by the Development Special Project of Jilin Provincial Development and Reform Commission in 2023 (No. 2023C042-6). There was no additional external funding received for this study. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Competing interests: The authors have declared that no competing interests exist.

Introduction

Glioblastoma is a complex and severe neurological disease that profoundly affects patients’ lives and overall quality of life. In clinical practice, the characterization and treatment planning of gliomas largely rely on radiological interpretation based on standardized assessment systems. The VASARI (Visually Accessible Rembrandt Images) feature set [1,2] provides a structured vocabulary for describing glioma morphology, including enhancement patterns, necrosis, edema, and mass effect. Although these frameworks offer important guidance for clinical decision-making, they still depend heavily on expert visual assessment and manual delineation, which are time-consuming and subject to considerable inter-observer variability. With the rapid development of deep learning, medical imaging analysis has entered a new era [3]. Among these, CNN and Transformer architectures have gained considerable attention due to their exceptional feature-learning capabilities and ability to model complex relationships [4]. These models can learn from large volumes of medical imaging data, automatically extract features relevant to glioblastoma, and subsequently perform downstream computations [5,6], offering new perspectives for glioblastoma segmentation.

However, in practical semantic segmentation applications, the high heterogeneity and complexity of glioblastoma pose significant challenges. Models that rely solely on data-driven approaches often fail to achieve optimal performance, particularly in the following aspects: (1) The imaging characteristics of glioblastoma are highly diverse, with complex morphological and signal intensity variations on MRI. These factors make it difficult for algorithms based solely on image feature recognition to perform accurately; (2) Due to the relatively limited number of available cases, i.e., the small sample problem, training datasets are often insufficient to capture all potential patterns of disease variation. This frequently leads to overfitting and poor generalization in models [7], reducing accuracy in real-world predictions, especially when dealing with blurred tumor boundaries or overlapping regions.

To address these challenges, incorporating prior knowledge is critically important [8]. Prior knowledge may include information about the morphology and biological characteristics of glioblastoma, as well as domain expertise in the medical field [9]. By integrating such knowledge into deep learning models to enhance their regulatory constraints, the models can better understand the characteristics of glioblastoma, thereby improving the accuracy and robustness of segmentation results.

This study aims to develop a small-sample glioblastoma segmentation algorithm that seamlessly integrates deep learning with prior knowledge. To achieve this, the innovative GL-Net was designed. Through this algorithm, GL-Net enables more efficient recognition and segmentation of glioblastoma, providing more precise and reliable support for clinical diagnosis and treatment. Furthermore, we applied the model-generated segmentation results to the VASARI framework for GBM characterization and compared them with the gold-standard segmentations. This comparison highlights the potential clinical relevance and translational value of the proposed model. The main innovations of this study are as follows:

(1) GL-Net: By combining data-driven and knowledge-driven approaches, an innovative framework based on a hybrid structure of Transformer and CNN was designed. Prior knowledge is integrated into the model through the Gaussian Gating Module (GGM), Layered Refinement Module (LRM), and Edge-Region Voxel Dynamic Weighted Loss Function. This effectively leverages prior knowledge to enhance the overall performance of the model.
(2) Gaussian Gating Module: A secondary feature optimization module based on Gaussian functions was designed to effectively activate secondary features while suppressing overly sensitive or insensitive features. By leveraging the modulation capabilities of Gaussian functions, the module enhances the neural network’s ability to identify under-activated regions, such as the blurred edge areas of glioblastoma.
(3) Layered Refinement Module: A Layered Refinement Module based on label priors is proposed to progressively refine the prediction results for different label types, identifying and correcting mispredictions. This module employs a detail-aware extractor (DAE) that utilizes deep supervision and multi-scale feature fusion to extract and refine detailed features, thereby improving segmentation accuracy.
(4) Edge-Region Voxel Dynamic Weighted Loss Function: A novel loss function, DiceBRD loss, is introduced to focus more on the voxels in edge regions. Through a dynamic weighting mechanism, it alleviates the class imbalance problem. By adaptively adjusting the penalty for misclassified voxels in edge regions, this loss function enhances model accuracy in medical image segmentation tasks.
(5) The VASARI framework: The proposed GL-Net model is applied to the VASARI framework for GBM characterization, which is a comprehensive and reliable method for evaluating GBM. By comparing the model-generated segmentation results with the gold-standard segmentations, the potential clinical relevance of the proposed model is demonstrated.

The remaining sections of this paper are organized as follows: In the Related Work section, we introduce relevant studies on semantic segmentation algorithms that are purely data-driven. The Methods section details the experimental methods used in this study, including the overall framework of the model, the workflow, and the design details of the proposed innovative modules. The Experiment and Analysis section presents the main evaluation metrics, the datasets used, detailed experimental procedures, and a comparative analysis of the results.

Related work

The encoder-decoder structure of CNN models like U-Net

In previous studies, convolutional neural network (CNN)-based architectures have dominated the field of data-driven medical image semantic segmentation tasks, with models like U-Net [10], which adopt an encoder-decoder structure, has demonstrated particularly strong performance. The U-Net model achieves semantic segmentation of medical images through a symmetric encoder and decoder structure. The encoder extracts deep semantic features, progressively downsampling to capture global contextual information, while the decoder progressively upsampling to restore image resolution and reconstruct spatial details. Skip connections transfer high-resolution features between the encoder and decoder, ensuring that local details are preserved and improving the precision of edge segmentation. This model is particularly well-suited for small-sample medical image datasets, as it can capture global information while retaining local details.

In recent years, researchers have continuously improved and extended U-Net to address the challenges in the field of glioma segmentation. Some researchers have focused on introducing attention mechanisms to enhance segmentation performance by capturing significant regions. For example, Xu et al. [11] designed a novel U-Net by incorporating a corner attention module, which efficiently extracts dimensional information between slices and strengthens contextual connections, thus enhancing the network’s representational capacity. Schlemper et al. [12] proposed an Attention Gate (AG) model for medical image analysis, which highlights salient features useful for specific tasks. Akbar et al. [13] introduced a multi-path residual attention module based on the U-Net architecture, adding an attention gating mechanism in the skip connections to increase the model’s focus on target regions and reduce attention to non-target regions. Jia et al. [14] proposed an end-to-end glioma segmentation algorithm based on 3D U-Net, incorporating a coordinate attention module to enhance the ability to capture local texture features and global positional information. These improvements have significantly improved the network’s ability to accurately identify and segment regions of interest in glioma processing, providing strong support for glioma analysis.

Transformer and its combined model with CNN

The Transformer [15] is a model architecture based on the Multi-Head Attention (MHA) mechanism. Through parallel processing and self-attention mechanisms, it can more effectively capture long-range dependencies in sequences. As a result, researchers have started using Transformer to explore its potential in extracting information across different modalities for glioma segmentation.

For example, Lin et al. [16] grouped input modalities into two categories based on MRI imaging principles and named the model CKD-TransBTS. This model leverages the Transformer’s advantages, capturing local lesion boundaries and extracting long-range features from 3D images. Hu et al. [17] proposed an efficient R-Transformer dual-encoder network, which captures complex semantic features and global contextual information by constructing a feature branch and a patch branch. Li et al. [18] introduced a new DenseTrans network that utilizes the shifted window operation of the Swin Transformer to obtain global feature information and long-range dependency modeling capabilities. Yang et al. [19] proposed a flexible multimodal glioma segmentation fusion network, which employs two Transformer-based feature learning networks and a cross-modal shared learning network to extract both individual and shared feature vectors, improving glioma segmentation performance using multimodal images. Ting et al. [20] introduced a multimodal Transformer to model correlations between multimodal features, progressively integrating multimodal and multilevel features for glioma segmentation using spatial and channel self-attention modules.

The aforementioned algorithms fully integrate the Transformer’s ability to capture global contextual information without being constrained by the receptive field, enabling better feature and structural information extraction in glioma image segmentation tasks. By employing methods such as the cross-fusion of global and local features, improvements to skip connection structures, and the integration of multimodal information, the accuracy and generalization capability of segmentation algorithms can be enhanced. While the self-attention mechanism in the Transformer captures global information well, it is less effective in handling local details. In images, pixels in local regions often exhibit strong correlations, and when processing images, the Transformer may not effectively capture this local correlation, leading to insufficient utilization of information.

Consequently, some researchers have developed hybrid architectures that integrate CNNs and Transformers. For example, Wang et al. [21] were the first to propose a novel glioma segmentation model called TransBTS, which combines CNNs and Transformers to extract spatial features and capture long-range dependencies. Liang et al. [22] proposed a TransConver model based on CNNs and Transformers, achieving cross-fusion of global and local features while improving skip connection structures to mitigate the semantic gap between encoder and decoder features for better fusion. Zhu et al. [23] introduced a glioma segmentation algorithm that utilizes multimodal MRI data and integrates semantic and edge information through deep learning techniques, aiming to fully leverage multimodal information.

In handling glioma segmentation tasks, these algorithms combine the advantages of classic deep learning models like CNNs, resulting in models with improved feature extraction and spatial information processing capabilities. By incorporating traditional algorithms, these models achieve a more comprehensive analysis and interpretation of medical images. However, the performance of CNN-Transformer architectures often relies on large amounts of annotated data. In the field of medical imaging, obtaining large quantities of annotated data can be challenging. Nevertheless, algorithms that incorporate prior knowledge can effectively address this issue. By leveraging the expertise and experience of medical professionals, models can be guided during training to better understand and analyze images, thus improving segmentation accuracy.

Loss functions and the issue of class imbalance

In medical image analysis, tumor regions are significantly smaller than normal regions. Even within a single brain slice, the tumor area is much smaller than other parts of the brain, leading to class imbalance in medical images [24]. To effectively address this issue, researchers have worked on improving loss functions. Lin et al. [25] first proposed the focal loss for object detection in image segmentation, which introduces a modulation factor to alleviate the class imbalance problem. Caliva et al. [26] used a distance map as the weight for cross-entropy, allowing the loss to focus on difficult-to-segment boundary regions. Kervadec et al. [27] developed an edge loss algorithm that calculates region interfaces by integrating along boundaries, formulating it as a distance metric in contour space. Karimi et al. [28] introduced a loss function based on the Hausdorff Distance (HD) to directly minimize the HD between the model-generated contours and the ground truth. Liu et al. [24] proposed a multi-level structural loss by utilizing region, boundary, and pixel information to supervise feature fusion and achieve precise segmentation. Yeung et al. [29] proposed a Unified Focal Loss to address the issues of excessive hyperparameters and overly fast convergence in general focal loss. Du et al. [30] developed a boundary-sensitive loss function that automatically focuses on hard-to-segment boundaries, leading to more refined target delineation.

Although these algorithms help mitigate class imbalance to some extent, they still have limitations. For example, some algorithms rely on manually tuned hyperparameters or complex computational processes, increasing the difficulty of optimization. Additionally, certain methods do not sufficiently emphasize boundary regions, which can limit segmentation precision.

Materials and methods

Materials

The glioma MRI data used in this study were obtained from the standardized multi-center TCGA and TCIA repositories [33–35]. The BraTS2019 training set includes 259 high-grade glioma (HGG) and 76 low-grade glioma (LGG) cases, while the BraTS2021 dataset comprises 1,251 cases in total. The scans were collected from 13 independent medical institutions using MRI systems from GE, Siemens, and Philips, covering a magnetic field strength range of 0.5 T-3 T and incorporating diverse acquisition protocols (slice thickness: 1–3 mm, TR: 500–3000 ms, TE: 10–100 ms, matrix size: 128×128–512×512). This inherent clinical and technical heterogeneity enables GL-Net to learn tumor representations that are independent of specific scanners and acquisition protocols, thereby supporting its robust generalization under real-world imaging conditions.

Methods

Compared to natural images, extracting feature information from medical images is significantly more challenging, especially in multi-sequence medical imaging. Since optimal tumor image slices cannot always be guaranteed, our overall model strategy is built on 3D data to better adapt to practical applications (Fig 1). In this study, we designed a novel Gaussian-gated and secondary refinement segmentation algorithm, GL-Net, based on T-CNN (as shown in Fig 2). To fully integrate layered feature information from different structures, we introduce a Gaussian-gated module for secondary feature optimization, effectively fusing feature representations from both sides. On the decoder side, we design a label-prior-based LRM, which progressively refines the WT, TC, and ET regions, ultimately achieving accurate segmentation. Additionally, we propose a new region-dynamic-weighted loss function that mitigates the class imbalance problem by adjusting the computation domain and enhancing the decoder’s focus on image edges.

Download:

Fig 1. Clinical integration workflow of GL-Net segmentation.

https://doi.org/10.1371/journal.pone.0351953.g001

Download:

Fig 2. Network architecture of GL-Net.

https://doi.org/10.1371/journal.pone.0351953.g002

Gaussian gating module for secondary feature optimization.

To effectively utilize the unique features of T-CNN, it is essential to address the issues of noise in layered features and the compatibility of intrinsic differences. To achieve this goal, this paper introduces a Gaussian Gating Module for secondary feature optimization. The Gaussian function enhances secondary features while suppressing both the most and least sensitive features [31]. In deep learning-based medical image segmentation, secondary features refer to regions that are not fully activated during the neural network’s learning process. As shown in Fig 3, these unactivated regions are primarily distributed along the fuzzy boundary areas of gliomas. To effectively activate secondary features, this paper proposes a Gaussian Gating Module for secondary feature optimization leveraging the modulation properties of Gaussian functions, as illustrated in Fig 4.

Download:

Fig 3. Activation mapping of glioma region.

https://doi.org/10.1371/journal.pone.0351953.g003

Download:

Fig 4. Gaussian gating module for secondary feature optimization.

https://doi.org/10.1371/journal.pone.0351953.g004

First, the feature vectors extracted by MHA and extracted by the convolution kernel are passed through a 33 convolution kernel to reduce the original number of channels to , which helps alleviate the computational burden of subsequent operations. To ensure spatial alignment between and , needs to undergo an upsampling operation. Next, the two feature maps are merged through a concatenation operation, followed by average pooling (AvgPool) and max pooling (MaxPool) operations to obtain the feature vectors and , respectively. Then, the output is passed through a 33 convolutional layer, and the resulting feature map is further processed using a Gaussian function to obtain the normalized weight . This weight reactivates the secondary features of both, compensating for the lost edge features. Finally, and are multiplied by the normalized weight . The above process can be expressed by Equations (1) to (5):

(1)

(2)

(3)

(4)

(5)

Next, the activated feature maps are concatenated. To preserve the original information of the feature representations and , the activated features are concatenated again with and , followed by a convolution operation to generate the feature representation . The above process can be expressed by Equation (6):

(6)

In addition, to retain the contextual information from the previous layer in the encoder, the fused feature can be combined with the output of the Gaussian Gating Module from the previous layer through a concatenation operation. Finally, the features of both are extracted through a convolutional layer. This process can be expressed by Equation (7):

(7)

In the equation, and represent the outputs of the Gaussian Gating Module at the current level and the previous level, respectively.

Layered refinement module based on label priors.

To address the impact of unclear boundaries and overlapping regions, this paper introduces an LRM. Inspired by the labeling format of gliomas, it is designed to progressively refine the prediction results for different types of labels, as shown in Fig 5. The purpose of this module is to identify and correct mispredictions. It takes deep features and current layer features as input and outputs the refined features along with the current layer’s prediction results.

Download:

Fig 5. LRM based on label prior.

https://doi.org/10.1371/journal.pone.0351953.g005

First, the deep features , , and their components undergo convolution operations to reduce the number of channels to 1, resulting in the corresponding predicted maps. Second, deep supervision is applied to the generated predicted maps, guiding the detail-aware extraction module to refine the detail features of the corresponding predictions. Finally, upsampling and activation operations are performed sequentially to obtain the corresponding activation values. The specific process is described in Equation (8):

(8)

In the formula, represents the deep features of the (i + 1)-th layer, denotes the activation value for refining the j-th target region, where j takes values from {WT, TC, ET}. represents the Sigmoid function.

Secondly, this paper multiplies with the current feature to generate the foreground attention feature for WT. Subsequently, this feature is fed into the DAE to perform the detail extraction task, resulting in the detail feature for WT. Then, is multiplied with to obtain the foreground attention feature for TC, and the detail feature for TC is extracted using the DAE. Similarly, the detail feature can be obtained. This process can be expressed by Equation (9).

(9)

In the formula, represents the detail feature from the previous layer of the DAE for , where j takes values sequentially from {WT, TC, ET}; j denotes the DAE.

Then, this paper first performs convolution and upsampling operations on the deep feature , and then adds it to the features extracted by the DAE. Secondly, normalization and nonlinear activation are performed through the BR layer on . This process can be expressed by Equations (10) to (14).

(10)

(11)

(12)

(13)

(14)

In the formula, denotes the BN + ReLU operation.

Multi-scale feature fusion DAE: The input features are first processed through a convolution kernel, reducing the original channel size to , which helps reduce the computational burden of subsequent operations. This is followed by a batch normalization layer and a ReLU operation. Four parallel branches are then established to provide various receptive fields for matching candidate regions of different sizes and shapes. In each branch, a dilated convolution kernel is used, with a specific dilation rate and padding size. These four branches are then merged to obtain the fused features. Finally, a convolution kernel, batch normalization layer, and ReLU operation are applied to fuse the concatenated features, resulting in the detailed feature vector, as shown in Fig 6.

Download:

Fig 6. Structure of the detail-aware extractor for multi-scale feature fusion.

https://doi.org/10.1371/journal.pone.0351953.g006

Dynamic weighted loss function for edge region voxels.

In medical images, the presence of abundant noise and artifacts often leads to blurred target edge contours, which significantly affects the loss function. To address this issue, this paper introduces an edge-region voxel dynamic weighting loss function, called DiceBRD loss [32], as shown in Fig 7. This loss not only alleviates the class imbalance problem caused by the asymmetry between the number of foreground and background voxels but also precisely identifies the true edges by adaptively adjusting the penalty for misclassified voxels in edge regions. This improves the accuracy of deep neural networks in medical image segmentation.

Download:

Fig 7. Flowchart of the dynamic weighted loss function for edge region voxels.

Red, green, and white areas represent the predicted values, ground truth labels, and background, respectively.

https://doi.org/10.1371/journal.pone.0351953.g007

DiceBRD loss extracts the edges of both the predicted results and the ground truth labels during the training iterations as the regions for loss calculation. The union of these regions defines the region of interest (ROI) for recalculating the loss. The edge extraction process can be expressed using equations (15) to (17).

(15)

(16)

(17)

Here, R represents the dynamically extracted boundary region of interest (ROI). The overall loss function, L_DiceBRD, is formulated as a combination of a global Dice loss (L_Dice) calculated over the entire image volume, and a Boundary Region-restricted Dynamic weighted cross-entropy loss (L_BRD) calculated strictly within the dynamic ROI R.

To integrate both global topological structure and local boundary refinement, the proposed L_DiceBRD elegantly decouples the learning objectives. The overall loss is formulated as a weighted sum of the global Dice loss (L_Dice) calculated over the entire image volume, and the Boundary Region-restricted Dynamic weighted cross-entropy loss (L_BRD) calculated strictly within the dynamically extracted boundary region.

Specifically, the global L_Dice is defined as Equation (18):

(18)

The local boundary-restricted term L_BRD is defined as Equation (19):

(19)

Finally, the concise combined loss equation is explicitly formulated as Equation (20):

(20)

To ensure absolute clarity, all symbols utilized in the above formulations are consistently defined as follows:

V: The set of all voxels in the entire global image volume.
R: The dynamically extracted boundary region of interest ().
|R|: The total number of voxels strictly within the boundary region R.
C: The total number of segmentation classes (including the background).
: The ground truth one-hot encoding indicating whether voxel i belongs to class c.
: The predicted probability that voxel i belongs to class c.
: The dynamic adaptive weight, computed as the Euclidean distance from voxel i to the exact boundary, heavily penalizing boundary misclassifications.
: A small smoothing factor added to prevent division by zero.
: A hyperparameter balancing the contribution of the local boundary refinement against the global structural loss (in our implementation, is set to 1.0).

By defining the Dice term over the global domain V, the model effectively mitigates the severe class imbalance inherent in medical images. Conversely, by restricting the weighted cross-entropy term (L_BRD) strictly to the dynamically evolving boundary domain R via the balancing weight , the model adaptively forces the network to refine blurry contours without overwhelming the loss gradients with easy background voxels.

GBM diagnosis based on VASARI.

In this study, we developed an automated pipeline for extracting VASARI features F4-F7 based on multi-sequence MRI and tumor segmentation masks. First, all MRI sequences were resampled to the spatial coordinate system of the label image to ensure voxel-wise spatial consistency. Tumor subregions—Whole Tumor (WT), tumor core (TC), and Enhancing Tumor (ET)—were defined according to the BraTS specifications. Additionally, connected component analysis was applied to remove small-volume noise and obtain stable tumor masks. The specific extraction rules are as follows:

F4 (Enhancement quality): Describes the degree of signal enhancement on post-contrast T1-weighted MRI, categorized as Absent, Minimal, or Avid. This requires comparing signal intensity changes in the tumor region between pre- and post-contrast T1-weighted images.
F5 (Enhancing Tumor proportion): The percentage of the tumor composed of enhancing regions.
F6 (Non-Enhancing Tumor proportion): The percentage of the tumor composed of non-enhancing regions.
F7 (Necrosis proportion): The percentage of the tumor composed of necrotic regions. These extracted features were subsequently used as inputs to a logistic regression classifier for GBM prediction.

Results

Dataset and evaluation metrics

This paper conducts experiments on the BraTS2019 and BraTS2021 [33–35] datasets to validate the effectiveness of the proposed algorithm. The BraTS2019 training set consists of 259 HGG and 76 LGG cases. The BraTS2021 official training dataset consists of 1,251 cases. Since the ground truth labels for the official validation and test sets of BraTS2019 and BraTS2021 are withheld by the challenge organizers and are not publicly available, all models in this study were trained and evaluated exclusively on the official training datasets. To ensure a robust and fair evaluation, a five-fold cross-validation strategy was performed. Therefore, any reference to “test set” or “test results” in this paper specifically denotes the internal hold-out test folds partitioned during this cross-validation process, rather than the official unseen BraTS test sets. The average results of the five runs are reported. In addition, for BraTS2019, we compared GBM predictions based on VASARI features extracted from the GL-Net-generated masks and from the gold-standard masks.

To evaluate the model’s potential clinical relevance, the following evaluation metrics(21–22) will be used:

(21)

(22)

In this equation, TP represents the cases where the model correctly predicts the positive class. FP represents the instances where the model incorrectly predicts the negative class as positive. FN refers to the cases where the model incorrectly predicts the positive class as negative. X and Y are two proper subsets of the metric space M, with representing the supremum and representing the infimum.

For GBM diagnosis based on VASARI, commonly used classification metrics were employed, including ACC (Accuracy), Precision, Sensitivity (Recall), Specificity, F1-score, and AUC (Area Under the Curve).

Furthermore, to rigorously validate the performance improvements of the proposed GL-Net, statistical significance testing was conducted. The quantitative segmentation results across the five-fold cross-validation are reported as Mean ± Standard Deviation (SD). A paired Student’s t-test was utilized to determine whether the performance differences between the baseline configurations (or comparative models) and the proposed GL-Net were statistically significant across the test folds. A p-value of < 0.05 was considered statistically significant, and p < 0.01 was considered highly significant.

Experimental setup and preprocessing

All experiments in this paper use MRI data provided by the BraTS2019 and BraTS2021 datasets. First, padding and cropping are performed to resize the images to 160 × 160 × 160, followed by Z-score normalization, where the mean of the normalized data is 0 and the variance is 1. Then, slice operations are carried out along the axial plane with a channel depth of 1, reducing the image size to 160 × 160. All experiments were conducted on an Ubuntu operating system equipped with an Intel(R) Xeon(R) CPU E5-2678 v3 @ 2.50GHz, 128 GB of system RAM, and dual NVIDIA RTX 3090 Ti GPUs (24 GB VRAM each). The proposed GL-Net has a total parameter count of approximately 35.2 M. Under this hardware configuration, the total training time for 100 epochs was approximately 14 hours, and the average inference time per patient volume during the testing phase was around 4.2 seconds, which fully meets the efficiency requirements for clinical practice.

For the training phase, the input image size is , the batch size is set to 8, and the number of epochs is 100. The optimization method used is SGD, with a weight decay of 0.0001, momentum set to 0.99, and the initial learning rate lr₀ set to 1.5 × 10⁻³. The learning rate changes as the number of training iterations increases, as described by the following equation (22):

(23)

Our code will be open-sourced at https://github.com/Turing17/GL-Net.

Comparative experiments of GL-Net

First, the proposed GL-Net is evaluated on the internal hold-out test sets partitioned from the BraTS2019 and BraTS2021 official training data, and the results are compared with advanced algorithms, as shown in Tables 1 and 2 (Figs 8 and 9). It is important to note that the quantitative results of the comparative models presented in these tables are directly cited from their respective original publications rather than re-implemented in our exact experimental environment. While this provides a comprehensive contextual benchmark against current state-of-the-art methods, we explicitly acknowledge that these constitute indirect comparisons rather than head-to-head evaluations within a unified framework. Differences in preprocessing methods, training pipelines, and specific cross-validation splits can significantly impact performance metrics. Therefore, these comparative results should be interpreted with caution. Consequently, rather than claiming absolute superiority, we frame our results as highly competitive, noting that a definitive benchmarking would require re-implementing all models under an identical pipeline. In the BraTS2019, GL-Net achieves DSC scores of 0.855, 0.916, and 0.860 for the ET, WT, and TC regions, respectively. Notably, the DSC scores for the ET and WT regions are the highest. Additionally, GL-Net achieves HD values of 1.51 mm, 2.39 mm, and 1.59 mm for the ET, WT, and TC regions, respectively, with these HD values comparing favorably against current segmentation algorithms. Since the boundary between the TC and WT regions is relatively vague in some glioma datasets, GL-Net, compared to reference [23], lacks an edge detection module and thus does not extract edge features effectively. As a result, its performance in the TC region is inferior to that in reference [23], but GL-Net shows significant improvements in the ET and WT regions. Compared to the TransConver model [22], which is based on transformers and CNNs, GL-Net shows favorable results across most segmentation metrics, with particularly good performance in the ET region. In the BraTS2021 internal test set, GL-Net achieves DSC scores of 0.899, 0.936, and 0.905 for the ET, WT, and TC regions, respectively. Furthermore, GL-Net achieves HD values of 1.12 mm, 1.85 mm, and 1.68 mm for the ET, WT, and TC regions, respectively. Compared to the dual-branch network model based on attention mechanisms and super-resolution reconstruction (Jia et al.), GL-Net achieves highly comparable or superior metrics relative to the reported results of this algorithm, particularly in the segmentation of the WT and TC regions. Compared to the DenseTrans model proposed by Li et al. [18], GL-Net achieves excellent segmentation performance in the ET and TC regions, with further improvements in the WT region as well.

Download:

Table 1. The comparative experimental results of the GL-Net with other models on the BraTS2019.

https://doi.org/10.1371/journal.pone.0351953.t001

Download:

Table 2. The comparative experimental results of the GL-Net with other models on the BraTS2021.

https://doi.org/10.1371/journal.pone.0351953.t002

Download:

Fig 8. Bar chart of the comparison experiment results of GL-Net with other models on the BraTS2019.

https://doi.org/10.1371/journal.pone.0351953.g008

Download:

Fig 9. Bar chart of the comparison experiment results of GL-Net with other models on the BraTS2021.

https://doi.org/10.1371/journal.pone.0351953.g009

In summary, GL-Net performs excellently in both DSC and HD metrics, demonstrating highly competitive segmentation optimization capabilities alongside current mainstream segmentation algorithms. This is due to GL-Net’s effective utilization of secondary features and further extraction through the LRM, which enhances the segmentation performance for brain gliomas. Fig 10 presents a visual comparison of the segmentation results for brain gliomas obtained by different algorithms. By comparing with the Ground Truth (GT), GL-Net achieves more accurate segmentation results than other algorithms, further validating the superior segmentation performance of the proposed algorithm from a qualitative analysis perspective.

Download:

Fig 10. Examples of segmentation results on the BraTS2019 for the GL-Net, compared with other algorithms.

WT: green + red + yellow, TC: red + yellow, ET: yellow.

https://doi.org/10.1371/journal.pone.0351953.g010

Ablation experiments

Tables 3 and 4 display the quantitative ablation results of GL-Net on the BraTS2019 and BraTS2021 datasets, respectively, reported as Mean ± SD. To ensure absolute statistical rigor and account for potential non-normal distributions in the case-level metric differences, we report p-values from both the paired Student’s t-test and the non-parametric Wilcoxon signed-rank test (presented in parentheses). It can be observed that each of the proposed modules contributes to further improvements in segmentation performance. Crucially, both statistical tests reveal that the full GL-Net architecture (ALL) achieves highly significant improvements (p < 0.01) over the baseline T-CNN across almost all subregions in both DSC and HD metrics. While the individual addition of either the Gaussian Gating Module (GGM) or the Layered Refinement Module (LRM) noticeably enhances the segmentation of brain gliomas, the full model consistently yields statistically superior and more stable results (indicated by smaller standard deviations). This confirms that the performance gains are robust across the patient volumes and not driven by random variance. Fig 11 presents a comparison of the ablation study results for this algorithm. From the figure, it can be observed that, after incorporating the Gaussian Gating Module, inactive regions are effectively activated, enhancing the recognition of blurred edges. Furthermore, after adding the LRM, the segmentation performance of the WT, TC, and ET regions becomes more closely aligned with the ground truth labels.

Download:

Table 3. Ablation experiment results of GL-Net on the BraTS2019.

https://doi.org/10.1371/journal.pone.0351953.t003

Download:

Table 4. Ablation experiment results of GL-Net on the BraTS2021.

https://doi.org/10.1371/journal.pone.0351953.t004

Download:

Fig 11. Examples of the ablation experiment results of this algorithm on the BraTS2019 dataset.

(a): T1CE, (b): T – CNN, (c): T – CNN + GGM, (d): T – CNN + LRM, (e): T – CNN + GGM + LRM, (f): GT; WT: green + red + yellow, TC: red + yellow, ET: yellow.

https://doi.org/10.1371/journal.pone.0351953.g011

To verify the effectiveness of the proposed loss function, ablation experiments were conducted. Table 5 presents the results obtained from training GL-Net with various loss functions. First, a comparison was made between Dice loss and Dice+WCE, where Dice+WCE improved the DSC by 0.5%, 0.8%, and 0.4% for the ET, WT, and TC regions, respectively, and reduced HD by 0.6 mm, 0.8 mm, and 0.84 mm, respectively. This indicates that WCE effectively mitigates the class imbalance problem. WCE refers to the use of a distance map for weighting and includes all voxels in the loss calculation. In the comparison between Dice+WCE and the proposed DiceBRD loss function, the proposed loss improved the DSC by 0.6%, 1.2%, and 0.4% for the ET, WT, and TC regions, respectively, while reducing HD by 0.18 mm, 0.61 mm, and 0.42 mm, respectively. More importantly, the dual statistical analyses rigorously validate the specific advantages of the DiceBRD loss. While the DSC improvements over Dice+WCE show moderate statistical significance (e.g., p = 0.03 for both t-test and Wilcoxon test in WT), the reductions in the Hausdorff Distance (HD) are highly significant across all subregions ( for ET, p < 0.01 for WT and TC in both tests). This explicitly confirms our methodological hypothesis: by strictly restricting the weighted cross-entropy calculation to the dynamically evolving boundary regions, the model successfully tightens the geometric contour alignment, leading to a statistically robust and reliable reduction in boundary errors (HD). Additionally, qualitative results, as shown in Fig 12, indicate that the proposed loss function enhances network performance and yields satisfactory experimental results, which is consistent with the quantitative comparisons in Table 5.

Download:

Table 5. Comparative experimental results of different loss functions for GL-Net on the BraTS2021 internal test set (via 5-fold cross-validation).

https://doi.org/10.1371/journal.pone.0351953.t005

Download:

Fig 12. Examples of segmentation results of the loss function proposed on the BraTS2019.

WT: green + red + yellow, TC: red + yellow, ET: yellow.

https://doi.org/10.1371/journal.pone.0351953.g012

It is worth noting that these findings may still be influenced by characteristics of the training data. The BraTS datasets, although standardized and widely adopted, remain limited in size compared with real-world multi-center cohorts, which may constrain the generalizability of the observed improvements. Moreover, the significant class imbalance inherent to glioma subregions—particularly the small volume of the ET region—may amplify sensitivity to sampling variations, potentially affecting the magnitude of performance gains. In addition, the curated nature of BraTS introduces a degree of selection bias, meaning the effectiveness of the proposed loss function on more heterogeneous clinical data remains to be further validated. These factors should be considered when interpreting the ablation results.

VASARI-based feature analysis

While geometric metrics such as DSC and HD provide quantitative measures of spatial overlap and boundary distances, they do not fully capture the clinical utility of a segmentation model. In clinical workflows, the ultimate goal of tumor delineation is to extract reliable morphological and compositional metrics—such as the proportions of necrosis, enhancing, and non-enhancing tumor regions—that directly inform diagnosis and treatment planning. Therefore, downstream classification consistency serves as a crucial indicator of potential clinical relevance for segmentation quality. If the segmentation masks generated by GL-Net yield VASARI features and subsequent GBM diagnostic performance that are statistically indistinguishable from those derived from expert manual annotations (ground truth), it provides valuable complementary and indirect evidence that GL-Net successfully preserves the clinically critical morphological variations and subregion proportions required for accurate diagnosis, thereby bridging the gap between pixel-level accuracy and potential clinical relevance.

Following this rationale, after extracting VASARI structural imaging features from both the GL-Net-generated masks and the BraTS gold-standard masks, we constructed GBM classification models and compared the discriminative performance of the two feature sets. The results (Table 6) showed that the model based on the gold-standard masks achieved an AUC of 0.954, ACC of 0.940, and F1-score of 0.962, while the model based on GL-Net outputs achieved an AUC of 0.949, ACC of 0.925, and F1-score of 0.952, indicating highly consistent performance across all metrics (Fig 13). DeLong test further confirmed that the difference in AUC between the two models was not statistically significant (p = 0.9996).

Download:

Table 6. GBM classification performance using GL-Net vs. BraTS masks.

https://doi.org/10.1371/journal.pone.0351953.t006

Download:

Fig 13. ROC curves for GBM classification using GL-Net and BraTS segmentation masks.

https://doi.org/10.1371/journal.pone.0351953.g013

In addition, visualization of the logistic regression feature weights (Fig 14) revealed that the weight distribution patterns of both models were almost identical, further demonstrating that the structural imaging features extracted from GL-Net segmentations contribute to GBM classification comparably to manual annotations. Collectively, these results suggest that GL-Net segmentation masks can serve as a stable and reliable input for GBM imaging feature extraction and diagnostic classification, offering indirect evidence of the model’s potential clinical relevance rather than definitive proof of absolute segmentation accuracy.

Download:

Fig 14. Feature weight comparison between GL-Net (left) and gold-standard masks (right).

https://doi.org/10.1371/journal.pone.0351953.g014

Conclusion

Medical imaging plays a pivotal role in clinical diagnosis, particularly for the segmentation of brain glioma in MRI. Despite the promise of deep learning, performance is often limited by data scarcity and image complexity. In this study, we proposed GL-Net, a brain glioma segmentation algorithm that integrates deep learning with prior knowledge. The Gaussian Gating Module (GGM) effectively suppresses image noise, activates secondary features, and enhances recognition of blurred edges, while the Label Refinement Module (LRM) leverages label priors for multi-scale feature fusion, improving the perception of fine details. A dynamic weighted loss function was also designed to emphasize edge-region voxels, mitigating the class imbalance problem.

Extensive experiments on the BraTS2019 and BraTS2021 datasets demonstrate that GL-Net achieves robust and highly competitive performance, with average DSCs of 0.877 and 0.913, and HDs of 1.83 mm and 1.55 mm, respectively, achieving highly competitive performance relative to reported benchmarks, particularly in the segmentation of the Enhancing Tumor (ET) and Whole Tumor (WT) regions. To assess clinical relevance, VASARI-based structural imaging features were extracted from GL-Net segmentations and compared to features derived from BraTS gold-standard masks. GBM classification results showed no statistically significant difference (p = 0.9996) between the two feature sets, and logistic regression feature weight distributions were nearly identical, suggesting that GL-Net provides stable and clinically meaningful representations comparable to manual annotations, thus offering complementary evidence of its potential clinical relevance.

While GL-Net demonstrates strong segmentation accuracy and reliable downstream feature extraction, claims regarding its direct clinical applicability must be interpreted with caution. A primary limitation of the current study is the reliance on the highly curated BraTS datasets. Although multi-institutional, these standardized datasets may not fully capture the extreme heterogeneity, unpredictable artifacts, and diverse scanning protocols encountered in independent, real-world clinical cohorts. The absence of external validation on an independent real-world dataset means the model’s operational robustness in everyday clinical practice remains to be definitively proven. Furthermore, to address the limitations of citing unstandardized comparative results, future work will involve re-implementing state-of-the-art baseline models within a unified training, preprocessing, and evaluation framework (such as nnU-Net) to enable standardized, head-to-head benchmarking. Therefore, extensive external validation on independent clinical cohorts and subsequent prospective trials are essential next steps to rigorously evaluate the model’s true generalizability. Future work will focus on conducting these prospective validations, optimizing GL-Net for small or poorly defined tumors, and iteratively advancing its robust incorporation into clinical workflows to truly enhance diagnostic efficiency and reliability.

Acknowledgments

This research was based on Jilin Province Science and Technology Innovation Center for Multimodal Cognitive Computing and Analysis of Medical Biometrics, and the Smart Health Joint Innovation Laboratory for the New Generation of AI.

References

1. Gemini L, Tortora M, Giordano P, Prudente ME, Villa A, Vargas O, et al. Vasari Scoring System in Discerning between Different Degrees of Glioma and IDH Status Prediction: A Possible Machine Learning Application?. J Imaging. 2023;9(4):75. pmid:37103226
- View Article
- PubMed/NCBI
- Google Scholar
2. Negro A, Gemini L, Tortora M, Pace G, Iaccarino R, Marchese M, et al. VASARI 2.0: a new updated MRI VASARI lexicon to predict grading and IDH status in brain glioma. Front Oncol. 2024;14:1449982. pmid:39763601
- View Article
- PubMed/NCBI
- Google Scholar
3. Guan Y, Aamir M, Rahman Z, Ali A, Abro WA, Dayo ZA. A framework for efficient brain tumor classification using MRI images. 2021.
4. Chen X, Wang X, Zhang K, Fung K-M, Thai TC, Moore K, et al. Recent advances and clinical applications of deep learning in medical image analysis. Med Image Anal. 2022;79:102444. pmid:35472844
- View Article
- PubMed/NCBI
- Google Scholar
5. Haq EU, Jianjun H, Li K, Haq HU, Zhang T. An MRI-based deep learning approach for efficient classification of brain tumors. J Ambient Intell Human Comput. 2021;14(6):6697–718.
- View Article
- Google Scholar
6. Tandel GS, Tiwari A, Kakde OG. Performance enhancement of MRI-based brain tumor classification using suitable segmentation method and deep learning-based ensemble algorithm. Biomedical Signal Processing and Control. 2022;78:104018.
- View Article
- Google Scholar
7. Hyun CM, Kim KC, Cho HC, Choi JK, Seo JK. Framelet pooling aided deep learning network: the method to process high dimensional medical data. Mach Learn: Sci Technol. 2020;1(1):015009.
- View Article
- Google Scholar
8. Akbarian S, Seyyed-Kalantari L, Khalvati F, Dolatabadi E. Evaluating Knowledge Transfer in the Neural Network for Medical Images. IEEE Access. 2023;11:85812–21.
- View Article
- Google Scholar
9. Xie X, Niu J, Liu X, Chen Z, Tang S, Yu S. A survey on incorporating domain knowledge into deep learning for medical image analysis. Med Image Anal. 2021;69:101985. pmid:33588117
- View Article
- PubMed/NCBI
- Google Scholar
10. Ronneberger O, Fischer P, Brox T. U-net: Convolutional networks for biomedical image segmentation. In: Medical image computing and computer-assisted intervention–MICCAI 2015: 18th international conference, Munich, Germany, October 5-9, 2015, proceedings, part III 18. Springer. 2015. 234–41.
11. Xu W, Yang H, Zhang M, Cao Z, Pan X, Liu W. Brain tumor segmentation with corner attention and high-dimensional perceptual loss. Biomedical Signal Processing and Control. 2022;73:103438.
- View Article
- Google Scholar
12. Schlemper J, Oktay O, Schaap M, Heinrich M, Kainz B, Glocker B, et al. Attention gated networks: Learning to leverage salient regions in medical images. Med Image Anal. 2019;53:197–207. pmid:30802813
- View Article
- PubMed/NCBI
- Google Scholar
13. Akbar AS, Fatichah C, Suciati N. Single level UNet3D with multipath residual attention block for brain tumor segmentation. Journal of King Saud University - Computer and Information Sciences. 2022;34(6):3247–58.
- View Article
- Google Scholar
14. Jia Z, Zhu H, Zhu J, Ma P. Two-Branch network for brain tumor segmentation using attention mechanism and super-resolution reconstruction. Comput Biol Med. 2023;157:106751. pmid:36934534
- View Article
- PubMed/NCBI
- Google Scholar
15. Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T. An image is worth 16x16 words: Transformers for image recognition at scale. 2020. https://arxiv.org/abs/2010.11929
16. Lin J, Lin J, Lu C, Chen H, Lin H, Zhao B, et al. CKD-TransBTS: Clinical Knowledge-Driven Hybrid Transformer With Modality-Correlated Cross-Attention for Brain Tumor Segmentation. IEEE Trans Med Imaging. 2023;42(8):2451–61. pmid:37027751
- View Article
- PubMed/NCBI
- Google Scholar
17. Hu Z, Li L, Sui A, Wu G, Wang Y, Yu J. An efficient R-Transformer network with dual encoders for brain glioma segmentation in MR images. Biomedical Signal Processing and Control. 2023;79:104034.
- View Article
- Google Scholar
18. ZongRen L, Silamu W, Yuzhen W, Zhe W. DenseTrans: Multimodal Brain Tumor Segmentation Using Swin Transformer. IEEE Access. 2023;11:42895–908.
- View Article
- Google Scholar
19. Yang H, Zhou T, Zhou Y, Zhang Y, Fu H. Flexible Fusion Network for Multi-Modal Brain Tumor Segmentation. IEEE J Biomed Health Inform. 2023;27(7):3349–59. pmid:37126623
- View Article
- PubMed/NCBI
- Google Scholar
20. Ting H, Liu M. Multimodal Transformer of Incomplete MRI Data for Brain Tumor Segmentation. IEEE J Biomed Health Inform. 2023;PP:10.1109/JBHI.2023.3286689. pmid:37327094
- View Article
- PubMed/NCBI
- Google Scholar
21. Wenxuan W, Chen C, Meng D, Hong Y, Sen Z, Jiangyun L. Transbts: Multimodal brain tumor segmentation using transformer. In: International Conference on Medical Image Computing and Computer-Assisted Intervention, Springer. 2021. 109–19.
22. Liang J, Yang C, Zeng M, Wang X. TransConver: transformer and convolution parallel network for developing automatic brain tumor segmentation in MRI images. Quant Imaging Med Surg. 2022;12(4):2397–415. pmid:35371952
- View Article
- PubMed/NCBI
- Google Scholar
23. Zhu Z, He X, Qi G, Li Y, Cong B, Liu Y. Brain tumor segmentation based on the fusion of deep semantics and edge information in multimodal MRI. Information Fusion. 2023;91:376–87.
- View Article
- Google Scholar
24. Liu Y, Duan Y, Zeng T. Learning multi-level structural information for small organ segmentation. Signal Processing. 2022;193:108418.
- View Article
- Google Scholar
25. Lin T-Y, Goyal P, Girshick R, He K, Dollar P. Focal Loss for Dense Object Detection. In: 2017 IEEE International Conference on Computer Vision (ICCV), 2017. 2999–3007. https://doi.org/10.1109/iccv.2017.324
26. Caliva F, Iriondo C, Martinez AM, Majumdar S, Pedoia V. Distance map loss penalty term for semantic segmentation. arXiv preprint. 2019.
- View Article
- Google Scholar
27. Kervadec H, Bouchtiba J, Desrosiers C, Granger E, Dolz J, Ayed IB. In: International conference on medical imaging with deep learning, 2019. 285–96.
28. Karimi D, Salcudean SE. Reducing the Hausdorff Distance in Medical Image Segmentation With Convolutional Neural Networks. IEEE Trans Med Imaging. 2020;39(2):499–513. pmid:31329113
- View Article
- PubMed/NCBI
- Google Scholar
29. Yeung M, Sala E, Schönlieb C-B, Rundo L. Unified Focal loss: Generalising Dice and cross entropy-based losses to handle class imbalanced medical image segmentation. Comput Med Imaging Graph. 2022;95:102026. pmid:34953431
- View Article
- PubMed/NCBI
- Google Scholar
30. Du J, Guan K, Liu P, Li Y, Wang T. Boundary-Sensitive Loss Function With Location Constraint for Hard Region Segmentation. IEEE J Biomed Health Inform. 2023;27(2):992–1003. pmid:36378793
- View Article
- PubMed/NCBI
- Google Scholar
31. Yu S, Zhang B, Xiao J, Lim EG. Structure-Consistent Weakly Supervised Salient Object Detection with Local Saliency Coherence. AAAI. 2021;35(4):3234–42.
- View Article
- Google Scholar
32. Wang G, Lu H, Li N, Xue H, Sang P. Kfd-net: a knowledge fusion decision method for post-processing brain glioma MRI segmentation. Pattern Anal Applic. 2024;27(4).
- View Article
- Google Scholar
33. Bakas S, Akbari H, Sotiras A, Bilello M, Rozycki M, Kirby JS, et al. Advancing The Cancer Genome Atlas glioma MRI collections with expert segmentation labels and radiomic features. Sci Data. 2017;4:170117. pmid:28872634
- View Article
- PubMed/NCBI
- Google Scholar
34. Bakas S, Reyes M, Jakab A, Bauer S, Rempfler M, Crimi A, et al. Identifying the best machine learning algorithms for brain tumor segmentation, progression assessment, and overall survival prediction in the BRATS challenge. 2018. https://doi.org/10.48550/arXiv.1811.02629
35. Bakas S, Akbari H, Sotiras A, Bilello M, Rozycki M, Kirby J. Segmentation labels and radiomic features for the pre-operative scans of the TCGA-LGG collection. The Cancer Imaging Archive. 2017.
36. Rehman MU, Ryu J, Nizami IF, Chong KT. RAAGR2-Net: A brain tumor segmentation network using parallel processing of multiple spatial frames. Comput Biol Med. 2023;152:106426. pmid:36565485
- View Article
- PubMed/NCBI
- Google Scholar
37. Chang Y, Zheng Z, Sun Y, Zhao M, Lu Y, Zhang Y. DPAFNet: A Residual Dual-Path Attention-Fusion Convolutional Neural Network for Multimodal Brain Tumor Segmentation. Biomedical Signal Processing and Control. 2023;79:104037.
- View Article
- Google Scholar
38. AboElenein NM, Piao S, Noor A, Ahmed PN. MIRAU-Net: An improved neural network based on U-Net for gliomas segmentation. Signal Processing: Image Communication. 2022;101:116553.
- View Article
- Google Scholar
39. Mazumdar I, Mukherjee J. Fully automatic MRI brain tumor segmentation using efficient spatial attention convolutional networks with composite loss. Neurocomputing. 2022;500:243–54.
- View Article
- Google Scholar

[ref1] 1. Gemini L, Tortora M, Giordano P, Prudente ME, Villa A, Vargas O, et al. Vasari Scoring System in Discerning between Different Degrees of Glioma and IDH Status Prediction: A Possible Machine Learning Application?. J Imaging. 2023;9(4):75. pmid:37103226
View Article
PubMed/NCBI
Google Scholar

[2] View Article

[3] PubMed/NCBI

[4] Google Scholar

[ref2] 2. Negro A, Gemini L, Tortora M, Pace G, Iaccarino R, Marchese M, et al. VASARI 2.0: a new updated MRI VASARI lexicon to predict grading and IDH status in brain glioma. Front Oncol. 2024;14:1449982. pmid:39763601
View Article
PubMed/NCBI
Google Scholar

[6] View Article

[7] PubMed/NCBI

[8] Google Scholar

[ref3] 3. Guan Y, Aamir M, Rahman Z, Ali A, Abro WA, Dayo ZA. A framework for efficient brain tumor classification using MRI images. 2021.

[ref4] 4. Chen X, Wang X, Zhang K, Fung K-M, Thai TC, Moore K, et al. Recent advances and clinical applications of deep learning in medical image analysis. Med Image Anal. 2022;79:102444. pmid:35472844
View Article
PubMed/NCBI
Google Scholar

[11] View Article

[12] PubMed/NCBI

[13] Google Scholar

[ref5] 5. Haq EU, Jianjun H, Li K, Haq HU, Zhang T. An MRI-based deep learning approach for efficient classification of brain tumors. J Ambient Intell Human Comput. 2021;14(6):6697–718.
View Article
Google Scholar

[15] View Article

[16] Google Scholar

[ref6] 6. Tandel GS, Tiwari A, Kakde OG. Performance enhancement of MRI-based brain tumor classification using suitable segmentation method and deep learning-based ensemble algorithm. Biomedical Signal Processing and Control. 2022;78:104018.
View Article
Google Scholar

[18] View Article

[19] Google Scholar

[ref7] 7. Hyun CM, Kim KC, Cho HC, Choi JK, Seo JK. Framelet pooling aided deep learning network: the method to process high dimensional medical data. Mach Learn: Sci Technol. 2020;1(1):015009.
View Article
Google Scholar

[21] View Article

[22] Google Scholar

[ref8] 8. Akbarian S, Seyyed-Kalantari L, Khalvati F, Dolatabadi E. Evaluating Knowledge Transfer in the Neural Network for Medical Images. IEEE Access. 2023;11:85812–21.
View Article
Google Scholar

[24] View Article

[25] Google Scholar

[ref9] 9. Xie X, Niu J, Liu X, Chen Z, Tang S, Yu S. A survey on incorporating domain knowledge into deep learning for medical image analysis. Med Image Anal. 2021;69:101985. pmid:33588117
View Article
PubMed/NCBI
Google Scholar

[27] View Article

[28] PubMed/NCBI

[29] Google Scholar

[ref10] 10. Ronneberger O, Fischer P, Brox T. U-net: Convolutional networks for biomedical image segmentation. In: Medical image computing and computer-assisted intervention–MICCAI 2015: 18th international conference, Munich, Germany, October 5-9, 2015, proceedings, part III 18. Springer. 2015. 234–41.

[ref11] 11. Xu W, Yang H, Zhang M, Cao Z, Pan X, Liu W. Brain tumor segmentation with corner attention and high-dimensional perceptual loss. Biomedical Signal Processing and Control. 2022;73:103438.
View Article
Google Scholar

[32] View Article

[33] Google Scholar

[ref12] 12. Schlemper J, Oktay O, Schaap M, Heinrich M, Kainz B, Glocker B, et al. Attention gated networks: Learning to leverage salient regions in medical images. Med Image Anal. 2019;53:197–207. pmid:30802813
View Article
PubMed/NCBI
Google Scholar

[35] View Article

[36] PubMed/NCBI

[37] Google Scholar

[ref13] 13. Akbar AS, Fatichah C, Suciati N. Single level UNet3D with multipath residual attention block for brain tumor segmentation. Journal of King Saud University - Computer and Information Sciences. 2022;34(6):3247–58.
View Article
Google Scholar

[39] View Article

[40] Google Scholar

[ref14] 14. Jia Z, Zhu H, Zhu J, Ma P. Two-Branch network for brain tumor segmentation using attention mechanism and super-resolution reconstruction. Comput Biol Med. 2023;157:106751. pmid:36934534
View Article
PubMed/NCBI
Google Scholar

[42] View Article

[43] PubMed/NCBI

[44] Google Scholar

[ref15] 15. Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T. An image is worth 16x16 words: Transformers for image recognition at scale. 2020. https://arxiv.org/abs/2010.11929

[ref16] 16. Lin J, Lin J, Lu C, Chen H, Lin H, Zhao B, et al. CKD-TransBTS: Clinical Knowledge-Driven Hybrid Transformer With Modality-Correlated Cross-Attention for Brain Tumor Segmentation. IEEE Trans Med Imaging. 2023;42(8):2451–61. pmid:37027751
View Article
PubMed/NCBI
Google Scholar

[47] View Article

[48] PubMed/NCBI

[49] Google Scholar

[ref17] 17. Hu Z, Li L, Sui A, Wu G, Wang Y, Yu J. An efficient R-Transformer network with dual encoders for brain glioma segmentation in MR images. Biomedical Signal Processing and Control. 2023;79:104034.
View Article
Google Scholar

[51] View Article

[52] Google Scholar

[ref18] 18. ZongRen L, Silamu W, Yuzhen W, Zhe W. DenseTrans: Multimodal Brain Tumor Segmentation Using Swin Transformer. IEEE Access. 2023;11:42895–908.
View Article
Google Scholar

[54] View Article

[55] Google Scholar

[ref19] 19. Yang H, Zhou T, Zhou Y, Zhang Y, Fu H. Flexible Fusion Network for Multi-Modal Brain Tumor Segmentation. IEEE J Biomed Health Inform. 2023;27(7):3349–59. pmid:37126623
View Article
PubMed/NCBI
Google Scholar

[57] View Article

[58] PubMed/NCBI

[59] Google Scholar

[ref20] 20. Ting H, Liu M. Multimodal Transformer of Incomplete MRI Data for Brain Tumor Segmentation. IEEE J Biomed Health Inform. 2023;PP:10.1109/JBHI.2023.3286689. pmid:37327094
View Article
PubMed/NCBI
Google Scholar

[61] View Article

[62] PubMed/NCBI

[63] Google Scholar

[ref21] 21. Wenxuan W, Chen C, Meng D, Hong Y, Sen Z, Jiangyun L. Transbts: Multimodal brain tumor segmentation using transformer. In: International Conference on Medical Image Computing and Computer-Assisted Intervention, Springer. 2021. 109–19.

[ref22] 22. Liang J, Yang C, Zeng M, Wang X. TransConver: transformer and convolution parallel network for developing automatic brain tumor segmentation in MRI images. Quant Imaging Med Surg. 2022;12(4):2397–415. pmid:35371952
View Article
PubMed/NCBI
Google Scholar

[66] View Article

[67] PubMed/NCBI

[68] Google Scholar

[ref23] 23. Zhu Z, He X, Qi G, Li Y, Cong B, Liu Y. Brain tumor segmentation based on the fusion of deep semantics and edge information in multimodal MRI. Information Fusion. 2023;91:376–87.
View Article
Google Scholar

[70] View Article

[71] Google Scholar

[ref24] 24. Liu Y, Duan Y, Zeng T. Learning multi-level structural information for small organ segmentation. Signal Processing. 2022;193:108418.
View Article
Google Scholar

[73] View Article

[74] Google Scholar

[ref25] 25. Lin T-Y, Goyal P, Girshick R, He K, Dollar P. Focal Loss for Dense Object Detection. In: 2017 IEEE International Conference on Computer Vision (ICCV), 2017. 2999–3007. https://doi.org/10.1109/iccv.2017.324

[ref26] 26. Caliva F, Iriondo C, Martinez AM, Majumdar S, Pedoia V. Distance map loss penalty term for semantic segmentation. arXiv preprint. 2019.
View Article
Google Scholar

[77] View Article

[78] Google Scholar

[ref27] 27. Kervadec H, Bouchtiba J, Desrosiers C, Granger E, Dolz J, Ayed IB. In: International conference on medical imaging with deep learning, 2019. 285–96.

[ref28] 28. Karimi D, Salcudean SE. Reducing the Hausdorff Distance in Medical Image Segmentation With Convolutional Neural Networks. IEEE Trans Med Imaging. 2020;39(2):499–513. pmid:31329113
View Article
PubMed/NCBI
Google Scholar

[81] View Article

[82] PubMed/NCBI

[83] Google Scholar

[ref29] 29. Yeung M, Sala E, Schönlieb C-B, Rundo L. Unified Focal loss: Generalising Dice and cross entropy-based losses to handle class imbalanced medical image segmentation. Comput Med Imaging Graph. 2022;95:102026. pmid:34953431
View Article
PubMed/NCBI
Google Scholar

[85] View Article

[86] PubMed/NCBI

[87] Google Scholar

[ref30] 30. Du J, Guan K, Liu P, Li Y, Wang T. Boundary-Sensitive Loss Function With Location Constraint for Hard Region Segmentation. IEEE J Biomed Health Inform. 2023;27(2):992–1003. pmid:36378793
View Article
PubMed/NCBI
Google Scholar

[89] View Article

[90] PubMed/NCBI

[91] Google Scholar

[ref31] 31. Yu S, Zhang B, Xiao J, Lim EG. Structure-Consistent Weakly Supervised Salient Object Detection with Local Saliency Coherence. AAAI. 2021;35(4):3234–42.
View Article
Google Scholar

[93] View Article

[94] Google Scholar

[ref32] 32. Wang G, Lu H, Li N, Xue H, Sang P. Kfd-net: a knowledge fusion decision method for post-processing brain glioma MRI segmentation. Pattern Anal Applic. 2024;27(4).
View Article
Google Scholar

[96] View Article

[97] Google Scholar

[ref33] 33. Bakas S, Akbari H, Sotiras A, Bilello M, Rozycki M, Kirby JS, et al. Advancing The Cancer Genome Atlas glioma MRI collections with expert segmentation labels and radiomic features. Sci Data. 2017;4:170117. pmid:28872634
View Article
PubMed/NCBI
Google Scholar

[99] View Article

[100] PubMed/NCBI

[101] Google Scholar

[ref34] 34. Bakas S, Reyes M, Jakab A, Bauer S, Rempfler M, Crimi A, et al. Identifying the best machine learning algorithms for brain tumor segmentation, progression assessment, and overall survival prediction in the BRATS challenge. 2018. https://doi.org/10.48550/arXiv.1811.02629

[ref35] 35. Bakas S, Akbari H, Sotiras A, Bilello M, Rozycki M, Kirby J. Segmentation labels and radiomic features for the pre-operative scans of the TCGA-LGG collection. The Cancer Imaging Archive. 2017.

[ref36] 36. Rehman MU, Ryu J, Nizami IF, Chong KT. RAAGR2-Net: A brain tumor segmentation network using parallel processing of multiple spatial frames. Comput Biol Med. 2023;152:106426. pmid:36565485
View Article
PubMed/NCBI
Google Scholar

[105] View Article

[106] PubMed/NCBI

[107] Google Scholar

[ref37] 37. Chang Y, Zheng Z, Sun Y, Zhao M, Lu Y, Zhang Y. DPAFNet: A Residual Dual-Path Attention-Fusion Convolutional Neural Network for Multimodal Brain Tumor Segmentation. Biomedical Signal Processing and Control. 2023;79:104037.
View Article
Google Scholar

[109] View Article

[110] Google Scholar

[ref38] 38. AboElenein NM, Piao S, Noor A, Ahmed PN. MIRAU-Net: An improved neural network based on U-Net for gliomas segmentation. Signal Processing: Image Communication. 2022;101:116553.
View Article
Google Scholar

[112] View Article

[113] Google Scholar

[ref39] 39. Mazumdar I, Mukherjee J. Fully automatic MRI brain tumor segmentation using efficient spatial attention convolutional networks with composite loss. Neurocomputing. 2022;500:243–54.
View Article
Google Scholar

[115] View Article

[116] Google Scholar

Figures

Abstract

Introduction

Related work

The encoder-decoder structure of CNN models like U-Net

Transformer and its combined model with CNN

Loss functions and the issue of class imbalance

Materials and methods

Materials

Methods

Gaussian gating module for secondary feature optimization.

Layered refinement module based on label priors.

Dynamic weighted loss function for edge region voxels.

GBM diagnosis based on VASARI.

Results

Dataset and evaluation metrics

Experimental setup and preprocessing

Comparative experiments of GL-Net

Ablation experiments

VASARI-based feature analysis

Conclusion

Acknowledgments

References