Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Unsupervised model for structure segmentation applied to brain computed tomography

  • Paulo Victor dos Santos ,

    Contributed equally to this work with: Paulo Victor dos Santos, Marcella Scoczynski Ribeiro Martins, Solange Amorim Nogueira, Cristhiane Gonçalves, Rafael Maffei Loureiro, Wesley Pacheco Calixto

    Roles Conceptualization, Data curation, Formal analysis, Funding acquisition, Investigation, Methodology, Project administration, Resources, Software, Supervision, Validation, Visualization, Writing – original draft, Writing – review & editing

    paulodos@ieee.org (PVS); wesley.pacheco@ufg.br (WPC)

    Affiliations Electrical, Mechanical & Computer Engineering School, Federal University of Goias, Goiania, Brazil, Department of Radiology, Hospital Israelita Albert Einstein, Sao Paulo, Sao Paulo, Brazil, Technology Research and Development Center (GCITE), Federal Institute of Goias, Goiania, Brazil

  • Marcella Scoczynski Ribeiro Martins ,

    Contributed equally to this work with: Paulo Victor dos Santos, Marcella Scoczynski Ribeiro Martins, Solange Amorim Nogueira, Cristhiane Gonçalves, Rafael Maffei Loureiro, Wesley Pacheco Calixto

    Roles Conceptualization, Data curation, Formal analysis, Funding acquisition, Investigation, Methodology, Project administration, Resources, Software, Supervision, Validation, Visualization, Writing – original draft, Writing – review & editing

    Affiliations Electrical, Mechanical & Computer Engineering School, Federal University of Goias, Goiania, Brazil, Federal University of Technology - Parana, Ponta Grossa, Parana, Brazil

  • Solange Amorim Nogueira ,

    Contributed equally to this work with: Paulo Victor dos Santos, Marcella Scoczynski Ribeiro Martins, Solange Amorim Nogueira, Cristhiane Gonçalves, Rafael Maffei Loureiro, Wesley Pacheco Calixto

    Roles Validation, Writing – original draft, Writing – review & editing

    Affiliations Electrical, Mechanical & Computer Engineering School, Federal University of Goias, Goiania, Brazil, Department of Radiology, Hospital Israelita Albert Einstein, Sao Paulo, Sao Paulo, Brazil

  • Cristhiane Gonçalves ,

    Contributed equally to this work with: Paulo Victor dos Santos, Marcella Scoczynski Ribeiro Martins, Solange Amorim Nogueira, Cristhiane Gonçalves, Rafael Maffei Loureiro, Wesley Pacheco Calixto

    Roles Methodology, Supervision, Validation, Visualization, Writing – original draft, Writing – review & editing

    Affiliation Federal University of Technology - Parana, Ponta Grossa, Parana, Brazil

  • Rafael Maffei Loureiro ,

    Contributed equally to this work with: Paulo Victor dos Santos, Marcella Scoczynski Ribeiro Martins, Solange Amorim Nogueira, Cristhiane Gonçalves, Rafael Maffei Loureiro, Wesley Pacheco Calixto

    Roles Data curation, Investigation, Supervision, Validation, Visualization, Writing – original draft, Writing – review & editing

    Affiliation Department of Radiology, Hospital Israelita Albert Einstein, Sao Paulo, Sao Paulo, Brazil

  • Wesley Pacheco Calixto

    Contributed equally to this work with: Paulo Victor dos Santos, Marcella Scoczynski Ribeiro Martins, Solange Amorim Nogueira, Cristhiane Gonçalves, Rafael Maffei Loureiro, Wesley Pacheco Calixto

    Roles Formal analysis, Methodology, Project administration, Supervision, Validation, Visualization, Writing – original draft, Writing – review & editing

    paulodos@ieee.org (PVS); wesley.pacheco@ufg.br (WPC)

    Affiliations Electrical, Mechanical & Computer Engineering School, Federal University of Goias, Goiania, Brazil, Technology Research and Development Center (GCITE), Federal Institute of Goias, Goiania, Brazil

Abstract

This article presents an unsupervised method for segmenting brain computed tomography scans. The proposed methodology involves image feature extraction and application of similarity and continuity constraints to generate segmentation maps of the anatomical head structures. Specifically designed for real-world datasets, this approach applies a spatial continuity scoring function tailored to the desired number of structures. The primary objective is to assist medical experts in diagnosis by identifying regions with specific abnormalities. Results indicate a simplified and accessible solution, reducing computational effort, training time, and financial costs. Moreover, the method presents potential for expediting the interpretation of abnormal scans, thereby impacting clinical practice. This proposed approach might serve as a practical tool for segmenting brain computed tomography scans, and make a significant contribution to the analysis of medical images in both research and clinical settings.

Introduction

Neurological diseases pose a significant risk not only to individual development but also to overall productivity. Currently, some of the most prevalent neurological disorders, such as dementia, stroke, epilepsy, Parkinson’s disease, multiple sclerosis, migraines, and tension-type headaches, generate an estimated economic cost of nearly $789 billion in the United States [1]. These disorders involve congenital, developmental, or acquired abnormalities that affect the brain, spinal cord, and nerves [2].

The World Health Organization (WHO) estimates that one in three people will be affected by neurological disorders in their lifetime. These disorders are the leading cause of disability and the second leading cause of death, resulting in more than six million deaths each year. Given these statistics, public health and society must prioritize the understanding and effective treatment of neurological disorders. Prevention strategies, early diagnosis, and effective treatments are critical to minimizing the negative impact of these conditions on patients’ health while reducing the associated economic burden [3].

The high prevalence of neurological diseases and the growing number of neuroimaging studies stored in repositories [4, 5] emphasize the importance of using artificial intelligence (AI) to develop models that can assist physicians in early diagnosis, thus improving patient care [6, 7]. By analyzing vast amounts of data, AI can identify intricate and subtle patterns that humans cannot observe.

AI has greatly advanced in the analysis of medical images, particularly through the process of annotation. This involves trained radiologists identifying and labeling specific areas with anatomical structures, lesions, or other significant features, which is crucial for AI algorithms involved in tasks such as detection, segmentation, and classification [8, 9]. The use of well-organized medical image datasets with accurate labeling facilitates the development of computational technologies for efficient classification and preliminary medical diagnoses in a supervised manner [10]. Supervised machine learning algorithms are trained using labeled data, establishing associations and inferences between data and relevant categories. This supervised approach allows the generalization and inference of new tests based on previously learned patterns. Among various supervised learning methods, artificial neural networks (ANNs) have emerged as a particularly promising approach [11, 12]. However, the annotation process can be resource-intensive in terms of time and finances. Moreover, available datasets often lack adequate annotations or a satisfactory quantity for the effective training of AI models [1315].

Segmentation, the division of the image into regions, objects, or pixels, with the aim of separating the objects of interest from other elements [16], is a technique frequently applied, often performed by ANNs [17], and is important in biomedical applications, particularly for delineating anatomical structures [18]. In neuroimaging, the segmentation of intracranial structures facilitates the visualization and classification of brain tissues, aiding in the identification of abnormalities, including tumor localization [1921]. Deep Neural Networks (DNNs), a subtype of neural networks, have proven effective for segmentation, acting as supervised classifiers and achieving more accurate results [18, 22, 23].

Numerous studies have employed supervised deep learning techniques for segmentation purposes. Monteiro et al. proposed the use of supervised deep learning to quantify brain injuries using the CQ500 dataset [24, 25]. Ronneberger, Fischer, and Brox, as well as Li et al., developed a specialized deep neural network architecture called U-Net for brain hemorrhage segmentation, achieving an impressive 98% accuracy in identifying bleeds across two distinct datasets [26, 27]. However, it is essential to acknowledge that these approaches are limited by the availability of pre-annotated datasets. Segmentation of brain images is challenging due to image noise, especially for computed tomography (CT) images [28]. Additionally, supervised segmentation involves certain considerations: i) it can be costly as it relies on annotations from experts with potential interobserver variability, ii) modeling different exam types incurs high computational costs as models need to be trained for each exam type. and iii) the CT acquisition protocol may vary from scanner to scanner, differing in terms of signal to noise ratio (SNR), contrast, slice thickness, and spatial resolution, among others.

On the other hand, unsupervised segmentation methods have also attracted attention. These methods offer unique advantages since they do not rely on pre-labeled data and can effectively deal with image noise. Balafar et al. [28] provided a comprehensive review of supervised and unsupervised segmentation techniques and suggested further research to enhance the speed, accuracy, and integration of these methods. A notable unsupervised segmentation approach was proposed by Atkins & Mackiewich [29], who used anisotropic filters and image processing techniques, such as noise removal, to perform unsupervised segmentation of brain lesions and generate brain contour masks. This automated method demonstrated the ability to segment images obtained from different scanners and resolutions, overcoming the limitations associated with data heterogeneity. Their work contributed to unsupervised segmentation in neuroimaging, and their results highlighted the potential to advance the field in terms of accuracy and applicability.

Lee et al. [30] proposed an approach that combines classic clustering algorithms, such as k-means and fuzzy c-means, for the segmentation of brain CT images into three distinct regions. In their study, the authors employed decision trees to analyze the interrelation between components in normal and abnormal regions within brain CT images. These unsupervised techniques have shown promising results in medical image segmentation, effectively addressing challenges associated with the need for specialized annotations and the presence of image noise. As a result, they enhanced the accuracy and reliability of medical image segmentation techniques, pushing the boundaries of the field and creating new opportunities for improved diagnosis.

Recent studies have explored unsupervised segmentation for brain images acquired through magnetic resonance imaging (MRI). For example, Dalca et al. [31] proposed a novel approach that combines Bayesian inference with classical probabilistic segmentation based on brain atlases, incorporating deep learning techniques. The authors developed neural networks that can detect, delineate, and recognize abnormalities in brain images, allowing the segmentation model to be trained on new MRI scans without manual annotation. Experimental results showed that the method enables accurate segmentation regardless of MRI contrast.

Mahata et al. [20] proposed a fuzzy segmentation technique for brain MRI. Their method consists of combining the Gaussian function with local contextual information to establish associations between neighboring pixels. Segmentation is performed by using the Gaussian function to estimate the heterogeneous intensity within each tissue region using the local gradients of the image. The authors conducted simulations using two databases of brain MRI scans and demonstrated that the proposed method is efficient compared to other clustering algorithms based on fuzzy logic. However, it is fundamental to note that their proposed approach is specifically tailored to MRI studies.

Khan et al. [32] proposed a method for accurate classification of brain tumors in MRI images. Their method comprised three phases: i) preprocessing, ii) segmentation of brain tumor using the k-means clustering technique, and iii) classification of tumors as benign or malignant. To refine the accuracy of the classification process, the authors introduced the concept of synthetic data augmentation. In this technique, additional data is generated to expand the training dataset considered by the classifier. The approach was evaluated using the BraTS2015 datasets, and results show the robustness of the proposed strategy, particularly emphasizing the effectiveness of clustering as a segmentation technique.

Raja et al. [33] presented a methodology to classify brain tumors in MRI scans by introducing a hybrid deep autoencoder with a Bayesian fuzzy clustering segmentation approach. Following segmentation, they extracted information metrics using dispersion transform and entropy methods. This work is characterized by the integration of unsupervised and supervised approaches, although it is limited to MRI scans. In a similar context, Hua et al. [34] proposed a clustering technique based on fuzzy c-means to improve the accuracy of segmentation in brain MRI scans. Their method incorporates a visualization mechanism with adaptive weight learning, which assigns an optimal weight to each visualization based on its cluster contribution. The segmentation result was obtained by combining these visualizations. Compared to other clustering algorithms, the authors’ method exhibited superior adaptability and performance.

The segmentation of medical images using MRI has been extensively explored in the literature, often combining both supervised and unsupervised learning techniques. According to Lenchik et al. [35], the brain segmentation studies conducted in 2019 indicate a predominant use of MRI over CT. In Brazil, there are approximately 15.6 CT machines per million inhabitants, with MRI machines generally having higher maintenance and depreciation costs [36, 37]. As a result, the widespread availability and rapid imaging acquisition of CT scans justify their frequent use, especially in emergency cases such as stroke, where prompt diagnosis is crucial for effective treatment.

Table 1 presents a chronological overview of studies that have addressed MRI and CT image segmentation, including the study by Kim, Kanezaki, and Tanaka [38]. Their work proposes an unsupervised segmentation approach applicable to various image types beyond the medical domain. The method uses a deep neural network architecture with different filters and processes to group similar features, eliminating the need for training or manual labeling. However, adapting this particular method for CT scans is crucial considering the unique characteristics of this imaging modality.

thumbnail
Table 1. Summary of studies addressing the segmentation of magnetic resonance imaging and computed tomography images.

https://doi.org/10.1371/journal.pone.0304017.t001

To bridge this gap, this work aims to adapt the method of Kim, Kanezaki, & Tanaka [38] for CT scan segmentation using a predetermined number of masks and performing an initial calibration of the neural network with reference images. This approach allows the segmentation of similar images without training or manual annotation.

The central hypothesis of this study is as follows: if it is possible to develop an unsupervised segmentation model that uses an end-to-end approach to segment intracranial structures and optimizes the network’s hyperparameters, then it can be effectively applied to the specific context of CT exams. This approach would eliminate the need for expert conceptual changes, reduce computational costs, and simplify the annotation task by facilitating the identification of regions with abnormalities. The main objective is to implement a deep neural network architecture for segmenting intracranial structures without prior labeling, manual annotation, or supervision. Specific objectives include: i) evaluating different training techniques for the neural network, ii) applying the optimization process to determine optimal values for the network’s hyperparameters, iii) controlling the number of masks used in the segmentation process, and iv) evaluating the performance of the proposed approach through validation with domain experts and a comparative analysis with other studies in the literature.

This paper presents an original unsupervised method for segmenting medical images, characterized by its high flexibility in defining the number of masks and the inclusion of customizable metrics for each segment. The originality and innovation of this approach lie in optimizing the network’s hyperparameters, without relying on pre-existing labels. This strategy is designed to reduce the financial and time costs, thus enabling faster and more efficient diagnoses. A key significance of this method lies in the effort to simplify traditional supervised segmentation models, which typically demand manual annotation and extensive computational resources. Moreover, the proposed model shows promising potential for application in several medical fields.

This article is structured as follows: The Theoretical Background section discusses the conceptual foundations, which include important theories such as brain anatomy, the Hounsfield Scale, and window sliding. The Methodology section provides a detailed description of the proposed approach, while Results section describes the experiments conducted and highlights results, complemented by relevant discussions. Finally, the Conclusion section succinctly summarizes the main conclusions and contributions of this research.

Theoretical background

This section provides an overview of the anatomy, structure, and function of the brain. It also discusses the main differences between CT and MRI and the methods of annotating and labeling medical images. In addition, it explains the concept of segmentation techniques, focusing on the unsupervised approach for medical images.

Brain structures

The human brain consists of several interconnected structures that are involved in higher cognitive functions, memory formation, sensory processing, autonomic and endocrine regulation, motor coordination, and vital functions [39, 40]. It comprises the cerebrum, diencephalon, brainstem (midbrain, pons, and medulla), and cerebellum. The cerebrum, the largest part of the brain, encompasses the cerebral hemispheres, basal ganglia, and white matter tracts. Each cerebral hemisphere is divided into five lobes listed in descending order of size: the frontal lobe, temporal lobe, parietal lobe, occipital lobe, and insular lobe. The corpus callosum serves as the connection between the two hemispheres. The brain has characteristic folds called gyri (singular: gyrus) and grooves called sulci (singular: sulcus), which increase its surface area for information processing [4143].

Fig 1(a), adapted from [44], illustrates the brain’s surface. The gray matter is the substance of the brain that contains the neuronal cell bodies. Within the cerebrum, the two main grey matter locations are on the surface of the gyri, known as the cortical grey matter, and in the nuclei of the basal ganglia. In contrast, white matter consists of fiber tracts comprising neuronal axons [45]. Compared to white matter, gray matter has a higher density, enabling their differentiation on imaging examinations [46]. The brain also has a ventricular system consisting of internal ventricles filled with cerebrospinal fluid (CSF): two lateral ventricles, a third ventricle in the midline, and a fourth ventricle [39]. This fourth ventricle, located between the pons of the brainstem and the cerebellum, is mainly visible in lower cross-sectional slices [4751].

thumbnail
Fig 1.

Brain mapping: (a) specific brain regions and (b) different brain structures.

https://doi.org/10.1371/journal.pone.0304017.g001

The segmentation of brain structures contributes to quantifying brain volume and diagnosing neurological diseases [39, 40]. The Fig 1(b) illustrates an axial image of a brain CT scan, highlighting discernible regions.

Computed tomography × magnetic resonance imaging

CT and MRI are widely used imaging techniques for the diagnosis of neurological diseases. CT uses X-rays to produce detailed cross-sectional images of the human body, while MRI uses strong magnetic fields and radiofrequency waves to produce images with higher contrast resolution, allowing better differentiation between various tissue types based on their magnetic properties [52]. In contrast to CT, MRI is not associated with ionizing radiation, which makes it safer in terms of radiation exposure. However, CT has distinct advantages, such as higher spatial resolution, faster image acquisition, and lower costs, making it more accessible in public institutions [52]. Fig 2(a) and 2(b), adapted from Le [53], illustrates the differences between a CT and an MRI scan performed on the same individual and in the same cross-sectional view of the brain.

thumbnail
Fig 2.

Brain exams: (a) computed tomography e (b) magnetic resonance.

https://doi.org/10.1371/journal.pone.0304017.g002

In CT, the X-rays pass through the body and are captured by sensors that convert them into electrical signals. These signals are reconstructed by a computer, resulting in volumetric images presented in slices as the X-ray tube completes a full 360° rotation around the body [52]. Modern scanners typically produce images with an average thickness of 0.5mm. Ongoing advances in medical imaging and technology have led to specific protocols designed for accurate diagnoses. These protocols seamlessly integrate devices, standardized procedures, and AI applications [54]. To ensure interoperability and consistency, digital images adhere to standardized formats, such as the DICOM standard (Digital Imaging and Communications in Medicine). DICOM defines universal standards for storage and communication, regardless of the manufacturer or vendor [55]. This standardization improves compatibility and facilitates the seamless exchange of medical imaging data in the healthcare ecosystem.

In CT scans, the digital image is represented by intensity values expressed with Hounsfield units (HU), a numerical scale that assigns values to different substances and tissues based on their radiological attenuation [56]. Hounsfield units are derived from a linear transformation of the measured attenuation coefficients. This transformation is based on the radiodensities of air (assigned as 0 HU) and pure water (assigned as −1000 HU) at standard temperature and pressure. Each pixel in the CT image is assigned an intensity value in HU [57].

This approach enables the classification of different tissues, with bones appearing as white regions and soft tissues such as muscles and organs having intermediate values visualized in shades of gray [56]. Regions containing air are displayed in darker tones. By applying filters or thresholds, specific tissues can be highlighted for visualization. Generally, CT images use 12 − bit images capable of storing values between −1024 and 3071 HU [58]. The presentation of these values is determined by the application of specific window and level settings, allowing for better visualization of anatomical structures in different parts of the human body [56, 59, 60]. This windowing adjusts the values in the grayscale range in radiological exams, improving the visualization of structures of interest. It involves selecting specific attenuation intervals and associating them with corresponding ranges in the grayscale. The proper use of this procedure results in improvements in identifying the regions [61].

Annotation and labeling in medical images

The manual annotation process of digital images is performed by experts who identify and label spatial regions present in the image, providing descriptions and comments in the textual form [62]. In the medical field, image annotation requires precision, often involving the participation of multiple radiologists who perform independent annotations [54]. After individual labeling, the annotations are collectively reviewed for further corrections and updates. However, this manual annotation process is financially and temporally demanding, often requiring the involvement of multiple professionals [62].

To address these challenges, several computational tools have been developed to enable automatic image annotation, ranging from classification labels to pixel-by-pixel segmentation [63]. In medical images, automatic annotation is applied to label MRI and CT images for training machine learning and deep learning models [54]. These tools play a fundamental role in the segmentation of structures in several organs, such as breast masses, lung nodules, retinal vessels, and tumors in the liver, brain, and other regions [62].

Unsupervised segmentation

Image segmentation can be performed in two ways: i) supervised and ii) unsupervised. In supervised segmentation, the model is trained with labeled data, where labels and annotations are input information about the regions of interest in the image [64, 65]. In contrast, in unsupervised segmentation, the model is trained without labels, using clustering techniques or pixel comparisons to automatically identify regions or objects in the image. The supervised approach provides more accurate results but requires prior annotation, while the unsupervised approach is useful when labels are not available or when the image has an unknown variety of regions [66].

Automatic segmentation of medical images is a growing area of research as it promises to aid the diagnosis and monitoring of patients with various diseases [66]. With the integration of AI, image analysis tasks have become more accessible, demanding higher precision in the delineation of regions of interest. The main goal of segmentation is to simplify the representation of relevant information by dividing the image into different regions using techniques that pinpoint objects or their boundaries. There are different approaches, including region-based, edge-based, or object boundary-based techniques [67]. Within this spectrum, unsupervised segmentation aims to separate objects from the image background through pixel comparison or grouping based on similarities. This process involves the application of calculations and evaluation functions tailored to the specific context.

The unsupervised DNN approach has been widely employed in segmenting brain structures or lesions, combining statistical methods, comparative methods, and clustering to achieve accurate results [66]. Recent studies have applied these methods in MRI, exploiting the high quality and contrast resolution of these images to delineate objects and regions in the brain [31, 68, 69]. Evaluation metrics are applied to measure the AI model’s ability to make correct or incorrect predictions, usually based on the confusion matrix. The model’s performance is assessed by comparing true positives to the desired values, resulting in a positive measure for the highest number of correct predictions [70].

The loss function is fundamental for quantifying the difference between the output of the model and the desired value, representing errors on a per-observation basis. Unlike the evaluation function, which includes all model data, the loss function measures the errors individually [71]. In unsupervised approaches, where there are no expected values and no labeled data for training, the loss or the evaluation function serves as a reference during the training of the feature extraction model. The applied function indicates the desired behavior, taking into account errors in prediction or model performance.

Optimization process

The optimization process of DNN models occurs during the training phase. In this process, the model is adjusted via its hyperparameters. These are configurations and decisions that influence the behavior and efficiency of the model but are not learned from the data during training [72]. Some of the optimizers frequently used in DNN are Adaptive Moment Estimation (Adam), Root Mean Square Propagation (RMSprop), and Stochastic Gradient Descent (SGD) [73]. Adam integrates elements of algorithms such as RMSprop [74], while RMSprop is a variant of SGD that adjusts the learning rate individually for each parameter [75]. SGD is the effective version of the classic gradient descent algorithm and is particularly suitable for large datasets [76].

The main goal of most optimization processes in DNNs is to minimize the loss function. This is achieved by strategies that search for optimal values for hyperparameters [77]. Optuna is a hyperparameter optimization library for DNNs that uses the history of experiments to determine which hyperparameter configurations are best suited for a given problem [78]. Optuna employs techniques such as decision trees and Bayesian optimization to explore the hyperparameter search space and determine the most efficient combination. It uses the Parzen density estimator (PDE) to model the probability distribution of the hyperparameters during the optimization process [73, 79]. In addition, Optuna interacts with applications or platforms via an application programming interface (API), which enables communication between different software components.

Methodology

Computed tomography (CT) is a necessary diagnostic tool in the detection of various neurological diseases. However, careful annotation of the images can be challenging, requiring specialized professionals to identify and annotate different brain structures in the search for abnormalities. Therefore, an approach is proposed that employs automated brain region segmentation techniques to expedite the annotation process and minimize manual intervention. Thus, a deep neural network (DNN) is implemented to segment intracranial structures without relying on pre-existing labeling, manual annotation, or direct supervision. Additionally, different training techniques for neural networks are explored, the network hyperparameters are optimized, the ideal number of masks for segmentation is determined, and the overall performance of the approach is evaluated. Segmentation results are validated by comparing them with expert opinions in the field and contextualizing them with similar studies from the literature. The routines utilized in this paper are made available at Santos et al. [80].

Foundation and criteria of the methodology

The proposed methodology focuses on image segmentation without the need for pre-specified training images or pixel labels. In this approach, once the target image is provided, pixel labels and feature representations are jointly optimized, and their parameters are updated through gradient descent. The proposed process alternates between predicting labels and learning network parameters to satisfy three main criteria: i) pixels with similar characteristics should be assigned the same label, ii) spatially contiguous pixels should be assigned the same label, and iii) the number of unique cluster labels should contain a high number of pixels. These criteria reflect the intuition that desired image segmentation groups similar pixels forming parts or salient objects in the image while maintaining spatial continuity and distinguishing between distinct patterns. To meet these criteria, the approach minimizes the combination of similarity and spatial continuity losses.

The study introduces a novel end-to-end network architecture for unsupervised segmentation of intracranial medical images, which includes normalization and a differentiable clustering function. Furthermore, the spatial continuity loss function has its parameters optimized to mitigate limitations related to fixed segment boundaries, as observed in previous works [38]. The proposed architecture employs linear classification to categorize the features of each pixel into classes, followed by normalization and classification to determine cluster labels. The similarity loss of features is calculated based on the cross-entropy between the normalized response maps and the predicted cluster labels, while the spatial continuity loss is based on the norm-Lp of horizontal and vertical differences in the response map [81]. The evaluation and validation flow of the proposed methodology is illustrated in Fig 3.

The flowchart shown in Fig 3 outlines the methodology for segmenting brain CT exams. The process starts with the selection of a database of brain CT scans to obtain the desired number of labels NR. It is checked whether there is a pre-trained model that can generate the desired labels. If no such model exists, a new model is trained (Training). The images segmented by the proposed method are compared with the images generated by existing tools such as CTSeg [82, 83] using segmentation evaluation metrics (Validation). If results of these metrics are in the predefined acceptable threshold regionηc, they are evaluated by experts. If there is disagreement, the model optimization process is repeated until results reach the desired ηc. If the segmentation is approved by the experts, results are presented in text form. Otherwise, the model is discarded.

Data pre-processing

In order to use CT scans with DNNs, the data must be standardized. To this end, it must be ensured that all exams in the dataset have the same number of slices and uniform dimensions. To achieve this standardization, the exams are subjected to spatial dimension interpolation. In this process, pixels are created or eliminated based on neighboring pixels, allowing for convenient spatial image resizing [84, 85]. Result is a predefined number of image slices and a uniform resolution in width and height. This standardization is necessary to ensure that the deep neural network can analyze the data accurately.

To eliminate irrelevant information in exams, a windowing procedure is applied with values of 40HU for the window center and 80HU for the window width, according to standard medical practice [86, 87]. An algorithm is then used to map the Hounsfield scale to a new range of gray tones. This approach simplifies the calculations during the training of the neural networks, as it involves numbers on a reduced scale, making the process more efficient [60]. At the end of the preprocessing phase, the data structure is obtained with the dimensions [Nfatia timesL timesA], where Nfatia is the number of slices, L is the width and A is the height of the image in pixels.

Architecture of the proposed segmentation model

Since we use the segmentation model developed by Kim, Kanezaki, and Tanaka [38], in which the authors follow an unsupervised approach that uses clustering techniques and metrics to evaluate static similarity S and continuity rate Δ(ri, j), we propose a different approach using dynamic values to adjust the calculation of S and Δ(ri, j), based on the desired NR as the output of segmentation of CT scans. The architecture of the proposed segmentation model is indicated in the gray block in Fig 3 and illustrated in Fig 4, where the black flow represents the learning phase of the segmentation, while the red flow represents the final phase of the iterative process when the desired NR is achieved.

The feature extractor, as shown in Fig 4, configurable convolutional layers and activation functions that are important for extracting features from images, especially in unsupervised approaches that aim to address the similarity constraint between pixels [88, 89]. After feature extraction, normalization is performed with a certain number of convolutional filters NF for each layer [90]. Sivakumar [91] describes that CT images are inherently susceptible to noise and that grouping pixels based on S requires normalization of the pixels in the image. In the context of this study, images with dimensions [L × A] pixels are processed by the neural network, resulting in a one-dimensional tensor [Pn]. Each position in this structure stores a processed image while maintaining the original dimensions [L × A]. The result of the normalization is then a three-dimensional structure with the dimensions [L × A × Pn], which consists of images processed by different NF.

The permutator works with the output of the feature extractor, which consists of constructing the tensor [Pn]. This structure provides data for the evaluation process of segmentation learning, shown in the black flow in Fig 4, and also feeds the necessary structure for the application of the one-dimensional maximum value filter, represented by the red flow in Fig 4. Each element of [Pn] corresponds to an image processed by a specific NF. Thus, considering N as the number of pixels in each image, this data structure is transformed into a new tensor with dimensions [N × Pn] in pixels. After permutation, two reconfiguration procedures are performed: i) Remodeler 1 and ii) Remodeler 2. The aim of these procedures is to reconstruct the data structure in the corresponding dimensions. Remodeler 1 processes the data structure of the permutator, which is scaled to [N × Pn], resulting in a new structure with the dimensions [A × L × Pn]. On the other hand, Remodeler 2 considers the tensor [N] with input elements and converts this information into the tensor [A × L] pixels.

The calculation of S generates positive values for similar features, while negative values are always converted to zero. This approach adjusts the values to the desired range, creating the similarity map. This map ensures that similar features are related or close to each other while distinguishing different behaviors depending on the type of image or examination processed. In this work, the calculation of S is done using the cross-entropy function, given by: (1) in which p(xi) denotes the probability of the actual class obtained from the reference map for pixel xi, and q(xi) represents the probability estimated by the neural network from the similarity map for the same pixel. The cross entropy serves as an indicator of the agreement between the model output and the reference data. In medical image segmentation, cross entropy is the metric used to evaluate the correspondence between the segmentation determined by the neural network and the desired segmentation, allowing the quality of the segmentation obtained to be quantified.

The calculation of Δ(ri, j) is used to accurately identify objects present in images. This process is based on the observation that similar pixel values indicate continuity in certain regions, while abrupt changes may indicate the boundaries of regions or objects. Δ(ri, j) analyzes the three-dimensional tensor [A × L × Pn] by comparing all its elements. For example, if a part of the image is contained in a certain element, the technique calculates the difference of possible reliefs between other elements and thus measures the continuity of these features. The result of is the delineation of the regions present in the images, which indicates the differences between the types of objects present.

The can be computed by overlapping filtered images on the original image and shifting them in different directions relative to the original image. The pixel at a particular position appears displaced in the overlapping images, allowing its distance to the same pixel in the other images to be calculated. Smaller distance values indicate spatial continuity, which means that there were no abrupt changes in the object. Larger distance values, on the other hand, indicate an interruption of the object in the two images, suggesting spatial discontinuity.

The method proposed by Kim, Kanezaki, & Tanaka [38] evaluates the absolute distance between the original image and its response map in several directions. This method identifies discrepancies in the continuity of pixels to their neighbors, quantified by the mean absolute error . In this paper, we propose some modifications to the approach of Kim, Kanezaki, and Tanaka [38]. For example, we introduce the function Δ(ri, j) to replace . The authors of the original paper consider the distance λ to be constant for all cases, λ = 1. In contrast, we determine the distance λ based on the desired value of NR, which is treated as a hyperparameter to be optimized. For the specific region in the image considered as the center point and the corresponding region in the other image after applying different NF, Δ(ri, j) measures the difference between these regions. In Cartesian coordinates, Δ(ri, j) tends to be more pronounced at the most extreme points. These adjustments include the introduction of dynamic values for comparing the central image and its response map, as well as specific values to determine the desired NR. Similar to Shibata et al. [81], we consider the norm- of the horizontal and vertical differences of the response map as a spatial constraint. Thus, the process can be implemented through the differential operator, defining the spatial continuity loss given by: (2) where λ is the distance to be determined based on the desired NR, ri, j refers to the response map, ri + λ, j and represent the pixel values at positions i, j in the response map, and w and h correspond to the length and width of the image, respectively. The selection of the λ value plays a crucial role in the segmentation of structures in medical exams. Choosing a smaller λ value may result in smaller regions being neglected in the segmented exams, while a larger λ value has the potential to segment small and disconnected structures.

The proposed evaluation function Faval considers the composition of S and . It analyzes the similarity between pixels adjacent to a given pixel, also considering the nearby neighbors. The process begins with feature extraction and culminates in the assignment of labels or categories to the images contained in the tensor [Pn]. The tensor results from the feature extraction and normalization step and encapsulates the relevant features of the original images. After processing and normalization of [Pn], the rectified linear activation function (ReLU) is used for normalization. This leads to a non-linearity in the outputs of the neural network layers. The evaluation function is therefore given by: (3) where dsim and dcont are regularization parameters used to find the balance between fitting the data in S and in Δ(ri, j). They determine the penalty applied in Faval according to the magnitude of the model, ensuring that the fitting process is sensitive to the data. This approach helps prevent overfitting, which could lead to difficulties in generalizing to new datasets, and contributes to the model. In the work of Kim, Kanezaki, & Tanaka [38], these parameters are considered constant and both are assigned the same value so that S and Δ(ri, j) have no penalty in Faval. However, in this work, we propose to optimize the values of dsim and dcont, which are indirectly used in the cross-entropy function given by (1), in the calculation of S, and in .

The data structure resulting from the permutation, with dimensions [N × Pn], is submitted to the application of a one-dimensional filter after the evaluation step. This filter is represented by the red flow in Fig 4 and transforms the data structure into a tensor of dimension [N], in which the maximum values are highlighted. The criterion for selecting the key pixels in the one-dimensional filter is based on intensity. This criterion aims to maintain continuous structures while suppressing artifacts that could interfere with cohesion. After applying the one-dimensional filter, Remodeler 2 reconstructs the tensor, enhances the edges and boundaries of each object, and obtains the image of the partial segmentation.

The grouper aims to consolidate pixels by combining repeating values into individual elements. In this way, the labels present in the image are counted. From the one-dimensional vector obtained in the filter step, its elements are identified, classified as unique, and organized. This step results in the unification of pixels in the image, and the number of unique pixels can be quantified, corresponding to the number of structures identified in the unsupervised segmentation. To calculate S, the cross-entropy algorithm described by (1) is used. This process compares the data structure resulting from the permutation step with the maximum values from the filter step.

The unsupervised segmentation approach is automatically guided by two metrics: Δ(ri, j) and S in (3). Δ(ri, j) controls the process through mathematical operations that outline the contours of the objects or structures present in the image. At the same time, the degree of similarity between neighboring pixels is evaluated by calculating S. The combination of these two metrics enables the identification, delineation, and segmentation of specific regions in the image. Δ(ri, j), the calculation of S, and NR are used as stopping criteria for the segmentation algorithm. The algorithm therefore stops when NR is reached or when the maximum number of iterations NI is reached.

Hyperparameter optimization

Hyperparameters are predefined configurations that guide the training process of a machine learning model. Unlike model parameters, which are directly adjusted by the data, hyperparameters are defined before training begins and remain constant throughout the entire process. The correct selection of these hyperparameters is essential to achieve satisfactory model performance, thereby saving time and computational resources. In this study, as shown in Fig 3 in the blue block, an optimization process is applied to determine the most effective hyperparameters, including: i) optimizer OPT, ii) number of convolutional filters NF, iii) number of convolutional layers NC, iv) distance λ, v) maximum number of iterations NI, vi) regularization rate dsim, vii) regularization rate dcont, and viii) learning rate TA, where , and .

The choice of OPT involves evaluating different algorithms to determine which are more efficient in identifying the desired NR. The configuration of NF refers to determining the number of NC in the feature extractor. The optimization of these quantities aims to minimize the evaluation function given the desired NR. Defining the appropriate amount of NC in the architecture of the feature extractor aims to achieve optimal efficiency and capture precise details. Having too few layers may be insufficient to generate the desired features while having too many layers can lead to saturation in feature extraction [92]. The hyperparameter λ is optimized within a predefined value range. Larger values neglect smaller regions in segmented exams, and smaller values segment small and disconnected structures.

When validating the methodology, a loop with a predefined maximum NI is used. The optimization of NF is necessary as it aims to track the progress of Faval during training, allowing comparisons between different hyperparameter configurations. Moreover, this controllable constraint allows simulating the performance of the methodology during the segmentation of brain structures. The value of S is used to ensure that neighboring pixels belong to the same group. However, for images with similar pixel intensities, the values of S can be close to zero even if the images are different. To control the optimization of S, the optimized value is assigned todsim. To achieve this, the range of values for the optimal or optimized ratio between S and NR must be determined. The effectiveness of the features generated by the extractor is directly linked to TA. Low values of TA can cause the network to stagnate in local minima, while high values can cause the network to continue training even after reaching an optimal or optimized point.

Evaluation of the proposed model

The model evaluation process consists of two stages: the first involves comparison with another established segmentation method in the literature, as illustrated in the green block of Fig 3, using the Dice Sørensen coefficient Dc, a similarity measure commonly used in segmentation assessments [9395]. The Dc provides a quantitative assessment of the quality of the segmentation generated by the proposed method in comparison to the existing approach. In the second stage, the evaluation is performed by a trained neuroradiologist who examines the segmentations produced by the proposed method.

The specialist evaluates criteria such as: i) accuracy, ii) delineation of contours, and iii) ability to identify relevant features or regions. This qualitative assessment complements the quantitative analysis and validates the efficiency of the new method. After validation, the trained model can be used as a segmentation tool for different brain CT scans and datasets, eliminating the need for further validation. In other words, neither the use of the segmentation block with another method in the green block of Fig 3 nor the evaluation block with a specialist is required.

Results

This section presents the results of the application of the proposed method. This includes describing the employed database, optimizing the hyperparameters, performing the segmentation phase, and conducting subsequent statistical analysis. They are complemented by the illustrations in Figs 3 and 4.

The unsupervised methodology produces segmentation masks, in contrast to results generated by the CTSeg tool [82, 83], which are images with predefined labels. To ensure uniformity, we will refer to the methodology’s results as labels. This terminology choice facilitates the comparison of segments generated by this approach with those from the CTSeg tool, enabling a comprehensive assessment of similarity and confirming that the analyzed segmentation aligns with the corresponding label generated by CTSeg.

Dataset and parameters definition

Since we addressed the Computed Tomography Quality 500 (CQ500) database, a public dataset containing 500 brain CT scans performed by various hospitals in India. This dataset includes high quality images that have been properly anonymized [25]. In addition to the images, CQ500 provides clinical reports prepared by three radiologists with 8, 12, and 20 years of experience in brain CT interpretation. The image files are structured as follows: i) patient identification, ii) study identification, and iii) exam identification, with the files in DICOM format. The exams are identified by codes starting with CQ500CT00x, where 00x represents the exam ID. In this work, we have decided to remove the prefix CQ500 and retain only the suffix CT00x.

Segmentation with a different method, as shown in Fig 3, uses the CTSeg tool [82, 83]. An acceptable threshold of ηc = 0.8 was set, indicating that metrics with higher values indicate greater similarity between images. Following the Pareto principle [96, 97], the selection of 100 reference exams from the set of 500 exams of CQ500 was performed. Among these 100 exams, one was selected to adjust the hyperparameters to minimize the evaluation function. For the experiments, five exams were randomly selected from the remaining 400 according to the Pareto principle. No abnormalities or findings outside the normal range were detected when analyzing these five exams.

The exams selected for the experiments are: i) CT047, which serves as a reference for hyperparameter tuning, and ii) CT042, C195, CT200, CT299, and CT418 for testing and validation. Each of these images has a resolution of A = 512 × L = 512 pixels, resulting in a total number of N = AL = 262144 pixels. After preprocessing, a data structure with the dimensions [Nf = 256, L = 512, A = 512] is obtained. In this study, the data is transformed into images with a structure of Pn = 100. Each position in this structure contains a processed image, maintaining the original dimensions of A × L pixels. The proposed unsupervised segmentation method requires pre-training of the feature extractor, which is the core of the process.

Training is performed for each desired NR so that the trained neural network can subsequently be used to segment other exams and validate the results obtained. In our experiments, validation was performed with the five scans CT042, CT195, CT200, CT299, and CT418. Although it is possible to use random slices of exams for segmentation, it is important to consider that brain CT scans have sequencing in the slices and share similar information between adjacent slices. A single brain CT scan can provide sufficient and representative information for neural network training due to the similarities between different slices of the same scan. This approach saves computational resources and time and simplifies the training process. By choosing to train on a single random exam with Nf = 256 slices, the unsupervised method proves to be an efficient and feasible strategy. This is particularly important because unsupervised methods, unlike supervised methods, do not require large amounts of data for training [98, 99].

Optimization results

The search for optimized values for the hyperparameters controls the behavior of the feature extractor during training as well as the behavior of the functions responsible for segmentation. The hyperparameters considered include: i) optimizer Opt, which includes three different algorithms: Adam, RMSprop, and SGD, ii) the number of neurons per layer NF, ranging from [15, 150] neurons, iii) the number of convolutional layers NC, ranging from [1, 9] layers, iv) the distance λ, with values in the range of [1, 9], v) the maximum number of iterations NI, ranging from [1, 10] iterations, vi) regularization terms for similarity dsim and continuity dcont, with values ranging from [0.1, 5] in increments of 0.1, and vii) learning rate TA, in the range [0.001, 0.1].

The three optimization techniques were considered due to their popularity in machine learning and neural network training [100, 101]. The ranges of hyperparameters to be optimized were selected based on the empirical knowledge of the researchers. To optimize these hyperparameters, the Optuna search tool was used [78]. A total of twelve simulations were performed for each NR, covering the range with , 3 ≤ NR ≤ 8 labels. Table 2 disposes the averages and standard deviations σ for Dc along the twelve simulations for each NR and for each class: gray matter c1, white matter c2 and skull c4. The reference segmentations considered to calculate Dc are from the CTSeg tool, and the calculation of Dc is performed by the 3D Slicer, an open-source software platform for processing and analyzing medical images [102].

thumbnail
Table 2. and σ along the twelve simulations for each NR and for each class c1, c2 and c4.

https://doi.org/10.1371/journal.pone.0304017.t002

Note in the Table 2 the absence of some values. These omissions are related to discrepancies between the results of the proposed segmentation model and the reference segmentation. Unsupervised methods do not guarantee identical results as they do not include training of weights and classifiers. For example, when using the segmentations of the CTSeg tool as a reference, the segmentation obtained with the proposed model may exhibit variations in brain tissues and structures. These divergences can lead to a lack of agreement compared to Dc.

The most efficient configuration from the twelve simulations is the one with the best averaged Dc among the three classes c1, c2 and c4. Table 3 disposes results of the hyperparameter optimization process, displaying the most efficient configuration for each NR value, along with the processing time t [min] for each optimization process. In the NR column, the predefined amount of desired segmentation is given. We observe an increase in the expansion of the neural networks, both in width and depth, to find the quantity of labels. Moreover, different TA values were specified for each optimizer.

thumbnail
Table 3. Hyperparameters optimized based on the predefined quantity of labels.

https://doi.org/10.1371/journal.pone.0304017.t003

Comparison between segmentation methods

The proposed method starts with a window level of 40 [HU] and a window width of 80 [HU]. This procedure ensures the presence of tissue of interest, such as the white and gray matter, while excluding unwanted physical artifacts that may be present in the exam. However, this filter can lead to disturbances such as the presence of isolated pixels or small groups of pixels with extremely low values (salt-type noise) or extremely high values (pepper-type noise) [91, 103, 104]. These isolated points or small groups of pixels with discrepant values can distort the image information, making interpretation and clinical use more difficult. Fig 5 presents the result after applying windowing. We notice that the region containing brain tissue has salt and pepper noise. There is also differentiation of the ventricular structure, including calcification of the choroid plexus, a physiological phenomenon [105].

If applying the CTSeg tool, it is not necessary to specify NR because the classes are automatically identified during processing and the human anatomical structures are recognized. Typically, the labels include gray matter, white matter, CSF, skull, extracranial soft tissue, and a background label that does not correspond to any of the previous labels. Three classes are used in the validation to avoid favorable bias, which have better segmentation performance compared to the CTSeg tool: i) skull, ii) gray matter, and iii) white matter. The segmentation results were obtained using both the CTSeg tool and the proposed method, which were set with the optimized hyperparameters from Table 3 and adjusted for the same NR as the CTSeg tool. Figs 6 and 7 show these results for exams CT042, CT195, CT200, CT299 and CT418, respectively.

thumbnail
Fig 6.

Segmentation of exams: (a) to (f) CT042 and (g) to (l) CT195.

https://doi.org/10.1371/journal.pone.0304017.g006

thumbnail
Fig 7.

Segmentation of exams: (a) to (f) CT200, (g) to (l) CT299, and (m) to (r) CT418.

https://doi.org/10.1371/journal.pone.0304017.g007

In Figs 6 and 7, the black and white images show the segmentation results obtained with the CTSeg tool, while the multicolored images show results of the proposed approach. Thus, the images are assigned to three different tissue classes: i) the first column represents the skull (bone) c4, ii) the second column corresponds to the gray matter c1, and iii) the third column represents the white matter c2. Results obtained with the optimized parameters shown in Table 3 are presented in Table 4 and compared with the approach proposed by Kim, Kanezaki, and Tanaka [38]. The evaluation of results is expressed in the form of Dc. The reference segmentations considered to calculate Dc are from the CTSeg tool, and the calculation of Dc is performed by the 3D Slicer. The range is 0 ≤ Dc ≤ 1, where 0 stands for no overlap and 1 for 100% overlap.

thumbnail
Table 4. Comparison of Dc between the proposed method versus Kim, Kanezaki, and Tanaka’s method [38].

https://doi.org/10.1371/journal.pone.0304017.t004

All exams presented in Table 4 are the results of about 256 slices after preprocessing. Overall, the values presented in Table 4 show an accuracy of over 65% for the proposed approach, compared to an accuracy of about 33% for the method of Kim, Kanezaki, and Tanaka [38]. The optimization performed resulted in a significant increase in the accuracy of Kim, Kanezaki, and Tanaka’s method [38] when using the proposed method. When analyzing the segmentation results for class c1 in the CT042 exam, it was found that the CTSeg segmentation had no significant correlation with the segmentation of the proposed method. In contrast, when comparing with the segmentation using Kim, Kanezaki, and Tanaka’s method [38], a lack of correlation was found only for NR = 3 and NR = 4, with 0.58 ≤ Dc ≤ 0.74. Considering all exams for the class c1, the proposed method reached 0.41 ≤ Dc ≤ 0.62, while the method of Kim, Kanezaki, and Tanaka [38] reached 0.44 ≤ Dc ≤ 0.74.

Except for exam CT042, segmentation with CTSeg showed an association in all other exams compared to the proposed method, including all labels of class c1. The proposed method performed better in four exams than Kim, Kanezaki, and Tanaka’s method [38], which segmented all exams but performed better in only two of them. For classes c1 and c2, the proposed method showed better results in terms of Dc than the approach of Kim, Kanezaki, and Tanaka [38]. The optimization provided average improvements of 4% for class c2 and 5% for class c4 compared to the method of Kim, Kanezaki, and Tanaka [38]. These results show that the proposed work with hyperparameter optimization can achieve more accurate white matter segmentation than the approach of Kim, Kanezaki, and Tanaka [38].

The white matter is the brain tissue that establishes the connections between different brain regions and plays an important role in cognitive and motor functions. Precise segmentation of white matter is essential in the imaging assessment of neurological diseases such as multiple sclerosis, dementia and brain tumors [106108]. Regarding the overall results for the segmentation of class c4, the approach of Kim, Kanezaki and Tanaka [38] achieved values of 0.23 ≤ Dc ≤ 0.86, while these values for the proposed method were 0.50 ≤ Dc ≤ 0.86. When comparing the two methods, it can be seen that of the three classes and five exams analyzed, the proposed method outperforms in ten exams, while the method of Kim, Kanezaki and Tanaka [38] outperforms in five exams, considering the results in terms of Dc. These results show that the proposed method segments different tissues in CT scans, generating masks similar to those of the CTSeg tool.

The optimization process proved to be efficient since there was no saturation for the class c4, indicating a satisfactory performance of Feval. However, when considering only the best Dc result for each exam in the comparison between the methods, we observed that the proposed method performs worse on the segmentations of class c4. Although the proposed method generates segmentations for 3 ≤ NR ≤ 8, this does not necessarily mean that the class c4 in CTSeg will find similar segmentations for all labels generated by the proposal. The presence of the class may or may not be present in each label. When analyzing the segmentation difficulty, we found that out of the 30 segmentations performed in class c4, eleven of them have no correlation between the proposed method and the CTSeg segmentations. Moreover, in the CT299 exam, the approach of Kim, Kanezaki, and Tanaka [38] for class c4 achieved a better result with Dc = 0.26. These results indicate the complexity of the segmentation of the class c4 and highlight the particular challenges that the proposed method has to face compared to the approach of Kim, Kanezaki, and Tanaka [38].

A detailed analysis of the segmentation results of the CT299 exam by CTSeg, presented in Fig 7(h) and 7(i), reveals asymmetric distortions that indicate segmentation errors. These distortions are noticeable from class c4, Fig 7(g). Here, disturbances occur in the bone segment, accompanied by variations in bone thickness and a distorted morphology. Compared to the segmentation results of our approach, presented in Fig 7(j), no distortions are observed in class c4. Instead, there is bone formation in an ovoid shape as expected for the human skull. Moreover, we observed in Fig 7(k) and 7(l) we observed that the regions corresponding to classes c1 and c2 maintain the integrity of the tissue without exhibiting asymmetric distortions. This comparison shows that the segmentation performed with the proposed method can in some cases be superior to the segmentation performed with the reference method, the CTSeg tool.

Fig 8 shows the overlap of the segmentation of the CT195 exam performed by the CTSeg tool and the segmentation obtained by the proposed method. By relating the six labels used by CTSeg to the labels obtained by the proposed method, we were able to identify overlap in classes c1, c2, and c4, as well as in certain classes: CSF class c3, soft tissue class c5, and background class c6. Fig 8(a) to 8(f) show a partial correspondence between the segmentations of the CTSeg tool and the proposed method.

thumbnail
Fig 8.

Segmentation overlap of the CT195 exam: (a) c1, (b) c2, (c) c3, (d) c4, (e) c5 e (f) c6.

https://doi.org/10.1371/journal.pone.0304017.g008

For the class segmentation c1 and c2, shown in Fig 8(a) and 8(b), respectively, the result in the deep shade is the entire set, while the segmentation by the CTSeg tool in the light shade is the correct subset. In contrast, for the class segmentation c3, c4, and c5, shown in Fig 8(c)–8(e), respectively, there is an intersection, where the segmentation by the CTSeg tool in the light shade is the set and the segmentation by the proposed method in the deep shade is the correct subset. Finally, for the segmentation of class c6, shown in Fig 8(f), the CTSeg tool incorrectly considered classes c4, c5, and c6, resulting in an inaccurate segmentation, while the proposed method performed the segmentation correctly.

Results presented in Fig 8 underscore the importance of precise segmentation in medical imaging. The reliability of the information obtained from these exams is inextricably linked to the quality of the segmentation, which has a direct impact on diagnoses and clinical decisions. Identifying these symmetric or asymmetric distortions in the segmentation process emphasizes the need to review and optimize procedures. This is crucial to ensure the accuracy and reliability of diagnostic results.

Comparison of the cranial volumetry

Volumetry of head structures is crucial in neurological assessments, especially in neurodegenerative diseases and brain tumors (both in surgical planning and post-treatment monitoring) [109]. The calculation process involves the segmentation of images to identify and delineate a specific structure (e.g. the skull or an intracranial region), resulting in the creation of a three-dimensional model. The volume is determined by multiplying the segmentation area by the slice thickness to obtain the volume for each slice. The volumes of all slices are then summed to determine the total volume of a particular structure. Three-dimensional reconstruction software such as 3D Slicer [102] is used to create volumetric models from segmented images.

Cranial volumetry using segmentations obtained with the CTSeg tool is compared with the proposed method. The segmentations from both approaches were used in the 3D Slicer platform to model the three-dimensional system and calculate the cranial volumes in selected exams with the highest Dc values from Table 4. Results of this validation are shown in Table 5, where represents the cranial volumetry values for class c4. It can be observed that the values of are higher in 62.5% of the cases with the proposed method. This is due to the application of post-processing techniques by the CTSeg tool, which improves the segmentation contours [82, 83], reducing the value of .

thumbnail
Table 5. Comparison of cranial volumetry values between the proposed method versus CTSeg.

https://doi.org/10.1371/journal.pone.0304017.t005

In Table 5, represents the average of the values, while indicates the percentage error between the values determined with the two methods, using the values determined with the CTSeg tool as a reference. Fig 9 shows a visualization of the cranial volume of class c4 in the CT042 exam with NR = 3, where the segmentation with the proposed method is shown in a deep shade and the CTSeg tool in a light shade. As observed in Table 5, the volume resulting from the segmentation of the class c4 by the proposed method is larger than the volume obtained by the CTSeg tool. Fig 9 is intended to facilitate visualization in the three-dimensional plane, since the segmentations are performed in a two-dimensional plane. By organizing the data, each segmentation is integrated into a resulting matrix that can be interpreted as a three-dimensional volume, enabling projection and subsequent visualization.

By quantifying the volume of segmented structures in a given patient, it is possible to compare them with other patients in the same database, which facilitates the identification of discrepant values. Our approach enables the use of the automatic segmentation model for screening and prioritization of suspected cases, accelerating the diagnosis of critical diseases at an early stage. Moreover, the efficiency and cost-effectiveness of the proposed method are remarkable. Since it is an unsupervised model, the fast and efficient computational power reduces the costs associated with data processing and makes the technology more accessible, especially in resource-constrained contexts. Although CT is more accessible than MRI, deep learning models for CT segmentation are less common than those developed for MRI. This work is an important step towards reducing this disparity in clinical practice.

Evaluation and validation of the proposed methodology

The validation of the proposed method for unsupervised structural segmentation in brain computed tomography is conducted through the analysis of results by expert physicians in intracranial imaging, encompassing a detailed evaluation of all outcomes. These processes ensure the efficiency, reliability, and clinical relevance of the method, demonstrating its segmentation capability and utility for diagnosis and treatment planning in clinical settings.

Validation by medical specialists.

Results were subjected to expert analysis, revealing that the comparative analysis of CT brain segmentation offers valuable insights into process efficiency and reveals relevant aspects for clinical interpretation. Within the CQ500 dataset [25], we observed a diverse array of supports for fixing the patient’s head, suggesting the use of different CT scanners. This diversity is beneficial as it introduces variability into the proposed method, enabling more robust assessments across different CT scanner configurations. The evaluation included several variables, from the influence of device diversity to optimization with the configuration of the number of labels, with a range of values of 3 ≤ NR ≤ 8, which was considered an unnecessary interval in some exams. This range overlaps with most brain tissues, considering the Hounsfield scale, and provides less visualization and detail of the anatomical structures. The appropriate choice of this range is necessary to avoid loss of information or excessive detail that could affect the results, ensuring coherent segmentation of tissues and structures.

The experts discussed the visual analysis of the results and emphasized the importance of the choice of NR in the segmentation. The preference for NR = 6 was described by the experts as visually appealing since it allows for differentiation between white and gray matter. It was suggested to reduce the labeling range to 3 ≤ NR ≤ 6, supported by the observation that this range provides more clarity and anatomical distinctiveness. Results of the CT042, CT195, CT200, CT299, and CT418 exams for NR = 7 and NR = 8, shown in Fig 10, confirmed the observations for reducing the NR range. Segmentation with NR = 6 provided improved detail and delineation of anatomical structures. When analyzing configurations with NR = 7 and NR = 8 in Fig 10, a combination of different structures, such as fat and air, is observed. This phenomenon can be explained by the nature of the Hounsfield scale, which makes segmentation more difficult as the number of labels increases. In solid regions, especially at NR = 8, increased speckle is observed.

thumbnail
Fig 10.

Segmentation using the proposed method in exams: (a) CT042 NR = 7, (b) CT042 NR = 8, (c) CT195 NR = 7, (d) CT195 NR = 8, (e) CT200 NR = 7, (f) CT200 NR = 8, (g) CT299 NR = 7, (h) CT299 NR = 8, (i) CT418 NR = 7, and (j) CT418 NR = 8.

https://doi.org/10.1371/journal.pone.0304017.g010

The sequential analysis of the quantity of NR applied to the same exam, such as the example of CT195 exam in Fig 8, shows that the proposed method can separate or group tissues according to the desired quantity. For segmentations with NR = 6, we observed clarity and definition of regions, indicating the efficiency of the proposed method in delineating anatomical structures. When evaluating the CTSeg tool for CT299 exam, from Fig 7(g) to 7(i) compared to the proposed method from Fig 7(j) to 7(l), we observed asymmetric deformations in the segmentations, which casts doubt on the reliability of the reference tool. The CTSeg tool is widely used in various medical imaging centers. On the other hand, the proposed method serves as an alternative solution that represents a significant advance in brain CT segmentation and promotes improvements in some cases where the widely used method faces challenges.

Evaluation of results.

The study presents an unsupervised methodology for brain CT segmentation, contrasting the results with the supervised method CTSeg, and utilizes the CQ500 database for validation. The generated segmentation is compared with the CTSeg tool, using the Dice coefficient Dc as a similarity metric. The ability of the proposed method to produce precise segmentation masks comparable to those generated by CTSeg, with a focus on the accurate identification of brain tissues, is highlighted. Additionally, a comparison was conducted between the method of Kim, Kanezaki, and Tanaka [38] and the proposed method, demonstrating segmentation accuracy exceeding 65% for the proposed method, in contrast to approximately 33% for the method of Kim, Kanezaki, and Tanaka [38]. Analyses focused on Dc, indicating significant improvements in the results obtained by the proposed method, particularly for segmentations of gray and white matter classes, with optimizations resulting in increased segmentation accuracy.

These results suggest that the optimization of hyperparameters has contributed to enhancing the segmentation accuracy, presenting an efficient strategy for the treatment of brain CT images. Furthermore, the results indicate the effectiveness of the proposed method not only in quantitative terms, with improvements in Dc and segmentation accuracy but also qualitatively, by the ability to accurately segment different brain tissues compared to the CTSeg method. Detailed analysis of Dc indicates that, for certain classes and examinations, the proposed method outperformed CTSeg, highlighting its clinical relevance and potential application in medical practice, particularly for the precise assessment of neurological diseases through CT. This advancement in unsupervised segmentation of brain CT images may facilitate faster and more accurate diagnoses, providing a valuable tool for healthcare professionals in the evaluation of brain conditions.

Discussion

Results obtained with the proposed method involve the unsupervised segmentation of brain CT scans, specifically applied to CQ500 database. Hyperparameter tuning was performed using optimization techniques available in the Optuna tool [78]. The experiments yielded varying Dice coefficients Dc for different tissue classes. These results were compared with segmentations obtained using both the CTSeg tool and the method proposed by Kim, Kanezaki, and Tanaka [38]. The comparison revealed convergence in values, with some cases favoring our proposed method. These findings align with existing literature studies [110112] that employ segmentation techniques based on supervised models. Notably, our unsupervised model holds great relevance as it does not require large amounts of data for training, thus alleviating the computational cost in the training phase. Comparative observations between the proposed method and the CTSeg tool showed similarities in several cases. This is particularly relevant considering the widespread use of the CTSeg tool in both research and clinical settings [82, 83].

In most comparisons with Kim, Kanezaki, and Tanaka’s approach [38], our proposed method consistently outperformed. Kim, Kanezaki, and Tanaka’s method was not originally designed for medical images in the DICOM format. Regarding Dice coefficients Dc values, in certain cases, it was not possible to establish a direct correspondence between the labels generated with the proposed method and those obtained with the CTSeg tool. However, this lack of correspondence does not necessarily reflect the quality of the segmentation by the proposed method. The calculation of the Dc value is performed by 3D Slicer [102], which uses the segmentation from the CTSeg tool as a reference, including post-processing to enhance contours, reduce volume, and sometimes generate asymmetric distortions in segmentations.

This study contributes to the optimization of hyperparameters in brain CT segmentation, the innovative use of unsupervised deep neural networks, and the creation of an adjustable evaluation function for different label sets. Despite the remaining challenges in the proposed method, such as the difficulty in correlating the generated labels with the predefined labels, the observed improvements in some anatomical regions compared to existing methods highlight the relevance for clinical practice. This difficulty in correlating the labels may indicate the need for future studies on innovative data post-processing techniques, and the successes may indicate the ability of the proposed method to be applied to different types of segmentation.

Manual segmentation by trained experts is still considered the gold standard for identifying brain structures. However, this approach has its limitations, such as high costs and time-consuming manual effort, which makes it impractical for processing large amounts of data. It is also prone to intraobserver and interobserver variability. Although the segmentations generated by the unsupervised automatic model are limited to certain structures, they can serve as a basis for subsequent validation by experts, which significantly reduces the manual workload. If this model is successfully validated and integrated into real-time radiology workflows, it has the potential to become a valuable tool for screening and prioritizing exams in the radiologist’s routine.

Rapid interpretation of abnormal brain CT scans can improve patient care. The unsupervised approach does not require additional time because it does not require manual data annotation for training, as is common with supervised methods. Results of the available data and the identification of patterns or structures do not depend on manual labeling of individual inputs. Unsupervised learning can reveal non-obvious insights and patterns in the data, facilitating the identification of intrinsic relationships between variables or specific data features.

When applying unsupervised neural networks to real and previously unaudited clinical datasets, challenges arise such as variability in image quality: resolution, noise, illumination, and artifacts, which affect the network’s learning capacity and generalization. Artifacts introduced during image acquisition, processing, or transmission can distort information, while anatomical variations among patients hinder pattern identification [113]. Obtaining quality training data can also be challenging. Strategies such as data preprocessing, data augmentation, and involvement of clinical experts are necessary to address these limitations and ensure interpretability and clinical relevance of the results.

In detailed analysis of the use of automated methods for medical image segmentation in surgical and medical practices indicates both benefits and challenges. Automation offers advantages such as efficiency and accuracy in medical image interpretation but faces limitations such as dependence on data quality and the need for confidence in the results. Conversely, manual segmentation, while precise, is time-consuming and subject to interobserver and intraobserver variations. The combination of automated segmentation and human review emerges as a balanced solution, allowing validation and adjustments by experts, ensuring precise and reliable clinical decisions [114116].

The field of medical image segmentation is rapidly advancing, with a emphasis on the use of deep learning techniques to enhance segmentation accuracy. Research demonstrates that the application of Convolutional Neural Networks (CNNs) in unsupervised scenarios yields promising results, suggesting even greater potential when combined with supervised and semi-supervised approaches. This integration aims to more efficiently capture the inherent complexities of medical images, thereby enhancing diagnostic accuracy and treatment personalization. Furthermore, the incorporation of multimodal data from various imaging modalities such as MRI, CT, and ultrasound is recognized as a necessary advancement. The combination of these diverse data sources promises to enrich segmentation models, increasing their robustness and precision by providing a more comprehensive and detailed view of the clinical aspects to be analyzed.

Another important direction for the future of medical image segmentation is the development of solutions capable of operating in real-time, especially in clinical contexts where quick decisions are crucial. The ability to perform precise segmentations instantly could revolutionize surgical and diagnostic procedures by enabling immediate interventions based on detailed and reliable information. This implies challenges both in terms of developing highly efficient algorithms and in advancing hardware infrastructure to ensure the feasibility of these technologies in clinical environments. The future of medical image segmentation focuses on enhancing accuracy through deep learning, exploring the richness of multimodal data, and implementing real-time segmentations, promising significant transformations in the healthcare field.

Conclusion

This work developed an unsupervised segmentation method for brain CT based on the approach of Kim, Kanezaki, and Tanaka [38]. The objectives were achieved by implementing a DNN architecture to segment intracranial structures without relying on pre-existing labels, manual annotation, or supervision. Three training techniques were compared, using optimization to find hyperparameters and determine the number of segmentation masks. The method was evaluated by experts and compared with other tools. The main hypothesis was confirmed, demonstrating efficiency in training the neural network on a single random exam, reducing resources, training time, and costs, and easing the workload of the medical expert.

Compared to the approach of Kim, Kanezaki, and Tanaka, our method showed an accuracy exceeding 65%. Results indicated superior performance in white matter segmentation and similar or superior outcomes compared to the CTSeg tool in cranial volumetry. Experts recommended the range 3 ≤ NR ≤ 6 to achieve visually enhanced results. This study contributes significantly to improving the accuracy of brain CT segmentation, which has promising implications in research and clinical settings. It presents a simplified and accessible approach that has the potential to facilitate early detection of abnormal scans, thereby improving patient care.

References

  1. 1. Gooch CL, Pracht E, Borenstein AR. The burden of neurological disease in the United States: A summary report and call to action. Annals of neurology. 2017;81(4):479–84. pmid:28198092
  2. 2. Patel UK, Anwar A, Saleem S, Malik P, Rasul B, Patel K, et al. Artificial intelligence as an emerging technology in the current care of neurological disorders. Journal of neurology. 2021;268:1623–42. pmid:31451912
  3. 3. Organization WH. Neurological disorders: public health challenges. World Health Organization; 2006.
  4. 4. Lima AA, Mridha MF, Das SC, Kabir MM, Islam MR, Watanobe Y. A Comprehensive Survey on the Detection, Classification, and Challenges of Neurological Disorders. Biology. 2022;11(3):469. pmid:35336842
  5. 5. Usman MB, Ojha S, Jha SK, Chellappan DK, Gupta G, Singh SK, et al. Biological databases and tools for neurological disorders. Journal of Integrative Neuroscience. 2022. pmid:35164477
  6. 6. Wahl B, Cossy-Gantner A, Germann S, Schwalbe NR. Artificial intelligence (AI) and global health: how can AI contribute to health in resource-poor settings? BMJ global health. 2018;3(4):e000798. pmid:30233828
  7. 7. Langen KJ, Galldiks N, Hattingen E, Shah NJ. Advances in neuro-oncology imaging. Nature Reviews Neurology. 2017;13(5):279–89. pmid:28387340
  8. 8. Cordeiro FR, Carneiro G. A survey on deep learning with noisy labels: How to train your model when you cannot trust on the annotations? In: 2020 33rd SIBGRAPI conference on graphics, patterns and images (SIBGRAPI). IEEE. Recife, Brazil: IEEE; 2020. p. 9-16.
  9. 9. Gatidis S, Hepp T, Früh M, La Fougère C, Nikolaou K, Pfannenberg C, et al. A whole-body FDG-PET/CT Dataset with manually annotated Tumor Lesions. Scientific Data. 2022;9(1):601. pmid:36195599
  10. 10. Sait U, KV GL, Shivakumar S, Kumar T, Bhaumik R, Prajapati S, et al. A deep-learning based multimodal system for Covid-19 diagnosis using breathing sounds and chest X-ray images. Applied Soft Computing. 2021;109:107522. pmid:34054379
  11. 11. Schmidhuber J. Deep learning in neural networks: An overview. Neural networks. 2015;61:85–117. pmid:25462637
  12. 12. Caruana R, Niculescu-Mizil A. An empirical comparison of supervised learning algorithms. In: Proceedings of the 23rd international conference on Machine learning. Carnegie: ACM; 2006. p. 161-8.
  13. 13. Tajbakhsh N, Jeyaseelan L, Li Q, Chiang JN, Wu Z, Ding X. Embracing imperfect datasets: A review of deep learning solutions for medical image segmentation. Medical Image Analysis. 2020;63:101693. pmid:32289663
  14. 14. Yang L, Zhang Y, Chen J, Zhang S, Chen DZ. Suggestive annotation: A deep active learning framework for biomedical image segmentation. In: Medical Image Computing and Computer Assisted Intervention- MICCAI 2017: 20th International Conference, Quebec City, QC, Canada, September 11-13, 2017, Proceedings, Part III 20. Quebec City, QC, Canada: Springer; 2017. p. 399-407.
  15. 15. Rister B, Yi D, Shivakumar K, Nobashi T, Rubin DL. CT-ORG, a new dataset for multiple organ segmentation in computed tomography. Scientific Data. 2020;7(1):1–9. pmid:33177518
  16. 16. Senthilkumaran N, Rajesh R. Image segmentation-a survey of soft computing approaches. In: 2009 International Conference on Advances in Recent Technologies in Communication and Computing. IEEE. Kottayam, Kerala, India: IEEE Computer Society; 2009. p. 844-6.
  17. 17. Zhang J, Zhao X, Chen Z, Lu Z. A review of deep learning-based semantic segmentation for point cloud. IEEE Access. 2019;7:179118–33.
  18. 18. Nazir S, Dickson DM, Akram MU. Survey of explainable artificial intelligence techniques for biomedical imaging with deep neural networks. Computers in Biology and Medicine. 2023:106668. pmid:36863192
  19. 19. Pham DL, Xu C, Prince JL. A survey of current methods in medical image segmentation. Annual review of biomedical engineering. 2000;2(3):315–37.
  20. 20. Mahata N, Kahali S, Adhikari SK, Sing JK. Local contextual information and Gaussian function induced fuzzy clustering algorithm for brain MR image segmentation and intensity inhomogeneity estimation. Applied Soft Computing. 2018;68:586–96.
  21. 21. Ker J, Wang L, Rao J, Lim T. Deep learning applications in medical image analysis. Ieee Access. 2017;6:9375–89.
  22. 22. Krizhevsky A, Sutskever I, Hinton GE. Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems. 2012;25.
  23. 23. Szegedy C, Ioffe S, Vanhoucke V, Alemi AA. Inception-v4, inception-resnet and the impact of residual connections on learning. In: Thirty-first AAAI conference on artificial intelligence. vol. 31. San Francisco, California, USA: AAAI Press; 2017.
  24. 24. Monteiro M, Newcombe VF, Mathieu F, Adatia K, Kamnitsas K, Ferrante E, et al. Multiclass semantic segmentation and quantification of traumatic brain injury lesions on head CT using deep learning: an algorithm development and multicentre validation study. The Lancet Digital Health. 2020;2(6):e314–22. pmid:33328125
  25. 25. Chilamkurthy S, Ghosh R, Tanamala S, Biviji M, Campeau NG, Venugopal VK, et al. Deep learning algorithms for detection of critical findings in head CT scans: a retrospective study. The Lancet. 2018;392(10162):2388–96. pmid:30318264
  26. 26. Ronneberger O, Fischer P, Brox T. U-net: Convolutional networks for biomedical image segmentation. In: International Conference on Medical image computing and computer-assisted intervention. Springer. Munich, Germany: Springer International Publishing; 2015. p. 234-41.
  27. 27. Li L, Wei M, Liu B, Atchaneeyasakul K, Zhou F, Pan Z, et al. Deep learning for hemorrhagic lesion detection and segmentation on brain CT images. IEEE journal of biomedical and health informatics. 2020;25(5):1646–59.
  28. 28. Balafar MA, Ramli AR, Saripan MI, Mashohor S. Review of brain MRI image segmentation methods. Artificial Intelligence Review. 2010;33(3):261–74.
  29. 29. Atkins MS, Mackiewich BT. Fully automatic segmentation of the brain in MRI. IEEE transactions on medical imaging. 1998;17(1):98–107. pmid:9617911
  30. 30. Lee TH, Fauzi MFA, Komiya R, Haw SC. Unsupervised abnormalities extraction and brain segmentation. In: 2008 3rd International Conference on Intelligent System and Knowledge Engineering. vol. 1. IEEE. Xiamen, China: IEEE; 2008. p. 1185-90.
  31. 31. Dalca AV, Yu E, Golland P, Fischl B, Sabuncu MR, Eugenio Iglesias J. Unsupervised deep learning for Bayesian brain MRI segmentation. In: International Conference on Medical Image Computing and Computer-Assisted Intervention. Vancouver, Canada: Springer; 2019. p. 356-65.
  32. 32. Khan AR, Khan S, Harouni M, Abbasi R, Iqbal S, Mehmood Z. Brain tumor segmentation using K-means clustering and deep learning with synthetic data augmentation for classification. Microscopy Research and Technique. 2021;84(7):1389–99. pmid:33524220
  33. 33. Raja PS, et al. Brain tumor classification using a hybrid deep autoencoder with Bayesian fuzzy clustering-based segmentation approach. Biocybernetics and Biomedical Engineering. 2020;40(1):440–53.
  34. 34. Hua L, Gu Y, Gu X, Xue J, Ni T. A novel brain MRI image segmentation method using an improved multi-view fuzzy c-means clustering algorithm. Frontiers in Neuroscience. 2021;15:662674. pmid:33841095
  35. 35. Lenchik L, Heacock L, Weaver AA, Boutin RD, Cook TS, Itri J, et al. Automated segmentation of tissues using CT and MRI: a systematic review. Academic radiology. 2019;26(12):1695–706. pmid:31405724
  36. 36. Almeida JFdF, Pinto LR, Conceição SV, Campos FCCd. Medical centers location and specialists’ allocation: a healthcare planning case study. Production. 2019;29.
  37. 37. Santos R, Pires A, Almeida R, Pereira W. Computed tomography scanner productivity and entry-level models in the global market. Journal of healthcare engineering. 2017;2017. pmid:29093804
  38. 38. Kim W, Kanezaki A, Tanaka M. Unsupervised learning of image segmentation based on differentiable feature clustering. IEEE Transactions on Image Processing. 2020;29:8055–68.
  39. 39. Bear M, Connors B, Paradiso MA. Neuroscience: exploring the brain, enhanced edition: exploring the brain. Jones & Bartlett Learning; 2020.
  40. 40. Kandel ER, Schwartz JH, Jessell TM, Siegelbaum S, Hudspeth AJ, Mack S, et al. Principles of neural science. vol. 4. McGraw-hill New York; 2000.
  41. 41. Andreasen NC, Flaum M, Swayze V, O’Leary DS, Alliger R, Cohen G, et al. Intelligence and brain structure in normal individuals. American Journal of Psychiatry. 1993;150:130–0. pmid:8417555
  42. 42. Martin A, Chao LL. Semantic memory and the brain: structure and processes. Current opinion in neurobiology. 2001;11(2):194–201. pmid:11301239
  43. 43. Maldonado KA, Alsayouri K. Physiology, Brain. In: StatPearls [Internet]. StatPearls Publishing; 2021.
  44. 44. Amthor F. Neurobiology For Dummies. –For dummies. Wiley; 2014.
  45. 45. Mercadante AA, Tadi P. Neuroanatomy, Gray Matter. StatPearls Publishin. 2020.
  46. 46. Budday S, Nay R, de Rooij R, Steinmann P, Wyrobek T, Ovaert TC, et al. Mechanical properties of gray and white matter brain tissue by indentation. Journal of the mechanical behavior of biomedical materials. 2015;46:318–30. pmid:25819199
  47. 47. Brinker T, Stopa E, Morrison J, Klinge P. A new look at cerebrospinal fluid circulation. Fluids and Barriers of the CNS. 2014;11(1):1–16.
  48. 48. Gonzalo Domínguez M, Hernández C, Ruisoto P, Juanes JA, Prats A, Hernández T. Morphological and volumetric assessment of cerebral ventricular system with 3D slicer software. Journal of medical systems. 2016;40:1–8. pmid:27147517
  49. 49. Johnstone E, Frith C, Crow T, Husband J, Kreel L. Cerebral ventricular size and cognitive impairment in chronic schizophrenia. The Lancet. 1976;308(7992):924–6. pmid:62160
  50. 50. Sakka L, Coll G, Chazal J. Anatomy and physiology of cerebrospinal fluid. European annals of otorhinolaryngology, head and neck diseases. 2011;128(6):309–16. pmid:22100360
  51. 51. Stratchko L, Filatova I, Agarwal A, Kanekar S. The ventricular system of the brain: anatomy and normal variations. In: Seminars in Ultrasound, CT and MRI. vol. 37. Elsevier; 2016. p. 72–83.
  52. 52. Bafaraj SM, et al. Evaluation of neurological disorder using computed tomography and magnetic resonance imaging. Journal of Biosciences and Medicines. 2021;9(02):42.
  53. 53. Le R, Nguyen M, Yan W. In: A Web-Based Augmented Reality Approach to Instantly View and Display 4D Medical Images; 2020. p. 691-704.
  54. 54. Nowak S, Rüger S. How reliable are annotations via crowdsourcing: a study about inter-annotator agreement for multi-label image annotation. In: Proceedings of the international conference on Multimedia information retrieval. ACM. Philadelphia, Pennsylvania, USA: ACM; 2010. p. 557-66.
  55. 55. Mildenberger P, Eichelberg M, Martin E. Introduction to the DICOM standard. European radiology. 2002;12(4):920–7. pmid:11960249
  56. 56. Toga AW, Mazziotta JC, Mazziotta JC. Brain mapping: the methods. vol. 1. Academic press; 2002.
  57. 57. Osborne T, Tang C, Sabarwal K, Prakash V. How to interpret an unenhanced CT Brain scan. Part 1: Basic principles of Computed Tomography and relevant neuroanatomy. South Sudan Medical Journal. 2016;9(3):67–9.
  58. 58. Glide-Hurst C, Chen D, Zhong H, Chetty I. Changes realized from extended bit-depth and metal artifact reduction in CT. Medical physics. 2013;40(6Part1):061711. pmid:23718590
  59. 59. Broder JS. Diagnostic Imaging for the Emergency Physician E-Book: Expert Consult—Online and Print. Elsevier Health Sciences; 2011.
  60. 60. Razi T, Niknami M, Ghazani FA. Relationship between Hounsfield unit in CT scan and gray scale in CBCT. Journal of dental research, dental clinics, dental prospects. 2014;8(2):107. pmid:25093055
  61. 61. Radhiana H, Syazarina S, Shahizon Azura M, Hilwati H, Sobri M. Non-contrast computed tomography in acute ischaemic stroke: a pictorial review. Med J Malaysia. 2013;68(1):93–100. pmid:23466782
  62. 62. Aljabri M, AlAmir M, AlGhamdi M, Abdel-Mottaleb M, Collado-Mesa F. Towards a better understanding of annotation tools for medical imaging: A survey. Multimedia tools and applications. 2022;81(18):25877–911. pmid:35350630
  63. 63. Dias PA, Shen Z, Tabb A, Medeiros H. FreeLabel: a publicly available annotation tool based on freehand traces. In: 2019 IEEE Winter Conference on Applications of Computer Vision (WACV). IEEE. Waikoloa Village, HI, USA: IEEE; 2019. p. 21-30.
  64. 64. Ho PG. Image segmentation. BoD–Books on Demand; 2011.
  65. 65. Zhu W, Huang Y, Zeng L, Chen X, Liu Y, Qian Z, et al. AnatomyNet: deep learning for fast and fully automated whole-volume segmentation of head and neck anatomy. Medical physics. 2019;46(2):576–89. pmid:30480818
  66. 66. González-Villà S, Oliver A, Valverde S, Wang L, Zwiggelaar R, Lladó X. A review on brain structures segmentation in magnetic resonance imaging. Artificial intelligence in medicine. 2016;73:45–69. pmid:27926381
  67. 67. Lei T, Nandi AK. Image Segmentation: Principles, Techniques, and Applications. John Wiley & Sons; 2022.
  68. 68. Baur C, Wiestler B, Albarqouni S, Navab N. Deep autoencoding models for unsupervised anomaly segmentation in brain MR images. In: International MICCAI brainlesion workshop. Granada, Spain: Springer; 2018. p. 161-9.
  69. 69. Xue JH, Pizurica A, Philips W, Kerre E, Van De Walle R, Lemahieu I. An integrated method of adaptive enhancement for unsupervised segmentation of MRI brain images. Pattern Recognition Letters. 2003;24(15):2549–60.
  70. 70. Chai T, Draxler RR. Root mean square error (RMSE) or mean absolute error (MAE). Geoscientific Model Development Discussions. 2014;7(1):1525–34.
  71. 71. Willmott CJ, Matsuura K. Advantages of the mean absolute error (MAE) over the root mean square error (RMSE) in assessing average model performance. Climate research. 2005;30(1):79–82.
  72. 72. Feurer M, Hutter F. Hyperparameter optimization. Automated machine learning. 2019:3-33. Available from: link.springer.com/chapter/10.1007/978-3-030-05318-5_1.
  73. 73. Agrawal T. Hyperparameter optimization in machine learning. Apress Berkeley: Berkeley, CA, USA. 2021:81-108.
  74. 74. Kingma DP, Ba J. Adam: A method for stochastic optimization. arXiv preprint arXiv:14126980. 2014.
  75. 75. Hinton G. Deep learning—a technology with the potential to transform health care. Jama. 2018;320(11):1101–2. pmid:30178065
  76. 76. Bottou L, Bousquet O. The tradeoffs of large scale learning. Advances in neural information processing systems. 2007;20.
  77. 77. Srinivas P, Katarya R. hyOPTXg: OPTUNA hyper-parameter optimization framework for predicting cardiovascular disease using XGBoost. Biomedical Signal Processing and Control. 2022;73:103456.
  78. 78. Akiba T, Sano S, Yanase T, Ohta T, Koyama M. Optuna: A next-generation hyperparameter optimization framework. In: Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining. Anchorage AK USA: ACM; 2019. p. 2623-31.
  79. 79. Bergstra J, Bardenet R, Bengio Y, Kégl B. Algorithms for hyper-parameter optimization. Advances in neural information processing systems. 2011;24.
  80. 80. Santos PV, Scoczynski M, Calixto WP. Artificial intelligence-based unsupervised image segmentation model for brain computed tomography. PLoS One [Source Code]. 2024 3.
  81. 81. Shibata T, Tanaka M, Okutomi M. Misalignment-robust joint filter for cross-modal image pairs. In: Proceedings of the IEEE International Conference on Computer Vision; 2017. p. 3295-304.
  82. 82. Brudfors M, Balbastre Y, Flandin G, Nachev P, Ashburner J. Flexible Bayesian Modelling for Nonlinear Image Registration. In: International Conference on Medical Image Computing and Computer-Assisted Intervention. Springer; 2020. p. 253-63.
  83. 83. Brudfors M. Generative Models for Preprocessing of Hospital Brain Scans [PhD]. UCL (University College London); 2020.
  84. 84. Amanatiadis A, Andreadis I. A survey on evaluation methods for image interpolation. Measurement Science and Technology. 2009;20(10):104015.
  85. 85. Parsania P, Virparia PV, et al. A review: Image interpolation techniques for image scaling. International Journal of Innovative Research in Computer and Communication Engineering. 2014;2(12):7409–14.
  86. 86. Ee C, Sim K, Teh V, Ting F. Estimation of window width setting for CT scan brain images using mean of greyscale level to standard deviation ratio. In: 2016 International Conference on Robotics, Automation and Sciences (ICORAS). IEEE. IEEE Computer Society; 2016. p. 1-6.
  87. 87. Ho ML, Rojas R, Eisenberg RL. Cerebral edema. American Journal of Roentgenology. 2012;199(3):W258–73. pmid:22915416
  88. 88. Liu YH. Feature extraction and image recognition with convolutional neural networks. In: Journal of Physics: Conference Series. vol. 1087. IOP Publishing; 2018. p. 062032.
  89. 89. Taye MM. Theoretical understanding of convolutional neural network: concepts, architectures, applications, future directions. Computation. 2023;11(3):52.
  90. 90. Kanezaki A. Unsupervised image segmentation by backpropagation. In: 2018 IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE. Calgary, AB, Canada: IEEE; 2018. p. 1543-7.
  91. 91. Sivakumar R. Denoising of computer tomography images using curvelet transform. ARPN Journal of Engineering and Applied Sciences. 2007;2(1):21–6.
  92. 92. Wu Z, Shen C, Van Den Hengel A. Wider or deeper: Revisiting the resnet model for visual recognition. Pattern Recognition. 2019;90:119–33.
  93. 93. Bertels J, Eelbode T, Berman M, Vandermeulen D, Maes F, Bisschops R, et al. Optimizing the dice score and jaccard index for medical image segmentation: Theory and practice. In: Medical Image Computing and Computer Assisted Intervention–MICCAI 2019: 22nd International Conference, Shenzhen, China, October 13–17, 2019, Proceedings, Part II 22. Springer; 2019. p. 92-100.
  94. 94. Eelbode T, Bertels J, Berman M, Vandermeulen D, Maes F, Bisschops R, et al. Optimization for medical image segmentation: theory and practice when evaluating with dice score or jaccard index. IEEE Transactions on Medical Imaging. 2020;39(11):3679–90. pmid:32746113
  95. 95. Thada V, Jaglan V. Comparison of jaccard, dice, cosine similarity coefficient to find best fitness value for web retrieved documents using genetic algorithm. International Journal of Innovations in Engineering and Technology. 2013;2(4):202–5.
  96. 96. Arnold BC. Pareto distribution. Wiley StatsRef: Statistics Reference Online. 2014:1-10.
  97. 97. Sanders R. The Pareto principle: its use and abuse. Journal of Services Marketing. 1987;1(2):37–40.
  98. 98. Huang J, Dong Q, Gong S, Zhu X. Unsupervised deep learning by neighbourhood discovery. In: International Conference on Machine Learning. PMLR. Long Beach, California, USA: PMLR; 2019. p. 2849-58.
  99. 99. Zhao W. Research on the deep learning of the small sample data based on transfer learning. In: AIP conference proceedings. vol. 1864. AIP Publishing LLC; 2017. p. 020018.
  100. 100. Zaheer R, Shaziya H. A study of the optimization algorithms in deep learning. In: 2019 third international conference on inventive systems and control (ICISC). IEEE; 2019. p. 536-9.
  101. 101. Zhang Z. Improved adam optimizer for deep neural networks. In: 2018 IEEE/ACM 26th international symposium on quality of service (IWQoS). Ieee; 2018. p. 1-2.
  102. 102. Pinter C, Lasso A, Wang A, Jaffray D, Fichtinger G. SlicerRT: radiation therapy research toolkit for 3D Slicer. Medical physics. 2012;39(10):6332–8. pmid:23039669
  103. 103. Azzeh J, Zahran B, Alqadi Z. Salt and pepper noise: Effects and removal. JOIV: International Journal on Informatics Visualization. 2018;2(4):252–6.
  104. 104. Toh KKV, Isa NAM. Noise adaptive fuzzy switching median filter for salt-and-pepper noise reduction. IEEE signal processing letters. 2009;17(3):281–4.
  105. 105. Saade C, Najem E, Asmar K, Salman R, El Achkar B, Naffaa L. Intracranial calcifications on CT: an updated review. Journal of radiology case reports. 2019;13(8):1. pmid:31558966
  106. 106. Ghribi O, Maalej A, Sellami L, Slima MB, Maalej MA, Mahfoudh KB, et al. Advanced methodology for multiple sclerosis lesion exploring: Towards a computer aided diagnosis system. Biomedical Signal Processing and Control. 2019;49:274–88.
  107. 107. Pagnozzi AM, Fripp J, Rose SE. Quantifying deep grey matter atrophy using automated segmentation approaches: A systematic review of structural MRI studies. Neuroimage. 2019;201:116018. pmid:31319182
  108. 108. Tran P, Thoprakarn U, Gourieux E, Dos Santos CL, Cavedo E, Guizard N, et al. Automatic segmentation of white matter hyperintensities: validation and comparison with state-of-the-art methods on both Multiple Sclerosis and elderly subjects. NeuroImage: Clinical. 2022;33:102940. pmid:35051744
  109. 109. Giorgio A, De Stefano N. Clinical use of brain volumetry. Journal of Magnetic Resonance Imaging. 2013;37(1):1–14. pmid:23255412
  110. 110. Hema Rajini N, Bhavani R. Automatic classification of computed tomography brain images using ANN, k-NN and SVM. AI & society. 2014;29:97–102.
  111. 111. Tustison NJ, Shrinidhi K, Wintermark M, Durst CR, Kandel BM, Gee JC, et al. Optimal symmetric multimodal templates and concatenated random forests for supervised brain tumor segmentation (simplified) with ANTsR. Neuroinformatics. 2015;13:209–25. pmid:25433513
  112. 112. Henry T, Carré A, Lerousseau M, Estienne T, Robert C, Paragios N, et al. Brain tumor segmentation with self-ensembled, deeply-supervised 3D U-net neural networks: a BraTS 2020 challenge solution. In: Brainlesion: Glioma, Multiple Sclerosis, Stroke and Traumatic Brain Injuries: 6th International Workshop, BrainLes 2020, Held in Conjunction with MICCAI 2020, Lima, Peru, October 4, 2020, Revised Selected Papers, Part I 6. Springer; 2021. p. 327-39.
  113. 113. Ambellan F, Lamecker H, von Tycowicz C, Zachow S. Statistical shape models: understanding and mastering variation in anatomy. Springer; 2019.
  114. 114. Furriel BC, Oliveira BD, Prôa R, Paiva JQ, Loureiro RM, Calixto WP, et al. Artificial intelligence for skin cancer detection and classification for clinical environment: a systematic review. Frontiers in Medicine. 2024;10:1305954. pmid:38259845
  115. 115. Zhong NN, Wang HQ, Huang XY, Li ZZ, Cao LM, Huo FY, et al. Enhancing head and neck tumor management with artificial intelligence: Integration and perspectives. In: Seminars in Cancer Biology. Elsevier; 2023.
  116. 116. Lévêque L, Outtas M, Liu H, Zhang L. Comparative study of the methodologies used for subjective medical image quality assessment. Physics in Medicine & Biology. 2021;66(15):15TR02. pmid:34225264