Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Hybrid deep learning and feature selection approach for autism detection from rs-fMRI data

  • Mohamed Abd Elaziz ,

    Roles Conceptualization, Methodology, Project administration, Visualization, Writing – original draft, Writing – review & editing

    abd_el_aziz_m@yahoo.com; dahou.abdghani@univ-adrar.edu.dz

    Affiliation Artificial Intelligence Research Center (AIRC), Ajman University, Ajman , United Arab Emirates

  • Nermine Mahmoud,

    Roles Conceptualization, Data curation, Investigation, Project administration, Resources, Writing – original draft, Writing – review & editing

    Affiliation Faculty of Social and Human Sciences, Galala University, Suez, Egypt

  • Ahmed A. Ewees,

    Roles Software, Validation, Writing – original draft

    Affiliation Department of Computer, Damietta University, Damietta, Egypt

  • Mohamed G. Khattap,

    Roles Data curation, Resources, Writing – original draft

    Affiliation Technology of Radiology and Medical Imaging Program, Faculty of Applied Health Sciences Technology, Galala University, Suez, Egypt

  • Abdelghani Dahou ,

    Roles Formal analysis, Methodology, Software, Writing – original draft

    abd_el_aziz_m@yahoo.com; dahou.abdghani@univ-adrar.edu.dz

    Affiliation Mathematics and Computer Science Department, University of Ahmed DRAIA,  Adrar, Algeria

  • Safar M. Alghamdi,

    Roles Writing – review & editing, Supervision

    Affiliations Department of Mathematics and Statistics, College of Science, Taif University, Taif, Saudi Arabia, King Salman Center for Disability Research, Riyadh, Saudi Arabia

  • I. Nafisah,

    Roles Resources, Supervision, Visualization, Writing – review & editing

    Affiliation Department of Statistics and Operations Research, College of Sciences, King Saud University, Kingdom of Saudi ArabiaRiyadh

  • Ibrahim A. Fares,

    Roles Data curation, Software, Writing – original draft

    Affiliation Department of Mathematics, Faculty of Science, Zagazig University, Zagazig, Egypt

  • Mohammed Azmi Al-Betar

    Roles Investigation, Methodology, Writing – review & editing

    Affiliations Artificial Intelligence Research Center (AIRC), College of Engineering and Information Technology, Ajman University, Ajman, United Arab Emirates, Center of Excellence in Precision Medicine and Digital Health, Department of Physiology, Geriatric Dentistry and Special Patients Care Program, Faculty of Dentistry, Chulalongkorn University, Bangkok, Thailand

Abstract

Autism Spectrum Disorder (ASD) is a neurodevelopmental condition that is primarily characterized by deficits in social communication and restricted or repetitive behavioral patterns. Although psychologists contribute significantly to the understanding of ASD, offering insights into its cognitive, emotional, and behavioral dimensions through assessments, diagnoses, therapeutic approaches, and family support, the diagnostic process remains complex. This complexity arises from the diverse manifestations of the disorder and the challenges associated with data sharing. In addition, conventional machine learning approaches for ASD detection may struggle with high-dimensional neuroimaging data and may require careful feature engineering. Consequently, this motivated us to enhance ASD diagnosis by incorporating deep learning (DL) techniques for feature extraction alongside a modified exponential-trigonometric optimization (ETO) algorithm as a feature selection (FS) technique. The modified ETO integrates the Arithmetic Optimization Algorithm (AOA) and the Guided Learning Strategy (GLS) to improve diagnostic performance. To evaluate the effectiveness of the proposed model, we utilized resting-state functional MRI (rs-fMRI) data from the Autism Brain Imaging Data Exchange (ABIDE I). Furthermore, the performance of the proposed model was compared with that of established models. The results indicate that the proposed model achieves competitive and, in most cases, superior performance compared with the benchmark methods, demonstrating superior accuracy, sensitivity, and AUC in diagnosing ASD. On average across the three atlas-based feature sets, the proposed model has an accuracy, sensitivity, and AUC of 73%, 78%, and 79%, respectively.

Introduction

Autism spectrum disorder (ASD) is a neurodevelopmental disorder characterized by persistent difficulties in social communication and interaction, along with restricted and repetitive behaviors. The term encompasses conditions that were previously diagnosed separately, such as Asperger syndrome and childhood disintegrative disorder. Numerous studies have been conducted in search of specialized therapy and technological designs that can help people cope with and overcome these challenges [1]. According to the World Health Organization (WHO), autism was estimated to affect about 1 in 127 people worldwide in 2021, while the U.S. Centers for Disease Control and Prevention (CDC) reported that approximately 1 in 31 children aged 8 years were identified with ASD in 2022 [2,3]. While not curable, early diagnosis is often extremely crucial in facilitating timely intervention and improving developmental outcomes. However, conventional diagnostic procedures are often time-consuming, costly, and highly dependent on specialist expertise [4,5]. The WHO also reports that the onset of ASD symptoms typically occurs between the ages of 2 and 3 years. ASD is generally associated with a combination of genetic and environmental factors, and its diagnosis remains challenging because there is no definitive medical test for its identification [2,3].

Children with ASD often present with a range of challenges, particularly in communication and social interactions. These challenges include difficulties in interpreting nonverbal cues such as facial expressions and body language, struggles with forming peer relationships, a lack of spontaneous social initiation, and impairments in emotional reciprocity [6]. Additionally, individuals with ASD may engage in self-harming behaviors and suffer from insomnia. Moreover, it is common for individuals with autism to experience co-occurring conditions such as epilepsy, depression, anxiety disorders, and attention deficit hyperactivity disorder (ADHD) [7]. When it comes to obsessive interests and behaviors, individuals may display atypical responses to various stimuli, engage in repetitive behaviors, and exhibit an intense focus on specific details. On the other hand, some individuals with ASD demonstrate exceptional skills in areas such as visual perception, academics, and music [8]. The intellectual functioning of individuals with ASD can vary significantly, ranging from severe impairment to superior capabilities. Early diagnosis of autism is critical; however, an increasing number of children are being diagnosed at later ages. A comprehensive assessment for diagnosing ASD requires more than just a brief overview and a set of standardized tests administered by professionals. The clinical protocol approach, which is widely adopted, involves a thorough evaluation. Common diagnostic tools used in this process include the Autism Diagnostic Observation Schedule (ADOS) and the Autism Diagnostic Interview-Revised (ADI-R), both of which involve multiple questions and activities assessed by specialists during the diagnostic procedure [9].

The observed rise in prevalence highlights the critical need for precise and timely diagnostic approaches. The disorder manifests in diverse ways, varying by age and individual capabilities. In some cases, the condition may go undetected during childhood, only becoming evident with psychiatric comorbidities during adolescence, which complicates traditional symptom-based diagnostic methods [10,11]. To address this challenge, psychiatric neuroimaging studies aim to identify objective biomarkers through non-invasive techniques like resting-state functional magnetic resonance imaging (rs-fMRI), which investigates the functional connectivity between regions of interest (ROIs) in the brain [12]. Consequently, data-driven machine learning approaches hold great promise for classifying ASD and healthy controls by analyzing neural patterns derived from large-scale datasets such as the Autism Brain Imaging Data Exchange I (ABIDE I) [13,14]. These methods not only improve the reproducibility of results across diverse datasets but also aid in identifying neural networks that differentiate ASD from control participants, thereby advancing our understanding of the neural circuits involved in ASD and supporting early screening and diagnosis [15].

To develop a robust ML model [1619], there are some challenges in ASD diagnosis due to small sample sizes and high dimensionalities of features. Each subject’s feature vector from rs-fMRI data may have tens of thousands of dimensions, which would generate very time-consuming training processes and noisy data caused by image recording processes and redundant information [20]. Feature selection (FS) is considered one of the most important dimensionality reduction techniques in ML; it aims to find the important features for a particular dataset and build a minimum subset of crucial features from the input data [21,22]. FS has come into consideration for ASD diagnosis via various methods like filter methods, wrapper methods, and embedded techniques [23].

Recently, the use of FS methods coupled with Deep Learning (DL) architectures, combining both to form hybrid models for significantly enhanced diagnostic performance, has increased. These models exploit the strengths of FS techniques to preprocess data and subsequently apply DL frameworks for classification or regression tasks. An example is the hybrid approach that combines adaptive bacterial foraging optimization with SVM-RFE, which has demonstrated very high accuracy in the diagnosis of ASD [24]. Convolutional neural networks (CNNs), combined with metaheuristic algorithms (MAs), have been used on several fMRI datasets and have shown improved performance in classifying ASD patients [25]. In [26,27], explainable DL models have been proposed to detect ASD using facial images. Almars et al. [28] proposed an interpretable IoT-based EfficientNet technique to detect ASD based on emotion recognition in children. Rai et al. [29] introduced a hybrid DL model named ASD-HybridNet, which integrates region-of-interest time-series data and functional connectivity (FC) maps derived from fMRI data. Altomi et al. [30] developed a DL model based on the NasNetMobile and DeiT networks to extract features, followed by an SVM classifier to detect ASD.

Neuroimaging, particularly rs-fMRI, has emerged as a promising tool for ASD-related research and biomarker discovery. It enables the study of the functional connectivity patterns of the brain that are commonly disturbed in ASD individuals [31]. DL models, when applied with effective FS techniques, have been capable of identifying biomarkers from fMRI data with high accuracy and reliability to improve diagnostic tools. Apart from this, other modalities such as structural MRI and genomic data also joined the DL frameworks, thus increasing the scope towards ASD research [32].

Despite these advances, there are several challenges in this field. One of the major challenges in ASD research is the heterogeneity of the disorder and it presents in different ways among individuals and across age groups [33]. The diversity makes generalizable models complicated to develop [25,34]. In addition, the limited availability of large datasets poses a challenge for training robust DL models [35].

This study investigates the application of DL-based FS methods for diagnosing ASD. A key objective of our work is to develop an FS technique that is robust to different representations of neuroimaging data. We utilize DL techniques to extract reliable features from rs-fMRI data. Moreover, we improve the performance of the ETO [36] by incorporating the AOA [37] and GLS [38]. Specifically, To improve the exploration–exploitation balance of ETO, we incorporate AOA and GLS, resulting in a modified optimizer referred to as METO. METO is then used to select the most relevant features. The rationale for employing AOA and GLS lies in their established effectiveness in improving the performance of various MAs. For instance, AOA has been successfully applied to problems in electric vehicles [39], photovoltaic systems [40], and FS [41], while GLS has proven effective in addressing a wide range of global optimization and constrained engineering tasks [38].

The contributions of this study are summarized as follows:

  • Propose an alternative autism diagnosis technique based on the integration of DL and FS methods.
  • Employ a DL model that combines a Stacked Sparse Denoising Autoencoder (SSDAE) and a multilayer perceptron (MLP) to extract robust features from rs-fMRI data.
  • Develop an enhanced version of ETO by incorporating the strengths of AOA and GLS.
  • Evaluate the effectiveness of the proposed METO model using multiple autism datasets and compare its performance with state-of-the-art autism detection models.

Related Works

DL algorithms have shown strong potential for improving ASD diagnosis. Their ability to learn complex patterns from high-dimensional neuroimaging data makes them particularly attractive for ASD classification. Several DL-based ASD classification frameworks have demonstrated encouraging performance. Recently, a multi–sparse-autoencoder-based FS method was introduced and applied to the ABIDE I dataset, yielding an accuracy improvement of up to 9.09% [42]. Building on this, a hybrid approach combining SVM-RFE with an SVM classifier achieved an accuracy of 90.60% on larger samples [23]. Another framework, ASD-DiagNet, an autoencoder integrated with a single-layer perceptron, outperformed methods across multiple imaging centers and reached a maximum accuracy of 80% [43].

FS methods have also played a key role in reducing training time while enhancing model accuracy. For instance, a graph-based FS technique combined with a deep belief network reported improved classification performance on the full ABIDE I dataset [44]. Meanwhile, SVM-RFE continued to demonstrate strong classification capability [23]. More recently, functional connectivity features derived from rs-fMRI were fused with deep neural network classifiers, obtaining a mean accuracy of 88% [34]. Additionally, a personalized feature extraction strategy for fMRI achieved a mean classification accuracy of 87.62% and an AUC of 0.92 [45].

Further advancements include a classification model that integrates filter-based FS with a multilayer perceptron trained using a simplified variational autoencoder, achieving an accuracy of 78.12% [46]. Independent research explored ASD identification using self-collected speech recordings from children, demonstrating the feasibility of diagnosis without specialized equipment [47].

Recent progress also involves the hybridization of DL models with the F-score FS approach, reporting intra-site accuracy of 64.53% and ABIDE dataset accuracy of 70.9% [48]. DeepASDPred, a predictive model based on a CNN-LSTM architecture, achieved high accuracy in identifying ASD risk genes [32]. An adaptive bacterial foraging optimization model combined with SVM-RFE, mRMR, and a graph convolutional network resulted in an impressive accuracy of 97.512% [24]. Moreover, hybrid MAs integrated with CNNs achieved 98.6% accuracy on the ABIDE dataset [25]. DL models such as YOLOv8, trained on facial images, reported a maximum classification accuracy of 89.64% and an F1-score of 0.89 [49].

A recent meta-analysis reported encouraging classification performance for DL-based ASD diagnosis in children, further supporting the potential of such techniques in ASD research [50]. In addition, the MADE-for-ASD model, developed using fMRI data [14], integrated multi-brain atlases with demographic information and achieved a classification accuracy of up to 96.40% while identifying brain regions associated with ASD, suggesting a scalable and interpretable direction for ASD detection.

The study in [51] introduces WS-BiTM, a framework that integrates White Shark Optimization (WSO) for FS with a Bidirectional Long Short-Term Memory (Bi-LSTM) classifier to enhance ASD diagnosis. WSO effectively selects the most informative features from autism screening datasets, which are then fed into a Bi-LSTM capable of efficiently handling sequential data. This integrated approach addresses overfitting and improves computational efficiency. Comparative experiments demonstrate that WS-BiTM outperforms baseline models, achieving accuracies of 97.6%, 96.2%, and 96.4% on toddler, adult, and child datasets, respectively. This positions WS-BiTM as a promising tool for robust ASD classification.

The work in [52] proposes a two-stage model for improving ASD classification using brain MRI volumetric data. First, subcortical structures are extracted and processed using a 3D autoencoder to identify ASD-relevant regions. These regions are then fed into a Siamese Convolutional Neural Network (SCNN) classifier. Using regions identified through Mutual Information FS, the model achieved an accuracy of 66%. This study underscores the potential of combining autoencoders with SCNNs to enhance ASD classification from MRI data.

Background

Exponential-Trigonometric Optimization (ETO).

The ETO algorithm is a new metaheuristic approach designed to solve complex optimization problems using a mathematical approach based on exponential and trigonometric functions [36]. Unlike many optimization methods inspired by nature, ETO focuses purely on mathematical principles to balance exploration (searching new regions) and exploitation (refining solutions). This balance is crucial for navigating the search space efficiently, preventing premature convergence, and enhancing solution quality.

ETO introduces adaptive and random mechanisms to enhance flexibility and efficiency, as no single algorithm is best for all problems. By leveraging exponential functions to adjust search agents and trigonometric functions to fine-tune their positions, ETO effectively navigates both global and local search spaces.

With its simple structure, ease of implementation, and parameter adaptability, ETO provides a powerful and efficient alternative to existing optimization techniques, demonstrating strong performance in solving intricate engineering problems. The following sections outline the algorithm’s key steps, including initialization, exploration, and exploitation.

Initialization

The ETO algorithm begins its optimization process with the random initialization of a population of candidate solutions. In this phase, a set of N individuals is generated, where each individual represents a potential solution in a d-dimensional search space. Mathematically, this population is structured as a matrix X, where each row corresponds to an individual and each column represents a specific dimension.

Each individual in the population is represented as:

(1)

The initial values for each individual are determined by a uniform random distribution, ensuring that the solutions are spread across the search space. The initialization formula is given by:

(2)

where represents the value of the dimension of the individual. rand() generates a random number within the interval [0, 1], ensuring variability in the population. The parameters and define the upper and lower boundaries for the dimension, respectively, constraining the search space within a predefined range.

Following the initialization phase, the Changeover Method (CM) regulates the transition between exploration and exploitation to ensure an optimal balance. In the early iterations, CM prioritizes exploration to diversify the search, while later, it shifts towards exploitation to refine promising solutions. The value of CM determines the algorithm’s search strategy. When , the algorithm enters exploration mode, expanding the search space to enhance diversity. Conversely, when , the algorithm engages in exploitation mode, focusing on refining and improving existing solutions.

The transition is governed by the following equation:

(3)

where t refers to current iteration, denotes the total number of iterations, and are computed parameters, and rand() is a random number in the range [0, 1].

The scalar parameters and are introduced to systematically shape the search space, initially promoting diversity and gradually narrowing the search as the optimization process advances. By progressively reducing their values over successive iterations, the algorithm enhances both convergence speed and solution precision. Mathematically, and are symmetric functions constrained within the range , ensuring controlled adjustments in search behavior. Their formulations are expressed as follows:

(4)(5)

Effective optimization requires a well-defined search space to balance computational efficiency and solution accuracy. The Constrained Exploration (CE) approach in the ETO algorithm dynamically adjusts the search space, initially allowing broad exploration before gradually narrowing it to enhance convergence.

The adaptation of the search space is triggered when the iteration count t matches , which is determined as follows:

(6)(7)

where and are adjustment coefficients, and represents the floor function for integer iteration values. Upon activation, the search space boundaries are updated dynamically:

(8)(9)

where and are random coefficients in [0, 1], denotes the best solution identified so far, and represents the suboptimal solution.

By continuously refining the search space, this approach strikes an optimal balance between exploration and exploitation, thereby facilitating greater efficiency and faster convergence in tackling complex optimization challenges.

Exploration

Balancing exploration and exploitation is essential in optimization algorithms to maintain search diversity while ensuring convergence. In the ETO algorithm, the exploration phase is divided into two stages, allowing for a broader search in the initial iterations before refining the search space to improve solution accuracy. The transition between exploration and exploitation is defined as:

(10)

First Exploration Phase

In the initial stage of exploration, individuals begin searching the solution space after population initialization and boundary definition. The early iterations emphasize diversity, with individuals dispersing across the search space to explore potential regions. As the process advances, their movement gradually shifts toward the best-discovered solutions, refining the search direction.

The update of individual positions is governed by:

(11)

where denotes the best solution found so far, while and represent the current and updated positions of ith solution at dimension j. The coefficient is a random variable within the range [0, 1].

The exploration behavior is further controlled by the weighting coefficient , ensuring individuals move outward from their initial positions while gradually converging over time. It is defined as:

(12)

where starts with significant variation to encourage broad exploration but stabilizes in later iterations. This adaptive adjustment enables a gradual transition from unrestricted exploration to targeted refinement, improving search efficiency.

Second Exploration Phase

In the second exploration phase, search agents operate more independently, reducing reliance on previously identified optimal solutions. This phase enhances diversification by allowing individuals to explore new areas of the search space without directional bias, relying solely on their current positions for movement. The position update mechanism is defined as:

(13)

where is a random number within [0, 1], ensuring variability in movement. refers to the weight coefficient that governs the extent of exploration and gradually decreases over time to refine the search. It is defined as:

(14)

As iterations progress, converges toward zero, facilitating a smoother transition from broad exploration to exploitation and ensuring that the search remains diverse while progressively guiding individuals toward the most promising regions.

Exploitation

First Exploitation Phase

To achieve a comprehensive search, the exploitation phase is divided into two stages, ensuring a structured refinement of solutions. In the first phase, the algorithm emphasizes local search, enabling potential solutions to probe their immediate vicinity and progressively expand their search radius.

The position update mechanism in this phase is governed by:

(15)

where and are random values in [0, 1]. The weight coefficient dictates the degree of localization and is defined as:

(16)

Initially, is large, promoting broad local movements as it gradually increases over iterations, refining the search domain.

Second Exploitation Phase

In the final phase of exploitation, the algorithm intensifies its focus on refining solutions near the optimal point identified so far. This stage is characterized by a deep, localized search, where candidate solutions undergo a more rigorous adjustment process to fine-tune their positions. As iterations progress, the emphasis on optimal regions increases, ensuring convergence to the best possible solution.

The update mechanism for candidate positions follows:

(17)

where rand() represents a random number within [0, 1]. The weight coefficient (as defined in Eq. (14)) plays a crucial role in preserving diversity while refining search accuracy. The coefficient c further ensures the diversity among the solutions. It is derived from the exponential and trigonometric functions, ensuring adaptability in the search process. This refined adaptive exploitation strategy optimizes the final search stage, enhancing solution accuracy while ensuring efficient convergence. The coefficient c is defined as:

(18)

In summary, the ETO algorithm is structured around four core components that work synergistically to enhance search efficiency and convergence precision. The Constrained Exploration (CE) strategy dynamically adjusts the search space, ensuring an optimal balance between exploration and computational efficiency. The dual-phase exploration mechanism begins with a broad search across the solution space before gradually refining towards promising regions, while the dual-phase exploitation strategy fine-tunes candidate solutions, focusing on both local refinement and intensified optimization near the best-found solution. Finally, the Changeover Method (CM) acts as a transition mechanism, adaptively balancing exploration and exploitation throughout the optimization process. Together, these components enable ETO to navigate complex search landscapes effectively, ensuring robust performance in solving intricate engineering optimization problems. The complete step-by-step implementation of the ETO algorithm is outlined in Algorithm 1, which provides a structured pseudocode representation of the optimization process.

Algorithm 1 Pseudo-code of the ETO Algorithm.

1: Set the ETO parameters, including population size (N), maximum iterations (), and problem dimensions (Dim).

2: Randomly initialize solutions X.

3: Calculate fitness values for .

4: Allocate the best solution .

5:   While

6: *Changeover Method (CM) & Constrained Exploration (CE)*

7:     Compute and using Eq. (4) and Eq. (5)

8:     Compute CM using Eq. (3).

9:     If

10:       Update the search domain using Eq. (8) and Eq. (9).

11:     End If

12: *Dual-phase Exploration*

13:     If

14:       If

15:         Compute using Eq. (12).

16:         Modify the value of using Eq. (11).

17:       Else If

18:         Compute using Eq. (14).

19:         Modify the value of using Eq. (13).

20:       End If

21:     End If

22: *Dual-phase Exploitation*

23:     If

24:       If

25:         Compute using Eq. (16).

26:         Modify the value of using Eq. (15).

27:       Else If

28:         Compute coefficient c using Eq. (18).

29:         Modify the value of using Eq. (17).

30:       End If

31:     End If

32:    

33:   End While

Arithmetic Optimization Algorithm (AOA)

The AOA is a metaheuristic inspired by fundamental arithmetic operators-Multiplication (M), Division (D), Subtraction (S), and Addition (A) [37]. These operators serve as the core mechanism for exploring and exploiting the search space. AOA does not require derivative calculations, making it suitable for complex, non-differentiable optimization problems. By utilizing arithmetic operations for diversification and intensification, AOA proves effective in solving challenging optimization tasks with competitive convergence and computational efficiency. This introduction sets the stage for a detailed examination of AOA’s steps, including initialization, exploration, and exploitation.

Initialization

The AOA begins the optimization process by initializing a set of solutions, denoted as X. The solutions are structured in a matrix as shown in Eq. (19), where each element represents a potential solution within the search space. At each iteration, the best candidate solution is considered as the most optimal solution found so far, or at least nearly the optimal solution.

(19)

where N is the total number of candidate solutions, and n is the dimension of each individual solution.

Before proceeding with the search, AOA must determine whether to enter the exploration or exploitation phase. This decision is guided by the Math Optimizer Accelerated (MOA) function, which is calculated using the following equation:

(20)

where Max and Min represent the upper and lower bounds of the accelerated function, respectively. This coefficient adjusts the search process to either explore new regions of the search space or exploit known promising regions based on the current iteration and the progress made so far.

Exploration

In the Exploration Phase, the AOA performs mathematical calculations using the D and M operators to explore various regions of the search space. These operators allow the algorithm to cover a broad range of distributed values or decisions, enhancing the search for a near-optimal solution. However, due to their high dispersion, the Division and Multiplication operators may not directly approach the target solution. To mitigate this, AOA also utilizes other operators, such as Subtraction and Addition, which assist in refining the search.

The exploration operators guide the search in finding through two main search strategies: the Division (D) search strategy and the Multiplication (M) search strategy, which are mathematically modeled as shown in Eq. (21). The exploration phase is governed by the MOA function, given by Eq. (20), which helps decide the phase of the search process based on the value of a random number, denoted as . Specifically, if is greater than the value of MOA, the exploration phase will continue executing either the Division or Multiplication operation. The choice of operator is controlled by , a random number, which determines whether the Division or Multiplication operator will be executed at each iteration.

To enhance exploration, a stochastic scaling coefficient is included to generate more diversified candidate solutions and cover new areas within the search space. This ensures that the algorithm simulates the behaviors of the Arithmetic operators effectively and produces a broad search trajectory.

(21)

In this equation, represents the position of the j-th element of the i-th solution in the subsequent iteration. The term refers to the best-known position for the j-th dimension of the current optimal solution. and correspond to the upper and lower bounds for the j-th dimension, respectively, while ε is a small integer used to prevent division by zero. The parameter μ, set to 0.5, governs the search behavior. MOP is the probability of the Math Optimizer defined by Eq. (22).

(22)

where MOP(t) denotes the function value at iteration t, where is the maximum number of iterations. The exponent α controls the sensitivity of the exploration process, which is fixed at 5 based on the experimental setup. This function plays a critical role in adjusting the exploitation accuracy as the algorithm progresses through iterations.

Exploitation

The exploitation phase in the AOA uses the Subtraction (S) and Addition (A) operators, which have low dispersion and are effective in approaching the target solution. These operators refine the search by focusing on areas near the best solution found so far. This phase is conditioned by the MOA function as the exploration phase. When the condition exceeds the current MOA(t) value, the search deepens using the Subtraction or Addition operators. The update mechanism is as follows:

(23)

The phase exploits the search space by performing a deep search in the local area, guided by the random value . If , the Subtraction operator is used; otherwise, the Addition operator is used. The position updates, based on the operators D, M, S, A, estimate the final solution within the search bounds, allowing AOA to refine its search for a near-optimal solution. The detailed steps of the AOA are presented in the pseudocode in Algorithm 2.

Algorithm 2 Pseudo-code of the AOA Algorithm.

1: Set the AOA parameters, including population size (N), maximum iterations (), control parameters (α, μ), and problem dimensions (Dim).

2: Randomly initialize the positions of candidate solutions in the search space.

3: Calculate the fitness values for solutions X.

4: Allocate the best solution .

5:   While

6:     Update MOA using (20) and MOP using (22).

7:     Generate random numbers between [0, 1] for , , and .

8:     If

9:     *Exploration*

10:       If > 0.5 then

11:         Update using the Division (D) operator in Eq. (21).

12:       Else

13:         Update using the Multiplication (M) operator in Eq. (21).

14:       End If

15:     Else

16:     *Exploitation*

17:       If > 0.5 then

18:         Update using the Subtraction (S) operator in Eq. (23).

19:       Else

20:         Update using the Addition (A) operator in Eq. (23).

21:       End If

22:     End If

23:    

24:   End While

Guided Learning Strategy (GLS)

In this section, we introduce the basic information of the GLS. Following [38], GLS is developed to enhance the balance between exploration and exploitation. In general, GLS is inspired by constructivist learning theory that assumes knowledge is constructed. Based on this inspiration, GLS computes the standard deviation of the previous values of solutions, and this leads to generating the dispersion degree of the set of solutions (i.e., population). Therefore, when the solutions are stuck in exploration, GLS guides them towards exploitation. Otherwise, when the solutions are biased toward exploitation, GLS guides them toward exploration.

In general, GLS starts by setting the initial value for the C and , which represent the initial experience value and the experience upper limit, respectively. Subsequently, the optimization algorithm’s operators are employed to update the current solutions, denoted as X. The updated solution X is then stored in the learning experience St and updates the value of experience . Thereafter, we compute the fitness value for X. Following [38], GLS consists of two phases named feedback and guidance. In the feedback phase, the results are utilized to direct the solutions. Following this, the dispersion degree of the learning experiences in St is calculated. If the previous solutions exhibit a high degree of dispersion, the process transitions to the exploration phase. If not, the process proceeds with the exploitation phase. The value of is computed as follows:

(24)

In Eq. (24), UB and LB are the limits of the boundaries of the search space. STD(St) refers to the standard deviation function of St. B refers to the process of normalizing of to prevent it from being affected by changes in UB and LB.

Moreover, during the guidance stage, the solutions X will be guided to generate new updated solutions . This process is achieved using a new definition for exploitation (first branch in Eq. (25)) and exploration (second branch in Eq. (25)) [38].

(25)

For a detailed description of the steps involved in GLS, refer to Algorithm 3.

Algorithm 3 Pseudo-code of the GLS Algorithm.

Initialize the parameters , and .

Generate initial solutions (.)

Update solutions X using the operators of MH algorithm.

Update the value of learning experiences St.

Update C (i.e., )

Compute Fitness value (F) for each .

if then

  Compute using Eq. (24).

  if then

    Generate using the exploitation as in the first branch of Eq. (25).

  else

    Generate using the exploration as in the second branch of Eq. (25).

  end if

  Compute the fitness value of .

  Select the best of X and .

  Clear the learning experience St and set .

end if

Return (X).

Materials and methods

Data Description

We analyzed rs-fMRI data from the ABIDE I dataset (http://preprocessed-connectomes-project.org/abide/), which includes 505 individuals with ASD and 530 typical controls (TCs), yielding a total of 1,035 subjects [53]. The dataset was accessed on March 23, 2025, and processed using the Configurable Pipeline for the Analysis of Connectomes (CPAC), a widely used pipeline in previous research. This pipeline includes essential preprocessing steps such as slice-timing correction, voxel-intensity normalization, and motion correction, as well as additional procedures including nuisance-signal removal, global-signal regression, band-pass filtering, and spatial registration. Importantly, the dataset is fully anonymized, and we did not have access to any information that could identify individual participants during or after data collection. Following preprocessing and quality control, the final dataset comprised 1,035 samples (623 for training, 308 for validation, and 104 for testing). The class distribution was nearly balanced; therefore, no additional balancing techniques were applied.

Proposed Autism Detection Model

This section outlines the methodology employed in the proposed autism detection model. Initially, the autism dataset is partitioned into training, validation, and testing subsets. Subsequently, a set of candidate solutions is generated, and the fitness value of each solution is computed. The next step involves selecting the optimal solution, defined as the one that attains the lowest fitness value among all candidates. Following this, an update process is performed to refine the solutions during both the exploration and exploitation phases. To achieve this, we integrated the AOA operators with the ETO operators. Moreover, the GLS approach was employed to enhance the balance between exploration and exploitation. Additional details regarding the implementation of the proposed model are presented in the following subsections.

Feature Extraction

The proposed method in [14] introduces a novel approach for diagnosing ASD by leveraging multi-atlas fMRI data. Departing from the original model, our method omits demographic information and ensemble learning, instead utilizing a pre-trained MLP component to extract learned features. These features are subsequently processed through a developed FS algorithm. The methodology is structured into four key stages: data preparation, FS, model training, and feature extraction, each of which is elaborated below.

Data Preparation

To ensure data quality, samples lacking fMRI time series were removed. The mean time series for each ROI was then computed using three distinct brain atlases:

  • Automated Anatomical Labeling (AAL): Comprising 116 ROIs, the AAL atlas provides a detailed anatomical mapping of the brain. Its precise delineation of anatomical regions has made it a standard tool in ASD diagnostic research.
  • Craddock 200 (CC): This atlas partitions the brain into 200 functionally homogeneous regions based on rs-fMRI data. Its structure is particularly advantageous for examining functional connectivity patterns.
  • Eickhoff-Zilles (EZ): The EZ atlas, also consisting of 116 ROIs, integrates cytoarchitectonic mapping to provide a hybrid view of anatomical and functional brain organization. This dual perspective is instrumental in identifying ASD-related deviations in brain connectivity.

The data from these atlases are transformed into functional connectivity matrices similar to [14]. These atlases assess the synchronization of brain activity from rs-fMRI data. The AAL and EZ atlases, although centered on anatomical structures, also offer crucial information on functional connectivity. Thus, making them valuable tools for ASD research. Connectivity features at the voxel level are extracted from each atlas and converted into one-dimensional (1D) vectors for input into the feature extraction model. Functional connectivity, measured by the Pearson correlation coefficient between the time series of paired brain regions, serves as the primary feature for distinguishing ASD from TC subjects. The resulting Pearson correlation matrices are flattened by retaining the upper triangle, yielding a 1D feature vector that encapsulates functional connectivity. For example, the CC atlas generates a 19,900-dimensional vector per sample. This procedure is uniformly applied across all three atlases.

To optimize the feature set, the F-score metric is used to rank features based on their discriminative power between ASD and TC subjects. The top 15% of features are selected. This selective approach optimizes the feature set for subsequent model training and classification tasks.

Feature Extraction Model Architecture

The proposed methodology, as shown in Fig. 1, begins with the training of the model using multi-atlas datasets, followed by the extraction of learned features from the final MLP layer, which consists of 100 units. This layer acts as the primary feature extraction step, with feature extraction performed separately for each brain atlas (AAL, CC, and EZ). The extracted features are subsequently fed into the FS algorithm to identify the most discriminative features for the classification task.

The model architecture integrates the SSDAE followed by the MLP. The SSDAE serves as a pre-training step to learn robust feature representations, which are then fine-tuned using the MLP. This pre-training phase is critical for capturing meaningful features by incorporating sparsity and denoising techniques during training. The SSDAE comprises two stacked autoencoders, each with a sparsity constraint applied to the encoding layer to mitigate overfitting and ensure the extraction of essential features. The input to the SSDAE consists of the flattened 1D feature vectors derived from the multi-atlas data. For instance, the input feature vector for the CC atlas is 3,000-dimensional, while for the AAL and EZ atlases, it is 1,000-dimensional. The first autoencoder includes an input layer corresponding to these flattened feature vectors and an encoding layer with 1,000 units, coupled with a sparsity constraint to promote sparse feature learning. The second autoencoder further compresses the data into a reduced encoding of 600 units, which serves as the pre-trained feature representation.

During SSDAE training, noise is introduced to the input data to facilitate denoising, enabling the autoencoder to learn to reconstruct the original (clean) data. Once the SSDAE is pre-trained, the model undergoes supervised fine-tuning using the MLP. In the MLP structure, the network features layers with 1,000 units in the initial stage, 600 in the next, and 100 units in the final stage. The weights for the first two layers of the MLP are initialized using those learned from the SSDAE, ensuring that the MLP begins with meaningful feature representations. The final 100-unit MLP layer after the fine-tuning process is used to extract learned features separately for each atlas (AAL, CC, and EZ), which are then passed to the FS algorithm to identify the most relevant and discriminative features for distinguishing between ASD and TC cases.

Feature selection (FS)

The phases of the proposed FS based on a modified version of ETO are introduced in this section.

Initial phase

We generate a set of N solutions X using Eq. (26).

(26)

where D indicates the number of features/dimensions of and refers a random value.

Learning Phase

This phase begins with the calculation of the fitness value for each individual X using the training dataset (). This computation is performed by transforming the current X into its binary representation through the equation provided below.

(27)

In Eq. (27), denotes a random value. The next step is to assess the quality of selected features in using BX. This evaluation is performed by computing the fitness value () that is defined as follows.

(28)

In Eq. (28), refers to the number of selected features that correspond to ones in . γ refers to the value of error classification using the KNN classifier. In this study, we applied the KNN classifier with . denotes the factor used to balance between the two parts of Eq. (28) (i.e., the ratio of selected features and the classification error).

Thereafter, we evaluate the allocation of the best solution with the best fitness . Then the updating of X is conducted through the injection of the exploration of AOA into ETO, whereas the solution update process integrates AOA and ETO by injecting the exploration operators of AOA into ETO, while combining the exploitation operators of both algorithms. The process of updating the solution during exploration is conducted using the following formula.

(29)(30)(31)

where and are the minimum and maximum values, respectively, of .

Moreover, for the updating process during the exploitation phase, the solutions are updated using the following formula:

(32)

The next step is to apply the GLS as defined in Eq. (24) and (25). However, to reduce the time complexity of this step, we apply it according to the following formula:

(33)

Evaluation of selected features

The primary objective of this phase is to evaluate the quality of the chosen features by utilizing the binary version of the optimal solution, denoted as , in conjunction with the testing set. Typically, we identify the features from the testing set that correspond to those in and subsequently apply them to the KNN classifier trained on the selected training features for autism classification. Additionally, performance metrics are computed to quantify the effectiveness of the predicted outcomes. The sequence of steps involved in the developed model is illustrated in Fig. 2.

thumbnail
Fig 2. Framework of the autism detection technique based on METO.

https://doi.org/10.1371/journal.pone.0339921.g002

The time complexity of the developed model to detect autism is given as follows:

(34)

where is the total number of iterations, and is the complexity of the initial step, which is given as follows:

(35)

Whereas, refers to the complexity of the updated steps and is given as follows:

(36)

So, finally, the complexity of METO is given as follows:

(37)

Experimental Results

This section provides a detailed presentation of the experimental outcomes derived from the evaluation of METO, alongside six comparative methods: ETO, AOA, Harris hawks optimization (HHO) [54], LASHDE [55], whale optimization algorithm (WOA) [56], and Triangulation topology aggregation optimizer (TTAO) [57]. The performance of these algorithms is measured through several metrics, including accuracy, sensitivity, AUC, Precision, and F-score. The experiments were carried out on three atlas-specific feature sets derived from the ABIDE I dataset to ensure a thorough and robust evaluation of the methods.

Model Configurations

To ensure the learning of robust and meaningful features, the SSDAE is pre-trained with both sparsity and noise constraints. The learning rates for the SSDAE and the MLP are set to 0.001 and 0.0005, respectively. Optimization is performed using gradient descent (GD) for the SSDAE and stochastic gradient descent (SGD) for the MLP. To balance computational efficiency and model performance, the batch size is set to 100 for the SSDAE and 10 for the MLP. Dropout rates of 0.5 for the SSDAE and 0.3 for the MLP are applied to reduce the risk of overfitting.

The SSDAE is trained for 700 iterations, while the second autoencoder within the SSDAE is trained for 1,000 iterations to ensure comprehensive feature learning. The final layer of the MLP, consisting of 100 units, outputs refined learned features that capture the key attributes of each brain atlas (AAL, CC, and EZ). In this study, we refer to CC, AAL, and EZ as dataset-1, dataset-2, and dataset-3, respectively.

Performance Measures

The performance of the proposed method and the compared algorithms is evaluated using the following metrics:

• Accuracy measures:

(38)

• Sensitivity:

(39)

• Precision:

(40)

• F-score:

(41)

where TP and TN represent the number of true positives and true negatives, while FP and FN denote false positives and false negatives, respectively.

• Standard Deviation (StD):

(42)

where represents individual values, is the mean, and n is the number of observations.

• Fitness value: We used the fitness value defined in Eq. (28) to assess the proposed model’s ability to balance the ratio of selected features and classification error.

Results and discussion

The results of the experiments are listed in Tables 13. Table 1 and Fig. 3 present the results for dataset-1. In this table, METO achieved the highest accuracy, with a mean value of 0.6827, indicating stable and reliable performance in classifying ASD cases. Its stability is further supported by the lowest standard deviation, showing that METO produced consistent accuracy across all runs. LASHDE and TTAO ranked second, each with a mean accuracy of 0.6490, indicating moderate performance with limited variability. HHO, AOA, WOA, and ETO recorded mean accuracy values of 0.6442, suggesting weaker performance. The standard deviation of HHO (0.0136) reflects unstable results and irregular convergence. The best accuracy values also confirm this ranking: METO consistently reached strong performance, whereas other algorithms, such as HHO, ranged from a best value of 0.6538 to a worst value of 0.6346, indicating inconsistent behavior.

thumbnail
Table 1. Comparative results of proposed METO with other methods using dataset-1.

https://doi.org/10.1371/journal.pone.0339921.t001

thumbnail
Fig 3. Results of Accuracy, sensitivity, and AUC for dataset-1.

https://doi.org/10.1371/journal.pone.0339921.g003

Regarding sensitivity, METO again ranked first with a mean of 0.7800, demonstrating accurate identification of ASD cases. Its standard deviation of 0.0000 indicates fully consistent results across all runs. HHO and WOA followed, each with a mean sensitivity of 0.7400, suggesting good but less stable detection performance, as reflected in their standard deviations of 0.0283. LASHDE achieved a mean sensitivity of 0.7300, while TTAO reached 0.7200. ETO recorded the lowest mean value of 0.7100, showing higher variation across runs.

For AUC values, METO once more ranked first, with a mean of 0.7269, demonstrating the strongest separation between ASD and non-ASD samples. Its standard deviation of 0.0213 indicates consistent performance across different runs. ETO ranked second with 0.6873, followed by AOA (0.6839), LASHDE (0.6794), WOA (0.6791), and TTAO (0.6781). HHO obtained the lowest mean AUC value of 0.6711. METO also showed balanced best and worst AUC measures (0.7419 and 0.7119), reflecting reliable decision boundaries, whereas other methods exhibited wider gaps between their best and worst values.

For precision, METO achieved the best result with a mean of 0.6393. TTAO ranked second with 0.6154, followed by LASHDE with 0.6136. ETO, AOA, WOA, and HHO recorded precision values between 0.6121 and 0.6065, indicating a slightly weaker ability to correctly identify positive cases.

The F-score was used to provide insight into the overall classification performance of the methods. Its analysis also favored METO, which achieved a mean value of 0.7027. LASHDE ranked second with 0.6667, followed closely by HHO (0.6666) and WOA. TTAO and AOA reached mean F-scores of 0.6636 and 0.6606, respectively, while ETO recorded the lowest value of 0.6574.

Regarding the FS measure, METO selected an average of 19.5 features, indicating that it successfully identified the most relevant attributes while minimizing redundancy. HHO followed with 20.5 features, and WOA with 20.0. LASHDE selected 53.5 features, whereas AOA and TTAO selected 58.5 and 62.0, respectively. ETO required the highest number of features (75.0). The minimal standard deviation of METO further demonstrates its stable FS behavior.

Based on these findings, METO demonstrated the highest overall performance across all evaluation metrics, including accuracy, sensitivity, AUC, precision, F-score, and FS results. HHO and LASHDE followed with moderately consistent results, whereas WOA, TTAO, and AOA provided average performance. In contrast, ETO showed the lowest accuracy. Overall, METO achieved the strongest overall trade-off between classification performance and feature reduction on datasets 1 and 3.

Table 2 and Fig. 4 present the results for dataset-2. ETO achieved the highest mean accuracy of 0.7308, ranking first among all methods. Its standard deviation of 0.0136 indicates minimal variation across different runs. HHO and METO followed with mean accuracies of 0.6827 and 0.6683, ranking second and third, with standard deviations of 0.0136 and 0.0204, respectively. The best accuracy was recorded by ETO, while METO’s best accuracy was 0.6827, ranking third after HHO. The worst accuracy values similarly placed ETO first, followed by HHO, WOA, and METO.

thumbnail
Table 2. Comparative results of proposed METO with other methods using dataset-2.

https://doi.org/10.1371/journal.pone.0339921.t002

thumbnail
Fig 4. Results of Accuracy, sensitivity, and AUC for dataset-2.

https://doi.org/10.1371/journal.pone.0339921.g004

In terms of sensitivity, ETO outperformed all other methods with a mean sensitivity of 0.7700 and a standard deviation of 0.0141. METO ranked fourth with a mean of 0.7000, following LASHDE and HHO. ETO’s best sensitivity was 0.7800, whereas METO’s highest was 0.7200. The worst sensitivity measure also ranked ETO first and METO fourth.

For AUC, ETO ranked first with a mean of 0.7957 and a StD of 0.0143. METO ranked second with a mean AUC of 0.7252 and a StD of 0.0265. ETO’s best AUC was 0.8058, while METO’s best was 0.7440. In the worst-case scenario, METO ranked third behind ETO and WOA.

Precision results show that ETO achieved a mean of 0.7006 with a deviation of 0.0232. METO and LASHDE ranked similarly, both with mean values of 0.6437. TTAO and AOA trailed with means near 0.60.

Regarding the F-score, ETO ranked first with a mean of 0.7334. METO and LASHDE shared the next rank, both with mean values near 0.67. WOA, HHO, TTAO, and AOA ranked lower, with values ranging from 0.665 to 0.637. METO’s low StD reflects stable performance across runs.

For the FS measure, METO selected the fewest attributes with a mean of 110. ETO selected more features, with a mean of 157.5, while HHO selected a substantially larger number of features, averaging around 535.5. LASHDE and WOA also selected relatively large subsets, with averages above 165 and 335 features, respectively. TTAO and AOA selected moderately sized subsets. TTAO and HHO demonstrated lower deviations in the number of selected features.

Overall, METO demonstrated competitive performance across all measures. ETO achieved higher values in most metrics, most notably accuracy, sensitivity, AUC, precision, and F-score, but at the cost of selecting larger feature subsets, which may increase model complexity. HHO and LASHDE offered moderate performance with less stability in some measures. These observations suggest that METO presents a balanced and effective option for ASD detection in this dataset, emphasizing reduced feature dimensionality while maintaining reliable classification performance.

The results for dataset-3 are summarized in Table 3 and Fig. 5. METO achieved the highest mean accuracy at 0.8357, outperforming all other methods. Its standard deviation of 0.0068 indicates highly consistent results across all runs. HHO ranked second with a mean accuracy of 0.8164, followed closely by ETO at 0.8140. METO’s best accuracy reached 0.8406, while its lowest was 0.8309, remaining higher than the worst scores of the other methods, including ETO, HHO, and WOA.

thumbnail
Table 3. Comparative results of proposed METO with other methods using dataset-3.

https://doi.org/10.1371/journal.pone.0339921.t003

thumbnail
Fig 5. Results of Accuracy, sensitivity, and AUC for dataset-3.

https://doi.org/10.1371/journal.pone.0339921.g005

For sensitivity, METO again led with a mean value of 0.8165 and a standard deviation of 0.0130. LASHDE and HHO followed with slightly lower average sensitivity levels. METO’s highest sensitivity matched that of LASHDE at 0.8257. ETO, WOA, and HHO achieved maximum sensitivity values near 0.7890. Regarding the minimum sensitivity, METO maintained the top rank at 0.8073, followed by TTAO and HHO.

In terms of AUC, METO recorded the highest mean value of 0.9071 alongside very low variability (standard deviation of 0.0009), indicating highly stable class separation. HHO and ETO achieved the second and third highest means, respectively. METO’s AUC values ranged narrowly between 0.9077 and 0.9065, consistently surpassing all other methods, including HHO, ETO, and WOA.

METO also achieved the highest mean precision at 0.8641, followed closely by ETO and HHO at approximately 0.858. The remaining algorithms produced precision values ranging between 0.833 and 0.845.

The F-score results followed a similar pattern, with METO obtaining the highest value of 0.8396. HHO and ETO came next with values near 0.817 and 0.814, respectively, while the remaining techniques ranged between 0.793 and 0.816.

Regarding FS, METO consistently selected the fewest attributes, demonstrating its ability to maintain strong performance while using a minimal feature set. WOA and AOA selected moderately small subsets, whereas TTAO and LASHDE occupied the mid-range. ETO and HHO chose larger feature subsets and exhibited greater variability in FS.

Overall, METO provides the best balance of high accuracy, stable performance, and efficient feature usage on dataset-3. While other methods offer competitive performance, they generally show less consistency across evaluation metrics.

Overall, METO demonstrated competitive performance across the three datasets in different measures. It maintained robust classification while selecting fewer features, which shows its effectiveness. Although other methods outperformed METO in certain metrics, it achieved a stable performance as a classification model for ASD.

The superior performance of the proposed METO model, particularly in terms of accuracy and AUC on datasets 1 and 3, can be attributed to its effective hybrid design. The integration of AOA operators enhances the exploration capability, enabling the algorithm to escape local optima, while the GLS component strategically guides the search toward promising regions and refines the solution quality. This synergy leads to a more discriminative and compact feature subset, as evidenced by METO consistently selecting the fewest features while maintaining high classification performance (Tables 1–3). In contrast, models such as ETO and HHO exhibit greater variability or select larger feature sets, suggesting less stable and potentially overfitted solutions.

Although the ABIDE I dataset is a valuable resource, it introduces potential biases that may influence model performance. The data are aggregated from multiple imaging sites, resulting in variability in scanner protocols and participant demographics (site effects). Moreover, ASD is a highly heterogeneous disorder, and the dataset may not fully represent the entire spectrum of phenotypic presentations. These factors may limit the model’s generalizability to new, unseen data from different sources. Although standard preprocessing (CPAC pipeline) reduces some technical variations, the intrinsic clinical and acquisition heterogeneity remains a challenge for all models evaluated in this study. Future work will include validation on more homogeneous cohorts and the implementation of domain adaptation techniques to enhance robustness.

Although the proposed METO framework achieves competitive performance across the three rs-fMRI atlases, its outputs are still separated from the way clinicians arrive at diagnostic decisions in routine practice. In particular, METO is a deep, multi-stage hybrid model (SSDAE + MLP + metaheuristic FS), which produces probabilistic class labels based on high-dimensional functional connectivity patterns. These representations are not directly expressed in terms of observable symptoms, behavioral domains, or standardized clinical criteria (e.g., ADOS or ADI-R scores). As a result, there remains an interpretability gap between the model’s decision boundary in feature space and the multi-informant, longitudinal reasoning process used by clinicians when diagnosing ASD.

The FS step does provide a more compact subset of connectivity features, yet the selected connections are not, in their current form, systematically mapped back to anatomically or functionally interpretable networks (e.g., social cognition, default mode, salience, or executive control networks) that could be directly discussed with clinicians. Moreover, the metaheuristic search in METO is optimized for predictive performance and sparsity, not for human interpretability, which may limit transparency and trust in clinical settings, despite the favorable accuracy and AUC values.

These observations underline the need for explainable AI extensions to the present framework. In future work, we plan to integrate post-hoc explanation techniques, such as permutation-based feature importance, Shapley value analysis, or layer-wise relevance propagation, to quantify the contribution of individual ROIs and functional connections to each prediction. At the subject level, saliency-style visualizations of the most influential connections could be projected back onto brain templates and summarized at the network level, thereby linking model decisions to neurobiological hypotheses about ASD. At the clinical level, METO should be regarded as a decision-support tool that complements, rather than replaces, expert clinical judgment. By combining high-performing predictive models with explicit explanations of which brain regions and connectivity patterns drive each decision, we aim to make the outputs more transparent and clinically meaningful, narrowing the gap between algorithmic predictions and real-world diagnostic decision-making.

Moreover, we compared the results of the proposed model with other models, such as MADE-for-ASD [14]. The accuracies obtained for AAL, EZ, and CC using MADE-for-ASD are 71.20%, 68.74%, and 73.42%, respectively. In contrast, the developed model achieves accuracies of 66.83%, 83.57%, and 68.27% for AAL, EZ, and CC, respectively. Therefore, while MADE-for-ASD demonstrates better accuracy than METO on the AAL and CC dataset, its accuracy on the EZ dataset is lower than that of the proposed model.

From the previous discussion, it can be observed that the proposed METO achieves high performance in detecting ASD. This improvement results from the integration of ETO with AOA and GLS, where AOA and GLS are employed to enhance the exploration and exploitation capabilities of ETO. Nevertheless, the developed METO still has some limitations, such as relatively high time complexity, which requires further improvement. Additionally, the parameters of METO need to be optimized, as multiple parameters exist within the ETO, GLS, and AOA algorithms.

Conclusion and future works

This study proposed a hybrid deep learning and feature selection framework for ASD detection from rs-fMRI data. The focus of this paper has been on utilizing fMRI data, as neuroimaging studies often face replicability issues, largely due to the necessity for extensive sample sizes and the complexity of brain-behavior relationships. The psychological sciences are currently at a pivotal point regarding replication, which similarly affects neuroimaging research. Although neuroimaging techniques can identify group differences, translating these findings to individual cases remains challenging. Furthermore, while neuroimaging provides valuable insights into brain abnormalities, it does not offer mechanistic explanations for the underlying causes of these disruptions, thereby limiting the understanding of fundamental neural mechanisms. In response to these limitations, this paper proposes an alternative autism diagnostic method, which integrates DL techniques with FS strategies. The DL component was used to extract discriminative representations from rs-fMRI data. The DL framework, incorporating a combination of SSDAE and MLP, proves effective in extracting discriminative features from rs-fMRI data, which are essential for the FS process. Subsequently, a modified FS technique, called METO, was applied to select pertinent features while eliminating irrelevant ones, thereby improving the autism diagnosis process. This modification incorporated the AOA to enhance both exploration and exploitation of the search space, followed by the use of GLS to optimize the balance between these two phases. To evaluate the effectiveness of the proposed model, its performance was compared using three atlas-based feature representations derived from the ABIDE I dataset. The results, as reflected in various performance metrics, demonstrate the model’s robust capability in autism diagnosis, achieving an average accuracy of 0.72, a sensitivity of 0.76, an AUC of 0.78, and 40 selected features.

Future work will focus on improving the performance of the developed model, extending it to other disorders such as depression and ADHD, and incorporating explainable AI techniques to improve interpretation of the results.

References

  1. 1. Samar Hazim H, Albahri AS. Unlocking the Potential of Autism Detection: Integrating Traditional Feature Selection and Machine Learning Techniques. Applied Data Science and Analysis. 2023;2023:42–58.
  2. 2. https://www.who.int/news-room/fact-sheets/detail/autism-spectrum-disorders
  3. 3. Shaw KA, Williams S, Patrick ME, et al. Prevalence and Early Identification of Autism Spectrum Disorder Among Children Aged 4 and 8 Years — Autism and Developmental Disabilities Monitoring Network, 16 Sites, United States, 2022. MMWR Surveill Summ 2025;74(No. SS-2):1–22. DOI: https://doi.org/http://dx.doi.org/10.15585/mmwr.ss7402a1
  4. 4. Randall M, Egberts KJ, Samtani A, Scholten RJ, Hooft L, Livingstone N, Woolfenden S, Williams K. Diagnostic tests for autism spectrum disorder (ASD) in preschool children. Cochrane database of systematic reviews. 2018(7).
  5. 5. Galliver M, Gowling E, Farr W, Gain A, Male I. Cost of assessing a child for possible autism spectrum disorder? An observational study of current practice in child development centres in the UK. BMJ Paediatrics Open. 2017 Nov 30;1(1):e000052.
  6. 6. Edition F. Diagnostic and statistical manual of mental disorders. Am Psychiatric Assoc. 2013;21(21):591-643.
  7. 7. Micai M, Fatta LM, Gila L, Caruso A, Salvitti T, Fulceri F, Ciaramella A, D’Amico R, Del Giovane C, Bertelli M, Romano G. Prevalence of co-occurring conditions in children and adults with autism spectrum disorder: A systematic review and meta-analysis. Neuroscience & Biobehavioral Reviews. 2023 Dec 1;155:105436.
  8. 8. Bal VH, Wilkinson E, Fok M. Cognitive profiles of children with autism spectrum disorder with parent-reported extraordinary talents and personal strengths. Autism. 2022 Jan;26(1):62-74.
  9. 9. Falkmer T, Anderson K, Falkmer M, Horlin C. Diagnostic procedures in autism spectrum disorders: a systematic literature review. European Child & Adolescent Psychiatry. 2013;22(6):329–40.
  10. 10. Geschwind DH, Levitt P. Autism spectrum disorders: developmental disconnection syndromes. Curr Opin Neurobiol. 2007;17(1):103–11. pmid:17275283
  11. 11. Aggarwal S, Angus B. Misdiagnosis versus missed diagnosis: diagnosing autism spectrum disorder in adolescents. Australas Psychiatry. 2015;23(2):120–3. pmid:25653302
  12. 12. Fedorov A, Geenjaar E, Wu L, Sylvain T, DeRamus TP, Luck M, et al. Self-supervised multimodal learning for group inferences from MRI data: Discovering disorder-relevant brain regions and multimodal links. Neuroimage. 2024;285:120485. pmid:38110045
  13. 13. Di Martino A, Yan CG, Li Q, Denio E, Castellanos FX, Alaerts K, Anderson JS, Assaf M, Bookheimer SY, Dapretto M, Deen B. The autism brain imaging data exchange: towards a large-scale evaluation of the intrinsic brain architecture in autism. Molecular psychiatry. 2014 Jun;19(6):659-67.
  14. 14. Liu X, Hasan MR, Gedeon T, Hossain MZ. MADE-for-ASD: A multi-atlas deep ensemble network for diagnosing Autism Spectrum Disorder. Computers in Biology and Medicine. 2024;182:109083.
  15. 15. Deng J, Rakibul Hasan M, Mahmud M, Mahbub Hasan M, Asif Ahmed K, Zakir Hossain M. Diagnosing Autism Spectrum Disorder Using Ensemble 3D-CNN: A Preliminary Study. In: 2022 IEEE International Conference on Image Processing (ICIP). IEEE; 2022. 3480–4. https://doi.org/10.1109/icip46576.2022.9897628
  16. 16. Zare A, Shoeibi A, Shafaei N, Moridian P, Alizadehsani R, Halaji M. Accurate prediction using triangular type-2 fuzzy linear regression. 2021. https://arxiv.org/abs/2109.05461
  17. 17. Bajestani NS, Zare A. Application of optimized Type 2 fuzzy time series to forecast Taiwan stock index. In: 2009 2nd International Conference on Computer, Control and Communication, 2009. 1–6. https://doi.org/10.1109/ic4.2009.4909268
  18. 18. Bajestani NS, Kamyad AV, Esfahani EN, Zare A. Nephropathy forecasting in diabetic patients using a GA-based type-2 fuzzy regression model. Biocybernetics and Biomedical Engineering. 2017;37(2):281–9.
  19. 19. Bajestani NS, Kamyad AV, Zare A. An interval type-2 fuzzy regression model with crisp inputs and type-2 fuzzy outputs for TAIEX forecasting. In: 2016 IEEE International Conference on Information and Automation (ICIA), 2016. 681–5. https://doi.org/10.1109/icinfa.2016.7831906
  20. 20. Khodatars M, Shoeibi A, Sadeghi D, Ghaasemi N, Jafari M, Moridian P. Deep learning for neuroimaging-based diagnosis and rehabilitation of autism spectrum disorder: a review. Computers in Biology and Medicine. 2021;139:104949.
  21. 21. Rahman MM, Usman OL, Muniyandi RC, Sahran S, Mohamed S, Razak RA. A Review of Machine Learning Methods of Feature Selection and Classification for Autism Spectrum Disorder. Brain Sci. 2020;10(12):949. pmid:33297436
  22. 22. Ghnemat R, Al-Madi N, Awad M. An intelligent approach for autism spectrum disorder diagnosis and rehabilitation features identification. Neural Comput & Applic. 2024;37(4):2557–80.
  23. 23. Wang C, Xiao Z, Wu J. Functional connectivity-based classification of autism and control using SVM-RFECV on rs-fMRI data. Phys Med. 2019;65:99–105. pmid:31446358
  24. 24. Lamani MR, Benadit PJ. Automatic Diagnosis of Autism Spectrum Disorder Detection Using a Hybrid Feature Selection Model with Graph Convolution Network. SN COMPUT SCI. 2023;5(1).
  25. 25. Chola Raja K, Kannimuthu S. Deep learning-based feature selection and prediction system for autism spectrum disorder using a hybrid meta-heuristics approach. IFS. 2023;45(1):797–807.
  26. 26. Atlam E-S, Aljuhani KO, Gad I, Abdelrahim EM, Atwa AEM, Ahmed A. Automated identification of autism spectrum disorder from facial images using explainable deep learning models. Sci Rep. 2025;15(1):26682. pmid:40695996
  27. 27. Atlam E-S, Masud M, Rokaya M, Meshref H, Gad I, Almars AM. EASDM: Explainable Autism Spectrum Disorder Model Based on Deep Learning. Journal of Disability Research. 2024;3(1).
  28. 28. Almars AM, Gad I, Atlam E-S. RETRACTED ARTICLE: Unlocking autistic emotions: developing an interpretable IoT-based EfficientNet model for emotion recognition in children with autism. Neural Comput & Applic. 2025;37(21):17129–48.
  29. 29. Rai N, Pradhan PC, Saikia H, Bhutia R, Singh OP. ASD-HybridNet: A hybrid deep learning framework for detection of autism spectrum disorder. Magn Reson Imaging. 2025;124:110492. pmid:40876583
  30. 30. Altomi ZA, Alsakar YM, El-Gayar MM, Elmogy M, Fouda YM. Autism Spectrum Disorder Diagnosis Based on Attentional Feature Fusion Using NasNetMobile and DeiT Networks. Electronics. 2025;14(9):1822.
  31. 31. Villamarín A, Chumaña J, Narváez M, Guallichico G, Ocaña M, Luna A. Artificial Intelligence in the Detection of Autism Spectrum Disorders (ASD): a Systematic Review. Proceedings in Adaptation, Learning and Optimization. Springer Nature Switzerland. 2024. p. 21–32. https://doi.org/10.1007/978-3-031-71388-0_3
  32. 32. Fan Y, Xiong H, Sun G. DeepASDPred: a CNN-LSTM-based deep learning method for Autism spectrum disorders risk RNA identification. BMC Bioinformatics. 2023;24(1):261. pmid:37349705
  33. 33. Masi A, DeMayo MM, Glozier N, Guastella AJ. An Overview of Autism Spectrum Disorder, Heterogeneity and Treatment Options. Neurosci Bull. 2017;33(2):183–93. pmid:28213805
  34. 34. Subah FZ, Deb K, Dhar PK, Koshiba T. A Deep Learning Approach to Predict Autism Spectrum Disorder Using Multisite Resting-State fMRI. Applied Sciences. 2021;11(8):3636.
  35. 35. Alzubaidi L, Bai J, Al-Sabaawi A, Santamaría J, Albahri AS, Al-dabbagh BSN, et al. A survey on deep learning tools dealing with data scarcity: definitions, challenges, solutions, tips, and applications. J Big Data. 2023;10(1).
  36. 36. Luan TM, Khatir S, Tran MT, De Baets B, Cuong-Le T. Exponential-trigonometric optimization algorithm for solving complicated engineering problems. Computer Methods in Applied Mechanics and Engineering. 2024;432:117411.
  37. 37. Abualigah L, Diabat A, Mirjalili S, Abd Elaziz M, Gandomi AH. The Arithmetic Optimization Algorithm. Computer Methods in Applied Mechanics and Engineering. 2021;376:113609.
  38. 38. Jia H, Lu C. Guided learning strategy: A novel update mechanism for metaheuristic algorithms design and improvement. Knowledge-Based Systems. 2024;286:111402.
  39. 39. Kathiravan K, Rajnarayanan PN. Application of AOA algorithm for optimal placement of electric vehicle charging station to minimize line losses. Electric Power Systems Research. 2023;214:108868.
  40. 40. Mohamed MAE, Nasser Ahmed S, Eladly Metwally M. Arithmetic optimization algorithm based maximum power point tracking for grid-connected photovoltaic system. Sci Rep. 2023;13(1):5961. pmid:37045948
  41. 41. Xu M, Song Q, Xi M, Zhou Z. Binary arithmetic optimization algorithm for feature selection. Soft comput. 2023;:1–35. pmid:37362265
  42. 42. Guo X, Dominick KC, Minai AA, Li H, Erickson CA, Lu LJ. Diagnosing Autism Spectrum Disorder from Brain Resting-State Functional Connectivity Patterns Using a Deep Neural Network with a Novel Feature Selection Method. Front Neurosci. 2017;11:460. pmid:28871217
  43. 43. Eslami T, Mirjalili V, Fong A, Laird AR, Saeed F. ASD-DiagNet: A Hybrid Learning Approach for Detection of Autism Spectrum Disorder Using fMRI Data. Front Neuroinform. 2019;13:70. pmid:31827430
  44. 44. Huang Z-A, Zhu Z, Yau CH, Tan KC. Identifying Autism Spectrum Disorder From Resting-State fMRI Using Deep Belief Network. IEEE Trans Neural Netw Learn Syst. 2021;32(7):2847–61. pmid:32692687
  45. 45. Pan L, Liu J, Shi M, Wong CW, Chan KHK. Identifying autism spectrum disorder based on individual-aware down-sampling and multi-modal learning. In: 2021. https://arxiv.org/abs/2109.09129
  46. 46. Zhang F, Wei Y, Liu J, Wang Y, Xi W, Pan Y. Identification of Autism spectrum disorder based on a novel feature selection method and Variational Autoencoder. Comput Biol Med. 2022;148:105854. pmid:35863246
  47. 47. Chi NA, Washington P, Kline A, Husic A, Hou C, He C, et al. Classifying Autism From Crowdsourced Semistructured Speech Recordings: Machine Learning Model Comparison Study. JMIR Pediatr Parent. 2022;5(2):e35406. pmid:35436234
  48. 48. Zhang J, Feng F, Han T, Gong X, Duan F. Detection of Autism Spectrum Disorder using fMRI Functional Connectivity with Feature Selection and Deep Learning. Cogn Comput. 2022;15(4):1106–17.
  49. 49. Gautam S, Sharma P, Thapa K, Upadhaya MD, Thapa D, Khanal SR. Screening autism spectrum disorder in children using deep learning approach: evaluating the classification model of YOLOv8 by comparing with other models. arXiv preprint. 2023. https://arxiv.org/abs/230614300
  50. 50. Ding Y, Zhang H, Qiu T. Deep learning approach to predict autism spectrum disorder: a systematic review and meta-analysis. BMC Psychiatry. 2024;24(1):739. pmid:39468522
  51. 51. Khan K, Katarya R. WS-BiTM: Integrating White Shark Optimization with Bi-LSTM for enhanced autism spectrum disorder diagnosis. J Neurosci Methods. 2025;413:110319. pmid:39521353
  52. 52. Abu-Doleh A, Abu-Qasmieh IF, Al-Quran HH, Masad IS, Banyissa LR, Ahmad MA. Recognition of autism in subcortical brain volumetric images using autoencoding-based region selection method and Siamese Convolutional Neural Network. Int J Med Inform. 2025;194:105707. pmid:39561667
  53. 53. Craddock C, Benhajali Y, Chu C, Chouinard F, Evans A, Jakab A, Khundrakpam BS, Lewis JD, Li Q, Milham M, Yan C. The neuro bureau preprocessing initiative: open sharing of preprocessed neuroimaging data and derivatives. Frontiers in Neuroinformatics. 2013 Jan;7(27):5.
  54. 54. Heidari AA, Mirjalili S, Faris H, Aljarah I, Mafarja M, Chen H. Harris hawks optimization: Algorithm and applications. Future Generation Computer Systems. 2019;97:849–72.
  55. 55. Mohamed AW, Hadi AA, Fattouh AM, Jambi KM. LSHADE with semi-parameter adaptation hybrid with CMA-ES for solving CEC 2017 benchmark problems. In: 2017 IEEE Congress on Evolutionary Computation (CEC), 2017. 145–52.
  56. 56. Sahlol AT, Abd Elaziz M, Al-Qaness MAA, Kim S. Handwritten Arabic Optical Character Recognition Approach Based on Hybrid Whale Optimization Algorithm With Neighborhood Rough Set. IEEE Access. 2020;8:23011–21.
  57. 57. Zhao S, Zhang T, Cai L, Yang R. Triangulation topology aggregation optimizer: A novel mathematics-based meta-heuristic algorithm for continuous optimization and engineering applications. Expert Systems with Applications. 2024;238:121744.