Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Intelligent diagnosis of major depressive disorder with edge convolution and contrastive learning

  • Dan Long ,

    Contributed equally to this work with: Dan Long, Chen Zhu

    Roles Conceptualization, Data curation, Methodology, Project administration, Resources, Writing – original draft

    Affiliation Zhejiang Cancer Hospital, Hangzhou Institute of Medicine (HIM), Chinese Academy of Sciences, Hangzhou, Zhejiang, China

  • Chen Zhu ,

    Contributed equally to this work with: Dan Long, Chen Zhu

    Roles Conceptualization, Investigation, Methodology, Software, Validation, Visualization

    Affiliation School of Statistics and Data Science, Zhejiang Gongshang University, Hangzhou, Zhejiang, China

  • Lei Xiong,

    Roles Validation, Writing – review & editing

    Affiliation School of Statistics and Data Science, Zhejiang Gongshang University, Hangzhou, Zhejiang, China

  • Zhou Long,

    Roles Formal analysis, Writing – review & editing

    Affiliation The Second School of Medicine, Wenzhou Medical University, Wenzhou, Zhejiang, China

  • Fangfang Dong

    Roles Methodology, Project administration, Resources, Supervision, Writing – review & editing

    fangdong12@aliyun.com

    Affiliation School of Statistics and Data Science, Zhejiang Gongshang University, Hangzhou, Zhejiang, China

Abstract

Based on functional connectivity (FC) matrices derived from resting-state functional magnetic resonance imaging (rs-fMRI) data, graph neural networks (GNNs), as an advanced deep learning technique, have been widely applied in major depressive disorder (MDD) diagnosis. However, conventional GNNs suffer from a critical limitation in preserving the spatial specificity of brain regions, which is attributed to their intrinsic node permutation invariance that ignores the unique order and specific roles of brain regions in neural circuits. To address this limitation, in this paper, we propose a novel deep learning framework, Graph Contrastive Learning based on Edge Convolution (EC-GCL), to analyze resting-state fMRI data from 1,160 participants, including 597 patients with MDD and 563 healthy controls. This framework integrates an edge convolution encoder, specifically designed to preserve the spatial specificity of brain regions, with a learnable graph augmentation module into an adversarial graph contrastive learning, thereby enhancing the extraction of discriminative MDD-related FC features and improving diagnostic classification accuracy. Compared with conventional machine learning and GNN models, our proposed EC-GCL achieved superior performance (AUC = 71.2%) and improved interpretability. In particular, the framework identified several key brain regions, including the dorsolateral superior frontal gyrus, thalamus, and insula, that are closely linked to the pathophysiology of MDD, which is consistent with the findings of prior neuroimaging studies. This study demonstrates that combining edge convolution with contrastive learning provides a robust and explainable method for MDD diagnosis. This provides new insights into depression and may support improvements in clinical practice.

Introduction

Major depressive disorder (MDD) is a leading cause of disability worldwide, affecting more than 264 million individuals annually [1]. It is the second most common cause of death among young people aged 15–29 years. The diagnosis of MDD is typically based on systematic psychiatric evaluation in accordance with the criteria outlined in the Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition (DSM-5), combined with an assessment of patients’ clinical symptoms and medical history. However, the consistency of such subjective diagnostic procedures remains unsatisfactorily low, with a reliability coefficient as low as 0.25 [2]. Consequently, there is an urgent need for objective approaches to improve the accuracy of MDD identification.

Neuroimaging studies have revealed that MDD patients exhibit abnormalities in both gray matter (e.g., frontal lobe, hippocampus, amygdala) and white matter (e.g., thalamus, striatum, associated fiber bundles) [35]. These subtle structural changes, if captured effectively, could serve as biomarkers for MDD diagnosis and provide new perspectives for intelligent clinical assessment. But they are difficult to identify. To address this challenge, researchers aim to leverage advanced machine learning techniques to extract neuroimaging biomarkers from multi-modal data, including structural magnetic resonance imaging (sMRI), functional MRI(fMRI), and diffusion tensor imaging (DTI), helping diagnose and evaluating treatment for depression [6]. Despite these efforts, the core pathophysiological mechanisms of MDD remain poorly understood due to its high complexity. In recent years, resting-state fMRI (rs-fMRI) has been proven to be more effective than structural imaging or task-based fMRI in MDD classification [7]. The studies revealed that there is a strong correlation between the brain’s microscopic neural circuits and the macroscopic rs-fMRI functional network [8]. Furthermore, rs-fMRI derived functional connectivity(FC) patterns between the limbic system and frontal-striatal circuits can be used to define different neurophysiological subtypes of MDD [9]; these subtypes correlate with clinical symptom profiles and can predict responsiveness to neurostimulation therapies, further confirming the potential of fMRI-based brain networks as rich sources of diagnostic biomarkers. In short, a lot of many studies have shown that rs-fMRI is well-suited for thoroughly describing brain network features, which makes it a potential objective way for MDD diagnosis.

In the past few years, great progress has been made in computer-aided diagnosis for brain disorders using rs-fMRI data. Since brain networks inherently exhibit a graph structure, with brain regions as nodes and functional connections as edges, graph neural networks (GNNs) have been widely used to classify brain networks from fMRI scans. Prior studies have demonstrated that GNN-based models outperform traditional machine learning methods such as Multilayer Perceptron (MLP) in predicting depression severity [10]. Our previous research integrated global network attribute features into GNN, which also confirms this view [11]. However, traditional GNNs often miss spatial details about brain regions. Their message-passing mechanisms treat all nodes indiscriminately, ignoring the unique order and specific roles of brain regions in neural circuits, which leads to a contradiction between GNNs' node permutation invariance and brain region specificity. This limitation reduces both classification accuracy and interpretability. To address this issue, researchers have attempted to add special features to each node when building brain networks [12]. Unfortunately, these modifications do not completely resolve the problem of indiscriminate node treatment during message passing. Consequently, classification performance remains suboptimal.

In this study, we present a new framework that preserves the spatial specificity of nodes with edge convolution instead of traditional graph convolution, capturing directional connection patterns. This method better reflects the unique functions of neural circuits. Furthermore, to make the model stronger and less dependent on labeled data, we use a contrastive learning method based on adversarial ideas. By generating augmented graphs through automatically learning edge deletions while maintaining classification consistency, the model achieves effective data augmentation and enhances interpretability. This design not only strengthens feature representation but also enables the identification of key brain regions closely associated with the pathophysiology of MDD.

Method

All data came from the REST-Meta-MDD project (http://rfmri.org/REST-meta-MDD), which includes fMRI scans from 2,428 participants across 17 hospitals in China. To meet the requirements of the study, we further screened the dataset by following the same procedures described in our previous work [13]. After screening, a total of 1,160 subjects from 10 independent sites were retained, including 597 with MDD and 563 healthy controls. Then, the data for each subject were preprocessed as described in Yan et al. without global signal regression [14]. Specifically, the preprocessing of these fMRI data was performed using Data Processing Assistant for Resting-State fMRI (DPARSF) based on SPM12 in MATLAB 2018a. Following our previous study [11], by dividing the brain into 116 regions of interest (ROIs) with the AAL atlas, we adopted ROI-based FC analysis: the brain networks were built by calculating FC between regions with Pearson correlation, and setting the diagonal elements of the connectivity matrix to zero [15].

Architecture overview

In this paper, an EC-GCL framework is proposed for MDD diagnosis, which is an adversarial graph contrastive learning framework with learnable data augmentation based on 1D convolution (edge convolution) operations. The framework is mainly composed of a graph augmentation module, two classifier branches for original data and augmented data. Each classifier branch has the same architecture but non-shared parameters, including a feature encoder, a projection head, and a classifier. The specific process is as follows: Firstly, we extract the FC matrix from fMRI as the input graph X, and the graph X is fed into a graph augmenter to obtain an augmented graph . Then, the original graph X and the augmented graph are fed into the dual branch network with the same encoder , the same projector and the same classifier , respectively, to obtain the contrastive embedding and classification result . Finally, the model is trained by the total loss function with contrastive loss and classification loss to optimize the feature extraction and classification performance.

The entire forward network can be formulated as:

(1)(2)(3)(4)

In this network, the encoder sequentially consists of the following components: two edge convolutional layers, a 1 × R convolutional layer, an R × 1 convolutional layer, and a linear layer. To introduce the non-linearity, an activation function is applied after each component. Thus, the encoder can be formulated as:

(5)(6)

where H represents the input feature, Econv represents the edge convolution, represents the activation function, wR×1 and wR are the weights of R × 1 convolution and 1 × R convolution layers, and W1 and b1 are weights of linear layer.

Both the projector head and classifier are two different MLPs with onehidden layer:

(7)(8)

The details are shown in Fig 1. Note that the activation layers and linear layers are not shown in the figure for simplicity.

Edge convolution operation

To avoid the permutation invariance problem of GNN, we propose to replace graph convolution with edge convolution blocks. Specifically, for the feature map of the l-th layer , where R is the number of regions of interest and M is the input channel, a 1 × R convolution filter is used to get , and an R × 1 convolution filter is used to get , where N is the output channel. Then, the edge convolutional block can be defined as:

(9)(10)(11)

where * represents the convolutional operation, and represent the weight and bias of the 1 × R convolutional filter, and represents the weight and bias of the R × 1 convolutional filter, ⊗ represents the outer product, and is an R-dimensional vector containing a single value of 1, while is the output of the edge convolution operation.

By using edge convolution, the framework captures connections between nodes through specific filters with learnable weights, while maintaining the inherent specificity of the nodes. Specifically, the input FC matrix is processed by two one-dimensional convolutional operations: (1) The 1 × R convolutional filtering is used to focus on the brain region features in the “row” dimension, capturing the connection pattern between the current brain region and all other brain regions; (2) The R × 1 convolutional filtering is used to focus on the brain region features in the “column” dimension, capturing the connection pattern between all other brain regions and the current brain region. The results of the two operations are concatenated to form a feature representation, which not only contains the connection information between nodes, but also retains the spatial specificity of a single node (brain region), avoiding the limitation of “indiscriminate treatment of nodes” in traditional graph convolution algorithms. The detailed edge convolution algorithm is shown in Fig 2.

Graph augmentation

The graph augmentation module aims to generate a novel data view by masking edges of the original FC matrix, and meanwhile preserve edges that are informative for the classification task and ensure that the features derived from the network remain as invariant as possible. Unlike the random edge deletion operations, in this paper, we present a learnable graph augmentation method based on contrastive learning. To be specific, the new graph is produced by training an edge drop mask from a probability matrix. The probability matrix reflects the importance of edge pairs for classification. The specific process is shown in Fig 3.

Given the input FC matrix , we feed it into the encoder to get an edge-dropping probability matrix . The encoder contains two edge convolutional layers with nonlinear activations, a 1 × 1 convolution operation, and a sigmoid activation, as shown in Fig 3. The process can be defined as:

(12)(13)(14)

where Econv3 and Econv4 are two edge convolutional operations with different weights, which are non-shared with the weights of edge convolution layers in the main encoder ; denotes the nonlinear activation function, Hl and H2 represent the hidden represented features, and w1×1 means the weight of the 1 × 1 convolutional filter. After the feature extraction for the original FC matrix, by a 1 × 1 convolution, the channel number of features H2 is reduced to 1, and then a sigmoid function is applied to normalize the element values to the range [0, 1], yielding a probability matrix that denotes the masking probability of each edge in the original FC matrix. In this way, the FC matrix is transformed into an edge-dropping probability matrix.

To obtain an edge-dropping graph, a binary mask is required. Then, we sample a mask matrix from a Bernoulli distribution parameterized by the edge dropping probability matrix P(i.e., M ∼ Bernoulli(P)), where 0 indicates that the corresponding edge is completely masked and discarded, and 1 indicates that the edge is retained. However, this sampling process results in the gradient not being able to back-propagate. To enable gradients to propagate backwards, we use the following re-parameterization trick to ensure the graph augmenter is effectively trained:

(15)

where is constructed based on Eij ∼ Uniform(0,1), and represents a temperature parameter that controls the smoothness of the re-parameterized sampling functions. As , M gets closer to the sampled binary mask.

After obtaining the edge dropping mask M, we apply it to the original FC matrix X by element-wise multiplication to obtain the augmented graph :

(16)

Loss functions

To train the feature extractor and graph augmenter, we use a loss function that forces the two feature vectors of the positive sample pair to be close and the two feature vectors of the negative sample pair to move away. Specifically, the original FC matrix X and the augmented graph derived from it will be considered as a positive pair, while those from different original FC matrices are negative pairs. In this way, the trained feature extractor can capture the most important information while removing the redundant. Specifically, the contrastive loss is defined as follows:

(17)

where B is the number of graphs in the batch, is a temperature hyper-parameter, and sim() is the cosine similarity metric, i.e.,

(18)

However, this contrastive loss can be easily minimized by discarding the smallest number of edges. To encourage the graph augmenter to drop more edges, we use a regularization function to regularize the ratio of discarded edges. The regularization loss function is defined as the mean value of all elements in the edge-dropping mask :

(19)

To train the feature extractor and classifier, we use a classification loss function simultaneously with the contrastive loss. For the classifier, we get , which is the classification result of the original FC matrix X, and , which is the classification result of the augmented matrix . Their average cross-entropy is used for the final classification loss:

(20)

where y is the label of the corresponding samples X and .

Finally, the total loss function of our framework is expressed as:

(21)

where and are hyper-parameters, which balance the three loss functions. In Eq (21), the unknown variables are the model parameters to be optimized, including the weights and biases of the edge convolutional layers, linear layers, and convolutional layers in the encoder , encoder , projector , and classifier .

Experiments and results

Experimental setup

Our algorithm is implemented on an NVIDIA 4060 GPU with Python 3.9.16 and PyTorch 2.0.0. The overall network is optimized using the Adam optimizer with default momentum parameters. To maintain the stability of the training, we set up a separate optimizer for the graph augmentation module during the training process, with its learning rate being smaller than that of other parts. The learning rate of the graph augmentation is lr1 = 1e − 4, and the others are lr2 = 1e − 3. The total number of epochs is set to 300 and the batch size is 16. The hyper-parameters in the loss function are , (which is determined by grid search method). The activation function in graph augmentation and feature encoder is LeakyReLU (negativeslope = 0.33), and the activation function in the projection head and the classifier is ReLU. The specific experimental parameters are shown in Table 1.

The regularization coefficient in the total loss function is crucial to control the edge masking intensity for the original graph in the data augmentation module, which in turn affects the model’s classification performance. To identify the optimal value of , we employed a grid search strategy, evaluating coefficients within the range of 0.05 to 0.40. As presented in Fig 4, the experimental results demonstrate that the model attains its peak performance when is set to 0.2.

thumbnail
Fig 4. Effects of different regularization coefficients on classification performance.

https://doi.org/10.1371/journal.pone.0347870.g004

To evaluate the computational efficiency of the EC-GCL framework, we recorded the total number of parameters, training time, and inference time. The total number of trainable parameters of the EC-GCL framework is 0.207 M. The average training time per epoch was 3.57 seconds, with a total training time of approximately 18 minutes for 300 epochs. For inference, the average time to process a single sample was 5.88 milliseconds(ms). This computational efficiency, together with its superior diagnostic performance (presented in subsequent sections), indicates that the framework achieves a favorable trade-off between accuracy and efficiency.

In addition to achieving accurate classification, it is also crucial to explore the relationship between disease and the brain. A natural advantage is that our method allows for interpretability analysis. In the contrastive learning framework, we discard edges that are not useful for classification through a learnable graph enhancement method, preserving functional connections (edges) that are important for disease classification. By analyzing these retained edges, we further calculate the importance of brain regions, thereby identifying the top ten brain regions most relevant to classification.

Specifically, we calculate the importance score by the edge-dropping mask M as follows:

(22)

where sumrow represents row-wise summation of the matrix M. By adding the edge-dropping matrix M and its transpose, the symmetric connectivity importance between each brain region and others is obtained. Then, the total connection importance score (denoted by S) of each brain region with all others can be calculated by summing each row of elements of the resulting matrix. Accordingly, the brain regions corresponding to the top 10 highest scores in the averaged S across all test samples are the 10 most important ones for the MDD classification task.

Comparison results

To evaluate the performance of our EC-GCL framework, we compared it with eight well-known machine learning methods. First, we compared the proposed method with two traditional machine learning methods: Support Vector Machine (SVM) and Multilayer Perceptron (MLP). For these two methods, the upper triangle of the FC matrix was flattened into a 6670 dimensional vector, which was then used as input for the SVM or MLP classifier. We also compared our framework with two well-known GNN models: Graph Convolutional Networks (GCNs) [16] and the Graph Attention Network (GAT) [17]. In addition, our method was also compared with methods designed for fMRI network analysis: BrainNetCNN [18], BrainGNN [19], HI-GCN [20], and A-GCL [21]. For these four methods, we followed the parameter settings described in their original papers to ensure fairness.

Model performance was evaluated by three standard metrics: area under the curve (AUC), accuracy, and F1 score. To ensure robustness, five independent experiments were conducted with five randomly initialized seeds, and the mean and standard deviation of the results are reported in Table 2. The experimental results demonstrate that the proposed EC-GCL framework consistently achieves superior performance across all evaluation metrics on the dataset. These results provide strong evidence for the effectiveness of the EC-GCL model in the classification of MDD.

thumbnail
Table 2. Comparison with different algorithms.

https://doi.org/10.1371/journal.pone.0347870.t002

Ablation studies

To understand how different parts of our method affect performance, we carried out ablation experiments. We compared the complete EC-GCL framework against three simplified versions:

  1. GC-GCL (Graph Convolution based Graph Contrastive Learning): replaces the edge convolution layer with a standard GCN to validate the efficacy of the edge convolution.
  2. EC-GCL-RA (Random Augmentation): substitutes the learnable data augmentation module with random edge deletion, showing the importance of adaptive augmentation.
  3. EC-GCL-DT (Decoupled Training): separates contrastive learning and supervised classification into two stages instead of joint training, highlighting the effect of joint training.

The results of the ablation experiment are shown in Table 3. The EC-GCL achieves excellent performance in various evaluation indicators compared to the others. This indicates that these strategies using edge convolution, learnable data augmentation, and joint training combining contrastive loss with classification loss, yield substantial improvements in model performance.

Leave-one-site-out experiments

To verify the generalization capability of the proposed method across multi-center data, we used a Leave-One-Site-Out Cross-Validation (LOSOCV) approach. For each validation fold, the data from one site was set aside as the test set, while the data from all remaining sites was used for training. This process was repeated for all sites, meaning the number of folds (k) equals the number of sites. For the REST-meta-MDD dataset, this resulted in 10 folds, since it includes data from 10 sites.

To evaluate the overall performance on all sites, we calculated three indicators with the weighted average rather than a simple arithmetic mean. Because the number of samples varies greatly across sites, simply averaging the results of all sites would lead to a disproportionate bias to smaller sites, which are often less representative. To address this, we used the weighted average metrics, with weights proportional to each site’s sample size, which provides a fairer and more reliable measure of overall performance.

We reported the threeevaluation metrics for each site and compared the average results under two settings: LOSOCV and conventional cross validation without LOSO. The detailed results are shown in Figs 57. In the REST-Meta-MDD dataset, the weighted average accuracy for LOSOCV was 66.2%, only 1.6% lower than the 67.8% achieved with conventional cross validation. This small difference indicates that the model remains stable even when accounting for differences between sites. Moreover, the experiments confirm that the EC-GCL model is effective at identifying MDD using fMRI brain data from various research centers.

thumbnail
Fig 5. The results of the leave-one-site-out experiments (Accuracy).

https://doi.org/10.1371/journal.pone.0347870.g005

thumbnail
Fig 6. The results of the leave-one-site-out experiments (AUC).

https://doi.org/10.1371/journal.pone.0347870.g006

thumbnail
Fig 7. The results of the leave-one-site-out experiments (F1).

https://doi.org/10.1371/journal.pone.0347870.g007

Interpretability analysis

From the interpretability analysis, we identified ten brain regions that most strongly influenced the classification results: the right middle frontal gyrus (MFG.R), right dorsal cingulate gyrus (DCG.R), right insula (INS.R), left dorsolateral superior frontal gyrus (SFGdor.L), left precentral gyrus (PreCG.L), left superior parietal gyrus (SPG.L), left superior temporal gyrus (STG.L), left middle temporal gyrus (MTG.L), left dorsal cingulate gyrus (DCG.L), and left thalamus (THA.L). Most of these regions are located on the left side of the brain. The detailed distribution is shown in Fig 8.

thumbnail
Fig 8. The ten brain regions that have the greatest influence on classification.

https://doi.org/10.1371/journal.pone.0347870.g008

Discussion

This study proposes an intelligent diagnostic framework for MDD that leverages the graph contrastive learning integrating edge convolutional operations with learnable graph augmentation. Relative to conventional GNNs, the EC-GCL framework exhibits enhanced capacity to preserve the spatial specificity of brain regions, thereby improving both diagnostic accuracy and interpretability.

Three key interconnected innovations collectively drive the superior performance of EC-GCL. First, edge convolution replaces traditional message-passing mechanisms to more effectively capture physiologically meaningful connectivity patterns. By processing row and column dimensions of the FC matrix separately, it retains the unique functional roles of individual brain regions in neural circuits. Our ablation experiments have confirmed that replacing standard GCN (GC-GCL) with edge convolution improves AUC by 3.3% (from 67.9% to 71.2%), highlighting the critical role of this design in capturing disease-relevant neural patterns. Second, learnable graph augmentation adaptively discards redundant edges while preserving clinically relevant connections, avoiding the blindness of random augmentation. Compared to random augmentation (EC-GCL-RA), our approach improved AUC by 4.2% (from 67.0% to 71.2%), demonstrating that adaptive augmentation strengthens feature discriminability and reduces overfitting. Third, joint training of contrastive loss, classification loss, and regularization loss ensures the model learns discriminative and invariant features, reducing overfitting and improving generalization. Contrastive loss enforces invariant feature learning between original and augmented graphs, while classification loss anchors the model to diagnostic labels. Regularization loss encourages meaningful edge deletion, further focusing the model on critical connections. Decoupled training (EC-GCL-DT) separates these objectives, leading to a 2.1% AUC reduction, confirming that joint optimization enhances both performance and generalization.

These innovations build on and extend prior advances in graph contrastive learning (. Lin et al. [21] presented graph data augmentation from a spectral perspective by exploring the invariance properties of graphs in the spectral domain. Cai et al. [22] developed a LightGCL with a contrastive augmentation method based on singular value decomposition, which can optimize the graph structure according to global relationships. Consistent with these works, our results validate that structure-aware graph augmentation outperforms random strategies, particularly in clinical datasets where meaningful connections are sparse. However, the graph augmentation of both works employed rule-based perturbations rather than a learnable augmentation module, lacking adaptive optimization tailored to dataset-specific characteristics. Yin et al. [23] advanced an AutoGCL by learning a probability distribution-based graph augmenter in the GCL framework, which can greatly retain the representative structure of the original graph while introducing sufficient variance into the augmented views. Suresh et al. [24] introduced an adversarial GCL (AD-GCL) with trainable edge-dropping graph augmentation, which avoids capturing redundant graph features. Zhang et al. [25] further proposed dynamic memory banks for negative sample expansion. Although these two studies align with our work in addressing the blindness of traditional random graph augmentation via learning-based adaptive strategies, and in integrating augmentation with contrastive learning to strengthen feature discriminability, they adopt the traditional GNN as encoders, lacking directional connectivity capture. By integrating learnable augmentation with edge convolution, our framework enhances both feature specificity and data utility for graph-based MDD diagnosis.

Beyond validating the framework’s diagnostic efficacy, the interpretability analysis enabled by our learnable graph augmentation module further uncovers MDD’s neural mechanisms. By quantifying the importance of retained functional connections via edge-dropping mask analysis, we identified ten specific brain regions that contribute to the classification of MDD. These regions are largely consistent with past research and support core ideas regarding the neural mechanisms of MDD, such as emotion regulation deficits, cognitive impairment, interoceptive dysfunction, and disturbances in sensory-emotional integration [11].

The lateral habenula (LHb), a subregion of the thalamus (THA), is recognized as a key brain region in the pathogenesis of depression [26]. The interaction pattern between specific brain cells (glial cells and neurons) within the LHb serves as a critical driver of depression. When LHb neurons exhibit abnormal firing, they block the brain’s reward and pleasure centers, which in turn induces depressive symptoms including persistent low mood and anhedonia [27]. Ketamine works quickly as an antidepressant by precisely blocking abnormal signals originating from the LHb region of the brain [28]. Additionlly, recent studies have found a decrease in the volume of the left thalamus [3] (THA.L) and a reduction in white matter fiber tract markers for MDD patients [4]. Consistent with these prior findings, our current study also highlights THA.L as a critical region for the identification of MDD. Besides, the dorsolateral superior frontal gyrus (SFGdor.L) and THA.L are integral components of the cortico-striatal-pallidal-thalamic circuit, which exhibits hyperactivation in response to negative stimuli among individuals with MDD [29]. Our results also show that the SFGdor.L is an important region for MDD recognition. Collectively, these results reinforce the notion that changes in neural circuit activity are reliable biomarkers for MDD identification.

The dorsal cingulate gyrus (DCG) acts as the main center for emotion cognition regulation, with a primary role in processing self-referential negative information and emotional conflict. In MDD patients, this region exhibits hyperactivation during episodes of negative self-reflection [30], which weakens control over negative thoughts, leads to repetitive self-focus (rumination), and makes it hard to stop negative thinking. Consistent with the prior research, our study identifies that both the left and right sides of the DCG (DCG.L and DCG.R) are critical biomarkers for MDD.

The insula is also considered a key indicator for MDD due to its role in controlling for mood and the altered brain connections found in patients [31]. Specifically, some studies indicate that before treatment, activity in the right anterior insula may help predict whether an MDD patient will respond better to therapy or medication [32]. Additionally, a reduced volume of the right anterior insula has also been associated with a higher risk of relapse, even after controlling other risk factors such as stressful life events [33]. In MDD patients, the insula shows altered functional connections with other brain regions, including the amygdala and prefrontal cortex [34]. Our study further confirms that the right insula (INS.R) is important for identifying MDD, which matches earlier neuroscience works.

Despite the EC-GCL improves the classification and interpretability for MDD, this study has several limitations that should be addressed in future research. First, the dataset is derived exclusively from Chinese populations, which may limit the generalizability of the model to other ethnic groups. Future work will expand the dataset to include multi-ethnic samples. Second, we relied solely on static rs-fMRI functional connectivity data, ignoring the dynamic nature of brain networks and the complementary information from other modalities. Drawing insights from recent relevant studies [35,36], in the future, we plan to deepen our research in two directions: first, inspired by [35], we will integrate rs-fMRI with clinical data, structural MRI, and other modalities to capture more comprehensive MDD-related information; second, building on [36], we will incorporate dynamic graph learning to model time-varying functional connections, complementing our static graph learning.

Conclusion

To improve the intelligent diagnosis of MDD, we proposed a new learning framework that uses edge convolution and graph contrastive learning. The framework effectively handles the contradiction between GNN’s permutation invariance and brain region specificity through edge convolution blocks, while the learnable graph augmentation strategy helps retain important functional connections. The results showed that our method outperformed various comparative methods, including traditional machine learning, classic GNNs, and state-of-the-art fMRI analysis models. Additionally, the framework enables interpretability analysis by identifying critical brain regions via retained edges, offering insights into the neural mechanism of MDD.

Acknowledgments

Data were provided by the members of REST-meta-MDD Consortium.

References

  1. 1. GBD 2019 Stroke Collaborators. Global, regional, and national burden of stroke and its risk factors, 1990-2019: a systematic analysis for the Global Burden of Disease Study 2019. Lancet Neurol. 2021;20(10):795–820. pmid:34487721
  2. 2. Regier DA, Narrow WE, Clarke DE, Kraemer HC, Kuramoto SJ, Kuhl EA, et al. DSM-5 field trials in the United States and Canada, Part II: test-retest reliability of selected categorical diagnoses. Am J Psychiatry. 2013;170(1):59–70. pmid:23111466
  3. 3. Ye J, Xu C, Guo H, Huang W, Xie G, Liang J. Structural and functional thalamic alterations in major depressive disorder with comorbid chronic pain. Sci Rep. 2025;15(1):16854. pmid:40374655
  4. 4. He C, Gong L, Yin Y, Yuan Y, Zhang H, Lv L, et al. Amygdala connectivity mediates the association between anxiety and depression in patients with major depressive disorder. Brain Imaging Behav. 2019;13(4):1146–59. pmid:30054873
  5. 5. Liu W, Heij J, Liu S, Liebrand L, Caan M, van der Zwaag W, et al. Structural connectivity of thalamic subnuclei in major depressive disorder: an ultra-high resolution diffusion MRI study at 7-Tesla. J Affect Disord. 2025;370:412–26. pmid:39505018
  6. 6. Gao S, Calhoun VD, Sui J. Machine learning in major depression: from classification to treatment outcome prediction. CNS Neurosci Ther. 2018;24(11):1037–52. pmid:30136381
  7. 7. Winter NR, Blanke J, Leenings R, Ernsting J, Fisch L, Sarink K, et al. A systematic evaluation of machine learning-based biomarkers for major depressive disorder. JAMA Psychiatry. 2024;81(4):386–95. pmid:38198165
  8. 8. Kahali S, Raichle ME, Yablonskiy DA. The role of the human brain neuron-glia-synapse composition in forming resting-state functional connectivity networks. Brain Sci. 2021;11(12):1565. pmid:34942867
  9. 9. Drysdale AT, Grosenick L, Downar J, Dunlop K, Mansouri F, Meng Y, et al. Resting-state connectivity biomarkers define neurophysiological subtypes of depression. Nat Med. 2017;23(1):28–38. pmid:27918562
  10. 10. Zhang Y, Huang H. New graph-blind convolutional network for brain connectome data analysis. In: International Conference on Information Processing in Medical Imaging. 2019. p. 669–81.
  11. 11. Zhang M, Long D, Chen Z, Fang C, Li Y, Huang P, et al. Multi-view graph network learning framework for identification of major depressive disorder. Comput Biol Med. 2023;166:107478. pmid:37776730
  12. 12. Zhao K, Duka B, Xie H, Oathes DJ, Calhoun V, Zhang Y. A dynamic graph convolutional neural network framework reveals new insights into connectome dysfunctions in ADHD. Neuroimage. 2022;246:118774. pmid:34861391
  13. 13. Long D, Zhang M, Yu J, Zhu Q, Chen F, Li F. Intelligent diagnosis of major depression disease based on multi-layer brain network. Front Neurosci. 2023;17:1126865. pmid:37008226
  14. 14. Yan C-G, Chen X, Li L, Castellanos FX, Bai T-J, Bo Q-J, et al. Reduced default mode network functional connectivity in patients with recurrent major depressive disorder. Proc Natl Acad Sci U S A. 2019;116(18):9078–83. pmid:30979801
  15. 15. Veličković P, Fedus W, Hamilton WL, Liò P, Bengio Y, Hjelm RD. Deep graph Infomax. In: International Conference on Learing Representations. 2019.
  16. 16. Kipf TN, Welling M. Semi-supervised classification with graph convolutional networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019. p. 11313–20.
  17. 17. Veličković P, Cucurull G, Casanova A, Romero A, Liò P, Bengio Y. Graph attention networks. In: International Conference on Learning Representations (ICLR). 2018.
  18. 18. Kawahara J, Brown CJ, Miller SP, Booth BG, Chau V, Grunau RE, et al. BrainNetCNN: convolutional neural networks for brain networks; towards predicting neurodevelopment. Neuroimage. 2017;146:1038–49. pmid:27693612
  19. 19. Li X, Zhou Y, Dvornek N, Zhang M, Gao S, Zhuang J, et al. BrainGNN: interpretable brain graph neural network for fMRI analysis. Med Image Anal. 2021;74:102233. pmid:34655865
  20. 20. Jiang H, Cao P, Xu M, Yang J, Zaiane O. Hi-GCN: a hierarchical graph convolution network for graph embedding learning of brain network and brain disorders prediction. Comput Biol Med. 2020;127:104096. pmid:33166800
  21. 21. Lin L, Chen J, Wang H. Spectral augmentation for self-supervised learning on graphs. In: International Conference on Learing Representations (ICLR). 2023.
  22. 22. Cai X, Huang C, Xia L, Ren X. Light GCL: simple yet effective graph contrastive learning for recommendation. In: International Conference on Representations (ICLR). 2023.
  23. 23. Yin Y, Wang Q, Huang S, Xiong H, Zhang X. AutoGCL: automated graph contrastive learning via learnable view generators. AAAI. 2022;36(8):8892–900.
  24. 24. Suresh S, Li P, Hao C, Neville J. Adversarial graph augmentation to improve graph contrastive learning. Advances in Neural Information Processing Systems. 2021;34:15920–33.
  25. 25. Zhang S, Chen X, Shen X, Ren B, Yu Z, Yang H, et al. A-GCL: Adversarial graph contrastive learning for fMRI analysis to diagnose neurodevelopmental disorders. Med Image Anal. 2023;90:102932. pmid:37657365
  26. 26. Li K, Zhou T, Liao L, Yang Z, Wong C, Henn F, et al. βCaMKII in lateral habenula mediates core symptoms of depression. Science. 2013;341(6149):1016–20. pmid:23990563
  27. 27. Cui Y, Yang Y, Ni Z, Dong Y, Cai G, Foncelle A, et al. Astroglial Kir4.1 in the lateral habenula drives neuronal bursts in depression. Nature. 2018;554(7692):323–7. pmid:29446379
  28. 28. Ma S, Chen M, Jiang Y, Xiang X, Wang S, Wu Z, et al. Sustained antidepressant effect of ketamine through NMDAR trapping in the LHb. Nature. 2023;622(7984):802–9. pmid:37853123
  29. 29. Hamilton JP, Etkin A, Furman DJ, Lemus MG, Johnson RF, Gotlib IH. Functional neuroimaging of major depressive disorder: a meta-analysis and new integration of base line activation and neural response data. Am J Psychiatry. 2012;169(7):693–703. pmid:22535198
  30. 30. Malhi GS, Mann JJ. Depression. The Lancet. 2018;392(10161):2299–312.
  31. 31. Cooney RE, Joormann J, Eugène F, Dennis EL, Gotlib IH. Neural correlates of rumination in depression. Cogn Affect Behav Neurosci. 2010;10(4):470–8. pmid:21098808
  32. 32. Dunlop BW, Mayberg HS. Neuroimaging-based biomarkers for treatment selection in major depressive disorder. Dialogues Clin Neurosci. 2014;16(4):479–90. pmid:25733953
  33. 33. Schnellbächer GJ, Rajkumar R, Veselinović T, Ramkiran S, Hagen J, Shah NJ, et al. Structural alterations of the insula in depression patients - A 7-Tesla-MRI study. Neuroimage Clin. 2022;36:103249. pmid:36451355
  34. 34. He C, Fan D, Liu X, Wang Q, Zhang H, Zhang H, et al. Insula network connectivity mediates the association between childhood maltreatment and depressive symptoms in major depressive disorder patients. Transl Psychiatry. 2022;12(1):89. pmid:35236833
  35. 35. Lin Y, Feng J, Chen X, Xue R, Jiang J, Tian Z, et al. Multi-level graph self-supervised learning for multi-modal medical corpus construction. Pattern Recognition. 2026;171:112113.
  36. 36. Wang S, Chen L, Liu Z. Multi-task dynamic graph learning for brain disorder identification with functional MRI. Pattern Recogn. 2026;170:111922.