Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Hybrid multivariate pattern analysis combined with extreme learning machine for Alzheimer’s dementia diagnosis using multi-measure rs-fMRI spatial patterns

  • Duc Thanh Nguyen,

    Roles Conceptualization, Formal analysis, Investigation, Methodology, Resources, Software, Validation, Visualization, Writing – original draft, Writing – review & editing

    Affiliation Department of Biomedical Science and Engineering (BMSE), Institute of Integrated Technology (IIT), Gwangju Institute of Science and Technology (GIST), Gwangju, Republic of Korea

  • Seungjun Ryu,

    Roles Conceptualization, Methodology

    Affiliation Department of Biomedical Science and Engineering (BMSE), Institute of Integrated Technology (IIT), Gwangju Institute of Science and Technology (GIST), Gwangju, Republic of Korea

  • Muhammad Naveed Iqbal Qureshi,

    Roles Conceptualization, Methodology

    Affiliations Translational Neuroimaging Laboratory, The McGill University Research Center for Studies in Aging (MCSA), McGill University, Montreal, Canada, Alzheimer’s Disease Research Unit, Douglas Mental Health University Institute, McGill University, Montreal, Canada, Department of Psychiatry, McGill University, Montreal, Canada, Montreal Neurological Institute and Hospital, Montreal, Canada

  • Min Choi,

    Roles Formal analysis

    Affiliation Department of Biomedical Science and Engineering (BMSE), Institute of Integrated Technology (IIT), Gwangju Institute of Science and Technology (GIST), Gwangju, Republic of Korea

  • Kun Ho Lee,

    Roles Data curation

    Affiliations National Research Center for Dementia, Chosun University, Gwangju, Republic of Korea, Department of Biomedical Science, Chosun University, Gwangju, Republic of Korea

  • Boreom Lee

    Roles Conceptualization, Funding acquisition, Investigation, Methodology, Project administration, Supervision, Visualization, Writing – review & editing

    leebr@gist.ac.kr

    Affiliation Department of Biomedical Science and Engineering (BMSE), Institute of Integrated Technology (IIT), Gwangju Institute of Science and Technology (GIST), Gwangju, Republic of Korea

Abstract

Background

Early diagnosis of Alzheimer’s disease (AD) and Mild Cognitive Impairment (MCI) is essential for timely treatment. Machine learning and multivariate pattern analysis (MVPA) for the diagnosis of brain disorders are explicitly attracting attention in the neuroimaging community. In this paper, we propose a voxel-wise discriminative framework applied to multi-measure resting-state fMRI (rs-fMRI) that integrates hybrid MVPA and extreme learning machine (ELM) for the automated discrimination of AD and MCI from the cognitive normal (CN) state.

Materials and methods

We used two rs-fMRI cohorts: the public Alzheimer’s disease Neuroimaging Initiative database (ADNI2) and an in-house Alzheimer’s disease cohort from South Korea, both including individuals with AD, MCI, and normal controls. After extracting three-dimensional (3-D) patterns measuring regional coherence and functional connectivity during the resting state, we performed univariate statistical t-tests to generate a 3-D mask that retained only voxels showing significant changes. Given the initial univariate features, to enhance discriminative patterns, we implemented MVPA feature reduction using support vector machine-recursive feature elimination (SVM-RFE), and least absolute shrinkage and selection operator (LASSO), in combination with the univariate t-test. Classifications were performed by an ELM, and its efficiency was compared to linear and nonlinear (radial basis function) SVMs.

Results

The maximal accuracies achieved by the method in the ADNI2 cohort were 98.86% (p<0.001) and 98.57% (p<0.001) for AD and MCI vs. CN, respectively. In the in-house cohort, the same accuracies were 98.70% (p<0.001) and 94.16% (p<0.001).

Conclusion

From a clinical perspective, combining extreme learning machine and hybrid MVPA applied on concatenations of multiple rs-fMRI biomarkers can potentially assist the clinicians in AD and MCI diagnosis.

Introduction

Alzheimer’s disease (AD) is the most common neurodegenerative disease and is the main cause of 60% to 70% of dementia cases in aging societies. It is characterized by cognitive decline and short-term memory loss [1, 2]. Mild cognitive impairment (MCI) is referred to as the prodromal stage of AD, and subjects with MCI are at high risk of developing AD [3]. Because AD/MCI are neurodegenerative diseases and progressively attack memory cells, the development of early diagnostic tools is undoubtedly important.

In recent years, resting-state functional magnetic resonance imaging (rs-fMRI) was shown to be a powerful tool for analysing the spontaneous blood-oxygen-level-dependent (BOLD) contrasts to map neural activity associated with a variety of brain functions. In order to map the brain areas involved in a given cognitive function, the BOLD signal at the level of the individual voxel is analyzed [4]. Statistical analysis is then performed on all voxels to show regions whose BOLD signal shows significant effects. This approach is referred to as univariate t-test analysis, which is performed independently on each voxel, and has been used in neuroimaging research for decades [5, 6]. However, this approach can only show differences between group averages, and is not sufficient to diagnose individual subjects. Therefore, recently, a machine learning (ML) technique known as multivariate pattern analysis (MVPA) has been promisingly applied to classify individual subjects using neuroimaging scans [7, 8]. Multivariate methods such as support vector machine-recursive feature elimination (SVM-RFE) and least absolute shrinkage and selection operator (LASSO) investigate the mutual relationships between multiple voxels and spatial patterns. Thus, the combination of univariate t-test and multivariate MVPA approaches is expected to enhance the prediction performance as compared to each individual approach used alone.

Previous fMRI studies have indicated that the pathophysiology of AD/MCI can be associated with statistical changes, in the average sense, of regional spontaneous low-frequency (<0.08 Hz) BOLD fluctuation coherence measured in the resting state and analysed using univariate t-tests. The metrics used in these studies included regional homogeneity (ReHo) [9, 10], amplitude of low-frequency fluctuation (ALFF) [1113], and fractional ALFF (fALFF), as well as functional connectivity (FC) [14]. For example, He et. al., [10] showed that the posterior cingulate cortex (PCC) and the precuneus (PCu) have the largest ReHo differences between the AD and CN groups (p<0.05). The ALFF and fALFF studies using fMRI by Han et al., [15] revealed that MCI patients had decreased fALFF values in PCC/PCu and hippocampus, and increased fALFF values in several other regions, including occipital and temporal cortices. Rs-fMRI FC, investigated by Li et al. [16], showed that the regions with high FC were mostly located in the default mode network (DMN), and mainly involved the bilateral PCu and PCC [17]. These are all statistically significant findings at the group level. However, the discriminative ability based on the above-mentioned biomarkers related to AD/MCI diseases has not been evaluated. Since the discrimination task automatically classifies each subject into one of the studied groups (AD/MCI vs. CN), it is considered a much more complex task than the study of differences between groups [18, 19].

In neuroimaging studies, preprocessed brain scans commonly contain hundreds of thousands of non-zero voxels which significantly outnumber the number of subjects (often less than 1000). Thus, selection of an adequate subset of relevant training features/voxels is of critical importance to obtain good generalization ability and reduce risks of overfitting problems and computational complexity. A growing trend today is the design of ML-based feature reduction techniques integrated with classification methods applied to neuroimaging data for the voxel-based automated discrimination of patients with brain disorders, including AD and MCI (see the reviews [18, 19]). Many studies demonstrated the relevance of feature selection. Statistical hypothesis t-tests have broadly been used not only for group-discrimination detection but also for feature selections with success. The technique relies on an optimal threshold of significance (p-value) representing a subset of important features from whole-brain features. Though, applications of t-tests in feature selection are computational efficiency and easy to implement, this technique suffers from a significant drawback by not considering interactions between multiple features or spatial patterns which are the inherent multivariate nature of fMRI data. By contrast, MVPA methods do evaluate the relationships between multiple patterns. However, the primary drawback of whole-brain MVPA is its computationally demanding because of 3-D and high dimensionality of the data as well as the large number of images being analyzed [2022]. Thus, to select the most informative features, a univariate feature selection strategy should be performed prior to MVPA in order to reduce the dimensionality sufficient for memory capacity, computational efficiency and ensure high sensitivity to fine-grained spatial discriminative patterns, while preserving the appealing properties of whole-brain fMRI analysis and multivariate nature of fMRI data [21, 22]. Practically, many previous studies have employed hybrid combinations of filter-based t-test and MVPA techniques, i.e. wrapper-based SVM-RFE, to diagnoze the brain disorders using neuroimaging data, e.g., ADHD [2325], MCI [2628], Autism [29], AD [30, 31], or for high-dimensional gene selections [22, 32] with success (accuracies>90%).

In this study, we propose a ML-based AD/MCI diagnosis framework combining MVPA and extreme learning machines (ELM) applied to multi-measure rs-fMRI data. We first extracted maps of 3-D regional coherence (ReHo, ALFF, and fALFF) and of resting-state FC (rsFC) (degree centrality (DC), seed-based rsFC) of multiple individual subjects. We then performed statistical univariate two-sample t-tests on whole-brain 3-D maps between two pre-defined training groups, to generate an analysis mask that retained only an initial set of relevant features (voxels) showing significant changes in any one of the measures, i.e. ReHo, fALFF, rs-FC. Next, MVPA techniques such as the wrapper-based SVM-RFE proposed by Guyon [20] and embedded-based LASSO were implemented to optimize the discriminative performance. In this study we used ELM and competing methods, including linear and non-linear SVM classifiers, to distinguish AD/MCI patients from the CN controls. We hypothesized that a hybrid combination of univariate statistical t-test and MVPA approaches applied on concatenation of multiple functional biomarkers could boost the classification performance. Thus, the major contributions present in this study can be summarized as follows:

  • We propose a voxel-wise ML-based discriminative framework integrating ELM classifier and hybrid MVPA techniques for automated AD/MCI diagnosis using multi-measure rs-fMRI.
  • The proposed framework extracts a maximum amount of information from multiple rs-fMRI biomarkers of a public Alzheimer’s disease Neuroimaging Initiative (ADNI2) and an in-house AD cohort from South Korea and, therefore, achieves maximal classification accuracies as compared to all other previous studies.
  • We demonstrate that, compared to conventional univariate statistical analysis t-test, the hybrid combination of multivariate methods (univariate t-test + SVM-RFE and univariate t-test + LASSO) increases the classification performance of the discriminative patterns.
  • The effectiveness of the ELM classifier, superior to that of linear and radial basis function (RBF)-based SVM classifiers, when combined with hybrid feature selection methods for AD/MCI identifications based on multi-biomarker rs-fMRI is addressed for the first time in this work.
  • We showed that the highest classification accuracies are achieved when all patterns from multiple regional coherence and functional connectivity biomarkers are concatenated. This suggests that different brain regions suffer different functional losses due to AD/MCI. Hence, classification framework should include the maximum amount of informative changes to achieve best performance.

The remainder of this paper is organized as follows. Section 2 provides details on the datasets, subjects, preprocessing of rs-fMRI data, classification algorithms, univariate and MVPA feature reduction techniques, and permutation test used for the validation of the results. Section 3 presents the comparative results, while Section 4 is devoted to the discussion and conclusions of the article.

Materials and methods

We used two independent rs-fMRI datasets: the ADNI2 dataset, publicly available online and an in-house dataset whose subjects were recruited from the Chosun University Hospital in Gwangju, South Korea.

Subjects

ADNI2 cohort.

We used a cohort of 33 (17 females) Alzheimer’s disease (AD) subjects, 31 (14 females) early Mild Cognitive Impairment (MCI) and 31 (17 females) Cognitive Normal (CN) subjects from the ADNI2 database, which is publicly available on the web (www.adni.loni.usc.edu). The mean ages of AD, MCI, and CN are 73.59 ± 5.18, 74.52 ± 5.18, and 74.66 ± 5.56. General criteria for categorizing AD, MCI, and CN are well explained on the ADNI web site (http://adni.loni.ucla.edu). The subjects ranged in age from 56 to 89 years, and functional assessments of AD/MCI patients, such as Mini-Mental State Examination (MMSE) and Clinical Dementia Rating (CDR), were independently performed by the research institutions. The general criteria were as follows: the CN subjects had MMSE scores between 24 and 30, a CDR of 0, and were non-depressed, non-MCI, and non-demented. MCI patients had MMSE scores between 24 and 30, CDR scores between 0 and 1, no significant levels of impairment in other cognitive domains, essentially preserved daily living activities, and absence of dementia. The MMSE scores of AD patients were between 15 and 26, their CDR scores were 0.5 or 1, and they met the National Institute of Neurological and Communicative Disorders and Stroke and the Alzheimer’s disease and Related Disorders Association (NINCDS/ADRDA) criteria for probable AD. In this study, to minimize the effect of different image sizes and resolutions, we selected images from subjects with the same image dimension and resolution, and we used only the baseline fMRI scans.

In-house cohort.

A total of 365 subjects were included the in-house dataset: 81 AD subjects, 132 MCI subjects, and 152 CN subjects. This dataset was a part of a large cohort enrolled at the National Dementia Research Center, Chosun University, Gwangju, South Korea. All subjects provided written informed consent before the data collection. In case of AD patients with the inability of consent, the next of kin of patients gave consent before participation. Psychological tests or assessments were not used to determine whether subjects were able to provide written informed consent. The consent procedure and data acquisition were approved by the Institutional Review Board (IRB) of the Chosun University Hospital, Gwangju, South Korea (IRB number 2013-12-018). Briefly, subjects were between 56 and 87 years of age, and the study partners were able to provide independent functional evaluations. The MMSE and CDR scores, and the other clinical criteria for inclusion in the three groups were the same as in the ADNI2 cohort. The demographics of the participants from two cohorts are shown in Table 1 and subject IDs are provided in supporting S1 Table.

thumbnail
Table 1. Demographic details of all participants of two cohorts in this study.

https://doi.org/10.1371/journal.pone.0212582.t001

Rs-fMRI data acquisition

ADNI2 cohort.

ADNI2 subjects were scanned at different centres using 3.0 T Philips Achieva scanners with the same scanning protocol and parameters: Repetition Time (TR)/Echo Time (TE) = 3000/30 ms, flip angle = 80°, acquisition matrix size = 64 × 64, 48 slices, 140 volumes, and a voxel thickness = 3.3 mm.

In-house cohort.

The participants in the Chosun University Hospital were scanned with a Siemens Skyra 3.0-Tesla scanner. A 2D EPI MR acquisition type was used with the following parameters: TR/TE = 3000/30 ms, flip angle = 90°, field of view (FOV) = 240 × 240 mm, acquisition matrix size 64 × 64, 35 slices, 90 volumes, voxel size = 3.75 x 3.75 x 3.75, spacing between slices = 4.8 mm, number of echoes = 1, imaging frequency = 123.206 Hz, slice acquisition order = ascending (bottom-up), direction = 'Transverse > Coronal (2.6) > Sagittal (1.7)', pixel bandwidth = 3440, in-plane phase encoding direction = ‘ROW’, number of phase encoding steps = 63, echo train length = 31, percent sampling = 100, percent phase field of view = 100, variable flip angle flag = ‘N’, and specific absorption rate (SAR) = 0.0778.

Preprocessing of rs-fMRI data

Preprocessing of rs-fMRI data was carried out using the Data Processing Assistant for Resting-State fMRI (DPARSF; http://www.restfmri.net) [33] and the Statistical Parametric Mapping platform (SPM8; http://www.fil.ion.ucl.ac.uk/spm). All Digital Imaging and Communications in Medicine (DICOM) files were obtained from the scanners as described above, and converted into the Neuroimaging informatics Technology initiative (NIfTI) file format. The first 10 time points for each participant were disregarded to allow for signal calibration and participants’ adaption to the scanning noise. Subsequently, functional images went through the following preprocessing steps: slice-timing correction was referred to the last slice; realignment for head movement compensation was performed by applying a Friston 24-parameter model (6 head motion parameters, 6 head motion parameters from the previous time point, and 12 corresponding squared items); individual structural images (T1-weighted MPRAGE) were co-registered to the mean functional image after realignment; normalization the rs-fMRI to the original space was performed with the Diffeomorphic Anatomical Registration Through Exponentiated Lie algebra (DARTEL) toolbox [34] (resampling voxel size = 3 × 3 × 3 mm3); spatial smoothing was performed with a 6-mm full-width at half-maximum (FWHM) Gaussian kernel. Then, linear trend removal and temporal band-pass filtering (0.01 Hz < f < 0.08 Hz) were performed on the time series of each voxel. Finally, we regressed out cerebrospinal and white matter signals as well as six head-motion parameters to further reduce the effects of nuisance signals and focus only on the gray matters signal. A mask image was created according to the intersection of the subject-specific normalized T1 anatomical images. Only the voxels within the mask were further analyzed. The mask image was also used for correcting for multiple comparisons in later analyses.

Proposed framework

Fig 1 illustrates all the procedures and techniques proposed in this study. The first step of the procedure is to extract the whole-brain 3-D measures from processed rs-fMRI images. The measures are ReHo, fALFF, ALFF, Degree Centrality (DC), left hippocampus-based rsFC (LeftHC-based rsFC), right hippocampus-based rsFC (RightHC-based rsFC), left post cingulate cortex-based rsFC (LeftPCC-based rsFC), right post cingulate cortex-based rsFC (RightPCC-based rsFC), left precuneus-based rsFC (LeftPCu-based rsFC), and right precuneus-based rsFC (RightPCu-based rsFC). Due to the small size of the datasets, we used leave-one-out (LOO-CV) and 10-fold cross-validation (10-fold CV) for the ADNI2 and the in-house cohort, respectively, to validate the classification performance of the methods. In LOO-CV, one sample was selected as testing data whereas the rest was used for training. In 10-fold CV, 90% of the data were used for training and the remaining 10% for testing. Given training 3-D spatial maps, we then performed univariate statistical t-tests to obtain a 3-D mask which identified a set of ‘active’ voxels. We then implemented the MVPA techniques (SVM-RFE and LASSO) on 1-D concatenated training features to select the most relevant features for training the ELM and SVM classifiers. Finally, given the indices of the highest ranked features on the training data, we extracted the testing data for classification.

thumbnail
Fig 1. Descriptions of the proposed framework in this study.

Block (a) presents the 3-D feature measure extractions from preprocessed fMRI scans. Block (b) describes the LOO-CV and 10-fold-CV cross validation for ADNI2 and in-house cohorts, respectively. Block (c) presents the multivariate feature reduction techniques using LASSO and SVM-RFE. The combined univariate t-test and multivariate LASSO as well as SVM-RFE informative features are trained by ELM and SVM classifiers as illustrated in block (d). Finally, the trained classifiers and testing features are used to evaluate the performance as in block (e).

https://doi.org/10.1371/journal.pone.0212582.g001

Feature extraction

We describe here some biomarkers measured from rs-fMRI using the Resting-State fMRI Data Analysis Toolkit (REST) toolbox [35]. The measures can be categorized into regional spontaneous measures (ReHo, ALFF, fALFF), and functional connectivity measures (DC, seed-based rsFC), as described below.

Regional homogeneity (ReHo).

We used the ReHo measure to explore regional brain activity during the resting state. The computation was performed on a voxel-wise basis by calculating Kendall’s Coefficient of Concordance (KCC) [36] of fMRI time series of a given voxel with those of its nearest neighbours. From all the voxels in the brain, an individual ReHo map was obtained for each subject. A higher regional coherence within a cluster, consisting of a voxel and its nearest neighbours, was represented by a larger ReHo value for the voxel. Several recent studies in literature have shown the potential value of ReHo in clinical applications [9, 10, 37].

Amplitude of low-frequency fluctuation (ALFF) and fractional ALFF (fALFF).

The regional spontaneous activities can be examined by the ALFF measure and its improved version, the fALFF measure. After preprocessing, the filtered time series was transformed to a frequency domain using a fast Fourier transform (FFT), and the power spectrum was obtained. The average square root of the power spectrum (amplitude) between the frequencies of 0.01 and 0.08 Hz was computed at each voxel to give the ALFF measure [11, 38]. The fALFF measure is a modified version of ALFF, defined as the ratio of the average amplitude in the low-frequency range (0.01–0.08 Hz) to that of the entire frequency range (0–0.25 Hz) [33].

Degree centrality (DC).

We used a commonly employed graph-based measure of network organization, degree centrality (DC), to perform a full-brain exploration of the regions that were influenced by AD and MCI. Within the study mask, individual network centrality maps were generated in a voxel-wise fashion. First, the preprocessed functional runs were subjected to voxel-based whole-brain correlation analysis. The time course of each voxel from each participant that was within the gray matter mask was correlated with the time course of every other voxel, to obtain a correlation matrix. An undirected adjacency matrix was then obtained by thresholding the correlation at r > 0.25 [39, 40], and the DC was computed as the sum of the weights of the significant weighted connections for each voxel. Finally, the individual-level voxel-wise DC was converted into a z-score map by subtracting the mean DC across the entire brain and dividing by the standard deviation of the whole-brain DC.

Seed-based resting-state functional connectivity (rsFC).

To examine the detailed rsFC differences among the AD, MCI and CN groups at the regional level, we performed seed-based rsFC analysis. Briefly, the mean time course within each seed was extracted by averaging the time courses of all the voxels belonging to the seed. Subsequently, the mean time course was used to compute the correlation coefficients with the time courses of all voxels. The resulting correlation coefficients were then converted to z-scores using Fisher’s r-to-z transform to improve normality [16, 41]. In this study, we selected bilateral PCC, bilateral Hippocampus, and bilateral Precuneus as the seeds. Table 2 provides detailed information about the seeds.

thumbnail
Table 2. Detailed information of the seeds for seed-based rsFC measures.

https://doi.org/10.1371/journal.pone.0212582.t002

Feature concatenation.

Combining multiple measures is a very effective approach for boosting the performance of a machine learning setup [42], which has been used in many research domains, including neuroimaging classification [43]. In this work, we investigated a common feature concatenation that linked many feature measures of the same dataset. We believe that feature concatenation will enhance accuracy and enable the inference of indirect or direct associations between multiple features extracted from the same fMRI data.

Feature reduction techniques

The number of predictor voxels obtained in our spatial maps was larger than the number of subjects. Thus, a dimensionality reduction process was necessary in order to select the most relevant features, discard redundant features and noise, and avoid numerical singularities and overfitting problems, and thus enhance the classification performance. Importantly, feature reduction was performed using the training data only. Once identified, the same brain regions identified during training were used to assess the classifier predictive accuracy [44] on the testing data. In this study, we used univariate t-test and MVPA approaches, including SVM-RFE and LASSO, as voxel-wise feature reduction techniques. The univariate t-test is performed voxel-wise to identify independent voxels, whereas the multivariate RFE and LASSO investigate the mutual associations between multiple features and spatial patterns. We also used hybrid combinations of univariate and MVPA approaches to outperform the individual techniques.

Univariate two sample t-test.

Many neuroimaging studies have shown abnormalities, at the level of the average signal, in one or more brain features in a diseased group compared to a control group using univariate statistical tests [19]. Recently, classification studies have used t-tests to select informative features for machine learning in neuroimaging [8, 45]. The key results of the analysis based on statistical tests are usually expressed by means of p-values. Subsequently, the optimal p-value cutoff to select the relevant features is determined through a cross-validation process, and the features thus selected are used in the subsequent machine learning analysis. In this study, we applied t-test-based feature reductions techniques to machine learning based diagnosis. Using t-tests on the training dataset, we generated an analytical mask that retained only the voxels presenting significant changes in any of the analytical feature measures, i.e. ReHo, ALFF, fALFF, DC, rsFC, between any of the two groups at the threshold p-values (p<0.05 with |t|>1.9715, p<0.01 with |t|> 2.599, and p<0.001 with |t|>3.3381). The correction cluster size threshold p = 0.05 corresponding to corrected individual voxel p-values was computed by Monte Carlo simulations with the program AlphaSim in REST [35] (1000 iterations) to determine the cluster size. As a result, cluster sizes of 85 voxels (2295 mm3), 18 voxels (486 mm3), 6 voxels (162 mm3) were found to correspond to corrected individual voxel p-values of 0.05, 0.01, and 0.001, respectively. Fig 2 shows selected regions resulted from univariate t-test applied to ReHo maps of one-fold training data, i.e., AD vs. CN and MCI vs. CN (out of >62 different folds for ADNI2 cohort and 10 folds for in-house cohort).

thumbnail
Fig 2.

An example of one-fold univariate statistical two-sample t-test on ReHo maps between two training analytical groups, i.e., AD against CN (left subfigure) and MCI against CN (right subfigure). The threshold was set to p-value<0.05 with cluster size of 85 voxels (2295 mm3), which corresponded to a corrected p-value<0.05. The t-test maps are overlaid on the anatomical image. The hot and cold colours represent positive and negative changes.

https://doi.org/10.1371/journal.pone.0212582.g002

Support vector machine-recursive feature elimination (SVM-RFE).

While the t-test is a univariate procedure that does not take into account interactions between multiple features and spatial patterns, support vector machine-recursive feature elimination (SVM-RFE) is a multivariate wrapper-model-based feature reduction algorithm, which efficiently fits a model and removes the weakest features until the specified informative number of features is reached. The ranking criterion of SVM-RFE is closely related to the SVM model. In each iteration of the RFE, an SVM model is trained. Then, the feature with smallest ranking criterion is removed since it has the least effect on classification, while the remaining features are kept for the SVM model in the next iteration. The sequential process is repeated until all the features have been eliminated. Then, according to the order of elimination, the features are graded. The later a feature is eliminated, the more significant it should be [46]. A detailed description of the SVM-RFE algorithm can be found in a previous paper [20]. In this work, after the application of SVM-RFE, the most important training features that maximize cross-validated accuracy were kept for training the classifiers. Fig 3 illustrates the process of hybrid combination of univariate t-test and multivariate SVM-RFE as well as LASSO to select the most relevant features.

thumbnail
Fig 3. Illustration of the hybrid combination of univariate t-test and MVPA feature reduction techniques (SVM-RFE and LASSO) on the 3-D cross-validated fMRI measures.

https://doi.org/10.1371/journal.pone.0212582.g003

Least absolute shrinkage and selection operator (LASSO).

A good example of MVPA feature reduction with error and regularization terms is LASSO, which has been successfully applied in neuroimaging machine learning tasks to mitigate problems related to the so-called curse-of-dimensionality. LASSO computes model coefficients γj by minimizing the following function: where xi is the voxel-wise feature input data, a vector of q values at observation i, and n is the number of observations. ui is the response at observation i. Lambda (λ) is a non-negative user-defined regularization parameter which controls the balance between limiting the number of non-zero coefficients γj (sparsity) and high prediction accuracy. Interestingly, as λ approaches 1, the model becomes increasingly sparse, meaning it will produce few relevant features, while as λ approaches 0, the model becomes less sparse and includes more relevant features [5]. The parameter γ0 is a scalar. The function minimized by LASSO involves the l1 norm of γj [4749]. In this paper, we chose the value of λ that minimized the cross validated mean squared error (MSE), as shown in Fig 4. The hybrid combination of univariate t-test and multivariate LASSO for selecting the most discriminative training features is shown in Fig 3.

thumbnail
Fig 4. An example of cross-validated MSE of LASSO fit with a parameter lambda (λ).

https://doi.org/10.1371/journal.pone.0212582.g004

Classification

In this study, three machine learning classification algorithms were used namely, ELM, linear SVM, and non-linear SVM. We have compared the results of all the classifiers, and ELM proves to be the most efficient algorithm both in terms of computation time and accuracy. Brief description of each method is described as follows.

ELM classifier.

An ELM consists of an input layer, a hidden layer, and an output layer. Whereas traditional feedforward neural networks require weights and biases for all layers to be adjusted by gradient-based learning algorithms, ELM arbitrarily assigns input weights and hidden layer biases without iterative adjustment, and computes the output weights by solving a single linear system [23]. Thus, ELM learns much faster than traditional neural networks and is widely employed in various classification applications as an efficient learning algorithm [24]. In this work, the number of hidden nodes was set between 1 and 400, and we selected a sigmoid activation function. A grid search method on training data was used to tune this parameter for achieving maximum cross-validated validation accuracy. To minimize the random effects due to the weight initializations, each value of the number of hidden nodes was used 100 times and the average performance was presented.

SVM classifier.

Support vector machines (SVM) have recently become popular as supervised classifiers of fMRI data due to their high performance, their ability to deal with large high-dimensional datasets, and their flexibility in modeling diverse sources of data [4]. In the present study, we utilized a linear SVM and a non-linear SVM based on radial basis function (RBF) kernels. In SVMs, the parameters that need to be tuned are the gamma value of the kernel scale (γ) and the box constraint (C). We used a greedy search method on training data to tune these parameters to maximize cross-validated test accuracy. In this study, the search scale for selecting gamma values of kernel scale and box constraint were set to γ = [0.001, 0.01, 0.1, 1, 10, 100, 1000, 10000], and C = [0.001, 0.01, 0.1, 1, 10, 100, 1000, 10000], respectively.

Cross-validation, performance evaluation, and significant testing methods

Cross-validation.

In this work, we used Leave-One-Out cross-validation (LOO-CV) for the ADNI2 cohort and 10-fold cross-validation (CV) for the in-house cohort. In the LOO-CV, N-1 subjects out of N were used for training, and the remaining one was left for testing, and the procedure was repeated for all the N subjects. In 10-fold CV, the subjects were randomly divided into 10 equally sized subsets: each of these subsets (folds), containing 10% of the subjects, was then used as testing set for a model trained on the remaining 90%. The mean performance of all N test subjects in LOO-CV, or all the 10 folds in 10-fold CV was reported as the final result.

Performance evaluation.

To evaluate the performance of the classifiers, we reported accuracy (ACC), sensitivity (SEN), specificity (SPEC), balanced accuracy (BAC), positive predictive value (PPV), and negative predictive value (NPV). TP, TN, FP, and FN indicate the number of true positives, true negatives, false positives, and false negatives, respectively. In terms of these numbers, ACC, SEN, SPEC, BAC, PPV, and NPV can be computed as follows:

  1. Accuracy (ACC) = (TP + TN)/(TP + TN + FP + FN)
  2. Sensitivity (SEN) = TP/(TP + FN)
  3. Specificity (SPEC) = TN/(TN + FP)
  4. Balanced accuracy (BAC) = (SEN + SPEC)/2
  5. Positive predictive value (PPV) = TP/(TP + FP)
  6. Negative predictive value (NPV) = TN/(TN + FN)

Significant testing methods.

To assess the statistical significance of the classifiers’ performance, a permutation test was performed on the classification accuracies, by randomly permuting 1000 times the labels of the test data of each of the N (LOO-CV) or 10 (10-fold CV) folds to get the probability of random successful classification. In general, the lower the p-value of the permuted prediction rate against the prediction rate of the original data labels, the higher the significance of the classifier performance.

Results

Classification results: Univariate t-test

ADNI2 cohort.

Tables 3 and 4 summarize the classification performance in discriminating between AD and CN, and between MCI and CN, respectively, of all the competing methods on the ADNI2 cohort. In terms of the mean diagnosis accuracy, the ELM classifier with concatenated features obtained a maximal accuracy of 89.92% (p-value<0.001) with a sensitivity of 86.51%, specificity of 84.17%, balanced accuracy of 84.58%, PPV of 94.00%, and NPV of 87.40% when discriminating between AD and CN; and a maximal accuracy of 85.81% (p-value<0.001) with a sensitivity of 86.67%, specificity of 85.83%, balanced accuracy of 84.85%%, PPV of 86.50% and NPV of 90.00% when discriminating between MCI and CN. The concatenated measure outperformed all individual measures. In addition, the ELM outperformed the linear and RBF-based non-linear SVMs in terms of diagnosis accuracy in all measures, including concatenated ones.

thumbnail
Table 3. Leave-One-Out cross-validation mean classification performance for AD versus CN of multi-measure features at p-value = 0.05 with ADNI2 cohort.

https://doi.org/10.1371/journal.pone.0212582.t003

thumbnail
Table 4. Leave-One-Out cross-validation mean classification performance for MCI against CN of multi-functional features at p-value = 0.05 with ADNI2 cohort.

https://doi.org/10.1371/journal.pone.0212582.t004

In-house cohort.

The experimental results on the in-house dataset are summarized in Table 5 (AD against CN) and Table 6 (MCI against CN). Our proposed method with the ELM classifier achieved very high mean accuracies for all types of measures (above 90% mean accuracy for AC against CN; around 80% mean accuracy for MCI against CN). Note that, in AD vs. CN, the concatenation of all measures resulted in a maximal mean accuracy of 94.45% (p-value<0.001) with a sensitivity of 83.67%, a specificity of 96.67%, a balance accuracy of 90.17%, a PPV of 95.67%, and a NPV of 91.07%; in MCI vs. CN, the maximal mean accuracy was 87.20% (p-value<0.001), with a sensitivity of 78.85%, a specificity of 87.50%, a balance accuracy of 81.69%, a PPV of 84.66%, and a NPV of 81.27%. Again, the performance of the combined measures was superior to that of the individual measures. In addition, the mean accuracy of the ELM classifier was superior to that of the linear and non-linear SVMs, as can be seen in the Tables 5 (AD vs. CN) and 6 (MCI vs. CN).

thumbnail
Table 5. 10-fold cross-validation mean classification performance for AD against CN of multi-functional features at p-value = 0.05 with In-house cohort.

https://doi.org/10.1371/journal.pone.0212582.t005

thumbnail
Table 6. 10-fold cross-validation mean classification performance for MCI against CN of multi-functional features at p-value = 0.05 with In-house cohort.

https://doi.org/10.1371/journal.pone.0212582.t006

Classification results: Group differences and classifications

To date, there are no guidelines available for the optimal user-defined threshold of significance (p-values) to select the relevant features to be used in machine learning for the differentiation of AD and MCI vs. CN [5, 44]. To investigate the effects of univariate statistical p-values, we show in Table 7 the ELM classification performance at different p-values (p = 0.05, 001 and 0.001). Interestingly, the best performance was found with the least significant difference (p-value = 0.05) for both datasets and both classification problems (AD vs. CN and MCI vs. CN). Specifically, in the ADNI2 cohort, the maximal mean accuracy in AD vs. CN classification was 89.92% (p-value<0.001), with a sensitivity of 86.51%, specificity of 84.17%; while in classifying MCI vs. CN, the maximal accuracy was 85.81% (p-value<0.001), with a sensitivity of 86.67%, and a specificity of 85.83%. For the in-house cohort, we achieved, in AD vs. CN classification, a maximal accuracy of 94.45% (p-value<0.001), a sensitivity of 83.67%, a specificity of 96.67%; while for the MCI vs. CN classification the maximal accuracy was 87.20% (p-value<0.001), with a sensitivity of 78.85%, and a specificity of 87.50%. Therefore we can conclude that a highly significant group difference (p-value = 0.01, 0.001) does not necessarily result in a stronger classification performance, and, conversely, that a high classification performance does not necessarily mean that strong differences exist between the means of the groups.

thumbnail
Table 7. The effects of significant p-values on the classification performances reported with ADNI2 and in-house cohorts.

https://doi.org/10.1371/journal.pone.0212582.t007

Classification results: Hybrid combination of MVPA methods

In the previous section, we reported the results using only univariate t-tests, not combined with MVPA methods, for discriminating AD and MCI from CN. In this section we will examine the hybrid combinations of t-tests and multivariate techniques, including LASSO and SVM-RFE. Table 8 presents the performance in AD and MCI discrimination using the ELM classifier with only the univariate t-test (on concatenated features), and its combination with LASSO or SVM-RFE. The results show that the ELM classifier combined with the hybrid feature optimization framework outperformed the same classifier without feature optimization, in both cohorts and in both AD and MCI discrimination (accuracies up to 98.86% for AD and 98.57% for MCI diagnosis in the ADNI2 cohort; up to 98.70% for AD and 94.16% for MCI diagnosis in the in-house cohort). In addition, the ELM performance with combined univariate t-test and SVM-RFE is clearly superior to that of combined univariate t-test and LASSO. Interestingly, the hybrid combinations of univariate t-test with different threshold p-values and SVM-RFE resulted in similar accuracies. These similar performances can be explained as follows: In this paper, we chose the highest ranked features using grid search cross validation method on only training data, and SVM-RFE eliminated the remaining, low-ranked features. Even though with different p-values, the number of highest features are the same for the classifiers, and that resulted in equal performance.

thumbnail
Table 8. The effects of multivariate feature optimization methods (LASSO and SVM-RFE) on the ELM classification performances reported with ADNI2 and in-house cohorts.

https://doi.org/10.1371/journal.pone.0212582.t008

Discussion

Comparison with previous studies

In recent years, many studies have been carried out to classify AD/MCI subjects using rs-fMRI. Studies based on the use of a binary classification reported accuracies from about 75% to about 95% [18, 19]. Table 9 summarizes the results of recently published studies using rs-fMRI neuroimaging-based machine learning to discriminate AD and MCI from CN and compares them with our results. It should be noted that our method outperformed the ones proposed in [26, 5052], which used the same MCI and CN subject selection from the ADNI2 cohort. Direct performance comparison with other studies would not be fair, because of the different datasets, preprocessing pipelines, feature measures, and classifiers. Nevertheless, it is noteworthy that the method we propose achieved the highest accuracy among all the methods described in the classification of AD and MCI vs. CN using only rs-fMRI data.

thumbnail
Table 9. Comparison of classification accuracy of AD/MCI subjects with state-of-the-art methods using rs-fMRI.

https://doi.org/10.1371/journal.pone.0212582.t009

Feature selection techniques on ADNI cohort

Recent years have shown wide applications of MVPA feature selection methods applied on neuroimaging data sets from public ADNI cohorts. In Table 10, we summarize the results of previous works that applied univariate and MVPA as well as their hybrid combinations for discriminating the AD and MCI patients. In recent study [60], Kim et al. proposed multi-model hierarchical ELM integrated with t-test and LASSO applied on ROI-based features for classifications of AD and MCI against CN. Volume and mean intensity extracted from 93 ROIs of preprocessed MRI and FDG-PET images, respectively, as well as CSF values were used as features. The maximal accuracies achieved by t-test method were 96.11% and 86.15% while LASSO-based method achieved 96.03% and 86.17% for AD and MCI vs. CN, respectively. Similar AD/MCI identification framework [61] used multiple-kernel SVM method to combine the biomarkers of three modalities (MRI, FDG-PET, and CSF). Simple feature selection based on t-test was implemented, leading to the highest classification accuracies of 93.2% and 76.4% for respective AD and MCI diagnosis compared to using all features. Another study [62] used LASSO-based feature selection on GM and WM volume maps to achieve maximal accuracies of 85.7% and 81.1%. Hidalgo-Muñoz et al. [63] compared voxel-wise feature selections, i.e univariate t-test and multivariate SVM-RFE for classification of AD patients from CN using segmented GM and WM maps. Their obtained results have suggested that SVM-RFE selects discriminant features more efficiently than t-test significance for classification purposes (99.7% vs. 93.2%). Using rs-fMRI data, Khazaee et al. [56] computed functional brain network-based features, and used univariate Fisher score for feature selection and SVM as the classifier for AD classification, achieving up to a maximum accuracy of 97%. More recently, other MVPA techniques, such as principal component analysis (PCA) and independent component analysis (ICA), have been developed to keep informative features while disregarding uninformative sources of noises. Salvatore et al. used PCA method to reduce the dimensions of WM and GM density maps [64]. The reduced density maps were used for SVM classifiers to identify AD (accuracy = 76%) and MCI (accuracy = 72%) patients from CN. Similar predictive improvements due to a single MVPA feature selection or their hybrid combinations were obtained in unimodal rs-fMRI studies [26, 27], sMRI [60, 63, 65], PET [60, 66] and multi-model sMRI+PET [60, 66].

thumbnail
Table 10. Comparison of classification performances of AD/MCI patients on ADNI cohort with hybrid MVPA feature selections.

https://doi.org/10.1371/journal.pone.0212582.t010

The hybrid combinations of feature selection methods were demonstrated to diagnose the AD and MCI diseases with success. In studies [26, 27], Wee et al. combined two filter-based methods (t-test and minimum redundancy and maximum relevance-mRMR) and wrapper-based SVM-RFE methods to select the most discriminative functional connectivity extracted from rs-fMRI images. They reported maximum accuracies of 92.35% and 84% for identifications of AD and MCI patients from healthy controls. In other studies [67, 68], a new hybrid voxel-wise feature selection approach that combines t-test with Fisher criterion-based genetic algorithm was proposed predict AD patients from CNs using segmented GM images. They reported that the hybrid method’s performance (accuracy: 93.01%) is superior to those with PCA-based feature selection method (88.70%) and with no feature selection (accuracy: 87.63%). In addition, combinations of PCA with LDA and FDR (Fisher discriminant ratio) as feature selection methods outperform the whole-brain vovel-wise approach as they achieved AD classification accuracy results of up to 96.7% and 89.5% for PET and SPECT images, respectively [69].

By contrast, some studies have reported that feature selection without utilizing prior knowledge did not increase classification accuracy. Chu et al. [70] compared four common feature selection methods: 1) pre-selected ROIs based on pre-knowledge, 2) univariate t-test, 3) RFE, and 4) t-test constrained by ROIs, extracted from segmented GM maps from T1 MRI scans of three patient groups (AD, MCI, CN). Surprisingly, the results showed that: 1) the predictive accuracies with either univariate t-test or RFE were no better than those achieved using the whole brain data, 2) the hybrid method (t-test + ROI) that used the ROI as spatially constrain and t-test as the ranking of features did show significant improvements of classification accuracy in AD vs. CN and MCI vs. CN. Similarly, voxel-wise hybrid combinations of t-test and SVM-RFE applied to whole-brain GM maps were not significantly improved the AD- and MCI-diagnosis performances as compared to whole-GM approach [65].

Hybrid combinations of feature selection methods have also been used for AD and MCI classifications using other cohorts rather than standard ADNI data sets. Typically, Jie et al. [28] combined t-test and RFE to select the most topological features extracted from fMRI scans for MCI discrimination from CN subjects. They reported a maximal accuracy of 91.9%. Other study [31] utilized a hybrid feature selection approach that combines three filter- and two wrapper-based methods, and compared the performance of six different combinations of them. They reported the best accuracy of 90.4% using the proposed hybrid approach with SVM classifier in LOO-CV for AD patients diagnosis taken from Open Access Series of Imaging Studies (OASIS) database (http://www.oasis-brains.org/).

The benefits of MVPA feature reduction methods

It is known that the performance of pattern recognition methods such as SVM and ELM decreases with the increase of non-informative features [19]. Machine learning techniques take advantage of the multivariate nature of the fMRI data and are able to identify maximally discriminative spatial patterns [58]. In the present work, we have examined and assessed an approach for fMRI pattern discrimination analysis based on ELM and hybrid combinations of multi-voxels, including univariate and MVPA feature reductions. Our results show that the conventional univariate t-test, as used alone, can be used with a classifier for identification of AD/MCI patients. In addition, as shown in Table 7, a very low p-value cut-off does not guarantee a strongly informative feature, while a larger p-value does not necessarily indicate an irrelevant feature. Thus, by discarding voxels based only on the results of statistical tests sensitive to group means, could lead to loss of discriminative ability. Therefore, additional MVPA methods should be used in combination with the univariate group-level t-test.

We also demonstrated that the hybrid combination of multi-voxel methods (t-test + SVM-RFE and t-test + LASSO) increases the discriminative power of the patterns (Table 8). In our studies, we searched for the most relevant discriminative patterns using SVM-RFE, which iteratively eliminates the lowest-ranked patterns based on multivariate information classified by RBF-based SVM; and LASSO, which chooses the sparse features that contribute the most to the accuracy of the model during training. It is worth noting that because of the lesser sensitivity of the univariate method, the wisest setting for combining univariate and multivariate is to use larger p-value thresholds (thus preventing the exclusion of potentially relevant voxels), and then remove irrelevant voxels based on multivariate ranking functions.

Clinical significance of the results

The regions showing significant changes in a univariate t-test play an important role in achieving highly accurate differential diagnosis when used in combination with MVPA feature reduction methods. The following discussion of the significant regions may have clinical relevance.

We showed that the highest discrimination patterns were achieved when all information from regional coherence and functional connectivity measures were combined. This may imply that different parts of the brain undergo different functional failures as a consequence of AD/MCI. Therefore, classification methods should include the maximum amount of informative change to achieve optimal discrimination.

One important finding of the current study is that the significant regional features depend on the dataset: Therefore we cannot label any regional feature as a global biomarker of AD or MCI. Our binary classification results between folds indicated that the significant features are subject to change when the cross-validation subgroups of AD and MCI subjects are changed. Therefore, no specific regional feature would be an appropriate global biomarker for AD and MCI diagnosis. For instance, Figs 5 and 6 present an example of the statistical group-level differences between AD and CN, and between MCI and CN, for all measures of a CV fold. Regions with significant changes were mostly located in the DMN (mainly involving in the prefrontal cortex, the PCu, and the PCC).

thumbnail
Fig 5. Univariate t-statistical difference maps between AD and CN groups of ten measures extracted from in-house cohort.

Voxels with p-value<0.05 and cluster size of 85 voxels (2295 mm3) corresponding to a corrected p-value<0.05 were used to identify the significant clusters. Hot and cold colours indicate AD-related measures increases and decreases, respectively.

https://doi.org/10.1371/journal.pone.0212582.g005

thumbnail
Fig 6. Univariate t-statistical difference maps between MCI and CN groups of ten measures extracted from in-house cohort.

Voxels with p-value<0.05 and cluster size of 85 voxels (2295 mm3) corresponding to a corrected p-value<0.05 were used to identify the significant clusters. Hot and cold colours indicate MCI-related measures increases and decreases, respectively.

https://doi.org/10.1371/journal.pone.0212582.g006

Limitations and future perspectives

Notwithstanding the discriminative power of the framework we presented for AD and MCI, this work has several limitations that we now describe. First, the limited sample size of the in-house cohort (81 AD, 132 MCI, and 152 CN), but especially of the ADNI2 one (33 AD, 31 MCI, and 31 CN), prevented the algorithm from learning during the training phase. Therefore these small datasets certainly do not adequately represent the patient population, so that the generalization of our results to other groups is not guaranteed.

A second limitation has to do with model complexity, as our proposed voxel-wise method may require more computation and resources than methods based on regions-of-interest (ROIs). However, the computation and resource burden only occur in the training phase, which can be implemented offline, whereas the computation for testing consists of simple functions. Thus, from the clinical perspective, we believe such limitation is acceptable when considering the better accuracies obtained.

Third, our multi-measure classification framework only considers functional MRI data. However, it is expected that combining as many modalities as possible would be advantageous for the discrimination of AD and MCI from CN [71]. Accordingly, in future studies, we plan to develop a multi-modal classification framework combining multiple data sources, including structural MRI and PET data.

Conclusion

In conclusion, we proved the possibility of using rs-fMRI scans for AD/MCI prediction in individual subjects. Using a standard Alzheimer’s disease Neuroimaging Initiative cohort and an in-house AD cohort from South Korea, the proposed framework extracts the maximum amount of information changes due to AD/MCI from concatenations of multiple rs-fMRI biomarkers which lead to maximal classification accuracies as compared to all other recent researches. The combination of t-test-based univariate, and RFE-based multivariate feature selection techniques performed on the concatenated measure extracted from rs-fMRI data provided the best discriminative performance when the features thus selected were used by the ELM classifier, superior to that of linear and non-linear SVM classifiers. These results may direct future studies using rs-fMRI scans for the classification of patients with preclinical AD or MCI.

Supporting information

S1 Table. The subject IDs of three groups of ADNI2 cohort used in this study.

https://doi.org/10.1371/journal.pone.0212582.s001

(DOC)

Acknowledgments

The authors thankfully acknowledge the ADNI consortium for the use of the released dataset, and also thank Dr. Heung-Il Suk and Chong-Yaw Wee for sharing the ADNI2 subject IDs.

References

  1. 1. Braak H, Braak E. Neuropathological stageing of Alzheimer-related changes. Acta Neuropathol. 1991;82: 239–259. pmid:1759558
  2. 2. Zheng W, Yao Z, Xie Y, Fan J, Hu B. Identification of Alzheimer’s Disease and Mild Cognitive Impairment Using Networks Constructed Based on Multiple Morphological Brain Features. Biological Psychiatry: Cognitive Neuroscience and Neuroimaging. 2018; pmid:30077576
  3. 3. Khazaee A, Ebrahimzadeh A, Babajani-Feremi A. Classification of patients with MCI and AD from healthy controls using directed graph measures of resting-state fMRI. Behav Brain Res. 2017;322: 339–350. pmid:27345822
  4. 4. Mahmoudi A, Takerkart S, Regragui F, Boussaoud D, Brovelli A. Multivoxel Pattern Analysis for fMRI Data: A Review. Comput Math Methods Med. 2012;2012: 1–14.
  5. 5. Mwangi B, Tian TS, Soares JC. A Review of Feature Reduction Techniques in Neuroimaging. Neuroinformatics. 2013;12: 229–244.
  6. 6. Ashburner J, Friston KJ. Voxel-Based Morphometry—The Methods. Neuroimage. 2000;11: 805–821. pmid:10860804
  7. 7. Yang Z, Fang F, Weng X. Recent developments in multivariate pattern analysis for functional MRI. Neurosci Bull. 2012. pmid:22833038
  8. 8. Chaves R, Ramírez J, Górriz JM, López M, Salas-Gonzalez D, Alvarez I, et al. SVM-based computer-aided diagnosis of the Alzheimer’s disease using t-test NMSE feature selection with feature correlation weighting. Neurosci Lett. 2009;461: 293–297. pmid:19549559
  9. 9. Zang Y, Jiang T, Lu Y, He Y, Tian L. Regional homogeneity approach to fMRI data analysis. Neuroimage. 2004;22: 394–400. pmid:15110032
  10. 10. He Y, Wang L, Zang Y, Tian L, Zhang X, Li K, et al. Regional coherence changes in the early stages of Alzheimer’s disease: a combined structural and resting-state functional MRI study. Neuroimage. 2007;35: 488–500. pmid:17254803
  11. 11. Zou Q-H, Zhu C-Z, Yang Y, Zuo X-N, Long X-Y, Cao Q-J, et al. An improved approach to detection of amplitude of low-frequency fluctuation (ALFF) for resting-state fMRI: fractional ALFF. J Neurosci Methods. 2008;172: 137–141. pmid:18501969
  12. 12. Guo Z, Liu X, Li J, Wei F, Hou H, Chen X, et al. Fractional amplitude of low-frequency fluctuations is disrupted in Alzheimer’s disease with depression. Clin Neurophysiol. 2017;128: 1344–1349. pmid:28570868
  13. 13. Li Y, Jing B, Liu H, Li Y, Gao X, Li Y, et al. Frequency-Dependent Changes in the Amplitude of Low-Frequency Fluctuations in Mild Cognitive Impairment with Mild Depression. J Alzheimers Dis. 2017;58: 1175–1187. pmid:28550250
  14. 14. Lin Q, Rosenberg MD, Yoo K, Hsu TW, O’Connell TP, Chun MM. Resting-State Functional Connectivity Predicts Cognitive Impairment Related to Alzheimer’s Disease. Front Aging Neurosci. 2018;10: 94. pmid:29706883
  15. 15. Han Y, Wang J, Zhao Z, Min B, Lu J, Li K, et al. Frequency-dependent changes in the amplitude of low-frequency fluctuations in amnestic mild cognitive impairment: a resting-state fMRI study. Neuroimage. 2011;55: 287–295. pmid:21118724
  16. 16. Li Y, Wang X, Li Y, Sun Y, Sheng C, Li H, et al. Abnormal Resting-State Functional Connectivity Strength in Mild Cognitive Impairment and Its Conversion to Alzheimer’s Disease. Neural Plast. 2016;2016: 4680972. pmid:26843991
  17. 17. Dai Z, Yan C, Li K, Wang Z, Wang J, Cao M, et al. Identifying and Mapping Connectivity Patterns of Brain Network Hubs in Alzheimer’s Disease. Cereb Cortex. 2015;25: 3723–3742. pmid:25331602
  18. 18. Rathore S, Habes M, Iftikhar MA, Shacklett A, Davatzikos C. A review on neuroimaging-based classification studies and associated feature extraction methods for Alzheimer’s disease and its prodromal stages. Neuroimage. 2017;155: 530–548. pmid:28414186
  19. 19. Arbabshirani MR, Plis S, Sui J, Calhoun VD. Single subject prediction of brain disorders in neuroimaging: Promises and pitfalls. Neuroimage. 2017;145: 137–165. pmid:27012503
  20. 20. Guyon I, Weston J, Barnhill S, Vapnik V. Gene Selection for Cancer Classification using Support Vector Machines. Mach Learn. Kluwer Academic Publishers; 2002;46: 389–422.
  21. 21. De Martino F, Valente G, Staeren N, Ashburner J, Goebel R, Formisano E. Combining multivariate voxel selection and support vector machines for mapping and classification of fMRI spatial patterns. Neuroimage. 2008;43: 44–58. pmid:18672070
  22. 22. Mishra S, Mishra D. SVM-BT-RFE: An improved gene selection framework using Bayesian T-test embedded in support vector machine (recursive feature elimination) algorithm. Karbala International Journal of Modern Science, 2015.
  23. 23. Qureshi M. N. I., Min B., Jo H. J., Lee B. (2016). Multiclass classification for the differential diagnosis on the adhd subtypes using recursive feature elimination and hierarchical extreme learning machine: structural MRI Study. PLoS ONE 11:e0160697. pmid:27500640
  24. 24. Qureshi M. N. I., Oh J., Min B., Jo H. J., Lee B. (2017b). Multi-modal, multi-measure, and multi-class discrimination of ADHD with hierarchical feature extraction and extreme learning machine using structural and functional brain MRI. Front. Hum. Neurosci. 11:157. pmid:28420972
  25. 25. Dai D, Wang J, Hua J, He H. Classification of ADHD children through multimodal magnetic resonance imaging. Front Syst Neurosci. 2012 Sep 3;6:63. pmid:22969710
  26. 26. Wee CY, Yap PT, Zhang D, Wang L, Shen D. Group-constrained sparse fMRI connectivity modeling for mild cognitive impairment identification. Brain Struct Funct. 2014 pmid:23468090
  27. 27. Wee CY, Yap PT, Shen D; Alzheimer's Disease Neuroimaging Initiative. Prediction of Alzheimer's disease and mild cognitive impairment using cortical morphological patterns. Hum Brain Mapp. 2013 Dec;34(12):3411–25. pmid:22927119
  28. 28. Jie B, Zhang D, Wee CY, Shen D. Topological graph kernel on multiple thresholded functional connectivity networks for mild cognitive impairment classification. Hum Brain Mapp. 2014 Jul;35(7):2876–97. pmid:24038749
  29. 29. Wee CY, Wang L, Shi F, Yap PT, Shen D. Diagnosis of autism spectrum disorders using regional and interregional morphological features. Hum Brain Mapp. 2014 Jul;35(7):3414–30. pmid:25050428
  30. 30. Falahati F, Westman E, Simmons A. Multivariate data analysis and machine learning in Alzheimer's disease with a focus on structural magnetic resonance imaging. J Alzheimers Dis. 2014. pmid:24718104
  31. 31. Dai D, Huiguang H, Joshua TV, Zengguang H. Accurate prediction of AD patients using cortical thickness networks. Machine Vision and Applications (2013).
  32. 32. Li X, Peng S, Chen J, Lü B, Zhang H, Lai M. SVM-T-RFE: a novel gene selection algorithm for identifying metastasis-related genes in colorectal cancer using gene expression profiles. Biochem Biophys Res Commun. 2012. pmid:22306013
  33. 33. Chao-Gan Y, Yu-Feng Z. DPARSF: A MATLAB Toolbox for “Pipeline” Data Analysis of Resting-State fMRI. Front Syst Neurosci. 2010;4: 13. pmid:20577591
  34. 34. Ashburner J. A fast diffeomorphic image registration algorithm. Neuroimage. 2007;38: 95–113. pmid:17761438
  35. 35. Song X-W, Dong Z-Y, Long X-Y, Li S-F, Zuo X-N, Zhu C-Z, et al. REST: a toolkit for resting-state functional magnetic resonance imaging data processing. PLoS One. 2011;6: e25031. pmid:21949842
  36. 36. Kendall M, Gibbons JDR. Correlation methods. Oxford: Oxford University Press; 1990.
  37. 37. Zang Y, Jiang T, Lu Y, He Y, Tian L. Regional homogeneity approach to fMRI data analysis. Neuroimage. 2004;22: 394–400. pmid:15110032
  38. 38. Zang Y-F, He Y, Zhu C-Z, Cao Q-J, Sui M-Q, Liang M, et al. Altered baseline brain activity in children with ADHD revealed by resting-state functional MRI. Brain Dev. 2007;29: 83–91. pmid:16919409
  39. 39. Zhou Y, Wang Y, Rao L-L, Liang Z-Y, Chen X-P, Zheng D, et al. Disrutpted resting-state functional architecture of the brain after 45-day simulated microgravity. Front Behav Neurosci. 2014;8. pmid:24926242
  40. 40. Zuo X-N, Ehmke R, Mennes M, Imperati D, Xavier Castellanos F, Sporns O, et al. Network Centrality in the Human Functional Connectome. Cereb Cortex. 2011;22: 1862–1875. pmid:21968567
  41. 41. Zhan Z-W, Lin L-Z, Yu E-H, Xin J-W, Lin L, Lin H-L, et al. Abnormal resting-state functional connectivity in posterior cingulate cortex of Parkinson’s disease with mild cognitive impairment and dementia. CNS Neurosci Ther. 2018; pmid:29500931
  42. 42. Qureshi MNI, Oh J, Cho D, Jo HJ, Lee B. Multimodal Discrimination of Schizophrenia Using Hybrid Weighted Feature Concatenation of Brain Functional Connectivity and Anatomical Features with an Extreme Learning Machine. Front Neuroinform. 2017;11: 59. pmid:28943848
  43. 43. Calhoun VD, Sui J. Multimodal Fusion of Brain Imaging Data: A Key to Finding the Missing Link(s) in Complex Mental Illness. Biological Psychiatry: Cognitive Neuroscience and Neuroimaging. 2016;1: 230–244.
  44. 44. Mwangi B, Ebmeier KP, Matthews K, Steele JD. Multi-centre diagnostic classification of individual structural neuroimaging scans from patients with major depressive disorder. Brain. 2012;135: 1508–1521. pmid:22544901
  45. 45. Wee C-Y, Yap P-T, Zhang D, Denny K, Browndyke JN, Potter GG, et al. Identification of MCI individuals using structural and functional connectivity networks. Neuroimage. 2012;59: 2045–2056. pmid:22019883
  46. 46. Yan K, Zhang D. Feature selection and analysis on correlated gas sensor data with recursive feature elimination. Sens Actuators B Chem. Elsevier; 2015;212: 353–363.
  47. 47. Tibshirani R. Regression Shrinkage and Selection via the Lasso. J R Stat Soc Series B Stat Methodol. [Royal Statistical Society, Wiley]; 1996;58: 267–288.
  48. 48. Wu TT, Chen YF, Hastie T, Sobel E, Lange K. Genome-wide association analysis by lasso penalized logistic regression. Bioinformatics. academic.oup.com; 2009;25: 714–721. pmid:19176549
  49. 49. Friedman J, Hastie T, Tibshirani R. Regularization paths for generalized linear models via coordinate descent. J Stat Softw. NIH Public Access; 2010;33: 1. pmid:20808728
  50. 50. Eavani H, Satterthwaite TD, Gur RE, Gur RC, Davatzikos C. Unsupervised learning of functional network dynamics in resting state fMRI. Inf Process Med Imaging. 2013;23: 426–437. pmid:24683988
  51. 51. Leonardi N, Richiardi J, Gschwind M, Simioni S, Annoni J-M, Schluep M, et al. Principal components of functional connectivity: a new approach to study dynamic brain connectivity during rest. Neuroimage. 2013;83: 937–950. pmid:23872496
  52. 52. Suk H-I, Wee C-Y, Lee S-W, Shen D. State-space model with deep learning for functional dynamics estimation in resting-state fMRI. Neuroimage. 2016;129: 292–307. pmid:26774612
  53. 53. de Vos F, Koini M, Schouten TM, Seiler S, van der Grond J, Lechner A, et al. A comprehensive analysis of resting state fMRI measures to classify individual patients with Alzheimer’s disease. Neuroimage. 2018;167: 62–72. pmid:29155080
  54. 54. Zhou J, Greicius MD, Gennatas ED, Growdon ME, Jang JY, Rabinovici GD, et al. Divergent network connectivity changes in behavioural variant frontotemporal dementia and Alzheimer’s disease. Brain. 2010;133: 1352–1367. pmid:20410145
  55. 55. Wu X, Li J, Ayutyanont N, Protas H, Jagust W, Fleisher A, et al. The receiver operational characteristic for binary classification with multiple indices and its application to the neuroimaging study of Alzheimer’s disease. IEEE/ACM Trans Comput Biol Bioinform. 2013;10: 173–180. pmid:23702553
  56. 56. Khazaee A, Ebrahimzadeh A, Babajani-Feremi A. Identifying patients with Alzheimer’s disease using resting-state fMRI and graph theory. Clin Neurophysiol. 2015;126: 2132–2141. pmid:25907414
  57. 57. Challis E, Hurley P, Serra L, Bozzali M, Oliver S, Cercignani M. Gaussian process classification of Alzheimer’s disease and mild cognitive impairment from resting-state fMRI. Neuroimage. 2015;112: 232–243. pmid:25731993
  58. 58. Jie B, Zhang D, Gao W, Wang Q, Wee C-Y, Shen D. Integration of network topological and connectivity properties for neuroimaging classification. IEEE Trans Biomed Eng. 2014;61: 576–589. pmid:24108708
  59. 59. Beltrachini L, De Marco M, Taylor ZA, Lotjonen J, Frangi AF, Venneri A. Integration of Cognitive Tests and Resting State fMRI for the Individual Identification of Mild Cognitive Impairment. Curr Alzheimer Res. 2015;12: 592–603. pmid:26238814
  60. 60. Kim J, Lee B. Identification of Alzheimer's disease and mild cognitive impairment using multimodal sparse hierarchical extreme learning machine. Hum Brain Mapp. 2018 pmid:29736986
  61. 61. Zhang D, Wang Y, Zhou L, Yuan H, Shen D; Alzheimer's Disease Neuroimaging Initiative. Multimodal classification of Alzheimer's disease and mild cognitive impairment. Neuroimage. 2011 pmid:21236349
  62. 62. Casanova R, Whitlow CT, Wagner B, Williamson J, Shumaker SA, Maldjian JA, Espeland MA. High dimensional classification of structural MRI Alzheimer's disease data based on large scale regularization. Front Neuroinform. 2011 pmid:22016732
  63. 63. Hidalgo-Muñoz AR, Ramírez J, Górriz JM, Padilla P. Regions of interest computed by SVM wrapped method for Alzheimer's disease examination from segmented MRI. Front Aging Neurosci. 2014. pmid:24634656
  64. 64. Salvatore C, Cerasa A, Battista P, Gilardi MC, Quattrone A, Castiglioni I; Alzheimer's Disease Neuroimaging Initiative. Magnetic resonance imaging biomarkers for the early diagnosis of Alzheimer's disease: a machine learning approach. Front Neurosci. 2015. pmid:26388719
  65. 65. Retico A, Bosco P, Cerello P, Fiorina E, Chincarini A, Fantacci ME. Predictive Models Based on Support Vector Machines: Whole-Brain versus Regional Analysis of Structural MRI in the Alzheimer's Disease. Neuroimaging. 2015 pmid:25291354
  66. 66. Ota K, Oishi N, Ito K, Fukuyama H; SEAD-J Study Group; Alzheimer's Disease Neuroimaging Initiative. Effects of imaging modalities, brain atlases and feature selection on prediction of Alzheimer's disease. J Neurosci Methods. 2015. pmid:26318777
  67. 67. Beheshti I, Demirel H; Alzheimer’s Disease Neuroimaging Initiative. Feature-ranking-based Alzheimer's disease classification from structural MRI. Magn Reson Imaging. 2016. pmid:26657976
  68. 68. Beheshti I, Demirel H, Matsuda H; Alzheimer's Disease Neuroimaging Initiative. Classification of Alzheimer's disease and prediction of mild cognitive impairment-to-Alzheimer's conversion from structural magnetic resource imaging using feature ranking and a genetic algorithm. Comput Biol Med. 2017 pmid:28260614
  69. 69. López M, Ramírez J, Górriz JM, Álvarez I, Salas-Gonzalez D, Segovia F, et al. Principal component analysis-based techniques and supervised classification schemes for the early detection of Alzheimer's disease. Neurocomputing 2011.
  70. 70. Chu C, Hsu AL, Chou KH, Bandettini P, Lin C, Alzheimer's Disease Neuroimaging Initiative. Does feature selection improve classification accuracy? Impact of sample size and feature selection on classification using anatomical magnetic resonance images. Neuroimage. 2012 pmid:22166797
  71. 71. Duc NT, Lee B. Microstate functional connectivity in EEG cognitive task revealed by multivariate Gaussian hidden Markov model with phase locking value. J Neural Eng 2019. pmid:30673644