Figures
Abstract
Background
Early diagnosis of Alzheimer’s disease (AD) and Mild Cognitive Impairment (MCI) is essential for timely treatment. Machine learning and multivariate pattern analysis (MVPA) for the diagnosis of brain disorders are explicitly attracting attention in the neuroimaging community. In this paper, we propose a voxel-wise discriminative framework applied to multi-measure resting-state fMRI (rs-fMRI) that integrates hybrid MVPA and extreme learning machine (ELM) for the automated discrimination of AD and MCI from the cognitive normal (CN) state.
Materials and methods
We used two rs-fMRI cohorts: the public Alzheimer’s disease Neuroimaging Initiative database (ADNI2) and an in-house Alzheimer’s disease cohort from South Korea, both including individuals with AD, MCI, and normal controls. After extracting three-dimensional (3-D) patterns measuring regional coherence and functional connectivity during the resting state, we performed univariate statistical t-tests to generate a 3-D mask that retained only voxels showing significant changes. Given the initial univariate features, to enhance discriminative patterns, we implemented MVPA feature reduction using support vector machine-recursive feature elimination (SVM-RFE), and least absolute shrinkage and selection operator (LASSO), in combination with the univariate t-test. Classifications were performed by an ELM, and its efficiency was compared to linear and nonlinear (radial basis function) SVMs.
Citation: Nguyen DT, Ryu S, Qureshi MNI, Choi M, Lee KH, Lee B (2019) Hybrid multivariate pattern analysis combined with extreme learning machine for Alzheimer’s dementia diagnosis using multi-measure rs-fMRI spatial patterns. PLoS ONE 14(2): e0212582. https://doi.org/10.1371/journal.pone.0212582
Editor: Jingwen Yan, Indiana University Purdue University at Indianapolis, UNITED STATES
Received: November 4, 2018; Accepted: February 5, 2019; Published: February 22, 2019
Copyright: © 2019 Nguyen et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: All relevant data are within the paper and its Supporting Information files.
Funding: This work was supported by the Bio & Medical Technology Development Program of the NRF funded by the Korean government, MSIT (NRF-2016M3A9E9941946). The funder had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interests: The authors have declared that no competing interests exist.
Introduction
Alzheimer’s disease (AD) is the most common neurodegenerative disease and is the main cause of 60% to 70% of dementia cases in aging societies. It is characterized by cognitive decline and short-term memory loss [1, 2]. Mild cognitive impairment (MCI) is referred to as the prodromal stage of AD, and subjects with MCI are at high risk of developing AD [3]. Because AD/MCI are neurodegenerative diseases and progressively attack memory cells, the development of early diagnostic tools is undoubtedly important.
In recent years, resting-state functional magnetic resonance imaging (rs-fMRI) was shown to be a powerful tool for analysing the spontaneous blood-oxygen-level-dependent (BOLD) contrasts to map neural activity associated with a variety of brain functions. In order to map the brain areas involved in a given cognitive function, the BOLD signal at the level of the individual voxel is analyzed [4]. Statistical analysis is then performed on all voxels to show regions whose BOLD signal shows significant effects. This approach is referred to as univariate t-test analysis, which is performed independently on each voxel, and has been used in neuroimaging research for decades [5, 6]. However, this approach can only show differences between group averages, and is not sufficient to diagnose individual subjects. Therefore, recently, a machine learning (ML) technique known as multivariate pattern analysis (MVPA) has been promisingly applied to classify individual subjects using neuroimaging scans [7, 8]. Multivariate methods such as support vector machine-recursive feature elimination (SVM-RFE) and least absolute shrinkage and selection operator (LASSO) investigate the mutual relationships between multiple voxels and spatial patterns. Thus, the combination of univariate t-test and multivariate MVPA approaches is expected to enhance the prediction performance as compared to each individual approach used alone.
Previous fMRI studies have indicated that the pathophysiology of AD/MCI can be associated with statistical changes, in the average sense, of regional spontaneous low-frequency (<0.08 Hz) BOLD fluctuation coherence measured in the resting state and analysed using univariate t-tests. The metrics used in these studies included regional homogeneity (ReHo) [9, 10], amplitude of low-frequency fluctuation (ALFF) [11–13], and fractional ALFF (fALFF), as well as functional connectivity (FC) [14]. For example, He et. al., [10] showed that the posterior cingulate cortex (PCC) and the precuneus (PCu) have the largest ReHo differences between the AD and CN groups (p<0.05). The ALFF and fALFF studies using fMRI by Han et al., [15] revealed that MCI patients had decreased fALFF values in PCC/PCu and hippocampus, and increased fALFF values in several other regions, including occipital and temporal cortices. Rs-fMRI FC, investigated by Li et al. [16], showed that the regions with high FC were mostly located in the default mode network (DMN), and mainly involved the bilateral PCu and PCC [17]. These are all statistically significant findings at the group level. However, the discriminative ability based on the above-mentioned biomarkers related to AD/MCI diseases has not been evaluated. Since the discrimination task automatically classifies each subject into one of the studied groups (AD/MCI vs. CN), it is considered a much more complex task than the study of differences between groups [18, 19].
In neuroimaging studies, preprocessed brain scans commonly contain hundreds of thousands of non-zero voxels which significantly outnumber the number of subjects (often less than 1000). Thus, selection of an adequate subset of relevant training features/voxels is of critical importance to obtain good generalization ability and reduce risks of overfitting problems and computational complexity. A growing trend today is the design of ML-based feature reduction techniques integrated with classification methods applied to neuroimaging data for the voxel-based automated discrimination of patients with brain disorders, including AD and MCI (see the reviews [18, 19]). Many studies demonstrated the relevance of feature selection. Statistical hypothesis t-tests have broadly been used not only for group-discrimination detection but also for feature selections with success. The technique relies on an optimal threshold of significance (p-value) representing a subset of important features from whole-brain features. Though, applications of t-tests in feature selection are computational efficiency and easy to implement, this technique suffers from a significant drawback by not considering interactions between multiple features or spatial patterns which are the inherent multivariate nature of fMRI data. By contrast, MVPA methods do evaluate the relationships between multiple patterns. However, the primary drawback of whole-brain MVPA is its computationally demanding because of 3-D and high dimensionality of the data as well as the large number of images being analyzed [20–22]. Thus, to select the most informative features, a univariate feature selection strategy should be performed prior to MVPA in order to reduce the dimensionality sufficient for memory capacity, computational efficiency and ensure high sensitivity to fine-grained spatial discriminative patterns, while preserving the appealing properties of whole-brain fMRI analysis and multivariate nature of fMRI data [21, 22]. Practically, many previous studies have employed hybrid combinations of filter-based t-test and MVPA techniques, i.e. wrapper-based SVM-RFE, to diagnoze the brain disorders using neuroimaging data, e.g., ADHD [23–25], MCI [26–28], Autism [29], AD [30, 31], or for high-dimensional gene selections [22, 32] with success (accuracies>90%).
In this study, we propose a ML-based AD/MCI diagnosis framework combining MVPA and extreme learning machines (ELM) applied to multi-measure rs-fMRI data. We first extracted maps of 3-D regional coherence (ReHo, ALFF, and fALFF) and of resting-state FC (rsFC) (degree centrality (DC), seed-based rsFC) of multiple individual subjects. We then performed statistical univariate two-sample t-tests on whole-brain 3-D maps between two pre-defined training groups, to generate an analysis mask that retained only an initial set of relevant features (voxels) showing significant changes in any one of the measures, i.e. ReHo, fALFF, rs-FC. Next, MVPA techniques such as the wrapper-based SVM-RFE proposed by Guyon [20] and embedded-based LASSO were implemented to optimize the discriminative performance. In this study we used ELM and competing methods, including linear and non-linear SVM classifiers, to distinguish AD/MCI patients from the CN controls. We hypothesized that a hybrid combination of univariate statistical t-test and MVPA approaches applied on concatenation of multiple functional biomarkers could boost the classification performance. Thus, the major contributions present in this study can be summarized as follows:
- We propose a voxel-wise ML-based discriminative framework integrating ELM classifier and hybrid MVPA techniques for automated AD/MCI diagnosis using multi-measure rs-fMRI.
- The proposed framework extracts a maximum amount of information from multiple rs-fMRI biomarkers of a public Alzheimer’s disease Neuroimaging Initiative (ADNI2) and an in-house AD cohort from South Korea and, therefore, achieves maximal classification accuracies as compared to all other previous studies.
- We demonstrate that, compared to conventional univariate statistical analysis t-test, the hybrid combination of multivariate methods (univariate t-test + SVM-RFE and univariate t-test + LASSO) increases the classification performance of the discriminative patterns.
- The effectiveness of the ELM classifier, superior to that of linear and radial basis function (RBF)-based SVM classifiers, when combined with hybrid feature selection methods for AD/MCI identifications based on multi-biomarker rs-fMRI is addressed for the first time in this work.
- We showed that the highest classification accuracies are achieved when all patterns from multiple regional coherence and functional connectivity biomarkers are concatenated. This suggests that different brain regions suffer different functional losses due to AD/MCI. Hence, classification framework should include the maximum amount of informative changes to achieve best performance.
The remainder of this paper is organized as follows. Section 2 provides details on the datasets, subjects, preprocessing of rs-fMRI data, classification algorithms, univariate and MVPA feature reduction techniques, and permutation test used for the validation of the results. Section 3 presents the comparative results, while Section 4 is devoted to the discussion and conclusions of the article.
Materials and methods
We used two independent rs-fMRI datasets: the ADNI2 dataset, publicly available online and an in-house dataset whose subjects were recruited from the Chosun University Hospital in Gwangju, South Korea.
Subjects
ADNI2 cohort.
We used a cohort of 33 (17 females) Alzheimer’s disease (AD) subjects, 31 (14 females) early Mild Cognitive Impairment (MCI) and 31 (17 females) Cognitive Normal (CN) subjects from the ADNI2 database, which is publicly available on the web (www.adni.loni.usc.edu). The mean ages of AD, MCI, and CN are 73.59 ± 5.18, 74.52 ± 5.18, and 74.66 ± 5.56. General criteria for categorizing AD, MCI, and CN are well explained on the ADNI web site (http://adni.loni.ucla.edu). The subjects ranged in age from 56 to 89 years, and functional assessments of AD/MCI patients, such as Mini-Mental State Examination (MMSE) and Clinical Dementia Rating (CDR), were independently performed by the research institutions. The general criteria were as follows: the CN subjects had MMSE scores between 24 and 30, a CDR of 0, and were non-depressed, non-MCI, and non-demented. MCI patients had MMSE scores between 24 and 30, CDR scores between 0 and 1, no significant levels of impairment in other cognitive domains, essentially preserved daily living activities, and absence of dementia. The MMSE scores of AD patients were between 15 and 26, their CDR scores were 0.5 or 1, and they met the National Institute of Neurological and Communicative Disorders and Stroke and the Alzheimer’s disease and Related Disorders Association (NINCDS/ADRDA) criteria for probable AD. In this study, to minimize the effect of different image sizes and resolutions, we selected images from subjects with the same image dimension and resolution, and we used only the baseline fMRI scans.
In-house cohort.
A total of 365 subjects were included the in-house dataset: 81 AD subjects, 132 MCI subjects, and 152 CN subjects. This dataset was a part of a large cohort enrolled at the National Dementia Research Center, Chosun University, Gwangju, South Korea. All subjects provided written informed consent before the data collection. In case of AD patients with the inability of consent, the next of kin of patients gave consent before participation. Psychological tests or assessments were not used to determine whether subjects were able to provide written informed consent. The consent procedure and data acquisition were approved by the Institutional Review Board (IRB) of the Chosun University Hospital, Gwangju, South Korea (IRB number 2013-12-018). Briefly, subjects were between 56 and 87 years of age, and the study partners were able to provide independent functional evaluations. The MMSE and CDR scores, and the other clinical criteria for inclusion in the three groups were the same as in the ADNI2 cohort. The demographics of the participants from two cohorts are shown in Table 1 and subject IDs are provided in supporting S1 Table.
Rs-fMRI data acquisition
ADNI2 cohort.
ADNI2 subjects were scanned at different centres using 3.0 T Philips Achieva scanners with the same scanning protocol and parameters: Repetition Time (TR)/Echo Time (TE) = 3000/30 ms, flip angle = 80°, acquisition matrix size = 64 × 64, 48 slices, 140 volumes, and a voxel thickness = 3.3 mm.
In-house cohort.
The participants in the Chosun University Hospital were scanned with a Siemens Skyra 3.0-Tesla scanner. A 2D EPI MR acquisition type was used with the following parameters: TR/TE = 3000/30 ms, flip angle = 90°, field of view (FOV) = 240 × 240 mm, acquisition matrix size 64 × 64, 35 slices, 90 volumes, voxel size = 3.75 x 3.75 x 3.75, spacing between slices = 4.8 mm, number of echoes = 1, imaging frequency = 123.206 Hz, slice acquisition order = ascending (bottom-up), direction = 'Transverse > Coronal (2.6) > Sagittal (1.7)', pixel bandwidth = 3440, in-plane phase encoding direction = ‘ROW’, number of phase encoding steps = 63, echo train length = 31, percent sampling = 100, percent phase field of view = 100, variable flip angle flag = ‘N’, and specific absorption rate (SAR) = 0.0778.
Preprocessing of rs-fMRI data
Preprocessing of rs-fMRI data was carried out using the Data Processing Assistant for Resting-State fMRI (DPARSF; http://www.restfmri.net) [33] and the Statistical Parametric Mapping platform (SPM8; http://www.fil.ion.ucl.ac.uk/spm). All Digital Imaging and Communications in Medicine (DICOM) files were obtained from the scanners as described above, and converted into the Neuroimaging informatics Technology initiative (NIfTI) file format. The first 10 time points for each participant were disregarded to allow for signal calibration and participants’ adaption to the scanning noise. Subsequently, functional images went through the following preprocessing steps: slice-timing correction was referred to the last slice; realignment for head movement compensation was performed by applying a Friston 24-parameter model (6 head motion parameters, 6 head motion parameters from the previous time point, and 12 corresponding squared items); individual structural images (T1-weighted MPRAGE) were co-registered to the mean functional image after realignment; normalization the rs-fMRI to the original space was performed with the Diffeomorphic Anatomical Registration Through Exponentiated Lie algebra (DARTEL) toolbox [34] (resampling voxel size = 3 × 3 × 3 mm3); spatial smoothing was performed with a 6-mm full-width at half-maximum (FWHM) Gaussian kernel. Then, linear trend removal and temporal band-pass filtering (0.01 Hz < f < 0.08 Hz) were performed on the time series of each voxel. Finally, we regressed out cerebrospinal and white matter signals as well as six head-motion parameters to further reduce the effects of nuisance signals and focus only on the gray matters signal. A mask image was created according to the intersection of the subject-specific normalized T1 anatomical images. Only the voxels within the mask were further analyzed. The mask image was also used for correcting for multiple comparisons in later analyses.
Proposed framework
Fig 1 illustrates all the procedures and techniques proposed in this study. The first step of the procedure is to extract the whole-brain 3-D measures from processed rs-fMRI images. The measures are ReHo, fALFF, ALFF, Degree Centrality (DC), left hippocampus-based rsFC (LeftHC-based rsFC), right hippocampus-based rsFC (RightHC-based rsFC), left post cingulate cortex-based rsFC (LeftPCC-based rsFC), right post cingulate cortex-based rsFC (RightPCC-based rsFC), left precuneus-based rsFC (LeftPCu-based rsFC), and right precuneus-based rsFC (RightPCu-based rsFC). Due to the small size of the datasets, we used leave-one-out (LOO-CV) and 10-fold cross-validation (10-fold CV) for the ADNI2 and the in-house cohort, respectively, to validate the classification performance of the methods. In LOO-CV, one sample was selected as testing data whereas the rest was used for training. In 10-fold CV, 90% of the data were used for training and the remaining 10% for testing. Given training 3-D spatial maps, we then performed univariate statistical t-tests to obtain a 3-D mask which identified a set of ‘active’ voxels. We then implemented the MVPA techniques (SVM-RFE and LASSO) on 1-D concatenated training features to select the most relevant features for training the ELM and SVM classifiers. Finally, given the indices of the highest ranked features on the training data, we extracted the testing data for classification.
Block (a) presents the 3-D feature measure extractions from preprocessed fMRI scans. Block (b) describes the LOO-CV and 10-fold-CV cross validation for ADNI2 and in-house cohorts, respectively. Block (c) presents the multivariate feature reduction techniques using LASSO and SVM-RFE. The combined univariate t-test and multivariate LASSO as well as SVM-RFE informative features are trained by ELM and SVM classifiers as illustrated in block (d). Finally, the trained classifiers and testing features are used to evaluate the performance as in block (e).
Feature extraction
We describe here some biomarkers measured from rs-fMRI using the Resting-State fMRI Data Analysis Toolkit (REST) toolbox [35]. The measures can be categorized into regional spontaneous measures (ReHo, ALFF, fALFF), and functional connectivity measures (DC, seed-based rsFC), as described below.
Regional homogeneity (ReHo).
We used the ReHo measure to explore regional brain activity during the resting state. The computation was performed on a voxel-wise basis by calculating Kendall’s Coefficient of Concordance (KCC) [36] of fMRI time series of a given voxel with those of its nearest neighbours. From all the voxels in the brain, an individual ReHo map was obtained for each subject. A higher regional coherence within a cluster, consisting of a voxel and its nearest neighbours, was represented by a larger ReHo value for the voxel. Several recent studies in literature have shown the potential value of ReHo in clinical applications [9, 10, 37].
Amplitude of low-frequency fluctuation (ALFF) and fractional ALFF (fALFF).
The regional spontaneous activities can be examined by the ALFF measure and its improved version, the fALFF measure. After preprocessing, the filtered time series was transformed to a frequency domain using a fast Fourier transform (FFT), and the power spectrum was obtained. The average square root of the power spectrum (amplitude) between the frequencies of 0.01 and 0.08 Hz was computed at each voxel to give the ALFF measure [11, 38]. The fALFF measure is a modified version of ALFF, defined as the ratio of the average amplitude in the low-frequency range (0.01–0.08 Hz) to that of the entire frequency range (0–0.25 Hz) [33].
Degree centrality (DC).
We used a commonly employed graph-based measure of network organization, degree centrality (DC), to perform a full-brain exploration of the regions that were influenced by AD and MCI. Within the study mask, individual network centrality maps were generated in a voxel-wise fashion. First, the preprocessed functional runs were subjected to voxel-based whole-brain correlation analysis. The time course of each voxel from each participant that was within the gray matter mask was correlated with the time course of every other voxel, to obtain a correlation matrix. An undirected adjacency matrix was then obtained by thresholding the correlation at r > 0.25 [39, 40], and the DC was computed as the sum of the weights of the significant weighted connections for each voxel. Finally, the individual-level voxel-wise DC was converted into a z-score map by subtracting the mean DC across the entire brain and dividing by the standard deviation of the whole-brain DC.
Seed-based resting-state functional connectivity (rsFC).
To examine the detailed rsFC differences among the AD, MCI and CN groups at the regional level, we performed seed-based rsFC analysis. Briefly, the mean time course within each seed was extracted by averaging the time courses of all the voxels belonging to the seed. Subsequently, the mean time course was used to compute the correlation coefficients with the time courses of all voxels. The resulting correlation coefficients were then converted to z-scores using Fisher’s r-to-z transform to improve normality [16, 41]. In this study, we selected bilateral PCC, bilateral Hippocampus, and bilateral Precuneus as the seeds. Table 2 provides detailed information about the seeds.
Feature concatenation.
Combining multiple measures is a very effective approach for boosting the performance of a machine learning setup [42], which has been used in many research domains, including neuroimaging classification [43]. In this work, we investigated a common feature concatenation that linked many feature measures of the same dataset. We believe that feature concatenation will enhance accuracy and enable the inference of indirect or direct associations between multiple features extracted from the same fMRI data.
Feature reduction techniques
The number of predictor voxels obtained in our spatial maps was larger than the number of subjects. Thus, a dimensionality reduction process was necessary in order to select the most relevant features, discard redundant features and noise, and avoid numerical singularities and overfitting problems, and thus enhance the classification performance. Importantly, feature reduction was performed using the training data only. Once identified, the same brain regions identified during training were used to assess the classifier predictive accuracy [44] on the testing data. In this study, we used univariate t-test and MVPA approaches, including SVM-RFE and LASSO, as voxel-wise feature reduction techniques. The univariate t-test is performed voxel-wise to identify independent voxels, whereas the multivariate RFE and LASSO investigate the mutual associations between multiple features and spatial patterns. We also used hybrid combinations of univariate and MVPA approaches to outperform the individual techniques.
Univariate two sample t-test.
Many neuroimaging studies have shown abnormalities, at the level of the average signal, in one or more brain features in a diseased group compared to a control group using univariate statistical tests [19]. Recently, classification studies have used t-tests to select informative features for machine learning in neuroimaging [8, 45]. The key results of the analysis based on statistical tests are usually expressed by means of p-values. Subsequently, the optimal p-value cutoff to select the relevant features is determined through a cross-validation process, and the features thus selected are used in the subsequent machine learning analysis. In this study, we applied t-test-based feature reductions techniques to machine learning based diagnosis. Using t-tests on the training dataset, we generated an analytical mask that retained only the voxels presenting significant changes in any of the analytical feature measures, i.e. ReHo, ALFF, fALFF, DC, rsFC, between any of the two groups at the threshold p-values (p<0.05 with |t|>1.9715, p<0.01 with |t|> 2.599, and p<0.001 with |t|>3.3381). The correction cluster size threshold p = 0.05 corresponding to corrected individual voxel p-values was computed by Monte Carlo simulations with the program AlphaSim in REST [35] (1000 iterations) to determine the cluster size. As a result, cluster sizes of 85 voxels (2295 mm3), 18 voxels (486 mm3), 6 voxels (162 mm3) were found to correspond to corrected individual voxel p-values of 0.05, 0.01, and 0.001, respectively. Fig 2 shows selected regions resulted from univariate t-test applied to ReHo maps of one-fold training data, i.e., AD vs. CN and MCI vs. CN (out of >62 different folds for ADNI2 cohort and 10 folds for in-house cohort).
An example of one-fold univariate statistical two-sample t-test on ReHo maps between two training analytical groups, i.e., AD against CN (left subfigure) and MCI against CN (right subfigure). The threshold was set to p-value<0.05 with cluster size of 85 voxels (2295 mm3), which corresponded to a corrected p-value<0.05. The t-test maps are overlaid on the anatomical image. The hot and cold colours represent positive and negative changes.
Support vector machine-recursive feature elimination (SVM-RFE).
While the t-test is a univariate procedure that does not take into account interactions between multiple features and spatial patterns, support vector machine-recursive feature elimination (SVM-RFE) is a multivariate wrapper-model-based feature reduction algorithm, which efficiently fits a model and removes the weakest features until the specified informative number of features is reached. The ranking criterion of SVM-RFE is closely related to the SVM model. In each iteration of the RFE, an SVM model is trained. Then, the feature with smallest ranking criterion is removed since it has the least effect on classification, while the remaining features are kept for the SVM model in the next iteration. The sequential process is repeated until all the features have been eliminated. Then, according to the order of elimination, the features are graded. The later a feature is eliminated, the more significant it should be [46]. A detailed description of the SVM-RFE algorithm can be found in a previous paper [20]. In this work, after the application of SVM-RFE, the most important training features that maximize cross-validated accuracy were kept for training the classifiers. Fig 3 illustrates the process of hybrid combination of univariate t-test and multivariate SVM-RFE as well as LASSO to select the most relevant features.
Least absolute shrinkage and selection operator (LASSO).
A good example of MVPA feature reduction with error and regularization terms is LASSO, which has been successfully applied in neuroimaging machine learning tasks to mitigate problems related to the so-called curse-of-dimensionality. LASSO computes model coefficients γj by minimizing the following function:
where xi is the voxel-wise feature input data, a vector of q values at observation i, and n is the number of observations. ui is the response at observation i. Lambda (λ) is a non-negative user-defined regularization parameter which controls the balance between limiting the number of non-zero coefficients γj (sparsity) and high prediction accuracy. Interestingly, as λ approaches 1, the model becomes increasingly sparse, meaning it will produce few relevant features, while as λ approaches 0, the model becomes less sparse and includes more relevant features [5]. The parameter γ0 is a scalar. The function minimized by LASSO involves the l1 norm of γj [47–49]. In this paper, we chose the value of λ that minimized the cross validated mean squared error (MSE), as shown in Fig 4. The hybrid combination of univariate t-test and multivariate LASSO for selecting the most discriminative training features is shown in Fig 3.
Classification
In this study, three machine learning classification algorithms were used namely, ELM, linear SVM, and non-linear SVM. We have compared the results of all the classifiers, and ELM proves to be the most efficient algorithm both in terms of computation time and accuracy. Brief description of each method is described as follows.
ELM classifier.
An ELM consists of an input layer, a hidden layer, and an output layer. Whereas traditional feedforward neural networks require weights and biases for all layers to be adjusted by gradient-based learning algorithms, ELM arbitrarily assigns input weights and hidden layer biases without iterative adjustment, and computes the output weights by solving a single linear system [23]. Thus, ELM learns much faster than traditional neural networks and is widely employed in various classification applications as an efficient learning algorithm [24]. In this work, the number of hidden nodes was set between 1 and 400, and we selected a sigmoid activation function. A grid search method on training data was used to tune this parameter for achieving maximum cross-validated validation accuracy. To minimize the random effects due to the weight initializations, each value of the number of hidden nodes was used 100 times and the average performance was presented.
SVM classifier.
Support vector machines (SVM) have recently become popular as supervised classifiers of fMRI data due to their high performance, their ability to deal with large high-dimensional datasets, and their flexibility in modeling diverse sources of data [4]. In the present study, we utilized a linear SVM and a non-linear SVM based on radial basis function (RBF) kernels. In SVMs, the parameters that need to be tuned are the gamma value of the kernel scale (γ) and the box constraint (C). We used a greedy search method on training data to tune these parameters to maximize cross-validated test accuracy. In this study, the search scale for selecting gamma values of kernel scale and box constraint were set to γ = [0.001, 0.01, 0.1, 1, 10, 100, 1000, 10000], and C = [0.001, 0.01, 0.1, 1, 10, 100, 1000, 10000], respectively.
Cross-validation, performance evaluation, and significant testing methods
Cross-validation.
In this work, we used Leave-One-Out cross-validation (LOO-CV) for the ADNI2 cohort and 10-fold cross-validation (CV) for the in-house cohort. In the LOO-CV, N-1 subjects out of N were used for training, and the remaining one was left for testing, and the procedure was repeated for all the N subjects. In 10-fold CV, the subjects were randomly divided into 10 equally sized subsets: each of these subsets (folds), containing 10% of the subjects, was then used as testing set for a model trained on the remaining 90%. The mean performance of all N test subjects in LOO-CV, or all the 10 folds in 10-fold CV was reported as the final result.
Performance evaluation.
To evaluate the performance of the classifiers, we reported accuracy (ACC), sensitivity (SEN), specificity (SPEC), balanced accuracy (BAC), positive predictive value (PPV), and negative predictive value (NPV). TP, TN, FP, and FN indicate the number of true positives, true negatives, false positives, and false negatives, respectively. In terms of these numbers, ACC, SEN, SPEC, BAC, PPV, and NPV can be computed as follows:
- Accuracy (ACC) = (TP + TN)/(TP + TN + FP + FN)
- Sensitivity (SEN) = TP/(TP + FN)
- Specificity (SPEC) = TN/(TN + FP)
- Balanced accuracy (BAC) = (SEN + SPEC)/2
- Positive predictive value (PPV) = TP/(TP + FP)
- Negative predictive value (NPV) = TN/(TN + FN)
Significant testing methods.
To assess the statistical significance of the classifiers’ performance, a permutation test was performed on the classification accuracies, by randomly permuting 1000 times the labels of the test data of each of the N (LOO-CV) or 10 (10-fold CV) folds to get the probability of random successful classification. In general, the lower the p-value of the permuted prediction rate against the prediction rate of the original data labels, the higher the significance of the classifier performance.
Results
Classification results: Univariate t-test
ADNI2 cohort.
Tables 3 and 4 summarize the classification performance in discriminating between AD and CN, and between MCI and CN, respectively, of all the competing methods on the ADNI2 cohort. In terms of the mean diagnosis accuracy, the ELM classifier with concatenated features obtained a maximal accuracy of 89.92% (p-value<0.001) with a sensitivity of 86.51%, specificity of 84.17%, balanced accuracy of 84.58%, PPV of 94.00%, and NPV of 87.40% when discriminating between AD and CN; and a maximal accuracy of 85.81% (p-value<0.001) with a sensitivity of 86.67%, specificity of 85.83%, balanced accuracy of 84.85%%, PPV of 86.50% and NPV of 90.00% when discriminating between MCI and CN. The concatenated measure outperformed all individual measures. In addition, the ELM outperformed the linear and RBF-based non-linear SVMs in terms of diagnosis accuracy in all measures, including concatenated ones.
In-house cohort.
The experimental results on the in-house dataset are summarized in Table 5 (AD against CN) and Table 6 (MCI against CN). Our proposed method with the ELM classifier achieved very high mean accuracies for all types of measures (above 90% mean accuracy for AC against CN; around 80% mean accuracy for MCI against CN). Note that, in AD vs. CN, the concatenation of all measures resulted in a maximal mean accuracy of 94.45% (p-value<0.001) with a sensitivity of 83.67%, a specificity of 96.67%, a balance accuracy of 90.17%, a PPV of 95.67%, and a NPV of 91.07%; in MCI vs. CN, the maximal mean accuracy was 87.20% (p-value<0.001), with a sensitivity of 78.85%, a specificity of 87.50%, a balance accuracy of 81.69%, a PPV of 84.66%, and a NPV of 81.27%. Again, the performance of the combined measures was superior to that of the individual measures. In addition, the mean accuracy of the ELM classifier was superior to that of the linear and non-linear SVMs, as can be seen in the Tables 5 (AD vs. CN) and 6 (MCI vs. CN).
Classification results: Group differences and classifications
To date, there are no guidelines available for the optimal user-defined threshold of significance (p-values) to select the relevant features to be used in machine learning for the differentiation of AD and MCI vs. CN [5, 44]. To investigate the effects of univariate statistical p-values, we show in Table 7 the ELM classification performance at different p-values (p = 0.05, 001 and 0.001). Interestingly, the best performance was found with the least significant difference (p-value = 0.05) for both datasets and both classification problems (AD vs. CN and MCI vs. CN). Specifically, in the ADNI2 cohort, the maximal mean accuracy in AD vs. CN classification was 89.92% (p-value<0.001), with a sensitivity of 86.51%, specificity of 84.17%; while in classifying MCI vs. CN, the maximal accuracy was 85.81% (p-value<0.001), with a sensitivity of 86.67%, and a specificity of 85.83%. For the in-house cohort, we achieved, in AD vs. CN classification, a maximal accuracy of 94.45% (p-value<0.001), a sensitivity of 83.67%, a specificity of 96.67%; while for the MCI vs. CN classification the maximal accuracy was 87.20% (p-value<0.001), with a sensitivity of 78.85%, and a specificity of 87.50%. Therefore we can conclude that a highly significant group difference (p-value = 0.01, 0.001) does not necessarily result in a stronger classification performance, and, conversely, that a high classification performance does not necessarily mean that strong differences exist between the means of the groups.
Classification results: Hybrid combination of MVPA methods
In the previous section, we reported the results using only univariate t-tests, not combined with MVPA methods, for discriminating AD and MCI from CN. In this section we will examine the hybrid combinations of t-tests and multivariate techniques, including LASSO and SVM-RFE. Table 8 presents the performance in AD and MCI discrimination using the ELM classifier with only the univariate t-test (on concatenated features), and its combination with LASSO or SVM-RFE. The results show that the ELM classifier combined with the hybrid feature optimization framework outperformed the same classifier without feature optimization, in both cohorts and in both AD and MCI discrimination (accuracies up to 98.86% for AD and 98.57% for MCI diagnosis in the ADNI2 cohort; up to 98.70% for AD and 94.16% for MCI diagnosis in the in-house cohort). In addition, the ELM performance with combined univariate t-test and SVM-RFE is clearly superior to that of combined univariate t-test and LASSO. Interestingly, the hybrid combinations of univariate t-test with different threshold p-values and SVM-RFE resulted in similar accuracies. These similar performances can be explained as follows: In this paper, we chose the highest ranked features using grid search cross validation method on only training data, and SVM-RFE eliminated the remaining, low-ranked features. Even though with different p-values, the number of highest features are the same for the classifiers, and that resulted in equal performance.
Discussion
Comparison with previous studies
In recent years, many studies have been carried out to classify AD/MCI subjects using rs-fMRI. Studies based on the use of a binary classification reported accuracies from about 75% to about 95% [18, 19]. Table 9 summarizes the results of recently published studies using rs-fMRI neuroimaging-based machine learning to discriminate AD and MCI from CN and compares them with our results. It should be noted that our method outperformed the ones proposed in [26, 50–52], which used the same MCI and CN subject selection from the ADNI2 cohort. Direct performance comparison with other studies would not be fair, because of the different datasets, preprocessing pipelines, feature measures, and classifiers. Nevertheless, it is noteworthy that the method we propose achieved the highest accuracy among all the methods described in the classification of AD and MCI vs. CN using only rs-fMRI data.
Feature selection techniques on ADNI cohort
Recent years have shown wide applications of MVPA feature selection methods applied on neuroimaging data sets from public ADNI cohorts. In Table 10, we summarize the results of previous works that applied univariate and MVPA as well as their hybrid combinations for discriminating the AD and MCI patients. In recent study [60], Kim et al. proposed multi-model hierarchical ELM integrated with t-test and LASSO applied on ROI-based features for classifications of AD and MCI against CN. Volume and mean intensity extracted from 93 ROIs of preprocessed MRI and FDG-PET images, respectively, as well as CSF values were used as features. The maximal accuracies achieved by t-test method were 96.11% and 86.15% while LASSO-based method achieved 96.03% and 86.17% for AD and MCI vs. CN, respectively. Similar AD/MCI identification framework [61] used multiple-kernel SVM method to combine the biomarkers of three modalities (MRI, FDG-PET, and CSF). Simple feature selection based on t-test was implemented, leading to the highest classification accuracies of 93.2% and 76.4% for respective AD and MCI diagnosis compared to using all features. Another study [62] used LASSO-based feature selection on GM and WM volume maps to achieve maximal accuracies of 85.7% and 81.1%. Hidalgo-Muñoz et al. [63] compared voxel-wise feature selections, i.e univariate t-test and multivariate SVM-RFE for classification of AD patients from CN using segmented GM and WM maps. Their obtained results have suggested that SVM-RFE selects discriminant features more efficiently than t-test significance for classification purposes (99.7% vs. 93.2%). Using rs-fMRI data, Khazaee et al. [56] computed functional brain network-based features, and used univariate Fisher score for feature selection and SVM as the classifier for AD classification, achieving up to a maximum accuracy of 97%. More recently, other MVPA techniques, such as principal component analysis (PCA) and independent component analysis (ICA), have been developed to keep informative features while disregarding uninformative sources of noises. Salvatore et al. used PCA method to reduce the dimensions of WM and GM density maps [64]. The reduced density maps were used for SVM classifiers to identify AD (accuracy = 76%) and MCI (accuracy = 72%) patients from CN. Similar predictive improvements due to a single MVPA feature selection or their hybrid combinations were obtained in unimodal rs-fMRI studies [26, 27], sMRI [60, 63, 65], PET [60, 66] and multi-model sMRI+PET [60, 66].
The hybrid combinations of feature selection methods were demonstrated to diagnose the AD and MCI diseases with success. In studies [26, 27], Wee et al. combined two filter-based methods (t-test and minimum redundancy and maximum relevance-mRMR) and wrapper-based SVM-RFE methods to select the most discriminative functional connectivity extracted from rs-fMRI images. They reported maximum accuracies of 92.35% and 84% for identifications of AD and MCI patients from healthy controls. In other studies [67, 68], a new hybrid voxel-wise feature selection approach that combines t-test with Fisher criterion-based genetic algorithm was proposed predict AD patients from CNs using segmented GM images. They reported that the hybrid method’s performance (accuracy: 93.01%) is superior to those with PCA-based feature selection method (88.70%) and with no feature selection (accuracy: 87.63%). In addition, combinations of PCA with LDA and FDR (Fisher discriminant ratio) as feature selection methods outperform the whole-brain vovel-wise approach as they achieved AD classification accuracy results of up to 96.7% and 89.5% for PET and SPECT images, respectively [69].
By contrast, some studies have reported that feature selection without utilizing prior knowledge did not increase classification accuracy. Chu et al. [70] compared four common feature selection methods: 1) pre-selected ROIs based on pre-knowledge, 2) univariate t-test, 3) RFE, and 4) t-test constrained by ROIs, extracted from segmented GM maps from T1 MRI scans of three patient groups (AD, MCI, CN). Surprisingly, the results showed that: 1) the predictive accuracies with either univariate t-test or RFE were no better than those achieved using the whole brain data, 2) the hybrid method (t-test + ROI) that used the ROI as spatially constrain and t-test as the ranking of features did show significant improvements of classification accuracy in AD vs. CN and MCI vs. CN. Similarly, voxel-wise hybrid combinations of t-test and SVM-RFE applied to whole-brain GM maps were not significantly improved the AD- and MCI-diagnosis performances as compared to whole-GM approach [65].
Hybrid combinations of feature selection methods have also been used for AD and MCI classifications using other cohorts rather than standard ADNI data sets. Typically, Jie et al. [28] combined t-test and RFE to select the most topological features extracted from fMRI scans for MCI discrimination from CN subjects. They reported a maximal accuracy of 91.9%. Other study [31] utilized a hybrid feature selection approach that combines three filter- and two wrapper-based methods, and compared the performance of six different combinations of them. They reported the best accuracy of 90.4% using the proposed hybrid approach with SVM classifier in LOO-CV for AD patients diagnosis taken from Open Access Series of Imaging Studies (OASIS) database (http://www.oasis-brains.org/).
The benefits of MVPA feature reduction methods
It is known that the performance of pattern recognition methods such as SVM and ELM decreases with the increase of non-informative features [19]. Machine learning techniques take advantage of the multivariate nature of the fMRI data and are able to identify maximally discriminative spatial patterns [58]. In the present work, we have examined and assessed an approach for fMRI pattern discrimination analysis based on ELM and hybrid combinations of multi-voxels, including univariate and MVPA feature reductions. Our results show that the conventional univariate t-test, as used alone, can be used with a classifier for identification of AD/MCI patients. In addition, as shown in Table 7, a very low p-value cut-off does not guarantee a strongly informative feature, while a larger p-value does not necessarily indicate an irrelevant feature. Thus, by discarding voxels based only on the results of statistical tests sensitive to group means, could lead to loss of discriminative ability. Therefore, additional MVPA methods should be used in combination with the univariate group-level t-test.
We also demonstrated that the hybrid combination of multi-voxel methods (t-test + SVM-RFE and t-test + LASSO) increases the discriminative power of the patterns (Table 8). In our studies, we searched for the most relevant discriminative patterns using SVM-RFE, which iteratively eliminates the lowest-ranked patterns based on multivariate information classified by RBF-based SVM; and LASSO, which chooses the sparse features that contribute the most to the accuracy of the model during training. It is worth noting that because of the lesser sensitivity of the univariate method, the wisest setting for combining univariate and multivariate is to use larger p-value thresholds (thus preventing the exclusion of potentially relevant voxels), and then remove irrelevant voxels based on multivariate ranking functions.
Clinical significance of the results
The regions showing significant changes in a univariate t-test play an important role in achieving highly accurate differential diagnosis when used in combination with MVPA feature reduction methods. The following discussion of the significant regions may have clinical relevance.
We showed that the highest discrimination patterns were achieved when all information from regional coherence and functional connectivity measures were combined. This may imply that different parts of the brain undergo different functional failures as a consequence of AD/MCI. Therefore, classification methods should include the maximum amount of informative change to achieve optimal discrimination.
One important finding of the current study is that the significant regional features depend on the dataset: Therefore we cannot label any regional feature as a global biomarker of AD or MCI. Our binary classification results between folds indicated that the significant features are subject to change when the cross-validation subgroups of AD and MCI subjects are changed. Therefore, no specific regional feature would be an appropriate global biomarker for AD and MCI diagnosis. For instance, Figs 5 and 6 present an example of the statistical group-level differences between AD and CN, and between MCI and CN, for all measures of a CV fold. Regions with significant changes were mostly located in the DMN (mainly involving in the prefrontal cortex, the PCu, and the PCC).
Voxels with p-value<0.05 and cluster size of 85 voxels (2295 mm3) corresponding to a corrected p-value<0.05 were used to identify the significant clusters. Hot and cold colours indicate AD-related measures increases and decreases, respectively.
Voxels with p-value<0.05 and cluster size of 85 voxels (2295 mm3) corresponding to a corrected p-value<0.05 were used to identify the significant clusters. Hot and cold colours indicate MCI-related measures increases and decreases, respectively.
Limitations and future perspectives
Notwithstanding the discriminative power of the framework we presented for AD and MCI, this work has several limitations that we now describe. First, the limited sample size of the in-house cohort (81 AD, 132 MCI, and 152 CN), but especially of the ADNI2 one (33 AD, 31 MCI, and 31 CN), prevented the algorithm from learning during the training phase. Therefore these small datasets certainly do not adequately represent the patient population, so that the generalization of our results to other groups is not guaranteed.
A second limitation has to do with model complexity, as our proposed voxel-wise method may require more computation and resources than methods based on regions-of-interest (ROIs). However, the computation and resource burden only occur in the training phase, which can be implemented offline, whereas the computation for testing consists of simple functions. Thus, from the clinical perspective, we believe such limitation is acceptable when considering the better accuracies obtained.
Third, our multi-measure classification framework only considers functional MRI data. However, it is expected that combining as many modalities as possible would be advantageous for the discrimination of AD and MCI from CN [71]. Accordingly, in future studies, we plan to develop a multi-modal classification framework combining multiple data sources, including structural MRI and PET data.
Conclusion
In conclusion, we proved the possibility of using rs-fMRI scans for AD/MCI prediction in individual subjects. Using a standard Alzheimer’s disease Neuroimaging Initiative cohort and an in-house AD cohort from South Korea, the proposed framework extracts the maximum amount of information changes due to AD/MCI from concatenations of multiple rs-fMRI biomarkers which lead to maximal classification accuracies as compared to all other recent researches. The combination of t-test-based univariate, and RFE-based multivariate feature selection techniques performed on the concatenated measure extracted from rs-fMRI data provided the best discriminative performance when the features thus selected were used by the ELM classifier, superior to that of linear and non-linear SVM classifiers. These results may direct future studies using rs-fMRI scans for the classification of patients with preclinical AD or MCI.
Supporting information
S1 Table. The subject IDs of three groups of ADNI2 cohort used in this study.
https://doi.org/10.1371/journal.pone.0212582.s001
(DOC)
Acknowledgments
The authors thankfully acknowledge the ADNI consortium for the use of the released dataset, and also thank Dr. Heung-Il Suk and Chong-Yaw Wee for sharing the ADNI2 subject IDs.
References
- 1. Braak H, Braak E. Neuropathological stageing of Alzheimer-related changes. Acta Neuropathol. 1991;82: 239–259. pmid:1759558
- 2. Zheng W, Yao Z, Xie Y, Fan J, Hu B. Identification of Alzheimer’s Disease and Mild Cognitive Impairment Using Networks Constructed Based on Multiple Morphological Brain Features. Biological Psychiatry: Cognitive Neuroscience and Neuroimaging. 2018; pmid:30077576
- 3. Khazaee A, Ebrahimzadeh A, Babajani-Feremi A. Classification of patients with MCI and AD from healthy controls using directed graph measures of resting-state fMRI. Behav Brain Res. 2017;322: 339–350. pmid:27345822
- 4. Mahmoudi A, Takerkart S, Regragui F, Boussaoud D, Brovelli A. Multivoxel Pattern Analysis for fMRI Data: A Review. Comput Math Methods Med. 2012;2012: 1–14.
- 5. Mwangi B, Tian TS, Soares JC. A Review of Feature Reduction Techniques in Neuroimaging. Neuroinformatics. 2013;12: 229–244.
- 6. Ashburner J, Friston KJ. Voxel-Based Morphometry—The Methods. Neuroimage. 2000;11: 805–821. pmid:10860804
- 7. Yang Z, Fang F, Weng X. Recent developments in multivariate pattern analysis for functional MRI. Neurosci Bull. 2012. pmid:22833038
- 8. Chaves R, Ramírez J, Górriz JM, López M, Salas-Gonzalez D, Alvarez I, et al. SVM-based computer-aided diagnosis of the Alzheimer’s disease using t-test NMSE feature selection with feature correlation weighting. Neurosci Lett. 2009;461: 293–297. pmid:19549559
- 9. Zang Y, Jiang T, Lu Y, He Y, Tian L. Regional homogeneity approach to fMRI data analysis. Neuroimage. 2004;22: 394–400. pmid:15110032
- 10. He Y, Wang L, Zang Y, Tian L, Zhang X, Li K, et al. Regional coherence changes in the early stages of Alzheimer’s disease: a combined structural and resting-state functional MRI study. Neuroimage. 2007;35: 488–500. pmid:17254803
- 11. Zou Q-H, Zhu C-Z, Yang Y, Zuo X-N, Long X-Y, Cao Q-J, et al. An improved approach to detection of amplitude of low-frequency fluctuation (ALFF) for resting-state fMRI: fractional ALFF. J Neurosci Methods. 2008;172: 137–141. pmid:18501969
- 12. Guo Z, Liu X, Li J, Wei F, Hou H, Chen X, et al. Fractional amplitude of low-frequency fluctuations is disrupted in Alzheimer’s disease with depression. Clin Neurophysiol. 2017;128: 1344–1349. pmid:28570868
- 13. Li Y, Jing B, Liu H, Li Y, Gao X, Li Y, et al. Frequency-Dependent Changes in the Amplitude of Low-Frequency Fluctuations in Mild Cognitive Impairment with Mild Depression. J Alzheimers Dis. 2017;58: 1175–1187. pmid:28550250
- 14. Lin Q, Rosenberg MD, Yoo K, Hsu TW, O’Connell TP, Chun MM. Resting-State Functional Connectivity Predicts Cognitive Impairment Related to Alzheimer’s Disease. Front Aging Neurosci. 2018;10: 94. pmid:29706883
- 15. Han Y, Wang J, Zhao Z, Min B, Lu J, Li K, et al. Frequency-dependent changes in the amplitude of low-frequency fluctuations in amnestic mild cognitive impairment: a resting-state fMRI study. Neuroimage. 2011;55: 287–295. pmid:21118724
- 16. Li Y, Wang X, Li Y, Sun Y, Sheng C, Li H, et al. Abnormal Resting-State Functional Connectivity Strength in Mild Cognitive Impairment and Its Conversion to Alzheimer’s Disease. Neural Plast. 2016;2016: 4680972. pmid:26843991
- 17. Dai Z, Yan C, Li K, Wang Z, Wang J, Cao M, et al. Identifying and Mapping Connectivity Patterns of Brain Network Hubs in Alzheimer’s Disease. Cereb Cortex. 2015;25: 3723–3742. pmid:25331602
- 18. Rathore S, Habes M, Iftikhar MA, Shacklett A, Davatzikos C. A review on neuroimaging-based classification studies and associated feature extraction methods for Alzheimer’s disease and its prodromal stages. Neuroimage. 2017;155: 530–548. pmid:28414186
- 19. Arbabshirani MR, Plis S, Sui J, Calhoun VD. Single subject prediction of brain disorders in neuroimaging: Promises and pitfalls. Neuroimage. 2017;145: 137–165. pmid:27012503
- 20. Guyon I, Weston J, Barnhill S, Vapnik V. Gene Selection for Cancer Classification using Support Vector Machines. Mach Learn. Kluwer Academic Publishers; 2002;46: 389–422.
- 21. De Martino F, Valente G, Staeren N, Ashburner J, Goebel R, Formisano E. Combining multivariate voxel selection and support vector machines for mapping and classification of fMRI spatial patterns. Neuroimage. 2008;43: 44–58. pmid:18672070
- 22. Mishra S, Mishra D. SVM-BT-RFE: An improved gene selection framework using Bayesian T-test embedded in support vector machine (recursive feature elimination) algorithm. Karbala International Journal of Modern Science, 2015.
- 23. Qureshi M. N. I., Min B., Jo H. J., Lee B. (2016). Multiclass classification for the differential diagnosis on the adhd subtypes using recursive feature elimination and hierarchical extreme learning machine: structural MRI Study. PLoS ONE 11:e0160697. pmid:27500640
- 24. Qureshi M. N. I., Oh J., Min B., Jo H. J., Lee B. (2017b). Multi-modal, multi-measure, and multi-class discrimination of ADHD with hierarchical feature extraction and extreme learning machine using structural and functional brain MRI. Front. Hum. Neurosci. 11:157. pmid:28420972
- 25. Dai D, Wang J, Hua J, He H. Classification of ADHD children through multimodal magnetic resonance imaging. Front Syst Neurosci. 2012 Sep 3;6:63. pmid:22969710
- 26. Wee CY, Yap PT, Zhang D, Wang L, Shen D. Group-constrained sparse fMRI connectivity modeling for mild cognitive impairment identification. Brain Struct Funct. 2014 pmid:23468090
- 27. Wee CY, Yap PT, Shen D; Alzheimer's Disease Neuroimaging Initiative. Prediction of Alzheimer's disease and mild cognitive impairment using cortical morphological patterns. Hum Brain Mapp. 2013 Dec;34(12):3411–25. pmid:22927119
- 28. Jie B, Zhang D, Wee CY, Shen D. Topological graph kernel on multiple thresholded functional connectivity networks for mild cognitive impairment classification. Hum Brain Mapp. 2014 Jul;35(7):2876–97. pmid:24038749
- 29. Wee CY, Wang L, Shi F, Yap PT, Shen D. Diagnosis of autism spectrum disorders using regional and interregional morphological features. Hum Brain Mapp. 2014 Jul;35(7):3414–30. pmid:25050428
- 30. Falahati F, Westman E, Simmons A. Multivariate data analysis and machine learning in Alzheimer's disease with a focus on structural magnetic resonance imaging. J Alzheimers Dis. 2014. pmid:24718104
- 31. Dai D, Huiguang H, Joshua TV, Zengguang H. Accurate prediction of AD patients using cortical thickness networks. Machine Vision and Applications (2013).
- 32. Li X, Peng S, Chen J, Lü B, Zhang H, Lai M. SVM-T-RFE: a novel gene selection algorithm for identifying metastasis-related genes in colorectal cancer using gene expression profiles. Biochem Biophys Res Commun. 2012. pmid:22306013
- 33. Chao-Gan Y, Yu-Feng Z. DPARSF: A MATLAB Toolbox for “Pipeline” Data Analysis of Resting-State fMRI. Front Syst Neurosci. 2010;4: 13. pmid:20577591
- 34. Ashburner J. A fast diffeomorphic image registration algorithm. Neuroimage. 2007;38: 95–113. pmid:17761438
- 35. Song X-W, Dong Z-Y, Long X-Y, Li S-F, Zuo X-N, Zhu C-Z, et al. REST: a toolkit for resting-state functional magnetic resonance imaging data processing. PLoS One. 2011;6: e25031. pmid:21949842
- 36.
Kendall M, Gibbons JDR. Correlation methods. Oxford: Oxford University Press; 1990.
- 37. Zang Y, Jiang T, Lu Y, He Y, Tian L. Regional homogeneity approach to fMRI data analysis. Neuroimage. 2004;22: 394–400. pmid:15110032
- 38. Zang Y-F, He Y, Zhu C-Z, Cao Q-J, Sui M-Q, Liang M, et al. Altered baseline brain activity in children with ADHD revealed by resting-state functional MRI. Brain Dev. 2007;29: 83–91. pmid:16919409
- 39. Zhou Y, Wang Y, Rao L-L, Liang Z-Y, Chen X-P, Zheng D, et al. Disrutpted resting-state functional architecture of the brain after 45-day simulated microgravity. Front Behav Neurosci. 2014;8. pmid:24926242
- 40. Zuo X-N, Ehmke R, Mennes M, Imperati D, Xavier Castellanos F, Sporns O, et al. Network Centrality in the Human Functional Connectome. Cereb Cortex. 2011;22: 1862–1875. pmid:21968567
- 41. Zhan Z-W, Lin L-Z, Yu E-H, Xin J-W, Lin L, Lin H-L, et al. Abnormal resting-state functional connectivity in posterior cingulate cortex of Parkinson’s disease with mild cognitive impairment and dementia. CNS Neurosci Ther. 2018; pmid:29500931
- 42. Qureshi MNI, Oh J, Cho D, Jo HJ, Lee B. Multimodal Discrimination of Schizophrenia Using Hybrid Weighted Feature Concatenation of Brain Functional Connectivity and Anatomical Features with an Extreme Learning Machine. Front Neuroinform. 2017;11: 59. pmid:28943848
- 43. Calhoun VD, Sui J. Multimodal Fusion of Brain Imaging Data: A Key to Finding the Missing Link(s) in Complex Mental Illness. Biological Psychiatry: Cognitive Neuroscience and Neuroimaging. 2016;1: 230–244.
- 44. Mwangi B, Ebmeier KP, Matthews K, Steele JD. Multi-centre diagnostic classification of individual structural neuroimaging scans from patients with major depressive disorder. Brain. 2012;135: 1508–1521. pmid:22544901
- 45. Wee C-Y, Yap P-T, Zhang D, Denny K, Browndyke JN, Potter GG, et al. Identification of MCI individuals using structural and functional connectivity networks. Neuroimage. 2012;59: 2045–2056. pmid:22019883
- 46. Yan K, Zhang D. Feature selection and analysis on correlated gas sensor data with recursive feature elimination. Sens Actuators B Chem. Elsevier; 2015;212: 353–363.
- 47. Tibshirani R. Regression Shrinkage and Selection via the Lasso. J R Stat Soc Series B Stat Methodol. [Royal Statistical Society, Wiley]; 1996;58: 267–288.
- 48. Wu TT, Chen YF, Hastie T, Sobel E, Lange K. Genome-wide association analysis by lasso penalized logistic regression. Bioinformatics. academic.oup.com; 2009;25: 714–721. pmid:19176549
- 49. Friedman J, Hastie T, Tibshirani R. Regularization paths for generalized linear models via coordinate descent. J Stat Softw. NIH Public Access; 2010;33: 1. pmid:20808728
- 50. Eavani H, Satterthwaite TD, Gur RE, Gur RC, Davatzikos C. Unsupervised learning of functional network dynamics in resting state fMRI. Inf Process Med Imaging. 2013;23: 426–437. pmid:24683988
- 51. Leonardi N, Richiardi J, Gschwind M, Simioni S, Annoni J-M, Schluep M, et al. Principal components of functional connectivity: a new approach to study dynamic brain connectivity during rest. Neuroimage. 2013;83: 937–950. pmid:23872496
- 52. Suk H-I, Wee C-Y, Lee S-W, Shen D. State-space model with deep learning for functional dynamics estimation in resting-state fMRI. Neuroimage. 2016;129: 292–307. pmid:26774612
- 53. de Vos F, Koini M, Schouten TM, Seiler S, van der Grond J, Lechner A, et al. A comprehensive analysis of resting state fMRI measures to classify individual patients with Alzheimer’s disease. Neuroimage. 2018;167: 62–72. pmid:29155080
- 54. Zhou J, Greicius MD, Gennatas ED, Growdon ME, Jang JY, Rabinovici GD, et al. Divergent network connectivity changes in behavioural variant frontotemporal dementia and Alzheimer’s disease. Brain. 2010;133: 1352–1367. pmid:20410145
- 55. Wu X, Li J, Ayutyanont N, Protas H, Jagust W, Fleisher A, et al. The receiver operational characteristic for binary classification with multiple indices and its application to the neuroimaging study of Alzheimer’s disease. IEEE/ACM Trans Comput Biol Bioinform. 2013;10: 173–180. pmid:23702553
- 56. Khazaee A, Ebrahimzadeh A, Babajani-Feremi A. Identifying patients with Alzheimer’s disease using resting-state fMRI and graph theory. Clin Neurophysiol. 2015;126: 2132–2141. pmid:25907414
- 57. Challis E, Hurley P, Serra L, Bozzali M, Oliver S, Cercignani M. Gaussian process classification of Alzheimer’s disease and mild cognitive impairment from resting-state fMRI. Neuroimage. 2015;112: 232–243. pmid:25731993
- 58. Jie B, Zhang D, Gao W, Wang Q, Wee C-Y, Shen D. Integration of network topological and connectivity properties for neuroimaging classification. IEEE Trans Biomed Eng. 2014;61: 576–589. pmid:24108708
- 59. Beltrachini L, De Marco M, Taylor ZA, Lotjonen J, Frangi AF, Venneri A. Integration of Cognitive Tests and Resting State fMRI for the Individual Identification of Mild Cognitive Impairment. Curr Alzheimer Res. 2015;12: 592–603. pmid:26238814
- 60. Kim J, Lee B. Identification of Alzheimer's disease and mild cognitive impairment using multimodal sparse hierarchical extreme learning machine. Hum Brain Mapp. 2018 pmid:29736986
- 61. Zhang D, Wang Y, Zhou L, Yuan H, Shen D; Alzheimer's Disease Neuroimaging Initiative. Multimodal classification of Alzheimer's disease and mild cognitive impairment. Neuroimage. 2011 pmid:21236349
- 62. Casanova R, Whitlow CT, Wagner B, Williamson J, Shumaker SA, Maldjian JA, Espeland MA. High dimensional classification of structural MRI Alzheimer's disease data based on large scale regularization. Front Neuroinform. 2011 pmid:22016732
- 63. Hidalgo-Muñoz AR, Ramírez J, Górriz JM, Padilla P. Regions of interest computed by SVM wrapped method for Alzheimer's disease examination from segmented MRI. Front Aging Neurosci. 2014. pmid:24634656
- 64. Salvatore C, Cerasa A, Battista P, Gilardi MC, Quattrone A, Castiglioni I; Alzheimer's Disease Neuroimaging Initiative. Magnetic resonance imaging biomarkers for the early diagnosis of Alzheimer's disease: a machine learning approach. Front Neurosci. 2015. pmid:26388719
- 65. Retico A, Bosco P, Cerello P, Fiorina E, Chincarini A, Fantacci ME. Predictive Models Based on Support Vector Machines: Whole-Brain versus Regional Analysis of Structural MRI in the Alzheimer's Disease. Neuroimaging. 2015 pmid:25291354
- 66. Ota K, Oishi N, Ito K, Fukuyama H; SEAD-J Study Group; Alzheimer's Disease Neuroimaging Initiative. Effects of imaging modalities, brain atlases and feature selection on prediction of Alzheimer's disease. J Neurosci Methods. 2015. pmid:26318777
- 67. Beheshti I, Demirel H; Alzheimer’s Disease Neuroimaging Initiative. Feature-ranking-based Alzheimer's disease classification from structural MRI. Magn Reson Imaging. 2016. pmid:26657976
- 68. Beheshti I, Demirel H, Matsuda H; Alzheimer's Disease Neuroimaging Initiative. Classification of Alzheimer's disease and prediction of mild cognitive impairment-to-Alzheimer's conversion from structural magnetic resource imaging using feature ranking and a genetic algorithm. Comput Biol Med. 2017 pmid:28260614
- 69. López M, Ramírez J, Górriz JM, Álvarez I, Salas-Gonzalez D, Segovia F, et al. Principal component analysis-based techniques and supervised classification schemes for the early detection of Alzheimer's disease. Neurocomputing 2011.
- 70. Chu C, Hsu AL, Chou KH, Bandettini P, Lin C, Alzheimer's Disease Neuroimaging Initiative. Does feature selection improve classification accuracy? Impact of sample size and feature selection on classification using anatomical magnetic resonance images. Neuroimage. 2012 pmid:22166797
- 71. Duc NT, Lee B. Microstate functional connectivity in EEG cognitive task revealed by multivariate Gaussian hidden Markov model with phase locking value. J Neural Eng 2019. pmid:30673644