Prediction and classification of Alzheimer disease based on quantification of MRI deformation

Detecting early morphological changes in the brain and making early diagnosis are important for Alzheimer’s disease (AD). High resolution magnetic resonance imaging can be used to help diagnosis and prediction of the disease. In this paper, we proposed a machine learning method to discriminate patients with AD or mild cognitive impairment (MCI) from healthy elderly and to predict the AD conversion in MCI patients by computing and analyzing the regional morphological differences of brain between groups. Distance between each pair of subjects was quantified from a symmetric diffeomorphic registration, followed by an embedding algorithm and a learning approach for classification. The proposed method obtained accuracy of 96.5% in differentiating mild AD from healthy elderly with the whole-brain gray matter or temporal lobe as region of interest (ROI), 91.74% in differentiating progressive MCI from healthy elderly and 88.99% in classifying progressive MCI versus stable MCI with amygdala or hippocampus as ROI. This deformation-based method has made full use of the pair-wise macroscopic shape difference between groups and consequently increased the power for discrimination.


Introduction
Alzheimer disease (AD), the most common form of dementia, is known for the unresolved etiology and pathophysiology. Neurofibrillary tangle, plaque buildup and tissue loss in the brain parenchyma [1,2] suggest the progressive degenerative nature of the disease. Early detection of AD at the preclinical stage is of great importance in terms of patient management. Since the earliest symptoms of AD, such as short-term memory loss and paranoid suspicion, are often mistaken as related to aging and stress, or are confused with symptoms resulted from other brain disorders, it remains challenging to predict the disease onset and the dynamic of AD in a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 the scenario of dementia till it manifests severe cognitive impairment with typical neuroimaging signs.
AD is usually diagnosed clinically from the patient history and cognitive impairment testing [3]. Interviews with family members and caregivers are also utilized in the assessment of the disease [4]. The diagnosis based on neuropsychological scale requires rich clinical experience of physicians, and as a result it is subjective and less repeatable. Moreover, it is more challenging to identify patients suffering from AD at a prodromal stage, named mild cognitive impairment (MCI), as these subjects have cognitive impairments beyond that expected for their age and education but do not meet neuropathological criteria for AD. Neuroimaging, especially the high resolution magnetic resonance imaging (MRI), was recommended in more precise research criteria for prediction or early diagnosis of AD [5]. The structural MR images provide additional information about abnormal tissue atrophy or other abnormal biomarkers that can be sensitively detected at the early stage of the disease, and therefore automatic imageanalysis methods are desired to help diagnose the illness before irreversible neuronal loss has set in, or to help detect brain changes between patients who may convert and may not convert to AD [6].
To this end, many algorithms on distinguishing AD or MCI have been proposed, varying from conceptually simple measurement of volumes or mathematically complex description of shape difference in a priori regions of interest (ROI) [7][8][9][10][11][12][13], to voxel-wise modeling of tissue density changes on the whole brain region, e.g. voxel-wise morphometry [11,[14][15][16][17][18]. There has been interest in machine learning and computer-aided diagnostics in the field of medical imaging, where a machine learning algorithm is trained to produce a desired output from a set of input training data such as features obtained from voxel intensity, tissue density or shape descriptor. Machine learning diagnostics can be also divided into ROI based and whole-brain based methods. ROI based algorithms always focus on the medial temporal structures of the brain, including the hippocampus and entorhinal cortex. In the work of Chupin et al. [19], Gutman et al. [20] and Gerardin et al. [21], support vector machine (SVM) were used for classification of AD or MCI subjects with hippocampal volume or shape as features. Another study has compared the linear discriminant analysis (LDA) and SVM for MCI classification and prediction based on hippocampal volume [22]. The entorhinal cortical thickness and modified tissue density in amygdala, parahippocampal gyrus have also been used as features in AD and MCI discrimination [23,24]. ROI based analyses typically do not make use of all the available information contained in the whole brain, and require a priori decisions concerning which structures to assess. Atrophy in the inferior-lateral temporal lobes, cingulate gyrus, and in the parietal and frontal lobes has also been reported [25,26]. Whether hippocampus, medial temporal lobe, or other ROIs would be a better choice for discrimination or prediction of AD is still controversial. Algorithms that extracted features from wider or cohort-adaptive brain regions have been proposed [27][28][29][30][31][32]. Kloppel et al. [33] developed a supervised method using linear SVM to group the gray matter segment of T1-weighted MR images on a high dimensional space, treating voxels as coordinates and intensity value at each voxel as their location. Aguilar et al. [34] explored the classification performance of orthogonal projections to latent structures (OPLS), decision trees, artificial neural networks (ANN), and SVM based on 10 features selected from 23 volumetric and 34 cortical thickness variables. Beheshti et al. [35] combined voxel-based morphometry and Fisher Criterion for feature selection and reduction over the entire brain, followed by SVM for classification. The whole-brain techniques have shown high discriminative power for individual diagnoses.
In this paper, we proposed a deformation-based machine learning method that quantified deformation field between subjects as distance and projected each subject onto a low dimensional Euclidean space in which a machine learning algorithm was applied to classify groups of

Data and subjects
Data used in the study were obtained from the Alzheimer's Disease Neuroimaging Initiative (ADNI) database (http://adni.loni.usc.edu/). The ADNI was launched in 2003 by the National Institute on Aging (NIA), the National Institute of Biomedical Imaging and Bioengineering (NIBIB), the Food and Drug Administration (FDA), private pharmaceutical companies and non-profit organizations, as a $60 million, 5-year public-private partnership. The primary goal of ADNI has been to test whether serial magnetic resonance imaging (MRI), positron emission tomography (PET), other biological markers, and clinical and neuropsychological assessment can be combined to measure the progression of mild cognitive impairment (MCI) and early Alzheimer's disease (AD). ADNI is the result of efforts of many investigators from a broad range of academic institutions and private corporations, and subjects have been recruited from over 50 sites across the U.S. and Canada. The ADNI study was approved by IRB of all participating sites. Written informed consent was provided by all subjects and if applicable, their legal representatives. For up-to-date information, see www.adni-info.org. Data from a total of 427 subjects was retrieved from the ADNI database for whom preprocessed images and FreeSurfer post-processed images were available. The subjects were categorized into groups of normal elderly controls (NC) (n = 135, aged 76.19±5.48), stable MCI subjects (sMCI) (n = 132, aged 75.25±7.27) who had not converted to AD within 36 months, progressive MCI subjects (pMCI) (n = 95, aged 75.1±7.05) who had converted to AD 36 months after their baseline visit, and mild AD patients (n = 65, aged 75.58±8.39). The criteria used to characterize and to track a patient's level of impairment were as follows: normal controls had a CDR (Clinical Dementia Rating) of 0 and MMSE (Mini-Mental State Examination) score between 24 and 30, MCI subjects had a CDR of 0.5 and MMSE score between 22 and 30, and mild AD patients had a CDR of 1 and MMSE score between 20 and 26 at the baseline test. Detailed demographic information of the studied population was listed in Table 1.
The baseline 3D T1-weighted image of each subject was used for segmentation and classification using FreeSurfer (http://surfer.nmr.mgh.harvard.edu/). In this study, we have only chosen the subjects with provided FreeSurfer processing in the database to exclude segmentation variance due to different software-related settings and standard of quality control. The FreeSurfer processing in ADNI was performed by the team from Center for Imaging of Neurodegenerative Diseases, UCSF. The analysis was completed using Version 4.3 and quality control was conducted with both global and regional assessment, including the checking of skullstripped brainmask, surface segmentation and generation. The classical pipeline (reconall) was conducted to each image, including intensity normalization, skull stripping, alignment to a standard space, tissue partition, surface reconstruction and inflation, spherical mapping to standard coordinate system, as well as parcellation of cerebral cortex [36][37][38][39][40]. The whole-brain gray matter (GM), whole-brain white matter (WM), frontal lobe, parietal lobe, occipital lobe, temporal lobe, cingulate cortex, as well as amygdala, hippocampus, caudate, putamen, globus pallidus, and thalamus were selected as regions of interest (ROI) (Fig 1).

Registration and distance metric
Images of each subject were affinely aligned to the MNI space using FSL flirt (https://fsl.fmrib. ox.ac.uk/fsl/fslwiki/FLIRT) prior to deformable registration, to remove differences in subject positioning to detect true differences in shape. The symmetric log-domain diffeomorphic demons algorithm was used for the deformable registration, whose output deformation field is invertible and symmetric with respect to the order of the inputs [41]. The algorithm defines a smooth and continuous mapping ϕ(.) that best aligns two images I 0 (.) and I 1 (.). The global energy function of diffeomorphic demons is where u is the smooth update field, ϕ denotes a warping operation. The optimization is performed within the space of diffeomorphisms using updates of the form ϕ exp(u). If ϕ is also represented as an exponential of a smooth velocity field v, i.e. ϕ = exp(v), then the diffeomorphic demons is extended to represent the complete spatial transformation in the log domain. Thus the algorithm is called the log-domain diffeomorphic demons. The algorithm where Z(v, u) is a velocity field. The log-domain diffeomorphic demons registration has a symmetric (or inverse-consistent) extension by symmetrizing the energy function After registration, the algorithm provides not only the deformation field ϕ, but also the logarithm of the diffeomorphism, v = log(ϕ), which can be directly used in computational anatomical analysis. More details about the symmetric log-domain diffeomorphic demons registration were introduced in the paper of Vercauteren et al. [41].
To compute the distance between images, the Riemannian distance was defined [42]. For each pair of images {I j , I k }, the symmetric log-domain diffeomorphic demons algorithm calculated a mapping ϕ from I k to I j , a velocity field v = log(ϕ) (that is, ϕ = exp(v)), and an inverse mapping ϕ −1 = exp(−v) from I j back to I k . The following equation was used to compute the Riemannian distance between I j and I k : where Id denotes an identity transformation. In the above equation, ϕ ROI can be either a diffeomorphism of the whole brain or a sub-field of any segmented region of the brain. v ROI j and v ROI k represent the log-domain diffeomorphism of the specific ROI in I j and I k , respectively. For example, the specific ROI can be the whole-brain gray matter (GM) or white matter (WM), cortical lobes, hippocampus or other subcortical structures.

Embedding algorithm
A distance matrix was constructed after the distance between each pair of subjects was calculated. The embedding algorithm projected all the labeled images onto a low-dimensional space with this distance matrix and a discrimination hyperplane will be obtained by training the labeled subjects on the embedded space. To classify a new unlabeled image, an out-of-sample extension of embedding algorithms was used to project the new subject onto the constructed embedded space. The metric multi-dimensional scaling (MDS) algorithm was applied for embedding. The idea of metric MDS is to transform the distance matrix into a cross-product matrix and then to find its eigen-decomposition which gives a principal component analysis (PCA). Let S i be the i-th row sum of the distance matrix D, S i = S j D ij . The cross-product matrix is obtained by using the "double-centering" formula: The embedding e im of subject where λ m denotes the m-th principal eigenvalue and v im denotes the i-th element of the m-th principal eigenvector.
To calculate the embedding coordinate of a new point, define the kernel functionK yielding the symmetric matrixM on the dataset I ¼ fx 1 ; . . . ; x n g, with x i sampled from an unknown distribution with density p: where d(a, b) is the original distance and the expectations E are taken over the training data I. Let (v l , λ l ) be an (eigenvector, eigenvalue) pair that solvesMv l ¼ l l v l and e l denotes the embedding associated with the new point x. Then Readers can refer to the work of Bengio et al. for algorithm details and proof [43]. In this study, subjects were all projected onto an R 3 space for classification.

Classification
SVM with a linear kernel which was implemented using matlab 'libsvm' toolbox (http://www. csie.ntu.edu.tw/~cjlin/libsvm/), was applied on the embedded space to classify subjects. The C-SVM model was chosen, and the cost parameter C was fixed as 1 in all experiments. The kfold cross validation was adopted to estimate the classification performance. The subjects were randomly partitioned into k "equal" sized subgroups. In this study, as the number of subjects in each group was unequal and may not be evenly divided by k, some subgroups may have one or two more subjects in practice. Of the k subgroups, a single subgroup was used as the validation data and the remaining k-1 subgroups were used as training data. The process was repeated for k times and k was set as 10 in this study. Classification sensitivity, specificity, and accuracy were then calculated. The receiver operating characteristics (ROC) curve was plotted and areas under ROC curve (AUC) was measured.

Results
No significant differences on age were found between each pair of groups using the Student's t test. For the baseline MMSE score, no significant difference was found only between sMCI and pMCI subjects.
The deformable registration and distance quantification results of two pairs of subjects were shown in Fig 2, where the same reference was used. Images before and after registration, deformation fields, and quantified ROI-specific Riemannian distances for the two source subjects were shown. It was observed that the reference and source images were considerably well aligned using the symmetric log-domain diffeomorphic demons registration. The deformation from the subject who is more morphologically different from the reference was notably larger than that from the other subject. Consequently, the difference was manifest in the quantified distances.
Classification results for differentiating normal elderly controls and AD patients were summarized in Table 2 and Fig 3. Using the whole-brain gray matter as ROI, the highest classification accuracy was 96.5% with a sensitivity of 93.85%, specificity of 97.78% and AUC of 0.995. In addition, using the other six ROIs including temporal lobe, whole-brain white matter, hippocampus, parietal lobe, amygdala, and frontal lobe, the algorithm achieved high sensitivity and specificity above 90% (AUCs>0.96). The worst performance resulted from caudate, where the sensitivity was substantially lower in discrimination.
Classification results for stable MCI versus progressive MCI subjects were summarized in Table 4 and Fig 5. As in differentiating normal controls and pMCI subjects, amygdala and hippocampus remained the top two ROIs with which the method obtained the highest   3. (a) Classification sensitivity (green), specificity (blue), and accuracy (red) of normal elderly controls versus AD patients with different ROIs. The highest accuracy (96.5%) was achieved using the whole-brain gray matter as ROI with 93.85% sensitivity and 97.78% specificity. The algorithm obtained high sensitivity and specificity (>90%) with half of the ROIs. (b) The ROC curve of the prediction accuracy between normal controls versus AD. The AUCs were larger than 0.98 for the whole-brain gray matter and white matter (left), amygdala and hippocampus (middle), parietal and temporal lobes (right).
doi:10.1371/journal.pone.0173372.g003  respectively. The algorithm also performed well when using the whole-brain gray matter, frontal lobe, and cingulate cortex as ROI, achieving accuracy over 85% (AUCs>0.875). For the globus pallidus, thalamus, and putamen, we obtained high specificity but significantly lower sensitivity, resulting in classification accuracy lower than 67%. To summarize and compare the classification performance of each ROI, we calculated the mean accuracy for each ROI over the three experiments (Table 5). Hippocampus and amygdala were ranked the top two ROIs with excellent performance for all testing. Gray matter and its subdivisions also got high rankings except for the occipital lobe, followed by the white matter and other subcortical structures.

Discussion
Classification performance compared with existing algorithms A lot of algorithms have been proposed for early diagnosis of AD with accuracy ranging from 75% to 96% [44][45][46][47][48]. Kloppel et al. considered the voxels of tissue probability maps of the whole brain or volumes of interest (VOI) as features in the classification, obtaining accuracy of 95.6% to discriminate normal controls and AD [33]. In recent work, Beheshti et al. selected the regions with significant difference between groups as VOIs and considered each voxel in the VOIs as a feature, followed by a feature selection step [35]. They obtained 96.32% accuracy between controls and AD. In this work, we observed a classification rate of 96.5% using the whole-brain gray matter as ROI with an AUC of 0.995. For five ROIs, the classification accuracy exceeded 95% indicating that global morphological changes have occurred in mild AD patients and that mild AD is much distinguishable from healthy controls.
By contrast, the brain shape difference between healthy elderly and MCI subjects is smaller, which therefore increases difficulty for discrimination. Fan et al. proposed a method that considered the tissue density from pathology-adaptive anatomical parcellation as features and obtained classification accuracy of 81.8% [48]. Chupin et al. used hippocampal volume to discriminate between elderly controls and progressive MCI who had developed AD in 18 months and obtained 71% accuracy [19]. Our proposed algorithm manifested outstanding performance in the testing, where 91.74% accuracy (0.971 AUC) was obtained to classify MCI who had developed AD at 36 months follow-up. For the ROIs of amygdala, hippocampus, temporal lobe, the whole-brain gray matter, frontal lobe, and parietal lobe, the algorithm obtained AUC values all higher than 0.9. To distinguish progressive MCI from stable MCI, which is important for prediction of conversion in MCI subjects, is challenging in the MRI-based classification. An algorithm based on hippocampal volume measurement obtained accuracy of 67% [19]. Normalized thickness index in specific cortical regions was considered as features in another algorithm proposed by Querbes et al. [49], where 76% accuracy was obtained to classify MCI converters for the 24-month period. Lillemark et al. reported an classification accuracy of 76.6% using the region-based surface connectivity as features for grouping MCI subjects who had developed AD at 12-month follow-up [50]. Westman et al. [45] and Aguilar et al. [34] collected multiple surface and volumetric indices via FreeSurfer processing and applied multivariate models for discrimination respectively. Westman et al. obtained 75.9% accuracy for MCIs with conversion at 18 months follow-up while Aguilar et al. obtained 86% accuracy for MCIs with conversion at 12 months follow-up. Using the proposed method, we obtained an overall accuracy of 88.99% (0.932 AUC) to classify MCI patients who had progressed to AD after 36 months of baseline visit. Algorithm comparison was summarized in Table 6.
The proposed algorithm developed a new strategy that quantified the deformation field to represent shape difference between subjects rather than comparing the tissue density or surface/volumetric indices. This deformation-based method characterized the macroscopic differences in brain anatomy which were discarded in most of the existing approaches at the spatial normalization step. The quantified deformation was then used to denote dissimilarity between subjects and a distance matrix was constructed. The MDS algorithm used in the study was guaranteed to recover the true dimensionality and geometric structure of manifolds in which each subject represented as an element [52]. Finally MDS constructed an embedding of the data in a low-dimensional Euclidean space that best preserved the manifold's estimated intrinsic geometry. The advantage of this algorithm may due to the as much information it used in dimensional reduction for spatially representing the similarity relationships between subjects, by computing the pair-wise registration instead of aligning subjects to an atlas or a constructed template, resulting in more informative embedding and consequently an enhanced power to discriminate between different populations.

Prediction of AD conversion in MCI patients
Identifying MCI patients at high risk for conversion to AD is crucial for the effective treatment of the disease. Over the past decade, numerous biomarkers have been proposed for prediction of AD-conversion in MCI patients [19,34,45,49,50,[53][54][55][56][57][58]. Cognitive performance data including the Spatial Pattern of Abnormalities for Recognition of Early AD (SPARE-AD) index, AD Assessment Scale-Cognitive (ADAS-Cog) subscale, or composite cognitive scores were introduced to assess AD conversion. However, the accuracy is not satisfactory with a classification rate around 65% [44]. Combining cognitive measures with MRI and age information, the discrimination rate has risen to 82% [57]. Cerebrospinal fluid (CSF) tau and Aβ42 measures have been also proposed as potential predictors of risk for developing AD [59]. Integrating CSF biomarkers together with MRI patterns resulted in accuracy of 62% [53]. When further including positron emission tomography (PET) data and routine clinical tests, the predicting accuracy has increased to 72% [55].
Compared to previous studies using ADNI database, the proposed algorithm based on quantification of MRI deformation demonstrated a promising strategy for predicting MCI-to-AD conversion 3 years in advance with accuracy of 88.99% and AUC of 0.932, which are the highest rates ever reported to the best of our knowledge. If MRI can provide sufficient information for good prediction using a robust algorithm, the use of CSF and PET biomarkers can be avoided as the former requires lumbar puncture which is invasive and painful for patients and the latter suffers its high cost and radiation exposure [60].

Selection of regions of interest for classification
Global and regional cerebral atrophy has been reported in previous studies. Annual rates of global brain atrophy in AD are about 2-3%, compared with 0.2-0.5% in healthy controls [6,61]. At early stage of AD progression, prominent atrophy has emerged in the medial temporal regions and the posterior cortical regions including posterior cingulate, retrosplenial, and lateral parietal cortex [62]. Medial temporal lobe atrophy, particularly of the amygdala, hippocampus, entorhinal cortex, and parahippocampal gyrus, can be observed with higher frequency in patients with AD or probable AD [63,64]. Shape changes have also been demonstrated in the caudate, putamen, globus pallidus, and thalamus in AD [65]. Although remarkable morphological alterations were found in a certain regions in AD or prodromal AD at the group level, individual classification based on different regions in this study yielded substantially distinct results. The whole-brain gray matter and temporal lobe performed the best in distinguishing AD from normal elderly controls, while amygdala and hippocampus worked better in classifying progressive MCI versus either healthy elderly or stable MCI. This result was mostly consistent with the previous finding that significantly increased rates of hippocampal atrophy were observed in presymptomatic and mild AD, while more widespread tissue shrinkage has been shown in mild to moderate AD patients [6,66]. Evidence have also been documented that increased oxygen extraction capacity and tissue atrophy were observed in basal ganglia and thalamus in patient with AD [65,67]. These ROIs indeed resulted in a classification accuracy higher than 80% in discriminating AD, nevertheless much lower in classifying progressive MCI, indicating that shape changes of basal ganglia and thalamus were prominent features in AD but not yet in the prodromal stage. By an integrative comparison, we proposed that hippocampus, amygdala, the whole-brain gray matter, temporal lobe, and parietal lobe should be of higher preference for AD or MCI classification, where amygdala and hippocampus could be the leading candidate for predicting AD conversion in MCI, while occipital lobe, thalamus, globus pallidus, and putamen should be non-priority selections for early diagnosis.

Conclusion
In this study, we proposed a deformation-based machine learning method for discrimination of AD and prediction of MCI-to-AD conversion with high resolution MRI. The proposed algorithm showed great performance on both classification and prediction of AD, with 96.5% accuracy discriminating AD from healthy elderly, 91.74% accuracy for progressive MCI versus healthy elderly, and 88.99% accuracy for progressive MCI versus stable MCI. Large deformation in hippocampus and amygdala was advantageous to differentiate progressive MCI patients, while diffusive morphological changes in the whole-brain gray matter were prominent to identify mild or moderate AD patients.
The limitation of the algorithm is that it was computational expensive. A balance between classification accuracy and computational time should be achieved in our future research. In general, MRI-based analysis can be a beneficial supplement to clinical diagnosis and prediction of AD.