The impact of atlas-based MR attenuation correction on the diagnosis of FDG-PET/MR for Alzheimer’s diseases— A simulation study combining multi-center data and ADNI-data

Background The purpose of this study was to assess the impact of vendor-provided atlas-based MRAC on FDG PET/MR for the evaluation of Alzheimer’s disease (AD) by using simulated images. Methods We recruited 47 patients, from two institutions, who underwent PET/CT and PET/MR (GE SIGNA) examination for oncological staging. From the PET raw data acquired on PET/MR, two FDG-PET series were generated, using vendor-provided MRAC (atlas-based) and CTAC. The following simulation steps were performed in MNI space: After spatial normalization and smoothing of the PET datasets, we calculated the error map for each patient, PETMRAC/PETCTAC. We multiplied each of these 47 error maps with each of the 203 Alzheimer’s Disease Neuroimaging Initiative (ADNI) cases after the identical normalization and smoothing. This resulted in 203*47 = 9541 datasets. To evaluate the probability of AD in each resulting image, a cumulative t-value was calculated automatically using commercially-available software (PMOD PALZ) which has been used in multiple large cohort studies. The diagnostic accuracy for the discrimination of AD and predicting progression from mild cognitive impairment (MCI) to AD were evaluated in simulated images compared with ADNI original images. Results The accuracy and specificity for the discrimination of AD-patients from normal controls were not substantially impaired, but sensitivity was slightly impaired in 5 out of 47 datasets (original vs. error; 83.2% [CI 75.0%-89.0%], 83.3% [CI 74.2%-89.8%] and 83.1% [CI 75.6%-88.3%] vs. 82.7% [range 80.4–85.0%], 78.5% [range 72.9–83.3%,] and 86.1% [range 81.4–89.8%]). The accuracy, sensitivity and specificity for predicting progression from MCI to AD during 2-year follow-up was not impaired (original vs. error; 62.5% [CI 53.3%-69.3%], 78.8% [CI 65.4%-88.6%] and 54.0% [CI 47.0%-69.1%] vs. 64.8% [range 61.5–66.7%], 75.7% [range 66.7–81.8%,] and 59.0% [range 50.8–63.5%]). The worst 3 error maps show a tendency towards underestimation of PET scores. Conclusion FDG-PET/MR based on atlas-based MR attenuation correction showed similar diagnostic accuracy to the CT-based method for the diagnosis of AD and the prediction of progression of MCI to AD using commercially-available software, although with a minor reduction in sensitivity.


Results
The accuracy and specificity for the discrimination of AD-patients from normal controls were not substantially impaired, but sensitivity was slightly impaired in 5   .5%]). The worst 3 error maps show a tendency towards underestimation of PET scores.

Conclusion
FDG-PET/MR based on atlas-based MR attenuation correction showed similar diagnostic accuracy to the CT-based method for the diagnosis of AD and the prediction of progression of MCI to AD using commercially-available software, although with a minor reduction in sensitivity.

Background
Integrated positron emission tomography (PET) / magnetic resonance (MR) systems have been currently widely distributed (over 100 institutions in the world). Previous studies have revealed that 2-deoxy-2-[18F]fluoro-D-glucose (FDG)-PET/MR is useful in the evaluation of neurodegenerative diseases [1][2][3][4][5][6]. Additionally, combined PET/MR not only provides detailed brain anatomy, but the immediate availability of coregistered anatomy might even improve PET image quality by facilitating the correction of partial volume effects and/or motion artifacts [7,8]. However, several technical challenges should be solved to exploit the full performance of PET/MR. One of the limitations in need of improvement is that of attenuation correction (AC) using MR imaging data (MRAC) [9]. On PET/MR system, it is difficult to derive AC-maps from conventional MR-data due to the lack of a relationship between photon attenuation and MR signal intensity. To solve this problem, several AC-methods (i.e. Dixonbased AC, Atlas-based AC, Model-based AC, zero echo time MRI based AC and ultrashort echo time MRI based AC) have been proposed from vendors and researchers [9,10]. For clinical use, Dixon-based four-class segmentation approaches (i.e. air, lung, fat and soft tissue) was implemented into both the Biograph mMR (Siemens Healthcare, Erlangen, Germany) and SIGNA PET/MR (GE Healthcare, Waukesha, WI, USA). However, these methods are not recommended for brain studies, because neglecting bone introduces a significant bias in cortical areas [11]. One of the alternative methods currently implemented on clinical PET/MR scanners is the atlas-based method [12,13]. This method is comparably accurate in supratentorial regions, but not accurate enough in the temporal lobe and in the infratentorial region, where FDG uptake is underestimated. This variability of error distribution may impact the diagnostic accuracy of FDG PET in several diseases, including Alzheimer's disease (AD). Quantitative evaluation of FDG-PET typically relies on a normalization of local FDG uptake to that in dedicated anatomical regions (e.g. cerebellum and thalamus) or to the whole brain average. If the regions with overestimated FDG accumulation are normalized to an underestimated region, or vice versa, the result could be under-or over-diagnosis of AD. However, the impact of the vendor-provided MRAC on the accuracy of diagnosis of AD has not been reported in the literature. The aim of this paper was to clarify the clinical utility of FDG-PET from PET/MR, with vendor-provided atlas-based MRAC, for the diagnosis of AD. The analysis was performed on simulated data that combined real patient data from two institutions (Institution A (InA) and B (InB)) and Alzheimer Disease Neuroimaging Initiative (ADNI) data, well-established large cohort data. The probability of AD was calculated using fully automated procedures in commercially available software.

Materials and methods
This study was approved by each local institutional review board, cantonal ethics committee Zurich and Institutional Review Board of Mayo Clinic. All subjects provided signed informed consent prior to the examinations. All experiments were performed in accordance with relevant guidelines and regulations. We recruited 47 patients, from two institutions, who underwent both PET/CT and PET/MR (GE SIGNA) examination. In addition, we extracted 203 subjects from the ADNI dataset. From the PET raw data acquired on PET/MR, two FDG-PET series were generated, using either the vendor-provided atlas-based MRAC or CTAC. Following spatial normalization to MNI space, we calculated the error map for each patient, as PET M-RAC /PET CTAC . We multiplied each of these 47 error maps with each of the 203 ADNI cases in MNI space. This resulted in 203 � 47 = 9541 datasets (Fig 1). To evaluate the probability of AD in each resulting image, a cumulative t-value was calculated using a fully automated method in commercially-available software (S1 Fig). The diagnostic accuracy for the discrimination of AD and predicting progression from mild cognitive impairment (MCI) to AD were evaluated in simulated images compared with the original ADNI images (Fig 1).

Alzheimer Disease Neuroimaging Initiative (ADNI) data
Data used in the preparation of this article were obtained from the ADNI database (adni.loni. usc.edu). ADNI was launched in 2003 as a public-private partnership, led by Principal Investigator Michael W. Weiner, MD. The primary goal of ADNI has been to test whether serial magnetic resonance imaging (MRI), positron emission tomography (PET), other biological markers, and clinical and neuropsychological assessment can be combined to measure the progression of mild cognitive impairment (MCI) and early Alzheimer's disease (AD). For up-todate information, see www.adni-info.org. The authors had no special access privileges to the data others would not have. It means that any interested researchers can replicate the current study findings in their entirety by directly obtaining the data from ADNI's website and combining their own error map dataset.
From ADNI-1 data, we extracted 203 participants (76.0±6.3 years, 129 males, 48 healthy, 59 AD and 96 MCI participants). Out of 96 MCI participants, 33 progressed to AD at 24 months after imaging. The inclusion criteria were: completeness of date of birth, baseline diagnosis (healthy, MCI, or AD) and the diagnosis at 24 months after imaging. All PET images were of sufficient quality for visual scoring and for software-based analysis using PMOD's Alzheimer Discrimination tool PALZ (PMOD Technologies LLC, Zurich, Switzerland) [14,15]. The baseline PET data was utilized. The reported FDG-PET imaging parameters were: injected dose, 185 MBq (5 mCi), dynamic 3D acquisition, six 5-min frames 30-60 min post injection.

Patients
We recruited 47 patients who underwent both PET/CT and PET/MR for oncologic staging from two institutions (InA and InB). Twenty patients (11 males and 9 females, 61.6±12.4 years) with lymphoma, lung cancer, head and neck cancer, pancreatic cancer, pleural methotelioma, uterine carcinoma, cervical carcinoma and malignant melanoma at InA were collected by summing the cohorts recruited in previous studies [12,16]. The other twenty-seven patients (15 males and 12 females, 60.0±13.0 years) with lymphoma, pheochromocytoma, myeloma, melanoma, Merker-cell cancer, lung cancer, pancreatic cancer, breast cancer and dementia at InB were collected from another previous study, after excluding 3 patients. Two of these three excluded patients had infarction and one had multiple brain metastases [17]. A neuroradiologist (T.S.) reviewed and confirmed that all included patients were free of brain abnormalities.

Attenuation map generation
The atlas AC map was calculated from the LAVA-Flex T1w images using the vendor-provided default processing [18].
For the generation of CTAC, the processing steps detailed below were performed using custom Matlab scripts and PMOD 3.8. The co-registered CTAC map was generated as follows. First, the original head CT was exported from the PET/CT scanner and converted into an AC map using a Matlab version of the same bilinear mapping implemented in the SIGNA PET/ MRI. From this map, the CT table was removed manually. A threshold was set to extract the outside air component from the CTAC map. None of the images used in this study contained artifacts likely to affect air thresholding. To derive the registration parameters necessary to match CT to LAVA-Flex T1w, a normalized mutual information matching algorithm (PMOD) was used and the final matching was performed using custom Matlab routines. Finally, the CTAC map was superimposed on the atlas AC map, replacing it [13].

Automated software for AD probability assessment
Automated AD probability assessment was performed in a commercially available tool (PMOD Alzheimer's Discrimination, PALZ). This software tool has been used in multiple large cohort studies, e.g. ADNI, NEST-DD and SEAD-Japan [15,19,20]. The software ran the following procedures, in a fully automated workflow, in accordance to the methods described by Herholz et al. [21]. First, spatial normalization is performed by transforming the original images to the SPM99 PET template, followed by smoothing with a Gaussian filter of 12 x 12 x 12 mm [22,23]. In these images, voxel values are normalized by dividing each image voxel value by the mean voxel value, averaged within a mask representing voxels in which FDG uptake is typically preserved even in AD patients. The expected value in each voxel is calculated from a pre-stored, age-matched, reference PET database of healthy controls. This is achieved by combining voxel-wise regression parameters, where brain atrophy was taken into account by adjustment of the normal reference values using linear regression by age. By comparing the voxel-wise differences between expected value and the patient-specific value, a Student's t-value is calculated [24]. The AD t-sum is calculated by summing the t-value in predefined AD-related voxels. Finally, the PET Score was calculated as log2 (AD t-sum/11089 +1), for which the 95% prediction limit (11089) of AD t-sum was established in the NEST-DD multi-center trial [15]. This analysis was initially performed in all 203 ADNI-PET data (e.g. PETScore ADNI−j ) before multiplication with the 47 error maps. The detailed procedure is shown in S1 Fig.

Creation of simulated data: ADNI-data with Atlas-AC
All simulation steps were performed in MNI space with same spatial resolution (2 mm isotropic voxels). First, we divided the locally acquired PET images based on atlas AC by those based on CTAC (47 patients) (e.g. Error PET pt−i ). Second, the resulting images were spatially normalized to the SPM99 PET template using the transformation calculated for PET images based on CTAC to the template, then a Gaussian filter of 12 x 12 x 12 mm full-width half-maximum was applied ( Norm Error PET ptÀ i ). A brain mask was applied to avoid distortion at the edges of the measured data. These steps were designed to replicate the preprocessing steps used in the PMOD Alzheimer's Discrimination tool, as used to calculate PET score. Therefore, the resulting images were the error maps (between atlas AC and CTAC) in the same image space as the spatially normalized ADNI PET data ( Norm PET ADNI−j ). Third, we multiplied each of the 203 normalized ADNI data with each of the 47 normalized error maps, resulting in 203 � 47 = 9541 normalized PET images (e.g. Norm Error PET ptÀ i ADNIÀ j ). Thus, the value-error was simply imposed in a voxel-wise manner and further PET score calculation was performed without additional need for spatial deformation or filtering. Therefore, we expected any bias due to impaired spatial normalization or differences in PET acquisition protocol to be minimized. For each of these 9541 images, PALZ analysis (from the second to fifth analysis step) was performed to calculate the PET score (PETScore ptÀ i ADNIÀ j ). This workflow is summarized in Fig 1.

Evaluation of diagnostic accuracy for Alzheimer's disease
First, to clarify the distribution of MRAC error, we calculated the averaged error in whole SPM 99 PET voxels, whole AD-related voxels and whole non-AD related voxels in each of the 47 normalized error map ð Norm Error PET ptÀ i Þ. Second, we calculated the difference in PET score (PETScore ptÀ i ADNIÀ j À PETScore ADNIÀ j ) for all 9541 datasets. To reveal whether differing PET acquisition protocols affected the simulation results, we additionally compared PET score between InA and InB.
We evaluated the diagnostic accuracy from two points of view. First was the diagnostic accuracy of discrimination of AD from normal patients. Based on a previous study, the cut-off is PET score = 1 [15]. Second was the diagnostic accuracy of prediction of conversion from MCI to AD [14]. For the prediction from MCI to AD, the PET score cut-off is 0.79, defined using the Youden index [25].
We calculated sensitivity and specificity using the data for each error map multiplied by each PET image. We created Bland-Altman plots of PET scores in the best 3 and the worst 3 cases [26]. The confidence intervals (CI) were calculated using the original data.
Neither accuracy nor specificity for the discrimination of AD patients from normal controls were significantly impaired by MRAC, but sensitivity was slightly impaired when using 5 1 and Fig 2B). The worst 3 error maps showed a tendency towards underestimation of PET scores (Fig 3). A representative case is shown in Fig 4.

Discussion
In the current study, we estimated the diagnostic accuracy of FDG-PET with vendor-provided atlas AC from PET/MR (GE SIGNA) for AD. This was achieved by simulating the error introduced by MRAC on ADNI data and investigating the subsequent effect on an automated method for Alzheimer's discrimination. The result shows that error induced by MRAC could lead to an underestimation of the probability of AD. Accuracy and specificity were maintained, but sensitivity for the discrimination of Alzheimer's disease from normal subjects was slightly impaired. A similar slight tendency was found for the prediction of progression from MCI to AD.
There have been few studies evaluating diagnostic accuracy for AD in clinical PET/MR machines that included more than 10 patients. Hitz et al. recruited 30 patients with suspected AD [27]. FDG-PET imaging on PET/CT with CTAC and that on PET/MR with MRAC were generated separately. Quantitative analysis showed that inconsistent over-and underestimation, depending on the anatomical region, was apparent on PET/MR even after normalization to the global mean. In visual assessment, even experienced observer ratings diverged between PET/CT and PET/MR in 3 out of 29 patients. In this study, they used Dixon-based MRAC, which is no longer recommended for use in brain PET/MR imaging, and generated PET data for each modality from different PET raw data [28]. Our goal was to evaluate the vendor-provided atlas AC, and to use the same PET raw data in the generation of error maps. Moodley et al. enrolled 24 dementia patients [29]. However, the main focus of their study was to evaluate the concordance between FDG-PET and MRI in dementia patients, rather than the validation of MRAC compared with the gold standard: FDG-PET derived from CTAC.
Further studies have been published on the validation of MRAC for brain FDG-PET, some of which recruited AD patients [10,30,31]. However, the main focus of these studies was to clarify the extent and distribution of error introduced by MRAC. None of these studies evaluated the impact of these errors on the diagnostic accuracy for AD in an objective manner. In addition, these studies were performed on different MR systems with different underlying We used a fully automated, commercially available, Alzheimer's discrimination analysis to calculate a predictive value for AD. The normal database of FDG PET images used for the discrimination (calculation of t-value) in this tool were acquired on conventional PET/CT scanners [15]. Based on our study alone, it is difficult to determine whether a PET/MR-specific FDG-PET database should be constructed [27]. However, our finding that diagnostic accuracy was maintained between FDG-PET/MR with atlas-based MRAC and PET/CT indicates that the database for FDG-PET/CT could be used for FDG-PET/MR.
A recent multi-center trial revealed that some novel MRAC methods could generate near "gold standard" AC-maps from MR data on PET/MR [30]. In addition, deep-learning-based MRAC methods have been validated and provided prominent accuracy [32][33][34]. However, implementation and adoption on commercially available systems will certainly vary among vendors and users. At most institutions with a PET/MR scanner, researchers and physicians can only choose between the MRAC provided by the vendor, a multi-atlas based method available via a web interface (http://cmictig.cs.ucl.ac.uk/niftyweb/program.php?p=PCT) or their own reconstruction algorithm [12,35].
There were a number of limitations in our study. First, the data acquisition in patients was heterogeneous and not optimized in all cases for brain PET (short 2 min scan duration and variable post-injection time). Using only 2 min acquisition for brain PET images could limit image quality, despite the use of a state-of-the-art PET/MR scanner with high sensitivity detector [36]. To mitigate this, all images were smoothed with a Gaussian filter of 12 x 12 x 12 mm which step has been included in PALZ as one of the analysis steps. In addition, we proved that there was no significant difference of PET score error between short scan time (2min) cohort at InA and relatively longer scan time (10min) cohort at InB. Second, brain PET data was taken from patients imaged for oncologic staging, rather than dedicated brain imaging. However, the error introduced by MRAC primarily results from differences in skull bone detection which is not expected to systematically vary among oncology and AD patients. Third, a limited number of patients, n = 47, were used to generate MRAC error maps. This is substantially smaller than a previous larger cohort study that validated the performance of MRAC (n = 337) [30]. In addition, the dataset evaluated in the current study consisted entirely of simulated images which combined PET data on different scanners. In future studies a larger dataset of real patient images should be assessed. Fourth, we chose not to focus on the diagnosis of other types of dementia; e.g. fronto-temporal lobar degeneration, dementia with Lewy bodies and vascular dementia, which is sometimes difficult to distinguish from AD in a clinical setting. Fifth, the outcome measure was derived from a single software tool, and visual assessment by an experienced reader was not included. However, the main focus of this study was not on the tool itself but to assess the diagnostic accuracy of FDG PET from PET/MR with vendor-provided atlas AC for AD, in a controlled manner. This goal was fulfilled by the current study, because the diagnostic concept is similar between software or visual assessment. Sixth, we only evaluated a single atlas-based MRAC method. Cross validation using several MRAC methods, provided by other vendors or researchers, should be performed in the future.

Conclusion
FDG-PET/MR based on atlas-based MR attenuation showed similar diagnostic accuracy than the CT-based method for the diagnosis of Alzheimer's disease and the prediction of the progression of mild cognitive impairment to Alzheimer's disease using the PMOD-based PALZ software, although with a minor reduction in sensitivity. were spatially normalized to the SPM99 PET template, followed by smoothing with a Gaussian filter of 12 x 12 x 12 mm. Secondly, voxel values were normalized by dividing each image voxel value by the mean voxel value, averaged within a mask representing voxels with AD-preserved activity (B). The voxelwise differences between the expected and patient-specific images are used to calculate t-values (C). The error map was obtained by dividing PET based on Atlas-AC by that on CT-AC, followed by the normalization to the identical space as spatially normalized ADNI-PET data. The original error map is shown in figures (D) and (F). We multiplied the normalized PET (B) with each normalized error map (D and F), followed by calculation of the t-value map of the simulated PET (not shown in this figure). The error of the t-value maps was obtained by subtracting the original t-value map from the simulated t-value map (E and G). The yellow VOI corresponds to AD-related voxels. The original PET-score of ADNI-data was 0.9385. The PET-score of simulated data was 1.0478 for pt- 11