Variability in quantitative analysis of atherosclerotic plaque inflammation using 18F-FDG PET/CT

Background 18F-FDG-PET(/CT) is increasingly used in studies aiming at quantifying atherosclerotic plaque inflammation. Considerable methodological variability exists. The effect of data acquisition and image analysis parameters on quantitative uptake measures, such as standardized uptake value (SUV) and target-to-background ratio (TBR) has not been investigated extensively. Objective The goal of this study was to explore the effect of several data acquisition and image analysis parameters on quantification of vascular wall 18F-FDG uptake measures, in order to increase awareness of potential variability. Methods Three whole-body emission scans and a low-dose CT scan were acquired 38, 60 and 90 minutes after injection of 18F-FDG in six rheumatoid arthritis patients with high cardiovascular risk profiles.Data acquisition (1 and 2) and image analysis (3, 4 and 5) parameters comprised:1. 18F-FDG uptake time, 2. SUV normalisation, 3. drawing regions/volumes of interest (ROI’s/VOI’s) according to: a. hot-spot (HS), b. whole-segment (WS) and c. most-diseased segment (MDS), 4. Background activity, 5. Image matrix/voxel size.Intraclass correlation coefficients (ICC’s) and Bland Altman plots were used to assess agreement between these techniques and between observers. A linear mixed model was used to determine the association between uptake time and continuous outcome variables. Results 1. Significantly higher TBRmax values were found at 90 minutes (1,57 95%CI 1,35–1,80) compared to 38 minutes (1,30 95%CI 1,21–1,39) (P = 0,024) 2. Normalising SUV for BW, LBM and BSA significantly influences average SUVmax (2,25 (±0,60) vs 1,67 (±0,37) vs 0,058 (±0,013)). 3. Intraclass correlation coefficients were high in all vascular segments when SUVmax HS was compared to SUVmax WS. SUVmax HS was consistently higher than SUVmax MDS in all vascular segments. 4. Blood pool activity significantly decreases in all (venous and arterial) segments over time, but does not differ between segments. 5. Image matrix/voxel size does not influence SUVmax. Conclusion Quantitative measures of vascular wall 18F-FDG uptake are affected mainly by changes in data acquisition parameters. Standardization of methodology needs to be considered when studying atherosclerosis and/or vasculitis.


Objective
The goal of this study was to explore the effect of several data acquisition and image analysis parameters on quantification of vascular wall 18 F-FDG uptake measures, in order to increase awareness of potential variability.

Methods
Three whole-body emission scans and a low-dose CT scan were acquired 38, 60 and 90 minutes after injection of 18 F-FDG in six rheumatoid arthritis patients with high cardiovascular risk profiles.Data acquisition (1 and 2) and image analysis (3, 4 and 5) parameters comprised:1. 18 F-FDG uptake time, 2. SUV normalisation, 3. drawing regions/volumes of interest (ROI's/VOI's) according to: a. hot-spot (HS), b. whole-segment (WS) and c. most-diseased segment (MDS), 4. Background activity, 5. Image matrix/voxel size.Intraclass correlation coefficients (ICC's) and Bland Altman plots were used to assess agreement between these techniques and between observers. A linear mixed model was used to determine the association between uptake time and continuous outcome variables. Introduction 18 F-Fluorodeoxyglucose positron emission tomography ( 18 F-FDG PET) is a nuclear imaging modality that is increasingly used for assessment of vascular inflammation in patients with atherosclerosis. [1,2] Its clinical relevance is related to the role of inflammation in atherosclerotic plaque development and instability. [3] Quantitative characteristics of 18 F-FDG PET are increasingly recognized as providing a more accurate and less observer-dependent measure of inflammatory atherosclerosis than qualitative assessment of PET images. [4] Maximum Standardized Uptake Value (SUVmax) is regularly used for quantification in PET-atherosclerosis studies. [5] SUV is the decay-corrected tissue concentration of intravenously injected 18 F-FDG normalised for either body weight, lean body mass or body surface area. [6] In addition to SUV, the target-to-background ratio (TBR), which is the ratio of vascular wall and blood pool SUV, is frequently used. [5] Both SUV and TBR correlate with histologically determined macrophage content in atherosclerotic lesions. [7] To optimize reliability and comparison of results between studies (and within multicenter studies) standardized quantification of vascular wall 18 F-FDG uptake is essential. [5,8] Nonetheless, there is large variability in data acquisition and image analysis in 18 F-FDG PET quantification of vascular inflammation. (S1 Table) For instance, arterial segments studied comprised carotid arteries, (parts of) the aorta and/or all large arteries. Also, regions of interest (ROI's) are not constructed in a consistent manner and 18 F-FDG uptake time varies, ranging from 45 to 193 minutes. [9,10] Both image reconstruction and resolution have been shown to influence quantification of 18 F-FDG uptake in oncological studies. [11,12] In quantitative assessment of vascular inflammation, however, there is still little (though increasing) knowledge on the effect of variability of data acquisition and image analysis. [13] The objective of this exploratory study was to investigate the effect of several data acquisition and image analysis parameters on quantification of vascular wall 18 F-FDG uptake.

Significantly higher
inflammation are studied. Male and female RA patients > 50 years of age were included, Patients were excluded if they: were using oral corticosteroids, had active tuberculosis or severe infections/ sepsis, plasma glucose > 11 mmol/l at time of 18 F-FDG-PET scan, had moderate heart failure  (NYHA class III/IV) or had cancer with a limited life expectancy (< 12 months) All patients provided written informed consent. The study was approved by the medical ethical review board of the VU University Medical Center.

Data acquisition and image analysis
Data acquisition. A Philips Gemini TOF PET/CT system was used to perform 18 F-FDG PET/CT studies (Philips medical systems, Eindhoven, The Netherlands). All patients fasted for at least 6 hours prior to the intravenous injection of 18 F-FDG. Blood glucose was measured before injection in all patients and did not exceed 8 mmol/l (144 mg/dl). 18 F-FDG uptake time: To explore the effect of uptake-time on the detection of vascular wall inflammation three whole-body (cranium to mid-femur) emission scans were acquired 38, 60 and 90 minutes after intravenous injection of 3,5 Megabecquerel/kilogram 18 F-FDG.
Scan duration equalled 2 min/bed position. Immediately following the third emission scan, a low-dose CT-scan (80-120 kV, 20-35 mAs) was performed for attenuation correction and anatomic localization (voxel size 1.17x1.17x4 mm). PET images were reconstructed using a time-of-flight ordered subset expectation maximisation algorithm, as implemented by the vendor, providing images with a matrix size of 144x144 and a voxel size of 4x4x4mm.
SUV normalisation: scans were normalised for body weight (BW), lean body mass (LBM) and body surface area (BSA). We explored the effect of these normalisations on SUVmax values. Image analysis. All images were analysed using the PET image analysis research tool developed at the Department of Radiology & Nuclear Medicine of the VU University Medical Center Amsterdam.
Region of interest (ROI): Regional maximum vascular 18 F-FDG (a surrogate marker of plaque inflammation) can be assessed using several methods. [9,15,16] Three methods were compared in this study.
First, ROI's were drawn on each axial slice of low-dose CT images in predefined vascular segments. (S2 Table and S1 Fig) Sagittal and coronal views were used to ensure correct placement. The resulting volume of interest (VOI) was transferred to corresponding PET-images. Subsequently, maximal activity in the VOI (SUVmax WS) was calculated after (visually) correcting for potential spill-over from adjacent FDG-avid regions (e.g. esophagus for the descending aorta) and for pre-scan glucose levels. This (time-consuming) method should be highly sensitive to detect the most inflamed region.
Second, ROI's were drawn directly on the axial slice of the PET-image showing the most intense 18  Third, SUVmax most-diseased segment (MDS) was derived by including two ROI's in slices adjacent (one proximal and one distal) to the visually determined hot-spot.
Two observers independently determined SUVmax WS and HS. Agreement between observers and between SUVmax WS, HS and MDS was determined.
Background activity: Background (i.e. blood pool) activity was calculated by drawing ROI's on at least 3 axial slices in the blood pool of the inferior and superior vena cava and the center of the blood pool of the ascending aorta. SUVmean (i.e. average of all voxels in the resulting VOI) was compared between and within these regions at 60 and 90 minutes.
Voxel size/image matrix: The effect of voxel size and/or image rebinning was studied by comparing images with the original PET-voxel size (in these cases the co-registered CT-images were rebinned to PET-voxel size) with images with CT-voxel size (i.e. PET-images were rebinned to the voxel size of the CT-images, potentially affecting measured SUVmax values).
In summary, data acquisition and image analysis parameters that were considered to potentially influence quantification of vascular wall 18

Statistical analysis
Continuous data are presented as mean (±Standard Deviation, SD) after normality was ascertained using frequency histograms and Q-Q-plots. Categorical data are presented as proportions. Agreement between analytical methods and observers was assessed using intraclass correlation coefficients (ICC's; two-way random, absolute agreement) and Bland-Altman plots (for which limits of agreement were calculated). A linear mixed model was used to determine the relation between uptake time and continuous outcome variables (SUV and TBR), correcting for repeated measures within one subject ( 18 F-FDG uptake time (i.e. repeated measurements in time in different subjects) was considered fixed and random as vascular segment (i.e. repeated measurement within 1 patient at 1 given time point) was considered as random effects). Levene's test for homogeneity was used to compare differences in coefficients of variation (for background activity). Values (SUVmax for voxel size) were transformed (using the natural logarithm) due to an association between average SUVmax and the difference between SUVmax for both voxel sizes. Statistical analyses were performed using SPSS analysis software. (SPSS version 20; SPSS Inc.) Individual patient data (anonymized) can be found as supplemental data. (S1 File)

Results
Patient characteristics are displayed in Table 1. None of the patients used statins and 4 of 6 patients (67%) used anti-hypertensive medication.

Uptake time
Results for SUVmax and TBRmax at three post-injection scan times are illustrated in Fig 1. SUVmax gradually decreases over time. In contrast, TBRmax increases. A Significantly higher

Region of interest
Scatter plots and Bland Altman plots for the agreement between SUVmax HS and WS are displayed in Fig 2 and S3A Fig. Comparable Bland-Altman plots were reconstructed (data not  shown) with the average difference between SUVmax WS and HS approaching zero in most vascular segments without many outliers. Intraclass correlation coefficients were high in all vascular segments when SUVmax HS was compared to SUVmax WS. (Table 2) However, 95% confidence intervals (combined with ICC's > 0, 9) suggest that agreement was best in the aortic segments.
SUVmax HS was higher than SUVmax MDS in all vascular segments. This is shown in S3B  Fig in a Bland Altman plot for the aortic archOn several occasions SUVmax MDS was higher suggesting that the 'culprit' lesion was missed using SUVmax HS. Intraclass correlation coefficients were high in all vascular segments. (Table 2) Interobserver agreement Interobserver agreement was excellent for all vascular segments, except the right iliac artery, for the whole-segment method ( Table 2). Using the hot-spot method, agreement was excellent for the aortic arch, abdominal aorta and the right femoral artery.
Scatter and Bland-Altman plots (Fig 3 and S4 Fig) show excellent agreement and suggest that, for both methods, there were no systematic differences in observer agreement for high or low values, most values lying within 2 standard deviations of the mean difference, indicating good agreement between the observers.

Background activity
Blood pool activity in the 3 vascular segments that were studied at all time points is shown in Fig 4. At 38 minutes, blood pool activity (SUVmean) was significantly higher in the ascending aorta than in the superior vena cava, whereas there were no differences at 60 and 90 minutes.
Variation in blood pool activity appears to be larger after 90 minutes, although not statistically significant.

Voxel size
The average difference between SUVmax PET and CT voxel size approximated zero, SUVmax PET resolution being slightly higher (Bland Altman, S5 Fig). Values were transformed (using the natural logarithm) after which the association between average SUVmax and the difference between SUVmax for both voxel sizes disappeared. A summary of the main findings is shown in Table 3.

Discussion
Our study shows that data acquisition and image analysis may cause differences in quantification of vascular wall 18 F-FDG uptake. Data acquisition, in this case 18 F-FDG uptake time and SUV normalisation appear to affect quantification considerably, whereas image analysis (ROI's, voxel size and background activity) causes minimal to moderate differences. Previously, dynamic studies have suggested that late-imaging (i.e. increasing FDG uptake time) is superior for the detection of vessel wall inflammation. [2] This finding has been argued by other investigators. [17] Interestingly, the latter group studied the abdominal aorta as opposed to the carotid arteries. In this study, we showed that SUVmax decreases when uptake time increases, whereas TBRmax increases significantly over time. This effect could not be established for individual segments, most likely due to a small sample size. We may conclude, however, that the decrease in blood pool activity is larger than the decrease in the vascular wall uptake. Furthermore, the inter-subject variability of TBRmax values was larger at late-imaging from which we could hypothesize that vascular wall activity in inflammatory/vulnerable atherosclerotic lesions may decrease at a slower pace than in non-inflammatory lesions. Conversely, late-imaging did not improve correlations between the hot-spot and whole-segment methods. It appears that for the visual detection of atherosclerotic plaque activity late-imaging is not required. This result is in accordance with a previous study in which patterns and locations of 18 F-FDG uptake could be identified in early and delayed images in patients with carotid artery disease. [10] Quantification by SUV depends on the type of normalisation used. SUV's are commonly normalised for either body weight (BW) or lean body mass (LBM), although body surface area (BSA) is also used occasionally. SUVmax BW was clearly higher (i.e. 35%) than SUVmax LBM. SUVmax BSA was much lower due to a different calculation. At present, SUVmax BSA has not been used in vascular wall uptake quantification. In oncology, the most appropriate method for SUV normalization is still a matter of debate, although use of LBM is increasingly being introduced. [18] Our results show that the type of normalisation used should definitely be consistent within (multicenter) studies.
To date, there are no standardized methods to draw regions of interest to detect the most inflamed vascular region. Moreover, a 'systemic' approach was initially used in which the average of SUVmax values (as global burden of atherosclerotic disease) of all slices along the course of a vascular segment was calculated. [2] Subsequently, a more 'focal' approach, including the single-hottest-slice (hot-spot/HS) and most-diseased segment (MDS), was added. [9,15,16] Several studies have used visually enhanced 18 F-FDG uptake. [19][20][21] This method is less timeconsuming than a systematic approach in which all axial slices are analysed, however potentially less sensitive to detect the culprit lesion. Our study showed similar SUVmax values for the two methods, indicating that the (visually detected) hot-spot method is equally sensitive and can be used safely without the risk of missing inflamed lesions. Nevertheless, interobserver agreement was not optimal in all regions. Therefore, visual (HS and MDS) analysis should only be used in the aortic arch or abdominal aorta (both showing excellent agreement between HS and WS SUVmax and between observers). These findings contradict previous studies showing high interobserver agreement for all vascular segments. [8,22] In addition, most studies reporting observer agreement show that it is generally good to excellent. [7,[23][24][25][26][27] Moderate observer agreement was described in one earlier study that analysed peripheral arteries. [28] Low observer agreement for the visual identification of the site with most intense FDG uptake in peripheral arteries is not surprising as it is more challenging, due to the smaller calibre of the vessel and high FDG uptake in adjacent structures (e.g. muscles) and patient movement, to detect the most inflamed lesion. Studies using a 'systemic' approach by averaging SUVmax from multiple axial slices may be less prone to observer variability.
SUVmax MDS was consistently lower than SUVmax HS, probably due to the increasing number of axial slices used to average SUVmax (as it would include more slices having a lower SUVmax value). Though both methods have been previously used, there is no data that supports one of both as predictor of future cardiovascular events.
Background (blood pool) activity is almost exclusively assessed by measuring SUVmean in the blood pool of an adjacent vein (e.g. jugular vein) or the inferior or superior vena cava. Two earlier reports have also calculated blood pool activity in the center of a large artery. [17,29] It has been suggested that spill-over from atherosclerotic lesions may overestimate blood pool activity. [30] Conversely, venous blood pool activity may be lower due to tissue metabolism extracting glucose at the capillary level, yet our results show that arterial and venous blood pool activity were comparable both at 60 and 90 minutes p.i.
To our knowledge, this study is the first to report on the effect of voxel size. We hypothesized that, due to rebinning, conversion of PET-images to CT-voxel size would lead to changes in SUVmax. We were not able to establish this in the vascular segments that we studied. However, comparisons were only made in the large (aortic) segments as visual detection of smaller segments was too difficult after CT images were converted to PET voxel size. Therefore, it remains to be elucidated whether the results also apply to smaller vessels. Nonetheless, our results suggest that adjusting the voxel size of PET-images to CT-voxel size, which is required for proper transfer of VOI's, can be performed without significantly affecting SUVmax. The purpose of this procedure is to enhance differentiation between vascular segments and structures that may potentially cause spill-over (e.g. oesophagus).
Our study has strengths and limitations. First, our population consists of patients with high cardiovascular risk. RA is associated with increased cardiovascular disease. [15] Additionally, a recent study showed that RA patients had significantly higher SUV's than a historical control group of patients with cardiovascular disease. [15] As our population had a history of both RA and cardiovascular disease, we were more certain that we were actually studying vascular inflammation, although a gold standard (histological proof) was absent. Whether findings in this patient group can be generalized to atherosclerosis patients without systemic autoimmune inflammation remains to be determined. Second, our multi-segment approach enabled us to study the effects of potential factors on different vascular segments, which proved to be useful as differences between segments in susceptibility to variation were observed.
The small sample size is a limitation. Possibly, statistically significant differences might have been observed, especially for individual segments in the analysis of uptake time variability. Also, the limited number of possible effectors may be considered a limitation.
Parameters that were not investigated, but may potentially influence quantification are reconstruction parameters, pre-scan glucose values (although we did correct for it) and image acquisition parameters (e.g. time/bed position). The European Association of Nuclear Medicine (EANM) has recently published a position statement on the use of 18 F-FDG PET imaging in atherosclerosis to optimize standardization of arterial PET imaging in which recommendations on these parameters are provided. [5] Also, 'atherosclerotic' lesions on the 18 F-FDG PET scan were not confirmed by histology or CTA/MRI. Understandably, it was impossible to obtain histological specimens in these patients. In addition, the large amount of radiation of whole-body CTA and costs of wholebody MRI precluded these imaging modalities to be performed. Finally, we only studied the effect of factors potentially influencing focal SUVmax. Earlier studies mainly calculated the average SUVmax in a vascular segment and this method might be less susceptible to the influence of some of the variation in image acquisition and analysis. [31] Conclusion Quantification of vascular inflammation, using SUVmax/TBRmax, is affected by several data acquisition parameters, i.e. 18 F-FDG uptake time and SUV normalisation. Image analysis may also introduce variability/affect observer agreement. Although the individual factors may not have a large impact by themselves, the cumulative effects of these factors may result in substantial differences in reported SUV's throughout studies and within multi-center trials. These results stress the need for standardisation in both clinical practice and research settings. Some of our findings corroborate the recently published position statement of the EANM (i.e. recommendation of late-imaging, particularly for TBR measurements) and some add to this statement (importance of standardization of SUV normalisation and preference of visual (i.e. less time-consuming) methods for aortic arch and abdominal aorta. [5] Supporting information S1 Fig. In the whole-segment method axial low dose CT slices (Fig A) were used to draw a region of interest (ROI) comprising the entire artery (Fig B). Afterwards this ROI was transferred to the corresponding 18 F-FDG-PET slices (Fig C,D). Multiple axial ROI's were drawn resulting in a final volume interest (VOI) which is illustrated in a coronal low-dose CT slice. Maximal standardized uptake values were determined in this VOI after being transferred to the 18