Comparison of Texture Features Derived from Static and Respiratory-Gated PET Images in Non-Small Cell Lung Cancer

Background PET-based texture features have been used to quantify tumor heterogeneity due to their predictive power in treatment outcome. We investigated the sensitivity of texture features to tumor motion by comparing static (3D) and respiratory-gated (4D) PET imaging. Methods Twenty-six patients (34 lesions) received 3D and 4D [18F]FDG-PET scans before the chemo-radiotherapy. The acquired 4D data were retrospectively binned into five breathing phases to create the 4D image sequence. Texture features, including Maximal correlation coefficient (MCC), Long run low gray (LRLG), Coarseness, Contrast, and Busyness, were computed within the physician-defined tumor volume. The relative difference (δ3D-4D) in each texture between the 3D- and 4D-PET imaging was calculated. Coefficient of variation (CV) was used to determine the variability in the textures between all 4D-PET phases. Correlations between tumor volume, motion amplitude, and δ3D-4D were also assessed. Results 4D-PET increased LRLG ( = 1%–2%, p<0.02), Busyness ( = 7%–19%, p<0.01), and decreased MCC ( = 1%–2%, p<7.5×10−3), Coarseness ( = 5%–10%, p<0.05) and Contrast ( = 4%–6%, p>0.08) compared to 3D-PET. Nearly negligible variability was found between the 4D phase bins with CV<5% for MCC, LRLG, and Coarseness. For Contrast and Busyness, moderate variability was found with CV = 9% and 10%, respectively. No strong correlation was found between the tumor volume and δ3D-4D for the texture features. Motion amplitude had moderate impact on δ for MCC and Busyness and no impact for LRLG, Coarseness, and Contrast. Conclusions Significant differences were found in MCC, LRLG, Coarseness, and Busyness between 3D and 4D PET imaging. The variability between phase bins for MCC, LRLG, and Coarseness was negligible, suggesting that similar quantification can be obtained from all phases. Texture features, blurred out by respiratory motion during 3D-PET acquisition, can be better resolved by 4D-PET imaging. 4D-PET textures may have better prognostic value as they are less susceptible to tumor motion.


Introduction
Positron emission tomography (PET) with [ 18 F]fluorodeoxyglucose (FDG), a surrogate of glucose metabolism, is an essential clinical tool for tumor diagnosis, staging, and monitoring tumor progression [1][2][3][4]. Accurate quantification of tumor characteristics based on [ 18 F]FDG-PET images can provide valuable information for optimizing therapy [5,6]. Standardized uptake value (SUV) measures such as maximum, peak, mean, and total SUV, are commonly used for quantification of the tumor characteristics [7][8][9][10]. High baseline SUV uptake has been found to be associated with poor treatment outcome in many tumors, such as esophageal, lung, and head-and-neck cancer [11][12][13].
High intra-tumoral heterogeneity has been shown to relate to poor prognosis and treatment resistance [14,15]. However, SUV measures fail to adequately capture the spatial heterogeneity of the intra-tumoral uptake distribution [16,17]. Therefore, texture features, which can be derived from a number of mathematical models of the relationship between multiple voxels and their neighborhood, are proposed to describe tumor heterogeneity [18,19]. Particularly, pretreatment [ 18 F]FDG PET texture features have shown promise for delineating nodal and tumor volumes [20,21] and assessing therapeutic response [22][23][24]. Studies have suggested that texture features perform better than SUV measures in treatment outcome prediction [22,[24][25][26]. For example, Cook et al (2013) compared the predictive power of common SUV measures and four neighborhood gray-tone difference matrix (NGTDM) derived textures in non-small cell lung cancer (NSCLC) patients [27]. They found that NGTDM-derived Coarseness, Contrast, and Busyness were not only better prognostic predictors than the SUV measures, but also better able to differentiate responders from nonresponders.
Despite the clinical potential of texture features, the accurate quantification of texture features may be hindered by respiratory motion in lung cancer patients. Motion induced image blurring in static PET images (3D PET) can lead to reduction in tumor uptake and over estimation of metabolic tumor volume [28][29][30]. 4D PET imaging gates PET image acquisition with respiratory motion in order to improve PET image quality and has been shown to reduce motion blurring in the PET images, providing more accurate quantification of lung tumor activity [28,[31][32][33][34]. We hypothesize that fine texture features are likely to be blurred during 3D PET acquisition of lung tumors.
With the growing interest of texture features and tumor heterogeneity, the impact of tumor motion on PET-based quantification needs to be studied as it is still yet unknown. In this study, we compared the quantification of texture features between 3D and 4D PET imaging. Although numerous texture features can be found in the literature [22,35,36], we focused on five texture features. Particularly, three NGTDM derived Coarseness, Contrast, and Busyness due to their predictive value in lung cancer patients [27]. A gray level co-occurrence matrix (GLCM) derived Maximal Correlation Coefficient (MCC) [37] and gray level run length matrix (GLRLM) derived Long Run Low Gray level emphasis (LRLG) [38] were also computed due to their robustness against variation of reconstruction parameters of PET images [36].
The NGTDM texture features were originally designed to resemble human perception and were first proposed by Amadasun and King (1989) [18]. In a coarse image, the texture is made up by large patterns, such as large area with uniform intensity distribution. Contrast measures the intensity difference between neighboring regions within the tumor. Busyness is a measure of the intensity change between multiple voxels and their surroundings. GLCM-MCC was first introduced by Haralick et al in 1973 [37] and is used to measure the statistical relationship between two neighboring voxels. GLRLM-LRLG measures the joint distribution of long runs and low intensity values, where a run is the distance between two consecutive voxels with the same intensity in a specific direction [38].

Patients and imaging
This study was conducted under the Dana-Farber Cancer Institute institutional review board (IRB) approved protocol (protocol #: 06-294) and written consents were obtained from all patients. Twenty-six patients (mean age 565¡10 yr, 14 males, 12 females) with NSCLC received a treatment planning CT (both 3D and 4D) two weeks before the start of radiotherapy with or without concurrent chemotherapy. 3D [ 18 F]FDG-PET/CT, a free breathing chest CT, and a 4D [ 18 F]FDG-PET scans were acquired 1-2 weeks prior to the therapy. There were sixteen patients with adenocarcinoma and ten patients with squamous cell carcinoma. The internal tumor volumes (ITV), which encompassed tumor motion, of thirty-four lesions (1-3 malignant tumors/patient) were delineated by an experienced radiation oncologist on a 4D planning CT. 3D PET and 4D PET scans were performed on a Siemens Biograph PET/CT scanner (Siemens AG, Erlangen, Germany). Attenuation correction of 3D PET images was performed using the whole body 3D CT images, while 4D PET images were corrected by the free breathing chest CT images. 3D PET scans were acquired approximately 100 min after injection of 16.7-22mCi of [ 18 F]FDG in the patients. For the 3D PET scan, the images were acquired for 3-5 min/bed position in six to seven bed positions. The 3D PET images were reconstructed with ordered-subset expectation-maximization (OSEM) with 4 iterations, 8 subsets, 7 mm full-widthhalf-maximum (FWHM) post-filtration, and sampled onto a 1686168 grid comprised of 4.0664.06 mm 2 pixel. The image acquisition of 4D PET followed immediately after the completion of the 3D PET scan.
4D PET images were acquired at one bed position centered on the tumor and covering part of the lung for 20-30 min, depending on the comfort of the patients. An AZ-733V respiratory gating system (Anzai Medical System, Tokyo, Japan) was employed to monitor patient respiratory motion [39]. The acquired data were retrospectively binned into five phases starting at inhale peak (bin 1) to create the 4D image sequence using the phase-based algorithm provided by the Siemens Biograph PET/CT scanner (Siemens AG, Erlangen, Germany). In particular, the five phase bins, corresponded to the end of inhalation (bin 1), inhalation-to-exhalation (bin 2), mid exhalation (bin 3), end of exhalation (bin4), exhalation-to inhalation (bin 5), respectively. The respiratory gated 4D PET images were reconstructed with OSEM with 2 iterations, 8 subsets, 5 mm FWHM, and sampled onto a 2566256 grid comprised of 2.6762.67 mm 2 pixel.

Texture features
Planning CT was rigidly registered to 3D-and 4D-PET images with normalized mutual information. The transformations were then applied to each ITV. The 3D and 4D PET images were cropped using the registered ITV contour to crop out the tumor region. Number of voxels per tumor region ranged from 85 to 6483 with median number of voxels5545. Prior to texture feature computation, all PET images (PET(x)) were preprocessed using the following equation, Where minPET and maxPET are the maximum and minimum intensities of PET within the tumor region. The intensity range of the post-processed image (PET'(x)) was converted into 32 discrete values as suggested by Orlhac et al (2014) [40]. Within the tumor region, the following four neighborhood gray-tone difference matrix (NGTDM) derived texture features were computed to quantify tumor heterogeneity: Coarseness, Contrast, Busyness, and Complexity. These were implemented in MATLAB (The Mathworks Inc. Natrick MA) using the Chang-Gung Image Texture Analysis Toolbox [41,42]. The mathematical definitions of the NGTDM, GLCM, and GLRLM texture features can be found in Amadasun and King (1989) [18], Haralick et al (1973Haralick et al ( , 1979 [37,43], and Galloway (1975) [38], respectively. 3D (1686168) and 4D (2566256) PET images were reconstructed to different matrix sizes based on different reconstruction parameters. Additionally, due to the difference in 3D and 4D PET imaging acquisition times, fewer photon counts and higher noise may be found in the 4D PET images. Therefore, all 4D PET images were downsampled to the same grid/resolution of 3D PET images using linear interpolation prior to texture feature computation to reduce noise.

Data analysis
The relative difference (d 3D-4D ) in texture features between 3D and 4D PET were calculated: Where Q 3D is the quantification (i.e. texture features measures) based on 3D PET, Q 4D j is the quantification based on bin j of the 4D PET images. Wilcoxon signedrank test (p,0.05) was performed on pairs to determine if Q 3D and Q 4D j were significantly different. We calculated an avid tumor volume (ATV) as thresholded PET images with SUV over 40% maximum SUV within the ITV [29]. We investigated the influence of ATV and ITV on d 3D-4D using Spearman's correlation coefficient (R) with significant value of p50.05.
Kruskal-Wallis test was used to assess if one phase was significantly different from the other phases (p,0.05). The variability in the texture features measures between all five phase bins was assessed using the coefficient of variation (CV).
To estimate the extent of motion, the centers of mass (C j ) of the PET avid region (ATV) on all five 4D PET bins were recorded. The amplitude of the tumor motion was estimated using the maximum difference inC j between the five bins [28,29] Amp~maxfC i {C j g ð 5Þ Where i and j range from 1 to 5.
To study the impact of tumor motion, we calculated the Spearman's correlation coefficient for Amplitude:ATV ratio and d 3D-4D with significant value p50.05. Amplitude:ATV ratio is a measure of motion amplitude relative to the tumor volume. Large Amplitude:ATV ratio indicates large tumor movement relative to the tumor size.

5
Furthermore, textures may be affected by motion differently according to the tumor histology. Therefore, we investigated if d 3D-4D were significantly different between adenocarcinomas (21 lesions) and squamous cell carcinomas (13 lesions) using Mann-Whitney U-test with p,0.05.

Results
4D PET images appeared to have higher uptake and less blurring than the corresponding 3D PET images (Fig. 1). The differences between 3D and 4D PET were found to be significant (p,,0.01) for Busyness, MCC, and LRLG as shown in Table 1. Significant difference for Coarseness was found in all bins (p,,0.01) except in bin 2 (p50.59) ( Table 1). The Coarseness determined on the 3D PET images was about 10% higher than the 4D PET. 4D PET images were found to have as much as a 19% increase in Busyness, compared to the corresponding 3D PET images (Table 1, Fig. 2). MCC was found to be 2% higher in 3D PET than 4D PET, while 2% higher LRLG was found in 4D PET when comparing to 3D PET. However, Contrast on 3D images was only about 5% lower when compared to 4D PET and d 3D-4D was not significant (p.0.08) ( Table 1, Fig. 2).

Discussion
In this study, we investigated the sensitivity of prognostic PET texture features to respiratory motion. Our results suggest that texture measures are sensitive to tumor motion. Substantial differences between 3D and 4D (d 3D-4D .10%) were found in Coarseness and Busyness. Therefore, the temporal resolution offered by 4D PET imaging may lead to more accurate quantification of image features.  Coarseness, Contrast, and Busyness considered in this study were originally designed to resemble human perception and were first proposed by Amadasun and King (1989) [18]. Cook et al (2012) [27] have shown that these three texture features are clinically relevant to lung cancer due to their predictive value for patient outcome. In a coarse image, the texture is made up by large patterns, such as large area with uniform intensity distribution. As breathing motion blurs the fine textures in the images, the 3D PET images appear to be more uniform (Fig. 1) and therefore have more Coarseness than 4D PET images. The sensitivity of Contrast was found to be insignificant to motion induced blurring. The intensity difference between neighboring regions within the tumor was observed to be more pronounced in 4D PET image (Fig. 1), leading to slightly higher (d 3D-4D ,5%) Contrast in 4D PET than 3D PET images. Busyness is a measure of the intensity change between single voxels and their surroundings. Busyness computed with 4D PET images was found to be as much as 20% higher than the 3D PET images. Since d 3D-4D tended to be higher at large Amplitude:ATV, the quantification of Busyness is especially sensitive to large relative tumor amplitude. However, 3D PET imaging was employed in the study of Cook et al (2012). Our results suggest that the quantification and prognostic value of busyness can be adversely affected by tumor motion.
GLCM-MCC and GLRLM-LRLG were included in the 3D vs 4D PET imaging comparison as they are insensitive to reconstruction parameters of PET images [36]. Tumor motion blurring in 3D PET image can reduce intensity difference between neighboring voxels. Therefore, neighboring voxels are better correlated in Fig. 2. Distribution of the difference between 3D and 4D PET (d 3D-4D ) in the texture features across 34 lesions. The top vertical line of a boxplot represents 75 th -95 th percentiles of the data. The bottom vertical line is the 5 th -25 th percentiles. Interquartile range (IQR) of the data is indicated by the width of the boxplot. Asterisks indicate the maximum and minimum differences. Median and mean differences are indicated by bar and square inside the box plots, respectively. MCC5Maximal correlation coefficient. LRLG5Long run low gray-level emphasis. The first boxplot represents the comparisons of 3D and 3D PET textures (d 3D-3D ). d 3D-3D is therefore zero by definition as shown in the first ''boxplot'' for each texture.
doi:10.1371/journal.pone.0115510.g002 Table 2. Spearman correlation coefficient of Amplitude:ATV (mm 22 ) and d 3D-4D and its p-value. 3D PET than 4D PET, leading to significant 2% higher MCC in 3D PET images. LRLG measures the joint probability of long runs and low gray values. As observed in Fig. 1, low intensity voxels are more localized (less distance apart) in the motion blurred 3D PET than in the 4D PET images. Therefore, LRLG was higher in 4D PET than 3D PET. In this study, the 4D PET images were binned into five phases. The activity uptake of each bin was slightly different as in Huang and Wang (2013) [30]. The bin with the highest SUV max is often chosen to be the ''best'' bin for 4D PET image [29,44,45]. However, we found that the variability between phase bins for MCC, LRLG, and Coarseness were negligible (CV,5%), suggesting that similar quantification can be obtained from all phases. The small variability may be due to the small tumor amplitude (4.4¡4.6 mm) in our dataset. On the other hand, the phase bin variability was found to be moderate for Contrast and Busyness (CV,10%). The values of Contrast and Busyness may depend on the choice of phase-bin. MCC, LRLG, and Coarseness are independent of the choice of phasebin, and therefore should be recommended for quantification of tumor characteristics in 4D PET imaging.
Apart from the texture features, studies often investigate the effect of respiratory motion on the quantification of various SUV measures, especially the maximum SUV [28,29,33]. The SUV max was found to increase with 4D PET imaging from 25% to 80% in these studies. The motion induced artifacts not only lower maximum tumor uptake on the 3D PET images, but may also lead to misclassification of lesions. For example, García Vicente et al (2010) compared the SUV max determined on 3D and 4D PET images for 42 lesions in lung cancer patients [33]. Tumor with SUV max over 2.5 was considered malignant in their study. As a result, 40% (17/42) of the lesions needed to be changed from benign to malignant. To this end, although the results are not shown, we also compared the differences in four SUV measures (SUV max , SUV peak , SUV mean , and SUV total ). 4D PET imaging increased the measurements of SUV max and SUV peak by about 30% and 25%, respectively, while increased for SUV mean and SUV total were only about 5%. Our results in SUV max are comparable to the previous studies [28,29,33].
However, there is one limitation of our textures and SUV comparison as it has been shown that malignant tumor tissue can continuously increase the uptake of [ 18 F]FDG even 2 hours after injection [46][47][48]. While the 3D PET imaging was  [49]. SUV max was also found to be highly correlated to entropy and energy in a study conducted by Orlhac et al (2014) [40] using patients with metastatic colorectal, lung, and breast cancer. These two studies may therefore suggest that the histogram derived textures are likely to be affected by the delayed imaging. However, none of the textures that were used in our study has been found to be highly correlated with SUV max [40]. This may be due to the fact that the textures we used are based on the spatial relationship between neighborhoods of voxels, and are not directly dependent on the intensity value of single or multiple voxels within the tumors. However, further study is needed to better understand the impact of delayed imaging on texture quantification. All the PET images in our study underwent attenuation correction using the free breathing CT images. The blurred anatomical mismatched of the PET/CT scans due to respiratory motion may affect the quality of the attenuation corrected 4D PET images, and subsequently the quantification of texture features [29,50,51]. Moreover, due to the difference in 3D and 4D PET imaging acquisition times, fewer photon counts and higher noise may be found in 4D PET images, which may subsequently affect the accuracy of texture feature definition. To mitigate the effect of noise, all 4D PET images have a minimum acquisition time of 20 min. These potential effects will be explored further in a future study.

Conclusions
Texture features, representing tumor heterogeneity, are blurred out by respiratory motion during 3D PET acquisition. 4D PET imaging reduces motion blurring, enabling PET-based features to be better resolved. Significant differences were found in MCC, LRLG, Coarseness, and Busyness between 3D and 4D PET imaging. When measuring tumor heterogeneity characteristics with PET imaging, reduced motion blurring by 4D PET acquisition enables significantly better spatial resolution of texture features. 3D PET textures may lead to inaccurate prediction of treatment outcome, hindering optimal lung cancer patient management. 4D PET textures may have better prognostic value as they are less susceptible to tumor motion.