Prognostic value of baseline [18F]-fluorodeoxyglucose positron emission tomography parameters MTV, TLG and asphericity in an international multicenter cohort of nasopharyngeal carcinoma patients

Purpose [18F]-fluorodeoxyglucose (FDG) positron emission tomography (PET) parameters have shown prognostic value in nasopharyngeal carcinomas (NPC), mostly in monocenter studies. The aim of this study was to assess the prognostic impact of standard and novel PET parameters in a multicenter cohort of patients. Methods The established PET parameters metabolic tumor volume (MTV), total lesion glycolysis (TLG) and maximal standardized uptake value (SUVmax) as well as the novel parameter tumor asphericity (ASP) were evaluated in a retrospective multicenter cohort of 114 NPC patients with FDG-PET staging, treated with (chemo)radiation at 8 international institutions. Uni- and multivariable Cox regression and Kaplan-Meier analysis with respect to overall survival (OS), event-free survival (EFS), distant metastases-free survival (FFDM), and locoregional control (LRC) was performed for clinical and PET parameters. Results When analyzing metric PET parameters, ASP showed a significant association with EFS (p = 0.035) and a trend for OS (p = 0.058). MTV was significantly associated with EFS (p = 0.026), OS (p = 0.008) and LRC (p = 0.012) and TLG with LRC (p = 0.019). TLG and MTV showed a very high correlation (Spearman’s rho = 0.95), therefore TLG was subesequently not further analysed. Optimal cutoff values for defining high and low risk groups were determined by maximization of the p-value in univariate Cox regression considering all possible cutoff values. Generation of stable cutoff values was feasible for MTV (p<0.001), ASP (p = 0.023) and combination of both (MTV+ASP = occurrence of one or both risk factors, p<0.001) for OS and for MTV regarding the endpoints OS (p<0.001) and LRC (p<0.001). In multivariable Cox (age >55 years + one binarized PET parameter), MTV >11.1ml (hazard ratio (HR): 3.57, p<0.001) and ASP > 14.4% (HR: 3.2, p = 0.031) remained prognostic for OS. MTV additionally remained prognostic for LRC (HR: 4.86 p<0.001) and EFS (HR: 2.51 p = 0.004). Bootstrapping analyses showed that a combination of high MTV and ASP improved prognostic value for OS compared to each single variable significantly (p = 0.005 and p = 0.04, respectively). When using the cohort from China (n = 57 patients) for establishment of prognostic parameters and all other patients for validation (n = 57 patients), MTV could be successfully validated as prognostic parameter regarding OS, EFS and LRC (all p-values <0.05 for both cohorts). Conclusions In this analysis, PET parameters were associated with outcome of NPC patients. MTV showed a robust association with OS, EFS and LRC. Our data suggest that combination of MTV and ASP may potentially further improve the risk stratification of NPC patients.

Introduction Nasopharyngeal carcinomas (NPC) are a subset of head and neck squamous cell carcinomas (HNSCC) with an etiology, treatment, and prognosis differing from other HNSCC. In Europe and Northern America, the incidence of NPC is low, but there are regions, including Southern China, where NPC are endemic, while other regions like Northern Africa or Middle East exhibit an intermediate incidence. Standard treatment of non-metastatic NPC is radiotherapy or chemoradiation (CRT) in case of locally advanced disease. Compared to non-human papilloma virus (HPV) associated HNSCC of other locations, NPC possess a relatively high radiosensitivity. Most cases of NPC seem to be related to an infection with Epstein-Barr virus (EBV), other classical risk factors for HNSCC like smoking usually play a minor causative role. Due to the relatively young age of patients with overall good prognosis, individually tailored treatment is a pivotal issue. This could comprise either de-escalation/ escalation of radiation therapy or escalation of concurrent chemotherapy with induction or adjuvant chemo-and/ or immunotherapy [1][2][3].
Several publications suggest that 18 F-fluorodeoxyglucose (FDG) positron emission tomography (PET) parameters bear prognostic value in NPC and could potentially be used for treatment individualization. Two meta-analyses investigated the prognostic role of FDG-PET in NPC and found that the parameters maximum standardized uptake value (SUV max ), metabolic tumor volume (MTV), and total lesion glycolysis (TLG) bear a significant prognostic value for various important clinical endpoints, including event-free survival (EFS) and overall survival (OS) [4,5]. Some recent publications suggest that assessment of tumor heterogeneity by PET may also provide prognostic value [6,7]. Our group and others have identified tumor asphericity (ASP) as a prognostic parameter in HNSCC [8][9][10].
The aim of this study was to assess the prognostic value of several FDG-PET parameters, including ASP, in a multicenter cohort of European, American and Chinese NPC patients. None of these patients had been included in the mentioned meta-analyses.

Ethics
The research has been reviewed and approved by institutional ethical committees of all the participating centers.

Patients
Inclusion criteria for this study were: histologically confirmed NPC without evidence of distant metastases, definitive radiotherapy or CRT with curative intent, and availability of pre-treatment FDG-PET. We analyzed PET images and patient data from Xiamen, China and Charité Berlin, Germany plus additional images and patient data from three American databases, available in the cancer imaging archive [11][12][13][14]. Data for the Chinese patients and the patients of the cancer imaging archive have been published previously [15][16][17][18].
The whole dataset includes 57 patients from Xiamen, China, 24 patients from Berlin, Germany and 33 patients from the above mentioned three public available datasets. For additional independent validation of identified PET parameters patients from China were used for establishment of prognostic parameters and all other (European and American) patients for independent validation.

Imaging
All patients underwent a hybrid FDG PET/CT examination prior to therapy. Data acquisition started 75.6 +/-27 min after injection of 132-770 MBq FDG. Examinations in Xiamen (3D PET acquisition, 90 seconds (s) per bed position) were performed with a Discovery STE (General Electric Medical Systems, Milwaukee, WI, USA). PET raw data were reconstructed using CT based attenuation weighted OSEM reconstruction (2 iterations, 20 subsets, 6 mm FWHM Gaussian filter). Examinations in Berlin (3D PET acquisition, Median 150 s per bed position, range 90-210 s) were performed with a Gemini TF 16 (Philips Medical Systems, Cleveland, OH, USA). PET data were reconstructed using BLOB-OS-TF reconstruction (Philips Astonish TF technology: 3 iterations, 33 subsets; voxel size: 4.42 x 4.42 x 4.42 mm 3 ). Canadian data were acquired at four different sites. Details on the acquisition protocols can be found in [16].

Treatment
Patients with stages T1 or T2 and N0 were usually treated with radiotherapy alone, while more advanced stages were treated with CRT, except if patient age, comorbidities or patient refusal contraindicated concomitant therapy.
Treatment details of Canadian patients can be found in the supplementary files of the original publication [16]. All patients received intensity modulated radiotherapy (IMRT) or volumetric arc modulated radiotherapy (VMAT) with a total dose of 70 Gray (Gy) in 35  Patients from Berlin received VMAT with a total dose of 57.5 to 76.6 Gy. Most patients were treated with a simultaneous integrated boost (SIB) delivering single fractions of 2.2 Gy, some patients received hyperfractionated radiotherapy with twice daily 1.4 Gy. In case of concomitant chemotherapy, either cisplatin or cisplatin in combination with 5-FU was the most commonly applied regime. One patient received mitomycin C and one patient cetuximab. Patients were treated between August 2009 and March 2018.

Data analysis
The metabolically active part of the primary tumor was delineated in the PET data by an semiautomatic algorithm based on thresholding relative to the maximum activity with adaptation for local background [19,20]. The resulting regions of interest (ROI) were inspected visually by an experienced observer (SZ) who was blinded to patients outcome. Manual correction was applied in 8 out of 107 patients who exhibited only low diffuse tracer accumulation in the respective lesion. For the delineated ROIs, ASP was computed as ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi 1 36p where V is the volume of the ROI and S is its surface. ASP is equal to zero for spheres. For nonspherical shapes ASP > 0 and is a quantitative measure of the degree of deviation from a spherical shape. In addition, the metabolic tumor volume (MTV), the maximum standardized uptake value (SUV max ), and the total lesion glycolysis (TLG = MTV x SUV mean ) were computed. It should be noted that in two PET examinations, time point of injection was missing in the data (presumably due to incorrect pseudonymization). The corresponding patients, therefore, had to be excluded from analysis of SUV max and TLG. Uptake time after injection was not standardized. Therefore, all SUVs were corrected for scan time to T 0 = 75 min after injection using where T is the time at which the SUV was actually measured and b = 0.31 describes the shape and decrease of the arterial input function over time [21]. Since only time corrected values were investigated, the index 'tc' is omitted in the following. ROI definition and analysis was performed using the ROVER software, version 3.0.41 (ABX, Radeberg, Germany).

Statistical analysis
Survival analysis was performed with respect to overall survival (OS), locoregional tumor control (LRC), distant metastases-free survival (FFDM), and event-free survival (EFS, defined as death or any recurrence or occurrence of DM) from the start of therapy to death and/or event.
Patients who did not keep follow-up appointments and for whom information on survival or tumor status was thus unavailable were censored with the date of last follow-up. The association of OS, LRC, FFDM, and EFS with clinically relevant parameters (age, EBV status, T stage, N stage, and UICC stage) as well as quantitative PET parameters was analyzed using univariable Cox proportional hazard regression in which the PET parameters were included as metric variables. PET parameters showing a significant association or a trend for significance (p � 0.1) in this analysis were further analyzed in univariable Cox regression using binarized PET parameters. Binarization was performed using the cutoff value with the highest hazard ratio (HR) in univariable Cox regression for each variable. To avoid too small group sizes, only values leading to a minimum group size of 15% of the whole group were considered as potential cutoff. The cutoff values were separately computed for OS, EFS, LRC, and FFDM. Cutoff values leading to p < 0.05 were further investigated for stability by determining the full range of cutoff values around the optimal cutoff for which a trend for significance remained in univariable Cox regression. The probability of survival was computed and rendered as Kaplan-Meier curves. Independence of parameters was analyzed by multivariable Cox regression. When combining two prognostic PET parameters, combination was defined as co-occurence of both prognostic negative parameters. HR were compared using the bootstrap method (10 5 samples) to determine the statistical distribution of (HR 1 -HR 2 ) from which the relevant p value then was derived. Statistical significance was assumed at a p value of less than 0.05. Statistical analysis was performed with the R language and environment for statistical computing version 3.6.2 [22].

Results
Median follow-up time in surviving patients was 87 months and 66 months in all patients (inter-quartile range: 53-108 months and 27-95 months, respectively). OS, EFS, and LRC rates five years after start of treatment were 74%, 60%, and 79%, respectively. These treatment results are in line with results of current phase III studies on NPC [2,3]. Table 1 summarizes patient and treatment characteristics of all patients included in the study. In a first step, the association between clinical parameters and metric PET parameters and outcome of patients was analyzed by univariate Cox regression analysis. The clinical parameters age, N stage, and EBV negative tumors were significantly associated with decreased FFDM (p = 0.004, p = 0.046, and p = 0.022, respectively). Additionally, younger patients and patients with EBV positive tumors showed a better OS (p = 0.003 and p = 0.046). Furthermore, higher age was associated with decreased EFS (p = 0.001) and LRC (p = 0.014). Regarding metric PET parameters (Table 2), a significant association between higher tumor MTV or higher ASP and decreased EFS (p = 0.026 for MTV and p = 0.035 for ASP) was observed. MTV and TLG showed a significant association with OS (p = 0.008 and p = 0.049) and ASP showed a trend for association with OS (p = 0.058). MTV and TLG showed a significant association with LRC (p = 0.012 and p = 0.019, respectively).
After binarization, univariable Cox regression showed a significant association of ASP with EFS (p = 0.023) and OS (p = 0.027) and MTV with OS (p<0.001), EFS (p<0.001) and LRC (p<0.001). Correlation analyses of all PET parameters revealed a strong correlation between MTV and TLG, but no strong correlation between ASP and MTV (Spearman´s rho = 0.35, see S1 Table for correlations of all PET parameters), and therefore ASP + MTV were combined for OS and showed a strong association with outcome (MTV+ASP, p<0.001). Since ASP may be more relevant in large tumors, we investigated if the combination of MTV and ASP bears additional prognostic value compared to each individual parameter. We performed bootstrap analysis for the parameter MTV and the combination of MTV and ASP with OS as endpoint. These analyses revealed an improved association with OS for the combination of both parameters (p = 0.005 and p = 0.04, respectively). Table 3 shows details of univariable Cox regression for all binarized PET parameters. Due to the high correlation with MTV, TLG was not further investigated.
Cutoff stability testing revealed that MTV seems to discriminate across a relatively broad range of values with respect to OS, EFS and LRC and ASP with regard to OS. However, ASP only led to a significant discrimination of risk-groups within a narrow range of cutoff values with respect to EFS, see S2 Table for details. Due to the small range of ASP in regard to EFS, ASP was not further evaluated for the EFS endpoint.
In multivariable Cox regression of each PET parameter with clinical parameters (Table 4), MTV, ASP and the combination of both remained significantly associated with OS (p<0.001, p = 0.031 and p<0.001, respectively; Fig 1) and MTV was significantly associated with LRC   Fig 2) and EFS (P = 0.004; Fig 2). Sub-group analyses revealed that the combination of ASP and MTV delivered prognostic value in regard to OS irrespective of a treatment center (Chinese versus Euro-American), see S1 Fig. Additionally, it was investigated if PET parameters can be successfully validated by splitting patients into an exploration and a validation cohort. Therefore, the Chinese cohort was used as exploration cohort. The cut-offs for the PET parameters MTV and ASP were separately optimized for this cohort. Subsequently these cut-offs were applied to the remaining patients treated at the other centers. MTV discriminated significantly between risk groups for the endpoints OS (p = 0.043 exploration, p = 0.002 validation), EFS (p = 0.015 exploration, p = 0.01 validation) and LRC (p<0.001 exploration, p = 0.018 validation). ASP showed only a trend for significance in regard to OS (p = 0.064) in the exploration cohort. Taken together MTV could be established as strong prognostic factor regardless of the geographic location of the participating center. The results are shown in S2-S5 Figs.

Discussion
To our knowledge this is the first international multicenter PET evaluation study of patients with NPC. Our data suggest that the established PET parameter MTV is best suited for stratification with regard to OS, LRC and EFS. No PET parameters showed an association with FFDM. The results on the high prognostic impact of MTV are in line with two recent metaanalyses [4,5]. In contrast to our findings, a recent publication with 179 patients was able to show a correlation between the FDG-PET tumor parameter SUV max and FFDM [23]. This discrepancy might be explained by the lower number of patients in our study or a different composition of tumor stages. In accordance with the current results, a study with 294 patients with advanced NPC showed that SUV max of neck lymph nodes but not primary tumor SUV max was associated with FFDM [24]. Another explanation for the poor performance of SUV could be the well-known limitations in reproducibility of SUV between examination time points, acquisition protocols, PET scanner and reconstruction algorithms [25,26], which are especially manifest in multicenter studies. Recent publications demonstrated that the uptake time normalized tumor-to-blood SUV ratio (standardized uptake ratio, SUR) essentially removes most of these shortcomings [27,28], which leads to a significantly better prognostic value compared to tumor SUV in other malignancies [29][30][31]. However, the blood SUV, which is necessary for SUR computation, could not be determined in about one half of the patients included in the present study since the aorta was not in the field of view. Mainly because the head and neck region was scanned with thermoplastic masks for radiation treatment planning, while the remaining body was  imaged in a second examination without mask. Therefore, the question if SUR might be able to improve the prognostic value of tracer uptake in the present context remains open. In our analysis, the stratification power of MTV was further improved by ASP. However, this has to be considered as an exploratory finding that needs to be confirmed by future, ideally prospective, analyses. Unfortunately, most larger (i.e. with more than 100 patients) PET studies on NC patients did not investigate the prognostic value of MTV or TLG but restricted analysis to SUV. To our knowledge only Chan and colleagues evaluated MTV in a larger cohort of 196 NC patients. Chan reported cutoff values for MTV (45ml) and TLG (330) that seem to be quite high compared to our cohort of patients [32]. Therefore we were not able to validate these cutoffs in this study (maximum cutoff for MTV 17.8ml and TLG 173, see S2 Table).
Several limitations of this study have to be mentioned. First, this is a retrospective analysis with all limitations inherent to this approach. Additionally, due to the partial use of publicly available imaging databases important clinical information like type of chemotherapy or Karnofsky performance status was not available at an individual patient level, and clinical parameters like EBV association were missing for several patients. In this regard also further prognostic or potentially prognostic parameters were not available, especially the EBV viral load, which showed an association with patients´outcome in a large meta-analysis [33].
Our study design has several strengths: first, all original PET images were analyzed by a highly standardized workflow and semi-automatic delineation. Second, in the two above-mentioned meta-analyses, only Asian centers were included (12 of 12)  A recent publication investigated the prognostic impact of FDG-PET in a monocentric study with 49 Italian patients and found TLG and SUV max to be significantly associated with OS of patients [34]. Another monocentric study on 52 Turkish patients found SUV max of the primary tumor to be significantly associated with FFDM and disease-free survival, but not with OS [35]. Furthermore, another monocentric study from Turkey investigated the role of primary tumor and lymph node SUV max on the outcome of 32 patients. The authors observed a statistical trend towards worse survival for patients with higher SUV max of the primary tumor [36]. Unfortunately, both Turkish studies did not include further volumetric PET parameters like MTV or TLG. By splitting patients into two independent cohorts (Chinese patients and European/ American patients), we were able to validate the prognostic value of MTV in regard to the endpoints OS, EFS and LRC. Additionally it is astonishing that geographic location does not seem to influence the prognostic impact of MTV substantially. In our analysis, half of the evaluated patients were treated in Europe or America, and the high prognostic impact of MTV and MTV+ASP could be confirmed in this international cohort of patients. Given the high prognostic value of MTV, it could potentially be relevant for treatment individualization regarding the prescribed radiation dose. Dose escalation within high FDG uptake regions seems to be a feasible approach. A recent publication with 213 NPC patients retrospectively investigated two groups of patients: one group of 101 patients received a PET based radiation dose escalation with about 12% increased radiation dose, while 112 patients received standard curative radiation doses. This PET boost approach improved LRC, FFDM and OS of patients [37]. This study is limited by the retrospective (non-randomized) design; however, improvement in patient outcome by a PET based treatment individualization could be demonstrated consistently for different endpoints in a comparably large patient sample. Additionally, given the high radiosensitivity of NPC, PET parameters might also be used for treatment de-escalation with the aim to preserve high rates of long-term curation in conjunct with decreased rates of radiation induced side effects.

Conclusions
Our data suggest that MTV is an excellent prognostic parameter for OS, LRC and EFS of NPC patients. The prognostic value of MTV seems to be independent from geographic location, as cut-off values generated in China also discriminated European and American patients. It seems that the stratification power of MTV might be further improved by the novel parameter ASP but this initial finding needs to be validated by further independent studies.
Supporting information S1