Pre-treatment 2D and 3D dosimetric verification of volumetric arc therapy. A correlation study between gamma index passing rate and clinical dose volume histogram

Objectives To evaluate methods for the pre-treatment verification of volumetric modulated arc therapy (VMAT) based on the percentage gamma passing rate (%GP) and its correlation and sensitivity with percentage dosimetric errors (%DE). Methods A total of 25 patients with prostate cancer and 15 with endometrial cancer were analysed. The %GP values of 2D and 3D verifications with different acceptance criteria (1%/1 mm, 2%/2 mm, and 3%/3 mm) were obtained using OmniPro and Compass. The %DE was calculated using a planned dose volume histogram (DVH) created in Monaco’s treatment planning system (TPS), which relates radiation dose to tissue and the patient’s predicted dose volume histogram in Compass. Statistical correlation between %GP and %DE was verified using Pearson’s correlation coefficient. Sensitivity was calculated based on the receiver operating characteristics (ROC) curve. Plans were calculated using Collapsed Cone Convolution and the Monte Carlo algorithm. Results The t-test results of the planned and estimated DVH showed that the mean values were comparable (P > 0.05). For the 3%/3 mm criterion, the average %GP was acceptable for the prostate and endometrial cancer groups, with average rates of 99.68 ± 0.49% and 99.03 ± 0.59% for 2D and 99.86 ± 0.39% and 99.53 ± 0.44% for 3D, respectively. The number of correlations was poor for all analysed data. The mean Pearson’s R-values for prostate and endometrial cancer were < 0.45 and < 0.43, respectively. The area under the ROC curve for the prostate and endometrial cancer groups, was lower than 0.667. Conclusions Analysis of the %GP versus %DE values revealed only weak correlations between 2D and 3D verifications. DVH results obtained using the Compass system will be helpful in confirming that the analysed plans respect dosimetric constraints.


Introduction
Volumetric modulated arc therapy (VMAT) has become a standard delivery method of intensity-modulated radiotherapy (IMRT) that improves the conformance of the dose distributions in the target area while simultaneously reducing doses in the organ at risk (OAR). As a result, organs are spared to a greater extent than that with the standard 3D conformal radiotherapy technique. In VMAT, highly conformal dose distributions are obtained through concomitant continuous gantry rotation, variable dose rate and dynamic beam modulation [1]. The increasing complexity of VMAT plans with sharp gradients requires a patient-specific quality assurance (QA). Pre-treatment verification is recommended for each VMAT plan and is essential for the detection of possible mismatches between planned and delivered doses. This process is typically performed by applying the treatment plan to a dosimetric phantom and comparing the measured and calculated phantom dose distributions based on the gamma index (GI). This method of quantitatively comparing measured and calculated dose maps was first introduced by Low et al., [2]. Detector arrays consisting of ion chambers or diodes can be used for absolute dose distribution measurement in a 2D plane or 3D geometries. In the 2D method, checking delivery precision in only a single plane exported from TPS is commonly used but is insufficient, since it is difficult to interpret the results due to missing patient anatomy. The 3D method verifies the whole patient volume, and the reconstructed dose on the CT scan is compared with the planned dose to judge dosimetric errors on their clinical relevance. A variety of VMAT verification methods have been described in the literature [3][4][5][6], some of which have been proven useful for QA; however, they have weaknesses. For instance, Electronic Portal Imaging Device (EPID) dosimetry has a limited field of view; film dosimetry offers high resolution but is labour intensive; and detector array also has a limited field of view and spatial resolution. This has led to a necessity to implement an effective quality control program [7][8][9][10][11] since precise delivery of the treatment equipment and calculation accuracy of the treatment planning system (TPS) must be provided to ensure that all critical aspects of the VMAT method are functioning properly.

Patients
A total of 25 patients with prostate cancer and 15 with endometrial cancer, treated with the dual arc VMAT technique, were enrolled in the present study. Plans were optimised using the Monte Carlo (MC) algorithm in Monaco's TPS (version 5.11.02) for 6 MV of photon energy, and were realised on an Elekta Synergy accelerator equipped with an Agility 160 multileaf collimator. Calculation options based on the dose deposition-to-medium with grid settings of 3 mm were used. In addition, statistical uncertainty (SU) for the MC algorithm was defined as 0.5% [12][13], which is a standard value in our department for radical VMAT plans. The prostate group was treated with a dose of up to 50 Gy in 25 fractions, while the planned dose for endometrial cancer patients was 45 Gy in 25 fractions. Dose evaluations were performed based on Quantitative Analyses of Normal Tissue Effects in the Clinic Group [14], International Commission on Radiation Units and Measurements [15], and Radiation Therapy Oncology Group recommendations [16][17][18].

Compass system with MatriXX Evolution
The Compass software (version 3.1b) uses a 2D detector array measurement device, such as the MatriXX Evolution, which is combined with a gantry angle sensor (GAS) to measure the gantry angle [19,20]. The sensors in the MatriXX are vented pixel ionisation chambers, and each chamber has its own measurement channel. When the MatriXX is irradiated, the air in the chambers is ionised. The released charge is separated in the electrical field created between two electrodes. The current, which is proportional to the dose rate, is measured and digitised by current-sensitive analogue-to-digital converters. The chambers are arranged in a 32 x 32 grid, except for the four corner positions. The distance between the chambers is 7.62 mm from centre to centre. The effective point of measurement is 3 mm below the surface, which is 3.3 mm water equivalent depth. This level is indicated by markers on the sides of the MatriXX detector. The device is mounted to the gantry of the accelerator. The gantry mount consists of two parts: an advanced holder and a gantry fixture. The gantry fixture is customised to a type of linear accelerator (LINAC), and the advanced holder consists of an adjustable XY table and a supporting frame. The adjustable XY table is mounted on the top of a supporting frame and is used to finely adjust the MatriXX position to the crosshairs in the light field or positioning lasers of the accelerator. Geometric and absolute calibration were performed prior to measurements. Geometric calibration requires measurement of a field size larger than 7 x 7 cm, and graphical evaluation detects the edge of the field. Absolute calibration determines the response of the detector to the dose factor. In addition, a characterised Hounsfield-units-to electron density (HU-to-ED) calibration curve was implemented in the Compass system to assure accurate dose calculations on the computed tomography scan (CT). The radiotherapy plan (RT plan) from the Monaco system was exported to Compass and to the accelerator for measurement with MatriXX Evolution. In Compass, this 2D detector array measurement was used to reconstruct the fluence in four steps: 1. Computation expected fluence on the detector from a fluence model and RT plan from the TPS were exported to Compass; The OmniPro and Compass systems give 1-mm resolution with linear interpolation using a low pass filter. Targets pixels are calculated by the four surrounding source pixels using a bilinear algorithm. The maximum and minimum dose rates that are detectable by the detectors are 5 Gy/min and 0.1 Gy/min, respectively. Additionally, the responses of the I'mRT MatriXX and MatriXX Evolution devices are linear with respect to dose and dose rate, but the limited resolution is insufficient to detect hot and cold spots in highly modulated VMAT plans.
In addition, the Compass system can perform a full 3D Collapsed Cone Convolution (CCC) algorithm [22]. Fig 1 shows the 3D gamma index (GI) analysis in the Compass system.

OmniPro system with the I'mRT MatriXX
The OmniPro system (version 1.7b) uses the I'mRT MatriXX 2D detector array, which consists of 1020 vented ion chambers arranged in a 32 x 32 grid that resembles the Evolution array. The main difference between the I'mRT MatriXX and MatriXX Evolution is that the latter can be combined with a GAS to measure the gantry angles. The I'mRT MatriXX, with built-up and backscatter (RW3) plates, is mounted on the treatment couch under the gantry. A calibration factor was obtained prior to measurements. The system calculates k user using the entered dose reference value and the average values for the four middle MatriXX chambers. A value of 100 MU, with a field of 10 x 10 cm 2 , was required during the calibration procedure. The treatment plan in the Monaco system was transferred to a measuring phantom containing the I'mRT MatriXX (QA plan). The dose plane output from the QA plan was subsequently exported to OmniPro, and measurements were then performed on a LINAC and compared with the TPS dose distribution. The OmniPro system offers 2D verification with the gamma index (GI) passing rate. Fig 2 presents the 2D GI analysis in the OmniPro system. All QA plans were delivered through MosaiQ (version 2.50, Elekta).

Gamma analysis in 2D and 3D dosimetric verification
VMAT QA dose distributions for each treatment plan were evaluated using the GI method. The percentage gamma passing rate (%GP) was calculated for different acceptance criteria (1%/1 mm, 2%/2 mm, and 3%/3 mm) in the 2D and 3D methods of dosimetric verification in the OmniPro and Compass systems [23]. Analyses were performed with a dose threshold of 10%, and dose values below this level were not included in the comparison.

Evaluation of the predicted dose-volume histogram
The planning target volume (PTV) parameters, D 1% , D 98% (dose in 1% and 98% volume), and D mean , were analysed in both groups. In prostate cancer cases, D 15% , D 25% , D 35% , and D 50% were evaluated in the bladder and rectum; D max , D 25% , and D 40% were evaluated in the femoral head; D mean was evaluated in the penile bulb and D max was evaluated in the bowel. In the endometrial cancer group, D 35% and D 50% were analysed in the bladder; D 35% , D 50% , and D 60% were analysed in the rectum; D max and D 15% were assessed in the femoral head; D 30% was analysed in the bowel; and D mean was evaluated in the bone marrow. The percentage dosimetric errors (%DE) between the DVH values from TPS and DVH Compass were calculated using: where D DVHCompass is the dose taken from Compass and D TPS is the dose extracted from Monaco.

Correlations and sensitivity analysis
Statistical correlation between %DE and %GP was studied using Pearson's correlation coefficient in Statistica (version 12, StatSoft, Poland). The %GP in 2D and 3D verifications was compared with the %DE parameters from DVH for PTV and OAR. The strength of correlations, in terms of R-values, was compared. A total of 57 R-values were analysed in the prostate cancer group; 19 for each of the three acceptance criteria: 3%/3 mm, 2%/2 mm, and 1%/1 mm. The Rvalue was analysed for 19 dose parameters, e.g., D 1% , D 98% , D mean in PTV; D 15% , D 25%, D 35%, D 50% in bladder, etc. For the endometrial cancer group, 45 R-values were obtained (for 15 dose parameters like D 1% , D 98% , D mean in PTV; D 35% , D 50% in bladder, etc. for each of the three criteria). Numbers 19 and 15 refer to the numbers of DVH parameters evaluated for all structures from 25 patients with prostate and 15 patients with endometrial cancer, respectively, under each acceptance criterion. To quantitate the sensitivity of the GI method, the number of false negative (FN) and true positive (TP) cases were also calculated [24][25][26]. FN cases were included for all structures with a %DE > 3% among patients with a %GP > 95%. All cases with a %DE > 3% and a %GP < 95% were considered TP. Receiver operating characteristics (ROC) curves were generated based on the FN and TP rates, and the area under the curve (AUC) was analysed to investigate the ability of the 2D and 3D methods to accurately identify the plan with dose errors > 3%.

CCC in the Compass system
The accuracy of the Monte Carlo algorithm in heterogeneous media was evaluated based on the secondary independent algorithm CCC. This comparison could be useful for detecting possible discrepancies in the TPS and is recommended for each treatment plan. VMAT plans from TPS were exported into the Compass system and recalculated using the CCC algorithm. Dose comparisons were performed based on the same DVH parameters used during the evaluation of the predicted DVH Compass [27,28]. The percentage dosimetric errors (%DE) between the DVH values from TPS and DVH CompassCCC were calculated using: where D DVHCompassCCC is the dose recalculated using the CCC algorithm in Compass and D TPS is the dose extracted from TPS.

Evaluation of the %GP
The results for patients with prostate and with endometrial cancers who were treated with radiotherapy are presented in Table 1. For the 3%/3 mm criterion, the average %GP was acceptable in both the prostate and endometrial cancer groups, with an average rates of 99.68 ± 0.49% and 99.03 ± 0.59% for 2D and 99.86 ± 0.39% and 99.53 ± 0.44% for 3D, respectively. The %GP values significantly decreased with decreasing acceptance criteria. The average passing rate of the 2%/2 mm acceptance criterion was < 95% for the 2D method (OmniPro) and > 95% for the 3D method (Compass) in the endometrial cancer group. In the prostate cancer group for the same criterion, the %GP was higher than the standard action level. No patients had a %GP > 95% when using an acceptance criterion of 1%/1 mm, and the %GP was generally too low for the establishment of acceptance thresholds.

Dose comparison (%DE)
The %DE values obtained from TPS in Monaco and DVH Compass are given in Tables 2 and 3, respectively. DVH parameters were compared using a parametric t-test, and P-values show that the doses were not significantly different (P > 0.05). Relatively higher %DE values were observed for parameters with a large dose gradient [29]. In the bowel structure, the %DE of D max was 7.93%, which corresponds to a dose of 0.16Gy. Variable bladder filling affected the values of the standard deviations for DVH parameters in the group of patients with prostate cancer. A lower %DE for the D 1% , D 98% , and D mean in PTV was obtained for the prostate cancer group in comparison with the gynecological group. Some differences in the bone region were as high as 3.20%.

Correlations and sensitivity analysis
The correlations for patients with prostate and endometrial cancers are presented in Tables 4  and 5, respectively. Statistical correlations between %DE and %GP were analysed using three different acceptance criteria and three methods of dosimetric verification for DVH parameters. The number of correlations based on 3D dosimetric verifications was higher (56% of cases) than that base on the 2D dosimetric verifications in the prostate cancer group. For patients with endometrial cancer, the number of correlations based on the 3D verifications was also higher than that based on the 2D verifications (53% of cases) [24]. The mean correlation coefficient values for patients with prostate and endometrial cancers are presented in Tables 6 and 7, respectively. The R-values were compared with the parametric t-test values. The P-values did not show any statistical difference (P > 0.05) in either group of patients, and the R-coefficients were mainly negative [25,30]. These values prove that there was a decrease in the clinical metrics with increasing passing rates in all treatment plans. Clinical metrics are related to DVH errors, %DE (e.g., D 1%, D 15% ). Negative R-values indicate a In the endometrial cancer group, the R-values were also mainly negative. The negative Rvalues for the 3D and 2D methods totalled 58% and 93% of cases, respectively. These results are in accordance with those previously reported [26]. In particular, for both groups and the D mean parameter, the %GP calculated using the 3D method and different acceptance criteria resulted in a high correlation with %DE (r > 0.75). Fig 4 presents the correlations between % GP and %DE for PTV in the endometrial cancer group.
Sensitivity analysis was performed for the 3%/3 mm acceptance criterion. Neither the 2%/2 mm nor 1%/1 mm level is applicable in routine clinical practice, since a %GP > 95% is difficult to achieve for all plans. The AUC values of the ROCs for DVH metrics in patients with prostate cancer were 0.540 and 0.480 for 2D and 3D, respectively. In the group of patients with endometrial cancer, the AUC values were 0.364 and 0.636 for 2D and 3D, respectively. Figs 5 and 6, presents the ROC curves for patients with prostate and endometrial cancers, respectively, with the corresponding AUC values.

Comparison of the CCC and MC algorithms
The results of the comparison between CCC and MC in patients with prostate and endometrial cancers are given in Tables 8 and 9, respectively. The %DE in PTV was lower in the first group, but higher values were reached in the OAR with a large dose gradient.

Discussion
The aim of the present study was to evaluate the predictive value of GI analysis in terms of the correlation between the %GP and %DE obtained by pre-treatment QA verification. In addition, the standard action level of the 95% passing rate for 2D and 3D pre-treatment verification was analysed with the criteria of 3%/3 mm, 2%/2 mm, and 1%/1 mm. No significant differences between doses calculated using the TPS Monaco and Compass software were found for  Table 4. Correlation between the 2D and 3D GI passing rate and the dose difference in patients with prostate cancer.

Acceptance criterion Structure DVH parameter Correlation indices 3D Correlation indices 2D
3% the selected DVH parameters. The pre-treatment verification was performed carefully. Analysis of the DVH results from the Compass system provided more helpful information than those from the gamma method and confirm that the analysed plans respected dose-tolerance limits. Parameters such as average dose, dose at volume, and volume at dose were more useful during the evaluation plan. Application of the gamma method for the evaluation of dose at volume may be insufficient. This can be explained by the fact that although the gamma passing rate provides the quantity of errors, it does not specify the magnitude of the error. For instance, if a 95% gamma passing rate is reported for a serial organ (e.g., the brain stem or spinal cord), what is immediately important is not whether 95% is high enough, but rather the magnitude and direction (increase or decrease) of the error for those 5% of failing voxels and their impact on the clinical relevant dose metrics (i.e., D max and D 1% ) that cannot be identified from the passing rate itself. Furthermore, Nelms showed that analysis of the average GI passing rate was not acceptable on its own, since some cases with a high %GP could be clinically acceptable in one patient and unacceptable in another [31]. Therefore, evaluation based on DVH should be considered for clinical decisions. Observed dose differences may result from incorrect implementation of irradiation or a difference between the models in the treatment planning and dosimetry systems. Possible uncertainty of treatment delivery was controlled during a nationwide audit of the IMRT technique and internal measurements. Verification of the dosimetric systems was carried out at the beginning of clinical use based on the film dosimetry. Additionally, comparison of the Compass beam modelling and OmniPro measurements with TPS using the Elekta Express QA plan was performed. The accuracy of the dose calculation model and dose delivery on the LINAC must be checked prior to clinical use.
Higher dose differences are presented for structures with a large dose gradient [29] and may result from insufficient spatial resolution of detectors used in the matrix [32]. The limited resolution of the I'mRT MatriXX and MatriXX Evolution can affect the detection of hot and cold spots in highly modulated fields. As a result, dosimetric systems may slightly underestimate or overestimate the planned dose. We observed a lower dose in the high dose region, particularly in the bowel, in the prostate group. This artifact may be caused by the interpolation of the dose measured around the ion chamber in a field with a high dose gradient. Therefore, the spatial resolution of the detector should be considered during evaluation of the measured dose. In addition, a higher %DE in bone regions was confirmed.      The %GP results fulfilled the standards recommended in the Code of Practice for QA and Control for IMRT published by the Netherlands Commission on Radiation Dosimetry [33], the European Society for Radiotherapy and Oncology [34], etc. The standards in radiotherapy recommend a pass rate > 95% using the 3%/3 mm criteria. The results obtained in the present study are in agreement with those obtained by other groups who have analysed the scores obtained from IMRT audits. The present study identified lower %GP results for 2D (OmniPro) as compared with 3D (Compass) verification. This may have been caused by the different methods of dose reconstruction in both systems, since Compass reconstructed the dose on a heterogeneous medium (CT scan of the patient) and OmniPro used a QA plan (RW3 material with a 2D detector array). For 2D analysis, the %GP pass rate decreased more rapidly than for the 3D analysis since the criteria became stricter, which is likely a result of the blurring effect, noise, or combination of both. In addition, the %GP in prostate cancer patients using the 2%/2 mm criterion was higher than the 95% action level, which should be considered in clinical practice for this group.
Relatively weak correlations between the %GP and %DE were observed for both 2D and 3D pre-treatment VMAT dosimetric evaluations. ROC curve analysis showed that the sensitivity of DVH evaluation and both GI methods was not sufficient for clinical acceptance, with AUC values < 0.667. Similar results have been previously reported in the literature [24,25,30]. Low AUC parameters confirm that the ability of 2D and 3D methods is insufficient for the accurate identification the plan with dose errors > 3%. The value of %GP shows only how many voxels fail or pass the criteria and does not provide information regarding the anatomic location of the failure or at which dose level it failed. The risk of underdosing targets and overdosing the   organ at risk cannot be analysed based only on the gamma methods. Analysing the DVH results from Compass instead of the gamma passing rate gives more information about dosimetric errors and their effect on dose distribution. Therefore, the %DE obtained from pretreatment QA verification provides a more helpful solution for VMAT QA and should be considered for clinical use.

Conclusions
The present study identified weak correlations and sensitivity between the GI passing rate and dose errors from the dose-volume histograms for 2D and 3D pre-treatment verifications. The %GP only shows how many voxels failed to pass the criteria and is insufficient for the evaluation of dose parameters; therefore, the gamma passing rate cannot be exclusively relied upon. Evaluation of the clinical tolerance of PTV and OAR should be implemented. Comparison of the CCC and MC algorithms in the pelvic region led to similar results and may be useful for detecting possible discrepancies in the TPS [35,36]. The results indicate that the percentage dose difference between the Compass software and the TPS calculation was <2.09% for analysis using the definition of D 1% , D 98% , and D mean in PTV for each group. New approaches to evaluate QA plans need to be urgently implemented in clinical practice. VMAT QA analysis with a methodology that allows clinicians to predict the impact of a delivered dose on the DVH curve from 3D reconstructions of patient anatomy needs to be employed. Pre-treatment 2D and 3D dosimetric verification of volumetric arc therapy