Correlation of the gamma passing rates with the differences in the dose-volumetric parameters between the original VMAT plans and actual deliveries of the VMAT plans

Purpose The aim of this study was to investigate the correlations of the gamma passing rates (GPR) with the dose-volumetric parameter changes between the original volumetric modulated arc therapy (VMAT) plans and the actual deliveries of the VMAT plans (DV errors). We compared the correlations of the TrueBeam STx system to those of a C-series linac. Methods A total of 20 patients with head and neck (H&N) cancer were retrospectively selected for this study. For each patient, two VMAT plans with the TrueBeam STx and Trilogy (C-series linac) systems were generated under similar modulation degrees. Both the global and local GPRs with various gamma criteria (3%/3 mm, 2%/2 mm, 2%/1 mm, 1%/2 mm, and 1%/1 mm) were acquired with the 2D dose distributions measured using the MapCHECK2 detector array. During VMAT deliveries, the linac log files of the multi-leaf collimator positions, gantry angles, and delivered monitor units were acquired. The DV errors were calculated with the 3D dose distributions reconstructed using the log files. Subsequently, Spearman’s rank correlation coefficients (rs) and the corresponding p values were calculated between the GPRs and the DV errors. Results For the Trilogy system, the rs values with p < 0.05 showed weak correlations between the GPRs and the DV errors (rs<0.4) whereas for the TrueBeam STx system, moderate or strong correlations were observed (rs≥0.4). The DV errors in the V20Gy of the left parotid gland and those in the mean dose of the right parotid gland showed strong correlations (always with rs > 0.6) with the GPRs with gamma criteria except 3%/3 mm. As the GPRs increased, the DV errors decreased. Conclusion The GPRs showed strong correlations with some of the DV errors for the VMAT plans for H&N cancer with the TrueBeam STx system.


Methods
A total of 20 patients with head and neck (H&N) cancer were retrospectively selected for this study. For each patient, two VMAT plans with the TrueBeam STx and Trilogy (C-series linac) systems were generated under similar modulation degrees. Both the global and local GPRs with various gamma criteria (3%/3 mm, 2%/2 mm, 2%/1 mm, 1%/2 mm, and 1%/1 mm) were acquired with the 2D dose distributions measured using the MapCHECK2 detector array. During VMAT deliveries, the linac log files of the multi-leaf collimator positions, gantry angles, and delivered monitor units were acquired. The DV errors were calculated with the 3D dose distributions reconstructed using the log files. Subsequently, Spearman's rank correlation coefficients (r s ) and the corresponding p values were calculated between the GPRs and the DV errors.

Results
For the Trilogy system, the r s values with p < 0.05 showed weak correlations between the GPRs and the DV errors (r s <0.4) whereas for the TrueBeam STx system, moderate or strong correlations were observed (r s �0.4). The DV errors in the V 20Gy of the left parotid gland and those in the mean dose of the right parotid gland showed strong correlations (always with r s > 0.6) with the GPRs with gamma criteria except 3%/3 mm. As the GPRs increased, the DV errors decreased.

Introduction
The most popular method of pre-treatment patient-specific quality assurance (QA) for intensity modulated radiation therapy (IMRT) or volumetric modulated arc therapy (VMAT) in the clinic is the gamma index method proposed by Low et al. [1]. The gamma index method can effectively identify and quantify differences in the two dose distributions [2,3]. However, several studies have questioned the clinical relevance of gamma passing rates (GPRs) [4,5]. Nelms et al. demonstrated that there is a lack of correlation between the GPRs and clinically relevant dose-volumetric parameter changes between plans and deliveries (DV errors) by utilising a total of 24 IMRT plans generated with a C-series linac [4]. Similarly, Stasi et al. showed that there were weak correlations between the GPRs and the DV errors of clinically relevant DV endpoints by utilising 27 prostate and 15 head and neck (H&N) IMRT plans [5]. They also showed cases where high GPRs did not necessarily indicate good consistency in anatomy dose metrics (i.e., false negatives) [5]. In this respect, several studies suggested log-file-based pretreatment QA or calculation of the modulation indices as a pre-treatment patient-specific QA method for IMRT or VMAT [6][7][8][9][10]. However, these methods have a limitation in that they are not based on independent dose measurements; therefore, the gamma evaluation is still widely adopted in the clinic as a verification method of IMRT and VMAT plans. The previous studies demonstrated the clinical irrelevance of the GPRs of IMRT plans with a C-series linac [4,5]; however, no study has been performed with the TrueBeam STx system (Varian Medical Systems, Palo Alto, CA, USA), which delivers treatment plans more accurately than the C-series linac by using an integrated control system which is called supervisor. It also has a greater advanced log-file generation capability than that of the C-series linac. Park et al. demonstrated that the GPRs of VMAT plans with the C-series linac were different from those with the TrueBeam STx system although both the VMAT plans were generated under identical conditions [11]. In other words, although both the VMAT plans were generated with an identical treatment planning system using identical patient computed tomography (CT) images, structure sets, prescription doses, and normal tissue tolerance levels, and GPRs were acquired using the same dosimeter, the GPRs with the TrueBeam STx could be different from those with the C-series linac [11]. This might be attributed to the difference in the modulation degree between the TrueBeam STx and C-series linac plans, or the difference in the operation mechanisms of the TrueBeam STx system and the C-series linac. In addition, it is unclear whether the predictive power of the GPRs with the TrueBeam STx regarding the accuracy of VMAT delivery is also as poor as that with the C-series linac. Therefore, in the present study, we investigated the correlations of the GPRs with the DV errors of clinically relevant DV end points with the TrueBeam STx compared with those with the C-series linac utilising a total of 20 VMAT plans.

Patient selection and simulation
After receiving approval from the institutional review board (IRB), a total of 20 patients with nasopharyngeal cancer (H&N cancer) treated using the VMAT technique were retrospectively selected for this study. Approval for this study was obtained from the institutional review board of Seoul National University Hospital (IRB No. 1901-059-1002. This study is a retrospective study using an anonymized patient's CT image set and treatment plan, which cause minimal risk to the patient. Therefore, this study was granted exemption for informed consent from IRB. Each patient underwent CT scans using the Brilliance CT Big Bore TM system with a slice thickness of 3 mm (Phillips, Amsterdam, Netherlands) in the supine position. Each patient was immobilised using a thermoplastic mask and a Silverman pillow (Bionix Radiation Therapy, Toledo, OH).

Volumetric modulated arc therapy plans
For each patient, two VMAT plans were generated: one using the Trilogy TM system with the Millennium 120 TM multi-leaf collimator (MLC), which is a C-series linac, and the other using the TrueBeam STx TM system with the high-definition (HD) 120 TM MLC (Varian Medical Systems, Palo Alto, CA). For both plans of each patient, the CT image set, structure set, prescription doses, and set of dose-volume constraints used for planning were identical. The simultaneous integrated boost technique was used with three planning target volumes (PTVs), which were PTV 67.5 (prescription dose = 67.5 Gy and daily dose = 2.25 Gy), PTV 54 (prescription dose = 54 Gy and daily dose = 1.8 Gy), and PTV 48 (prescription dose = 48 Gy and daily dose = 1.6 Gy). These prescription doses were delivered in 30 fractions. Each plan in the present study was generated with the Eclipse TM system (Varian Medical Systems, Palo Alto, CA) using 6 MV photon beams and two full arcs. For optimisation, the progressive resolution optimizer (PRO3, ver.13, Varian Medical Systems, Palo Alto, CA) was used. During optimisation, identical dose-volume constraints based on the Quantitative Analyses of Normal Tissue Effects in the Clinic (QUANTEC) recommendations were used for both the VMAT plans with the Trilogy and TrueBeam STx systems [12]. After optimisation, dose distributions were calculated (dose calculation resolution = 1 mm) using the anisotropic analytic algorithm (AAA, ver. 13, Varian Medical Systems, Palo Alto, CA). After dose calculation, each VMAT plan in the present study was normalised to cover 95% of the PTV 67.5 with 95% of the prescription dose of 67.5 Gy. A total of 46 clinically relevant dose-volumetric parameters were calculated for each VMAT plan. For each PTV, the dose received by at least 99% of the structure volume (D 99% ), D 98% , D 95% , D 50% , D 5% , D 2% , D 1% , minimum dose, maximum dose, and mean dose were calculated. For both the left and right parotid glands (PGs), the volumes irradiated by at least 20 Gy (V 20Gy ), V 50% , and mean doses were calculated. For the optic chiasm, both the left and right optic nerves, and both the left and right lenses, the maximum doses were calculated. For the spinal cord and brain stem, the maximum doses were calculated. For body, the values of V 100% and V 50% , and mean doses were calculated. To investigate the modulation degrees of the VMAT plans with the Trilogy and TrueBeam STx systems, the modulation complexity score for VMAT (MCS v ) and the leaf travel modulation complexity score (LTMCS) were calculated for each VMAT plan [13].

Gamma index method
For the gamma evaluation, a MapCHECK2 TM detector array inserted in a MapPHAN TM (Sun Nuclear Corporation, Melbourne, FL) was utilised for the measurements of 2D planar dose distributions of VMAT plans. To determine the reference dose distributions of each VMAT plan, a CT image set of MapCHECK2 in the MapPHAN was acquired with a slice thickness of 1 mm and the CT number of that structure (the MapPHAN including MapCHECK2) was assigned as 455 according to the manufacturer's guideline [2]. Utilising this CT image set, verification plans of each VMAT plan were generated in the Eclipse system. The reference 2D dose distributions were calculated with a dose calculation grid size of 1 mm, which is the finest dose calculation grid size of the Eclipse system. Before the measurements with the Map-CHECK2 detector array, the outputs of the Trilogy and TrueBeam STx systems were calibrated according to the American Association of Physicists in Medicine (AAPM) Task Group 51 (TG-51) protocol [14]. The MapCHECK2 detector array was also calibrated according to the manufacturer's guideline before the measurements of 2D dose distributions of VMAT plans [2]. The setup of the MapCHECK2 dosimeter was verified by acquiring the cone beam computed tomography (CBCT) images of the Trilogy and TrueBeam STx systems. Using the SNC patient TM software (Sun Nuclear Corporation, Melbourne, FL), both the global and local gamma evaluations with absolute doses were performed with various gamma criteria of 3%/3 mm, 2%/2 mm, 2%/1 mm, 1%/2 mm, and 1%/1 mm. When performing gamma evaluation, the threshold value was 10%.

Differences in the dose-volumetric parameters between the original VMAT plans and the VMAT plans reconstructed with the log files
During measurements with the MapCHECK2 detector array, the log files recorded in the linac control system during VMAT deliveries were acquired with both the Trilogy and TrueBeam STx systems. The log files were records of the actual MLC positions, gantry angles, and delivered monitor units (MUs) during VMAT delivery [15]. Using an in-house program written in MATLAB (ver.8.1, MathWorks Inc., Natick, MA), the log files were combined and formatted to correspond to the VMAT plan file in DICOM-RT format and this plan file was imported to the Eclipse system. Subsequently, 3D dose distribution was calculated with a CT image set identical to that used to generate the original VMAT plan. The dose calculation resolution was identical to that of the original VMAT plan, which was 1 mm. With a structure set identical to that of the original VMAT plan, a total of 46 clinically relevant dose-volumetric parameters were calculated, which were the same as those calculated with the original VMAT plans. Subsequently, the DV errors were calculated.

Differences in the GPRs between the original VMAT plans and the VMAT plans reconstructed with the log files (GPR cal )
To evaluate the changes in the doses between the original VMAT plans and the VMAT plans reconstructed with the log files, we performed gamma evaluations between the original VMAT plans and the VMAT plans reconstructed with the log files. Since there were an enormous number of points of doses to be evaluated with the gamma-index method in the case of the 3D gamma evaluation on the patient's whole body, which potentially results in underestimation of the changes in the GPR values, we performed 2D gamma evaluation as described above instead of 3D gamma evaluation. In other words, for both the original VMAT plans and the VMAT plans reconstructed with the log files, 2D dose distributions were calculated utilising the CT image set of the MapPHAN with the MapCHECK2 and the values of the GPR cal (GPRs with the 2D dose distribution calculated with the original VMAT and that calculated with the VMAT reconstructed with the log files) were acquired.

Correlations between the GPRs and the DV errors
The correlations between the GPRs with various gamma criteria (calculated vs. measured) and the DV errors were analysed by calculating Spearman's rank correlation coefficients (r s ) with the corresponding p values. The r s values with p values equal to or less than 0.05 were regarded as statistically significant in the present study. Following the Evans guidelines proposed in 1996, the absolute r s values equal to or larger than 0.2 and smaller than 0.4 indicate weak correlations (0.2 � r s < 0.4); The absolute r s values equal to or larger than 0.4 and smaller than 0.6 indicate moderate correlations (0.4 � r s < 0.6); The absolute r s values equal to or larger than 0.6 and smaller than 0.8 indicate strong correlations (0.6 � r s < 0.8); The absolute r s values equal to or larger than 0.8 indicate very strong correlations (r � 0.8) [16].

GPRs of the Trilogy and TrueBeam STx systems
The global and local GPRs with the gamma criteria of 3%/3 mm, 2%/2 mm, 2%/1 mm, 1%/2 mm, and 1%/1 mm as well as the values of MCS v and LTMCS are shown in Table 1. According to the previous studies and guidelines, each VMAT plan in the present study was clinically acceptable based on the QA threshold as the global GPRs with 2%/2 mm were always higher than 90% [17,18]. Except the GPRs with 3%/3 mm, both the global and local GPRs of the VMAT plans with the TrueBeam STx system were always higher than those with the Trilogy system with statistical significance (p � 0.05). Consistently, the average MLC positioning error, gantry angle error, and the MU delivery error during VMAT delivery with the TrueBeam STx system were 0.09 ± 0.01 mm, 0.03˚± 0.00˚, and 0.02 ± 0.01 MU, respectively, whereas those with the Trilogy system were 0.19 ± 0.06 mm, 0.05˚± 0.00˚, and 0.11 ± 0.09 MU, respectively. However, the values of the MCS v and LTMCS indicated no statistically significant differences in the modulation degrees between the VMAT plans with the TrueBeam STx and Trilogy systems (p > 0.05). The values of MCS v and LTMCS indicated that the modulation degrees of both VMAT plans with the TrueBeam STx and Trilogy systems were high as the values of MCS v and LTMCS vary from 0 to 1 and these values decrease with the increase in the modulation degree [13].

GPR cal between the original VMAT plans and the VMAT plans reconstructed with the log files
The values of the global and local GPR cal with the gamma criteria of 3%/3 mm, 2%/2 mm, 2%/ 1 mm, 1%/2 mm, and 1%/1 mm between the original VMAT plans and the VMAT plans reconstructed with the log files are shown in Table 2. The statistically significant differences in the GPR cal values between the original VMAT plans and the VMAT plans reconstructed with the log files were observed at the global gamma evaluation with 2%/1 mm and local gamma evaluations with 1%/2 mm and 1%/1 mm. For both global and local GPR cal , the differences between the GPR cal with 3%/3 mm and those with 1%/1 mm of the TrueBeam STx system were higher than those of the Trilogy system.

Correlations between the GPRs and the DV errors of the Trilogy system
The average values of each DV error with the Trilogy system are shown in S1 Table. Only the statistically significant correlation coefficients between the GPRs and the DV errors with the Trilogy system are shown in Table 3 with the corresponding p values. Every correlation coefficients between the GPRs and the DV errors are shown in S2

Correlations between the GPRs and the DV errors of the TrueBeam STx system
The average values of each DV error with the TrheBeam STx system are shown in S1 Table. Only the statistically significant correlation coefficients between the GPRs and the DV errors with the TrueBeam STx system are shown in Table 4 with the corresponding p values. Every correlation coefficients between the GPRs and the DV errors are shown in S2 Table. Among a total of 46 dose-volumetric parameters tested in this study, both the global and local GPRs with 1%/2 mm and the local GPRs with 1%/1 mm showed statistically significant correlations with DV errors most frequently (a total of 19 r s values with p values less than 0.05). The highest  The DV errors in the V 20Gy of both the left and right PGs and the mean dose of the right PG showed higher r s values than the others with the local GPRs with various gamma criteria. Especially, the mean dose of the right PG showed strong correlations with every local GPR tested in this study except the GPR with 3%/3 mm (absolute values of r s > 0.6). The DV errors in the mean dose of the right PG as well as those in the V 20Gy of the left PG with the TrueBeam STx system are plotted according to the local GPRs with various gamma criteria in Fig 1. As the GPRs increased, the DV errors decreased.

Discussion
In the present study, we observed moderate or strong correlations of the GPRs with some of the DV errors of the VMAT plans for H&N cancer with the TrueBeam STx system. Especially, the DV errors at the PGs showed higher correlations with the GPRs with various gamma criteria than those at the other structures. In contrast, weak or no correlations were generally observed between the GPRs and the DV errors with the Trilogy system, which is a C-series linac. The results obtained with the Trilogy system are consistent with those of the previous studies [4,5].  According to the values of the MCS v and LTMCS, the modulation degrees of the VMAT plans with the TrueBeam STx system were almost the same as those with the Trilogy system and no statistically significant differences between them were observed (both with p > 0.05). In addition, the values of the local GPR cal with 1%/2 mm and 1%/1 mm (both with p < 0.05) of the TrueBeam STx system were lower than those of the Trilogy system, which is reasonable because the mechanical errors (MLC positioning errors, gantry angle errors, and MU delivery errors) during delivery with the TrueBeam STx were much smaller than those with the Trilogy system. Nonetheless, both the global and local GPRs with various gamma criteria of the True-Beam STx system were higher than those of the Trilogy system with statistical significance, except for the GPRs with 3%/3 mm (all with p < 0.05). This means the mechanical errors recorded in the Trilogy system during delivery were larger than those in the TrueBeam STx system, however, the actual dose delivery errors of the Trilogy system were larger than those of the TrueBeam STx according to the GPRs based on the measurements. This could be attributed to the more accurate VMAT delivery records in the log files of the TrueBeam STx system than those of the Trilogy system [19]. Previous studies showed that the most dominant mechanical error significantly affecting VMAT delivery accuracy among the three types of mechanical errors, i.e., MLC positioning error, gantry angle error, and MU delivery error, was the MLC positioning error [20][21][22]. The log file of the actual MLC positions during the VMAT delivery of the TrueBeam STx system is the Trajectory file, which is a record of the direct MLC position values with an update rate of 20 ms [19]. However, the actual MLC position log file of the Trilogy system is the DynaLog file, which is a record of the actual motor values with an update rate of 50 ms [10]. The actual motor values are converted to MLC positioning values using a conversion table (mlctable.txt) [10]. Therefore, the Trajectory file contains more accurate MLC positioning information than that of the DyanLog file owing to the small update rate and direct-record method. This could result in the accurate calculation of the DV errors with the TrueBeam STx system; therefore, higher correlations of the DV errors with the GPRs could be obtained with the TrueBeam STx system than with the Trilogy system.
The different results of the TrueBeam STx system from those of the Trilogy system in the present study might be attributed to the different types of the MLC systems (HD 120 MLC and Millennium 120 MLC) since the MLC leaf width and the design of the HD 120 MLC are different from those of the Millennium 120 MLC. This was not investigated in this study, which is a limitation of the present study, therefore, we will investigate this in the future by utilising Vital-Beam (Varian Medical Systems, Palo Alto, CA, USA).
Despite the high modulation degrees (average MCS v value less than 0.0014 and average LTMCS value less than 0.0006) of the H&N VMAT plans in the present study, all the VMAT plans utilised in this study were clinically acceptable, always showing global GPRs with 2%/2 mm higher than 90% [13,17,18]. Consequently, the magnitudes of the mechanical discrepancies between the original VMAT plans and the actual deliveries of the VMAT plans were small, which resulted in the small DV errors. Especially, the more accurate VMAT deliveries with the TrueBeam STx could be consistently identified from the higher GPRs, smaller mechanical errors recorded in the log files, and smaller DV errors than those with the Trilogy system. This was attributed to the more accurate VMAT delivery of the TrueBeam STx system than that of the Trilogy system, owing to the integrated mechanical parameter control system [19]. Despite the smaller ranges of the delivery errors with the TrueBeam STx system than those with the Trilogy system, higher correlations between the GPRs and the DV errors were observed with the TrueBeam STx system than with the Trilogy system owing to the reasons described above. Previous studies that examined the correlations between the GPRs and the DV errors with IMRT plans concluded that the GPRs did not predict DV errors of clinically relevant DV endpoints [4,5]. In contrast, in the present study with VMAT plans using the TrueBeam STx, the GPRs with 1%/2 mm and 1%/1 mm showed moderate or strong correlations with some of the DV errors, which is the first report to the best of our knowledge. The GPRs showed strong correlations with some (but not all) of the DV errors. Since the gamma index method quantitatively evaluates the accuracy of VMAT delivery with a single value, i.e., the GPR, it can only be recognised whether the overall VMAT delivery accuracy would be high or not with the gamma index method. In other words, if the GPR is low, we would be aware that the VMAT delivery to a patient would be inaccurate but we would not know the location of the inaccuracy in the patient's body, i.e., spatial information of the errors in the patient's body would not be provided by the GPRs. This is a limitation of the plan verification with the GPR. Therefore, as the AAPM TG-218 protocol recommended, in addition to the GPR, various types of information provided by the gamma index method such as gamma value and gamma map should be examined comprehensively [23]. According to the results of the present study, the GPRs of the VMAT plans with the TrueBeam STx system could indicate the occurrence of DV errors during VMAT deliveries; however, they could not indicate where and what kind of DV errors would occur in a patient's body.
In the present study, the local GPRs with 1%/2 mm and 1%/1 mm showed strong correlations with the DV errors at the PGs. This appears reasonable as the DV constraints of the PGs are generally difficult to satisfy because the PGs are generally overlapped with or extremely close to the PTVs (PTV 54 or PTV 67.5 ). Therefore, steep dose gradients were generally generated between the PGs and the target volumes by highly modulated photon beams, which is highly sensitive to the uncertainty of VMAT delivery [23][24][25]. If there is a discrepancy between the original VMAT plan and the actual delivery of VMAT at the steep dose gradient generated in or near the PGs, DV errors would occur at the PGs [23][24][25]. In this respect, strong correlations were observed between the GPRs and the DV errors of the PGs in the present study.
The limitation of the present study is that we could not include the clinically unacceptable VMAT plans, which failed in the gamma evaluation owing to the extremely high modulations. Therefore, we could not determine the tolerance levels of the gamma index method based on the DV errors of clinically relevant DV endpoints in this study. Another limitation of the present study is that the number of VMAT plans analysed in this study was limited, i.e., only 20. Besides the number of VMAT plans, the tumour site of the VMAT plans in the present study Differences in the dose-volumetric parameters of the parotid glands between the original volumetric modulated arc therapy (VMAT) plans and the plans reconstructed with the log files recorded during VMAT delivery according to the local gamma passing rates (GPRs) with various gamma criteria of the TrueBeam STx system. Differences in the dosevolumetric parameters between the original VMAT plans and the plans reconstructed with the log files recorded during VMAT delivery (DV errors) of the parotid glands (PGs) are plotted according to the local GPRs with various gamma criteria for the TrueBeam STx system. The DV errors of the mean dose of the right PG vs. local GPRs with the gamma criteria of 2%/2 mm (a), 2%/1 mm (b), 1%/2 mm (c), and 1%/1 mm (d) are shown. The DV errors of the V 20Gy of the left PG vs. local GPRs with the gamma criteria of 2%/2 mm (e), 2%/1 mm (f), 1%/2 mm (g), and 1%/1 mm (h) are also shown.
https://doi.org/10.1371/journal.pone.0244690.g001 was limited by the analysis of only H&N VMAT plans. We will conduct further studies in the future to overcome the limitations of the present study.
In the present study, we demonstrated that the GPRs with tight gamma criteria could predict some DV errors by utilising a linac system whose log record system during VMAT delivery is accurate. Therefore, the gamma index method is worthy to be performed before the deliveries of VMAT plans to patients for accurate patient treatment. In addition, the gamma index method is an independent verification method for the accuracy of VMAT plan delivery based on measurement. In the present study, although the GPRs with 1%/2 mm and 1%/1 mm could predict some DV errors of clinically relevant DV endpoints of VMAT, they do not provide spatial information of the DV errors. Therefore, when utilising the gamma index method, comprehensive analysis through GPR evaluation including gamma value and gamma map analyses should be performed.
Supporting information S1