Assessment and agreement of the CT appearance pattern and its severity grading of radiation-induced lung injury after stereotactic body radiotherapy for lung cancer

Purpose Radiographic severity of radiation-induced lung injury (RILI) has not been well-studied. The goal of this study was to assess the CT appearance pattern and severity of RILI without consideration of the clinical presentation. Material and methods A total of 49 patients, 41 with primary lung cancer and 8 with metastatic lung cancer, were treated by 4-fraction stereotactic body radiotherapy (SBRT). RILI after SBRT was separately assessed by two observers. The early and late CT appearance patterns and CT-based severity grading were explored. Results The median follow-up period was 39.0 months. In the early CT findings of observers 1 and 2, there was diffuse consolidation in 15 and 8, diffuse ground glass opacity (GGO) in 0 and 0, patchy consolidation and GGO in 17 and 20, patchy GGO in 3 and 3, and no changes in 10 and 14, respectively (kappa = 0.61). In late CT findings of observer 1 and 2, there were modified conventional pattern in 28 and 24, mass-like pattern in 8 and 11, scar-like pattern in 12 and 12, and no changes in 1 and 2, respectively (kappa = 0.63). In the results of the CT-based grading by observers 1 and 2, there were grade 0 in 1 and 2, grade 1 in 10 and 14, grade 2 in 31 and 29, grade 3 in 7 and 4, and none of grade 4 or more, respectively (kappa = 0.66). According to multivariate analyses (MVA), the significant predicting factors of grade 2 or more CT-based RILI were age (p = 0.01), oxygen dependence (p = 0.03) and interstitial shadow (p = 0.03). Conclusions The agreement of the CT appearance and CT-based grading between two observers was good. These indicators may be able to provide us with more objective information and a better understanding of RILI.

In most cases, RILI is graded according to the Common Terminology Criteria for Adverse Events (CTCAE). Although CTCAE grading of pulmonary fibrosis includes radiographic pulmonary fibrosis, it is not appropriate for RILI after SBRT considering the CT appearance pattern after SBRT. CTCAE grading of pneumonitis deals with the symptomatic and therapeutic factors for RILI, but cannot assess the radiographic severity. It is true that symptomatic RILI is important, and many reports have emphasized the predicting factors of symptomatic RILI. However, the comorbidities of the patient, baseline respiratory function, subclinical interstitial lung disease, performance status or subjectivity of physicians can affect symptomatic complaints [10][11][12][13][14][15]. Thus, the radiological appearance of RILI is not always accompanied by clinical symptoms [16]. For example, many doctors have difficulty assessing RILI of oxygendependent patients treated with home oxygen therapy (HOT) because they frequently develop dyspnea and require an increase of oxygen flow in spite of a moderate radiographic change. These symptoms can be caused not only by RILI but also by progression of their underlying disease and poor pulmonary preservation. In this case, doctors would decide on the best fit category to grade RILI according to CTCAE, but the agreement between doctors would be low. This study aimed to objectively assess RILI and gather information that can be masked by clinical findings. In this study, one radiologist and one radiation oncologist separately assessed early and late CT appearances and graded late RILI using a CT-based severity grading scale after SBRT. In addition, the predictive factors of CT-based RILI grading were explored.

Patients and treatments
This retrospective study was approved by the Ethical Committee of Tohoku University Hospital (reference number: 2016-768), and informed consent was obtained from all patients. SBRT for a non-centrally located lung tumor was assigned to a 4-fraction schedule at our institute. Patients who were treated with 4-fraction SBRT and for whom 6 months or more of follow-up CT data were available were included in this study. A total of 49 eligible patients were treated between December 2007 and August 2015. There were 41 patients with primary lung cancer and 8 with metastatic lung cancer. Six patients had received HOT treatment at the time of receiving SBRT (HOT patients), and all of these patients were diagnosed with chronic obstructive pulmonary disease. The characteristics of the patients and tumors are shown in Table 1.
In SBRT, a vacuum pillow (Vac-loc, Med-tek, Orange City, IA) was used to immobilize each patient. An X-ray simulator (Ximatron or Acuity, Varian Medical Systems, Palo Alto, CA), 4-D CT, or both were used to evaluate intrafractional lung tumor motion. If the respiratory amplitude was larger than 10 mm, the abdominal compression or breath hold method was used to reduce the internal target volume (ITV) margin. Planning CT scans were performed at intervals of 2.5 mm (GE Light Speed Qxi, GE Healthcare, Waukesha, WI). Gross tumor volume (GTV) was defined as the visible extent of the tumor on planning CT images. ITV was typically created by 10 respiratory phases generated from 4-D CT. For ITV, a 0-5 mm margin was added to account for microscopic extension and then was expanded by 5 mm in all directions a to account for the uncertainty of the set up and to form the planning target volume (PTV). Radiotherapy planning was performed using a 3-D radiotherapy planning system (Eclipse, Varian Medical Systems, Palo Alto, CA). SBRT was delivered using multiple coplanar and non-coplanar static beams with a linear accelerator (Clinac 23EX, Varian Medical

Early and late CT appearance patterns and CT-based grading scale of RILI
The CT appearance pattern was judged according to Linda's classification of the CT findings [17]. Early appearance was defined as CT findings in the first 6 months after SBRT; late appearance was defined as CT findings after the first 6 months after SBRT. The early CT appearance pattern consisted of a diffuse consolidation, diffuse ground-glass opacity (GGO), patchy consolidation and GGO, patchy GGO, and no changes. The late CT appearance pattern consisted of a modified conventional pattern, mass-like pattern, scar-like pattern, and no changes.
For the CT-based severity grading scale, the modified RTOG/EORTC Late Radiation Morbidity Scoring Schema of the lung was used. The classification of the CT-based grading was as follows: grade 0, none; grade 1, slight radiographic appearance; grade 2, patchy radiographic appearance; grade 3, diffuse radiographic change <25% of the lung volume; grade 4; diffuse radiographic change !25% of the lung volume; and grade 5, death ( Table 2). Because the late radiation scoring schema was used, follow-up CT 6 months after SBRT was used for the judgments. or grading, diffuse radiographic changes were used instead of dense radiographic changes of the RTOG/EORTC criteria because dense radiographic changes or similar changes (such as mass-like shadow) were sometimes observed after SBRT. Thus, the CTCAE of pulmonary fibrosis was referenced to define grades 3 and 4.

Statistical analysis
The early/late CT appearance patterns and CT-based grading of RILI were assessed by two observers: one radiation oncologist and one radiologist (T.Y. and Y.M., with 9 and 5 years of experience, respectively). Each assessment was blinded, but the clinical and treatment information were open. The agreement between interobservers was measured using the kappa static [18]. Cohen's unweighted kappa was applied for the agreement of the CT appearance pattern, and the quadratic-weighted kappa was applied for the agreement of the grading. Interobserver agreement was categorized by kappa values, as follows: poor, <0.20; fair, 0.20-0.39; moderate, 0.40-0.59; good, 0.60-0.79; or excellent, >0.80. To perform radiotherapeutic parameter analysis, all treatment plans were recalculated with Acuros XB, version 11031. The parameter of V n Gy was defined as the percentage volume of the lung that received n Gy or more. The time to an event was calculated from the first day of SBRT to the day an event was confirmed. The Cox proportional hazards model was used to perform univariate analyses (UVA) and multivariate analyses (MVA). A stepwise backward elimination/forward addition approach using the Akaike information criterion (AIC) was applied to build the best MVA model. A p value less  [19].

Treatment results
The median follow-up period was 39.0 months for all patients and 42.3 months for living patients. During follow-up, 14 patients died: 10 died from primary disease; 2 from another cancer; 1 from cerebral infarction; and 1 in an accident. No treatment-related deaths occurred. The median recalculated dose of D95 was 39.7 Gy (range, 34.3-43.4 Gy), and local tumor failure occurred in 9 patients during follow-up. Symptomatic RILI occurred in 10 patients, and steroids were administered to 4 patients. According to CTCAE, grade 0 was assessed in 10 patients; grade 1 in 29 patients; grade 2 in 9 patients; and grade 3 RILI in 1 patient. There were rib fractures in 7 patients.

Interobserver variability
The unweighted kappa for the early CT appearance of RILI was 0.61 (95% CI: 0.43-0.80), which suggested that the agreement between interobservers was good. The unweighted kappa for the late CT appearance was 0.63 (95% CI: 0.45-0.82), which similarly indicated that the agreement between interobservers was good. The agreement for the CT-based late radiographic grading of RILI was also good (quadratic weighted kappa = 0.66; 95% CI: 0.23-1.00).  However, the agreement between the CTCAE grade and radiographic grading by observer 2 was moderate (quadratic weighted kappa = 0.43; 95% CI: 0.20-0.67), and the agreement between the CTCAE grade and radiographic grading by observer 1 was not calculated because the observed concordance was smaller than the mean chance concordance.

Cox regression analyses for CT-based RILI grades
Cox regression analyses for grade 2 or more RILI as assessed by observer 1 using only CT findings were performed. The results of UVA and MVA are shown in Table 6   Agreement of the CT appearance pattern and its severity grading of radiation-induced lung injury

Discussion
This study aimed to assess the early and late CT appearance and severity of RILI after SBRT using a radiographic severity grading scale without consideration of the clinical presentation and treatment content for RILI. This attempt was considered to be successful because the results of each agreement were good. Although the agreement between the CTCAE grade and radiographic grading was not good, the result of MVA for the CT-based grade was interesting. The better the CT-based severity grading scale, the more information it can provide.
MVA for the CT-based grade assessed by observer 1 showed that age, HOT and interstitial shadow were significant predicting factors for grade 2 or more CT-based RILI. Older age was reported to be a risk factor for RILI [20][21]. In chemoradiotherapy for lung cancer, both the carboplatin/paclitaxel regimen and an age greater than 65 years were classified as high risks for RILI [22]. However, the result of this study showed the opposite: older age reduced HR, which suggested that the poorer tolerance to RILI in older age comes from age-related problems, such as comorbidities and frailty. In addition, HOT was a significant factor: patients treated with HOT had a lower CT-based RILI grade. Patients who received HOT have been thought to be susceptible to developing dyspnea and sometimes require an increased oxygen flow, but these points have not been well-studied. Our result suggested that older age and HOT patients have poorer tolerance to RILI, but this does not mean that older age and HOT patients have  higher radiosensitivity. On the other hand, the presence of an interstitial shadow indicated a higher CT-based RILI grade, comparable to previous CTCAE-based findings [23]. RILI is induced not only by the progression of underlying disease but also by increased radiosensitivity, which sometimes lead to acute exacerbation of the underlying disease [24]. There have been some previous reports on RILI using CT appearance. Avanzo et al. regarded acute RILI as diffuse consolidation and a patchy consolidation and GGO as severe RILI. They reported V 5 Gy , V 20 Gy , the mean lung dose, and the number of fractions significantly correlated with severe RILI; the dose of the 50% probability of severe RILI was 73.0 Gy in 5 and 8 fractions [25][26]. Bernchou et al. divided CT appearance of acute RILI after conventional fractionated radiotherapy into 3 categories: interstitial changes, GGO, or consolidation [27]. Affecting factors all categories were that intervals between commencement of radiotherapy and follow-up CT scan and lung dose metrics. On the other hand, dosimetric factors such as V 5 Gy and V 20 Gy were not significant factors in this study because of the difference between "acute" and "late". In regard to the severity of RILI, previous reports used an acute CT appearance; however, the late CT severity grading scale was used in current study. Dosimetric factors may be better predictors for an early CT appearance than a late CT appearance. Although the agreement between the two observers was good, it fell short of excellent. A more precise definition would lead to better agreement. The difference between the training of the radiologist and the radiation oncologist may have also contributed to the not excellent agreement. One of the reasons will come from the difference between "patchy" and "diffuse". This difference is defined as incompletely and completely filling the "high-dose region" in Linda's criteria, the interpretation of which may differ between interobservers [17]. Dahele et al. defined the difference using an objective cutoff value that was more than 5 cm in the largest dimension or not [28]. This definition may offer better agreement between interobservers, but may offer a stronger effect of PTV on the radiological assessment of SBRT.
The date of the late CT appearance diagnosis also showed some difference. The intervals between SBRT and early CT diagnosis of observer 1 and observer 2 were almost the same and the averages were 4.4 months and 4.8 months, respectively. By contrast, intervals of late CT diagnoses of observer 1 and observer 2 had some differences: the averages were 22.2 months and 16.7 months, respectively. The interval periods were consistent with previous findings [17]. However, some difference of the intervals of a late CT diagnosis between observers indicated that prolonged or transitional shadows of an early CT appearance may have confused the observers. Defining this point more precisely may contribute to better agreement between observers.
There were several limitations in the current study. This study was a retrospective study conducted at a single institute with a limited sample size. The timing of follow-up CT was not constant. The number of patients receiving HOT and number of patients with interstitial shadows were small. Some possible factors, such as peripheral oxygen saturation, spirometry data and the serum KL-6 level, were lacking. A prospective study with a larger sample size is needed to overcome these limitations.

Conclusions
In conclusion, the CT based appearance and severity of RILI were assessed with good agreement. Older age, receiving HOT and absence of an interstitial shadow were related to a lower grade of RILI. This relatively objective assessment could provide further information that has been masked by clinical presentation.
Supporting information S1 Data. Relevant data.xls to this manuscript. (XLS)