Time efficiency and reliability of established computed tomographic obstruction scores in patients with acute pulmonary embolism

Objective Acute pulmonary embolism (PE) is a life-threatening disease with a high mortality. Computed tomographic pulmonary angiography (CTPA) is used in clinical routine for diagnosis of PE. Many pulmonary obstruction scores were proposed to aid in stratifying clinical course of PE. The purpose of the present study was to compare common pulmonary obstruction scores in PE in regard of time efficiency and interreader agreement based upon a representative patient sample. Methods Overall, 50 patients with acute PE were included in this single center, retrospective analysis. Two readers scored the CT images blinded to each other and assessed the scores proposed by Mastora et al., Qanadli et al., Ghanima et al. and Kirchner et al. The required time was assessed of each reading for scoring. Results For reader 1, Mastora score took the longest time duration, followed by Kirchner score, Qanadli score and finally Ghanima score (every test, p<0.0001). The interreader variability was excellent for all scores with no significant differences between them. In the Spearman’s correlation analysis strong correlations were identified between the scores of Mastora, Qanadli and Kirchner, whereas Ghanima score was only moderately correlated with the other scores. There was a weak correlation between time duration and Mastora score (r = 0.35, p = 0.014). For the Ghanima score, a significant inverse correlation was found (r = -0.67, p<0.0001). Conclusion For the investigated obstruction scores, there are significant differences in regard of time consumption with no relevant differences in regard of interreader variability in patients with acute pulmonary embolism. Mastora score requires the most time effort, whereas the score by Ghanima the least time.


Results
For reader 1, Mastora score took the longest time duration, followed by Kirchner score, Qanadli score and finally Ghanima score (every test, p<0.0001). The interreader variability was excellent for all scores with no significant differences between them. In the Spearman's correlation analysis strong correlations were identified between the scores of Mastora, Qanadli and Kirchner, whereas Ghanima score was only moderately correlated with the other scores. There was a weak correlation between time duration and Mastora score (r = 0.35, p = 0.014). For the Ghanima score, a significant inverse correlation was found (r = -0.67, p<0.0001).

Conclusion
For the investigated obstruction scores, there are significant differences in regard of time consumption with no relevant differences in regard of interreader variability in patients with acute pulmonary embolism. Mastora score requires the most time effort, whereas the score by Ghanima

Introduction
Acute pulmonary embolism (PE) is a possible life-threatening disease with 30-day mortality rates ranging from 0.5% to over 20% depending on clinical symptoms at presentation [1]. However, there are also low-risk clinical courses without severe complications. Therefore, immediate risk stratification of patients with acute PE at the time of presentation is crucial for the planning of patient care. Computed tomographic pulmonary angiography (CTPA) has been established as the diagnostic gold standard in the detection of PE [2,3]. So, the sensitivity and specificity were reported in some studies to be up to 100% [2]. Since then, the CT technique has significantly improved, especially due to increasing CT slices and consequently better image quality.
In clinical routine, the radiologist assesses, whether there is the presence of PE or not. Yet, there are some CT signs, which harbor prognostic information to guide treatment planning and to predict mortality [4]. In clinical evaluation, the right ventricle to left ventricle (LV) diameter-ratio was identified to be the strongest predictive value and most robust to predict clinical outcomes in patients with acute PE [3]. The contrast media reflux into the inferior vena cava has been reported as a significant prognostic marker in acute PE [5,6].
The quantified total embolus burden represents another important CTPA parameter. The rationale is that more obstructed vessels lead to higher resistance and therefore to right heart insufficiency. In fact, in the first studies, the scores were associated with invasive pulmonary angiography and were able to predict short-term mortality. For example, Wu et al. found that clot burden quantified on CT pulmonary angiography was an important predictor of death in patients with PE [7]. Similar results were also reported by Van der Meer et al. [8]. However, other authors did not find any associations between total clot burden and mortality in PE [9][10][11].
Despite the growing body of literature regarding embolus burden scores, there are only few comparisons between these scores [12]. Moreover, the complexity of the scores differs significantly. As such, for the score proposed by Mastora et al. [13] (every pulmonary vessel is scored from 0 to 5, reflecting no embolism with 0 and complete obstruction with 5 points, whereas for the score proposed by Ghanima et al. [14] only the level of the most proximal vessel obstruction is quantified. The resulting point range is 0 to 155 for Mastora score and 0 to 4 for Ghanima score. These could result in significant differences to reflect clinical features depending on the score employed.
In our clinical experience, the time effort to quantify these scores differs significantly. Some of the obstruction scores were rated as too cumbersome to perform in clinical routine [3]. Moreover, the interreader agreement might be higher for the simpler scores compared to the score by Mastora et al. Clearly, there is need to investigate the differences of these pulmonary embolism scores.
Therefore, the purpose of the present study was to compare four of the most commonly used pulmonary obstruction scores for PE in regarding time efficiency and interreader agreement based upon a representative patient sample.

Methods
This retrospective study was approved by the institutional review board (Nr: 118/19-ck, Ethics Committee, University of Leipzig, Leipzig, Germany).
The patient sample was obtained from a larger study sample, which assessed the associations between Mastora score and clinical features in patients with acute PE [6,11]. Inclusion criteria were sufficient pulmonary vessel contrast and a representative PE manifestation. Exclusion criteria were patients with only small subsegmental emboli. Patients with other pulmonary diseases, such as pneumonia, cardiac decompensation or pleural effusions, were not excluded to ensure external validity. The CT scans were obtained between 2015 and 2018.

Imaging technique
CTPA was performed on a 128-slice CT scanner (Ingenuity 128, Philips, Hamburg, Germany). Intravenous administration of an iodine-based contrast medium (60 mL Imeron 400 MCT, Bracco Imaging Germany GmbH, Konstanz, Germany) was given at a rate of 4.0 mL/s via a peripheral venous line. Automatic bolus tracking was performed in the pulmonary trunk with a trigger of 100 Hounsfield units (HU). Typical imaging parameters were: 100 kVp; 125 mAs; slice thickness = 1 mm; and pitch = 0.9. CTPA was performed in every case in deep inspiration level.
Qanadli score. This obstruction score (0-100%) was defined based on the number of obstructed segmental arteries and was corrected according to the estimated degree of occlusion of each vessel (1 for partial obstruction; 2 for complete obstruction) [14].
Ghanima score. This obstruction score is based on the proximal extension of the embolus relative to the main pulmonary arteries. The pulmonary arterial tree is divided into four levels, and the score is calculated according to the level of the proximal extension in each lung: mediastinal arteries (4 points), lobar arteries (3 points), segmental arteries (2 points), and subsegmental (1 point) [15].
Kirchner score (modified Miller score). The fourth score was calculated as reported by Kirchner et al. [16], which is a modified score of a previous proposed by Miller et al. [17]. This score ranges from 0 to 16. The presence of embolic material is rated using a two-point scale (0 absent, 1 present) within a total of 16 segmental arteries.

Image analysis and assessment of time duration
The images were evaluated by two advanced residents (NB, HJM) with 5 and 4 years of general radiological experience including CT imaging, respectively.
Pulmonary embolism is defined as contrast media filling defect within a pulmonary vessel at least on 2 slices.
Before the patient images were read, both radiologists were trained for 2 hours in scoring PE with these scores with images of other patients with PE. To reduce possible recognition bias, there was one week delay between the estimation of different scores in these cases.
The required time to assess the scores was calculated with a stopwatch. The start was defined with the opening of the CT study, the end with record of the score into a spreadsheet. Both readers were blinded to each other's results.

Statistical analysis
The statistical analysis and graphics creation were performed using GraphPad Prism 5 (Graph-Pad Software, La Jolla, CA, USA). Collected data were evaluated by means of descriptive statistics (absolute and relative frequencies). Spearman's correlation coefficient (r) was used to analyze associations between the investigated scores. Intraclass coefficient (ICC) was used to calculate interreader variability. Bland-Altman plots were used to visualize the interreader variability. Group differences were calculated with Mann-Whitney test. In all instances, p values <0.05 were taken to indicate statistical significance.

Results
The descriptive score results are provided by Table 1.
For reader 1, Mastora score took the longest time duration, followed by Kirchner score, Qanadli score and finally Ghanima score (every test p<0.0001).
For reader 2, Mastora score also took the longest time duration, followed by Qanadli score, Kirchner score and Ghanima score (every test p<0.0001).
There were significant differences between reader 1 and 2 in regard of time duration. So, reader 2 was significantly faster for every score (p<0.0001 for every score). Fig 2 displays the time duration for every score as scatter plots.  In the Spearman's correlation analysis, strong correlations were identified between Mastora score, Qanadli score and Kirchner score, whereas Ghanima score was only moderately correlated with the other scores (Table 3).
There was a weak correlation between the time duration and Mastora score (r = 0.35, p = 0.014). For Ghanima score, a significant inverse correlation with the time duration was identified (r = -0.67, p<0.0001, Fig 4). Regarding the other scores, no correlation was identified with time duration (Qanadli score r = -0.19, p = 0.19, Kirchner score r = 0.17, p = 0.24).

Discussion
The present study investigated commonly used pulmonary embolism obstruction scores of their interreader variability and the time effort to score them. There were significant differences regarding time consumption between the scores. Notably, all of them had an excellent interreader agreement.  CT is the imaging modality of choice to diagnose and to rule out PE [2,3]. There is extensive scientific effort to obtain potential biomarker from CT images and not only qualitatively assess the images by the radiologists.
With this approach, possible novel biomarkers could be obtained to predict possible complications and mortality in PE. This is especially of interest as PE is a potential life-threatening disease with a high mortality [1]. It could be crucial to predict early a possible hazardous course of PE, especially as CT imaging is often one of the first diagnostic procedures in these patients. There are several proposed imaging signs, which were identified to have prognostic implications [4]. Potential CT-signs were the ratio right ventricular diameter to left ventricular diameter and contrast media reflux into the inferior vena cava [4][5][6].
There is extensive body of literature which investigated the possible clinical benefit of embolism obstruction score [9]. Different scoring systems were proposed. As a common finding, all of them rated the location of the embolism with more proximal location in a higher rating. However, those scores were primarily validated by the authors of the studies and only few independent evaluations were performed.
The score by Mastora et al. is very complex, which scores the amount of obstruction of every pulmonary vessel up to the segmental vessels with 0 to 5 [13]. For the Ghanima score, the radiologist just has to locate the proximal embolism resulting in a score value from 0 to 4 [15].
These requirements of the scores result in different time effort to for the scoring. So, the score of Mastora needs the most time, followed by the scores of Qanadli and Kirchner, whereas the score of Ghanima was the fastest score. In clinical routine, Ghanima score can therefore easily be reported by the radiologist with only little effort.

Obstruction scores in pulmonary embolism
Moreover, there is a weak correlation between the time duration and Mastora score indicating that more occluded pulmonary vessels result into more time to calculate the score. For the other scores, no similar association was identified, which can be interpreted that they are not dependent on overall embolus burden and complex patient cases are as fast to calculate as patients with less involved vessels.
On the other hand, a strong inverse correlation between the time duration and Ghanima score value was identified, which can be interpreted that severe, proximal occlusions are fast and easily to score. The radiologist just needs to scroll to the pulmonary trunk and the main pulmonary vessels to see a large embolism which results in a score of 4. This score can, therefore, be easily used in clinical routine.
For all scores the interrreader variability was very good to excellent with no substantial differences between the scores. These results are good comparable to the interreader agreement published in the study by Aribas et al. [12].
According to the literature, the scores of Mastora and Qanadli were most often used to assess the pulmonary obstruction and to correlate them with possible outcome predictors.
In other studies, these promising results could not be replicated. So, Bach et al. could not identify predictive power of Mastora score for 30 days mortality [5]. Similar results were reported also by other authors [6]. Moreover, Mastora score was only weakly correlated with lactate level and not with other serological or clinical parameters [11].
Contrary to other studies, some correlations with clot burden were reported. For example, Thieme et al. [18], reported a statistically significant correlation between Mastora score and troponin level (r = 0.37, p = 0.016). Furthermore, Gül et al. investigated 28 patients with PE and identified a slight correlation between Qanadli score and troponin level (r = 0.32, p = 0.01) [19]. Similarly, Jeebun et al. also showed a significant correlation between clot burden and troponin level (r = 0.41, p = 0.048) [20]. In brief, there are only weak correlations, if any, between clot burden and serological parameters in PE patients to assume.
Ghanima et al. proposed a simple score to predict right heart dilation and serum troponin levels [15]. Yet contrary to first believe, the proximal localization of the embolus is not associated with all-cause mortality, which could lead to a weaker prognostic relevance of this score [4]. This might be of interest, as the score by Ghanimi et al does not utilize differences regarding multiple emboli, which might fail to detect clinically relevant information. For example, a patient with multiple segmental emboli is scored the same as a patient with only one segmental embolus with a score value of 2.
One finding of the present study is that only moderate correlations between the other scores were identified with Ghanima score. This proposedly can indicate that the simplified approach by Ghanima also leads to loss of information which the other scores still harbor. Thus, in a recent study, Ghanima score showed the worst area under the curve to predict right ventricular dysfunction and was significantly lower compared to the other scores [12]. Undoubtedly, there is need for more research to assess, whether these differences also lead to differences in prediction of mortality as an outcome parameter in PE patients.
In a recent study, 30 patients with pulmonary embolism were investigated by human reading and by a computed aided approach to calculate the obstruction scores by Mastora and Qanadli [21]. A mean time value of 374.9 ± 150.2 s for calculation both scores was reported, which is good comparable with the present results. The authors reported a computed aided approach, which significantly reduces the time effort to calculate the scores. However, the study included only patients with a small embolus burden with a mean Mastora score of 20.5, which is significantly lower than in the present study. Presumably, patients with segmental embolism in only one lobe are easier and faster to score than patients with bilateral and severe embolism.
Nevertheless, in the future computed aided approaches and even machine learning techniques will help to further score patients with PE [22].
Possible reasons for the interreader variability are filling defects within the vessels, which may be misdiagnosed with an embolism. This could especially be of importance for the segmental and subsegmental level. Thus, scores utilizing smaller vessels with a higher weighting of this pulmonary level, mostly the score by Mastora et al., suffer from this limitation. Another one can be vessel anomalies which might mislead to diagnose lobar instead of segmental PE and in this way might have an influence of all scores, yet, again, especially of Mastora score.
A strength of the present study is that a CT scanner with 128 slices was used, which is a newer generation CT compared to the CT scanner technology used in the studies to first validate the proposed scores. Presumably, due to higher imaging quality and better contrast of the pulmonary vessel the assessment of small emboli is improved.
There are some limitations of the present study to address. Firstly, the scores were only assessed by advanced residents. Yet, they are experienced in CT imaging and were trained in the scoring before the study CTs were evaluated. The external validity for the analysis can be assumed as sufficient. Secondly, there might be recognition bias, after the first scoring of the patients. However, there was sufficient time between the reading sessions to reduce this possible bias.
In conclusion, there are significant differences in regard of time consumption with no relevant differences of interreader variability in obstruction scores for pulmonary embolism. Mastora score needs most time effort, whereas the score by Ghanima the least time. However, future studies are needed to assess, whether this simplified scoring method is as good for prediction of individual clinical course.
Supporting information S1