A comparison of radiological descriptions of spinal cord compression with quantitative measures, and their role in non-specialist clinical management

Introduction Magnetic resonance imaging (MRI) is gold-standard for investigating Degenerative Cervical Myelopathy (DCM), a disabling disease triggered by compression of the spinal cord following degenerative changes of adjacent structures. Quantifiable compression correlates poorly with disease and language describing compression in radiological reports is un-standardised. Study design Retrospective chart review. Objectives 1) Identify terminology in radiological reporting of cord compression and elucidate relationships between language and quantitative measures 2) Evaluate language’s ability to distinguish myelopathic from asymptomatic compression 3) Explore correlations between quantitative or qualitative features and symptom severity 4) Investigate the influence of quantitative and qualitative measures on surgical referrals. Methods From all cervical spine MRIs conducted during one year at a tertiary centre (N = 1123), 166 patients had reported cord compression. For each spinal level deemed compressed by radiologists (N = 218), four quantitative measurements were calculated: ‘Maximum Canal Compromise (MCC); ‘Maximum Spinal Cord Compression’ (MSCC); ‘Spinal Canal Occupation Ratio’ (SCOR) and ‘Compression Ratio’ (CR). These were compared to associated radiological reporting terminology. Results 1) Terminology in radiological reports was varied. Objective measures of compromise correlated poorly with language. “Compressed” was used for more severe cord compromise as measured by MCC (p<0.001), MSCC (p<0.001), and CR (p = 0.002). 2) Greater compromise was seen in cords with a myelopathy diagnosis across MCC (p<0.001); MSCC (p = 0.002) and CR (p<0.001). “Compress” (p<0.001) and “Flatten” (p<0.001) were used more commonly for myelopathy-diagnosis levels. 3) Measurements of cord compromise (MCC: p = 0.304; MSCC: p = 0.217; SCOR: p = 0.503; CR: p = 0.256) and descriptive terms (p = 0.591) did not correlate with i-mJOA score. 4) The only variables affecting spinal surgery referral were increased MSCC (p = 0.001) and use of ‘Compressed’ (p = 0.045). Conclusions Radiological reporting in DCM is variable and language is not fully predictive of the degree of quantitative cord compression. Additionally, terminology may influence surgical referrals.


Introduction
Magnetic resonance imaging (MRI) is gold-standard for investigating Degenerative Cervical Myelopathy (DCM), a disabling disease triggered by compression of the spinal cord following degenerative changes of adjacent structures. Quantifiable compression correlates poorly with disease and language describing compression in radiological reports is unstandardised.

Study design
Retrospective chart review. Objectives 1) Identify terminology in radiological reporting of cord compression and elucidate relationships between language and quantitative measures 2) Evaluate language's ability to distinguish myelopathic from asymptomatic compression 3) Explore correlations between quantitative or qualitative features and symptom severity 4) Investigate the influence of quantitative and qualitative measures on surgical referrals. Introduction 'Degenerative Cervical Myelopathy'(DCM) refers to spinal cord disease triggered by degeneration of the cervical spine, including cervical spondylotic myelopathy, degenerative disc disease, ossification of the posterior longitudinal ligament and ossification of the ligamentum flavum [1]. DCM is the most common form of spinal cord dysfunction [2,3] with treatment limited to surgical decompression. Surgery can prevent further injury and improve neurological function and general health [4]. However, many patients retain life-long disabilities and reduced quality of life [4,5]. Prompt diagnosis and treatment are key to preserving function and ensuring good recovery [6], however diagnosis is frequently delayed [1,7]. As treatment is limited to surgical decompression, new international guidelines recommend that all patients with DCM are reviewed by specialists who can offer surgery [8]. Initial assessment and diagnosis, however, is typically carried out by other specialities. Diagnosis of DCM requires clinical signs and symptoms, confirmed with MRI (magnetic resonance imagining) examination. MRI [9] is the best imaging modality for assessing extent of cord compromise or injury [10] and typical features include visible cord compression, altered cord signal intensity, canal stenosis, altered sagittal spinal alignment and ligamentous changes [11].

Methods
Despite investigation, no standard MRI features consistently representing disease severity in DCM have been found [12], and whilst cord compression is considered a hallmark, its extent correlates poorly with severity. This may be due to dynamic injury mechanisms undetected by standard MRI protocols [13] or biological differences in responses to mechanical stress. Significant cord compression can be present in asymptomatic individuals [14,15].
Currently, various quantitative measurements of cord compromise have been described, including 'Transverse Area', 'Compression Ratio', 'Maximum Canal Compromise', 'Maximum Spinal Cord Compression' and 'Spinal Cord Occupation Ratio'. However, their chief usage is in research. Whilst such measurements provide objective and quantitative measures of cord compromise, in daily practice, non-specialist clinicians are primarily informed by the qualitative reports provided with MRIs. Whilst standardised nomenclature for MRI features such as disc pathology is well-established, no such guidance exists for radiologists reporting cervical spine imaging. Semi-quantitative and qualitative scales for cervical stenosis exist [16] but are not widely used. We hypothesised that language used by radiologists reporting cord compromise influences clinical management.
This study aimed to 1) identify terminology used in radiological reporting of spinal cord compromise 2) compare this with objective, quantitative measures of cord compromise 3) evaluate its ability to distinguish myelopathic from asymptomatic cord compromise and 4) investigate whether language influences referral to spinal surgeons.

Methods
This was a retrospective study examining data from all patients receiving a cervical MRI in one year at a tertiary NHS centre (N = 1123).
Author BH assessed radiological reports for each MRI for reference to spinal cord involvement. For each patient in whom compromise was identified (N = 166), the main descriptive term used to describe the cord at the affected level (e.g. flattened), as well as any qualifiers (e.g. mild) were recorded. In patients with multiple levels of cord involvement suggested, details for each were recorded and analysed separately, giving 218 unique cord levels.
Radiological reports were primarily authored by consultant neuroradiologists, but a minority (N = 9) were written by radiology trainees and endorsed by supervising consultants. In the single case where the consultant used a different term to the trainee, this was recorded in preference.
For each spinal level, four ratio measurements were calculated using raw measurements gathered from computer-based MRI records: 'Maximum Canal Compromise (MCC) [17]; 'Maximum Spinal Cord Compression' (MSCC) [17]; 'Spinal Cord Occupation Ratio '(SCOR) [11,18] and 'Compression Ratio' (CR) [19][20][21] (Table 1). Drawn from previous literature, these represent the selection felt to best reflect visible compromise and offer clinical Intra-and inter-observer ICCs reported previously as 0.76 ± 0.08 and 0.79 ± 0.09 for T2 images [23] SCOR Ratio of the sum of the cord width above and below, and the sum of the canal width above and below the point of compression Intra-and inter-observer ICCs reported previously as 0.82 ± 0.13 and 0.80 ± 0.05 on axial T2 images [23] significance [11]. Greater cord compromise is indicated by a larger MCC, MSCC or SCOR, or smaller CR. Reflecting work by Nouri et al. [22] we defined an SCOR value of �70% as diagnostic of congenital stenosis (or cord-canal mismatch). Ratio measurements were chosen to allow for better standardisation and comparison between patients than raw measurements of cord diameter. The accuracy of raw measurements (used to calculate ratio values) was verified by second researcher BD. Bland-Altman analysis showed acceptable agreement (SD = 0.24). Data summarising the treatment pathway of this cohort were extracted from the hospital database of patient records. Here, patients presenting acutely, with non-degenerative conditions (e.g. malignancy), or for whom insufficient documentation existed, were excluded, leaving 113 unique patients, and 148 spinal levels overall (Fig 1).
The mJOA is the most common assessment of DCM [24], however it relies upon clinicians completing specific assessments. Symptom severity was therefore assessed using clinical notes, and quantified with the i-mJOA, a modified form of the mJOA [25] developed and validated by our team [26].

Statistical methods
Statistical analysis was carried out using SPSS (v24, UK), with statistical significance at the level of p<0.05. Descriptive statistics were generated for all data sets and each was assessed for normal distribution as required, using Shapiro-Wilk and visual inspection.
To assess whether a relationship existed between the quantitative degree of cord compromise at each spinal level and the phrase describing that level in the associated radiological report, one-way ANOVA (and post-hoc Tukey's range test)-or in cases of unequal variances, Welch's test (and post-hoc Games-Howell analysis)-was used to compare the main descriptive terms against MCC, MSCC and CR. Only descriptive terms used ten or more times were included in analysis.
To compare characteristics of cords associated with a myelopathy diagnosis to those with no such diagnosis, it was assumed the diagnosis was applicable to all levels with compromise. Comparison of quantitative measures of compromise between myelopathy-diagnosis and non-myelopathic spinal cords was carried out using one-way ANOVA for MCC, MSCC and CR and chi-squared testing to compare myelopathy rates in levels with a SCOR of more or less than 70%. Use of main descriptive terms and qualifiers in myelopathic and non-myelopathic cords was evaluated using Chi-Squared testing, and post-hoc analysis with adjusted p-values.
Binary logistic regression models were used to consider the significance of quantitative measures of cord compromise and descriptive phrase on predicting whether spinal levels were referred to spinal surgeons.

Subject characteristics
Data from 113 patients was analysed. 46% of subjects were male, with a mean age of 55.1 years and 32% (N = 36) were at any time diagnosed with myelopathy. Patient characteristics are presented in Table 2.

What is the relationship of qualitative terms with quantitative measures?
A range of vocabulary was used to describe spinal cord levels. Within the cohort, 11 distinct descriptive terms and 11 qualifier terms were identified in radiological reports, though 52% of reports used no qualifiers. Details are summarised in Tables 3 and 4. Only descriptive terms used ten or more times ('Compress', 'Indent', 'Abut', 'Flatten' and 'Touch') were included in statistical analysis.> Maximum canal compromise (MCC). Comparing main descriptive term to the ratio measurements showed a significant difference in MCC between groups (Welch's test, (p<0.001)). MCC was higher in the group of MRIs where cords were described as 'Compressed' compared to 'Abutted' (19.9 ± 7.6, p<0.001), 'Indented' (22.3 ± 7.1, p<0.001) or 'Touched' (20. 0± 10.9 p<0.001). There were no other significant differences between groups (Fig 2).
Compression ratio (CR). Comparing main descriptive term and CR, showed a significant difference between groups (Welch's test, (p = 0.002)), with CR lower in the group of cords described as 'Compressed' compared to 'Abutted' (-0.11±0.06, p<0.001), indicating greater compromise in the former. There were no other significant differences between groups (Fig 2).
Overall, objective measures of cord compromise correlated poorly with radiological descriptions. The term "Compressed" seemed to be used in more severe cord compromise as measured by MCC, MSCC, and CR.
A wide combination of qualifier terms was used to modify descriptive terminology (Table 4), however the sample size was insufficient to allow analysis of the relationship between qualifiers and degree of cord compromise.

Do qualitative or quantitative features identify myelopathic spinal cord levels?
Within the sample, 48 spinal levels (32.3%) identified by radiologist were associated with a diagnosis of DCM. Within this sample, patients showing compression at more than one site were excluded from analysis, as it is not possible to pinpoint which level was responsible for their symptoms. This left 27 unique spinal levels associated with a diagnosis of DCM.   Comparing cord levels in patients considered myelopathic to those without the diagnosis found significantly greater compromise in myelopathy-diagnosis cords across three quantitative measurements (MCC: p <0.001; MSCC: p = 0.002 and CR: p<0.001) (Fig 3). Rates of myelopathy diagnosis also differed between cords with an SCOR greater than 70% and an SCOR less than 70% (χ 2 = 4.211, p = 0.040).

Do qualitative or qualitative features correlate with symptom severity?
Data was available from all 81 patients receiving a surgical consultation to calculate an i-mJOA score [26], an 18 point, semi-quantitative assessment scale of disease severity, based on the mJOA score [27]. Individuals with cord compromise but no evidence of myelopathy receive an i-mJOA score of 18. Pearson product-moment-correlation analysis showed no correlation between the four ratio measurements (MCC (r = -0.116, p = 0.304), MSCC (r = -0.139, p = 0.217), SCOR (r = -0.076, p = 0.503), CR (r = 0.128, p = 0.256)) and i-MJOA at the time of first appointment with a spinal surgeon.
Similarly, one-way ANOVA showed no relationship between descriptive term used and i-mJOA at the time of first appointment with a spinal surgeon (p = 0.591).
Overall, objective measurements of cord compromise do not appear to correlate with severity of DCM symptoms.

Do quantitative or qualitative features of DCM predict surgical consultation?
Within the cohort, 81 patients (72%) with 107 spinal levels (73%) were assessed by spinal surgeons. Of these patients, 44 (54%) received a diagnosis of myelopathy.
Two of the four quantitative measures of cord compromise were significantly different in those spinal levels referred to surgeons compared to those which were not. MCC (p = 0.009) and MSCC (p<0.001) showed greater compromise in patients referred for surgical review, however neither SCOR (p = 0.220) nor CR (p = 0.063) varied. A binary logistic regression model considering MCC, MSCC, CR and SCOR of greater than 70% found a model (χ2 = 18.46, p = 0.001) correctly classifying 76.4% of cases, in which the only variable significantly affecting referral was increased MSCC (p = 0.001).

Discussion
Our study showed that many different terms are used in radiological reporting of spinal cord involvement, however their quantitative features overlap greatly. Whilst some relationships were identified-for example, the qualitative term 'Compressed' was associated with greater quantifiable compromise-this was inconsistent. Moreover, neither qualitative or quantitative measures of cord involvement correlated with clinical symptoms, despite compression acting as a determinant of referral to spinal surgeons.
These preliminary results raise several key issues: firstly, that little relationship was seen between quantitative and qualitative features of spinal MRIs; secondly, the apparent ability of radiological reporting to influence clinicians, and finally, the requirement for MRI to diagnose DCM despite its inability to specify or stage disease.
Debate over the value of qualitative versus quantitative descriptors in radiology is longstanding and has been considered across various conditions [28][29][30]. Typically, such work compares the relative benefit of quantitative measurements to qualitative grading systems with set descriptive criteria and a proven inter/intra-rater reliability. In DCM, however, description of cord compromise uses terms chosen at a radiologist's discretion, with no clear guidance as to their individual meaning. This is particularly important when we know radiological reports are influential in clinical decision-making. Surveys suggests that the majority of clinicians rely on radiological reports to guide their practice, and believe that radiologists are equally or better able to interpret imaging than themselves [31]. Additionally, evidence suggests that radiologists' choice of language may be confusing to other clinicians and affect clinical practice. For example, work has suggested that use of the word 'infiltrate' in reports produces a range of non-overlapping interpretations by clinicians regarding possible underlying pathology and diagnoses [32]. Equally, reporting of chest radiographs has been shown to affect both diagnosis and management in childhood respiratory disease [32][33][34]. We also know that there is mismatch between how radiologists and requesting clinicians would like reports to be made [31]. Collectively, this literature suggests that radiologists' choice of language may have unintended effects on patient care. This is consistent with our findings suggesting that language choice may influence non-expert clinician's decision whether to refer patients with DCM.
This becomes particularly significant given the delays patients face in diagnosis: typically over 2 years [7], delaying treatment. The chance of full recovery is greatest if surgery is offered within 6 months of symptom onset [35] and cervical MRIs are key in the diagnostic pathway [36]. Additionally, even patients with demonstrable cord compression (but no myelopathy) risk developing DCM, and may sometimes opt to undergo surgery [37]. Therefore, it is recommended that all patients with MRI features and symptoms of DCM are assessed by spinal surgeons [38].
The route to a diagnosis of myelopathy involves multiple different specialities [39] with a shared key diagnostic stage in MR imaging. MRI is therefore an attractive target for strategies to improve patient care. The question remains however, how to improve interpretation by non-specialist clinicians.
Quantitative measurements are attractive in their objectivity, however, as demonstrated, their relationship to myelopathy is inconsistent: some quantitative MRI measures of compromise may relate to clinical symptoms of DCM [40] but, overall, degree of radiological compromise correlates poorly with disease severity. Patients with cord compression may not suffer from myelopathy [41] and some patients suffer myelopathy without visualised compression due to dynamic injury [13].
There are of course extensive advancements in MR imaging [42] and techniques such as fractional anisotropy have the potential to detect microstructural changes [43] which better correlate with severity [44], and may also detect subclinical spinal cord injury [45]. However, these techniques are far removed from current practice.
Consequently, isolated MR imaging cannot currently replace clinical assessment and it is notable that interpretation of MRI reports by non-expert clinicians may contribute to false reassurances and variable care. To prevent confusion for non-expert clinicians, descriptive terminology could be removed from reporting and replaced by statements of consistency (or non-consistency) with DCM but further investigation is needed to confirm the value of such an approach. There are, of course, limitations to the conclusions of this study-data was drawn retrospectively from a single centre, and patterns of language likely differ across individuals, centres and countries. Furthermore, we focussed on the quantifiable degree of cord compromise visible on MRI without considering other features which could influence radiologists' choice of language (for example signal hyperintensity [46]) or other clinical factors contributing to seeking a spinal surgery consult. However, while choices of wording may differ, the subjectivity implicit with the use of qualitative descriptions identified here is likely to be present elsewhere. Moreover, the relationship with myelopathy diagnosis and severity is in-keeping with the wider medical literature [1].

Conclusions
This is the first study considering radiological reporting and routine patient care in DCM. Many different terms are used to describe spinal cord involvement and choice of word does not consistently represent quantitative compromise. Moreover, objective assessments of spinal cord compromise do not correlate with severity of myelopathy symptoms. However, choice of qualitative description may influence decisions regarding referral to spinal surgery and as all symptomatic patients are now recommended to receive assessment by a spinal surgeon [8] it is possible that ambiguities in language are affecting patients' treatment and quality of life. Any evidence of spinal cord compression with myelopathy symptoms should therefore be considered significant.
Supporting information S1