Monitoring for myelopathic progression with multiparametric quantitative MRI

Background Patients with mild degenerative cervical myelopathy (DCM) are often managed non-operatively, and surgery is recommended if neurological progression occurs. However, detection of progression is often subjective. Quantitative MRI (qMRI) directly measures spinal cord (SC) tissue changes, detecting axonal injury, demyelination, and atrophy. This longitudinal study compared multiparametric qMRI with clinical measures of progression in non-operative DCM patients. Methods 26 DCM patients were followed. Clinical data included modified Japanese Orthopedic Association (mJOA) and additional assessments. 3T qMRI data included cross sectional area, diffusion fractional anisotropy, magnetization transfer ratio, and T2*-weighted white/grey matter signal ratio, extracted from the compressed SC and above/below. Progression was defined as 1) patients’ subjective impression, 2) 2-point mJOA decrease, 3) ≥3 clinical measures worsening ≥5%, 4) increased compression on MRI, or 5) ≥1 of 10 qMRI measures or composite score worsening (p < 0.004, corrected). Results Follow-up (13.5 ± 4.9 months) included mJOA in all 26 patients, MRI in 25, and clinical/qMRI in 22. 42.3% reported subjective worsening, compared with mJOA (11.5%), MRI (20%), comprehensive assessments (54.6%), and qMRI (68.2%). Relative to subjective worsening, qMRI showed 100% sensitivity and 53.3% specificity compared with comprehensive assessments (75%, 60%), mJOA (27.3%, 100%), and MRI (18.2%, 81.3%). A decision-making algorithm incorporating qMRI identified progression and recommended surgery for 11 subjects (42.3%). Conclusions Quantitative MRI shows high sensitivity to detect myelopathic progression. Our results suggest that neuroplasticity and behavioural adaptation may mask progressive SC tissue injury. qMRI appears to be a useful method to confirm subtle myelopathic progression in individual patients, representing an advance toward clinical translation of qMRI.


Introduction
Degenerative cervical myelopathy (DCM) is among the most common causes of spinal cord (SC) dysfunction, involving age-related degeneration of the discs, ligaments, and vertebrae leading to extrinsic compression and dynamic injury. [1,2] Low quality evidence suggests that 20%-62% of DCM subjects will deteriorate over 3-7 years. [3][4][5] Non-operative treatments such as cervical collars and physiotherapy are sometimes employed, but no evidence exists to support their benefit. [5] Decompressive surgery not only halts neurological deterioration, it improves outcomes and is the recommended treatment for moderate/severe DCM in recent clinical practice guidelines (CPGs). [6,7] However, optimal management of mild DCM is controversial; surgery is a treatment option, but many patients are managed non-operatively and monitored periodically, in which case surgery is recommended if neurological deterioration occurs. [7] Subtle progression can be difficult to identify, relying on the patient's perception of symptoms and subjective findings of the neurological examination, while the utility of electrophysiology studies in this context has not been established. [8,9] The majority of natural history studies have used Japanese Orthopedic Association (JOA) [10][11][12][13][14] or modified JOA (mJOA) [15,16] scores as the primary outcome measure, but these have limited sensitivity and granularity, and poorly characterized reliability. [17,18] Furthermore, these studies have used variable definitions of deterioration, including a lack of improvement (in moderate-severe subgroups), [10,11] a 1-point decline, [15,16] or conversion from the mild to moderate category of JOA. [12,13] More detailed myelopathy assessments are available, but their psychometric properties and evidence supporting their use are similarly lacking. [19] As a result, practice patterns are variable and frequently rely on subjective factors, including the rudimentary question: "are your symptoms better, the same, or worse?" An array of MRI techniques have emerged that measure aspects of SC microstructure and tissue injury. [20] Cross-sectional area (CSA) measures the degree of SC compression in DCM, and atrophy in uncompressed regions. The diffusion tensor imaging (DTI) metric fractional anisotropy (FA) reflects axonal injury and demyelination. Magnetization transfer ratio (MTR) is a more specific measure of myelin quantity. T2 Ã -weighted imaging (T2 Ã WI) shows strong contrast between white and grey matter, and the white matter to grey matter signal intensity ratio (T2 Ã WI WM/GM) reflects demyelination, gliosis, calcium, and iron changes. [21,22] We developed a clinically feasible multiparametric quantitative MRI (qMRI) protocol that collects these data across the cervical SC, producing 10 measures of tissue injury that correlate with myelopathic impairment in DCM. [21,23] In the current study, we compare several methods of detecting myelopathic progression, including 1) patients' subjective impression of worsening, 2) mJOA, 3) comprehensive clinical assessments, 4) anatomical MRI, and 5) multiparametric qMRI. We hypothesize that qMRI will show a higher rate of progression than other measures due to the effects of neuroplasticity and behavioral adaption, which we suspect compensate for progressive tissue injury. Finally, we develop a practical framework for monitoring DCM patients and describe its initial implementation.

Study design and subjects
This prospective longitudinal study received institutional approval from the University Health Network (Toronto, Ontario, Canada) and all participants provided written informed consent. A total of 58 patients were consecutively enrolled between October 2014 and August 2016 that showed one or more symptoms of cervical myelopathy (including sensory dysfunction, hand clumsiness, gait dysfunction, bladder dysfunction, or perceived weakness), one or more signs (including sensory deficit, motor deficit, hand incoordination, gait ataxia, or hyperreflexia), and imaging evidence of spinal cord compression from degenerative causes (disc, ligament, or bone). Among this cohort, 26 patients were initially managed non-operatively based on shared decision-making between the attending surgeon and patient, and this subgroup comprised the population of interest in this study. Factors that led to non-operative management were typically very mild symptoms and/or the patient's preference to be managed non-operatively. These 26 individuals were reassessed approximately 12 months later, depending on subject availability.

Clinical assessments
A battery of clinical assessments was administered by a clinician-scientist (ARM, 6 years experience; SKR, 10 years experience; Table 1). To reduce inter-observer variability, scripts and agreed-upon criteria to interpret answers were used. This included a modified version of the mJOA (Table 2) to simplify language and allow substitute findings, such as worsened handwriting for mild upper extremity motor impairment. The percent change in clinical measures was calculated using the maximum score as the denominator for finite scales (e.g. 18 for mJOA) or the baseline score for infinite scales (e.g. grip strength).

MRI acquisitions
All imaging was performed on the same clinical scanner (3T GE). T2-weighted imaging (T2WI) utilized sagittal FIESTA-C with 0.8 mm 3 isotropic resolution. DTI used spin echo single shot EPI (ssEPI) with 80x80 mm 2 field of view, anterior/posterior saturation bands, second order localized shimming with a box volume of interest, no cardiac triggering, and 1.25x1.25x5 mm 3 resolution. MT was performed using gradient echo with and without MT prepulse with 1x1x5 mm 3 resolution. T2 Ã WI images used a multi-echo recombined gradient echo (MERGE) sequence with 3 echoes and 0.6x0.6x4 mm 3 resolution. DTI, MT, and T2 Ã WI images had 13 axial slices positioned perpendicular to the spinal cord (at C3), covering C1 to C7 using a variable gap, alternating between mid-vertebral body and intervertebral disc. Imaging time was 30-35 minutes in total, and further details of each sequence are as previously reported. [23] Image analysis Images were reviewed by 2 raters (ARM, 6 years experience; AN, 6 years experience) and excluded if they showed motion or other artifacts (e.g. aliasing), along with corresponding images from the comparison examination. T2WI and T2 Ã WI images were reviewed to identify T2WI hyperintensity and record levels with extrinsic SC compression, defined as indentation, flattening, torsion, or circumferential compression. The maximally compressed level (MCL) was subjectively determined, with discrepancies resolved by consensus. When the MCL changed between baseline and follow-up, the new level was used for comparisons. Quantitative image analysis was performed with the Spinal Cord Toolbox (SCT) v3.0 by ARM (3 years experience). [24] Automatic SC segmentation was performed, and segmentation masks were reviewed and manually corrected if necessary (Fig 1). Segmentation editing was blinded by anonymizing and randomizing baseline and follow-up scans. CSA was calculated from the T2 Ã WI segmentation (or T2WI segmentation if T2 Ã WI was excluded). Registration to the SCT template was performed for each dataset and FA, MTR, and T2 Ã WI WM/GM were extracted from WM in each slice with correction for partial volume effects using the maximum a posteriori method. [25] Metrics were age-corrected based on linear regression in 40 healthy subjects (CSA: β = -0.0867 mm 2 /year, FA: β = -0.00121/year, MTR: β = -0.0815%/year, T2 Ã WI WM/GM: β = 0.000740/year). [23] Corrected metrics were averaged across rostral (C1-C3) and caudal (C6-C7) levels, excluding compressed slices, and at MCL using a single slice for CSA or 3 slices for FA, MTR, and T2 Ã WI WM/GM. [21] This approach produces 12 metrics, of which 10 previously demonstrated significant clinical correlations in DCM, [21] leading to exclusion of caudal CSA and MTR from this study.

Myelopathic progression
Patients were asked if their neurological symptoms were better, the same, maybe worse (defining borderline progression), or worse (defining progression) than at the initial assessment. For mJOA, progression was defined as a decrease of ! 2 points and borderline progression as a 1-point decline. [18] For comprehensive clinical assessments, progression was defined as ! 3 measures worsening by ! 5%, and borderline progression as 1-2 measures worsening. qMRI progression was defined by statistical tests described in the statistical analysis section. Patients' subjective impression was used as the clinical case definition of myelopathic progression.

Statistical analysis
Overall α was set to 0.05. Continuous data were summarized by mean ± standard deviation (SD). Group deterioration at follow-up was analyzed with single-tailed paired t tests. 95% confidence intervals (CIs) of proportions were calculated using the Wilson method (continuity corrected). qMRI progression was tested against the null hypothesis that changes were due to measurement error (assuming normal distribution and SD = p 2 Ã standard error of measurement, SEM), using z scores. [26] SEM of FA, MTR, and T2 Ã WI WM/GM were derived from our previous reliability study, and SEM of CSA was calculated using T2 Ã WI data from 5 healthy and 11 DCM subjects. [23] For rostral and caudal measures, pooled estimates of SEM were derived from healthy and DCM subjects, whereas MCL SEM values were derived from DCM subjects only (Table 3). Z scores were also averaged to yield an unweighted composite score (null hypothesis: t distribution, 10 degrees of freedom, d.f.s, standard error = 1/ p 10). qMRI progression was defined as z score < -2.65 for any single metric or composite score: t 10 < -3.30 (p = 0.004, single-tailed, Bonferroni corrected). Sensitivity, specificity, and Youden's Index (YI) were calculated for each measure of myelopathic progression relative to the clinical case definition (based on available follow-up events for each measure). Pairwise Fisher exact tests were used to compare the sensitivity, specificity, and concordance with the clinical case definition between measures of myelopathic progression.

Subjects
The cohort was aged 57.6 ± 9.1 years, included 15 men and 11 women, and baseline mJOA score was 15.7 ± 1.3 (21 mild, 5 moderate severity). Follow-up data included subjective impression and mJOA score for all 26 subjects (100%), anatomical MRI for 25 subjects (96.2%), and comprehensive clinical and qMRI data for 22 subjects (84.6%) ( Table 4). One subject had two complete follow-up assessments due to interim subjective deterioration. Among four subjects without complete follow-up, three (11.5%) experienced rapid progression (subjectively worse, mJOA declined ! 2 points) requiring urgent surgery and the remaining subject reported stable symptoms and mJOA but declined follow-up.

Quantitative MRI
All DTI and MT datasets were of acceptable quality, but two T2 Ã WI datasets were degraded by motion artifact and excluded. Individual slices were excluded 24/585 DTI, 17/585 MT, and 11/ 533 T2 Ã WI images. Analysis was successful for all remaining data, including accurate registration to the SCT atlas.
At the group level, all age-corrected qMRI metrics deviated pathologically at follow-up, including significant changes in five measures (CSA MCL , FA Rostral , FA MCL , T2 Ã WI WM/ GM Caudal , and MTR MCL ) and trends in three (CSA Rostral , FA Caudal , T2 Ã WI WM/GM Rostral (Table 3). Composite score showed the strongest group change (p = 0.00004).

Comparison of approaches to monitor myelopathic progression
qMRI showed greater sensitivity to detect myelopathic progression compared with mJOA (p = 0.003) and anatomical MRI (p<0.001), but not compared with comprehensive clinical assessments (p = 0.47). mJOA had higher specificity for progression than qMRI (p = 0.002) and comprehensive clinical assessments (p = 0.007), but not anatomical MRI (p = 0.10). qMRI showed a trend toward higher concordance with the gold standard (subjective worsening) compared with anatomical MRI (p = 0.08), whereas differences with mJOA (p = 0.55) and comprehensive clinical assessments (p = 0.75) were non-significant.

Clinical implementation
Based on the results, the authors developed a practical definition of myelopathic progression: subjective progression of neurological symptoms and any objective sign of progression, with the latter including mJOA, comprehensive clinical assessments, anatomical MRI, or qMRI. (Fig 4). Possible myelopathic progression was defined as either subjective or objective  worsening. Using these definitions, 11 subjects had progression at latest follow-up (42.3%, 95% CI: 24.0%-62.8%), seven (30.8%) had possible progression, and eight were stable (including three with clinical deterioration that was attributed to another cause). Fifteen subjects were invited for reassessment in clinic, with the decision-making algorithm being used to help guide surgical recommendations, in addition to patient-specific factors such as preferences and goals. The remaining subjects were educated about myelopathy symptoms and encouraged to contact their surgeon if subjective progression occurred. To date, seven patients have been reassessed in clinic and two are planned for operative treatment, two have pending visits, and six declined, stating they are comfortable monitoring their symptoms.

Discussion
In this study, myelopathic progression was more frequently detected with multiparametric qMRI than any other method. Furthermore, qMRI progression was highly congruent with subjective progression, indicating that the macro-and microstructural changes captured by qMRI are clinically meaningful. qMRI demonstrated greater sensitivity for progression than anatomical MRI and mJOA, but it should be noted that it was less specific than these measures. This is possibly explained by the concept of neuroplasticity and/or behavioural adaptation, as qMRI may pick up subtle progressive damage that an individual can compensate for (discussed below). qMRI also showed the highest concordance with subjective progression, although the differences with other measures were non-significant and no conclusions can be drawn about superiority. At present, a "gold standard" method of determining neurological progression in patients with DCM does not exist, and thus the ground truth is unknown. As a result, clinical practice varies widely, and many surgeons rely on patients' subjective impression of neurological deterioration as a trigger to recommend surgery. Detailed clinical assessments showed moderate sensitivity and specificity to detect deterioration, while anatomical MRI was insensitive but fairly specific. Overall, we felt that all of the methods of determining myelopathic progression had potential clinical utility, and we incorporated them into a decision-making algorithm that is consistent with recent CPGs that recommend surgery when myelopathic progression occurs. [7] qMRI results were sufficiently convincing to include in this algorithm, providing confirmatory evidence of deterioration. However, caution is warranted before qMRI results take a more central role in decision-making, as this study had a relatively small sample size. In fact, our proposed algorithm is arguably more conservative than the decision-making process currently used by many surgeons, as stable qMRI results helped to avoid surgery in 2 patients that had concerning findings at follow-up that were difficult to interpret (fluctuating symptoms and confounding physical ailments). Ultimately, the algorithm provides only general guidance and the final decision regarding surgery occurs in a standard clinic visit that incorporates patient preference and other factors, including a fulsome discussion to balance risks and benefits and select the optimal treatment. The initial implementation of this algorithm has led to surgical treatment in two patients; both showed only 1-point decreases in mJOA and minimal neurological worsening, which some surgeons would manage conservatively, but qMRI helped to confirm progression. This study represents, to the authors' knowledge, the first instance in which qMRI measurement of SC integrity has been used to inform decision-making in individual patients, constituting an important step toward clinical translation. Longitudinal monitoring for progression is an attractive first use of qMRI because it circumvents the normal inter-subject variability of these data, which limit qMRI's utility for diagnosis and prognostication. [23,27] However, further data, including long-term clinical outcomes, are needed to fully characterize the utility of this approach. Eight qMRI metrics demonstrated significant deterioration in either group or individual analyses, with the greatest individual and group differences observed using the composite score. The composite averages data from the compressed cord and uncompressed regions above and below, which we feel strikes a good balance between greater sensitivity to pathology (using measurements at the level of compression), [21] and unbiased values from undistorted regions. [28] Calculation of an unweighted composite score is a naïve approach that could be further strengthened by using weightings (e.g. logistic regression), but this was not performed to avoid overfitting given our small sample. Other groups have also developed multiparametric protocols, [29][30][31] and our data suggest that this type of approach can overcome the limitations of single qMRI techniques, such as modest reliability. Two potential outliers (improvements of z > 2.65) were observed, close to the expected value of 1.1, validating our statistical approach. These changes may represent tissue regeneration (e.g. remyelination), or alternatively these and some qMRI decreases could be spurious due to sampling error, artifacts, analysis errors, or inaccurate estimation of SEM.
Our results suggest that DCM is less benign than previously thought. [4] mJOA showed a rate of progression of 3.0%-31.3%, consistent with previous reports (adjusting for follow-up duration). [10][11][12][13][14][15][16] In contrast, progression with our clinical battery was 32.7%-74.9%, in spite of missing follow-up data in three subjects that had rapid deterioration. The higher rate of progression using more comprehensive methods was expected, as our clinical instruments were selected to detect subtle myelopathic changes. [19] Quantitative MRI showed even higher frequency of progression (40.8%-82.0%). These results cast doubt that the natural history of myelopathy has been accurately characterized, and larger prospective studies are needed with clear definitions of progression and comprehensive assessments. If the natural history is in fact as aggressive as our estimates suggest, it may be advisable to recommend surgery for patients with mild DCM. However, further research is needed to determine the impact of subtle progression on 1) quality of life and 2) the risk of more substantial deterioration before CPGs are modified for mild DCM. Long-term monitoring of non-operative subjects will also reveal if isolated qMRI progression is a precursor to physical deterioration.
Myelopathy can present variably, and the design of valid, reliable, and responsive instruments is challenging. Unfortunately, assessments that rely on patients' perception of progression are prone to recall bias and numerous other factors such as personality and anxiety. One subject clearly had worsened gait and hand dexterity but reported feeling "the same", highlighting that patients are often unaware or resistant to acknowledge neurological deterioration. Therefore, it is important to establish accurate and reliable clinical methods of confirming myelopathic progression to advise surgical decision-making. The mJOA is easy to administer and provides a useful summary measure, but lacks sensitivity to detect subtle changes. [6] Furthermore, one-point changes in mJOA are probably not trustworthy, based on one small reliability study. [18] Two-point mJOA changes were specific but not sensitive for progression, and thus, we feel that mJOA is not adequate as a standalone measure for detecting progression. Comprehensive clinical assessments were far more sensitive but less specific, primarily due to confounding physical ailments that commonly affect older individuals. The neurological impairments in cervical myelopathy include gait imbalance, hand incoordination, sensory dysfunction, weakness (e.g. hand intrinsics), and bladder dysfunction, which are all captured in our comprehensive clinical assessments. Grip strength was the most sensitive measure of progression, which has high inter-subject variability but excellent within-subject reliability, making it ideal for longitudinal monitoring. [32] Decreases in hand dexterity were also often encountered, which involved judging subjects' precision, grasp, and speed of tightening metallic nuts on screws. [33] QuickDASH, a questionnaire of upper limb function, [34] frequently showed progression, but it is not specific to myelopathic impairment. Gait impairment in DCM primarily involves imbalance, which is difficult to measure, and quantitative analysis with GAITRite may offer greater sensitivity than the 30-meter walk test. [35] However, quantitative gait analysis produces dozens of parameters, and further investigation is needed to determine if gait stability ratio is the optimal measure. Quantitative standardized clinical assessments are needed to enable precise quantification of myelopathic impairment (i.e. "personalized medicine"), which will allow more informed treatment decisions and greater standardization of care. However, direct measurement of spinal cord integrity with qMRI is also appealing because it avoids the challenges of clinical measurement, which assess injury to the SC indirectly. Further investigation of multimodal electrophysiology approaches, including motor and sensory evoked potentials and contact head evoked potentials (CHEPs), may also prove useful. [36] Moving forward, we envision that DCM management will evolve such that it is driven by a combination of quantitative clinical, electrophysiological, and qMRI assessments.
qMRI showed a higher rate of progression than clinical measures, suggesting that homeostatic mechanisms act to preserve normal function in the context of progressive tissue injury. Physical assessments (strength, dexterity) showed higher rates of progression than selfreported functional measures (mJOA, QuickDASH), which may be related to behavioural adaption, recall bias, and psychological denial of symptom progression. DCM patients typically alter their grasp and gait, often unconsciously, to maintain function despite incoordination and hyperactive reflexes. Furthermore, deterioration of low-level physical functions (e.g. grip strength) occurred more often than higher-level functions (gait, dexterity) that involve more complex neurological systems, potentially due to neuroplasticity. [37,38] Complex neural circuits show more plasticity than simple circuits, such as spinal reflexes, due to the number of neurons and synapses involved. [37] However, our data are only suggestive of this concept; histopathological studies that correlate qMRI measurements with actual tissue changes are needed to fully elucidate these mechanisms. However, other qMRI techniques such as functional MRI have provided similar evidence of neuroplasticity in spinal cord injury (SCI) and may yield further insights as they become more refined. [38] This study was subject to several limitations including a relatively small sample size and a lack of electrophysiology data, and larger multi-center confirmatory studies are planned to validate our results and more accurately characterize test-retest reliability, relationships with age, and the natural history of DCM. The cohort of 26 patients may be subject to selection bias, as the decision to manage patients non-operatively is subjective and varies between surgeons. The effect of age on qMRI metrics was derived from a cross-sectional study of 40 healthy subjects, but a longitudinal study in this population may be beneficial to fully elucidate the effects of age. Four subjects (15%) did not have complete follow-up data available, but were included in certain analyses to determine the rate of myelopathic progression. This may have resulted in underestimates of the rate of progression and sensitivity of qMRI and comprehensive clinical assessments because three of these subjects had severe worsening and would likely have shown worsening on all measures (extrapolating from our other results). The accuracy of CSA measurement could likely be improved with high-resolution T2WI using a different sequence that is less affected by motion. DTI with cardiac triggering may slightly improve reliability, based on previous data. [23] We assumed that qMRI measurement errors were normally distributed, but this is potentially incorrect. The definitions of myelopathic progression on mJOA (2-points) and comprehensive clinical assessments (3 measures by 5%) were arbitrary, and greater research is needed to optimize clinical assessments for myelopathic deterioration. The methods used in this study require considerable resources (MRI, clinical tools, expertise) that may not be feasible to implement in some clinical settings, highlighting the importance of developing and validating simple clinical tools. Finally, our decision-making algorithm is an initial attempt at rational use of these novel assessments, but this is expected to evolve as greater experience is obtained, while taking into account additional patient-specific factors.

Conclusions
Multiparametric qMRI appears to detect subtle myelopathic progression with high sensitivity in individual DCM patients, while correlating well with patients' perceptions. This novel approach may be a useful adjunctive method to confirm neurological worsening, warranting further study with long-term follow-up. The natural history of DCM appears to be more progressive than previously thought, perhaps because neuroplasticity and behavioral adaption act to mask progressive tissue injury. Our pilot implementation of qMRI into a decision-making algorithm represents one of the first clinical uses of SC qMRI to inform management of individual patients.