Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Improved inter-subject alignment of the lumbosacral cord for group-level in vivo gray and white matter assessments: A scan-rescan MRI study at 3T

  • Silvan Büeler,

    Roles Conceptualization, Formal analysis, Investigation, Project administration, Visualization, Writing – original draft

    Affiliation Department of Neuro-Urology, Balgrist University Hospital, University of Zürich, Zürich, Switzerland

  • Patrick Freund,

    Roles Conceptualization, Supervision, Writing – review & editing

    Affiliations Spinal Cord Injury Center, Balgrist University Hospital, University of Zürich, Zürich, Switzerland, Department of Neurophysics, Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany, UCL Queen Square Institute of Neurology, Wellcome Trust Centre for Human Neuroimaging, University College London, London, United Kingdom

  • Thomas M. Kessler,

    Roles Conceptualization, Supervision, Writing – review & editing

    Affiliation Department of Neuro-Urology, Balgrist University Hospital, University of Zürich, Zürich, Switzerland

  • Martina D. Liechti,

    Roles Conceptualization, Investigation, Supervision, Writing – review & editing

    Affiliation Department of Neuro-Urology, Balgrist University Hospital, University of Zürich, Zürich, Switzerland

  • Gergely David

    Roles Conceptualization, Formal analysis, Investigation, Software, Supervision, Visualization, Writing – original draft

    gergely.david@balgrist.ch

    Affiliations Department of Neuro-Urology, Balgrist University Hospital, University of Zürich, Zürich, Switzerland, Spinal Cord Injury Center, Balgrist University Hospital, University of Zürich, Zürich, Switzerland

Abstract

Introduction

Magnetic resonance imaging (MRI) enables the investigation of pathological changes in gray and white matter at the lumbosacral enlargement (LSE) and conus medullaris (CM). However, conducting group-level analyses of MRI metrics in the lumbosacral spinal cord is challenging due to variability in CM length, lack of established image-based landmarks, and unknown scan-rescan reliability. This study aimed to improve inter-subject alignment of the lumbosacral cord to facilitate group-level analyses of MRI metrics. Additionally, we evaluated the scan-rescan reliability of MRI-based cross-sectional area (CSA) measurements and diffusion tensor imaging (DTI) metrics.

Methods

Fifteen participants (10 healthy volunteers and 5 patients with spinal cord injury) underwent axial T2*-weighted and diffusion MRI at 3T. We assessed the reliability of spinal cord and gray matter-based landmarks for inter-subject alignment of the lumbosacral cord, the inter-subject variability of MRI metrics before and after adjusting for the CM length, the intra- and inter-rater reliability of CSA measurements, and the scan-rescan reliability of CSA measurements and DTI metrics.

Results

The slice with the largest gray matter CSA as an LSE landmark exhibited the highest reliability, both within and across raters. Adjusting for the CM length greatly reduced the inter-subject variability of MRI metrics. The intra-rater, inter-rater, and scan-rescan reliability of MRI metrics were the highest at and around the LSE (scan-rescan coefficient of variation <3% for CSA measurements and <7% for DTI metrics within the white matter) and decreased considerably caudal to it.

Conclusions

To facilitate group-level analyses, we recommend using the slice with the largest gray matter CSA as a reliable LSE landmark, along with an adjustment for the CM length. We also stress the significance of the anatomical location within the lumbosacral cord in relation to the reliability of MRI metrics. The scan-rescan reliability values serve as valuable guides for power and sample size calculations in future longitudinal studies.

1. Introduction

The lumbosacral spinal cord (SC) contains nuclei that innervate the lower limbs and pelvic organs. Pathological changes in the gray matter (GM) or white matter (WM) of the lumbosacral cord can lead to various dysfunctions such as motor and sensory impairments in the lower limbs, as well as lower urinary tract, sexual and bowel dysfunction [13].

Pathological changes in the SC can be investigated in vivo by utilizing magnetic resonance imaging (MRI) [4,5]. Cross-sectional area (CSA) measurements of SC, GM, and WM, derived from multi-echo gradient-echo sequences, have been utilized as indirect measures of atrophy in the cervical cord and lumbosacral enlargement (LSE) [612]. Furthermore, advanced MRI techniques, such as diffusion MRI, have provided complementary information on WM integrity [69,11,1316]. In contrast, there have been only a few studies investigating the conus medullaris (CM) extending caudally from the LSE [17,18]. Imaging the entire lumbosacral cord is of particular interest, as tissue damage may also occur [19] or even originate within the CM [20].

In the lumbosacral cord, SC level-specific and group-level analyses are challenged by the mismatch between vertebral and neurological levels [21], which precludes the use of vertebral levels as neuroanatomical landmarks. Given the difficulty of identifying neurological levels in vivo in the lumbosacral cord [22], previous studies have suggested the use of the slice with the largest SC CSA, here referred to as the "LSE landmark", as an image-based neuroanatomical landmark [18,23]. However, the intra- and inter-rater reliability of image-based landmarks have not been reported. Additionally, slice-wise group comparisons of MRI metrics are increasingly challenging toward the tip of the spinal cord due to the high inter-subject variability in the CM length [18].

Automatic SC and GM segmentation techniques have been shown to perform well for the cervical cord [24] but have not yet been optimized for the lumbosacral cord. This is primarily attributed to the smaller size and lower signal-to-noise ratio, as well as the close proximity of nerve roots, which can compromise SC segmentation. Consequently, manual segmentation is still the standard segmentation technique for the lumbosacral cord. For healthy volunteers, the feasibility of manual GM and WM segmentation has been demonstrated within the LSE [23] and CM [18], while the feasibility of diffusion tensor imaging (DTI) has been shown within the LSE [25]. However, none of these studies reported scan-rescan values for the CM.

The aim of this study is twofold. First, to facilitate group-level analyses, we aimed to improve the inter-subject alignment of the lumbosacral cord by comparing GM- and SC-based landmark definition methods and proposing a method to adjust for the individual CM length. Second, we computed intra-rater, inter-rater, and scan-rescan reliability of the SC, GM, and WM CSA, as well as scan-rescan reliability of the DTI metrics, within both LSE and CM, and both in healthy volunteers and patients. Here, our focus was on measuring the reliability of MRI metrics in a patient cohort where the lumbosacral cord is not directly affected. For this reason, we included patients with a spinal cord injury (SCI) in the cervical or thoracic cord, who represent a challenging imaging cohort as they often experience higher levels of involuntary motion (e.g., due to spasticity).

2. Materials and methods

2.1 Study participants

Fifteen participants including 10 healthy volunteers (4 females, 6 males, age (mean ± standard deviation (SD)): 32.9 ± 10.8 years) and 5 SCI patients with cervical or thoracic injuries, but without injury-related radiological structural abnormalities in the lumbosacral cord (1 female, 4 males, age: 46.1 ± 20.3 years, time since injury: 5.9 ± 0.1 months) participated in this study. SCI patients had neurological impairments with mixed aetiologies, injury levels, and severities (see S1 Table for demographic and clinical information). The patient cohort was part of a randomized controlled trial investigating the effect of transcutaneous tibial nerve stimulation on the emergence of neurogenic lower urinary tract dysfunction following SCI (TASCI, ClinicalTrials.gov identifier: NCT03965299) [26]. The main inclusion criteria for all participants were: (i) no MRI contraindications, (ii) no pre-existing neurological or mental disorders, and (iii) > 18 years of age. For a comprehensive list of inclusion and exclusion criteria specific to SCI patients, refer to [26]. Healthy volunteers were scanned twice, with an interval of (mean ± SD) 8.4 ± 3.0 weeks (range: 6–15 weeks) between scans. Participants were recruited between 17.04.2020 and 14.02.2022. The study was approved by the local ethics committee (Kantonale Ethikkommission Zürich, BASEC ID: 2019–00074) and conducted in accordance with the Declaration of Helsinki. Written informed consent was obtained from all participants.

2.2 MRI acquisition

Scanning was performed on a 3T Siemens Prisma MRI scanner (Siemens Healthineers, Erlangen, Germany) using a body transmit coil and a 32-channel spine matrix coil. Foam wedges were placed beneath the knees to minimize the lower spine natural lordotic curve and maximize contact between the lower back and the spine matrix coil. Motion artifacts in the lower back area were reduced by placing the legs onto a spine vacuum cushion and applying Velcro straps around the knees, hips, and chest.

A sagittal T2-weighted turbo spin echo sequence was acquired as an anatomical reference of the lumbosacral cord as described in [17]. Subsequently, axial T2*-weighted images were acquired using a 3D spoiled multi-echo gradient-echo sequence (ME-GRE, Siemens FLASH) with 20 axial-oblique slices of 5 mm thickness (no gap) (Fig 1). A saturation band was placed anterior to the spine to suppress potential artifacts from the abdomen. The sequence parameters were as described in [17].

thumbnail
Fig 1.

(A) The axial slice stacks of the 3D spoiled multi-echo gradient-echo (ME-GRE) sequence (indicated in green) and the diffusion MRI sequence (indicated in yellow), overlaid on the sagittal T2-weighted image. The field of view of the ME-GRE scan was set such that its 6th most rostral slice (slice #15) aligned with the maximum width of the spinal cord as observed in the sagittal T2-weighted scan (slice highlighted in red across all images). The slice stack of the ME-GRE scan covers the lumbosacral enlargement (LSE) and the entire conus medullaris (CM), while that of the diffusion MRI is smaller and centered at the 9th most rostral slice (slice #12) of the ME-GRE scan. (B) Corresponding axial slices of the ME-GRE scan. (C) Corresponding axial slices of the maps of diffusion tensor imaging metrics, including fractional anisotropy (FA), mean diffusivity (MD), axial diffusivity (AD), and radial diffusivity (RD). Axial slices are displayed in rostral (top left) to caudal (bottom right) direction.

https://doi.org/10.1371/journal.pone.0301449.g001

Images for diffusion MRI were acquired using a reduced field of view (FOV) single-shot spin-echo echo planar imaging (EPI) sequence with 15 slices of 5 mm thickness (no gap) (Fig 1). To prevent fold-over artifacts, two saturation bands were respectively placed anterior and posterior to the FOV. The sequence consisted of 180 diffusion-weighted (b = 800 s/mm2) and 6 T2-weighted (b = 0 s/mm2) images with an in-plane resolution of 0.9x0.9 mm2, in-plane FOV of 86x32 mm2, repetition time of 440 ms, echo time of 56 ms, no parallel imaging, 7/8 phase partial Fourier in the anterior-posterior phase-encoding direction, and bandwidth of 1270 Hz/pixel. The acquisition was cardiac gated, acquiring 3 slices per cardiac cycle with a trigger delay of 120 ms. The total acquisition time depended on the heart rate and was approximately 12 min.

2.3 Processing of ME-GRE images

For each repetition, the first three echoes of the ME-GRE scan were combined via root-mean-squares, as it has been demonstrated to be the optimal combination for segmenting the SC and GM within the same image [17]. The resulting combined echo was then averaged across all repetitions. The SC and GM were manually segmented according to a standard operating procedure (SOP), made available on GitHub (https://github.com/NeuroimagingBalgrist/LumbosacralCordMRI), using the sub-voxel segmentation tool in JIM 7.0 (Xinapse systems), providing corresponding CSA values. WM CSA was obtained by subtracting GM CSA from SC CSA. Sub-voxel segmentations were binarized at an inclusion threshold of 100% for SC and 50% for GM to create binary SC and GM masks. A binary WM mask was created by subtracting the binary GM mask from the binary SC mask.

The three raters were instructed to segment the SC and GM by drawing an isointense contour within the partial volumes along the edges of the SC and GM, respectively, while also considering the anatomical shape and smoothness of the SC and GM. The raters had extensive experience in manually segmenting the lumbosacral cord, having individually segmented over 100 images prior to this study. To establish consensus guidelines for segmentation, the raters initially segmented a separate training set comprising three healthy volunteers, which was not included in the main analysis. These initial segmentations were then compared, and any disagreements were resolved through discussion among the raters. The resulting consensus guidelines were added to the SOP.

2.4 Processing of diffusion MRI images

The diffusion MRI images were processed using the ACID toolbox [27]. All images were cropped to an in-plane FOV of 32x32 mm2. Eddy current and motion correction was performed using ECMOCO [28] with a 3-degrees-of-freedom (DOF) volume-wise registration (translation along x and y; scaling along y (with x and y being the left-right and anterior-posterior direction, respectively)), followed by a 2-DOF slice-wise registration (translation and scaling along y). Images underwent adaptive smoothing using msPOAS [29], with parameters k* = 5 and lambda = 10, to increase the signal-to-noise ratio without introducing blurring across tissue edges. The diffusion tensor model was then fitted on the corrected images using a robust tensor fitting algorithm [30,31] to generate maps of fractional anisotropy (FA), mean diffusivity (MD), axial diffusivity (AD), and radial diffusivity (RD).

The mean corrected diffusion-weighted image was manually segmented for SC in FSLeyes and was spatially normalized to the PAM50 template [32] resulting in both forward (native to template space) and backward (template to native space) warping fields. Normalization was aided by labeling both the slice with the largest SC CSA and the most caudal slice of the SC (determined from the ME-GRE sequence), which were aligned with the corresponding labels in the PAM50 template (disc labels 19 and 21). The probabilistic WM atlas, integrated into the PAM50 template, was warped into the native space using the obtained backward warping field. Using sct_extract_metric from the Spinal Cord Toolbox (v.5.8) [33], weighted average values of DTI metrics were extracted slice-wise within the GM, WM, as well as the dorsal, lateral, and ventral WM columns. As the PAM50 atlas does not encompass the caudal half of the CM, DTI metrics were extracted only from the upper half of the CM.

2.5 Comparison of neuroanatomical landmark definition methods

The LSE landmark as an image-based neuroanatomical landmark was defined either as the slice with the largest SC CSA or the largest GM CSA. The curves of slice-wise CSA values (see Fig 3 for examples) were either unsmoothed or underwent smoothing by moving window averaging across three adjacent slices (corresponding to 15 mm). The choice of three slices was driven by the intrinsic smoothness of the curves of slice-wise CSA values and the need for selecting an uneven number of slices (thus avoiding the need for interpolation). Overall, this resulted in a total of four LSE landmark definition methods:

  1. SCmax: slice with the largest SC CSA without moving window averaging
  2. SCmax,mw: slice with the largest SC CSA with moving window averaging
  3. GMmax: slice with the largest GM CSA without moving window averaging
  4. GMmax,mw: slice with largest GM CSA with moving window averaging

To assess the reliability of the LSE landmarks, the mean absolute deviation (MAD) of the determined slices was calculated across three sets of segmentations performed by the same rater (intra-rater reliability) and across segmentations performed by three different raters on the same set (inter-rater reliability) (see Section 2.7 for segmentation procedures).

Identifying the tip of the spinal cord (CMtip landmark) was difficult due to the large slice thickness (5 mm). Therefore, it was determined by extrapolating the curve of slice-wise SC CSA values to zero. Note that the most caudal slice where SC was segmented was usually 1–2 slices above the CMtip landmark.

2.6 Adjusting for the conus medullaris length

To adjust for the individual CM length, we divided the CM, i.e., space between the two landmarks (LSE and CMtip), into five segments of equal thickness in each subject. For example, if the CM consisted of 8 slices (40 mm) in length, each of the five segments comprised 8/5 = 1.6 slices (equivalent to 8 mm). Without reslicing the images, we extracted CSA values and DTI metrics within each segment, computed as a weighted average of the slice-wise values, where the weights represent the spatial contribution of each slice to the particular segment. This allowed us to determine MRI metrics within segments centered at the LSE and the CMtip landmarks, respectively, four segments located between them, and three segments rostral to the LSE landmark (Fig 2).

thumbnail
Fig 2. Adjustment for the individual conus medullaris (CM) length (60, 50, 40, and 30 mm for the displayed cases).

The acquired slices are illustrated by rectangular gray boxes (slice thickness of 5 mm). First, two image-based neuroanatomical landmarks (LSE, defined here as the slice with the largest cross-sectional GM area, and the CMtip) are determined in each subject (indicated by asterisks). Then, the space between these landmarks are divided into 5 segments of equal thickness (resulting in segment thicknesses of 12, 10, 8, and 6 mm). The segments, displayed as colored spinal cord regions, are defined such that one segment (light blue) is centered at the LSE landmark (segment LSE), another one at the tip of the spinal cord (segment LSE-5), and the space between them is covered by four segments. The spinal cord rostral to the LSE landmark is also divided into segments using the same segment thickness. Average values are extracted within each segment as a weighted average of the slice-wise values, where the weights represent the spatial contribution of each slice to the segment. If a value for a slice which contributes more than 25% to the segment was not available, the value for that segment was not calculated.

https://doi.org/10.1371/journal.pone.0301449.g002

2.7 Intra- and inter-rater reliability of cross-sectional area measurements

A total of 10 participants including 5 healthy volunteers (1 female, 4 males, age (mean ± SD): 41.4 ± 8.6 years) and 5 patients with SCI (1 female, 4 males, age: 46.1 ± 20.3 years) were included in this analysis. No images had to be discarded due to motion or other artifacts. Segmentation of the axial ME-GRE images was performed by three raters (S.B., G.D., M.D.L.). Two raters performed the segmentation three times and one rater performed it once, with a minimum of two weeks between each round of segmentation. The order of segmentation in each round was pseudo-randomized into three blocks using a computer-generated randomization list to ensure a balanced distribution of patients and healthy volunteers. Raters were provided with the pseudonymized set of images (ME-GRE of the lumbosacral cord). The raters were not provided with images covering the injury site, where the participant status would have been visible. Since the patients did not exhibit any injury-related radiological abnormalities or implant-related artifacts in the lumbosacral cord, raters were unable to discern whether the participant was a healthy volunteer or a patient. The LSE landmark, defined as the GMmax,mw slice, was determined in each subject as the median across three raters.

Slice-wise coefficient of variation (CV) of SC, GM, and WM CSA was calculated as the percent ratio between the standard deviation and mean of CSA values either (i) across three sets of segmentations performed by the same rater (intra-rater reliability) or (ii) across segmentations performed by three different raters on the same set (inter-rater reliability). The CV was then averaged across all subjects (n = 10) and separately for patients (n = 5) and healthy volunteers (n = 5).

Intra- and inter-rater intraclass correlation coefficients (ICC) were simultaneously computed using Eliasziw’s two-way mixed effects, absolute agreement, single-measure model [34], as implemented in the relInterIntra function of the irr package in R. Two-sided (upper and lower bounds) 95% confidence intervals were additionally calculated according to the formulas in [35]. The Dice coefficient represents the spatial overlap between masks and was computed as the size of the union of two (binary) segmentation masks, created either by the same rater (intra-rater Dice coefficient) or different raters (inter-rater Dice coefficient), divided by the average size of the two segmentations masks. Values range from 0 (no overlap) to 1 (perfect overlap). Single Dice coefficients were calculated by averaging pairwise Dice coefficients for each subject, and then further averaged across all subjects.

2.8 Scan-rescan reliability of cross-sectional area measurements and diffusion tensor imaging

The reliability analysis across both imaging sessions (scan and rescan) included the 10 healthy volunteers. Patients were not included, as their scan-rescan data might be affected by disease-related longitudinal changes. No images had to be discarded due to motion or other artifacts. Segmentation was performed by a single rater (S.B.). The LSE landmark was defined as the GMmax,mw slice.

We calculated the mean of the scan-rescan differences (), along with the 95% limits of agreement (). Scan-rescan CV was calculated as the percent ratio between the standard deviation and mean of MRI metrics across scan and rescan. Scan-rescan ICC was computed using a two-way mixed effects, absolute agreement, single-measure model using the icc function of the irr package in R. The minimal detectable change (MDC), or smallest real difference represents the smallest longitudinal change that be confidently attributed to a meaningful change rather than mere scan-rescan variability. The MDC with 95% confidence threshold were calculated as , where SDtotal denotes the total standard deviation (taken across all measurements) and ICC denotes the scan-rescan ICC. While MDC describes the minimal detectable longitudinal change in an individual, it is important to note that group studies can detect even smaller longitudinal changes by utilizing multiple subjects.

3. Results

3.1 Comparison of neuroanatomical landmark definition methods

Visual comparison of different LSE landmarks is provided in Fig 3. The inter-rater MAD values of the LSE landmarks were 0.58 for SCmax, 0.42 for SCmax,mw, 0.29 for GMmax, and 0.22 for GMmax,mw. Intra-rater MAD values were consistently the same or lower than corresponding inter-rater values (0.16 for SCmax, 0.19 for SCmax,mw, 0.29 for GMmax, and 0.11 for GMmax,mw). The lowest intra- and inter-rater MAD values were achieved for GMmax,mw. The GMmax,mw slice was located (mean ± SD) 8.5 ± 4.5 mm more caudally than the SCmax,mw slice. The inter-subject variability of SC and GM CSA was lower at caudal locations when using the GMmax,mw slice as LSE landmark; however, no clear trend was observed at rostral locations (S2 Table, Fig 4).

thumbnail
Fig 3. Curves of slice-wise cross-sectional area (CSA) of the spinal cord (SC) and gray matter (GM), obtained by three different raters in two subjects, with and without applying moving window averaging across 3 adjacent slices.

The LSE landmarks, as determined by the raters, are indicated by asterisks. The benefit of using the slice with the largest GM CSA (GMmax,mw) as LSE landmark, as opposed to the slice with the largest SC CSA (SCmax,mw), is evident in Subject 1, with no inter-rater variability for the GMmax,mw slice. Subject 2 demonstrates the advantage of using moving window averaging: There is no inter-rater variability in GMmax,mw and SCmax,mw slice after applying moving window averaging.

https://doi.org/10.1371/journal.pone.0301449.g003

thumbnail
Fig 4. Inter-subject mean (solid line) and standard deviation (shaded area) of the cross-sectional area (CSA) of the spinal cord (SC) and gray matter (GM), computed across 10 healthy volunteers, when aligning the individual slice stacks at the slice either with the largest SC CSA (SCmax,mw, in red) or GM SCA (GMmax,mw, in blue), without adjusting for the length of the conus medullaris.

The inter-subject variability at caudal locations was slightly lower when aligning at the GMmax,mw slice, as seen by the smaller width of the blue shaded areas.

https://doi.org/10.1371/journal.pone.0301449.g004

3.2 Adjusting for the conus medullaris length

After adjusting for the CM length, the inter-subject CV of CSA measurements remained similar at and around the GMmax,mw slice, but decreased substantially caudal to it (Table 1). For example, the inter-subject CV of SC CSA was 31.5% five slices (25 mm) below the GMmax,mw slice (before adjustment), and it reduced to 17.4% after adjustment at roughly the same anatomical location (segment LSE-3) (compare values highlighted in bold in Table 1).

thumbnail
Table 1. Inter-subject variability of cross-sectional area measurements.

https://doi.org/10.1371/journal.pone.0301449.t001

3.3 Intra- and inter-rater reliability of cross-sectional area measurements

Fig 5 illustrates the intra- and inter-rater variability in SC and GM segmentations, while Table 2 lists the intra- and inter-rater reliability metrics for the CSA measurements. Intra-rater reliability was higher than inter-rater reliability for all CSA values and slices, indicated by lower CV and higher ICC and Dice coefficients. Intra- and inter-rater reliability was in general higher for SC CSA than for GM and WM CSA. Intra- and inter-rater CV increased considerably from the GMmax,mw slice toward the tip of the spinal cord, accompanied by a decrease in Dice coefficients. Notably, such a trend could not be observed in the ICC. Intra-rater CV values were similar between healthy volunteers and patients, but inter-rater CV values were higher in the patient group, especially at caudal locations (S3 Table).

thumbnail
Fig 5. Illustration of intra- and inter-rater variability of manual spinal cord (SC) and gray matter (GM) segmentations within a representative subject, i.e., a subject whose intra- and inter-rater variability values are close to the mean values reported in Table 2.

SC and GM segmentations are shown on alternating slices for display purposes. For the displayed SC and GM segmentations, the coefficient of variation (CV) values of the corresponding SC cross-sectional areas (CSA) were (intra- vs. inter-rater) 0.8% vs. 3.7% (slice 17), 2.4% vs. 3.2%, 1.2% vs. 9.9%, 4.8% vs. 12.3%, and 6.8% vs. 6.7% (slice 9). The CV values of the corresponding GM CSA were (intra- vs. inter-rater) 1.8% vs. 9.1% (slice 16), 1.5% vs. 9.1%, 2.4% vs. 8.9%, 6.5% vs. 10.9%, and 3.6% vs. 9.5% (slice 8).

https://doi.org/10.1371/journal.pone.0301449.g005

thumbnail
Table 2. Slice-wise intra- and inter-rater reliability of cross-sectional area measurements (n = 10; 5 healthy volunteers and 5 patients with spinal cord injury).

https://doi.org/10.1371/journal.pone.0301449.t002

3.4 Scan-rescan reliability of cross-sectional area measurements and diffusion tensor imaging

There was no systematic bias, as measured by , between the scan and rescan values for any of the MRI metrics (Tables 35, S1 Fig). In general, scan-rescan reliability was higher for SC CSA than for GM and WM CSA, indicated by lower CV and MDC, and higher ICC values (Table 3). Among the DTI metrics, RD tended to have the lowest scan-rescan reliability. In general, reliability was higher for DTI metrics when extracted within the entire WM rather than separate WM columns. Reliability (with the exception of ICC values for DTI metrics) decreased considerably from the LSE segment toward the tip of the SC. Without adjusting for the CM length, the scan-rescan CV values for DTI metrics were higher (compare Tables 35 with S4S6 Tables, respectively). The scan-rescan reliability decreased only minimally (i.e., the increase in CV was, on average, below 1 percentage point for MRI metrics extracted within the WM) when the landmarks were determined independently for scan and rescan, as opposed to when they were determined in the first scan (compare Tables 35 with S7S9 Tables, respectively).

thumbnail
Table 3. Scan-rescan reliability of cross-sectional area measurements (n = 10 healthy volunteers).

https://doi.org/10.1371/journal.pone.0301449.t003

thumbnail
Table 4. Scan-rescan reliability of fractional anisotropy and mean diffusivity values (n = 10 healthy volunteers).

https://doi.org/10.1371/journal.pone.0301449.t004

thumbnail
Table 5. Scan-rescan reliability of axial and radial diffusivity values (n = 10 healthy volunteers).

https://doi.org/10.1371/journal.pone.0301449.t005

4. Discussion

We investigated methods for improving the inter-subject alignment of axial slice stacks within the lumbosacral cord. We found that the slice with the largest gray matter (GM) cross-sectional area (CSA) can serve as a reliable image-based neuroanatomical landmark for the lumbosacral enlargement (LSE). Adjusting for the conus medullaris (CM) length by dividing the CM into a fixed number of segments and extracting MRI metrics from those segments substantially reduced inter-subject variability. Additionally, we reported different aspects of reliability for CSA measurements and diffusion tensor imaging (DTI) metrics within the lumbosacral cord at 3T. We found that intra-rater, inter-rater, and scan-rescan reliability was the highest at and around the LSE landmark, and decreased caudal to it.

4.1 Slice with the largest gray matter cross-sectional area is a reliable image-based neuroanatomical landmark

We found that the slice with the largest GM CSA, as opposed to the slice with the largest spinal cord (SC) CSA, can be identified more consistently across raters and across repeated assessments of the same rater. This is likely because of the sharper peak in the curves of slice-wise GM CSA values at the LSE compared to the flatter peak in SC CSA (Fig 3). Sharper peaks are easier for raters to identify than flatter peaks. Further improvement in the reliability of image-based landmarks can be achieved by smoothing the curves of slice-wise CSA values through moving window averaging across three adjacent slices, which reduces fluctuations inherent in the slice-wise CSA values.

Furthermore, when aligning the individual axial slice stacks at the LSE landmark determined based on GM CSA, as opposed to SC CSA, we noticed a slight reduction in inter-subject variability of CSA values at caudal locations. The reason for this reduction remains unclear. We argue that the slice with the largest GM CSA is more closely associated with the neurological level than the slice with the largest SC CSA, considering that the LSE is neuroanatomically attributed to the enlargement of the GM.

Notably, a previous study based on post-mortem MRI utilized distinct morphological features of the ventral GM horns to characterize the lumbosacral cord across species [36]. Another study revealed a strong link between these morphological features and the motor neuron pools located within the ventral GM horns [37]. We anticipate that, for in vivo studies, the shape of GM would serve as an even better neuroanatomical landmark than its size. In a recent study, a distinct approach was used to directly identify neurological levels through nerve root tracing [38]. However, this approach requires the acquisition of an additional image optimized for nerve segmentation, thereby increasing the scan time, and might not work reliably in all subjects. Future research should focus on identifying neurological levels directly from the axial ME-GRE images.

4.2 Improved inter-subject alignment after adjusting for the conus medullaris length

To adjust for the considerable inter-subject variation in CM length, we divided the CM into a fixed number of segments and extracted MRI metrics from these segments using a weighted average of the slice-wise values. This adjustment substantially reduced the inter-subject variability of MRI metrics at caudal locations. This is beneficial for group-level analyses, as reduced inter-subject variability increases statistical power for the same sample size or necessitates a smaller sample size for the same statistical power. We note that another line of research is concerned with reducing inter-subject variability of MRI metrics that arise from biological variability, using regression models incorporating demographic information, spine, and SC metrics [18,39]; however, this was not the focus of our investigation.

The decision of dividing the CM into five segments was determined by both the slice thickness (5 mm) and the range of CM lengths observed within the healthy population [18]. By using five segments, we ensured that segments were thicker than the slice thickness in all participants, avoiding the need for interpolating within slices. While thinner slices would allow for a higher number of segments and increased spatial specificity, this comes at the cost of reduced scan-rescan reliability. Conversely, dividing the CM into fewer segments would likely enhance scan-rescan reliability, but it would come at the expense of specificity along the rostro-caudal axis.

Spinal cord segments, created by correcting for the individual CM length, do not correspond to specific neurological levels. Instead, each segment consists of a combination of neurological levels, where more caudal segments encompass more neurological levels due to their progressively shorter length [40]. Nevertheless, this approach is still consistent with neurological levels as long as their distribution within each segment remains constant across subjects. In addition, while the adjustment method was demonstrated on ME-GRE images, it can be applied to other MRI sequences as well.

4.3 Excellent intra-rater reliability of cross-sectional area measurements at the lumbosacral enlargement

We demonstrated excellent intra-rater reliability for CSA measurements, when using the previously optimized ME-GRE imaging protocol [17], with intra-rater CV values below 4%, 5%, and 6% for SC, GM, and WM CSA, respectively, at and within a 15 mm radius of the LSE landmark. Notably, the inter-rater CV values for CSA measurements were approximately twice as high as the corresponding intra-rater CV values. Consequently, we strongly advocate for the segmentation of all images by the same rater.

We also highlight the importance of anatomical location for the reliability of CSA measurements. More caudal slices within the CM had lower intra- and inter-rater reliability, in line with a previous report [18]. This observation is likely attributed to the conus medullaris becoming thinner; for example, a minor difference between two segmentations has a larger impact when the area being segmented is smaller. Despite the increasing CV values toward the tip of the spinal cord, the intra- and inter-rater ICC values exhibited relatively consistent trends along the CM. This is because higher CV values are offset by higher inter-subject variability at more caudal locations (Table 1).

While intra-rater CV values for SC, GM, and WM CSA measurements were, on average, less than 1 percentage point higher in patients with spinal cord injury compared to healthy controls, the inter-rater CV values were, on average, between 2 and 5 percentage points higher, with the largest differences occurring caudal to the LSE landmark. This finding suggests that, for patients, the lower quality of the ME-GRE images increases the inter-rater, but not the intra-rater variability of SC and GM segmentations. The lower quality for patients may be attributed to a higher level of involuntary motion (4 of 5 patients reported spasticity in the lower limbs). While 4 of 5 patients had spinal instrumentation, they were not in close proximity to our field of view (FOV) and are therefore unlikely to affect the image quality.

The higher values for SC CSA, in comparison to GM and WM CSA, reflect the experience of the raters that SC segmentation is an "easier" task than GM segmentation. This is due to the fact that the contrast between the GM and WM is typically lower than between the WM and cerebrospinal fluid on ME-GRE images, and GM exhibits more ambiguity owing to its irregular shape [17]. The reliability of CSA measurements depends on the confidence with which manual segmentation is performed, referred to as "segmentability". SC and GM segmentability is influenced by various factors, including the rater’s experience, the subject being investigated, the imaging hardware, pulse sequence, and sequence parameters [17]. Consequently, the comparability of reliability values across studies is limited by variations in these factors. In comparison to values obtained on a 3T Philips MRI scanner [18], our study showed lower intra-rater (1.6% vs. 3.5%) but higher inter-rater CV for SC CSA (4.7% vs. 3.0%) at the LSE landmark within healthy volunteers. The CV values for GM CSA were found to be lower in our study for both intra-rater (4.0% vs. 7.7%) and inter-rater (6.1% vs. 9.9%) analyses.

4.4 Scan-rescan reliability depends on the anatomical location within the lumbosacral cord

Scan-rescan reliability was generally lower compared to the corresponding intra-rater reliability. This is expected, as in addition to the variabilities captured by intra-rater reliability, scan-rescan reliability also encompasses variabilities arising from subject positioning, the position of the imaged organ, FOV positioning, and potential imaging hardware instabilities (e.g., scanner drift). The time interval between scan and rescan is another important factor; longer intervals are associated with lower reliability. In our study, we deliberately selected a relatively long time interval of 6 to 15 weeks to mimic time frames commonly encountered in longitudinal clinical studies. Scan-rescan reliability was generally higher compared to the corresponding inter-reliability, which further emphasizes the importance of employing the same rater for segmentation.

The scan-rescan reliability of DTI metrics was found to be higher (i) in slices with larger CSA, particularly at and around the LSE landmark, as opposed to more caudal slices, and (ii) within larger regions (such as WM), compared to smaller ones (such as dorsal and ventral WM). This is because averaging DTI metrics within a larger region reduces the impact of individual outliers, small anatomical variations, and partial volume effects. Scan-rescan reliability can be further improved by averaging DTI metrics across several adjacent slices or segments, although this comes at the cost of decreased spatial specificity. For example, when averaging across three adjacent slices around the LSE landmark, as in [23,25], we observed a reduction in scan-rescan CV from 3.4% to 2.2% for WM CSA and from 4.2% to 2.7% for FA within the WM. When adjusting for the CM length, a small improvement in the scan-rescan reliability for DTI metrics was observed, which is probably due to the larger thickness of the segments (1–2 slices) compared to a single slice.

The DTI metrics exhibited lower scan-rescan reliability in comparison to CSA measurements. This is because diffusion MRI is inherently noisier than structural MRI, and, unlike CSA measurements, the average DTI metrics within a ROI are not solely determined by the size of the ROI, but also by its location. For example, if two SC segmentations do not fully overlap but have the same area, they would yield the same SC CSA, yet different average DTI metrics within the SC. Nevertheless, we demonstrated a very high overlap between segmentation masks created by the same rater, as measured by the Dice coefficient, with values above 91% for SC, 90% for GM, and 76% for WM segmentation.

Despite differences in the cohort, MRI scanner, sequence, and processing pipeline, our scan-rescan CV for SC CSA at the LSE landmark is similar to that reported previously in healthy volunteers scanned on a 3T Philips MRI scanner (vs. Yiannakas et al., 2014 [23]; 1.8% vs. 2.0%). However, our CV for GM CSA was notably lower (2.9% vs. 7.8%). For DTI metrics extracted within the WM at the LSE landmark, our CV values were in line with previously reported values for FA (vs. Yiannakas et al., 2016 [25]: 2.8% vs. 6.0%), MD (5.5% vs. 5.0%), AD (4.7% vs. 5.4%), and RD (6.4% vs. 8.3%). The provided scan-rescan reliability values (Tables 35) serve as valuable guides for power and sample size calculations in future longitudinal studies. For example, we can calculate, using the equations in [41], that detecting a 3% change over time in GM CSA at the LSE with a power of 80% and significance of 5% (one-sided) would require 12 subjects.

4.5 Limitations

In contrast to CSA values, we did not separately investigate intra- and inter-rater reliability for DTI metrics. This is because while obtaining CSA values involves a single manual step (SC and GM segmentation in the ME-GRE images), the processing pipeline for DTI requires several manual interventions (selection of SC midpoint, SC segmentation in both the EPI and ME-GRE images), making it challenging to isolate individual effects. However, we argue that the reported scan-rescan reliability values encompass the combined effect of all these sources of variability and are therefore most relevant for planning future studies. Additionally, DTI metrics could not be extracted from the caudal half of the CM, as this region is currently not covered by the PAM50 atlas. This is a recognized issue within the community and is likely to be addressed in future developments. A limitation of the study is the relatively small sample size; however, this sample size is typical in method development and scan-rescan studies. Moreover, the findings may not be generalizable to patient populations with pathologies (e.g. lesions) within the lumbosacral cord. Another limitation is the considerable length of the MRI examination; however, the acquisition time of the ME-GRE images can be reduced for clinical applications [17].

5. Conclusions

We provide recommendations for improved inter-subject alignment within the lumbosacral cord to facilitate group-level analyses of MRI metrics. Specifically, we propose using the slice with the largest gray matter cross-sectional area as a reliable image-based neuroanatomical landmark, along with an adjustment method for the length of the conus medullaris. We emphasize the importance of anatomical location for intra-rater, inter-rater, and scan-rescan reliability, which were the highest at the lumbosacral enlargement and decreased toward the tip of the spinal cord. The provided scan-rescan reliability values serve as valuable guides for power and sample size calculations in future longitudinal studies.

Supporting information

S1 Fig. Bland-Altman plots illustrating the scan-rescan differences of cross-sectional area measurements and diffusion tensor imaging metrics within the white matter (WM), at the lumbosacral enlargement (segment LSE, in blue) and the middle of the conus medullaris (segment LSE-3, in red) (n = 10 healthy volunteers).

The dotted lines represent the bias, i.e., the mean of the scan-rescan differences, while the dashed lines represent the 95% limits of agreement.

https://doi.org/10.1371/journal.pone.0301449.s001

(TIF)

S1 Table. Demographic and clinical information of patients with spinal cord injury.

https://doi.org/10.1371/journal.pone.0301449.s002

(DOCX)

S2 Table. Slice-wise cross-sectional area values when aligning the individual slice stacks at different lumbosacral enlargement landmarks.

https://doi.org/10.1371/journal.pone.0301449.s003

(DOCX)

S3 Table. Slice-wise intra- and inter-rater reliability of cross-sectional area measurements (n = 10; 5 healthy controls and 5 patients with spinal cord injury).

https://doi.org/10.1371/journal.pone.0301449.s004

(DOCX)

S4 Table. Slice-wise scan-rescan reliability of cross-sectional area measurements (n = 10 healthy volunteers).

https://doi.org/10.1371/journal.pone.0301449.s005

(DOCX)

S5 Table. Slice-wise scan-rescan reliability of fractional anisotropy and mean diffusivity values (n = 10 healthy volunteers).

https://doi.org/10.1371/journal.pone.0301449.s006

(DOCX)

S6 Table. Slice-wise scan-rescan reliability of axial and radial diffusivity values (n = 10 healthy volunteers).

https://doi.org/10.1371/journal.pone.0301449.s007

(DOCX)

S7 Table. Scan-rescan reliability of cross-sectional area measurements (n = 10 healthy volunteers).

https://doi.org/10.1371/journal.pone.0301449.s008

(DOCX)

S8 Table. Scan-rescan reliability of fractional anisotropy and mean diffusivity values (n = 10 healthy volunteers).

https://doi.org/10.1371/journal.pone.0301449.s009

(DOCX)

S9 Table. Scan-rescan reliability of axial and radial diffusivity values (n = 10 healthy volunteers).

https://doi.org/10.1371/journal.pone.0301449.s010

(DOCX)

Acknowledgments

We thank all the volunteers who participated in this study. We also thank Veronika Birkhäuser and Oliver Gross, as well as the entire clinical team of the Department of Neuro-Urology, Balgrist University Hospital, University of Zürich for recruiting the patients and screening the healthy volunteers. Furthermore, we thank Collene Anderson and Adrian Cathomen for pseudo-randomizing the images for segmentation. We also thank all members of the TASCI Study Group for their support. Imaging was performed with support of the Swiss Center for Musculoskeletal Imaging, SCMI, Balgrist Campus AG, Zürich.

References

  1. 1. Panicker JN, Fowler CJ, Kessler TM. Lower urinary tract dysfunction in the neurological patient: clinical assessment and management. Lancet Neurol. 2015;14(7):720–32. pmid:26067125
  2. 2. Park SE, Elliott S, Noonan VK, Thorogood NP, Fallah N, Aludino A, et al. Impact of bladder, bowel and sexual dysfunction on health status of people with thoracolumbar spinal cord injuries living in the community. J Spinal Cord Med. 2017;40(5):548–59. pmid:27576584
  3. 3. Ahuja CS, Wilson JR, Nori S, Kotter MRN, Druschel C, Curt A, et al. Traumatic spinal cord injury. Nat Rev Dis Prim. 2017;3(1):17018.
  4. 4. Seif M, Gandini Wheeler-Kingshott CAM, Cohen-Adad J, Flanders AE, Freund P. Guidelines for the conduct of clinical trials in spinal cord injury: Neuroimaging biomarkers. Spinal Cord. 2019;57(9):717–28. pmid:31267015
  5. 5. David G, Mohammadi S, Martin AR, Cohen-Adad J, Weiskopf N, Thompson A, et al. Traumatic and nontraumatic spinal cord injury: pathological insights from neuroimaging. Nat Rev Neurol. 2019;15(12):718–31. pmid:31673093
  6. 6. David G, Pfyffer D, Vallotton K, Pfender N, Thompson A, Weiskopf N, et al. Longitudinal changes of spinal cord grey and white matter following spinal cord injury. J Neurol Neurosurg Psychiatry. 2021;92(11):1222–1230. pmid:34341143
  7. 7. David G, Vallotton K, Hupp M, Curt A, Freund P, Seif M. Extent of cord pathology in the lumbosacral enlargement in non-traumatic versus traumatic spinal cord injury. J Neurotrauma. 2022;39(9–10):639–50. pmid:35018824
  8. 8. David G, Seif M, Huber E, Hupp M, Rosner J, Dietz V, et al. In vivo evidence of remote neural degeneration in the lumbar enlargement after cervical injury. Neurology. 2019;92(12):e1367–e1377. pmid:30770423
  9. 9. Huber E, David G, Thompson AJ, Weiskopf N, Mohammadi S, Freund P. Dorsal and ventral horn atrophy is associated with clinical outcome after spinal cord injury. Neurology. 2018;90(17). pmid:29592888
  10. 10. Paquin M, Mendili M, Gros C, Dupont SM, Cohen-Adad J, Pradat PF. Spinal cord gray matter atrophy in amyotrophic lateral sclerosis. Am J Neuroradiol. 2018;39(1):184–92. pmid:29122760
  11. 11. Vallotton K, David G, Hupp M, Pfender N, Cohen-Adad J, Fehlings MG, et al. Tracking white and gray matter degeneration along the spinal cord axis in degenerative cervical myelopathy. J Neurotrauma. 2021;38(21). pmid:34238034
  12. 12. White ML, Zhang Y, Healey K. Cervical spinal cord multiple sclerosis: Evaluation with 2D multi-echo recombined gradient echo MR imaging. J Spinal Cord Med. 2011;34(1):93–8. pmid:21528632
  13. 13. Ciccarelli O, Wheeler-Kingshott CA, McLean MA, Cercignani M, Wimpey K, Miller DH, et al. Spinal cord spectroscopy and diffusion-based tractography to assess acute disability in multiple sclerosis. Brain. 2007;130(8):2220–31. pmid:17664178
  14. 14. Cohen-Adad J, El Mendili MM, Lehéricy S, Pradat PF, Blancho S, Rossignol S, et al. Demyelination and degeneration in the injured human spinal cord detected with diffusion and magnetization transfer MRI. Neuroimage. 2011;55(3):1024–33. pmid:21232610
  15. 15. Combes AJE, O’Grady KP, Rogers BP, Schilling KG, Lawless RD, Visagie M, et al. Functional connectivity in the dorsal network of the cervical spinal cord is correlated with diffusion tensor imaging indices in relapsing-remitting multiple sclerosis. NeuroImage Clin. 2022;35:103127. pmid:35917721
  16. 16. Cohen-Adad J, El Mendili MM, Morizot-Koutlidis R, Lehéricy S, Meininger V, Blancho S, et al. Involvement of spinal sensory pathway in ALS and specificity of cord atrophy to lower motor neuron degeneration. Amyotroph Lateral Scler Front Degener. 2013;14(1):30–8. pmid:22881412
  17. 17. Büeler S, Yiannakas MC, Damjanovski Z, Freund P, Liechti MD, David G. Optimized multi-echo gradient-echo magnetic resonance imaging for gray and white matter segmentation in the lumbosacral cord at 3 T. Sci Rep. 2022;12(1):16498.
  18. 18. Yiannakas MC, Liechti MD, Budtarad N, Cullinane P, Yang X, Toosy AT, et al. Gray vs. white matter segmentation of the conus medullaris: reliability and variability in healthy volunteers. J Neuroimaging. 2019;29(3):410–7. pmid:30582252
  19. 19. Konno H, Yamamoto T, Iwasaki Y, Iizuka H. Shy-Drager syndrome and amyotrophic lateral sclerosis. Cytoarchitectonic and morphometric studies of sacral autonomic neurons. J Neurol Sci. 1986;73(2):193–204. pmid:3701375
  20. 20. Panicker JN, Simeoni S, Miki Y, Batla A, Iodice V, Holton JL, et al. Early presentation of urinary retention in multiple system atrophy: can the disease begin in the sacral spinal cord? J Neurol. 2020;267(3):659–64. pmid:31720822
  21. 21. Canbay S, Gürer B, Bozkurt M, Comert A, Izci Y, Başkaya MK. Anatomical relationship and positions of the lumbar and sacral segments of the spinal cord according to the vertebral bodies and the spinal roots. Clin Anat. 2014;27(2):227–33. pmid:23649511
  22. 22. Nunès A, Glaudot G, Lété A, Balci A, Lengelé B, Behets C, et al. Measurements and morphometric landmarks of the human spinal cord: A cadaveric study. Clin Anat. 2023;36(4):631–40. pmid:36647816
  23. 23. Yiannakas MC, Kakar P, Hoy LR, Miller DH, Wheeler-Kingshott CAM. The use of the lumbosacral enlargement as an intrinsic imaging biomarker: Feasibility of grey matter and white matter cross-sectional area measurements using MRI at 3T. PLoS One. 2014;9(8):e105544. pmid:25170763
  24. 24. Prados F, Ashburner J, Blaiotta C, Brosch T, Carballido-Gamio J, Cardoso MJ, et al. Spinal cord grey matter segmentation challenge. Neuroimage. 2017;152:312–29. pmid:28286318
  25. 25. Yiannakas MC, Grussu F, Louka P, Prados F, Samson RS, Battiston M, et al. Reduced field-of-view diffusion-weighted imaging of the lumbosacral enlargement: A pilot in vivo study of the healthy spinal cord at 3T. PLoS One. 2016;11(10):e0164890. pmid:27741303
  26. 26. Birkhäuser V, Liechti MD, Anderson CE, Bachmann LM, Baumann S, Baumberger M, et al. TASCI-transcutaneous tibial nerve stimulation in patients with acute spinal cord injury to prevent neurogenic detrusor overactivity: Protocol for a nationwide, randomised, sham-controlled, double-blind clinical trial. BMJ Open. 2020;10(8):e039164. pmid:32792454
  27. 27. David G, Fricke B, Oeschger JM, Ruthotto L, Freund P, Tabelow K, et al. ACID: A Comprehensive Toolbox for Image Processing and Modeling of Brain, Spinal Cord, and Post-mortem Diffusion MRI Data. bioRxiv. 2023;2023.10.13.562027.
  28. 28. Mohammadi S, Möller HE, Kugel H, Müller DK, Deppe M. Correcting eddy current and motion effects by affine whole-brain registrations: Evaluation of three-dimensional distortions and comparison with slicewise correction. Magn Reson Med. 2010;64(4):1047–56. pmid:20574966
  29. 29. Tabelow K, Mohammadi S, Weiskopf N, Polzehl J. POAS4SPM: A toolbox for SPM to denoise diffusion MRI data. Neuroinformatics. 2015;13(1):19–29. pmid:24993814
  30. 30. Mohammadi S, Freund P, Feiweier T, Curt A, Weiskopf N. The impact of post-processing on spinal cord diffusion tensor imaging. Neuroimage. 2013;70:377–85. pmid:23298752
  31. 31. David G, Freund P, Mohammadi S. The efficiency of retrospective artifact correction methods in improving the statistical power of between-group differences in spinal cord DTI. Neuroimage. 2017;158:296–307. pmid:28669912
  32. 32. De Leener B, Fonov VS, Collins DL, Callot V, Stikov N, Cohen-Adad J. PAM50: Unbiased multimodal template of the brainstem and spinal cord aligned with the ICBM152 space. Neuroimage. 2018;165:170–9. pmid:29061527
  33. 33. De Leener B, Lévy S, Dupont SM, Fonov VS, Stikov N, Louis Collins D, et al. SCT: Spinal Cord Toolbox, an open-source software for processing spinal cord MRI data. Neuroimage. 2017;145:24–43. pmid:27720818
  34. 34. Eliasziw M, Young SL, Woodbury MG, Fryday-Field K. Statistical methodology for the concurrent assessment of interrater and intrarater reliability: Using goniometric measurements as an example. Phys Ther. 1994;74(8):777–88. pmid:8047565
  35. 35. Hayen A, Dennis RJ, Finch CF. Determining the intra- and inter-observer reliability of screening tools used in sports injury research. J Sci Med Sport. 2007;10(4):201–10. pmid:17194623
  36. 36. Toossi A, Everaert DG, Perlmutter SI, Mushahwar VK. Functional organization of motor networks in the lumbosacral spinal cord of non-human primates. Sci Rep. 2019;9(1):13539. pmid:31537819
  37. 37. Gross C, Ellison B, Buchman AS, Terasawa E, VanderHorst VG. A novel approach for assigning levels to monkey and human lumbosacral spinal cord based on ventral horn morphology. PLoS One. 2017;12(5):e0177243. pmid:28542213
  38. 38. Mesbah S, Herrity A, Ugiliweneza B, Angeli C, Gerasimenko Y, Boakye M, et al. Neuroanatomical mapping of the lumbosacral spinal cord in individuals with chronic spinal cord injury. Brain Commun. 2023;5(1):fcac330. pmid:36632181
  39. 39. Papinutto N, Asteggiano C, Bischof A, Gundel TJ, Caverzasi E, Stern WA, et al. Intersubject variability and normalization strategies for spinal cord total cross-sectional and gray matter areas. J Neuroimaging. 2020;30(1):110–8. pmid:31571307
  40. 40. Frostell A, Hakim R, Thelin EP, Mattsson P, Svensson M. A review of the segmental diameter of the healthy human spinal cord. Front Neurol. 2016;7:238. pmid:28066322
  41. 41. Kadam P, Bhalerao S. Sample size calculation. Int J Ayurveda Res. 2010;1(1):55–7. pmid:20532100