Outcome measures used in trials on gait rehabilitation in multiple sclerosis: A systematic literature review

Background Multiple Sclerosis (MS) is associated with impaired gait and a growing number of clinical trials have investigated efficacy of various interventions. Choice of outcome measures is crucial in determining efficiency of interventions. However, it remains unclear whether there is consensus on which outcome measures to use in gait intervention studies in MS. Objective We aimed to identify the commonly selected outcome measures in randomized controlled trials (RCTs) on gait rehabilitation interventions in people with MS. Additional aims were to identify which of the domains of the International Classification of Functioning, Disability and Health (ICF) are the most studied and to characterize how outcome measures are combined and adapted to MS severity. Methods Pubmed, Cochrane Central, Embase and Scopus databases were searched for RCT studies on gait interventions in people living with MS according to PRISMA guidelines. Results In 46 RCTs, we identified 69 different outcome measures. The most used outcome measures were 6-minute walking test and the Timed Up and Go test, used in 37% of the analyzed studies. They were followed by gait spatiotemporal parameters (35%) most often used to inform on gait speed, cadence, and step length. Fatigue was measured in 39% of studies. Participation was assessed in 50% of studies, albeit with a wide variety of scales. Only 39% of studies included measures covering all ICF levels, and Participation measures were rarely combined with gait spatiotemporal parameters (only two studies). Conclusions Selection of outcome measures remains heterogenous in RCTs on gait rehabilitation interventions in MS. However, there is a growing consensus on the need for quantitative gait spatiotemporal parameter measures combined with clinical assessments of gait, balance, and mobility in RCTs on gait interventions in MS. Future RCTs should incorporate measures of fatigue and measures from Participation domain of ICF to provide comprehensive evaluation of trial efficacy across all levels of functioning.


Rationale
Multiple Sclerosis (MS) is an inflammatory demyelinating chronic disease of the central nervous system, and it is the most common non-traumatic cause of disability among young adults [1]. The clinical presentation and evolution of this disease is very heterogeneous, generating quite different disorders with important functional repercussions [2]. Gait impairment is one of the most common motor disorders [3] and is perceived as one of the most important bodily functions across the MS disability spectrum [4].
There is a central nervous system remodeling after inflammatory and demyelinating injuries by spontaneous mechanisms of recovery [1] that can be enhanced by rehabilitation interventions that promote activity dependent neural plasticity [5], improve the degree of functionality and increase Participation [6,7].
In recent years, with advances in the field of technology and neurorehabilitation, there have been a growing number of new rehabilitation approaches and RCTs to assess their efficacy [8]. Assessment in this context is central and selecting the most appropriate outcome measures is crucial for determining which rehabilitation treatments are most efficient [9]. There are many assessment tools, clinical scales, self-questionnaires, and technological devices that are validated and commonly used in gait assessment in MS [8,10]. Psychometric properties of some of these assessment methods have already been studied by many authors [11,12]. However, a consensus about which are the most appropriate is lacking, although agreement is crucial to generalize outcomes.
Primary symptoms of MS impact not only on disability and functioning but can also have major effects on quality of life and socioeconomic issues. The World Health Organization proposes a framework and classification for measuring health and disability known as the International Classification of Functioning, Disability and Health (ICF). According to the ICF, health domains of people living with MS (pwMS) are classified into three levels: Body structure/Body function, Activity, and Participation domains [13,14]. In RCTs, assessing health according to all three ICF domains is considered beneficial in determining efficacy of rehabilitation techniques in the different health domains. For example, including a measure from the Participation domain would provide information on whether the bio-psycho-social situation of people changes following the rehabilitation intervention. Gait rehabilitation interventions can improve not only walking abilities, classified in the ICF Activity domain, but also other aspects like strength, range of movement or spasticity, included in the Body function/Body structure domain, and aspects like self-esteem, social interaction or quality of life, included in the ICF Participation domain [10,15].
European Multiple Sclerosis rehabilitation recommendations [16] state that a comprehensive view of the pwMS status across all ICF domains is needed to provide adequate health care.
It is emphasized to select outcome measures according to the ICF framework in clinical trials on MS rehabilitation.
There is a need for a systematic literature review focusing on assessment methods used in clinical trials on gait rehabilitation interventions in pwMS in recent years. This would inform on which outcome measures are most used in the clinical and scientific community. If measures are quite common across all studies, this would indicate a good consensus in the field. Knowing which outcome measures are used in clinical trials is a first step that would help improve the design of future studies by identifying weaknesses and strong points in gait assessment procedures.
The first aim of this systematic review was to identify the commonly selected outcome measures in randomized controlled trials (RCTs) on gait rehabilitation interventions in pwMS.
Secondary aims were to identify which of the domains of the ICF are the most studied and to characterize how outcome measures are combined and adapted to MS severity.

Study design and search strategy
A systematic literature review was performed according to PRISMA guidelines 2009 [17] and following the recommendations provided in the Cochrane handbook for literature reviews [18].
The search was performed in the following databases: Medline using Pubmed interface, Cochrane Central, Embase and Scopus.
The search strategy included articles from January 2010 until February 2021, using the following key words and Mesh terms: ("Walking" The literature search included manual scanning of the reference lists of the included articles. We limited the search (using database filters) to studies performed on human adults and published from 1/1/2010 to 28/02/2021. Two independent reviewers (L.S., A.R.-L.) identified which articles to include. The search and selection processes were performed independently by both L.S. and A.R.-L. Disagreements on whether to include a study were resolved by discussing with a third author (J.I) and reaching consensus.

Study identification
Following the removal of duplicates with Refworks and verifying them manually, included studies were identified by first screening the title and abstract and, secondly, by full text screening.
Articles were included if they fulfilled the following inclusion criteria: i) randomized clinical trials regarding rehabilitation interventions to improve gait capacities in pwMS, ii) adult participants > 18 years old. Exclusion criteria included: i) literature reviews, ii) study protocols, iii) studies regarding the psychometric properties of outcome measures, iv) studies combining participants with other neurological diseases, v) studies evaluating specific rehabilitation interventions of other impairments (e.g. upper limb rehabilitation interventions, pelvic floor muscle rehabilitation interventions, memory rehabilitation interventions, swallowing rehabilitation interventions, balance specific rehabilitation interventions, vestibular rehabilitation interventions), if the aim of the intervention was not to improve gait capacities.

Data extraction
Full articles were reviewed for: year of publication, characteristics of the participants (age, disease severity according to EDSS, form of MS), type of rehabilitation intervention, number of participants and reported outcome measures.

Data analysis
The data have been analyzed using Microsoft Excel software.

Results
The electronic search yielded 88 articles in Pubmed, 90 in Cochrane Central, 363 in Embase and 258 in Scopus. The selection process is explained in Fig 1. Forty-six articles  shown in Table 1. fulfilled selection criteria, involving a total of 1842 patients. 69 outcome measures were identified in included RCTs, they are shown in Table 2. The summary of data collection is shown in Table 1.

Most commonly used outcome measures according to ICF levels
The wide range of outcome measures used across RCTs is depicted in Fig 2. The most used outcome measures were the 6-minute walking test and the Timed Up and Go test, followed by gait spatiotemporal parameters (GSTP).
Of the 69 outcome measures found, 20 assessed Body function and Body structure, 35 assessed Activity and 14 assessed Participation domains of ICF (See Fig 3). 17% of the studies assessed only one ICF domain, 44% of RCTs included measures covering two ICF domains and only 39% measures from all three ICF domains.
The Body structure/Body function domain was assessed in 80% of studies and the most used outcome measure to assess this domain was GSTP, used in 35% of RCTs. GSTP referred to b770 on ICF domain [15], was performed using different systems: nine studies used the Gaitrite system, two used the Vicon system, one used the Smart-D BTS bioengineering system, two used the Qualisys motion system, one study used the Gait-Real-time-Analysis-Interactive-Lab and one study a 3D photogrammetry. All these systems provide GSTP and some of these technological systems provide kinematics parameters with information about displacement and range of movement of joints. In studied RCT only 10% provide kinematic parameters.
In terms of GSTP, most studies (87%) reported gait speed, 67% of these studies reported cadence (steps/minute), 56% reported step length, and 37% analyzed stride length. Specific GSTP used in each study are reported in Table 3.
Fatigue, referred by the Body function/Body structure ICF item b4552 [15], is a cardinal symptom in MS impacting on gait pattern and functioning, and was assessed in 39% of studies using four different scales, the fatigue severity scale (15% of studies), the fatigue impact scale (15% of studies), the fatigue scale for motor and cognitive function (4% of studies), and the Wei-MUS scale (4% of studies).
The Activity domain was assessed in 91% of studies, assessing walking capacities referring to d450 ICF item (walking) and d4609 item (move around) [15]. Following the 6-minute walking test and the Timed Up and Go test used in 37% of studies, the Multiple Sclerosis Walking Scale-12 was used in 26% of studies and the Berg Balance Scale was used in 24% of studies. The expanded disability status scale (EDSS) for MS is used in 91% of the studies. Studies used the EDSS for different purposes. Only 13.33% used the EDSS to assess intervention efficacy and 80% of the studies used EDSS for classifying clinical status of the participants.
Participation and quality of life was assessed in 50% of studies, using 14 different scales. The most used outcome measure to assess this domain was the Multiple Sclerosis Impact Scale 29, used in 17% of the studies, followed by the Quality of Life Short Form 36, used in 6% of the studies.
How outcome measures are distributed according to ICF levels is described in Fig 3.

Combination of outcome measures
How often outcome measures were combined with each other is shown in Fig 4. Four scales were combined as 'Minutes walked': 2-meter walking test, 3-minute walking test, 5-minute walking

PLOS ONE
Outcome measures used in trials on gait rehabilitation in multiple sclerosis: A systematic literature review test, and 6-minute walking test. 'Meters walked' represents a combination of 10-meter walking test and the Timed 25-foot walk test. Ms represents combination of muscle strength with Lokomat device, isokinetic dynamometers, mechanical devices, and static strength measures. The most common combination of measures was between 'Meters walked' and 'Minutes walked' measures used in 32% of studies (15 RCT) and between 'Minutes walked' and Timed Up and Go used in 24% of studies (11 RCT).
The most common inter-domain combinations of measures were between Fatigue Impact Scale on Body structure/Function level and 'Minutes walked' measure on Activity level (85% of studies using FIS) and between Multiple Sclerosis Impact Scale on Participation level and 'Minutes Walked' (88% of studies using MSIS) on Activity level. GSTP assessment was complemented by other clinical mobility measures: 31% of them used a measure of walking time (predominantly 6-minute walking test) and 31% of studies also assessed Timed 25-foot walk test (meters walked; Fig 4). GSTP was less often combined with Berg Balance Scale (three studies, 19%) and Multiple Sclerosis Walking Scale-12 (four studies, 25%) and Timed Up and Go (two studies, 12%). GSTP was combined with muscle strength measurement in 19% of studies, but was rarely combined with fatigue measures (only one study, 6%) using Fatigue Severity Scale, and was combined with quality of life or participation assessments in only two RCT.

PLOS ONE
Outcome measures used in trials on gait rehabilitation in multiple sclerosis: A systematic literature review In Fig 4. we can see how outcome measures were combined in studies. Represented by a line between scales, the thicker the line is the more often the two scales are used in the same RCTs.

Outcome measure selection adapted to severity of MS
We stratified studies according to clinical status and gait capacity of the participants to study whether this influenced selection of outcome measures. A score of 4.5 on EDSS has been used [65,66] to classify MS participants into those with mild walking disability (score <4.5) and moderate to severe (score >4.5) gait disturbance [67]. In 19 RCTs, including participants with severe gait disturbance according to EDSS, the Timed Up and Go was the most used outcome measure, used in 47% of studies, followed by the 6-minute walking test used in 42% of studies. In 22 RCTs with less affected participants, the most used outcome measure was GSTP used in

Discussion
This systematic review showed that the most used outcome measures in RCTs on gait interventions in MS were the 6-minute walking test and the Timed Up and Go test, followed by GSTP, and that the choice of outcome measures depended on MS disease severity of participants. This study also highlights the large heterogeneity in the outcome measures used, and the fact that only the 39% of analyzed studies considered the three ICF domains in their assessment.

Gait spatiotemporal parameters and clinical assessments of gait
Assessments performed with technological devices to assess GSTP provide clinicians and researchers with accurate objective information. The studied parameters included time or distance parameters like stance duration, swing duration, stride length, gait cycle duration, cadence, velocity and normalized velocity [68]. One advantage of technological gait evaluation is that specific and sensitive information about gait quality (e.g., lower limb movement symmetry, support phase symmetry) and gait pattern (e.g., spastic-paretic, ataxia like, unstable gait) [69] is obtained allowing to gauge the impact of the studied interventions on these aspects.
In reviewed studies, the GSTP most often assessed with technological devices was gait speed. Other parameters like step length or support are not sensitive enough to detect changes in gait capacity across EDSS spectrum of mobility [70].
In the included RCTs, GSTP were more frequently reported in studies on patients with mild EDSS (score <4.5). GSTP were also often combined with clinical assessment of gait,   4). Included RCTs have thus provided comprehensive evaluations of gait.
There is a growing tendency to use GSTP to assess gait capacities in RCTs. Despite this fact, studies on the psychometric properties of these methods is needed. This point was already pointed out by Andreopoulou in 2019 [71], stating that although 3D gait analysis is considered a "gold" standard, psychometric properties of some of the measures provided by these technological systems have not been examined in pwMS. They studied the relative and absolute reliability of ankle kinematics and GSTP provided by VICON system in a sample of 49 pwMS. Their results indicate good to excellent relative reliability of walking speed, step length and cadence. Psychometric properties of other systems like GAITrite have been studied. Riis in 2020 [72] studied its convergent validity in a sample of 24 geriatric patients, studying correlations between Berg Balance Scale, DGI and Timed Up and Go test, showing moderate correlations between GAITrite parameters and functional tests. Hoschproung in 2014 [73] compared GAITrite provided GSTP with results of the Timed 25-foot walk test in a sample of 85 pwMS, obtaining as results that the GAITrite system has the same clinical validity in gait evaluation as the Timed 25-foot walk test. Sosnoff in 2011 [74] studied the validity of the functional ambulatory profile (FAP) score from GAITrite in a sample of 13 pwMS. They found that this specific parameter strongly correlated with the EDSS, walking performance (Timed 25-foot walk tests and Timed Up and Go tests) supporting validity of this GAITrite measure. But there is still a lack of knowledge about psychometric properties of GSTP obtained using other technological systems.
The most used clinical scales for gait assessment in the Activity domain of the ICF were the following: 6-minute walking test, Timed Up and Go test, 10-meter walking test, Timed 25-foot walk test. These clinical measures have good psychometric properties [75] and they assess gait in a quantitative manner. The 6-minute walking test gives information about cardiopulmonary function, and also provides information about walking capacities; the Timed Up and Go test provides quantitative information about gait and functional capacities, assessing a sit to stand

PLOS ONE
Outcome measures used in trials on gait rehabilitation in multiple sclerosis: A systematic literature review transfer from a chair followed by 3 meter walk, a turning and a return to the sitting position, allowing to assess also dynamic balance and gait stability; the Timed 25-foot walk test is a short distance measure of walking speed; the 10-meter walking test assesses a short distance walk allowing to asses gait speed [76]. All these tests can be complementary to each other, giving information about different aspects of gait. But it is difficult to compare efficacy of interventions across RCTs when different outcome measures are used. This makes clinical decision making and the establishment of evidence-based guidelines challenging, particularly when metanalyses are lacking.

Gait speed
Gait speed was the most commonly used GSTP and was also measured in clinical gait assessments. There is thus good consensus among clinical researchers to use gait speed to assess

PLOS ONE
Outcome measures used in trials on gait rehabilitation in multiple sclerosis: A systematic literature review efficacy of gait rehabilitation interventions. There are other authors that describe gait speed as a suitable outcome to assess differences in gait performance [70]. However, GSTP, 10-meter walking test, 2-minute walking test, 3-minute walking tests, and the Timed 25-foot walk, assess gait speed in different ways. Gait speed over short distances is assessed in the 10-meter walking test, and Timed 25-foot walk test, while 2-minute walking test, 3-minute walking test, 5-minute walking test, and 6-minute walking test assess gait speed and endurance over longer distances. Clinical scales and assessment with technological systems also differ in terms of instructions provided to the subject or required speed (maximal speed, comfort speed), with no standardized protocol for every technological system. Gait speed seems to be the parameter that researchers choose to assess gait rehabilitation interventions, assessing gait capacities in a quantitative manner. Although all trials include gait speed as an outcome measure, it is difficult to compare across clinical trials since testing procedures differed, e.g., distances covered and instructions provided were not the same. A consensus about modalities of assessment of this parameter, including standardized protocol for short and long-distance testing, could help in comparing results across RCTs.
Although gait speed is one of the parameters that is affected in pMS, decreasing while EDSS increases [69], one may ask if improving gait speed in performed tests really reflects an improvement in gait capacities. A less studied aspect, walking speed reserve (i.e., the difference between usual and fastest speed) could be important for interpretation of RCT results. Gijbels in 2010 [77] found that pace instructions provided influenced gait speed of the participants. They also reported that the difference between comfortable self-induced walking pace and fastest possible walking speed decreases as the degree of ambulatory dysfunction increases. That means that in more affected patients the performed gait speed is not necessarily a reflection of their comfortable walking speed. Taking this discrepancy into account in RCTs on gait interventions could help in improving accuracy and identifying efficacy of interventions on gait capacities.

Fatigue
Fatigue is a cornerstone symptom in pwMS [78] that likely determines gait pattern and gait functionality in everyday life [79,80]. In our results we can see that 39% of studies assessed this aspect using four different scales. To know which gait rehabilitation intervention minimizes this symptom is central for optimal clinical decision making. Few studies combined GSTP evaluation with measures of fatigue. This highlights a gap in previous research priorities in RCTs on gait interventions. Fatigue interacts with GSTP, for example, fatigue can be reflected by changes in stride length, gait velocity and stride time [81]. Future RCTs should therefore combine GSTP and fatigue measurements for a more complete mechanistic understanding.

Participation
Reducing restriction in Participation and obtaining good quality of life is the overall objective of rehabilitation interventions. Quality of life questionnaires provide useful information about this aspect that is identified by therapists as one of the goals of their therapies [82]. However, Participation was not systematically assessed (only 50% of studies assessed it) and there was considerable heterogeneity in the choice of outcome measures, with 14 different outcome measures for assessing Participation. Assessing this aspect more frequently in RCTs on gait interventions is recommended since this review showed a lack of consensus among researchers on the need to assess this aspect and on which measure to select. Improved consensus here would make it possible to compare the effects of rehabilitation interventions on quality of life across studies more easily.
In our findings, GSTP were combined with Participation assessments in only two studies, showing that most RCTs that focus on objective and fine assessment of gait parameters do not consider the repercussion of the studied intervention on the patient's specific life context. It is important that future studies on gait interventions combine these measures to extend results on pwMS quality of life, which is the final objective of rehabilitation interventions and enable more comprehensive understanding of intervention effects.

Gait capacities characterized by EDSS
EDSS is widely used for defining participant characteristics [65,66,83] and in our results, we observed that different outcome measures were used depending on gait capacities assessed by the EDSS.
Assessment with EDSS have many limitations [84], and assessments capable of compensating these limitations are needed when assessing gait capacities. Some outcome measures can be challenging for patients with a high EDSS, while others may not be sensitive enough to assess changes in pwMS with high gait capacities. GSTP, for example, were more frequently used in less affected pwMS characterized with a lower EDSS that need a fine assessment to detect changes in gait, since other tests like Timed Up and Go test can have ceiling effects and would not be responsive enough to changes due to rehabilitation interventions. In contrast, Timed Up and Go test, which provides information about gait over short distances and functional aspects like transfers, was used in more affected patients with higher scores in EDSS.
Regarding GSTP in pwMS, absolute and relative reliability of GSTP have been studied [71] in populations with lower (0-3.5) and higher (4-6) EDSS scores, and this study showed that higher walking disability in pwMS was associated with higher within-subject variability. These results are consistent with our review findings showing that clinical researchers less often chose this kind of assessment in pwMS with lower gait capacities.

Measuring across ICF domains
Comprehensive assessment, with outcome measures spanning all the ICF domains, is counseled by European recommendations in MS rehabilitation (RIMS) [16], and International Consensus Conference about ICF core sets in MS [15]. A recent study about goal setting and assessment according to ICF in MS, points out the need to use ICF Core Sets and standardized outcome measures for evaluation at the different ICF levels, both in clinical practice and in research (82). This multidimensional assessment can give information about efficacy of gait interventions on the global status of the pwMS and not only about one specific component. As we can see in our results, only 39% of analyzed clinical trials consider the three domains of the ICF. Covering all ICF domains more systematically in studies will be useful for comparing the global efficacy of physical interventions among studies. Combining Participation measures with GSTP would allow to answer whether gait interventions that improve quality of gait also enhance quality of life of pwMS. The assessment using the ICF framework has also been recommended in other neurological diseases like Parkinson's [85], stroke [86] and also in pediatric pathology [87].
There are some authors that have already pointed out the need to refine the assessment in MS clinical trials, alluding to the need for multidimensional measures in order to allow full coverage of disease progression and the value of technological measures [10,80]. Nonetheless, our results point to a lack of consensus among researchers as to the best outcome measures to assess gait performance in all ICF domains after gait rehabilitation interventions in MS.

Implications for research
There are literature reviews about measurement properties of gait assessment in people with MS [88], and some authors have been interested in studying psychometric properties of specific technological devices for assessment in MS [11]. However, there is still a lack of knowledge of psychometric properties of all technological devices used to assess GSTP in pwMS.
There is a clear need for a systematic review evaluating measurement properties of gait assessment in people with MS, including all technological systems used for assessing GSTP, to recommend specific outcome measures for future studies.

Limitations of the study
In this review we only included RCTs. Data from longitudinal or cross-sectional studies was not included.
We have analyzed the influence of gait capacities on the choice of outcome measures, but we have not analyzed whether the type of MS can influence this choice.
Neither have we analyzed whether the sample of participants in studies could influence the choice of outcome measures.
Another limitation is that we have only included studies on rehabilitation interventions if the aim of the study was to improve gait capacities. There are rehabilitation interventions like balance interventions, vestibular specific interventions or exercise interventions that focus on improving specific aspects other than gait capacities, which can have an influence in gait performance, that are not included in this review.

Conclusion
Assessment in pwMS poses a great challenge due to the heterogeneity of symptoms and the progressive changing status of pwMS. This systematic literature review highlights the heterogeneity in choice of outcome measures used in RCTs on gait interventions and the lack of systematic assessment across the whole ICF spectrum. Improved consensus in assessment across studies would help clinicians and researchers interpret results of rehabilitation interventions and facilitate meta-analyses to compare results across studies [18]. Assessment of the whole ICF spectrum is needed to determine which gait interventions are the most efficient ones to improve capacities at Body structure and Body function, Activity, and Participation levels. A growing consensus was identified for the use of GSTP to evaluate the effects of gait interventions. These measures were often combined with clinical gait, mobility, and balance measures. However, GSTP were rarely combined with measures of fatigue or Participation, highlighting an important gap in research knowledge. Continued efforts are needed to move forward in establishing consensus on selection of outcome measures in clinical trials on gait interventions in MS and assessing psychometric properties of commonly used assessment methods.  (8)