Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Physical measures of physical functioning as prognostic factors to predict outcomes in low back pain: A systematic review and narrative synthesis

  • Rameeza Rashed ,

    Roles Conceptualization, Formal analysis, Methodology, Writing – original draft, Writing – review & editing

    rrashed2@uwo.ca

    Affiliation Health and Rehabilitation Sciences Graduate Program and School of Physical Therapy, London, Ontario, Canada

  • Afieh Niazigharemakher,

    Roles Formal analysis

    Affiliation Health and Rehabilitation Sciences Graduate Program and School of Physical Therapy, London, Ontario, Canada

  • David Walton,

    Roles Conceptualization, Methodology, Supervision, Validation, Writing – review & editing

    Affiliation Health and Rehabilitation Sciences Graduate Program and School of Physical Therapy, London, Ontario, Canada

  • Katie Kowalski,

    Roles Conceptualization, Methodology, Supervision, Validation, Writing – review & editing

    Affiliation Health and Rehabilitation Sciences Graduate Program and School of Physical Therapy, London, Ontario, Canada

  • Alison Rushton

    Roles Conceptualization, Methodology, Supervision, Validation, Writing – review & editing

    Affiliation Health and Rehabilitation Sciences Graduate Program and School of Physical Therapy, London, Ontario, Canada

Abstract

Background

Low back pain (LBP) remains a major global health challenge. Effective management of LBP requires prognostic research to identify people at risk of poor outcome, enabling timely and targeted interventions.

Objective

To synthesize the evidence for physical measures of physical functioning as prognostic factors for predicting outcome in LBP.

Methods

This systematic review followed PRISMA and published protocol [PROSPERO-CRD42023406796] [1]. Searches were conducted in MEDLINE, EMBASE, CINAHL, Scopus and ProQuest Dissertations/Theses from inception to 29/5/2024. Hand searches of key journals and screening reference lists of included studies was performed. Prospective longitudinal studies, evaluating physical measures of physical functioning as prognostic factors, in adults 18years≥ with LBP and/or LBP-related leg pain were included. LBP related to malignancy, fracture, infection, cauda equina, inflammatory conditions, and measures; imaging, EMG, and motion capture with force plates or 3D video analysis were excluded. Two independent reviewers screened articles, extracted data, assessed risk of bias (RoB) using QUIPS. Due to high heterogeneity a narrative synthesis was conducted and GRADE determined the quality of evidence.

Results

From 15,889 citations, 42 studies were included, with 50% assessed as high RoB. Low-quality evidence supports no predictive ability of high isometric back extension endurance, high handgrip strength, and high fingertip-to-floor test for good long term LBP outcomes. Very low-quality evidence supports inconsistent predictive ability of high lumbar extension range of motion and high straight leg raise range for good short-term outcomes, and high isometric back flexion endurance for good long-term LBP outcome. For studies that could not be synthesized, 41 physical measures of physical functioning were investigated, with 23 of them showing promising predictive ability for LBP outcome.

Conclusion

This review highlights a lack of high-quality evidence regarding the predictive ability of physical measures of physical functioning in LBP. Findings indicate that the existing evidence is low-quality for no predictive ability and very low-quality for inconsistent predictive ability of physical measures of physical functioning. Low/very low-quality evidence suggests cautious interpretation. Imprecision, high RoB studies, and inadequately controlled confounding factors contributed to low/very low-quality evidence. This review also identifies emerging potential prognostic factors. An adequately powered, low RoB prospective longitudinal study using standardized measurement protocols and multivariable analysis is required to further investigate the promising predictive ability of physical measures of physical functioning in LBP. Future prognostic research should be grounded in strong theoretical rationale, including biological plausibility.

Introduction

Low back pain (LBP) remains a major global health challenge, ranking among the top causes of years lived with disability [1,2]. Its impact extends beyond physical and mental health, imposing significant economic burdens [35]. Chronic LBP leads to ongoing medical expenses and indirect costs, such as lost work productivity [6]. Prognostic research identifies people at risk of poor outcome. Stratification based on prognostic factors facilitates personalized treatment plans, that would enhance effective LBP management [7]. A prognostic factor is any indicator that can predict subsequent health outcome and provides insights into the likely progression of a condition [8,9]. However, prognostic factors may not directly cause the outcome; rather, they can be markers or indicators of risk without being part of the causal pathway [10]. Causal factors always have some predictive value, but prognostic factors do not necessarily represent underlying causes. This study focuses exclusively on identifying and synthesizing prognostic factors that can predict LBP outcomes, rather than establishing causation.

Physical functioning is a fundamental aspect of health, defined by the Core Outcome Measures in Effectiveness Trials (COMET) Initiative as the impact of a disease or condition on physical activities of daily living, such as walking and self-care [11,12]. It is recognized as a multidimensional construct encompassing several interconnected domains, including bodily structures and functions, performance of physical activities, as well as social and role-related participation [13]. Limitations in one domain may impact others, contributing to a decline in quality of life (QOL) [14].

Physical functioning can be assessed through different forms, including standardized self-report like the physical functioning subscale of the Short-Form 36 (SF-36) [15,16], can be directly observed by a rater (e.g., 6-minute walk test) [17], or can be quantified in real-world settings through wearable devices like accelerometers [18]. Each offer different insights into physical function, such as the patient’s own self-perceptions in the case of self-rating scales, or activity in ecological settings in the case of accelerometers. The Initiative on Methods, Measurement, and Pain Assessment in Clinical Trials (IMMPACT) [18] recommends using both direct observation/quantification of activity in addition to participant self-report for a more fulsome evaluation of a participants’ physical function [18]. In this systematic review, we are focused on physical measures of physical functioning that can predict outcomes in LBP.

Existing literature on prognostic factors in LBP has addressed a range of variables, e.g., psychological, personal and work-related factors [19,20]. However, there is a gap for the comprehensive investigation of physical measures of physical functioning. To date, two systematic reviews have investigated these factors. Hartvigsen et al. included physical measures evaluating physical functioning limited to low-tech clinical tests, and reported inconsistent evidence for various prognostic factors [21]. Verkerk et al. investigated a variety of prognostic factors, but did not comprehensively include physical measures of physical functioning. Their focus was solely on muscle endurance, strength, and aerobic capacity [22]. Both reviews also exhibited some methodological limitations, which contributed to low AMSTAR-2 criteria scores [23]. The AMSTAR-2 assessment for both reviews is provided in S1 File.

Despite extensive research on prognostic factors in LBP [24] there remains a significant gap in understanding the role of physical measures of physical functioning as prognostic factors for predicting LBP outcomes. Therefore, the purpose of this study was to comprehensively assess how physical measures of physical functioning can predict LBP outcomes.

Objective

To synthesize the evidence for physical measures of physical functioning as prognostic factors predicting outcomes in the LBP population.

Methods

Design

This systematic review was designed using the PRISMA statement [25] and Cochrane Handbook [26]. It is registered in PROSPERO-(CRD42023406796) and follows a published protocol [27]. Our protocol initially included only English-language studies. However, with AI translation advancements, we translated non-English articles and validated the accuracy with bilingual individual familiar with the subject matter.

Eligibility criteria (Informed by PICOS framework)

Inclusion and exclusion criteria informed by PICOS is summarized in Table 1.

thumbnail
Table 1. Eligibility criteria (Informed by PICOS framework).

https://doi.org/10.1371/journal.pone.0335535.t001

Physical measures of physical functioning are categorized as the following:

  1. Impairment-based measures: evaluating structure or function of a specific body part or system (e.g., range of motion) [15].
  2. Performance-based measures: evaluating performance on a defined task in standardized environment (e.g., 6 min walk test) [18].
  3. Activity in natural environment/real-world: evaluating activity in natural environment (e.g., accelerometery) [18].

Information sources

A comprehensive search was performed from inception to May 29, 2024 on MEDLINE, EMBASE, CINAHL, and Scopus. Grey literature was searched using Open Grey System and ProQuest Dissertations and Theses. Hand searches of key journals (Spine, European Spine Journal, The Spine Journal) and screening reference list of included studies was also performed.

Search strategy

The search strategy was developed in collaboration with a research librarian around the constructs of LBP, physical measures of physical functioning and prognostic factors. Search terms were informed by the “National Institute for Health and Care Excellence guidelines for LBP and sciatica in adults over the age of 16 year” [28], a previous systematic review [29], and a search filter developed to identify prognostic factor studies [30]. The search strategy developed in MEDLINE was adapted for use in other databases is provided in S2 File.

Study selection process

The citations retrieved from searches were imported and archived into Covidence. This software detected and removed duplicate records. Based on eligibility criteria two authors [RR/AN] independently screened titles and abstracts, followed by full-texts screening. Discrepancies were resolved through discussion, and it was planned to consult a third reviewer (AR) if consensus was not achieved. Inter-rater reliability was assessed using Cohen’s Kappa. [31,32].

Data extraction process and items

Data were extracted by two independent authors using a standardized data extraction form, the Checklist for Critical Appraisal and Data Extraction for systematic reviews of prognostic factor studies (CHARMS-physical functioning) [33] which is an adapted version of the CHARMS checklist for primary studies of prediction models [9,34]. To ensure the reliability and feasibility of this modified form, pilot testing was performed. Data items extracted from each study were: LBP characteristics, participants, potential prognostic factors, outcome measure, results. For missing data, 5 authors were contacted via email as per mentioned in the protocol [27], response was received from 2 authors only [35,36].

Risk of bias (RoB) in individual studies

To evaluate RoB, two independent authors used the QUIPS tool [37], recommended by the Cochrane Collaboration for assessing the RoB in prognostic studies [38]. The inter-rater reliability of QUIPS has been demonstrated to be acceptable, and previous studies have used QUIPS successfully in prognostic reviews [39,40]. It consists of multiple prompting items categorized into six domains: study participation, study attrition, prognostic factor measurement, outcome measurement, study confounding, statistical analysis and reporting. Each domain is graded as low, moderate, or high risk of bias. Each study’s overall RoB assessment was determined based on original QUIPS article and supporting studies [37,41,42]. Overall classification was low RoB if all domains were graded as low or one as moderate; high RoB if any domain was high or ≥3 were moderate; with all studies in between as moderate RoB [41]. Details of domains are provided in S3 File.

Data synthesis and GRADE assessment

In line with the published protocol of this systematic review [27] and consistent with Cochrane handbook [26], narrative synthesis was planned a priori for circumstances of substantial heterogeneity. Due to high clinical, methodological and statistical heterogeneity [43], data were not pooled quantitatively. Clinically there was variability in LBP population characteristics, coexisting conditions, outcomes and follow up timepoints. Methodologically, almost half of the included studies were at high RoB, while others were at moderate or low RoB. Statistical heterogeneity was high as indicated by I² values >50% and reflected in wide variation in effect estimates across studies. This precluded a meaningful quantitative synthesis (meta-analysis), so a narrative synthesis was conducted [40]. According to the Cochrane Handbook, conducting a meta-analysis in the presence of substantial heterogeneity can produce misleading or clinically meaningless results and reduce the interpretability of the findings [44].

Narrative synthesis was based on prognostic factors, significant/nonsignificant association with different outcomes in relation to follow-up time points guided by Cochrane Consumers and Communication Review Group. Prognostic factors and outcome at short-term (<3 months), medium-term (≥3 months to <12 months), and long-term (≥12 months) were grouped and summarised when examined in ≥ 2 studies. The presence/absence and direction of an association between prognostic factors and outcomes at given time point was reported. If two studies reported the association in same direction then findings were considered consistent. Findings were considered inconsistent if studies reported associations in different directions or if they differed in statistical significance (achieved vs. not achieved), particularly when confidence intervals were not reported and only p-values were provided. Studies examining the same prognostic factor and outcome were narratively synthesized due to high clinical, methodological and statistical heterogeneity as meaningful meta-analysis was not possible. Bivariate analysis, odds ratios, beta coefficients, likelihood ratios, P values, confidence intervals (CI) chi-square test, narrative statements and multivariable analysis was reported per the original study.

Cumulative evidence was assessed by two authors independently, using modified GRADE proposed by Huguet et al. for prognostic factors research [45]. Cochrane recommends using GRADE to assess the quality of evidence in systematic reviews, including those with narrative syntheses when meta-analysis is not possible [46]. It is also recommended that narrative syntheses should provide structured summaries of findings using GRADE assessments to help interpret confidence in the evidence. The modified GRADE consists of six domains (phase of investigation, study limitations, inconsistency, indirectness, imprecision, publication bias) that determine certainty of evidence. In the modified GRADE for prognostic studies quality of evidence can be downgraded due to 5 factors [study limitations, inconsistency, indirectness, imprecision, and publication bias], and upgraded by 2 factors [moderate/large effect sizes (e.g., SMD 0.5–0.8, OR 2.5–4.25) and an exposure–response gradient]. Longitudinal designs are standard for prognostic research so study design is not significant feature in modified GRADE [47]. GRADE criteria used for determining the quality of evidence is provided in S4 File.

Reporting bias

Reporting bias was evaluated by consistency to study protocols and published articles where available. Information of study protocols was obtained from included studies.

Results

Study selection

The search of 4 databases identified 15,889 citations and an additional 1,179 were identified from other sources. After removal of duplicates, 13,295 articles were screened by title and abstract, followed by full text screening of 488 articles. A total of 42 studies were included, with 2 articles by Nordeman et al. (2014, 2017) reported as 1 study. There were 9 non-English studies identified: 2 each (Japanese, German, Turkish language) and 1 each (French, Spanish and Chinese language). The list of non-English studies is provided in S5 File. Non-English studies were translated using an open-source software, Chat Generative Pre-Trained Transformer and validity of translation was cross checked by bilingual individual familiar with the subject matter. A PRISMA flow diagram [25] in Fig 1 shows details of identified citations, selection and reasons for exclusion. At full text screening stage, reasons of exclusion are detailed in S6 File for each article. Inter-rater reliability between reviewers was 95.7% for title and abstract stage and 80.2% for full text screening stage, with Cohen’s Kappa indicating fair agreement [48], however after discussion disagreement was resolved and complete agreement was achieved for each stage of screening.

Study characteristics

The 42 included studies were published between 1989–2023, in 16 different countries with the greatest number of studies, coming from the United States (n = 8). The follow-up time range was 4 days to 30 months. The total number of participants was 6,808 with sample sizes ranging from 24 to 675 participants. A total of 47 physical measures of physical functioning were assessed in the included studies. Details of included studies are provided in Table 2. Among the 42 studies, a total of 17 different outcomes were predicted. Disability was the most frequently evaluated in 15 studies, followed by pain in 13, return to work in 6, and non-recovery in 2 studies. Another 13 outcomes were assessed, each in only one study.

RoB within studies

Of the 42 included studies, 21 (50%) studies were assessed as high RoB, 12 (29%) as low RoB, and 9 (21%) as moderate RoB provided in Table 3. The domain “study confounding” was the most rated as high RoB in 13 studies, due to not accounting for confounding factors in the analysis and the ‘participation’ domain was rated as low RoB in 33 studies, where the study samples fully represented the populations of interest. Details of each domain of QUIPS with reasons of high or low RoB is detailed in the S3 File.

Results per physical prognostic factor of physical functioning

A total of 47 physical measures of physical functioning were investigated. Fingertip to floor test (FTF) was the most frequently evaluated measure in 7 studies followed by the back extension endurance test in 6 studies. Due to heterogeneity, only 6 measures could be synthesised across studies (≥2 studies) using GRADE (Table 4). Findings of 5 measures were based on bivariate analysis, while 1 measure was reported based on multivariable analysis. Overall quality of evidence using GRADE is shown in Table 5.

thumbnail
Table 5. Adapted grading of recommendations assessment, development and evaluation (GRADE).

https://doi.org/10.1371/journal.pone.0335535.t005

Impairment-based measures

Consistent findings.

Fingertip to floor distance (FTF) with pain long-term: Low-quality evidence (2 low RoB studies [49,50]) supports no statistically significant association between higher FTF (distance tip of middle finger to floor in flexion) and improved pain intensity (Visual Analogues Scale VAS, or Numeric pain rating Scale NRS) at 12 months. Therefore, low-quality evidence supports that higher FTF does not predict improved pain intensity long term.

Handgrip strength with disability at long-term: Low-quality evidence (2 low RoB studies [51,52] supports no statistically significant association between higher handgrip strength (maximum force exerted by hand muscles using hand-held dynamometer) and improved disability (Roland-Morris Disability Questionnaire – RMDQ) at 12 and 24 months. Therefore, low-quality evidence supports that higher handgrip strength does not predict improved disability long term.

Inconsistent findings.

Lumbar extension range of motion (ROM) with disability at short-term: Very low-quality evidence (2 high RoB studies [35,53]) supports inconsistent statistically significant associations between higher lumbar extension ROM (distance between marks 10 cm above and 5 cm below the lumbosacral junction during active extension) and improved disability (Oswestry Disability Index-ODI and a self-reported disability questionnaire) at 2 and 4 weeks. Therefore, very low-quality evidence supports that higher lumbar extension ROM does not consistently predict improved disability short term.

Straight Leg Raise-Range of Motion (SLR-ROM) with disability at short-term: Very low-quality evidence (2 high RoB studies [54,55]) supports inconsistent statistically significant associations between SLR-ROM (degrees of hip flexion achieved through passive leg raising with a fully extended knee) and improved disability (Oswestry Disability Questionnaire-ODQ) at 2 and 4 weeks. Therefore, very low-quality evidence supports that higher SLR-ROM does not consistently predict improved disability short term.

Performance-based measures

Consistent findings.

Isometric back extension endurance with pain at long term: Low-quality evidence (1 low and 1 moderate RoB studies [49,56]) supports no statistically significant association between higher isometric extension endurance (maximum time holding a prone position with trunk horizontal off the edge of a bench) and improved pain intensity (VAS and NRS) at 12 and 30 months. Therefore, low-quality evidence supports that higher isometric back extension endurance does not predict improved pain intensity long term.

Inconsistent findings.

Isometric back flexion endurance test with pain at long-term: Very low-quality evidence (1 moderate and 1 low RoB studies [53,61]) supports inconsistent statistically significant associations between higher isometric back flexion endurance (maximum time holding the supine position with head and shoulders lifted) and improved pain intensity (NRS and VAS) at 12 and 30 months. Therefore, very low-quality evidence supports that higher isometric back flexion endurance does not consistenly predict improved pain long term.

Measures of activity in natural environment

No measures of activity in natural environment were synthesized due to wide variability of outcomes and follow-up time points across included studies.

Potential prognostic factors in single studies

Across single studies, 23 physical measures of physical functioning out of 41 showed statistically significant associations with outcomes, indicating potential of predictive ability. A list of these measures in single studies is provided in the S7 File. Twenty-two measures showed that higher measurement values predicted good outcome. In contrast, only one measure, ‘time spent in sedentary,’ indicated that a less sedentary time predicted a good outcome.

Reporting bias

Three studies [5759] were aligned with their registered protocols and 4 studies [55,6062] had institutional review board-approved but unregistered protocols. No information was found for the remaining studies, contributing to risk of reporting bias.

Discussion

The objective of this systematic review was to synthesize the evidence for physical measures of physical functioning as prognostic factors for predicting outcomes in the LBP population. The narrative syntheses highlighted the low-quality evidence for no predictive ability of higher FTF and higher isometric back extension endurance for improved pain intensity long term, and higher handgrip strength for improved disability long term. Very low-quality evidence supported inconsistent predictive ability of higher lumbar extension ROM and higher SLR-ROM for improved disability short term, and higher isometric back flexion endurance for improved pain intensity long term. Single studies identified 23 potential prognostic factors showing promising predictive ability for LBP outcomes. Variability in prognostic factors, outcomes and follow up timepoints hindered a comprehensive narrative synthesis.

Consistent findings of no predictive ability

Low-quality evidence for no predictive ability of isometric back extension endurance for long-term pain aligns with the findings of a previous systematic review of Hartvigsen et al. [21] who found no association of isometric back extension endurance with short-term and long-term outcomes of disability and leg pain. Similarly, current review findings regarding the FTF test showing no predictive ability for long term pain, are consistent with the previous review findings [21]. Both reviews included prospective longitudinal studies; though the previous review was limited to the studies up to June 2012, the current systematic review includes studies from inception to May 2024 and also applies the GRADE approach, providing updated evidence and a more rigorous quality assessment. In this review, GRADE indicated that the low quality of supporting evidence is primarily due to imprecision, such as unreported or wide confidence intervals in the study results.

Low-quality evidence suggests no predictive ability of HGS for LBP outcomes in this review. To the best of our knowledge, no prior systematic review has explored the predictive ability of HGS in the LBP population. However, HGS has been shown to have predictive value in other populations. An umbrella review [64] demonstrated the predictive ability of HGS for various health outcomes, in cardiovascular disease, chronic kidney conditions, diabetes, and in the general population. This contrast between the low-quality evidence of no predictive ability in LBP and highly suggestive (class 2 evidence) predictive value of HGS in other health conditions highlights the need for high-quality evidence in the LBP to better understand the predictive ability of HGS in this population.

While these consistent findings suggest no predictive ability across the different physical measures of physical functioning, it is important to note that the findings are supported by low-quality evidence, so caution is required to interpret these findings. To address this, future research should focus on generating high quality evidence by conducting an adequately powered prospective longitudinal study. This will help to understand the predictive ability of these measures in LBP.

Inconsistent findings of predictive ability

Very low-quality evidence supported inconsistent predictive ability of passive SLR-ROM, lumbar extension ROM for short-term outcomes, and isometric back flexion endurance for long-term outcomes. In this review included studies used passive SLR-ROM to assess passive hip flexion ROM, but its predictive ability was unclear due to inconsistent findings. Previous systematic review findings [21] also found that hip flexion ROM did not show consistent association with short and long term outcomes in LBP. In both systematic reviews included studies used different methods of measuring hip flexion ROM that may have given inconsistent results,

In this review very low-quality of evidence for spinal extension ROM demonstrated inconsistent predictive ability for disability in the short term which is similar to the findings of a review [63] that found low quality evidence for no consistent relationship between changes in lumbar extension ROM and changes in pain or activity limitation. Current review covers the broader LBP population, whereas the previous review [63] was restricted to only nonspecific LBP. Similarly, in this review, very low-quality of evidence for back flexion endurance demonstrated inconsistent predictive ability for long-term outcomes, mirroring the inconsistent associations reported for long-term outcomes in the previous review [21]. Despite including the same study design (prospective longitudinal studies) in both reviews, the very low-quality of supporting evidence in this review is primarily attributed to high RoB studies in the synthesized evidence. Inconsistency in the predictive ability may be due to the methodological differences across included studies, measurement techniques (for example lumbar ROM measured by flexicurve or inclinometer) imprecision, inadequately controlled confounding factors and inconsistent reporting of univariate and multivariable analysis. Considering these challenges, future research should focus on conducting a low RoB prospective longitudinal study accounting for potential moderating variables and assuring consistent standards for test application, outcomes measurement and more consistency in reporting of univariable and multivariable analyses that will lead to the required high quality of evidence.

Single studies suggesting predictive ability of potential prognostic factors

This review highlights significant heterogeneity among single studies for outcomes, follow up timepoints and different measurement methods of prognostic factors, that hindered a comprehensive synthesis of findings. The diversity across single studies emphasizes a need for standardized approaches (standardized protocols and methodologies) for consistent comparisons in future research. Out of the 41 prognostic factors identified across the single studies, 23 emerging physical measures of physical functioning showed promising potential as predictors of outcomes. Notably, half of these studies investigating performance-based measures and measures of physical activity in natural environment had a moderate to low risk of bias, highlighting an increasing trend towards high quality research investigating these physical measures in LBP. Given this, further research on these promising performance-based measures and natural activity measures is needed, through a low RoB prospective longitudinal study to establish their predictive value in LBP.

Strengths and limitations of review

The strengths of this review are its rigorous conduct and reporting in accordance with PRISMA, Cochrane and AMSTAR. Inclusion of prospective longitudinal studies, the gold standard for prognostic research, enabled optimal synthesis of existing evidence for predictive ability of physical measures of physical functioning [23]. While screening the articles no restrictions such as language or publication date were applied. Every step was performed in duplicate with two independent reviewers. There are however some limitations that reflect the weak body of evidence that currently exists. The majority of studies in the narrative synthesis were limited to bivariate analysis, despite multivariable analysis being a powerful tool as it captures complex interplay between variables [64]. Also, wide variability across included studies for follow up times and outcomes hindered a comprehensive narrative synthesis.

In this review, applying the modified GRADE tailored for prognostic factor research provided a structured and transparent way to assess and communicate the quality of evidence. However, we acknowledge that GRADE is fundamentally a quantitative tool, and its use in the context of narrative synthesis presents inherent limitations.

Challenges and future directions in prognostic research

This systematic review highlights methodological challenges in prognostic research of physical measures of physical functioning in LBP. One of the challenges is variability in how outcomes are defined (e.g., change versus absolute scores, recurrence versus trajectory), which adds to the heterogeneity and complicates synthesis and comparison across studies. Future research should adopt standardized outcome definitions to enhance consistency and comparability. Another challenge is that most of the studies reported only p-values without corresponding confidence intervals, restricting true interpretation of prognostic associations. Future work should ensure that effect estimates are accompanied by appropriate measures of precision. Additionally, studies lack clear reporting of symptom duration, hindering evaluation of its role in prognosis. Improved reporting of symptom duration is needed to enhance the quality of prognostic analyses. Reflecting further upon the results, we noted that the biologic rationale informing the selection of many of the identified prognostic factors has not been adequately theorized. Some could be inferred (e.g., lower range of motion reflecting more severe structural pathology) yet even here they have been largely assumed. Strong theoretical rationale, including biological plausibility, is a necessary component of causality and seems an opportunity to strengthen the overall work in this area, which is important for advancing the field.

Conclusion

This rigorous systematic review highlights that existing literature regarding the predictive ability of physical measures of physical functioning in LBP lacks high-quality evidence. Low-quality evidence supports no predictive ability of higher isometric back extension endurance, higher handgrip strength, and higher fingertip-to-floor test for good LBP outcomes long term. Very low-quality evidence supports inconsistent predictive ability of higher lumbar extension ROM, higher SLR-ROM short-term, and higher isometric back flexion endurance long term for good LBP outcomes. Low/very low-quality evidence suggests caution while interpreting these results. Imprecision, high risk of bias, variability of measurement techniques across studies, lack of standardized protocols, and inadequately controlled confounding factors contributed to low/very low-quality evidence.

This review also identifies emerging potential prognostic factors (performance-based measures, measures of activity in natural environment) showing promising predictive ability, and highlighting an increasing trend of improved research quality in this area. An adequately powered, low risk of bias prospective longitudinal study using standardized measurement protocols and multivariable analysis is required to further investigate the promising predictive ability of physical measures of physical functioning in LBP. Future prognostic research should be grounded in strong theoretical rationale, including biological plausibility. In a body of literature that is highly heterogenous this systematic review is providing an initial robust synthesis, positioning the researchers to advance the field from strong basis of understanding.

Supporting information

S1 File. AMSTAR 2 score of previous systematic reviews.

https://doi.org/10.1371/journal.pone.0335535.s001

(DOCX)

S3 File. QUIPS domains and judgmental formula.

https://doi.org/10.1371/journal.pone.0335535.s003

(DOCX)

S6 File. Reasons of excluded studies at full text screening stage.

https://doi.org/10.1371/journal.pone.0335535.s006

(DOCX)

S7 File. List of potential prognostic factors in single studies.

https://doi.org/10.1371/journal.pone.0335535.s007

(DOCX)

Acknowledgments

Study authors thank Alanna Marson, Librarian Western university for helping in development of search strategies. Patient and public involvement: The spinal pain research Patient Partner Advisory Group (PPAG) in the School of Physical Therapy at Western University has shaped the research methodology focused on physical measures of physical functioning. Systematic review results have been discussed with the PPAG to inform future research initiatives.

References

  1. 1. Wu A, March L, Zheng X, Huang J, Wang X, Zhao J. Global low back pain prevalence and years lived with disability from 1990 to 2017: estimates from the Global Burden of Disease Study 2017. Annal Transl Med. 2020;8(6).
  2. 2. Cieza A, Causey K, Kamenov K, Hanson SW, Chatterji S, Vos T. Global estimates of the need for rehabilitation based on the Global Burden of Disease study 2019: a systematic analysis for the Global Burden of Disease Study 2019. Lancet. 2021;396(10267):2006–17. pmid:33275908
  3. 3. Driscoll T, Jacklyn G, Orchard J, Passmore E, Vos T, Freedman G, et al. The global burden of occupationally related low back pain: estimates from the Global Burden of Disease 2010 study. Ann Rheum Dis. 2014;73(6):975–81. pmid:24665117
  4. 4. Walker BF, Muller R, Grant WD. Low back pain in Australian adults: prevalence and associated disability. J Manipulative Physiol Ther. 2004;27(4):238–44. pmid:15148462
  5. 5. Hartvigsen J, Hancock MJ, Kongsted A, Louw Q, Ferreira ML, Genevay S, et al. What low back pain is and why we need to pay attention. Lancet. 2018;391(10137):2356–67. pmid:29573870
  6. 6. Rojanasarot S, Bhattacharyya SK, Edwards N. Productivity loss and productivity loss costs to United States employers due to priority conditions: a systematic review. J Med Econ. 2023;26(1):262–70. pmid:36695516
  7. 7. Tousignant-Laflamme Y, Houle C, Cook C, Naye F, LeBlanc A, Décary S. Mastering prognostic tools: an opportunity to enhance personalized care and to optimize clinical outcomes in physical therapy. Phys Ther. 2022;102(5):pzac023. pmid:35202464
  8. 8. Hayden JA, Dunn KM, van der Windt DA, Shaw WS. What is the prognosis of back pain?. Best Pract Res Clin Rheumatol. 2010;24(2):167–79. pmid:20227639
  9. 9. Riley RD, Moons KGM, Snell KIE, Ensor J, Hooft L, Altman DG, et al. A guide to systematic review and meta-analysis of prognostic factor studies. BMJ. 2019;364:k4597. pmid:30700442
  10. 10. Moons KG, Royston P, Vergouwe Y, Grobbee DE, Altman DG. Prognosis and prognostic research: what, why, and how?. BMJ. 2009;338.
  11. 11. O’Neill D, Forman DE. The importance of physical function as a clinical outcome: assessment and enhancement. Clin Cardiol. 2020;43(2):108–17. pmid:31825137
  12. 12. Dodd S, Clarke M, Becker L, Mavergames C, Fish R, Williamson PR. A taxonomy has been developed for outcomes in medical research to help improve knowledge discovery. J Clin Epidemiol. 2018;96:84–92. pmid:29288712
  13. 13. Jette AM. Toward a common language for function, disability, and health. Phys Ther. 2006;86(5):726–34. pmid:16649895
  14. 14. Tomey KM, Sowers MR. Assessment of physical functioning: a conceptual model encompassing environmental factors and individual compensation strategies. Phys Ther. 2009;89(7):705–14. pmid:19443558
  15. 15. Reiman MP, Manske RC. The assessment of function: how is it measured? A clinical perspective. J Man Manip Ther. 2011;19(2):91–9. pmid:22547919
  16. 16. Ware JE Jr, Sherbourne CD. The MOS 36-item short-form health survey (SF-36). I. Conceptual framework and item selection. Med Care. 1992;30(6):473–83. pmid:1593914
  17. 17. Alamrani S, Rushton A, Gardner A, Falla D, Heneghan NR. Outcome measures evaluating physical functioning and their measurement properties in adolescent idiopathic scoliosis: a protocol for a systematic review. BMJ Open. 2020;10(4):e034286. pmid:32241788
  18. 18. Taylor AM, Phillips K, Patel KV, Turk DC, Dworkin RH, Beaton D, et al. Assessment of physical function and participation in chronic pain clinical trials: IMMPACT/OMERACT recommendations. Pain. 2016;157(9):1836–50. pmid:27058676
  19. 19. Boissoneault J, Mundt J, Robinson M, George SZ. Predicting low back pain outcomes: suggestions for future directions. J Orthop Sports Phys Ther. 2017;47(9):588–92. pmid:28859589
  20. 20. Cook CE, Learman KE, O’Halloran BJ, Showalter CR, Kabbaz VJ, Goode AP, et al. Which prognostic factors for low back pain are generic predictors of outcome across a range of recovery domains?. Phys Ther. 2013;93(1):32–40. pmid:22879443
  21. 21. Hartvigsen L, Kongsted A, Hestbaek L. Clinical examination findings as prognostic factors in low back pain: a systematic review of the literature. Chiropr Man Therap. 2015;23:13. pmid:25802737
  22. 22. Verkerk K, Luijsterburg PAJ, Miedema HS, Pool-Goudzwaard A, Koes BW. Prognostic factors for recovery in chronic nonspecific low back pain: a systematic review. Phys Ther. 2012;92(9):1093–108. pmid:22595238
  23. 23. Shea BJ, Reeves BC, Wells G, Thuku M, Hamel C, Moran J, et al. AMSTAR 2: a critical appraisal tool for systematic reviews that include randomised or non-randomised studies of healthcare interventions, or both. BMJ. 2017;358:j4008. pmid:28935701
  24. 24. Otero-Ketterer E, Peñacoba-Puente C, Ferreira Pinheiro-Araujo C, Valera-Calero JA, Ortega-Santiago R. Biopsychosocial factors for chronicity in individuals with non-specific low back pain: an umbrella review. Int J Environ Res Public Health. 2022;19(16):10145. pmid:36011780
  25. 25. Page MJ, McKenzie JE, Bossuyt PM, Boutron I, Hoffmann TC, Mulrow CD, et al. The PRISMA 2020 statement: an updated guideline for reporting systematic reviews. BMJ. 2021;372:n71. pmid:33782057
  26. 26. Moons KG, Hooft L, Williams K, Hayden JA, Damen JA, Riley RD. Implementing systematic reviews of prognosis studies in Cochrane. Cochrane Database System Rev. 2018;2018(10).
  27. 27. Rashed R, Kowalski K, Walton D, Niazigharemakhe A, Rushton A. Physical measures of physical functioning as prognostic factors to predict outcomes in low back pain: protocol for a systematic review. PLoS One. 2023;18(12):e0295761. pmid:38079434
  28. 28. Uk NGC. Low back pain and sciatica in over 16s: assessment and management. 2016.
  29. 29. Kowalski KL, Lukacs MJ, Mistry J, Goodman M, Rushton AB. Physical functioning outcome measures in the lumbar spinal surgery population and measurement properties of the physical outcome measures: protocol for a systematic review. BMJ Open. 2022;12(6):e060950. pmid:35667717
  30. 30. Stallings E, Gaetano-Gil A, Alvarez-Diaz N, Solà I, López-Alcalde J, Molano D, et al. Development and evaluation of a search filter to identify prognostic factor studies in Ovid MEDLINE. BMC Med Res Methodol. 2022;22(1):107. pmid:35399050
  31. 31. Delgado R, Tibau X-A. Why Cohen’s Kappa should be avoided as performance measure in classification. PLoS One. 2019;14(9):e0222916. pmid:31557204
  32. 32. Nurjannah I, Siwi SM. Guidelines for analysis on measuring interrater reliability of nursing outcome classification. Int J Res Med Sci. 2017;5(4):1169.
  33. 33. Moons KGM, de Groot JAH, Bouwmeester W, Vergouwe Y, Mallett S, Altman DG, et al. Critical appraisal and data extraction for systematic reviews of prediction modelling studies: the CHARMS checklist. PLoS Med. 2014;11(10):e1001744. pmid:25314315
  34. 34. Debray TPA, Damen JAAG, Snell KIE, Ensor J, Hooft L, Reitsma JB, et al. A guide to systematic review and meta-analysis of prediction model performance. BMJ. 2017;356:i6460. pmid:28057641
  35. 35. Burton AK, Tillotson KM. Prediction of the clinical course of low-back trouble using multivariable models. Spine (Phila Pa 1976). 1991;16(1):7–14. pmid:1825895
  36. 36. Wittink H, Michel TH, Sukiennik A, Gascon C, Rogers W. The association of pain with aerobic fitness in patients with chronic low back pain. Arch Phys Med Rehabil. 2002;83(10):1467–71. pmid:12370889
  37. 37. Hayden JA, Côté P, Bombardier C. Evaluation of the quality of prognosis studies in systematic reviews. Ann Intern Med. 2006;144(6):427–37. pmid:16549855
  38. 38. Moons C, Hooft L, Hayden J. Systematic reviews of prognosis studies II: Assessing bias in studies of prognostic factors using the QUIPS tool. 2023.
  39. 39. Middlebrook A, Middlebrook N, Bekker S, Rushton A. Physical prognostic factors predicting outcome following anterior cruciate ligament reconstruction: a systematic review and narrative synthesis. Phys Ther Sport. 2022;53:115–42. pmid:34896673
  40. 40. Rushton A, Zoulas K, Powell A, Staal JB. Physical prognostic factors predicting outcome following lumbar discectomy surgery: systematic review and narrative synthesis. BMC Musculoskelet Disord. 2018;19(1):326. pmid:30205812
  41. 41. Grooten WJA, Tseli E, Äng BO, Boersma K, Stålnacke B-M, Gerdle B, et al. Elaborating on the assessment of the risk of bias in prognostic studies in pain rehabilitation using QUIPS-aspects of interrater agreement. Diagn Progn Res. 2019;3:5. pmid:31093575
  42. 42. Hayden JA, van der Windt DA, Cartwright JL, Côté P, Bombardier C. Assessing bias in studies of prognostic factors. Ann Intern Med. 2013;158(4):280–6. pmid:23420236
  43. 43. Fletcher J. What is heterogeneity and is it important?. BMJ. 2007;334(7584):94–6. pmid:17218716
  44. 44. McKenzie JE, Brennan SE. Synthesizing and presenting findings using other methods. Cochrane handbook for systematic reviews of interventions. 2019. 321–47.
  45. 45. Huguet A, Hayden JA, Stinson J, McGrath PJ, Chambers CT, Tougas ME, et al. Judging the quality of evidence in reviews of prognostic factor research: adapting the GRADE framework. Syst Rev. 2013;2:71. pmid:24007720
  46. 46. Schünemann HJ, Higgins JP, Vist GE, Glasziou P, Akl EA, Skoetz N, et al. Completing ‘Summary of findings’ tables and grading the certainty of the evidence. In: Cochrane handbook for systematic reviews of interventions. 2019. 375–402.
  47. 47. Huguet A, Hayden JA, Stinson J, McGrath PJ, Chambers CT, Tougas ME, et al. Judging the quality of evidence in reviews of prognostic factor research: adapting the GRADE framework. Syst Rev. 2013;2:71. pmid:24007720
  48. 48. Park CU, Kim HJ. Measurement of inter-rater reliability in systematic review. Hanyang Med Rev. 2015;35(1):44.
  49. 49. Enthoven P, Skargren E, Kjellman G, Oberg B. Course of back pain in primary care: a prospective study of physical measures. J Rehabil Med. 2003;35(4):168–73. pmid:12892242
  50. 50. van den Berg R, Chiarotto A, Enthoven WT, de Schepper E, Oei EHG, Koes BW, et al. Clinical and radiographic features of spinal osteoarthritis predict long-term persistence and severity of back pain in older adults. Ann Phys Rehabil Med. 2022;65(1):101427. pmid:32798770
  51. 51. Nordeman L, Thorselius L, Gunnarsson R, Mannerkorpi K. Predictors for future activity limitation in women with chronic low back pain consulting primary care: a 2-year prospective longitudinal cohort study. BMJ Open. 2017;7(6):e013974. pmid:28674128
  52. 52. Felício DC, Diz JBM, Pereira DS, Queiroz BZ de, Silva JP de, Moreira B de S, et al. Handgrip strength is associated with, but poorly predicts, disability in older women with acute low back pain: a 12-month follow-up study. Maturitas. 2017;104:19–23. pmid:28923172
  53. 53. Hirayama K, Tsushima E, Arihara H, Omi Y. Developing a clinical prediction rule to identify patients with lumbar disc herniation who demonstrate short-term improvement with mechanical lumbar traction. Phys Ther Res. 2019;22(1):9–16. pmid:31289707
  54. 54. Hicks GE, Fritz JM, Delitto A, McGill SM. Preliminary development of a clinical prediction rule for determining which patients with low back pain will respond to a stabilization exercise program. Arch Phys Med Rehabil. 2005;86(9):1753–62. pmid:16181938
  55. 55. Stolze LR, Allison SC, Childs JD. Derivation of a preliminary clinical prediction rule for identifying a subgroup of patients with low back pain likely to benefit from Pilates-based exercise. J Orthop Sports Phys Ther. 2012;42(5):425–36. pmid:22281950
  56. 56. Strøyer J, Jensen LD. The role of physical fitness as risk indicator of increased low back pain intensity among people working with physically and mentally disabled persons: a 30-month prospective study. Spine (Phila Pa 1976). 2008;33(5):546–54. pmid:18317201
  57. 57. Gilmore SJ, Hahne AJ, Davidson M, McClelland JA. Predictors of substantial improvement in physical function six months after lumbar surgery: is early post-operative walking important? A prospective cohort study. BMC Musculoskelet Disord. 2019;20(1):418. pmid:31506099
  58. 58. Lagersted-Olsen J, Thomsen BL, Holtermann A, Søgaard K, Jørgensen MB. Does objectively measured daily duration of forward bending predict development and aggravation of low-back pain? A prospective study. Scandinavian J Work Environ Health. 2016;528–37.
  59. 59. Rodríguez-Romero B, Smith MD, Pértega-Díaz S, Quintela-Del-Rio A, Johnston V. Thirty minutes identified as the threshold for development of pain in low back and feet regions, and predictors of intensity of pain during 1-h laboratory-based standing in office workers. Int J Environ Res Public Health. 2022;19(4):2221. pmid:35206409
  60. 60. Jain S, Shetty GM, Linjhara S, Chutani N, Ram CS. Do improved trunk mobility and isometric strength correlate with improved pain and disability after multimodal rehabilitation for low back pain?. Rev Bras Ortop (Sao Paulo). 2023;58(5):e698–705. pmid:37908535
  61. 61. Scheele J, Enthoven WTM, Bierma-Zeinstra SMA, Peul WC, van Tulder MW, Bohnen AM, et al. Course and prognosis of older back pain patients in general practice: a prospective cohort study. Pain. 2013;154(6):951–7. pmid:23597679
  62. 62. Schiøttz-Christensen B, Nielsen GL, Hansen VK, Schødt T, Sørensen HT, Olesen F. Long-term prognosis of acute low back pain in patients seen in general practice: a 1-year prospective follow-up study. Fam Pract. 1999;16(3):223–32. pmid:10439974
  63. 63. Wernli K, Tan J-S, O’Sullivan P, Smith A, Campbell A, Kent P. Does movement change when low back pain changes? A systematic review. J Orthop Sports Phys Ther. 2020;50(12):664–70. pmid:33115341
  64. 64. Kroonenberg PM. Applied multiway data analysis. John Wiley & Sons; 2008.