Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Development and reliability testing of a qualitative observational rating system for individuals with brachial plexus injury performing functional capacity evaluation tests

  • Tallie M. J. van der Laan ,

    Roles Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Writing – original draft, Writing – review & editing

    t.m.j.van.der.laan@umcg.nl

    Affiliation University of Groningen, University Medical Center Groningen, Department of Rehabilitation Medicine, Groningen, The Netherlands

  • Sietke G. Postema ,

    Contributed equally to this work with: Sietke G. Postema, Corry K. van der Sluis, Michiel F. Reneman

    Roles Conceptualization, Investigation, Methodology, Project administration, Supervision, Writing – review & editing

    Affiliation University of Groningen, University Medical Center Groningen, Department of Rehabilitation Medicine, Groningen, The Netherlands

  • Corry K. van der Sluis ,

    Contributed equally to this work with: Sietke G. Postema, Corry K. van der Sluis, Michiel F. Reneman

    Roles Conceptualization, Investigation, Methodology, Project administration, Supervision, Writing – review & editing

    Affiliation University of Groningen, University Medical Center Groningen, Department of Rehabilitation Medicine, Groningen, The Netherlands

  • Michiel F. Reneman

    Contributed equally to this work with: Sietke G. Postema, Corry K. van der Sluis, Michiel F. Reneman

    Roles Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Project administration, Supervision, Writing – review & editing

    Affiliation University of Groningen, University Medical Center Groningen, Department of Rehabilitation Medicine, Groningen, The Netherlands

Abstract

Background

Individuals with brachial plexus injury may be more prone to develop musculoskeletal complaints, due to compensatory strategies. Quantifying compensatory strategies of these individuals may help to minimize the use of dysfunctional compensatory strategies and prevent musculoskeletal complaints.

Purpose

To develop and to explore the feasibility of an observational rating system for rating postures and movements of the shoulders and trunk in individuals with brachial plexus injury during the performance of the functional capacity evaluation one-handed-individuals (FCE-OH) and to explore the interrater and intrarater reliability of this rating system.

Study design

Psychometric study including development and reliability testing.

Methods

Individuals with brachial plexus injury (n = 15) and able bodied controls (n = 21) were videotaped during the performance of five FCE-OH tests. Abnormal shoulder and trunk movements and postures were identified. The rating system was developed pilot tested and adjusted. The interrater and intrarater reliability of the final draft were determined. Sixteen raters performed two rating sessions, two weeks apart and rated 40 video fragments. Absolute percentages of agreement, kappa (ĸ) and 95% confidence intervals (CI) were calculated. Feasibility was explored using a questionnaire.

Results

The interrater reliability of the rating system was: first session ĸ = 0.48, 95%CI = 0.36–0.60; second session ĸ = 0.59, 95%CI = 0.45–0.72. The intrarater reliability was ĸ = 0.64, 95%CI = 0.50–0.70. Half of the raters agreed that the system was easy to use in clinical practice.

Conclusions

A rating system for measuring postures and shoulder and trunk movements of individuals with brachial plexus injury was developed. The reliability appeared to be sufficient when the system was applied by the same rater. The interrater reliability was moderate for both rating sessions. The reliability of both overhead lifting tests of the FCE-OH was low. Variation in movement patterns due to differences in the remaining function of the affected arm may have caused difficulties in rating.

Introduction

Individuals with brachial plexus injury (BPI) have loss of function of their affected arm caused by damage to the brachial plexus. The brachial plexus is a complex network of nerves, origination form the spinal cord in the neck, through the axilla to the limb, controlling movement and sensation of the entire arm and hand. It exists of three trunks, the upper trunk which consists of roots C5 and C6, the middle trunk (root C7), and the lower trunk which is formed by roots C8 and T1. Injury to the brachial plexus leads to diminished motor and sensory function and thereby causes complete or partial loss of arm and/or hand function depending on the affected roots and the severity of the BPI.

Individuals with (BPI) need to compensate for the loss of function of their affected arm. Little is known about the compensatory strategies used by individuals with BPI to perform daily activities and work. Compensation may involve increasing the load on the unaffected arm and using compensatory trunk and/or shoulder movements [14]. Compensatory movements may make individuals with BPI more prone to develop musculoskeletal complaints (MSCs), because they may be more exposed to biomechanical risk factors for MSCs such as static muscle contractions, forceful and repetitive movements and working in awkward positions [5]. Approximately half of the individuals with BPI experience MSCs [6], compared with one-third of the general population [7]. Most often the neck and unaffected shoulder are affected [6]. Pain, neuropathic or due to MSCs, affects the quality of life, and job choices in individuals with BPI [810].

After acquiring BPI only 45% to 66% of the individuals are able to resume their previous work, and of these individuals, 31% need to adjust or to change jobs after injury [1114]. A functional capacity evaluation (FCE), is a standardized assessment containing work-related tasks and is used to evaluate an individual’s capacity in order to make recommendations for participation in work, while considering the individual’s body functions and structures, environmental factors, personal factors, and health status [15]. By matching an individual’s functional capacity to physical work demands musculoskeletal complaints may be prevented. The functional capacity evaluation one-handed (FCE-OH) is a short-form functional capacity evaluation adapted for the use in one-handed individuals and upper limb prosthesis wearers [16]. The results of FCE test are expressed in a quantitative manner, for example how much weight an individual can lift, or the time needed for 30 reaching movements. However, how an individual gets to a certain test result is also important. For example, an individual may use a lot of compensatory movements and thereby be more at risk of developing MSCs. A qualitative score, that evaluates movement patterns of individuals with BPI during the performance of the FCE-OH tests may improve return-to-work recommendations based on the FCE-OH test results. Also, it may help professionals to develop treatment plans to minimize the use of dysfunctional compensatory movements during the rehabilitation treatment. This is important because it is difficult to replace a learned dysfunctional movement with a more optimal movement, and the use of dysfunctional compensatory movements may lead to more disability [17]. Furthermore, individuals with BPI get insight into their movement patterns and movement habits using this tool. This awareness can help in learning to adjust compensation strategies, which may help to prevent musculoskeletal complaints in individuals with BPI.

Two studies evaluated the use of compensatory movements in individuals with BPI during activities of daily living and during specific movements of the BPI-affected upper limb (like reaching with the hand to the mount). Both studies used motion capture systems and showed that the movement patterns were task-specific and that individuals with BPI compensated mostly with their trunk and shoulder [24]. However, motion capture systems have challenges, including a) preparation time required for placing markers, b) lack of portability from the use of fixed cameras, or lengthy set-up requirements for portable systems, and c) equipment costs and training requirements [18]. Additionally, the equipment is costly and needs specialized staff. With an observational rating system, compensatory movements of individuals with BPI may also be measured reliably during specific tasks. Such a rating system does not have the disadvantages of motion capture systems.

To the best of our knowledge an observational rating system rating movement applicable for the use in individuals with BPI does not exist. Previously a qualitative rating system for rating shoulder and trunk movements in upper limb prosthesis wearers during the performance of FCE-OH tests was developed [19]. However, we expect that this rating system is not applicable for the use in individuals with BPI, because upper limb prosthesis wearers compensate for the loss of wrist, forearm (pro- and supination) and sometimes elbow movements due to the limited degrees of freedom of the prosthesis [20]. In contrast, individuals with BPI need to compensate for loss of sensation and strength resulting in a limited active range of motion. This probably will result in the use of different compensatory strategies. We, therefore, aimed (1) to develop and to explore the feasibility of an observational rating system for rating posture and movements of the shoulders and trunk in individuals with BPI during the performance of FCE-OH tests, (2) to explore the interrater and intrarater reliability of this new rating system.

Methods

Design

This study consisted of a development phase and a qualitative evaluation phase (Fig 1).

thumbnail
Fig 1. Flowchart of the development and exploration of the reliability of the scoring system for rating postures and movement patterns in individuals with BPI during the performance of FCE-one handed tests.

Abbreviations: BPI, brachial plexus injury; ROM, range of motion; SD, standard deviation; FCE, functional capacity evaluation; ĸ, Fleiss kappa; nWNL, not within normal limits.

https://doi.org/10.1371/journal.pone.0345464.g001

Participants

Three groups of participants were included: raters, individuals with BPI and able-bodied controls. Video recordings of individuals with BPI and able-bodied controls, executing FCE-OH tests as part of previous studies, were used; all individuals signed informed consent before entering the previous study and gave consent to reuse video recordings for research purposes [16,21]. All raters signed informed consent prior to study entrance. The medical ethics committee of the University Medical Center Groningen decided that no formal approval was needed (METC file number: METc 2020/518). All procedures were in accordance with the ethical standards of the responsible committee on human experimentation (institutional and national) and with the Helsinki Declaration of 1975, as revised in 2000.

Raters.

Development phase: all raters had a (allied) medical background and were experienced in observing movement patterns. All raters worked in the local medical center. Most raters had no experience with FCE tests. Five pilot tests were performed in this phase. In order to avoid a learning effect, all raters participated in one pilot test only. In the first and second pilot test a total of four hand therapists participated, two therapists per test. In the third pilot test five raters participated, one hand therapist and four physiotherapists. In the fourth and fifth pilot test a total of ten residents in physical rehabilitation medicine participated, five raters per test.

Qualitative evaluation phase: Twenty raters were recruited using an international Linked-In FCE research network and by contacting medical professionals at our local medical center. The sample size was based on two previous studies that assessed the reliability of an observational rating system [19,22]. All raters had a medical background and had experience in observing movement patterns. Experience with FCEs was desirable but not required. All raters had good understanding of Dutch or English. Recruitment of the raters took place from 19 November 2021 to 1 February 2022.

Individuals with BPI.

Video recordings of 15 individuals with BPI performing FCE-OH tests, recorded during a previous study that aimed to determine the functional capacity of individuals with BPI [21], were reused. The characteristics of the individuals with BPI were described in Table 1. In the previous study, a physical examination was performed in order to assess the remaining function of the BPI-affected upper limb. The physical examination included an assessment of the active range of motion, strength, and sensation of the hand, see supporting information S1 Table. In order to protect the participants’ privacy their faces were blurred. Camera positions were standardized (Table 2), the videotapes were recorded from March 2016 to June 2019 at the local medical center. Individuals with BPI who were between 18 and 65 years of age, who had been diagnosed with BPI, had remaining hand activity, had sufficient understanding of the Dutch language, performed paid work, and who had normal hand function in the sound hand were eligible to participate in the study. Exclusion criteria were hypertension (blood pressure >160/100 mmHg in rest), serious pulmonary conditions, cardiac conditions, or other conditions that could cause unsafe situations during physical effort exerted during test performance. The seven-item physical activity readiness questionnaire (PAR-Q) was used to screen participants for serious health problems [23]. Participants responding “Yes” to one or more of the items were excluded.

thumbnail
Table 1. Characteristics of individuals with BPI and controls.

https://doi.org/10.1371/journal.pone.0345464.t001

thumbnail
Table 2. FCE-OH tests, participant instructions, FCE-outcome, camera positions and video fragment selected for rating.

https://doi.org/10.1371/journal.pone.0345464.t002

Able-bodied controls.

Video recordings of 21 able-bodied controls (Table 1) performing FCE-OH tests at the local medical center in 2014, made during a previous study [16], were reused. Inclusion criteria for controls were: aged between 18 and 67 years, good understanding of Dutch or English, and normal hand function in both hands. Performing paid work was not an inclusion criterion for controls; the inclusion criterion was added for individuals with BPI because of a previous study that compared functional capacity to physical work demands [24]. Exclusion criteria for controls were similar to those for individuals with BPI. Controls were aware of the study and knew they were participating as controls.

FCE-OH tests

Five out of six FCE-OH tests were selected: the overhead lifting test two-handed, the overhead working test, the repetitive reaching test, the fingertip dexterity test and the overhead lifting test-one handed. The hand grip strength test was not selected, because movements of the trunk and shoulders were not allowed in this test (Table 2).

Scale development

The first 3 out of 4 phases for instrument development and validation were followed [25]: planning, construction, qualitative evaluation and validation. Fig 1 shows the steps taken in each phase.

Planning phase.

In this phase, deviating movement patterns of the shoulders and trunk of individuals with BPI were identified. After the identification of deviating movements and postures, it was checked if these movements and postures could be rated with the previously developed rating system for upper limb prosthesis wearers, or that a new rating system specifically for individuals with BPI needed to be developed.

Movement patterns of the shoulders and trunk of individuals with BPI were compared to the controls for each FCE-OH test separately, because compensatory movements seem to be task specific [3]. Postures and movements deviating from controls were defined as not within normal limits (nWNL), if (1) the maximum range of motion deviated more than two standard deviations from the mean of the control group or if (2) additional movements were performed that were not observed in the control group. These criteria were based on the previously developed observational rating system for upper limb prosthesis wearers [19]. The range of motion in a direction was measured using VideoStudio Pro 2020 Corel corporation, allowing to play videos frame by frame. The frames in which the range of motion was maximal were chosen and angles were manually measured. In order to prevent missing deviating movements and postures, all videotapes of individuals with BPI were analysed separately and the range of motion of each individual was compared with the mean range of motion of controls for each test. One medical student listed the deviating postures and movements, the listed movements were checked by a medical professional. In case the medical professional disagreed with the medical student video recordings were reassessed together. The findings of the planning phase were discussed in the research group (which consisted of 3 rehabilitation physicians and one FCE-expert) until consensus was met. Only three out of eight items of the rating system for upper limb prosthesis wearers were suitable for use in individuals with BPI. These items included (1) trunk and (2) shoulder movements from the two-handed overhead lifting test, as well as (3) trunk movements from the fingertip dexterity test [11]. Therefore, a new rating system needed to be developed for individuals with BPI.

Construction phase.

The first draft of the new rating system consisted of the three items that matched the observed movement patterns supplemented with five adjusted items (S1 Fig). Based on our previous experiences with developing a comparable rating system in upper limb prosthesis wearers, we decided to rate all items dichotomously (within normal limits (WNL) or (nWNL) [19].

The first draft was pilot-tested five times and adjusted according to the feedback of the raters (Fig 1). All pilot tests were performed online, guided by a moderator (TMJL) using Microsoft Teams. Before rating an item, raters received a short instruction and four example video fragments were shown (two test performances WNL and two test performances nWNL). After rating all items, the raters were asked to provide feedback on the instructions and the rating system. Some video recordings appeared to be more difficult to rate than others. In order to reduce the influence of one single video recording on the results, all raters participating in the pilot test three, four, and five rated eight video fragments of individuals with BPI performing the FCE-OH tests. Raters of these pilot tests were also asked to rate the certainty of their ratings (dichotomous, certain: yes or no), in order to identify difficulties in rating.

Qualitative evaluation phase.

In this final phase interrater and intrarater reliability were explored (Fig 1). Each rater was asked to participate in two rating sessions, with two weeks between rating sessions. Prior to each rating session the use of the rating system was explained by an instruction video, containing two examples of a performance WNL and two performances nWNL of each FCE-OH test. For each item, except for the overhead lifting test one-handed, the same eight video fragments as in the pilot tests were selected for rating. For the overhead lifting test one-handed seven video fragments were selected for rating, because only seven individuals with BPI were able to perform this test with the affected upper limb. The video instruction and all video fragments were offered online, using a secured video fragment rating system (VFR MAS Outreach, Leeuwarden the Netherlands). This system allows to share videos, without the possibility to download them, share the videos to others or to distribute them in any other way. Raters were instructed to rate independently and not to replay or pause the videos during rating in order to mimic a clinical setting as best as possible. As in the pilot tests, raters were asked if they were certain of their ratings. After each rating session raters handed in their results and were instructed to remove their rating form from their computer. After the second rating session raters were asked to provide (qualitative) feedback on the feasibility and usability of the rating system and the video instruction, using a self-developed questionnaire (S1 File). This questionnaire had previously been used during the development of the rating system for upper limb prosthesis wearers, but has not been formally validated. Rating took place from January to March 2022.

Statistical analysis

Construction phase: interrater reliability for all items in the pilot tests was determined using Fleiss Kappa (ĸ) for multiple raters. Qualitative evaluation phase: Fleiss kappa for multiple raters was used to determine the interrater reliability, Cohen’s kappa was used to determine intrarater reliability. Because of categorical data, kappa statistics were used to determine the interrater and intrarater reliability [26,27]. Percentages of absolute agreement and 95% confidence intervals (CI) were determined. Reliability was considered poor if ĸ ≤ 0.4, moderate if 0.41 < ĸ < 0.59, sufficient if 0.6 ≤ ĸ < 0.79 and good if ĸ ≥ 0.80 [28]. In case of missing data the ratings of that particular rater were omitted from the analyses for the interrater reliability. For the intrarater reliability analysis, corresponding ratings of the first and second rating session were removed pairwise in case of missing data. Post-hoc Gwets AC 1 was determined to assess inter- and intrarater reliability, because of the risk of prevalence bias when using kappa statistics [28,29]. The feedback provided by the raters through the questionnaire was analyzed using descriptive statistics only. Statistical analyses were performed using Statistical Package for the Social Sciences (SPSS) version 25.0 software package (SPSS; IBM Corp, Armonk, NY). AgreeStat 360 was used to calculate Gwets AC1 [30].

Results

Construction phase

The results of the construction phase are shown in S1 Fig. The most important adjustment after the five pilot tests was that the descriptions were focused more on symmetry, in order to improve the rating of subtle deviating movements such as shoulder elevation in both two-handed FCE-OH tests. We did not develop separate items for these subtle deviating movements, because from the development of the previous observational rating system for upper limb prosthesis wearers, we learned that these items were difficult to rate reliably by observation [19]. The final draft consisted of five items, one item per FCE-OH test (see also S2 File). Points of attention were specified per item. Raters were instructed to rate the worst performance in case of a variable movement pattern. Fig 2 shows an illustrative example of the overhead lifting test two-handed item.

thumbnail
Fig 2. Illustrative example of the overhead lifting test two-handed item in the final draft of the scoring system.

https://doi.org/10.1371/journal.pone.0345464.g002

Qualitative evaluation phase

In total 20 raters were recruited, eight from Austria and twelve from the Netherlands. The first rating session was performed by 17 raters (5 males, 12 females; professions: 8 physiotherapists, 5 rehabilitation physicians, 2 occupational therapists and 2 residents for rehabilitation medicine), while 16 out of these 17 raters completed the second rating session as well. Two raters had time constraints and two raters gave no reason for not performing one or both rating sessions. Seven raters had FCE-experience (mean years of FCE-experience 3.3 ± 7.7 years, missing data n = 1). The mean duration of work experience of raters without FCE-experience was 10.7 ± 11.8 years (missing data n = 2). The mean time between the first and the second rating session was 16.3 ± 3.8 days. Raters were able to open all documents and watch the videos, one rater had problems because of stuttering videos. Almost all raters declared that the instructions for the testing procedure were clear. One rater answered the instructions were unclear, but stated a lack of clarity in the rating system as the reason and not in the instructions. A post-hoc analysis to test whether the results were influenced by this rater showed similar results for analyses that included or excluded this rater. The rating sessions were conducted between 17 January 2022 until 2 March 2022.

The overall interrater reliability of both rating sessions was moderate (Table 3). Fleiss kappa increased in the second rating session and was almost sufficient. In both rating sessions 77% of the raters were certain of their ratings. In the second rating session the interrater reliability of the overhead working test was good and sufficient for the repetitive reaching test and the fingertip dexterity test. The interrater reliability of both overhead lifting tests was poor in both rating sessions. Most raters (19−30%) mentioned that they were uncertain about their ratings for both overhead lifting tests. The interrater reliability was similar for raters with and without FCE experience (first rating session raters with FCE-experience ĸ = 0.56, 95% CI 0.41–0.70, raters without FCE-experience ĸ = 0.53, 95% CI 0.39–0.66; second rating session raters with FCE-experience ĸ = 0.64, 95% CI 0.56–0.73, raters without FCE-experience ĸ = 0.58, 95% CI 0.43–0.73).

thumbnail
Table 3. Inter and intrarater reliability of the observation-based scoring system.

https://doi.org/10.1371/journal.pone.0345464.t003

The overall intrarater reliability was sufficient. The intrarater reliability was good for the overhead working test and sufficient for the other items except for the overhead lifting test one-handed. The overall intrarater reliability was similar for raters with and without FCE-experience (raters with FCE-experience ĸ = 0.64, 95% CI 0.58–0.70, raters without FCE-experience ĸ = 0.68, 95% CI 0.61–0.76).

Post hoc determined Gwets AC1 values were similar to kappa values, indicating no prevalence biases (Table 3).

The intra- and interrater reliability may have been influenced by the insufficient power of the kappa analyses. Therefore we conducted a post hoc power analysis in R (version 4.3.1; R Core Team) using custom-made scripts developed in RStudio and a web-based sample size calculator [31]. These analyses indicated that the sample size was sufficient for the overall intrarater reliability (number of videos required 10, for a power of 0.8 with kappa expected 0.6, precision 0.1 and number of raters 16) and interrater reliability (246 rating required, for a power of 0.8 with kappa expected 0.6, precision 0.1). However, the intra- and interrater reliability analyses for the individual items were underpowered.

Feasibility and usability

Fifteen raters provided feedback on the feasibility and usability of the developed rating system (Table 4). In their feedback raters complained that the camera position was not perfect in the frontal plane in some videotapes. Furthermore, they suggested adding feedback to the video instruction to improve the training.

thumbnail
Table 4. Feedback of the raters on the developed rating system.

https://doi.org/10.1371/journal.pone.0345464.t004

Discussion

A new system was developed for rating deviant postures and movements of the shoulders and trunk in individuals with BPI during the performance of FCE-OH tests. The interrater and intrarater reliability were explored. The overall interrater reliability of the rating system was moderate in both rating sessions, although nearly sufficient in the second rating session. The overall intrarater reliability was sufficient.

The developed rating system was based on the rating system for compensatory trunk and shoulder movements in upper limb prosthesis wearers [19], this rating system appeared not to be applicable for the use in individuals with BPI. In both rating systems the interrater reliability was higher compared with the interrater reliability. The interrater reliability of the rating system for individuals with BPI was lower compared with the rating system developed for upper limb prosthesis wearers. Rating movement patterns in individuals with BPI appeared to be more challenging, because of greater variation in remaining upper limb function compared with upper limb prosthesis wearers, resulting in greater variability in movement patterns. This variability was also observed in children with brachial plexus birth injury performing modified Mallet scale tasks [4]. The variability in movement patterns may have caused difficulties in recognizing and rating movement patterns and postures, because the deviant movement patterns in the video fragments selected for rating were sometimes different from the examples of deviant movement patterns shown in the video instructions. This issue was particularly apparent in the two overhead lifting tests, where a combination of movements had to be rated simultaneously. Pilot testing had indicated that scoring these components individually was problematic (see also supportive information S1 Fig), which led to the decision to assess them as combined movement patterns. However, the substantial variation in how these combined movements were performed likely increased the complexity of the rating and may therefore explain the low interrater reliability observed for the overhead lifting tests.

The interrater and intrarater reliability of the rating system was similar for medical professionals with and without FCE-experience, suggesting that specific FCE-experience is not required for the use of the rating system after following a training program. The interrater reliability increased from the first to the second rating session, which was also observed in other studies [19,22]. This finding may suggest a learning effect; however, other unexamined rater specific factors, such as fatigue, mental health or time taken for rating, may also have contributed.

The combination of sufficient intrarater reliability and insufficient reliability may have been a consequence of the limited rating training. Only 40% of the raters agreed that the video instruction was sufficient before the use of the system. Furthermore, when they handed in their ratings, some raters mentioned that they would have liked to receive feedback on their ratings in order to know if their ratings were right. Clearly, a single video instruction should be considered insufficient for future use. A more extensive training program that incorporates structured feedback may help to clarify rating criteria and help raters to recognize and rate variable movement patters of individuals with BPI and thereby improve interrater reliability.

We developed an observational rating system for all individuals with BPI. Observations may also improve by developing different rating systems for different levels of BPI. Probably the variation in movement patterns will decrease when rating individuals with similar levels of BPI, which could make it easier to recognize movement patterns and thereby simplify rating. The disadvantage of different systems is that raters need to train and maintain rating skills for different rating systems. This may be challenging, especially because of the low prevalence of BPI [32,33]. It may nevertheless be necessary to opt for optoelectrical motion capture systems for reliable measurement of deviant movements and postures in individuals with BPI, despite the disadvantages of these systems [18].

It was remarkable that almost all raters agreed with the statement that the system was easy in use, but that only 53% of the raters agreed with the statement that the system would be easy in use in clinical practice. This finding is in contrast to the rating system for compensatory trunk and shoulder movements in upper limb prosthesis wearers, where all raters agreed with the statements that the system was easy in use and could be easily implemented in clinical practice [19]. Unfortunately, raters were not asked why they did not agreed with the statement that the system would be easy to use in clinical practice. Insufficient training may be one reason. Another reason may be that raters experienced rating as difficult, indicated by uncertainty of their ratings, which may explain the disagreement with the statement that the system is easy in use in clinical practice. This finding needs further investigation.

Based on the reliability test results and the comments of the raters, the current developed rating system cannot be used in clinical practice yet. It can be questioned if an observational ratings system is the right measurement tool to measure deviating movement patterns and postures in individuals with BPI, because of the variability in movement patterns. Follow-up research should therefore focus on the use of an optoelectrical movement system to measure movement patterns and postures during the performance of FCE-OH tests, to provide more insight into the qualitative aspects of the FCE-OH test performances in individuals with BPI.

Limitations

It is plausible that certain deviating movement patterns and postures remain undetected by the developed rating system. This limitation may be caused by the definition of movement patterns classified as nWNL, which was based on a maximum range of motion deviating by more than two standard deviations from the mean of the control group. Consequently, this definition may have caused that smaller deviating movements were not identified as nWNL. However, from a clinical perspective the more severe deviating movements may be the most important to target in order to prevent MSCs. These movements cause larger internal moments around joints and thus require more muscle force, leading to muscle fatigue, which subsequently may increase the risk on MSCs [5].

Raters reported that the use of fixed camera positions in a single plane caused difficulties during rating. Additionally the fixed camera positions may have caused that certain movement patterns and postures in other planes were not observed during the planning phase. Therefore the use of multi angle recordings is desired in future studies.

Deviating movements and postures of individuals with BPI were determined by manually measuring the range of motion from video frames by two researchers. This approach primarily captured deviations large enough to be assessed by observation; however, subtle deviations may have gone undetected. The validity and reliability of this method are unknown, and therefore results of the planning phase should be interpreted with caution. For future research, the use of motion capture systems should be considered to determine deviating movement patterns and postures. Previous studies have demonstrated that such systems also identify smaller deviation movement patterns of individuals with BPI, like scapulothoracic movements, rotation of the humerus and movement of the scapula and clavicle with respect to the thorax (virtual thoracohumeral movement) [2,4,34].

The item selection in the planning phase was based on an internal consensus group consisting of three medical professionals and one FCE-expert. In future research, the validity of the rating system may be enhanced by performing a Delphi study, as it systematically incorporates expert consensus and reduces individual bias.

The sample size of individuals with BPI was small. Post hoc we checked if saturation of the observed movement patterns and postures was reached for each FCE-OH test. Saturation was reached if no new deviating movement patterns and postures were observed in at least the last four analyzed videotapes. In four out of five FCE-OH tests saturation of the observed deviating movement patterns and postures was reached. Only for the overhead lifting test, no saturation was reached.

A post-hoc power analysis showed that the power of the overall interrater and intrarater reliably analyses was sufficient, however the power of reliability analyses of the individual items was insufficient. A lower number of raters, rating more videotapes per item would have increased the power [28]. However, this would lead to long-lasting rating sessions which was undesirable because it could lead to fatigue and time constraints for the raters, which in turn could negatively affect reliability. Therefore and because of the explorative character of this study, it was decided to increase the number of raters instead of the number of videotapes that needed to be rated.

Conclusions

A rating system was developed to rate movement patterns and postures of the trunk and shoulders in individuals with BPI during the performance of FCE-OH tests by observation. Movement patterns and postures of individuals with BPI could be rated reliably by the same rater using the developed rating system in combination with the video instruction. The interrater reliability was moderate in the first rating session and almost sufficient in the second rating session (ĸ was 0.59, whereas ĸ ≥ 0.6 was considered sufficient). The interrater reliability of the items of both overhead lifting tests was low in particular. The moderate interrater reliability of the rating system may be induced by the great variation in movement patterns in individuals with BPI, caused by differences in the remaining function of the affected upper limb. No differences were observed in the results of raters with or without FCE-experience, which implies that FCE-experience is not required for the use of the system. The feasibility of the rating system could be improved by implementing a more extended training program. Further research is needed to determine how the reliability of measuring deviant movement patterns and postures in individuals with BPI can be increased, taking advantages and disadvantages for use in daily practice into account.

Supporting information

S1 Table. Physical examination: Assessed range of movement and muscles selected for strength testing.

https://doi.org/10.1371/journal.pone.0345464.s001

(DOCX)

S2 Table. Results of the qualitative evaluation phase.

https://doi.org/10.1371/journal.pone.0345464.s002

(DOCX)

S1 Fig. Flow chart of the construction phase of the development of the scoring system for rating postures and movement patterns in individuals with BPI during the performance of FCE-one handed tests.

https://doi.org/10.1371/journal.pone.0345464.s003

(DOCX)

S1 File. Questionnaire qualitative evaluation phase.

https://doi.org/10.1371/journal.pone.0345464.s004

(DOCX)

S2 File. Qualitative scoring system for rating posture and movements of the shoulders and trunk in individuals with brachial plexus injury during the performance Functional capacity evaluation one-handed (FCE-OH)*.

https://doi.org/10.1371/journal.pone.0345464.s005

(DOCX)

Acknowledgments

The authors thank M. Reijmerink for his help in listing deviating postures and movement patterns.

Furthermore we would thank all raters who participated in this study.

References

  1. 1. Mancuso CA, Lee SK, Dy CJ, Landers ZA, Model Z, Wolfe SW. Compensation by the uninjured arm after brachial plexus injury. Hand (N Y). 2016;11(4):410–5. pmid:28149206
  2. 2. Webber CM, Shin AY, Kaufman KR. Kinematic profiles during activities of daily living in adults with traumatic brachial plexus injuries. Clin Biomech (Bristol). 2019;70:209–16. pmid:31669918
  3. 3. Mosqueda T, James MA, Petuskey K, Bagley A, Abdala E, Rab G. Kinematic assessment of the upper extremity in brachial plexus birth palsy. J Pediatr Orthop. 2004;24(6):695–9. pmid:15502572
  4. 4. Mahon J, Malone A, Kiernan D, Meldrum D. Kinematic differences between children with obstetric brachial plexus palsy and healthy controls while performing activities of daily living. Clin Biomech (Bristol). 2018;59:143–51. pmid:30241094
  5. 5. Gallagher S, Heberger JR. Examining the interaction of force and repetition on musculoskeletal disorder risk. Hum Factors: J Hum Factors Ergon Soc. 2013;55(1):108–24.
  6. 6. van der Laan TMJ, Postema SG, van Bodegom JM, Postema K, Dijkstra PU, van der Sluis CK. Prevalence and factors associated with musculoskeletal complaints and disability in individuals with brachial plexus injury: a cross-sectional study. Disabil Rehabil. 2023;45(18):2936–45. pmid:36149019
  7. 7. Huisstede BMA, Miedema HS, Verhagen AP, Koes BW, Verhaar JAN. Multidisciplinary consensus on the terminology and classification of complaints of the arm, neck and/or shoulder. Occup Environ Med. 2007;64(5):313–9. pmid:17043078
  8. 8. Ciaramitaro P, Mondelli M, Logullo F, Grimaldi S, Battiston B, Sard A, et al. Traumatic peripheral nerve injuries: epidemiological findings, neuropathic pain and quality of life in 158 patients. J Peripher Nerv Syst. 2010;15(2):120–7. pmid:20626775
  9. 9. Bailey R, Kaskutas V, Fox I, Baum CM, Mackinnon SE. Effect of upper extremity nerve damage on activity participation, pain, depression, and quality of life. J Hand Surg Am. 2009;34(9):1682–8. pmid:19896011
  10. 10. van der Holst M, Groot J, Steenbeek D, Pondaag W, Nelissen RGHH, Vliet Vlieland TPM. Participation restrictions among adolescents and adults with neonatal brachial plexus palsy: the patient perspective. Disabil Rehabil. 2018;40(26):3147–55.
  11. 11. Holdenried M, Schenck T, Akpaloo J, Müller-Felber W, Holzbach T, Giunta R. Lebensqualität bei posttraumatischen Paresen des Plexus brachialis im Erwachsenenalter. Handchir Mikrochir plast Chir. 2013;45(04):229–34.
  12. 12. Kretschmer T, Ihle S, Antoniadis G, Seidel JA, Heinen C, Börm W, et al. Patient satisfaction and disability after brachial plexus surgery. Neurosurgery. 2009;65(4 Suppl):A189-96. pmid:19927067
  13. 13. Haldane C, Frost G, Ogalo E, Bristol S, Doherty C, Berger M. A systematic review and meta-analysis of patient-reported outcomes following nerve transfer surgery for brachial plexus injury. PM R. 2022.
  14. 14. Gushikem A, Mendonca de Cardoso M, Cabral ALL, Mendes Barros CS, Isidro HBTM, Rodrigues Silva J, et al. Predictive factors for return to work or study and satisfaction in traumatic brachial plexus injury individuals undergoing rehabilitation: a retrospective follow-up study of 101 cases. J Hand Ther. 2021.
  15. 15. Soer R, van der Schans CP, Groothoff JW, Geertzen JHB, Reneman MF. Towards consensus in operational definitions in functional capacity evaluation: a Delphi Survey. J Occup Rehabil. 2008;18(4):389–400. pmid:19011956
  16. 16. Postema SG, Bongers RM, Reneman MF, van der Sluis CK. Functional capacity evaluation in upper limb reduction deficiency and amputation: development and pilot testing. J Occup Rehabil. 2018;28(1):158–69. pmid:28397018
  17. 17. Levin MF, Weiss PL, Keshner EA. Emergence of virtual reality as a tool for upper limb rehabilitation: incorporation of motor control and motor learning principles. Phys Ther. 2015;95(3):415–25. pmid:25212522
  18. 18. van der Kruk E, Reijne MM. Accuracy of human motion capture systems for sport applications; state-of-the-art review. Eur J Sport Sci. 2018;18(6):806–19. pmid:29741985
  19. 19. van der Laan TMJ, Postema SG, Reneman MF, Bongers RM, van der Sluis CK. Development and reliability of the rating of compensatory movements in upper limb prosthesis wearers during work-related tasks. J Hand Ther. 2019;32(3):368–74. pmid:29439843
  20. 20. Metzger AJ, Dromerick AW, Holley RJ, Lum PS. Characterization of compensatory trunk movements during prosthetic upper limb reaching tasks. Arch Phys Med Rehabil. 2012;93(11):2029–34. pmid:22449551
  21. 21. van der Laan TMJ, Postema SG, van der Sluis CK, Reneman MF. Functional capacity of individuals with brachial plexus injury. Work. 2023;76(3):1019–30. pmid:37248939
  22. 22. Trippolini MA, Dijkstra PU, Jansen B, Oesch P, Geertzen JHB, Reneman MF. Reliability of clinician rated physical effort determination during functional capacity evaluation in patients with chronic musculoskeletal pain. J Occup Rehabil. 2014;24(2):361–9. pmid:23975060
  23. 23. Thomas S, Reading J, Shephard RJ. Revision of the physical activity readiness questionnaire (PAR-Q). Can J Sport Sci. 1992;17(4):338–45. pmid:1330274
  24. 24. van der Laan TMJ, Postema SG, Alkozai SA, van der Sluis CK, Reneman MF. Musculoskeletal complaints, physical work demands, and functional capacity in individuals with a brachial plexus injury: an exploratory study. Work. 2024;77(3):811–25. pmid:37781839
  25. 25. Benson J, Clark F. A guide for instrument development and validation. Am J Occup Ther. 1982;36(12):789–800. pmid:6927442
  26. 26. Tooth LR, Ottenbacher KJ. The kappa statistic in rehabilitation research: an examination. Arch Phys Med Rehabil. 2004;85(8):1371–6. pmid:15295769
  27. 27. Sainani KL. Reliability statistics. PM R. 2017;9(6):622–8.
  28. 28. Sim J, Wright CC. The kappa statistic in reliability studies: use, interpretation, and sample size requirements. Phys Ther. 2005;85(3):257–68. pmid:15733050
  29. 29. Vach W, Gerke O. Gwet’s AC1 is not a substitute for Cohen’s kappa - A comparison of basic properties. MethodsX. 2023;10:102212. pmid:37234937
  30. 30. Gwet KL. AgreeStat360. [cited 2025 Dec 3]. Available from: https://AgreeStat360.com
  31. 31. Arifin WN. A web-based sample size calculator for reliability studies. Educ Med J. 2018;10(3):67–76.
  32. 32. Midha R. Epidemiology of brachial plexus injuries in a multitrauma population. Neurosurgery. 1997;40(6):1182–8; discussion 1188-9. pmid:9179891
  33. 33. Chauhan SP, Blackwell SB, Ananth CV. Neonatal brachial plexus palsy: incidence, prevalence, and temporal trends. Semin Perinatol. 2014;38(4):210–8. pmid:24863027
  34. 34. Wu G, van der Helm FCT, Veeger HEJD, Makhsous M, Van Roy P, Anglin C, et al. ISB recommendation on definitions of joint coordinate systems of various joints for the reporting of human joint motion--Part II: shoulder, elbow, wrist and hand. J Biomech. 2005;38(5):981–92. pmid:15844264