The authors have declared that no competing interests exist.
The Patient-Reported Outcomes Measurement Information System (PROMIS) is a universally applicable set of instruments, including item banks, short forms and computer adaptive tests (CATs), measuring patient-reported health across different patient populations. PROMIS CATs are highly efficient and the use in practice is considered feasible with little administration time, offering standardized and routine patient monitoring. Before an item bank can be used as CAT, the psychometric properties of the item bank have to be examined. Therefore, the objective was to assess the psychometric properties of the Dutch-Flemish PROMIS Physical Function item bank (DF-PROMIS-PF) in Dutch patients receiving physical therapy.
Cross-sectional study.
805 patients >18 years, who received any kind of physical therapy in primary care in the past year, completed the full DF-PROMIS-PF (121 items).
Unidimensionality was examined by Confirmatory Factor Analysis and local dependence and monotonicity were evaluated. A Graded Response Model was fitted. Construct validity was examined with correlations between DF-PROMIS-PF T-scores and scores on two legacy instruments (SF-36 Health Survey Physical Functioning scale [SF36-PF10] and the Health Assessment Questionnaire Disability-Index [HAQ-DI]). Reliability (standard errors of theta) was assessed.
The results for unidimensionality were mixed (scaled CFI = 0.924, TLI = 0.923, RMSEA = 0.045, 1th factor explained 61.5% of variance). Some local dependence was found (8.2% of item pairs). The item bank showed a broad coverage of the physical function construct (threshold-parameters range: -4.28–2.33) and good construct validity (correlation with SF36-PF10 = 0.84 and HAQ-DI = -0.85). Furthermore, the DF-PROMIS-PF showed greater reliability over a broader score-range than the SF36-PF10 and HAQ-DI.
The psychometric properties of the DF-PROMIS-PF item bank are sufficient. The DF-PROMIS-PF can now be used as short forms or CAT to measure the level of physical function of physiotherapy patients.
Patient Reported Outcome Measures (PROMs) have become standard instruments to measure patients’ perceived health, and are used to assist in patient-physician shared-decision making and to monitor patients’ health over time. However, its application in daily clinical practice is not without problems. Many traditional PROMs are too long for use in daily clinical practice, and sometimes contain irrelevant and poorly formulated questions [
The Patient-Reported Outcomes Measurement Information System (PROMIS®) has the potential to overcome some of the shortcomings of existing PROMs [
The PROMIS Physical Function (PROMIS-PF) item bank is one of the PROMIS instruments which is highly relevant for physical therapists and their patients [
In line with the international PROMIS goals to re-do the calibration of item banks and evaluate its psychometric properties in multiple validation studies and in patients with multiple conditions before an item bank can be used as CAT, the aim of current study was to examine the psychometric properties of the V1.2 Dutch-Flemish PROMIS-PF item bank (DF-PROMIS-PF) in Dutch patients receiving physical therapy in primary care. This is the first study re-doing the calibration of the PROMIS-PF in patients receiving physical therapy. The ultimate aim is to obtain a user-friendly, efficient, precise and valid instrument to measure physical function in patients receiving physical therapy in daily clinical practice and research.
For this study Dutch patients (18 years or older) receiving physical therapy in primary care in the past year were invited. Patients were eligible if they provided informed consent.
The study was approved by the local institutional review board of the VU University Medical Center. Physical therapy practices across the Netherlands were approached to recruit patients for the study through the personal network of the authors, and advertisement in a Dutch physical therapy journal. Thereafter, the patients were invited by their physical therapist, by e-mail or flyer, to complete an online questionnaire.
The questionnaire included items addressing demographic and clinical characteristics. The questionnaire also included all 121 items of the V1.2 DF-PROMIS-PF. The items cover a wide range of activities, from self-care (activities of daily living) to more complex activities that require a combination of skills. The item bank includes items about functioning of the axial regions (neck and back), the upper and lower extremities, and ability to carry out instrumental activities of daily living (i.e. housework, shopping) [
In addition, two generic legacy instruments were administered: the SF36-PF10 and the HAQ-DI [
Demographic and clinical characteristics were described by descriptive statistics. Psychometric analyses were conducted in accordance with the PROMIS analysis plan and were similar to the re-doing of the calibration of the DF-PROMIS-PF in Dutch patients with chronic pain [
To check unidimensionality, a Confirmatory Factor Analysis (CFA) was fitted using the R-package Lavaan (version 0.5–16) [
Local independence was evaluated by considering the residual correlation matrix [
Monotonicity was evaluated by estimating a nonparametric Mokken scale with the R-package Mokken [
When the assumptions of unidimensionality, local independence, and monotonicity were met, the fit of the Graded Item Response model (GRM) was examined, indicating if the IRT model fits to the response data. A logistic GRM was used to estimate item slopes, thresholds, and individual theta scores, using IRT PRO [
Differential Item Functioning (DIF) analyses evaluate if persons from different groups (for example male vs. female), with similar levels of physical function, respond similar to the items which implies validity of comparisons between the groups at issue [
Construct validity indicates whether the item bank really measures the intended construct (physical function). Therefore, construct validity of the DF-PROMIS-PF was evaluated by calculating Spearman correlations between the T-scores of the DF-PROMIS-PF and scores on the two legacy instruments (SF36-PF10 and HAQ-DI). If an instrument measures the intended construct its scores should be highly correlated to scores of other PROMs measuring the same construct. We hypothesized that the DF-PROMIS-PF would have strong correlations with both legacy instruments (r>0.60), but the strongest correlation (r>0.70) with the SF36-PF10, because both DF-PROMIS-PF and SF36-PF10 were both developed for use in a general population whereas the HAQ-DI in the first instance was developed for patients with rheumatoid arthritis.
Reliability indicates whether a measure is precise in estimating the level of the construct, in other words, precise in estimating the physical function T-scores. Reliability within IRT is conceptualized as “information”, in which the fact that measurement precision can differ across levels of the measured trait (θ = Theta) is taken into account [
A total of 805 patients completed the questionnaire. Their demographic and clinical characteristics are summarized in
Physical therapy patients |
|
---|---|
53 (14) 18–88 | |
Male | 331 (41) |
Female | 474 (59) |
Netherlands | 761 (95) |
Other | 44 (5) |
Less than High School degree | 21 (3) |
High School degree | 82 (10) |
Some college | 301 (37) |
College degree | 37 (5) |
Advanced degree | 364 (45) |
Head | 14 (2) |
Breast/abdomen | 25 (3) |
Neck/upper back | 152 (19) |
Shoulders/upper arm | 113 (14) |
Elbow/forearm/hand | 23 (3) |
Low back | 157 (20) |
Pelvis/hip/upper leg | 76 (9) |
Knee | 86 (11) |
Lower leg/ankle/foot | 52 (6) |
More than 1 region | 107 (13) |
Disorder of muscles, bones or joints without surgery | 391 (49) |
Recovery after surgery | 100 (12) |
Condition resulting from an accident without surgery | 70 (9) |
Cardiac, vascular or lymphatic disorder | 25 (3) |
Pulmonary affection | 20 (2) |
Other internal disorder | 4 (1) |
Neurological disorder | 15 (2) |
Gynaecological disorder | 7 (1) |
Disorder with no known cause | 11 (1) |
Rheumatic disease | 17 (2) |
Osteoarthritis | 45 (6) |
Other | 100 (12) |
0–3 months | 126 (16) |
3–6 months | 116 (14) |
6–12 months | 166 (21) |
1–2 years | 146 (18) |
2–5 years | 85 (10) |
>5 years | 166 (21) |
48.2 (9.4) 21.4–73.5 | |
SF36-PF10 |
75.2 (26) |
HAQ-DI |
0.4 (0.5) |
SF36-PF10 = Short Form Health Survey Physical Functioning (range 0–100, higher scores indicate better physical function); HAQ-DI = Health Assessment Questionnaire-Disability Index (range 0–3, higher scores indicate less physical functioning).
The results for unidimensionality were mixed: The CFA analyses showed unscaled fit indices of CFI = 0.982, TLI = 0.982 and RMSEA = 0.091, and scaled indices of CFI = 0.924, TLI = 0.923 and RMSEA = 0.045. The scaled CFI and TLI did not met the criterion of >0.95, while the scaled RMSEA did met the criterion of <0.06. Furthermore, the first factor accounted for 61.5% of the variance and the ratio of the variance explained by the first to the second factor was 7.6, well above the criterion of 4. Altogether showing sufficient unidimensionality.
Some violations of local independence were found: The residual correlation matrix showed that 592 out of 7260 item pairs (8.2%) were flagged for local dependence (
No violations of monotonicity were found. The scalability coefficients for all item pairs were positive, only one item had a scalability coefficient <0.30 and the scalability coefficient H for the full scale was 0.57, suggesting strong scalability.
The GRM item slope parameters ranged from 1.36 to 4.29 (mean 2.74) and the item threshold parameters ranged from -4.28 to 2.33, indicating good coverage of the physical function construct. The mean T-score of the DF-PROMIS-PF for the Dutch sample was 48.2 (SD = 9.4), with a range from 21.4 to 73.5, indicating a slightly lower average level of physical function in Dutch patients currently receiving physical therapy treatment or had completed physical therapy treatment in the past year compared to persons from the US general population, with large variation among patients.
Only two out of 121 Dutch-Flemish PROMIS Physical Function items were flagged for DIF for age and fourteen for DIF for gender (
The DF-PROMIS-PF correlated strongly with both the SF36-PF10 (r = 0.84)) and the HAQ-DI (r = -0.85), as expected. The correlation with the SF-PF10 was, however, not higher than the correlation with the HAQ-DI.
The horizontal axis represents the different physical function abilities with T = 50 representing the mean of the US general population with a standard deviation of 10. The vertical axis represents the standard error (reliability), with reference reliabilities of 0.80, 0.90 and 0.95. The lower the curve, the greater the reliability. The lower plot shows the distribution of the Dutch physiotherapy patients (Dutch PT) sample and the US general population sample along the T-score scale.
This is the first validation study of PROMIS in a physical therapy population. The results of the current study add to the evidence on the psychometric properties of the DF-PROMIS-PF. The results supported sufficient unidimensionality, showed some local dependence, good monotonicity, good IRT model fit, good coverage of the construct of physical function and negligible DIF for age and gender. Furthermore, good construct validity and high reliability across the physical function construct was found. Although further improvement of the item bank may be possible, we consider the psychometric properties of the DF-PROMIS-PF sufficient to measure the level of physical function of physiotherapy patients if being applied as short forms or CAT.
The average age (53yr) and percentages of females (60%) in our sample matches with the average physical therapy patient population in the Netherlands in 2015, but the percentage of patients with complaints longer than 6 months was much higher in our sample (70% compared to 25% in the Netherlands) [
Our results regarding the psychometric properties of the DF-PROMIS-PF were similar to those of previous studies of the PROMIS-PF. Previous studies in different patient populations and in people from the general population also found small problems with the IRT assumptions unidimensionality and local dependence (
In the current study items with DIF with respect to age and gender were found, however their impact on the physical function scores was negligible. Previous studies on the PROMIS-PF also showed no or minimal impact of DIF for age and gender [
The current study supports the construct validity of the DF-PROMIS-PF, by showing strong correlations between the DF-PROMIS-PF and the traditionally used SF36-PF10 (r = 0.84) and HAQ-DI (r = -0.85). This was also found by Oude Voshaar et al., who found similar strong correlations between the DF-PROMIS-PF and both the SF36-PF10 (r = 0.84) and the HAQ-DI (r = -0.76) [
The current study as well as several previous studies showed that the PROMIS-PF measures have good reliability; the reliability of the total item bank as well as the short forms and CATs were greater than 0.80 or even 0.90 or 0.95 for the range of the scale where the study samples were located [
Were the current study focused on examining the psychometric properties of the whole item bank and showed sufficient properties of the DF-PROMIS-PF to be used as CAT, future research is recommended on known-groups validity, test-retest reliability and responsiveness of the CAT, because the CAT would most likely be used in pre- and post-intervention measurements. A recent study in orthopedic reconstructions patients showed that the PROMIS-PF CAT outperformed the legacy instruments Knee injury and Osteoarthritis Outcome Score Joint Replacement (KOOS-JR) and Hip disability and Osteoarthritis Outcome Score Joint Replacement (HOOS-JR) with respect to responsiveness [
The psychometric properties of the DF-PROMIS-PF item bank are sufficient for use as short forms or CAT to measure the level of physical function in Dutch physical therapy practice. Using the highly efficient DF-PROMIS-PF CAT in clinical practice is considered feasible with little administration time, and has the potential for standardized and routine patient monitoring across a wide range of patients receiving physical therapy.
(DOCX)
(DOCX)
(DOCX)
The Dutch-Flemish PROMIS group is an initiative that aims to translate and implement PROMIS item banks and CATS in the Netherlands and Flanders (