Psychometric evaluation of the Chinese version of the revised American Pain Society Patient Outcome Questionnaire concerning pain management in Chinese orthopedic patients

The present study tested the clinical efficiency (item grouping, internal consistency of the subscales, construct validity, and clinical feasibility) of a widely used pain assessment system, the Mandarin version of the American Pain Society Patient Outcome Questionnaire (APS-POQ-R-C), in Chinese patients. We also attempted to investigate the current quality of pain management provided in orthopedic inpatient units in China and provide baseline data. First, we investigated the test–retest reliability of APS-POQ-R-C. In total, 236 orthopedic patients were evaluated. Our results showed that APS-POQ-R-C has satisfactory internal consistency and construct validity, although some items are not appropriate for orthopedic patients. Test–retest reliability outcomes indicated that APS-POQ-R-C is a satisfactory battery with acceptable validity and reliability, and is therefore recommended for pain management in future studies.


Introduction
Pain is one of the most common symptoms associated with many disorders, is more frequent in surgical and cancer patients [1][2][3][4][5]. Unrelieved pain often causes serious negative consequences, including physiological and psychological impairments, which is harmful to patient outcome and quality of life [6]. Pain is, therefore, regarded as the fifth vital sign by many organizations such as Veterans Health Administration (VHA), National Pain Management Coordinating Committee, and American Pain Society (APS), etc. [7][8][9]. Elimination of pain, or at least reducing it to a tolerable level, in which pain management plays a crucial role, is emphasized by more and more clinicians [10][11][12][13]. a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 Pain is a subjective experience encompassing multiple dimensions [14,15]. In order to effectively manage pain, a quality improvement (QI) approach based on reliable and effective assessments is very important [16,17]. A satisfactory assessment should include not only pain intensity (PI), but also all the domains affected by pain or affecting pain, such as ongoing pain assessment, interdisciplinary cooperation (nursing, clinical medicine, clinical pharmacy, and psychology), appropriate treatment, specialty care, and patient input [16]. Only an appropriate QI approach based on ideal pain assessments can provide a reliable evaluation of patients with pain, which contributes to effective pain management and is helpful for controlling pain. There are many measurements of pain, such as the Strategic and Clinical Quality Indicators in Postoperative Pain Management [18], Pain Treatment Satisfaction Scale [19], and the Patient Pain Questionnaire [20]. However, the most common assessment is the American Pain Society Patient Outcome Questionnaire (APS-POQ), which was developed in 1991 [21] and revised in 1995 and 2010 [1,10,14]. The latest version of APS-POQ was released in 2010. It is an investigator-reported measurement of six aspects of pain management: pain relief and severity; effect of pain on activity, sleep, and emotion; usefulness of information regarding pain treatment; nonpharmacological pain management; side effects of pain treatment; and participation in pain treatment decisions [10,16]. This version has been translated into 11 languages [16]. Gordon et al. in 2010 investigated pain in 299 patients using an English version (APS-POQ-R). This study reported the initial psychometric properties of APS-POQ-R for QI, and the internal consistency of the instrument subscales and construct validity of APS-POQ-R were verified [16]. A later study evaluated the quality of an Icelandic version of APS-POQ-R (APS-POQ-R-I) in 143 patients. They found that APS-POQ-R-I was feasible, and the questionnaire had acceptable construct validity and reliability and is now recommended to evaluate the quality of pain management in hospitals in Iceland [10]. These studies were based on Caucasian populations; there are no data regarding the usage of APS-POQ-R among Mongolian populations. Many literatures suggested that sensitivity and tolerance of pain are different among ethnic groups. Tan et al compared the pain scores in the main four races in Singapore and found Indians having the highest mean pain score and using the highest amount of morphine [22]. Recently, Holmgaard et al also reported that people with dark eyes and hair exhibit higher pain sensitivity [23]. In this regard, it is meaningful to investigate the APS-POQ-R among Mongolian populations.
The aim of the present study is to test the item grouping, internal consistency of the instrument subscales, construct validity, and clinical feasibility of the Mandarin version of APS-POQ-R (APS-POQ-R-C) in Chinese orthopedic patients. We also investigated the relationships among items, subscales, and variables, which may potentially play a role in the prediction of outcomes affected by regional environments, cultural traditions, and native background behind the reported methods. We attempt to investigate the current quality of pain management provided in orthopedic inpatient units of China and provide baseline data. Moreover, we investigated the test-retest reliability of APS-POQ-R-C in the present study.

Materials and methods Scale
The Mandarin version of APS-POQ-R was obtained directly from the American Pain Society website, which can be freely used without further permission. The questionnaire consists of 22 items, including pain severity (P1-P3), pain interference (P4, P5), adverse drug effects (P6), pain relief, participation in decision-making, satisfaction, and received pain treatment information. (P7-P10), nonpharmacological methods for treating pain (P11, P12), and whether patients received the help from investigators when completing the questionnaire (P13). Patients should read and sign informed consent at the beginning of the questionnaire.
During the preliminary investigation, we found it was difficult for the patients undergoing spine or hip joint surgery to complete the item of activities out of bed (P4b) because of the requirement of local or systemic immobilization in the first 24 h after surgery. Patients were usually confused by the major reason of disability, namely, by pain or by immobilization requirements. However, in order to retain the integrity of the scale, the item was not removed in subsequent tests. In this case, many subsequent attempts were made to make the patient understand the context of the question. Only 2 patients were usually confused and we excluded their responses and included them into blank forms.
A number of patients could not clearly distinguish the pain level (maximum or minimum), and may possibly have reported the same pain level. In this regard, two items ("pain relief," P3 and "time spent in severe pain," P7) were somewhat difficult for patients to report. We therefore supplemented the definition of "severe pain," namely, a pain experience causing bad effect, according to a previous study [16].

Patients
All patients were recruited into the cohort from the 50-bed ward of an orthopedics department of a 1216-bed comprehensive university hospital (February 2016 to June 2016). Patients of this department include fractures and acute musculoskeletal injury, spinal injury, wrist arthroscopy, hand and wrist fusions, shoulder and elbow replacements, hip disorders, and knee ligament reconstruction. The present study complied with the Declaration of Helsinki of the World Medical Association (2000), and was approved and supervised by the Ethics Committee of Huashan Hospital of Fudan University (approval number: 2015291). Informed consent was obtained from each patient or their relatives after all procedures were fully explained. The patients were recruited using the same methods as a previous study [16]. Briefly, a head nurse who was not directly involved in treating the patients was employed in the present study to select patients. The head nurse was well trained in performing assessments and data collection. The criteria for inclusion were age !14 years, orthopedic surgery in the past 72 h, pain in the past 24 h, conscious and responsive through the involvement process, and a native Chinese speaker without any communication obstacles. Once a patient was considered, the head nurse would communicate with him/her, and decided if the patient was included or not. If the patient met the inclusion criteria and agreed to participate in the study, the nurse then introduced all the items of the APS-POQ-R-C, and answered all questions from patients or their relatives personally. Patients were encouraged to complete the scale independently. Only those who underwent local or systemic immobilization had to complete the scoring with help of the head nurse. In this case, the nurse only objectively put down the patient's answer regarding the scale. All the data were checked by the investigators of the present study.

Statistical analysis
We used the same statistical analyzing methods as in a previous study [16]. Briefly, the difference in survey results between patients undergoing pharmacological therapy and without nonpharmacological therapy was tested by the Mann-Whitney U Test for skewed data. A correlation analysis was conducted to assess the relationships among items, subscales, and variables. Multiple stepwise linear regression analyses were used to determine and evaluate the effect of items, subscales, and variables on satisfaction outcomes and identify the predictors of satisfaction. Face validity was determined according to the methods of Shaik et al. [24]; 10 patients were involved in the preliminary experiment to acquire proper feedback. Kaiser-Meyer-Olkin (KMO) and the Barlett's tests of sphericity were used to assess the appropriateness of using factor analysis on the data, and the principal component analysis with varimax rotation was used to confirm the construct validity of the questionnaire. For the evaluation of the original APS-POQ-R, scores of three items (P7, P8, P9, higher scores represent a favorable situation), were reversed to be in line with the other major items. Estimated items of P3 (pain relief) and P7 (time spent in severe pain) were normalized to 0-10 scales to match other items. The exploratory principal components factor analysis with varimax rotation was employed to extract components from all 18 continuous scales items (P1 to P9 listed in the APS-POQ-R). The internal consistency of APS-POQ-R-C was assessed with Cronbach's alpha (α), which is acceptable if the α value is >0.7 [24]. Patients were selected randomly and administered the questionnaire twice before leaving the hospital, only 26 patients agreed to participate these tests. All these 26 patients were similar to the rest of the group. Since the satisfaction item was continuous, a linear regression was therefore employed to identify the satisfaction predictors. The correlation between scales and subscales of the two surveys was adopted to evaluate the test-retest reliability. All statistical analysis was completed with SPSS version 20.0 software (IBM SPSS Inc., Chicago, IL, USA,).

Patient demographics
A total of 249 questionnaires were returned from 269 eligible participants (20 cases declined); of these, 236 were analyzed (13 were excluded because of returning blank forms, including 2 patients who were usually confused), for a response rate of 92.6%. The average completion time of the questionnaire was approximately 10 min (range, 6-20 min). The final participation rate was 87.7%. Table 1 summarizes the characteristics of the participants (n = 236; 47.5% males and 52.5% females). Mean patient age was 54.8 ± 14.7 years (range, 16-82 years). Three categories of orthopedic surgery were involved in the present study: 47 upper limbs (19.9%), 113 lower limbs (47.9%) and 76 spinal columns (32.2%). With respect to education, 58 (24.6%) reported a college (or greater) level of education, and 144 (61.0%) reported a high school education.

Initial survey of validity and reliability of APS-POQ-R-C
Regarding the initial component loading matrix of APS-POQ-R, the KMO was 0.80 while the probability value of Bartlett sphericity test was low (Bartlett test of sphericity: χ 2 = 1229.8; p < 0.001); therefore, the indicated factor analysis was applicable. Six factors with Eigenvalues over 1 were identified, while the 67.57% total variance was explained. According to this rotated component matrix, 18 items with different factor loading were affiliated to 6 groups, which were classified according to the criterion of loading coefficient >0.5. Except for "severity of itching" (P6c), which was isolated as a single factor, the initial results were similar to the subscales classified in original version, which were labeled with affective subscale (4 variables: P5a-P5d), pain severity and sleep interference subscale (5 variables: P1-P3, P4c, P4d), perceptions of care subscale (3 variables: P7-P9), activity interference subscale (2 variables: P4a, P4b), and adverse drug reaction (ADR) subscale (3 variables: P6a, P6b, and P6d).
We also investigated the Cronbach's α and the corrected item-total correlation of total scale and subscales. The Cronbach's α of initial total APS-POQ-R-C was 0.798, whereas the value based on standardized items was 0.820. The items were adjusted according to the results of accompanying changes of α value of total scale and subscales supervening after item deletion. Four items, namely, "activity interference out of bed" (P4b, 0.818), "severity of nausea" (P6a, 0.8), "severity of drowsiness" (P6b, 0.799), and "participation of pain treatment decision" (P8, 0.804), may increase the total α if deleted. After rounding, only deletion of P4b could enhance the α value. Spinal and hip joint surgery required immobilization after operation; thus, these patients could not get out of the bed, which was why some patients did not complete the interference section, in contrast to patients undergoing surgery on other parts of the body. In this regard, this item may not be appropriate for the patients strictly confined to bed.
Moreover, the loading coefficient of the itching item on the component of other 3 ADR items was 0.057, distinct from others 3, (severity of nausea 0.795, severity of dizziness 0.793, severity of drowsiness0.582); we attempted to retained the itching item in ADR subscales but it lowered α of ADR subscale to <0.6. The items of "activity interference out of bed" and "severity of itching" were therefore removed and the validity and reliability of the adjusted APS-POQ-R-C were recalculated.
Construction of the final scale and subscale of APS-POQ-R-C for orthopedic surgery Table 2 shows the component loading matrix of adjusted APS-POQ-R-C. After removing P4b and P6c, the KMO was 0.818 and P value of Bartlett test was <0.001. The exploratory factor analyses with 16 items produced 4 factors with Eigenvalue >1; namely, "pain severity and interference" (6 items: P1-P3, P4a, P4c, P4d), "affection" (4 items: P5a-P5d), "perceptions of care" (3 items: P7-P9), and "ADR" (3 items: P6a, P6b, P6d). The Eigenvalues after rotation were 3.47, 2.61, 1.80, and 1.73, respectively. The loading coefficients of items on each factor were >0.5, with the exception of the item of pain relief in the first hours, which had a loading coefficient >0.4 on 2 factors ("perceptions of care" and "pain severity and interference"). The pain severity and interference could explain 30.52% of variance, and all 4 factors could explain 60.07% of total variance. Table 3 shows the correlations of all items, including the subscales. All of the correlations were >0.3, which indicated each item was consistent with the measurement behavior of the subscale and could not be discarded [25]. The Cronbach's α of total scale was 0.818 and the standardized value was 0.83; these data indicated that the reliability of total APS-POQ-R-C of the final version was good. α of 4 subscales were 0.819 ("pain severity and interference"), 0.812 ("affection"), 0.609 ("adverse reactions") and 0.618 ("perceptions of care") (Table 3). Although, as in the α of "ADRs" and "perceptions of care," 0.7 is usually regarded as an acceptable internal consistency; however, for the fewer and important items, thresholds can be adjusted to 0.6 [16,26]. Table 4 shows the correlation matrix among subscales. The results indicated that the pairwise correlation among 3 subscales, including "pain severity and interference," "affection" and" perceptions of care," were higher than that between the ADR subscale and others. The highest pairwise correlation coefficient was between "pain severity and interference" and "affection" (0.508), which indicated good independence of the 4 subscales ( Table 4).
As to the test-retest reliability of the final APS-POQ-R-C (obtained from 26 patients). The correlation coefficient of the subscales and total scales between two tests of final APS-POQ-R-C were extremely high (0.714-0.914). As for the independent sample test, no significant difference was found among the 4 subscales and total scales between two tests. The intraclass correlation coefficient (ICC) were >0.80 for total scale, and >0.75 for all subscales, except that the perception of care subscale was 0.654. According to these results, the test-retest reliability of the final APS-POO-R-C was verified.

Construction of the model of satisfaction prediction
Tests of independent samples were employed to assess the differences of APS-POQ-R-C survey results between patients undergoing nonpharmacological therapy and those without nonpharmacological therapy. Of the 16 items in the scale, only "the level of percentage of time spent in severe pain" and "pain interference with falling asleep" (P3, P4c) were significantly higher in patients undergoing nonpharmacological therapy. No difference was found in the "patient satisfaction," "participation in decision-making" and other items (Table 5).  Table 6 shows the regression model of satisfaction predictors. To screen the patient satisfaction predictors, satisfaction item (P9) was employed as the dependent variable, while other items in the final APS-POQ-R-C (age, binary variables including sex, information received Evaluation of the APS-POQ-R-C concerning pain management in Chinese orthopedic patients about pain treatment [P10], and nonpharmacological methods to relieve pain [P11]) were selected as independent variables. The model contained 5 variables: "pain relief," "participation in pain treatment decisions," "activity interference in bed," "depression caused by pain," and "least pain in 24 hours" (adjusted R 2 = 0.29, F = 20.36, p < 0.001). Our data indicated that these five items may be important components that influence pain satisfaction in patients (Table 6).

Descriptive statistics of final version of APS-POQ-R-C items
As in the original version of APS-POQ-R-C, the continuous scales as well as the necessary additional items for pain treatment were included in the final version scale. Table 7 shows the numeric rating scales (NRS) data measured from 0-10. The mean of "worst pain in 24 hours" was 3.3136 ± 1.97301 (indicating mild degree of pain), "pain relief" was 68.405% ± 30.9244% and "satisfaction" was 9.0819 ± 1.57007. These data showed that the scores of the "degree of satisfaction" were high, which indicated that pain in these postoperative patients was generally under control. With regard to additional items for pain, we calculated and summarized the data using the methods by Gordon et al. [16] and Zoega et al. [10]. Fig 1 shows that 92.37% of the patients received information about pain treatment options, and 60.17% of patients selected non-pharmacological methods to relieve pain. Table 8 shows that the average "usefulness of information received for pain treatment" was 8.7477 ± 2.08475. As for the level of how often the clinician encouraged nonpharmacological methods, most of the patients (66.2%) considered that the clinician "sometimes" encouraged the nonmedication methods, while "18.2%" selected "often" and 15.6% selected "never"

Discussion
Evaluating the reliability of the total scale Although the use of APS-POQ (1995) and APS-POQ-R (2010) is well documented [10,16,17,27], there are no studies that evaluate these scales in Asian populations. In the present study, indeed we found that APS-POQ-R-C has a satisfactory internal consistency for total scale (original version α = 0.798, and final version α = 0.818) in Chinese patients. In the previous studies using APS-POQ-R in different ethnic groups, the internal consistency was quite different among different populations. The Cronbach α in Gordon's study was 0.86 [16], and was 0.84 in final Iceland version [10], however, low in Danish (Cronbach α = 0.54) and Australian (Cronbach α = 0.63) cohorts [28]. The difference may derive from different sensitivity and tolerance of pain among different populations, different pain management system, different prescribing habits against pain, and different understanding of the items caused by the culture difference. Based on these results, we suggest that APS-POQ-R-C is a valid and reliable battery for pain evaluation in Chinese patients, although some revision (removal of p4b and p6c) is recommended in the actual practice for orthopedic patients. To the best of our knowledge, the present study is the first report to perform APS-POQ-R among Asian patients.
Modifying the items of the APS-POQ-R-C Moreover, several differences were found in comparison with the previously investigated American original ASP-POQ-R [16] and the Icelandic version [10]. First, we had to delete"itching item" (P6c) subscales because adding the "itching" item into the ADR subscale may lower the Cronbach's α of subscale to <0.6, while deleting this item may enhance the Cronbach's α to >0.6. Generally, ADR symptoms can be divided into type A or type B [29,30]. In Evaluation of the APS-POQ-R-C concerning pain management in Chinese orthopedic patients this study, the "itching" item was type B, whereas the others were type A. For patients with pain, opioid intake is an important factor that should be seriously taken into account in clinical practice, since release of histamine by opiates is a potential factor causing allergic reaction. In China, nonsteroidal anti-inflammatory drugs, instead of opioids, are the most widely used drugs to treat pain after orthopedic surgery. The present study evaluated orthopedic patients, whose opioid analgesics usage were lower than that of the tumor patients (related to less itching) in the previous study [16]. Differences of drug administration caused by patient selection between the present study and previous ones may be an explanation concerning the difference of the "itching" item. Second, we deleted the item of "interference with activities out of bed" (P4b) and then merged the "pain severity and sleep interference" subscale and the "activity interference" subscale into one subscale in the final APS-POQ-R-C according to the results of the exploratory factor analysis. The removal of P4b enhanced the Cronbach's α of total scale to >0.8. For some orthopedics patients, immobilization was required during the first postoperative 24 h. Also, it was difficult for some patients (such as those who received spinal or hip joint surgery) to distinguish whether the disability was caused by pain, which might cause confusion or difficulty to complete P4b, consequently lowering the Cronbach's α. After removal of P4b, the remaining item of "activity interference subscale" (P4a) was therefore classified in the "pain severity and sleep interference" subscale.
The lower Cronbach's α of "ADR" and "perceptions of care" subscales in the present study We also found that the Cronbach's α of the ADR subscale in the present study (0.609) was slight lower than that in the American study (0.609 vs. 0.63) and the Icelandic study (0.609 vs. 0.75 respectively) [10,16]. One reason is the items of the present study were fewer (3 vs. 4 in the American study). Moreover, as in the American study, the ADRs of the orthopedic patients in the present study were also very low; this might be another explanation for the low Cronbach's α in the present study. Another problem in the present study was the Cronbach's α of "perceptions of care" subscale was only 0.618. This result was not as satisfactory as that in "total scale" and "affection" subscales. Similar to the ADR subscales, too few items may lower the Cronbach's α. Although all items of the "perceptions of care" subscale (P7, P8, and P9) were well described, investigation of "satisfaction and participation" may be confused with the investigation to "medical or nursing quality" by some patients in the pain treatment questionnaire. A previous study reported that patients were satisfied with pain management even with an uncontrolled pain level [27]. In addition, the Cronbach's α of the "satisfaction" subscale was low in Dihle's study [1], which investigated the reliability and validity of an older version of APS in orthopedic postoperative patients; therefore, further study was needed to investigate non-surgical patients and other surgery patients.

Comparing the scores between patients with and without nonpharmacological therapy
With regard to the results comparing the difference between patients undergoing nonpharmacological therapy and those without nonpharmacological therapy, only the scores of 2 items, namely "percentage of time spent in severe pain" along with "pain interference with falling asleep" were significantly higher in the nonpharmacological group. There was no difference for the "degree of the satisfaction and participation" between the two groups. Different from the American version, patients in the nonpharmacological methods group had significantly lower satisfaction scores and significantly higher "worst pain" scores.

Factors affecting the satisfaction of pain treatment
Along with the items of "pain relief" (P7) and "level of participation" (P8) in the American version, other predictors, namely "least pain" (P1), "depression" (P5b), and "activity interference in bed" (P4a) were also included in the APS-POQ-R-C; however, "the time in severe pain" (P3) and "information received" (P10) in the American version were not included. When the subscales scores were selected as the independent items, only the items of "pain severity and interference" subscale and "affective subscale" demonstrated a significant contribution to the changes of "satisfaction" or "perception of care" subscale, while the ADR subscales did not influence the result.
The test-retest reliability of the APS-POQ-R-C The test-retest reliability of the APS-POQ was first verified in the present study. Spearman correlation and ICC are usually employed as indicators of reliability coefficient, while the ICC is more rigorous. Both indexes were adopted in this study to test the test-retest reliability. Test-retest reliabilities for each of the 4 subscales along with the total scale were statistically significant (p < 0.001). Except for the ICC of "perception of care" subscale of 0.654, which was acceptable and might be caused by some patients confusing the items of "satisfaction and participation concerning pain care" with the "investigation of medical or nursing quality," other subscales showed very satisfactory retest reliability coefficients (0.769-0.920). Good test-retest reliability of the first three factors (pain severity and interference, affection, and ADR) indicated a minimal measurement error related to random variance [31]. Our test-retest reliability results confirmed the stability of APS-POQ-R-C. The second scoring process during the testretest reliability may be influenced by the feeling about the medical service including doctor, nurse or hospital.

Conclusions
Taken together, the present study first investigated APS-POQ-R-C among Chinese orthopedic patients. However, we selected different patients from the previous studies in different languages, and it is difficult to draw a conclusion by simply comparing the present study with the previous ones. Our data indicate that APS-POQ-R-C has satisfactory internal consistency and construct validity in general. Moreover, the results of the test-retest further indicated that APS-POQ-R-C is valid and reliable, and is therefore recommended for pain management in Chinese orthopedic patients.