Item response theory evaluation of the biomedical scale of the Pain Attitudes and Beliefs Scale

Objectives The assessment of health care professionals’ attitudes and beliefs towards musculoskeletal pain is essential because they are key determinants of their clinical practice behaviour. The Pain Attitudes and Beliefs Scale (PABS) biomedical scale evaluates the degree of health professionals’ biomedical orientation towards musculoskeletal pain and was never assessed using item response theory (IRT). This study aimed at assessing the psychometric performance of the 10-item biomedical scale of the PABS scale using IRT. Methods Two cross-sectional samples (BeBack, n = 1016; DABS; n = 958) of health care professionals working in the UK were analysed. Mokken scale analysis (nonparametric IRT) and common factor analysis were used to assess dimensionality of the instrument. Parametric IRT was used to assess model fit, item parameters, and local reliability (measurement precision). Results Results were largely similar in the two samples and the scale was found to be unidimensional. The graded response model showed adequate fit, covering a broad range of the measured construct in terms of item difficulty. Item 3 showed some misfit but only in the DABS sample. Some items (i.e. 7, 8 and 9) displayed remarkably higher discrimination parameters than others (4, 5 and 10). The scale showed satisfactory measurement precision (reliability > 0.70) between theta values -2 and +3. Discussion The 10-item biomedical scale of the PABS displayed adequate psychometric performance in two large samples of health care professionals, and it is suggested to assess group-level professionals degree of biomedical orientation towards musculoskeletal pain.


Introduction
Musculoskeletal (MSK) pain disorders such as low back pain (LBP), neck pain (NP), and osteoarthritis (OA) are a leading cause of disability globally [1]. Moreover, the financial costs of these disorders represent a considerable burden to health care systems and society [2][3][4][5]. Clinical practice guidelines (CPGs) support health care professionals (HCPs) who routinely manage patients with these disorders to deliver best practice care [6][7][8]. However, HCPs managing patients with MSK pain often fail to follow the recommendations of CPGs and, consequently, deliver sub-optimal care [9,10]. One key explanation for not following CPGs recommendations is that clinical practice behaviour is strongly related to HCPs' attitudes and beliefs towards MSK pain [11][12][13][14]. Considering the influential role of HCPs' attitudes and beliefs on clinical practice behaviour [12], and to be able to better target training strategies to those HCPs who do not deliver optimal care, there is a need to have sound measurement instruments with which to assess these variables.
Different self-reported multi-item questionnaires exist to measure HCP attitudes and beliefs towards pain and the most thoroughly tested is the Pain Attitudes and Beliefs Scale (PABS) [15]. This questionnaire was developed and tested in the field of LBP [16,17], and then adapted for other disorders like NP and OA of the knee [18][19][20]. The PABS measures the strength of two theoretically derived clinical approaches by means of two subscales: one covering a biomedical approach and one a biopsychosocial approach [16,17]. Several studies have shown satisfactory Cronbach's alpha, test-retest reliability, construct validity, structural validity and responsiveness for the biomedical scale, whereas unsatisfactory Cronbach's alpha and structural validity highlight the need for major reworking of the biopsychosocial scale [15,21]. The component items recommended for inclusion in the biopsychosocial subscale have varied markedly in previous investigations using the PABS, driven mainly by attempts to improve the dimensionality of the subscale [22][23][24][25].

Item response theory
Item response theory (IRT) provides an excellent framework and toolbox for psychometric evaluations as it encompasses a family of measurement models that focus on explaining the dependencies between item responses within a person and between persons. IRT models are especially suitable for dichotomous or polytomous (e.g. Likert-type scale) item response data [26,27], like those of the PABS. IRT permits the assessment of dimensionality of a scale and measurement precision at the item level [26,27]. Some analytic features of IRT cannot be obtained with classical test theory (CTT) analysis, such as item parameters and reliability estimation along the continuum representing the measured latent trait, and examination of the optimal number of response options in each item [26][27][28][29][30][31]. Reliability estimation of a measurement instrument is usually represented by a single fixed number such as Cronbach's alpha; yet, this is in conflict with the fact that a scale cannot be expected to measure each person equally efficiently along the latent trait. In IRT, this problem is solved by using (Fisher) information function as an estimate of reliability/measurement precision conditional on the latent trait value; this function, showing information for different latent trait values, is known as scale information function (SIF) [26,27].

Aims of the study
IRT methods provide a valid way to assess and refine scales, however, response option sparseness was highlighted as a key finding of the biopsychosocial scale of the PABS [11] and needs to be resolved prior to IRT testing. Exploratory factor analysis (EFA) has shown the 10-item biomedical scale of the PABS to be unidimensional in samples of HCPs from the Netherlands [16]. Nevertheless, some items displayed factor loadings at the lower limit for acceptability and the dimensionality was never assessed in HCPs from other countries, making it crucial to further investigate its psychometric performance in other samples and with other analytic methods. Since the goal of the PABS biomedical scale is to provide adequate information for different degree of biomedical attitudes and beliefs towards pain, an IRT analysis of this scale is warranted. Nevertheless, to date, no studies have assessed the measurement properties of the biomedical scale of PABS with IRT methods. Therefore, this study aimed to use IRT to further assess the psychometric performance of the biomedical scale of the PABS.

Study participants
This study used secondary data analysis of two large samples of HCPs in the UK: one assessing their biomedical orientation towards LBP (BeBack study) [11], and the other towards MSK pain more broadly (DABS study).
The BeBack study was a cross-sectional postal survey of general practitioners (GPs) and physiotherapists (PTs) involved in the management of LBP, conducted between April and November 2005 [11]. This study aimed to explore associations between HCPs attitudes and beliefs towards LBP and their reported clinical behaviour. Simple random sampling was used to obtain details of 2000 GPs and 2000 PTs from national databases [11]. A single reminder was sent to all non-responders four weeks after the first mailing; no incentives were provided for completing the questionnaire [11]. The overall response rate was 38% for a total of 1534 HCPs (443 GPs and 1091 PTs); 66.7% of these (n = 1022, 442 GPs and 580 PTs) reported treating at least one patient with LBP in the previous six months and were included in the analyses of the original study [11].
The DABS dataset used in the current study was a cross-sectional psychometric study involving GPs, PTs, chiropractors and osteopaths. Random samples of HCPs involved in the management of patients with MSK pain (1650 GPs, 750 PTs, 749 chiropractors, 250 osteopaths) were identified through national registries: Binleys (GPs), Chartered Society of Physiotherapy (PTs), British Chiropractic Association (chiropractors), Institute of Osteopathy (osteopaths). A study pack was mailed and contained: letter of invitation, participant information sheet, PABS, and pre-paid return envelope. After two weeks, non-responders were sent a reminder postcard. Two weeks later the study pack was sent again to non-responders and, if a response was not received within two weeks, potential participants were not contacted again. Overall response rates were: 17.7% for GPs, 41.7% for PTs, 45.1% for chiropractors, and 31.6% for osteopaths. After selecting only professionals that treated patients with LBP in the previous six months: 279 GPs, 268 PTs, 329 chiropractors and 78 osteopaths were included.
Ethical approval for the BeBack study was obtained from the West Midlands Multi-centre Research Ethics Committee (MREC) (reference 05/MRE07/1), and for the DABS study from Keele University Ethics Review Panel.

Measurement instrument
The PABS was developed to measure PTs' attitudes and beliefs about non-specific LBP to determine the degree to which they adopted a biomedical or a biopsychosocial treatment approach [17]. The two-factor structure of the original scale was in line with the intentions of the developers [17], however, the number of items in each subscale was reduced by means of EFA into a 19-item version in a subsequent study [16]. Each PABS item is rated on a 6-point Likert scale, ranging from 'Totally disagree' (score = 1) to 'Totally agree' (score = 6). Ten items load on one subscale representing the biomedical orientation (total score range: 10-60), while the other nine load on the biopsychosocial subscale (total score: 9-54). This version of the PABS was developed and refined in PTs in the Netherlands and two of its items were slightly amended in the version used in the UK, to ensure face validity for both GPs and PTs [11]. Also in a sample of PTs and GPs in the UK a two-factor structure was found [25]. Considering its satisfactory measurement properties, the 10-item biomedical scale was retained in the DABS study in which a new MSK generic version of the PABS was developed to measure HCPs' attitudes and beliefs towards MSK pain more broadly. Small amendments were made in five items (i.e. 2, 3, 4, 5 and 10) of the PABS biomedical scale to make it applicable to different MSK pain conditions.

Statistical analysis
All analyses were performed in the two datasets (BeBack and DABS) separately. The following three main steps were undertaken in the analyses: 1) missing data handling and descriptive statistics, 2) evaluation of IRT assumptions, 3) IRT fit evaluation and estimations.

Missing data handling and descriptive statistics
Frequencies of missing data at the item level was calculated and respondents with missing data on all items of the scale were excluded from analysis. Patterns of missing values were explored to find any recurrent pattern. To evaluate if a desirable 'missing completely at random' (MCAR) situation was present, the Little MCAR's test was used, with a cut-off p-value > 0.05 [32]. If less than 10% of respondents displayed missing data and data were MCAR, a two-way imputation technique was used at the item level [33][34][35].
Response frequencies for each category of each item were also assessed. If fewer than 10 participants endorsed a response option, that option was collapsed with the contiguous one that had a similar meaning. Descriptive statistics were calculated for the socio-demographic characteristics of the participants. All descriptive statistics and missing data handling were conducted with the statistical software SPSS, version 21.

Evaluation of dimensionality and local independence
Following Lenferink et al. [36], we used two complementary statistical methods to evaluate the dimensionality of the PABS: 1) common factor analysis, 2) Mokken scale analysis (MSA; a non-parametric technique). Factor analysis was performed using the software programme FACTOR [37] (version 10.8.01), and MSA using the R package Mokken version 2.8.10 [38].
The procedure used for determining the number of factors was Parallel Analysis based on Minimum Rank Factor Analysis; this method will be abbreviated as PA-MRFA [39]. PA-MRFA can be seen as the current gold standard method for exploratory factor analysis. In PA-MRFA the empirical value of the proportion of explained common variance (ECV) is compared to corresponding factors ECV derived from random data [39]; this is done for each factor separately. The random data are generated based on the sample size of the real data assuming independence among items [40]. To determine the optimal number of factors, the observed ECV associated with a factor can be compared to the mean or the 95 th percentile of the sampling distribution associated with the corresponding factor. We used the standard configuration for PA-MRFA available in FACTOR: 500 random correlation matrices were generated based on "random permutation of sample values" [39]. The factor analyses were based on the polychoric correlation matrix.
MSA investigates the dimensionality of a set of items and, at the same time, identifies scales that allow an ordering of respondents on one or more underlying one-dimensional scales using the unweighted sum of item scores [41][42][43]. The imputed dataset was used as MSA is not appropriate for use with missing data [44]. Scalability coefficients (denoted as H) are calculated on several levels (scale: H; item: H i ; item-pair: H ij ). H ij and H i -values can be used to determine which of the items form a scale; the H i -value expresses the degree to which an item is related to other items. The H coefficient expresses the degree to which the total score can be reliably used to order respondents on the latent trait. A scale is considered acceptable if 0.3 H < 0.4, good if 0.4 H < 0.5, and strong if H ! 0.5 [42]. First, a confirmatory analysis was run and an H ! 0.3 for the total scale was considered satisfactory [41,42]. Second, an exploratory analysis was performed using the Automated Item Selection Procedure (AISP). The AISP is a bottomup, iterative approach in which a starting pair of items is selected with a favourable H ij value, after which one item at a time is added to form a scale. Items are only added to the scale if they have a positive relationship (H ij ) with the other items in the scale, and if the selected item has an H i -value exceeding a pre-defined lowerbound. This analysis included successive iterations in which the lowerbound scalability coefficient was increased by 0.1, from 0.1 to 0.5. The resulting pattern of outcomes is thought to be indicative of the dimensionality of a set of items. The scale was assumed to be unidimensional if at lowerbounds from 0.1 to 0.3, only one scale, or a bigger scale and a smaller one, were found [42].
Local independence signifies that, after controlling for the dominant construct, there should not be residual correlations among items [26-29, 31, 45]. Local independence was also assessed under MSA, using the R package Mokken version 2.8.10 [38].

IRT fit evaluation and estimations
For model fit, we estimated the 1PL, the GPCM and the 2PL Samejima's Graded Response Model (GRM) in the R package mirt version 1.26.3 [46], to ascertain which of these models showed the best overall fit. Original datasets, not imputed, were used for parametric IRT models as they can handle the presence of missing data [47]. The Akaike Information Criterion (AIC) was used to determine which model provided the best fit to the data [48]. The AIC allows comparison of non-nested models when the parameters within the models are estimated by the method of maximum likelihood and identifies the most parsimonious model by taking into account both goodness of fit and complexity of the models [48,49]. In this study, the GRM was the model with the best data fit.
Model fit was assessed with S-X 2 item fit statistics for polytomous data, which quantify differences between observed and expected response frequencies under the GRM model [50,51]. S-X 2 statistics with a p-value <0.001 were considered to indicate item misfit [45].
Among item parameters, item thresholds (β, or item difficulty parameters) represent the level of difficulty of an item and its response options, and item slopes (α, or item discrimination parameters) indicate the relationship of an item with the measured construct with higher values indicating a greater ability of an item to discriminate between adjoining values on the construct [26-29, 31, 45].
Item characteristic curves (ICCs) and item information functions (IIFs) were estimated for each item under the GRM. ICCs illustrate visually the probability of selecting the response options of an item considering the level of a respondent on the estimated underlying theta [26-29, 31, 45]. IIFs were estimated to determine which items were the most precise in measuring different levels of theta [26,27,31,45]. All IIFs were summed to plot a SIF that gives an indication of the measurement precision of the total scale across different levels of the latent trait [26,27,31,45]. In this context, information is conceptualized as an index of local reliability (r), where r can be calculated as 1-(1/information) to obtain a 0-1 value [30,31]. A value of r > 0.70 is usually used to consider an instrument as having satisfactory reliability when comparing population means [52][53][54] and this value corresponds to information > 3.3 [30,31]. A standard error (SE) of the estimated theta can also be calculated, being the inverse of the square root of information [28,31,45]. ICCs displayed items with response options having an endorsement probability lower than contiguous options, analyses were repeated after collapsing these response categories to assess if this led to an improvement in unidimensionality and measurement precision of the scale. All IRT parametric analyses were conducted using the R package mirt version 1.26.3 [46]. The GRM was estimated using a full information maximum likelihood approach.

BeBack study
Six participants had missing data on all items and were excluded from analysis leaving a total sample of 1016 respondents. Analysis of missing data revealed that 56 participants (5.5%) had at least one missing item and that 67 item values (0.7%) were missing in total. The Little MCAR's test was not significant (p = 0.968) suggesting that missing were completely at random.
Descriptive statistics for the socio-demographic characteristics of the HCPs are presented in Table 1, while descriptive statistics for the 10 items of the scale are displayed in Table 2. The sample had a mean score of 31.0 (standard deviation (SD) = 6.5) on the scale. The lowerbound of the reliability, estimated using Cronbach's alpha, equalled 0.78.
The factor analysis showed support for a unidimensional solution. The polychoric correlation matrix can be found in S1 Table. Only the first factor explained a larger percentage of common variance (69.3%) that could be expected when using random data (mean: 33.3%, 95 th percentile: 47.4%); the second factor explained a smaller percentage of common variance (11.3%) that expected when using random data (mean: 26.1%, 95 th percentile: 35.0%). In contrast, an H value of 0.273 was found for the total scale using confirmatory MSA, which is below the threshold of 0.3 for an acceptable scale. H i values for the individual items are presented in Table 3. The results of running exploratory analyses for increasing values of lowerbound scalability coefficient were inconclusive. At lowerbounds 0.1 and 0.2 all ten items were placed in the first scale, while at lowerbound 0.3 five items (i.e. 2,6,7,8,9) were placed in the first scale, two items (1, 3) in a second smaller scale, the other three items (4, 5, 10) were discarded. The H value of the first and largest scale was equal to 0.4. No locally dependent item pairs were found under MSA. Since the FA showed support for a unidimensional solution, IRT analyses were performed using unidimensional models.
All items exhibited satisfactory item fit statistics under the GRM model (S-X 2 pvalues > 0.001, Table 3). Item thresholds and item slopes estimated are listed in Table 3. Items 4, 5 and 10 were those with the lowest discriminative power and with difficulty parameters covering a larger range of theta values; items 7, 8 and 9 showed the highest discrimination and difficulty covering a smaller range of theta values. We decided to rerun all analyses after having removed item 10 as it was the one showing the most poorly psychometric performance ( Table 3, Fig 1). This deletion did not lead to any improvement in item slopes or SIF (Fig 2).
All poorly endorsed response options (Fig 1) were merged with adjacent ones having similar meaning (e.g. 'disagree to some extent' with 'agree to some extent') resulting in a modified 10-item version with varying number of response options across items. All analyses were also rerun for this modified 10-item version. Parametric IRT item parameters did not change substantially and no substantial changes could be identified in the SIF (Fig 2).

DABS study
All 958 respondents were included in the analyses. Analysis of missing data revealed that 53 subjects (5.5%) had at least one missing item and that 70 item values (0.7%) were missing in total; a MCAR situation was present (Little MCAR's test,p = 0.356). Table 1 and Table 2 present also the socio-demographic characteristics of the HCPs and item level statistics in this sample. The sample mean score on the scale was 33.7 ± 6.7 SD, its Cronbach's alpha equalled 0.78. PA-MRFA exhibited strong support for a unidimensional solution. The polychoric correlation matrix is included in S2 Table. Only the first factor accounted for a larger percentage of common variance (64.4%) than what could be expected when using random data (mean: 33.1%, 95 th percentile: 46.4%); the second factor accounted for a smaller proportion (13.2%) than what could be expected with random data (mean: 26.5%; 95 th percentile: 34.9%). As for the BeBack sample, the scale scalability coefficient (H = 0.274) was below the threshold to be considered an acceptable scale. The three items (4, 5 and 10) found with the lowest H i values in this dataset were the same as those in the BeBack sample ( Table 3). The scale also demonstrated satisfactory unidimensionality: all ten items were assigned to the first scale at lowerbound 0.1, eight items were assigned to the first scale and the other two to a second scale at lowerbound 0.2, six items were assigned to the first scale and four discarded at lowerbound 0.3. At this latter lowerbound, the scale H value was 0.426. Local independence assessment did not show any locally dependent item pairs. Since both PA-MRFA and MSA displayed support for a unidimensional solution, IRT analyses were performed using unidimensional models.
Item 3 displayed an unsatisfactory fit statistic (S-X 2 p-value < 0.001), while all other items fitted the GRM model ( Table 3). As in the BeBack dataset, items 7, 8 and 9 were those with highest item slopes, while items 4, 5 and 10 were those with the lowest ones (Table 3); these latter three items together with item 6 were also those with threshold parameters spreading across a broader range of the latent trait (Table 3). Fig 3 displays all ICCs of the biomedical scale  PABS version. Also in this sample, the scale exhibited acceptable measurement precision between -2 and +3 theta values (Fig 4).  An additional analysis was run to evaluate if the removal of the worst performing item 10 (consistently with the BeBack dataset) led to substantial improvements in the scale. The SIF of the 9-item version of the questionnaire was very similar to the curve of the original 10-item version (Fig 4).
Analyses were repeated as for the BeBack dataset for a 10-item modified version in which all response options with low probabilities of endorsement were collapsed (Fig 3). No substantial improvement could be observed in item thresholds and item slopes. A loss in information could be observed for theta values between -1 and 2 but without compromising local reliability (Fig 4).

Discussion
The biomedical scale of the PABS was assessed with IRT analytic methods in two large samples of HCPs in the UK. Factor analyses offered clear support for unidimensionality of the PABS scale in both samples. This finding was supported by the MSA for the DABS sample as well; for the BeBack sample, the MSA findings were inconclusive. Three items (i.e. 4, 5 and 10) were consistently found to show poor discrimination values, and three items (i.e. 7, 8 and 9) showed the highest discrimination, as estimated using the GRM (parameteric IRT). The scale showed satisfactory measurement precision for estimated latent trait values for an acceptable interval around the population mean level.
The PABS was developed following a CTT approach and this is the first study that assesses its biomedical scale with IRT analytic methods. Modern IRT techniques provide some advantages over CTT, providing a deeper insight into the measurement properties of a self-reported questionnaire and its items [26][27][28][29][30][31]. Our results were very similar in two different samples of HCPs in the UK, one including only GPs and PTs, the other also chiropractors and osteopaths (Table 1). These results are relevant considering that the PABS biomedical scale was originally developed to evaluate PTs' attitudes and beliefs towards non-specific LBP [17] and subsequently adapted to also assess GPs' attitudes and beliefs [11]. The same scale, with some small adaptations, was recently included in a new generic MSK version of the PABS to measure attitudes and beliefs of PTs, GPs, chiropractors and osteopaths towards non-specific MSK more broadly. The fact that the questionnaire showed consistently similar results in two different versions and in different HCPs shows that this scale has the potential to be adapted to different MSK pain conditions and HCP populations.
In this study, some issues were consistently identified for items 4, 5 and 10 of the scale in both samples. These items were those with lower MSA scalability coefficients and IRT discrimination parameters (Table 3). These findings are not surprising for item 10 considering that previous EFA studies have shown this item to be the most problematic [16,17]. Nevertheless, the results for items 4 and 5 in the present study has not been previously reported. The content of items 4, 5 and 10 seem to refer to aspects of the treatment or prognosis of patients with musculoskeletal pain, whereas the items with the highest discriminative power (i.e. 7,8,9) refer more to aspects of pain neurophysiology; a similar distinction can also be made with other items (e.g. 1 and 3) that showed acceptable and higher levels of discrimination (Table 2). This apparent difference in content could explain why some items present such low discrimination. These considerations could be further explored in future studies involving experts in the field of pain attitudes and beliefs and asking them to accurately judge the content validity of this scale.
Additional analyses without item 10 indicated that removing this item did not lead to loss of measurement precision (Figs 2 and 4). Considering that this questionnaire has been used in different languages and with reference to different MSK pain conditions [11,16,[18][19][20]23], it seems inappropriate to suggest the removal of item 10 as this would also lead to a discrepancy with the version of the questionnaire used in previous studies. However, if the results of this and prior studies are replicated in other samples, the future removal of item 10 and/or refinement of the scale should be further discussed and reconsidered. The misfit of item 3 in the DABS sample was a new and surprising finding, considering that this item seems to cover a pain neurophysiological aspect, in line with items 7, 8 and 9 which are the best performing ones. Additionally, its discrimination and difficulty parameters were very similar in the two samples (Table 3). For these reasons, we decided of not running additional analyses with the removal of this item.
ICCs of different PABS versions showed that some response options of some items had a low probability of endorsement compared to adjacent options (Figs 1 and 3). We decided to run additional analyses to assess if merging these response options led to positive changes in item parameters and measurement precision. No loss in measurement precision was retrieved (Figs 2 and 4), therefore our findings were not sufficient to justify the merging of these response options as this would lead to an impractical version of the scale with items having varying numbers of response options. Hence, our analyses and considerations are in favor of keeping the PABS biomedical scale in its current form.
The original version of the PABS was developed and tested in the Dutch language and culture [16,17]. The versions used in this study of HCPs in the UK are an adaptation of the 19-item version refined by Houben et al. [11,16]. To date, no studies assessing the cross-cultural validity of this questionnaire have been performed. A commonly used definition of cross-cultural validity is 'the degree to which the performance of the items on a translated or culturally adapted instrument is an adequate reflection of the performance of the items of the original version of the instrument' [55]. This measurement property can be tested by assessing differential item functioning under an IRT model, for which samples of different language versions should be aggregated [27,28]. Therefore, considering that this questionnaire is already available in several languages, future international collaborations and studies should attempt to assess this measurement property by merging datasets from different countries.
The results of this and previous studies on the PABS biomedical scale have shown that this scale has exhibited acceptable psychometric performance and precision for group-level analyses of the degree of HCP biomedical orientation towards MSK pain. Future research efforts should be directed towards improving the measurement precision of this scale for individuallevel analyses (i.e. to reach reliability estimates ! 0.9); this could be accomplished by adding more items that reflect the same construct. Importantly, the original intention of the PABS developers was to have a questionnaire that could classify HCPs as having a biomedical approach or a biopsychosocial approach [17]. In fact, the PABS includes another subscale aimed at assessing the biopsychosocial orientation of HCPs towards pain [11,16,17]. This scale was not assessed in the current study because previous research has indicated that it needs psychometric improvement [15,21]. This discrepancy in the scales' psychometric performance could be due to different factors, one of them being the widespread diffusion and acceptance of the biopsychosocial model for explaining MSK pain disorders, like LBP [56][57][58][59]. In fact, the popularity of this model has probably influenced HCPs' attitudes and beliefs towards MSK pain, so that it has become difficult for them to 'disagree' with items on the biopsychosocial orientation and this has led to sparseness and lack of variation in responses on this subscale. Overall, taking into account the psychometric differences in the two subscales, it can be asserted that research in the field of measurement of attitudes and beliefs towards pain is still at a preliminary stage, and that further psychometric research is necessary.