The assessment of pressure pain threshold (PPT) provides a quantitative value related to the mechanical sensitivity to pain of deep structures. Although excellent reliability of PPT has been reported in numerous anatomical locations, its absolute and relative reliability in the lower back region remains to be determined. Because of the high prevalence of low back pain in the general population and because low back pain is one of the leading causes of disability in industrialized countries, assessing pressure pain thresholds over the low back is particularly of interest. The purpose of this study study was (1) to evaluate the intra- and inter- absolute and relative reliability of PPT within 14 locations covering the low back region of asymptomatic individuals and (2) to determine the number of trial required to ensure reliable PPT measurements. Fifteen asymptomatic subjects were included in this study. PPTs were assessed among 14 anatomical locations in the low back region over two sessions separated by one hour interval. For the two sessions, three PPT assessments were performed on each location. Reliability was assessed computing intraclass correlation coefficients (ICC), standard error of measurement (SEM) and minimum detectable change (MDC) for all possible combinations between trials and sessions. Bland-Altman plots were also generated to assess potential bias in the dataset. Relative reliability for both intra- and inter- session was almost perfect with ICC ranged from 0.85 to 0.99. With respect to the intra-session, no statistical difference was reported for ICCs and SEM regardless of the conducted comparisons between trials. Conversely, for inter-session, ICCs and SEM values were significantly larger when two consecutive PPT measurements were used for data analysis. No significant difference was observed for the comparison between two consecutive measurements and three measurements. Excellent relative and absolute reliabilities were reported for both intra- and inter-session. Reliable measurements can be equally achieved when using the mean of two or three consecutive PPT measurements, as usually proposed in the literature, or with only the first one. Although reliability was almost perfect regardless of the conducted comparison between PPT assessments, our results suggest using two consecutive measurements to obtain higher short term absolute reliability.
Citation: Balaguier R, Madeleine P, Vuillerme N (2016) Is One Trial Sufficient to Obtain Excellent Pressure Pain Threshold Reliability in the Low Back of Asymptomatic Individuals? A Test-Retest Study. PLoS ONE 11(8): e0160866. https://doi.org/10.1371/journal.pone.0160866
Editor: Neil R. Smalheiser, University of Illinois-Chicago, UNITED STATES
Received: December 15, 2015; Accepted: July 26, 2016; Published: August 11, 2016
Copyright: © 2016 Balaguier et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: With regard to data availability, ethical restrictions prevent deposition in a public repository. Data will be made available from Univ. Grenoble Alpes for researchers who meet the criteria for access to confidential data. Requests for data access should be made to the corresponding author (Dr. Nicolas Vuillerme).
Funding: This joint PhD project is financed by a grant from the French Ministry of Higher Education and Research. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interests: The authors have declared that no competing interests exist.
Pain is defined as an unpleasant sensory and emotional experience associated with actual or potential tissue damage, or describe in terms of such damage . According to the American Pain Society , pain is the fifth vital sign of medical examination. Pressure algometry (PA) performed with a handheld algometer is a method increasingly used since the 80s to assess mechanical pain sensitivity in different anatomical regions. When it is applied perpendicularly to the skin, the algometer creates a mechanical painful stimulation by activating group III and group IV muscle nociceptors . Through pressure pain thresholds (PPT), PA provides a quantitative value related to deep structures sensitivity allowing clinicians or researchers to make comparison over time. In case of musculoskeletal pain, as recently proposed in a literature review by Arendt-Nielsen and Yarnitsky , PA seems particularly relevant to compare pain over time or between various normal, affected or treated anatomical regions.
It has been reported that pressure pain sensitivity is different between individual muscles  and also non uniformly distributed between muscle belly and tendons of a same muscle [6–9]. Thus, according to Anderssen and colleagues , the assessment of pain sensitivity in two adjacent sites can lead to two significantly different PPT’s values. This difference could be explained by a change in muscle thickness and density of nociceptors. However, no difference are observed when PPT are assessed bilaterally over homologous body locations [5,10]. Among all the different anatomical locations, the low back region is particularly of interest for PPT’s measurements since 70% of the population will experience Low Back Pain (LBP) at least once in his lifetime  and because LBP is often reported in relation to work related musculoskeletal disorders , disability and sickness absence from work [13,14]. The assessment of PPT can be used as a method to diagnose and monitor the effectiveness of various treatments or interventions over the lower back region [15–17].
According to a literature review by Arendt-Nielsen and Yarnitsky , PA seems relevant to compare pain sensitivity over time or between various normal, affected, or treated anatomical locations. In numerous studies, PA reported good to excellent intra- and inter- reliability to assess pain sensitivity in the low back [16,18–21]. Mokkink and colleagues  have defined relative reliability as the extent to which scores for subjects who have not changed are the same for repeated measurements, in our study assessed by one examiner on two different occasions. Relative reliability is commonly quantified using intraclass coefficient correlation (ICC) . Absolute reliability also called “agreement” or “absolute measurement error” is defined as how close the score on repeated measures are  and it is quantified using standard error of measurement (SEM). Interestingly, PPT reliability studies have (1) generally assessed only two or four locations over the low back region and/or (2) assessed PPT’s reliability only unilaterally. Further investigations are therefore needed to ensure that PA is a reliable method to assess PPT in numerous locations covering the low back region of asymptomatic individuals.
The purpose of this study study was (1) to evaluate the intra- and inter- absolute and relative reliability of PPT within 14 locations covering the low back region of young asymptomatic individuals and (2) to determine the number of trials required to ensure reliable PPT assessments.
Materials and Methods
Fifteen asymptomatic subjects (8 women and 7 men), described in Table 1, volunteered to participate in this study. The subjects were recruited within the Grenoble community and consisted of students (11) and newly-hired workers (4). Inclusion criteria were being aged to 18 to 55 years, no musculoskeletal pain in the low back during the last week, no previous injury or/and surgery in the low back region and no pregnancy. This study was conducted in accordance with the Declaration of Helsinki and was approved by the national ethics committee (French society for independent-living technologies and gerontechnology). Subjects gave their informed written consent to the experimental procedure.
A Somedic Algometer (Type 2, Sollentuna, Sweden) with a probe size of 1 cm² and calibrated before each session was used to assess PPT over two sessions separated by one hour and lasting approx. 30 minutes. The pressure was applied (1) by a single examiner, (2) over 14 anatomical locations in the lower back region with 7 locations on each side of the lumbar spinal processes L1-L5 and (3) at a rate of 30 kPa/s in line with previous studies. To avoid tissue injury [26–28], a 1 minute interval was observed between two consecutive PPT assessments over the same location to avoid temporal sensitization .
Subjects lying comfortably in a prone position were asked to press a button that locks the algometer when the pressure became painful. Then, the examiner noted the pressure indicated on the algometer display corresponding to the PPT. As in numerous studies [30–32] a training PPT measurement was realized prior recordings on the tibialis anterior , a remote site from the low back.
Procedure to mark the 14 anatomical locations
After palpation, the examiner placed two marks at the level of the first (L1) and fifth (L5) vertebrae spinal processes and measures the distance between these two locations (d1). This distance allows the examiner to select one paper grid with 14 anatomical locations among 8 grids specially designed according to the average L1-L5 distance reported earlier [32–33]. Once selected, the examiner aligns the grid with the L1 and L5 marks over the skin and start the experiment.
To design these grids, we calculated d2, corresponding to the quarter of the distance L1-L5. A first column of 5 points was placed bilaterally at the distance (d2) from a fictive line joining L1 to L5. Then, a second column of 2 points was set bilaterally at 2 times the distance (d2) of L2 and L3 (Fig 1).
PPT measurements were found to be normally distributed (Shapiro-Wilk normality test).
On the one hand, the results of the first session were analyzed using a repeated measure of variance (ANOVA) to investigate the intra-session reliability, followed by Tukey post-hoc test to highlight differences between trials . The relative and absolute reliability across the trials 1-2-3 were computed using ICC, SEM and minimum detectable change (MDC). The relative reliability was evaluated by calculating a 2-way fixed ICC2,1 (for absolute agreement). Reliability coefficients (i.e. ICC values) were interpreted according to Landis and Koch  in which an ICC between 0.00–0.20 is considered poor, 0.21–0.40 is fair, 0.41–0.60 is moderate, 0.61–0.80 is substantial, and 0.81–1.00 is almost perfect. The SEM expressed in the same unit as pain sensitivity (kPa) quantifies the precision of PPT measurements of individual subjects [23,36]. The SEM was calculated as SD where SD is the standard deviation of the scores from all subjects and ICC the relative reliability . MDC calculated as SEM × 1.96 × provides information on the thresholds required to be confident that a difference can be considered as “real” . Mean, standard deviation, ICC2,1, SEM, MDC and limits of agreement (LOA) values were calculated for the two sessions to investigate the inter-session reliability in relation to the following three comparisons [16, 37–39]:
- trial 1 from session 1 versus trial 1 from session 2;
- the mean of trials 1 and 2 (trials 1–2) from session 1 versus the mean of trials 1 and 2 (trials 1–2) from session 2;
- the mean of trials 1,2 and 3 (trials 1-2-3) from session 1 versus the mean of trials 1, 2 and 3 (trials 1-2-3) from session 2.
Furthermore, Bland and Altman plots of the differences between trials against their mean and LOA were used to assess the magnitude of disagreement between trials of the 2 sessions. Of note, a difference between trials outside the LOA can be considered as a real change .
Finally, one way ANOVA (with the number of trials as within-subject factor) followed by Tukey post-hoc test for pair-wise comparison was performed to compare reliability values (ICC and SEM) of the two sessions.
Relative reliability of PPT in the low back.
As mentioned in Table 2, with values ranged from 0.85 to 0.99, the ICCs of the 14 anatomical locations (Table 2) and of the left, right and overall low-back (Pleft, Pright, Pall) were almost perfect regardless of the conducted comparison (trials 1-2-3).
Absolute reliability of PPT in the low back.
Table 2 also reports that absolute reliability (.i.e. SEM) remained non statistically different for all possible combinations. SEM values ranged from 26 to 91 kPa.
Number of trials to ensure reliable measurements in the low back.
The mean PPT values at each anatomical locations were not significantly different between trials regardless of the three conducted comparisons (trial 1 versus trial 2, trial 1 versus trial 3, trial 2 versus trial 3), the p-values were ranged from 0.7220 to 1.000 (Table 3). Concerning the left, right and overall low-back (Pleft, Pright, Pall), p-values ranged from 0.9960 to 0.9995.
See “Procedure to mark the 14 anatomical locations” for explanation concerning the locations of PPT assessments.
The comparison of means of ICC regardless of the conducted comparisons between trials (Table 2) showed a statistical difference for Trials 2–3 versus Trials 1–3 (p = 0.0338). The same analysis for SEM further showed no significant difference between trials.
Relative reliability of PPT in the low back.
The visual analysis of Bland and Altman’s plots suggested no difference in PPT values between sessions because (1) zero was included in the 95% confidence interval and (2) all the subjects were inside the limits of agreement (Fig 3). Furthermore, this visual analysis also suggested narrowed LOA for the association trials 1–2 and trials 1-2-3 compared to the plot of the first trial.
Number of trials to ensure reliable measurements in the low back.
The mean PPT values at each pressure pain location between sessions 1 and 2 were not significantly different regardless of the following three comparisons: (1) trial 1 from session 1 versus trial 1 from session 2, (2) the mean of trials 1 and 2 (trials 1–2) from session 1 versus the mean of trials 1 and 2 (trials 1–2) from session 2, (3) the mean of trials 1, 2 and 3 (trials 1-2-3) from session 1 versus the mean of trials 1, 2 and 3 (trials 1-2-3) from session 2, the p-values were ranged from 0.4137 to 0.9974.
When two consecutive measurements (Trials 1–2 or Trials 2–3) or all trials were used to calculate subjects’ relative and absolute reliability, ICC and SEM values were significantly higher than when the first trial was used. Conversely, no statistical difference was observed between the two first consecutive trials and all trials, or between the two last consecutive trials and all trials. Finally, no statistical difference was observed between the two first and the two last consecutive measurements (Tables 5 and 6). Visual analysis of Bland and Altman plots showed that LOA values decreased when the two first and the three trials were analyzed (Fig 4).
Considering the importance of collecting reliable PPT over the lower back region, the purpose of the present experiment was (1) to evaluate the intra- and inter- absolute and relative reliability of PPT within 14 locations covering the low back region of asymptomatic individuals, and (2) to determine the number of trial required to ensure reliable PPT assessments. Intra-session results will be discussed before those of the inter-session.
First, the analysis of PPT measurements of the low back showed excellent relative reliability for the intra-session. ICCs values were almost perfect regardless of the conducted comparisons (trial 1 versus trial 2, trial 1 versus trial 3, trial 2 versus trial 3) suggesting no difference in PPTs’ measurements between trials and no systematic error in the data. Moderate to excellent relative reliability was also obtained in previous studies, assessing PPT in other anatomical locations such as tibia , calf, hand  and trapezius . In a recent study assessing PPT in the lower back region of young healthy subjects, Waller and colleagues  reported ICC ranged from 0.94 to 0.99 and further conclude that intra-rater reliability was excellent in the low back. As Waller and colleagues  have assessed PPT only over one location in the low back (2 cm laterally from L4/L5). However, the generalization of such a finding is questionable considering that PPT can be different over the same muscle . Moreover, the study population was small but sufficient to obtain substantial relative reliability values .
The inter-session relative reliability has also been shown to be excellent in our study. It is first important to note that no significant difference was observed for PPT measurements between session 1 and session 2. Then, the analysis trial-to-trial showed that ICCs values were also almost perfect regardless of the number of trials considered, confirming excellent reliability previously reported by Koo and colleagues  in the low back of healthy individuals. In the latter study, the six anatomical locations assessed (1) bilaterally in the low back perpendicularly to the spinal processes of L1, L3 and L5 and (2) over two sessions separated by 5 minutes led to ICC ranged from 0.86 to 0.91. Conversely to the intra-session’s results, we report higher relative and absolute inter-session reliability (i.e. ICC and SEM values) when two consecutive PPTs measurements were used for data analysis. Similar results have also been reported in the low back of healthy individuals by Chesterton and colleagues , i.e, higher intra-session reliability for the mean of three consecutive PPTs assessments than when only the first assessment was used for analysis. Even more interesting to note was that in our study, contrary to numerous studies [18,31,44], the first PPT assessment did not need to be discarded to obtain excellent reliability for both intra- and inter- session. Lacourt and colleagues , in a test-retest study have reported significant differences in PPTs values respectively, between the first and second PPT measurement and also between the first and third one. This result led them to use only the second and third PPT measurements for data analysis. Higher inter-session reliability was also found by Nussbaum and Downes , when the first PPT measurement was omitted. This could be explained by the effort made in the current study to familiarize the subject with PPT measurements (tests at a remote location, one practice session).
As ICC is largely influenced by between-subjects variability and does not provide information on typical error , it was necessary to complete our analysis by computing SEM and MDC. When the first PPT measurement was associated with the second or third one, SEM were generally below 65 kPa and MDC ranged from 11% to 27% (71 kPa to 179 kPa). In other words, this result suggests that (1) the true score of PPT was 65 kPa below or above the observed score and (2) that a clinical change will not be masked by measurement error if the observed score changed by more than 11 to 27%. The limited number of published studies assessing absolute reliability has made comparison difficult with the existing literature. However, after two sessions of PPTs measurements over one location on the trapezius muscle and the tibialis anterior separated by three to five days, Walton and colleagues  have reported a SEM value close to ours with a value of 49 kPa. Similar results were found by Fingleton and colleagues  in the lower limbs with SEM ranged from 16 to 39 kPa and by Chesterton and colleagues  in the back with SEM equal to 60 kPa.
In general, when looking at the number of trials required ensuring reliable measurement in the low back, our results are rather original. Indeed, we have reported almost perfect intra- and inter-session reliability on the first PPT measurement (ICC ranged from 0.85 to 0.99) suggesting that one training trial over the tibialis anterior would be sufficient to familiarize the participant with the PPT procedure. Hopkins in 2000  assumed that the reliability of a test could be influenced by several factors such as motivation or boredom. For instance, during series of trials, the second one is often better than the first because participants want to improve their performance or because they benefit from the experience of the first one. Conversely, a decreased performance between the first trial and the following ones could be explained by fatigue or loss of motivation. In our study it seems that there is no learning effect between trials because PPT values did not change. Then, it seems that the cognitive and attentional resources needed to perform three consecutive PPT assessments over the low back do not generate boredom or loss of motivation and do not influence reliability. Furthermore, even though both relative and absolute relative reliabilities were significantly higher when two or three consecutive measurements were used for data analysis compared with the first measurement, no statistical difference was observed between the two first and two last PPT measurements. Therefore, this result suggests that using the two first PPT measurements for data analysis will not lead to lower relative and absolute reliabilities than using the two last or three PPT measurements. Finally, in accordance with existing literature in the low back [16,45], MDC values were regularly between 100 and 200 kPa corresponding to approx. 10–20% of the PPT scale range considered as acceptable measurement error by Chiarotto and colleagues .
For instance, to be confident that a true change was observed in the low back of young football player after an intervention, Madeleine and colleagues  reported MDC value of 140 kPa. Walton and colleagues  assessing PPTs in the trapezius muscle among young healthy subjects reported MDC of 113 kPa. These results imply a small sensitivity to change and that a change in PPT measurement can be masked by the measurement error regardless of the absolute changes in PPT .
Recent studies have reported good to excellent PPTs’ reliability between sessions, respectively separated by one day and assessed over 14 locations covering the abdominal region , two days and assessed over the 2 locations from the low back  and twenty-one days and assessed 1 location from the trapezius muscle . Still, the current results need to be confirmed by assessing PPT’s reliability over longer period of time. Then, the absence of significant difference for some important parameters such as ICC and SEM values between two or three consecutive PPT measurements could be explained by the relatively small sample size. Indeed, true significant effects might have been missed in our study because the sample size used might have not adequate power for detecting a true difference of a meaningful magnitude.
Finally, we recruited a mixed population of asymptomatic individuals classified as such since they did not report pain in the low-back within the last 7 days prior to the experiment and had no history of low back injury or/and surgery. However, pain is usually fluctuating as reported by recent studies [50,51,52]. Further, the present results should not be generalized to specific population or gender as gender differences are reported in pressure pain sensitivity [33,53,54]. Still, Paungmali and colleagues  have reported almost perfect relative reliability for chronic non-specific low back pain individuals in line with our results. Future studies could address the relative and absolute variability of PPT assessed over 14 locations covering the low back in population suffering from LBP.
Excellent relative and absolute reliability of PPT measured over 14 locations covering the low back of asymptomatic individuals were reported for both intra- and inter-session. Reliable measurements can be equally achieved when using the mean of consecutive PPT measurement or with only the first one. Although reliability was almost perfect regardless of the conducted comparison between PPT assessments, our results suggest using at least two consecutive measurements to obtain higher inter-session absolute reliability among asymptomatic participants in the low back region. Further studies are needed to enable a more global generalization of these findings.
The presented work is part of the joint PhD thesis of Romain Balaguier at Univ. Grenoble Alpes (France) and Aalborg University (Denmark), who was supported by a grant of the French Ministry of Higher Education and Research. This work is also part of a larger pluri-disciplinary project called ‘EWS’ (Ergonomics at Work and in Sports). EWS project has benefited from support from the Blåtand French-Danish scientific cooperation program (Institut Français du Danemark), the Direction des Relations Territoriales et Internationales from Univ. Grenoble Alpes (France) and Aalborg University (Danemark) we acknowledge gratefully. The authors would like to thank anonymous reviewers for helpful comments and suggestions.
- Conceptualization: RB PM NV.
- Data curation: RB PM NV.
- Formal analysis: RB PM NV.
- Funding acquisition: RB PM NV.
- Investigation: RB PM NV.
- Methodology: RB PM NV.
- Project administration: RB PM NV.
- Resources: RB PM NV.
- Software: RB PM NV.
- Supervision: PM NV.
- Validation: RB PM NV.
- Visualization: RB PM NV.
- Writing - original draft: RB PM NV.
- Writing - review & editing: RB PM NV.
- 1. Bonica JJ. International for the study of pain: pain definition. The need of a taxonomy. Pain. 1979; 6(3):247–248. pmid:460931
- 2. Merboth MK, Barnason S. Managing pain: the fifth vital sign. Nurs Clin North Am. 2000; 35(2):375–383. pmid:10873249
- 3. Mense S, Simons DG. Muscle pain. Understanding its nature, diagnosis, and treatment. Philadelphia: Lippincott Williams & Wilkins; 1992.
- 4. Arendt-Nielsen L and Yarnitsky D. Experimental and clinical applications of quantitative sensory testing applied to skin, muscles and viscera. J Pain. 2009; 10(6):556–572. pmid:19380256
- 5. Fischer AA. Pressure algometry over normal muscles. Standard values, validity and reproducibility of pressure threshold. Pain. 1987; 30(1):115–126. pmid:3614975
- 6. Andersen H, Arendt-Nielsen L, Danneskiold-SamsØe B and Graven-Nielsen T. Pressure pain sensitivity and hardness along human normal and sensitized muscle. Somatosens Mot Res. 2006; 23(3–4):97–109. pmid:17178545
- 7. Baker SJ, Kelly NM, Eston RG. Pressure pain tolerance at different sites on the quadriceps femoris prior to and following eccentric exercise. Eur J Pain. 1997; 1:229–233. pmid:15102404
- 8. Fridén J, Lieber RL. Eccentric exercise-induced injuries to contractile and cytoskeletal muscle fibre components. Acta Physiol Scand. 2001; 171(3):321–326. pmid:11412144
- 9. Nie H, Arendt-Nielsen L, Madeleine P, Graven-Nielsen T. Enhanced temporal summation of pressure pain in the trapezius muscle after delayed onset muscle soreness. Exp Brain Res. 2006; 170(2):182–190. pmid:16328284
- 10. Ylinen J. Pressure algometry. Aust J Physiother. 2007; 53(3):207. pmid:17899675
- 11. Andersson GB. Epidemiological features of chronic low-back pain. The Lancet. 1999; 354(9178):581–585.
- 12. Punnett L and Wegman DH. Work-related musculoskeletal disorders: the epidemiologic evidence and the debate. J Electromyogr Kinesiol. 2004; 14(1):13–23. pmid:14759746
- 13. Freburger JK, Holmes GM, Agans RP, Jackman AM, Darter JD et al. The rising prevalence of chronic low back pain. Arch Int Med. 2009; 169(3):251–258.
- 14. Hoy D, Brooks P, Blyth F and Buchbinder R. The epidemiology of low back pain. Best Pract Res Clin Rheumatol. 2010; 24(6):769–781. pmid:21665125
- 15. Andersen CH, Andersen LL, Zebis MK and Sjøgaard G. Effect of scapular function training on chronic pain in the neck/shoulder region: a randomized controlled trial. J Occup Rehabil. 2014; 24(2):316–324. pmid:23832167
- 16. Madeleine P, Hoej BP, Fernandez-de-Las-Peñas C, Rathleff MS and Kaalund S. Pressure pain sensitivity changes after use of shock-absorbing insoles among young soccer players training on artificial turf: a randomized controlled trial. J Orthop Sports Phys Ther. 2014; 44(8):587–594. pmid:25029914
- 17. Ylinen J, Takala EP, Kautiainen H, Nykänen M, Häkkinen A, Pohjolainen T, Karppi SL and Airaksinen O. Effect of long-term neck muscle training on pressure pain threshold: A randomized controlled trial. Eur J Pain. 2005; 9(6):673–673. pmid:16246820
- 18. Farasyn A and Meeusen R. Pressure pain thresholds in healthy subjects: influence of physical activity, history of lower back pain factors and the use of endermology as a placebo-like treatment. J Bodyw Mov Ther. 2003; 7(1):53–61.
- 19. Koo TK, Guo JY and Brown CM. Test-Retest Reliability, Repeatability, and Sensitivity of an Automated Deformation-Controlled Indentation on Pressure Pain Threshold Measurement. J Manipulative Physiol Ther. 2013; 36(2):84–90. pmid:23499143
- 20. Paungmali A, Sitilertpisan P, Taneyhill K, Pirunsan U and Uthaikhup S. Intrarater reliability of pain intensity, tissue blood flow, thermal pain threshold, pressure pain threshold and lumbo-pelvic stability tests in subjects with low back pain. Asian J Sports Med. 2012; 3(1):8. pmid:22461960
- 21. Vanderweeen L, Oostendorp RAB, Vaes P and Duquet W. Pressure algometry in manual therapy. Man Ther. 1996; 1(5):258–265. pmid:11440515
- 22. Mokkink LB, Terwee CB, Patrick DL, Alonso J, Stratford PW et al. The COSMIN study reached international consensus on taxonomy, terminology, and definitions of measurement properties for health-related patient-reported outcomes. J Clin Epidemiol. 2010; 63(7):737–745. pmid:20494804
- 23. Weir JP. Quantifying test-retest reliability using the intraclass correlation coefficient and the SEM. J Strength Cond Res. 2005; 19(1):231–240. pmid:15705040
- 24. Terwee CB, Bot SD, de Boer MR, van der Windt DA, Knol DL, et al. Quality criteria were proposed for measurement properties of health status questionnaires. J Clin Epidemiol. 2007; 60(1):34–42. pmid:17161752
- 25. Fingleton CP, Dempsey L, Smart K and Doody CM. Intraexaminer and interexaminer reliability of manual palpation and pressure algometry of the lower limb nerves in asymptomatic subjects. J Manipulative and Physiol Ther. 2014; 37(2):97–104.
- 26. Graven-Nielsen T, Wodehouse T, Langford RM, Arendt-Nielsen L and Kidd BL. Normalization of widespread hyperesthesia and facilitated spatial summation of deep-tissue pain in knee osteoarthritis patients after knee replacement. Arthritis & Rheum. 2012; 64(9):2907–2916.
- 27. Ohrbach R and Gale EN. Pressure pain thresholds in normal muscles: reliability, measurement effects, and topographic differences. Pain. 1989; 37(3):257–263. pmid:2755707
- 28. Nie H, Graven-Nielsen T, Arendt-Nielsen L. Spatial and temporal summation of pain evoked by mechanical pressure stimulation. Eur J Pain. 2009; 13(6):592–599. pmid:18926745
- 29. Chesterton LS, Sim J, Wright CC and Foster NE. Interrater reliability of algometry in measuring pressure pain thresholds in healthy humans, using multiple raters. Clin J Pain. 2007; 23(9):760–766. pmid:18075402
- 30. Mutlu EK and Ozdincler AR. Reliability and responsiveness of algometry for measuring pressure pain threshold in patients with knee osteoarthritis. J Phys Ther Sci. 2015; 27(6):1961. pmid:26180358
- 31. Nussbaum EL and Downes L. Reliability of clinical pressure-pain algometric measurements obtained on consecutive days. Phys Ther. 1998; 78(2):160–169. pmid:9474108
- 32. Binderup AT, Holtermann A, Søgaard K and Madeleine P. Pressure pain sensitivity maps, self-reported musculoskeletal disorders and sickness absence among cleaners. Int Arch Occup Environ Health. 2011; 84(6):647–654. pmid:21400102
- 33. Binderup AT, Arendt-Nielsen L and Madeleine P. Pressure pain sensitivity maps of the neck-shoulder and the low back regions in men and women. BMC Musculoskelet Disord. 2010; 11(1):234.
- 34. Schless SH, Desloovere K, Aertbeliën E, Molenaers G, Huenaerts C et al. The intra-and inter-rater reliability of an instrumented spasticity assessment in children with cerebral palsy. PloS One. 2015; 10(7):e0131011. pmid:26134673
- 35. Landis JR and Koch GG. An application of hierarchical kappa-type statistics in the assessment of majority agreement among multiple observers. Biometrics. 1977; 33(2):363–374. pmid:884196
- 36. Harvill LM. Standard error of measurement. Educational Measurement: Issues and Practice. 1991; 10(2):33–41.
- 37. Bruyneel AV and Bridon F. Inclinométrie du genou: comparaison de la reproductibilité d’un outil mécanique et d’une application sur smartphone. Kinésitherapie, la revue. 2015; 15(158):74–79.
- 38. Roren A, Fayad F, Roby-Brami A, Revel M, Fermanian J et al. Precision of 3D scapular kinematic measurements for analytic arm movements and activities of daily living. Man Ther. 2013; 18(6):473–480. pmid:23726286
- 39. Pinsault N, Fleury A, Virone G, Bouvier B, Vaillant J et al. Test-retest reliability of cervicocephalic relocation test to neutral head position. Physiother Theory Pract. 2008; 24(5):380–391. pmid:18821444
- 40. Aweid O, Gallie R, Morrissey D, Crisp T, Maffulli N et al. Medial tibial pain pressure threshold algometry in runners. Knee Surg Sports Traumatol Arthrosc. 2014; 22(7):1549–1555. pmid:23740326
- 41. Nikolajsen L, Kristensen AD, Pedersen LK, Rahbek O, Jensen TS et al. Intra-and interrater agreement of pressure pain thresholds in children with orthopedic disorders. J Child Orthop. 2011; 5(3):173–178. pmid:22654978
- 42. Park G, Kim CW, Park SB, Kim MJ and Jang SH. Reliability and usefulness of the pressure pain threshold measurement in patients with myofascial pain. Ann Rehabil Med. 2011; 35(3):412–417. pmid:22506152
- 43. Waller R, Straker L, O'Sullivan P, Sterling M and Smith A. Reliability of pressure pain threshold testing in healthy pain free young adults. Scand J Pain. 2015; 9:38–41.
- 44. Lacourt TE, Houtveen JH and van Doornen LJ. Experimental pressure-pain assessments: test–retest reliability, convergence and dimensionality. Scand J Pain. 2012; 3(1):31–37.
- 45. Walton D, MacDermid J, Nielson W, Teasell R, Chiasson M et al. Reliability, standard error, and minimum detectable change of clinical pressure pain threshold testing in people with and without acute neck pain. J Orthop Sports Phys Ther. 2011; 41(9):644–650. pmid:21885906
- 46. Hopkins WG. Measures of reliability in sports medicine and science. Sports Med. 2000; 30(1):1–15. pmid:10907753
- 47. Chiarotto A, Maxwell LJ, Terwee CB, Wells GA, Tugwell P et al. Roland-Morris Disability Questionnaire and Oswestry Disability Index: which has better measurement properties for measuring physical functioning in nonspecific low back pain? A Systematic Review and Meta-Analysis. Phys Ther. 2016.
- 48. Montenegro MLLS, Braz CA, Mateus-Vasconcelos EL, Rosa-e-Silva JC, Candido-dos-Reis FJ et al. Pain pressure threshold algometry of the abdominal wall in healthy women. Braz J Med Biol Res. 2012; 45(7):578–582. pmid:22527127
- 49. Soee ABL, Thomsen LL, Tornoe B and Skov L. Reliability of four experimental mechanical pain tests in children. J Pain Res. 2013; 6:103–110. pmid:23403523
- 50. Dunn KM, Jordan K and Croft PR. Characterizing the course of low back pain: a latent class analysis. Am J Epidemiol. 2006; 163(8):754–761. pmid:16495468
- 51. Downie AS, Hancock MJ, Rzewuska M, Williams CM, Lin CWC et al. Trajectories of acute low back pain: a latent class growth analysis. Pain. 2016; 157(1):225–234. pmid:26397929
- 52. Kongsted A, Kent P, Hestbaek L and Vach W. Patients with low back pain had distinct clinical course patterns that were typically neither complete recovery nor constant pain. A latent class analysis of longitudinal data. Spine J. 2015; 15(5):885–894. pmid:25681230
- 53. Fillingim RB, King CD, Ribeiro-Dasilva MC, Rahim-Williams B, and Riley JL. Sex, gender, and pain: A review of recent clinical and experimental findings. J Pain. 2009; 10:447–485. pmid:19411059
- 54. Ge HY, Madeleine P and Arendt-Nielsen L. Sex differences in temporal characteristics of descending inhibitory control: an evaluation using repeated bilateral experimental induction of muscle pain. Pain. 2004; 110(1):72–78.