Healthy Volunteers Can Be Phenotyped Using Cutaneous Sensitization Pain Models

Background Human experimental pain models leading to development of secondary hyperalgesia are used to estimate efficacy of analgesics and antihyperalgesics. The ability to develop an area of secondary hyperalgesia varies substantially between subjects, but little is known about the agreement following repeated measurements. The aim of this study was to determine if the areas of secondary hyperalgesia were consistently robust to be useful for phenotyping subjects, based on their pattern of sensitization by the heat pain models. Methods We performed post-hoc analyses of 10 completed healthy volunteer studies (n = 342 [409 repeated measurements]). Three different models were used to induce secondary hyperalgesia to monofilament stimulation: the heat/capsaicin sensitization (H/C), the brief thermal sensitization (BTS), and the burn injury (BI) models. Three studies included both the H/C and BTS models. Results Within-subject compared to between-subject variability was low, and there was substantial strength of agreement between repeated induction-sessions in most studies. The intraclass correlation coefficient (ICC) improved little with repeated testing beyond two sessions. There was good agreement in categorizing subjects into ‘small area’ (1st quartile [<25%]) and ‘large area’ (4th quartile [>75%]) responders: 56–76% of subjects consistently fell into same ‘small-area’ or ‘large-area’ category on two consecutive study days. There was moderate to substantial agreement between the areas of secondary hyperalgesia induced on the same day using the H/C (forearm) and BTS (thigh) models. Conclusion Secondary hyperalgesia induced by experimental heat pain models seem a consistent measure of sensitization in pharmacodynamic and physiological research. The analysis indicates that healthy volunteers can be phenotyped based on their pattern of sensitization by the heat [and heat plus capsaicin] pain models.


Introduction
It has been hypothesized that the propensity for developing central sensitization in response to noxious events is a heritable trait and may explain differences in susceptibility to developing chronic pain [1]. Noxious skin stimulation of sufficient intensity and duration, leads to the development of an area of secondary hyperalgesia surrounding the stimulation site (cutaneous sensitization). The area of secondary hyperalgesia is experimentally induced with either chemical, electrical or heat stimuli. This phenomenon is inducible in most subjects and thought to be due, at least in part, to central neuronal sensitization [1]. Further, the size of the area of secondary hyperalgesia is commonly used as the primary outcome measure in human experimental studies of pain mechanisms, or when testing the effect of anti-hyperalgesic drugs [2][3][4][5][6][7][8].
It has been recognized that areas of secondary hyperalgesia varies substantially between subjects, but there is a scarcity of information in the literature about the within-subject and between-subject variability. Human twin studies indicate that genetic factors play an important role in the size of the secondary hyperalgesia area [9]. In support, a recent study of human twins showed that familial effects accounted for 24-32% of observed variances in heat and cold pressor pain thresholds, and, opioidmediated elevations in cold pressor pain tolerance [10]. These observations need further validation.
To address the variability issue, we pooled drug-and placebofree data from 10 different healthy volunteer pain model studies in which the area of secondary hyperalgesia, determined with mechanical stimulation, was used as a primary outcome measure. We aimed to examine the robustness of cutaneous secondary hyperalgesia induced by one or more of three different human experimental models: the heat/capsaicin sensitization model (H/ C), the brief thermal sensitization model (BTS), and the burn injury model (BI). Because some of the studies used two different models (H/C and BTS) in the same subject, the variability and reliability of the 2 models could be compared directly. Our hypothesis was that it is possible to phenotype subjects based on their pattern of sensitization across days and across models.

Materials and Methods
Data was obtained from ten studies (9 published [2,[4][5][6][7][8][11][12][13] and 1 unpublished [Petersen et al., unpublished data]) of healthy volunteers (conducted in Copenhagen or San Francisco) in which experimental cutaneous secondary hyperalgesia was induced on two or more study days, and in which the area of secondary hyperalgesia to monofilament stimulation was the primary outcome measure (Table 1). Three different pain models were used: the heat/capsaicin sensitization (H/C) model, the brief thermal sensitization (BTS) model and the burn-injury (BI) model. In the ten studies, five used only the H/C model ( [2,[4][5][6], unpublished data), one used the BTS model [11], and one study used the BI model [12]. In three studies subjects underwent both testing with the H/C and BTS model [7,8,13]. Study design, timing and procedures were largely similar; differences are noted below and in Table 1. Data collected after administration of a study drug or placebo, were not included in the analyses. In eight studies secondary hyperalgesia areas were measured prior to any study drug or placebo administration, on two or more study days at least one week apart. In two methodology studies not involving any drug administration, secondary hyperalgesia areas were collected during multiple sessions (Table 1) [12,13].

Subjects
All subjects were paid healthy volunteers recruited through flyers and advertisements in the Copenhagen and the San Francisco areas. Subjects with use of acute or chronic pain medication, or a prior history of drug or alcohol abuse were excluded. Written informed consent was obtained at the first visit in each study. All studies were conducted in accordance with the Helsinki Declaration and appropriate amendments, and approved by one or more of the following: Ethics Committee of Copenhagen, Danish Data Protection Agency, Danish Medicines Agency (for drug studies), and the Committee on Human Research at the University of California, San Francisco. All subjects were familiarized with the experimental pain model in a pre-study training session.

Sensitization Methods
Heat/Capsaicin sensitization. The stimulation site was marked with a felt pen on the volar side of the dominant forearm. Sensitization was produced by heating the skin to 45uC for 5 min with the contact thermode. The skin was then covered with capsaicin cream (0.075%) for 30 minutes. After capsaicin removal, the area of secondary hyperalgesia was mapped.
Brief thermal sensitization. The stimulation site was marked on the center of the anterior side of the dominant thigh. BTS was induced by heating the skin to 45uC for 3 min. With the thermode still in place (45uC), the borders of secondary hyperalgesia were mapped.
Burn injury sensitization. The stimulation site on the medial upper part of the non-dominant, lower leg was marked. A first-degree burn injury, with erythema and hyperalgesia, was induced by applying a contact thermode at 47uC for 7 minutes. The area of secondary hyperalgesia was mapped 1, 2 and 3 hours after induction of the BI.

Assessment of Secondary Hyperalgesia
The area of secondary hyperalgesia to stimulation with a monofilament (von Frey hair) was the primary outcome measure in all 10 studies. In all but one study the borders of secondary hyperalgesia were determined by stimulating with a monofilament (20.9 g [4], 21.5 g [2,[5][6][7]13], 26.0 g [8,11]), along 4 linear paths arranged in 90u angles around the stimulation center, in 5-mm steps at 1-s intervals. Stimulation along each path started well outside the hyperalgesic area and continued toward the stimulated skin area. The subjects, who had their eyes closed during the assessments, reported the occurrence of a definite change in sensation: A mildly noxious pin-like sensation, more intense pricking, burning or tenderness. The border was marked, and the transverse and longitudinal axes were measured for surface area calculations.
In the burn injury study the secondary hyperalgesia border was determined by a monofilament (90.6 g) stimulating in 8 symmetric lines each separated by angles of 45u converging towards the centre of the burn injury. Secondary hyperalgesia areas were measured 1, 2 and 3 hours after the BI. The corners of the octagon were marked on the skin with a felt-tipped pen and transferred to a transparent sheath. The octagonal secondary hyperalgesia area was calculated using a computer-based vector-algorithm (Canvas 12.0, ACD Systems International, Victoria, Canada).

Statistical Analyses
Parallel analyses were performed for all 3 methods of induction of experimental cutaneous secondary hyperalgesia. Analyses are shown by study, then pooled across all studies. In 4 studies, the data sets no longer contained information on the order of treatment [2,[5][6][7]. For those studies, pre-treatment data from the placebo session was used as the first measurement.
For the continuous outcome of area of secondary hyperalgesia, normality was examined using residual plots and histograms. The strength of association between the first and second measurement was examined with Pearson or Spearman's correlation coefficients, depending on the distribution. The size of the difference between the 2 measurements was examined with mean differences, tested with a paired t-test. Agreement between the repeat measurements was examined with intraclass correlation coefficients (ICCs (1,1)) and their 95% confidence intervals (CI) were indicated [14]. While there are no agreed upon classifications for levels of agreement for ICCs, for interpretation purposes ICCs were categorized as slight/ poor (#0.2), fair (.0.2 to 0.4), moderate (.0.4 to 0.6), substantial (.0.6 to 0.8) and almost perfect (.0.8) [15]. Bland-Altman plots were presented on pooled data to assess systematic bias in the differences in measurement 1 and measurement 2 [16]. A mixed models approach was used to examine the effect of time in the six studies where the session order was known.
All significance levels reported are 2-sided. Statistical analyses were conducted using SAS version 9.2 (SAS Institute Inc, Cary, NC) and Medcalc Software version 11.1.6.0 (Mariakerke, Belgium). In the secondary analyses, a Bonferroni correction was applied to P-values to examine if the significance observed held after correction for multiple comparisons. The significance level was set to ,0.05/9 = 0.006 for the HC analyses, ,0.05/5 = 0.01 for the BTS analyses and left as P,0.05 for BI analyses because there is only one study with this method of induction. [We do not know of an existing public repository for this type of data. However, data are freely available upon request.].

Reliability of Secondary Hyperalgesia Measurements
Compared to the size of the area of secondary hyperalgesia, the magnitude of the mean difference between measurement 1 and measurement 2 was small and non-significant for all studies, but two ( Table 2; BTS [11], BI [12]).
The standard deviations of measurement 1 and measurement 2 for the analysis on pooled data demonstrate the size of the between-subject variation in relation to the average of the measure ( Table 2). For H/C the standard deviation was half the size of the average value, for BTS 52 to 57% of the average value, and for BI 65 to 67% of the average value.

Sensitization Models
Heat/capsaicin sensitization model. The eight studies ( [2,[4][5][6][7][8]13], unpublished data), using the H/C model, had a level of association between the two repeat measurements that was 'fair' to 'substantial' (rho 0.28 to 0.76), with a correlation coefficient (rho) of 0.69 on the pooled data ( Table 2). The mean differences between measurement 1 and 2 ranged from 215.95 cm 2 to Phenotyping by Cutaneous Sensitization Pain Models PLOS ONE | www.plosone.org 7.5 cm 2 and were not statistically, significantly different (P$0.06), showing a small level of disagreement in the pooled analysis (21.38 cm 2 +/242.8 cm 2 , P = 0.66). The reliability as shown by the ICCs demonstrated 'fair' agreement in one study (ICC = 0.30), 'moderate' agreement in two studies (0.52 and 0.55), and 'substantial' agreement in 5 studies. The analysis on pooled data demonstrated a 'substantial' strength of agreement between the 2 measurements (ICC = 0.69). The Bland-Altman plots ( Figure 2A) illustrate a random scattering of points both above and below the average mean difference with few points falling outside the 95% confidence interval, showing a lack of systematic bias between the first and second measurements. Mixed models did not show any systematic bias based on whether the measurement was from the first day or a subsequent day (P = 0.45).
Brief thermal sensitization model. The four studies [7,8,11,13] using the BTS model had a level of association between the 2 repeat measurements that was 'substantial' (rho 0.57 to 0.86), with a correlation coefficient (rho) of 0.74 on pooled data ( Table 2). The ICCs ranged from 0.48 to 0.84, with one study categorized as 'moderate' (0.48) and the remaining 3 studies all categorized as 'substantial' agreement or higher (.0.6). The ICC based on pooled data showed 'substantial' agreement (0.74). The pooled results demonstrated a small level of disagreement on average (20.79+/263.75 cm 2 , N.S.). The Bland-Altman plots ( Figure 2B) showed no discernible pattern of systematic bias.
Mixed models showed no systematic offset based on day order (P = 0.48).
Burn injury model. Secondary hyperalgesia induced by the BI model was assessed in one study [12]. The reliability as shown by the ICC was 0.58. There was a significant disagreement on average values (7.81+/224.21 cm 2 , P = 0.002), a bias pattern also evident in the Bland-Altman plot ( Figure 2C). The mixed models showed a systematic offset in area of secondary hyperalgesia based on day order, with the second day lower on average (P = 0.002). These calculations were based on the one hour post-burn assessment.
Five percent of subjects did not develop an area of secondary hyperalgesia at all (six area measurements collected on two sessions days) and an additional 11% did not develop a secondary hyperalgesia area in #4 of the 6 measurements.
Categorizations of secondary hyperalgesia areas into quartiles. Agreement between measurements, split up into quartiles, across days and models, is presented in Table 3. Perfect test-retest agreement for all quartiles, in the heat/capsaicin, brief thermal stimulation and the burn injury models, were 51%, 54% and 49%, respectively. Perfect agreement for the 1 st quartile ('small-area') and 4 th quartile ('large-area') were between 56 and 76%. The weighted kappa statistics were for the heat/capsaicin model 0.66 (95% CI: 0.57-0.74), the brief thermal stimulation 0.74 (0.65-0.82) and the burn injury model 0.74 (0.65-0.82), Table 2. Areas of secondary hyperalgesia for measurement 1 and 2 (mean +/2 SD), P-values for paired t-tests between measurements, Spearman's rank correlation coefficient (rho), coefficient of variation (CV, mean +/2 SD), intra-class correlation coefficient (ICC, 95% CI) for 2 measurements and for data with more than 2 measurements. corresponding to a good strength of agreement. The probability of measurement 2 moving more than one quartile away from measurement 1 ( Table 4) was fairly similar for the three models, ranging from 0 to 13%. Repeatability across models (H/C and BTS). In order to simplify calculations the observations were categorized in 'smallarea' (1 st quartile), 'mid-area' (2 nd and 3 rd quartile) and 'large-area' (4 th quartile) secondary hyperalgesia (Table 4). There were three studies [5,7,8] with two measurements of secondary hyperalgesia, induced by both HC and BTS that were measured concurrently. The agreement in classification to 'small-', 'mid-' or 'large-areas' across induction method, was 59% for measurement 1 and 69% for measurement 2 (Table 5). When assessed with both the H/C and BTS on the first assessment day, 32% of the subjects fell in either 'small-area' or 'large-area' categories with both models. Similarly 42% fell in the same category on the second assessment day ( Table 5). The weighted kappa statistics for these six measures ranged from 0.24 to 0.67, with 5 (62.5%) of the comparisons having 'moderate' or 'substantial' agreement ( Table 6).
Value of additional measurements. Including additional measurements in the ICC calculations for the three studies [5,7,8] with .2 days of data, had little effect, except for one study with a low ICC comparing the measurements on the first 2 days, which yielded an improvement in ICC when an additional 4 measurements were added (H/C: 0.30 vs. 0.66; BTS: 0.48 vs 0.81) [5]. In the BI study, using the median values of the 3 hourly post-burn assessments, increases in the correlation coefficient (rho) from 0.64 to 0.77 (95% CI, 0.68-0.84) and for the ICC from 0.58 to 0.67 (95% CI, 0.52 to 0.77), were seen.
Secondary statistical analyses. Results were largely similar after applying a Bonferroni correction to P-values. The mean difference in area of secondary hyperalgesia for the BTS method was no longer significantly different from zero, and most of the significance testing that the kappa statistics were different from zero lost significance.

Discussion
This study demonstrated that measurement of secondary hyperalgesia areas, induced by experimental heat [and heat plus capsaicin] pain models, seems a consistent and repeatable measure of sensitization in pharmacodynamic and physiological research. The statistical analyses indicated that healthy volunteers can be phenotyped based on their pattern of sensitization by the heat pain models, confirming the study hypothesis.

Within-Subject vs. Between-Subject Variation
Most importantly, the present analysis showed a low withinsubject test-retest variation, compared to the substantial betweensubject variation. Intraclass correlation coefficients indicated that the within-subject variance was approximately 25% of the between-subject variance demonstrating a good within-subject agreement during repeated assessments.

Agreement across Measurements and Methods
The agreement between measurements, in pooled secondary hyperalgesia areas, was 'substantial' in all models and the agreement between areas induced with the H/C and BTS models on the same study day was 'moderate' to 'substantial'( Table 2). Grouping the first measurement session data into quartiles, widely separating the 'small-area' from 'large-area' responders, illustrated that the categorizations generally held up at the second measurement session (Table 3 and 4). For 'small-area' (1 st quartile) as well as 'large-area' (4 th quartile) responders, 56-76% of subjects remained in the same quartile grouping. Subjects almost never crossed all the way from 'small-area' to 'large-area' (or vice versa), this was only seen in 1 out of 409 assessments. Surprisingly, for those studies that had more than two sessions, the improvement of ICC from having additional measurement sessions was limited.
These results suggest that the area of secondary hyperalgesia is reproducible across multiple sessions. 'Small-area' and 'large-area' responders can be selected with reasonable confidence by requiring that a subject fall in the respective categories on two consecutive model inductions. If the goal in a pharmacodynamic trial was to select a group of 20 'small-area' and 20 'large-area' responders, approximately 130 subjects would be needed to undergo pre-trial induction of secondary hyperalgesia on the first measurement. From these tested subjects 'small-area' and 'largearea' responders are selected (n = 65) and continue with the same model on the second measurement. Since 60-70% will repeat an identical outcome at second measurement, at least 20 'small-area' and 20 'large-area' subjects with a consistent secondary hyperalgesia response are likely to appear. Noteworthy, when secondary hyperalgesia was induced with both the H/C (forearm) and BTS (thigh) in the same subject on the first assessment day, 32% of subjects were classified in the same responder group ('small' or 'large') with both models, suggesting robustness across models, and that subjects can be classified as 'small' or 'large' area subjects in a single study day.

'No-Responders'
Little information is available on the proportion of healthy persons in the general population who would fail to develop an area of secondary hyperalgesia in response to one of the three pain models used in the 10 studies analyzed here. For the 9 studies using H/C and/or BTS ( [2,[4][5][6][7][8]11,13], unpublished data), such subjects were excluded after the training session for failing to develop the primary outcome measure and no further data were collected. In the BI study [12] analyzed, 5% failed to develop an area at any session.

Limitations and Advantages of the Study
First, this is a secondary analysis of 9 published and one unpublished study, which should be taken into consideration, when interpreting the results. The study is, as such, a hypothesis generating study, but the high number of subjects (n = 342) and test-retest measurements (409 repeated measurements) seem to make the hypothesis solidly based. Second, the studies differ in methodology in regard to thermode size, measurement timing, bending force of the monofilaments, area calculations and body region sensitized. Some of these variables may affect the outcome, but unfortunately there is a void in the literature on the significance of these interactions. In each study, the methods were identical during the test-retest measurements, counterbalancing some of the negative effects by pooling the studies. An illustrative example was seen in the BI-model where a highly significant carryover effect across the measurements was evident, indicating a habituation effect between measurements ( Figure 2C). The median interval between the measurements was 3 weeks, but hyposensitivity to painful stimuli has been described for up to 6 weeks after the BI [18]. Carry-over effects were not observed with either the H/C or the BTS pain models. To counterbalance this effect in the BI-model we refrained from using absolute area values, but used a relative ranking of the values during each of the two measurements, which seemed to eliminate the effect of habituation. Further, the robustness of the results across models and trial methodology may be considered to strengthen the overall conclusion of the analyses. Third, secondary hyperalgesia may differ not only in area-size but also in regard to duration [19], but the present study, based on single time measurements is not accommodated to address this question. Fourth, it is important to realize that data from the present study in volunteers, cannot be extrapolated to clinical situations with patients.

Phenotyping and Prediction of Pain Sensitivity
Although speculative, our findings may represent a window of opportunity to further explore differences (phenotypes) in individual pain sensitivity and propensity to develop areas of secondary hyperalgesia, which likely depends on central neuronal sensitization. As pointed out by Woolf [1], the comorbidity of a number of pain hypersensitivity syndromes and their similar pattern of clinical presentation may reflect inherent, common attributes of central sensitization to their pathophysiology. Most importantly, the potential predictive value of identifying 'small-' and 'largearea' responders needs to be explored. Thus far, one study comprising 20 patients investigated the potential of preoperatively induced secondary hyperalgesia to predict subsequent dynamic postoperative pain in surgical patients, but no correlations were found [20]. However, a recent study demonstrated postsurgical area of secondary hyperalgesia to predict chronic postsurgical neuropathic pain in the iliac crest bone harvest model [21]. Determining the predictive value of secondary hyperalgesia area for acute pain and other measures of pain hypersensitivity, as well as the risk for development of chronic pain, await much larger sufficiently powered studies.

Phenotyping and Cutaneous Sensitization
The results of the previously mentioned studies in twins [9,10] demonstrate substantial influence of heritable factors for response to pain stimuli, such as the cold pressor response. A subject's propensity subject of developing cutaneous sensitization may also contain underlying genetic factors that can be explored [9]. Phenotyping subjects based on their the ability to develop cutaneous sensitization may, in fact, be equally or more important than phenotyping subjects on the basis of pain thresholds and could potentially be added to the battery of testing following The agreement in classification to small-, mid-or large areas across induction methods was 59% for measurement 1 and 69% for measurement 2. doi:10.1371/journal.pone.0062733.t005 Table 6. The weighted Cohen's kappa statistic for the pooled analyses showed moderate agreement for measurement 1 (0.44) and substantial agreement for measurement 2 (0.62). algorithms like those of the German Research Network on Neuropathic Pain (DFNS) [22].

Conclusion
In conclusion, post-hoc analyses of test-retest data from 342 healthy volunteers in 10 studies, employing experimentally induced cutaneous secondary hyperalgesia, in addition to corroborating the methodological consistency of these models in physiological and pharmacodynamic research, indicated that healthy volunteers can be phenotyped based on their pattern of sensitization by the heat [and heat plus capsaicin] pain models. These findings may represent an incentive to further explore individual differences in ability to develop neuronal sensitization, and to investigate if 'large-area' responders are at an increased risk of developing conditions associated with pain hypersensitivity and chronification -and whether the pharmacodynamic profiles of analgesics and anti-hyperalgesics differ between 'large-area' and 'small-area' responders.