A comparison of three strategies to reduce the burden of osteoarthritis: A population-based microsimulation study

Objectives The purpose of this study was to compare three strategies for reducing population health burden of osteoarthritis (OA): improved pharmacological treatment of OA-related pain, improved access to joint replacement surgery, and prevention of OA by reducing obesity and overweight. Methods We applied a validated computer microsimulation model of OA in Canada. The model simulated a Canadian-representative open population aged 20 years and older. Variables in the model included demographics, body mass index, OA diagnosis, OA treatment, mortality, and health-related quality of life. Model parameters were derived from analyses of national surveys, population-based administrative data, a hospital-based cohort study, and the literature. We compared 8 what-if intervention scenarios in terms of disability-adjusted life years (DALYs) relative to base-case, over a wide range of time horizons. Results Reductions in DALYs depended on the type of intervention, magnitude of the intervention, and the time horizon. Medical interventions (a targeted increase in the use of painkillers) tended to produce effects quickly and were, therefore, most effective over a short time horizon (a decade). Surgical interventions (increased access to joint replacement) were most effective over a medium time horizon (two decades or longer). Preventive interventions required a substantial change in BMI to generate a significant impact, but produced more reduction in DALYs than treatment strategies over a very long time horizon (several decades). Conclusions In this population-based modeling study we assessed the potential impact of three different burden reduction strategies in OA. Data generated by our model may help inform the implementation of strategies to reduce the burden of OA in Canada and elsewhere.

BMI separately for men and women, based on BMI history and other covariates, such as region of residence, income, and education. The response variable is log BMI, treated as a continuous, normally distributed variable. The model includes the most recent BMI value (2 years back) and up to 3 earlier BMI values, i.e., 4, 6 and 8 years back. The model can be presented as: Log (BMIt) = b0 + b1 log (BMI history) + b2 Age + b3 Education + b4 Income + b5 Region + error(t).
There are regression equations separately for males and females. Within each sex stratum, there are four models, depending on the number of prior BMI values used, resulting in 8 models (equations). The parameters (regression coefficients) are shown in Table A1-2.

Baseline OA incidence rates
Age and sex-specific baseline OA incidence rates for MSM-OA were derived from administrative data in British Columbia (BC). A random sample of 742,070 records was obtained from Population Data BC, an administrative database which covers virtually all hospitalizations and visits to health professionals in BC since 1986. Similar to prior studies [16,23], we defined incident OA as a hospital discharge or two visits to a health professional within 2 years (and not on the same day) with a diagnostic code for OA (715 in ICD-9 or M15-M19 in ICD-10) in a person without OA within the previous 10 years. Visits to all types of health professionals were included. The date of the second code was the date of diagnosis. Person-years were obtained from individual data on registration with the provincial health plan. Baseline OA incidence rates by age and sex are shown in Table A1-3.

Initial prevalence of OA
The initial age/sex-specific prevalence of OA was calibrated in a simulation run using the incidence rates from Population Data BC (Table A1-3), such that the population was in a steady state at the beginning of the simulation [16]. To ensure that the distribution of BMI by OA status in the initial population was correct, we assigned OA to subjects who self-reported OA in the CCHS. In the age/sex groups in which the proportion self-reporting OA was greater than the calibrated prevalence from administrative data, we randomly reduced the number of persons with OA to match the calibrated prevalence, and in the age/sex groups in which the proportion self-reporting OA was smaller than the calibrated prevalence, we randomly increased the number with OA. The proportion of subjects in different stages of OA in the initial population was assigned based on the distribution observed in the Population Data BC database.

Effect of BMI on OA incidence
Relative risks (hazard ratios) for the effect of BMI on OA incidence was derived by analyzing longitudinal data from multiple cycles (1994/96/98/00/02) of the NPHS [16] (Table A1-4). Both OA and BMI were self-reported. We applied a survival regression model, adjusting for age. For the underweight category, because the sample size in the NPHS was too small, we obtained the relative risks from the CCHS (crosssectional data) and adjusted for cross-sectional bias by applying as a multiplier the mean ratio of the relative risks for the other BMI categories from the NPHS to the relative risks from the CCHS.

Medication use
We computed a drug use table, which contains stratified estimated probabilities of medication use (each of four medication categories) according to age group, sex, OA stage, and level of pain (Table A1-5). The drugs modeled, Drug(i) (i=1 to 4) were: 1) Acetaminophen (paracetamol); 2) Non-steroidal antiinflammatory drugs (NSAIDs); 3) COX-2 inhibitors (coxibs) and 4) Opioids. The level of pain was obtained from the pain domain of the Health Utilities Index Mark 3 (HUI3), a well-established measure of health utility, which consists of 8 domains (vision, hearing, speech, mobility, dexterity, cognition, emotion, and pain/discomfort). For details of modeling HUI3, please see the Section "HUI3 domain-specific model".
Modelling drug use was carried out in the following steps: Step 1. Using NPHS data, we modeled each Drug(i) as a function HUI3 Pain domain + age + sex + SROA, where age is continuous, and SROA is self-reported OA (0/1). Analysis dataset was the NPHS combined cycles, age 12+ only. We obtained odds ratios (ORs) estimating adjusted effect of pain levels on drug use.
Step 3b. We fit models to estimate BC use in 2012 of each Drug(i) from the 6 data points obtained in Step 3a. We used linear models with dependent variable percent use, using the same independent variables as before.
Step 6. Using BC administrative data (12+), we estimated the aggregate proportion using Drug(i) on a given day in 2012.
Step 7. We estimated drug use ratios to convert probability estimates for Drug(i) from administrative to self-report metric, as follows. Let SRprob_Drug(i) = P(Drug(i)) in 2012 estimated in Step 2b on self-report aggregate Canadian data, and ADMINprob_Drug(i) = P(Drug(i)) in 2012 estimated in Step 6 on administrative aggregate BC data. Then Ratio_Drug(i) = logit (SRprob_Drug(i)) -logit (ADMINprob_Drug(i)) (note, this is a difference in logits, not a ratio of logits).

Side effects of medication
We considered four side-effects (complications) associated with the use of NSAIDs, coxibs and opioids in OA: serious GI complications (ulcer with bleeding or perforation); dyspepsia; cardiovascular disease (CVD, except stroke); and stroke. In addition, we included lethal overdose as a side effect of opioids. We assumed no side effects from acetaminophen. The parameters describing the risk of side effects were derived from the literature [26][27][28][29][30][31][32][33][34][35][36]. However, we used different types of parameters for different side effects. For CVD and stroke, we obtained baseline incidence rates of these conditions in Canada from 2001 to 2017 (assumed to be constant thereafter) by age, sex and year from the Global Burden of Disease Study (GBD 2017) (Tables A1-6 and A1-7) and multiplied them by the relative risks (RRs) associated with the use of each medication, derived from the literature (Table A1-8). These RRs were assumed to be constant across age and sex strata and similar for all three classes of drugs [32,34). For GI complications, we obtained excess rates of ulcer and dyspepsia among patients taking medication from the literature (Table A1-8). We assumed those rates to vary by age (<70 vs. 70+ years). For opioid overdose, we used an estimate of the total number of deaths in Canada due to prescription opioids, excluding deaths due to illicit drugs (Table A1-8) and derived the probability of lethal overdose among patients with OA. Probabilities of death due to CVD and stroke by age, sex, and year were obtained from GBD 2017 as mortality/incidence ratios (Tables A1-9 and A1-10), and the risk of death from bleeding or perforated ulcer was obtained from the literature [43,44] (Table A1-11).

Impact of medication on HUI3
We assumed pain was the only HUI3 domain affected directly by medication (please see the Section on modeling HUI3). Other domains could have been affected indirectly, because of their relationship with pain. The effects of each medication type on pain was obtained from meta-analyses of published randomized trials (Table A1-12). The effect was expressed as change in pain relative to baseline, in a before-after comparison (rather than against placebo). This was done to better reflect the real-life effect of treatment, which includes the placebo effect. Each simulated subject was randomly assigned a probability of medication being effective. It should be noted that in our model pain was one of the factors determining the use of medication (Table A1-5) and the use of medication affected the level of pain. To avoid a circular relationship, we modeled two types of pain for each simulated subject taking medication, latent or counterfactual pain that the patient would have experienced without medication and actual pain (on medication). The probability of taking pain medication was determined by the former type of pain, whereas effect on HUI3 was determined by the latter. Rather than modeling the level of pain reduction in each simulated individual, we used the pain reduction parameter to simulate the proportion of users who get 100% reduction in pain (which is equivalent).

Frequency of joint replacement surgery (JRS)
We used administrative data from BC (Pop Data BC -Medical Services Plan Database) to derive baseline JRS rates. We defined primary and revision JRS in hospital data as a Canadian Classification of Diagnostic, Therapeutic, and Surgical Procedures (CCP) procedure code 935 or 934.1, or a Canadian Classification of Health Interventions (CCI) code 1VA53LAPN or 1VA53PNPN or 1VG53. We excluded cases if concurrently there was a diagnosis in the hospital record of the following ICD9 codes: 800-999, E800-E869, E880-E928, E950-E999, 140-208, 235-239, except codes 996.4, 996.6, 996.7, or T84, or the following ICD10 codes: S00-S99, V01-V99, W00-W99, X093-X99, C00-C97 and D37-D48. We defined joint replacement surgery as revision if concurrently there was a diagnosis in that hospital record with codes 996.4, 996.6, 996.7, or T84, otherwise, it was defined as primary. Baseline rates of primary and revision JRS by age, sex and time were obtained from a Poisson regression model with predictors age group, sex, and log10 (year-2000). The latter term was needed to accommodate a time trend, and necessitated that the model not be applied until the year 2001 at the earliest (current starting year for MSM-OA). Subjects could have up to 4 primary surgeries and any number of revision surgeries over lifetime. These models were only applied once the subject had OA. The parameters (model coefficients) are in Table A1-13.
To model JRS as a function of HUI3 domains, we carried out a case-control study in which cases were 220 patients with OA scheduled to undergo hip/knee replacement at Vancouver General Hospital  (vision, hearing, speech, dexterity, ambulation, pain, cognition, and emotion). HUI3 was measured prior to surgery in the cases and as part of the survey in the controls.
We used conditional logistic regression to model the association of JRS with each of the dichotomized (for pain there were 3 categories) HUI3 domains (Table A1-14). The baseline JRS rates by age and year for males and females are shown in Table A1-15 (primary) and Table A1-16 (revisions).
We fit the following model to predict a run-in time correction factor (the ratio of asymptotically estimated OA prevalence over the estimated prevalence at a given run-in time): OA_prev ~ years + years^2 + log10(years). The plots in Figure 1 show that the model fits the published numbers very well, and predictions asymptotically approach a sensible level. The second plot illustrates our approach to subsequently level off the estimated asymptotic prevalence rather than let it begin to drop when too far beyond the data range.
Since our data begin in 1990, we inflated the OA person-time for each subject within each calendar year they contributed to, by the factor prev(20) / (prev(min(20,max(year,1991)-1990)), where prev(t) is to be read off the right plot ( Figure 1). We then fit Poisson models for primary and revision surgery separately, as ln(E(count)) ~ age group + sex + log10 (year-2000), with an offset to put estimated regression coefficients in the per 1000 person-year scale.

Mortality post-surgery
The risk of death post JRS relative to age/sex-specific general population risk was assumed to be 1.80 during the first 90 days post-JRS based on data from the literature [45][46][47]. After 90 days the risk of death was assumed to return to the general population risk.

HUI3 domain-specific model
We modeled each HUI3 domain (attribute) separately. To this end, we used NPHS (longitudinal data, 1994-2012)  We applied domain-specific proportional odds (PO) models to predict each domain of HUI, applying models and updating domains in the following order: Vision, Hearing, Speech, Pain, Cognition, Dexterity, Mobility, Emotion. The order was important because models were nested. The vision model did not include any other domains, the hearing model included vision as a predictor, the speech model contained vision and hearing as predictors, etc., until finally the emotion model included all the other domains as predictors. After each domain-specific model was run, that domain was updated before applying the next model.
To update a given component, the cumulative logit model was applied. To this end, we calculated the linear predictor etai=X*beta for each level of the domain except level 1. The ith intercept was selected for the ith level's calculation. The intercept selected depended on the level being calculated, this was calculated once with each intercept. Then a cumulative probability was calculated as P(level≥i)=1/(1+exp (-etai)). This was repeated for each intercept from i=2 to k (where there are k levels in the domain). We used subtraction to calculate point probabilities from the cumulative probabilities as follows: P(level=j)=P(level≥j)-P(level≥j+1), where P(level≥1)=1. This produced a probability mass function (PMF) from which POHEM randomly selected a level for the given HUI3 domain to assign to the subject.
When calculating a linear predictor, we applied a multiplier to each coefficient of either 0 or 1 (for categorical variables, where a 1 is for the applicable category), or the variable's value for a continuous variable.
After all domains were updated, we re-calculated the aggregate HUI3 from the individual components using modified HUI3 coefficients derived as population-weighted averages of the coefficients for the uncollapsed categories that are available at http://www.healthutilities.com/hui3.htm. The population counts used in the weighting were obtained from the master CCHS file. The coefficients were calculated as follows: HUI3 aggregate was then calculated as: HUI3 =1.371*(b1*b2*b3*b4*b5*b6*b7*b8)-0.371, where b1-b8 are the modified coefficients for the 8 domains.
Model coefficients for the side effects of medication (stroke and ulcer/bleeding) for the HUI3 domains were obtained from the CCHS 2001 data. The effects of the side effects were adjusted for the same variables as the HUI3 domain models as follows: 1. Ordinal logistic models (proportional odds models) were fit regressing each collapsed ordinal HUI3 domain on stroke and ulcer in the same model plus the same variables as in the HUI3 model (Table A1- 3. We compared the regression coefficients for CVD, osteoarthritis, diabetes, and high blood pressure not adjusted for year and not including previous values of the domain vs. models adjusted for log base 10 of year and either one or two 2-year cycles back values of the HUI3 outcome domain. For each domain, the median of the eight ratios thus obtained was taken as the domain ratio. This domain ratio was multiplied against the regression coefficients and SEs for stroke, ulcer and non-ulcer dyspepsia in the new no-previous-year CCHS models. The purpose of this step was to reflect in the estimates the effects of year and autoregressive variables in the model on the odds ratios. The additional side effects were treated as 0s when applying the expanded HUI3 domain models, until such time as they appeared as side effects of drugs. The coefficients are shown in Table A1-17.  (Table A1-18). This model was applied once post-JRS, using data from a cohort study of patients undergoing JRS, obtained on average a year post-surgery. After that, HUI3 domains continued to be updated using the general HUI3 model (Table A1-17) that was used prior to JRS; however, prior levels of the domains in this model were those observed after surgery.  Incidence rates for selected ages are shown for illustration purposes. In the simulation, we used rates for each single year of age. The relative risk were estimated using a survival regression model.      The coefficients were obtained from a conditional logistic regression model.

Appendix 3: Sensitivity analysis
We estimated the impact of uncertainty in 17 key model parameters on the results in a one-way sensitivity analysis. The parameters, their confidence intervals, and source of data are listed in Table A3-1. For each parameter, we ran the model for the base-case and three intervention scenarios (one per strategy) assuming three values for the parameter, the mean, lower limit, and upper limit of the 95% confidence interval. We selected the Medication x2, Surgery x2 and BMI-0.3 scenarios for this analysis.
For each value of each parameter, we computed lifetime QALYs per person for the base-case scenario and each intervention scenario. We specified the sample size as 1 million simulated individuals for the duration of the simulation, which resulted in a population of about 360,000 in 2020. It may be noted that the sample size for the sensitivity analysis was 1/10 of the sample size for the main analysis (this was necessary for feasibility). As a result, standard error due to random variation in the model was around 0.018-0.019, which is significantly larger than uncertainty associated with some of the model parameters (e.g., reduction in pain due to medication). Therefore, the impact of these parameters should be interpreted with caution.
The results are presented numerically (actual QALYs) in Tables A3-2 -A3-5 and graphically in Figures A3-1 -A3-3 (tornado plots). The plots show the differences in QALYs between base-case and intervention scenarios for the mean, lower, and upper limit of the confidence interval for each parameter.  QALYs per person between base-case and Surgery x2 intervention scenario.