Development and validation of a one year predictive model for secondary fractures in osteoporosis

The number of osteoporosis-related fractures in the United States is no longer declining. Existing risk-based assessment tools focus on long-term risk. Payers and prescribers need additional tools to identify patients at risk for imminent fracture. We developed and validated a predictive model for secondary osteoporosis fractures in the year following an index fracture using administrative medical and pharmacy claims from the Optum Research Database and Symphony Health, PatientSource. Patients ≥50 years with a case-qualifying fracture identified using a validated claims-based algorithm were included. Logistic regression models were created with binary outcome of a second fracture versus no second fracture within a year of index fracture, with the goal of predicting second fracture occurrence. In the Optum Research Database, 197,104 patients were identified with a case-qualifying fracture (43% commercial, 57% Medicare Advantage). Using Symphony data, 1,852,818 met the inclusion/exclusion criteria. Average patient age was 70.09 (SD = 11.09) and 71.28 (SD = 14.24) years in the Optum Research Database and Symphony data, respectively. With the exception of history of falls (41.26% vs 18.74%) and opioid use (62.80% vs 46.78%), which were both higher in the Optum Research Database, the two populations were mostly comparable. A history of falls and steroid use, which were previously associated with increased fracture risk, continue to play an important role in secondary fractures. Conditions associated with bone health (liver disease), or those requiring medications that impact bone health (respiratory disease), and cardiovascular disease and stroke—which may share etiology or risk factors with osteoporosis fractures—were also predictors of imminent fractures. The model highlights the importance of assessment of patient characteristics beyond bone density, including patient comorbidities and concomitant medications associated with increased fall and fracture risk, in alignment with recently issued clinical guidelines for osteoporosis treatment.


Introduction
The number of osteoporotic-related fractures in US is no longer declining [1,2]. Contributing factors include the demographic shift towards an aging population and a decrease in screening. In addition to limited primary prevention measures, secondary prevention to reduce the rate of subsequent fractures is also suboptimal. Most patients who incur a low trauma fracture do not undergo osteoporosis evaluation or initiate treatment [3] despite existing quality of care measures [4].
In the first year following initial fracture, patients have the highest risk of incurring a subsequent fracture. The rate of secondary fractures varies from 4-17% depending on initial fracture site and population characteristics [5,6], with the cumulative incidence increasing to 21% in the four years after the index fracture [7]. In a 10-year follow-up study, secondary fracture rates were 28% for patients with an initial hip fracture and 35-38% for those with non-hip fractures. The risk was highest in the first year after fracture (5-45%) and declined progressively during the 10-year follow-up [8]. There is an incremental cost for second fractures [5]. The total all-cause cost of care is significantly higher in the year following index fracture for those experiencing a second fracture compared to those without a prior fracture for both Medicare ($34,327 vs $20,790; P < .001) and commercial health plan enrollees ($39,501 vs $19,131; P < .001) [6].
Several risk assessment tools, including FRAX and GARVAN, predict the 5-or 10-year probability of fractures [9]. The identification of patients at high risk for subsequent fracture is important to payers, providers, and patients alike. Assessment of risk in the year immediately following index fracture, when the potential to reduce avoidable events and associated burden and cost of illness is greatest, is even more relevant.
The goal of the current study was to develop and validate a predictive model for secondary osteoporotic-related fractures in the year following an index fracture using administrative claims data.

Data source
The study included commercial and Medicare Advantage health plan members with evidence of a case-qualifying fracture between January 1, 2007, to May 31, 2017 (identification period) using the administrative claims data from the Optum Research Database (ORD). Optum has access to a proprietary research database with medical and pharmacy claims data (including linked enrollment data) from 1993 covering 59.5 million lives or approximately 10% of the US population.
Anonymized patient level data from Symphony Health, PatientSource were used for assessment of the model's predictive performance in new individuals (external validation). Data are payer agnostic and provide access to individual-level healthcare claims for >280 million USbased commercial and Medicare Advantage enrollees. Unlike the ORD, the Symphony data includes enrollees from various US payers, but not from Optum. The study identification period was September 1, 2012 to October 31, 2018. The index date was defined as the first fracture claim during the identification period.

Patient selection
Patients �50 years of age with a case-qualifying fracture during the identification period were included. Patients with Paget's disease of bone or malignancy, except nonmelanoma skin cancers, carcinoma in situ of the cervix, ductal carcinoma in situ of breast at baseline or through

Fracture definition
Fractures, including pathologic ones, were considered case qualifying if they were identified during either an inpatient hospitalization (any position on a hospital claim) or an outpatient visit with a repair procedure code, based on a primary or secondary International Classification of Diseases, Ninth Revision or Tenth Revision (ICD-9 or 10) code, listed on the same claim and based on a validated algorithm shown to have accuracy with a positive predictive value exceeding 90% [10]. Fracture episodes that started >30 days after the index date at a different anatomic site or those that started after 90 days with no fracture claims at the same anatomic site were considered subsequent fractures.

Candidate predictor variables
Predictor variables considered during the baseline (12 months pre-index) period included demographic characteristics (ie, age, gender, geography), setting where the fracture occurred (inpatient/outpatient), clinical characteristics (ie, fracture history, site of index fracture, fall history, mobility issues, Parkinson's disease, stroke), concomitant medications associated with increased fall or fracture risk (ie, muscle relaxants, anxiolytics, sedatives, sleep medications, diuretics), or diseases or medications associated with poor bone quality or healing (ie, liver disease, diabetes, oral corticosteroids) [11]. Comorbid conditions were identified by the presence of a diagnostic claim in an inpatient or outpatient setting prior to index fracture using ICD-9 codes before September 30, 2015 and ICD-10 codes after this date. Medication use was assessed by the presence of �1 prescription claims. Comorbidity scores were calculated per the Quan-Charlson Comorbidity Index using diagnosis codes in the preindex period and categorized as 0, 1 to 2, 3 to 4, and �5 [12].

Model development and internal validation (ORD)
Logistic regression models were created with binary outcome of a second fracture versus no second fracture within a year of index fracture, with the goal of predicting the occurrence of a second fracture. The model selection process included an examination of several models, which considered all covariates as main effects only as well as main effects and two-way interactions. A stratified analysis was conducted separately for commercial and Medicare Advantage enrollees. Stepwise, forward, and backward selection including/excluding select variables based on clinical and statistical significance were carried out. Performance of each model was subsequently evaluated using the concordance index (c-statistic), and the final set of covariates was chosen to balance parsimony, performance, and interpretation.
Internal validation was conducted using bootstrapping methodology [13] where, for each of 100 bootstrapped samples drawn with replacement: (a) Fit logistic regression model with same set of covariates as from the original model and calculate the c-statistic; (b) Score original dataset using model from (a) and calculate the c-statistic; (c) Difference in c-statistics between (a) and (b) is called the optimism. The optimism is then averaged across all 100 samples, that average is subtracted from the original c-statistic from the observed model and data, and the difference is reported as the internally validated c-statistic.

Model scoring
The model calculates a patient's individual probability or risk of a second fracture in the year following the index fracture (prognostic prediction model) using coefficients for the variables relevant to a particular patient. The total score is based on the summation of values corresponding to each predictor variable ( Fig 1A) with separate models by insurance type (Fig 1B). The summed value is then converted to a probability. The predicted probability is a patientlevel measure. The predicted probabilities should be used with caution as they are dependent on the prevalence of refracture in the population and might be best used as relative or comparisons measures (ie, Patient A is higher risk than Patient B). A threshold for intervention may be set by the end user as to what may be considered a high-risk probability with the objective for identification and treatment of individuals thought to be at imminent risk of fracture.

External model validation (Symphony data)
Patient cohorts were created using the validated fracture algorithm as described previously.
Variables previously thought to be predictive of future fracture risk were considered. Subsequently, the predictive power of the two models and individual variables within them were compared. After validation was completed, additional conditions and concomitant medications, not included in the Optum model, but which were recommended for risk-based assessment by the recently-issued clinical practice guidelines, were considered to further test the fit of the model (S1 Table) [10]. The guidelines suggest an evaluation of factors beyond bone health assessment including comorbidities that increase the risk of falls and/or fractures consistent with recently issued consensus statement regarding secondary fracture prevention [14]. TRIPOD guidelines were followed in reporting of these results [15].

Results
In the ORD, 197,104 patients were identified with 1) a case-qualifying fracture between January 1, 2007 and May 31, 2017, and 2) �12 months of follow-up after the case-qualifying fracture. Fortythree percent (n = 84,866) were commercial and 57% (n = 112,238) were Medicare Advantage enrollees. Using Symphony data, 1,852,818 met the inclusion/exclusion criteria (Table 1).

Population differences
Differences in study populations were expected given the variations in database structures, availability of variables within the databases, and variations in patient populations covered. To 18.74%) and opioid use (62.80% vs 46.78%), both of which were higher in the ORD, the two populations were mostly comparable. The most prevalent conditions associated with increased fall risk [11] for both populations were history of falls and mobility impairments. The most prevalent conditions associated with increased length of fracture healing were diabetes and renal disease. The most prevalent chronic conditions included cardiovascular disease (CVD), hypertension, respiratory disease, and arthritis. Commonly prescribed medications associated with increased fall risk [11] included opioids, beta blockers, proton pump inhibitors, and selective serotonin reuptake inhibitors.

Predictive model results: ORD versus Symphony
Various selection techniques generally selected the same set of variables and therefore had similar predictive performance. Predictive performance of the model, coefficients of all variables included in the model, and a summary of the key differences between the results of the ORD and Symphony model are provided (Tables 3 and 4).

Model validation
After forcing the same variables from the ORD model using Symphony data, the direction and parameter estimates remained consistent with a few notable exceptions. In both the Commercial and Medicare models, the estimate for history of ankle fracture changed from a negative to a positive predictor of subsequent fractures (Tables 3 and 4). In the Medicare model, prior oral corticosteroid use also changed from positive in the ORD model to negative in the Symphony model, though both of these variables were statistically insignificant (Table 4). Where the two models differed were in the magnitude of estimates. In both the Commercial and Medicare models, the Symphony model assigned a higher intercept, but the variable estimates tended to be of lower absolute magnitude (Tables 3 and 4). The fit of the models was comparable (C-statistics for the commercial model: Optum 0.737, Symphony 0.620; C-statistics for the Medicare model: Optum 0.637, Symphony 0.603). After separating commercial and Medicare patients, all models resulted in approximately the same c-statistics suggesting that the models were not overly affected by including or excluding any single variable. Given that the predictors of future fracture risk remained consistent, the direction of the association persisted, and that the fit of the two models were comparable, we consider the ORD model to have been validated using a different database.

Additional covariates of interest
All of the additional variables were available in the Symphony data, though some had relatively low prevalence (S1 Table in the Supplemental Materials). Most variables were significant predictors of future fracture risk except for acromegaly, Cushing disease, immobile paralysis, seizure disorders, and osteomalacia; however, the fit of the model did not improve in consideration of these characteristics and thus they were not considered. As an additional attempt to improve model accuracy, variables were checked for correlation; pairs with a correlation of 0.3 or higher had an interaction term added. The new variables and interaction terms were introduced to the model and tested for significance and impact with a forward stepwise methodology. The introduction of these variables slightly improved the fit of the models, increasing the c-score in the Commercial and Medicare models from 0.62 to 0.63 and from 0.603 to 0.609, respectively. The increase in predictive power indicated that these variables were not poor inclusions, but it was not representative of a large enough difference to render the original OptumRx model obsolete.
The estimated probability of fracture varied by patient, depending on demographic characteristics and presence or absence of comorbidities and conditions that increased the risk of falls and fractures. For example, in the commercial health plan population, a 59-year-old male with a hip fracture and no comorbidities had a lower risk of fracture (1.75%) than a 70-yearold female with a hip and shoulder fracture who also had a history of CVD (6.7%). In the Medicare Advantage population, a 70-year-old male with a hip fracture and no comorbidities had a lower risk of fracture in the year following his index fracture (3.78%) than a female counterpart with a spine fracture and who was being treated for anxiety (SSRI) and use of oral corticosteroids (OCS), and had a history of falls (19.53%).

Discussion
The current study focused on the development and validation of a secondary prediction model for fracture. The output is the estimation of the probability of fracture risk in the year following an initial fracture and in a patient with a given number of characteristics over another patient with a different set of characteristics. The C-statistics of 0.737 for the ORD commercial model and 0.637 for the ORD Medicare model indicated 74% or 64% probability of predicting higher second fracture risk for a randomly selected actual second fracture patient than a randomly selected non-second fracture patient. Both model-based predictions are better than chance (0.5). The performance of the developed and internally validated ORD model was tested and externally validated in a new population of patients. The validation of the model is indicative of its robustness.
In addition to having a good performance in the development sample, the model performed well in a different population of patients supporting its generalizability. Specifically, some differences in the population characteristics in the ORD and Symphony models, including a higher history of falls and opioid use in ORD, are important given their association with increased fracture risk. We believe that the higher opioid use is due to fewer restrictions for use in the time period of the ORD model compared to today. A history of falls, a known risk factor for fractures, continues to play a role in secondary fracture risk and remains an important consideration for risk assessment.
The use of OCS was a positive predictor in ORD and a negative predictor in Symphony Medicare patients. Overall, higher steroid use was noted in Symphony; however, this was not limited to chronic steroid users. While we were able to adjust for certain conditions typically associated with steroid use (ie, arthritis, COPD), we did not have detailed clinical information on other factors associated with pre-treatment bone density (ie, weight) that could further impact the effect of steroids on fractures.
In addition to mobility issues, other concomitant medications [16] used to treat patient comorbidities need evaluation and consideration in fracture risk assessment. Both commercial and Medicare models had respiratory disease, depression, and liver disease as risk factors for secondary fractures, whereas stroke, mobility issues, and CVD were also predictors of risk in the commercial population. While mobility issues increase fall risk and subsequent fractures directly, CVD and stroke may be associated with increased fracture risk due to shared etiology or risk factors or as a marker of frailty [17].
Our model has several advantages over the FRAX-a risk assessment tool used to evaluate the 10-year probability of hip and major osteoporotic fractures [18,19]. First, our model predicts one-year risk for patients who have already incurred a fragility fracture, thus focusing on secondary fracture prevention. Second, our model predicts all fractures not just hip and major osteoporotic fractures. Third, our model is transparent providing the model parameters. The model recognizes treatment and treatment failure could occur and includes consideration of osteoporosis treatment received [20]. In comparison, FRAX is to be used in treatment-naïve patients. Finally, our model included population characteristics that are readily available from the patient's health records (ie, comorbidities and concomitant medications associated with increased fall and fracture risk) and not subject to reporting bias associated with self-reporting of lifestyle risk factors required by FRAX. Our model is therefore more closely related to a population risk management tool because the output of the query is a list of patients at risk for subsequent fracture.
The model is also flexible and predicts outcomes for two different patient populations (commercial and Medicare enrollees). Each provider or health system may have a large number of patients in one or the other category and may want to run the model separately for individuals in the population of interest. The model does not require a special software tool. Microsoft Excel or IBM's Statistical Package for the Social Sciences can be used for calculations using the coefficient estimates and a list of comorbidities, concomitant medications, and osteoporosis disease and treatment history.
The model has several limitations, mainly due to the retrospective nature and inherent features of an administrative claims database. First, the database, like other claims data, is structured largely to collect information for billing purposes and not research. Any errors or inconsistencies in the documentation of diagnosis or medication codes may lead to misclassification of patients; however, we expect such errors to be low because the proper documentation of information in claims is a prerequisite for reimbursement. The ORD allows for the identification of individuals with guaranteed coverage during the desired period, while Symphony data does not. To maximize the opportunity to include the full spectrum of claims for patients in Symphony, patients were required to have recorded activity in the 9-14 months pre-and post-index fracture, which provided sufficient certainty that the patient remained in the database during the time periods. Second, the model is US-centric and not validated in other regions/countries, whereas other models, such as FRAX, provide calculations separately for different regions/countries based on local epidemiology and population characteristics. Because we used administrative data, we could not consider environmental factors that increase fall risk in the patient's residence [11]. Lastly, our data did not specifically evaluate future fracture risk for institutionalized patients as data were not available; however, we did include conditions that are prevalent in this population including diseases associated with poor mobility and balance.
Fractures are associated with increased disease burden even years after fracture incidence [21]. In an evaluation of longer term outcomes of women 70 years of age, the mean utility decrement due to fractures was 12-fold greater in patients with sentinel hip fracture and was increased 15-fold for spinal, 4-fold for forearm, and 8-fold for humeral fracture, highlighting the importance of secondary fracture prevention [8]. These data further emphasize the importance of secondary fracture prevention especially given the low rate of treatment initiation and adherence following fracture incidence. Information on risk of subsequent fracture may guide healthcare professionals in their decision-making regarding testing or initiation of treatment. The information could potentially modify patient attitude and behavior regarding their risk and subsequently impact treatment adherence.
In summary, the model suggests consideration of factors beyond bone density, including comorbidities and concomitant medications associated with an increased risk of falls, fractures, or reduced bone health/quality. Future research includes implementation in a health system to determine the usability in the fracture care pathway and impact on patient outcome.
Supporting information S1 Checklist. TRIPOD checklist: Prediction model development and validation. (PDF) S1