Using machine learning to study the effect of medication adherence in Opioid Use Disorder

Background Opioid Use Disorder (OUD) and opioid overdose (OD) impose huge social and economic burdens on society and health care systems. Research suggests that Medication for Opioid Use Disorder (MOUD) is effective in the treatment of OUD. We use machine learning to investigate the association between patient’s adherence to prescribed MOUD along with other risk factors in patients diagnosed with OUD and potential OD following the treatment. Methods We used longitudinal Medicaid claims for two selected US states to subset a total of 26,685 patients with OUD diagnosis and appropriate Medicaid coverage between 2015 and 2018. We considered patient age, sex, region level socio-economic data, past comorbidities, MOUD prescription type and other selected prescribed medications along with the Proportion of Days Covered (PDC) as a proxy for adherence to MOUD as predictive variables for our model, and overdose events as the dependent variable. We applied four different machine learning classifiers and compared their performance, focusing on the importance and effect of PDC as a variable. We also calculated results based on risk stratification, where our models separate high risk individuals from low risk, to assess usefulness in clinical decision-making. Results Among the selected classifiers, the XGBoost classifier has the highest AUC (0.77) closely followed by the Logistic Regression (LR). The LR has the best stratification result: patients in the top 10% of risk scores account for 35.37% of overdose events over the next 12 month observation period. PDC score calculated over the treatment window is one of the most important features, with better PDC lowering risk of OD, as expected. In terms of risk stratification results, of the 35.37% of overdose events that the predictive model could detect within the top 10% of risk scores, 72.3% of these cases were non-adherent in terms of their medication (PDC <0.8). Targeting the top 10% outcome of the predictive model could decrease the total number of OD events by 10.4%. Conclusions The best performing models allow identification of, and focus on, those at high risk of opioid overdose. With MOUD being included for the first time as a factor of interest, and being identified as a significant factor, outreach activities related to MOUD can be targeted at those at highest risk.


PDC = Days Covered Total Days in Period
where days covered is calculated by: 1. Identifying all prescription claims and days supply for MOUD in the examined period; 2. Iterating through prescriptions in chronological order counting any "gap" days where the patient is not covered and adding any overlapping days to the next claim; and then 3. Deriving days covered by subtracting the total gap days from the days in the period.
Treatment medication An important factor in determining outcomes is the treatment approach chosen by the treating doctor including both whether prescription medication is used and which type of medication is prescribed for the OUD. Pharmacy claims in the first 3 months after the first diagnosis are scanned for any of the primary treatment medications. The medication with the largest number of claims in the period is considered to be the MOUD approach prescribed at the diagnosis point. A special case is considered where both Buprenorphine and Naltrexone are claimed in the treatment period in order to capture protocols combining the medications [1,2]. This treatment approach is then encoded into a categorical predictor variable for all models indicating window medication as one of: • Buprenorphine The targeted categories are then simplified into 29 binary variables. Each one indicating the presence of at least one diagnosed comorbid condition in the given category before OUD diagnosis. These binary variables represent a simplified medical history that can easily and realistically be used in predictive modelling at the time of first OUD diagnosis. This approach has many advantages: 1. Claims-level data without indicators of severity can be used.
2. It is simple and reproducible given a history of diagnosis codes.
3. Complex patient history can be considered without overwhelming the model with trivial features.

Features have a clear and interpretable general clinical meaning improving interpretability.
Other prescriptions NDC codes for drugs of the classes of drugs included as predictors (SSRIs, benzodiazepine class drugs, opioid based analgesics) are extracted from the RxNorm API via the RxNormR R package then used to scan prescription claims for the cohort. Prescriptions from the 3 classes are partitioned into: • "Prior" -claimed before the patient's first OUD diagnosis.
• "During" -claimed during the month following the first OUD diagnosis.
This separation is made to differentiate historical factors from potential pharmaceutical interactions in co-prescribed medications. The presence or absence of a prescription claim in each class for each partition is then encoded as a binary variable (6 total). An assumption is made that some ongoing prescriptions a patient will use during treatment for OUD can be determined at the point of diagnosis.
The complete lists of drugs in each class follows.

• Flurazepam
List of Opioids Income Household income is included as a single categorical variable by dividing regions into income categories based on average household income quantiles: • Low Income (quantile < 0.2) Urban development Urban development is indicated by a single categorical variable indicating whether an area is considered urban or rural according to its level of development. Source: Economic Research Service (ERS) of the United States Department of Agriculture. Rural-Urban Commuting Area Codes.