Combinations of medicines in patients with polypharmacy aged 65–100 in primary care: Large variability in risks of adverse drug related and emergency hospital admissions

Background Polypharmacy can be a consequence of overprescribing that is prevalent in older adults with multimorbidity. Polypharmacy can cause adverse reactions and result in hospital admission. This study predicted risks of adverse drug reaction (ADR)-related and emergency hospital admissions by medicine classes. Methods We used electronic health record data from general practices of Clinical Practice Research Datalink (CPRD GOLD) and Aurum. Older patients who received at least five medicines were included. Medicines were classified using the British National Formulary sections. Hospital admission cases were propensity-matched to controls by age, sex, and propensity for specific diseases. The matched data were used to develop and validate random forest (RF) models to predict the risk of ADR-related and emergency hospital admissions. Shapley Additive eXplanation (SHAP) values were calculated to explain the predictions. Results In total, 89,235 cases with polypharmacy and hospitalised with an ADR-related admission were matched to 443,497 controls. There were over 112,000 different combinations of the 50 medicine classes most implicated in ADR-related hospital admission in the RF models, with the most important medicine classes being loop diuretics, domperidone and/or metoclopramide, medicines for iron-deficiency anaemias and for hypoplastic/haemolytic/renal anaemias, and sulfonamides and/or trimethoprim. The RF models strongly predicted risks of ADR-related and emergency hospital admission. The observed Odds Ratio in the highest RF decile was 7.16 (95% CI 6.65–7.72) in the validation dataset. The C-statistics for ADR-related hospital admissions were 0.58 for age and sex and 0.66 for RF probabilities. Conclusions Polypharmacy involves a very large number of different combinations of medicines, with substantial differences in risks of ADR-related and emergency hospital admissions. Although the medicines may not be causally related to increased risks, RF model predictions may be useful in prioritising medication reviews. Simple tools based on few medicine classes may not be effective in identifying high risk patients.

Introduction A recent UK Government Review of Overprescribing of medicines highlighted the need to reduce prescribing as at least 10% of the current volume of medicines in the UK may be unnecessary [1,2]. Older patients frequently receive multiple medicines as they are more likely to have multiple long-term conditions. These conditions often result in multiple medicines being prescribed, or polypharmacy, which is particularly common in the frail older people [2]. Polypharmacy is often intended to reduce the risk of future morbidity and mortality in each of the patient's specific health conditions. The underlying evidence for drug treatment in patients with multiple long-term conditions is often poor as clinical trials usually focus on single conditions and drugs, excluding, participants with multimorbidity and polypharmacy [3]. A recent policy report proposed a pragmatic approach by classifying polypharmacy into 'appropriate' and 'problematic'. Appropriate polypharmacy was defined as pharmacotherapy that extends life expectancy and improves quality of life. In contrast, problematic polypharmacy concerns pharmacotherapy with an increased risk of drug interactions and adverse drug reactions (ADRs), together with impaired adherence to medication and quality of life for patients [4]. The World Health Organization has highlighted that unsafe medication practices and medication errors are a leading cause of injury and avoidable harm in health care systems across the world [5].
A systematic review of problematic polypharmacy, its burden and the effectiveness of interventions to reduce this found that interventions can reduce problematic polypharmacy but without effect on health outcomes. It concluded that evidence of the extent of problematic polypharmacy in the UK, and what interventions are effective is limited [6]. A possible reason for the limited effectiveness of intervention to optimise prescribing in patients with polypharmacy may be the limited screening tools to identify polypharmacy at higher risk of ADRs. The 2015 NICE Medicines optimisation guideline provide general advice on e.g., systems for reporting ADRs but with only limited information on what medicine combinations would need medicine review. It recommended to use screening tools such as STOPP/START, based on pharmacological considerations and expert consensus, to identify potentially inappropriate prescribing and treatments that might be changed [7]. However, a cluster randomised trial found that a structured medicine review based on the STOPP/START criteria reduced prescribing but without any effect on drug-related hospital admissions which was the primary outcome [8]. A recent review found limited evidence that interventions in polypharmacy, such as medication reviews, resulted in clinically significant improvements [6].
The aim of this study was to develop and test a new screening tool for identifying medicine combinations in patients with polypharmacy at high risk of hospital admissions. The approach in this study was data-driven without prior hypotheses of pharmacological plausibility of the effects of the medicines considered.

Database
Data sources were the Clinical Practice Research Databank (CPRD GOLD) [9] and Aurum [10]. CPRD GOLD and Aurum contain longitudinal, anonymised, patient level electronic health records (EHRs) from general practices in the UK. Almost all UK residents are registered with a general practice, which typically provides most of the primary healthcare. If a patient received emergency care (e.g., at Accident & Emergency department) or inpatient or outpatient hospital care, the general practice of the patient will be informed. All UK general practices use EHRs which are provided by different EHR vendors, including EMIS and Vision. EMIS is the most frequently used primary care EHR, whereas Vision used to be used more frequently previously [11]. The CPRD GOLD databases includes general practices that use Vision EHR software system, while Aurum practices use EMIS Web. Practices can change their EHR software although this will be reflected in the start and end of data collection for each practice. CPRD GOLD includes data on about 11.3 million patients [9] and Aurum 19 million patients [10], although practices and patients may have contributed data for varying durations of time. These databases include the clinical diagnoses, medication prescribed, vaccination history, diagnoses, lifestyle information, clinical referrals, as well as patient's age, sex, ethnicity, smoking history, and body mass index (BMI). The patient-level data from the general practices in England were linked through a trusted third party to hospital admission data (hospital episode statistics) using unique patient identifiers [9]. The hospital data contained information on the date of hospital admission and the clinical diagnoses established at and during admission and coded using ICD-10. Also, linked data were available, starting April 1, 2007 for visits to emergency departments, including the visit day, but presenting diagnosis data was less complete for these visits. Patient-level socioeconomic information was approximated from Index of Multiple Deprivation (IMD) linked to the patient's residential postcode [12]. Patient-level IMD was aggregated into quintiles for the current analysis. Medicines were classified using the British National Formulary (BNF) sections which is the prescribing guide for UK clinicians.

Study population
The overall study population consisted of patients aged 65-100 years at any time during the observation period (from January 1, 2000 to July 1, 2020 for CPRD GOLD or up to September 1, 2020 for Aurum) and registered in a practice from England and participated in record linkage. Patient demographics included sex, age, ethnicity, and medical history. We calculated the Charlson comorbidity score for each patient using their medical history [13]. Follow-up of individual patients considered their start date of registration with a general practice, prior history of registration in the practice of at least three years, time of reaching age 65 as well as end date due to moving away or death and time of reaching age 101. The follow-up of each patient was divided into 3-month periods with risk factors such as presence of morbidity assessed at each of these time-periods. These data were used in the matching process. Presence of polypharmacy, defined as the prescription of � 5 medicines in the 84 days before [2], was assessed at each interval. Most prescriptions are typically issued for a duration of 1-2 months (the 95 th percentile of prescription duration was 60 days). Prescribing in the 84 days before the start of each interval was assessed and the number of distinct drug classes counted. Non-pharmacological prescribing, such as blood glucose monitoring equipment, dressings, stoma, or urinary catheter-related products and vaccines, was not included.
The outcomes of interest were based on hospital admission data from the linked data. Two sets of hospital admissions were analysed in this study, including (i) admission code for an adverse-drug reaction (ADR) and (ii) emergency hospital admission. For ADR-related hospital admission, we used a code list based on a systematic search and assessment of lists in 41 publications identifying ADRs from administrative data [14]. This review suggested a comprehensive list of definitions and their corresponding codes, classifying them according to level of likely causality based on the ICD-10 code, which could be used to build consensus among health researchers [14]. The categories used in the current study included (i) ICD-10 codes with phrase 'induced by medication/drug', (ii) ICD-10 codes with phrase 'induced by medication or other causes' or 'poisoning by medication', (iii) ADRs deemed to be very likely or (iv) likely although the ICD-10 code description does not refer to a drug [14]. Emergency hospital admissions were defined as hospital admissions with a visit to the Accident & Emergency on the same day as the hospital admission (following the approach by Budnitz et al. [15]).
Cases were patients with a first hospital admission during follow-up and with recent history of polypharmacy. Cases were matched to up to six controls without hospital admission on the index date (hospital admission date of case) and with history of polypharmacy. The objective of the matching was to closely match on extent of morbidity based on disease (although not on treatments). Matching was done using propensity matching (using the QAdmission Score) as well as matching by variables including age, sex, morbidity cluster, presence of frailty, practice coding level and calendar time. The QAdmissions score estimates the risk of emergency hospital admission for patients aged 18-100 years in primary care [16]. It is based on variables such as age, sex, deprivation score, ethnicity, lifestyle variables (smoking, alcohol intake) and chronic diseases [16]. Predictors such as prescribed medications and laboratory values were not used in the calculation as medications were the exposure of interest and laboratory values were not extracted. Age and calendar time matching was done stepwise (age same year or birth up to difference of up to five years; calendar time from within three months up to difference up to five years). Larger clusters of co-morbidity were also identified using k-means methods. Using 38 conditions [17], the number of clusters was increased stepwise until the number of patients in smaller clusters exceeded 5% of the size of the population. For each practice, the mean level of coding was assessed for each general practice. Nine inception cohorts of starters of medications were identified (including antiarrhythmics, drugs for hypertension / heart failure, thyroid disorders, anti-Parkinson drugs, anti-dementia drugs, antidepressants, antiepileptics, antihyperglycemic therapy and inhaled bronchodilators). The presence of a code for the indication of treatment was measured and then averaged across the practice. Cases and controls were matched on the quintile of practice coding level (mean in CPRD of 64.6% with 5-95% range of 54.4 to 76.6; Aurum 74.4%, 61.6-85.7%). Matching was done separately for CPRD GOLD and Aurum and the risk-set approach to control sampling was used (with control patients potentially included as controls for multiple cases although only once for a particular case).

Statistical analysis
The propensity matching procedure used a caliper (pre-specified maximum difference) of 0.25 of the logit of the propensity score [18]. Greedy nearest neighbour matching was used to select the control unit nearest to each treated unit. The SAS procedure PSMATCH was used to conduct the matching.
Random forest (RF) models were used to predict the probabilities of being a case or control based on the subgroups of medicine classes. RF is a supervised tree-based classifier developed by Breiman [19]. It has been broadly used and cited in different areas including medicine and pharmaceutical applications [20,21]. Tree-based methods such as RF offer superior performance for sub-group classification over techniques such as logistic regression due to its difficulty to a-priori define the subgroups [22]. The RF method first creates subsets of the original data by sampling with replacement on the rows of the original data and randomly selecting the features or columns of the original data. This process is known as bootstrapping. After this, RF forms an ensemble of trees that are trained by each subset of the data independent from other trees. The prediction of each tree depends on a randomly chosen vector and produces a random vector of θ independently [20]. This leads to generation of a set of random classifiers that are generalised. For classification with RF, a number of parameters need to be specified including the number of trees in the forest, the maximum depth of the tree, and the maximum number of leaf nodes [19,23]. To explain RF models, we used SHapley Additive eXplanation (SHAP) values, that can explain the role of each feature or predictor variable in making prediction [24]. SHAP values are calculated by removing each feature and measuring its marginal contribution. They can explain the output of the model as a global interpretability of feature importance, impact of top features toward target prediction (i.e., ADR-related and emergency hospital admissions), and local interpretability of the prediction of a single observation (i.e., one patient). Global interpretability is drawn as feature importance plots that rank the features in a descending order based on the average impact of each feature on model output calculated as the mean of absolute SHAP value of the features. The impact of top features is depicted by ranking the features along with the impact of individual observations on each feature for prediction of the target variable. In this depiction of feature importance, each observation is represented by a dot and the horizontal location of the dots indicates whether the variable's observations associate with the risk for the target variable or not. The baseline shows no impact on predictions and the farther from the baseline to the right side refers to a greater risk for the target variable. Local interpretability demonstrates the role of each feature on the prediction of one specific observation [25]. This type of explanation specifies a base value that points the base prediction of the model in the absence of any features [26].
The study population was split into a development (75%) and validation (25%) datasets. The first step in the development of the RF models was to select the top 50 medicine classes based on the variable importance in the models. The second step was to estimate the probabilities of being a case or control for these top 50 medicine classes. The reason was that RF models would not converge, due to memory constraints, with detailed RF estimations for the probabilities. Two types of plots explain the prediction of RF models for ADR-related hospital admissions and emergency hospital admissions. These plots express the contribution of each medicine class on hospital admissions with colour-encoding to differentiate cases and controls.
The propensity matching was done using SAS software version 9.4; the RF analyses were done with Python 3.7 using Jupyter Notebooks, although they were redone using SAS with high correlations found between the two packages. We used SHAP package to explain the prediction of RF models for hospital admission predictions [27].

Results
89,235 cases with polypharmacy and hospitalised with an ADR-related admission were matched to 443,497 controls on age, sex and disease characteristics. A small number of cases (1.1%) could not be matched to any control and were excluded. Most cases were matched by year of birth and within 3 months (81.1%). Table 1 shows characteristics of cases and controls stratified by Aurum and CPRD GOLD. The age and sex distributions were similar between cases and controls (due to the matching). Comparing medical history between cases and one randomly sampled control (per case) showed that medical histories were broadly comparable. Older cases were found to have fewer controls than younger cases. S1 Table provides characteristics of cases of emergency hospital admissions and their matched controls. We found over 112,000 different combinations of the 50 BNF categories that were most important in predicting ADR-related hospital admission in the RF models. For emergency hospital admissions, there were over 484,000 combinations.
The calibration of the RF probabilities in the development and validation datasets is shown in Table 2. The RF probabilities were strongly predictive of risk of ADR-related and emergency hospital admission. The observed Odds Ratio (OR) in the highest RF decile was 7.16 (95% CI 6.65-7.72) in the validation dataset, compared to the lowest decile. The RF probabilities of being a case were close to the observed probabilities. The ORs as predicted by RF were smaller than the observed OR in the highest deciles (a small change in the probabilities can lead to substantive difference in the OR in case of higher probabilities). Table 3 gives the discrimination of different logistic models for ADR-related and emergency hospital admissions. The effects of age/sex, Qadmission score and RF scores on the Cstatistic were moderate for each of these individually. The C-statistics for ADR-related hospital admissions were 0.58 for age and sex and 0.66 for RF probabilities.   Table 4 presents the range of ORs within each medicine class based on RF predictions for ADR-related hospital admissions. These ORs indicate the effect of taking each medicine class compared to not taking the medicine class. The range of ORs (5, 50 and 95 th percentiles) provide the variability in the effects depending on co-medication. As an example, the ORs for  Table 5 including three levels of medicines based on predictions by RF model. As an example, users of loop diuretics had a mean OR of 7.97 when co-prescribed with medicines for hypoplastic/haemolytic/renal anaemias and clindamycin/lincomycin. Conversely, users of loop diuretics, renin-angiotensin system drugs and betaadrenoceptor blocking drugs had an OR of 2.53. S2 Table provides the range of ORs within each medicine class for emergency hospital admissions. Fig 2 displays a local interpretability of RF model prediction for ADR-related admission for a fake observation. The figure shows that exposure to loop diuretics (rx1), medicines for irondeficiency anaemias (rx3), opioid analgesics (rx6) and antispasmodics (rx12) was associated with an increased risk of ADR-related hospital admission (red lines). The medicines for irondeficiency anaemias (rx3) contributed relatively most to the increased risk. Conversely, absence of penicillins (rx14) was associated with a lowered risk (blue lines).

Discussion
Our study found that primary care patients with polypharmacy were prescribed a myriad combination of medicines. The risks of ADR-related and emergency hospital admissions varied substantially with the specific combinations of medicines. RF models identified sub-groups of medicine users with substantially increased risks of hospital admission (ORs of about 7 for highest vs lowest decile). Loop diuretics, domperidone and/or metoclopramide, medicines for  Table 4).
https://doi.org/10.1371/journal.pone.0281466.g001 iron-deficiency anaemias and for hypoplastic/haemolytic/renal anaemias, and sulfonamides/ trimethoprim were the top 5 medicine classes with highest importance in the RF models for ADR-related and emergency hospital admissions. Various classes of antibiotics (including widely used penicillin, macrolides, cephalosporins, nitrofurantoin and methenamine) were also associated with substantively increased risk of ADR-related and emergency hospital admissions. Medicine classes for pain treatment (such as opioid analgesics and non-opioid analgesics and compound preparations) showed an association with higher risk of ADRrelated and emergency hospital admission. Although some analgesics may not be even effective [28], they are usually prescribed to treat chronic pain that older people are more likely to suffer from, which can lead to or exacerbate polypharmacy and its risks [29,30]. The evidence base for the safety and effectiveness of medicine combinations is limited, and this study has shown this is likely to be a substantial problem for delivering safer care. As outlined in a recent review, older people remain under-represented in clinical trials, and differential effects of medicines under-researched [31]. Treatment guidelines are often developed with a focus on patients with single conditions, and less consideration of multimorbidity and effects of polypharmacy. A review and expert consensus of guidelines for the management of patients with multimorbidity and polypharmacy concluded that there is limited availability of reliable risk prediction models and absence of interventions of proven effectiveness [32]. Despite the widely recognised need for medicine optimisation [5,33], there are only limited tools available to guide clinicians. A 2015 national guideline in England for medicine optimisation mostly provides general guidance on systems rather than specific patient-or medicine characteristics to act on [34]. One exception is the recommendation to use a screening tool such as STOPP/ START tool, which includes 80 STOPP criteria of stopping a medicine or reducing the dose mostly for single disease-medicine or for two medicine combinations [7]. The advantages of the START/STOPP are the detailed considerations by an expert panel of expert and biological plausibility of adverse effects. A major disadvantage is that these sets of criteria do not capture the huge number of medicine combinations with substantive variations in risks in patients with polypharmacy, as observed in our study or acknowledged in the Scottish polypharmacy guidance [35]. RF models may be useful to better capture the large and complex heterogeneity in risks and medicine combinations.

Number Medicine class Range of ORs in users of medicine class
Global interpretability of RF models can help to distinguish the medicines on level of association to risks such as ADR-related or emergency hospital admissions. Local interpretability can explain the prediction and relative associations of different medicines to risk for one patient, and they may be useful in supporting medication reviews for individual patients. These techniques may provide information on the relative importance of various predictors on risk; however, they do not provide causally explainable evidence. Explainability has been considered an essential prerequisite for machine learning models such as RF models [36]. A widely used method is to focus on medicines with pharmacologically well-established mechanisms that can lead to ADR, like STOPP/START criteria [7]. A recent trial in patients with polypharmacy found that an intervention applying STARTT/STOPP reduced the prevalence of inappropriate medicine use, but without effect on drug related hospital admission [8]. A challenge for managing ADR risks in this way is that polypharmacy is a complex system [37], with very many medicine combinations and with hugely varying risks, as observed in this study. It has been argued that explainability of AI models may not be essential but rather empirical evaluation of successful implementation and effectiveness [38]. In the case of RF   Table 4.
https://doi.org/10.1371/journal.pone.0281466.g002 models in polypharmacy, such evaluation could involve highlighting medicines at higher ADR risk to clinicians, with any deprescribing decision considering both patient preferences for the medicine and perceived clinical need. This study was successful in predicting risks of ADR-related and emergency hospital admissions and it could identify the most important medicine classes that contributed to those risks; however, there are several limitations to this study. A major limitation is residual confounding due to differences in disease severity between various medication combinations despite propensity matching. Cases and controls were broadly matched on presence of disease but not on severity of disease. Like most risk prediction models, the results of this study should not be used for counterfactual risk prediction and causal inference [39]. Therefore, the risk difference between exposed and non-exposed patients cannot be assumed to be the effects of the exposure. A limitation of our study is that we do not provide direct evidence for specific interventions to reduce risks. But our results could support targeting of patients at higher risk for ADR-related or emergency hospital admissions, which could be considered for a structured medication review. Another limitation is that medicines were combined into sometimes broad categories covering various pharmacological effects. A further limitation is that our study focuses on hospital admission of older people; however, there can be other adverse outcomes related to polypharmacy such as losing independence, incontinence, or deteriorating cognition. Also, not only older people, but also younger people with complex multimorbidity and polypharmacy can be the subject of these adverse outcomes and may need a medication review.
In conclusion, polypharmacy involves very large number of different combinations of medicines, with substantial differences in risks of ADR-related and emergency hospital admissions. Although the medicines may not be causally related to increased risks, RF models may be used to target interventions to those individuals at greatest need. Simple tools based on counts of medicines or focussed on few medicine classes may not be effective in identifying high risk patients. Predictions based on RF models may help to prioritise patients for structured medication reviews. Future work could involve developing a clinical decision-support with a user interface for doctors to predict and provide the risk of ADR-related and emergency hospital admissions in polypharmacy.  Table. (TIF)

Acknowledgments
This study is based on data from the Clinical Practice Research Datalink (CPRD) obtained under licence from the UK Medicines and Healthcare products Regulatory Agency (MHRA). The data is provided by patients and collected by the NHS as part of their care and support. Hospital Episode Statistics (HES) data are subject to Crown copyright (2022) protection, reused with the permission of The Health & Social Care Information Centre, all rights reserved. The interpretation and conclusions contained in this study are those of the authors alone, and not necessarily those of the MHRA, NIHR, NHS or the Department of Health and Social Care. The study protocol was approved by CPRD's Independent Scientific Advisory Committee (ISAC) (reference: 20_150R). We would like to acknowledge all the data providers and general practices who make anonymised data available for research.