Skip to main content
Browse Subject Areas

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Incidence of Exposure of Patients in the United States to Multiple Drugs for Which Pharmacogenomic Guidelines Are Available

  • Matthias Samwald ,

    Affiliation Section for Artificial Intelligence and Decision Support; Center for Medical Statistics, Informatics, and Intelligent Systems, Medical University of Vienna, Vienna, Austria

  • Hong Xu,

    Affiliation Section for Artificial Intelligence and Decision Support; Center for Medical Statistics, Informatics, and Intelligent Systems, Medical University of Vienna, Vienna, Austria

  • Kathrin Blagec,

    Affiliation Section for Artificial Intelligence and Decision Support; Center for Medical Statistics, Informatics, and Intelligent Systems, Medical University of Vienna, Vienna, Austria

  • Philip E. Empey,

    Affiliation Department of Pharmacy and Therapeutics, School of Pharmacy, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America

  • Daniel C. Malone,

    Affiliation College of Pharmacy, University of Arizona, Tucson, Arizona, United States of America

  • Seid Mussa Ahmed,

    Affiliation Department of Pharmacy, College of public health and medical sciences, Jimma University, Jimma, Ethiopia

  • Patrick Ryan,

    Affiliations Janssen Research and Development, Titusville, New Jersey, United States of America, Observational Health Data Sciences and Informatics, New York, New York, United States of America

  • Sebastian Hofer,

    Affiliation Section for Artificial Intelligence and Decision Support; Center for Medical Statistics, Informatics, and Intelligent Systems, Medical University of Vienna, Vienna, Austria

  • Richard D. Boyce

    Affiliation Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America


Pre-emptive pharmacogenomic (PGx) testing of a panel of genes may be easier to implement and more cost-effective than reactive pharmacogenomic testing if a sufficient number of medications are covered by a single test and future medication exposure can be anticipated. We analysed the incidence of exposure of individual patients in the United States to multiple drugs for which pharmacogenomic guidelines are available (PGx drugs) within a selected four-year period (2009–2012) in order to identify and quantify the incidence of pharmacotherapy in a nation-wide patient population that could be impacted by pre-emptive PGx testing based on currently available clinical guidelines. In total, 73 024 095 patient records from private insurance, Medicare Supplemental and Medicaid were included. Patients enrolled in Medicare Supplemental age > = 65 or Medicaid age 40–64 had the highest incidence of PGx drug use, with approximately half of the patients receiving at least one PGx drug during the 4 year period and one fourth to one third of patients receiving two or more PGx drugs. These data suggest that exposure to multiple PGx drugs is common and that it may be beneficial to implement wide-scale pre-emptive genomic testing. Future work should therefore concentrate on investigating the cost-effectiveness of multiplexed pre-emptive testing strategies.


Ineffective medicinal treatments and drug-associated adverse events place a significant burden on modern healthcare systems [1]. Pharmacogenomic testing of patients prior to treatment initiation might help address these issues by tailoring pharmacotherapy to individual patient needs. Unfortunately, a current barrier to the widespread adoption of pharmacogenomic testing is the lack of information on how to implement it in an efficient and economic manner within clinical workflows [2,3]. Among potential implementation scenarios, “pre-emptive” population-based pharmacogenomic testing is a promising strategy. In pre-emptive testing, a panel of pharmacogenomic markers is tested once, and the test results are stored to optimize drug treatment in later patient care [2]. Still, healthcare organizations will likely not adopt pre-emptive testing without a clearer understanding of the magnitude of its potential impact, as well as associated costs.

Several leading health systems that have launched pharmacogenomics initiatives have reported their institutional processes and key implementation metrics. Table 1 provides an overview of some of these approaches, including targeted drugs and the testing procedures that were employed.

Table 1. Examples of health systems that have launched pharmacogenomics initiatives.

For example, Vanderbilt University Medical Center’s PREDICT (Pharmacogenomic Resource for Enhanced Decisions in Care and Treatment) program [4] reports on the organization’s experience of genotyping 10 044 patients. The frequencies of actionable results (i.e., results that warrant a deviation from standard dosage and drug selection according to clinical guidelines) in 9 589 patients with complete genotype data based on the VeraCode ADME Core Panel (Illumina, San Diego, CA) that was used at the time were: CYP2C9/VKORC1-warfarin (70% of patients), CYP2C19-clopidogrel (28.5% of patients), SLCO1B1-simvastatin (25.7% of patients), CYP3A5-tacrolimus (23.5% of patients) and TPMT-thiopurines (9.1% of patients) [10]. Bush et al. examined the prevalence of actionable PGx variants based on sequencing data from about 5 000 subjects participating in the eMERGE-PGx program. They found that 96.19% of all samples had one or more actionable PGx variants [11].

At the time of this writing, few studies assessing the feasibility of large-scale pre-emptive pharmacogenomic testing are available. Dunnenberger et al. analyzed nationally representative outpatient drug prescriptions in the United States and found that a substantial number of prescriptions are for drugs with pharmacogenetic risk [2]. The PREDICT program published a cohort analysis of examining medication exposure, allele frequencies, and adverse event risk estimates in 52 942 patients at Vanderbilt University Medical Center [12]. In this population over a five year period, 65% were expected to receive at least one, and more than 10% were expected to receive at least four, medications for which evidence exists that the drug response is influenced by pharmacogenetics. Based on data on medication exposure and the event probability of six selected severe adverse drug reactions, they estimated the occurrence of 398 (95% CI: 225–583) severe adverse events that are potentially preventable by an effective pre-emptive testing program in their investigated patient cohort. Medications that posed the greatest risk included: clopidogrel (myocardial infarction, stroke, death), abacavir (hypersensitivity), azathioprine (leukopenia), simvastatin (myopathy), tamoxifen (breast cancer recurrence), and warfarin (bleeding). A pharmacoeconomic analysis was not performed, but the authors reported costs per adverse event estimates from three cost-effectiveness studies for abacavir ($121–$36 850, depending on severity of hypersensitivities), tamoxifen ($24 400–$56 521) and warfarin ($11 542).

Data from these studies suggest that pre-emptive pharmacogenomics testing might be most cost-effective when 1) a large number of different medications can be optimized by performing a single test and 2) future exposure to a multitude of these medications can be expected (e.g., elderly patients or patients with multiple morbidities).

The current study seeks to expand these analyses beyond a single academic medical center to quantify the incidence of pharmacotherapy in a nation-wide patient population that could be impacted by pre-emptive PGx testing based on currently available clinical guidelines in different hypothetical implementation scenarios. In particular, we aim to identify the fraction of patients across different age and insurance categories who had incident prescriptions of a multitude of different PGx drugs within a four-year time window. The purpose is to estimate the potential impact of pre-emptive PGx testing, and to provide the foundation for further in-depth cost-effectiveness analyses by large healthcare organisations and payers based on additional parameters such as estimated frequency and cost of adverse events, as well as costs of PGx testing.

Materials and Methods

The study was conducted through the following analytical steps:

  1. Identifying medications for which pharmacogenomic clinical guidelines are available (from here on called PGx drugs).
  2. Analysing which PGx drugs were most frequently prescribed.
  3. Describing the population of individuals exposed to PGx drugs by age and insurance coverage.
  4. Determining the fraction of patients with incident prescriptions of a multitude of different PGx drugs within a 4 year period.
  5. Comparing the results of different implementation models (pre-emptive vs. mixed reactive/pre-emptive approach) on the number of persons who would be eligible for genomic testing.
  6. Estimating the frequency of high-risk drug–phenotype co-occurrences in the investigated populations

Identification of medications with pharmacogenomic guidelines

Our analysis is based on well-established clinical guidelines for pharmacogenomic treatment optimisation from two organisations: the Clinical Pharmacogenetics Implementation Consortium (CPIC) [13,14] in the United States (US), and the Dutch Pharmacogenetics Working Group (DPWG) [15] in Europe.

The CPIC has published comprehensive reviews of the existing literature on specific drug-gene pairs and has authored guidelines on the clinical use of pharmacogenomic information. The CPIC guidelines are established in the field and have been endorsed by the American Society of Health Systems Pharmacists, incorporated into the US National Institute of Health’s Genetic Test Registry [16], and are indexed in PubMed as part of its Practice Guideline search tool.

The DPWG was formed by the Royal Dutch Pharmacist's Association in 2005 and has authored clinically applicable pharmacogenetic dosing recommendations for integration into an automated medical surveillance system in the Netherlands. The guidelines are developed through extensive systematic literature review.

We manually reviewed the pharmacogenomic treatment recommendations of both organisations in mid-2014 and compiled a unified list of PGx drugs mentioned in at least one of the guidelines.

To further refine the list of drugs to only those which have been widely established as important pharmacogenes and where testing is feasible, we compared the list of pharmacogenes covered in guidelines with genes that are covered by two consensus lists of important pharmacogenes–the PharmaADME Core Marker List [17] and PharmGKB VIP genes [18]–and seven panel-based pharmacogenomic assays to generate a ‘core PGx List’

Patient data sources

Healthcare administrative claims data from three sources were used to establish population-level data on the frequency of prescription of pharmaceuticals for the PGx drugs:

  1. Administrative claims data for a privately-insured population of over 100 million patients from multiple larger employers/payers in the US covering the years 2003 to 2013 (Truven MarketScan® Commercial Claims and Encounters, CCAE)
  2. Administrative claims data for over 15 million Medicaid enrollees from multiple states in the US covering the years 2002 to 2012 (Truven MarketScan® Multi-state Medicaid)
  3. Administrative claims for over 8 million US retirees with Medicare supplemental insurance paid by employers covering 2003 to 2013 (Truven MarketScan® Medicare Supplemental Beneficiaries)

In the US, commercial insurers (CCAE) include mostly working individuals and their families because health insurance is largely provided through employment. Because it represents a privately-insured population, relative to the US demographics, it underrepresents elderly and patients with lower socioeconomic status.

Medicare and Medicaid are government programs. Medicaid covers individuals who are economically disadvantaged. Qualifying patients for state Medicaid programs have lower socioeconomic status, and tend to be younger than the overall population. Medicare largely consists of older individuals over the age of 65 who are no longer working. In the dataset, only plans where both the Medicare-paid amounts and the employer-paid amounts were available and evident on the claims were selected. Because it presents patients who can afford supplemental coverage above and beyond the typical Medicare coverage afforded to all persons > 65 in US, our dataset tends to reflect a younger and healthier population with higher socioeconomic status than the overall Medicare population.

While the three MarketScan® databases are not directly nationally representative (without applying appropriate weights), they contain longitudinal data from a very large convenience sample of the US population and have been used in hundreds of population based studies. All datasets had been previously translated and loaded into a common data model and standard vocabulary as part of the Observational Medical Outcomes Partnership (OMOP) project [19,20]. Details on the decisions that were made by the OMOP project when translating and loading each dataset can be found in three publicly available mapping specification documents [2123]. Version 4 of the common data model and standard vocabulary was used for all datasets.

A set of database queries was developed to conduct a cross-sectional evaluation of drug utilization across each dataset; database queries are available on GitHub [24]. This was possible because the standard vocabulary provided mappings to terms in the RxNorm terminology [25] for all medication claims contained in the administrative dataset described above. The three claims datasets were then queried using SQL Workbench/J (build 116) [26] using the a virtual computer instance provided by the Innovation in Medical Evidence Development and Surveillance (IMEDS) program [27]. The IMEDS laboratory is a secure Amazon Web Services Elastic Cloud Computing (EC2) image that provides approved researchers access to clinical research datasets that meet the US regulatory requirements for “de-identified” or regulatory “safe harbor.” The database queries are available online [28]. This study was reviewed by the University of Pittsburgh Institutional Review Board and found to include no involvement of human subjects.

Prescription Drug Statistics

Statistics on incident claims for PGx drugs were generated within a selected four-year period (1/1/2009–12/31/2012). Incident claims were defined as administrative claims for a specific medication within the four-year period, without any previous claims for the medication prior to the start date of the four-year period. The rationale for focusing on incident use (i.e., excluding cases where a drug was already prescribed prior to the four-year period) instead of prevalent use (i.e., including all cases irrespective of prior drug use) was based on the assumption that treating physicians might not see enough value in optimizing treatment for patients that have already received a medication in the past without experiencing notable adverse events.

Topical preparations of PGx drugs were excluded because PGx guidelines are not generally relevant for topical therapies due to poor absorption. For each patient, the number of distinct PGx drug claims within the four-year period was calculated. Aggregate statistics were calculated for patient groups with specific ages at first time of PGx drug prescription. For the CCAE and Medicaid datasets, these age ranges (inclusive) were 0–13, 14–39 and 40–64 years; for the Medicare dataset, the age range was > = 65 years without an upper bound. These statistics were calculated for the core list of PGx drugs.

The aggregate statistics for prescription of medications were further stratified into two hypothetical, idealized scenarios: a ‘pre-emptive’ scenario, in which a pre-emptive genetic test for PGx genes would be conducted in the selected subpopulation at the start of the four-year period; and a mixed ‘reactive pre-emptive’ scenario, in which a genetic test panel would only be conducted at the time of first incident use of a PGx drug within the four-year time window. A single summary table was created to show how many different PGx drugs would be prescribed per tested patient in both scenarios, as well as the eight most prescribed PGx drugs for each dataset and age group.

Estimation of phenotype distributions and clinical significance

Compiling haplotype frequencies and calculating the frequency of diplotypes in major population groups.

Haplotype frequencies for all genes that are covered by at least one CPIC or DPWG guideline were compiled from studies cited in these guidelines [2935]. These frequencies were used to estimate the frequencies of diplotypes (combinations of haplotypes) via Punnett squares. Punnett squares are a simple way to determine the probability of diplotypes by using a tabular representation of all potential haplotype combinations. Taking into account the partly large differences in haplotype frequencies between different ethnic groups, separate Punnett squares were created for major population groups (i.e. Caucasian, African / African American, Asian). Details on these Punnett squares are provided in S1 Table. The *1 haplotype was used as a “default” haplotype, indicating that none of the considered variants is present. For the calculations, the frequencies of *1 haplotypes was therefore imputed as 100%—(sum of all other haplotype frequencies in %) for all genes. An example of how diplotype frequencies were calculated is provided in S1 Text–Supplementary methods.

Estimating the distribution of drug metabolizing phenotypes in major ethnic population groups.

Drug metabolizing phenotypes associated with diplotypes were inferred based on CPIC guidelines and, where CPIC guidelines were unavailable, on DPWG guidelines. For the majority of genes (i.e. CYP2C19, CYP2D9, CYP2D6, CYP3A5, DPYD, TPMT, UGT1A1), the following phenotype classification was used: Extensive metabolizer (EM), Poor metabolizer (PM), Intermediate metabolizer (IM) and Ultrarapid metabolizer (UM). For example, the CYP2C19*2/*3 diplotype (two loss-of-function alleles) was assigned to the phenotype “CYP2C19 Poor metabolizer”. For SLCO1B1, the phenotype categories “non-functional”, “intermediate function” and “low function” were used. For VKORC1, no phenotype classification was used since only one variant was considered. The diplotype-phenotype assignments for all genes included in this analysis are provided in S2 Table. For each gene and population group, the estimated overall frequencies of different drug metabolizing phenotypes were calculated by adding up the frequencies of the diplotypes assigned to the respective drug metabolizing phenotype. The estimated distributions of drug metabolizing phenotypes for major ethnic population groups can be found in S2 Table.

Ethnic distribution in drug prescription datasets

The distribution of major population groups was derived for each dataset and age group. For the Medicaid dataset, the distribution of ethnic population groups was queried directly from the dataset. The categories “White” and “Black” reported in the dataset were assigned to the phenotype categories “Caucasian” and “African / African American”. The CCAE and Medicare Supplemental datasets did not contain data on population groups. Demographic statistics on these insurance populations were derived from an external source (i.e. the Kaiser Family Foundation [KFF], a US non-profit organization that publishes reports on health care issues) to estimate the distribution in the investigated cohorts [36,37]. A more detailed description on how ethnic distributions were inferred can be found in S1 Text.

Clinical significance classification

The clinical significance of all drug-phenotype co-occurrences was categorized based on DPWG guidelines, which offer a unified categorisation scheme for most of the PGx drugs considered in this analysis (Table 2), as well as on CPIC priority levels, which are available for a subset of the considered PGx drugs. All of these assignments can be found in S4S6 Tables. We focused our analysis on categories of higher clinical significance (i.e., DPWG classes C to F) and higher priority (i.e., CPIC priority level A).

Table 2. Classification of potential clinical effects observed in patients with risk phenotypes, based on DPWG guidelines.

Drug substances included in the estimation

Only a subset of the PGx drugs was included in this final step of our analysis (see Table 3). Drug substances with a clinical significance level A or B, and drug substances that could not be assigned a clinical significance level according to the DPWG classification were excluded. Furthermore, we decided to conduct two variants of the analysis; one excluding codeine, and one including codeine. The exclusion of codeine is likely to yield more accurate results since the drug is widely used as a low-dosed cough medicine, whereas PGx guidelines generally only apply to higher, analgesic dosages of codeine. A sufficiently reliable distinction of low-dose versus high-dose regimen of codeine was not deemed feasible with the datasets used. A detailed listing of the reasons for including / excluding drugs can be found in S1 Text–Supplementary methods.

Table 3. Overview of drugs for which CPIC or DPWG guidelines were available at the time of our analysis (mid-2014).

Drugs and substances that we assigned to the ‘core list’ are listed separately. Only ‘core list’ substances were used for generating prescription statistics. Drug substances that were included in the estimation of the number of high-risk drug-phenotype co-occurrences are printed in bold.

Estimating the number of high-risk drug-phenotype co-occurrences

Calculating the probability that a PGx drug is prescribed to a patient who has a drug metabolizing phenotype that puts the patient at risk for developing an adverse drug reaction if the drug is prescribed in standard dosage was performed as follows: where p(Rx) is the probability that a patient is prescribed a PGx drug in the observed time frame (see S3 Table), p(risk phenotype | ethnicity) is the estimated prevalence of risk phenotypes relevant to the respective PGx drug in the respective population group (see S2 Table) and p(ethnicity) is the prevalence of the ethnic population group in the respective dataset and age group (see S4S6 Tables).

These calculations were performed for all PGx drug–risk phenotype combinations considered in this analysis, across all datasets (Medicaid, Medicare, CCAE), age groups, and ethnicities (Caucasian, African / African American, Asian). The results were summed up for each dataset, age group and clinical significance level. The calculations can be found in S4S6 Tables. An exemplary calculation for the number of CYP2C19 poor metabolizer / codeine prescription co-occurrences in the Medicare dataset is provided in S1 Text.


Based on pharmacogenomic guidelines, we included 61 drugs in the analysis (Table 3). From these, 10 medications were associated with two different genes resulting in 72 different drug-gene interaction pairs (Table 3).

A total of 24 interaction pairs were derived from CPIC guidelines and 61 interaction pairs were derived from DPWG guidelines. Thirteen drug-gene interaction pairs were included in both guideline sources. We found that some of the genes covered by PGx guidelines were included in the majority of consensus lists and assays (e.g., CYP2C19, CYP2D6), while a small set of other genes were rarely included (e.g., “Others”—F5, HLA-B, IFNL3). From here on, we refer to genes and drugs that we found to be well-established based on this analysis as the core list of genes and drugs (Table 3).

Pre-emptive testing

In total, 73 024 095 patient records were included in the analysis, of which 55.7% were associated with female patients. An overview of the results is shown in Table 4, with greater detail provided in several spreadsheets as supplementary material (S3S6 Tables). The top prescribed PGx drugs were indicated for pain relief and cardiovascular conditions. Incident use of PGx drugs in the 0–13 year of age range was very low, with only 1.1% (CCAE) to 1.8% (Medicaid) receiving two or more different PGx drugs. Incident use of two or more PGx drugs increased with age, rising to 17.8% (CCAE age 40–64) and 32.8% (Medicaid age 40–64), respectively. In general, the utilisation of PGx drugs in the Medicaid dataset was significantly higher than in the CCAE dataset. Patients in the Medicare dataset (age > = 65) received a large number of PGx drugs; with 27.5% receiving two or more PGx drugs. Still, utilisation of medications that may benefit from genomic testing was lower in the Medicare dataset than in the patients of age 40–64 who were enrolled in Medicaid.

Table 4. Incidence of exposure to drugs for which pre-emptive pharmacogenomic testing is available.

‘Reactive pre-emptive’ testing

The results for the ‘reactive pre-emptive’ scenario differ significantly from the results of the purely pre-emptive scenario (Table 5 and S3 Table). Since the ‘reactive pre-emptive’ scenario is based on the assumption that the pre-emptive pharmacogenomic test is triggered when the first incident use of a PGx drug occurs, 100% of tested patients in this scenario receive at least one PGx drug. The fraction of tested patients receiving two or more different PGx drugs is also considerably greater in this scenario across all groups (Table 4). For example, in both the CCAE 14–39 and 40–64 groups, the fraction of patients receiving two or more PGx drugs is tripled compared to the pre-emptive scenario (rising from 9% to 31.1% and from 17.8% to 43.3%, respectively). In the Medicare > = 65 group the fraction is doubled (from 27.5% to 54.6%). The highest relative increase is seen in the youngest age group, e.g., a rise from 1.1% to 9.6% in the CCAE 0–13 group.

Table 5. Number of expected drug-phenotype co-occurrences of highest priority according to CPIC guidelines (CPIC level A) or high clinical significance according to DPWG guidelines (DPWG clinical significance classes C–F) within the observed four-year time window.

PGx drugs included in all estimations: amitriptyline, azathioprine, clomipramine, clopidogrel, doxepin, glimepiride, haloperidol, imipramine, mercaptopurine, metoprolol, nortriptyline, paroxetine, propafenone, risperidone sertraline, tamoxifen, thioguanine, tramadol, venlafaxine. Estimations which additionally included codeine are shown for comparison. The rationale for including / excluding drug substances is described in the Methods section (‘Drug substances included in the estimation’).

The relative distribution of PGx drugs among therapeutic areas is shown in Fig 1. Analgesics/anaesthesiology and psychiatry/neurology medications made up a large fraction of PGx drugs across all groups studied with cardiology and endocrinology medications becoming increasingly important with advanced age.

Fig 1. Distribution of incident use within four-year time window among therapeutic areas.

Clinical significance

Table 5 presents the estimated fractions of patients with drug-phenotype co-occurrences that pose a risk for inducing significant adverse events for all investigated datasets, broken down by DPWG clinical significance levels (see Table 2 for the categorisation scheme) and CPIC priority level. When focusing on the estimates not including codeine, the greatest estimated overall fractions of high risk drug-phenotype co-occurrences across all clinical significance levels were found in the Medicaid 40–64 and the Medicare > = 65 group (4.3% and 4.1%, respectively). Our estimation of drug-phenotype co-occurrences assigned with the greatest clinical significance level F, indicating a risk for potentially life-threatening adverse events, resulted in 13 745 and 91 346 co-occurrences in these subgroups, respectively. Level F co-occurrences accounted for nearly half (1.7%) of the overall number of high risk drug-phenotype co-occurrences in the Medicare > = 65 group, and more than one quarter (1.4%) in the Medicaid 40–64 group.

As described in the Methods section, inclusion of codeine prescriptions in the analysis of clinical significance was deemed problematic, as it might result in overestimations caused by the presumably wider application of codeine as a cough medicine instead of as an analgesic. The inclusion of codeine in this final step of the analysis increased the number of DPWG class C to F co-occurrences by an additional 0.3% to 0.4% for each dataset and age group. Similarly, CPIC level A co-occurrences are increased by an additional 0.3% to 0.4% as well.

Full data on estimation results excluding and including codeine can be found in S7 Table.


This study was motivated by the observation that current data on potential return on investment for pre-emptive pharmacogenomics testing is either not detailed enough in terms of demographics or limited to academic healthcare environments. The results suggest that a significant portion of the population will be exposed to one or more PGx drugs and that considerable variation exists in the PGx drugs that are most frequently prescribed depending on age and insurance. This variation extends to the proportion of individuals for which pre-emptive testing could be used to optimize a multitude of medications and the estimated fraction of high-risk drug-phenotype co-occurrences.

Incident users of two or more PGx drugs ranged between 9% and 33% of patients of age ≥ 14. Older patients enrolled in Medicare (age > = 65) or Medicaid (age 40–64) had the highest incidence of PGx drug use, with approximately half of the patients receiving at least one PGx drug and one fourth to one third of patients receiving two or more PGx drugs. Furthermore, these two subpopulations showed the highest estimated fraction of high-risk drug-phenotype co-occurrences, which pose a risk for inducing significant adverse drug events that are potentially preventable by pre-emptive PGx testing. When interpreting these results, it is important to bear in mind that we report here estimated frequencies of drug-phenotype co-occurrences and not estimated frequencies of adverse drug events. Estimating the actual frequency of adverse drug events would require the inclusion of risk measures in the calculation that describe the actual risk for developing a certain adverse drug reaction in the presence of a certain phenotype, which was beyond the scope of this work. It is also important to note that our results are prone to underestimate the number of actual high-risk drug-phenotype-co-occurrence, since several drug substances with high exposure and high clinical significance (in particular warfarin and simvastatin) were not included in this analysis, as they could not be classified according to the DPWG classification scheme.

We used a conservative approach for this analysis, focusing on incident use of a core list of drugs associated with a small number of well-researched pharmacogenes where testing is feasible with common pharmacogenomic assays. Including genes that are associated with high quality evidence, but are not included in the majority of pharmacogenomic testing panels (HLA-B and IFNL3), or genes with comparatively weak evidence (F5) that are also rarely included in panels, would further increase the percentage of patients exposed to medications with pharmacogenomic recommendations. However, we wanted our results to reflect the current technological capabilities and therefore chose to exclude these genes from our analysis.

The drug substances considered in this analysis were selected based on clinical PGx dosing guidelines authored by two different consortia: the DPWG from the Netherlands, and the CPIC which is primarily composed of members from the USA. It is worth mentioning that, besides clinical evidence, differences in the US and European healthcare systems, such as the prescription drug market or national prescribing practices, may have influenced the consortia’s selection of drug substances for their guidelines. A prominent example for this is the anticoagulant warfarin, which is commonly prescribed in the USA (and subject to a CPIC guideline) whereas in Europe phenprocoumon (covered by a DPWG guideline) is mostly used instead. Furthermore, there are several drug substances covered by the DPWG guidelines that are not yet subject to a CPIC guideline but are listed as “high-priority” (level A) gene-drug combinations, such as UGT1A1 –irinotecan or CYP2D6 –tamoxifen. For this analysis, we decided to unify the DPWG and CPIC list to provide a complementary view of PGx drugs.

This study focused on incident use instead of prevalent use. Focusing on prevalent use would make sense in cases of drug therapy optimization based on PGx markers. Such optimization of ongoing treatments might be especially useful in cases where pharmacogenomic markers confer a long-term risk of adverse events (e.g., risk of thrombosis when taking oral contraceptives [15]), biomarkers of ADE risk do not exist, or in patients with multiple morbidities and polypharmacy, where potential side-effects of improperly dosed medications might be difficult to associate with their root causes. If prevalent use in such cases would be included in the analysis, the fraction of patients who could benefit from pre-emptive testing would increase.

We evaluated a mixed ‘reactive pre-emptive’ scenario and found that the number of different medications prescribed to a tested patient is considerably increased compared to a purely pre-emptive approach in some of the patient populations. This more targeted approach could result in a better return on investment for PGx testing, but also suffers from some of the difficulties associated with reactive PGx testing (e.g., uncertainty about when to order tests). The feasibility of implementing such a ‘reactive pre-emptive’ approach based on the drug substances covered by DPWG guidelines will be further investigated within the context of the European PGx implementation project “Ubiquitous pharmacogenomics” (U-PGx) in a multi-centred cross-over trial, starting in January 2017 [38].

Panel-based pre-emptive testing has the potential to increase the use of pharmacogenetic data based on its immediate availability, because results might either be available from a previous pre-emptive test or additional benefits from the pre-emptive test might be expected in the future. Additionally, the accessibility of test results to healthcare providers can be facilitated by novel information technologies. This includes expanded use of electronic health records, and innovative methods such as the Medication Safety Code (MSC) [39], in which a patient can carry personal pharmacogenomic data on a pocket-sized card. Besides human readable information on pharmacogenomic risk factors, the MSC card contains a barcode which can be scanned with a mobile device to retrieve patient-specific pharmacogenomic guidelines. In the context of U-PGx, the MSC system will be implemented at several European sites as an auxiliary tool for maximizing the utility of pharmacogenomic data and enabling decision support in a wide variety of care settings.


The data show results for pre-emptive and reactive pre-emptive scenarios that may be of use for health systems with patient populations similar to the included populations. However, some health systems might have very different patient populations from those included. In practice, the impact of pre-emptive pharmacogenomic testing might vary depending on other demographics, ethnicity, clinical conditions, diagnosis for which the medication is prescribed, and medication use in the population to which it is applied. Conducting a sensitivity analysis to assess the impact of variations in allele frequency or ethnic distribution on our results was beyond the scope of this work.

Practical constraints and missing data in the utilized datasets made it necessary to apply a pragmatic and therefore simplified model to assess the potential impact of pre-emptive pharmacogenomic testing, resulting in two main limitations: Firstly, some of the PGx guidelines included in this analysis are restricted to specific indications, patient groups or dosage levels. Examples for this include the CPIC guideline for clopidogrel that only addresses patients undergoing percutaneous coronary intervention, and the guideline for codeine that only applies to analgesic doses of codeine. Due to the unavailability of data on indication in the used datasets, our results on incident PGx drug prescriptions do not fully reflect these restrictions.

Secondly, the partly incomplete coverage of ethnicity in the used datasets made it necessary to infer the estimated distribution of ethnic subgroups in the different insurance populations from other publicly available demographic datasets. Thus, the estimation on the prevalence of high risk drug-phenotype co-occurrences does not account for any differences in the prevalence of medication use between different ethnic subpopulations that may result from ethnic predispositions to certain health conditions. These limitations of our study emphasize the importance of the availability of comprehensive data on medication use that, besides data on age and gender, also captures additional parameters, such as ethnicity, diagnosis, and dosage amount.

Future work should also take into account that costs of genetic testing and adverse drug events might vary widely between different institutions and regions. This paper sets the stage of the broader issue of economic implications of PGx testing. However, providing more than speculative estimates is not possible because those studies are complicated and time consuming to conduct. We do plan to develop such studies but that is beyond the scope of the current investigation. We think a promising approach to addressing these issues is to enable a broad range of stakeholders and interested researchers to collaborate on extending the analysis. Towards this goal, we have created a new research study within the Observational Health Data Sciences and Informatics (OHDSI) collaborative [40]. The OHDSI program is a multi-stakeholder, interdisciplinary collaborative with the goal of bringing out the value of health data through large-scale, open source analytics. OHDSI has established an international network of researchers and observational health databases with a central coordinating center housed at Columbia University, New York, USA. Immediate next steps are to generate statistics on genes associated with PGx drugs. Furthermore, we plan to extend the study to cover regions outside the United States.

Finally, the evidence and strength of different pharmacogenomic treatment recommendations varies greatly and is constantly changing as new data are generated by the scientific community; implementers might deem only a subset of the recommendations from CPIC and DPWG suitable for utilisation in clinical practice.


To our knowledge, this is the first analysis to use large claims datasets to examine the potential for pre-emptive pharmacogenomics testing by examining the incidence of exposure to multiple PGx drugs in different patient populations. However, deriving incidence data is only the first step toward a detailed health economic analysis that predicts the impact of different pre-emptive PGx testing scenarios in different regions and care settings on the quality of care, total costs and cost savings. While this study is not a formal cost-effectiveness analysis, it provides the basis for determining key parameters that are required to estimate the economic implications of implementing the various testing strategies. A formal cost-effectiveness analysis needs to consider not only the incidence of use of these medications, but would also need to quantify risks and benefits associated with those therapies that would be more efficiently used with pharmacogenomic information. This would include not prescribing medications that may not be metabolised into active medications and also preventing debilitating adverse drug reactions and drug-drug interactions.

Supporting Information

S4 Table. Drug-phenotype co-occurrence (CCAE).


S5 Table. Drug-phenotype co-occurrence (Medicaid).


S6 Table. Drug-phenotype co-occurrence (Medicare).


S7 Table. Drug-phenotype co-occurrence overall results.


Author Contributions

  1. Conceptualization: MS RDB.
  2. Formal analysis: MS HX KB PEE DCM RDB.
  3. Funding acquisition: MS RDB.
  4. Methodology: MS HX KB PEE DCM PR SH RDB.
  5. Project administration: MS RDB.
  6. Software: MS HX SH RDB.
  7. Writing – original draft: MS KB RDB.
  8. Writing – review & editing: MS HX KB PEE DCM SMA PR SH RDB.


  1. 1. US Department of Health and Human Services. National action plan for adverse drug event prevention. Washington, DC. 2014;
  2. 2. Dunnenberger HM, Crews KR, Hoffman JM, Caudle KE, Broeckel U, Howard SC, et al. Preemptive Clinical Pharmacogenetics Implementation: Current Programs in Five US Medical Centers. Annu Rev Pharmacol Toxicol. 2015;55: 89–106. pmid:25292429
  3. 3. Johnson SG, Gruntowicz D, Chua T, Morlock RJ. Financial Analysis of CYP2C19 Genotyping in Patients Receiving Dual Antiplatelet Therapy Following Acute Coronary Syndrome and Percutaneous Coronary Intervention. J Manag Care Spec Pharm. 2015;21: 552–557. pmid:26108379
  4. 4. Pulley JM, Denny JC, Peterson JF, Bernard GR, Vnencak-Jones CL, Ramirez AH, et al. Operational implementation of prospective genotyping for personalized medicine: the design of the Vanderbilt PREDICT project. Clin Pharmacol Ther. 2012;92: 87–95. pmid:22588608
  5. 5. Crews KR, Hicks JK, Pui C- H, Relling MV, Evans WE. Pharmacogenomics and individualized medicine: translating science into practice. Clin Pharmacol Ther. 2012;92: 467–475. pmid:22948889
  6. 6. Hoffman JM, Haidar CE, Wilkinson MR, Crews KR, Baker DK, Kornegay NM, et al. PG4KDS: A model for the clinical implementation of pre-emptive pharmacogenetics. Am J Med Genet C Semin Med Genet. 2014;166: 45–55. pmid:24619595
  7. 7. Weitzel KW, Elsey AR, Langaee TY, Burkley B, Nessl DR, Obeng AO, et al. Clinical pharmacogenetics implementation: approaches, successes, and challenges. Am J Med Genet C Semin Med Genet. 2014;166C: 56–67. pmid:24616371
  8. 8. Nutescu EA, Drozda K, Bress AP, Galanter WL, Stevenson J, Stamos TD, et al. Feasibility of implementing a comprehensive warfarin pharmacogenetics service. Pharmacotherapy. 2013;33: 1156–1164. pmid:23864527
  9. 9. Shuldiner AR, Palmer K, Pakyz RE, Alestock TD, Maloney KA, O’Neill C, et al. Implementation of pharmacogenetics: the University of Maryland Personalized Anti-platelet Pharmacogenetics Program. Am J Med Genet C Semin Med Genet. 2014;166C: 76–84. pmid:24616408
  10. 10. Van Driest SL, Shi Y, Bowton EA, Schildcrout JS, Peterson JF, Pulley J, et al. Clinically actionable genotypes among 10,000 patients with preemptive pharmacogenomic testing. Clin Pharmacol Ther. 2014;95: 423–431. pmid:24253661
  11. 11. Bush WS, Crosslin DR, Owusu-Obeng A, Wallace J, Almoguera B, Basford MA, et al. Genetic variation among 82 pharmacogenes: The PGRNseq data from the eMERGE network. Clin Pharmacol Ther. 2016;100: 160–169. pmid:26857349
  12. 12. Schildcrout JS, Denny JC, Bowton E, Gregg W, Pulley JM, Basford MA, et al. Optimizing drug outcomes through pharmacogenetics: a case for preemptive genotyping. Clin Pharmacol Ther. 2012;92: 235–242. pmid:22739144
  13. 13. Caudle KE, Klein TE, Hoffman JM, Muller DJ, Whirl-Carrillo M, Gong L, et al. Incorporation of Pharmacogenomics into Routine Clinical Practice: the Clinical Pharmacogenetics Implementation Consortium (CPIC) Guideline Development Process. Curr Drug Metab. 2014;15: 209–217. pmid:24479687
  14. 14. Relling MV, Klein TE. CPIC: Clinical Pharmacogenetics Implementation Consortium of the Pharmacogenomics Research Network. Clin Pharmacol Ther. 2011;89: 464–467. pmid:21270786
  15. 15. Swen JJ, Nijenhuis M, de Boer A, Grandia L, Maitland-van der Zee AH, Mulder H, et al. Pharmacogenetics: From Bench to Byte—An Update of Guidelines. Clinical Pharmacology & Therapeutics. 2011;89: 662–673. pmid:21412232
  16. 16. Rubinstein WS, Maglott DR, Lee JM, Kattman BL, Malheiro AJ, Ovetsky M, et al. The NIH genetic testing registry: a new, centralized database of genetic tests to enable access to comprehensive information and improve transparency. Nucleic Acids Res. 2013;41: D925–935. pmid:23193275
  17. 17. Phillips MS. [Internet]. 2013 [cited 19 Apr 2012]. Available:
  18. 18. Thorn CF, Klein TE, Altman RB. Pharmacogenomics and bioinformatics: PharmGKB. Pharmacogenomics. 2010;11: 501–505. pmid:20350130
  19. 19. Schuemie MJ, Gini R, Coloma PM, Straatman H, Herings RMC, Pedersen L, et al. Replication of the OMOP Experiment in Europe: Evaluating Methods for Risk Identification in Electronic Health Record Databases. Drug Saf. 2013;36: 159–169. pmid:24166232
  20. 20. Overhage JM, Ryan PB, Schuemie MJ, Stang PE. Desideratum for evidence based epidemiology. Drug Saf. 2013;36 Suppl 1: S5–14. pmid:24166219
  21. 21. Don Torok. OMOP Common Data Model (CDM V4.0) ETL Mapping Specification for Truven CCAE. Observational Medical Outcomes Partnership. [Internet]. 2012 [cited 24 Feb 2015]. Available:
  22. 22. Don Torok. OMOP Common Data Model (CDM V4.0) ETL Mapping Specification for Truven MDCR. Observational Medical Outcomes Partnership [Internet]. 2012 [cited 24 Feb 2015]. Available:
  23. 23. Mark Khayter. OMOP Common Data Model (CDM V4.0) ETL Mapping Specification for Truven MDCD 0108. Observational Medical Outcomes Partnership [Internet]. 2012 [cited 24 Feb 2015]. Available:
  24. 24. Database queries. In: GitHub [Internet]. [cited 7 Sep 2015]. Available:
  25. 25. Nelson SJ, Zeng K, Kilbourne J, Powell T, Moore R. Normalized names for clinical drugs: RxNorm at 6 years. J Am Med Inform Assoc. 2011;18: 441–448. pmid:21515544
  26. 26. SQL Workbench/J—Home [Internet]. [cited 24 Feb 2015]. Available:
  27. 27. Innovation in Medical Evidence Development and Surveillance (IMEDS) [Internet]. [cited 10 Nov 2014]. Available:
  28. 28. OHDSI/StudyProtocols. In: GitHub [Internet]. [cited 21 Aug 2015]. Available:
  29. 29. Relling MV, Gardner EE, Sandborn WJ, Schmiegelow K, Pui C- H, Yee SW, et al. Clinical Pharmacogenetics Implementation Consortium Guidelines for Thiopurine Methyltransferase Genotype and Thiopurine Dosing: 2013 Update. Clinical Pharmacology & Therapeutics. 2013;93: 324–325. pmid:23422873
  30. 30. Hicks J, Bishop J, Sangkuhl K, Müller D, Ji Y, Leckband S, et al. Clinical Pharmacogenetics Implementation Consortium (CPIC) Guideline for CYP2D6 and CYP2C19 Genotypes and Dosing of Selective Serotonin Reuptake Inhibitors. Clinical Pharmacology & Therapeutics. 2015;98: 127–134. pmid:25974703
  31. 31. Hicks J, Bishop J, Sangkuhl K, Müller D, Ji Y, Leckband S, et al. Clinical Pharmacogenetics Implementation Consortium (CPIC) Guideline for CYP2D6 and CYP2C19 Genotypes and Dosing of Selective Serotonin Reuptake Inhibitors. Clinical Pharmacology & Therapeutics. 2015;98: 127–134. pmid:25974703
  32. 32. Caudle KE, Thorn CF, Klein TE, Swen JJ, McLeod HL, Diasio RB, et al. Clinical Pharmacogenetics Implementation Consortium Guidelines for Dihydropyrimidine Dehydrogenase Genotype and Fluoropyrimidine Dosing. Clinical Pharmacology & Therapeutics. 2013;94: 640–645. pmid:23988873
  33. 33. Gammal R, Court M, Haidar C, Iwuchukwu O, Gaur A, Alvarellos M, et al. Clinical Pharmacogenetics Implementation Consortium (CPIC) Guideline for UGT1A1 and Atazanavir Prescribing. Clinical Pharmacology & Therapeutics. 2016;99: 363–369. pmid:26417955
  34. 34. Ramsey LB, Johnson SG, Caudle KE, Haidar CE, Voora D, Wilke RA, et al. The Clinical Pharmacogenetics Implementation Consortium Guideline for SLCO1B1 and Simvastatin-Induced Myopathy: 2014 Update. Clinical Pharmacology & Therapeutics. 2014;96: 423–428. pmid:24918167
  35. 35. Johnson JA, Gong L, Whirl-Carrillo M, Gage BF, Scott SA, Stein CM, et al. Clinical Pharmacogenetics Implementation Consortium Guidelines for CYP2C9 and VKORC1 Genotypes and Warfarin Dosing. Clinical Pharmacology & Therapeutics. 2011;90: 625–629. pmid:21900891
  36. 36. 13 M, 2013. Health Coverage by Race and Ethnicity: The Potential Impact of the Affordable Care Act [Internet]. [cited 22 Aug 2016]. Available:
  37. 37. 09 M, 2016. Profile of Medicare Beneficiaries by Race and Ethnicity: A Chartpack—Chartpack [Internet]. [cited 22 Aug 2016]. Available:
  38. 38. Ubiquitous Pharmacogenomics (U-PGx) | Making actionable pharmacogenomic data and effective treatment optimization accessible to every European citizen [Internet]. [cited 22 Aug 2016]. Available:
  39. 39. Samwald M, Adlassnig K-P. Pharmacogenomics in the pocket of every patient? A prototype based on quick response codes. J Am Med Inform Assoc. 2013;20: 409–412. pmid:23345409
  40. 40. Matthias Samwald, Richard Boyce. OHDSI Research Study Homepage: Incidence of exposure to drugs for which pre-emptive pharmacogenomic testing is available [Internet]. [cited 25 Feb 2015]. Available: