Thromboembolic adverse event study of combined estrogen-progestin preparations using Japanese Adverse Drug Event Report database

Combined estrogen-progestin preparations (CEPs) are associated with thromboembolic (TE) side effects. The aim of this study was to evaluate the incidence of TE using the Japanese Adverse Drug Event Report (JADER) database. Adverse events recorded from April 2004 to November 2014 in the JADER database were obtained from the Pharmaceuticals and Medical Devices Agency (PMDA) website (www.pmda.go.jp). We calculated the reporting odds ratios (RORs) of suspected CEPs, analyzed the time-to-onset profile, and assessed the hazard type using Weibull shape parameter (WSP). Furthermore, we used the applied association rule mining technique to discover undetected relationships such as the possible risk factors. The total number of reported cases in the JADER contained was 338,224. The RORs (95% confidential interval, CI) of drospirenone combined with ethinyl estradiol (EE, Dro-EE), norethisterone with EE (Ne-EE), levonorgestrel with EE (Lev-EE), desogestrel with EE (Des-EE), and norgestrel with EE (Nor-EE) were 56.2 (44.3–71.4), 29.1 (23.5–35.9), 42.9 (32.3–57.0), 44.7 (32.7–61.1), and 38.6 (26.3–56.7), respectively. The medians (25%–75%) of the time-to-onset of Dro-EE, Ne-EE, Lev-EE, Des-EE, and Nor-EE were 150.0 (75.3–314.0), 128.0 (27.0–279.0), 204.0 (44.0–660.0), 142.0 (41.3–344.0), and 16.5 (8.8–32.0) days, respectively. The 95% CIs of the WSP-β for Ne-EE, Lev-EE, and Nor-EE were lower and excluded 1. Association rule mining indicated that patients with anemia had a potential risk of developing a TE when using CEPs. Our results suggest that it is important to monitor patients administered CEP for TE. Careful observation is recommended, especially for those using Nor-EE, and this information may be useful for efficient therapeutic planning.


Introduction
Combined estrogen-progestin preparations (CEPs) are one of the most commonly used birth control methods worldwide. CEPs have benefits beyond preventing an undesired pregnancy, including reduced ovarian and endometrial cancer risk, reduced dysfunctional uterine bleeding, decreased menstrual flow and menorrhagia, decreased primary dysmenorrhea, improved hirsutism and acne, and decreased risk of premenstrual syndrome/premenstrual dysphoric disorder [1].
Because CEPs are administered to healthy women over the long-term, patients should be carefully monitored for adverse events (AEs). CEPs, such as oral contraceptives (OCs), have a variety of side effects, of which thrombosis is the most frequent and important [2]. Numerous studies have demonstrated a relationship between CEPs and thromboembolism (TE), including venous thromboembolism (VTE) [1][2][3][4][5][6][7][8][9][10][11][12][13][14][15]. According to the American College of Obstetricians and Gynecologists, the incidence of TE increases from 1 to 5 occurrences per 10,000 women per year in non-OC users to 3 to 9 occurrences per 10,000 women per year in OC users [16]. A systematic review indicated that the risk of VTE in women of childbearing age who were non-OC users was 4 per 10,000 women per year, whereas in OC users, the risk was 7 to 10 per 10,000 women per year [4]. Appropriate treatment of TE after onset resolves the thrombus; however, in approximately 20%-50% of cases of TE, and in proximal deep vein thrombosis (DVT) to a greater extent, patients develop a post-thrombotic syndrome with lifelong problems including pain and swelling of the leg [17,18]. Rare thrombi cause pulmonary embolism, and 1 in 100 cases results in death [13].
Few studies have examined the association between CEP use and arterial thromboembolism (ATE), such as myocardial infarction and ischemic stroke [2,10,[19][20][21][22][23]. Although ATE is less frequent than VTE, the consequences of ATE are often more serious [23]. The World Health Organization (WHO) has reported that the use of CEPs increased the risk of myocardial infarction by approximately 5-fold and the risk of ischemic stroke by approximately 3-fold [2,19,20].
Because VTE and ATE are rare AEs associated with CEP use, the implementation phase of epidemiologic research is difficult. The Pharmaceuticals and Medical Devices Agency (PMDA) in Japan has released the Japanese Adverse Drug Event Report (JADER) database, which is a large spontaneous reporting system (SRS) and reflects the realities of clinical practice in Japan [24]. Therefore, JADER has been used for pharmacovigilance assessments for rare AEs using the reporting odds ratio (ROR) [24][25][26][27].
Several studies have indicated that the risk of developing CEP-induced VTE is greatest during the first year of use [2,4,6,7,9,10,12]. However, detailed onset profiles of CEP-induced VTE are not clear. The analysis of time-to-onset data has been proposed as a new method of detecting signals for AEs in SRSs [24,27,28]. In this study, we applied the index of ROR to TE and evaluated time-to-onset profiles of TE for CEPs in the real world.
Furthermore, association rule mining has been proposed as a new analytical approach for identifying undetected clinical factor combinations, such as possible risk factors, between variables in huge databases [29][30][31]. This is the first application of association rule mining for the detection of association rules between CEPs and TE.

Materials and methods
AEs recorded from April 2004 to November 2014 in the JADER database were obtained from the PMDA website (www.pmda.go.jp). The JADER database consists of 4 tables: patient demographic information, such as sex, age, and reporting year (demo); drug information, such as non-proprietary name of the prescribed drug, route, and start and end date of administration (drug); adverse events, such as type, outcome, and onset date (reac); and primary disease (hist). We constructed a relational database that integrated the 4 data tables using FileMaker Pro 12 software (FileMaker, Inc., Santa Clara, CA, USA). The "drug" file included the role codes assigned to each drug: suspected, concomitant, and interacting drugs (higiyaku, heiyouyaku, and sougosayou in Japanese, respectively). The suspected drug records were extracted and analyzed in this study.
AEs in the JADER database are coded according to the terminology preferred by the Medical Dictionary for Regulatory Activities/Japanese version 17.1 (MedDRA/J) (www.pmrj.jp/ jmo/php/indexj.php). The standardized MedDRA Queries (SMQ) index consists of groupings of MedDRA terms, ordinarily at the preferred term (PT) level, that relate to a defined medical condition or area of interest [32]. We used the SMQ for embolic and thrombotic events, arterial (SMQ code: 20000082), embolic and thrombotic events, vessel type unspecified and mixed arterial and venous (SMQ code: 20000083), and embolic and thrombotic events, venous (SMQ code: 20000084; Table 1).
The mosaic plot of the two-way frequency table was constructed with the age-category (X) and primary disease (Y). A mosaic plot is divided into rectangles so that the vertical length of each rectangle is proportional to the proportion of the Y variable at each level of the X variable.
We assessed the association between CEPs and TE using the ROR, which is an established parameter for pharmacovigilance research. The ROR is the ratio of the odds of reporting an adverse event versus all other events associated with the drug of interest compared with the reporting odds for all other drugs present in the database [33]. We calculated the ROR using a two-by-two contingency table by defining the rows using CEPs and all other drugs and the columns using TE and all other adverse events (Fig 1). RORs are expressed as point estimates with 95% confidence intervals (CI). The detection of a signal was dependent on the signal indices exceeding a predefined threshold. Safety signals are considered significant when the ROR estimates and the lower limits of the corresponding 95% CI exceed 1. At least 2 cases are required to define a signal [33,34].
Proportional reporting ratios (PRRs) are measures of disproportionality used for detecting signals in SRS databases [35]. PRRs are calculated from the same 2 × 2 tables and the ROR is identical to the calculation of relative risk (RR) from a cohort study, i.e., [a / (a + c)] / [b / (b + d)]. If the drug and adverse event are independent, the expected value of the PRR is 1. The minimum criteria for signal detection are as follows: 3 or more cases, PRR of at least 2, and Chi-square of at least 4.
Time-to-onset duration of the data from the JADER database was calculated from the time of the patient's first prescription to the occurrence of the AE. The median duration, quartiles, and Weibull shape parameters (WSPs) were used to evaluate the dates from administration to development of TE [27, [36][37][38]. The WSP test is used for the statistical analysis of time-toonset data and can describe the non-constant rate of the incidence of AE reactions [24,39]. The scale parameter α of the Weibull distribution determines the scale of the distribution function. A larger scale value stretches the distribution. A smaller scale value shrinks the data distribution. The shape parameter β of the Weibull distribution indicates the hazard without a reference population. When β is equal to 1, the hazard is estimated to be constant over time.
When β is greater than 1 and the 95% CI of β excludes 1, the hazard is considered to increase over time. When β is smaller than 1 and the 95% CI of β excludes 1, the hazard is considered to decrease over time [39]. The data analyses were performed using JMP 11.2 (SAS Institute Inc., Cary, NC, USA). Thromboembolic adverse event with combined estrogen-progestin preparations

Association rule mining
The association rule mining approach attempts to search the frequent items in databases and discover interesting relationships between variables. Given a set of transactions T (each transaction is a set of items), an association rule can be expressed as X -> Y, where X and Y are mutually exclusive sets of items [40]. The rule's statistical significance and strength are measured by the support and confidence, respectively. Support is defined as the percentage of transactions in the data that contain all items in both the antecedent (left-hand-side of rule: lhs) and the consequent of the rule (right-hand-side of rule: rhs) [40]. The support indicates how frequently the rule occurs in the transaction. The formula for calculating support is as follows: D is the total number of the transaction. Confidence corresponds to the conditional probability P (Y|X). A rule with high confidence is important because it provides an accurate prediction of the association of the items in the rule. The formula for calculating confidence is as follows: Lift represents the ratio of probability. For a given rule, X and Y occur together to the multiple of the two individual probabilities for X and Y; that is, Since P(Y) appears in the denominator of the lift measure, the lift can be expressed as the confidence divided by P(Y). The lift can be evaluated as follows: lift = 1, if X and Y are independent; lift > 1, if X and Y are positively correlated; lift < 1, if X and Y are negatively correlated. We performed these analyses using the apriori function of the arules library in the arules package of R version 3.3.3 software [41].

Results
The  Table 2). In the mosaic plot, Dro-EE and Nor-EE were primarily administered to patients with dysmenorrhea and endometriosis, respectively (Fig 2). The ROR and 95% CI of patients stratified  Table 3).
The association rule mining technique was applied to TE (as consequent) using demographic data such as age category and patient history. The apriori algorithm extracts frequent combinations from a large database to efficiently find sets of adverse events that occur more frequently than the minimum support threshold (defined as 0.00001 in this study). This generates sets of adverse drug reactions with the minimum confidence threshold (defined as 0.9 in this study). Furthermore, the maximum size of mined frequent item sets (maxlen: a parameter in the arules package) was restricted to 3. The result of the mining algorithm was a set of 12 rules (Table 4). The support, confidence, and lift of each association rule are summarized in Table 4; the association rules up to the twelfth position in descending order of the support are shown in Table 4 Fig 3). Additionally, the association rules of the combination of {smoking, Nor-EE} were high (Table 4, id [3]).

Discussion
The RORs and PRRs suggested that all CEPs were associated with an increased risk of TE. Several studies demonstrated that the increase in VTE risk after administration of Dro-EE or Des-EE was greater than that after administration of Lev-EE [6,8,11,15,42]. The risk of VTE might be associated with the type of progestin, the amount of estrogen, or the pharmacological activity of estrogen [6,7]. In contrast, Odlind et al. suggested that those associations might be subject to bias [43,44]. Whereas some studies indicated that Des-EE reduced the risk of ATE compared to other CEPs, other studies did not [2,23]. Lidegaard et al. found the risk of ATE decreased with lower doses of estrogen [23]. We did not observe significant differences in the RORs among Dro-EE, Nor-EE, Lev-EE, and Des-EE. We do not have a conclusive explanation for the differences in TE risk between the various progestins in low-dose CEPs. Thromboembolic adverse event with combined estrogen-progestin preparations The median time to TE onset induced by Nor-EE, which contained the highest amount of EE (50 μg), was the shortest time to onset among the CEPs (16.5 days). EE enhances the effects of the procoagulation factors 2, 7, 9, 10, 12, 13, and fibrinogen, while reducing natural anticoagulant protein S and antithrombin, and acts as a procoagulant [2,45]. The effects of EE were reported to be dose-dependent [2]. With an estrogen dose of 30 μg as the reference category, the thrombotic risk was 0.8 (95% CI 0.5 to 1.2) for an estrogen dose of 20 μg and 1.9 (1.1 to 3.4) for a dose of 50 μg [11]. In contrast, progestin has no effect on coagulation factor levels [2]. One plausible reason for the "short" median time to TE onset induced by Nor-EE might be the high amount of EE in Nor-EE in our study. However, the mechanism of development of thrombosis is poorly understood. It may be due to the differential effects on sex hormone binding globulin, anticoagulant protein S resistance in early OC use, or the unmasking of an underlying inherited coagulation disorder [4]. CEPs have several metabolic effects on lipid, carbohydrate, and hemostatic parameters [3]. To reveal the mechanism of the short time to onset of TE by Nor-EE, further pharmacological study is necessary.
The WSP β of Ne-EE, Lev-EE, and Nor-EE was less than 1, which indicated an early failure type, and indicated that TE caused by these CEPs might decrease over time. It was reported that the risk of VTE decreased with prolonged administration [46][47][48] and recovered to the level of non-users of CEPs within 3 months after discontinuation [15].
In our study, the median occurrence of TE for all CEPs was within 3 months; however, several instances of VTE were observed after 3 months. The risks of VTE were reported to be observed within 4 months following CEP administration [15]. These results corresponded  Table 3. Quartiles and parameter of Weibull distribution and failure pattern for combined estrogen-progestin preparations. Thromboembolic adverse event with combined estrogen-progestin preparations with those of previous studies and confirmed the necessity of long-term observation after the administration of these drugs.

Drugs Case reports (n) Median (lower−upper quartile)(day) Scale parameter, α (95% CI) Shape parameter, β (95% CI)
In the association rule mining, because the lift values of two combined items, CEPs and anemia-related items, including iron pill administration, were high, patients with anemia had a potential risk of TE when using CEPs. Recently, an association between anemia and cerebral venous thrombosis was reported [49]. Therefore, anemia patients should be monitored carefully. The lift values of the two combined items, {smoking, Nor-EE}, were also high enough to suggest an association. This information demonstrated that smoking while taking CEPs may increase the risk of TE.
Association rule mining is one of the most important tasks in data mining and various effective algorithms have been proposed. Several groups have conducted the performance evaluation of the association rule mining algorithms, such as apriori, Frequent Pattern (FP)-Growth, and Eclat, by execution time or those with higher confidence, lift, and conviction Thromboembolic adverse event with combined estrogen-progestin preparations values. Apriori is a level-wise, breadth-first algorithm that counts transactions, generates candidates, and discovers frequent itemsets by the exploitation of user-specified support and confidence measures. In a large quantity of itemsets, the algorithm requires more space and time; consequently, the complexity of the algorithm increases [50]. The FP-Growth algorithm was proposed as an alternative to the apriori-based approach by Han [51,52]. The basic concept of the FP-Growth algorithm consists of the construction of an FP-tree for all the transactions. FP-Growth encodes the data set by using a compact data structure called an FP-tree, which can save considerable amounts of memory in transaction storage [52,53]. The Eclat algorithm uses equivalence classes, depth-first search, and set intersection instead of counting. Eclat is a depth-first search-based algorithm that uses a vertical database layout [54]. It also solves the frequent itemset problem. However, the performance by each algorithm differs owing to various parameters, such as the size of itemset and the structure of database. We consider that the relative merits of the algorithms have not yet been settled.
An apriori algorithm is designed to efficiently identify association rules in large databases and is the most classical algorithm for mining frequent item sets [55]. This algorithm has recently been used for the analysis of AEs in the JADER and US Food and Drug Administration (FDA) Adverse Event Reporting System (FAERS) and confirmed its usefulness for pharmacovigilance [29-31]. Therefore, we used an apriori algorithm.
The numerous known risk factors for TE in women are as follows: advanced age [5,6,11], high body mass index [14,47,[56][57][58][59][60], smoking [20,[61][62][63][64][65], breast cancer, migraine, hypertension, and medical history of a cardiovascular event [1,66]. CEP use should be discouraged among women older than 35 years who smoke because they have an increased risk of arterial vascular disease when using CEP [10]. CEP users should have their blood pressure routinely monitored and smoking cessation should be encouraged in older women. Clinicians should monitor for any symptoms suggestive of stroke, myocardial infarction, or venous thrombosis and discontinue the agent immediately if any symptoms occur during the first 3 months of CEP use. From our results, Nor-EE users should be closely monitored for the first 2 to 3 weeks. Regarding the prescribing of CEPs, clinicians should consider a woman's risk factors for TE. The choice of an appropriate CEP should be made by considering the need to minimize the risk of TE, patient preference, and available alternatives.
Like the JADER database, the FAERS database is an SRS and is the largest and best-known AEs database worldwide. Therefore, the FDA uses it for pharmacovigilance activities, such as looking for new safety concerns that might be related to a drug. The FAERS database files are publicly available on the FDA web site (open.fda.gov/data/faers/) [33]. FAERS includes information about the country where the AEs occurred. From our preliminary analysis of the FAERS database from April 2004 to November 2014, the total number of reported cases in the FAERS database was 6,165,659 and the number of reports from the US and Japan was 3,652,497 (59.2%) and 275,268 (4.5%), respectively (detailed data not shown). The number of reported AEs in the JADER (338,224 in this study) was greater than that in the FAERS (275,268 from Japan). Nomura et al. reported that there are differences in the reported number of AEs between JADER and FAERS, but the reports that were common between the FAERS and JADER were uncertain [67]. SRS databases mostly depend on the compliance of pharmaceutical companies to report according to regulatory requirements. Each company has its own operational rules for AE reports, which makes it impossible for researchers to validate the contents of SRS databases [67]. Regional differences in drug prescriptions or genetic backgrounds may be related to AEs. However, we did not analyze this issue further.
The JADER database does not contain detailed background information regarding patients' body mass index, smoking, or accurate medical history, such as migraine and cardiovascular disease. Furthermore, SRS has several limitations, including under-reporting, over-reporting, missing data, bias, confounding factors, and lack of a control population as a reference group [34]. Further epidemiological studies for confirmation might be required.
Several pharmacovigilance indexes have been developed to detect drug-associated AEs, including the ROR used by the PMDA and the Netherlands Pharmacovigilance Centre (Lareb), the PRR used by the Medicines and Healthcare Products Regulatory Agency in the United Kingdom (UK), the information component (IC) used by WHO, and the empirical Bayes geometric mean (EBGM) used by the FDA. The multi-item gamma poisson shrinker (MGPS) method is a disproportionality method that utilizes an empirical Bayesian model to detect the magnitude of drug-event associations in drug safety databases [68,69]. MGPS calculates adjusted reporting ratios for pairs of drug event combinations. The adjusted reporting ratio values are termed the EBGM. Although many studies regarding the performance, accuracy, and reliability of different data mining algorithms are in progress, there is no recognized gold standard methodology. We did not analyze using the EBGM, but this might be a future consideration.
The ROR is defined as the ratio of the odds of reporting of one specific event versus all other events for a given drug compared to the reporting odds for all other drugs present in the database. Basically, the higher the value, the stronger the disproportion appears to be. The ROR indicates an increased risk of AE reporting and not a risk of AE occurrence. Therefore, the ROR does not allow risk quantification, but only offers a rough indication of signal strength and is only relevant to the hypothesis [24,33,34]. The ROR is a clear and easily applicable technique that allows for the control of confounding factors through logistic regression analysis [27, [70][71][72]]. An additional advantage of using the ROR is that non-selective underreporting of a drug or AE has no influence on the value of the ROR compared with the population of patients experiencing an AE [73]. Therefore, we selected first the ROR as a pharmacovigilance index in this study.
ROR and PRR are both measures of disproportionality used to detect signals in SRS databases. In our study, the tendencies of the results from the RORs and the PRRs for signal detection were similar. Evans et al. suggested that the PRR might be much less error prone than the ROR [35]. In contrast, Rothman et al. proposed that SRS should be treated as a data source for a case-control study, thereby excluding from the control series those events that may be related to drug exposure. Therefore, the ROR may offer an advantage over PRR by estimating the relative risk [74]. However, this apparent superiority has been called into question [75]. Van Puijenbroek et al. concluded that, in practice, there is no important difference between the ROR and PRR measures for pharmacovigilance [34]. A judgment on the validity and utility of these measures should be based on comparison of their sensitivity, specificity, and predictive values in signal detection from a real dataset.
The aforementioned limitations inherent to the SRS should be recognized in the interpretation of the results from the JADER database. We stress that our results do not provide any justification for the restriction of CEP use because the benefits and tolerability of CEPs have been accepted worldwide.

Conclusion
This study was the first to evaluate the correlation between CEP and TE using an SRS analysis strategy. Despite the limitations inherent to SRS, we showed the potential risk of TE during CEP use in a real-life setting. The present analysis demonstrated that the incidence of TE with Nor-EE use should be closely monitored for a short onset (within 3 weeks). Patients with anemia who are using CEPs might be advised to adhere to an appropriate care plan. We recommend the close monitoring of patients, and those who experience any symptoms suggestive of TE should be advised to discontinue administration.