Browse Subject Areas

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Imputation of adverse drug reactions: Causality assessment in hospitals

  • Fabiana Rossi Varallo ,

    Contributed equally to this work with: Fabiana Rossi Varallo, Cleopatra S. Planeta

    Affiliations São Paulo State University (UNESP), School of Pharmaceutical Sciences, Araraquara, São Paulo, Brazil, CAPES Foundation, Ministry of Education of Brazil, Brasília—DF, Brazil

  • Cleopatra S. Planeta ,

    Contributed equally to this work with: Fabiana Rossi Varallo, Cleopatra S. Planeta

    Affiliation São Paulo State University (UNESP), School of Pharmaceutical Sciences, Araraquara, São Paulo, Brazil

  • Maria Teresa Herdeiro ,

    ‡ These authors also contributed equally to this work.

    Affiliation Departamento de Ciências Médicas—Universidade de Aveiro, Aveiro, Portugal

  • Patricia de Carvalho Mastroianni

    ‡ These authors also contributed equally to this work.

    Affiliation São Paulo State University (UNESP), School of Pharmaceutical Sciences, Araraquara, São Paulo, Brazil

Imputation of adverse drug reactions: Causality assessment in hospitals

  • Fabiana Rossi Varallo, 
  • Cleopatra S. Planeta, 
  • Maria Teresa Herdeiro, 
  • Patricia de Carvalho Mastroianni


Background & objectives

Different algorithms have been developed to standardize the causality assessment of adverse drug reactions (ADR). Although most share common characteristics, the results of the causality assessment are variable depending on the algorithm used. Therefore, using 10 different algorithms, the study aimed to compare inter-rater and multi-rater agreement for ADR causality assessment and identify the most consistent to hospitals.


Using ten causality algorithms, four judges independently assessed the first 44 cases of ADRs reported during the first year of implementation of a risk management service in a medium complexity hospital in the state of Sao Paulo (Brazil). Owing to variations in the terminology used for causality, the equivalent imputation terms were grouped into four categories: definite, probable, possible and unlikely. Inter-rater and multi-rater agreement analysis was performed by calculating the Cohen´s and Light´s kappa coefficients, respectively.


None of the algorithms showed 100% reproducibility in the causal imputation. Fair inter-rater and multi-rater agreement was found. Emanuele (1984) and WHO-UMC (2010) algorithms showed a fair rate of agreement between the judges (k = 0.36).

Interpretation & conclusions

Although the ADR causality assessment algorithms were poorly reproducible, our data suggest that WHO-UMC algorithm is the most consistent for imputation in hospitals, since it allows evaluating the quality of the report. However, to improve the ability of assessing the causality using algorithms, it is necessary to include criteria for the evaluation of drug-related problems, which may be related to confounding variables that underestimate the causal association.

1 Introduction

The adverse drug reaction (ADR) causality assessment is a routine procedure in Pharmacovigilance[1], because it allows assessing drug safety parameters and the relationship and likelihood between drug exposure and the occurrence of ADR of health technologies in the post-marketing period.

Since the 1970s, different methods to standardize the evaluation of the causal association of ADRs have been available, ranging from small questionnaires to comprehensive algorithms[2].

The development of these tools, which are ordinary to use[3] and require minimal expertise to be employed[1,4], aims to solve methodological bias, reliability, and validity issues in the imputation of drug-induced adverse effects[5]. However, the main advantage arises from the possibility of decentralizing the causality assessment from the medical diagnosis, extending it to different health care levels: academics, the pharmaceutical industry, and health agencies[6].

By standardizing ADR causality assessment, the uncertainty of the association between a drug and an adverse event will not be reduced, but semi-quantitatively categorized[2] in different links of probability.

Establishing a causal link may influence the rationale for the correlation of an event that occurs to drug consumers[7]; therefore, the results of the causality assessments using algorithms must be reproducible. This is important to ratify the viability of their employment in pharmacovigilance[8], as well as their capacity to detect ADR signals[9,10]. This is because the higher the agreement on a defined ADR causal link, the more robust the hypothesis about the relationship between the use of a medication and the adverse event observed, allowing the communication of the risk and, therefore, the implementation of risk minimization and patient safety plans.

Because serious ADRs lead to hospitalization, it is necessary to assess the causality in the tertiary health care level. However, there are few data about the agreement on ADR causality assessment using different algorithms in patients hospitalized in internal medicine units in developing countries.

It is known that most ADR evidence arises from hospitals, due to the high risks associated with treatments in the tertiary health care level[11]. Therefore, the causality assessment in high complexity institutions contributes to: i) the early recognition of adverse effects, which helps to prevent iatrogenic complications; ii) therapy optimization[2]; iii) establishing barriers to prevent recurrence; iv) reducing the time of hospitalization and unnecessary burden with hospitalizations that could be avoided[12].

This study aimed to compare the results of the imputation of ADRs using different algorithms in a Brazilian public hospital, to identify the most appropriate for establishing causal associations between medication use and the occurrence of adverse events.

2 Material and methods

2.1 Study design

We assessed the causality of all of the ADRs reported by health professionals during the first year of implementation of the pharmacovigilance service (March 2012 until March 2013) in a general assistance, public, medium complexity (secondary health care level) hospital with 104 beds located in the state of São Paulo.

2.2 Selection of algorithms

Twenty-nine (29) algorithms for ADR causality assessment were identified by literature review. Nineteen (19) were excluded for the following reasons: absence of equivalent terminology for the level of imputation of ADRs (n = 6); inclusion of information that is not required for the causality assessment in Brazil (n = 3); tools that were developed for the assessment of specific ADRs (n = 3); and no access to the article (n = 7).

The ten algorithms (Table 1) considered eligible for the study included the combination of five main criteria for the causality assessment[13], namely: i) plausible temporality; ii) prior bibliographic description of the adverse effects related to the use of the drug involved; iii) alternative causes; iv) positive withdrawal (discontinuation of the drug with improvement of the ADR); v) positive rechallenge (reintroduction of the drug with reappearance of the ADR).

Table 1. Algorithms selected for the causality assessment of adverse drug reactions in a public and general hospital in the State of Sao Paulo, Brazil (n = 10).

Owing to the quantitative and qualitative variability in the terminologies used to express the results of the imputation of ADRs in the included algorithms, the nomenclature developed by Macedo et al. (2005)[13] was used. To improve the accuracy of the comparison, the equivalent terms of the likelihood level were grouped into four major categories: definite, probable, possible and unlikely.

2.3 Causality assessment

Using 10 causality algorithms (Table 1), four judges: FRV (rater A), ADFS (rater B) SPS (rater C) and IO (rater D) independently assessed the first 44 cases of ADRs reported to the hospital’s risk management service during its first year of implementation.

The group of judges who conducted the analysis included: a clinical pharmacist of the hospital (rater A) who had PhD in Pharmaceutical Sciences and 8 years of professional experience with pharmacovigilance issues; three pharmacy undergraduate students (raters B, C and D) who were in the last year of the course and had previously experience in pharmacovigilance´s scientific research for at least 1 year. The students were trained, in order to standardize the analysis of causal association. The 12-hour training included: 1) discussion of scientific papers on the subject (evaluation of ADR causality; evaluation of ADR causality with different decision algorithms, application of Austin Bradford-Hill’s criteria in pharmacoepidemiological studies); 2) directed study (comparison and critical analysis) of the algorithms used; 3) simulation of an ADR causality assessment with a fictional case[14].

The cases of ADRs reported and selected for the study contained at least the following information: i) suspected drug (start and end date); ii) a brief description of the event (start and end date, data of laboratory tests when relevant); iii) polypharmacy (start and end date); iv) the patient’s medical history; v) relevant interventions.

We considered ADR any noxious, unintended, or undesired effect of a drug occurring at doses used in humans for prophylaxis, diagnosis, or therapy[15].

The clinical manifestations reported were classified according to seriousness and expectance. Serious ADR were defined as those causing hospitalization, those that were fatal or life-threatening, or those that resulted in significant changes in patient treatment (thereby prolonging hospitalization)[16].

Informational drug sheets approved by the National Agency of Sanitary Surveillance (ANVISA) and monographs, such as those in the DRUGDEX (MICROMEDEX®database), Uptodate® database and LexiComp Manole (2009) were consulted to verify the expectancy of ADR.

The results of imputation obtained with the ten algorithms were compared to analyze the agreement between the judges and the feasibility of the algorithms in the causality assessment in hospitals.

2.4 Statistical analysis

Two descriptive statistics were used to measure the nominal agreement between two or more raters: Cohen´s kappa and Light´s kappa.

Cohen´s kappa measure the degree of concordance between two judges. The analysis carried out by FRV (rater A) was considered gold-standard to calculate the inter-rater agreement between judges B, C and D.

Light´s kappa is a multi-rater statistic which measures the degree of concordance among multiple judges without gold-standard. It is an extension of Cohen’s kappa. For both tests, we considered α = 0.05, 95%CI for all analyses. Values were interpreted according to Landis and Koch protocol (1977)[17] (Table 2).

2.5 Research ethics committee

This study (E-015/10 protocol) was approved by the Research Ethics Committee of the Instituto Lauro de Souza Lima.

3 Results

During the period of data collection, the risk management department received 24 ADRs reports that enclose 36 different types of clinical manifestations resulting from 19 drugs (Table 3). Owing to the causality imputation was carried out case to case, each judge independently assess 44 cases, since a single report may describe more than one clinical manifestation associated with only one drug or may signalize more than one suspected drug for the occurrence of a single clinical manifestation.

Table 3. Characteristics of the adverse drug reactions (ADR) analyzed, according to characteristic of patients, expectative, frequency, seriousness of the event (n = 44).

According to seriousness, seven ADR reports showed symptomatology classified as serious, since 4 of them prolonged hospital length-stay, 2 resulted in temporary disability and 1 was related to hospital admission.

After causality assessment, none of the algorithms showed 100% agreement between judges on the imputation of ADRs. Fair agreement was observed for both statistic tests (Cohen´s and Light´s kappa) (Table 4). Findings suggest the poor reproducibility of the algorithms in performing ADR imputation with different judges.

Table 4. Inter-rater and multi-rater agreement in adverse drug causality assessment, according to the statistical analysis with Cohen´s and Light´s kappa.

The lower inter-rater agreement was observed for the judge B, except for Emanueli (1984), Jones (1986) and WHO-UMC (2010) algorithms (Table 4).

Venulet (k = 0.15), Kramer (k = 0.19), and Naranjo (k = 0.20) algorithms showed the worst multi-raters coefficient, indicating slight agreement on causal association. Moreover, the agreement was stronger in Emanueli (k = 0.36), Mashford (1984) and WHO-UMC (k = 0.36) algorithms, but not better than fair (Table 4).

4 Discussion

Our data suggest that WHO-UMC algorithm is the most consistent for causal imputation of hospital ADR that affected patients admitted to an internal medicine unit of a medium complexity hospital. The advantage of this tool is the semi-quantitative assessment of the causal likelihood and of the quality of the report; it has been used as a gold standard in causality studies[8,13]. Moreover, this tool was developed to evaluate the occurrence of adverse effects during the post-marketing period, which helps to achieve higher probability scores of causal association and a better reproducibility between the judges.

Emanueli (1984) algorithm, which contains a minimalist, simplified, dichotomous structure that considers only the clinical condition of the patient with alternative cause, may overestimate the cases of ADR and generate false-positive signals in risk communication, which is why it is not the most recommended for causality assessment in the context of this study.

For the remaining algorithms, we noted a weak agreement between the judges on ADR causality. Studies have shown a great variability in the results of imputation of ADR using different algorithms[8,13,1822]. According to Shakir and Layton (2002)[10], the tools are inconsistent and sometimes of poor quality for signal detection. Furthermore, they have significant limitations that reduce the accuracy and reliability of the assessment of the probability of ADR[1].

Considering Naranjo et al. (1981) algorithm, data from previous study showed slight agreement between the judges[21], because it was developed and validated for the assessment of ADRs that occur during randomized clinical trials[19]. Other authors suggest the use of this tool for the imputation of ADR[21] due to its rapid implementation. However, we disagree this is the only factor to consider when choosing an algorithm. In addition to this aspect, the reliability of the results and the limitations of each tool, especially in the context of medication use (clinical trial versus post-marketing surveillance), should be considered. According to the data from our study, WHO-UMC (2010) partially meets these criteria.

We understand that it meets in an incomplete manner, because all of the analyzed algorithms do not include other factors that may be associated with adverse events, such as medication errors, product quality deviations, and suspected therapeutic ineffectiveness. Most consider in the assessment only the drug safety issues and neglect (Emanueli 1884; Blanc et al., 1979; Gallagher et al., 2011) or ambiguously (Karch Lasagna, 1977), subjectively (Naranjo et al., 1981; Mashford, 1984; WHO-UMC, 2010) or complexly (Kramer et al., 1979; Venulet et al., 1986) manage other factors that may be associated with adverse events. Even Gallagher et al. (2011) algorithm which was developed after the new definition of pharmacovigilance in 2002 did not include relevant information in the assessment, allowing underestimation of a causal association. This may also be correlated with the low agreement between the judges.

The arbitrary weighting given to the evaluation criteria is another limitation that may contribute to the inconsistency of algorithms[5,23]. This adds subjectivity inherent to the algorithm structure according to criteria these authors deem most important and give greater weighting in scoring. The causality assessment itself also includes some subjectivity[2,6]. Both situations described may contribute to the poor agreement between the algorithms in the imputation of the causal link of ADR.

Another evidence that may decrease the accuracy of risk communication is the absence of ADR reports of good quality and underreporting[10,24,25]. Poor or missing information in the reports makes it difficult assessing causality in details, differentiating between probable and possible cases[2], and finding a definitive causal association. Consequently, the assumptions are not robust enough to generate signals in pharmacovigilance, impairing the assessment of drug safety in the post-marketing period.

Nowadays, there is no gold-standard algorithm for the assessment of events occurring in primary care[6,26] and studies that compared the imputation of ADR reported to national pharmacovigilance centers[13,23]. At the tertiary health care level, Kane-Gill et al. (2012)[20] found strong agreement when comparing three algorithms by active search of retrospective cases of ADR in the intensive care unit. This can be explained by the ward where the study was conducted, the methodology (one judge) and the active search method. Critical patients are constantly monitored, so the records in medical charts are more complete, which allows the collection of better information and increases the robustness of causality assessment. However, the disadvantage of the active search is the time necessary to review medical records[27], which turns the process unfeasible.

Considering the limitations described, there is evidence that it is necessary to develop better quality tools that improve the diagnosis of ADR[26]. A strategy to increase the generation of pharmacovigilance signals is the monitoring of adverse drug events[28], and its evaluation criteria should be included in the algorithms to improve the reproducibility in the causal imputation. These criteria involve the assessment of any drug-related problem.

Therefore, in an attempt to minimize the described flaws and confounding variables during the assessment, the need, effectiveness, adherence and safety parameters should also be included in the algorithms in order to update the assessment with the new concepts of WHO (2002)[29] about post-marketing studies.

Considering drug use[28,30] and the intentional non-compliance with pharmacotherapy related to the diagnosis stereotypical diseases[31] are associated with undesirable effects, it is also necessary to evaluate the impact of ADRs on the patient. This would help to know the priorities, difficulties and factors that may motivate the use or discontinuation of therapy and therefore the occurrence of undesirable effects derived from these perceptions and practices.

Finally, although the literature presents a wide range of methods for the causality assessment, including computational approaches, algorithms are still viable alternatives to the causality assessment in hospitals, since these tools are easy to use, require little financial resources to be applied in the clinical routine and need minimal expertise to be applied[18]. Thus, it is important to update these algorithms in accordance with the new definition of pharmacovigilance, allowing the monitoring of adverse drug events[26], in order to minimize confounding variables associated with the causal imputation process and therefore improve the risk/benefit assessment of medications available on the market.

5 Conclusion

Our data show slight agreement on the ADR causality assessment for the majority of the tested algorithms. However, WHO-UMC (2010) algorithm showed fair reproducibility and allows the analysis of the quality of the report, which is why we suggest that it is the best tool for causality assessment of ADRs occurring in hospitals. Since the Naranjo algorithm was developed and validated to diagnose ADR occurring in randomized clinical trials and showed slight concordance between the judges, this tool is not the most consistent for the assessment of ADRs that affect non-critical patients in a secondary hospital. In addition, data demonstrate the need for the development of better quality tools, that include other criteria for the assessment of drug-related problems, such as effectiveness, safety, compliance, quality or quality deviation and medication errors.


The CAPES Foundation, Ministry of Education of Brazil for the scholarship (PDSE) grant n°. 014301/2013-00. The authors would also like to thank FAPESP for the financial support in this project, under the grant #2013/10263-9, São Paulo Research Foundation (FAPESP), the Programa de Apoio ao Desenvolvimento Científico da Faculdade de Ciências Farmacêuticas da UNESP-PADC and to Instituto de Biomedicina (iBiMED) FCT Ref N° UID/BIM/04501/2013. We are also thankful to the Hospital Estadual Américo Brasiliense, which allowed its data to be collected.

Author Contributions

  1. Conceptualization: FRV CSP MTH PCM.
  2. Data curation: FRV CSP MTH PCM.
  3. Formal analysis: FRV CSP MTH PCM.
  4. Funding acquisition: FRV PCM.
  5. Investigation: FRV CSP MTH PCM.
  6. Methodology: FRV CSP MTH PCM.
  7. Project administration: CSP PCM.
  8. Supervision: PCM MTH.
  9. Validation: FRV CSP MTH PCM.
  10. Visualization: FRV PCM.
  11. Writing – original draft: FRV CSP.
  12. Writing – review & editing: PCM MTH.


  1. 1. WHO-UMC. The use of the WHO-UMC system for standardised case causality assessment. 2010. [Last accessed on 2015 nov 16]. Available from:
  2. 2. Meyboom RH, Hekster YA, Egberts ACG, Gribnau FWJ, Edwards RI. Causal or casual? The role of causality assessment in pharmacovigilance. Drug Safety 1997; 17 (6): 374–389. pmid:9429837
  3. 3. Theóphile H, André M, Arimone Y, Haramburu F, Miremont-Salamé G, Bégaud B. An updated method improved the assessment of adverse drug reaction in routine pharmacovigilance. Journal of Clinical Epidemiology 2012; 65: 1069–1077. pmid:22910538
  4. 4. Coloma PM, Avillach P, Salvo F, Schuemie MJ, Ferrajolo C, Pariente A, et al. A reference standard for evaluation of methods for drug safety signal detection using electronic healthcare record databases. Drug Safety 2013; 36:13–23. pmid:23315292
  5. 5. Doherty MJ. Algorithms for assessing the probability of an Adverse Drug Reaction. Respiratory Medicine CME 2009; 2: 63–67.
  6. 6. Agbabiaka TB, Savovic J, Ernst E. Methods for causality assessment of adverse drug reactions: a systematic review. Drug Safety 2008; 31 (1): 21–37. pmid:18095744
  7. 7. Meyboom RHB, Egbertes ACG, Gribnau FWJ, Hekster YA. Pharmacovigilance in perspective. Drug Safety 1999; 21 (6):429–447. pmid:10612268
  8. 8. Macedo AF, Marques FB, Ribeiro CF, Teixeira F. Causality assessment of adverse drug reactions: comparison of the results obtained from published decisional algorithms and from the evaluations of an expert panel, according to different levels of imputability. Journal of Clinical Pharmacy and Therapeutics 2003; 28: 137–143. pmid:12713611
  9. 9. OPS. Buenas Prácticas de Farmacovigilancia para las Américas. Washington, D. C.: OPS, © 2011. (Red PARF Documento Técnico No. 5). 78 pages.
  10. 10. Shakir SAW, Layton D. Causal association in pharmacovigilance and pharmacoepidemiology thoughts on the application of the Austin Bradford-Hill criteria. Drug Safety 2002; 25 (6): 467–471. pmid:12071785
  11. 11. Mugosa S, Bukumirić Z, Kovacević A, Bosković A, Protić D, Todorović Z. Adverse drug reactions in hospitalized cardiac patients: characteristics and risk factors.Vojnosanit Pregl. 2015;72(11):975–81. pmid:26731971
  12. 12. Angamo MT, Chalmers L, Curtain CM, Bereznicki LR. Adverse-Drug-Reaction-Related Hospitalisations in Developed and Developing Countries: A Review of Prevalence and Contributing Factors.Drug Saf. 2016;39(9):847–57. pmid:27449638
  13. 13. Macedo AF, Marques FB, Ribeiro CT, Teixeira F. Causality assessment of adverse drug reactions: comparison of the results obtained from published decisional algorithms and from the evaluations of an expert panel. Pharmacoepidemiology and Drug Safety 2005; 14: 885–890. pmid:16059869
  14. 14. Farcas A, Bojita M. Adverse Drug Reactions in Clinical Practice: a Causality Assessment of a Case of Drug-Induced Pancreatitis. The Journal of Gastrointestinal and Liver Diseases 2009; 18: 353–358. pmid:19795031
  15. 15. WHO. International drug monitoring: the role of the national centers. WHO Technical Report Series n. 498. Genebra: WHO, 1972.
  16. 16. Moore N, Lecointre D, Noblet C, Mabille M. Frequency and cost of serious adverse drug reactions in a department of general medicine. B J Clin Pharmacol 1998; 45:301–308.
  17. 17. Landis JR, Koch GG. The measurement of observer agreement for categorical data. Biometrics. 1977 Mar;33(1):159–74. pmid:843571
  18. 18. Kane-Gill SL, Devlin JW. Adverse drug event reporting in the intensive care unit: a survey of current practices. Annals of Pharmacotherapy 2006; 40: 1267–1273. pmid:16849619
  19. 19. Davies EC, Rowe PH, James S, Nickless G, Ganguli A, Danjuma M, et al. An investigation of disagreement in causality assessment of adverse drug reactions. Pharmaceutical medicine 2011; 25 (1): 17–24.
  20. 20. Kane-Gill SL, Forsberg EA, Verrico MM, Handler SM. Comparison of three pharmacovigilance algorithms in the ICU setting: a retrospective and prospective evaluation of ADRS. Drug Safety 2012; 35 (8): 645–653. pmid:22720659
  21. 21. Belhekar MN, Taur SF, Munshi RP. A study of agreement between the Naranjo algorithm and WHO-UMC criteria for causality assessment of adverse drug reactions. Indian J Pharmacol 2014; 46 (1): 117–120. pmid:24550597
  22. 22. Théophile H, André M, Miremont-Salamé G, Arimone Y, Bégaud B. Comparison of three methods (an updated logistic probabilistic method, the Naranjo and Liverpool algorithms) for the evaluation of routine pharmacovigilance case reports using consensual expert judgments as reference. Drug Safety 2013; 36: 1033–1044. pmid:23828659
  23. 23. Arimone Y, Bégaud B, Miremont-Salamé G, Fourrier-Réglat A, Moore N, Molimard M, et al. Agreement of expert judgment in causality assessment of adverse drug reactions. Eur J ClinPharmacol 2005; 61: 169–173.
  24. 24. Edwards R. An agenda for UK clinical pharmacology pharmacovigilance. British Journal of Clinical Pharmacology 2012; 73 (6): 979–982. pmid:22360774
  25. 25. Pal SN, Duncombe C, Falzon D, Olsson S. WHO Strategy for Collecting Safety Data in Public Health Programmes: Complementing Spontaneous Reporting Systems. Drug Safety 2013; 36: 75–81. pmid:23329541
  26. 26. Khan LM, Al-Harthi SE, Osman AM, Sattar MAAA, Ali AS. Dilemmas of the causality assessment tools in the diagnosis of adverse drug reactions. Saudi Pharm J. 2016;24 (4):485–93. pmid:27330379
  27. 27. Classen DC, Resar R, Griffin F, Federico F, Frankel T, Kimmel N, et al. 'Global trigger tool' shows that adverse events in hospitals may be ten times greater than previously measured. Health Aff (Millwood). 2011;30(4):581–9.
  28. 28. Seeger JD. Future proofing adverse event monitoring. Drug Safety 2015; 38 (10): 847–1048. pmid:26323240
  29. 29. World Health Organization (WHO). The importance of pharmacovigilance. Geneva: WHO, 2002, 48 pages.
  30. 30. De Vries ST, Keers JC, Visser R, de Zeeuw D, Haaijer-Ruskamp FM, Voorham J, et al. Medications beliefs, treatment complexity, and non-adherence to different drug classes in patients with type 2 diabetes. J Psychosom Rs 2014; 76 (2): 134–138.
  31. 31. Kamarulzaman A, Altice FL. Challenges in managing HIV in people who use drugs. CurrOpin Infect Dis 2015; 28 (1): 10–16.