Explainable detection of adverse drug reaction with imbalanced data distribution

doi:10.1371/journal.pcbi.1010144

Explainable detection of adverse drug reaction with imbalanced data distribution

Fig 3

Interpretability analysis of the selected examples for the proposed weighted BERT-CRF model.

Green and red respectively means that the portion contributed positively and negatively to the classification of the target label. The weights are interpreted by applying them to the prediction probabilities. (A) Example 1 (target = tired). If tokens go and bed were removed from the texts, the classifier is expected to predict tired as <B-ADR> with a probability 0.95 − 0.33 − 0.31 = 0.31. Thus, the tokens go and bed could be regarded as indicators of ADR. Compared with the model without a weighting strategy, the proposed model can accurately predict the ADR label based on the local information. (B) Example 2 (target = weight). The proposed model predicted <B-ADR> and <I-ADR> for the tokens gain and weight. The word pristiq is a strong indicator that those tokens are ADR. This indicates that, in the dataset, pristiq is often a drug which may cause an ADR. In contrast, the model without the weighted loss function tends to ignore both the <B-ADR> and <I-ADR> label for the tokens gain and weight, even though these is a strong indicator pristiq. Since the model applies cross-entropy as loss function, it tends to predict all <O> labels for all tokens to achieve the lowest entropy value.

doi: https://doi.org/10.1371/journal.pcbi.1010144.g003