Fig 1.
Workflow of the offensive language detection methodology in Persian language.
Table 1.
Shared tasks in identification of abusive language in different types and languages.
Fig 2.
Paper structure diagram.
Table 2.
Distribution of annotated data in three levels of annotation schema.
A set of 6,000 out of 520,000 sampled data is randomly selected for annotation process.
Fig 3.
Tweet samples (original and translated) from the annotated data with their categories for each level of the annotation schema.
Table 3.
Baselines ML models.
Table 4.
Baselines DL models.
Table 5.
Description of the transformer-based neural network models used in identification of offensive language in Persian.
Fig 4.
Diagram of the stacking K-Fold cross validation.
Fig 5.
Preprocessing steps of the dataset.
Table 6.
Results for offensive language identification (first level).
The bold and underlined numbers represent the first and second best scores, respectively, in each category: classical ML, DL, and transformer-based neural networks.
Table 7.
Results for targeted offensive language identification (second level).
The bold and underlined numbers represent the first and second best scores, respectively, in each category: classical ML, DL, and transformer-based neural networks.
Table 8.
Results for target type of offensive language identification (third level).
The bold and underlined numbers represent the first and second best scores, respectively, in each category: classical ML, DL, and transformer-based neural networks.
Fig 6.
Pairwise Pearson Correlation Coefficient between the predicted probabilities of different single classifiers on out-of-fold test set.
First level (a) shows the correlation between the output predictions of classifiers trained on offensive vs non-offensive annotated data. Second level (b) shows the correlation between the output predictions of classifiers trained on targeted vs untargeted samples. Third level (c) shows the correlation between the output predictions of classifiers trained on targeted offensive towards individual or group.
Fig 7.
Offensive language identification performance among all models in three levels of annotation.
First level (a), Second level (b), and Third level (c) indicate performance of selected base-level classifiers accompanying stacking ensemble classifier in identification of offensive vs non-offensive, targeted vs untargeted offensive content, and the target of offensive language towards individual or group, respectively.