A comparative analysis of video vision transformers on word-level sign language datasets
Fig 7
Long-tail nature of WLASL-2000 on VideoMAE kinetics model.
This figure illustrates the long-tail effect in WLASL-2000, showing class-wise F1-score versus training-set class frequency (log scale), grouped into head, middle, and tail classes. A weak but statistically significant positive correlation is observed (Spearman r = 0.122, p < 0.001), with higher variability among tail classes.