< Back to Article
SensiMix: Sensitivity-Aware 8-bit index & 1-bit value mixed precision quantization for BERT compression
Fig 1
SensiMix shows the best trade-off between accuracy, model size, and inference time among the competitors.
We report the average accuracy of four GLUE tasks (QQP, QNLI, SST-2, and MRPC).
doi: https://doi.org/10.1371/journal.pone.0265621.g001