< Back to Article
SensiMix: Sensitivity-Aware 8-bit index & 1-bit value mixed precision quantization for BERT compression
Table 2
Overall performance of SensiMix compared to the competitors.
SensiMix achieves the smallest model size and the least inference time while maintaining the similar accuracy.
doi: https://doi.org/10.1371/journal.pone.0265621.t002