SensiMix: Sensitivity-Aware 8-bit index & 1-bit value mixed precision quantization for BERT compression

Table 2

Overall performance of SensiMix compared to the competitors.

SensiMix achieves the smallest model size and the least inference time while maintaining the similar accuracy.

