Skip to main content
Browse Subject Areas

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

< Back to Article

SensiMix: Sensitivity-Aware 8-bit index & 1-bit value mixed precision quantization for BERT compression

Fig 1

SensiMix shows the best trade-off between accuracy, model size, and inference time among the competitors.

We report the average accuracy of four GLUE tasks (QQP, QNLI, SST-2, and MRPC).

Fig 1