Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

< Back to Article

SensiMix: Sensitivity-Aware 8-bit index & 1-bit value mixed precision quantization for BERT compression

Fig 6

Accuracy vs. model size for QQP, QNLI, SST-2, and MRPC tasks.

SensiMix shows the best trade-off between accuracy and model size. The two points of SensiMix represent SensiMix (3+3) and SensiMix (3+6). The three points of BERT pruning represent the pruning with ratios of 90%, 80%, and 70%.

Fig 6

doi: https://doi.org/10.1371/journal.pone.0265621.g006