SensiMix: Sensitivity-Aware 8-bit index & 1-bit value mixed precision quantization for BERT compression
Table 6
Comparison of the sensitivity of different encoder layers.
SensiMix applies MP encoders to the upper three layers, SensiMix-L applies MP encoders to the lower three layers, and SensiMix-E applies MP encoders to the even-numbered layers 2, 4, and 6. SensiMix shows the best accuracy.