论文标题
PD定量:基于预测差度度量的训练后量化
PD-Quant: Post-Training Quantization based on Prediction Difference Metric
论文作者
论文摘要
训练后量化(PTQ)是一种神经网络压缩技术,该技术将全精度模型转换为使用较低精确数据类型的量化模型。尽管它可以帮助降低深神经网络的规模和计算成本,但它也可以引入量化噪声并降低预测准确性,尤其是在极低的位置。如何确定适当的量化参数(例如,缩放因子和权重舍入)是现在面临的主要问题。现有方法试图通过最小化量化前后的特征之间的距离来确定这些参数,但是这种方法仅考虑本地信息,并且可能不会导致最佳的量化参数。我们分析了此问题和绳索PD-Quant,该方法通过考虑全球信息来解决此限制。它通过使用量化前后网络预测之间的差异信息来确定量化参数。此外,PD量化可以通过调整激活的分布来减轻PTQ中PTQ中的过度拟合问题。实验表明,PD定量会导致更好的量化参数,并提高量化模型的预测准确性,尤其是在低位设置中。例如,PD Quant将RESNET-18的准确性提高到53.14%,而Regnetx-600mf的重量为2位激活2位的重量为40.67%。该代码在https://github.com/hustvl/pd-quant上发布。
Post-training quantization (PTQ) is a neural network compression technique that converts a full-precision model into a quantized model using lower-precision data types. Although it can help reduce the size and computational cost of deep neural networks, it can also introduce quantization noise and reduce prediction accuracy, especially in extremely low-bit settings. How to determine the appropriate quantization parameters (e.g., scaling factors and rounding of weights) is the main problem facing now. Existing methods attempt to determine these parameters by minimize the distance between features before and after quantization, but such an approach only considers local information and may not result in the most optimal quantization parameters. We analyze this issue and ropose PD-Quant, a method that addresses this limitation by considering global information. It determines the quantization parameters by using the information of differences between network prediction before and after quantization. In addition, PD-Quant can alleviate the overfitting problem in PTQ caused by the small number of calibration sets by adjusting the distribution of activations. Experiments show that PD-Quant leads to better quantization parameters and improves the prediction accuracy of quantized models, especially in low-bit settings. For example, PD-Quant pushes the accuracy of ResNet-18 up to 53.14% and RegNetX-600MF up to 40.67% in weight 2-bit activation 2-bit. The code is released at https://github.com/hustvl/PD-Quant.