论文标题
具有可证明保证的神经网络的训练后量化
Post-training Quantization for Neural Networks with Provable Guarantees
论文作者
论文摘要
尽管神经网络在各种应用程序中取得了非常成功的成功,但在资源受限的硬件中实施它们仍然是深入研究的领域。通过用量化(例如4位或二进制)对应物替换神经网络的权重,可以实现大量的计算成本,内存和功耗。为此,我们概括了一种基于贪婪的路径跟踪机制的训练后神经网络量化方法GPFQ。除其他外,我们提出了修改以促进权重的稀疏性,并严格分析相关的错误。此外,我们的错误分析扩展了GPFQ上先前工作的结果以处理一般量化字母,表明对于量化单层网络,相对平方误差基本上是在权重的数量上线性衰减的,即过度参数水平。我们的结果在一系列输入分布以及完全连接和卷积体系结构中均构成,从而扩大了先前的结果。为了通过经验评估该方法,我们量化了几个常见的架构,每位重量很少,并在Imagenet上测试它们,与非量化模型相比仅显示准确性较小。我们还证明了标准修改,例如偏置校正和混合精度量化,进一步提高了准确性。
While neural networks have been remarkably successful in a wide array of applications, implementing them in resource-constrained hardware remains an area of intense research. By replacing the weights of a neural network with quantized (e.g., 4-bit, or binary) counterparts, massive savings in computation cost, memory, and power consumption are attained. To that end, we generalize a post-training neural-network quantization method, GPFQ, that is based on a greedy path-following mechanism. Among other things, we propose modifications to promote sparsity of the weights, and rigorously analyze the associated error. Additionally, our error analysis expands the results of previous work on GPFQ to handle general quantization alphabets, showing that for quantizing a single-layer network, the relative square error essentially decays linearly in the number of weights -- i.e., level of over-parametrization. Our result holds across a range of input distributions and for both fully-connected and convolutional architectures thereby also extending previous results. To empirically evaluate the method, we quantize several common architectures with few bits per weight, and test them on ImageNet, showing only minor loss of accuracy compared to unquantized models. We also demonstrate that standard modifications, such as bias correction and mixed precision quantization, further improve accuracy.