结合神经网络异质近似的梯度和概率

论文标题

结合神经网络异质近似的梯度和概率

Combining Gradients and Probabilities for Heterogeneous Approximation of Neural Networks

论文作者

Trommer, Elias, Waschneck, Bernd, Kumar, Akash

论文摘要

这项工作探讨了对神经网络的异质近似乘数配置的搜索，这些神经网络可产生高精度和低能消耗。我们讨论了添加到准确的神经网络计算中的加性高斯噪声的有效性，作为用于近似乘数行为模拟的替代模型。由加性高斯噪声模型跨越的解决方案空间的连续和可区分特性被用作一种启发式，可生成有意义的层稳健性估计，而无需组合优化技术。取而代之的是，在网络训练期间，使用反向传播学习了注入精确计算的噪声量。提出了乘数误差的概率模型，以弥合域之间的间隙。该模型估计了大约乘数误差的标准偏差，将加性高斯噪声空间中的解决方案连接到实际硬件实例。我们的实验表明，对于CIFAR-10数据集上不同的重新网络变体，异质近似和神经网络再培训的组合将乘法的能量降低了70％至79％，而TOP-1准确性损失却低于一个百分点。对于更复杂的小型成像网任务，我们的VGG16模型可实现53％的能源消耗，而前5个精度下降了0.5个百分点。我们进一步证明，我们的误差模型可以以高精度的常用添加剂高斯噪声（AGN）模型的背景下预测近似乘数的参数。我们的软件实施可在https://github.com/etrommer/agn-approx下获得。

This work explores the search for heterogeneous approximate multiplier configurations for neural networks that produce high accuracy and low energy consumption. We discuss the validity of additive Gaussian noise added to accurate neural network computations as a surrogate model for behavioral simulation of approximate multipliers. The continuous and differentiable properties of the solution space spanned by the additive Gaussian noise model are used as a heuristic that generates meaningful estimates of layer robustness without the need for combinatorial optimization techniques. Instead, the amount of noise injected into the accurate computations is learned during network training using backpropagation. A probabilistic model of the multiplier error is presented to bridge the gap between the domains; the model estimates the standard deviation of the approximate multiplier error, connecting solutions in the additive Gaussian noise space to actual hardware instances. Our experiments show that the combination of heterogeneous approximation and neural network retraining reduces the energy consumption for multiplications by 70% to 79% for different ResNet variants on the CIFAR-10 dataset with a Top-1 accuracy loss below one percentage point. For the more complex Tiny ImageNet task, our VGG16 model achieves a 53 % reduction in energy consumption with a drop in Top-5 accuracy of 0.5 percentage points. We further demonstrate that our error model can predict the parameters of an approximate multiplier in the context of the commonly used additive Gaussian noise (AGN) model with high accuracy. Our software implementation is available under https://github.com/etrommer/agn-approx.

下载PDF全文

下载文献需遵守相关版权规定

论文标题