用于训练神经网络的节能反向传播的梯度交换调度程序

论文标题

用于训练神经网络的节能反向传播的梯度交换调度程序

A Gradient-Interleaved Scheduler for Energy-Efficient Backpropagation for Training Neural Networks

论文作者

Unnikrishnan, Nanda, Parhi, Keshab K.

论文摘要

本文使用收缩期体系结构使用新型梯度交织方法来培训神经网络的设计。训练神经网络涉及误差的反向传播和相对于激活功能和权重的梯度计算。结果表明，相对于激活函数的梯度可以使用权重平台收缩期数组计算，而相对于权重的梯度可以使用输出平台的收缩期阵列计算。所提出的方法的新颖性在于将这两个梯度的计算与相同的可配置收缩期阵列交织在一起。这导致将变量从一个计算到另一个计算重复使用，并消除了不必要的内存访问。提出的方法在周期数和$ 1.9 \ times $ $ $上节省了1.4-2.2倍的节省。因此，提议的加速器减少了潜伏期和能耗。

This paper addresses design of accelerators using systolic architectures for training of neural networks using a novel gradient interleaving approach. Training the neural network involves backpropagation of error and computation of gradients with respect to the activation functions and weights. It is shown that the gradient with respect to the activation function can be computed using a weight-stationary systolic array while the gradient with respect to the weights can be computed using an output-stationary systolic array. The novelty of the proposed approach lies in interleaving the computations of these two gradients to the same configurable systolic array. This results in reuse of the variables from one computation to the other and eliminates unnecessary memory accesses. The proposed approach leads to 1.4 - 2.2 times savings in terms of number of cycles and $1.9 \times$ savings in terms of memory accesses. Thus, the proposed accelerator reduces latency and energy consumption.

下载PDF全文

下载文献需遵守相关版权规定

论文标题