论文标题

两次通用软效果算法

The Two-Pass Softmax Algorithm

论文作者

Dukhan, Marat, Ablavatski, Artsiom

论文摘要

SoftMax(也称为SoftArgmax)功能被广泛用于机器学习模型中,以将实值分数归一化为概率分布。为了避免浮点溢出,通常在三个通过中实现了SoftMax函数:计算归一化常数的第一个通过,另外两个通过以计算归一化输入的输出。我们分析了三通算法的两个变体,并证明,在所有三个通过的HPC级处理器性能实现良好的实现中,这受到内存带宽的限制。然后,我们提供了一种新的算法,仅需两次通过即可计算。提出的两通算法既避免数值溢出,又避免了额外的归一化范围通过用于中间值的外来表示,其中每个值用表示为一对浮点数:一个代表“ Mantissa”,另一个代表“指数”。绩效评估表明,在Intel Skylake-X处理器上的调查外输入中,新的两通算法在AVX512实施中优于传统的三通算法,高达28%,在AVX2实施中高达18%。提出的两次通行算法还胜过Intel Broadwell和AMD Zen 2处理器的传统三通算法。为了培养可重复性,我们在本文中发布了新的两通弹性算法和其他实验的开源实现,这是github.com/google/xnnpack的XNNPACK库的一部分。

The softmax (also called softargmax) function is widely used in machine learning models to normalize real-valued scores into a probability distribution. To avoid floating-point overflow, the softmax function is conventionally implemented in three passes: the first pass to compute the normalization constant, and two other passes to compute outputs from normalized inputs. We analyze two variants of the Three-Pass algorithm and demonstrate that in a well-optimized implementation on HPC-class processors performance of all three passes is limited by memory bandwidth. We then present a novel algorithm for softmax computation in just two passes. The proposed Two-Pass algorithm avoids both numerical overflow and the extra normalization pass by employing an exotic representation for intermediate values, where each value is represented as a pair of floating-point numbers: one representing the "mantissa" and another representing the "exponent". Performance evaluation demonstrates that on out-of-cache inputs on an Intel Skylake-X processor the new Two-Pass algorithm outperforms the traditional Three-Pass algorithm by up to 28% in AVX512 implementation, and by up to 18% in AVX2 implementation. The proposed Two-Pass algorithm also outperforms the traditional Three-Pass algorithm on Intel Broadwell and AMD Zen 2 processors. To foster reproducibility, we released an open-source implementation of the new Two-Pass Softmax algorithm and other experiments in this paper as a part of XNNPACK library at GitHub.com/google/XNNPACK.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源