部分可观测时空混沌系统的无模型预测

论文标题

部分可观测时空混沌系统的无模型预测

Adaptive Low-Precision Training for Embeddings in Click-Through Rate Prediction

论文作者

Li, Shiwei, Guo, Huifeng, Hou, Lu, Zhang, Wei, Tang, Xing, Tang, Ruiming, Zhang, Rui, Li, Ruixuan

论文摘要

嵌入表通常在点击率（CTR）预测模型上是巨大的。为了有效，经济地培训和部署CTR模型，有必要在培训阶段压缩其嵌入式表。为此，我们制定了一种新颖的量化训练范例，以压缩训练阶段的嵌入，称为低精度训练（LPT）。另外，我们提供了有关其收敛性的理论分析。结果表明，与LPT中的确定性权重量化相比，随机重量量化的收敛速率更快，收敛误差较小。此外，为了降低准确性降解，我们提出了自适应低精度训练（ALPT），该训练（ALPT）通过梯度下降来学习步骤大小（即量化分辨率）。在两个现实世界数据集上的实验证实了我们的分析，并表明ALPT可以显着提高预测准确性，尤其是在极低的位宽度下。在CTR模型中，我们第一次成功地训练了8位嵌入，而无需牺牲预测准确性。 ALPT守则公开可用。

Embedding tables are usually huge in click-through rate (CTR) prediction models. To train and deploy the CTR models efficiently and economically, it is necessary to compress their embedding tables at the training stage. To this end, we formulate a novel quantization training paradigm to compress the embeddings from the training stage, termed low-precision training (LPT). Also, we provide theoretical analysis on its convergence. The results show that stochastic weight quantization has a faster convergence rate and a smaller convergence error than deterministic weight quantization in LPT. Further, to reduce the accuracy degradation, we propose adaptive low-precision training (ALPT) that learns the step size (i.e., the quantization resolution) through gradient descent. Experiments on two real-world datasets confirm our analysis and show that ALPT can significantly improve the prediction accuracy, especially at extremely low bit widths. For the first time in CTR models, we successfully train 8-bit embeddings without sacrificing prediction accuracy. The code of ALPT is publicly available.

下载PDF全文

下载文献需遵守相关版权规定

论文标题