论文标题

蒸馏神经网络有效学习以排名

Distilled Neural Networks for Efficient Learning to Rank

论文作者

Nardini, F. M., Rulli, C., Trani, S., Venturini, R.

论文摘要

最新的学习排名研究表明,有效地将神经网络从回归树的整体中提取的可能性。该结果导致神经网络成为排名任务的基于树的合奏的自然竞争者。然而,在效率和有效性方面,回归树的合奏都超过了神经模型,尤其是在CPU上得分时。在本文中,我们提出了一种方法,用于通过应用蒸馏,修剪和快速矩阵乘法的组合来加快神经评分时间。我们采用知识蒸馏来从回归树的合奏中学习浅神经网络。然后,我们利用了一种面向效率的修剪技术,该技术对神经网络的最密集型层进行稀疏,然后通过优化的稀疏矩阵乘法对其进行评分。此外,通过研究密集和稀疏的高性能矩阵乘法,我们开发了一个评分时间预测模型,该模型有助于设计与所需效率要求相匹配的神经网络体系结构。与基于树的合奏相比,在两个公共学习到等级数据集的全面实验表明,在有效性效率折衷的任何时候,使用我们的新方法产生的神经网络都具有竞争力,可提供多达4倍的得分时间加速,而不会影响排名质量。

Recent studies in Learning to Rank have shown the possibility to effectively distill a neural network from an ensemble of regression trees. This result leads neural networks to become a natural competitor of tree-based ensembles on the ranking task. Nevertheless, ensembles of regression trees outperform neural models both in terms of efficiency and effectiveness, particularly when scoring on CPU. In this paper, we propose an approach for speeding up neural scoring time by applying a combination of Distillation, Pruning and Fast Matrix multiplication. We employ knowledge distillation to learn shallow neural networks from an ensemble of regression trees. Then, we exploit an efficiency-oriented pruning technique that performs a sparsification of the most computationally-intensive layers of the neural network that is then scored with optimized sparse matrix multiplication. Moreover, by studying both dense and sparse high performance matrix multiplication, we develop a scoring time prediction model which helps in devising neural network architectures that match the desired efficiency requirements. Comprehensive experiments on two public learning-to-rank datasets show that neural networks produced with our novel approach are competitive at any point of the effectiveness-efficiency trade-off when compared with tree-based ensembles, providing up to 4x scoring time speed-up without affecting the ranking quality.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源