论文标题

一个人,一个模型 - 学习复合路由器,用于顺序建议

One Person, One Model--Learning Compound Router for Sequential Recommendation

论文作者

Liu, Zhiding, Cheng, Mingyue, Li, Zhi, Liu, Qi, Chen, Enhong

论文摘要

深度学习为捕获动态用户兴趣的顺序推荐(SR)带来了重大突破。一系列最近的研究表明,具有更多参数的模型通常可以实现SR任务的最佳性能,不可避免地会给将其部署在实际系统中带来巨大挑战。按照一个简单的假设,即光网络可能已经足以容纳某些用户,在这项工作中,我们提出了Canet,这是一个概念上简单但非常可扩展的框架,用于以输入相关方式分配自适应网络体系结构,以减少不必要的计算。 Canet的核心思想是将输入用户行为用轻度加权路由器模块路由。具体而言,我们首先用各种子模型以多种模型维度(例如图层,隐藏大小和嵌入尺寸)进行参数化的各种子模型构建路由空间。为了避免路由空间的额外存储开销,我们使用重量调整架构来维护一个网络中的所有子模型。此外,我们利用多种解决方案来解决路由器模块引起的离散优化问题。多亏了他们,Canet可以以端到端的方式自适应地调整其网络体系结构,以有效地捕获用户偏好。为了评估我们的工作,我们在基准数据集上进行了广泛的实验。实验结果表明,CANET在保留原始模型的准确性的同时将计算降低了55〜65%。我们的代码可在https://github.com/icantnamemyself/canet上找到。

Deep learning has brought significant breakthroughs in sequential recommendation (SR) for capturing dynamic user interests. A series of recent research revealed that models with more parameters usually achieve optimal performance for SR tasks, inevitably resulting in great challenges for deploying them in real systems. Following the simple assumption that light networks might already suffice for certain users, in this work, we propose CANet, a conceptually simple yet very scalable framework for assigning adaptive network architecture in an input-dependent manner to reduce unnecessary computation. The core idea of CANet is to route the input user behaviors with a light-weighted router module. Specifically, we first construct the routing space with various submodels parameterized in terms of multiple model dimensions such as the number of layers, hidden size and embedding size. To avoid extra storage overhead of the routing space, we employ a weight-slicing schema to maintain all the submodels in exactly one network. Furthermore, we leverage several solutions to solve the discrete optimization issues caused by the router module. Thanks to them, CANet could adaptively adjust its network architecture for each input in an end-to-end manner, in which the user preference can be effectively captured. To evaluate our work, we conduct extensive experiments on benchmark datasets. Experimental results show that CANet reduces computation by 55 ~ 65% while preserving the accuracy of the original model. Our codes are available at https://github.com/icantnamemyself/CANet.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源