改进顺序推荐系统的数据增强策略

论文标题

改进顺序推荐系统的数据增强策略

Data Augmentation Strategies for Improving Sequential Recommender Systems

论文作者

Song, Joo-yeong, Suh, Bongwon

论文摘要

顺序推荐系统最近通过基于深度学习（DL）方法的方法实现了显着的性能改进。但是，尽管已经引入了各种基于DL的方法，但其中大多数仅着眼于网络结构的转换，忽略了其他影响因素的重要性，包括数据增强。显然，基于DL的模型需要大量的培训数据，以便估算参数并实现高性能，从而通过计算机视觉和语音域中的数据增加来提出早期的努力来增加培训数据。在本文中，我们试图确定各种数据增强策略可以改善顺序推荐系统的性能，尤其是当培训数据集不够大的情况下。为此，我们提出了一组简单的数据增强策略，所有这些策略都以直接损坏的方式改变了原始项目序列，并描述了数据增强如何改变性能。最新基于DL的模型的广泛实验表明，应用数据增强可以帮助该模型更好地推广，并且可以提高模型性能非常有效，尤其是在培训数据量较小时。此外，已经表明，我们提出的策略可以将表现提高到现有工作中建议的现有策略的更好或竞争水平。

Sequential recommender systems have recently achieved significant performance improvements with the exploitation of deep learning (DL) based methods. However, although various DL-based methods have been introduced, most of them only focus on the transformations of network structure, neglecting the importance of other influential factors including data augmentation. Obviously, DL-based models require a large amount of training data in order to estimate parameters well and achieve high performances, which leads to the early efforts to increase the training data through data augmentation in computer vision and speech domains. In this paper, we seek to figure out that various data augmentation strategies can improve the performance of sequential recommender systems, especially when the training dataset is not large enough. To this end, we propose a simple set of data augmentation strategies, all of which transform original item sequences in the way of direct corruption and describe how data augmentation changes the performance. Extensive experiments on the latest DL-based model show that applying data augmentation can help the model generalize better, and it can be significantly effective to boost model performances especially when the amount of training data is small. Furthermore, it is shown that our proposed strategies can improve performances to a better or competitive level to existing strategies suggested in the prior works.

下载PDF全文

下载文献需遵守相关版权规定

论文标题