论文标题

整洁:具有值得信赖评估的标签抗噪声互补项目推荐系统

NEAT: A Label Noise-resistant Complementary Item Recommender System with Trustworthy Evaluation

论文作者

Ma, Luyi, Xu, Jianpeng, Cho, Jason H. D., Korpeoglu, Evren, Kumar, Sushant, Achan, Kannan

论文摘要

补充项目推荐系统(CIRS)建议给定查询项目的互补项目。现有的CIRS模型将项目共购买信号视为互补关系的代表,这是由于巨大的交易记录中缺乏人类策划的标签。这些方法表示互补嵌入空间中的项目,并将互补关系建模为对矢量之间相似性的点估计。但是,共购买的项目不一定是彼此补充的。例如,客户可能会在同一交易中经常购买香蕉和瓶装水,但是这两个项目不是互补的。因此,直接使用共购买信号作为标签将加剧模型性能。另一方面,如果评估标签没有反映真正的互补相关性,则模型评估将不值得信赖。为了应对嘈杂的2个项目的嘈杂标记,我们将两个项目的共购套件建模为高斯分布,其中平均值表示互补相关性中的共同采购,协方差表示从噪声中的共购买。为此,我们将每个项目表示为高斯嵌入,并通过项目高斯嵌入的均值和协方差参数化共购的高斯分布。为了减少评估期间嘈杂标签的影响,我们提出了一种基于独立测试的方法,以产生具有一定信心的值得信赖的标签。我们对公开可用数据集和大规模现实世界数据集的广泛实验证明了我们在补充项目建议中与最先进的模型相比,我们提出的模型的有效性是合理的。

The complementary item recommender system (CIRS) recommends the complementary items for a given query item. Existing CIRS models consider the item co-purchase signal as a proxy of the complementary relationship due to the lack of human-curated labels from the huge transaction records. These methods represent items in a complementary embedding space and model the complementary relationship as a point estimation of the similarity between items vectors. However, co-purchased items are not necessarily complementary to each other. For example, customers may frequently purchase bananas and bottled water within the same transaction, but these two items are not complementary. Hence, using co-purchase signals directly as labels will aggravate the model performance. On the other hand, the model evaluation will not be trustworthy if the labels for evaluation are not reflecting the true complementary relatedness. To address the above challenges from noisy labeling of the copurchase data, we model the co-purchases of two items as a Gaussian distribution, where the mean denotes the co-purchases from the complementary relatedness, and covariance denotes the co-purchases from the noise. To do so, we represent each item as a Gaussian embedding and parameterize the Gaussian distribution of co-purchases by the means and covariances from item Gaussian embedding. To reduce the impact of the noisy labels during evaluation, we propose an independence test-based method to generate a trustworthy label set with certain confidence. Our extensive experiments on both the publicly available dataset and the large-scale real-world dataset justify the effectiveness of our proposed model in complementary item recommendations compared with the state-of-the-art models.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源