论文标题

从所做的事情中了解您需要的内容:用户行为监督的产品分类扩展

Learning What You Need from What You Did: Product Taxonomy Expansion with User Behaviors Supervision

论文作者

Cheng, Sijie, Gu, Zhouhong, Liu, Bang, Xie, Rui, Wu, Wei, Xiao, Yanghua

论文摘要

分类学已在各个领域广泛使用,以支持众多应用。特别是,产品分类法在电子商务领域中起着至关重要的作用,以进行建议,浏览和查询理解。但是,分类法需要在电子商务平台中不断捕获新出现的术语或概念,以保持最新状态,如果它依赖手动维护和更新,这将是昂贵且劳动力密集的。因此,我们针对分类学扩展任务,以将新概念自动附加到现有分类法上。在本文中,我们提出了一个自制和以用户行为为导向的产品分类扩展框架,以将新概念附加到现有的分类法中。我们的框架提取了符合用户的意图和认知的替象关系。具体来说,i)要充分利用用户行为信息,我们提取了与查询单击概念的用户兴趣相匹配的候选sibymy关系; ii)为了增强新概念的语义信息并更好地检测误比关系,我们通过利用预训练的语言模型和图形神经网络与对比学习相结合,通过用户生成的内容和结构信息来对概念和关系进行建模; iii)为了降低数据集构建的成本并克服数据偏斜,我们从现有分类法中构建了一个高质量且平衡的培训数据集,而无需监督。在Meituan Platform上进行的真实产品分类法进行了广泛的实验,Meituan平台是一家领先的中国垂直电子商务平台,可与超过7,000万活跃用户订购外卖,这表明了我们所提出的框架比先进方法的优越性。值得注意的是,我们的方法将实际产品分类法的规模从39,263升至94,698,精度为88%。

Taxonomies have been widely used in various domains to underpin numerous applications. Specially, product taxonomies serve an essential role in the e-commerce domain for the recommendation, browsing, and query understanding. However, taxonomies need to constantly capture the newly emerged terms or concepts in e-commerce platforms to keep up-to-date, which is expensive and labor-intensive if it relies on manual maintenance and updates. Therefore, we target the taxonomy expansion task to attach new concepts to existing taxonomies automatically. In this paper, we present a self-supervised and user behavior-oriented product taxonomy expansion framework to append new concepts into existing taxonomies. Our framework extracts hyponymy relations that conform to users' intentions and cognition. Specifically, i) to fully exploit user behavioral information, we extract candidate hyponymy relations that match user interests from query-click concepts; ii) to enhance the semantic information of new concepts and better detect hyponymy relations, we model concepts and relations through both user-generated content and structural information in existing taxonomies and user click logs, by leveraging Pre-trained Language Models and Graph Neural Network combined with Contrastive Learning; iii) to reduce the cost of dataset construction and overcome data skews, we construct a high-quality and balanced training dataset from existing taxonomy with no supervision. Extensive experiments on real-world product taxonomies in Meituan Platform, a leading Chinese vertical e-commerce platform to order take-out with more than 70 million daily active users, demonstrate the superiority of our proposed framework over state-of-the-art methods. Notably, our method enlarges the size of real-world product taxonomies from 39,263 to 94,698 relations with 88% precision.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源