论文标题

通过累积相对频率分布提高意见垃圾邮件检测

Improving Opinion Spam Detection by Cumulative Relative Frequency Distribution

论文作者

Fazzolari, Michela, Buccafurri, Francesco, Lax, Gianluca, Petrocchi, Marinella

论文摘要

在过去的几年中,在线评论变得非常重要,因为它们可以影响消费者的购买决定和企业的声誉,因此,编写伪造评论的做法可能会对客户和服务提供商产生严重的影响。已经提出了各种方法来检测在线评论中的意见垃圾邮件,尤其是基于监督分类器。在此贡献中,我们从用于对意见垃圾邮件进行分类的一系列有效功能开始,并通过考虑每个功能的累积相对频率分布来重新设计它们。通过对Yelp.com的实际数据进行的实验评估,我们表明分布功能的使用能够改善分类器的性能。

Over the last years, online reviews became very important since they can influence the purchase decision of consumers and the reputation of businesses, therefore, the practice of writing fake reviews can have severe consequences on customers and service providers. Various approaches have been proposed for detecting opinion spam in online reviews, especially based on supervised classifiers. In this contribution, we start from a set of effective features used for classifying opinion spam and we re-engineered them, by considering the Cumulative Relative Frequency Distribution of each feature. By an experimental evaluation carried out on real data from Yelp.com, we show that the use of the distributional features is able to improve the performances of classifiers.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源