SPAML：基于NLP技术的双峰集合学习垃圾邮件检测器

论文标题

SPAML：基于NLP技术的双峰集合学习垃圾邮件检测器

SpaML: a Bimodal Ensemble Learning Spam Detector based on NLP Techniques

论文作者

Fattahi, Jaouhar, Mejri, Mohamed

论文摘要

在本文中，我们提出了一种名为SPAML的新工具，用于使用一组监督和无监督的分类器进行垃圾邮件检测，以及两种充满自然语言处理（NLP）的技术，即单词（bow）和术语频率插入文档频率（TF-IDF）。我们首先介绍使用的NLP技术。然后，我们介绍分类器及其在每种技术中的性能。然后，我们介绍整体的合奏学习分类器以及我们使用的策略来组合它们。最后，我们介绍了SPAML在准确性和精度方面所显示的有趣结果。

In this paper, we put forward a new tool, called SpaML, for spam detection using a set of supervised and unsupervised classifiers, and two techniques imbued with Natural Language Processing (NLP), namely Bag of Words (BoW) and Term Frequency-Inverse Document Frequency (TF-IDF). We first present the NLP techniques used. Then, we present our classifiers and their performance on each of these techniques. Then, we present our overall Ensemble Learning classifier and the strategy we are using to combine them. Finally, we present the interesting results shown by SpaML in terms of accuracy and precision.

下载PDF全文

下载文献需遵守相关版权规定

论文标题