论文标题
Yelp审查评级预测:机器学习和深度学习模型
Yelp Review Rating Prediction: Machine Learning and Deep Learning Models
论文作者
论文摘要
我们预测基于Yelp Open数据集的Yelp评论的餐厅评分。提出了数据分配,并构建了一个平衡的培训数据集。对两个矢量器进行了实验,以进行功能工程。实施了四种机器学习模型,包括幼稚的贝叶斯,逻辑回归,随机森林和线性支撑向量机。还采用了包含Bert,Distilbert,Roberta和XLNET的四个基于变压器的模型。精度,加权F1分数和混乱矩阵用于模型评估。与逻辑回归相比,XLNET的五星级分类达到了70%的精度,精度为64%。
We predict restaurant ratings from Yelp reviews based on Yelp Open Dataset. Data distribution is presented, and one balanced training dataset is built. Two vectorizers are experimented for feature engineering. Four machine learning models including Naive Bayes, Logistic Regression, Random Forest, and Linear Support Vector Machine are implemented. Four transformer-based models containing BERT, DistilBERT, RoBERTa, and XLNet are also applied. Accuracy, weighted F1 score, and confusion matrix are used for model evaluation. XLNet achieves 70% accuracy for 5-star classification compared with Logistic Regression with 64% accuracy.