论文标题
关于监督与半监督的机器学习的比较研究
Comparative Study on Supervised versus Semi-supervised Machine Learning for Anomaly Detection of In-vehicle CAN Network
论文作者
论文摘要
作为智能车辆控制系统的中心神经,车载网络总线对于车辆驾驶的安全至关重要。车载网络的最佳标准之一是控制器区域网络(CAN BUS)协议。但是,由于缺乏安全机制,CAN总线被设计为容易受到各种攻击的影响。为了增强车载网络的安全性并根据大量的CAN网络流量数据和提取的有价值的功能来促进该领域的研究,本研究将完全监督的机器学习与半手不足的机器学习方法进行了全面比较,可以进行消息异常检测。评估了传统的机器学习模型(包括单个分类器和集合模型)和基于神经网络的深度学习模型。此外,这项研究提出了一种基于深度自动编码器的半监督学习方法,用于传达异常检测,并验证了其优于其他半监督方法的优势。广泛的实验表明,完全监督的方法通常优于半监督者,因为它们使用更多信息作为输入。通常,开发的基于XGBoost的模型以最佳精度(98.65%),精度(0.9853)和Roc AUC(0.9585)击败了文献中报道的其他方法。
As the central nerve of the intelligent vehicle control system, the in-vehicle network bus is crucial to the security of vehicle driving. One of the best standards for the in-vehicle network is the Controller Area Network (CAN bus) protocol. However, the CAN bus is designed to be vulnerable to various attacks due to its lack of security mechanisms. To enhance the security of in-vehicle networks and promote the research in this area, based upon a large scale of CAN network traffic data with the extracted valuable features, this study comprehensively compared fully-supervised machine learning with semi-supervised machine learning methods for CAN message anomaly detection. Both traditional machine learning models (including single classifier and ensemble models) and neural network based deep learning models are evaluated. Furthermore, this study proposed a deep autoencoder based semi-supervised learning method applied for CAN message anomaly detection and verified its superiority over other semi-supervised methods. Extensive experiments show that the fully-supervised methods generally outperform semi-supervised ones as they are using more information as inputs. Typically the developed XGBoost based model obtained state-of-the-art performance with the best accuracy (98.65%), precision (0.9853), and ROC AUC (0.9585) beating other methods reported in the literature.