论文标题
从DNS流量数据中解释机器学习DGA探测器
Explaining Machine Learning DGA Detectors from DNS Traffic Data
论文作者
论文摘要
在线系统缺乏连续性的最常见原因之一是源自广泛流行的网络攻击,称为分布式拒绝服务(DDOS),其中被感染设备(BotNet)网络被利用以通过攻击者的命令来淹没服务的计算能力。该攻击是通过通过域名系统(DNS)技术通过域生成算法(DGAS)来进行的,这是一种隐身连接策略,但仍留下可疑的数据模式。为了发现这种威胁,已经取得了分析的进步。对于大多数人来说,他们发现机器学习(ML)是一种解决方案,可以在分析和分类大量数据方面非常有效。尽管表现出色,但ML模型在决策过程中具有一定程度的晦涩难懂。为了应对这个问题,ML的一个被称为可解释的ML的分支试图分解分类器的黑盒性质,并使它们可解释和可读。这项工作解决了在僵尸网络和DGA检测背景下可解释的ML的问题,我们最了解,这是第一个专门分解用于僵尸网络/DGA检测的ML分类器的决策的问题,因此提供了全球和局部解释。
One of the most common causes of lack of continuity of online systems stems from a widely popular Cyber Attack known as Distributed Denial of Service (DDoS), in which a network of infected devices (botnet) gets exploited to flood the computational capacity of services through the commands of an attacker. This attack is made by leveraging the Domain Name System (DNS) technology through Domain Generation Algorithms (DGAs), a stealthy connection strategy that yet leaves suspicious data patterns. To detect such threats, advances in their analysis have been made. For the majority, they found Machine Learning (ML) as a solution, which can be highly effective in analyzing and classifying massive amounts of data. Although strongly performing, ML models have a certain degree of obscurity in their decision-making process. To cope with this problem, a branch of ML known as Explainable ML tries to break down the black-box nature of classifiers and make them interpretable and human-readable. This work addresses the problem of Explainable ML in the context of botnet and DGA detection, which at the best of our knowledge, is the first to concretely break down the decisions of ML classifiers when devised for botnet/DGA detection, therefore providing global and local explanations.