论文标题
SOK:您想知道X86/X64二进制拆卸,但害怕问
SoK: All You Ever Wanted to Know About x86/x64 Binary Disassembly But Were Afraid to Ask
论文作者
论文摘要
拆卸二进制代码很难,但对于提高二进制软件的安全性是必不可少的。在过去的几十年中,二进制拆卸的研究生产了许多工具和框架,这些工具和框架已为研究人员和安全专业人员提供。这些工具采用各种策略来赋予它们不同的特征。但是,缺乏系统化会阻碍该地区的新研究,并使选择正确的工具很难,因为我们不了解现有工具的优势和劣势。在本文中,我们通过研究九种流行的开源工具来系统化二进制拆卸。我们将其代码基库的手动检查与使用3,788个二进制文件的最全面的实验评估(迄今为止)。我们的研究产生了拆卸策略的全面描述和组织,将其归类为算法或启发式。同时,我们测量和报告各个算法对每个工具结果的影响。我们发现,尽管所有工具都使用了原则上的算法,但它们仍然在很大程度上依靠启发式方法来增加代码覆盖范围。根据所使用的启发式方法,不同的覆盖范围-VS校正权衡会发挥作用,从而导致具有不同优势和劣势的工具。我们设想这些发现将帮助用户选择正确的工具,并帮助研究人员改善二进制拆卸。
Disassembly of binary code is hard, but necessary for improving the security of binary software. Over the past few decades, research in binary disassembly has produced many tools and frameworks, which have been made available to researchers and security professionals. These tools employ a variety of strategies that grant them different characteristics. The lack of systematization, however, impedes new research in the area and makes selecting the right tool hard, as we do not understand the strengths and weaknesses of existing tools. In this paper, we systematize binary disassembly through the study of nine popular, open-source tools. We couple the manual examination of their code bases with the most comprehensive experimental evaluation (thus far) using 3,788 binaries. Our study yields a comprehensive description and organization of strategies for disassembly, classifying them as either algorithm or else heuristic. Meanwhile, we measure and report the impact of individual algorithms on the results of each tool. We find that while principled algorithms are used by all tools, they still heavily rely on heuristics to increase code coverage. Depending on the heuristics used, different coverage-vs-correctness trade-offs come in play, leading to tools with different strengths and weaknesses. We envision that these findings will help users pick the right tool and assist researchers in improving binary disassembly.