以类型为中心的Kotlin编译器模糊：通过保存类型来保存测试程序正确性

论文标题

以类型为中心的Kotlin编译器模糊：通过保存类型来保存测试程序正确性

Type-Centric Kotlin Compiler Fuzzing: Preserving Test Program Correctness by Preserving Types

论文作者

Stepanov, Daniil, Akhin, Marat, Belyaev, Mikhail

论文摘要

Kotlin是一种来自Jetbrains：它的开发的相对较新的编程语言，该语言始于2010年的1.0版，于2016年初完成。Kotlin编译器虽然缓慢而稳定地变得越来越成熟，但仍会不时地在更棘手的输入程序上崩溃，尤其是因为其功能的复杂性及其相互作用。这使其成为模糊的理想目标，即使是基本形式也可以找到大量的Kotlin编译器崩溃。但是，模糊存在与崩溃的原因密切相关的问题：生成随机，非平凡和语义上有效的Kotlin程序很难。在本文中，我们讨论了以类型为中心的枚举形式的以类型为中心的编译器构图，这是一种受骨骼程序枚举启发的方法，并基于基于生成和基于突变的模糊的组合，该方法通过将其集中在程序类型上解决了这一问题。创建骨骼程序后，我们用合适的类型的片段填充了键入的孔，该碎片通过生成创建，并通过语义感知突变增强。我们在称为后端错误查找器（BBF）的Kotlin编译器模糊框架中实现了这种方法，并进行了广泛的评估，不仅测试了我们方法的真实世界可行性，而且还将其与其他编译器的模糊技术进行了比较。结果表明，与其他模糊性方法生成语义有效的Kotlin程序相比，我们的方法要好得多，同时创建更有趣的碰撞输入。我们设法找到了50多个以前未知的编译器撞车事故，其中18个由编译器团队分类后认为很重要。

Kotlin is a relatively new programming language from JetBrains: its development started in 2010 with release 1.0 done in early 2016. The Kotlin compiler, while slowly and steadily becoming more and more mature, still crashes from time to time on the more tricky input programs, not least because of the complexity of its features and their interactions. This makes it a great target for fuzzing, even the basic forms of which can find a significant number of Kotlin compiler crashes. There is a problem with fuzzing, however, closely related to the cause of the crashes: generating a random, non-trivial and semantically valid Kotlin program is hard. In this paper, we talk about type-centric compiler fuzzing in the form of type-centric enumeration, an approach inspired by skeletal program enumeration and based on a combination of generative and mutation-based fuzzing, which solves this problem by focusing on program types. After creating the skeleton program, we fill the typed holes with fragments of suitable type, created via generation and enhanced by semantic-aware mutation. We implemented this approach in our Kotlin compiler fuzzing framework called Backend Bug Finder (BBF) and did an extensive evaluation, not only testing the real-world feasibility of our approach, but also comparing it to other compiler fuzzing techniques. The results show our approach to be significantly better compared to other fuzzing approaches at generating semantically valid Kotlin programs, while creating more interesting crash-inducing inputs at the same time. We managed to find more than 50 previously unknown compiler crashes, of which 18 were considered important after their triage by the compiler team.

下载PDF全文

下载文献需遵守相关版权规定

论文标题