论文标题
语法免费:迈向临时解析器的语法推断
Grammars for Free: Toward Grammar Inference for Ad Hoc Parsers
论文作者
论文摘要
临时解析器无处不在:它们在任何时候都会出现,无论何时将字符串分开,循环,解释,转换或处理过。每个临时解析器都会产生一种语言:该程序可以接受的无限输入字符串而不会出错。任何语言都可以用形式的语法描述:一套有限的规则,可以生成该语言的所有字符串。但是,程序员并不为临时解析器编写语法 - 即使它们非常有用。语法可以用作文档,援助计划理解,生成测试输入,并允许有关语言理论安全的推理。我们为临时解析器提出了一个自动语法推理系统,除了开放采矿软件存储库和双向解析器合成中的新可能性外,还可以实现所有这些用例。
Ad hoc parsers are everywhere: they appear any time a string is split, looped over, interpreted, transformed, or otherwise processed. Every ad hoc parser gives rise to a language: the possibly infinite set of input strings that the program accepts without going wrong. Any language can be described by a formal grammar: a finite set of rules that can generate all strings of that language. But programmers do not write grammars for ad hoc parsers -- even though they would be eminently useful. Grammars can serve as documentation, aid program comprehension, generate test inputs, and allow reasoning about language-theoretic security. We propose an automatic grammar inference system for ad hoc parsers that would enable all of these use cases, in addition to opening up new possibilities in mining software repositories and bi-directional parser synthesis.