论文标题
我的变量去了哪里?在不完整的调试信息中戳孔
Where Did My Variable Go? Poking Holes in Incomplete Debug Information
论文作者
论文摘要
用于优化可执行文件的调试信息的可用性在很大程度上可以减轻关键任务,例如崩溃分析。源级别的调试者使用此信息以源代码来显示程序状态,即使优化更改程序结构也可以在其上进行推理。最近的一些努力提出了有效的方法,以识别不正确的调试信息实例,通过向用户呈现不一致的程序状态,这可能会误导用户。 在这项工作中,我们确定并研究一个相关的重要问题:调试信息的完整性。与不优化的可执行文件可以用作参考的正确性问题不同,我们发现当计划状态的未报告部分背后的原因是优化或编译器实施缺陷的不可避免的效果时,没有类似的甲骨文。在这种情况下,我们认为,关于调试信息的预期可用性的经验得出的猜想可以作为暴露这些缺陷类别的有效手段。 我们提出了三个涉及可变价值的猜想,并研究了多久通过流行的GCC和LLVM编译器的不同配置汇编的合成程序的频率。然后,我们讨论技术以查明此类违规背后的优化,并相应地最大程度地减少错误报告。我们的实验揭示了GCC-GDB和Clang-LLDB生态系统开发人员已经确认的24个错误。
The availability of debug information for optimized executables can largely ease crucial tasks such as crash analysis. Source-level debuggers use this information to display program state in terms of source code, allowing users to reason on it even when optimizations alter program structure extensively. A few recent endeavors have proposed effective methodologies for identifying incorrect instances of debug information, which can mislead users by presenting them with an inconsistent program state. In this work, we identify and study a related important problem: the completeness of debug information. Unlike correctness issues for which an unoptimized executable can serve as reference, we find there is no analogous oracle to deem when the cause behind an unreported part of program state is an unavoidable effect of optimization or a compiler implementation defect. In this scenario, we argue that empirically derived conjectures on the expected availability of debug information can serve as an effective means to expose classes of these defects. We propose three conjectures involving variable values and study how often synthetic programs compiled with different configurations of the popular gcc and LLVM compilers deviate from them. We then discuss techniques to pinpoint the optimizations behind such violations and minimize bug reports accordingly. Our experiments revealed, among others, 24 bugs already confirmed by the developers of the gcc-gdb and clang-lldb ecosystems.