论文标题
从现实世界数据集中学习自动完成
Learning Autocompletion from Real-World Datasets
论文作者
论文摘要
代码完成是一种集成到所有主要IDE中的流行软件开发工具。许多神经语言模型在完成合成基准的完整建议预测中取得了令人鼓舞的结果。但是,当代码完成失败时,最近的一项研究:对现实世界完成的案例研究表明,这些结果可能不会转化为现实世界绩效的改善。为了应对这种效果,我们在现实代码完成示例上训练模型,发现这些模型的表现优于在承诺的源代码和工作版本快照上训练的模型,分别为12.8%和13.8%的精度。我们在建模技术中观察到了这种改进,并通过A/B测试表明,它对应于程序员实际的自动完成使用情况的6.2%。此外,我们的研究表征了大量已记录的自动完成用法,以调查为什么对现实世界实例的培训会导致更强的模型。
Code completion is a popular software development tool integrated into all major IDEs. Many neural language models have achieved promising results in completion suggestion prediction on synthetic benchmarks. However, a recent study When Code Completion Fails: a Case Study on Real-World Completions demonstrates that these results may not translate to improvements in real-world performance. To combat this effect, we train models on real-world code completion examples and find that these models outperform models trained on committed source code and working version snapshots by 12.8% and 13.8% accuracy respectively. We observe this improvement across modeling technologies and show through A/B testing that it corresponds to a 6.2% increase in programmers' actual autocompletion usage. Furthermore, our study characterizes a large corpus of logged autocompletion usages to investigate why training on real-world examples leads to stronger models.