论文标题

了解长期的编程语言,以结构意识到稀疏的关注

Understanding Long Programming Languages with Structure-Aware Sparse Attention

论文作者

Liu, Tingting, Wang, Chengyu, Chen, Cen, Gao, Ming, Zhou, Aoying

论文摘要

基于编程的预训练的语言模型(PPLM)(例如Codebert)在许多下游代码相关的任务中取得了巨大的成功。由于变压器中自我注意力的记忆和计算复杂性随序列长度四次增长,因此PPLM通常将代码长度限制在512中。但是,现实世界应用中的代码通常很长,例如代码搜索,这些代码搜索无法通过现有PPLM进行有效处理。为了解决这个问题,在本文中,我们提出了SASA,这是一种稀疏的注意机制,可降低复杂性并改善长期理解任务的性能。 SASA中的关键组成部分是$ K $稀疏的关注和抽象的语法树(AST)基于结构意识的关注。有了顶级$ K $稀疏的关注,可以通过较低的计算成本获得最关键的关注关系。由于代码结构代表代码语句的逻辑,这是代码序列特征的补充,我们将AST结构进一步引入注意力。关于Codexglue任务的广泛实验表明,SASA比竞争基线取得更好的性能。

Programming-based Pre-trained Language Models (PPLMs) such as CodeBERT have achieved great success in many downstream code-related tasks. Since the memory and computational complexity of self-attention in the Transformer grow quadratically with the sequence length, PPLMs typically limit the code length to 512. However, codes in real-world applications are generally long, such as code searches, which cannot be processed efficiently by existing PPLMs. To solve this problem, in this paper, we present SASA, a Structure-Aware Sparse Attention mechanism, which reduces the complexity and improves performance for long code understanding tasks. The key components in SASA are top-$k$ sparse attention and Abstract Syntax Tree (AST)-based structure-aware attention. With top-$k$ sparse attention, the most crucial attention relation can be obtained with a lower computational cost. As the code structure represents the logic of the code statements, which is a complement to the code sequence characteristics, we further introduce AST structures into attention. Extensive experiments on CodeXGLUE tasks show that SASA achieves better performance than the competing baselines.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源