论文标题
深度学习编译器:一项综合调查
The Deep Learning Compiler: A Comprehensive Survey
论文作者
论文摘要
在不同的DL硬件上部署各种深度学习模型(DL)模型的困难增强了社区中DL编译器的研究和开发。已经提出了来自行业和学术界的几个DL编译器,例如Tensorflow XLA和TVM。同样,DL编译器将不同DL框架中描述的DL模型作为输入,然后生成针对多种DL硬件作为输出的优化代码。但是,现有的调查都没有全面分析DL编译器的独特设计体系结构。在本文中,我们通过详细介绍常用的设计,重点介绍了面向DL的多层IRS以及前端/后端优化,对现有DL编译器进行了全面调查。具体而言,我们提供了来自各个方面的现有DL编译器之间的全面比较。此外,我们还提供了有关多层IRS设计的详细分析,并说明了通常采用的优化技术。最后,将一些见解强调为DL编译器的潜在研究方向。这是第一份针对DL编译器设计架构的调查文件,我们希望这可以为DL编译器提供铺平道路。
The difficulty of deploying various deep learning (DL) models on diverse DL hardware has boosted the research and development of DL compilers in the community. Several DL compilers have been proposed from both industry and academia such as Tensorflow XLA and TVM. Similarly, the DL compilers take the DL models described in different DL frameworks as input, and then generate optimized codes for diverse DL hardware as output. However, none of the existing survey has analyzed the unique design architecture of the DL compilers comprehensively. In this paper, we perform a comprehensive survey of existing DL compilers by dissecting the commonly adopted design in details, with emphasis on the DL oriented multi-level IRs, and frontend/backend optimizations. Specifically, we provide a comprehensive comparison among existing DL compilers from various aspects. In addition, we present detailed analysis on the design of multi-level IRs and illustrate the commonly adopted optimization techniques. Finally, several insights are highlighted as the potential research directions of DL compiler. This is the first survey paper focusing on the design architecture of DL compilers, which we hope can pave the road for future research towards DL compiler.