论文标题
第三个Archedge研讨会:探索有效的深神经网络的设计空间
Third ArchEdge Workshop: Exploring the Design Space of Efficient Deep Neural Networks
论文作者
论文摘要
本文概述了我们正在进行的关于高效深神经网络(DNNS)设计空间探索的工作。具体而言,我们介绍了两个方面:(1)静态体系结构设计效率和(2)动态模型执行效率。对于静态体系结构设计,与现有的端到端硬件建模假设不同,我们在GPU核心级别进行全堆栈分析,以确定DNN设计的更好的准确性延迟权衡。对于动态模型的执行,与在DNN通道级别上处理模型冗余的先前工作不同,我们探索了DNN功能映射冗余的新维度,以在运行时动态地进行。最后,我们重点介绍了几个有准备在未来几年引起研究关注的开放问题。
This paper gives an overview of our ongoing work on the design space exploration of efficient deep neural networks (DNNs). Specifically, we cover two aspects: (1) static architecture design efficiency and (2) dynamic model execution efficiency. For static architecture design, different from existing end-to-end hardware modeling assumptions, we conduct full-stack profiling at the GPU core level to identify better accuracy-latency trade-offs for DNN designs. For dynamic model execution, different from prior work that tackles model redundancy at the DNN-channels level, we explore a new dimension of DNN feature map redundancy to be dynamically traversed at runtime. Last, we highlight several open questions that are poised to draw research attention in the next few years.