论文标题
Cocopie:使移动ai变甜作为派 - 压缩兼容共同设计很长一段时间
CoCoPIE: Making Mobile AI Sweet As PIE --Compression-Compilation Co-Design Goes a Long Way
论文作者
论文摘要
假设硬件是实现实时移动智能的主要限制,该行业主要致力于开发专门的硬件加速器进行机器学习和推理。本文挑战了假设。通过借鉴最近的实时AI优化框架Cocopie,它坚持认为,通过有效的压缩兼容器共同设计,可以在没有特殊硬件的情况下对主流端设备进行实时人工智能。 Cocopie是一个软件框架,在移动AI上拥有大量记录:第一个支持从CNN到RNN到RNN,变压器,语言模型等所有主要DNN的第一个框架;最快的DNN修剪和加速框架,与当前在其他框架(例如TensorFlow-Lite)上修剪的DNN修剪相比,速度快180倍。使许多代表性的AI应用程序能够在现成的移动设备上实时运行,这些设备以前仅在特殊的硬件支持下才被认为是可能的;在能源效率和/或性能方面,使现成的移动设备的表现优于许多代表性的ASIC和FPGA解决方案。
Assuming hardware is the major constraint for enabling real-time mobile intelligence, the industry has mainly dedicated their efforts to developing specialized hardware accelerators for machine learning and inference. This article challenges the assumption. By drawing on a recent real-time AI optimization framework CoCoPIE, it maintains that with effective compression-compiler co-design, it is possible to enable real-time artificial intelligence on mainstream end devices without special hardware. CoCoPIE is a software framework that holds numerous records on mobile AI: the first framework that supports all main kinds of DNNs, from CNNs to RNNs, transformer, language models, and so on; the fastest DNN pruning and acceleration framework, up to 180X faster compared with current DNN pruning on other frameworks such as TensorFlow-Lite; making many representative AI applications able to run in real-time on off-the-shelf mobile devices that have been previously regarded possible only with special hardware support; making off-the-shelf mobile devices outperform a number of representative ASIC and FPGA solutions in terms of energy efficiency and/or performance.