加速神经网络推断与DRAM的处理：从边缘到云

论文标题

加速神经网络推断与DRAM的处理：从边缘到云

Accelerating Neural Network Inference with Processing-in-DRAM: From the Edge to the Cloud

论文作者

Oliveira, Geraldo F., Gómez-Luna, Juan, Ghose, Saugata, Boroumand, Amirali, Mutlu, Onur

论文摘要

神经网络（NNS）的重要性和复杂性正在增长。神经网络的性能（和能源效率）可以通过计算或内存资源约束。在内存阵列附近或内部放置计算的内存处理（PIM）范式是加速内存绑定的NNS的可行解决方案。但是，PIM架构的形式各不相同，其中不同的PIM方法导致不同的权衡。我们的目标是分析基于DRAM的PIM架构，以分析NN性能和能源效率。为此，我们分析了三个最先进的PIM架构：（1）UPMEM，将处理器和DRAM阵列集成到一个2D芯片中；（2）Mensa，是针对边缘设备量身定制的基于3D堆栈的PIM架构；（3）Simdram，它使用DRAM的模拟原理来执行位序列操作。我们的分析表明，PIM极大地使记忆遇到的NNS受益：（1）UPMEM在GPU需要内存过度订阅的一般矩阵 - 矢量乘数内核时提供了高端GPU的性能；（2）Mensa在Google Edge TPU上提高了3.0倍和3.1倍的能源效率和吞吐量，用于24个Google Edge NN型号；（3）SIMDRAM的表现优于CPU/GPU的三个二进制nns的16.7倍/1.4倍。我们得出的结论是，由于固有的建筑设计选择，NN模型的理想PIM体系结构取决于模型的独特属性。

Neural networks (NNs) are growing in importance and complexity. A neural network's performance (and energy efficiency) can be bound either by computation or memory resources. The processing-in-memory (PIM) paradigm, where computation is placed near or within memory arrays, is a viable solution to accelerate memory-bound NNs. However, PIM architectures vary in form, where different PIM approaches lead to different trade-offs. Our goal is to analyze, discuss, and contrast DRAM-based PIM architectures for NN performance and energy efficiency. To do so, we analyze three state-of-the-art PIM architectures: (1) UPMEM, which integrates processors and DRAM arrays into a single 2D chip; (2) Mensa, a 3D-stack-based PIM architecture tailored for edge devices; and (3) SIMDRAM, which uses the analog principles of DRAM to execute bit-serial operations. Our analysis reveals that PIM greatly benefits memory-bound NNs: (1) UPMEM provides 23x the performance of a high-end GPU when the GPU requires memory oversubscription for a general matrix-vector multiplication kernel; (2) Mensa improves energy efficiency and throughput by 3.0x and 3.1x over the Google Edge TPU for 24 Google edge NN models; and (3) SIMDRAM outperforms a CPU/GPU by 16.7x/1.4x for three binary NNs. We conclude that the ideal PIM architecture for NN models depends on a model's distinct attributes, due to the inherent architectural design choices.

下载PDF全文

下载文献需遵守相关版权规定

论文标题