论文标题
多端口,分布和共享内存体系结构的组合和几何形状
Combinatorics and Geometry for the Many-ported, Distributed and Shared Memory Architecture
论文作者
论文摘要
在许多应用程序域中,基于片上共享内存的许多核心SOC体系结构是灵活和可编程解决方案的优选。但是,随着我们接近摩尔定律的终结,许多移植内存的发展变得越来越具有挑战性,而系统需求要求更大的共享内存和更多访问端口。内存不再是为了最大程度地减少单个事务访问时间而设计的,但必须考虑SOC上的功能。在本文中,我们检查了SOC中常见的大型内存使用量,该记忆被用作大型缓冲区的存储空间,然后将其移动以进行时间安排的处理。我们合并了许多移植的内存设计,互连的组合分析以及对关键路径的几何分析的组合分析,这两者都扩展了两者,以表明在这种情况下,SOC性能从层次结构,分布式和分阶段的体系结构中显着受益,并具有较低的速度开关和内部记忆库的分形随机化,以及确定的和Geemiquious和Geemiquious和Geemiquious和Geemitique Assectry Awaine Assectry Aneverry Aneverry Aneverry Aneverry Aspepry Aspepry Appepher的应用。提出的结果表明,新的体系结构支持吞吐量的20%,延迟降低了20%,互连面积降低了约30%。我们从物理设计的角度通过布局来展示了这种体系结构在硅上的灵活性和可扩展性。该体系结构实现了一个容易得多的实现流,该流与物理不规则的端口访问和内存占主导地位的布局非常有效,这在实际设计中是一个常见的问题。
Manycore SoC architectures based on on-chip shared memory are preferred for flexible and programmable solutions in many application domains. However, the development of many ported memory is becoming increasingly challenging as we approach the end of Moore's Law while systems requirements demand larger shared memory and more access ports. Memory can no longer be designed simply to minimize single transaction access time, but must take into account the functionality on the SoC. In this paper we examine a common large memory usage in SoC, where the memory is used as storage for large buffers that are then moved for time scheduled processing. We merge two aspects of many ported memory design, combinatorial analysis of interconnect, and geometric analysis of critical paths, extending both to show that in this case the SoC performance benefits significantly from a hierarchical, distributed and staged architecture with lower-radix switches and fractal randomization of memory bank addressing, along with judicious and geometry aware application of speed up. The results presented show the new architecture supports 20% higher throughput with 20% lower latency and 30% less interconnection area at approximately the same power consumption. We demonstrate the flexibility and scalability of this architecture on silicon from a physical design perspective by taking the design through layout. The architecture enables a much easier implementation flow that works well with physically irregular port access and memory dominant layout, which is a common issue in real designs.