论文标题
BlackParrot Bedrock Cache Cooherence System
The BlackParrot BedRock Cache Coherence System
论文作者
论文摘要
本文介绍了BP-BETROCK,BLACKPARROT 64位RISC-V Multicore处理器中实现的开源缓存相干协议和系统。 BP-BETROCK实现了基于基础目录的MOESIF CACHE相干协议协议,并包括两个不同的开源连贯协议协议引擎,一种基于FSM,另一个基于Microcode可以编程。两种相干引擎都支持连贯的无法访问可缓存的内存和基于L1的原子读取模式操作。 BP-BETROCK安装在BlackParrot多核心中,已在GlobalFoundries 12nm Finfet流程中进行了验证,并用两个相干引擎验证了8核配置,启动Linux并从架子测试中运行。在描述了BP-BETROCK和两种连贯引擎的设计之后,我们通过分析处理占用率并在8核FPGA实现上运行Splash-3基准来研究它们的性能。仔细的设计和相干特定的ISA扩展使可编程控制器能够在我们的FPGA测试系统中所证明的,可以平均在固定功能FSM控制器(2.3%最差)的固定功能FSM控制器(2.3%最差)(2.3%)中实现性能。分析表明,可编程相干引擎在ASIC过程中仅增加了4%,并且在FPGA上仅增加了6.3%的逻辑利用率,每个核心增加了一个块RAM。
This paper presents BP-BedRock, the open-source cache coherence protocol and system implemented within the BlackParrot 64-bit RISC-V multicore processor. BP-BedRock implements the BedRock directory-based MOESIF cache coherence protocol and includes two different open-source coherence protocol engines, one FSM-based and the other microcode programmable. Both coherence engines support coherent uncacheable access to cacheable memory and L1-based atomic read-modify-write operations. Fitted within the BlackParrot multicore, BP-BedRock has been silicon validated in a GlobalFoundries 12nm FinFET process and FPGA validated with both coherence engines in 8-core configurations, booting Linux and running off the shelf benchmarks. After describing BP-BedRock and the design of the two coherence engines, we study their performance by analyzing processing occupancy and running the Splash-3 benchmarks on the 8-core FPGA implementations. Careful design and coherence-specific ISA extensions enable the programmable controller to achieve performance within 1% of the fixed-function FSM controller on average (2.3% worst-case) as demonstrated on our FPGA test system. Analysis shows that the programmable coherence engine increases die area by only 4% in an ASIC process and increases logic utilization by only 6.3% on FPGA with one additional block RAM added per core.