论文标题
工作共享任务:利用不规则和细粒循环并行的有效方法
Worksharing Tasks: An Efficient Way to Exploit Irregular and Fine-Grained Loop Parallelism
论文作者
论文摘要
共享的内存编程模型通常提供工作共享和任务构造。前者依靠有效的叉 - 加入执行模型来利用结构性并行性。而后者依赖于任务之间的细粒度同步和灵活的数据流执行模型来利用动态,不规则和嵌套并行性。在显示结构化和非结构化并行性的应用程序上,可以合并工作共享和任务构造。但是,很难在不惩罚数据流执行模型的情况下混合两个执行模型。因此,在许多应用程序上,还使用任务来利用结构性并行性,以利用纯数据流执行模型的全部好处。但是,任务创建和管理可能会引入不可忽略的开销,以防止有效利用细粒的结构性并行性,尤其是在多核处理器上。在这项工作中,我们提出了工作共享任务。这些任务在内部利用工作共享技术来利用基于良好的结构性循环的并行性。评估显示了几个基准和平台上的有希望的结果。
Shared memory programming models usually provide worksharing and task constructs. The former relies on the efficient fork-join execution model to exploit structured parallelism; while the latter relies on fine-grained synchronization among tasks and a flexible data-flow execution model to exploit dynamic, irregular, and nested parallelism. On applications that show both structured and unstructured parallelism, both worksharing and task constructs can be combined. However, it is difficult to mix both execution models without penalizing the data-flow execution model. Hence, on many applications structured parallelism is also exploited using tasks to leverage the full benefits of a pure data-flow execution model. However, task creation and management might introduce a non-negligible overhead that prevents the efficient exploitation of fine-grained structured parallelism, especially on many-core processors. In this work, we propose worksharing tasks. These are tasks that internally leverage worksharing techniques to exploit fine-grained structured loop-based parallelism. The evaluation shows promising results on several benchmarks and platforms.