论文标题

无检测方案的分布式异步收敛检测

Distributed asynchronous convergence detection without detection protocol

论文作者

Gbikpi-Benissan, Guillaume, Magoules, Frederic

论文摘要

在本文中,我们解决了检测可以终止正在进行的异步并行迭代过程的时刻的问题,以提供足够精确的解决方案来解决解决方案问题。我们将检测问题制定为全局解决方案识别问题,我们分析了基于快照的方法,这是唯一允许进行精确全局残留误差计算的方法。从最近开发的近似快照协议中,我们提供了可靠的全局剩余误差,我们在此处实验研究了,计算出的全局残留误差的可靠性,没有任何事先特定的检测机制。单位超级计算机上的结果成功地表明,这种高性能计算平台可能提供了足够稳定的计算环境,以便简单地诉诸于计算可靠的全局残差错误的非阻滞操作,这在实现和执行级别都可以节省大量时间。

In this paper, we address the problem of detecting the moment when an ongoing asynchronous parallel iterative process can be terminated to provide a sufficiently precise solution to a fixed-point problem being solved. Formulating the detection problem as a global solution identification problem, we analyze the snapshot-based approach, which is the only one that allows for exact global residual error computation. From a recently developed approximate snapshot protocol providing a reliable global residual error, we experimentally investigate here, as well, the reliability of a global residual error computed without any prior particular detection mechanism. Results on a single-site supercomputer successfully show that such high-performance computing platforms possibly provide computational environments stable enough to allow for simply resorting to non-blocking reduction operations for computing reliable global residual errors, which provides noticeable time saving, at both implementation and execution levels.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源