论文标题

多层本地SGD,用于异构层次网络

Multi-Level Local SGD for Heterogeneous Hierarchical Networks

论文作者

Castiglia, Timothy, Das, Anirban, Patterson, Stacy

论文摘要

我们提出了多级本地SGD,这是一种分布式梯度方法,用于在异质的多级网络中学习平滑,非凸的目标。我们的网络模型由一组不连接的子网络组成,带有一个集线器和多个工人节点。此外,工人节点可能具有不同的工作率。集线器通过连接但不一定完成通信网络相互交换信息。在我们的算法中,子网络使用轮毂和辐条范式执行分布式SGD算法,并且轮毂定期使用相邻的集线器平均其模型。我们首先提供了一个统一的数学框架,该框架描述了多级本地SGD算法。然后,我们对算法进行理论分析;我们的分析表明,收敛误差对工作者节点异质性,集线器网络拓扑以及本地,子网络和全局迭代的数量的依赖性。我们通过使用凸面和非凸目标的基于仿真的实验来备份理论结果。

We propose Multi-Level Local SGD, a distributed gradient method for learning a smooth, non-convex objective in a heterogeneous multi-level network. Our network model consists of a set of disjoint sub-networks, with a single hub and multiple worker nodes; further, worker nodes may have different operating rates. The hubs exchange information with one another via a connected, but not necessarily complete communication network. In our algorithm, sub-networks execute a distributed SGD algorithm, using a hub-and-spoke paradigm, and the hubs periodically average their models with neighboring hubs. We first provide a unified mathematical framework that describes the Multi-Level Local SGD algorithm. We then present a theoretical analysis of the algorithm; our analysis shows the dependence of the convergence error on the worker node heterogeneity, hub network topology, and the number of local, sub-network, and global iterations. We back up our theoretical results via simulation-based experiments using both convex and non-convex objectives.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源