论文标题
在多用户边缘云网络中精心学习深度学习推断的混合学习
Hybrid Learning for Orchestrating Deep Learning Inference in Multi-user Edge-cloud Networks
论文作者
论文摘要
基于深度学习的智能服务已在包括智能城市和医疗保健在内的网络物理应用中普遍存在。深度学习的协作最终边缘云计算提供了一系列的性能和效率,可以通过计算卸载来解决应用程序要求。卸载计算的决定是一个通信计算合作的问题,随着系统参数(例如,网络条件)和工作负载特征(例如输入)而变化。确定最佳的编排,考虑到面对不同系统动力学的跨层机会和要求是一个具有挑战性的多维问题。虽然早些时候提出了加强学习(RL)方法,但在学习过程中,它们遭受了大量的反复试验,导致时间和资源消耗过多。我们提出了一个混合学习编排框架,该框架通过结合基于模型和无模型的强化学习来减少与系统环境的相互作用的数量。我们深入学习的推理编排策略采用强化学习来找到最佳的编排政策。此外,我们部署了混合学习(HL)来加速RL学习过程并减少直接采样数量。我们通过与最新的基于RL的推理编排进行了实验比较来证明HL策略的功效,这表明我们的HL策略可加速学习过程高达166.6倍。
Deep-learning-based intelligent services have become prevalent in cyber-physical applications including smart cities and health-care. Collaborative end-edge-cloud computing for deep learning provides a range of performance and efficiency that can address application requirements through computation offloading. The decision to offload computation is a communication-computation co-optimization problem that varies with both system parameters (e.g., network condition) and workload characteristics (e.g., inputs). Identifying optimal orchestration considering the cross-layer opportunities and requirements in the face of varying system dynamics is a challenging multi-dimensional problem. While Reinforcement Learning (RL) approaches have been proposed earlier, they suffer from a large number of trial-and-errors during the learning process resulting in excessive time and resource consumption. We present a Hybrid Learning orchestration framework that reduces the number of interactions with the system environment by combining model-based and model-free reinforcement learning. Our Deep Learning inference orchestration strategy employs reinforcement learning to find the optimal orchestration policy. Furthermore, we deploy Hybrid Learning (HL) to accelerate the RL learning process and reduce the number of direct samplings. We demonstrate efficacy of our HL strategy through experimental comparison with state-of-the-art RL-based inference orchestration, demonstrating that our HL strategy accelerates the learning process by up to 166.6x.