论文标题
深入的强化学习进行优化和控制
Deep reinforcement learning for RAN optimization and control
论文作者
论文摘要
由于无线电访问网络(RAN)中流量的差异很高,因此固定的网络配置不够灵活,无法实现最佳性能。我们的供应商提供了ENODEB的多种设置来优化RAN性能,例如媒体访问控制调度程序,加载余额等。但是,EnodeB配置的详细机制通常非常复杂且未透露,更不用说需要考虑大型关键性能指标(KPIS)空间。这些使构建模拟器,离线调整或基于规则的解决方案变得困难。我们的目标是在没有强大的假设或域知识的情况下建立一个智能控制器,并且可以在没有监督的情况下24/7进行运行。为了实现这一目标,我们首先在实验室环境中构建一个闭环控制台,其中一个由最大的无线供应商和四个智能手机提供的eNODEB。接下来,我们构建了一个双Q网络代理,该代理通过RAN的关键性能指标进行了实时反馈。我们的工作证明了应用深度强化学习以在真正的网络环境中提高网络性能的有效性。
Due to the high variability of the traffic in the radio access network (RAN), fixed network configurations are not flexible enough to achieve optimal performance. Our vendors provide several settings of the eNodeB to optimize the RAN performance, such as media access control scheduler, loading balance, etc. But the detailed mechanisms of the eNodeB configurations are usually very complicated and not disclosed, not to mention the large key performance indicators (KPIs) space needed to be considered. These make constructing a simulator, offline tuning, or rule-based solutions difficult. We aim to build an intelligent controller without strong assumption or domain knowledge about the RAN and can run 24/7 without supervision. To achieve this goal, we first build a closed-loop control testbed RAN in a lab environment with one eNodeB provided by one of the largest wireless vendors and four smartphones. Next, we build a double Q network agent trained with the live feedback of the key performance indicators from the RAN. Our work proved the effectiveness of applying deep reinforcement learning to improve network performance in a real RAN network environment.