论文标题
相关感知合作社多群广播360°视频传递网络:分层深度加强学习方法
Correlation-aware Cooperative Multigroup Broadcast 360° Video Delivery Network: A Hierarchical Deep Reinforcement Learning Approach
论文作者
论文摘要
由于严格要求从体育赛事体育场的任何地方接收无人机(UAV)的视频,以及对虚拟现实(VR)用户进行视频传输的大量高元件吞吐量,有希望的解决方案是一个无细胞的多组广播(CF-MB)网络,具有合作接待和广播访问点(AP)。为了探索广播用户相关的Decode依赖性视频资源与空间关联的VR用户的好处,网络应动态地将视频和群集AP安排到虚拟单元格中,以供带有重叠的视频请求的另一组VR用户。通过将问题分解为调度和关联子问题,我们首先介绍了基于卷积神经网络(CNN)的彩虹剂的集中式深度增强学习方法(DRL)关联方法(CNN),以从观察产生决定。为了降低其复杂性,我们将关联问题分解为多个子问题,从而导致网络分布部分可观察到的马尔可夫决策过程(ND-POMDP)。为了解决它,我们提出了一种多代理深度DRL算法。为了共同解决耦合的关联和调度问题,我们进一步开发了一种层次联合的DRL算法,以调度程序为元控制器,并作为控制器关联。我们的仿真结果表明,我们的CF-MB网络可以有效地处理从UAV到VR用户的实时视频传输。我们提出的学习架构对于增加APS和VR用户的高维合作协会问题是有效且可扩展的。同样,我们提出的算法优于基于非学习的方法,具有显着改善的性能。
With the stringent requirement of receiving video from unmanned aerial vehicle (UAV) from anywhere in the stadium of sports events and the significant-high per-cell throughput for video transmission to virtual reality (VR) users, a promising solution is a cell-free multi-group broadcast (CF-MB) network with cooperative reception and broadcast access points (AP). To explore the benefit of broadcasting user-correlated decode-dependent video resources to spatially correlated VR users, the network should dynamically schedule the video and cluster APs into virtual cells for a different group of VR users with overlapped video requests. By decomposition the problem into scheduling and association sub-problems, we first introduce the conventional non-learning-based scheduling and association algorithms, and a centralized deep reinforcement learning (DRL) association approach based on the rainbow agent with a convolutional neural network (CNN) to generate decisions from observation. To reduce its complexity, we then decompose the association problem into multiple sub-problems, resulting in a networked-distributed Partially Observable Markov decision process (ND-POMDP). To solve it, we propose a multi-agent deep DRL algorithm. To jointly solve the coupled association and scheduling problems, we further develop a hierarchical federated DRL algorithm with scheduler as meta-controller, and association as the controller. Our simulation results shown that our CF-MB network can effectively handle real-time video transmission from UAVs to VR users. Our proposed learning architectures is effective and scalable for a high-dimensional cooperative association problem with increasing APs and VR users. Also, our proposed algorithms outperform non-learning based methods with significant performance improvement.