基于AI的端到端网络切片中基于AI的强大资源分配，需求和CSI不确定性

论文标题

基于AI的端到端网络切片中基于AI的强大资源分配，需求和CSI不确定性

AI-based Robust Resource Allocation in End-to-End Network Slicing under Demand and CSI Uncertainties

论文作者

Gharehgoli, Amir, Nouruzi, Ali, Mokari, Nader, Azmi, Paeiz, Javan, Mohamad Reza, Jorswieck, Eduard A.

论文摘要

网络切片（NWS）是移动通信和超越（5G+）的第五代的主要技术之一。 NWS的重要挑战之一是信息不确定性，主要涉及需求和渠道状态信息（CSI）。需求不确定性分为三种类型：用户的数量，带宽的量以及请求的虚拟网络功能工作负载。此外，CSI不确定性由三种方法建模：最差的案例，概率和混合动力。在本文中，我们的目标是通过利用需求和CSI不确定性的端到端NWS资源分配中的深层强化学习算法来最大化基础架构提供商的实用性。所提出的公式是非凸混合企业非线性编程问题。为了在涉及不确定性的问题中执行强大的资源分配，我们需要先前信息的历史记录。为此，我们使用经常出现的确定性策略梯度（RDPG）算法，这是一种深入强化学习中的经常性和基于内存的方法。然后，我们将RDPG方法与软演员批评（SAC），深层确定性策略梯度（DDPG），分布式和贪婪算法进行比较。仿真结果表明，SAC方法分别优于DDPG，分布式和贪婪方法。此外，RDPG方法平均执行SAC方法70％。

Network slicing (NwS) is one of the main technologies in the fifth-generation of mobile communication and beyond (5G+). One of the important challenges in the NwS is information uncertainty which mainly involves demand and channel state information (CSI). Demand uncertainty is divided into three types: number of users requests, amount of bandwidth, and requested virtual network functions workloads. Moreover, the CSI uncertainty is modeled by three methods: worst-case, probabilistic, and hybrid. In this paper, our goal is to maximize the utility of the infrastructure provider by exploiting deep reinforcement learning algorithms in end-to-end NwS resource allocation under demand and CSI uncertainties. The proposed formulation is a nonconvex mixed-integer non-linear programming problem. To perform robust resource allocation in problems that involve uncertainty, we need a history of previous information. To this end, we use a recurrent deterministic policy gradient (RDPG) algorithm, a recurrent and memory-based approach in deep reinforcement learning. Then, we compare the RDPG method in different scenarios with soft actor-critic (SAC), deep deterministic policy gradient (DDPG), distributed, and greedy algorithms. The simulation results show that the SAC method is better than the DDPG, distributed, and greedy methods, respectively. Moreover, the RDPG method out performs the SAC approach on average by 70%.

下载PDF全文

下载文献需遵守相关版权规定

论文标题