论文标题
可靠的关键任务无线传输的基于强化学习的功率控制
Reinforcement Learning Based Power Control for Reliable Mission-Critical Wireless Transmission
论文作者
论文摘要
在本文中,我们研究了关键任务应用程序上快速变化的渠道上的顺序功率分配,旨在最大程度地降低预期的总和,同时保证传输成功概率。特别是,通过适当的奖励设计构建了一个加强学习框架,以使最佳政策最大程度地提高了原始问题的拉格朗日语,在这种情况下,Lagrangian的最大化器被证明具有多种良好的属性。对于基于模型的情况,提出了一种快速收敛算法来找到最佳的Lagrange乘数,从而找到相应的最佳策略。对于无模型案例,我们制定了一种三阶段的策略,该策略按在线抽样,离线学习和在线操作组成,在该策略中,通过完全利用采样渠道实现的后退Q学习旨在加速学习过程。根据我们的模拟,提出的增强学习框架可以从双重角度解决原始优化问题。此外,无模型策略实现了与基于最佳模型算法的性能。
In this paper, we investigate sequential power allocation over fast varying channels for mission-critical applications, aiming to minimize the expected sum power while guaranteeing the transmission success probability. In particular, a reinforcement learning framework is constructed with appropriate reward design so that the optimal policy maximizes the Lagrangian of the primal problem, where the maximizer of the Lagrangian is shown to have several good properties. For the model-based case, a fast converging algorithm is proposed to find the optimal Lagrange multiplier and thus the corresponding optimal policy. For the model-free case, we develop a three-stage strategy, composed in order of online sampling, offline learning, and online operation, where a backward Q-learning with full exploitation of sampled channel realizations is designed to accelerate the learning process. According to our simulation, the proposed reinforcement learning framework can solve the primal optimization problem from the dual perspective. Moreover, the model-free strategy achieves a performance close to that of the optimal model-based algorithm.