论文标题
快速加固学习,用于反界沟通
Fast Reinforcement Learning for Anti-jamming Communications
论文作者
论文摘要
这封信提供了一种快速加强学习算法,用于反杀伤通信,该算法以概率$τ$选择了以前的操作,并应用了$ε$ - 果酱,概率$(1-τ)$。设计基于先前几个动作的平均值的动态阈值,并且概率$τ$被配制为类似高斯的功能,以指导无线设备。作为一个具体的示例,提出的算法是在无线通信系统中针对多个干扰者实现的。实验结果表明,所提出的算法超过了基于信噪比与互动 - 噪声 - 噪声比率和融合率的基于Q的深度Q-NETWORKS(DQN),双DQN(DQN),双DQN(DDQN)和优先的经验回复答复答复。
This letter presents a fast reinforcement learning algorithm for anti-jamming communications which chooses previous action with probability $τ$ and applies $ε$-greedy with probability $(1-τ)$. A dynamic threshold based on the average value of previous several actions is designed and probability $τ$ is formulated as a Gaussian-like function to guide the wireless devices. As a concrete example, the proposed algorithm is implemented in a wireless communication system against multiple jammers. Experimental results demonstrate that the proposed algorithm exceeds Q-learing, deep Q-networks (DQN), double DQN (DDQN), and prioritized experience reply based DDQN (PDDQN), in terms of signal-to-interference-plus-noise ratio and convergence rate.