有效的对抗训练而无需攻击：最坏意识的强大增强学习

论文标题

有效的对抗训练而无需攻击：最坏意识的强大增强学习

Efficient Adversarial Training without Attacking: Worst-Case-Aware Robust Reinforcement Learning

论文作者

Liang, Yongyuan, Sun, Yanchao, Zheng, Ruijie, Huang, Furong

论文摘要

最近的研究表明，受过良好训练的深度强化学习（RL）政策可能尤其容易受到对对抗性观察的侵入。因此，训练RL代理人与预算有限的任何攻击都有强大的攻击是至关重要的。 Deep RL中的现有强大训练方法分别处理相关的步骤，忽略了长期奖励的鲁棒性，或者训练代理和基于RL的攻击者，将训练过程的计算负担和样本复杂性加倍。在这项工作中，我们为RL提出了一个强大而有效的强大训练框架，称为最糟糕的RL RL（WOCAR-RL），直接估算并优化了有限的L_P攻击下的政策最差的奖励，而无需额外的样本来学习攻击者。在多种环境上进行的实验表明，WoCar-RL在各种强烈的攻击下实现了最先进的表现，并且比以前的最新强大训练方法获得了明显更高的训练效率。这项工作的代码可在https://github.com/umd-huang-lab/wocar-rl上找到。

Recent studies reveal that a well-trained deep reinforcement learning (RL) policy can be particularly vulnerable to adversarial perturbations on input observations. Therefore, it is crucial to train RL agents that are robust against any attacks with a bounded budget. Existing robust training methods in deep RL either treat correlated steps separately, ignoring the robustness of long-term rewards, or train the agents and RL-based attacker together, doubling the computational burden and sample complexity of the training process. In this work, we propose a strong and efficient robust training framework for RL, named Worst-case-aware Robust RL (WocaR-RL) that directly estimates and optimizes the worst-case reward of a policy under bounded l_p attacks without requiring extra samples for learning an attacker. Experiments on multiple environments show that WocaR-RL achieves state-of-the-art performance under various strong attacks, and obtains significantly higher training efficiency than prior state-of-the-art robust training methods. The code of this work is available at https://github.com/umd-huang-lab/WocaR-RL.

下载PDF全文

下载文献需遵守相关版权规定

论文标题