对元强化学习的采样攻击：最小值配方和复杂性分析

论文标题

对元强化学习的采样攻击：最小值配方和复杂性分析

Sampling Attacks on Meta Reinforcement Learning: A Minimax Formulation and Complexity Analysis

论文作者

Li, Tao, Lei, Haozhe, Zhu, Quanyan

论文摘要

元加强学习（Meta RL）作为元学习思想和强化学习（RL）的组合，使代理商可以使用一些样本适应不同的任务。但是，这种基于抽样的适应也使元rl容易受到对抗攻击的影响。通过操纵Meta RL中抽样过程的奖励反馈，攻击者可以误导代理商从培训经验中建立错误的知识，从而在适应后处理不同的任务时会恶化代理商的绩效。本文为理解这种类型的安全风险提供了游戏理论的基础。特别是，我们正式将采样攻击模型定义为攻击者和代理之间的stackelberg游戏，该游戏产生了最小值公式。它导致了两种在线攻击方案：间歇性攻击和持续攻击，这使攻击者能够学习最佳采样攻击，该攻击由$ε$ -first阶固定点定义，在$ \ MATHCAL {O}（O}（ε^{ - 2}）内。这些攻击方案自由地学习了学习的进度，而没有与环境进行额外互动的情况。通过通过数值实验来证实收敛结果，我们观察到攻击者的较小努力可以显着恶化学习绩效，而Minimax方法也可以帮助强化元素RL算法。

Meta reinforcement learning (meta RL), as a combination of meta-learning ideas and reinforcement learning (RL), enables the agent to adapt to different tasks using a few samples. However, this sampling-based adaptation also makes meta RL vulnerable to adversarial attacks. By manipulating the reward feedback from sampling processes in meta RL, an attacker can mislead the agent into building wrong knowledge from training experience, which deteriorates the agent's performance when dealing with different tasks after adaptation. This paper provides a game-theoretical underpinning for understanding this type of security risk. In particular, we formally define the sampling attack model as a Stackelberg game between the attacker and the agent, which yields a minimax formulation. It leads to two online attack schemes: Intermittent Attack and Persistent Attack, which enable the attacker to learn an optimal sampling attack, defined by an $ε$-first-order stationary point, within $\mathcal{O}(ε^{-2})$ iterations. These attack schemes freeride the learning progress concurrently without extra interactions with the environment. By corroborating the convergence results with numerical experiments, we observe that a minor effort of the attacker can significantly deteriorate the learning performance, and the minimax approach can also help robustify the meta RL algorithms.

下载PDF全文

下载文献需遵守相关版权规定

论文标题