理解和可视化RL代理策略的框架

论文标题

理解和可视化RL代理策略的框架

A Framework for Understanding and Visualizing Strategies of RL Agents

论文作者

Sequeira, Pedro, Elenius, Daniel, Hostetler, Jesse, Gervasio, Melinda

论文摘要

近年来，在可解释的AI中取得了重大进展，因为了解深度学习模型的需求已变得非常重要，随着对AI的信任和道德规范的越来越重视。顺序决策任务的可理解模型是一个特殊的挑战，因为它们不仅需要了解个人预测，还需要一系列与环境动态相互作用的预测。我们提出了一个框架，用于学习顺序决策任务的可理解模型，在该模型中，使用时间逻辑公式对代理策略进行表征。在给定一组试剂痕迹的情况下，我们首先使用一种捕获频繁的动作模式的新型嵌入方法聚集痕迹。然后，我们搜索逻辑公式，以解释不同簇中的代理策略。我们使用手工制作的专家政策和经过训练的强化学习代理商的痕迹评估了《星际争霸II》（SC2）中战斗场景的框架。我们为SC2环境实现了一个功能提取器，该功能提取器将痕迹提取为高级特征的序列，该序列描述了环境的状态和代理重播中代理的本地行为。我们进一步设计了一个可视化工具，描述了环境中单元的运动，这有助于了解不同的任务条件如何导致每个跟踪群集中不同的代理行为模式。实验结果表明，我们的框架能够将代理痕迹分离为不同的行为群体，我们的策略推理方法会产生一致，有意义且易于理解的策略描述。

Recent years have seen significant advances in explainable AI as the need to understand deep learning models has gained importance with the increased emphasis on trust and ethics in AI. Comprehensible models for sequential decision tasks are a particular challenge as they require understanding not only individual predictions but a series of predictions that interact with environmental dynamics. We present a framework for learning comprehensible models of sequential decision tasks in which agent strategies are characterized using temporal logic formulas. Given a set of agent traces, we first cluster the traces using a novel embedding method that captures frequent action patterns. We then search for logical formulas that explain the agent strategies in the different clusters. We evaluate our framework on combat scenarios in StarCraft II (SC2), using traces from a handcrafted expert policy and a trained reinforcement learning agent. We implemented a feature extractor for SC2 environments that extracts traces as sequences of high-level features describing both the state of the environment and the agent's local behavior from agent replays. We further designed a visualization tool depicting the movement of units in the environment that helps understand how different task conditions lead to distinct agent behavior patterns in each trace cluster. Experimental results show that our framework is capable of separating agent traces into distinct groups of behaviors for which our approach to strategy inference produces consistent, meaningful, and easily understood strategy descriptions.

下载PDF全文

下载文献需遵守相关版权规定

论文标题