具有基于GPT的体系结构和目标状态跟踪的生成用户模拟器，用于增强多域对话框系统

论文标题

具有基于GPT的体系结构和目标状态跟踪的生成用户模拟器，用于增强多域对话框系统

A Generative User Simulator with GPT-based Architecture and Goal State Tracking for Reinforced Multi-Domain Dialog Systems

论文作者

Liu, Hong, Cai, Yucheng, Ou, Zhijian, Huang, Yi, Feng, Junlan

论文摘要

建立用户模拟器（USS）用于增强任务的对话系统（DSS）的增强学习（RL）已引起了越来越多的关注，但是，这仍然面临着一些根本的挑战。首先，目前尚不清楚我们是否可以利用预验证的语言模型设计，例如基于GPT-2的USS来赶上并与最近高级的基于GPT-2的DSS进行互动。其次，美国的重要成分是可以有效地合并和跟踪用户目标。但是，如何灵活整合目标状态跟踪并为多域中开发端到端的端到端训练我们仍然是一个挑战。在这项工作中，我们提出了一个具有基于GPT-2的架构和目标状态跟踪的生成用户模拟器（GUS），以应对以上两个挑战。广泛的实验是在MultiWoz2.1上进行的。通过RL，GUS，经典议程的用户模拟器（ABU）和其他消融模拟器对不同的DSS进行了训练，并进行了比较，以进行跨模型评估，基于语料库的评估和人类评估。 GUS在所有三个评估任务中都取得了卓越的成果。

Building user simulators (USs) for reinforcement learning (RL) of task-oriented dialog systems (DSs) has gained more and more attention, which, however, still faces several fundamental challenges. First, it is unclear whether we can leverage pretrained language models to design, for example, GPT-2 based USs, to catch up and interact with the recently advanced GPT-2 based DSs. Second, an important ingredient in a US is that the user goal can be effectively incorporated and tracked; but how to flexibly integrate goal state tracking and develop an end-to-end trainable US for multi-domains has remained to be a challenge. In this work, we propose a generative user simulator (GUS) with GPT-2 based architecture and goal state tracking towards addressing the above two challenges. Extensive experiments are conducted on MultiWOZ2.1. Different DSs are trained via RL with GUS, the classic agenda-based user simulator (ABUS) and other ablation simulators respectively, and are compared for cross-model evaluation, corpus-based evaluation and human evaluation. The GUS achieves superior results in all three evaluation tasks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题