论文标题
语义解析器的保护域的适应
Privacy-Preserving Domain Adaptation of Semantic Parsers
论文作者
论文摘要
面向任务的对话系统通常会帮助用户解决个人或机密问题。因此,通常禁止这种系统的开发人员观察实际用法。那么,他们怎么知道系统在哪里失败,需要更多的培训数据或新功能?在这项工作中,我们研究了可以通过合成的用户话语产生现实的用户话语的方法,以帮助增加系统的语言和功能覆盖,而不会损害实际用户的隐私。为此,我们提出了一种两阶段的差异私有(DP)生成方法,该方法首先生成潜在的语义解析,然后根据解析生成话语。我们所提出的方法将淡紫色提高了2.5 $ \ times $,而解析树功能类型则相对于当前的私人合成数据生成的方法,将1.3 $ \ times $重叠,从而改善了流利度和语义覆盖范围。我们进一步验证了现实域适应任务的方法,该任务将新功能从私人用户数据添加到语义解析器,并通过新功能显示准确性的总体上涨8.5%。
Task-oriented dialogue systems often assist users with personal or confidential matters. For this reason, the developers of such a system are generally prohibited from observing actual usage. So how can they know where the system is failing and needs more training data or new functionality? In this work, we study ways in which realistic user utterances can be generated synthetically, to help increase the linguistic and functional coverage of the system, without compromising the privacy of actual users. To this end, we propose a two-stage Differentially Private (DP) generation method which first generates latent semantic parses, and then generates utterances based on the parses. Our proposed approach improves MAUVE by 2.5$\times$ and parse tree function type overlap by 1.3$\times$ relative to current approaches for private synthetic data generation, improving both on fluency and semantic coverage. We further validate our approach on a realistic domain adaptation task of adding new functionality from private user data to a semantic parser, and show overall gains of 8.5% points in accuracy with the new feature.