预见偶然监督的好处

论文标题

预见偶然监督的好处

Foreseeing the Benefits of Incidental Supervision

论文作者

He, Hangfeng, Zhang, Mingyuan, Ning, Qiang, Roth, Dan

论文摘要

现实世界中的应用程序通常需要通过利用一系列廉价的偶然监督信号来改进模型。这些可能包括部分标签，嘈杂的标签，基于知识的约束以及交叉域或交叉任务注释 - 所有这些都具有带有金注释的统计关联，但并不完全相同。但是，我们目前缺乏一种原则性的方法来衡量这些信号对给定目标任务的好处，并且评估这些好处的共同做法是通过对各种模型和超参数的详尽实验。本文研究了我们是否可以在一个框架中量化给定目标任务的各种偶然信号的好处，而无需进行组合实验。我们提出了一个统一的Pac-Bayesian动机信息措施PABI，其特征是偶然监督信号提供的不确定性降低。我们通过量化各种偶然信号添加的值来序列标记任务来证明PABI的有效性。关于指定实体识别（NER）和问题回答（QA）的实验表明，PABI的预测与学习绩效良好相关，提供了一种有希望的方法来确定在学习之前，哪些监督信号将是有益的。

Real-world applications often require improved models by leveraging a range of cheap incidental supervision signals. These could include partial labels, noisy labels, knowledge-based constraints, and cross-domain or cross-task annotations -- all having statistical associations with gold annotations but not exactly the same. However, we currently lack a principled way to measure the benefits of these signals to a given target task, and the common practice of evaluating these benefits is through exhaustive experiments with various models and hyperparameters. This paper studies whether we can, in a single framework, quantify the benefits of various types of incidental signals for a given target task without going through combinatorial experiments. We propose a unified PAC-Bayesian motivated informativeness measure, PABI, that characterizes the uncertainty reduction provided by incidental supervision signals. We demonstrate PABI's effectiveness by quantifying the value added by various types of incidental signals to sequence tagging tasks. Experiments on named entity recognition (NER) and question answering (QA) show that PABI's predictions correlate well with learning performance, providing a promising way to determine, ahead of learning, which supervision signals would be beneficial.

下载PDF全文

下载文献需遵守相关版权规定

论文标题