论文标题

裁判:通过符号知识蒸馏的无参考句子摘要,具有更清晰的可控性

Referee: Reference-Free Sentence Summarization with Sharper Controllability through Symbolic Knowledge Distillation

论文作者

Sclar, Melanie, West, Peter, Kumar, Sachin, Tsvetkov, Yulia, Choi, Yejin

论文摘要

我们提出裁判,这是一个句子摘要的新颖框架,可以无参考(即不需要金摘要来监督),同时允许直接控制压缩比。我们的工作是第一个证明,通过符号知识蒸馏的概念框架可行,可行的是可行的(West等人,2022年),其中预训练的语言模型中的潜在知识是通过从教师模型中取样的明确示例来提炼的,该模型进一步纯化了三种类型的滤纸:长度,富裕,富裕,信息,信息。此外,我们独特地提出了对知识的迭代蒸馏,在下一次迭代中,蒸馏的前迭代中的学生模型是教师模型。从一组相对适中的GPT3生成的摘要开始,我们演示了迭代知识蒸馏如何导致相当小,但更好的摘要具有更高的可控性。这种迭代蒸馏过程的有用副产品是句子 - 苏属对的高质量数据集,具有不同程度的压缩比。经验结果表明,最终的学生模型在压缩比的可控性方面极大地胜过更大的GPT3教学模型,而不会损害结果汇总的质量。

We present Referee, a novel framework for sentence summarization that can be trained reference-free (i.e., requiring no gold summaries for supervision), while allowing direct control for compression ratio. Our work is the first to demonstrate that reference-free, controlled sentence summarization is feasible via the conceptual framework of Symbolic Knowledge Distillation (West et al., 2022), where latent knowledge in pre-trained language models is distilled via explicit examples sampled from the teacher models, further purified with three types of filters: length, fidelity, and Information Bottleneck. Moreover, we uniquely propose iterative distillation of knowledge, where student models from the previous iteration of distillation serve as teacher models in the next iteration. Starting off from a relatively modest set of GPT3-generated summaries, we demonstrate how iterative knowledge distillation can lead to considerably smaller, but better summarizers with sharper controllability. A useful by-product of this iterative distillation process is a high-quality dataset of sentence-summary pairs with varying degrees of compression ratios. Empirical results demonstrate that the final student models vastly outperform the much larger GPT3-Instruct model in terms of the controllability of compression ratios, without compromising the quality of resulting summarization.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源