论文标题
Paireranker:对自然语言生成的成对重读
PairReranker: Pairwise Reranking for Natural Language Generation
论文作者
论文摘要
预训练的语言模型在自然语言生成(NLG)任务中已经成功。尽管已经采用了各种解码方法,但通常会产生次优的结果。我们首先对三个NLG任务进行经验分析:汇总,机器翻译和受限的文本生成。我们发现,从多种解码方法的结果中选择最佳输出可以显着提高性能。为了进一步改善NLG任务的重读,我们提出了一种新颖的方法\ textsc {paireranker},该方法使用单个编码器和一个配对损耗函数共同编码源输入和一对候选者并进行比较。与以前的基线相比,对三个NLG任务进行的实验证明了\ textsc {paireranker}的有效性和灵活性,显示出强烈的结果。此外,我们的\ textsc {pairreranker}可以概括以显着改善GPT-3(Text-davinci-003)结果(例如,Commongen的24.55 \%\%,即使我们的Rerankers也没有接受任何GPT-3候选者的培训,即使我们的Reranker都没有接受过我们的Reranker的培训,但在WMT18 ZH-EN上为11.35 \%。
Pre-trained language models have been successful in natural language generation (NLG) tasks. While various decoding methods have been employed, they often produce suboptimal results. We first present an empirical analysis of three NLG tasks: summarization, machine translation, and constrained text generation. We found that selecting the best output from the results of multiple decoding methods can significantly improve performance. To further improve reranking for NLG tasks, we proposed a novel method, \textsc{PairReranker}, which uses a single encoder and a pairwise loss function to jointly encode a source input and a pair of candidates and compare them. Experiments on three NLG tasks demonstrated the effectiveness and flexibility of \textsc{PairReranker}, showing strong results, compared with previous baselines. In addition, our \textsc{PairReranker} can generalize to significantly improve GPT-3 (text-davinci-003) results (e.g., 24.55\% on CommonGen and 11.35\% on WMT18 zh-en), even though our rerankers are not trained with any GPT-3 candidates.