文本生成评估：调查

论文标题

文本生成评估：调查

Evaluation of Text Generation: A Survey

论文作者

Celikyilmaz, Asli, Clark, Elizabeth, Gao, Jianfeng

论文摘要

论文调查了过去几年中开发的自然语言产生（NLG）系统的评估方法。我们将NLG评估方法分为三类：（1）以人为中心的评估指标，（2）不需要培训的自动指标，以及（3）机器学习指标。对于每个类别，我们讨论已经取得的进展以及仍面临的挑战，重点是评估最近提出的NLG任务和神经NLG模型。然后，我们为特定于任务的NLG评估提供了两个示例，以进行自动文本摘要和长文本生成，并通过提出未来的研究方向来结束论文。

The paper surveys evaluation methods of natural language generation (NLG) systems that have been developed in the last few years. We group NLG evaluation methods into three categories: (1) human-centric evaluation metrics, (2) automatic metrics that require no training, and (3) machine-learned metrics. For each category, we discuss the progress that has been made and the challenges still being faced, with a focus on the evaluation of recently proposed NLG tasks and neural NLG models. We then present two examples for task-specific NLG evaluations for automatic text summarization and long text generation, and conclude the paper by proposing future research directions.

下载PDF全文

下载文献需遵守相关版权规定

论文标题