论文标题
在统一的框架中产生主要类型的中国古典诗歌
Generating Major Types of Chinese Classical Poetry in a Uniformed Framework
论文作者
论文摘要
诗歌生成是文本生成领域中有趣的研究主题。作为中国最有价值的文学和文化遗产之一,中国古典诗歌一代人非常熟悉和爱。它在其语言结构中具有许多特定的特征,从形式,声音到含义,因此被视为文本生成的理想测试任务。在本文中,我们提出了一个基于GPT-2的统一框架,用于产生主要类型的中国古典诗歌。我们定义了一种统一格式,用于通过整合详细的形式信息来制定所有类型的训练样本,然后在GPT-2中提出一种简单的表格压力加权方法,以增强对生成诗歌的形式的控制,并特别强调了具有较长身体长度的这些形式。初步实验结果表明,这种增强模型可以在形式和内容上产生高质量的主要类型的中国古典诗,从而验证了拟议策略的有效性。该模型已被纳入Jiuge,这是Tsinghua University开发的最具影响力的中国古典诗歌生成系统(Guo等,2019)。
Poetry generation is an interesting research topic in the field of text generation. As one of the most valuable literary and cultural heritages of China, Chinese classical poetry is very familiar and loved by Chinese people from generation to generation. It has many particular characteristics in its language structure, ranging from form, sound to meaning, thus is regarded as an ideal testing task for text generation. In this paper, we propose a GPT-2 based uniformed framework for generating major types of Chinese classical poems. We define a unified format for formulating all types of training samples by integrating detailed form information, then present a simple form-stressed weighting method in GPT-2 to strengthen the control to the form of the generated poems, with special emphasis on those forms with longer body length. Preliminary experimental results show this enhanced model can generate Chinese classical poems of major types with high quality in both form and content, validating the effectiveness of the proposed strategy. The model has been incorporated into Jiuge, the most influential Chinese classical poetry generation system developed by Tsinghua University (Guo et al., 2019).