论文标题
将风格词汇偏好纳入生成语言模型
Incorporating Stylistic Lexical Preferences in Generative Language Models
论文作者
论文摘要
尽管语言建模的最新进展导致了强大的生成模型,但它们的一代风格仍然隐含地取决于训练数据,无法模仿特定的目标样式。利用基于变压器的语言模型的生成能力,我们通过将作者的连续多维词汇偏好纳入生成语言模型,提出了一种诱导某些目标作者属性的方法。我们在强化学习框架中介绍了奖励策略,该策略鼓励在多个分类维度上使用单词来改变范围。我们的实验表明,所提出的方法可以生成与给定目标作者的词汇风格截然不同的文本。我们与竞争性和相关基线进行定量和定性比较,以说明拟议方法的好处。
While recent advances in language modeling have resulted in powerful generation models, their generation style remains implicitly dependent on the training data and can not emulate a specific target style. Leveraging the generative capabilities of a transformer-based language models, we present an approach to induce certain target-author attributes by incorporating continuous multi-dimensional lexical preferences of an author into generative language models. We introduce rewarding strategies in a reinforcement learning framework that encourages the use of words across multiple categorical dimensions, to varying extents. Our experiments demonstrate that the proposed approach can generate text that distinctively aligns with a given target author's lexical style. We conduct quantitative and qualitative comparisons with competitive and relevant baselines to illustrate the benefits of the proposed approach.