论文标题
带有潜在离散计划的叙事文本生成
Narrative Text Generation with a Latent Discrete Plan
论文作者
论文摘要
过去的故事产生工作证明了对生成连贯故事的一代计划进行调节的有用性。但是,这些方法已使用启发式方法或现成的模型来首先使用所需类型的计划标记培训故事,然后以监督的方式进行训练生成模型。在本文中,我们提出了一个深层的可变模型,该模型首先采样了一系列锚词,这是故事中每个句子中的一个句子,作为其生成过程的一部分。在训练过程中,我们的模型将锚词的序列视为一个潜在变量,并试图诱导锚定序列,以帮助以无监督的方式指导生成。我们对几种类型的句子解码器分布进行实验:从左到右和非单调,具有不同程度的限制。此外,由于我们使用摊销的变分推断来训练我们的模型,因此我们介绍了两种相应的推理网络,以预测锚词的后验。我们进行人类评估,这些评估表明,与不考虑故事计划的基线相比,我们的模型产生的故事的评分更好,并且质量相对于基线相似或更好,而基线使用外部监督计划。此外,在评估通过离散计划对故事,多样性和控制故事的控制时,提出的模型得分良好。
Past work on story generation has demonstrated the usefulness of conditioning on a generation plan to generate coherent stories. However, these approaches have used heuristics or off-the-shelf models to first tag training stories with the desired type of plan, and then train generation models in a supervised fashion. In this paper, we propose a deep latent variable model that first samples a sequence of anchor words, one per sentence in the story, as part of its generative process. During training, our model treats the sequence of anchor words as a latent variable and attempts to induce anchoring sequences that help guide generation in an unsupervised fashion. We conduct experiments with several types of sentence decoder distributions: left-to-right and non-monotonic, with different degrees of restriction. Further, since we use amortized variational inference to train our model, we introduce two corresponding types of inference network for predicting the posterior on anchor words. We conduct human evaluations which demonstrate that the stories produced by our model are rated better in comparison with baselines which do not consider story plans, and are similar or better in quality relative to baselines which use external supervision for plans. Additionally, the proposed model gets favorable scores when evaluated on perplexity, diversity, and control of story via discrete plan.