论文标题
通过困惑估计,在语言模型中提示提示
Demystifying Prompts in Language Models via Perplexity Estimation
论文作者
论文摘要
可以提示语言模型执行各种零和少数学习问题。但是,随着提示的选择,性能差异很大,我们尚不理解为什么会发生这种情况或如何选择最佳提示。在这项工作中,我们分析了导致这种差异的因素,并建立了一个新的经验假设:提示的性能与模型熟悉所包含的语言的程度相结合。在广泛的任务中,我们表明提示的困惑越低,提示可以执行任务越好。结果,我们设计了一种创建提示的方法:(1)通过使用GPT3和backtranslation进行隔离,自动扩展了一组小种子手动编写的提示,并且(2)选择最低的困惑提示提示以获得绩效的显着提高。
Language models can be prompted to perform a wide variety of zero- and few-shot learning problems. However, performance varies significantly with the choice of prompt, and we do not yet understand why this happens or how to pick the best prompts. In this work, we analyze the factors that contribute to this variance and establish a new empirical hypothesis: the performance of a prompt is coupled with the extent to which the model is familiar with the language it contains. Over a wide range of tasks, we show that the lower the perplexity of the prompt is, the better the prompt is able to perform the task. As a result, we devise a method for creating prompts: (1) automatically extend a small seed set of manually written prompts by paraphrasing using GPT3 and backtranslation and (2) choose the lowest perplexity prompts to get significant gains in performance.