关于在摘要任务中随机加权编码器的令人印象深刻的性能

论文标题

关于在摘要任务中随机加权编码器的令人印象深刻的性能

On the impressive performance of randomly weighted encoders in summarization tasks

论文作者

Pilault, Jonathan, Park, Jaehong, Pal, Christopher

论文摘要

在这项工作中，我们研究了一类序列中未经训练的随机初始化编码器的性能，以使其序列模型，并将其性能与全面训练的编码器的性能进行比较，以抽象性汇总的任务。我们假设输入文本的随机投影具有足够的代表力来编码句子和文档语义的层次结构。使用训练有素的解码器生成抽象的文本摘要，我们从经验上证明，具有未经训练的随机初始化编码器的体系结构相对于具有全面训练的编码器的等效体系结构的竞争性。我们进一步发现，编码器的容量不仅可以改善整体模型的概括，而且还可以缩小未经训练的随机初始化和全训练编码器之间的性能差距。据我们所知，这是第一次评估训练有素且随机投影的抽象摘要表示序列模型。

In this work, we investigate the performance of untrained randomly initialized encoders in a general class of sequence to sequence models and compare their performance with that of fully-trained encoders on the task of abstractive summarization. We hypothesize that random projections of an input text have enough representational power to encode the hierarchical structure of sentences and semantics of documents. Using a trained decoder to produce abstractive text summaries, we empirically demonstrate that architectures with untrained randomly initialized encoders perform competitively with respect to the equivalent architectures with fully-trained encoders. We further find that the capacity of the encoder not only improves overall model generalization but also closes the performance gap between untrained randomly initialized and full-trained encoders. To our knowledge, it is the first time that general sequence to sequence models with attention are assessed for trained and randomly projected representations on abstractive summarization.

下载PDF全文

下载文献需遵守相关版权规定

论文标题