评估有效的神经体系结构搜索句子对任务的有效性

论文标题

评估有效的神经体系结构搜索句子对任务的有效性

Evaluating the Effectiveness of Efficient Neural Architecture Search for Sentence-Pair Tasks

论文作者

MacLaughlin, Ansel, Dhamala, Jwala, Kumar, Anoop, Venkatapathy, Sriram, Venkatesan, Ragav, Gupta, Rahul

论文摘要

神经体系结构搜索（NAS）方法自动学习整个神经模型或单个神经细胞体系结构，最近在各种自然语言处理和计算机视觉任务（包括语言建模，自然语言推断和图像分类）方面实现了竞争性或最先进的（SOTA）性能。在这项工作中，我们探讨了SOTA NAS算法，有效的神经体系结构搜索（ENAS）的适用性（Pham等人，2018年）对两个句子对任务，释义检测和语义文本相似性。我们使用ENA来执行微级搜索，并学习任务优化的RNN单元格体系结构作为LSTM的替换。我们通过在三个数据集（MRPC，Sick，STS-B）上进行实验探索ENA的有效性，并具有两个不同的模型（ESIM，BilstM-Max）和两组嵌入（Glove，Bert）。与先前的工作相反，将ENA应用于NLP任务，我们的结果是混合的 - 我们发现ENAS架构有时但并非总是比LSTM的，并且与随机体系结构搜索相似。

Neural Architecture Search (NAS) methods, which automatically learn entire neural model or individual neural cell architectures, have recently achieved competitive or state-of-the-art (SOTA) performance on variety of natural language processing and computer vision tasks, including language modeling, natural language inference, and image classification. In this work, we explore the applicability of a SOTA NAS algorithm, Efficient Neural Architecture Search (ENAS) (Pham et al., 2018) to two sentence pair tasks, paraphrase detection and semantic textual similarity. We use ENAS to perform a micro-level search and learn a task-optimized RNN cell architecture as a drop-in replacement for an LSTM. We explore the effectiveness of ENAS through experiments on three datasets (MRPC, SICK, STS-B), with two different models (ESIM, BiLSTM-Max), and two sets of embeddings (Glove, BERT). In contrast to prior work applying ENAS to NLP tasks, our results are mixed -- we find that ENAS architectures sometimes, but not always, outperform LSTMs and perform similarly to random architecture search.

下载PDF全文

下载文献需遵守相关版权规定

论文标题