论文标题

评估有效的神经体系结构搜索句子对任务的有效性

Evaluating the Effectiveness of Efficient Neural Architecture Search for Sentence-Pair Tasks

论文作者

MacLaughlin, Ansel, Dhamala, Jwala, Kumar, Anoop, Venkatapathy, Sriram, Venkatesan, Ragav, Gupta, Rahul

论文摘要

神经体系结构搜索(NAS)方法自动学习整个神经模型或单个神经细胞体系结构,最近在各种自然语言处理和计算机视觉任务(包括语言建模,自然语言推断和图像分类)方面实现了竞争性或最先进的(SOTA)性能。在这项工作中,我们探讨了SOTA NAS算法,有效的神经体系结构搜索(ENAS)的适用性(Pham等人,2018年)对两个句子对任务,释义检测和语义文本相似性。我们使用ENA来执行微级搜索,并学习任务优化的RNN单元格体系结构作为LSTM的替换。我们通过在三个数据集(MRPC,Sick,STS-B)上进行实验探索ENA的有效性,并具有两个不同的模型(ESIM,BilstM-Max)和两组嵌入(Glove,Bert)。与先前的工作相反,将ENA应用于NLP任务,我们的结果是混合的 - 我们发现ENAS架构有时但并非总是比LSTM的,并且与随机体系结构搜索相似。

Neural Architecture Search (NAS) methods, which automatically learn entire neural model or individual neural cell architectures, have recently achieved competitive or state-of-the-art (SOTA) performance on variety of natural language processing and computer vision tasks, including language modeling, natural language inference, and image classification. In this work, we explore the applicability of a SOTA NAS algorithm, Efficient Neural Architecture Search (ENAS) (Pham et al., 2018) to two sentence pair tasks, paraphrase detection and semantic textual similarity. We use ENAS to perform a micro-level search and learn a task-optimized RNN cell architecture as a drop-in replacement for an LSTM. We explore the effectiveness of ENAS through experiments on three datasets (MRPC, SICK, STS-B), with two different models (ESIM, BiLSTM-Max), and two sets of embeddings (Glove, BERT). In contrast to prior work applying ENAS to NLP tasks, our results are mixed -- we find that ENAS architectures sometimes, but not always, outperform LSTMs and perform similarly to random architecture search.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源