在深NLP模型中发现显着的神经元

论文标题

在深NLP模型中发现显着的神经元

Discovering Salient Neurons in Deep NLP Models

论文作者

Durrani, Nadir, Dalvi, Fahim, Sajjad, Hassan

论文摘要

尽管在理解深度NLP模型中学到的表示形式以及捕获的知识方面已经做了很多工作，但对单个神经元的关注很少。我们提出了一种称为语言相关性分析的技术，以在任何外部特性中提取模型中的显着神经元 - 目的是了解如何保留这种知识在神经元中。我们进行了细粒度的分析以回答以下问题：（i）我们可以识别网络中捕获特定语言特性的神经元子集吗？（ii）整个网络中的局部或分布式神经元如何？ iii）信息保留了多么冗余？ iv）针对下游NLP任务的预先训练模型如何影响学习的语言知识？ iv）架构在学习不同的语言特性方面有何不同？ Our data-driven, quantitative analysis illuminates interesting findings: (i) we found small subsets of neurons that can predict different linguistic tasks, ii) with neurons capturing basic lexical information (such as suffixation) localized in lower most layers, iii) while those learning complex concepts (such as syntactic role) predominantly in middle and higher layers, iii) that salient linguistic neurons are relocated from higher to lower在转移学习过程中，由于网络保留了特定任务信息的较高层，因此，我们发现了在预训练模型之间的有趣差异，就语言信息如何保留在内，v）我们发现，概念在多语言变压器模型中表现出跨不同语言的相似神经元分布。我们的代码作为Neurox工具包的一部分公开可用。

While a lot of work has been done in understanding representations learned within deep NLP models and what knowledge they capture, little attention has been paid towards individual neurons. We present a technique called as Linguistic Correlation Analysis to extract salient neurons in the model, with respect to any extrinsic property - with the goal of understanding how such a knowledge is preserved within neurons. We carry out a fine-grained analysis to answer the following questions: (i) can we identify subsets of neurons in the network that capture specific linguistic properties? (ii) how localized or distributed neurons are across the network? iii) how redundantly is the information preserved? iv) how fine-tuning pre-trained models towards downstream NLP tasks, impacts the learned linguistic knowledge? iv) how do architectures vary in learning different linguistic properties? Our data-driven, quantitative analysis illuminates interesting findings: (i) we found small subsets of neurons that can predict different linguistic tasks, ii) with neurons capturing basic lexical information (such as suffixation) localized in lower most layers, iii) while those learning complex concepts (such as syntactic role) predominantly in middle and higher layers, iii) that salient linguistic neurons are relocated from higher to lower layers during transfer learning, as the network preserve the higher layers for task specific information, iv) we found interesting differences across pre-trained models, with respect to how linguistic information is preserved within, and v) we found that concept exhibit similar neuron distribution across different languages in the multilingual transformer models. Our code is publicly available as part of the NeuroX toolkit.

下载PDF全文

下载文献需遵守相关版权规定

论文标题