论文标题

上下文单词表示模型的相似性分析

Similarity Analysis of Contextual Word Representation Models

论文作者

Wu, John M., Belinkov, Yonatan, Sajjad, Hassan, Durrani, Nadir, Dalvi, Fahim, Glass, James

论文摘要

本文从相似性分析的角度研究了上下文单词表示模型。给定训练有素的模型的集合,我们衡量其内部表示和关注的相似性。至关重要的是,这些模型来自截然不同的架构。我们使用现有和新颖的相似性度量,旨在衡量深层模型中信息定位水平,并促进对哪些设计因素影响模型相似性的研究,而无需任何外部语言注释。分析表明,同一家族中的模型彼此之间更相似。令人惊讶的是,不同的体系结构具有相似的表示,但是单个神经元不同。我们还观察到在较低和更高层中信息定位的差异,发现较高的层受到下游任务的微调影响。

This paper investigates contextual word representation models from the lens of similarity analysis. Given a collection of trained models, we measure the similarity of their internal representations and attention. Critically, these models come from vastly different architectures. We use existing and novel similarity measures that aim to gauge the level of localization of information in the deep models, and facilitate the investigation of which design factors affect model similarity, without requiring any external linguistic annotation. The analysis reveals that models within the same family are more similar to one another, as may be expected. Surprisingly, different architectures have rather similar representations, but different individual neurons. We also observed differences in information localization in lower and higher layers and found that higher layers are more affected by fine-tuning on downstream tasks.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源