论文标题
在预训练的基于变压器的语言模型中查找技能神经元
Finding Skill Neurons in Pre-trained Transformer-based Language Models
论文作者
论文摘要
基于变压器的预训练的语言模型已在各种自然语言处理任务上表现出卓越的性能。但是,尚不清楚如何处理这些任务所需的技能在模型参数之间分布。在本文中,我们发现在迅速调整特定任务后,预训练的变压器中某些神经元的激活高度预测了任务标签。我们将这些神经元技能神经元配音并确认它们通过发现:(1)技能神经元对于处理任务至关重要,并确认它们编码特定于任务的技能。当对应的技能神经元受到干扰时,预训练的变压器在任务上的性能会显着下降。 (2)技能神经元特定于任务。类似的任务往往具有类似的技能神经元分布。此外,我们证明了技能神经元很可能是在预训练中产生的,而不是通过证明迅速调整发现的技能神经元对其他微调方法冻结神经元重量(例如基于适配器的调音和fitfit)也至关重要。我们还探索了技能神经元的应用,包括通过网络修剪加速变压器和建立更好的可传递性指标。这些发现可能会促进有关理解变压器的进一步研究。可以从https://github.com/thu-keg/skill-neuron获得源代码。
Transformer-based pre-trained language models have demonstrated superior performance on various natural language processing tasks. However, it remains unclear how the skills required to handle these tasks distribute among model parameters. In this paper, we find that after prompt tuning for specific tasks, the activations of some neurons within pre-trained Transformers are highly predictive of the task labels. We dub these neurons skill neurons and confirm they encode task-specific skills by finding that: (1) Skill neurons are crucial for handling tasks. Performances of pre-trained Transformers on a task significantly drop when corresponding skill neurons are perturbed. (2) Skill neurons are task-specific. Similar tasks tend to have similar distributions of skill neurons. Furthermore, we demonstrate the skill neurons are most likely generated in pre-training rather than fine-tuning by showing that the skill neurons found with prompt tuning are also crucial for other fine-tuning methods freezing neuron weights, such as the adapter-based tuning and BitFit. We also explore the applications of skill neurons, including accelerating Transformers with network pruning and building better transferability indicators. These findings may promote further research on understanding Transformers. The source code can be obtained from https://github.com/THU-KEG/Skill-Neuron.