用于有效语言建模的微核

论文标题

用于有效语言建模的微核

MicroNet for Efficient Language Modeling

论文作者

Yan, Zhongxia, Wang, Hanrui, Guo, Demi, Han, Song

论文摘要

设计紧凑的语言模型以进行有效的部署很重要。我们改善了语言建模域和模型压缩域的最新进展，以构建参数和计算有效的语言模型。我们使用具有自适应嵌入和软磁性的有效基于变压器的体系结构，可区分的非参数缓存，HEBBIAN SOFTMAX，知识蒸馏，网络修剪和低位量化。在本文中，我们在语言建模轨道上为Neurips 2019 Micronet挑战提供了成功解决方案。与Micronet挑战提供的基线语言模型相比，我们的模型是参数效率高的90倍，计算效率高36倍，同时在Wikitext-103数据集上达到了35的所需测试困惑。我们希望这项工作将有助于对有效语言模型的未来研究，并在https://github.com/mit-han-lab/neurips-micronet上发布了完整的源代码。

It is important to design compact language models for efficient deployment. We improve upon recent advances in both the language modeling domain and the model-compression domain to construct parameter and computation efficient language models. We use an efficient transformer-based architecture with adaptive embedding and softmax, differentiable non-parametric cache, Hebbian softmax, knowledge distillation, network pruning, and low-bit quantization. In this paper, we provide the winning solution to the NeurIPS 2019 MicroNet Challenge in the language modeling track. Compared to the baseline language model provided by the MicroNet Challenge, our model is 90 times more parameter-efficient and 36 times more computation-efficient while achieving the required test perplexity of 35 on the Wikitext-103 dataset. We hope that this work will aid future research into efficient language models, and we have released our full source code at https://github.com/mit-han-lab/neurips-micronet.

下载PDF全文

下载文献需遵守相关版权规定

论文标题