Torchdistill：用于知识蒸馏的模块化，配置驱动的框架

论文标题

Torchdistill：用于知识蒸馏的模块化，配置驱动的框架

torchdistill: A Modular, Configuration-Driven Framework for Knowledge Distillation

论文作者

Matsubara, Yoshitomo

论文摘要

尽管知识蒸馏（转移）吸引了研究界的注意力，但该领域的最新发展增强了对可重复研究的需求和高度概括的框架，以降低对这种高质量，可重复的深度学习研究的障碍。一些研究人员自愿出版的框架在其知识蒸馏研究中用于帮助其他感兴趣的研究人员重现其原始工作。但是，这种框架通常既不是概括性的也不是维护的，因此仍然需要研究人员编写大量代码以在框架上进行重构/构建，以引入新方法，模型，数据集和设计实验。在本文中，我们介绍了开发的开源框架，建立在Pytorch上，专门用于知识蒸馏研究。该框架旨在使用户能够通过声明的PYYAML配置文件来设计实验，并帮助研究人员完成最近提出的ML代码完整性清单。使用开发的框架，我们展示了其各种有效的培训策略，并实施了各种知识蒸馏方法。我们还在主要的机器学习会议（例如ICLR，Neurips，CVPR和ECCV）上介绍的ImageNet和可可数据集上重现了一些最初的实验结果，包括最近的最新方法。所有源代码，配置，日志文件和训练有素的模型权重都在https://github.com/yoshitomo-matsubara/torchdistill上公开可用。

While knowledge distillation (transfer) has been attracting attentions from the research community, the recent development in the fields has heightened the need for reproducible studies and highly generalized frameworks to lower barriers to such high-quality, reproducible deep learning research. Several researchers voluntarily published frameworks used in their knowledge distillation studies to help other interested researchers reproduce their original work. Such frameworks, however, are usually neither well generalized nor maintained, thus researchers are still required to write a lot of code to refactor/build on the frameworks for introducing new methods, models, datasets and designing experiments. In this paper, we present our developed open-source framework built on PyTorch and dedicated for knowledge distillation studies. The framework is designed to enable users to design experiments by declarative PyYAML configuration files, and helps researchers complete the recently proposed ML Code Completeness Checklist. Using the developed framework, we demonstrate its various efficient training strategies, and implement a variety of knowledge distillation methods. We also reproduce some of their original experimental results on the ImageNet and COCO datasets presented at major machine learning conferences such as ICLR, NeurIPS, CVPR and ECCV, including recent state-of-the-art methods. All the source code, configurations, log files and trained model weights are publicly available at https://github.com/yoshitomo-matsubara/torchdistill .

下载PDF全文

下载文献需遵守相关版权规定

论文标题