论文标题

任务空间上的信息几何距离

An Information-Geometric Distance on the Space of Tasks

论文作者

Gao, Yansong, Chaudhari, Pratik

论文摘要

本文规定了在数据和标签上建模为联合分布的学习任务之间的距离。使用信息几何形状中的工具,将距离定义为Riemannian歧管上最短重量轨迹的长度,因为分类器在插值任务上安装。插值任务使用最佳传输公式从源到目标任务发展。我们称之为“耦合传输距离”的距离可以在不同的分类器架构上进行比较。我们开发了一种算法来计算迭代将源任务数据传输到目标任务的距离,同时更新分类器的权重以跟踪此不断发展的数据分布。我们开发理论表明,我们的距离捕获了直观的想法,即良好的传输轨迹是在转移过程中延伸概括差距的轨迹,尤其是在目标任务结束时。我们对各种图像分类数据集进行了彻底的经验验证和分析,以表明耦合的传递距离与微调难度密切相关。

This paper prescribes a distance between learning tasks modeled as joint distributions on data and labels. Using tools in information geometry, the distance is defined to be the length of the shortest weight trajectory on a Riemannian manifold as a classifier is fitted on an interpolated task. The interpolated task evolves from the source to the target task using an optimal transport formulation. This distance, which we call the "coupled transfer distance" can be compared across different classifier architectures. We develop an algorithm to compute the distance which iteratively transports the marginal on the data of the source task to that of the target task while updating the weights of the classifier to track this evolving data distribution. We develop theory to show that our distance captures the intuitive idea that a good transfer trajectory is the one that keeps the generalization gap small during transfer, in particular at the end on the target task. We perform thorough empirical validation and analysis across diverse image classification datasets to show that the coupled transfer distance correlates strongly with the difficulty of fine-tuning.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源