主动多任务表示学习

论文标题

主动多任务表示学习

Active Multi-Task Representation Learning

论文作者

Chen, Yifang, Du, Simon S., Jamieson, Kevin

论文摘要

为了利用源任务中的大数据的力量并克服目标任务样本的稀缺性，基于多任务预处理的表示形式在许多应用程序中已成为一种标准方法。但是，到目前为止，选择在多任务学习中包含哪些源任务的是艺术而不是科学。在本文中，我们通过利用积极学习的技术进行了首次关于资源任务采样的正式研究。我们提出了一种算法，该算法迭代估算每个源任务与目标任务的相关性，并根据估计的相关性从每个源任务中的示例中进行示例。从理论上讲，我们表明，对于线性表示类，为了达到相同的错误率，我们的算法可以节省源任务样本复杂性中的\ textit {源任务数}因子，与所有源任务中的幼稚统一采样相比。我们还提供了现实世界中计算机视觉数据集的实验，以说明我们提出的方法对线性和卷积神经网络表示类别的有效性。我们认为，我们的论文是带来从积极学习到代表学习的技术的重要第一步。

To leverage the power of big data from source tasks and overcome the scarcity of the target task samples, representation learning based on multi-task pretraining has become a standard approach in many applications. However, up until now, choosing which source tasks to include in the multi-task learning has been more art than science. In this paper, we give the first formal study on resource task sampling by leveraging the techniques from active learning. We propose an algorithm that iteratively estimates the relevance of each source task to the target task and samples from each source task based on the estimated relevance. Theoretically, we show that for the linear representation class, to achieve the same error rate, our algorithm can save up to a \textit{number of source tasks} factor in the source task sample complexity, compared with the naive uniform sampling from all source tasks. We also provide experiments on real-world computer vision datasets to illustrate the effectiveness of our proposed method on both linear and convolutional neural network representation classes. We believe our paper serves as an important initial step to bring techniques from active learning to representation learning.

下载PDF全文

下载文献需遵守相关版权规定

论文标题