论文标题
IN-N-OUT:使用辅助信息进行培训和自我训练
In-N-Out: Pre-Training and Self-Training using Auxiliary Information for Out-of-Distribution Robustness
论文作者
论文摘要
考虑一个预测设置,很少有标记的示例,以及许多未标记的示例(OOD)。目的是学习一个既可以分配和OOD都表现良好的模型。在这些设置中,辅助信息通常可用于每个输入。我们应该如何最好地利用这些辅助信息来进行预测任务?从三个图像和时间序列数据集进行经验上,从理论上讲,在多任务线性回归设置中,我们表明(i)使用辅助信息作为输入功能会改善分布错误,但会损害OOD错误;但是(ii)使用辅助信息作为辅助预训练任务的输出可改善OOD错误。为了获得两全其美,我们介绍了In-n-Out,该公司首先训练具有辅助输入的模型,并使用它来对所有分布输入进行伪造,然后在OOD辅助输出上预先培训,并通过伪级(自训练)进行微调和微调该模型。我们从理论上和经验上都表明,在分发和OOD误差上,In-N-Oun-un-un-un-un-un-un-out表现都超过了辅助输入或输出。
Consider a prediction setting with few in-distribution labeled examples and many unlabeled examples both in- and out-of-distribution (OOD). The goal is to learn a model which performs well both in-distribution and OOD. In these settings, auxiliary information is often cheaply available for every input. How should we best leverage this auxiliary information for the prediction task? Empirically across three image and time-series datasets, and theoretically in a multi-task linear regression setting, we show that (i) using auxiliary information as input features improves in-distribution error but can hurt OOD error; but (ii) using auxiliary information as outputs of auxiliary pre-training tasks improves OOD error. To get the best of both worlds, we introduce In-N-Out, which first trains a model with auxiliary inputs and uses it to pseudolabel all the in-distribution inputs, then pre-trains a model on OOD auxiliary outputs and fine-tunes this model with the pseudolabels (self-training). We show both theoretically and empirically that In-N-Out outperforms auxiliary inputs or outputs alone on both in-distribution and OOD error.