论文标题
狂野时代:随着时间的推移,野外分配的基准
Wild-Time: A Benchmark of in-the-Wild Distribution Shift over Time
论文作者
论文摘要
当测试分布与训练分布不同时,分配转移发生,并且它可以大大降低现实世界中部署的机器学习模型的性能。时间变化 - 由于时间的流逝而产生的分布变化 - 通常会逐渐发生,并具有时间戳元数据的附加结构。通过利用时间戳元数据,模型可以从过去的分布变化中学习并推断未来的趋势。尽管最近的工作已经研究了分布变化,但时间变化仍然没有被逐渐幻想。为了解决这一差距,我们策划了野外时间,这是5个数据集的基准,这些数据集反映了在各种现实世界应用中发生的时间分布变化,包括患者预后和新闻分类。在这些数据集上,我们从系统地基准了13个先验方法,包括域概括,持续学习,自我监督学习和集合学习的方法。我们使用两种评估策略:具有固定时间拆分(评估文件)的评估和数据流(eval-stream)的评估。我们的主要评估策略Ade-Fix旨在提供一个简单的评估协议,而Eval-stream对于某些现实世界应用程序更为现实。在两种评估策略下,我们都会观察到从分布到分布数据的平均绩效下降了20%。现有方法无法缩小此差距。代码可在https://wild time.github.io/上找到。
Distribution shift occurs when the test distribution differs from the training distribution, and it can considerably degrade performance of machine learning models deployed in the real world. Temporal shifts -- distribution shifts arising from the passage of time -- often occur gradually and have the additional structure of timestamp metadata. By leveraging timestamp metadata, models can potentially learn from trends in past distribution shifts and extrapolate into the future. While recent works have studied distribution shifts, temporal shifts remain underexplored. To address this gap, we curate Wild-Time, a benchmark of 5 datasets that reflect temporal distribution shifts arising in a variety of real-world applications, including patient prognosis and news classification. On these datasets, we systematically benchmark 13 prior approaches, including methods in domain generalization, continual learning, self-supervised learning, and ensemble learning. We use two evaluation strategies: evaluation with a fixed time split (Eval-Fix) and evaluation with a data stream (Eval-Stream). Eval-Fix, our primary evaluation strategy, aims to provide a simple evaluation protocol, while Eval-Stream is more realistic for certain real-world applications. Under both evaluation strategies, we observe an average performance drop of 20% from in-distribution to out-of-distribution data. Existing methods are unable to close this gap. Code is available at https://wild-time.github.io/.