论文标题
持续学习的多种模式
Multiple Modes for Continual Learning
论文作者
论文摘要
将模型参数适应传入数据流是深度学习可伸缩性的关键因素。有趣的是,在线设置中的先前持续学习策略无意间将其更新的参数锚定在本地参数子空间中,以记住旧任务,否则会远离子空间并忘记。从这个观察结果,我们在构建多个参数模式和每个模式分配任务之间制定了权衡。模式优化的任务分配(MOTA),我们的贡献适应策略,并行训练多种模式,然后优化每个模式的任务分配。我们从经验上证明了基线连续学习策略以及各种分配变化的改进,即子人群,领域和任务转变。
Adapting model parameters to incoming streams of data is a crucial factor to deep learning scalability. Interestingly, prior continual learning strategies in online settings inadvertently anchor their updated parameters to a local parameter subspace to remember old tasks, else drift away from the subspace and forget. From this observation, we formulate a trade-off between constructing multiple parameter modes and allocating tasks per mode. Mode-Optimized Task Allocation (MOTA), our contributed adaptation strategy, trains multiple modes in parallel, then optimizes task allocation per mode. We empirically demonstrate improvements over baseline continual learning strategies and across varying distribution shifts, namely sub-population, domain, and task shift.