带有情节图内存网络的视频对象分割

论文标题

带有情节图内存网络的视频对象分割

Video Object Segmentation with Episodic Graph Memory Networks

论文作者

Lu, Xiankai, Wang, Wenguan, Danelljan, Martin, Zhou, Tianfei, Shen, Jianbing, Van Gool, Luc

论文摘要

如何使细分模型有效地适应特定的视频和在线目标外观变化是视频对象细分领域的至关重要问题。在这项工作中，开发了图表内存网络，以解决“学习更新细分模型”的新颖想法。具体而言，我们利用一个以完全连接的图表组织的情节内存网络，以将框架作为节点存储并捕获边缘捕获跨框架相关性。此外，可学习的控制器被嵌入以减轻记忆读取和写作，并保持固定记忆量表。结构化的外部内存设计使我们的模型能够全面地挖掘并快速存储新知识，即使有限的视觉信息，可区分的内存控制器慢慢地学习了一种抽象方法，用于在内存中存储有用的表示形式，以及如何通过梯度下降来将这些表示形式用于预测。此外，所提出的图表内存网络可产生一个整洁而有原则的框架，可以很好地概括一次性和零击视频对象分割任务。在四个具有挑战性的基准数据集上进行的广泛实验验证了我们的图表内存网络能够促进针对逐案的视频对象细分的分割网络的适应。

How to make a segmentation model efficiently adapt to a specific video and to online target appearance variations are fundamentally crucial issues in the field of video object segmentation. In this work, a graph memory network is developed to address the novel idea of "learning to update the segmentation model". Specifically, we exploit an episodic memory network, organized as a fully connected graph, to store frames as nodes and capture cross-frame correlations by edges. Further, learnable controllers are embedded to ease memory reading and writing, as well as maintain a fixed memory scale. The structured, external memory design enables our model to comprehensively mine and quickly store new knowledge, even with limited visual information, and the differentiable memory controllers slowly learn an abstract method for storing useful representations in the memory and how to later use these representations for prediction, via gradient descent. In addition, the proposed graph memory network yields a neat yet principled framework, which can generalize well both one-shot and zero-shot video object segmentation tasks. Extensive experiments on four challenging benchmark datasets verify that our graph memory network is able to facilitate the adaptation of the segmentation network for case-by-case video object segmentation.

下载PDF全文

下载文献需遵守相关版权规定

论文标题