论文标题

部分可观测时空混沌系统的无模型预测

Collaborative Attention Memory Network for Video Object Segmentation

论文作者

Huang, Zhixing, Zha, Junli, Xie, Fei, Zheng, Yuwei, Zhong, Yuandong, Tang, Jinpeng

论文摘要

半监督视频对象细分是计算机视觉中的一项基本而又具有挑战性的任务。嵌入基于匹配的CFBI系列网络已通过前景 - 背景集成方法实现了有希望的结果。尽管表现出色,但这些作品表现出明显的缺点,尤其是在第一帧中很少出现实例引起的虚假预测,即使它们很容易被以前的框架识别。此外,它们遭受了对象的阻塞和错误漂移的困扰。为了克服缺点,我们提出了具有增强分段头的协作注意记忆网络。我们介绍了一个明确增强对象信息的对象上下文方案,该方案仅收集属于与给定像素与其上下文相同类别的像素。此外,采用具有特征金字塔注意(FPA)模块的分割头来在高级输出上执行空间金字塔注意结构。此外,我们提出了一个合奏网络,将STM网络与所有这些新的CFBI网络相结合。最后,我们在2021年YouTube-VOS挑战中评估了我们的方法,在该挑战中,我们获得了第六名,总得分为83.5%\%。

Semi-supervised video object segmentation is a fundamental yet Challenging task in computer vision. Embedding matching based CFBI series networks have achieved promising results by foreground-background integration approach. Despite its superior performance, these works exhibit distinct shortcomings, especially the false predictions caused by little appearance instances in first frame, even they could easily be recognized by previous frame. Moreover, they suffer from object's occlusion and error drifts. In order to overcome the shortcomings , we propose Collaborative Attention Memory Network with an enhanced segmentation head. We introduce a object context scheme that explicitly enhances the object information, which aims at only gathering the pixels that belong to the same category as a given pixel as its context. Additionally, a segmentation head with Feature Pyramid Attention(FPA) module is adopted to perform spatial pyramid attention structure on high-level output. Furthermore, we propose an ensemble network to combine STM network with all these new refined CFBI network. Finally, we evaluated our approach on the 2021 Youtube-VOS challenge where we obtain 6th place with an overall score of 83.5\%.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源