学习注意力传播的零拍零学习

论文标题

学习注意力传播的零拍零学习

Learning Attention Propagation for Compositional Zero-Shot Learning

论文作者

Khan, Muhammad Gul Zain Ali, Naeem, Muhammad Ferjad, Van Gool, Luc, Pagani, Alain, Stricker, Didier, Afzal, Muhammad Zeshan

论文摘要

组成零 - 旨在认识到对象类及其状态的视觉原始图的看不见的组成。尽管在某种结合训练期间都可以观察到所有原语（状态和物体），但它们的复杂相互作用使这项任务特别困难。例如，湿会改变狗的视觉外观与自行车的视觉外观大不相同。此外，我们认为构图之间的关系超出了共享状态或对象。一个混乱的办公室可以包含一个繁忙的桌子；即使这些构图没有共享状态或对象，繁忙的桌子的存在也可以指导一个混乱的办公室的存在。我们提出了一种称为组成注意传播嵌入（CAPE）作为解决方案的新型方法。我们方法的关键直觉是，除了组成之间的其他依赖性外，还存在着由原始人复杂相互作用引起的富依赖性结构。开普学习确定这种结构并在它们之间传播知识，以学习所有可见和看不见的作品的阶级嵌入。在具有挑战性的广义构图零拍设置中，我们表明我们的方法的表现优于以前的基准，可以在三个公开可用的基准上设置新的最新时间。

Compositional zero-shot learning aims to recognize unseen compositions of seen visual primitives of object classes and their states. While all primitives (states and objects) are observable during training in some combination, their complex interaction makes this task especially hard. For example, wet changes the visual appearance of a dog very differently from a bicycle. Furthermore, we argue that relationships between compositions go beyond shared states or objects. A cluttered office can contain a busy table; even though these compositions don't share a state or object, the presence of a busy table can guide the presence of a cluttered office. We propose a novel method called Compositional Attention Propagated Embedding (CAPE) as a solution. The key intuition to our method is that a rich dependency structure exists between compositions arising from complex interactions of primitives in addition to other dependencies between compositions. CAPE learns to identify this structure and propagates knowledge between them to learn class embedding for all seen and unseen compositions. In the challenging generalized compositional zero-shot setting, we show that our method outperforms previous baselines to set a new state-of-the-art on three publicly available benchmarks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题