论文标题
可解释的前景对象搜索作为知识蒸馏
Interpretable Foreground Object Search As Knowledge Distillation
论文作者
论文摘要
本文提出了一种用于前景对象搜索(FOS)的知识蒸馏方法。给定一个背景和矩形指定前景位置和比例,FOS检索一定类别中的兼容前景以进行以后的图像组成。同一类别中的前景可以分为少数模式。每种模式内的实例都与任何查询输入互换兼容。这些实例称为可互换的前景。我们首先提出了一条管道,以构建包含可互换前景标签的模式级FOS数据集。然后,我们建立一个基准数据集,用于管道后的进一步培训和测试。至于提议的方法,我们首先训练前景编码器,以学习可互换的前景的表示。然后,我们训练一个查询编码器,以学习知识蒸馏框架之后的查询前景兼容性。它旨在将知识从可互换的前景转移,以监督兼容性的表示。查询功能表示与可互换前景的潜在空间相同,从而实现了非常有效且可解释的实例级别搜索。此外,模式级别的搜索是可行的,可以检索更可控制,合理和多样化的前景。所提出的方法在绝对差异上优于先前的最新方法为10.42%,通过平均平均精度评估的相对改进中的面积为24.06%(MAP)。广泛的实验结果也证明了其从各个方面的功效。基准数据集和代码将很快发布。
This paper proposes a knowledge distillation method for foreground object search (FoS). Given a background and a rectangle specifying the foreground location and scale, FoS retrieves compatible foregrounds in a certain category for later image composition. Foregrounds within the same category can be grouped into a small number of patterns. Instances within each pattern are compatible with any query input interchangeably. These instances are referred to as interchangeable foregrounds. We first present a pipeline to build pattern-level FoS dataset containing labels of interchangeable foregrounds. We then establish a benchmark dataset for further training and testing following the pipeline. As for the proposed method, we first train a foreground encoder to learn representations of interchangeable foregrounds. We then train a query encoder to learn query-foreground compatibility following a knowledge distillation framework. It aims to transfer knowledge from interchangeable foregrounds to supervise representation learning of compatibility. The query feature representation is projected to the same latent space as interchangeable foregrounds, enabling very efficient and interpretable instance-level search. Furthermore, pattern-level search is feasible to retrieve more controllable, reasonable and diverse foregrounds. The proposed method outperforms the previous state-of-the-art by 10.42% in absolute difference and 24.06% in relative improvement evaluated by mean average precision (mAP). Extensive experimental results also demonstrate its efficacy from various aspects. The benchmark dataset and code will be release shortly.