论文标题

您的“火烈鸟”是我的“鸟”:细颗粒,还是不是

Your "Flamingo" is My "Bird": Fine-Grained, or Not

论文作者

Chang, Dongliang, Pang, Kaiyue, Zheng, Yixiao, Ma, Zhanyu, Song, Yi-Zhe, Guo, Jun

论文摘要

您在图1中看到的是“火烈鸟”还是“鸟”,这是我们在本文中提出的问题。虽然细粒度的视觉分类(FGVC)努力达到前者,但我们大多数非专家只是“鸟”可能就足够了。因此,真正的问题是 - 在不同水平的专业知识下,我们如何为不同的细粒度定义量身定制。为此,我们重新提出了FGVC的传统环境,从单标签分类到预定的粗到1个标签层次结构的自上而下的遍历 - 使我们的答案变成“鸟” - >“ phoenicopteriformes” - >“ phoenicopteriformes” - >“ phoenicopteridae” - >“ phoenicopteridae” - >“ flamingo”。为了解决这个新问题,我们首先进行了一项全面的人类研究,我们确认大多数参与者都喜欢多晶型标签,无论他们是否认为自己是专家。然后,我们发现:粗级标签预测加剧了细粒度学习的关键直觉,但精细的功能可以更好地学习粗级分类器。这一发现使我们能够设计一个非常简单的方法,尽管出奇的有效解决方案解决了我们的新问题,我们(i)利用特定水平的分类头将具有精细粒度的粗级特征解散了粗级特征,并且(ii)允许更细粒度的功能参与粗糙的标签预测,这反过来有助于更好地分解。实验表明,我们的方法在新的FGVC环境中实现了卓越的性能,并且在传统的单标签FGVC问题上的性能也比最新的表现更好。由于其简单性,我们的方法可以轻松地在任何现有的FGVC框架之上实现,并且无参数。

Whether what you see in Figure 1 is a "flamingo" or a "bird", is the question we ask in this paper. While fine-grained visual classification (FGVC) strives to arrive at the former, for the majority of us non-experts just "bird" would probably suffice. The real question is therefore -- how can we tailor for different fine-grained definitions under divergent levels of expertise. For that, we re-envisage the traditional setting of FGVC, from single-label classification, to that of top-down traversal of a pre-defined coarse-to-fine label hierarchy -- so that our answer becomes "bird"-->"Phoenicopteriformes"-->"Phoenicopteridae"-->"flamingo". To approach this new problem, we first conduct a comprehensive human study where we confirm that most participants prefer multi-granularity labels, regardless whether they consider themselves experts. We then discover the key intuition that: coarse-level label prediction exacerbates fine-grained feature learning, yet fine-level feature betters the learning of coarse-level classifier. This discovery enables us to design a very simple albeit surprisingly effective solution to our new problem, where we (i) leverage level-specific classification heads to disentangle coarse-level features with fine-grained ones, and (ii) allow finer-grained features to participate in coarser-grained label predictions, which in turn helps with better disentanglement. Experiments show that our method achieves superior performance in the new FGVC setting, and performs better than state-of-the-art on traditional single-label FGVC problem as well. Thanks to its simplicity, our method can be easily implemented on top of any existing FGVC frameworks and is parameter-free.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源