CNN何时以及如何概括到分布式类别 - 视图组合

论文标题

CNN何时以及如何概括到分布式类别 - 视图组合

When and how CNNs generalize to out-of-distribution category-viewpoint combinations

论文作者

Madan, Spandan, Henry, Timothy, Dozier, Jamell, Ho, Helen, Bhandari, Nishchal, Sasaki, Tomotake, Durand, Frédo, Pfister, Hanspeter, Boix, Xavier

论文摘要

对象识别和观点估计是视觉理解的核心。最近的作品表明，卷积神经网络（CNN）未能推广到分布（OOD）类别 - 视点组合，即。训练期间看不到的组合。在本文中，我们通过评估经过训练的CNN来对OOD组合上的对象类别和3D观点进行分类，并识别促进这种OOD概括的神经机制，从而研究了何时以及如何进行这种OOD概括。我们表明，即使使用相同数量的培训数据，也会增加分布组合组合的数量（即数据多样性），从而大大提高了对OOD组合的概括。我们在单独和共享网络体系结构中比较学习类别和观点，并观察到分布和OOD组合的明显不同的趋势，即。尽管共享网络是有用的分布，但单独的网络在OOD组合上的分享率显着优于共享网络。最后，我们证明了这种OOD的概括是由专业化的神经机制促进的。两种类型的神经元的出现 - 神经元对类别的选择性和对观点的不变性，反之亦然。

Object recognition and viewpoint estimation lie at the heart of visual understanding. Recent works suggest that convolutional neural networks (CNNs) fail to generalize to out-of-distribution (OOD) category-viewpoint combinations, ie. combinations not seen during training. In this paper, we investigate when and how such OOD generalization may be possible by evaluating CNNs trained to classify both object category and 3D viewpoint on OOD combinations, and identifying the neural mechanisms that facilitate such OOD generalization. We show that increasing the number of in-distribution combinations (ie. data diversity) substantially improves generalization to OOD combinations, even with the same amount of training data. We compare learning category and viewpoint in separate and shared network architectures, and observe starkly different trends on in-distribution and OOD combinations, ie. while shared networks are helpful in-distribution, separate networks significantly outperform shared ones at OOD combinations. Finally, we demonstrate that such OOD generalization is facilitated by the neural mechanism of specialization, ie. the emergence of two types of neurons -- neurons selective to category and invariant to viewpoint, and vice versa.

下载PDF全文

下载文献需遵守相关版权规定

论文标题