机器的目光背后：具有生物学启发约束的神经网络表现出类似人类的视觉注意力

论文标题

机器的目光背后：具有生物学启发约束的神经网络表现出类似人类的视觉注意力

Behind the Machine's Gaze: Neural Networks with Biologically-inspired Constraints Exhibit Human-like Visual Attention

论文作者

Schwinn, Leo, Precup, Doina, Eskofier, Björn, Zanca, Dario

论文摘要

总的来说，现有的视觉注意力的计算模型默认地假设了完美的视觉，并完全访问刺激，从而偏离了动植物的生物学视觉。此外，自上而下的注意力通常会降低到语义特征的整合，而没有结合高级视觉任务的信号，而高级视觉任务已被证明可以部分引起人类的注意。我们提出了神经视觉注意（NEVA）算法以自上而下的方式产生视觉扫描。通过我们的方法，我们探讨了神经网络的能力，我们在其上施加了一种以生物学启发的视力约束来产生类似人类的扫描，而无需直接训练该目标。执行下游视觉任务（即分类或重建）的神经网络的丢失灵活地为扫描路径提供了自上而下的指导。广泛的实验表明，就与人类扫描的相似性而言，我们的方法优于最先进的人类注意力模型。此外，该框架的灵活性允许定量研究不同任务在生成的视觉行为中的作用。最后，我们在一项新的实验中证明了方法的优越性，该实验研究了在现实世界中提供了不完善的观看条件的实用性。

By and large, existing computational models of visual attention tacitly assume perfect vision and full access to the stimulus and thereby deviate from foveated biological vision. Moreover, modeling top-down attention is generally reduced to the integration of semantic features without incorporating the signal of a high-level visual tasks that have been shown to partially guide human attention. We propose the Neural Visual Attention (NeVA) algorithm to generate visual scanpaths in a top-down manner. With our method, we explore the ability of neural networks on which we impose a biologically-inspired foveated vision constraint to generate human-like scanpaths without directly training for this objective. The loss of a neural network performing a downstream visual task (i.e., classification or reconstruction) flexibly provides top-down guidance to the scanpath. Extensive experiments show that our method outperforms state-of-the-art unsupervised human attention models in terms of similarity to human scanpaths. Additionally, the flexibility of the framework allows to quantitatively investigate the role of different tasks in the generated visual behaviors. Finally, we demonstrate the superiority of the approach in a novel experiment that investigates the utility of scanpaths in real-world applications, where imperfect viewing conditions are given.

下载PDF全文

下载文献需遵守相关版权规定

论文标题