圈：基于注意的基于概念的基于概念的自我解释和知识注入的模块

论文标题

圈：基于注意的基于概念的基于概念的自我解释和知识注入的模块

LAP: An Attention-Based Module for Concept Based Self-Interpretation and Knowledge Injection in Convolutional Neural Networks

论文作者

Modegh, Rassa Ghavami, Salimi, Ahmad, Dizaji, Alireza, Rabiee, Hamid R.

论文摘要

尽管深度卷积神经网络的最新性能，但它们在看不见的情况下仍容易受到偏见和故障的影响。此外，建立信任背后的复杂计算是无法理解的。外部解释器方法试图以人为理解的方式来解释网络决策，但由于其假设和简化，它们被指控谬论。另一方面，模型的固有的自我解释性虽然对上述谬论更强大，但不能应用于已经训练的模型。在这项工作中，我们提出了一个新的基于注意力的集合层，称为本地注意力集合（LAP），该层实现了自我解释性和知识注入的可能性而无需绩效损失。该模块很容易插入任何卷积神经网络，甚至已经训练有素的神经网络。我们已经定义了一个弱监督的培训计划，以学习决策中的区别特征，而不必依赖专家的注释。我们通过评估包括ImageNet在内的两个数据集上的几个单圈延伸模型来验证我们的主张。与常用的白盒解释器方法相比，提出的框架提供了更有效的人为理解和忠实的模型解释。

Despite the state-of-the-art performance of deep convolutional neural networks, they are susceptible to bias and malfunction in unseen situations. Moreover, the complex computation behind their reasoning is not human-understandable to develop trust. External explainer methods have tried to interpret network decisions in a human-understandable way, but they are accused of fallacies due to their assumptions and simplifications. On the other side, the inherent self-interpretability of models, while being more robust to the mentioned fallacies, cannot be applied to the already trained models. In this work, we propose a new attention-based pooling layer, called Local Attention Pooling (LAP), that accomplishes self-interpretability and the possibility for knowledge injection without performance loss. The module is easily pluggable into any convolutional neural network, even the already trained ones. We have defined a weakly supervised training scheme to learn the distinguishing features in decision-making without depending on experts' annotations. We verified our claims by evaluating several LAP-extended models on two datasets, including ImageNet. The proposed framework offers more valid human-understandable and faithful-to-the-model interpretations than the commonly used white-box explainer methods.

下载PDF全文

下载文献需遵守相关版权规定

论文标题