论文标题

通过分析比较梯度和敌对激活来解释具有相对部分传播的深神网络

Interpreting Deep Neural Networks with Relative Sectional Propagation by Analyzing Comparative Gradients and Hostile Activations

论文作者

Nam, Woo-Jeoung, Choi, Jaesik, Lee, Seong-Whan

论文摘要

深层神经网络(DNN)的明显透明度受到复杂的内部结构和深层层次结构的非线性变换的阻碍。在本文中,我们提出了一种新的归因方法,相对部分传播(RSP),以完全分解输出预测,并具有类歧视性归因和清晰的物质的特征。我们仔细地重新审视了基于反向传播的归因方法的一些缺点,这些方法是分解DNN的权衡关系。我们将敌对因子定义为一种元素,该因素会干扰寻找目标的归因并以一种可观的方式传播它,以克服活化神经元的非抑制性质。结果,可以分配目标(正)和敌对(负)属性的双极相关性得分,同时保持每个属性与重要性一致。我们还提出了清除技术,以防止在向后传播过程中,通过消除冲突的单位到通道属性图,目标归因于属于目标的相关性得分与敌对归因的相关性得分之间的差距减少。因此,与常规归因方法相比,我们的方法可以将DNN的预测分解为DNN的预测和激活神经元的详细阐明。在经过验证的实验环境中,我们报告了评估的结果:(i)指向游戏,(ii)MIOU和(iii)对Pascal VOC 2007,Coco 2014和Imagenet数据集的模型敏感性。结果表明,我们的方法优于现有的向后分解方法,包括独特和直观的可视化。

The clear transparency of Deep Neural Networks (DNNs) is hampered by complex internal structures and nonlinear transformations along deep hierarchies. In this paper, we propose a new attribution method, Relative Sectional Propagation (RSP), for fully decomposing the output predictions with the characteristics of class-discriminative attributions and clear objectness. We carefully revisit some shortcomings of backpropagation-based attribution methods, which are trade-off relations in decomposing DNNs. We define hostile factor as an element that interferes with finding the attributions of the target and propagate it in a distinguishable way to overcome the non-suppressed nature of activated neurons. As a result, it is possible to assign the bi-polar relevance scores of the target (positive) and hostile (negative) attributions while maintaining each attribution aligned with the importance. We also present the purging techniques to prevent the decrement of the gap between the relevance scores of the target and hostile attributions during backward propagation by eliminating the conflicting units to channel attribution map. Therefore, our method makes it possible to decompose the predictions of DNNs with clearer class-discriminativeness and detailed elucidations of activation neurons compared to the conventional attribution methods. In a verified experimental environment, we report the results of the assessments: (i) Pointing Game, (ii) mIoU, and (iii) Model Sensitivity with PASCAL VOC 2007, MS COCO 2014, and ImageNet datasets. The results demonstrate that our method outperforms existing backward decomposition methods, including distinctive and intuitive visualizations.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源