论文标题
部分可观测时空混沌系统的无模型预测
Counterfactual Explanations for Misclassified Images: How Human and Machine Explanations Differ
论文作者
论文摘要
反事实解释已成为可解释的AI(XAI)问题的流行解决方案,该问题是由于其心理有效性,跨问题领域的灵活性和拟议的法律依从性,阐明了黑盒深学习系统的预测。尽管存在100多种反事实方法,但声称与人们更喜欢的解释产生了合理的解释,但实际上很少有人对用户进行测试($ \ sim7 \%$)。因此,没有确定这些反事实算法的心理有效性用于图像数据的有效XAI。这里使用一种新方法来解决此问题,该方法(i)在两项用户研究中,(ii)将这些人类生成的基础真实性解释与计算生成的解释进行比较,以对相同分类的计算生成的解释进行比较。结果表明,在产生反事实解释时,人类不会“最少编辑”图像。取而代之的是,它们使更大的“有意义”编辑更好地近似于反事实类中的原型。
Counterfactual explanations have emerged as a popular solution for the eXplainable AI (XAI) problem of elucidating the predictions of black-box deep-learning systems due to their psychological validity, flexibility across problem domains and proposed legal compliance. While over 100 counterfactual methods exist, claiming to generate plausible explanations akin to those preferred by people, few have actually been tested on users ($\sim7\%$). So, the psychological validity of these counterfactual algorithms for effective XAI for image data is not established. This issue is addressed here using a novel methodology that (i) gathers ground truth human-generated counterfactual explanations for misclassified images, in two user studies and, then, (ii) compares these human-generated ground-truth explanations to computationally-generated explanations for the same misclassifications. Results indicate that humans do not "minimally edit" images when generating counterfactual explanations. Instead, they make larger, "meaningful" edits that better approximate prototypes in the counterfactual class.