诊断具有民间行为概念的AI解释方法

论文标题

诊断具有民间行为概念的AI解释方法

Diagnosing AI Explanation Methods with Folk Concepts of Behavior

论文作者

Jacovi, Alon, Bastings, Jasmijn, Gehrmann, Sebastian, Goldberg, Yoav, Filippova, Katja

论文摘要

我们研究了成功解释AI的形式主义。我们认为“成功”不仅取决于解释所包含的信息，而且还取决于人类解释者从中理解的信息。思想文学理论讨论了人类用来理解和概括行为的民间概念。我们认为，民间行为的概念为我们提供了人类理解行为的“语言”。我们将这些民间概念用作人类解释的社会归因框架 - 人类可能会从解释中理解的信息结构 - 通过引入蓝图以用于解释性叙述（图1），以这些结构解释了AI的行为。然后，我们证明，当今许多XAI方法可以映射到定性评估中的民间行为概念。这使我们能够发现它们的故障模式，以阻止当前方法成功解释成功 - 即，对于任何给定的XAI方法所缺少的信息构造，其包含可以减少误解AI行为的可能性。

We investigate a formalism for the conditions of a successful explanation of AI. We consider "success" to depend not only on what information the explanation contains, but also on what information the human explainee understands from it. Theory of mind literature discusses the folk concepts that humans use to understand and generalize behavior. We posit that folk concepts of behavior provide us with a "language" that humans understand behavior with. We use these folk concepts as a framework of social attribution by the human explainee - the information constructs that humans are likely to comprehend from explanations - by introducing a blueprint for an explanatory narrative (Figure 1) that explains AI behavior with these constructs. We then demonstrate that many XAI methods today can be mapped to folk concepts of behavior in a qualitative evaluation. This allows us to uncover their failure modes that prevent current methods from explaining successfully - i.e., the information constructs that are missing for any given XAI method, and whose inclusion can decrease the likelihood of misunderstanding AI behavior.

下载PDF全文

下载文献需遵守相关版权规定

论文标题