论文标题
XRAI:通过AI解释的表示
xRAI: Explainable Representations through AI
论文作者
论文摘要
我们提出了XRAI的一种方法,用于提取数学功能的符号表示,神经网络应该从训练有素的网络中学习。该方法基于训练一个所谓的解释网络的想法,该网络接收训练有素的网络的权重和偏见,作为输入并输出该函数的数值表示,该网络应该学会地将可以直接翻译成符号表示。我们表明,可以使用布尔函数和低阶多项式作为示例对不同类别函数的解释网络进行培训。我们表明培训相当有效,结果的质量也很有希望。我们的工作旨在通过明确使目标功能明确地为更好地理解神经决策做出贡献
We present xRAI an approach for extracting symbolic representations of the mathematical function a neural network was supposed to learn from the trained network. The approach is based on the idea of training a so-called interpretation network that receives the weights and biases of the trained network as input and outputs the numerical representation of the function the network was supposed to learn that can be directly translated into a symbolic representation. We show that interpretation nets for different classes of functions can be trained on synthetic data offline using Boolean functions and low-order polynomials as examples. We show that the training is rather efficient and the quality of the results are promising. Our work aims to provide a contribution to the problem of better understanding neural decision making by making the target function explicit