论文标题
代数学习:迈向可解释的信息建模
Algebraic Learning: Towards Interpretable Information Modeling
论文作者
论文摘要
随着使用传感器技术收集的数字数据的扩散以及基于深度学习能力的提升,基于深度学习(DL)的方法在过去十年中引起了极大的关注,因为它们在从原始数据中提取复杂关系并代表有价值的信息方面具有令人印象深刻的性能。同时,由于缺乏解释性,对DL的良好欣赏源于其臭名昭著的黑盒本质。一方面,DL仅利用原始数据中包含的统计特征,同时忽略了人类对基础系统的知识,从而导致数据效率低下和信任问题;另一方面,受过训练的DL模型并未向研究人员提供有关其超出其输出的基础系统的任何额外见解,但是,这是大多数科学领域的本质,例如物理和经济学。 本文解决了一般信息建模中的可解释性问题,并努力从两个范围内减轻问题。首先,将面向问题的视角应用于将知识纳入建模实践中,其中有趣的数学属性自然出现,这对建模产生了约束。其次,给定训练有素的模型,可以应用各种方法来提取有关基础系统的进一步见解。这两个途径称为指导模型设计和次要测量。值得注意的是,统计学习中的建模实践出现了一种新颖的方案:代数学习(AGLR)。 AGLR不仅限于对任何特定模型的讨论,而是从学习任务本身的特质开始,并研究合法模型类的结构。这种新颖的方案证明了抽象代数对一般AI的值得注意的价值,该一般AI在最近的进度中被忽略了,并且可以进一步阐明可解释的信息建模。
Along with the proliferation of digital data collected using sensor technologies and a boost of computing power, Deep Learning (DL) based approaches have drawn enormous attention in the past decade due to their impressive performance in extracting complex relations from raw data and representing valuable information. Meanwhile, though, rooted in its notorious black-box nature, the appreciation of DL has been highly debated due to the lack of interpretability. On the one hand, DL only utilizes statistical features contained in raw data while ignoring human knowledge of the underlying system, which results in both data inefficiency and trust issues; on the other hand, a trained DL model does not provide to researchers any extra insight about the underlying system beyond its output, which, however, is the essence of most fields of science, e.g. physics and economics. This thesis addresses the issue of interpretability in general information modeling and endeavors to ease the problem from two scopes. Firstly, a problem-oriented perspective is applied to incorporate knowledge into modeling practice, where interesting mathematical properties emerge naturally which cast constraints on modeling. Secondly, given a trained model, various methods could be applied to extract further insights about the underlying system. These two pathways are termed as guided model design and secondary measurements. Remarkably, a novel scheme emerges for the modeling practice in statistical learning: Algebraic Learning (AgLr). Instead of being restricted to the discussion of any specific model, AgLr starts from idiosyncrasies of a learning task itself and studies the structure of a legitimate model class. This novel scheme demonstrates the noteworthy value of abstract algebra for general AI, which has been overlooked in recent progress, and could shed further light on interpretable information modeling.