论文标题

分类的最佳决策图

Optimal Decision Diagrams for Classification

论文作者

Florio, Alexandre M., Martins, Pedro, Schiffer, Maximilian, Serra, Thiago, Vidal, Thibaut

论文摘要

分类的决策图比决策树具有一些显着的优势,因为它们的内部连接可以在训练时间确定,并且它们的宽度并不能随着其深度而成倍增长。因此,决策图通常不太容易发生内部节点的数据碎片。但是,培训这些分类器的固有复杂性是其广泛采用的长期障碍。在这种情况下,我们从数学编程的角度研究了最佳决策图(赔率)的培训。我们介绍了一种新型的混合企业线性编程模型,用于培训,并证明其适用于许多实际重要性数据集。此外,我们展示了如何轻松扩展该模型以公平,简约和稳定性概念。我们提出了数值分析,表明我们的模型允许在短时间计算时间内进行训练赔率,并且比最佳决策树的赔率更高,同时允许提高稳定性,而不会出现明显的准确性损失。

Decision diagrams for classification have some notable advantages over decision trees, as their internal connections can be determined at training time and their width is not bound to grow exponentially with their depth. Accordingly, decision diagrams are usually less prone to data fragmentation in internal nodes. However, the inherent complexity of training these classifiers acted as a long-standing barrier to their widespread adoption. In this context, we study the training of optimal decision diagrams (ODDs) from a mathematical programming perspective. We introduce a novel mixed-integer linear programming model for training and demonstrate its applicability for many datasets of practical importance. Further, we show how this model can be easily extended for fairness, parsimony, and stability notions. We present numerical analyses showing that our model allows training ODDs in short computational times, and that ODDs achieve better accuracy than optimal decision trees, while allowing for improved stability without significant accuracy losses.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源