论文标题
贝叶斯网络结构的大规模经验验证与嘈杂数据学习算法
Large-scale empirical validation of Bayesian Network structure learning algorithms with noisy data
论文作者
论文摘要
在过去的几十年中,文献中提出了许多贝叶斯网络(BN)结构学习算法。每个出版物都是该出版物中提出的算法的经验或理论案例,并且在整个研究中的结果通常不一致地说,其主张哪种算法是“最好的”。这部分是因为没有商定的评估方法来确定其有效性。此外,每种算法都基于一组假设,例如完整的数据和因果关系,并且倾向于通过符合这些假设的数据来评估这些假设,但是这些假设可能不切实际,可能是在现实世界中。结果,综合性能高估了真实的性能,尽管这可能发生在多大程度上仍然未知。本文研究了15种结构学习算法的性能。我们提出了一种将算法应用于包含综合噪声的数据的方法,以便更好地了解结构学习算法的性能。每种算法在多个案例研究,样本量,噪声类型上进行测试,并通过多个评估标准进行评估。这项工作涉及大约10,000张图表,总结构学习运行时间为七个月。它提供了在不同的数据噪声假设下学习BN结构学习算法的第一个大规模经验验证。结果表明,传统的合成性能可能会高估现实世界的10%至50%以上。他们还表明,尽管基于得分的学习通常优于基于约束的学习,但较高的拟合分数并不一定意味着更准确的因果图。为了促进与未来研究的比较,我们已经在网上免费提供了所有数据,原始结果,图形和BN模型。
Numerous Bayesian Network (BN) structure learning algorithms have been proposed in the literature over the past few decades. Each publication makes an empirical or theoretical case for the algorithm proposed in that publication and results across studies are often inconsistent in their claims about which algorithm is 'best'. This is partly because there is no agreed evaluation approach to determine their effectiveness. Moreover, each algorithm is based on a set of assumptions, such as complete data and causal sufficiency, and tend to be evaluated with data that conforms to these assumptions, however unrealistic these assumptions may be in the real world. As a result, it is widely accepted that synthetic performance overestimates real performance, although to what degree this may happen remains unknown. This paper investigates the performance of 15 structure learning algorithms. We propose a methodology that applies the algorithms to data that incorporates synthetic noise, in an effort to better understand the performance of structure learning algorithms when applied to real data. Each algorithm is tested over multiple case studies, sample sizes, types of noise, and assessed with multiple evaluation criteria. This work involved approximately 10,000 graphs with a total structure learning runtime of seven months. It provides the first large-scale empirical validation of BN structure learning algorithms under different assumptions of data noise. The results suggest that traditional synthetic performance may overestimate real-world performance by anywhere between 10% and more than 50%. They also show that while score-based learning is generally superior to constraint-based learning, a higher fitting score does not necessarily imply a more accurate causal graph. To facilitate comparisons with future studies, we have made all data, raw results, graphs and BN models freely available online.