论文标题

关于为什么和为什么出处(扩展版)的大致摘要

Approximate Summaries for Why and Why-not Provenance (Extended Version)

论文作者

Lee, Seokki, Ludaescher, Bertram, Glavic, Boris

论文摘要

近年来,为什么对非出处进行了广泛研究。但是,为什么不是出处,在较小程度上,为什么出处可能会很大,从而导致严重的可伸缩性和可用性挑战。在本文中,我们引入了一种新颖的近似摘要技术,以克服这些挑战。我们的方法使用模式来编码(为什么)出处。我们开发了有效计算出来源摘要的技术,平衡信息,简洁和完整性。为了达到可伸缩性,我们将抽样技术集成到出处捕获和摘要中。我们的方法是第一个扩展到大型数据集并产生全面和有意义的摘要的方法。

Why and why-not provenance have been studied extensively in recent years. However, why-not provenance, and to a lesser degree why provenance, can be very large resulting in severe scalability and usability challenges. In this paper, we introduce a novel approximate summarization technique for provenance which overcomes these challenges. Our approach uses patterns to encode (why-not) provenance concisely. We develop techniques for efficiently computing provenance summaries balancing informativeness, conciseness, and completeness. To achieve scalability, we integrate sampling techniques into provenance capture and summarization. Our approach is the first to scale to large datasets and to generate comprehensive and meaningful summaries.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源