论文标题

明确和隐式模式关系分析,用于发现可行的负序列

Explicit and Implicit Pattern Relation Analysis for Discovering Actionable Negative Sequences

论文作者

Wang, Wei, Cao, Longbing

论文摘要

现实生活中的事件,行为和互动会产生顺序数据。一个重要但很少探索的问题是分析那些不存在的(也称为负)但重要的序列,形成负序列分析(NSA)。典型的NSA区域是发现由重要的非发生和发生的元素和模式组成的负顺序模式(NSP)。 NSP开采的现有工作有限,依赖于经常和向下封闭的基于财产的模式选择,从而产生了大型和高度冗余的NSP,这是不可用于业务决策的。这项工作是首次尝试可行的NSP发现。它构建了NSP图表示,量化显式发生和基于非电流的元素和模式关系,然后在NSP图中发现重要的,多样的和信息性的NSP,以表示整个NSP集,用于发现可行的NSP。基于DPP的NSP表示和可操作的NSP发现方法EINSP引入了NSA和序列分析的新颖和重要贡献:(1)它通过基于确定点过程(DPP)的图表表示NSP; (2)它根据其统计学意义,多样性和明确/隐式元素/模式关系的统计意义,多样性和强度来量化可行的NSP; (3)IT在基于DPP的NSP图中对明确和隐式元素/模式关系进行建模和测量,以表示NSP项目,元素和模式之间的直接和间接耦合。我们在各种理论和经验方面进行了实质性分析EINSP的有效性,包括复杂性,项目/模式覆盖率,模式大小和多样性,隐式模式关系强度以及数据因素。

Real-life events, behaviors and interactions produce sequential data. An important but rarely explored problem is to analyze those nonoccurring (also called negative) yet important sequences, forming negative sequence analysis (NSA). A typical NSA area is to discover negative sequential patterns (NSPs) consisting of important non-occurring and occurring elements and patterns. The limited existing work on NSP mining relies on frequentist and downward closure property-based pattern selection, producing large and highly redundant NSPs, nonactionable for business decision-making. This work makes the first attempt for actionable NSP discovery. It builds an NSP graph representation, quantify both explicit occurrence and implicit non-occurrence-based element and pattern relations, and then discover significant, diverse and informative NSPs in the NSP graph to represent the entire NSP set for discovering actionable NSPs. A DPP-based NSP representation and actionable NSP discovery method EINSP introduces novel and significant contributions for NSA and sequence analysis: (1) it represents NSPs by a determinantal point process (DPP) based graph; (2) it quantifies actionable NSPs in terms of their statistical significance, diversity, and strength of explicit/implicit element/pattern relations; and (3) it models and measures both explicit and implicit element/pattern relations in the DPP-based NSP graph to represent direct and indirect couplings between NSP items, elements and patterns. We substantially analyze the effectiveness of EINSP in terms of various theoretical and empirical aspects including complexity, item/pattern coverage, pattern size and diversity, implicit pattern relation strength, and data factors.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源