DPP的最大可能学习的硬度

论文标题

DPP的最大可能学习的硬度

Hardness of Maximum Likelihood Learning of DPPs

论文作者

Grigorescu, Elena, Juba, Brendan, Wimmer, Karl, Xie, Ning

论文摘要

确定点过程（DPPS）是一种用于负相关集的广泛使用的概率模型。 DPP已成功地用于机器学习应用程序中，以选择一个多样化但代表性的数据子集。在机器学习中有关DPPS的开创性工作中，Kulesza在他的博士学位论文（2011）中猜想，为给定数据集找到最大似然DPP模型的问题是NP完整的。在这项工作中，我们证明了Kulesza的猜想。实际上，我们证明了以下更强的近似结果：即使计算$ \ left（1-o（\ frac {1} {\ log^9 {n}}}）\ right）$ - 近似值至$ n $ elements的DPP的最大log-likelihood，np elements是NP-Complete。同时，我们还获得了第一个多项式时算法，该算法达到了最佳的log-oikelienie of to the the the-log-fikelienie：近似因子为$ \ frac {1} {（1+o（1+o（1+o（1））\ log log {m}} $ novecrientionally（对于$ mm $ mm $ mm $ mm） $ 1- \ frac {1+o（1）} {\ log n} $如果所有$ n $元素出现在$ o（1/n）$ - 子集的分数中。在技术方面，我们减少了数据集中DPP的最大对数可能性，以求解超图上“向量着色”问题的差距实例。这样的超图是建立在Bogdanov，Obata和Trevisan的有限度图建造上的（FOCS 2002），并通过Alon和Capalbo的强大扩展器进一步增强（FOCS 2007），以实现我们的目的。

Determinantal Point Processes (DPPs) are a widely used probabilistic model for negatively correlated sets. DPPs have been successfully employed in Machine Learning applications to select a diverse, yet representative subset of data. In seminal work on DPPs in Machine Learning, Kulesza conjectured in his PhD Thesis (2011) that the problem of finding a maximum likelihood DPP model for a given data set is NP-complete. In this work we prove Kulesza's conjecture. In fact, we prove the following stronger hardness of approximation result: even computing a $\left(1-O(\frac{1}{\log^9{N}})\right)$-approximation to the maximum log-likelihood of a DPP on a ground set of $N$ elements is NP-complete. At the same time, we also obtain the first polynomial-time algorithm that achieves a nontrivial worst-case approximation to the optimal log-likelihood: the approximation factor is $\frac{1}{(1+o(1))\log{m}}$ unconditionally (for data sets that consist of $m$ subsets), and can be improved to $1-\frac{1+o(1)}{\log N}$ if all $N$ elements appear in a $O(1/N)$-fraction of the subsets. In terms of techniques, we reduce approximating the maximum log-likelihood of DPPs on a data set to solving a gap instance of a "vector coloring" problem on a hypergraph. Such a hypergraph is built on a bounded-degree graph construction of Bogdanov, Obata and Trevisan (FOCS 2002), and is further enhanced by the strong expanders of Alon and Capalbo (FOCS 2007) to serve our purposes.

下载PDF全文

下载文献需遵守相关版权规定

论文标题