论文标题
DPP的最大可能学习的硬度
Hardness of Maximum Likelihood Learning of DPPs
论文作者
论文摘要
确定点过程(DPPS)是一种用于负相关集的广泛使用的概率模型。 DPP已成功地用于机器学习应用程序中,以选择一个多样化但代表性的数据子集。在机器学习中有关DPPS的开创性工作中,Kulesza在他的博士学位论文(2011)中猜想,为给定数据集找到最大似然DPP模型的问题是NP完整的。 在这项工作中,我们证明了Kulesza的猜想。实际上,我们证明了以下更强的近似结果:即使计算$ \ left(1-o(\ frac {1} {\ log^9 {n}}})\ right)$ - 近似值至$ n $ elements的DPP的最大log-likelihood,np elements是NP-Complete。同时,我们还获得了第一个多项式时算法,该算法达到了最佳的log-oikelienie of to the the the-log-fikelienie:近似因子为$ \ frac {1} {(1+o(1+o(1+o(1))\ log log {m}} $ novecrientionally(对于$ mm $ mm $ mm $ mm) $ 1- \ frac {1+o(1)} {\ log n} $如果所有$ n $元素出现在$ o(1/n)$ - 子集的分数中。 在技术方面,我们减少了数据集中DPP的最大对数可能性,以求解超图上“向量着色”问题的差距实例。这样的超图是建立在Bogdanov,Obata和Trevisan的有限度图建造上的(FOCS 2002),并通过Alon和Capalbo的强大扩展器进一步增强(FOCS 2007),以实现我们的目的。
Determinantal Point Processes (DPPs) are a widely used probabilistic model for negatively correlated sets. DPPs have been successfully employed in Machine Learning applications to select a diverse, yet representative subset of data. In seminal work on DPPs in Machine Learning, Kulesza conjectured in his PhD Thesis (2011) that the problem of finding a maximum likelihood DPP model for a given data set is NP-complete. In this work we prove Kulesza's conjecture. In fact, we prove the following stronger hardness of approximation result: even computing a $\left(1-O(\frac{1}{\log^9{N}})\right)$-approximation to the maximum log-likelihood of a DPP on a ground set of $N$ elements is NP-complete. At the same time, we also obtain the first polynomial-time algorithm that achieves a nontrivial worst-case approximation to the optimal log-likelihood: the approximation factor is $\frac{1}{(1+o(1))\log{m}}$ unconditionally (for data sets that consist of $m$ subsets), and can be improved to $1-\frac{1+o(1)}{\log N}$ if all $N$ elements appear in a $O(1/N)$-fraction of the subsets. In terms of techniques, we reduce approximating the maximum log-likelihood of DPPs on a data set to solving a gap instance of a "vector coloring" problem on a hypergraph. Such a hypergraph is built on a bounded-degree graph construction of Bogdanov, Obata and Trevisan (FOCS 2002), and is further enhanced by the strong expanders of Alon and Capalbo (FOCS 2007) to serve our purposes.