论文标题
部分可观测时空混沌系统的无模型预测
Tight Bounds on the Hardness of Learning Simple Nonparametric Mixtures
论文作者
论文摘要
我们研究有限混合物中学习非参数分布的问题,并在样本复杂性上建立紧密的界限,以学习此类模型中的组件分布。也就是说,我们得到了I.I.D.来自pdf $ f $的样本其中$$ f = w_1f_1+w_2f_2,\ quad w_1+w_2 = 1,\ quad w_1,w_1,w_2> 0 $$,我们有兴趣学习每个组件$ f_i $。没有关于$ f_i $的任何假设,此问题是错误的。为了识别组件$ f_i $,我们假设每个$ f_i $都可以写为高斯的卷积和紧凑的密度密度$ν_i$,带有$ \ text {supp {supp}(v _1)\ cap \ cap \ text {spept {supp}(supp}(np_2)= \ emertyset $。 我们的主要结果表明,$(\ frac {1} {\ varepsilon})^{ω(\ log \ log \ log \ frac {1} {\ varepsilon})} $样本是估计每个$ f_i $所必需的。证明依赖于定量的Tauberian定理,该定理与高斯人的近似近似速率可能具有独立感兴趣。为了证明这很紧,我们还提出了一种算法,该算法使用$(\ frac {1} {\ varepsilon})^{o(\ log \ log \ log \ frac {1} {\ varepsilon}} $ fean mimate $ f_i $。与基于力矩匹配和张量方法学习潜在变量模型的现有方法不同,我们的证明涉及通过正交功能对不良条件线性系统进行微妙的分析。结合了这些界限,我们得出结论,该问题的最佳样本复杂性正确在于多项式和指数之间,这在学习理论中并不常见。
We study the problem of learning nonparametric distributions in a finite mixture, and establish tight bounds on the sample complexity for learning the component distributions in such models. Namely, we are given i.i.d. samples from a pdf $f$ where $$ f=w_1f_1+w_2f_2, \quad w_1+w_2=1, \quad w_1,w_2>0 $$ and we are interested in learning each component $f_i$. Without any assumptions on $f_i$, this problem is ill-posed. In order to identify the components $f_i$, we assume that each $f_i$ can be written as a convolution of a Gaussian and a compactly supported density $ν_i$ with $\text{supp}(ν_1)\cap \text{supp}(ν_2)=\emptyset$. Our main result shows that $(\frac{1}{\varepsilon})^{Ω(\log\log \frac{1}{\varepsilon})}$ samples are required for estimating each $f_i$. The proof relies on a quantitative Tauberian theorem that yields a fast rate of approximation with Gaussians, which may be of independent interest. To show this is tight, we also propose an algorithm that uses $(\frac{1}{\varepsilon})^{O(\log\log \frac{1}{\varepsilon})}$ samples to estimate each $f_i$. Unlike existing approaches to learning latent variable models based on moment-matching and tensor methods, our proof instead involves a delicate analysis of an ill-conditioned linear system via orthogonal functions. Combining these bounds, we conclude that the optimal sample complexity of this problem properly lies in between polynomial and exponential, which is not common in learning theory.