论文标题
使用Wasserstein距离标准在协变量偏移下重新加权样品
Reweighting samples under covariate shift using a Wasserstein distance criterion
论文作者
论文摘要
考虑两个具有不同定律的随机变量,我们只能通过有限尺寸的IID样品访问,我们将解决第一个样本重新持续样本,以使其经验分布将两个样本的大小转移到Infinity。我们研究了最佳的重新加权,可将两个样本的经验度量之间的瓦斯汀距离最小化,并以最近的邻居的形式表达了权重的表达。从预期的瓦斯汀距离方面,一致性和某些渐近收敛速率被得出,并且不需要一个随机变量相对于另一个随机变量的绝对连续性的假设。这些结果在不确定性定量中进行了一些应用,以进行解耦估计,并在协变量偏移下最近的邻居回归的概括误差的边界。
Considering two random variables with different laws to which we only have access through finite size iid samples, we address how to reweight the first sample so that its empirical distribution converges towards the true law of the second sample as the size of both samples goes to infinity. We study an optimal reweighting that minimizes the Wasserstein distance between the empirical measures of the two samples, and leads to an expression of the weights in terms of Nearest Neighbors. The consistency and some asymptotic convergence rates in terms of expected Wasserstein distance are derived, and do not need the assumption of absolute continuity of one random variable with respect to the other. These results have some application in Uncertainty Quantification for decoupled estimation and in the bound of the generalization error for the Nearest Neighbor Regression under covariate shift.