私人估计高斯：高效，健壮和最佳

论文标题

私人估计高斯：高效，健壮和最佳

Privately Estimating a Gaussian: Efficient, Robust and Optimal

论文作者

Alabi, Daniel, Kothari, Pravesh K., Tankala, Pranay, Venkat, Prayaag, Zhang, Fred

论文摘要

在这项工作中，我们提供了有效的算法，用于私下估计纯和近似差异隐私（DP）模型中的高斯分布，对样品复杂性的维度具有最佳依赖性。在纯的DP设置中，我们提供了一种有效的算法，该算法估计了未知的$ d $维高斯分布，直到使用$ \ widetilde {o}（d^2 \ log-bog-κ）$样本的$ \ wideTildeLe {o}，同时耐受恒定分数的对手偏见者的恒定分数。在这里，$κ$是目标协方差矩阵的条件数。样本结合在对维度的依赖性中匹配最佳的非私有估计量（最多是聚类因子）。我们证明了对差异私有协方差估计的新下限，以表明对上述样本绑定中条件数$κ$的依赖也很紧。在我们工作之前，仅识别可识别性结果（效率低下的超多项式时间算法）以该问题而闻名。在近似DP设置中，我们使用$ \ widetilde {o}（d^2）$样本估算出未知的高斯分布，以估计未知的高斯分布，同时容忍恒定的对抗性异位物。在我们工作之前，所有有效的近似DP算法均产生了超季度样本成本或不抗相可比。对于平均估计的特殊情况，我们的算法实现了$ \ widetilde o（d）$的最佳样本复杂性，从而在$ \ widetilde o上改进了（d^{1.5}）$，从先前的工作中绑定。我们的纯DP算法依赖于递归的私人预处理子例程，该子规定利用了最近的私人平均估计工作[Hopkins等，2022]。我们的近似DP算法是基于[Kothari等，2022]中引入的稳定凸松弛方法的大量升级。

In this work, we give efficient algorithms for privately estimating a Gaussian distribution in both pure and approximate differential privacy (DP) models with optimal dependence on the dimension in the sample complexity. In the pure DP setting, we give an efficient algorithm that estimates an unknown $d$-dimensional Gaussian distribution up to an arbitrary tiny total variation error using $\widetilde{O}(d^2 \log κ)$ samples while tolerating a constant fraction of adversarial outliers. Here, $κ$ is the condition number of the target covariance matrix. The sample bound matches best non-private estimators in the dependence on the dimension (up to a polylogarithmic factor). We prove a new lower bound on differentially private covariance estimation to show that the dependence on the condition number $κ$ in the above sample bound is also tight. Prior to our work, only identifiability results (yielding inefficient super-polynomial time algorithms) were known for the problem. In the approximate DP setting, we give an efficient algorithm to estimate an unknown Gaussian distribution up to an arbitrarily tiny total variation error using $\widetilde{O}(d^2)$ samples while tolerating a constant fraction of adversarial outliers. Prior to our work, all efficient approximate DP algorithms incurred a super-quadratic sample cost or were not outlier-robust. For the special case of mean estimation, our algorithm achieves the optimal sample complexity of $\widetilde O(d)$, improving on a $\widetilde O(d^{1.5})$ bound from prior work. Our pure DP algorithm relies on a recursive private preconditioning subroutine that utilizes the recent work on private mean estimation [Hopkins et al., 2022]. Our approximate DP algorithms are based on a substantial upgrade of the method of stabilizing convex relaxations introduced in [Kothari et al., 2022].

下载PDF全文

下载文献需遵守相关版权规定

论文标题