论文标题
整数子空间差异隐私
Integer Subspace Differential Privacy
论文作者
论文摘要
我们建议在数据产品上同时强制执行外部\ emph {不变{integer}约束时,为外部\ emph {不变{integer}提出了新的差异隐私解决方案。这些要求是在私人数据策划的现实世界应用中产生的,包括公开发布2020年美国十年型人口普查。它们对具有足够统计可用性的可证明的私人数据产品构成了巨大挑战。我们建议\ emph {Integer子空间差异隐私}在数据产品保持不变性和整数特征时严格阐明隐私保证,并证明我们提案的组成和后处理属性。为了解决从潜在高度限制的离散空间采样的挑战,我们通过求解约束定义的二磷剂方程来设计一对公正的加性机制,广义的拉普拉斯和广义高斯机制。所提出的机制具有良好的精度,误差分别表现出次指数和下尾尾概率。为了实施我们的建议,我们使用$ l $ lag耦合的总变化距离上的估计上限设计了MCMC算法和供应经验收敛评估。我们证明了我们的提案的功效,并应用了与不变性相交的合成问题,具有已知边缘的敏感应急表以及2010年人口普查县级示范数据,具有固定状态总数。
We propose new differential privacy solutions for when external \emph{invariants} and \emph{integer} constraints are simultaneously enforced on the data product. These requirements arise in real world applications of private data curation, including the public release of the 2020 U.S. Decennial Census. They pose a great challenge to the production of provably private data products with adequate statistical usability. We propose \emph{integer subspace differential privacy} to rigorously articulate the privacy guarantee when data products maintain both the invariants and integer characteristics, and demonstrate the composition and post-processing properties of our proposal. To address the challenge of sampling from a potentially highly restricted discrete space, we devise a pair of unbiased additive mechanisms, the generalized Laplace and the generalized Gaussian mechanisms, by solving the Diophantine equations as defined by the constraints. The proposed mechanisms have good accuracy, with errors exhibiting sub-exponential and sub-Gaussian tail probabilities respectively. To implement our proposal, we design an MCMC algorithm and supply empirical convergence assessment using estimated upper bounds on the total variation distance via $L$-lag coupling. We demonstrate the efficacy of our proposal with applications to a synthetic problem with intersecting invariants, a sensitive contingency table with known margins, and the 2010 Census county-level demonstration data with mandated fixed state population totals.