论文标题
在2020年美国人口普查中,评估差异私人,分层计数数据的统计披露风险
Assessing Statistical Disclosure Risk for Differentially Private, Hierarchical Count Data, with Application to the 2020 U.S. Decennial Census
论文作者
论文摘要
我们提出了贝叶斯方法,以评估零浓缩差异隐私下发布的数据的统计披露风险,重点是具有强大的层次结构和具有许多级别的分类变量的设置。风险评估是通过假设贝叶斯入侵者具有各种先前信息的贝叶斯入侵者来进行的,并检查了其后代和先验之间的距离。我们讨论了这些风险评估方法的应用,以从2020年的十年人口普查中差异化私人数据发布,并使用1940年十年中普查的公共个人级别数据进行仿真研究。在这些研究中,我们研究了数据持有人对隐私参数的选择如何影响披露风险并量化假设入侵者包含大量层次结构信息时的风险增加。
We propose Bayesian methods to assess the statistical disclosure risk of data released under zero-concentrated differential privacy, focusing on settings with a strong hierarchical structure and categorical variables with many levels. Risk assessment is performed by hypothesizing Bayesian intruders with various amounts of prior information and examining the distance between their posteriors and priors. We discuss applications of these risk assessment methods to differentially private data releases from the 2020 decennial census and perform simulation studies using public individual-level data from the 1940 decennial census. Among these studies, we examine how the data holder's choice of privacy parameter affects the disclosure risk and quantify the increase in risk when a hypothetical intruder incorporates substantial amounts of hierarchical information.