论文标题

针对欧洲人口统计的差异隐私和嘈杂的机密概念

Differential privacy and noisy confidentiality concepts for European population statistics

论文作者

Bach, Fabian

论文摘要

本文旨在概述基于随机噪声的各种统计披露控制方法,这些噪声目前正在针对官方人口统计和人口普查进行讨论。一个特殊的重点是影响讨论的不同概念之间的严格描述:我们在风险度量,噪声分布和输出机制之间清楚地分开 - 将这些概念置于范围并相互关系。 在将差异隐私概述为风险措施之后,本文还评论了某些特定输出机制和参数设置的效用和风险方面,并特别注意静态输出,这些输出在官方人口统计中是相当典型的。特别是,有人认为,诸如普通拉普拉斯之类的无界噪声分布可能会危害关键的独特人口普查特征,而无需从风险的角度出发。另一方面,可以设置有限的噪声分布,例如截短的拉普拉斯或单元格方法,以保持独特的人口普查特征,同时控制类似人口普查的输出中的披露风险。 最后,本文分析了一些典型的攻击方案,以限制通用噪声参数范围,这些噪声参数范围范围范围,这表明2021 EU人口普查输出方案的风险/实用性妥协。分析还表明,在这种情况下,严格差异化机制将受到严格的限制。

The paper aims to give an overview of various approaches to statistical disclosure control based on random noise that are currently being discussed for official population statistics and censuses. A particular focus is on a stringent delineation between different concepts influencing the discussion: we separate clearly between risk measures, noise distributions and output mechanisms - putting these concepts into scope and into relation with each other. After recapitulating differential privacy as a risk measure, the paper also remarks on utility and risk aspects of some specific output mechanisms and parameter setups, with special attention on static outputs that are rather typical in official population statistics. In particular, it is argued that unbounded noise distributions, such as plain Laplace, may jeopardise key unique census features without a clear need from a risk perspective. On the other hand, bounded noise distributions, such as the truncated Laplace or the cell key method, can be set up to keep unique census features while controlling disclosure risks in census-like outputs. Finally, the paper analyses some typical attack scenarios to constrain generic noise parameter ranges that suggest a good risk/utility compromise for the 2021 EU census output scenario. The analysis also shows that strictly differentially private mechanisms would be severely constrained in this scenario.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源