关于差异隐私组成中数据库的选择

论文标题

关于差异隐私组成中数据库的选择

On the Choice of Databases in Differential Privacy Composition

论文作者

Hartmann, Valentin, Bindschaedler, Vincent, West, Robert

论文摘要

差异隐私（DP）是一种广泛应用的范式，用于在维护用户隐私的同时释放数据。它的成功在很大程度上是由于其成分属性即使在多个数据发布的情况下也保证了隐私。因此，构图引起了研究界的广泛关注：在选择机制时，存在一些具有不同灵活性的对手的组成定理。但是，除了机制外，对手还可以选择调用这些机制的数据库。所谓的构图实验是分析DP机制组成的经典工具，既不允许在数据库中纳入约束，也不允许对对手关于数据库成员的先验知识的不同假设。因此，我们提出了具有这种灵活性的广义组成实验（GCE）。我们表明，相对于经典组成实验的组成定理也与GCE最坏情况相关。这意味着现有的组成定理为案例提供了比经典构图实验明确涵盖的更多案例的隐私保证。除了这些理论见解之外，我们还展示了GCE的两个实际应用：第一个应用程序是在存在对数据库的限制的情况下提供更好的隐私范围；第二个应用程序是为对手的先验知识如何影响隐私泄漏。在这种情况下，我们显示了对手之间具有不知情的先验和亚采样的联系，这是DP中重要的原始性。据我们所知，本文是第一个分析DP组成中数据库之间的相互作用的文章，因此可以更好地了解组成和实用工具，以获得更好的组成界限。

Differential privacy (DP) is a widely applied paradigm for releasing data while maintaining user privacy. Its success is to a large part due to its composition property that guarantees privacy even in the case of multiple data releases. Consequently, composition has received a lot of attention from the research community: there exist several composition theorems for adversaries with different amounts of flexibility in their choice of mechanisms. But apart from mechanisms, the adversary can also choose the databases on which these mechanisms are invoked. The classic tool for analyzing the composition of DP mechanisms, the so-called composition experiment, neither allows for incorporating constraints on databases nor for different assumptions on the adversary's prior knowledge about database membership. We therefore propose a generalized composition experiment (GCE), which has this flexibility. We show that composition theorems that hold with respect to the classic composition experiment also hold with respect to the worst case of the GCE. This implies that existing composition theorems give a privacy guarantee for more cases than are explicitly covered by the classic composition experiment. Beyond these theoretical insights, we demonstrate two practical applications of the GCE: the first application is to give better privacy bounds in the presence of restrictions on the choice of databases; the second application is to reason about how the adversary's prior knowledge influences the privacy leakage. In this context, we show a connection between adversaries with an uninformative prior and subsampling, an important primitive in DP. To the best of our knowledge, this paper is the first to analyze the interplay between the databases in DP composition, and thereby gives both a better understanding of composition and practical tools for obtaining better composition bounds.

下载PDF全文

下载文献需遵守相关版权规定

论文标题