论文标题

学习证明:定义和实例化

Proof of Unlearning: Definitions and Instantiation

论文作者

Weng, Jiasi, Yao, Shenglong, Du, Yuefeng, Huang, Junjie, Weng, Jian, Wang, Cong

论文摘要

机器学习(ML)实践中的“被遗忘的权利”规则可以从训练有素的模型中删除某些单独的数据,这是最近开发的机器学习技术所追求的。要真正遵守该规则,一个自然而必要的步骤是验证在学习后是否确实删除了单个数据。但是,先前的参数空间验证指标可能很容易被不信任的模型培训师逃避。因此,Thudi等。最近,在USENIX SECurity'22中提出了有关算法级验证的呼吁。 我们通过重新考虑机器学习作为服务(MLAAS)的情况并提出一个新的定义框架以在算法级别上进行证明的新定义框架来回应呼叫。具体而言,我们的POUL定义(i)在未学习的阶段和后阶段都执行正确性属性,以防止最新的锻造攻击; (ii)突出显示了供者和验证者方面的适当实用性要求,对现成的服务管道和计算工作负载的侵入性很小。在定义框架下,我们随后使用SGX Enclave提出了一个受信任的硬件授权的实例化,并通过逻辑地合并了一个身份验证层,该验证层用来以支持学习审核的经验层来追踪数据谱系。我们自定义了身份验证的数据结构,以使用简单的操作逻辑支持大型的脱刀库存储,同时,可以证明复杂的学习逻辑,并在飞地中具有负担得起的内存足迹。我们最终通过概念验证实现和多维绩效评估来验证提出的实例化的可行性。

The "Right to be Forgotten" rule in machine learning (ML) practice enables some individual data to be deleted from a trained model, as pursued by recently developed machine unlearning techniques. To truly comply with the rule, a natural and necessary step is to verify if the individual data are indeed deleted after unlearning. Yet, previous parameter-space verification metrics may be easily evaded by a distrustful model trainer. Thus, Thudi et al. recently present a call to action on algorithm-level verification in USENIX Security'22. We respond to the call, by reconsidering the unlearning problem in the scenario of machine learning as a service (MLaaS), and proposing a new definition framework for Proof of Unlearning (PoUL) on algorithm level. Specifically, our PoUL definitions (i) enforce correctness properties on both the pre and post phases of unlearning, so as to prevent the state-of-the-art forging attacks; (ii) highlight proper practicality requirements of both the prover and verifier sides with minimal invasiveness to the off-the-shelf service pipeline and computational workloads. Under the definition framework, we subsequently present a trusted hardware-empowered instantiation using SGX enclave, by logically incorporating an authentication layer for tracing the data lineage with a proving layer for supporting the audit of learning. We customize authenticated data structures to support large out-of-enclave storage with simple operation logic, and meanwhile, enable proving complex unlearning logic with affordable memory footprints in the enclave. We finally validate the feasibility of the proposed instantiation with a proof-of-concept implementation and multi-dimensional performance evaluation.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源