不可靠的标签偏见表示公平学习

论文标题

不可靠的标签偏见表示公平学习

De-biased Representation Learning for Fairness with Unreliable Labels

论文作者

Zhang, Yixuan, Zhou, Feng, Li, Zhidong, Wang, Yang, Chen, Fang

论文摘要

消除偏见的同时保留所有与任务相关的信息对于公平表示学习方法具有挑战性，因为它们会产生随机或退化表示w.r.t.当敏感属性与标签相关时，标记。现有的作品提议将标签信息注入学习程序以克服此类问题。但是，并不总是满足观察到的标签是清洁的假设。实际上，标签偏见被认为是引起歧视的主要来源。换句话说，公平的预处理方法忽略了在学习过程或评估阶段中标签中编码的歧视。这一矛盾给了学识渊博的表示的公平性。为了避免此问题，我们探讨了以下问题：\ emph {我们可以学习可预测的理想公平标签可预测的公平陈述，仅在这项工作中访问了不可靠的标签吗？}，我们提出了a \ textbf {d} e- \ \ \ \ textbf {b} iase \ textbf {r textbf {r} epre fir for forter for f textbf {f textbf）将敏感信息从非敏感属性中解散，同时使学习的表示形式可预测为理想的公平标签，而不是观察到的偏见。我们通过信息理论概念（例如相互信息和信息瓶颈）制定了偏见的学习框架。核心概念是，当敏感信息受益于不可靠标签的预测时，DBRF主张不使用不可靠的标签进行监督。综合数据和现实世界数据的实验结果表明，DBRF有效地学习了对理想标签的偏差表示。

Removing bias while keeping all task-relevant information is challenging for fair representation learning methods since they would yield random or degenerate representations w.r.t. labels when the sensitive attributes correlate with labels. Existing works proposed to inject the label information into the learning procedure to overcome such issues. However, the assumption that the observed labels are clean is not always met. In fact, label bias is acknowledged as the primary source inducing discrimination. In other words, the fair pre-processing methods ignore the discrimination encoded in the labels either during the learning procedure or the evaluation stage. This contradiction puts a question mark on the fairness of the learned representations. To circumvent this issue, we explore the following question: \emph{Can we learn fair representations predictable to latent ideal fair labels given only access to unreliable labels?} In this work, we propose a \textbf{D}e-\textbf{B}iased \textbf{R}epresentation Learning for \textbf{F}airness (DBRF) framework which disentangles the sensitive information from non-sensitive attributes whilst keeping the learned representations predictable to ideal fair labels rather than observed biased ones. We formulate the de-biased learning framework through information-theoretic concepts such as mutual information and information bottleneck. The core concept is that DBRF advocates not to use unreliable labels for supervision when sensitive information benefits the prediction of unreliable labels. Experiment results over both synthetic and real-world data demonstrate that DBRF effectively learns de-biased representations towards ideal labels.

下载PDF全文

下载文献需遵守相关版权规定

论文标题