神经网络中的相互分区亚组的偏置缓解框架

论文标题

神经网络中的相互分区亚组的偏置缓解框架

Bias Mitigation Framework for Intersectional Subgroups in Neural Networks

论文作者

Kokhlikyan, Narine, Alsallakh, Bilal, Wang, Fulton, Miglani, Vivek, Yang, Oliver Aobo, Adkins, David

论文摘要

我们提出了一个公平感知的学习框架，可以减轻与受保护属性相关的交叉子组偏见。先前的研究主要集中于通过将复杂的公平驱动的约束纳入优化目标或设计针对特定受保护属性的其他层来减轻一种偏见。我们引入了一种简单而通用的缓解方法，该方法通过减少它们之间的相互信息来阻止模型在受保护属性和输出变量之间学习之间的关系。我们证明我们的方法可有效减少偏差，而准确性几乎没有下降。我们还表明，经过学习框架训练的模型对受保护属性的价值不敏感而变得不敏感。最后，我们通过研究受保护和非保护属性之间的特征相互作用来验证我们的方法。我们证明，在应用缓解偏差时，这些相互作用会大大减少。

We propose a fairness-aware learning framework that mitigates intersectional subgroup bias associated with protected attributes. Prior research has primarily focused on mitigating one kind of bias by incorporating complex fairness-driven constraints into optimization objectives or designing additional layers that focus on specific protected attributes. We introduce a simple and generic bias mitigation approach that prevents models from learning relationships between protected attributes and output variable by reducing mutual information between them. We demonstrate that our approach is effective in reducing bias with little or no drop in accuracy. We also show that the models trained with our learning framework become causally fair and insensitive to the values of protected attributes. Finally, we validate our approach by studying feature interactions between protected and non-protected attributes. We demonstrate that these interactions are significantly reduced when applying our bias mitigation.

下载PDF全文

下载文献需遵守相关版权规定

论文标题